Saturday, July 26, 2014

Why you cannot beat the compiler...

In the last weeks there was a big rewrite inside CodeRefractor. The main change is basically that we do in a multiple pass algorithm the building of code tree that can be built from the Main method. The changes we propagated from the frontend to the deep of the virtual methods backend (where the code is generated) and simplified.

using System;
class A
{
    public void F() { Console.WriteLine("A.F"); }
    public virtual void G() { Console.WriteLine("A.G"); }
}
class B : A
{
    public void F() { Console.WriteLine("B.F"); }
    public override void G() { Console.WriteLine("B.G"); }
}
class Test
{
    static void Main()
    {
        B b = new B();
        A a = b;
        a.F();
        a.G();
        b.G();
    }
}

Let's start with this code. Which is CR's output for Main method (C++ code)?

System_Void _Test_Main()
{

_B_F();
_B_G();
_B_G();
return;
}
 
CR doesn't have full inlining (yet!) but it has analysis steps that check if variables are used over the method boundaries and the most important part: it does it safely!

Many people may say that this code is contrived and unrealistic and I have to agree, also it has to be a "lucky case" that many things to happen:
- methods to devirtualize correctly
- the members of the method calls are not used
- the instance of B has no members  that are used, so the compiler can remove it safely

But on the other hand, this is also the strength of the compilers today: one month ago CR would not be able to write this code, but today it cans, so all you have to do to get a bit better performance is just update the compiler (or wait for the next release) and voila, you're good to go. Even more, every time when in future the compiler is able to devirtualize, it will devirtualize, over all program, and the most important bit: with no work from your side!

So, every time when you write code that targets a compiler (any compiler, not only CR), if the compiler project is improving, all you have to do is to wait, or to report bugs if the compiler doesn't compile your code, or update minimally your code to avoid the compiler issues and the code will improve itself.

Friday, July 18, 2014

Some String support

A simple program as this one:

 static void Main()
 {
        var s1 = "Hello";
        var s2 = " world";
        var s3 = s1 + s2;
        Console.Write(s3);
 }

 would not work and even seems to be a simple program, in fact is a bit harder because CR didn't have a nice way to map types and to implement a mix of the CR's internal string and the real string, so there was the need to duplicate the entire code of string and make it work.

Right now it works as the following:

[ExtensionsImplementation(typeof (string))]
    public static class StringImpl
    {
        [MapMethod]
        public static string Substring(string _this, int startIndex)
        {(...)}
        [MapMethod(IsStatic = true)]
        public static string Concat(string s1, string s2)
        {(...)}
        [MapMethod]
        public static char[] ToCharArray(CrString _this)
        {(...)}
 }
By adding implementation of mapped CrString for the System.String type makes possible to make implementations natural and as the library will grow (please spend very little time to support a simplified version of any class you think you can implement in .Net) many programs which were missing just the part of displaying some combined strings, right now is all possible.


Thursday, July 10, 2014

Improving debugging experience by improving internal structures

CodeRefractor has some clean advantages over other static compilers that target LLVM by targetting C++ but also some disadvantages.

One great advantage is that it is written in .Net and it bootstraps everything needed for output, and the output is somewhat human readable result. This also allows simple optimizations to be done as CR rewrites everything into an intermediate representation (IR) and this representation is simplified with many high level optimization steps.

So this instruction was in the past setup as a list of pairs of values: instruction kind and the instruction data and many casts were made based on this instruction. This is good design, but right now the code is represented as a specific instruction class. So if in the past for storing a label, you would have something like: pair of (Label, 23) where 23 was the label's name, right now is a class Label where it stores: JumpTo property which is set to value 23.

Similarly, more changes are to debug methods, in the past there was a similar pair of the method kind and a MethodInterpreter, and sadly the MethodInterpreter type will store based on the kind of method various fields which some of them were redundant, some of them unused, and changes of this kind. As the code right now is split, the experience in debugging CR is better also.

At last, but not at least, even is not yet fully implemented, it is important to say that the types right now are resolved using a construction named ClosureEntities, and there are steps (similar with optimization steps) that they find the unused types. This also will likely improve in future the definition of types and make possible to define easier the runtime methods. This new algorithm has one problem as for now: it doesn't solve the abstract/virtual methods and the fixing of them will be necessary in future.

Tuesday, July 1, 2014

Let's get Visual

About a month ago, Ciprian asked for developers to join CR, and I thought since I love the .Net ecosystem, I should be able to help a little here and there.After I looked at the code, having never really understood compilers or transpilers in this case, I needed an easier way to visualize what was going on behind the hood, so I could get up to speed with CR.

I then decided to mock up a "Visual Compiler" to both test CR and to get a simple call graph.
The Visual Compiler (VCR), once setup enables seamless development and testing of CR, alot of improvements can be made, such as full debugging support or clicking errors to take you to the line of code, but for now its a pretty useful tool. With this I have been able to catch up quite a bit and add/fix a few runtime features such as virtual methods and interfaces to CR.

VCR has the following features;
 - opening files
 -  enabling /  disabling specific optimizations
 - running the C# / CPP code and viewing output
 -  and testing the output of C# vs that of CPP.
 - multiple compiler support (although there is no gui to set that, yet)

With the knowledge gained, I am hoping to complete a Pet Project (C# Editor, Compiler and Interpreter for iOS), that has been in the Attic for about 2 years now.

I am currently writing a few more tests and looking into delegates to have CR almost feature complete with the C# language.

Ofcourse even with this tool, I needed alot of guidance, which Ciprian has been kind enough to provide. I am pretty optimistic of where this project is heading :)

Oops, Almost forgot, this is what VCR looks like;

VCR Running on Windows Server 2012 R2

My next blog post will be on the architecture of CR itself, stay tuned.


Enjoy CR,
Ronald Adonyo.

Wednesday, June 18, 2014

Let's box this and that

Roland, a great open-source developer joined CR with his expertise and I'm excited waiting for the new features will appear in future of CR. He also exposed some bugs in implementation and I was happy to fix them!

Some features:
- a simple implementation of auto boxing/unboxing is done
- optimizations are defined by category (no command line flag yet) but right now is possible to add just "propagation" optimizations, or just "constants folding" ones and so on
- some old optimizations were not using use-def optimizations, and right now they do. This means in short: they are a bit faster than in the past.

Monday, June 2, 2014

Status: Updates from time to time

This last month I did not have any commit as I was busy but I will try to give to CR some time. What I will try to address is mostly the maintanance of codebase as my (free) time is not always a thing I can guarantee.

So, in short what to expect:
- I want to cleanup some constructions: I am thinking mostly to make labels to use their hex notations. I will look if the CR works, because I made a bigger program and it looks CR fails (so be aware of this :( )
- I want to make a separate (example) backend/frontend: The idea is this: if anyone will have time, most likely CR will work with a rich C++ runtime (like Qt) to support some common classes. But it will be hard without knowing what is the front/backend job, or is to tweak the compiler
- I want to integrate in future to C# (most likely gmcs/csc) compiler: so you can run something like:  
[mono] crcs.exe <...>
and this code will be run the compiler frontend.
- In a longer future I want to add some unit tests

Please add me to Google+ and I will be glad anyone wants to make look into CR and is blocked in understanding how to work with the project. I will be glad to help and assist anyone working with the codebase.

Friday, May 9, 2014

Errors are mostly fixed

By mostly I mean that is somewhat good to use, and there may be a bug here and there, but the type closure bug is fixed. Also the documentation is also updated with the new architectural changes.

So, you may take the tip of the Git repo now and play back. Is not fully stable but it generates most of the code at its place.