Friday, June 7, 2013

Introduction to Code Refractor CIL to Native AOT

Welcome to CodeRefractor, an experimental project which compiles MSIL/CIL opcodes to C++ and after this it automatically generates a native (binary) version of the code.

The idea of using this convertor is to make possible that the MSIL code to run on machines that do not have Mono or .Net codes and is using as much as possible C++ paradigms.

Some questions I think that may occur for an interested reader and their answers

Is CodeRefractor a code-translator?
By a code-translator meaning that it translates the operation opcodes to C++ directly. The exact answer is no, it is a full compiler which uses an intermediary representation.

In a very brief steps the code, CodeRefractor does the following:
- reads an .Net assembly using reflection
- starts from the entry point (Main method) and starts reading CIL bytecodes/operations
- these operations are rewritten in an intermediary representation
- the intermediary representation is (optionally) optimized which simplify simple expressions and redundancies
- the intermediary representation is written into C++
- a C++ compiler is invoked and is linked with a static library that have some simple operations made into it and generates the executable form

This native executable should behave as much as possible as the original .Net application.

Most similar projects that do similar design are mostly in Java world:
- Excelsior JET
- GCJ
- RoboVM
- Mono --full-aot mode

This project will not be able to generate a C++ version of any C# (or Boo, F#, etc.) language and some specifics will make it to behave different (like many times slower), but a program written with the consideration of the design of the CodeRefractor should run similarly with a C++ code (even is written in C#).

How will CR handle licenses to not break them?
CR will not scan any GAC assemblies (and system ones too) for instructions, but it will let the user to add their own replacement implementations. We recommend for users to not read code that the license does not allow them to scan it. We plan for this project to use as a "backup" the Mono's BCL implementation (the CIL bytecodes) when something is missing.

C++ has no GC. so how memory management is done?
The code is using smart pointers and it requires a C++ 11 compiler (to make sure that the compiler supports smartpointers). This also means that users have to care about memory cycles. So, as an easy way to break cycles for now is to set pointers to null. In future plans, weak pointers will be supported.

Does it support Generics, Linq, Reflection, method Math.Cos,  ... (put your feature here)?
No, it doesn't and some of the limitations will be fixed as the software evolves. We hope that feedback and help from community will address as many of these features as early as possible.

What does it work?
Hello world kind of samples, the most complex version is NBody benchmark. This means in short: classes, static and not static (non-virtual) methods, many primitive types operations, while/for loops, array types, fields (but not properties yet), constructors

Where is the project?
 You can study, read, contribute to it here. The license file is not yet written, but it is: GPL2+ for compiler and MIT/X11 for libraries. Or rephrased (for companies) anyone which want to use it is: you can use it commercially, but if you make changes in the compiler, you have to contribute them back. If you let the compiler as-is, you can make your changes in the libraries for all your needs. We are using basically the same license as Mono project.

Legal note: we are not endorsed nor we have any relation with Mono project or Microsoft.

No comments:

Post a Comment