Friday, November 1, 2013

Opinion: Native and Managed, what it really means?

Microsoft and Virtual Machines world do use many definitions and based on emphasis they can say things that do make no sense to some extent: "native performance", "performance per watt" and in my view is all based on undefined clearly terms.

Based on this I notice that this emphasis changed even more with phones and I cannot clarify without using definitions which again will break the purpose of this entry, so I will go on technology side:
- native is in many people mind associated with Ahead Of Time compilers meaning that you write the code in the language of choice, and finally will create a final executable code that runs directly on the CPU machine
- managed/virtual machine is when applications are written in something intermediary, and before executing there is a runtime that reads this "bytecode" and compiles on the fly

Because of how compilers work, compilation is expensive, and it means that most virtual machines do compromises to make possible to have interactive applications possible. This is why the virtual machines are somewhat lightweight on compiling code by reducing the analysis steps that they perform on the code. This means two things: big applications typically start slower than their "native" counter part, and in many cases the compiled code quality is a bit weaker so the code will run some percent slower up-to some times slower (more in this later).

Based on this view, is it true that there is a managed application world that is so slow than a native performance world? Of course, but as many answers are in life it depends:
- most of the things you see on your screen depend on GPU (video card) to be drawn so even the slowness of a virtual machine is there, if it is done on a separate thread, the animations may work independently
- most of virtual machines do compile hottest of the hottest (including JavaScript VMs which today tend to use a separate thread/process to compile JS) so for simple/interactive applications you will get good (enough) performance
- some VMs do have parts of code compiled into "native" code, for example using NGen, and even the NGen is not a high quality code generator, is good enough, but also it makes your application to start fast
- VMs do allow to use native code directly so if a loop is not well optimized by the virtual machine, the developer can use native code that runs as fast as native code
- VMs tend to have a fast memory allocator, so an allocation heavy application may run faster than a native application, if the native application doesn't write memory pools or other caches to speedup the application

In this hybrid world, performance is less meaningful than it would be if we talk about full Java applications 10 years ago. Also, it is even less meaningful as GPUs and computation on GPU do matter a lot in computation.

This is why when Microsoft launched: "Going Native" campaign puzzled me... the easiest way to achieve this (in "managed" world) is to compile the bytecodes upfront using NGen. They are using this in Windows Phone 8 as your MSIL code is compiled in the cloud.

People were using C# not because performance was not good, but because it was good enough. C++ started to be used because Microsoft did not invest into improving their quality of generated code for a long long time, and in comparison, C++ always did this at least with work into Visual Studio's Phoenix backend, by GCC team and of course Clang/LLVM team.

The last issue of Managed vs Native is that people use it just to make it as a marketing pintch, like here: https://www.youtube.com/watch?v=3vGV4fF4KCM (minute 34:40), it is contrasted Web technologies like JavaScript with "True Native". And even we disregard the word "Scripted", which is the performance profile for a JS application? If you use Canvas, it will be hardware accelerated today, if you load code, most of it will be set as dead-code, and it will run like in 5x of the speed of fully optimized code, which if it is run like 1 time, to sum all items in a column is really instant.


Code Refractor is using some decisions found in other opensource projects like Vala or ObjectiveC (to use smart-pointers instead of GC) and from Java (escape analysis and Class Hierarchy Analysis) or even from GCC (pure function annotation) and Asm.JS (use a subset of code and optimize this properly, then add another feature, and optimize this new one properly) because sometimes performance matter and is done by design. The importance (in long term, as right now is just a very limited pre-alpha) of CR is in my view the approach: the "native" step is basically spending long times optimizing upfront.

What CR can bring is not always performance, but more capabilities: using Emscripted (a C++ to JS) or Duetto you can start from C#. As for myself I hope it will be used at least to migrate some Mono applications like Pinta or Tomboy and not use something like GNote (a C++ translation by hand of Tomboy) where Mono is not desired (Fedora Linux anyone!?).

No comments:

Post a Comment