Wednesday, February 26, 2014

Technical: Which optimizations to expect from CodeRefractor?

This entry is not about any status, but about the status of CR today. So, which optimizations you should expect that CR will do for you, and which will not do it for you.

I will take the most important as matter of performance and usages. Also I will try to see where are problems.

GC and Escape Analysis

If you will look in generated code of CR, when you declare a reference, typically will use an std::shared (also known as smart-pointer, or reference counted pointer). This form of freeing unreferenced variables is very common but even for very simple operations it will work fine. But every time when you assign a smart-pointer, the CPU will have to increment (and later) decrement the item.

This analysis will remove a lot of unnecessary increments/decrements but it has some caveats: it works really well for local variables, but it doesn't do well over the function boundaries and especially in cases of return values.

So how to make your reference-counted program to work fast?

First of all, remove unnecessary assignments of references.

Let's take this simple code:
class Point2D { ... public float X, Y; }
class Vector2D { ... public float X, Y; }
Vector2D ComputeVector(Point2D p1, Point2D p2)
{return new Vector2D{ X = p1.X-p2.X, Y = p1.Y-p2.Y }; }
var v = ComputeVector(p1, p2);

The easiest solution to fix this code is simply: define class Point2D and Vector2D to be struct (value types). Other solution is to recycle the object as the following code:
void ComputeVector(Vector2D result, Point2D p1, Point2D p2)
{
   result.X = p1.X-p2.X;
   result.Y = p1.Y-p2.Y;
}

This is good because the usage code can be something like:
var result = new Vector2D();
ComputeVector(result, p1, p2);
Console.WriteLine(result.X);
Console.WriteLine(result.Y);

If this would be your final code (or a version like this) and result will not be used anymore, the result variable will be declared on stack, avoiding not only smart-pointers, but memory allocation overhead.

Constants and propagation of values

CR will extensively evaluate every constant and will simplify everything at maximum scale (of course, excluding there is a bug). This means that if you use constants in the program, they will be moved all over the body of the function. So, try to parametrize your code using constants if is possible. Also, if you use simple types (like int, float, etc.), I recommend create as many intermediate variables you want. CR will heavily remove them without having any bearing for a programmer. So, if you don't work with references to heap objects (reference objects), use as many as possible variables. The compiler will also propagate the values as deep as possible in the code (there are still small caveats), so if you keep a value just for debug purposes anywhere, keep it, it will make your code readable, and this is more important.
Constants will not only simplify formulas, but will: remove branches in if or switch statements (if can be proven constants). This is good to know for other reason: if you have code like: if(Debug) ... in your development, you should have the piece of mind that this if statement will not be checked anywhere in the whole code.

Purity and other analyses

Without going to a functional programming talk, the compiler will check for some function proprieties, the most important of them are: Purity, ReadOnly, IsGetter, IsSetter.
IsGetter and IsSetter are important to inline them every time you use an auto-property.
A pure function is a function that uses just simple types as input (no strings for now, sorry), and it will offer as a result a simple value but in between will not change anything global. A read-only function is a function that can depend on anything (not only simple types) but will not change anything global, but return a simple type.
Writing code and using pure functions everywhere, you will have  the luxury that the compiler will do a lot of optimizations with it: when you call with constants, it will compute them at compile-time.
So when you write Math.Cos(0) will be replaced with 1.0, and even more, it will merge many common expressions even functions calls.

Unused arguments (experimental)

When you write your code, you want to make it configurable, based on this, you can add many arguments, and some of them will not be at all used. The latest (git) code removes the unused arguments. Do you remember the part of using constants everywhere? So if you call a function with a constant, and this is the single call, the constant is propagated in call, and after this the argument is removed.

This code is as efficient as #if directive, but done by the compiler with no work from your side (excluding the part to set the environment to enable(or disable) debug, or whatever flags you have in your program).

Conclusions

There are other optimizations, but many of them are low-level and are not so practical but in short I can make a small list of things you should do to improve performance in Code-Refractor generated code:
- when you need to change a variable and this variable is a result, give it as an external parameter
- use constants everywhere for configuring your runtime.
- use local variables of simple types as often as you want.
- try to make functions to not change too much stuff, make them small and with one target (like a computation). They will be really well optimized out. It is possible that entire calls will be removed if you write small functions
- make branches configurable with simple type parameters. The compiler will speed up the code
 in some cases if it can prove that your configuring variables are set with specific values.

No comments:

Post a Comment