Around year 2000 the computing was changed drastically, even more with the launching of multi-core era around year 2005 and after. Right now, we are in another "revolution" using "computing". And every time you will see new benchmarks where a generation is 25% faster (like A8 is faster than A7) or 100% (like A7 is "up-to" 100% faster than A6), and so on!
But as there are many whereabouts in real life, benchmarks are critical to show if you can have a good experience for users:
- you want your image effect to finish fast. Even more when there is a movie to be processed and a movie as is a huge collection of photos, you care about the language is used and the time your animation is finished
- you want all cores to be used especially when 4 cores look to be the norm today or at least 2 cores with hyperthreading, so you want that all your hardware to be used
- you want to save battery for your mobile devices
- and many other reasons where when things go fast, you will feel happier than if things go slow
What you should care about
But when you as an user may want all these, sometimes you miss the mark. I simply mean that when you do your tasks, you may look in the wrong direction:
- the disk is many many times slower than memory. So let's say you have to make as fast as possible a picture you've just received on Skype from a friend and he/she asks you to use Sepia tones. Starting Photoshop (or Gimp) can take literally many seconds if not a full minute (if you have more plugins on startup). So using Paint.Net will start the picture editing program faster, you can select faster to Sepia tone and you can send eventually faster the picture back faster. This is not to say that theoretically Photoshop if you would have 100 pictures to load it may not finish them faster (or any other program on that matter), but is often when starting a smaller program makes your task to finish faster
- on a similar note: do you have enough RAM (system memory)? When your system feels slow, is very often that is one of the two factors: you have few memory on your system (and memory is really one of the most disproportionately cheap component) or your CPU is really outdated (like let's say you want to process big pictures with a Pentium 4 1.7 GHz from 2001). But even the 2nd case is what's make your system slow, is still likely that adding memory will make it faster and usable even 10 years later!
- if your application doesn't care much about RAM and disk, is (very) probably a game, so the next question is: do you have the best video card is in your budget? It can be an integrated video card, but to be the fastest you can afford. Very few games are CPU limited, and even the CPU is to slow, and you have just 30 frames per second (so you can "feel" gaming lag), you can increase game detail and you will lose very little performance (if you have a to fast video card), and all will decrease to 28 FPS. So nothing to lose here!
What Internet says
Ok, being said that, I noticed some very bad benchmarks lately over the Internet, and they are made basically to put shocking numbers. like Android L will run basically slower than Android 4.4, or the reverse (these are Google numbers) in the attached picture, or the latest of them, a bit old, but with a big surprise.
What is funny in these 3 sources, are basically these conclusions:
So why this mess? As a (somewhat) compiler writer, I understand the following: all compilers have tradeoffs and sometimes (like the preview builds) you enable debug information to get better information. Also, when you compare, you have to make sure you compare the same thing. The last item that matter.
Investigating the claims
- CR doesn't handle exceptions at all, including bounds checking, NPE, so if you have a lot of iterations, you can get really awesome generated code, but it compares a minimalist "C-like" code with .Net's full code
- CR anyway uses C++ allocator which is roughly an order of magnitude slower than the .Net's GC, so if you allocate a lot of objects .Net ca be faster even .Net has exceptions (the same story is with Mono). CR has optimizations to help on this (mostly escape analysis, but users have to write code in a compiler friendly way).
Java most of the time (in low level math) with hot loops receives like 90% of C++ performance. In my experience, CodeRefractor was a bit faster (11%) in a very tight loop with no special hand-tune of code (but still tunning as much the compiler flags) so I can confirm as a consistent experience
For the sake of correctness, V8 was not all the time faster than the JIT of the Dalvik, the Android's "Java" implementation, but Dalvik was not improved from 2010 JIT wise, but most of improvements were GC related, and that Android applications will use GPU so they will use much less CPU to draw the screen, making applications to feel faster without Dalvik to become more CPU efficient. Do you remember how I've started the post: when applications need to feel faster, the best way to improve is to improve the components, not CPU only: this is what Google teams did, and I think it was the most sensible thing to do.
As a conclusion, Android L also introduces ART an AOT technique of compilation ("similar" with CR) meaning that (most of) compilation will happen on install time, making at least compilation to be program wide and I would expect as the Google team will improve the ART runtime, will make it very likely faster (if is not already) than V8.
So, how to spot bad benchmarks?
Every benchmarks which are not made for the domain of the user (typically your application), you should state upfront is a bad benchmark. Even more when is comparing "runtimes" or VMs in different class.
This is not to say that are not useful benchmarks, but you can get so easily tricked (and even experts can make mistakes, like this amazing talk of Gil Tene about GC latency and "almost"!) asking your compiler vendor (or JS vendor) to optimize for SunSpider, instead optimizing for your (web) application (like FaceBook, GMail, or what you really use from the web).
Soon I will likely announce my next (not CR related) FOSS project, and will be in the world of Java. The reason is that I understood why Java can give so much performance if written properly and the multitude of tools, but this is for another post. Also, I want to make clear that CR will likely not be updated for a time (excluding someone is interested on some feature and needs assistance).