Saturday, February 13, 2016

Java's Flat Collections - what's the deal? (Part II)

I thought about cases when people would want to use flat collections. The most obvious are for example an "point array", "Tuple array", but as thinking more I found some interesting case which is also kind of common: "rectangle", "triangle" or similar constructs.

Typically when people define a circle for instance, would build it as:
class Circle{
   Point2f center = new Point2f();
   float radius;
}

Without noticing maybe, if you have to store for a 32bit machine one hundred of circles, you will store in fact much more data than the: center.x, center.y, radius x 4 bytes = 12 bytes per circle, and for 100 circles is 1.2 KB (more or less), but more like:
- 100 entries for the reference table: 400 bytes
- 100 headers of Circle object: 800 bytes
- 100 references to Point: 400 bytes
- 100 headers of Circle.center (Point2F): 800 bytes
- 100 x 3 floats: 1200 bytes

So instead of your payload of 1.2 KB, you are into 3.6 KB, so there is a 3X memory usage compaction.

If you have 100 Line instances which themselves have 2 instances of Point2f, you will have instead of 1600 B: (refTable) 400 + (object headers) 2400 bytes + (references to internal points) 800 + (payload) 1600 = 5200 B which is a 3.25X memory compaction.

A simple benchmark shows that not only memory is saved, but also the performance. So, if you use Line (with internal 2 points in it) and you would populate flat collections instead of plain Java objects, you will get the following numbers:

Setup values 1:12983 ms.
Read time 1:5086 ms. sum: 2085865984

If you will use Java objects, you will have a big slowdown on both reading and writing:
Setup values 2:62346 ms.
Read time 2:18781 ms. sum: 2085865984

So, you will get more than 4x speedup on write (like populating collections) and 3x speedup on read by flattening most types.

Last improvement? Not only that reflection works, but sometimes it is ugly to create a type, reflect it and use it later for this code generator of flatter types. So right now, everything the input config is JSon based, and you can create on the fly your own "layouts" (meaning a "flat object"):
{
  "typeName": "Point3D",
  "fields": ["X", "Y", "Z"],
  "fieldType": "float"}
This code would create a flat class Point3D with 3 floats in it named X, Y, Z (meaning the cursor will use a "getX/setX" and so on).

Here is the attached formatted input of the code generator file named: flatcfg.json.

No comments:

Post a Comment