Monday, November 2, 2015

Vulkan, DirecX 12 and the Low Level API craze

A bit of history

There was a long time competition between PCs and consoles for giving the best visual and experience inside games. Also, typically the consoles had high end specifications when they were launched, but later they age fairly quickly because the PC market had bigger competition, but still they offered a consistent and higher frame rates. How was it possible? In part there were two factors: there was no fragmentation so programmers could fully use the hardware components without coding workarounds if a specific hardware component which does no offer hardware acceleration and the second part: the hardware was open to developers (after you sign an NDA) with lower access than classical OpenGL/DirectX API.

Enter Mantle

Mantle was the idea that AMD had to offer this low level for their hardware, and they did work with a game developer (Dice) to make it more usable for "mere mortals". Mantle had a fairly small impact overall for games but a big impact for industry as big (theoretical) potential. Later Mantle was offered as starting API for Vulkan, and Microsoft's DirectX 12, Apple's Metal following suit to offer similar functionality on their (propertary) OSes.

So what is it so special about these low level APIs? (I will do my analysis based on mostly Vulkan based presentations/documentation and my (limited) understanding of DirectX 12 (and assuming that many things are similar)).

Three items are the most important:
- don't do most of "rendering" on the main thread
- move part of driver code in user-space
- don't do (any) validation in "release mode"

Don't render on the main thread

Typically rendering in a classical OpenGL/DirectX application is basically issuing drawing commands against a driver and these commands are processed on a pipeline. Also, there are pixel/vertex shaders which they do pre-post processing of pixels and geometry. For historical reason most of developers are used to draw using main thread, so the drawing has to be done waiting basically on drivers to finish all drawing.

Right now the drawing commands are right now named: Command Buffers and these command buffers can be processed in separate CPU threads, and they can be reused! Follow this presentation for more (high-level) details.

VK_CMD_BUFFER_BEGIN_INFO info = { ... };
vkBeginCommandBuffer(cmdBuf &info);
vkCmdDoThisThing(cmdBuf, ...);
vkCmdDoSomeOtherThing(cmdBuf, ...);
vkEndCommandBuffer(cmdBuf);


This thing in itself can scale horizontally on both higher spec machines but also on lower (yet multi-core) machines as ARM or Atom CPUs which is really great thing for many cores which are not that fast.

Moving the driver code in user space

These command buffers are combined in rendering pipelines. These rendering pipelines which include the pixel/vertex shaders are prepared themselves can be setup on separate threads. Pixel/vertex shaders are right now compiled from a bytecode (named SPIR-V), which makes the scripts loading and processing faster. This item is not for importance in DirectX world because Microsoft was doing it as far as I understood from DirectX 10, so if you think that your game (Dota 2, chough, chough) because it has a lot of pixel shaders to precompile, it would not gonna happen.

Moving most of processing in userspace means both good and bad things. The good thing is that good developers will not have to wait for a driver developer to optimize a specific code path which the game needs. Another good part is that having most code in user-space the code should run faster as many drivers do "Ring" switches (jumping into kernel mode) which is a very expensive call (low microseconds level, but still significant if happens tens or hundreds of times per frame draw, as a rendering time per frame should be around 16 ms). The ugliest thing I can imagine is that very often driver developers for the main video card vendors do a good job. So in this scenario I would expect that driver developers will have fewer ways to improve all games.

Don't do validation

This is why you will hear things like: even if is using one core, the processing is still 30% faster using DirectX 12 (or Vulkan). This is of course a double edged sword: you can get very weird things happening and no one can assist the developers of what went wrong.

The good thing is that Vulkan come with many validation tools in "debug" mode, so you can check the weird mismatches in the code.

Should you install Windows 10 or find a Vulkan driver?

If you are a developer working with graphics, the answer may be yes, otherwise, not sure. Try not to get hyped!! Windows 10 had huge problems at launch with some older NVidia cards (like series 500 or lower). Having DirectX 12 which theoretically would run your future unlaunched game in one year from now means very little for your today usage of your computer.

If you don't play a lot, the situation is even worse, as for most interfaces I'm aware the most time in processing is mostly: font metrics calculation, styling, layouting and like it and sadly none of them are to GPU taxing.

Would Vulkan or DirectX 12 have a big impact? I would expect that in 2-3 yers from now yes, but not because anything changed for the user, but only because the industry will upgrade naturally the software stack.

No comments:

Post a Comment