3

Open GL and other graphics APIs support floating point formats smaller than 32 bits (e.g. see https://www.opengl.org/wiki/Small_Float_Formats). While GPUs seem to handle these formats natively, CPUs have limited support (x86 architecture chips can convert to/from 16 bit floats using the AVX instructions). It is sometimes convenient to pack/unpack these formats on the CPU , but it is not obvious how to do so efficiently.

What are the fastest techniques to convert between 32 bit floating point and the small float formats on the CPU? Are libraries or sample implementations available?

  • 2
    As for the code, there are many examples online. Take a look at SO: http://stackoverflow.com/questions/1659440/32-bit-to-16-bit-floating-point-conversion – glampert Feb 14 '16 at 16:57

1 Answers1

5

You definitely want to check out meticulous half to float and float to half implementations by Fabian Giesen. I don't think there is anything faster than those: https://gist.github.com/rygorous/2144712 https://gist.github.com/rygorous/2156668

For other formats you will find an implementation in DirectXMath (part of WindowsSDK), Mesa and similar libraries. For example, R11G11B10_Float conversions are implemented by DirectXMath in XMFLOAT3PK and by Mesa in u_format_other.c.