March 2013
Intermediate to advanced
984 pages
26h 18m
English
For signed 16-bit floating-point values, the minimum and maximum values that can be represented are (about) 6.103 × 10–5, and 65504.0, respectively.
The following routine, F32toF16(), will convert a single, full-precision 32-bit floating-point value to a 16-bit reduced-precision form (stored as an unsigned short integer).
#define F16_EXPONENT_BITS 0x1F#define F16_EXPONENT_SHIFT 10#define F16_EXPONENT_BIAS 15#define F16_MANTISSA_BITS 0x3ff#define F16_MANTISSA_SHIFT (23 - F16_EXPONENT_SHIFT)#define F16_MAX_EXPONENT \(F16_EXPONENT_BITS << F16_EXPONENT_SHIFT)GLushortF32toF16(GLfloat val){ GLuint f32 = (*(GLuint *) &val); GLushort f16 = 0; /* Decode IEEE 754 little-endian 32-bit ...