[0.0, 0.0078125, 0.015625, 0.0234375, 0.03125, 0.0390625, 0.046875, 0.0546875, 0.0625, 0.0703125, 0.078125, 0.0859375, 0.09375, 0.1015625, 0.109375, 0.1171875, 0.125, 0.1328125, 0.140625, 0.1484375, 0.15625, 0.1640625, 0.171875, 0.1796875, 0.1875, 0.1953125, 0.203125, 0.2109375, 0.21875, 0.2265625, 0.234375, 0.2421875, 0.25, 0.265625, 0.28125, 0.296875, 0.3125, 0.328125, 0.34375, 0.359375, 0.375, 0.390625, 0.40625, 0.421875, 0.4375, 0.453125, 0.46875, 0.484375, 0.5, 0.53125, 0.5625, 0.59375, 0.625, 0.65625, 0.6875, 0.71875, 0.75, 0.78125, 0.8125, 0.84375, 0.875, 0.90625, 0.9375, 0.96875, 1.0, 1.0625, 1.125, 1.1875, 1.25, 1.3125, 1.375, 1.4375, 1.5, 1.5625, 1.625, 1.6875, 1.75, 1.8125, 1.875, 1.9375, 2.0, 2.125, 2.25, 2.375, 2.5, 2.625, 2.75, 2.875, 3.0, 3.125, 3.25, 3.375, 3.5, 3.625, 3.75, 3.875, 4.0, 4.25, 4.5, 4.75, 5.0, 5.25, 5.5, 5.75, 6.0, 6.25, 6.5, 6.75, 7.0, 7.25, 7.5, 7.75, 8.0, 8.5, 9.0, 9.5, 10.0, 10.5, 11.0, 11.5, 12.0, 12.5, 13.0, 13.5, 14.0, 14.5, 15.0, 15.5]
\(e\leftarrow e+1\) if subnormal and \(m\leftarrow m+2^{n_m}\) if normal
\(m\) is not a \(n_m\)-bit, but \((n_m+1)\)-bit unsigned. The 5th bit is not stored explicitly, it is hidden and can be recovered from the value of e
.
As a side note, it is also means that mantissa \(m\) encodes a fixed-point number with 1 bit for the integer part and \(2^{n_m}\) bits for the fraction. Therefore, \(\frac m {2^{n_m}} \in [0, 2)\). Multiplying it by \(2^e\) we make the point float by \(e\) binary places.
\(93_{10} = 101\ 1101_2\)
\(e = \left\lfloor\frac{93}{16}\right\rfloor-4 = 1\)
\(m = (93 \mod 16) + 16 = 29\)
\(29 \cdot 2^{1-4} = 3.625\)
It turns out that our float7 can be represented as a fixed-point number with 4 bits in the integer part and 7 bits in the fractional part, which we can summarize as 4.7 format. We can determine this by noting that a float’s mantissa is a 1.4 fixed-point number. The maximum float exponent is 3, which is equivalent to shifting the mantissa left 3 positions. The minimum float exponent is -3, which is equivalent to shifting the mantissa right 3 positions. Those shift amounts of our 1.4 mantissa mean that all floats can fit into a 4.7 fixed-point number, for a total of 11 bits.
All we need to do is create this number, by pasting the mantissa into the correct location, and then convert the large fixed-point number to decimal.
\(29_{10} = 11101_2\)
0011.1010000
\(3.625_{10} = 0011.1010000_2\)
primary school - column addition
0.0078125 * 0
+ 0.0156250 * 0
+ 0.0312500 * 0
+ 0.0625000 * 0
+ 0.1250000 * 1
+ 0.2500000 * 0
+ 0.5000000 * 1
+ 1.0000000 * 1
+ 2.0000000 * 1
+ 4.0000000 * 0
8.0000000 * 0
= ---------
3.6250000