Floating-Point Precision Differences in Go: float32 vs float64
In Go, floating-point values can be represented as either float32 or float64, with varying levels of precision. In a recent experiment, a program demonstrated floating-point errors when using both float32 and float64 types.
The program accumulates the floating-point value 0.2 multiple times, subtracting 0.3 at each iteration until the value exceeds 1.0. When using float64, the program outputs 1.000000e 00 after 54 iterations, similar to the behavior of the same program in C (when using double).
However, when using float32 in the Go program, it enters an infinite loop. This inconsistency arises from the different ways Go and C handle float32 representations.
Examining the Floating-Point Representations
Using the math.Float32bits and math.Float64bits functions, we can observe how Go represents decimal values as IEEE 754 binary values:
float32(0.1): 00111101110011001100110011001101 float32(0.2): 00111110010011001100110011001101 float32(0.3): 00111110100110011001100110011010 float64(0.1): 0011111110111001100110011001100110011001100110011001100110011010 float64(0.2): 0011111111001001100110011001100110011001100110011001100110011010 float64(0.3): 0011111111010011001100110011001100110011001100110011001100110011
Analysis of float32 Precision
Converting these binary values to decimal reveals that the initial value of a in the Go program using float32 becomes:
0.20000000298023224 + 0.10000000149011612 - 0.30000001192092896 = -7.4505806e-9
This negative value prevents the loop from ever reaching 1.0.
C implementation differences
In C, the behavior differs due to differences in how the float constants are handled. Go approximates 0.1 with the closest binary representation, while C may truncate the last bit.
Go: 00111101110011001100110011001101 => 0.10000000149011612 C(?): 00111101110011001100110011001100 => 0.09999999403953552
As a result, C allows 0.1 to be represented more accurately, resulting in a different outcome for the floating-point loop when using float32.
The above is the detailed content of Why does a Go program using `float32` enter an infinite loop while a similar program using `float64` terminates, when both accumulate 0.2 and subtract 0.3 iteratively until exceeding 1.0?. For more information, please follow other related articles on the PHP Chinese website!