Skip to content

runtime: softfloat64 fadd64/fsub64 produce incorrect result when normal operands cancel into a subnormal #79964

@omarsy

Description

@omarsy

What version of Go are you using?

Reproduced with go1.22.3. The relevant code (fpack64/fadd64 in src/runtime/softfloat64.go) is byte-identical on master and through go1.25.x, so all current releases are affected.

$ go version
go version go1.22.3 ...

Does this issue reproduce with the latest release?

Yes. src/runtime/softfloat64.go is unchanged.

What operating system and processor architecture are you using?

GOOS=linux, GOARCH=mips, GOMIPS=softfloat (i.e. any target where the runtime emulates FP via softfloat64.go). Not observable on hardware-FP targets (amd64/arm64/…), which never execute this code.

What did you do?

runtime.fadd64 returns the wrong bit pattern when two near-equal, opposite-sign normal float64 values cancel into a subnormal result (|x+y| < 2.2e-308). Because fsub64(f,g) = fadd64(f, fneg64(g)), subtraction is affected too.

Reproducer (run on a softfloat target so the runtime uses softfloat64.go):

package main

import "math"

func main() {
	a := math.Float64frombits(9333378022939403091) // -2.662107816930723e-301
	b := math.Float64frombits(110005986185704326)  //  2.662107858822336e-301
	println("add:", math.Float64bits(a+b))

	p := math.Float64frombits(110005986185704326)
	q := math.Float64frombits(110005986084627283)
	println("sub:", math.Float64bits(p-q))
}

Build and run under software float:

$ GOOS=linux GOARCH=mips GOMIPS=softfloat go build -o repro_sf main.go
$ qemu-mips ./repro_sf
add: 202154086
sub: 202154086

What did you expect to see?

The IEEE-754 result, as produced on any hardware-FP build:

add: 847895691526144     (= 4.189161324398747e-309)
sub: 847895691526144

What did you see instead?

add: 202154086           (= 9.98e-316)
sub: 202154086

The result is off by ~6 orders of magnitude — it is the un-normalized cancellation mantissa returned at the wrong scale, not a 1-ULP rounding difference.

This is not a rare corner of the corner: a differential check of fadd64 against hardware over random near-equal opposite-sign normal pairs that cancel into a subnormal found ~1,066,956 of 2,000,000 (>53%) wrong. The reported operands are just one representative.

Root cause

In fadd64, catastrophic cancellation of two near-equal opposite-sign normals leaves a very small mantissa fm, while the exponent fe is still that of the (normal) operands. fpack64(fs, fm, fe-2, trunc) is then called with, for the example above:

mant0 = 0x181940cc   (~28 significant bits, far below the implicit-bit position 1<<52)
exp0  = -1001         (a normal-range exponent)

fpack64's top normalization loop correctly shifts mant up and exp down to the true (subnormal) exponent ≈ -1026, so the exp < bias64+1 denormal branch is entered. But that branch then resets to the raw, un-normalized (mant0, exp0):

// repeat expecting denormal
mant, exp, trunc = mant0, exp0, trunc0
for exp < bias64 {            // bias64 == -1023
    trunc |= mant & 1
    mant >>= 1
    exp++
}

The realignment loop assumes mant0 is already near the implicit-bit range (true for *, /, and non-cancelling adds, where the realign loop performs the subnormal right-shift). Here exp0 == -1001 > bias64 == -1023, so the loop body never executes, and the function returns mant0 >> 1 — the small cancellation mantissa at a normal-range scale, i.e. a garbage subnormal.

Scope (verified)

  • fadd64 / fsub64: affected. These are the only callers that can pass fpack64 a mantissa far below normalized with a normal-range exponent.
  • fmul64 / fdiv64: not affected — the product/quotient mantissa handed to fpack64 is already near-normalized, so the realign loop runs as intended.
  • Direct subnormal operands: not affected.
  • float32: fadd32/fsub32/fmul32/fdiv32 all route through the float64 ops (f64to32(fadd64(f32to64(x), f32to64(y)))), and float32 operands widened to float64 cannot land on a float64 subnormal — so float32 arithmetic is not affected in practice. Note fpack32 contains the structurally identical denormal-reset code and would exhibit the same bug if it were ever handed a deeply-cancelled mantissa, but no caller does (f64to32 passes an already-normalized mantissa). A fix should still patch fpack32 to match.

Fix

Insert a left-normalization loop in the denormal branch, before the right-shift alignment, so a heavily-cancelled mant0 is shifted in the correct direction. It is a no-op for already-normalized callers (mul/div/conversions), so it cannot regress them:

		// repeat expecting denormal
		mant, exp, trunc = mant0, exp0, trunc0
+		// Re-normalize before aligning to the subnormal exponent: when mant0
+		// came from heavy cancellation it can be far below 1<<mantbits64 while
+		// exp0 is a normal-range exponent, so the right-shift below would move
+		// in the wrong direction. No-op for already-normalized callers.
+		for mant < 1<<mantbits64 {
+			mant <<= 1
+			exp--
+		}
		for exp < bias64 {
			trunc |= mant & 1
			mant >>= 1
			exp++
		}

The same change applies to fpack32 (1<<mantbits32 / bias32).

Rationale: the denormal branch re-expresses the value at the fixed subnormal exponent via M = mant0 * 2^(exp0 - bias64 - 1), implemented as a right-shift by bias64+1-exp0. That shift count is only non-negative when exp0 <= bias64+1. Heavy cancellation produces exp0 > bias64+1 (a normal-range exponent on a tiny mantissa), so the shift count goes negative and the existing loop, which can only shift right, returns garbage. Left-normalizing first restores the invariant the right-shift step assumes.

Verification

  • fadd64/fsub64 on the reported operands return the correct 847895691526144.
  • The >53% cancellation-class divergence above drops to 0 mismatches across 30M randomized cancellation pairs (17M of which yield subnormal results), and 0 across 20M random pairs over + - * /.
  • Confirmed on a real GOMIPS=softfloat build (go1.22.3): patched runtime under qemu-mips prints 847895691526144 where the unpatched build printed 202154086.
  • Adding the two operands to the base slice in src/runtime/softfloat64_test.go reproduces the failure before the fix (the existing all-pairs +/- comparison against hardware catches it) and passes after.

Notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    FixPendingIssues that have a fix which has not yet been reviewed or submitted.NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.compiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    No status

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions