runtime: softfloat64 fadd64/fsub64 produce incorrect result when normal operands cancel into a subnormal

### What version of Go are you using?

Reproduced with `go1.22.3`. The relevant code (`fpack64`/`fadd64` in `src/runtime/softfloat64.go`) is **byte-identical on `master` and through `go1.25.x`**, so all current releases are affected.

```
$ go version
go version go1.22.3 ...
```

### Does this issue reproduce with the latest release?

Yes. `src/runtime/softfloat64.go` is unchanged.

### What operating system and processor architecture are you using?

`GOOS=linux`, `GOARCH=mips`, `GOMIPS=softfloat` (i.e. any target where the runtime emulates FP via `softfloat64.go`). Not observable on hardware-FP targets (amd64/arm64/…), which never execute this code.

### What did you do?

`runtime.fadd64` returns the wrong bit pattern when two near-equal, opposite-sign **normal** `float64` values cancel into a **subnormal** result (`|x+y| < 2.2e-308`). Because `fsub64(f,g) = fadd64(f, fneg64(g))`, subtraction is affected too.

Reproducer (run on a softfloat target so the runtime uses `softfloat64.go`):

```go
package main

import "math"

func main() {
	a := math.Float64frombits(9333378022939403091) // -2.662107816930723e-301
	b := math.Float64frombits(110005986185704326)  //  2.662107858822336e-301
	println("add:", math.Float64bits(a+b))

	p := math.Float64frombits(110005986185704326)
	q := math.Float64frombits(110005986084627283)
	println("sub:", math.Float64bits(p-q))
}
```

Build and run under software float:

```
$ GOOS=linux GOARCH=mips GOMIPS=softfloat go build -o repro_sf main.go
$ qemu-mips ./repro_sf
add: 202154086
sub: 202154086
```

### What did you expect to see?

The IEEE-754 result, as produced on any hardware-FP build:

```
add: 847895691526144     (= 4.189161324398747e-309)
sub: 847895691526144
```

### What did you see instead?

```
add: 202154086           (= 9.98e-316)
sub: 202154086
```

The result is off by ~6 orders of magnitude — it is the *un-normalized* cancellation mantissa returned at the wrong scale, not a 1-ULP rounding difference.

This is not a rare corner of the corner: a differential check of `fadd64` against hardware over random near-equal opposite-sign normal pairs that cancel into a subnormal found **~1,066,956 of 2,000,000 (>53%) wrong**. The reported operands are just one representative.

### Root cause

In `fadd64`, catastrophic cancellation of two near-equal opposite-sign normals leaves a very small mantissa `fm`, while the exponent `fe` is still that of the (normal) operands. `fpack64(fs, fm, fe-2, trunc)` is then called with, for the example above:

```
mant0 = 0x181940cc   (~28 significant bits, far below the implicit-bit position 1<<52)
exp0  = -1001         (a normal-range exponent)
```

`fpack64`'s top normalization loop correctly shifts `mant` up and `exp` down to the true (subnormal) exponent ≈ `-1026`, so the `exp < bias64+1` denormal branch is entered. But that branch then **resets to the raw, un-normalized `(mant0, exp0)`**:

```go
// repeat expecting denormal
mant, exp, trunc = mant0, exp0, trunc0
for exp < bias64 {            // bias64 == -1023
    trunc |= mant & 1
    mant >>= 1
    exp++
}
```

The realignment loop assumes `mant0` is already near the implicit-bit range (true for `*`, `/`, and non-cancelling adds, where the realign loop performs the subnormal right-shift). Here `exp0 == -1001 > bias64 == -1023`, so the loop body **never executes**, and the function returns `mant0 >> 1` — the small cancellation mantissa at a normal-range scale, i.e. a garbage subnormal.

### Scope (verified)

- **`fadd64` / `fsub64`**: affected. These are the only callers that can pass `fpack64` a mantissa far below normalized with a normal-range exponent.
- **`fmul64` / `fdiv64`**: not affected — the product/quotient mantissa handed to `fpack64` is already near-normalized, so the realign loop runs as intended.
- **Direct subnormal operands**: not affected.
- **float32**: `fadd32`/`fsub32`/`fmul32`/`fdiv32` all route through the float64 ops (`f64to32(fadd64(f32to64(x), f32to64(y)))`), and float32 operands widened to float64 cannot land on a float64 subnormal — so float32 arithmetic is **not** affected in practice. Note `fpack32` contains the structurally identical denormal-reset code and would exhibit the same bug if it were ever handed a deeply-cancelled mantissa, but no caller does (`f64to32` passes an already-normalized mantissa). A fix should still patch `fpack32` to match.

### Fix

Insert a left-normalization loop in the denormal branch, before the right-shift alignment, so a heavily-cancelled `mant0` is shifted in the correct direction. It is a no-op for already-normalized callers (`mul`/`div`/conversions), so it cannot regress them:

```go
		// repeat expecting denormal
		mant, exp, trunc = mant0, exp0, trunc0
+		// Re-normalize before aligning to the subnormal exponent: when mant0
+		// came from heavy cancellation it can be far below 1<<mantbits64 while
+		// exp0 is a normal-range exponent, so the right-shift below would move
+		// in the wrong direction. No-op for already-normalized callers.
+		for mant < 1<<mantbits64 {
+			mant <<= 1
+			exp--
+		}
		for exp < bias64 {
			trunc |= mant & 1
			mant >>= 1
			exp++
		}
```

The same change applies to `fpack32` (`1<<mantbits32` / `bias32`).

Rationale: the denormal branch re-expresses the value at the fixed subnormal exponent via `M = mant0 * 2^(exp0 - bias64 - 1)`, implemented as a right-shift by `bias64+1-exp0`. That shift count is only non-negative when `exp0 <= bias64+1`. Heavy cancellation produces `exp0 > bias64+1` (a normal-range exponent on a tiny mantissa), so the shift count goes negative and the existing loop, which can only shift right, returns garbage. Left-normalizing first restores the invariant the right-shift step assumes.

### Verification

- `fadd64`/`fsub64` on the reported operands return the correct `847895691526144`.
- The >53% cancellation-class divergence above drops to **0 mismatches** across 30M randomized cancellation pairs (17M of which yield subnormal results), and 0 across 20M random pairs over `+ - * /`.
- Confirmed on a real `GOMIPS=softfloat` build (`go1.22.3`): patched runtime under `qemu-mips` prints `847895691526144` where the unpatched build printed `202154086`.
- Adding the two operands to the `base` slice in `src/runtime/softfloat64_test.go` reproduces the failure before the fix (the existing all-pairs `+`/`-` comparison against hardware catches it) and passes after.

### Notes

- Found via gno (gnolang/gno#5806), which copies `softfloat64.go` verbatim to get deterministic, hardware-independent float results — so it executes this path on every platform and surfaced the divergence.
- This is a latent correctness bug for all software-float Go targets, not specific to gno.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: softfloat64 fadd64/fsub64 produce incorrect result when normal operands cancel into a subnormal #79964

What version of Go are you using?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using?

What did you do?

What did you expect to see?

What did you see instead?

Root cause

Scope (verified)

Fix

Verification

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

runtime: softfloat64 fadd64/fsub64 produce incorrect result when normal operands cancel into a subnormal #79964

Description

What version of Go are you using?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using?

What did you do?

What did you expect to see?

What did you see instead?

Root cause

Scope (verified)

Fix

Verification

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions