Skip to content

fix: out-of-bounds read in NumericUnkMaker::checkPeriod (closes #157)#158

Merged
eiennohito merged 2 commits intomasterfrom
fix/numeric-period-trailing-bounds
Apr 17, 2026
Merged

fix: out-of-bounds read in NumericUnkMaker::checkPeriod (closes #157)#158
eiennohito merged 2 commits intomasterfrom
fix/numeric-period-trailing-bounds

Conversation

@eiennohito
Copy link
Copy Markdown
Contributor

@eiennohito eiennohito commented Apr 17, 2026

Summary

  • Fix cause exception when analyze character string that ends with numeral and dot. #157. NumericUnkMaker::checkPeriod bounded the lookahead with pos + 1 < codepoints.size() but read codepoints[posPeriod + 1], where posPeriod = start + pos. On the second spawnNodes pass (start > 0), the two diverge and inputs like 10. or ほげ4. read one past the end of the codepoint vector — the crash Ozeki-san reported from Windows. Reproduced on Linux under libstdc++ debug mode.
  • Replaced the wrong guard with posPeriod + 1 < codepoints.size(). Audited the sibling helpers (checkInterfix, checkSuffix, checkPrefix, checkComma) — they already bound correctly, so the bug is isolated to checkPeriod.
  • Added two regression tests covering the repro (10.) and the start > 0 path (ほげ4.). Both abort on the prior code.
  • Bumped actions/checkout@v4v5 to clear the Node 20 deprecation warning the newly-rewritten workflow surfaced.

Test plan

  • ctest --test-dir build-asan --output-on-failure passes 10/10 suites under AddressSanitizer + UBSan + libstdc++ debug mode.
  • New numeric_creator_test cases multi-digit number followed by trailing period does not crash and digit+period preceded by non-numeric context does not crash abort on master before the fix, pass after.
  • CI green across the linux/linux-asan/macOS/Windows matrix introduced in c2b8b59.

Closes #157.

…digit+period

`checkPeriod` bounded the lookahead with `pos + 1 < codepoints.size()`, but
the index it actually read was `posPeriod + 1 = start + pos + 1`. On the
second pass of `spawnNodes` (start > 0) the two diverge, so inputs like
`10.` or `ほげ4.` read one past the end of the codepoint vector and
caused the crash reported in #157.

Bound the lookahead against the absolute index instead. Dropped the now-dead
`pos + 1` check so the condition reflects what is actually being guarded.

Added regression tests for `10.` and `ほげ4.`; both abort on the prior
code under libstdc++ debug mode / ASan.

Closes #157.
v4 runs on Node 20, which GitHub is forcing off the runners in
September 2026. v5 runs on Node 24 and is a drop-in replacement.
Surfaced as a warning annotation on the CI modernization run.
@eiennohito eiennohito self-assigned this Apr 17, 2026
@eiennohito eiennohito merged commit 34aa52c into master Apr 17, 2026
4 checks passed
@eiennohito eiennohito deleted the fix/numeric-period-trailing-bounds branch April 17, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cause exception when analyze character string that ends with numeral and dot.

1 participant