Merge HalfKA and Threats accumulators#6890
Conversation
|
(execution 27248272859 / attempt 1) |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
💤 Files with no reviewable changes (1)
📝 WalkthroughWalkthroughThis PR unifies NNUE accumulator handling by replacing templated PSQ/threat states with a single non-templated AccumulatorState containing DirtyPiece and DirtyThreats. AccumulatorStack APIs (latest/mut_latest/evaluate_side/find_last_usable_accumulator/forward/backward incremental updates) and storage are converted to non-templated, Color-keyed forms. A new apply_combined implements SIMD/scalar combined PSQ+threat delta application. update_accumulator_refresh_cache now refreshes PSQ and threat contributions together and the separate threat-only refresh function was removed. FeatureTransformer::transform() reads psqtAccumulation from the unified latest accumulator and omits threatAccumulation additions. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
ded8e9a to
5385e3a
Compare
|
(execution 27251005329 / attempt 1) |
|
(execution 27251205882 / attempt 1) |
|
(execution 27253689613 / attempt 1) |
|
|
I wonder if this makes a larger L1 like say 1280 more promising? |
Passed STC (https://tests.stockfishchess.org/tests/view/6a2893ad7c758d82accea129):
LLR: 3.20 (-2.94,2.94) <0.00,2.00>
Total: 23328 W: 6145 L: 5838 D: 11345
Ptnml(0-2): 50, 2463, 6346, 2740, 65
Instead of repeatedly doing the sum HalfKA + threats at the end, it's profitable to simply store one accumulator per side that combines them. This also avoids an extra/load store of an accumulator, and halves the cache footprint of the accumulators.
For full refreshes, we always compute both halfka and threats simultaneously. Any threat full refresh is always a halfka refresh because it occurs when the king crosses the center line, while halfka refreshes are required for ANY king move, so we don't need a separate detection path for threats.
I get about a 2.5% speedup locally with this, but I'd appreciate other ppl's measurements.
No functional change