cpu-o3: split decoupled bpu stats helpers by jensen-yan · Pull Request #624 · OpenXiangShan/GEM5

jensen-yan · 2025-12-01T03:18:00Z

moving DB/init/dumpStats etc to new file to reduce clutter.

Summary by CodeRabbit

Chores
- Refactored branch prediction statistics collection infrastructure, consolidating legacy tracing and statistics paths into a dedicated new module.
- Simplified statistics gathering mechanisms for improved maintainability.
New Features
- Enhanced branch prediction statistics module with comprehensive tracing and reporting capabilities for performance analysis.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

moving DB/init/dumpStats etc to new file to reduce clutter.

coderabbitai · 2025-12-01T03:18:21Z

Walkthrough

This change refactors the decoupled BTB-based branch predictor by extracting legacy statistics, tracing, and misprediction handling functionality from the main implementation file into a new dedicated statistics module. A build system entry is added to compile the new statistics module.

Changes

Cohort / File(s)	Summary
Build System Configuration `src/cpu/pred/SConscript`	Added new source file `btb/decoupled_bpred_stats.cc` to the SimObject sources list.
Decoupled Predictor Core `src/cpu/pred/btb/decoupled_bpred.cc`	Removed extensive legacy statistics gathering, misprediction handling, branch classification, phase-based processing, and FSQ/BTB debugging logic. Eliminated ~30 public methods including `initDB()`, `dumpStats()`, `processMisprediction()`, `classifyBranch()`, `processPhase()`, and related tracing/commit-tracking infrastructure.
Decoupled Predictor Statistics Module `src/cpu/pred/btb/decoupled_bpred_stats.cc`	New file containing comprehensive statistics collection infrastructure: `DBPBTBStats` struct with multi-stage counters, `BpTrace` logging facility for branch prediction traces, `initDB()` initialization for trace outputs and database configuration, `dumpStats()` for CSV report generation, branch classification logic (`classifyBranch()`, `branchClassName()`), phase-based misprediction processing (`processPhase()`, `processFetchDistributions()`, `processBTBEntries()`), and per-branch commit tracking (`commitBranch()`, `addBranchClassStat()`).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Significant refactoring involving migration of ~30+ public methods and hundreds of lines of statistics/tracing logic from decoupled_bpred.cc to the new decoupled_bpred_stats.cc module
Requires verification that extracted statistics collection, phase processing, and branch classification logic functions identically in the new module
Interface between the two files must be carefully validated to ensure no regressions in statistics accuracy or control flow
New DBPBTBStats structure and BpTrace facilities need thorough inspection for correctness
Careful examination needed to confirm all removed code paths are properly replicated in the new statistics module and that call sites are correctly updated

Possibly related PRs

cpu-o3: add fine-grained branch type stats to BPU #620: Introduces fine-grained branch-type statistics changes to DecoupledBPUWithBTB with overlapping branch classification and per-class statistics infrastructure related to this PR's new DBPBTBStats vectors and branchClassName() APIs.

Suggested reviewers

Yakkhini
CJ362ff

Poem

🐰 Statistics split and gathered bright,
From tangled code to modules tight,
Classification branches bloom,
Phase by phase they light the room,
A cleaner core, a clearer sight! 📊

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'cpu-o3: split decoupled bpu stats helpers' accurately describes the main change: moving statistics and database functionality to a new file to reduce code clutter.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch split-BPU

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (5)

src/cpu/pred/btb/decoupled_bpred_stats.cc (5)
248-248: Signed/unsigned comparison warning.

i is int but phaseRecords.size() returns size_t. Consider using size_t for loop indices to avoid compiler warnings.
-            for (int i = 0; i < outputTopN && i < phaseRecords.size(); i++) {
+            for (size_t i = 0; i < static_cast<size_t>(outputTopN) && i < phaseRecords.size(); i++) {
380-392: Architecture-specific cast limits portability.

The explicit RiscvISA::PCState cast ties this code to RISC-V. If this predictor is intended for other ISAs, consider using a more generic approach or adding an ISA check/abstraction.

Also, consider removing the commented-out debug code (lines 389-391) or converting it to a proper debug macro.

800-801: Magic number for history pattern size.

The value 18 for the history resize should be defined as a named constant for maintainability and clarity.
// Add near top of file or in class header
static constexpr int MISPRED_HIST_PATTERN_SIZE = 18;

// Then use:
hist.resize(MISPRED_HIST_PATTERN_SIZE);
832-834: Assertion lacks diagnostic context.

If this assertion fails, the developer won't know which fsqId caused the issue. Consider using panic_if or adding the fsqId to the assertion message for easier debugging.
     auto streamIt = fetchStreamQueue.find(inst->fsqId);
-    assert(streamIt != fetchStreamQueue.end());
+    panic_if(streamIt == fetchStreamQueue.end(),
+             "commitBranch: fsqId %lu not found in fetchStreamQueue",
+             inst->fsqId);
     auto entry = streamIt->second;
875-877: Assertion lacks diagnostic context.

Same issue as in commitBranch() — the assertion at line 876 provides no context if it fails.
     auto it = fetchStreamQueue.find(inst->fsqId);
-    assert(it != fetchStreamQueue.end());
+    panic_if(it == fetchStreamQueue.end(),
+             "notifyInstCommit: fsqId %lu not found in fetchStreamQueue",
+             inst->fsqId);
     it->second.commitInstNum++;

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fc34e6b and 3d69e94.

📒 Files selected for processing (3)

src/cpu/pred/SConscript (1 hunks)
src/cpu/pred/btb/decoupled_bpred.cc (0 hunks)
src/cpu/pred/btb/decoupled_bpred_stats.cc (1 hunks)

💤 Files with no reviewable changes (1)

src/cpu/pred/btb/decoupled_bpred.cc

🧰 Additional context used

🧬 Code graph analysis (1)

src/cpu/pred/btb/decoupled_bpred_stats.cc (2)

src/cpu/pred/btb/decoupled_bpred.cc (4)

nonControlSquash (546-559)

nonControlSquash (547-549)

trapSquash (561-573)

trapSquash (562-564)

src/cpu/pred/btb/decoupled_bpred.hh (2)

currentPhaseCommittedDist (949-950)

isSubPhase (937-941)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Quick Build, Unit Tests & Smoke Test

🔇 Additional comments (7)

src/cpu/pred/btb/decoupled_bpred_stats.cc (6)

1-102: LGTM!

The initDB() function cleanly initializes multiple trace subsystems with appropriate field definitions. The consistent pattern of checking switches, initializing traces, and removing processed switches is well-structured.

339-341: Off-by-one in header generation vs data output.

The header loop uses i <= outputTopNEntries which generates outputTopNEntries + 1 entries (entry_0 to entry_1 for outputTopNEntries=1). The data output at line 361 uses the same pattern. Verify this is intentional—typically "top N" means exactly N entries, not N+1.

444-516: LGTM, with minor observation.

The statistics initialization is comprehensive and properly uses gem5's statistics framework. The derived statistics formulas (lines 503-504) correctly establish relationships for computed counters.

Consider documenting the magic number 20 at line 505 (distribution bin count) as a named constant for clarity.

543-654: LGTM!

The phase processing functions correctly handle delta computation between phases and properly maintain state for both main phases and sub-phases. The BTB entry processing logic with visit count differencing is sound.

933-1012: LGTM!

The misprediction classification logic is well-structured with clear conditional paths for direction wrong, no prediction, and target wrong cases. The assertions provide appropriate invariant checks.

1019-1042: LGTM!

Good use of a local lambda to avoid duplicating the map update logic across the three branch tracking maps.

src/cpu/pred/SConscript (1)

98-98: LGTM!

The new source file is correctly added to the build system, placed logically next to the related btb/decoupled_bpred.cc entry.

src/cpu/pred/btb/decoupled_bpred_stats.cc

github-actions · 2025-12-01T03:31:02Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`1.8589`	-
This PR	`1.8589`	➡️ `0.0000` (`0.00%`)

✅ Difftest smoke test passed!

cpu-o3: split decoupled bpu stats helpers

3d69e94

moving DB/init/dumpStats etc to new file to reduce clutter.

coderabbitai bot reviewed Dec 1, 2025

View reviewed changes

src/cpu/pred/btb/decoupled_bpred_stats.cc Show resolved Hide resolved

Yakkhini approved these changes Dec 2, 2025

View reviewed changes

jensen-yan merged commit 868f116 into xs-dev Dec 2, 2025
2 checks passed

jensen-yan deleted the split-BPU branch December 2, 2025 03:05

coderabbitai bot mentioned this pull request Dec 26, 2025

cpu-o3:Implement predwrongSource method #683

Merged

This was referenced Jan 6, 2026

2 taken v8 #693

Open

Split microtage perf #694

Closed

cpu-o3: add 2Fetch features #700

Open

This was referenced Jan 13, 2026

support ChampSim and CBP2025 trace simulation #649

Merged

Sc ut #710

Merged

cpu-o3: simplify fetch， only support decoupled BTB mode #721

Merged

This was referenced Jan 21, 2026

Remove redundant codes in v3 BPU #723

Closed

bpu: remove unused code #742

Merged

This was referenced Jan 28, 2026

bpu: split FTQ into independent class #743

Closed

Ahead microtage index perf #741

Closed

coderabbitai bot mentioned this pull request Mar 9, 2026

Utage check rtl align #773

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cpu-o3: split decoupled bpu stats helpers#624

cpu-o3: split decoupled bpu stats helpers#624
jensen-yan merged 1 commit intoxs-devfrom
split-BPU

jensen-yan commented Dec 1, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 1, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jensen-yan commented Dec 1, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Dec 1, 2025

🚀 Coremark Smoke Test Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jensen-yan commented Dec 1, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 1, 2025 •

edited

Loading