Train ITTAGE at resolve stage by Yakkhini · Pull Request #630 · OpenXiangShan/GEM5

Yakkhini · 2025-12-03T02:55:35Z

Summary by CodeRabbit

Chores
- Updated branch prediction system configuration and component enablement handling.
- Enhanced prediction update filtering with improved safety checks to ensure reliable operation under various conditions.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-03T02:55:52Z

Walkthrough

The changes enable the ittage branch predictor in the KMHV3 configuration and modify BTB update logic in btb_ittage.cc and decoupled_bpred.cc to conditionally filter entries based on resolution status and component enablement flags.

Changes

Cohort / File(s)	Change Summary
Configuration update `configs/example/kmhv3.py`	Enabled ittage branch predictor (changed from False to True) and added `cpu.branchPred.ittage.resolvedUpdate = True` to enable resolution tracking for ittage, alongside existing mbtb and tage resolvedUpdate assignments.
BTB update filtering `src/cpu/pred/btb/btb_ittage.cc`	Modified BTBITTAGE::update to conditionally filter BTB entries: when getResolvedUpdate() is true, retains only entries that are indirect, non-return, and resolved; otherwise, preserves prior behavior of updating indirect non-return entries.
Update guard conditions `src/cpu/pred/btb/decoupled_bpred.cc`	Added isEnabled checks to gate UBTB and ABTB updates; ABTB update now handles missing FTQ entries by defaulting previous_startpc to 0 when enabled, or skipping update when disabled.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

btb_ittage.cc: Requires careful review of the conditional filtering logic and verification that both paths (resolvedUpdate true/false) maintain intended behavior.
decoupled_bpred.cc: Multiple guard condition additions across UBTB and ABTB updates; verify FTQ entry fallback logic is correct.
kmhv3.py: Cross-reference config changes with corresponding code paths to ensure consistency.

Possibly related PRs

PR#612: Directly enables ittage predictor in KMHV3 config with similar parameter assignments.
PR#629: Modifies ABTB update behavior and gating logic in decoupled_bpred path.
PR#628: Adjusts ittage history and update handling in both btb_ittage and decoupled_bpred components.

Suggested labels

perf

Suggested reviewers

jensen-yan

Poem

🐰 With ittage now awake and enabled bright,
BTB filters dance left and right—
Guards and flags keep updates in sight,
Branch prediction tuned for speed and might,
Resolution tracked in morning light! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: enabling ITTAGE training at the resolve stage, which is reflected in the configuration updates and conditional resolution tracking logic across multiple files.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch ittage-align

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 427f3c0 and ea1a5b2.

📒 Files selected for processing (1)

src/cpu/pred/btb/decoupled_bpred.cc (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/cpu/pred/btb/decoupled_bpred.cc

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.3c)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a221f6 and 493cb0d.

📒 Files selected for processing (1)

configs/example/kmhv3.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.3c)

configs/example/kmhv3.py

github-actions · 2025-12-03T03:03:14Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`1.7830`	-
This PR	`1.7846`	📈 `+0.0017` (`+0.09%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2025-12-03T03:59:06Z

[Generated by GEM5 Performance Robot]
commit: 493cb0d
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	15.24	19.92	-23.47 🔴

github-actions · 2025-12-03T04:40:48Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`1.7830`	-
This PR	`1.7921`	📈 `+0.0092` (`+0.51%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2025-12-03T06:16:45Z

[Generated by GEM5 Performance Robot]
commit: 4a41c44
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	15.19	14.64	+3.81 🟢

[Generated by GEM5 Performance Robot]
commit: 4a41c44
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	15.19	15.24	-0.32 🔴

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/cpu/pred/btb/btb_ittage.cc (1)

222-232: Resolved-update filtering looks sound; consider deduping the remove_if and double-check resolved plumbing.

The new gating on getResolvedUpdate() and e.resolved preserves the old behavior when resolvedUpdate is false and correctly restricts ITTAGE training to resolved, indirect, non-return entries when it is true. The main thing to watch is that all BTB entries you expect to train on—especially stream.updateNewBTBEntry in the miss case—have resolved set by the time update() runs, otherwise they’ll be silently dropped from all_entries_to_update.

You could also slightly simplify this block by using a single predicate that incorporates getResolvedUpdate() instead of two near-identical remove_if calls; that would reduce duplication and keep the filter logic in one place. The leading // // on the comment is likely a leftover and can be cleaned up as well.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4a41c44 and 66aa284.

📒 Files selected for processing (1)

src/cpu/pred/btb/btb_ittage.cc (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

src/cpu/pred/btb/btb_ittage.cc (1)

src/cpu/pred/btb/decoupled_bpred.hh (10)

e (202-214)

e (202-202)

e (216-228)

e (216-216)

e (230-235)

e (230-230)

e (237-241)

e (237-237)

e (268-271)

e (268-268)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.3c)

github-actions · 2025-12-03T09:14:20Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`1.7830`	-
This PR	`1.7747`	📉 `-0.0082` (`-0.46%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2025-12-03T10:19:46Z

[Generated by GEM5 Performance Robot]
commit: 66aa284
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	15.36	19.89	-22.77 🔴

[Generated by GEM5 Performance Robot]
commit: 66aa284
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	15.36	15.19	+1.10 🟢

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/cpu/pred/btb/btb_tage.cc (1)
396-404: Add a clarifying comment explaining why new entries are marked as resolved.

The change to set resolved = true on new BTB entries is correct and intentional. New entries reach this code path only when execution information is available (stream.exeTaken), and marking them as resolved enables them to participate in resolve-stage training for ITTAGE components when getResolvedUpdate() is true. This aligns with the PR objective to train ITTAGE at the resolve stage.

Add a comment to clarify this intent:
+        // Mark as resolved to enable resolve-stage training for ITTAGE
         potential_new_entry.resolved = true;

🧹 Nitpick comments (1)

src/cpu/pred/btb/btb_tage.cc (1)

407-415: Consider reducing code duplication in the filtering logic.

The filtering logic has duplicated lambda predicates that differ only in the additional e.resolved check. This could be simplified for better maintainability.

Apply this diff to reduce duplication:

     // Filter: only keep conditional branches that are not always taken
+    auto base_filter = [](const BTBEntry &e) { return e.isCond && !e.alwaysTaken; };
     if (getResolvedUpdate()) {
         auto remove_it = std::remove_if(all_entries.begin(), all_entries.end(),
-            [](const BTBEntry &e) { return !(e.isCond && !e.alwaysTaken && e.resolved); });
+            [&base_filter](const BTBEntry &e) { return !(base_filter(e) && e.resolved); });
         all_entries.erase(remove_it, all_entries.end());
     } else {
         auto remove_it = std::remove_if(all_entries.begin(), all_entries.end(),
-            [](const BTBEntry &e) { return !(e.isCond && !e.alwaysTaken); });
+            [&base_filter](const BTBEntry &e) { return !base_filter(e); });
         all_entries.erase(remove_it, all_entries.end());
     }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 66aa284 and 6d0f724.

📒 Files selected for processing (1)

src/cpu/pred/btb/btb_tage.cc (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.3c)

github-actions · 2025-12-03T11:32:34Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`1.7830`	-
This PR	`1.7747`	📉 `-0.0082` (`-0.46%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2025-12-03T13:10:30Z

[Generated by GEM5 Performance Robot]
commit: 6d0f724
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	15.23	19.89	-23.42 🔴

[Generated by GEM5 Performance Robot]
commit: 6d0f724
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	15.23	15.24	-0.08 🔴

github-actions · 2025-12-04T02:56:58Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`1.7830`	-
This PR	`1.7830`	📉 `-0.0000` (`-0.00%`)

✅ Difftest smoke test passed!

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6d0f724 and e1c5b82.

📒 Files selected for processing (4)

src/cpu/pred/btb/abtb.cc (1 hunks)
src/cpu/pred/btb/decoupled_bpred.cc (2 hunks)
src/cpu/pred/btb/mbtb.cc (3 hunks)
src/cpu/pred/btb/stream_struct.hh (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

src/cpu/pred/btb/mbtb.cc (2)

src/cpu/pred/btb/mbtb.hh (4)

stream (127-127)

stream (163-163)

stream (171-171)

stream (281-282)

src/cpu/pred/btb/btb_ubtb.hh (2)

stream (133-133)

stream (139-139)

src/cpu/pred/btb/stream_struct.hh (1)

src/cpu/pred/loop_predictor.hh (1)

bi (238-238)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.3c)

🔇 Additional comments (5)

src/cpu/pred/btb/stream_struct.hh (2)

81-86: BranchInfo::resolved wiring looks consistent with resolve-stage training

The new BranchInfo::resolved field and its initialization in both constructors default to false, with explicit true assignment only at control squash sites. Operators still compare only pc, so existing equality semantics (per-static-branch) remain unchanged while the resolved bit is purely auxiliary for training logic. This aligns well with the resolve-update design.

Also applies to: 93-96, 99-109

447-453: BTB entry resolve marking matches new BranchInfo::resolved semantics

FetchStream::markBTBEntryResolved selectively sets entry.resolved = true for matching pc in updateBTBEntries. Since these are per-stream copies, this cleanly annotates only the entries used during resolve-stage update without polluting the underlying BTB arrays.

src/cpu/pred/btb/abtb.cc (1)

523-528: Clearing resolved on AheadBTB updates is consistent with transient resolve-state usage

Resetting entry_to_write.resolved to false before committing to the BTB keeps the ahead BTB arrays free of stale resolve annotations, while resolve information is carried only in per-stream copies (updateBTBEntries) for ITTAGE-style training. This matches the intended “ephemeral resolve bit” design.

src/cpu/pred/btb/decoupled_bpred.cc (1)

463-467: Resolve-stage flow for exeBranchInfo.resolved and per-entry resolved looks coherent, but verify call ordering

The new bits in this file establish a clear resolve-update contract:

On control squashes with a valid static_inst, handleSquash() now builds stream.exeBranchInfo with full type/target info and sets stream.exeBranchInfo.resolved = true.

markCFIResolved() tags matching entries in stream.updateBTBEntries by setting entry.resolved = true, so resolve-aware predictors can see which BTB entries in the stream have actually completed.

resolveUpdate() then calls update(stream) only on components with getResolvedUpdate()==true (e.g., ITTAGE) when stream.isHit || stream.exeTaken, and finally clears stream.exeBranchInfo.resolved to avoid double-training on later calls.

This wiring is logically sound, but the correctness hinges on external sequencing:

prepareResolveUpdateEntries(stream_id) must be called before markCFIResolved(stream_id, resolvedInstPC) so that updateBTBEntries is populated; otherwise markCFIResolved won’t actually mark anything.

Only components that are truly resolve-stage–trained should return getResolvedUpdate()==true; others should continue to rely on the existing commit-time updatePredictorComponents() path.

Given the performance regressions reported by the robot, please double-check those two assumptions in the callers (resolve path in the core) and that ITTAGE is the only predictor currently switched to the resolve-update path.

Also applies to: 615-636, 661-672

src/cpu/pred/btb/mbtb.cc (1)

432-467: MBTB resolved-bit handling for new vs. existing entries is internally consistent

In MBTB::getAndSetNewBTBEntry():

On a BTB miss with stream.exeTaken, you now initialize the new BTBEntry from stream.exeBranchInfo and explicitly copy stream.exeBranchInfo.resolved into entry_to_write.resolved before storing it in stream.updateNewBTBEntry.

In MBTB::buildUpdatedEntry():

Regardless of whether you start from an existing entry or the request entry, you now reset entry_to_write.resolved = false before updating counters/targets and writing back.

This achieves a clean separation:

Transient resolve information is carried on per-stream state (updateNewBTBEntry and updateBTBEntries) for resolve-stage predictors.

The physical MBTB arrays and victim cache never retain stale resolved state across updates.

The only thing to watch is timing: for new branches, getAndSetNewBTBEntry() must be called in the resolve-stage preparation path before resolveUpdate() clears stream.exeBranchInfo.resolved, otherwise updateNewBTBEntry.resolved will always be false and ITTAGE won’t see resolved new-branch allocations as intended. Please confirm that the core’s resolve callback orders prepareResolveUpdateEntries(), markCFIResolved(), and resolveUpdate() accordingly.

Also applies to: 584-595

src/cpu/pred/btb/stream_struct.hh

XiangShanRobot · 2025-12-04T03:47:15Z

[Generated by GEM5 Performance Robot]
commit: e1c5b82
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	15.24	19.89	-23.35 🔴

[Generated by GEM5 Performance Robot]
commit: e1c5b82
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	15.24	15.23	+0.09 🟢

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/cpu/pred/btb/btb_ittage.cc (1)

222-232: Resolved vs non‑resolved BTB update filter looks correct; consider tightening comment and duplication

The conditional filtering cleanly preserves old behavior when getResolvedUpdate() is false and adds the expected e.resolved requirement when it is true, so functionally this aligns with “train only on resolved indirect, non‑return branches”.

Two minor follow‑ups:

The comment // // only update indirect branches that are not returns is now slightly inaccurate for the resolved‑update path and is double‑commented. Consider clarifying it to mention the resolved requirement, or just drop it.

The two remove_if + erase blocks differ only by the extra && e.resolved. You could reduce duplication by capturing getResolvedUpdate() in a predicate and calling remove_if once, but that’s just a readability nit.

Given the performance regressions in the robot reports, it’s worth double‑checking that all intended training entries (especially stream.updateNewBTBEntry) have BTBEntry::resolved set when getResolvedUpdate() is true; otherwise they’ll be silently filtered out here.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e1c5b82 and 2ac6ecb.

📒 Files selected for processing (2)

configs/example/kmhv3.py (1 hunks)
src/cpu/pred/btb/btb_ittage.cc (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

configs/example/kmhv3.py

🧰 Additional context used

🧬 Code graph analysis (1)

src/cpu/pred/btb/btb_ittage.cc (1)

src/cpu/pred/btb/test/decoupled_bpred.hh (6)

e (153-165)

e (153-153)

e (167-179)

e (167-167)

e (181-186)

e (181-181)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.3c)

github-actions · 2025-12-04T03:57:28Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`1.7830`	-
This PR	`1.7747`	📉 `-0.0082` (`-0.46%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2025-12-04T05:00:18Z

[Generated by GEM5 Performance Robot]
commit: 2ac6ecb
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	15.24	19.89	-23.39 🔴

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/cpu/pred/btb/btb_ittage.cc (1)
222-232: Update the comment to reflect conditional filtering behavior.

The implementation correctly introduces two-path conditional filtering based on getResolvedUpdate(). However, the comment on line 222 states "only update indirect branches that are not returns" which no longer fully describes the behavior when getResolvedUpdate() is true (where the resolved flag is also required).

Apply this diff to clarify the comment:
-    // // only update indirect branches that are not returns
     if (getResolvedUpdate()) {
+        // Only update indirect branches that are not returns AND are resolved
         auto remove_it =
             std::remove_if(all_entries_to_update.begin(), all_entries_to_update.end(),
                            [](const BTBEntry &e) { return !(e.isIndirect && !e.isReturn && e.resolved); });
         all_entries_to_update.erase(remove_it, all_entries_to_update.end());
     } else {
+        // Only update indirect branches that are not returns (original behavior)
         auto remove_it = std::remove_if(all_entries_to_update.begin(), all_entries_to_update.end(),
                                         [](const BTBEntry &e) { return !(e.isIndirect && !e.isReturn); });
         all_entries_to_update.erase(remove_it, all_entries_to_update.end());
     }
Based on PR comments, the automated performance tests show significant regressions (-22% to -23% vs master across multiple commits). While this change correctly implements the resolve-stage training feature, please verify that:

The performance delta is expected for this feature

Unit tests cover both code paths (resolvedUpdate true/false)

The resolved flag is being set correctly on BTBEntry instances

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ac6ecb and a4c5c74.

📒 Files selected for processing (2)

configs/example/kmhv3.py (1 hunks)
src/cpu/pred/btb/btb_ittage.cc (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

src/cpu/pred/btb/btb_ittage.cc (1)

src/cpu/pred/btb/test/decoupled_bpred.hh (6)

e (153-165)

e (153-153)

e (167-179)

e (167-167)

e (181-186)

e (181-181)

configs/example/kmhv3.py (3)

src/cpu/pred/btb/timed_base_pred.hh (1)

resolvedUpdate (77-77)

src/cpu/pred/btb/test/abtb.test.cc (1)

abtb (59-67)

src/cpu/pred/btb/test/btb_tage.test.cc (1)

tage (276-282)

🔇 Additional comments (1)

configs/example/kmhv3.py (1)

100-100: LGTM! Configuration correctly enables resolve-stage training for ITTAGE.

The configuration changes properly enable ITTAGE with resolve-stage training, aligning with the PR objective and matching the pattern used for mbtb and tage predictors. The past review comment requesting this change has been appropriately addressed after establishing a baseline.

However, the automated performance reports in the PR comments show consistent regressions of -22% to -23% versus master across multiple commits. Please verify whether:

This performance impact is expected for resolve-stage training

The regression is specific to the test workload or affects general performance

Any tuning or optimization opportunities exist to mitigate the impact

Also applies to: 107-107

github-actions · 2025-12-04T08:06:25Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`1.7830`	-
This PR	`1.7747`	📉 `-0.0082` (`-0.46%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2025-12-04T08:21:52Z

[Generated by GEM5 Performance Robot]
commit: a4c5c74
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	15.37	19.83	-22.49 🔴

Change-Id: I6a01e4ecb12119e1d2040eff44f28ceeefb3886d

Change-Id: Ie7812db8bc256535f019bd59e36d5688312122ef

Change-Id: I13a7033fce6096dd2ad7a5e3c58f935486b28a63

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/cpu/pred/btb/btb_ittage.cc (1)

222-232: Consider consolidating the duplicate filter logic.

The two branches share nearly identical code with only the lambda condition differing. You could refactor to reduce duplication:

-    // // only update indirect branches that are not returns
-    if (getResolvedUpdate()) {
-        auto remove_it =
-            std::remove_if(all_entries_to_update.begin(), all_entries_to_update.end(),
-                           [](const BTBEntry &e) { return !(e.isIndirect && !e.isReturn && e.resolved); });
-        all_entries_to_update.erase(remove_it, all_entries_to_update.end());
-    } else {
-        auto remove_it = std::remove_if(all_entries_to_update.begin(), all_entries_to_update.end(),
-                                        [](const BTBEntry &e) { return !(e.isIndirect && !e.isReturn); });
-        all_entries_to_update.erase(remove_it, all_entries_to_update.end());
-    }
+    // only update indirect branches that are not returns
+    bool resolvedUpdate = getResolvedUpdate();
+    auto remove_it = std::remove_if(
+        all_entries_to_update.begin(), all_entries_to_update.end(),
+        [resolvedUpdate](const BTBEntry &e) {
+            return !(e.isIndirect && !e.isReturn && (!resolvedUpdate || e.resolved));
+        });
+    all_entries_to_update.erase(remove_it, all_entries_to_update.end());

Also, line 222 has a double comment prefix (// //) that should be cleaned up.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a4c5c74 and d97449c.

📒 Files selected for processing (3)

configs/example/kmhv3.py (1 hunks)
src/cpu/pred/BranchPredictor.py (1 hunks)
src/cpu/pred/btb/btb_ittage.cc (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

configs/example/kmhv3.py

🧰 Additional context used

🧬 Code graph analysis (1)

src/cpu/pred/btb/btb_ittage.cc (2)

src/cpu/pred/btb/decoupled_bpred.hh (10)

e (202-214)

e (202-202)

e (216-228)

e (216-216)

e (230-235)

e (230-230)

e (237-241)

e (237-237)

e (268-271)

e (268-268)

src/cpu/pred/btb/test/decoupled_bpred.hh (6)

e (153-165)

e (153-153)

e (167-179)

e (167-167)

e (181-186)

e (181-181)

🪛 Ruff (0.14.7)

src/cpu/pred/BranchPredictor.py

1003-1003: Param may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.3c)

🔇 Additional comments (1)

src/cpu/pred/BranchPredictor.py (1)

1003-1003: LGTM!

The default value change for usingS3Pred from True to False aligns with the PR's objective of training ITTAGE at the resolve stage. The static analysis warning about Param is a false positive — it's correctly imported via the standard gem5 pattern from m5.params import * on line 29.

github-actions · 2025-12-04T08:47:28Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`1.7830`	-
This PR	`1.7747`	📉 `-0.0082` (`-0.46%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2025-12-04T09:39:10Z

[Generated by GEM5 Performance Robot]
commit: d97449c
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	15.31	19.83	-22.76 🔴

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/cpu/pred/btb/decoupled_bpred.cc (1)
252-261: Tighten ABTB S3 update gating and simplify previous-PC handling

The new it != fetchStreamQueue.end() check is good; it prevents dereferencing an invalid iterator when there is no previous stream entry. However, the current structure still calls abtb->updateUsingS3Pred(...) from the else branch even when abtb->isEnabled() is false. That’s a bit inconsistent with the constructor pattern where disabled components are generally not used, and it duplicates the updateUsingS3Pred call.

You can both (a) avoid touching ABTB when it’s disabled and (b) centralize previous-PC selection with something like:
-        auto it = fetchStreamQueue.find(fsqId-1);
-        if (it != fetchStreamQueue.end() && abtb->isEnabled()) {
-            auto previous_block_startpc = it->second.startPC;
-            abtb->updateUsingS3Pred(predsOfEachStage[numStages - 1], previous_block_startpc);
-        } else {
-            abtb->updateUsingS3Pred(predsOfEachStage[numStages - 1], 0);
-        }
+        if (abtb->isEnabled()) {
+            auto it = fetchStreamQueue.find(fsqId - 1);
+            Addr previous_block_startpc = 0;
+            if (it != fetchStreamQueue.end()) {
+                previous_block_startpc = it->second.startPC;
+            }
+            abtb->updateUsingS3Pred(
+                predsOfEachStage[numStages - 1], previous_block_startpc);
+        }
This keeps the “use 0 when no previous stream” behavior, but clearly skips ABTB training when the component is disabled and removes duplicated calls. If updateUsingS3Pred is intentionally callable when disabled, you can still use this pattern for clarity and to document that intent.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d97449c and 427f3c0.

📒 Files selected for processing (1)

src/cpu/pred/btb/decoupled_bpred.cc (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Quick Build, Unit Tests & Smoke Test
GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.3c)

github-actions · 2025-12-04T09:57:06Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`1.7830`	-
This PR	`1.7747`	📉 `-0.0082` (`-0.46%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2025-12-04T10:56:19Z

[Generated by GEM5 Performance Robot]
commit: 427f3c0
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	15.32	19.83	-22.74 🔴

[Generated by GEM5 Performance Robot]
commit: 427f3c0
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	15.32	15.31	+0.02 🟢

Change-Id: If8ad467ec69fc727ce177a942ae3ebc10eceafbe

github-actions · 2025-12-05T02:42:22Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`1.7830`	-
This PR	`1.7747`	📉 `-0.0082` (`-0.46%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2025-12-05T03:32:19Z

[Generated by GEM5 Performance Robot]
commit: ea1a5b2
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	15.31	19.83	-22.78 🔴

Yakkhini added the align-kmhv3 label Dec 3, 2025

coderabbitai bot reviewed Dec 3, 2025

View reviewed changes

configs/example/kmhv3.py Show resolved Hide resolved

Yakkhini assigned jensen-yan Dec 3, 2025

jensen-yan previously approved these changes Dec 3, 2025

View reviewed changes

Yakkhini dismissed jensen-yan’s stale review via 66aa284 December 3, 2025 09:06

coderabbitai bot reviewed Dec 3, 2025

View reviewed changes

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

src/cpu/pred/btb/stream_struct.hh Outdated Show resolved Hide resolved

Yakkhini force-pushed the ittage-align branch from e1c5b82 to 2ac6ecb Compare December 4, 2025 03:50

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

jensen-yan previously approved these changes Dec 4, 2025

View reviewed changes

Yakkhini dismissed jensen-yan’s stale review via a4c5c74 December 4, 2025 06:55

Yakkhini force-pushed the ittage-align branch from 2ac6ecb to a4c5c74 Compare December 4, 2025 06:55

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

cpu-o3: enable ITTAGE predictor

79207b2

Change-Id: I6a01e4ecb12119e1d2040eff44f28ceeefb3886d

Yakkhini added 2 commits December 4, 2025 16:39

cpu-o3: train ittage in resolve stage

681b100

Change-Id: Ie7812db8bc256535f019bd59e36d5688312122ef

cpu-o3: filter not resolved btb entries in ittage resolve update

f84a311

Change-Id: I13a7033fce6096dd2ad7a5e3c58f935486b28a63

Yakkhini force-pushed the ittage-align branch from a4c5c74 to d97449c Compare December 4, 2025 08:40

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

Yakkhini force-pushed the ittage-align branch from 427f3c0 to ea1a5b2 Compare December 5, 2025 02:34

cpu-o3: only update ubtb & abtb when it enabled

ea1a5b2

Change-Id: If8ad467ec69fc727ce177a942ae3ebc10eceafbe

jensen-yan approved these changes Dec 5, 2025

View reviewed changes

Yakkhini merged commit 4129e13 into xs-dev Dec 5, 2025
3 checks passed

Yakkhini deleted the ittage-align branch December 5, 2025 03:46

This was referenced Dec 8, 2025

Train potential entry correctly in resolve update #637

Merged

Add counter for components align #639

Merged

align RTL: tage window method, block resolveQ if update failed due to bank conflict. #644

Merged

This was referenced Dec 23, 2025

ITTAGE Bank Conflict Implementation #671

Closed

Using mbtb basetable align #672

Merged

coderabbitai bot mentioned this pull request Apr 2, 2026

Full resolve train #810

Open

coderabbitai bot mentioned this pull request Apr 10, 2026

Tracking legacy resolve train feature performance #823

Open

Conversation

Yakkhini commented Dec 3, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Dec 3, 2025

🚀 Coremark Smoke Test Results

Uh oh!

XiangShanRobot commented Dec 3, 2025

Ideal BTB Performance

Overall Score

Uh oh!

github-actions bot commented Dec 3, 2025

🚀 Coremark Smoke Test Results

Uh oh!

XiangShanRobot commented Dec 3, 2025

Ideal BTB Performance

Overall Score

Ideal BTB Performance

Overall Score

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 3, 2025

🚀 Coremark Smoke Test Results

Uh oh!

XiangShanRobot commented Dec 3, 2025

Ideal BTB Performance

Overall Score

Ideal BTB Performance

Overall Score

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 3, 2025

🚀 Coremark Smoke Test Results

Uh oh!

XiangShanRobot commented Dec 3, 2025

Ideal BTB Performance

Overall Score

Ideal BTB Performance

Overall Score

Uh oh!

github-actions bot commented Dec 4, 2025

🚀 Coremark Smoke Test Results

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

XiangShanRobot commented Dec 4, 2025

Ideal BTB Performance

Overall Score

Ideal BTB Performance

Overall Score

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 4, 2025

🚀 Coremark Smoke Test Results

Uh oh!

XiangShanRobot commented Dec 4, 2025

Yakkhini commented Dec 3, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 3, 2025 •

edited

Loading