Skip to content

Microtage merage #775

Merged
CJ362ff merged 39 commits intoxs-devfrom
microtage-ok-type
Mar 10, 2026
Merged

Microtage merage #775
CJ362ff merged 39 commits intoxs-devfrom
microtage-ok-type

Conversation

@CJ362ff
Copy link
Copy Markdown
Contributor

@CJ362ff CJ362ff commented Mar 6, 2026

Summary by CodeRabbit

Release Notes

New Features

  • Added MicroTAGE branch predictor with expanded configurability including table sizes, history depths, and banking conflict handling for improved branch prediction accuracy.
  • MicroTAGE is now enabled in supported CPU configurations with optimized default parameters.

Cao Jiaming added 30 commits March 6, 2026 11:35
Change-Id: I44b6175022a3a593ed385407bd95ec0c40c74642
Change-Id: I16ce79a7d8488d9a138d0a26f5500576ae54132e
Change-Id: I8be037a1cdacd7151f4fe8e743a32cca90a85036
Change-Id: I6b9c92c18c574a12c19532ae0b894e64c1187342
Change-Id: I953965b8e6feb1c15a238ac832d65bc16b32f496
Change-Id: I39cd471d452aa343c3dd741a80fdfa7d126e3a9f
Change-Id: I3567ae93652aac218c5b4646003abadddaf7cf32
Change-Id: If7c9d3aa68a23c36dde74d8cf3a286c9c48f3e3c
Change-Id: I83277ae5c801e9d22b594286580459d12cdec69b
Change-Id: I394246af184d3f07e02b85f06e4e5ceed368ec22
Change-Id: I762c11f8d15262fb9f1c9d443f77895fa76bbc79
Change-Id: Ia9bcdc028235447e254889d95e5ea98e7f067664
Change-Id: I56614e8ebc2dd33320d353562087ab456fc452da
Change-Id: Icdc7ac7f047a36dba6733a0e1d11e5e37aa4cdf1
Change-Id: I9cb7e7ef04efefbcbdbf705563563f50f3c83324
Change-Id: Iad486579b9ab0df207013348f02c6be30bd10cfd
Change-Id: I20e9fbfe161cd741b37bb69d46e99ee7755f79e5
Change-Id: If1187310e575a7c485d49fb89215742b1174393a
Change-Id: I2b9f9216a54405d517538958b717a8e3de2eca8a
Change-Id: I8c70b06b834fd5e713ef9ac4d3b26ec5b01ce2e6
Change-Id: Ic736ea7cc73abebb3af473510c16d837b4f535f4
Change-Id: I94ccc54b623bbf488bdaae9086c3cfff92c1fcdf
Change-Id: Ic7a506f9ee3bbbcaaf6b50f28cf74dc4a407cea8
Change-Id: Iffd21dbdb9f633841d1a9ac33dcd4e7f5f53b992
Change-Id: I93ab507fc32d1096a5fd07f68d85740577eb6ea9
Cao Jiaming added 9 commits March 6, 2026 11:40
Change-Id: I72fbca71841e9209f8803e928af907a8c1106bf2
Change-Id: If64a72e75eb5e8ee4c280a8362577617efa0c964
Change-Id: I0b5a16c19b91c36649a17171e49eb000626e7b98
Change-Id: I32ce96420169227820174c54ea4d7b423b910d22
Change-Id: I67953b502b3d2cb0f08a66f58537f6eaa800cdf1
Change-Id: Ic0faea91ca80f67e9b106eab459e3098bfe3b5a9
Change-Id: I34219f5fc8b3c4ac7ed515dd57aaca6543fcb4ba
Change-Id: I727d76e59df69d0cef0089b87a4e671ca25546b9
Change-Id: I15c1f248fa079ddd5b8cf4525fbaf23e04ddb1cf
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 6, 2026

📝 Walkthrough

Walkthrough

This pull request introduces a new MicroTAGE branch predictor class that replaces a simpler BTBTAGE-based design with an expanded, bank-aware, multi-table TAGE predictor. Changes include configuration updates, build system integration, type signature modifications, and a comprehensive 1,100+ line implementation with prediction, update, and history management logic.

Changes

Cohort / File(s) Summary
Configuration Updates
configs/example/idealkmhv3.py, configs/example/kmhv3.py
Added comment documenting microtage configuration; enabled MicroTAGE in DecoupledBPUWithBTB (changed from False to True).
Build System Integration
src/cpu/pred/SConscript
Added MicroTAGE to SimObject BP type list, included new source file btb/microtage.cc, and introduced new DebugFlag UTAGE.
Predictor Type Definitions
src/cpu/pred/BranchPredictor.py
Refactored MicroTAGE from BTBTAGE subclass to TimedBaseBTBPredictor with extensive new public parameters (needMoreHistories, updateOnRead, numPredictors, tableSizes, TTagBitSizes, TTagPcShifts, blockSize, histLengths, banking configs, etc.); updated DecoupledBPUWithBTB.microtage field type from Param.BTBTAGE to Param.MicroTAGE.
Interface Updates
src/cpu/pred/btb/decoupled_bpred.hh
Added microtage.hh include; changed private member type from BTBTAGE\* to MicroTAGE\*.
MicroTAGE Implementation
src/cpu/pred/btb/microtage.hh, src/cpu/pred/btb/microtage.cc
Introduced complete MicroTAGE predictor with bank-aware multi-table TAGE design, including TageEntry/TageTableInfo/TagePrediction/TageMeta nested structures, core prediction/update workflows (generateSinglePrediction, lookupHelper, putPCHistory, update, canResolveUpdate, doResolveUpdate), folded history management (doUpdateHist, specUpdatePHist, recoverPHist), banking logic (getBankId, enableBankConflict), and extensive TageStats instrumentation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • PR #644: Adds the two-phase resolveUpdate API (canResolveUpdate/doResolveUpdate) and blocking/notify flow for BTB/TAGE updates that directly align with MicroTAGE's update resolution infrastructure.
  • PR #628: Modifies branch predictors to adopt path-folded history interfaces (specUpdatePHist/recoverPHist, doUpdateHist) that are now central to MicroTAGE's history management.
  • PR #615: Extends TAGE-style BTB predictor with bank-aware, position-aware designs (bank params, getTageTag/getTageIndex/getBankId) that mirror MicroTAGE's banking model.

Suggested labels

align-kmhv3

Suggested reviewers

  • Yakkhini

Poem

🐰 A tiny TAGE hops into view,
With folded histories, tables, and banks so true,
Banking on conflicts, it predicts with grace,
Microtagely swift through the fetch-decode race! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'Microtage merage' is unclear and contains a typo; it does not clearly describe the main changes. Replace with a clear, specific title such as 'Implement MicroTAGE branch predictor with TAGE-based BTB integration' to accurately reflect the substantial additions of new predictor classes and configuration parameters.
Docstring Coverage ⚠️ Warning Docstring coverage is 54.35% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch microtage-ok-type

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (7)
src/cpu/pred/btb/microtage.hh (1)

163-165: Consider documenting the UNIT_TEST visibility change.

The #ifndef UNIT_TEST guard around private: exposes internal methods during testing, which is a reasonable pattern. A brief comment explaining this design choice would help future maintainers understand the intent.

📝 Suggested documentation
+// In production builds, the following methods are private implementation details.
+// In UNIT_TEST builds, they are exposed for direct testing.
 `#ifndef` UNIT_TEST
   private:
 `#endif`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/microtage.hh` around lines 163 - 165, Add a brief comment
above the `#ifndef` UNIT_TEST guard in microtage.hh explaining that the
conditional exposes internal members for unit testing; mention that the guard
toggles visibility of the subsequent private: section so test code can access
internal methods/state while production builds keep them private (reference the
UNIT_TEST macro and the private: label). Keep the comment concise and near the
guard so future maintainers understand the intent.
configs/example/idealkmhv3.py (1)

83-83: Clarify the intent of this commented-out configuration.

This commented-out line suggests MicroTAGE should be disabled in the "ideal" configuration, but since it's commented out, MicroTAGE will use its default enabled state (True per BranchPredictor.py).

If the intent is to keep MicroTAGE disabled in this ideal configuration (for performance comparison or baseline testing), consider uncommenting it. Otherwise, if MicroTAGE should be enabled here too, consider removing this commented line to avoid confusion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@configs/example/idealkmhv3.py` at line 83, The commented config line
"cpu.branchPred.microtage.enabled = False" leaves MicroTAGE using its default
True value; decide and apply the intended behavior: if MicroTAGE should be
disabled in this ideal config, uncomment "cpu.branchPred.microtage.enabled =
False" (so the setting is explicit), otherwise remove the commented line to
avoid confusion and rely on the default True from BranchPredictor.py; update the
file accordingly so the intent is explicit.
src/cpu/pred/btb/microtage.cc (5)

772-783: Add defensive bounds check for index bit width.

Similar to the tag computation, if tableIndexBits[t] reaches 64 or more, the shift would be undefined behavior. Consider adding an assertion.

🛡️ Suggested defensive fix
 Addr
 MicroTAGE::getTageIndex(Addr pc, int t, uint64_t foldedHist)
 {
     // Create mask for tableIndexBits[t] to limit result size
+    assert(tableIndexBits[t] < 64 && "Index bit width must be less than 64");
     Addr mask = (1ULL << tableIndexBits[t]) - 1;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/microtage.cc` around lines 772 - 783, In getTageIndex you
must guard against undefined behavior when tableIndexBits[t] is 64 or more
before doing (1ULL << tableIndexBits[t]); add a defensive bounds check (e.g.,
assert or conditional) that tableIndexBits[t] is in the valid range (1..63) or
handle the 64-bit case explicitly by using a full-mask (e.g., mask = ~0ULL) —
update the code around tableIndexBits[t], mask, and pcShift in
MicroTAGE::getTageIndex to mirror the safe check used in the tag computation so
shifts are never undefined.

265-273: Consider early-exit optimization for empty btbEntries.

If btbEntries is empty, the loop does nothing. An early return could slightly improve readability, though this is minor.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/microtage.cc` around lines 265 - 273, The loop over
btbEntries in microtage.cc performs no work when btbEntries is empty; add an
early exit by checking btbEntries.empty() before the loop (e.g., at the start of
the function or just before the for) and return immediately to improve
readability; locate the block that calls generateSinglePrediction(btb_entry,
startPC), writes to meta->preds[btb_entry.pc], calls
tageStats.updateStatsWithTagePrediction(pred, true), and pushes into results and
insert the empty-check there.

91-97: Potential redundant resize of tableTagBits.

tableTagBits is resized at line 93, but it was already initialized from p.TTagBitSizes at line 68 (or sized in the test constructor at line 55). The resize at line 93 appears unnecessary and could silently mask configuration issues where TTagBitSizes has fewer elements than numPredictors.

Consider either removing the resize or adding a more explicit validation error:

♻️ Suggested clarification
     tageTable.resize(numPredictors);
     tableIndexBits.resize(numPredictors);
-    tableTagBits.resize(numPredictors);
+    // tableTagBits already initialized from params; validate size matches
+    assert(tableTagBits.size() >= numPredictors && "TTagBitSizes must have numPredictors elements");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/microtage.cc` around lines 91 - 97, The resize of
tableTagBits is redundant and can mask misconfiguration of p.TTagBitSizes;
instead of silently resizing, validate that p.TTagBitSizes contains entries for
all predictors (compare p.TTagBitSizes.size() to numPredictors) and if it
doesn't, either assert/throw a clear configuration error or fallback explicitly
with logged intent; remove the tableTagBits.resize(numPredictors) call and
replace it with a check that sets tableTagBits from p.TTagBitSizes (or fails
fast), leaving the existing tablePcShifts resize behavior as-is.

628-632: Graceful handling when prediction metadata is missing.

The code logs and returns when predMeta is null, which is good defensive programming. However, this could indicate a bug in the calling code if it happens unexpectedly.

Consider adding a stat counter to track how often this occurs, which would help diagnose issues in production:

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/microtage.cc` around lines 628 - 632, Add a stat counter
increment when prediction metadata is missing: in the block where predMeta is
obtained (auto predMeta =
std::static_pointer_cast<TageMeta>(stream.predMetas[getComponentIdx()])), before
returning on the null check, increment a descriptive counter (e.g.,
predMetaMissingCount or nullPredMetaStat) defined on the containing class/module
(e.g., Microtage or associated stats struct) so occurrences are tracked in
production; keep the existing DPRINTF(UTAGE, ...) and return behavior unchanged
but ensure the new counter is declared/registered with the file's stat framework
and incremented each time predMeta is null.

756-770: Add defensive bounds check for tag bit width.

The static analyzer flagged a potential undefined behavior: if tableTagBits[t] is 64 or greater, (1ULL << tableTagBits[t]) is undefined behavior. While current defaults (16 bits) are safe, a misconfigured parameter could trigger this.

🛡️ Suggested defensive fix
 Addr
 MicroTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, uint64_t altFoldedHist, Addr position)
 {
     // Create mask for tableTagBits[t] to limit result size
+    assert(tableTagBits[t] < 64 && "Tag bit width must be less than 64");
     Addr mask = (1ULL << tableTagBits[t]) - 1;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/microtage.cc` around lines 756 - 770, Add a defensive bounds
check before computing mask from tableTagBits[t]: avoid doing (1ULL <<
tableTagBits[t]) when tableTagBits[t] >= 64 by computing mask conditionally
(e.g., if tableTagBits[t] >= 64 set mask = ~0ULL; else mask = (1ULL <<
tableTagBits[t]) - 1). Update the block that currently computes Addr mask =
(1ULL << tableTagBits[t]) - 1; (in the microtage tag/XOR construction using
tableTagBits[t], pcBits, foldedHist, altFoldedHist, bankBaseShift, position) to
use this guarded computation so shifts cannot be invoked with >=64.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@configs/example/idealkmhv3.py`:
- Line 83: The commented config line "cpu.branchPred.microtage.enabled = False"
leaves MicroTAGE using its default True value; decide and apply the intended
behavior: if MicroTAGE should be disabled in this ideal config, uncomment
"cpu.branchPred.microtage.enabled = False" (so the setting is explicit),
otherwise remove the commented line to avoid confusion and rely on the default
True from BranchPredictor.py; update the file accordingly so the intent is
explicit.

In `@src/cpu/pred/btb/microtage.cc`:
- Around line 772-783: In getTageIndex you must guard against undefined behavior
when tableIndexBits[t] is 64 or more before doing (1ULL << tableIndexBits[t]);
add a defensive bounds check (e.g., assert or conditional) that
tableIndexBits[t] is in the valid range (1..63) or handle the 64-bit case
explicitly by using a full-mask (e.g., mask = ~0ULL) — update the code around
tableIndexBits[t], mask, and pcShift in MicroTAGE::getTageIndex to mirror the
safe check used in the tag computation so shifts are never undefined.
- Around line 265-273: The loop over btbEntries in microtage.cc performs no work
when btbEntries is empty; add an early exit by checking btbEntries.empty()
before the loop (e.g., at the start of the function or just before the for) and
return immediately to improve readability; locate the block that calls
generateSinglePrediction(btb_entry, startPC), writes to
meta->preds[btb_entry.pc], calls tageStats.updateStatsWithTagePrediction(pred,
true), and pushes into results and insert the empty-check there.
- Around line 91-97: The resize of tableTagBits is redundant and can mask
misconfiguration of p.TTagBitSizes; instead of silently resizing, validate that
p.TTagBitSizes contains entries for all predictors (compare
p.TTagBitSizes.size() to numPredictors) and if it doesn't, either assert/throw a
clear configuration error or fallback explicitly with logged intent; remove the
tableTagBits.resize(numPredictors) call and replace it with a check that sets
tableTagBits from p.TTagBitSizes (or fails fast), leaving the existing
tablePcShifts resize behavior as-is.
- Around line 628-632: Add a stat counter increment when prediction metadata is
missing: in the block where predMeta is obtained (auto predMeta =
std::static_pointer_cast<TageMeta>(stream.predMetas[getComponentIdx()])), before
returning on the null check, increment a descriptive counter (e.g.,
predMetaMissingCount or nullPredMetaStat) defined on the containing class/module
(e.g., Microtage or associated stats struct) so occurrences are tracked in
production; keep the existing DPRINTF(UTAGE, ...) and return behavior unchanged
but ensure the new counter is declared/registered with the file's stat framework
and incremented each time predMeta is null.
- Around line 756-770: Add a defensive bounds check before computing mask from
tableTagBits[t]: avoid doing (1ULL << tableTagBits[t]) when tableTagBits[t] >=
64 by computing mask conditionally (e.g., if tableTagBits[t] >= 64 set mask =
~0ULL; else mask = (1ULL << tableTagBits[t]) - 1). Update the block that
currently computes Addr mask = (1ULL << tableTagBits[t]) - 1; (in the microtage
tag/XOR construction using tableTagBits[t], pcBits, foldedHist, altFoldedHist,
bankBaseShift, position) to use this guarded computation so shifts cannot be
invoked with >=64.

In `@src/cpu/pred/btb/microtage.hh`:
- Around line 163-165: Add a brief comment above the `#ifndef` UNIT_TEST guard in
microtage.hh explaining that the conditional exposes internal members for unit
testing; mention that the guard toggles visibility of the subsequent private:
section so test code can access internal methods/state while production builds
keep them private (reference the UNIT_TEST macro and the private: label). Keep
the comment concise and near the guard so future maintainers understand the
intent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b05f35f2-1ad6-4647-9523-5ff4bd41cc64

📥 Commits

Reviewing files that changed from the base of the PR and between 318c57c and 9e6be62.

📒 Files selected for processing (7)
  • configs/example/idealkmhv3.py
  • configs/example/kmhv3.py
  • src/cpu/pred/BranchPredictor.py
  • src/cpu/pred/SConscript
  • src/cpu/pred/btb/decoupled_bpred.hh
  • src/cpu/pred/btb/microtage.cc
  • src/cpu/pred/btb/microtage.hh

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 6, 2026

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.2203 -
This PR 2.2219 📈 +0.0016 (+0.07%)

✅ Difftest smoke test passed!

@XiangShanRobot
Copy link
Copy Markdown

[Generated by GEM5 Performance Robot]
commit: 9e6be62
workflow: gem5 Align BTB Performance Test(0.3c)

Align BTB Performance

Overall Score

PR Master Diff(%)
Score 17.91 17.84 +0.36 🟢

@CJ362ff CJ362ff requested a review from jensen-yan March 10, 2026 02:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants