Skip to content

Tage1 align#615

Merged
jensen-yan merged 6 commits intoxs-devfrom
tage1-align
Nov 28, 2025
Merged

Tage1 align#615
jensen-yan merged 6 commits intoxs-devfrom
tage1-align

Conversation

@jensen-yan
Copy link
Copy Markdown
Collaborator

@jensen-yan jensen-yan commented Nov 25, 2025

Summary by CodeRabbit

  • New Features

    • Added bank-conflict detection and simulation for the BTB/TAGE predictor
    • Per-bank statistics and tracking for monitoring bank accesses and conflicts
  • Configuration

    • Added numBanks (default 4) and enableBankConflict toggle
    • Increased tag size for TAGE entries (from 11 to 13 bits)
  • Tests

    • Added and enhanced tests for bank-conflict scenarios and position-aware tagging
  • Documentation

    • New implementation plan doc describing bank-partitioned TAGE design and validation

✏️ Tip: You can customize this high-level summary in your review settings.

now use position in 64byte, not sure use pos in 32byte?

Change-Id: Ie7f900e62d78859062c52f847ef6acf1c1392c1c
Change-Id: Iadf61c6e576840640a679c4136ce2c318129b6c2
Add enableBankConflict parameter to control bank conflict simulation:
- Default: enabled for production, disabled for unit tests
- Prevents existing unit tests from breaking

Add fine-grained per-bank statistics (non-UNIT_TEST only):
- updateBankConflictPerBank: conflict count per bank
- updateAccessPerBank: update access count per bank
- predAccessPerBank: prediction access count per bank

Add comprehensive bank conflict unit test:
- Test 1: Same bank conflict detection (enabled)
- Test 2: Different bank no conflict
- Test 3: Disabled flag prevents conflict

Fix existing SetAssociativeConflictHandling test:
- Bug: entries with different positions used same tag
- Root cause: tag calculation includes position XOR (aligned with RTL in 3a23ab4)
- Fix: calculate separate tags for each position

Change-Id: I18f666274a77b11f8cc28f9ed5c0db2806e81d19
about 0.3 score
waiting for better implement

Change-Id: Iba0f6e51e6a364fa65b2b794539b740aade0fe28
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Nov 25, 2025

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 76.47% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'Tage1 align' is vague and generic, using non-descriptive terminology that doesn't convey meaningful information about the changeset. Replace with a more descriptive title that clearly explains the main change, such as 'Add bank-aware TAGE implementation with bank conflict simulation' or 'Implement bank-partitioned TAGE predictor for GEM5'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch tage1-align

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
src/cpu/pred/btb/btb_tage.hh (1)

347-355: Consider initializing scalar statistics in UNIT_TEST mode.

The per-bank vector statistics (updateBankConflictPerBank, updateAccessPerBank, predAccessPerBank) are guarded by #ifndef UNIT_TEST, but the scalar stats (updateBankConflict, updateDroppedDueToConflict) are always present. This is fine, but note that in unit test mode, these scalars are raw uint64_t and may not be zero-initialized by memset in all cases if the struct has non-trivial members.

docs/Gem5_Docs/frontend/TAGE_BANK_IMPLEMENTATION_PLAN.md (1)

25-30: Add language specifiers to fenced code blocks.

Several code blocks lack language specifiers which affects syntax highlighting and triggers markdown lint warnings. Based on static analysis hints.

-```
-blockSize = 32B (2^5)
+```text
+blockSize = 32B (2^5)

Similarly for the other blocks at lines 34, 351, and 362 - add appropriate language identifiers (text, cpp, or leave as text for ASCII diagrams).

Also applies to: 34-40, 351-356, 362-367

src/cpu/pred/btb/btb_tage.cc (1)

662-695: Bank conflict detection logic is correct but could benefit from a minor clarification.

The conflict detection logic correctly handles:

  1. Enabled flag check
  2. Valid prediction state check
  3. Same-bank collision detection
  4. State cleanup after conflict or consumption

One observation: when enableBankConflict is false, predBankValid is never cleared in the update path. This means predBankValid will remain true after putPCHistory even when conflicts are disabled. This is benign since the flag is only checked when enableBankConflict is true, but for cleanliness, consider always clearing it at end of update regardless of conflict mode.

     // ========== Normal Update Logic ==========
+    // Clear prediction state regardless of conflict mode
+    predBankValid = false;
+
     // Prepare BTB entries to update
src/cpu/pred/btb/test/btb_tage.test.cc (1)

882-884: Consider cleaning up dynamically allocated bankTage to avoid memory leak.

The test allocates bankTage with new but doesn't delete it. While this is minor in test code, adding cleanup would be good hygiene.

+    delete bankTage;
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ff26e96 and 49168e2.

📒 Files selected for processing (5)
  • docs/Gem5_Docs/frontend/TAGE_BANK_IMPLEMENTATION_PLAN.md (1 hunks)
  • src/cpu/pred/BranchPredictor.py (2 hunks)
  • src/cpu/pred/btb/btb_tage.cc (17 hunks)
  • src/cpu/pred/btb/btb_tage.hh (6 hunks)
  • src/cpu/pred/btb/test/btb_tage.test.cc (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/cpu/pred/btb/btb_tage.cc (1)
src/cpu/pred/btb/btb_tage.hh (16)
  • BTBTAGE (49-49)
  • BTBTAGE (114-114)
  • BTBTAGE (116-116)
  • alignedPC (168-169)
  • alignedPC (199-199)
  • alignedPC (417-424)
  • pc (172-172)
  • pc (175-175)
  • pc (179-179)
  • pc (183-183)
  • pc (186-188)
  • pc (186-186)
  • pc (191-191)
  • pc (194-194)
  • pc (287-287)
  • TageStats (368-368)
🪛 markdownlint-cli2 (0.18.1)
docs/Gem5_Docs/frontend/TAGE_BANK_IMPLEMENTATION_PLAN.md

25-25: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


34-34: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


351-351: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


362-362: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.5)
src/cpu/pred/BranchPredictor.py

1047-1047: VectorParam may be undefined, or defined from star imports

(F405)


1059-1059: Param may be undefined, or defined from star imports

(F405)


1060-1060: Param may be undefined, or defined from star imports

(F405)

🔇 Additional comments (14)
src/cpu/pred/btb/btb_tage.hh (2)

46-52: LGTM: Bank configuration and state members well-designed.

The bank-related configuration (numBanks, bankIdWidth, bankIdShift, indexShift) is properly marked const, and the tracking state (lastPredBankId, predBankValid) is appropriately mutable. The default numBanks = 4 aligns with the implementation plan.


178-183: LGTM: Position parameter extension maintains backward compatibility.

The position parameter with default value 0 ensures existing callers continue to work while enabling position-aware tagging for bank-aware operation.

docs/Gem5_Docs/frontend/TAGE_BANK_IMPLEMENTATION_PLAN.md (1)

401-402: Documentation inconsistency: enableBankConflict parameter exists but not documented here.

The document states "当前实现总是启用bank冲突检测(无独立开关)" but the code actually implements enableBankConflict parameter (default False in BranchPredictor.py). Consider updating the documentation to reflect the actual implementation.

src/cpu/pred/btb/btb_tage.cc (7)

37-52: LGTM: Unit test constructor properly initializes bank configuration.

The bank-related fields are correctly computed and initialized:

  • bankIdWidth = ceilLog2(numBanks)
  • bankIdShift = floorLog2(blockSize)
  • indexShift = bankIdShift + bankIdWidth
  • enableBankConflict = false (appropriate for unit tests by default)

85-99: LGTM: Production constructor with appropriate warning.

The warning when updateOnRead is disabled is helpful for users, and all bank configuration is properly wired from parameters.


348-359: Per-bank prediction statistics not available in UNIT_TEST mode.

The predAccessPerBank increment is guarded by #ifndef UNIT_TEST, which means unit tests cannot verify per-bank prediction access counts. This is acceptable if the intent is to keep unit test statistics simple, but worth noting.


804-821: LGTM: Position-aware tag computation aligns with RTL.

The tag calculation correctly XORs PC bits, folded history, alt tag bits, and position. This matches the documented RTL behavior where tag = tempTag ^ cfiPosition.


829-845: LGTM: Bank-aware index calculation correctly skips bank bits.

The index computation using indexShift (which combines block offset and bank bits) ensures different banks index into separate PC space regions, avoiding bank aliasing as designed.


897-905: LGTM: getBankId implementation is correct.

The bank ID extraction (alignedPC >> bankIdShift) & ((1 << bankIdWidth) - 1) correctly extracts the bank bits from the aligned PC. For 32B blocks with 4 banks, this extracts pc[6:5] as documented.


1040-1044: Per-bank statistics vectors properly initialized.

The vectors are correctly sized to numBanks entries, matching the bank configuration.

src/cpu/pred/BranchPredictor.py (2)

1047-1047: Tag size increase from 11 to 13 bits is appropriate.

The increase from 11 to 13 bits accommodates the position XOR operation while maintaining tag uniqueness. This aligns with the position-aware tagging implementation.


1059-1060: LGTM: Bank configuration parameters with sensible defaults.

  • numBanks = 4 matches the RTL configuration
  • enableBankConflict = False by default allows backward-compatible behavior

The static analysis warnings (F405) are false positives - Param and VectorParam are legitimately imported via from m5.params import *, which is standard practice in gem5 SimObject definitions.

src/cpu/pred/btb/test/btb_tage.test.cc (2)

721-729: LGTM: Position-aware tag computation in tests.

The test correctly computes per-entry tags using position:

  • testTag1 uses position=0 for PC=0x1000
  • testTag2 uses position=2 for PC=0x1004 (calculated as (0x1004-0x1000)>>1 = 2)

This aligns with the updated getTageTag signature and ensures tags are unique per branch position within the block.


872-943: Comprehensive bank conflict test coverage.

The test effectively verifies three scenarios:

  1. Same bank conflict (enabled): Prediction on bank 1 (0x20), update on bank 1 (0xa0) → conflict detected
  2. Different bank (no conflict): Prediction on bank 0 (0x100), update on bank 2 (0x140) → no conflict
  3. Disabled flag: Same bank access with enableBankConflict=false → no conflict

The bank ID calculations are correct:

  • 0x20 → bits[6:5] = 01 → bank 1
  • 0xa0 → bits[6:5] = 01 → bank 1
  • 0x100 → bits[6:5] = 00 → bank 0
  • 0x140 → bits[6:5] = 10 → bank 2

Yakkhini
Yakkhini previously approved these changes Nov 25, 2025
@github-actions
Copy link
Copy Markdown

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.0524 -
This PR 2.0476 📉 -0.0047 (-0.23%)

✅ Difftest smoke test passed!

@XiangShanRobot
Copy link
Copy Markdown

[Generated by GEM5 Performance Robot]
commit: 49168e2
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 19.31 17.40 +10.94 🟢

[Generated by GEM5 Performance Robot]
commit: 49168e2
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 19.31 19.33 -0.14 🔴

@jensen-yan
Copy link
Copy Markdown
Collaborator Author

由于这个PR 修改了tage index 计算,会把2bit 拿去给bank id 计算,所以即便通过参数关闭bank 冲突模拟,还是会导致性能下降,待修复。

Change-Id: I27a1c24747ab22d96a2ab7fe637563215ca93a53
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cpu/pred/btb/btb_tage.cc (1)

805-821: Position XOR in tag computation may cause performance regression when bank conflict is disabled.

The position is unconditionally XORed into the tag at line 820. This changes how entries are stored and matched, even when enableBankConflict=false. According to the PR author's comment, this causes performance degradation when bank-conflict simulation is disabled.

Consider making the position XOR conditional:

 Addr
 BTBTAGE::getTageTag(Addr pc, int t, uint64_t foldedHist, uint64_t altFoldedHist, Addr position)
 {
     Addr mask = (1ULL << tableTagBits[t]) - 1;
     Addr pcBits = (pc >> floorLog2(blockSize)) & mask;
     Addr foldedBits = foldedHist & mask;
     Addr altTagBits = (altFoldedHist << 1) & mask;

-    // XOR all components together, including position (like RTL)
-    return pcBits ^ foldedBits ^ altTagBits ^ position;
+    // XOR all components together; include position only when bank simulation is enabled
+    Addr tag = pcBits ^ foldedBits ^ altTagBits;
+    if (enableBankConflict) {
+        tag ^= position;
+    }
+    return tag;
 }

This would preserve the original tag behavior when bank conflict simulation is disabled, while matching RTL behavior when enabled.

🧹 Nitpick comments (1)
docs/Gem5_Docs/frontend/TAGE_BANK_IMPLEMENTATION_PLAN.md (1)

25-30: Optional: Add language specifiers to fenced code blocks.

Several code blocks are missing language specifiers. Adding them improves readability and enables syntax highlighting.

Apply this pattern to the affected blocks:

-```
+```text
 blockSize = 32B (2^5)

For bash/shell examples at lines 371-384 and 432-446, use ```bash.

Also applies to: 34-40, 352-357, 363-368

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 49168e2 and 173395c.

📒 Files selected for processing (2)
  • docs/Gem5_Docs/frontend/TAGE_BANK_IMPLEMENTATION_PLAN.md (1 hunks)
  • src/cpu/pred/btb/btb_tage.cc (17 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
docs/Gem5_Docs/frontend/TAGE_BANK_IMPLEMENTATION_PLAN.md

25-25: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


34-34: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


352-352: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


363-363: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Quick Build, Unit Tests & Smoke Test
  • GitHub Check: perf_test / XS-GEM5 - Run performance test (spec06-0.3c)
🔇 Additional comments (10)
src/cpu/pred/btb/btb_tage.cc (10)

37-52: LGTM: UNIT_TEST constructor properly initializes bank-related fields.

The test constructor correctly initializes all new bank configuration fields (numBanks, bankIdWidth, bankIdShift, indexShift, enableBankConflict, lastPredBankId, predBankValid).


85-99: LGTM: Production constructor correctly wires bank configuration.

Bank parameters are properly initialized from the Python configuration, and the warning for updateOnRead=false is appropriate guidance.


215-226: Position-aware tagging is applied unconditionally.

The position XOR is now part of tag computation (see line 820), meaning entries are differentiated by their position within the block. This behavioral change affects tag matching regardless of whether enableBankConflict is enabled.

If the intent is to match RTL behavior only when bank simulation is enabled, consider guarding the position XOR:

-    // XOR all components together, including position (like RTL)
-    return pcBits ^ foldedBits ^ altTagBits ^ position;
+    // XOR all components together; include position only when bank simulation is enabled
+    if (enableBankConflict) {
+        return pcBits ^ foldedBits ^ altTagBits ^ position;
+    }
+    return pcBits ^ foldedBits ^ altTagBits;

Based on the PR author's comment that performance degrades even when bank-conflict simulation is disabled, this unconditional position XOR could be contributing to the issue.


348-359: LGTM: Prediction bank tracking is correctly implemented.

The prediction bank state is properly recorded for subsequent conflict detection in update(). Statistics are appropriately guarded for unit test builds.


541-554: Improved allocation policy avoids wasteful allocations.

The new logic correctly skips allocation when the provider's direction was correct despite the misprediction being caused by the use_alt decision. This prevents ping-pong effects between providers and gives weak providers time to train.


581-598: Allocation tagging is consistent with lookup tagging.

Position is correctly incorporated into the tag during allocation, matching the tag computation used in lookups.


662-696: Bank conflict detection is well-implemented and properly guarded.

The conflict detection:

  1. Only activates when enableBankConflict is true
  2. Correctly identifies same-bank conflicts between prediction and update
  3. Properly clears state after consumption
  4. Provides detailed debug logging

The implementation follows the documented "scheme 2" (clear at end of update).


835-848: Index computation correctly respects enableBankConflict flag.

The conditional shift at line 843 properly falls back to the original bankIdShift when bank conflict simulation is disabled. This pattern should be mirrored in getTageTag for the position XOR.


900-908: LGTM: Bank ID extraction is correct and efficient.

The bit extraction correctly isolates the bank ID bits from the aligned PC.


1011-1048: LGTM: Statistics properly extended for bank tracking.

The new per-bank statistics are correctly initialized with the appropriate number of banks, enabling detailed analysis of bank access patterns and conflicts.

@github-actions
Copy link
Copy Markdown

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 1.9489 -
This PR 1.9517 📈 +0.0028 (+0.14%)

✅ Difftest smoke test passed!

@jensen-yan
Copy link
Copy Markdown
Collaborator Author

由于这个PR 修改了tage index 计算,会把2bit 拿去给bank id 计算,所以即便通过参数关闭bank 冲突模拟,还是会导致性能下降,待修复。

已修复,分数和开启PR 最开始保持一致,都是19.33分,可以合并了

@jensen-yan jensen-yan requested a review from Yakkhini November 26, 2025 10:22
@XiangShanRobot
Copy link
Copy Markdown

[Generated by GEM5 Performance Robot]
commit: 173395c
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 19.33 17.40 +11.09 🟢

[Generated by GEM5 Performance Robot]
commit: 173395c
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 19.33 19.33 0.00

@jensen-yan jensen-yan merged commit 348f8f4 into xs-dev Nov 28, 2025
3 checks passed
@jensen-yan jensen-yan deleted the tage1-align branch November 28, 2025 02:28
This was referenced Dec 1, 2025
@coderabbitai coderabbitai bot mentioned this pull request Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants