Skip to content

cpu-o3: fix dispatch to align#674

Merged
tastynoob merged 1 commit intoxs-devfrom
fix-disp
Dec 25, 2025
Merged

cpu-o3: fix dispatch to align#674
tastynoob merged 1 commit intoxs-devfrom
fix-disp

Conversation

@tastynoob
Copy link
Copy Markdown
Collaborator

@tastynoob tastynoob commented Dec 24, 2025

Change-Id: If0a65e4ca0c6f6fead995e88977ca79e8cb7f97c

Summary by CodeRabbit

  • Refactor
    • Restructured CPU instruction queue scheduling architecture with optimized counter management and dispatch tracking.
    • Enhanced dispatch decision evaluation and resource allocation efficiency.
    • Improved per-class scheduling counter sharing and dispatch grouping mechanisms.

✏️ Tip: You can customize this high-level summary in your review settings.

Change-Id: If0a65e4ca0c6f6fead995e88977ca79e8cb7f97c
@tastynoob tastynoob requested a review from happy-lx December 24, 2025 10:13
@tastynoob tastynoob added the perf label Dec 24, 2025
@github-actions
Copy link
Copy Markdown

🚀 Performance test triggered: spec06-0.8c

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Dec 24, 2025

📝 Walkthrough

Walkthrough

The O3 CPU issue queue shifts from per-opclass vector-based counters (opNum) to pointer-based per-class counters (instNumClassify). IssuePort's mask field is renamed to opbits. IssueQue and Scheduler introduce shared dispatch-distance tracking structures with reusable counters across equivalent dispatch tables, optimizing counter management.

Changes

Cohort / File(s) Summary
IssuePort struct refinements
src/cpu/o3/issue_queue.hh, src/cpu/o3/issue_queue.cc
Renamed mask member to opbits (type std::bitset<Num_OpClasses>); no functional change, purely semantic clarification.
IssueQue counter architecture
src/cpu/o3/issue_queue.hh, src/cpu/o3/issue_queue.cc
Removed vector-based opNum and introduced pointer-based instNumClassify (vector of uint8_t*); added portFuDescs (vector of bitsets per port); updated readyQmap value type to store pairs of (ReadyQue*, uint8_t*) enabling shared counter references. Counter increments changed from opNum[class]++ to (*instNumClassify[class])++.
Scheduler dispatch infrastructure
src/cpu/o3/issue_queue.hh, src/cpu/o3/issue_queue.cc
Renamed type alias DispPolicy to IQGroup; updated dispTable to use new type; added totalDispCounter (vector<uint8_t>) and dispOpdist (vector<uint8_t*>) to track per-opclass dispatch distances with counter reuse via reuse_table mapping. Dispatch grouping debug output introduced.
Dispatch scheduling logic
src/cpu/o3/issue_queue.cc
Refactored dispatch latency/busy checks to use scheduler->getCorrectedOpLat(inst) vs. derived busy_bit; replaced bit-shift logic. Lookahead/dispatch sequence calculation now uses modulo with dispOpdist per-class distance counters instead of local arrays.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • happy-lx
  • jensen-yan

Poem

🐰 Counters dance through pointers now,
Dispatch distance vectors bow,
Shared and reused with clever care,
O3 scheduling springs through air!
Opbits twinkle, groupings align,
This refactored queue will shine! 🌟

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'cpu-o3: fix dispatch to align' is vague and lacks specificity about what 'dispatch' problem is being fixed or what 'align' means in this context. Consider a more descriptive title that clarifies the specific dispatch issue being addressed, such as 'cpu-o3: refactor dispatch counter tracking to support shared per-class state' or similar.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-disp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/cpu/o3/issue_queue.cc (1)

902-934: LGTM: Dispatch counter allocation logic is correct.

The dispatch counter allocation correctly implements counter sharing:

  1. reserve() at line 905 prevents vector reallocation and pointer invalidation
  2. Counters are shared across OpClasses with identical dispatch tables
  3. Debug output (lines 928-934) aids verification
Optional: Add clarifying comment

Consider adding a comment explaining why reserve() is critical:

+    // Reserve capacity to prevent reallocation during counter allocation,
+    // which would invalidate pointers stored in dispOpdist
     totalDispCounter.reserve(Num_OpClasses);
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8ef695e and 8bbbf01.

📒 Files selected for processing (2)
  • src/cpu/o3/issue_queue.cc
  • src/cpu/o3/issue_queue.hh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Run SPEC Test / XS-GEM5 - Run performance test (spec06-0.8c)
  • GitHub Check: Quick Build, Unit Tests & Smoke Test
🔇 Additional comments (11)
src/cpu/o3/issue_queue.hh (5)

70-70: LGTM: Field rename improves clarity.

The rename from mask to opbits better conveys that this field tracks which operation classes are supported by this issue port.


115-115: LGTM: Added field for per-port OpClass tracking.

The portFuDescs field accumulates per-port functional unit and operation class combinations, which supports the dispatch optimization.


284-288: LGTM: Type alias rename improves clarity.

Renaming DispPolicy to IQGroup better conveys that this represents a group of issue queues for dispatch purposes.


143-144: Verify memory management for raw pointer field.

The new instNumClassify field holds raw pointers to per-OpClass counters. Ensure:

  1. All pointers are properly initialized before use
  2. Memory is correctly freed (verify destructor or cleanup method exists)
  3. No dangling pointers after cleanup

293-295: Pointer stability is already handled — no changes needed.

The code safely stores pointers to totalDispCounter elements via reserve() at line 905. Since reserve(Num_OpClasses) is called before any push_back() operations in the initialization loop (lines 921-922), the vector will not reallocate even after elements are added. This keeps all pointers in dispOpdist valid. Later operations like std::fill() (line 1072) only modify values without reallocating.

Likely an incorrect or invalid review comment.

src/cpu/o3/issue_queue.cc (6)

67-74: LGTM: Field rename propagated correctly.

The IssuePort constructor correctly uses opbits (renamed from mask) to track supported operation classes.


564-566: LGTM: Improved latency calculation with overflow protection.

The latency calculation now correctly handles cases where lat > 63 to prevent undefined behavior from excessive bit shifts. Setting busy_bit = -1 when lat > 63 effectively marks the port as busy for all future cycles, which is appropriate for very long-latency operations.


686-691: Verify instNumClassify pointer is valid before dereferencing.

Line 689 dereferences instNumClassify[inst->opClass()] without null checks. This assumes the pointer was properly initialized for this OpClass. As noted in the earlier review of lines 212-215, ensure all OpClasses handled by this IssueQue have valid counter pointers.


1066-1087: LGTM: Lookahead correctly uses shared dispatch counters.

The lookahead function properly:

  1. Resets shared counters (line 1072)
  2. Increments per-OpClass counters via dispOpdist (line 1083)
  3. Computes dispatch sequence based on counter modulo IQ group size (line 1082)

This enables balanced dispatch across issue queues in each group.


833-840: The pointer dereferences at lines 837-838 are safe. The code design ensures that only IssueQueues supporting a given OpClass are added to dispTable[opClass] (line 864), and only IssueQueues supporting an OpClass initialize instNumClassify[opClass] with a valid pointer (line 286). Since disp_policy is instantiated with the same OpClass used to retrieve IssueQueues from dispTable (line 1076), all IssueQueues passed to the comparator must have valid pointers for disp_op. No null pointer dereference risk exists.

Likely an incorrect or invalid review comment.


212-215: Dispatch-side validation prevents unconfigured OpClasses from reaching this queue.

The design initializes instNumClassify with nullptr (line 212) and only assigns valid counters for OpClasses in this IQ's configured FUs (line 286). However, the actual runtime safety depends on the Scheduler's dispatch routing: the dispTable ensures instructions are routed only to IQs configured to handle them, with validation assert(!iqs.empty()) (line 1096) preventing dispatch of unhandled OpClasses. This design is correct as long as dispatch-side routing validation holds.

@github-actions
Copy link
Copy Markdown

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.0659 -
This PR 2.1005 📈 +0.0346 (+1.67%)

✅ Difftest smoke test passed!

@XiangShanRobot
Copy link
Copy Markdown

[Generated by GEM5 Performance Robot]
commit: 8bbbf01
workflow: On-Demand SPEC Test (Tier 1.5)

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 19.94 19.91 +0.14 🟢

@tastynoob tastynoob merged commit 4f70d27 into xs-dev Dec 25, 2025
4 checks passed
@tastynoob tastynoob deleted the fix-disp branch December 25, 2025 02:57
@coderabbitai coderabbitai bot mentioned this pull request Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants