Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion src/cpu/o3/FuncScheduler.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,11 @@ class PAgeSelector(BaseSelector):

piece = Param.Int(2, "number of instructions in a group")

class SMTBasedSelector(BaseSelector):
type = 'SMTBasedSelector'
cxx_class = 'gem5::o3::SMTBasedSelector'
cxx_header = "cpu/o3/issue_queue.hh"

class IssueQue(SimObject):
type = 'IssueQue'
cxx_class = 'gem5::o3::IssueQue'
Expand All @@ -85,7 +90,7 @@ class IssueQue(SimObject):
inports = Param.Int(2, "")
scheduleToExecDelay = Param.Cycles(2, "")
oports = VectorParam.IssuePort("")
sel = Param.BaseSelector(BaseSelector(), "Selector for this IQ (default: age first)")
sel = Param.BaseSelector(SMTBasedSelector(), "Selector for this IQ (default: age first)")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update the parameter description to match the new default selector.

At Line 93, the default is SMTBasedSelector() but the text still says “default: age first”.

Proposed fix
-    sel = Param.BaseSelector(SMTBasedSelector(), "Selector for this IQ (default: age first)")
+    sel = Param.BaseSelector(SMTBasedSelector(), "Selector for this IQ (default: SMT-based)")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sel = Param.BaseSelector(SMTBasedSelector(), "Selector for this IQ (default: age first)")
sel = Param.BaseSelector(SMTBasedSelector(), "Selector for this IQ (default: SMT-based)")
🧰 Tools
🪛 Ruff (0.15.5)

[error] 93-93: Param may be undefined, or defined from star imports

(F405)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/o3/FuncScheduler.py` at line 93, Update the parameter description for
the selector parameter so it reflects the actual default: change the help text
passed to Param.BaseSelector for the variable sel (which is currently
constructed with SMTBasedSelector()) to mention "default: SMT-based" (or similar
accurate wording) instead of "default: age first"; ensure the description in
Param.BaseSelector(SMTBasedSelector(), "Selector for this IQ (...)") references
SMTBasedSelector as the default.


class Scheduler(SimObject):
type = 'Scheduler'
Expand Down
2 changes: 1 addition & 1 deletion src/cpu/o3/SConscript
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Import('*')

if env['CONF']['TARGET_ISA'] != 'null':
SimObject('FuncScheduler.py', sim_objects=['FUPool', 'SpecWakeupChannel',
'IssuePort', 'IssueQue', 'BaseSelector', 'PAgeSelector', 'Scheduler'])
'IssuePort', 'IssueQue', 'BaseSelector', 'PAgeSelector', 'SMTBasedSelector', 'Scheduler'])
SimObject('FuncUnitConfig.py', sim_objects=[])
SimObject('BaseO3CPU.py', sim_objects=['BaseO3CPU'], enums=[
'SMTFetchPolicy', 'SMTQueuePolicy', 'CommitPolicy', 'ROBWalkPolicy', 'ROBCompressPolicy', 'PerfRecord'])
Expand Down
10 changes: 10 additions & 0 deletions src/cpu/o3/comm.hh
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,12 @@ struct IssueStruct
DynInstPtr insts[MaxWidth];
};

struct SquashInfo
{
InstSeqNum squashSn;
ThreadID squashTid;
};

struct SquashVersion
{
uint8_t version;
Expand Down Expand Up @@ -246,6 +252,10 @@ struct TimeStruct
};
/** Resolved control-flow PCs produced this cycle (fetch buffers/merges). */
std::vector<ResolvedCFIEntry> resolvedCFIs; // *F

unsigned iqCount;
unsigned ldstqCount;
unsigned robCount;
};

IewComm iewInfo[MaxThreads]; // iew to rename, fetch
Expand Down
65 changes: 64 additions & 1 deletion src/cpu/o3/fetch.cc
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ Fetch::Fetch(CPU *_cpu, const BaseO3CPUParams &params)
threads[tid].data = new uint8_t[fetchBufferSize];
}

initDecodeScheduler();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The decode scheduler policy is still effectively hardcoded.

Fetch::Fetch() calls initDecodeScheduler() immediately, but this class never assigns smtDecodePolicy from BaseO3CPUParams before that happens. As written, every instance takes the hardcoded "multi_priority" branch, so the icount/delayed paths are unreachable until the policy and delay are wired from params.

Also applies to: 393-406

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/o3/fetch.cc` at line 150, Fetch::Fetch() calls initDecodeScheduler()
before reading smtDecodePolicy and its delay from BaseO3CPUParams, so the
scheduler always uses the hardcoded "multi_priority" branch and never reaches
the icount/delayed paths; fix by initializing smtDecodePolicy and the
decode-delay field from the CPU params (BaseO3CPUParams) in the Fetch
constructor (and any other place that calls initDecodeScheduler()) before
calling initDecodeScheduler(), ensuring the policy string and delay value from
params are assigned to the Fetch instance so initDecodeScheduler() can pick the
correct branch (icount/delayed/etc.).


// Get the size of an instruction.
// stallReason size should be the same as decodeWidth,renameWidth,dispWidth
stallReason.resize(decodeWidth, StallReason::NoStall);
Expand Down Expand Up @@ -372,6 +374,41 @@ Fetch::setTimeBuffer(TimeBuffer<TimeStruct> *time_buffer)
fromCommit = timeBuffer->getWire(-commitToFetchDelay);
}

void
Fetch::initDecodeScheduler()
{
// Initialize counters (same as before)
lsqCounter = new InstsCounter();
iqCounter = new InstsCounter();
robCounter = new InstsCounter();
DPRINTF(Fetch, "Initialized SMT Decode Scheduler: 0\n");

for (ThreadID tid = 0; tid < numThreads; tid++)
{
lsqCounter->setCounter(tid, 0);
iqCounter->setCounter(tid, 0);
robCounter->setCounter(tid, 0);
}
DPRINTF(Fetch, "Initialized SMT Decode Scheduler: 1\n");

if (smtDecodePolicy == "icount") {
// Use ROB as default counter for icount
decodeScheduler = new ICountScheduler(numThreads, robCounter);
}
else if (smtDecodePolicy == "delayed") {
decodeScheduler = new DelayedICountScheduler(numThreads, robCounter, delayedSchedulerDelay);
}
else if (smtDecodePolicy == "multi_priority") {
decodeScheduler = new MultiPrioritySched(numThreads, {lsqCounter, iqCounter, robCounter});
}
else {
// Default: round-robin like (use delayed with thread cycling)
decodeScheduler = new DelayedICountScheduler(numThreads, robCounter, numThreads);
}

DPRINTF(Fetch, "Initialized SMT Decode Scheduler: %s\n", smtDecodePolicy.c_str());
}
Comment on lines +396 to +429
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Memory leak: dynamically allocated objects are never freed.

The counters and scheduler are allocated with new but there's no corresponding destructor or cleanup to delete them. This will leak memory when the Fetch object is destroyed.

🛠️ Suggested fix: store as unique_ptr or delete in destructor

Consider using smart pointers for automatic memory management:

+// In fetch.hh, change member declarations:
+std::unique_ptr<InstsCounter> lsqCounter;
+std::unique_ptr<InstsCounter> iqCounter;
+std::unique_ptr<InstsCounter> robCounter;
+std::unique_ptr<SMTScheduler> decodeScheduler;

// In initDecodeScheduler():
-    lsqCounter = new InstsCounter();
-    iqCounter  = new InstsCounter();
-    robCounter = new InstsCounter();
+    lsqCounter = std::make_unique<InstsCounter>();
+    iqCounter  = std::make_unique<InstsCounter>();
+    robCounter = std::make_unique<InstsCounter>();

Or alternatively, add cleanup in the destructor.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/o3/fetch.cc` around lines 377 - 410, The init allocates lsqCounter,
iqCounter, robCounter and decodeScheduler with new (in
Fetch::initDecodeScheduler) but they are never freed; update Fetch to manage
lifetime by either converting these members to smart pointers (e.g.,
std::unique_ptr for lsqCounter, iqCounter, robCounter, and decodeScheduler) or
by deleting them in the Fetch destructor (implement ~Fetch() to delete
lsqCounter, iqCounter, robCounter and decodeScheduler and null them). Ensure
constructors/assignments match the chosen approach (replace raw new with
make_unique when using unique_ptr or keep new if adding deletes) and update any
code that assumes raw pointers accordingly.


void
Fetch::setActiveThreads(std::list<ThreadID> *at_ptr)
{
Expand Down Expand Up @@ -1285,6 +1322,32 @@ Fetch::handleInterrupts()
}
}

ThreadID
Fetch::selectUnstalledThread()
{

// if (numThreads == 1) {
// return 0;
// }
for (ThreadID tid = 0; tid < numThreads; ++tid) {
if (!stallSig->blockFetch[tid]) {
lsqCounter->setCounter(tid, fromIEW->iewInfo[tid].ldstqCount);
iqCounter->setCounter(tid, fromIEW->iewInfo[tid].iqCount);
robCounter->setCounter(tid, fromIEW->iewInfo[tid].robCount);

} else {
lsqCounter->setCounter(tid, UINT64_MAX);
iqCounter->setCounter(tid, UINT64_MAX);
robCounter->setCounter(tid, UINT64_MAX);

}
DPRINTF(Fetch, "lsqCounter->setCounter: %d iqCounter->setCounter: %d robCounter->setCounter: %d\n",fromIEW->iewInfo[tid].ldstqCount,fromIEW->iewInfo[tid].iqCount,fromIEW->iewInfo[tid].robCount);
}

ThreadID selected = decodeScheduler->getThread();
return selected;
}
Comment on lines +1387 to +1411
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid selecting threads with empty fetch queues for decode handoff.

The selector currently considers only stallSig->blockFetch. If the chosen thread has an empty fetchQueue, Lines 1397-1408 send nothing even when another unstalled thread has ready instructions.

Proposed fix
 ThreadID
 Fetch::selectUnstalledThread()
 {
+    bool hasCandidate = false;
     for (ThreadID tid = 0; tid < numThreads; ++tid) {
-        if (!stallSig->blockFetch[tid]) {
+        if (!stallSig->blockFetch[tid] && !fetchQueue[tid].empty()) {
+            hasCandidate = true;
             lsqCounter->setCounter(tid, fromIEW->iewInfo[tid].ldstqCount);
             iqCounter->setCounter(tid, fromIEW->iewInfo[tid].iqCount);
             robCounter->setCounter(tid, fromIEW->iewInfo[tid].robCount);
         } else {
             lsqCounter->setCounter(tid, UINT64_MAX);
             iqCounter->setCounter(tid, UINT64_MAX);
             robCounter->setCounter(tid, UINT64_MAX);
         }
     }
+    if (!hasCandidate) {
+        return InvalidThreadID;
+    }
 
     ThreadID selected = decodeScheduler->getThread();
     return selected;
 }
-    ThreadID tid =selectUnstalledThread();
+    ThreadID tid = selectUnstalledThread();
+    if (tid == InvalidThreadID) {
+        toDecode->fetchStallReason = stallReason;
+        return;
+    }

Also applies to: 1387-1408

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/o3/fetch.cc` around lines 1325 - 1349, The selector
Fetch::selectUnstalledThread currently only checks stallSig->blockFetch but may
return a thread whose fetchQueue is empty; update Fetch::selectUnstalledThread
to exclude threads with empty fetchQueue (e.g., fetchQueue[tid].empty() or
equivalent API) when computing candidates and when calling
decodeScheduler->getThread so decode handoff never selects an
unstalled-but-empty thread; ensure the lsqCounter/iqCounter/robCounter updates
remain correct for excluded threads (still set to UINT64_MAX) and update any
logic that relies on selected thread to loop to another ready thread if
decodeScheduler returns an empty-thread id.


void
Fetch::sendInstructionsToDecode()
{
Expand Down Expand Up @@ -1321,7 +1384,7 @@ Fetch::sendInstructionsToDecode()
return;
}

ThreadID tid = 0; // TODO: smt support
ThreadID tid =selectUnstalledThread();

// fetch totally stalled
if (stallSig->blockFetch[tid]) {
Expand Down
19 changes: 19 additions & 0 deletions src/cpu/o3/fetch.hh
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@
#include "mem/port.hh"
#include "sim/eventq.hh"
#include "sim/probe/probe.hh"
#include "cpu/o3/smt_sched.hh"

namespace gem5
{
Expand Down Expand Up @@ -233,6 +234,18 @@ class Fetch
/** To probe when a fetch request is successfully sent. */
ProbePointArg<RequestPtr> *ppFetchRequestSent;

// SMT Decode Scheduler
SMTScheduler* decodeScheduler;

// Counters from backend structures (to be passed in)
InstsCounter* lsqCounter;
InstsCounter* iqCounter;
InstsCounter* robCounter;

// Configuration parameters
std::string smtDecodePolicy ="multi_priority";
int delayedSchedulerDelay;
Comment on lines +237 to +247
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n -C4 'Fetch::Fetch\s*\(|Fetch::~Fetch\s*\(' src/cpu/o3/fetch.cc

Repository: OpenXiangShan/GEM5

Length of output: 562


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Get the full constructor from line 89 until the closing brace
sed -n '89,170p' src/cpu/o3/fetch.cc

Repository: OpenXiangShan/GEM5

Length of output: 2988


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find the initDecodeScheduler implementation
rg -n 'initDecodeScheduler' src/cpu/o3/fetch.cc -A 30

Repository: OpenXiangShan/GEM5

Length of output: 2212


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for where delayedSchedulerDelay is set/initialized
rg -n 'delayedSchedulerDelay\s*=' src/cpu/o3/fetch.cc

Repository: OpenXiangShan/GEM5

Length of output: 44


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check the exact line in the header where delayedSchedulerDelay is declared
rg -n 'delayedSchedulerDelay' src/cpu/o3/fetch.hh -B 1 -A 1

Repository: OpenXiangShan/GEM5

Length of output: 156


delayedSchedulerDelay is uninitialized when used in initDecodeScheduler().

Line 399 of fetch.cc passes delayedSchedulerDelay to DelayedICountScheduler constructor, but this int is never initialized—not in the constructor, not in the class default, and not before use. The pointers (decodeScheduler, lsqCounter, iqCounter, robCounter) are initialized within initDecodeScheduler() on all code paths, but this member remains indeterminate. Add an in-class default or initialize it in the constructor:

Suggested fix
-    int delayedSchedulerDelay;
+    int delayedSchedulerDelay = 0;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// SMT Decode Scheduler
SMTScheduler* decodeScheduler;
// Counters from backend structures (to be passed in)
InstsCounter* lsqCounter;
InstsCounter* iqCounter;
InstsCounter* robCounter;
// Configuration parameters
std::string smtDecodePolicy ="multi_priority";
int delayedSchedulerDelay;
// SMT Decode Scheduler
SMTScheduler* decodeScheduler;
// Counters from backend structures (to be passed in)
InstsCounter* lsqCounter;
InstsCounter* iqCounter;
InstsCounter* robCounter;
// Configuration parameters
std::string smtDecodePolicy ="multi_priority";
int delayedSchedulerDelay = 0;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/o3/fetch.hh` around lines 237 - 247, The member delayedSchedulerDelay
is used uninitialized when passed into DelayedICountScheduler in
initDecodeScheduler(); initialize it to a safe default either by adding an
in-class initializer (e.g., int delayedSchedulerDelay = 0;) or by setting it in
the class constructor for the fetch unit so delayedSchedulerDelay has a
well-defined value before initDecodeScheduler() runs; update the declaration in
fetch.hh or the fetch class constructor accordingly and ensure any configuration
parsing assigns it prior to creating DelayedICountScheduler.


public:
/** Fetch constructor. */
Fetch(CPU *_cpu, const BaseO3CPUParams &params);
Expand Down Expand Up @@ -299,6 +312,12 @@ class Fetch

/** For priority-based fetch policies, need to keep update priorityList */
void deactivateThread(ThreadID tid);

// Function to initialize scheduler
void initDecodeScheduler();

// Select a thread that is not fetch-blocked, using scheduler
ThreadID selectUnstalledThread();
private:
/** Reset this pipeline stage */
void resetStage();
Expand Down
5 changes: 4 additions & 1 deletion src/cpu/o3/iew.cc
Original file line number Diff line number Diff line change
Expand Up @@ -525,7 +525,7 @@ IEW::squash(ThreadID tid)

for (auto& dp : dispQue) {
for (auto& it : dp) {
if (it->seqNum > fromCommit->commitInfo[tid].doneSeqNum) {
if (it->seqNum > fromCommit->commitInfo[tid].doneSeqNum && (it->threadNumber == tid)) {
it->setSquashed();
}
}
Expand Down Expand Up @@ -1533,6 +1533,9 @@ IEW::executeInsts()
ThreadID tid = *activeThreads->begin();
toFetch->iewInfo[tid].resolvedCFIs.clear();

toFetch->iewInfo[tid].ldstqCount=ldstQueue.getCount(tid);
toFetch->iewInfo[tid].robCount= rob->getThreadEntries(tid);
toFetch->iewInfo[tid].iqCount= scheduler->getIQInsts(tid);
// Execute/writeback any instructions that are available.
int insts_to_execute = fromIssue->size;
fromIssue->size = 0;
Expand Down
9 changes: 6 additions & 3 deletions src/cpu/o3/inst_queue.cc
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,8 @@ InstructionQueue::InstructionQueue(CPU *cpu_ptr, IEW *iew_ptr,
scheduler->setCPU(cpu_ptr, &iew_ptr->ldstQueue);
scheduler->resetDepGraph(numPhysRegs);
scheduler->setMemDepUnit(memDepUnit);

scheduler->initIQICountSmtScheduler(numThreads);

resetState();
}

Expand Down Expand Up @@ -1121,7 +1122,9 @@ InstructionQueue::doSquash(ThreadID tid)

DPRINTF(IQ, "[tid:%i] Squashing until sequence number %i!\n",
tid, squashedSeqNum[tid]);
scheduler->doSquash(squashedSeqNum[tid]);
squashInfo.squashTid = tid;
squashInfo.squashSn = squashedSeqNum[tid];
scheduler->doSquash(squashInfo);

for (auto it = mdpAddrReplayLdInsts.begin(); it != mdpAddrReplayLdInsts.end();) {
if (!it->inst ||
Expand All @@ -1134,7 +1137,7 @@ InstructionQueue::doSquash(ThreadID tid)
}

for (auto it = nonSpecInsts.begin(); it != nonSpecInsts.end();) {
if (it->first > squashedSeqNum[tid]) {
if (it->first > squashedSeqNum[tid] && (it->second->threadNumber == tid)) {
auto& squashed_inst = it->second;
if (!squashed_inst->isIssued() ||
(squashed_inst->isMemRef() &&
Expand Down
1 change: 1 addition & 0 deletions src/cpu/o3/inst_queue.hh
Original file line number Diff line number Diff line change
Expand Up @@ -427,6 +427,7 @@ class InstructionQueue

/** The sequence number of the squashed instruction. */
InstSeqNum squashedSeqNum[MaxThreads];
SquashInfo squashInfo;

struct IQStats : public statistics::Group
{
Expand Down
Loading
Loading