Skip to content

Commit d5e0972

Browse files
jensen-yanYakkhini
andauthored
cpu-o3: simplify fetch, only support decoupled BTB mode (#721)
* cpu-o3: simplify fetch, only support decoupled BTB mode Change-Id: If39d11a07088d4f9237d5bf6c30a9dd8f53a1436 * cpu-o3: delete isDecoupledFrontend() Change-Id: Ibac84ee677f597f90bc7ff70a0ff6a1b1d108586 * cpu-o3: fetch: move updateBranchPredictors to tick() start now we can update BPU to supply fetch first, then let fetch to consume this ftq entry. Change-Id: I922b899d7626e010b5f3589b39ad4c6cec6f7c2b * cpu-o3: fetch: split usedUpFetchTargets This update introduces a clearer separation of fetch state management by replacing the `usedUpFetchTargets` flag with `needFtqSupply` and `exhaustedFtqEntry`. The new flags enhance clarity in FTQ supply logic, ensuring that fetch operations correctly handle the availability of fetch targets. Additionally, the `trySupplyFtq` method is added to streamline FTQ supply attempts, improving overall fetch efficiency and maintainability. Change-Id: Ie17a13ebf64aea7bd68f3c4c50917b3e3c9b830a * cpu-o3: test remove ftq code I thinks we can delete this, run spec06 for try. Change-Id: If21eca420d21d8ca038dc1065e144fe346a23131 * cpu-o3: refactor fetch state management and FTQ handling 1. removes the `needFtqSupply` and `exhaustedFtqEntry` flags from the Fetch class, simplifying the fetch state management. 2. The logic for FTQ supply is streamlined, ensuring that fetch operations directly check the availability of fetch targets without maintaining additional state. 3. The `trySupplyFtq` method is also removed, as the fetch now operates in a head-driven FIFO manner. This change enhances clarity and maintainability of the fetch logic. Change-Id: I7a8b312785a7cce526192de7125cea458b979c0e * cpu-o3: fetch ftq rename some functions 1. Reset the instruction count for FTQ entries in multiple locations to ensure accurate tracking of fetched instructions. 2. Replace deprecated `updateBranchPredictors` method with direct calls to `dbpbtb->tick()` for improved clarity and maintainability. 3. Refactor fetch logic to utilize `ftqHasHead()` and `ftqHead()` methods, enhancing the readability of fetch target availability checks. 4. Introduce `consumeFetchTarget` method in the decoupled branch predictor to streamline the handling of consumed fetch targets. These changes improve the overall structure and efficiency of the fetch stage, ensuring better integration with the branch predictor and FTQ management. Change-Id: I7a8b312785a7cce526192de7125cea458b979c0e * cpu-o3: remove decoupledPredict in fetch 1. Refactor the next PC prediction logic to directly compute from the FTQ head entry, improving clarity and efficiency. 2. Introduce checks for the current PC against the predicted takenPC, streamlining the fetch process for taken branches. 3. Update the handling of microops to ensure they advance the program counter correctly, maintaining accurate instruction flow. These changes improve the overall structure and efficiency of the fetch stage, ensuring better integration with the branch predictor and FTQ management. Change-Id: Iaed922245a386b695499799c9e0a3947208e0524 * cpu-o3: add some docs Change-Id: I56c36d4757b2a78ad193225f6eef302fd2c986ad * cpu-o3: remove useless source code in v3 BPU Change-Id: I1c6093c1f63036b6346838572229e4440726d7f9 * util: remove unused scripts - Updated `gem5-vec.cfg` to remove DRAMsim3 configuration. - Changed `gem5.cfg` to use `kmhv3.py` instead of `fs.py` for task execution. - Removed deprecated options from `Options.py` related to branch prediction. - Adjusted `xiangshan.py` to enforce BTB-only branch prediction and streamline configuration. - Deleted unused `fs.py` file and various scripts to clean up the repository. - Updated example configurations to align with new branch prediction policies. These changes enhance clarity and maintainability of the configuration files while ensuring compliance with the current branch prediction strategy. Change-Id: I1c6093c1f63036b6346838572229e4440726d7f9 --------- Co-authored-by: Yakkhini <59007159+Yakkhini@users.noreply.github.com>
1 parent 0025245 commit d5e0972

27 files changed

+543
-2233
lines changed

.github/workflows/autotest/gem5-vec.cfg

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,6 @@ fs_path = ./configs/example/kmhv2.py --enable-riscv-vector --restore-rvv-cpt
4343
task ={set_env} && ./build/RISCV/gem5.opt \
4444
--outdir=$sublog$ \
4545
{fs_path} \
46-
--dramsim3-ini=./ext/dramsim3/xiangshan_configs/xiangshan_DDR4_8Gb_x8_3200_2ch.ini \
4746
--raw-cpt --generic-rv-cpt=$binfile$
4847
post-task =
4948
except-task = echo gem5 running error!

.github/workflows/autotest/gem5.cfg

Lines changed: 1 addition & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -169,27 +169,10 @@ pre-task =
169169
task = {set_var} && \
170170
./build/RISCV/gem5.opt \
171171
--outdir=$sublog$ \
172-
./configs/example/fs.py \
172+
./configs/example/kmhv3.py \
173173
--enable-difftest --difftest-ref-so {ref_so_path} \
174-
--xiangshan-system --cpu-type=DerivO3CPU \
175-
--mem-size=8GB \
176-
--caches --cacheline_size=64 \
177-
--l1i_size=64kB --l1i_assoc=8 \
178-
--l1d_size=64kB --l1d_assoc=4 \
179-
--l1d-hwp-type=XSCompositePrefetcher \
180-
--short-stride-thres=0 \
181-
--l2cache --l2_size=1MB --l2_assoc=8 \
182-
--l3cache --l3_size=16MB --l3_assoc=16 \
183-
--l1-to-l2-pf-hint \
184-
--l2-hwp-type=PrefetcherForwarder \
185-
--l2-wrapper-hwp-type=WorkerPrefetcher \
186-
--l2-to-l3-pf-hint \
187-
--l3-hwp-type=WorkerPrefetcher \
188-
--mem-type=DRAMsim3 \
189-
--dramsim3-ini=./ext/dramsim3/xiangshan_configs/xiangshan_DDR4_8Gb_x8_3200_2ch.ini \
190174
--generic-rv-cpt=$binfile$ \
191175
--gcpt-restorer={gcpt_path} \
192-
--bp-type=DecoupledBPUWithFTB --enable-loop-predictor --enable-jump-ahead-predictor \
193176
--warmup-insts=800 --warmup-insts-no-switch=50000010 --maxinsts=100000010
194177
post-task = echo run successfully at $binfile$
195178
except-task = echo gem5 running error at $binfile$!

.github/workflows/gem5-ideal-btb-perf-nosc.yml

Lines changed: 0 additions & 17 deletions
This file was deleted.

configs/common/Options.py

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -279,16 +279,6 @@ def addCommonOptions(parser, configure_xiangshan=False):
279279
help="enable bp database for specified subdatabase, "
280280
"basic branch trace is enabled by default even without specifying, "
281281
"available subdatabase: basic, tage, ras, loop")
282-
parser.add_argument("--disable-sc", default=False, action="store_true",
283-
help="disable SC (enabled by default, only for FTBTAGE)")
284-
parser.add_argument("--disable-mgsc", default=False, action="store_true",
285-
help="disable MGSC (only for BTBTAGE)")
286-
parser.add_argument("--enable-loop-buffer", default=False, action="store_true",
287-
help="enable loop buffer (only for ftb branch predictor)")
288-
parser.add_argument("--enable-loop-predictor", default=False, action="store_true",
289-
help="enable loop predictor (only for ftb branch predictor)")
290-
parser.add_argument("--enable-jump-ahead-predictor", default=False, action="store_true",
291-
help="enable jump ahead predictor (only for ftb branch predictor)")
292282

293283
parser.add_argument("--list-rp-types",
294284
action=ListRP, nargs=0,

configs/common/xiangshan.py

Lines changed: 15 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -329,50 +329,27 @@ def build_xiangshan_system(args):
329329
print(f"Trace mode: CPU {cpu.cpu_id} configured with {mode_str} translation")
330330

331331
# configure BP
332-
args.enable_loop_predictor = True
333-
if args.enable_riscv_vector:
334-
args.enable_loop_buffer = True
335-
336332
for i in range(np):
337333
if args.kmh_align:
338334
test_sys.cpu[i].enable_storeSet_train = False
339335

340-
if args.bp_type is None or args.bp_type == 'DecoupledBPUWithFTB' or args.bp_type == 'DecoupledBPUWithBTB':
341-
enable_bp_db = len(args.enable_bp_db) > 1
342-
if enable_bp_db:
343-
bp_db_switches = args.enable_bp_db[1] + ['basic']
344-
print("BP db switches:", bp_db_switches)
345-
else:
346-
bp_db_switches = []
347-
# for DecoupledBPUWithBTB, loop predictor and jump ahead predictor are not supported
348-
#if args.bp_type == 'DecoupledBPUWithBTB':
349-
if args.enable_loop_predictor or args.enable_loop_buffer:
350-
print("loop predictor and loop buffer not supported for DecoupledBPUWithBTB")
351-
args.enable_loop_predictor = False
352-
args.enable_loop_buffer = False
353-
if args.enable_jump_ahead_predictor:
354-
print("jump ahead predictor not supported for DecoupledBPUWithBTB")
355-
args.enable_jump_ahead_predictor = False
356-
357-
BPClass = DecoupledBPUWithBTB() if args.bp_type == 'DecoupledBPUWithBTB' else DecoupledBPUWithFTB()
358-
test_sys.cpu[i].branchPred = BPClass(
359-
bpDBSwitches=bp_db_switches,
360-
enableLoopBuffer=args.enable_loop_buffer,
361-
enableLoopPredictor=args.enable_loop_predictor,
362-
enableJumpAheadPredictor=args.enable_jump_ahead_predictor
363-
)
364-
test_sys.cpu[i].branchPred.tage.enableSC = not args.disable_sc
365-
test_sys.cpu[i].branchPred.isDumpMisspredPC = True
336+
if args.bp_type != 'DecoupledBPUWithBTB':
337+
fatal(
338+
"Only --bp-type=DecoupledBPUWithBTB is supported for Xiangshan in this repo "
339+
f"(got --bp-type={args.bp_type})."
340+
)
366341

342+
enable_bp_db = len(args.enable_bp_db) > 1
343+
if enable_bp_db:
344+
bp_db_switches = args.enable_bp_db[1] + ['basic']
345+
print("BP db switches:", bp_db_switches)
367346
else:
368-
BPClass = ObjectList.bp_list.get(args.bp_type)
369-
test_sys.cpu[i].branchPred = BPClass()
370-
371-
if args.indirect_bp_type:
372-
IndirectBPClass = ObjectList.indirect_bp_list.get(
373-
args.indirect_bp_type)
374-
test_sys.cpu[i].branchPred.indirectBranchPred = \
375-
IndirectBPClass()
347+
bp_db_switches = []
348+
349+
test_sys.cpu[i].branchPred = DecoupledBPUWithBTB(
350+
bpDBSwitches=bp_db_switches,
351+
)
352+
test_sys.cpu[i].branchPred.isDumpMisspredPC = True
376353

377354
# configure memory related
378355
if args.mem_type == 'DRAMsim3':

0 commit comments

Comments
 (0)