Skip to content

Commit c89c84d

Browse files
committed
refactor(tomasulo): migrate payload storage from FFs to distributed RAM (LUTRAM)
Move bulk data fields out of flip-flops into LUTRAM across three modules, keeping only control/CAM-scanned bits in registers: - reservation_station: 185-bit payload (op, imm, rm, branch_target, predicted_taken, predicted_target, is_fp_mem, mem_size, mem_signed, csr_addr, csr_imm, pc) into single sdp_dist_ram instance (write@dispatch, read@issue) - load_queue: 64-bit lq_data split into lo/hi mwp_dist_ram instances (2 write ports each: port 0 for cache-hit/forward/mem-response, port 1 for AMO completion; split enables FLD partial-word writes) - store_queue: 64-bit sq_data into duplicated sdp_dist_ram instances (shared CAM-resolved write, independent reads for forwarding scan and head writeback) Update .f filelists with RAM primitive dependencies. Update all three READMEs to reflect hybrid FF + LUTRAM storage strategy.
1 parent 7d6e117 commit c89c84d

File tree

12 files changed

+482
-231
lines changed

12 files changed

+482
-231
lines changed

formal/load_queue.sby

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,17 @@ smtbmc boolector
1313

1414
[script]
1515
read -formal -sv riscv_pkg.sv
16+
read -sv sdp_dist_ram.sv
17+
read -sv mwp_dist_ram.sv
1618
read -sv load_unit.sv
1719
read -sv lq_l0_cache.sv
1820
read -formal -sv load_queue.sv
1921
prep -top load_queue
2022

2123
[files]
2224
../hw/rtl/cpu_and_mem/cpu/riscv_pkg.sv
25+
../hw/rtl/lib/ram/sdp_dist_ram.sv
26+
../hw/rtl/lib/ram/mwp_dist_ram.sv
2327
../hw/rtl/cpu_and_mem/cpu/ma_stage/load_unit.sv
2428
../hw/rtl/cpu_and_mem/cpu/tomasulo/load_queue/lq_l0_cache.sv
2529
../hw/rtl/cpu_and_mem/cpu/tomasulo/load_queue/load_queue.sv

formal/reservation_station.sby

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,11 @@ smtbmc boolector
1313

1414
[script]
1515
read -formal -sv riscv_pkg.sv
16+
read -sv sdp_dist_ram.sv
1617
read -formal -sv reservation_station.sv
1718
prep -top reservation_station
1819

1920
[files]
2021
../hw/rtl/cpu_and_mem/cpu/riscv_pkg.sv
22+
../hw/rtl/lib/ram/sdp_dist_ram.sv
2123
../hw/rtl/cpu_and_mem/cpu/tomasulo/reservation_station/reservation_station.sv

formal/store_queue.sby

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,11 @@ smtbmc boolector
1313

1414
[script]
1515
read -formal -sv riscv_pkg.sv
16+
read -sv sdp_dist_ram.sv
1617
read -formal -sv store_queue.sv
1718
prep -top store_queue
1819

1920
[files]
2021
../hw/rtl/cpu_and_mem/cpu/riscv_pkg.sv
22+
../hw/rtl/lib/ram/sdp_dist_ram.sv
2123
../hw/rtl/cpu_and_mem/cpu/tomasulo/store_queue/store_queue.sv

hw/rtl/cpu_and_mem/cpu/tomasulo/load_queue/README.md

Lines changed: 25 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -29,17 +29,20 @@ freed when their result is accepted by the CDB arbiter (via `fu_cdb_adapter`).
2929

3030
## Storage Strategy
3131

32-
All fields in FFs (not LUTRAM/BRAM). 8 entries at ~116 bits each (~928 bits
33-
total). Rationale:
34-
35-
- **CAM-style tag search**: Address update must find matching `rob_tag` across
36-
all entries in parallel. RAM primitives only provide single-address reads.
37-
- **Per-entry invalidation**: Partial flush (branch misprediction) must clear
38-
individual entries by age comparison in a single cycle.
39-
- **Parallel priority scan**: Issue selection reads all entries to find the
40-
oldest ready candidate. RAM would require sequential iteration.
41-
- **8 entries**: Too small for BRAM (minimum 18 Kbit), marginal for LUTRAM.
42-
- Same rationale as reservation stations, which use FF arrays at depths 2-8.
32+
**Hybrid FF + LUTRAM.** Control and CAM-scanned fields remain in FFs; the
33+
64-bit `data` field is stored in split lo/hi `mwp_dist_ram` instances (32 bits
34+
each, 2 write ports).
35+
36+
- **FFs**: `valid`, `rob_tag`, `is_fp`, `addr_valid`, `address`, `size`,
37+
`sign_ext`, `is_mmio`, `fp64_phase`, `issued`, `data_valid`, `forwarded`.
38+
These require parallel CAM-style tag search, per-entry invalidation on flush,
39+
and parallel priority scan for issue selection.
40+
- **LUTRAM** (`mwp_dist_ram`, 2 write ports each, lo/hi split):
41+
- Port 0: cache-hit fast path, SQ forwarding, or memory response (mutually
42+
exclusive sources).
43+
- Port 1: AMO completion (can overlap with port 0).
44+
- Split into lo (bits 31:0) and hi (bits 63:32) for FLD two-phase partial
45+
writes (phase 0 writes lo only, phase 1 writes hi only).
4346

4447
## Entry Structure
4548

@@ -56,7 +59,7 @@ total). Rationale:
5659
| fp64_phase | 1 bit | FLD phase: 0=low word, 1=high word |
5760
| issued | 1 bit | Sent to memory |
5861
| data_valid | 1 bit | Data received from memory/forward |
59-
| data | 64 bits | Loaded data (FLEN for FLD) |
62+
| data | 64 bits | Loaded data (FLEN for FLD) — in LUTRAM |
6063
| forwarded | 1 bit | Data from store queue forward |
6164
| **Total** | **~116 bits** | |
6265

@@ -132,6 +135,16 @@ total). Rationale:
132135
flush, ordering, back-pressure, constrained random, LR/SC reservation,
133136
and AMO read-modify-write operations.
134137

138+
## Dependencies
139+
140+
| Module | Purpose |
141+
|--------|---------|
142+
| `riscv_pkg` | Type definitions |
143+
| `sdp_dist_ram` | Simple dual-port distributed RAM (used by `lq_l0_cache`) |
144+
| `mwp_dist_ram` | Multi-write-port distributed RAM for lq_data lo/hi LUTRAM |
145+
| `load_unit` | Byte/halfword extraction and sign extension |
146+
| `lq_l0_cache` | L0 data cache for OoO load hits |
147+
135148
## Files
136149

137150
- `load_queue.sv` - Module implementation

hw/rtl/cpu_and_mem/cpu/tomasulo/load_queue/load_queue.f

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,13 @@
11
# Load Queue file list
2-
# Circular buffer tracking in-flight load instructions
2+
# Circular buffer tracking in-flight load instructions (hybrid FF + LUTRAM)
33

44
# Package dependency
55
$(ROOT)/hw/rtl/cpu_and_mem/cpu/riscv_pkg.sv
66

7+
# RAM primitives (lq_data LUTRAM)
8+
$(ROOT)/hw/rtl/lib/ram/sdp_dist_ram.sv
9+
$(ROOT)/hw/rtl/lib/ram/mwp_dist_ram.sv
10+
711
# Load unit (byte/halfword extraction and sign extension)
812
$(ROOT)/hw/rtl/cpu_and_mem/cpu/ma_stage/load_unit.sv
913

0 commit comments

Comments
 (0)