Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 22 additions & 22 deletions docs/satellite_station.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,59 +11,59 @@
- `simCoreDBusAddrWidth`: 仿真核 AXI 地址宽度(bits)。
- `simCoreDBusDataWidth`: 仿真核 AXI 数据宽度(bits),`wordBytes = simCoreDBusDataWidth / 8`。
- `nStateBus`: 并行 StateBus 数量,`>0` 且 `2^k`。
- `toCoreStateBusBufferDepth`: 写入 corvus 方向队列深度。
- `fromCoreStateBusBufferDepth`: 从 corvus 读出方向队列深度。
- `fromCoreStateBusBufferDepth`: 写入 corvus 方向队列深度。
- `toCoreStateBusBufferDepth`: 从 corvus 读出方向队列深度。
- `writeQueueWStallTimeoutCycles`(模块内常量,默认 32,`SatelliteStation` 中定义):流式写队列在 W 通道被背压时的超时门限(周期);设为 0 可关闭,超时会在超时周期拉高 WREADY 消耗当前 beat,随后在 B 通道返回 SLVERR 结束该笔突发。
- 约束:`stateBusConfig.dstWidth + stateBusConfig.payloadWidth = simCoreDBusDataWidth`,否则配置时报错。

## 顶层接口
- 仿真核侧
- `ctrlAXI4Slave`: AXI4 Slave(ABITS/DBITS 如上)。
- `stateBusBufferFullInterrupt`: Bool,任一写队列满时置高
- `stateBusBufferFullInterrupt`: Bool,任一读队列满时置高

- corvus 系统侧
- `inSyncFlag`: 输入,同步树标志位,宽度 `syncTreeConfig.flagWidth`。
- `outSyncFlag`: 输出,同步树标志位,宽度 `syncTreeConfig.flagWidth`,复位 0,软件写后保持(由 CtrlAXI4Slave 寄存)。
- `nodeId`: 输出,StateBus 节点 ID,宽度 `stateBusConfig.dstWidth`,复位 0,软件写后保持(由 CtrlAXI4Slave 寄存)。
- `toCoreStateBusPort`: `Vec(nStateBus, Flipped(Decoupled(StateBusPacket)))`,来自仿真核写入的包,送往 corvus。
- `fromCoreStateBusPort`: `Vec(nStateBus, Decoupled(StateBusPacket))`,来自 corvus 的包,供仿真核读取。
- `fromCoreStateBusBufferNonEmpty`: Bool,任一 `fromCoreStateBusBuffer` 非空时置高,提示仿真核有待读数据。
- `fromCoreStateBusPort`: `Vec(nStateBus, Decoupled(StateBusPacket))`,来自仿真核写入的包,送往 corvus。
- `toCoreStateBusPort`: `Vec(nStateBus, Flipped(Decoupled(StateBusPacket)))`,来自 corvus 的包,供仿真核读取。
- `toCoreStateBusBufferNonEmpty`: Bool,任一 `toCoreStateBusBuffer` 非空时置高,提示仿真核有待读数据。

## 主要子模块与数据通路
- `CtrlAXI4Slave`: 参照 `src/main/scala/corvus/ctrl_axi4_slave/CtrlAXI4Slave.scala`,实例化 4 个子控制器并通过 Crossbar 拼接地址空间。
- `toCoreStateBusBuffers`: `nStateBus` 个 `Queue`,深度 `toCoreStateBusBufferDepth`,数据类型 `UInt(DBITS.W)`,AXI 写入;出队经拆字段后喂给 `toCoreStateBusPort`。
- `fromCoreStateBusBuffers`: `nStateBus` 个 `Queue`,深度 `fromCoreStateBusBufferDepth`,数据类型 `UInt(DBITS.W)`,从 `fromCoreStateBusPort` 入队;AXI 读取即出队。
- `fromCoreStateBusBuffers`: `nStateBus` 个 `Queue`,深度 `fromCoreStateBusBufferDepth`,数据类型 `UInt(DBITS.W)`,AXI 写入;出队经拆字段后喂给 `fromCoreStateBusPort`。
- `toCoreStateBusBuffers`: `nStateBus` 个 `Queue`,深度 `toCoreStateBusBufferDepth`,数据类型 `UInt(DBITS.W)`,从 `toCoreStateBusPort` 入队;AXI 读取即出队。
- 字段映射:`UInt(DBITS.W) = Cat(packet.dst, packet.payload)`,高位为 `dst`。两个方向共用同一布局。

## 地址空间布局(低地址到高地址)
记 `N_RS = pow2ceil(1 + nStateBus)`,`N_WS = 2`(已为 2 的幂),`N_RQ = nStateBus`,`N_WQ = nStateBus`。各段大小均为 `数量 * wordBytes`。

| 段 | 数量 | 作用 | 地址范围(相对 base=0) |
| - | - | - | - |
| 读状态 | `N_RS` | 寄存器只读 | `[0, N_RS*wordBytes)` |
| 写状态 | `N_WS` | 寄存器可读写 | `[N_RS*wordBytes, (N_RS+N_WS)*wordBytes)` |
| 读队列 | `N_RQ` | 出队 1 个 `UInt(DBITS)` | `[(N_RS+N_WS)*wordBytes, (N_RS+N_WS+N_RQ)*wordBytes)` |
| 写队列 | `N_WQ` | 入队 1 个 `UInt(DBITS)` | `[(N_RS+N_WS+N_RQ)*wordBytes, (N_RS+N_WS+N_RQ+N_WQ)*wordBytes)` |
| 读状态 | `N_RS` | 只读控制寄存器 | `[0, N_RS*wordBytes)` |
| 写状态 | `N_WS` | 读写控制寄存器 | `[N_RS*wordBytes, (N_RS+N_WS)*wordBytes)` |
| 收队列 | `N_RQ` | 出队 1 个 `UInt(DBITS)` | `[(N_RS+N_WS)*wordBytes, (N_RS+N_WS+N_RQ)*wordBytes)` |
| 发队列 | `N_WQ` | 入队 1 个 `UInt(DBITS)` | `[(N_RS+N_WS+N_RQ)*wordBytes, (N_RS+N_WS+N_RQ+N_WQ)*wordBytes)` |

### 读状态寄存器映射(按寄存器索引)
0. `inSyncFlag`零扩展到 `DBITS`。
1...`nStateBus`: `toCoreStateBusBuffer[i].count`,零扩展。
剩余补 0 直至 `N_RS` 个。
### 只读控制寄存器映射
- 0: `inSyncFlag`,零扩展到 `DBITS`。
- 1...`nStateBus`: `toCoreStateBusBuffer[i].count`,零扩展。
- ... `N_RS`: 固定0,填充用

### 写状态寄存器映射
### 读写控制寄存器映射
- 0: `outSyncFlag`(复位 0;读返回当前寄存器值,写遵循 AXI 写掩码,小端拼接,写后保持)。
- 1: `nodeId`(复位 0;读返回当前寄存器值,写遵循 AXI 写掩码,小端拼接,写后保持)。

### 队列访问语义
- 语义与 `CtrlAXI4Slave` 流式读/写队列一致(对齐要求、背压/弹队行为、越界/非对齐 OKAY 返回 0)。此处仅复用该子模块接口,不再重复展开。
### 收/发队列
每个寄存器均对应一条`StateBus`

## 中断
- `stateBusBufferFullInterrupt` 为电平信号,任一 `toCoreStateBusBuffer` 进入满状态置高;全部恢复非满后拉低。无额外屏蔽/粘滞/状态寄存器,保持设计简洁。
- `fromCoreStateBusBufferNonEmpty` 为电平指示,任一 `fromCoreStateBusBuffer` 非空即置高;全部为空后拉低,便于仿真核轮询/响应有数据可读。
- `toCoreStateBusBufferNonEmpty` 为电平指示,任一 `toCoreStateBusBuffer` 非空即置高;全部为空后拉低,便于仿真核轮询/响应有数据可读。

## StateBusPacket 转换与校验
- 写方向(仿真核→corvus):AXI 写入的 `UInt(DBITS)` 拆为 `{dst, payload}` 后送 `toCoreStateBusPort`。`dstWidth` 与 `payloadWidth` 来自 `stateBusConfig`。
- 读方向(corvus→仿真核):`fromCoreStateBusPort` 输出的 `StateBusPacket` 组合为 `UInt(DBITS)` 入队。
- 写方向(仿真核→corvus):AXI 写入的 `UInt(DBITS)` 拆为 `{dst, payload}` 后送 `fromCoreStateBusPort`。`dstWidth` 与 `payloadWidth` 来自 `stateBusConfig`。
- 读方向(corvus→仿真核):`toCoreStateBusPort` 输出的 `StateBusPacket` 组合为 `UInt(DBITS)` 入队。
- 配置校验:`dstWidth + payloadWidth` 必须等于 `simCoreDBusDataWidth`;否则 elaboration 期报错。

## 时钟/复位域假设
Expand Down
16 changes: 8 additions & 8 deletions src/main/scala/corvus/SatelliteStation.scala
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,12 @@ class SatelliteStation(implicit p: CorvusConfig) extends Module {
val io = IO(new Bundle {
val ctrlAXI4Slave = new CtrlAXI4IO(addrBits, dataBits)
val stateBusBufferFullInterrupt = Output(Bool())
val fromCoreStateBusBufferNonEmpty = Output(Bool())
val toCoreStateBusBufferNonEmpty = Output(Bool())
val inSyncFlag = Input(UInt(p.syncTreeConfig.flagWidth.W))
val outSyncFlag = Output(UInt(p.syncTreeConfig.flagWidth.W))
val nodeId = Output(UInt(dstWidth.W))
val toCoreStateBusPort = Vec(nStateBus, Decoupled(new StateBusPacket))
val fromCoreStateBusPort = Vec(nStateBus, Flipped(Decoupled(new StateBusPacket)))
val toCoreStateBusPort = Vec(nStateBus, Flipped(Decoupled(new StateBusPacket)))
val fromCoreStateBusPort = Vec(nStateBus, Decoupled(new StateBusPacket))
})

private val ctrlAXI =
Expand All @@ -61,8 +61,8 @@ class SatelliteStation(implicit p: CorvusConfig) extends Module {
ctrlAXI.io.axi <> io.ctrlAXI4Slave

for (i <- 0 until nStateBus) {
toCoreStateBusBuffers(i).io.enq <> ctrlAXI.io.writeQueues(i)
ctrlAXI.io.readQueues(i) <> fromCoreStateBusBuffers(i).io.deq
fromCoreStateBusBuffers(i).io.enq <> ctrlAXI.io.writeQueues(i)
ctrlAXI.io.readQueues(i) <> toCoreStateBusBuffers(i).io.deq
}

private val statusRegs = Wire(Vec(nRS, UInt(dataBits.W)))
Expand All @@ -78,15 +78,15 @@ class SatelliteStation(implicit p: CorvusConfig) extends Module {
io.outSyncFlag := ctrlAXI.io.control(0)(p.syncTreeConfig.flagWidth - 1, 0)
io.nodeId := ctrlAXI.io.control(1)(dstWidth - 1, 0)

toCoreStateBusBuffers.zip(io.toCoreStateBusPort).foreach { case (buf, port) =>
fromCoreStateBusBuffers.zip(io.fromCoreStateBusPort).foreach { case (buf, port) =>
val raw = buf.io.deq.bits
port.valid := buf.io.deq.valid
port.bits.dst := raw(dataBits - 1, payloadWidth)
port.bits.payload := raw(payloadWidth - 1, 0)
buf.io.deq.ready := port.ready
}

fromCoreStateBusBuffers.zip(io.fromCoreStateBusPort).foreach { case (buf, port) =>
toCoreStateBusBuffers.zip(io.toCoreStateBusPort).foreach { case (buf, port) =>
buf.io.enq.valid := port.valid
buf.io.enq.bits := Cat(port.bits.dst, port.bits.payload)
port.ready := buf.io.enq.ready
Expand All @@ -96,7 +96,7 @@ class SatelliteStation(implicit p: CorvusConfig) extends Module {
.map(buf => !buf.io.enq.ready)
.reduce(_ || _)

io.fromCoreStateBusBufferNonEmpty := fromCoreStateBusBuffers
io.toCoreStateBusBufferNonEmpty := toCoreStateBusBuffers
.map(_.io.deq.valid)
.reduce(_ || _)
}
8 changes: 4 additions & 4 deletions src/main/scala/corvus/Top.scala
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ class Top(implicit p: CorvusConfig) extends Module with RequireAsyncReset {
val satAwId = RegInit(0.U(satAxi.aw.bits.id.getWidth.W))

sat.io.ctrlAXI4Slave.ar.valid := satAxi.ar.valid
sat.io.ctrlAXI4Slave.ar.bits.addr := satAxi.ar.bits.addr
sat.io.ctrlAXI4Slave.ar.bits.addr := satAxi.ar.bits.addr - satelliteAddr.base.U // FIXME: wind up
sat.io.ctrlAXI4Slave.ar.bits.len := satAxi.ar.bits.len
sat.io.ctrlAXI4Slave.ar.bits.size := satAxi.ar.bits.size
sat.io.ctrlAXI4Slave.ar.bits.burst := satAxi.ar.bits.burst
Expand All @@ -140,7 +140,7 @@ class Top(implicit p: CorvusConfig) extends Module with RequireAsyncReset {
when(satAxi.ar.fire) { satArId := satAxi.ar.bits.id }

sat.io.ctrlAXI4Slave.aw.valid := satAxi.aw.valid
sat.io.ctrlAXI4Slave.aw.bits.addr := satAxi.aw.bits.addr
sat.io.ctrlAXI4Slave.aw.bits.addr := satAxi.aw.bits.addr - satelliteAddr.base.U
sat.io.ctrlAXI4Slave.aw.bits.len := satAxi.aw.bits.len
sat.io.ctrlAXI4Slave.aw.bits.size := satAxi.aw.bits.size
sat.io.ctrlAXI4Slave.aw.bits.burst := satAxi.aw.bits.burst
Expand Down Expand Up @@ -181,8 +181,8 @@ class Top(implicit p: CorvusConfig) extends Module with RequireAsyncReset {
val ring = ringNodes(idx)(i)
ring.io.nodeId := sat.io.nodeId

ring.io.fromCore <> sat.io.toCoreStateBusPort(i)
ring.io.toCore <> sat.io.fromCoreStateBusPort(i)
ring.io.fromCore <> sat.io.fromCoreStateBusPort(i)
ring.io.toCore <> sat.io.toCoreStateBusPort(i)
}

bus.controllers(3) <> peripheralOutBus.io.cores(idx)
Expand Down
19 changes: 17 additions & 2 deletions src/test/csrc/sim_main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,10 @@
#include <chrono>
#include <cstdio>
#include <cstdlib>
#include <iostream>
#include <signal.h>
#include <string>
#include <filesystem>

static vluint64_t main_time = 0;
double sc_time_stamp() { return static_cast<double>(main_time); }
Expand All @@ -26,6 +28,8 @@ static void print_help(const char *exe) {
<< " -r, --reset-cycles=N number of reset cycles (default 16)\n"
<< " -w, --wave enable FST dump (default path sim.fst)\n"
<< " --wave-file=PATH waveform output path (requires --wave)\n"
<< " -l, --log enable UART log to files\n"
<< " --log-path=PATH UART log directory (default: logs)\n"
<< " -h, --help show this message\n";
}

Expand All @@ -35,11 +39,12 @@ static SimConfig parse_args(int argc, char **argv) {
{"image", required_argument, nullptr, 'i'}, {"flash", required_argument, nullptr, 'f'},
{"mem-size", required_argument, nullptr, 'm'}, {"max-cycles", required_argument, nullptr, 'c'},
{"reset-cycles", required_argument, nullptr, 'r'}, {"wave", no_argument, nullptr, 'w'},
{"wave-file", required_argument, nullptr, 0}, {"help", no_argument, nullptr, 'h'}, {0, 0, 0, 0}};
{"wave-file", required_argument, nullptr, 0}, {"log", no_argument, nullptr, 'l'},
{"log-path", required_argument, nullptr, 0}, {"help", no_argument, nullptr, 'h'}, {0, 0, 0, 0}};

while (true) {
int idx = 0;
int opt = getopt_long(argc, argv, "hi:f:m:c:r:w", long_opts, &idx);
int opt = getopt_long(argc, argv, "hi:f:m:c:r:wl", long_opts, &idx);
if (opt == -1) break;
switch (opt) {
case 'i': cfg.image = optarg; break;
Expand All @@ -48,10 +53,13 @@ static SimConfig parse_args(int argc, char **argv) {
case 'c': cfg.max_cycles = std::strtoull(optarg, nullptr, 0); break;
case 'r': cfg.reset_cycles = std::strtoull(optarg, nullptr, 0); break;
case 'w': cfg.enable_wave = true; break;
case 'l': cfg.enable_log = true; break;
case 'h': print_help(argv[0]); std::exit(0);
case 0:
if (std::string(long_opts[idx].name) == "wave-file") {
cfg.wave_path = optarg;
} else if (std::string(long_opts[idx].name) == "log-path") {
cfg.log_path = optarg;
}
break;
default:
Expand All @@ -78,6 +86,12 @@ int main(int argc, char **argv) {
init_ram(cfg.image.empty() ? nullptr : cfg.image.c_str(), cfg.mem_size_bytes);
init_flash(cfg.flash.empty() ? nullptr : cfg.flash.c_str());

// Setup UART logs
if (cfg.enable_log) {
std::filesystem::create_directories(cfg.log_path);
init_all_uart_logs<VSimTop>(cfg.log_path);
}

VSimTop *dut = new VSimTop();
VerilatedFstC *tfp = nullptr;
if (cfg.enable_wave) {
Expand Down Expand Up @@ -118,6 +132,7 @@ int main(int argc, char **argv) {
tick();
cycles++;
print_all_uart_outs(dut);
if (cfg.enable_log) log_all_uart_outs(dut);
// if (dut->io_uart_in_valid) {
// int available = std::cin.rdbuf()->in_avail();
// if (available > 0) {
Expand Down
70 changes: 66 additions & 4 deletions src/test/csrc/sim_main.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,19 @@
#include <string>
#include <type_traits>
#include <utility>
#include <fstream>

struct SimConfig {
std::string image;
std::string flash;
std::string mem_size_str;
std::string wave_path = "sim.fst";
std::string log_path = "logs";
uint64_t mem_size_bytes = 2ULL * 1024 * 1024 * 1024; // match RAM_SIZE in DifftestMem1R1W
uint64_t max_cycles = 0;
uint64_t reset_cycles = 16;
bool enable_wave = false;
bool enable_log = false;
};

template <typename...>
Expand All @@ -23,20 +26,42 @@ using void_t = void;
template <int N, typename T, typename = void>
struct UartOutLane {
static void print(T *) {}
static void init_log(const std::string&) {}
static void log(T *) {}
};

#define UART_OUT_LANES(X) \
X(0) X(1) X(2) X(3) X(4) X(5) X(6) X(7) X(8) X(9) X(10) X(11) X(12) X(13) X(14) X(15) X(16)

#define DEFINE_UART_OUT_LANE(N) \
template <typename T> \
struct UartOutLane<N, T, void_t<decltype(std::declval<T>().io_uart_##N##_out_valid), \
decltype(std::declval<T>().io_uart_##N##_out_ch)>> { \
struct UartOutLane<N, T, void_t<decltype(std::declval<T>().io_uart_##N##_out_valid), \
decltype(std::declval<T>().io_uart_##N##_out_ch)>> { \
inline static std::ofstream log_file; \
\
static void init_log(const std::string& log_path) { \
if (!log_path.empty()) { \
std::string fname = log_path + "/uart_" + std::to_string(N) + ".log"; \
log_file.open(fname); \
if (!log_file.is_open()) { \
std::cerr << "Failed to open " << fname << "\n"; \
exit(1); \
} \
} \
} \
\
static void print(T *dut) { \
if (dut->io_uart_##N##_out_valid) { \
std::cout << static_cast<char>(dut->io_uart_##N##_out_ch); \
std::cout << static_cast<char>(dut->io_uart_##N##_out_ch); \
std::cout.flush(); \
} \
} \
\
static void log(T *dut) { \
if (dut->io_uart_##N##_out_valid && log_file) { \
log_file << static_cast<char>(dut->io_uart_##N##_out_ch); \
log_file.flush(); \
} \
} \
};

Expand All @@ -45,17 +70,38 @@ UART_OUT_LANES(DEFINE_UART_OUT_LANE)
template <typename T, typename = void>
struct SingleUartOut {
static void print(T *) {}
static void init_log(const std::string&) {}
static void log(T *) {}
};

template <typename T>
struct SingleUartOut<T, void_t<decltype(std::declval<T>().io_uart_out_valid),
decltype(std::declval<T>().io_uart_out_ch)>> {
decltype(std::declval<T>().io_uart_out_ch)>> {
inline static std::ofstream log_file;

static void init_log(const std::string& log_path) {
if (!log_path.empty()) {
std::string fname = log_path + "/uart.log";
log_file.open(fname);
if (!log_file.is_open()) {
std::cerr << "Failed to open " << fname << "\n";
exit(1);
}
}
}

static void print(T *dut) {
if (dut->io_uart_out_valid) {
std::cout << static_cast<char>(dut->io_uart_out_ch);
std::cout.flush();
}
}
static void log(T *dut) {
if (dut->io_uart_out_valid) {
log_file << static_cast<char>(dut->io_uart_out_ch);
log_file.flush();
}
}
};

template <typename T>
Expand All @@ -67,5 +113,21 @@ void print_all_uart_outs(T *dut) {
SingleUartOut<T>::print(dut);
}

template <typename T>
void log_all_uart_outs(T *dut) {
#define CALL_UART_LOG_LANE(N) UartOutLane<N, T>::log(dut);
UART_OUT_LANES(CALL_UART_LOG_LANE)
#undef CALL_UART_LOG_LANE
SingleUartOut<T>::log(dut);
}

template <typename T>
void init_all_uart_logs(const std::string& log_path) {
#define CALL_UART_INIT_LANE(N) UartOutLane<N, T>::init_log(log_path);
UART_OUT_LANES(CALL_UART_INIT_LANE)
#undef CALL_UART_INIT_LANE
SingleUartOut<T>::init_log(log_path);
}

#undef DEFINE_UART_OUT_LANE
#undef UART_OUT_LANES
Loading