This is the main public developer-facing guide for build, deployment, and the maintained native execution paths.
README.md: project overview and quick startdocs/api_reference.md: Python/runtime API detailsdocs/architecture.md: hardware/software architecture overviewdocs/user_guide.md: end-user setup and examples
# Clone and install
git clone https://github.com/metr0jw/Event-Driven-Spiking-Neural-Network-Accelerator-for-FPGA.git
cd Event-Driven-Spiking-Neural-Network-Accelerator-for-FPGA
# Python dev install
python3 -m venv venv
source venv/bin/activate
cd software/python
pip install -e .
pip install pytest pytest-cov black flake8 mypy
# Vivado tools
source ~/tools/2025.2/Vivado/settings64.sh
export LC_ALL=en_US.UTF-8hardware/
├── hdl/rtl/ # Verilog RTL
│ ├── core/ # Core group, event router, connectivity table
│ └── top/ # Top-level integration (snn_core_group_top)
├── hdl/tb/ # Testbenches (3 active)
├── hls/ # Vitis HLS
│ ├── src/ # HLS source (snn_top_hls)
│ ├── include/ # Headers
│ ├── test/ # HLS testbenches
│ └── scripts/ # HLS build scripts
├── constraints/ # Timing and pin constraints
└── scripts/ # Build & simulation scripts
software/python/ # Python package
examples/ # Usage examples
docs/ # Documentation
cd hardware/scripts
./run_testbenches.sh # Run all 3 core group testbenches (55 checks)cd hardware/hls
./scripts/build_hls.sh --cleancd hardware/scripts
source ~/tools/2025.2/Vivado/settings64.sh
export LC_ALL=en_US.UTF-8
vivado -mode batch -source synth_core_group.tclOutput: outputs/snn_integrated.bit
- Native library-first path is the maintained route.
- Removed from supported path:
SpikingJelly auto-conversion. - Recommended scenarios:
- GPU train (surrogate/STDP) -> native export -> FPGA inference
- FPGA STDP train + inference with parity tooling
./scripts/run_scenario1_native_fpga_infer.sh \
--deployment /home/xilinx/snn/mnist_10class_deployment.npz \
--output /home/xilinx/snn/mnist_10class_results_scenario1.json./scripts/run_scenario2_fpga_stdp_train_infer.sh \
--stdp-steps 100 \
--infer-output /home/xilinx/snn/mnist_10class_results_scenario2.jsonmodule my_module #(
parameter DATA_WIDTH = 16
) (
input wire clk,
input wire rst_n,
input wire [DATA_WIDTH-1:0] data_in,
output reg [DATA_WIDTH-1:0] data_out
);
always @(posedge clk or negedge rst_n) begin
if (!rst_n) begin
data_out <= 0;
end else begin
data_out <= data_in;
end
end
endmodule- Naming:
snake_casefor modules/signals,UPPER_SNAKE_CASEfor parameters - Reset: Active-low async reset (
rst_n) - Assignments: Non-blocking (
<=) in sequential, blocking (=) in combinational - Clock: Single domain unless noted
`timescale 1ns/1ps
module tb_my_module;
reg clk, rst_n;
reg [15:0] data_in;
wire [15:0] data_out;
my_module #(.DATA_WIDTH(16)) dut (
.clk(clk), .rst_n(rst_n),
.data_in(data_in), .data_out(data_out)
);
initial begin
clk = 0;
forever #5 clk = ~clk;
end
initial begin
$dumpfile("work/my_module.vcd");
$dumpvars(0, tb_my_module);
rst_n = 0; data_in = 0;
#20 rst_n = 1;
#10 data_in = 16'hABCD;
#10;
if (data_out == 16'hABCD) $display("PASS");
else $display("FAIL");
$finish;
end
endmoduleRun: iverilog -o work/test tb_my_module.v my_module.v && vvp work/test
#include "ap_int.h"
#include "hls_stream.h"
void my_function(
hls::stream<ap_uint<32>>& input,
hls::stream<ap_uint<32>>& output,
ap_uint<8> config
) {
#pragma HLS INTERFACE axis port=input
#pragma HLS INTERFACE axis port=output
#pragma HLS INTERFACE s_axilite port=config
for (int i = 0; i < 100; i++) {
#pragma HLS PIPELINE II=1
ap_uint<32> data = input.read();
output.write(data + config);
}
}cd hardware/hls
v++ -c --mode hls \
--part xc7z020clg400-1 \
--kernel my_function \
--hls.clock 10 \
--config config.ini \
src/my_function.cppCurrent HLS design targets 720 neurons with aggressive pipelining:
- Pipeline:
#pragma HLS PIPELINE II=1— all major loops (LTD, LTP, WEIGHT_SUM) run at II=1 - Loop unroll:
#pragma HLS UNROLL factor=4— used on LTD_LOOP, RSTDP_INNER, DECAY loops - Array partition: Weight memory uses 8 banks (cyclic factor=2 on dim=1, factor=4 on dim=2). Trace arrays use cyclic factor=4.
- Dataflow:
#pragma HLS DATAFLOWfor parallelism
Avoid DSP usage: Use shifts instead of multiplies when possible.
Key constants (in snn_top_hls.h):
MAX_NEURONS = 720,MAX_SYNAPSES = 518400NEURON_ID_WIDTH = 10(10-bit neuron IDs vianeuron_id_t)WEIGHT_WIDTH = 4,MAX_INPUT_CHANNELS = 784
software/python/snn_fpga_accelerator/
├── __init__.py
├── accelerator.py # Main API
├── cli.py # Command-line interface
├── deploy.py # Deployment utilities
├── encoder.py # Delta-sigma encoder
├── exceptions.py # Custom exceptions
├── fpga_controller.py # FPGA control interface
├── hw_accurate_simulator.py # Bit-accurate sim (LIF, STDP)
├── layer.py # SNN layer abstraction
├── learning.py # STDP/R-STDP
├── neuron.py # HW-accurate core group sim
├── pytorch_interface.py # PyTorch integration
├── pytorch_snn_layers.py # Custom PyTorch layers
├── rtl_simulator.py # RTL simulation driver
├── spike_encoding.py # Spike encoders (Poisson, Temporal, Phase)
├── spyketorch_compat.py # SpykeTorch compatibility
├── training.py # Training loop utilities
├── utils.py # Utilities (tau conversion, visualization)
└── xrt_backend.py # XRT/Vitis backend
cd software/python
pytest tests/
pytest --cov=snn_fpga_acceleratorblack .
flake8 .
mypy .Example: Add a new spike encoder
- Define interface (spike_encoding.py):
class MyEncoder:
def __init__(self, num_neurons, duration, my_param):
self.num_neurons = num_neurons
self.duration = duration
self.my_param = my_param
def encode(self, input_data):
# Convert input to spike times
spike_times = []
for i, val in enumerate(input_data):
if val > 0.5:
spike_times.append((i, val * self.duration))
return spike_times- Add tests (tests/test_encoders.py):
def test_my_encoder():
encoder = MyEncoder(10, 0.1, 1.0)
data = np.random.rand(10)
spikes = encoder.encode(data)
assert len(spikes) > 0-
Document (docs/api_reference.md)
-
Add example (examples/)
# Simulate with waveforms
cd hardware/hdl/sim
iverilog -o work/test tb_module.v module.v
vvp work/test
gtkwave work/waves.vcdCheck synthesis report: hls_output/hls/syn/report/csynth.rpt
import logging
logging.basicConfig(level=logging.DEBUG)Check Vivado timing report: outputs/integrated_timing.rpt
Key metrics:
- WNS (Worst Negative Slack): Must be ≥ 0
- TNS (Total Negative Slack): Should be 0
import time
start = time.time()
output = accel.infer(spikes)
print(f"Inference time: {time.time() - start:.3f}s")Vivado synthesis fails: Check for syntax errors with iverilog -t null -Wall file.v
HLS build fails: Check C++ syntax, add #includes
Python import error: Run pip install -e . in dev mode
Timing violations: Reduce clock frequency or add pipeline stages
Resource overflow: Reduce network size or optimize modules
See CONTRIBUTING.md for guidelines.
- User Guide - Usage and examples
- API Reference - Python API
- Architecture - System design