Mean-reversion statistical arbitrage on cointegrated crypto baskets — walk-forward validation, two-tier regime filtering, bucketed position sizing, and inverse-volatility risk parity with Net Sharpe 1.76 out-of-sample.
This project implements an end-to-end quantitative research pipeline for crypto statistical arbitrage based on cointegration. Rather than trading single pairs, the strategy discovers 2–4 asset baskets with a stable cointegrating relationship, constructs an Ornstein-Uhlenbeck mean-reversion signal on each spread, and filters signals through a two-tier regime framework before deploying into a diversified portfolio.
The entire pipeline is validated via a walk-forward backtest to prevent overfitting, and the final strategy is evaluated on a fully held-out 20% test set the model never touches during calibration.
| Metric | Gross | Net (20bps) |
|---|---|---|
| Sharpe Ratio | 1.953 | 1.764 |
| Cumulative Return | +6.50% | +5.61% |
| Max Drawdown | — | -1.38% |
| Win Rate | — | 72.2% |
| Cost Drag (Sharpe) | — | 0.190 |
| Annual Turnover | — | 7.05× |
| Entry Events | — | 9 (~15 annualised) |
Important caveat: The OOS Sharpe of 1.76 is based on only 9 round-trip trades (~220 days). This is not yet statistically meaningful — several dozen independent trades at minimum would be needed before the Sharpe ratio becomes a reliable indicator of true edge.
Data Acquisition Basket Discovery Walk-Forward Backtest
────────────────── ──────────────── ─────────────────────
Binance daily OHLCV → Johansen cointegration → 365-day train windows
23 crypto assets (rank = 1 filter) 60-day test windows
Mar 2023–Mar 2026 Quality filters: 8 folds total
18 retained after ADF stationarity Parameter calibration
quality filter Variance ratio per fold (hl_window
80/20 train/test Half-life 2–15 days scaled to median HL)
Rolling HL stability
Spread predictability
Dynamic rolling weights
(re-estimated monthly)
▼
Portfolio Evaluation Basket Selection Signal Generation
──────────────────── ──────────────── ─────────────────
4 baskets deployed ← Composite score: ← OU mean-reversion
Bucketed positions mean_SR × consistency signal on dynamic
flat / 0.5 / 1.0 × log1p(n_folds) spread
Risk parity weights Filter: ≥2 folds, Two-tier regime:
inverse-vol, 30d mean_SR > 0 Hard: vol spike,
20bps tcost on Greedy diversification: ADF breakdown
signal changes only max 2 baskets/asset Soft: continuous
Market neutrality score weighting
analysis (OLS)
MIT License — free to use, modify, and distribute with attribution.