Crypto Backtesting Explained: What a Backtest Reveals (and the Data It Needs)
By Imbalance Labs Research
TL;DR
A backtest is a controlled experiment on historical data. It reveals an equity curve, drawdown, Sharpe, win rate and regime sensitivity — but only for the reality your data encodes. Backtest on price-only candles and you measure a fantasy with no slippage or liquidity. Backtest on Level 2 orderbook depth and you measure something close to what live trading will actually do. If you're shopping for a backtester, the harder problem is usually the data feeding it.
What a Backtest Actually Is
A backtest replays a trading rule over historical market data and records what would have happened. You define a signal (“go long when this condition is true”), an execution model (how orders fill), and a cost model (fees and slippage). The backtester walks the data bar by bar, simulates the trades, and produces a performance record. Done honestly, it's the cheapest way to reject a bad idea before it costs real money.
Done carelessly, it's the most expensive form of self-deception in quant finance. The difference is rarely the backtesting engine — it's the assumptions and the data.
What a Backtest Reveals
A good backtest surfaces far more than a single return number. The metrics that actually matter:
- Equity curve — the shape of growth over time, and whether it came from a few lucky trades or a consistent edge.
- Maximum drawdown — the worst peak-to-trough loss. It decides whether you could actually hold the strategy through the pain.
- Sharpe / risk-adjusted return — return per unit of volatility. A 40% return with 80% drawdown is worse than 15% with 8%.
- Win rate & turnover — how often you're right, and how often you trade. High turnover makes the strategy exquisitely sensitive to costs.
- Regime sensitivity — does the edge survive bull, bear, and chop, or only one of them?
Notice that two of these — turnover sensitivity and the realism of the equity curve itself — depend entirely on how accurately you model execution. Which brings us to the catch.
The Catch: A Backtest Only Reveals What's in Your Data
Here is the uncomfortable truth most backtesting tutorials skip: a backtest is a function of its inputs. If your data doesn't contain liquidity information, your backtest cannot reveal liquidity risk — it will silently assume infinite liquidity at the close price.
This is the core failure of OHLCV-only backtests. A 5-minute candle tells you the high was $67,500, but not whether there was enough resting size at $67,500 to fill your order. Assume there was, and your backtest prints a beautiful equity curve that quietly evaporates in production. We unpack this in depth in Why OHLCV Models Fail.
Level 2 orderbook data closes the gap. With the resting bid/ask volume at multiple price levels, a backtest can sweep the book to compute the price you'd actually fill at — turning “assumed” execution into modelled execution. The full recipe is in Calculating Realistic Slippage with L2 Data.
A Depth-Aware Backtest Loop
The skeleton of a realistic backtest isn't complicated — the realism comes from the data it reads. Here a position is decided from information available before each bar (no look-ahead), and every position change pays a cost:
import pandas as pd
df = pd.read_parquet("btc_l2_depth_5m.parquet")
# Signal you only get from L2 depth: top-of-book imbalance
df["obi"] = df["bid_volume_level_1"] / (
df["bid_volume_level_1"] + df["ask_volume_level_1"]
)
fee = 5 / 10_000 # 5 bps round-trip cost (fees + slippage)
equity, pos = 1.0, 0
curve = []
for i in range(1, len(df)):
# Decide today's position from YESTERDAY's signal (causal)
new_pos = 1 if df["obi"].iloc[i - 1] > 0.55 else 0
ret = df["close_price"].iloc[i] / df["close_price"].iloc[i - 1] - 1
cost = fee if new_pos != pos else 0.0
equity *= 1 + new_pos * ret - cost
pos = new_pos
curve.append(equity)
print(f"Final equity: {equity:.3f} ({(equity - 1) * 100:+.1f}%)")Swap the obi signal for an RSI or MACD rule and you have a fair comparison: does the orderbook signal — which needs depth data — beat a price-only indicator after costs? That question is the whole game, and it's exactly what the demo below lets you explore. (For the imbalance signal itself, see Orderbook Imbalance Signals.)
Try a live backtest — no setup
Every dataset page has an interactive backtester running on a real 7-day orderbook sample. Move the sliders — OBI threshold, RSI settings, cost per trade — and watch the equity curve and drawdown react against buy & hold, in your browser, on real data.
Open the BTC backtester demo →The Data Backtesters Actually Need
If you're evaluating backtesters, you'll quickly find the engine is the easy part — Backtrader, vectorbt, QuantConnect, or a 50-line loop like the one above all work. The bottleneck is feeding them clean, realistic data. For crypto specifically, that means:
- Continuous history across a full market cycle — bull, bear, and chop — so regime sensitivity is testable.
- Level 2 depth, not just OHLCV, so execution cost is modelled rather than assumed.
- Clean, time-aligned bars — no gaps, no clock drift, no async WebSocket artifacts.
- A signal-rich venue. We source from Hyperliquid's on-chain orderbook, where resting orders are real commitments rather than the spoofing that pollutes many CEX feeds — see how to get Hyperliquid historical data.
Our historical orderbook datasets are built for exactly this: 24 instruments, 12+ months, 10-level depth, normalized into an analysis-ready 47-column schema you can drop straight into your backtester.
Frequently Asked Questions
What does a backtest actually reveal?
+
What data do I need to backtest a crypto strategy properly?
+
Can I backtest with free OHLCV data?
+
Why do backtested strategies fail in live trading?
+
Backtest on data that tells the truth
Stop measuring a fantasy. Feed your backtester real Hyperliquid Level 2 depth across 24 instruments and 12+ months — and start with a free 7-day sample.