Hyperliquid Historical Orderbook Data: How to Get L2 Depth
By Imbalance Labs Research
TL;DR
Hyperliquid is the most active on-chain perpetuals exchange, but it does not expose a ready-made historical Level 2 orderbook API. You have three options: (1) record the live WebSocket feed yourself, (2) reconstruct the book from raw node archives, or (3) buy a cleaned, time-aligned dataset. This guide explains the trade-offs and shows the Python to load L2 depth once you have it.
Why Hyperliquid Orderbook Data Is Worth the Trouble
Hyperliquid runs a fully on-chain central limit order book (CLOB) for perpetual futures. Unlike most centralized exchanges, where market makers post and pull zero-fee quotes thousands of times per second, every resting order on Hyperliquid is a genuine on-chain commitment. That property makes its Level 2 depth one of the cleanest microstructure signals available in crypto — far less polluted by spoofing and phantom liquidity. For anyone modeling slippage, order-flow imbalance, or statistical arbitrage, this is exactly the kind of orderbook data you want.
The catch: getting historical depth — not just a live feed — is where most researchers hit a wall. Below are the three realistic paths, from hardest to easiest.
Option 1 — The Native Hyperliquid API (and its limits)
Hyperliquid exposes a public REST endpoint at POST /info. The relevant request types for orderbook work are:
l2Book— returns the current L2 snapshot for one coin (bid/ask levels with price, size, and order count). It is a point-in-time snapshot, not a history.candleSnapshot— returns OHLCV candles with a limited lookback window. Useful for price, but it carries no depth information at all.
There is also a WebSocket at wss://api.hyperliquid.xyz/ws where you can subscribe to a live l2Book stream. This is the canonical way to start collecting depth — but only from the moment you connect, forward. If you want last year's data, the API cannot give it to you.
import requests
# Current L2 snapshot for BTC — NOT historical
resp = requests.post(
"https://api.hyperliquid.xyz/info",
json={"type": "l2Book", "coin": "BTC"},
)
book = resp.json()
bids, asks = book["levels"] # each level: {"px", "sz", "n"}
print("Best bid:", bids[0], " Best ask:", asks[0])Verdict: great for live data and prototyping, useless for backfilling history. To build a 12-month dataset this way you would have to run a collector continuously for 12 months — and still handle reconnects, dropped messages, and clock drift.
Option 2 — Reconstruct From Raw Node Archives
Because Hyperliquid is a blockchain, the full order-by-order history exists in the chain's raw node data, which is published as a requester-pays S3 archive. In principle you can download it and replay every order placement, fill, and cancellation to rebuild the L2 book at any past timestamp.
In practice this is a serious data-engineering effort:
- The archives are large (terabyte-scale) and you pay egress to pull them.
- You must write a deterministic replay engine that maintains the book state per instrument and emits snapshots on a fixed cadence.
- You have to handle aggregation, mid-price normalization, and gap filling before the data is usable for modeling.
Verdict: this is the only fully self-serve route to deep history, but it can consume weeks of engineering and meaningful cloud spend before you write a single line of research code. This is precisely the problem we built Imbalance Labs to remove.
Option 3 — Ready-to-Use Derived Datasets
The third option is to skip collection entirely and start from a cleaned, time-aligned dataset. Our historical orderbook datasets reconstruct Hyperliquid L2 depth, aggregate it into fixed bars, and normalize it into an analysis-ready schema. The 5-minute Standard release covers 24 instruments over 12+ months with 47 engineered columns per row: OHLCV, cumulative bid/ask volume at 10 depth levels, and the basis-point distance of each level from mid-price. See the full 47-column schema for the exact field list.
Because the output is a plain compressed CSV (or Parquet), loading a full instrument's history is a one-liner:
import pandas as pd
# Cleaned, time-aligned Hyperliquid L2 depth — 12+ months in one file
df = pd.read_parquet("btc_l2_depth_5m.parquet")
# Order-book imbalance at the top of book, in one line
df["obi_l1"] = df["bid_volume_level_1"] / (
df["bid_volume_level_1"] + df["ask_volume_level_1"]
)
print(df[["timestamp_utc", "close_price", "obi_l1"]].tail())From here you can go straight to research — computing imbalance signals, estimating slippage, or feeding 12 months of depth into an RL environment — instead of babysitting a WebSocket collector.
Which Option Should You Choose?
| Approach | Gets History? | Effort | Best For |
|---|---|---|---|
| Native /info API + WS | Forward only | Medium (ongoing) | Live trading, prototyping |
| Raw node archives | Yes, full | Very high | Teams with data infra |
| Derived datasets | Yes, instant | None | Research & backtesting |
If your edge is in research and you bill your time at anything close to a quant's rate, Option 3 is almost always the rational choice. The cost of a dataset is a rounding error against weeks of pipeline engineering.
Next Steps: From Data to Signal
Once you have clean Hyperliquid depth, the interesting work begins. Two companion reads:
- Orderbook Imbalance Signals — turning bid/ask depth into predictive features.
- Why OHLCV Models Fail — why depth-aware slippage modeling beats candle-only backtests.
Frequently Asked Questions
Does Hyperliquid have a historical orderbook data API?
+
Can I get free Hyperliquid historical data?
+
What is the difference between L2 orderbook data and OHLCV candles?
+
Why use Hyperliquid data instead of a centralized exchange (CEX)?
+
Skip the Pipeline
Get cleaned, time-aligned Hyperliquid L2 orderbook depth across 24 instruments and 12+ months — ready for Pandas and DuckDB. Try the free 7-day sample first.
Full 47-column schema documentation available.