Why DEX Data? Escaping the CEX Noise.
CEX Orderbooks Are Full of Noise
- ✕CEX Spoofing: Traditional orderbooks (like Binance) are filled with HFT spoofing and microstructural noise
- ✕Months of Data Engineering: Building L2 depth pipelines from raw decentralized exchange archives requires custom parsers, timestamp alignment, and massive compute — before you even start your research
- ✕Zero-fee API spam creates phantom liquidity that disappears at execution
- ✕Raw L2/L3 feeds produce terabytes of microstructural noise
Cleaned and Normalized L2 Data, Ready to Use
- ✓True Market Intent: DEX orderbooks from top L1 Perp chains reflect genuine liquidity and true institutional positioning, free from zero-fee API spam
- ✓Ready-to-Use CSV Orderbook Datasets: We've done the heavy data engineering — our proprietary pipeline handles the raw exchange data, aligns timestamps, normalizes depth, and packages it for instant Pandas/DuckDB ingestion
- ✓Time-aligned orderbook bars — no clock drift, no asynchrony. High-fidelity 5-minute resolution for noise-free microstructure research
- ✓10-level depth — granular bid/ask profiles, cumulative volumes, and distance from mid-price
- ✓Pre-computed Orderbook Imbalance & Stat-Arb Ready — derived features and the missing piece for your CEX vs. DEX statistical arbitrage models
Hyperliquid Orderbook Data
Liquidity profiles are sourced and derived from Hyperliquid — the leading L1 Perpetual Decentralized Exchange — capturing the most active on-chain derivatives orderbook flow.
Every Row. Every Field. Documented.
Each row = one 5-minute aggregated perpetual futures orderbook snapshot — ready for crypto backtesting and deep learning
| Column | Type | Description |
|---|---|---|
| timestamp_utc | DateTime | ISO 8601 UTC timestamp |
| instrument_symbol | String | Trading pair (e.g., BTC-USDT) |
| open_price | Float | Mid-price at bar open |
| high_price | Float | Highest mid-price in bar |
| low_price | Float | Lowest mid-price in bar |
| close_price | Float | Mid-price at bar close |
| interval_traded_volume | Float | Taker flow volume proxy |
| bid_volume_level_1..10 | Float | Cumulative passive bid volume |
| ask_volume_level_1..10 | Float | Cumulative passive ask volume |
| bid_distance_level_1..10 | Float | Distance from mid-price (bps) |
| ask_distance_level_1..10 | Float | Distance from mid-price (bps) |
Built For Quantitative Minds
Quant Researchers
Build order flow features for LSTM, Transformer, and Reinforcement Learning (RL) trading environments without months of data engineering.
Execution Algos
Backtest TWAP, VWAP, and iceberg strategies against real depth profiles with institutional backtesting data. Estimate slippage pre-deployment.
Market Makers
Study bid-ask dynamics, quote density, and liquidity provision patterns across 24 instruments.
Academics
Institutional-quality quantitative research crypto data without exchange partnerships or Bloomberg terminals.
One-Time Purchase. No Subscriptions.
Choose your resolution. Delivered instantly after payment. Compressed CSV.
- ✓ 1 of 24 instruments · ~96K bars
- ✓ Personal license · Non-commercial
- ✓ All 24 instruments · ~2.3M bars
- ✓ Team (10 users) · Commercial use
- ✓ Live strategy feeding · Priority support
Need redistribution rights, custom data, or API access?
Contact us for Enterprise pricing ($5,000+) →24 Instruments. March 2025 – February 2026.
Capturing the 2025/2026 crypto market transition — bull runs, corrections, and regime changes.
Need Different Data?
We can generate orderbook depth data for any cryptocurrency, at any candle interval (1m, 5m, 15m, 1h, etc.), covering any time period. Custom depth levels (up to 30) and exchange selection available.
Request Custom Dataset →Try Before You Buy
7-day sample of all 24 instruments. 47 columns. 5-minute resolution.
Just want a quick look? Download BTC sample directly — no email, no spam.
↓ BTC 7-Day Sample (CSV, 739 KB)Want all 24 instruments? Enter your email above for the full sample pack.
Market Microstructure Data: OHLCV Is Not Enough.
Traditional candlestick data tells you what happened. Orderbook depth data tells you why it happened — and what's about to happen next.
See the Invisible Forces
Every price candle hides a storm of bid-ask dynamics. Large institutional orders, spoofing walls, and liquidity vacuums shape price action — but are invisible in OHLCV data. Our 10-level depth profiles expose the micro-structure behind every 5-minute bar: cumulative passive liquidity on both sides, measured in actual volume and basis-point distance from mid-price.
Deep Learning & ML-Ready Feature Set
47 pre-computed columns per row means you skip months of data engineering. Feed directly into LSTM networks, Transformer architectures, reinforcement learning environments, or gradient-boosted models. The bid-ask imbalance ratio — widely cited in academic microstructure literature — is computable in one line from our schema. No WebSocket parsing, no clock-drift alignment, no GPU-intensive normalization.
Legally Clean IP
Raw Level 2 orderbook data from centralized exchanges is often restricted by Terms of Service from redistribution. Our datasets are classified as Derived Data — aggregated, transformed, and mathematically computed from raw inputs. The original tick-level snapshots are not recoverable or reverse-engineerable. You get institutional-quality depth intelligence without the legal risk.
Your Statistical Arbitrage Research Pipeline, Accelerated
Load
Import CSV.gz directly into Pandas, DuckDB, or Spark. No parsing, no cleaning needed.
Engineer
Compute bid-ask imbalance, depth slope, liquidity concentration, and 100+ features from 47 raw columns.
Train
Feed into RL environments, LSTM/Transformers, or XGBoost. 2.3M rows = 12 months of replay buffer.
Deploy
Backtest slippage-aware strategies against real depth. Validate before risking capital on live markets.
Built by a Quant, for Quants.
How on-chain orderbook data changed everything.
Hi. I'm the creator of Imbalance Labs. Outside of my day job, I spend my time researching financial markets using Reinforcement Learning algorithms.
A while back, I hit a wall. I was feeding my RL agents standard OHLCV candle data. The math checked out, but my models were blind to what actually matters: real liquidity, spread, and slippage. I quickly realized that in the context of AI-driven trading, traditional candles are nothing more than a liquidity illusion.
RL models are ruthless. If you don't give them market intent context, they make naïve decisions. I knew that to level up, I needed to give my trading bot full visibility into market microstructure — historical Level 2 orderbook data.
That's when the real engineering nightmare began. Getting clean, historical L2 depth data from leading DEX exchanges is borderline impossible for most researchers. Instead of training models, I spent weeks building data infrastructure — a proprietary pipeline that ingests raw exchange data, cleans it, normalizes the depth profiles, and compresses it into analysis-ready formats.
When I finally plugged the finished dataset into my bot's training environment, the difference inlearning quality and risk management was massive.
That's when it hit me: if I, as a hobbyist, needed this data badly enough to spend weeks building infrastructure to process it — professional analysts, quants, and researchers are certainly fighting the same battle.
That's how Imbalance Labs was born. I did the worst, most tedious infrastructure work so you don't have to. Download the data and jump straight to what matters — training models and testing strategies.
Frequently Asked Questions
What is orderbook depth data and how is it different from OHLCV?
+
Which instruments and timeframe does the dataset cover?
+
Can I use this data for machine learning and deep learning?
+
Can you create custom datasets with different instruments, timeframes, or depth levels?
+
Is this data legally safe to use?
+
What format is the data delivered in?
+
Legal Disclaimer
The datasets distributed by Imbalance Labs constitute an Aggregated Liquidity and Orderbook Depth Index — a proprietary, mathematically derived analytical product. All raw order book data has been independently collected, aggregated across time intervals, normalized to mid-price reference frames, and transformed through statistical computations (cumulative volume aggregation, basis-point distance normalization).
This product is classified as Derived Data under standard market data licensing frameworks. It does not constitute, reproduce, or redistribute any raw, unmodified exchange data stream. The original tick-level order book snapshots are not included, recoverable, or reverse-engineerable from this dataset.
Imbalance Labs is not affiliated with, endorsed by, or officially connected to any cryptocurrency exchange or decentralized protocol. All exchange names and trademarks are the property of their respective owners.
This data is provided for research, backtesting, and analytical purposes only. It does not constitute financial advice, trading signals, or investment recommendations. Users assume full responsibility for any trading or investment decisions made using this data.