How do I model slippage if I don't know my fill prices in advance?

Use the half-spread as a baseline, scaled by your trade size relative to typical volume. For instruments where you'd be a meaningful fraction of typical volume, double the half-spread as a margin of safety.

Where do I get point-in-time data?

Norgate Data, Quandl, and Refinitiv all offer point-in-time datasets. Free sources are usually survivor-only. The cost difference (free vs. $50–$200/month) is meaningful but worth it for any serious strategy development.

How long should the out-of-sample test period be?

At least 30% of total available history, drawn from the most recent period. The most recent data is the most demanding test because market microstructure changes over time.

Is paper trading required before live deployment?

Strongly recommended. The 30-day paper trading period is when you discover slippage modeling errors, latency issues, and adversarial-market drag. The paper-trading P&L tells you what to expect from live deployment within reasonable bounds.

How do I know if my strategy is overfit?

Compare in-sample vs. out-of-sample Sharpe ratio. If out-of-sample is less than 60% of in-sample, you're overfit. The fewer parameters your strategy has, the less likely overfit; the more parameters, the more rigorous the out-of-sample testing has to be.

Quorum is an AI multi-agent trading research system. 200 specialized AI agents evaluate each decision and act only on a 60% consensus threshold, with Monte Carlo stress-testing before any capital is risked.

Paper trading is always free. Quorum lets you test strategies with zero financial risk before considering live use.

Does Quorum guarantee returns?

No. Quorum is a research and decision-support system. Past performance does not guarantee future results, and all trading involves risk of loss. Nothing Quorum produces is financial advice.

How does Quorum make decisions?

Quorum uses consensus rather than single-model prediction: 200 specialized agents must reach a 60% agreement threshold, and candidate positions are Monte Carlo stress-tested before execution.

Backtest Realism — Why Most Backtests Don't Survive Contact with Markets

Failure 1 — Slippage modeled as zero

The most common backtest sin: assuming you fill at the displayed mid-price. Real markets give you the bid if you're buying or the ask if you're selling, sometimes worse on size. Backtests that ignore the bid-ask spread routinely overstate returns by 100–300 basis points per year.

The fix: model slippage as at minimum the half-spread of the instrument at the trade size you're modeling. For thinly-traded instruments at non-trivial size, slippage can be 1–3% per round trip — meaningful enough to flip a profitable strategy to a losing one.

Failure 2 — Survivorship bias

Backtests over historical data often run only on instruments that exist today. Companies that delisted, ETFs that were closed, currencies that were repegged — they're not in the backtest because they're not in the dataset. The backtest sees only winners.

This systematically inflates returns. The fix is sourcing point-in-time datasets that include delisted instruments — more expensive, but the only honest path.

Failure 3 — Lookahead bias

The backtest accidentally uses information that wouldn't have been available at the trade time. The classic version: using same-day high/low in a strategy that triggers at the open. Subtle versions include using earnings announcements before they were released, or fundamental data that was revised after the original publication.

The fix: rigorous separation of as-of timestamps. Every data point must be tagged with the moment it became publicly available, and the strategy can only see data with as-of <= now during backtest.

Failure 4 — Adversarial market makers

In backtest, the strategy fills at modeled prices. In live trading, market makers see the order coming and adjust their quotes. The strategy's edge gets partially front-run before the fill happens.

This is invisible in standard backtests. The only way to test for it is to run the strategy in paper trading against real-time market data with realistic latency — and see whether fills come in at expected prices or worse. Quorum's paper-trading mode does this; many platforms simulate fills at backtest prices, which understates the adversarial drag.

Failure 5 — Parameter overfit

The strategy was tuned to maximize backtested returns by adjusting parameters (lookback windows, threshold values, holding periods). The tuning fits the strategy to the historical noise rather than to a real edge. Live performance reverts to the underlying signal strength, which is much lower than the tuned backtest suggests.

Detection: out-of-sample testing. Hold out 30%+ of historical data, tune on 70%, evaluate on the held-out 30%. If out-of-sample performance is materially worse than in-sample, the strategy is overfit and won't survive live deployment.

The realistic backtest checklist

Slippage modeled as at least half-spread, scaled to size
Trading costs included (commissions, exchange fees, regulatory fees)
Point-in-time data including delisted instruments
Strict as-of separation — no lookahead
Out-of-sample validation with held-out test set
Walk-forward testing — re-train periodically vs. one-shot training on all history
Adversarial paper-trading period of at least 30 days against real-time data before any live capital

Strategies that survive all 7 are rare and worth deploying. Strategies that survive 3 of 7 are the norm in retail algo platforms — and explain why most retail algo trading underperforms expectations.

Frequently asked questions

How do I model slippage if I don't know my fill prices in advance?: Use the half-spread as a baseline, scaled by your trade size relative to typical volume. For instruments where you'd be a meaningful fraction of typical volume, double the half-spread as a margin of safety.
Where do I get point-in-time data?: Norgate Data, Quandl, and Refinitiv all offer point-in-time datasets. Free sources are usually survivor-only. The cost difference (free vs. $50–$200/month) is meaningful but worth it for any serious strategy development.
How long should the out-of-sample test period be?: At least 30% of total available history, drawn from the most recent period. The most recent data is the most demanding test because market microstructure changes over time.
Is paper trading required before live deployment?: Strongly recommended. The 30-day paper trading period is when you discover slippage modeling errors, latency issues, and adversarial-market drag. The paper-trading P&L tells you what to expect from live deployment within reasonable bounds.
How do I know if my strategy is overfit?: Compare in-sample vs. out-of-sample Sharpe ratio. If out-of-sample is less than 60% of in-sample, you're overfit. The fewer parameters your strategy has, the less likely overfit; the more parameters, the more rigorous the out-of-sample testing has to be.

Jon Lynch — Founder & CEO, Jon Lynch Financial Group

SDVOSB-certified financial technology operator. Builds production tools for licensed professionals across insurance, lending, trading, and sales.