Common Sources of Error in Trading Backtests

The issues that quietly make a backtest look better than the real thing.

A backtest is a simulation: it estimates how a trading strategy would have done if you'd run it on historical market data. It's a standard step in building a strategy, and it's also a step where things quietly go wrong. There's a long list of well-documented errors that can make a backtest look far better than the strategy would actually perform in live trading. This article walks through several of the common ones and the methods people use to keep them in check. It's educational reference material, not investment advice.

Survivorship bias

Survivorship bias creeps in when a backtest only looks at assets that still exist or remain in an index today. Companies that were delisted, acquired, or dropped from an index during the test period simply aren't in the dataset. And since the companies that disappear often did so because they performed poorly, testing on the survivors alone paints a rosier picture than the full historical record would. The usual fix is a point-in-time dataset, where the assets considered on each historical date match what was actually available on that date.

Identifier and data continuity issues

Ticker symbols aren't permanent. One can be retired and later handed to a completely different company. If a data pipeline stitches price history together by symbol alone, it can accidentally splice two unrelated companies into a single series — and the results that follow are just wrong. Leaning on a stable identifier, something security-level rather than the display symbol, goes a long way here. Automated checks that flag unusually large single-period price jumps can also surface these cases for a second look.

Split and dividend adjustments

Corporate actions like stock splits and dividends reshape historical prices. A split cuts the per-share price without the holder actually losing anything. If your historical prices aren't adjusted for that, a backtest can read the drop as a huge loss (or the reverse as a gain). In the same vein, price-only data that ignores dividends understates the total return of dividend-paying assets. Adjusted price data handles both, but it's worth confirming that the adjustments are current and account for every relevant corporate action.

Transaction costs

Live trading carries costs that a backtest can quietly leave out: commissions, the bid-ask spread, slippage between the price you expected and the price you got, and, on larger orders, market impact. A strategy that looks profitable before costs can flip to a loss once they're in, especially if it trades often. The standard practice is to bake a transaction cost assumption into the simulation and then check that the results actually move when you change that assumption.

Lookahead bias

Lookahead bias happens when a backtest uses information that wouldn't have been available yet when the decision was made. A classic example: generating a signal from a day's closing price and then assuming you could execute at that same close — you couldn't, in practice. Another is using financial data dated to the period it covers rather than the later date it was actually reported. The guard against this is to align every decision with only what was known at that exact moment.

Calendar and date handling

Markets don't trade every day, and holiday schedules differ by exchange and shift over time. If a backtest assumes a fixed calendar or mishandles holidays, dates can fall out of alignment, and the results start behaving in ways that are hard to explain. Checking the trading calendar against a known exchange schedule — and confirming that consecutive data points really do step forward in time — helps catch these issues early.

Overfitting

Overfitting is what happens when a strategy is tuned so tightly to historical data that it ends up fitting random noise instead of a pattern that lasts. Give a strategy enough adjustable knobs and you can almost always make it shine on the past while it stumbles on anything new. The common defenses are to keep the number of parameters small, set aside a chunk of data that you never touch during development, and check whether the results hold up when you nudge the parameters slightly.

Summary

Backtesting is genuinely useful, but the results are only as good as the data and assumptions behind them. Survivorship bias, identifier and adjustment problems, missing transaction costs, lookahead bias, calendar errors, and overfitting can each push a backtest to overstate performance. Working through this list before you trust a result tends to leave you with numbers that look a lot more like what live trading would actually deliver.

Back to all articles