Honest backtesting
Why most backtests lie (and how to spot a curve-fit strategy)
A backtest is a story you tell yourself about the past, and it nearly always has a happy ending. The line climbs. The drawdowns look survivable. Then real money meets a real market and the ending gets rewritten — and that space between the test and the trade is where most accounts quietly bleed out.
Building a good-looking backtest is the easy part. Give anyone enough indicators and enough free afternoons and they can make almost any rule look profitable on almost any chart. Fitting the past is trivial. The hard question — the whole game, really — is whether you've found an edge or just memorised the answer key.
So treat what follows as a field guide to the ways a backtest flatters you, and how to catch each one before it gets expensive. None of this is financial advice and none of it promises a winning system. The goal is narrower and a lot more useful: bin the bad ideas early, while it's cheap.
Most backtests lie. They just lie in predictable ways — and once you know the tells, you stop fooling yourself.
1. Too many knobs, one history
You add a parameter and the numbers improve. You add another and they improve again. An hour later you're holding a strategy with a dozen finely-tuned inputs that nails one slice of history and predicts nothing.
That's overfitting, and the tell is fragility. Take a moving-average length of 20 that works beautifully and try 18, then 22. If the whole thing collapses, you didn't find an edge — you found a coincidence that happened to like the number 20. Real edges sit on a plateau. You can wobble the settings and they keep working.
2. Testing on the data you tuned on
If you optimise a strategy and then judge it on the same stretch of history, of course it looks good. You built it to look good on exactly that data. That's not a test. It's a mirror.
Split the history instead. Tune on the first chunk, then run the strategy untouched on a part it has never seen and look only at that result. If it falls apart out-of-sample, the "edge" was living in your tuning the whole time. (Getting this right the first time is half of building a clean test.)
3. Using tomorrow's data today
Look-ahead bias is when a strategy quietly leans on information it couldn't have had at the time. It's the most flattering bug in trading, because it produces results that are gorgeous and impossible at once.
The usual suspects: acting on the current bar's close before the bar has actually closed, signals that repaint, fills at prices that weren't reachable in that moment. The test is simple — could you have placed this exact order, at this exact time, knowing only what was knowable then? If your backtest keeps filling you at the dead low right before a rip, be suspicious. Real fills aren't that generous, which is exactly why intrabar fills off finer data matter: they respect the order prices were actually reachable in, not just the convenient ones.
4. One lucky chart, one lucky year
Test on EUR/USD in 2021 and it sings. Run the same rules on GBP/USD, or on 2018, and they fall over. Guess which chart ends up in the screenshot.
An edge that exists only on one symbol in one period is a fact about that symbol in that period — not about markets. Run your rules across pairs you didn't design them on, and across different moods: trending, chopping, panicking. Past performance on any single chart says very little about the next one.
5. Pretending trading is free
Frictionless backtests are fiction. Spread, commission and slippage aren't rounding errors; for anything fast or frequent they're frequently the entire result.
Turn costs on and watch the equity curve deflate. A surprising number of "money printers" go flat the moment you charge a realistic spread and fill orders where they could actually have filled, rather than where you'd have liked.
The point of a backtest isn't to make you feel good. It's to give you an honest estimate of how a set of rules behaved under realistic conditions. If the result needs perfect fills, free trading, or one lucky chart to survive, it isn't a result — it's a wish. (And no backtest, honest or not, can promise future returns.)
6. Trying 500 things and keeping the winner
This is the sneaky one, because every step feels reasonable. You test 500 variations, keep the best, report its numbers. The catch: torture enough random variations and some will look spectacular through pure luck. You didn't discover an edge. You bought 500 lottery tickets and framed the one that hit.
The fix is to account for how hard you searched. That's the entire job of the Probabilistic Sharpe Ratio and its stricter cousin, the Deflated Sharpe Ratio — they mark a result down based on how many attempts it took to find. A Sharpe of 2 from your first idea and a Sharpe of 2 fished out of five hundred are not the same number, even though they read identically on the report.
7. "Optimising" without walk-forward
Optimisation gets a bad name because most people do the reckless version — sweep every parameter across all of history, grab the single best combination, ship it. By construction, that combination is the most overfit point in the entire search.
Walk-forward does it the honest way. You optimise on one window, test on the next window, roll forward, and repeat. Every score you keep comes from data the strategy never trained on. It's about as close as a backtest gets to how you'd actually run the thing, with the future arriving one bar at a time.
How to pressure-test a strategy honestly
Every lie above has a defence, and none of them need code. What they need is discipline, plus tooling that makes the honest path the default one. That's the idea behind StrategyNodes: you build a strategy by dragging blocks onto a canvas and wiring them, then put it through the gauntlet a sceptic would.
In practice the defences map cleanly to the lies:
- Out-of-sample and walk-forward so nothing gets judged on the data it was tuned on.
- Monte Carlo — shuffle the trade order, vary the starting bar. If your equity curve only works in one exact sequence, it's brittle, and this shows you the range of outcomes the same rules could plausibly have thrown.
- Cross-market holdout — run the edge on symbols it has never seen. Survival there is real evidence; failure means you found a quirk of one market.
- Real costs and fills — spread, commission, slippage, intrabar M1 fills, plus pyramiding, partial exits and pending orders, so the simulation behaves like trading instead of a spreadsheet.
- A score built to flag overfitting rather than flatter it. The StrategyNodes Score rolls these checks into one 0–100 number and a letter grade, combining the Probabilistic and Deflated Sharpe Ratios, the cross-market holdout, honesty checks, and a hidden benchmark.
If you trade market structure, the order-block, fair-value-gap, liquidity-sweep and supply/demand blocks let you turn those discretionary ideas into rules you can actually test. And when something survives the gauntlet, you can export it to TradingView Pine or MetaTrader MQL5 — the export reproduces what you built, and the code is yours to keep.
None of this makes a strategy profitable, and nothing can; markets move and past results carry no guarantee. What honest testing buys you is the chance to throw the bad ideas out before they bill you, instead of after. Build one free, no signup — and the first strategy worth testing might be one you currently trust a little too much.
Test your strategy honestly — free
Build a strategy by dragging blocks, run a realistic backtest, and get a Score that flags overfitting. No signup to try.
Open the builder →