The 6 Biggest Portfolio Backtesting Mistakes

backtesting illustration chart with a warning symbol

If you are backtesting a single trading strategy or you are doing a portfolio backtesting, you need to avoid the following mistakes. These 6 backtesting mistakes are explained without any fancy mathematical equations here, but if you would like to see the math behind all of that, I let you consult the bibliography at the end of the article.


“Backtesting is not a research tool, features engineering is!” That’s one of the best quotes from Marco Lopez de Prado. So powerful and, of course, so true!

This first mistake is the most obvious one, but the worst. Please, never use the same dataset for creating trading strategies and adjust the weight of a portfolio than for portfolio backtesting.

If you are using the same data to train and test your strategies, there are 99% of chances that you will lose money in live trading. Indeed, the market is involving and the best way to do not suffer too much from that is always doing a portfolio backtesting on unknown data. And best of all, unknown data containing a lot of different market conditions.

Figure: Visual explanation simple backtesting structure

timeline with the delimitation of the in sample and out sample for a trading backtesting

It may be obvious, but I prefer to insist on this point: Never only train a model with the latest data and backtest on the previous ones.

Join Our Newsletter

Be the first to receive our latest quant trading content (personal notes, discount, new articles).


In the previous section, we talked about creating two datasets: one to train your models and your portfolio and the second one to test it. That’s how beginners do portfolio backtesting. Not us! But why is it a bad habit?

If you are using only one point to split your train and test dataset, you will have a huge problem: the variability problem. It means that all the performance coming from your portfolio backtest depends only on one subjective point. Indeed, you generally take this point taking 80% of the data for the train and 20% for the test or 70%/30, but without any other consideration.

As you will have only one period to optimize the parameters of your trading strategies and only one to test it, you can easily have a different result if you take another split point.

Figure:Portfolio backtesting for the same strategy on the same data modifying the split point

three lines on a chart, each line represent one split possible between in sample and out sample datasets, we see a huge difference from this small choice

I will let you read the article by Van Belle and Kerr [2012] for more information, but a different split point is likely to lead to opposite conclusions. Moreover, as you want to backtest your strategy with the latest data, you also remove them from the train set, which is not optimal because they can bring a lot of information in your models.

To fix this type of problem, we can use different robustness testing methods like the Walk-Forward Optimization or the Combinatorial purged cross validation because they will not use only one set for the optimization (explained in the next articles).


This one is an easy one but one of the most damaging ones even if you have avoided all the other backtesting mistakes. Let’s imagine over 10 years of portfolio backtesting, you have 500 trades, which gives around one trade per week, so, quite realistic.

If you are not taking into the costs of each trade, you will have an over-optimistic backtest. Indeed, if we assume that the whole cost for one trade is 0.01%, the total cost is not 5% but much more because of the compounded interest.

Figure: Portfolio value with and without taking into consideration the costs

two lines on a chart, one line represent the trading strategy backtesting with costs included and the other without

The slippage, on the other hand, can be a silent killer (especially for small timeframe strategies). Each time you are opening or closing a trade, you have latency (of milliseconds but still latency). The price variation during this latency is called the slippage: it can be negative or positive.

As it is nearly impossible to obtain a database with reliable slippage for your portfolio backtesting, I personally estimate it. There are a lot of advanced technics to do it, but a good one to begin is the following:

  1. Each time you want to open a position or close one in your portfolio backtest, you extract all the ticks in the next 2 seconds.
  2. You enter in position or quit it at the most disadvantageous time.


This mistake is not true one, it is more a consequence of a mistake. As you are not using any robustness testing and you have only one split point (confer mistake 1), your portfolio backtest is more likely able to give amazing results just by pure randomness.

So, here the questions is more “Can I trust my backtest?”. Of course, many problems are fixed using the same answer: USE ROBUSTNESS TESTING.

Indeed, if you are taking 100 subsamples, test your portfolio on it, resample your data, apply sensitivity analysis, walk-forward optimization monte-carlo and so on… If you still have a winning portfolio backtesting, we can, with a high level of confidence, say that this strategy performs well in the past.

Moreover, another solution which is obvious is taking a decent Minimum Backtest Length (BTL). Indeed, we can all find a strategy profitable on one week, but can you do so on 5 years or 10 years?

In fact, even if you are applying all the robustness testing on a 1-year backtest, your backtest is not reliable because you do not have any market conditions into it. Keep in mind that the goal of backtesting is to see how your strategy worked in the past in a maximum of situations.

Personally, I try to have ideally 10 years of data for my backtest and when it is not possible, I do not try to backtest a strategy with less than 5 years of data.


I share a lot about documenting your process on LinkedIn, but here, it is more useful than simply earning time. For this part, documenting what you are doing is MANDATORY. And that for a simple reason, according to you, if I test 10,000 trading strategies on the same dataset, how many would have a profitable portfolio backtesting? And more important, how many will be profitable in live trading?

The answer is, on the long term probably a few of them, but the average return of all the strategies that have a profitable portfolio backtesting will lose money in live trading. Indeed, as I mentioned before, all the tests we are doing are comparable to statistical tests.

The problem here is that testing several strategies on the same data is not a single test but a succession of different tests on the same dataset. It sounds like nothing, just a more accurate mathematical term. Of course, it is much more than that… Multiple testing is subject to different laws than the single tests.

I promise I would not enter into the math behind so I will drop the associate article that deals with this problem using p-values in the bibliography.

For the beginners or the ones who don’t want to use to deep math concepts, you can create a discounting rate depending on the number of strategies you tested on the same dataset using simple rules.

It exists many rules to deal with this problem which consider the number of trades, the length of the backtest, … But here we try to keep it simple, so I will give you a simple rule. We will create a discount rate that will apply to the metrics we want high (return, Sharpe ratio…) and we apply the same rate as a multiplication rate to the metric we want low (drawdown, volatility…). To create it we just need the number of trading strategies tested N.

[math] rate = 1 – 0.95^{N}[/math]

So, if you have tested 20 strategies on the same dataset, you have a discount or multiplicator rate of 64%. Let’s assume the only metric I look at is the Sharpe ratio (SR) to say if I put this strategy in live trading. The minimal SR I want is 0.50. I have a return of 2, so the discounted SR I will consider is 0.72. In this case, I will keep the strategy.

Figure: Discounted values

Sharpe ratio
2 * (1-0.64) = 0.72
100 %
100% * (1-0.64) = 36 %
10 %
10% * (1+0.64) = 16 %

Even if the robustness testing allows us, in a certain way, to test more strategies on the same datasets, I never test more than 50 strategies on the same data.


If you do not like statistical significance, you are safe here. It is a much more understandable problem. To explain it to you, let me use an example

You are living in Europe, so your local currency is the Euro.

08:15:17 – you are opening a buy position on the Apple Stock

16:37:11 – +11% of variation from the opening to the closing of this position.

How much is the return of this position (assuming you used 100% of your capital)?

11%? 15%? Another amount?

With the previous information, IMPOSSIBLE to compute the return (if your local currency is the euro),

Maybe it is a losing trade…


Because of the currency risk… When you open your position, your broker (generally) or yourself need to convert your euros in dollars and when you close your position do the opposite transaction.

So, if we have a -5% variation on the EUR/USD from 08:15:17 to 16:37:11, the return on this position is not 11% but

(1+VarEurUsd) * (1+VarApple) = (1-0.05) * (1+0.11) = 1,055

So, on this position you have earned 5.5% and not 11%.

To conclude, we can say that we have essentially talked about the portfolio backtesting mistakes here. In the next articles, we will take a deeper look on the robustness testing, which is finally the heart of a good backtesting!

The goal is first with the first articles make you understand that backtesting is much more complex than it looks and in a second time, give you the tools to handle it in the better way possible.


👇🏼 Join the newsletter to be informed when the next article of the series will be issued 

Join Our Newsletter

Be the first to receive our latest quant trading content (personal notes, discount, new articles).

Lucas Inglese

Lucas is a self-taught Quantitative Analyst, holding degrees in Mathematics and Economics from the University of Strasbourg. Embarking on an independent learning journey, he delved deeply into data science and quantitative finance, eventually mastering the disciplines. Lucas has developed numerous bots, sharing his insights and expertise on LinkedIn, where he regularly posts content. His understanding and empathy for beginners in this complex field led him to author several books and create the comprehensive “Alpha Quant Program.” This program offers e-learning videos, monthly projects, and continuous 7-day-a-week support. Through his online courses and publications, Lucas has successfully guided over 67,000 individuals in their pursuit of knowledge in quantitative finance.

Related Posts

Scroll to Top