Trading Backtest Explained – 3 real life exemples

illustration of the backtest output (chart, histogram,...) with a big "example" in the center

Understand how to implement the different methods to do a backtest is good. But knowing how to use them to extract real insights is much better… That’s why I will analyze with you 3 trading backtest. The goal is simple: at the end you will understand which strategy you should put in live trading and why!


For the first example, we will detail a lot of each step to be sure you understand the process 100%. I remind you that this process needs to be done manually to analyze each point by yourself. If you automate it, it can lead to put the wrong strategies in live trading.


The first output we will study is the walk forward optimization results. Here, we will analyze only the performances and the drawdown, but you have another output which is the best parameters to put your strategy in live trading if you want to (so not necessary here).

Here, you need to look to several things that can be missed by the beginners:

  • yield plateau: it happens when the returns are blocked around a certain threshold. For example, the return moves only around 50% of total return for 2 years. So, you need to imagine that in real live trading, this strategy could be at the same profitability point for 2 years.
  • Continuously increasing drawdown: even if the drawdown is above the threshold, you want, if the max drawdown is continuously increasing across the time, it means that the strategy does not fit so well with the current market conditions. For sure, you need to be objective, if we go from 1% of max drawdown to 2% at the end of the sample, it is nothing. But, if we go from 1%, then 3%, then 10%. You should worry a bit about!

Figure: Example of an output from a walk forward backtesting in trading

Backtest of a trading strategy using Python. It show a clear upward trend in the results
show a financial drawdown: a chart line with only values below 0

Here, on the graph, we can see that this strategy has a stable upward trade, which is quite encouraging. We always recover from our drawdown period. The longer time to recover our capital is around 1 year, it is a bit high and that’s why combining trading strategies in a portfolio is benefic. I will give a 9/10 to this result.


The probability of overfitting is here to see the ability to optimize the parameters. The lower the PBO is, the higher the odds to obtain one of the best possible results in live trading with the parameters from the walk forward optimization is high.

Figure: Logits distribution

histogram with a threshold highlight: the PBO

Here, we have a PBO around 2%, it means 98% of the tested parameters are not overfitted. It is amazing, when it is lower than 5%, I give generally a 10/10.


From the combinatorial purged cross-validation, the second metric we have studied together is the Sharpe Ratio distribution from all the samples tested. The more the Sharpe ratio distribution gives high values, the better it is.

Figure: Sharpe Ratio distribution

histogram with a threshold highlight: probability to obtain a positive Sharpe ratio

Here, we can see several things. The probability to have a positive Sharpe ratio is closest to 1. The mean of the distribution is between 2 and 3 and the probability to have a Sharpe ratio higher than 1 is visually around 80%. For all these reasons, it obtains a 9/10. (We will see worst cases in the next section)


We switch now to the Monte Carlo simulations. The goal here is more similar than with the CPCV but just using another method: here, we do not resample the historical data, we simulate new data based on the historical data simulation.

Here, we will check visually if the simulations give us interesting results after 1 year. It helps us to obtain a projection about the risk and the return you can have 1 year after the strategy is put in live trading.

Figure: Monte Carlo simulations

line chart with hundreds of line with the same trends
Here, we can see a clear upward trend in all the simulations, which is quite encouraging. Let’s summarize all the simulations using some area to have something more readable.

Figure: Monte Carlo Projections

chart with different areas highlighted that give information about future stock prices

Now, we see that the odds to have lost money after 1 year trading this strategy is around 10% because the dark green is representing the area from the 5th centile to the 25th centile (and 75th centile to 95th for the best results). The median is around 10% per year which is quite good but not so fantastic either. It obtains a 6/10.


Instead of taking an overview of the distribution, we can focus on specific metrics (Sharpe ratio, Calmar ratio, drawdown…). Here, we will focus on the max drawdown for all the simulations. Once you have that you will obtain the drawdown distribution.

Figure: Maximum Drawdown distribution

histogram with a threshold highlight: the drawdown distribution
My risk of ruin threshold is 20%. So, as the worst maximum drawdown is around 4%, it obtains for sure a 10/10.


Now that we have rated all the charts and metrics, we need to aggregate the results. In my opinion, you should only put your strategy on live trading tests if the average notation is higher or equal than 7.

Moreover, you have a veto power of you find something you really dislike on the backtesting results and that even if the average notation is higher than 7: higher drawdown than you want, too much plateau yield, not enough trades… It is only a few examples.

Figure: Overall backtesting results

Walk forward
SR distribution
Return simulations
Risk of ruin

Here, the trading strategy backtest overall note is around 8.5, and I do not see any restrictive point that needs to apply a veto. So we can for sure continue with this strategy. 


In the first example, we had detailed all the steps, now I will give you the results with my analysis only.

Figure: Bakctesting overview

serval charts and histogram representing the output of a backtesting
  • Walk-Forward: quite good, without any plateau, even if we see that the stability of the returns is not amazing. On the drawdown side, nothing bad excepted that the max drawdown is a bit too high. I will give a 7/10.
  • PBO: the logits distribution gives us a PBO around 7% which is acceptable. I will give an 8/10 (because generally a PBO of 10% is noted 7/10, according to me, only).
  • Sharpe ratio distribution: the odds to have a negative Sharpe ratio are lower than 10%, which is quite good. On the other hand, the mean distribution is around 0.80. I give a 7/10.
  • Returns Simulations: The median after the 252 days is around 5%, not so good. We have around 20% of chances to lose money the first year. It is worth a 6/10
  • Risk of ruin: the maw drawdown distribution is quite good with a risk of ruin around 5.00%. It is worth an 8/10.

It gives us an overall note of 7/10. So, the strategy can be put in live trading but keeping in mind that, we need to have a strict risk management because it has only a 7/10, so really the limit to be put in live trading. If you have already many strategies in your portfolio, you can take it but with a small weight.

several computers, and laptops with trading pictures



On this example, you will see why making robustness tests is so important. Indeed, the walk forward result seems perfect but the robustness testing shows another truth.

Figure: Bakctesting overview

serval charts and histogram representing the output of a backtesting
  • Walk-forward: very good results, stable with a good upward trend. Drawdown a bit high but without any other pain points: 8/10.

  • PBO: the logits distribution gives a PBO around 22%. It means that the odds that the best parameters from the walk forward are due to randomness are high. 5/10.

  • Sharpe ratio distribution: the odds to have a Sharpe ratio above 1 are around 30% which is very low. 3/10.

  • Returns simulations: the odds to have lost money after 252 is higher than 25%. 5/10.

  • Risk of ruin: the lower max drawdown is around 10% and the worst around 50%. The odds to touch are risk of ruin threshold (drawdown higher than 20%) is around 70%. 2/10.



    The average notation is for sure lower than 7 but anyways I would use my veto power because of many things: Sharpe ratio distribution too bad and especially the max drawdown distribution.


Feel free to make your analysis using the same method and post them on the public Quantreo forum.
If you have any question, feel free to ask your question on my public discord forum or directly in private messages on Linkedin.

👇🏼 Join the newsletter to be informed when the next article of the series will be issued 

Join Our Newsletter

Be the first to receive our latest quant trading content (personal notes, discount, new articles).

Lucas Inglese

Lucas is a self-taught Quantitative Analyst, holding degrees in Mathematics and Economics from the University of Strasbourg. Embarking on an independent learning journey, he delved deeply into data science and quantitative finance, eventually mastering the disciplines. Lucas has developed numerous bots, sharing his insights and expertise on LinkedIn, where he regularly posts content. His understanding and empathy for beginners in this complex field led him to author several books and create the comprehensive “Alpha Quant Program.” This program offers e-learning videos, monthly projects, and continuous 7-day-a-week support. Through his online courses and publications, Lucas has successfully guided over 67,000 individuals in their pursuit of knowledge in quantitative finance.

Related Posts

Scroll to Top