Are you trading S&P 500, Nasdaq 100 or the Forex market? If the answer is yes, you should take a look to the following article. Indeed, for many assets in the world, the US treasury bonds data can have an important impact on them. So, why not integrating it in your models?
To hard-to-find data? You do not know how to use it? Maybe another reason (feel free to join are 100% free public community to ask any questions related to that). We are here to help you, in this article, we will explain to you, how to import the historical and real-time good quality us treasury bonds data using an API.
Data is the first step of the trading strategy building process and one of the most time consuming… But also one of the most import, do not skip it !
1. CREATE YOUR 100% FREE API KEY
To import our data, we will use Alpha Vantage (https://www.alphavantage.co) that takes the data from the Federal Reserve Economic Data, the FRED (https://fred.stlouisfed.org). As we said before, we need, first of all, to obtain our Free API key. To do it, nothing simpler, you use the following link (https://www.alphavantage.co/support/#api-key) and you give your mail address and 2 other basic information and you will receive your API key.
2. Import your first US Treasury bonds data
To make our first import, we will take exactly the code provided by the Alpha Vantage documentation (https://www.alphavantage.co/documentation/#treasury-yield) and modify it through this article in order to make you able to do the same for one of the other data provided by them. Let do our first import with the code below:
So, we can now, take the values associated with the data key which will give us the interest rate data for the 10-year treasury bonds. And we can transform the list of dictionaries into a pandas dataframe to obtain a better visualization of our data. As we can see, in the next figure, we have correctly imported the 10-year maturity data in a monthly interval. Moreover, I verified the data already using the FRED dataset and both are the same. For example, here, it means that, on the 1st of November 2023, the 10-year maturity US treasury bonds interest rate was 4.50%.
Figure: result of a standard import using the alpha Vantage library on Python
That was our first import, and it was very simple to do it but, you need to take several things into consideration that we will manage in the “Preprocess the data (please, read me)” section (datetime indexation, wrong order, etc…).
Obtain 1 week of content 100% Free !
- The Basics of Quant Finance
- The Biggest Backtesting errors
- The Scientific Backtesting Guide
- Prerequisites to break into quant finance
- Technical indicators in python
- Create your first Trading Strategy in Python
- Full Details about my ML Reversal FX Strategy
3. Modify your US treasury bonds data request (bond maturity, interval import, data frequency, …)
Now, let’s talk about the different options that you can modify to adapt your request to your need. You will find all the customizable parameters in the documentation (https://www.alphavantage.co/documentation/#treasury-yield) but in the post, we will just modify the most important (in my opinion): the interval and the maturity.
3.1. Modify the data frequency
The advantage of the bonds interest rate is that we do not need to use small timeframes (1-minute, 15-minutes, …), to incorporate this data in our analysis. Alpha Vantage proposes the yearly, monthly, and daily interval. The daily will be good enough to bring information in our models because, we generally use it to analyze market conditions and trends and not for quick reversal movements (which demand small timeframes).
3.2. Modify the US treasury bonds maturity
The second parameter that you can modify and that you SHOULD modify is the maturity parameter. Indeed, Alpha Vantage proposes several maturities: 3-months, 2-years, 5-years, 7-years, 10-years and 30-years. A bit less than the original database but quite enough to bring the information we need in our model. From my experience, I will say that the 3-months and 2-years maturities are the best because they have the biggest variations, so more potential information. But, of course, the data you need depends on the project you are working on.
Let’s important, the data for the 3-months maturity US treasury bonds and in a daily frequency with the code below:
Figure: 3-months US treasury bonds interest rate (daily frequency)
4. Preprocess the data (please, read me)
The importation was a clear success. It is easy to use and, we can have the data we want in fewer than 10 lines, which is very helpful. But, for now, we have raw data: non-numerical values, non-ordered, with missing values, … The goal of this section is to convert this raw data into an easy-to-use database for your projects.
4.1. Add a DateTime index
First, we need to change the index: convert the dates which are in a string format to a datetime format and put the dates as index. It will help a lot later to combine datasets (link to the code at the end of the post).
When combining several time series datasets in finance, make sure that you always use the same time zones to avoid looking forward bias.
Figure: Visual explanation about what we changed into the data (modifying the date column)
4.2. Order your time-series data in a chronological order
The second thing to do is a stupid error that can lead to huge false results. When you import data using Alpha Vantage, you always have the latest value first BUT, when you are working in finance, trading or machine learning, if you are using a timeseries dataset, we analyze it in a chronological order. It means that if you are using any model from the Alpha Quant Program or any other python libraries, you will analyze the past using the future instead of analyzing the future using the past.
Figure: Visual explanation about what we changed into the data (order the data)
4.3. Transform the string values into numerical values
The last thing to do which is essential to be able to analyze our data is, to transform our string values into numerical values, because all the data come as a string when you import them using Alpha Vantage. We will not see any modification visually excepted the transformation of the missing values from “.” To NaN values like for the date 1981-09-07.
Figure: Visual explanation about what we changed into the data (from string to numerical values)
Now, we have well preprocessed our data, we can plot them to check if all is good. We can see that we have few dates with missing values consecutively but excepted that, all seems good.
To fix this NaN values problem, you have several solutions, check the missing data manually, or predict them using a timeseries model. Both solutions have many advantages that I will not detail in another post.
Figure: 3-months maturity US treasury bonds interest rate from 1980 to 2024