In this blog entry, we are going to conduct a portfolio optimisation in statistical software R using library package fPortfolio as mean-variance investor. Of course, we will use other packages as well to prepare the data, but fPortfolio is the main package to be used for the portfolio optimisation.
First of all, you should install R. It also recommendable to install the RStudio editor as it has some nice features included.
Before opening R, create a new working directory to use for our playing-around. Open your Terminal (if you're using a Unix system) and create a new directory.
mkdir ~/R_portfolio
The command will create the directory /Users/<user>/R_portfolio
in your user directory. We will use the directory as working directory in R to eventually save files. Now, open R and switch the working directory of R to the just created directory using the command
setwd("~/R_portfolio")
To start with the portfolio optimization, we first need to have some daily data for the stocks we want to invest into and add to our portfolio. Ticker data from Yahoo Finance can be loaded on-the-fly directly from R, there is no need to go to the website, download CSV files and load them into R. The R package we use for loading the data is called quantmod.
# load library quantmod to load daily prices from yahoo
library(quantmod)
It allows you to directly access (a) certain ticker(s). You can also access a certain date range. For example, to access the ticker data for Apple (AAPL) between 01/01/2010 and today use the following command:
# get ticker data for Apple (AAPL) from Yahoo Finance between 01/01/2010 and today
getSymbols.yahoo("AAPL", env=globalenv(), from="2010-01-01", to = Sys.Date())
If you want to get the full history available for Apple on Yahoo Finance use the following command:
# get the full data history for Apple available on Yahoo Finance
getSymbols("AAPL", src="yahoo")
Either way, you can access the data series using the ticker name (this will print the whole data set on the R console)
# either way you can access the whole history using the ticker name
AAPL
Looking at the ouput from the ticker name, you can see that the data set has the six columns AAPL.Open, AAPL.High, AAPL.Low, AAPL.Close, AAPL.Volume and AAPL.Adjusted. The columns are pretty self-explanatory. We are only interested in the adjusted close price. To get only the adjusted price for Apple, you can use the Ad()
function:
# get adjusted close price for Apple
Ad(AAPL)
We want to get the data not only for a single stock but for several stocks which we want to invest into. E.g., we assume investing into
- Apple (AAPL),
- Alphabet Inc. (GOOG), formerly known as Google,
- JPMorgan Chase & Co. (JPM),
- Tesla Motors, Inc. (TSLA), and
- General Motors Company (GM).
Then we can retrieve the data for all five stocks at once using
ticker <- c("AAPL", "GOOG", "JPM", "TSLA", "GM")
getSymbols(ticker, src="yahoo", return.class="zoo")
You can now access all ticker data as mentioned for Apple above using the ticker names: AAPL, GOOG, JPM, TSLA, and GM.
The function call above also return the data as zoo class objects. zoo is another package for R made for working with time series. The package was implictly loaded when we loaded the quantmod package. However, we also load it explicitly:
# load library zoo
library(zoo)
To work with the different stock time series simultaneously, we merge the adjusted close prices of the stock into a single zoo object:
# merge the adjusted close price for
data.raw <- merge.zoo(Ad(AAPL), Ad(GOOG), Ad(JPM), Ad(TSLA), Ad(GM))
Looking at the data contained in data.raw
, we can see that there is a lack of data for the last two stocks, i.e. TSLA and GM:
...
2010-06-25 35.28400 236.1043 33.99956 NA NA
2010-06-28 35.49568 235.8046 33.22371 NA NA
2010-06-29 33.89090 226.9035 31.94786 23.89 NA
2010-06-30 33.27703 222.2531 31.55994 23.83 NA
...
2010-11-16 39.89990 291.5689 34.23692 29.67 NA
2010-11-17 39.75569 291.4840 33.86525 29.49 NA
2010-11-18 40.80482 297.9825 34.28013 29.89 31.29770
2010-11-19 40.57991 295.1204 34.06405 30.99 31.36178
...
The oldest date on which we have data for all five stocks is 2010-11-18. We now load the two packages PortfolioAnalytics and PerformanceAnalytics, and calculate artihmetic and logarithmic returns from the time series:
# load libraries PortfolioAnalytics and PerformanceAnalytics
library(PortfolioAnalytics)
library(PerformanceAnalytics)
# calculate arithmetic returns
data.arith <- Return.calculate(data.raw, method="simple")
# calculate logarithmic returns
data.log <- Return.calculate(data.raw, method="compound")
The first date to obtain a return figure is 2010-11-19 as this is the first date where we can calculate a return; for 2010-11-18 this is not possible as we don't have any base price as of 2010-11-17 for GM.
...
2010-11-18 2.604716e-02 2.204970e-02 0.0121767059 1.347272e-02 NA
2010-11-19 -5.526972e-03 -9.651418e-03 -0.0063235442 3.614063e-02 0.0020452778
2010-11-22 2.138474e-02 6.597773e-04 -0.0231016949 7.489139e-02 -0.0052676974
...
Accordingly, we'll choose the period 11/17/2010 to 12/31/2013 as the in-the-sample period to obtain our portfolio weights, i.e. this is the period for optimization (calibration):
# filter data.arith for the range from 2010-11-17 to 2013-12-31 which is our in-sample period
data.arith.ins <- data.arith[ index(data.arith) >= as.Date("2010-11-17") & index(data.arith) <= as.Date("2013-12-31") ]
Note that we used the arithmetic returns data.arith
. You could also use logarithmic returns as you like. But usually you would use arithmethic returns for short- and mid-term investment horizons and logarithmic returns for long-term investment horizons.
To obtain the optimal weights from the data given in the in-the-sample period, we have to do some preparatory stuff: Load the timeSeries package and convert our zoo into a timeSeries object:
# load library timeSeries
library(timeSeries)
# convert zoo into timeSeries
data.arith.ins.ts <- as.timeSeries(data.arith.ins)
Finally, we can use the fPortfolio package to obtain the minimum standard deviation (volatility) portfolio from the in-the-sample data:
assets <- colnames(data.arith.ins)
portfolio.init <- portfolio.spec(assets)
portfolio.init <- add.constraint(portfolio.init, type = "full_investment")
# calculate minimum std. dev. portfolio
portfolio.minSD <- add.objective(portfolio = portfolio.init, type="risk", name="StdDev")
portfolio.minSD.opt <- optimize.portfolio(data.arith.ins.ts, portfolio = portfolio.minSD, optimize_method = "ROI", trace = TRUE)
portfolio.minSD.weights <- portfolio.minSD.opt$weights
If you now type portfolio.minSD.weights
into your R console you can see the weights obtained from the data:
> portfolio.minSD.weights
Ad(AAPL) Ad(GOOG) Ad(JPM) Ad(TSLA) Ad(GM)
0.30965757 0.41682809 0.17623113 0.01959677 0.07768645
The weights above are the weights in the individual stocks for the optimised mean-variance portfolio. To check, we sum up the weights and indeed they sum up to 1.
You can find the full source code on my GitHub page at https://github.com/danieldinter/R-portfolio-optimization.