In this blog entry, we are going to obtain the optimal portfolio as mean-variance investor using the statistical software R and the library package **fPortfolio**. Of course, we will use other packages as well to prepare the data, but **fPortfolio** is the main package to be used for the portfolio optimisation.

First of all, you should install R. It also recommendable to install the RStudio editor as it has some nice features included.

Before opening R, create a new working directory to use for our playing-around. Open your Terminal (if you’re using a Unix system) and create a new directory.

mkdir ~/R_portfolio

The command will create the directory `/Users/<user>/R_portfolio`

in your user directory. We will use the directory as working directory in R to eventually save files.

Now, open R and switch the working directory of R to the just created directory using the command

setwd("~/R_portfolio")

To start with the portfolio optimization, we first need to have some daily data for the stocks we want to invest into and add to our portfolio. Ticker data from Yahoo Finance can be loaded on-the-fly directly from R, there is no need to go to the website, download CSV files and load them into R. The R package we use for loading the data is called *quantmod*.

# load library quantmod to load daily prices from yahoo library(quantmod)

It allows you to directly access (a) certain ticker(s). You can also access a certain date range. For example, to access the ticker data for Apple (AAPL) between 01/01/2010 and today use the following command:

# get ticker data for Apple (AAPL) from Yahoo Finance between 01/01/2010 and today getSymbols.yahoo("AAPL", env=globalenv(), from="2010-01-01", to = Sys.Date())

If you want to get the full history available for Apple on Yahoo Finance use the following command:

# get the full data history for Apple available on Yahoo Finance getSymbols("AAPL", src="yahoo")

Either way, you can access the data series using the ticker name (this will print the whole data set on the R console)

# either way you can access the whole history using the ticker name AAPL

Looking at the ouput from the ticker name, you can see that the data set has the six columns *AAPL.Open*, *AAPL.High, AAPL.Low, AAPL.Close*, *AAPL.Volume* and *AAPL.Adjusted*. The columns are pretty self-explanatory. We are only interested in the adjusted close price. To get only the adjusted price for Apple, you can use the `Ad()`

function:

# get adjusted close price for Apple Ad(AAPL)

We want to get the data not only for a single stock but for several stocks which we want to invest into. E.g., we assume investing into

- Apple (AAPL),
- Alphabet Inc. (GOOG), formerly known as Google,
- JPMorgan Chase & Co. (JPM),
- Tesla Motors, Inc. (TSLA), and
- General Motors Company (GM).

Then we can retrieve the data for all five stocks at once using

ticker <- c("AAPL", "GOOG", "JPM", "TSLA", "GM") getSymbols(ticker, src="yahoo", return.class="zoo")

You can now access all ticker data as mentioned for Apple above using the ticker names: AAPL, GOOG, JPM, TSLA, and GM.

The function call above also return the data as *zoo* class objects. *zoo* is another package for R made for working with time series. The package was implictly loaded when we loaded the *quantmod* package. However, we also load it explicitly:

# load library zoo library(zoo)

To work with the diffrent stock time series simultaneously, we merge the adjusted close prices of the stock into a single zoo object:

# merge the adjusted close price for data.raw <- merge.zoo(Ad(AAPL), Ad(GOOG), Ad(JPM), Ad(TSLA), Ad(GM))

Looking at the data contained in `data.raw`

, we can see that there is a lack of data for the last two stocks, i.e. TSLA and GM:

... 2010-06-25 35.28400 236.1043 33.99956 NA NA 2010-06-28 35.49568 235.8046 33.22371 NA NA 2010-06-29 33.89090 226.9035 31.94786 23.89 NA 2010-06-30 33.27703 222.2531 31.55994 23.83 NA ... 2010-11-16 39.89990 291.5689 34.23692 29.67 NA 2010-11-17 39.75569 291.4840 33.86525 29.49 NA 2010-11-18 40.80482 297.9825 34.28013 29.89 31.29770 2010-11-19 40.57991 295.1204 34.06405 30.99 31.36178 ...

The oldest date on which we have data for all five stocks is 2010-11-18. We now load the two packages PortfolioAnalytics and PerformanceAnalytics, and calculate artihmetic and logarithmic returns from the time series:

# load libraries PortfolioAnalytics and PerformanceAnalytics library(PortfolioAnalytics) library(PerformanceAnalytics) # calculate arithmetic returns data.arith <- Return.calculate(data.raw, method="simple") # calculate logarithmic returns data.log <- Return.calculate(data.raw, method="compound")

The first date to obtain a return figure is 2010-11-19 as this is the first date where we can calculate a return; for 2010-11-18 this is not possible as we don’t have any base price as of 2010-11-17 for GM.

... 2010-11-18 2.604716e-02 2.204970e-02 0.0121767059 1.347272e-02 NA 2010-11-19 -5.526972e-03 -9.651418e-03 -0.0063235442 3.614063e-02 0.0020452778 2010-11-22 2.138474e-02 6.597773e-04 -0.0231016949 7.489139e-02 -0.0052676974 ...

Accordingly, we’ll choose the period 11/17/2010 to 12/31/2013 as the in-the-sample period to obtain our portfolio weights, i.e. this is the period for optimization (calibration):

# filter data.arith for the range from 2010-11-17 to 2013-12-31 which is our in-sample period data.arith.ins <- data.arith[ index(data.arith) >= as.Date("2010-11-17") & index(data.arith) <= as.Date("2013-12-31") ]

Note that we used the arithmetic returns `data.arith`

. You could also use logarithmic returns as you like. But usually you would use arithmethic returns for short- and mid-term investment horizons and logarithmic returns for long-term investment horizons.

To obtain the optimal weights from the data given in the in-the-sample period, we have to do some preparatory stuff: Load the *timeSeries* package and convert our *zoo* into a *timeSeries* object:

# load library timeSeries library(timeSeries) # convert zoo into timeSeries data.arith.ins.ts <- as.timeSeries(data.arith.ins)

Finally, we can use the *fPortfolio* package to obtain the minimum standard deviation (volatility) portfolio from the in-the-sample data:

assets <- colnames(data.arith.ins) portfolio.init <- portfolio.spec(assets) portfolio.init <- add.constraint(portfolio.init, type = "full_investment") # calculate minimum std. dev. portfolio portfolio.minSD <- add.objective(portfolio = portfolio.init, type="risk", name="StdDev") portfolio.minSD.opt <- optimize.portfolio(data.arith.ins.ts, portfolio = portfolio.minSD, optimize_method = "ROI", trace = TRUE) portfolio.minSD.weights <- portfolio.minSD.opt$weights

If you now type `portfolio.minSD.weights`

into your R console you can see the weights obtained from the data:

> portfolio.minSD.weights Ad(AAPL) Ad(GOOG) Ad(JPM) Ad(TSLA) Ad(GM) 0.30965757 0.41682809 0.17623113 0.01959677 0.07768645

The weights above are the weights in the individual stocks for the optimised mean-variance portfolio. To check, we sum up the weights and indeed they sum up to 1.