Prerequisites

Before we get to the optimization examples, we will download the price history for a small number of ETFs and show how to calculate the returns, standard deviations, variance matrix and other inputs that we will need for the optimizer.

Prices

For these examples, we will download the daily prices for a handful of ETFs from Yahoo Finance, starting in 2003. Once the daily data is downloaded, we use the to.weekly() function to convert the prices to a weekly series (there is also a to.monthly() function) and then combine the series column by column into a data.frame.

library(xts)
library(quantmod)
## Loading required package: TTR
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## Version 0.4-0 included new data defaults. See ?getSymbols.
library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following object is masked _by_ '.GlobalEnv':
## 
##     %+%
symbols <- c("IWD","IWF","IWN","IWO","IWP","IWS")
initDate <- "2003-01-01"
go <- try(getSymbols(symbols, from = initDate))
## pausing 1 second between requests for more than 5 symbols
## pausing 1 second between requests for more than 5 symbols
prices <- list()
for(i in 1:length(symbols)) {
  tmp <- Ad(get(symbols[i]))
  tmp <- to.weekly(tmp, indexat="%m/%d/%Y")
  tmp <- tmp$tmp.Close
  names(tmp) <- symbols[i]
  prices[[i]] <- tmp
}
prices <- do.call(cbind, prices)
tail(prices)
##               IWD    IWF    IWN    IWO    IWP   IWS
## 2019-11-08 133.12 166.52 125.43 202.79 145.34 91.98
## 2019-11-15 133.61 169.10 124.46 203.82 148.39 92.31
## 2019-11-22 133.26 168.76 123.18 203.94 148.79 91.80
## 2019-11-29 134.08 171.43 125.12 209.97 151.05 92.56
## 2019-12-06 134.67 171.05 126.22 211.14 150.30 92.88
## 2019-12-11 134.37 170.87 126.28 210.36 149.52 92.83

Returns

Once we have the prices, we can use the lag.xts function to calculate the (weekly) returns. Note the explicit call to lag.xts instead of just lag. A number of different packages have a lag function and they do not all work the same, so it is better to be safe here and explicitly call the one we want to use. We also create a log prices series just for good measure.

prices <- as.xts(prices)
returns <- (prices / lag.xts(prices, k = 1) - 1)
returns[is.na(returns)] <- 0
logprices <- log(prices)
tail(logprices)
##                 IWD      IWF      IWN      IWO      IWP      IWS
## 2019-11-08 4.891251 5.115115 4.831748 5.312171 4.979076 4.521571
## 2019-11-15 4.894925 5.130490 4.823984 5.317237 4.999844 4.525152
## 2019-11-22 4.892302 5.128478 4.813647 5.317826 5.002536 4.519612
## 2019-11-29 4.898437 5.144175 4.829273 5.346965 5.017611 4.527857
## 2019-12-06 4.902827 5.141956 4.838026 5.352521 5.012633 4.531308
## 2019-12-11 4.900597 5.140903 4.838502 5.348820 5.007430 4.530770
tail(returns)
##                     IWD          IWF           IWN           IWO           IWP
## 2019-11-08  0.012088467  0.006223935  0.0083608088  0.0044081180 -0.0002752029
## 2019-11-15  0.003680935  0.015493646 -0.0077334051  0.0050792151  0.0209852971
## 2019-11-22 -0.002619609 -0.002010710 -0.0102844208  0.0005887302  0.0026955590
## 2019-11-29  0.006153437  0.015821273  0.0157493343  0.0295675147  0.0151892607
## 2019-12-06  0.004400328 -0.002216590  0.0087915439  0.0055722151 -0.0049652432
## 2019-12-11 -0.002227690 -0.001052371  0.0004753446 -0.0036942219 -0.0051896140
##                      IWS
## 2019-11-08  0.0068966066
## 2019-11-15  0.0035876820
## 2019-11-22 -0.0055248078
## 2019-11-29  0.0082788124
## 2019-12-06  0.0034572062
## 2019-12-11 -0.0005382752

Expected returns and Standard Deviation

The Greek letter \(\mu\) (mu) is usually used to represent the expected return. Often we don’t have an actual forecast of the return, so we use the naive forecast, which is the historical mean return. The mean return can be easily calculated in R using the apply function on the second dimension (the columns) using the mean function. If one or more observations are NA then the mean function will return NA unless you tell it to remove NAs by setting na.rm = TRUE. Likewise, you can use apply with the standard deviation (sd) function, which is usually represented with the Greek letter \(\sigma\).

The standard deviation of returns is by far the most widely used measure of risk, but there are other options. One option is the downside deviation. Recall the formula for standard deviation is \(\sqrt\sum(x-\overline{x})^{2}\). The downside deviation calculation focuses only on the cases where \(x \lt \overline{x}\). The rational given for this is that it is the negative deviations in returns that are problematic. No on complains about large positive deviations, unless of course, you are short the stock, in which case you could use the upside deviation calculation. For the vast majority of stocks all the deviation calculations are very similar and as stated before, it is probably best to stick to the the regular standard deviation calculation.

mu <- apply(returns, MARGIN=2, FUN=mean, na.rm=TRUE)
sigma <- apply(returns, MARGIN=2, FUN=sd)
sigma2 <- sqrt(apply(returns, MARGIN=2, FUN=var))
library(magrittr)
## 
## Attaching package: 'magrittr'
## The following objects are masked from 'package:testthat':
## 
##     equals, is_less_than, not
sigma3 <- apply(returns, MARGIN=2, FUN=var) %>% sqrt()
rbind(sigma, sigma2, sigma3)
##               IWD       IWF        IWN        IWO        IWP        IWS
## sigma  0.02393006 0.0226108 0.02910158 0.03027824 0.02615142 0.02529454
## sigma2 0.02393006 0.0226108 0.02910158 0.03027824 0.02615142 0.02529454
## sigma3 0.02393006 0.0226108 0.02910158 0.03027824 0.02615142 0.02529454

We show three different ways to calculate the same standard deviation above. The third one is essentially the same as the second, we just use the magrittr package in R which gives us the %>% operator that makes the code easier to read. We will show you a fourth way to calculate standard deviation in the next section.

```

Variance Matrix

There are a couple of different ways to calculate the variance matrix (V). The simplest method is the esimate the sample covariance matrix.

V <- cov(returns, use="complete")
V
##              IWD          IWF          IWN          IWO          IWP
## IWD 0.0005726478 0.0004998992 0.0006301837 0.0006320861 0.0005718628
## IWF 0.0004998992 0.0005112484 0.0005751191 0.0006267318 0.0005690783
## IWN 0.0006301837 0.0005751191 0.0008469020 0.0008359588 0.0006937144
## IWO 0.0006320861 0.0006267318 0.0008359588 0.0009167715 0.0007513819
## IWP 0.0005718628 0.0005690783 0.0006937144 0.0007513819 0.0006838969
## IWS 0.0005869192 0.0005278142 0.0006934758 0.0006948824 0.0006270713
##              IWS
## IWD 0.0005869192
## IWF 0.0005278142
## IWN 0.0006934758
## IWO 0.0006948824
## IWP 0.0006270713
## IWS 0.0006398135
sigma4 <- sqrt(diag(V))
sigma4
##        IWD        IWF        IWN        IWO        IWP        IWS 
## 0.02393006 0.02261080 0.02910158 0.03027824 0.02615142 0.02529454

The cov function calculations the co-variance of the returns between each pair of stocks using the formula \(\sqrt\sum((x-\overline{x})(y-\overline{y})}\). On the diagonal of the matrix \(x\) and \(y\) are the same and the resulting calculation is the variance, so that is why \(\sqrt(diag(V))\) produces the standard deviation, which you can verify matches the prior methods of calculating standard deviation.

Robust methods of handling outliers

To simulate an outlier, we will create a second returns series with only one observation different, but the difference being very large (an outlier).

returns2 <- returns
returns2[60,2]
##                      IWF
## 2004-02-20 -0.0002072267
returns2[60,2] <- 5

Graphically, we can see how changing just one observation changes the correlation relationship of that ETF with the other ETFs.

par(mfrow=c(1,2))
plot(as.dendrogram( hclust(
    as.dist(sqrt(2-cor(returns, use="complete")))
  ), h = .1 ), horiz = TRUE, xlim = c(1.3, 1)
)
plot(as.dendrogram( hclust(
    as.dist(sqrt(2-cor(returns2, use="complete")))
  ), h = .1 ), horiz = TRUE, xlim = c(1.3, 1)
)

V <- cov(returns, use="pairwise")
V2 <- cov(returns2, use="pairwise")
print(V[1:3, 1:3])
##              IWD          IWF          IWN
## IWD 0.0005726478 0.0004998992 0.0006301837
## IWF 0.0004998992 0.0005112484 0.0005751191
## IWN 0.0006301837 0.0005751191 0.0008469020
print(V2[1:3, 1:3])
##              IWD          IWF          IWN
## IWD 0.0005726478 0.0004740635 0.0006301837
## IWF 0.0004740635 0.0287348710 0.0005229648
## IWN 0.0006301837 0.0005229648 0.0008469020

R package corpcor has a function called cov.shrink that shrinks all of the covariances closer to the mean, based on a lambda.var parameter. Using lambda.var of zero returns the (unshrunk) sample covariance matrix and we verify that below. If lambda.var is not specified it is automatically determined. You can see that the shrunk covariance matrix does a good job of minimizing the effect of the outlier.

library(corpcor)
V2.sample <- cov.shrink(returns2, lambda.var = 0)
## Specified shrinkage intensity lambda.var (variance vector): 0 
## 
## Estimating optimal shrinkage intensity lambda (correlation matrix): 0.0099
V2.shrink <- cov.shrink(returns2)
## Estimating optimal shrinkage intensity lambda.var (variance vector): 1 
## 
## Estimating optimal shrinkage intensity lambda (correlation matrix): 0.0099
print(V2[1:3, 1:3])
##              IWD          IWF          IWN
## IWD 0.0005726478 0.0004740635 0.0006301837
## IWF 0.0004740635 0.0287348710 0.0005229648
## IWN 0.0006301837 0.0005229648 0.0008469020
print(V2.sample[1:3, 1:3])
##              IWD         IWF          IWN
## IWD 0.0005726478 0.000469388 0.0006239684
## IWF 0.0004693880 0.028734871 0.0005178070
## IWN 0.0006239684 0.000517807 0.0008469020
print(V2.shrink[1:3, 1:3])
##               IWD           IWF           IWN
## IWD 0.00076539942 0.00008856694 0.00068578875
## IWF 0.00008856694 0.00076539942 0.00008034055
## IWN 0.00068578875 0.00008034055 0.00076539942
print(V[1:3, 1:3])
##              IWD          IWF          IWN
## IWD 0.0005726478 0.0004998992 0.0006301837
## IWF 0.0004998992 0.0005112484 0.0005751191
## IWN 0.0006301837 0.0005751191 0.0008469020
# create exponential weights for a weighted covariance matrix
wgts <- .98^(nrow(returns):1)
V.weighted <- cov.shrink(returns, w=wgts)
## Estimating optimal shrinkage intensity lambda.var (variance vector): 0.7072 
## 
## Estimating optimal shrinkage intensity lambda (correlation matrix): 0.0442
returns2 <- returns
returns2[60,2] <- -100

V2 <- cov(returns2, use="pairwise")
dim(V2)
## [1] 6 6
dim(V2.shrink)
## [1] 6 6
V2.shrink <- cov.shrink(returns2, lambda = 0.99)
## Estimating optimal shrinkage intensity lambda.var (variance vector): 0.9978 
## 
## Specified shrinkage intensity lambda (correlation matrix): 0.99
V2
##              IWD           IWF          IWN          IWO          IWP
## IWD 0.0005726478  0.0010165899 0.0006301837 0.0006320861 0.0005718628
## IWF 0.0010165899 11.3004455531 0.0016181594 0.0018435141 0.0008896807
## IWN 0.0006301837  0.0016181594 0.0008469020 0.0008359588 0.0006937144
## IWO 0.0006320861  0.0018435141 0.0008359588 0.0009167715 0.0007513819
## IWP 0.0005718628  0.0008896807 0.0006937144 0.0007513819 0.0006838969
## IWS 0.0005869192  0.0011926139 0.0006934758 0.0006948824 0.0006270713
##              IWS
## IWD 0.0005869192
## IWF 0.0011926139
## IWN 0.0006934758
## IWO 0.0006948824
## IWP 0.0006270713
## IWS 0.0006398135
V2[1:3, 1:3]
##              IWD          IWF          IWN
## IWD 0.0005726478  0.001016590 0.0006301837
## IWF 0.0010165899 11.300445553 0.0016181594
## IWN 0.0006301837  0.001618159 0.0008469020
V2.shrink[1:3, 1:3]
##                 IWD             IWF             IWN
## IWD 0.0007649722088 0.0000005615255 0.0000069250825
## IWF 0.0000005615255 0.0258098140045 0.0000007352663
## IWN 0.0000069250825 0.0000007352663 0.0007655800610
library(ellipse)
## 
## Attaching package: 'ellipse'
## The following object is masked from 'package:graphics':
## 
##     pairs
plotcorr(cov2cor(V))

Risk and Expected returns

\(\mu\) is supposed to be the expected return. Often we don’t have any actual forcast of the return, so we use the naive forecast, which is the mean return. The mean return can be easily calculated in R using the apply function on the second dimension (the columns) using the mean function. If one or more observations are NA then the mean function will return NA unless you tell it to remove NAs by setting na.rm = TRUE.

mu <- apply(returns, MARGIN=2, FUN=mean, na.rm=TRUE)

and \(\sigma\) (sigma) is the standard deviation (the square root of the variance).

sigma2 <- apply(returns, 2, sd)
sigma <- sqrt(diag(V))
plot_assets <- function(V, mu) {
  rd <- data.frame(mu=mu, sigma=sqrt(diag(V)))
  rd$assets <- row.names(rd)
  
  ggplot(rd, aes(x = sigma, y = mu, color = assets)) +
    geom_point(size = 2) +
    ylab("Expected Return") +
    xlab("Standard Deviation (Risk)") +
    xlim(0, max(sigma)) + 
    ylim(0, 1.2*max(mu)) +
    ggtitle("Expected Returns versus Risk")
    
}
  
plot_assets(V, mu)

Optimization

There are many optimization packages available for R. To start off, we will use the solve.QP package, then move on to show some other optimizers.

Equal-weighted

The portfolio optimization problem is the problem of finding the optimal weights of a portfolio. For example, how much you should invest in each asset to maximize some desirable quantity, say risk-adjusted return.

If you do not have any information about the assets (or, more realistically, any reliable information) you can assign the same weight to each asset: this maximizes the entropy (a measure of diversity) of the weights.

plot_portfolio <- function(w, label, V, mu) {

  require(quadprog)
  n <- length(mu)
  A <- cbind( # Each column is a constraint
    matrix( rep(1,n), nr=n ),
    diag(n)
  )
  b <- c(1, rep(0,n))
  f <- function(u) {
    solve.QP(Dmat = u * V, dvec = mu, Amat = A, bvec = b, meq = 1)$solution
  }
  require(parallel) # Run the computations in parallel
  risk_aversions <- seq(.1, 100, length=ceiling(1e5/length(mu)))
  efw <- mclapply(risk_aversions, f)
  efw <- do.call(cbind, efw)
  x <- sqrt( colSums(efw * (V %*% efw)) )
  y <- t(efw) %*% mu
  
  # Efficient frontier points
  ef <- data.frame(x=x, y=y)
  
  rd <- data.frame(mu=mu, sigma=sqrt(diag(V)))
  rd$assets <- row.names(rd)
  
  port_risk <- sqrt( t(w) %*% V %*% w )
  port_return <- t(w) %*% mu
  
  gg <- ggplot(rd, aes(x = sigma, y = mu, color = assets)) +
    geom_point(size = 2) +
    ylab("Expected Return") +
    xlab("Standard Deviation (Risk)") +
    xlim(0, max(sigma)) + 
    ylim(0, 1.2*max(mu)) +
    ggtitle("Expected Returns versus Risk") +
    geom_segment(data = ef, aes(x = x, xend = dplyr::lead(x), y = y, yend = dplyr::lead(y)), color="grey") + 
    geom_point(aes(x = port_risk, y = port_return), color = "cornflowerblue", size = 6) +
    geom_text(aes(x = port_risk, y = port_return, label = label, color="Portfolio"), 
      position = position_dodge(width = 1), vjust = 1.5,) 
  print(gg)

}

plot_portfolio(w = rep(1/length(mu), length(mu)), label = "Equal weighted", V = V, mu = mu)
## Loading required package: quadprog
## Loading required package: parallel
## Warning: Removed 1 rows containing missing values (geom_segment).

Mean-Variance

The mean-variance portfolio maximizes the risk-adjusted expected return of the portfolio. Usually this takes the form of \(\mu - \lambda * \sigma^{2}\), where * \(\lambda\) is the risk aversion parameter as determined by the investor * \(\mu\) is the expected return (probably proxied by the historical return) * \(\sigma^{2}\) is the risk measure, but other choices are available

We will use the solve.QP function in library quadprog, which requires converting the problem to the following format:

Functin call: solve.QP(Dmat, dvec, Amat, bvec, meq=0) Find w to maximize dvec’ w - 1/2 w’ Dmat w such that Amat’ q >= bvec (the first meq constraints are equalities)

No constraints

n <- length(mu)
Amat <- matrix(nr = n, nc = 0)
bvec <- c()
r <- solve.QP(Dmat = V, dvec = mu, Amat, bvec, meq = 0)
round(r$solution, 4)
## [1] -9.9530  8.7031 -3.6035 -0.3473 -2.7931 12.5276

With absolutely no constraints the weights no not sum to one, rather summing to ’r sum(r$solution)`, the positions are highly levered, and there are many negative weights, implying shorting. We will now add a constraint to requie that the weights sum to one by using the Amat, bvec, and meq parameters as follows: \(Amat' q >= bvec\) We can create equality constraints, but they must be the first meq constraints and the remaining constraints will be inequality constraints. For long only we will only have one constraint, and that is that all the weights sum to exactly one.

Amat <- matrix(rep(1, n), nr = n)
bvec <- 1
print(Amat)
##      [,1]
## [1,]    1
## [2,]    1
## [3,]    1
## [4,]    1
## [5,]    1
## [6,]    1
print(bvec)
## [1] 1

As you can see, we created Amat as a column of all ones, bvec as a single 1, and set meq=1 so the function knows it is an equality constraint.

r <- solve.QP(Dmat=V, dvec=mu, Amat, bvec, meq=1)
sum(r$solution)
## [1] 1
round(r$solution, 2)
## [1] -10.07   2.17  -4.65   2.23  -0.74  12.06

Now the weights sum to one, but the leverage is still too high. Lets modify Amat and bvec as follows:

Amat <- cbind(matrix( rep(1,n), nr=n ), diag(n))
bvec <- c(1, rep(0,n))
print(Amat)
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,]    1    1    0    0    0    0    0
## [2,]    1    0    1    0    0    0    0
## [3,]    1    0    0    1    0    0    0
## [4,]    1    0    0    0    1    0    0
## [5,]    1    0    0    0    0    1    0
## [6,]    1    0    0    0    0    0    1
print(bvec)
## [1] 1 0 0 0 0 0 0

This make this concept concrete, here is how the matrix would look in equation form:

eq <- c("=", rep(">=", n))
print(rbind(Amat, eq, bvec))
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7]
##      "1"  "1"  "0"  "0"  "0"  "0"  "0" 
##      "1"  "0"  "1"  "0"  "0"  "0"  "0" 
##      "1"  "0"  "0"  "1"  "0"  "0"  "0" 
##      "1"  "0"  "0"  "0"  "1"  "0"  "0" 
##      "1"  "0"  "0"  "0"  "0"  "1"  "0" 
##      "1"  "0"  "0"  "0"  "0"  "0"  "1" 
## eq   "="  ">=" ">=" ">=" ">=" ">=" ">="
## bvec "1"  "0"  "0"  "0"  "0"  "0"  "0"

We can move along the efficient frontier (the grey line) by adjusting the risk-aversion parameter \(\lambda\). \(\lambda\)=1 gives a balance between expected return and risk, whereas \(\lambda\) < 1 will shift the optimal portfolio to the right and \(\lambda\) > 1 will shift the optimal portfolio to the left.

\(\lambda\) = 1: Risk-averse investor

r <- solve.QP(Dmat=V, dvec=mu, Amat, bvec, meq=1)
sum(r$solution)
## [1] 1
round(r$solution, 2)
## [1] 0 0 0 0 1 0
plot_portfolio(w=r$solution, label="lambda = 1", V, mu)
## Warning: Removed 1 rows containing missing values (geom_segment).

\(\lambda\) < 1 downplays the risk component and gives more importance to expected return, thus shifting the optimal portfolio to the right along the efficient frontier.

\(\lambda\) < 1: More risk-tolerant investor

r5 <- solve.QP(Dmat = .5 * V, dvec = mu, Amat, bvec, meq = 1)
r1 <- solve.QP(Dmat = .1 * V, dvec = mu, Amat, bvec, meq = 1)
plot_portfolio(w=r5$solution, label="lambda = .5", V, mu)
## Warning: Removed 1 rows containing missing values (geom_segment).
plot_portfolio(w=r1$solution, label="lambda = .1", V, mu)
## Warning: Removed 1 rows containing missing values (geom_segment).

\(\lambda\) > 1 increases the importance of the risk component and while keeping the expected returns the same, thus shifting the optimal portfolio to the left along the efficient frontier.

\(\lambda\) > 1: More risk-averse investor

r3 <- solve.QP(Dmat = 3 * V, dvec = mu, Amat, bvec, meq = 1)
r10 <- solve.QP(Dmat = 10 * V, dvec = mu, Amat, bvec, meq = 1)
plot_portfolio(w=r3$solution, label = "lambda = 3", V, mu)
## Warning: Removed 1 rows containing missing values (geom_segment).
plot_portfolio(w=r10$solution, label = "lambda = 10", V, mu)
## Warning: Removed 1 rows containing missing values (geom_segment).

Minimum Volatilty (Minv-Vol)

The minimum volatility portfolio is achieved by setting \(\mu\) to zero and solving the same optimization problem, so the optimizer finds the portfolio with the lowest risk regardless of the expected return. The minimum volatility portfolio is on the left and the mean variance portfolio is on the right for comparison purposes.

r0 <- solve.QP(Dmat = V, dvec = 0 * mu, Amat, bvec, meq = 1)
r <- solve.QP(Dmat = V, dvec = mu, Amat, bvec, meq = 1)
plot_portfolio(w=r0$solution, label = "Minv-Vol", V, mu)
## Warning: Removed 1 rows containing missing values (geom_segment).
plot_portfolio(w=r$solution, label = "Mean-Variance vol", V, mu)
## Warning: Removed 1 rows containing missing values (geom_segment).

We hope this helps you understand the basics of efficient frontiers and portfolio optimization, and gives you a starting point to optimizing portfolios in R.