This web page discusses the construction of a portfolio of Exchange Traded Funds (ETFs). This is not an academic exercise. Real money is invested in the resulting portfolio (although you should not put your money into this portfolio).
This portfolio is benchmarked against the S&P 500 (symbol ^GSPC). The portfolio construction is aimed at beating this benchmark on either a risk, return or both.
I've been working on ETF portfolios for a while. My first attempt at building ETF portfolios was for an independent study project when I was in the Computational Finance and Risk Management Masters program at the University of Washington: A Portfolio of Exchange Traded Funds, September 2012.
I would like to think that my skills in quantitative portfolio construction have advanced since I wrote this original paper. This web page publishes my latest work on ETF portfolio construction.
Exchange Traded Funds (ETFs) are collections of assets, particularly stocks, that trade like stocks but are structured like mutual funds. ETFs have a variety of advantages over mutual funds:
Tax Rates
In most cases ETFs have tax advantages over mutual funds. Trading in a mutual fund may incur short term tax liability. ETFs are structured in such a way that there is no tax liability for internal trading. Taxes are incurred only on the profits from trading the ETF share. This means that an ETF could have over 100% turn over in its assets, yet still only be taxed at the long term capital gains rate if the ETF shares are held longer than one year.
Diversification
There are a vast number of ETFs, which a variety of structures. Constructing a portfolio of ETFs allows the investor to choose high (or low) levels of market diversification.
Asset and Exposure Targeting
The vast array of ETFs allow portfolio components to be finely tuned for market exposure and liquidity. They also allow the investor to incorporate a variety of asset classes.
In practice the first step in portfolio construction is asset choice. In theory the portfolio optimizer can choose the assets but this introduces a number of challenges:
Data collection
The inception data for ETFs varies widely. Many ETFs have been created in the last few years and have a shorter history. Out of 436 ETFs that were selected from the etf.com ETF filter, 217 had an inception data on or before April 4, 2007, the start of the back-test period.
Covariance Matrix Estimation
The covariance matrix, which reflects portfolio risk, is a core component of portfolio optimization.
When the number of potential assets in a partfolio exceeds the number of samples (e.g., time perids) the error in the estimation of the covariance matrix can be so high that it becomes effectively useless (see Bai and Shi).
Two approaches were taken to address the issue of covariance matrix error. Assets were filtered to select a more favorable group of assets and shrinkage was used to reduce the error in the covariance matrix (see Ledoit and Wolf).
The challenge in choosing ETFs is the vast number of choices, as the ETFs that are available have exploded in the last few years.
As a teaser to encourage you to look at the PDF below I've included a plot from the paper. This plot shows one of the portfolios (blue line) vs. the S&P 500 (symbol ^GSPC) (red line). This portfolio has considerably better return than the S&P 500 benchmark.
This plot is discussed in a working paper titled Constructing an ETF Portfolio December 2, 2014 (PDF).
This paper is completely reproducible research. This paper was created/written using R and Knitr. The paper "source" is a combination of LaTex and R code. To create the paper, the paper is run in R studio. This runs all of the R code and generates the tables and plots. As a result, all of the code that generated the plots and tables can be directly examined.
This working paper is an informal discussion of the ETF portfolio construction. This portfolio is designed for real investment and is not an academic exercise.
The Knitr source code (containing the LaTex and R) that generated this PDF can be found here.
The Knitr code uses three supporting files that contain the ETF universe information:
The ETF universe used in the back-test is filtered from the overall ETF universe to remove ETFs that do not have sufficient history. A small R script is used for this:
All market data is downloaded from finance.yahoo.com
The ETF Handbook: How to Value and Trade Exchange Traded Funds, John Wiley and Sons, 2010
Estimating High Dimensional covariance Matrices and its Applications by Jushan Bai and Shuzhong Shi, Columbia University Department of Economics Discussion Paper No.: 1112-03, August 2011.
This web page is a discussion of investment approaches. The material here does not constitute advice. Make your own decisions and take credit for your own profits and losses.
Ian Kaplan
December 4, 2014
Last updated: