Value Factors Do Not Forecast Returns for S&P 500 Stocks

This web page is associated with a paper titled Value Factors Do Not Forecast Returns for S&P 500 Stocks. The work described in this paper investigates how effective corporate value factors can be in selecting S&P 500 stocks for an investment portfolio. As the title of this web page suggests, the answer is "not very effective".

The PDF for the paper can be found here.

This paper grew out of my Masters thesis presentation for my Masters degree in Computational Finance and Risk Management through the University of Washington. I gave my thesis presentation on November 22, 2013 and was awarded a Masters degree in December 2013. I continued to refine the work that I presented, which resulted in this paper.

This paper is reproducible research. The paper is written using Knitr, which combines R and the typesetting language LaTex. Using R and RStudio the PDF for the paper can be regenerated from the document source and data. The code to generate every table and diagram in the document is included in the Knitr source code.

The data used in this paper consists of approximately fifteen years of corporate quarterly report data. Through my Masters program I had access to the Wharton Research Data Service (WRDS). The CRSP/Compustat data sets, which I used in this work, are available from WRDS. Unfortunately redistribution of this data is prohibited, so I cannot include the data here.

Working with WRDS and the CRSP/Compustat data and cleaning it up so that it can be used in historial back tests is a very time consuming process. I have tried to document my work with this data so that the data set can be reproduced by anyone with access to this data. See The Wharton Research Data Service (WRDS) data set and Factor Model Factors

Open Source Corporate Value Factor Data
The Quandl site publishes corporate factor (fundamental) data for approximately 15,000 stocks. The fundamental data can be accessed via a Web API. Quandl does not have as much history as WRDS, but for the time period where they have data they are an attractive alternative to the CRSP/Compustat data which is both costly and missing values.

Knitr and R Source

Diagram for the paper (Open office with jpegs generated via screen capture): /finance/thesis_project/diagrams
Root source directory: http://www.bearcave.com/finance/thesis_project/r_code
1. factor_analysis.Rnw
  This is the Knitr source code for the paper. The document consists of executable R code and LaTex formatted text.
2. references.bib
  These are the bibtex formatted references that are used in the paper (factor_analysis.Rnw).
3. s_and_p.r
  This R code computes the quarterly S&P 500 constituents from the Compustat data. See Building the S&P 500 Constituents
4. s_and_p_monthly.r
  The S&P 500 constituent data is available quarterly. This R code is similar to s_and_p.r, but it fills in the months between the quarterly boundaries. This code supports the papers section on monthly linear models.
5. fix_compustat_data.r
  In most cases the Compustat data must be preprocessed to deal with missing values and other issues. This R code does this preprocessing so that the data could be used in the paper. See The Wharton Research Data Service (WRDS) data set and Factor Model Factors which discusses the CRSP/Compustat data set and how it must be preprocessed in order to calculate the value factors used in the paper.
6. factor_calc.r
  This R code calculates the value factors from the preprocessed CRSP/Compustat data.
7. monthly_factor_calc.r
  This R code is similar to factor_calc.r, but it fills in factor values using monthly close prices.
Miscellaneous support code
1. fix_interest_rate.r
  This R code cleans up the "risk free" interest rate data downloaded from CRSP/Compustat.
2. factor_distribution.r
  The WRDS quarterly Corporate Factor data has hundreds of values that can be selected for download. Many of these factors are either unpopulated or sparsely populated. By downloading all of the factors and running this code on the result, an analysis of the factor density can be performed. This was critical in understanding how to calculate the value factors used in the paper. The results are displayed on The Wharton Research Data Service (WRDS) data set and Factor Model Factors.

Ian Kaplan
March 2014
Last revised:

Back to topics in Quantitative Finance