#include <yahooTS.h>
Public Types | |
| enum | dataKind {  badEnum, Open, High, Low, Close, Volume, lastEnum }  | 
Public Methods | |
| yahooTS () | |
| yahooTS (const char *p) | |
| const double* | getTS (const char *fileName, double *a, size_t &N, dataKind kind) const | 
| Read a Yahoo equity time series from a file. More... | |
| void | path (const char *p) | 
| const char* | path () | 
Private Methods | |
| const char* | getStr_ (char *&line, char *buf, size_t bufSize) const | 
| Copy from the input string until either the end of the string (e.g., the null) is reached or a comma is found. More... | |
| void | parseVals_ (char *line, double *vals, const size_t n) const | 
| Parse a comma separated line of values into a vector of doubles. More... | |
| const double | getValue_ (char *line, const yahooTS::dataKind kind) const | 
| A data line from a Yahoo historical data file consists of a set of comma separated values:. More... | |
Private Attributes | |
| const char* | path_ | 
The data is downloaded in "spread sheet" format from the historical data page. There is probably some limitation on using this data (e.g., no commercial use and no resale) so use at your own risk.
The format of the file is ASCII. The first line lists the title for each of the fields in the file. The titles and the fields are comma separated.
This class is specific to the data format that was downloaded from Yahoo at the time. More general code could be written to easily account for changing formats. However, I just wanted to extract the data.
The Yahoo data has two places of accuracy, presumably reflecting decimalization. The equity time series are adjusted for splits and dividends, backward in time from the most recent time in the time series. This can cause problems over long periods of time since at some point a stock that pays dividends will pay all of it's worth out in dividends and the value will become negative (as a result, a reinvest is a better choice).
The format for the data is:
<title line> <time series line>+
(e.g,. a titled followed by one or more time series lines).
The title line consists of six comma separated strings (e.g., "Date,Open,High,Low,Close,Volume"). Time time series lines have the values suggested in the title. For my current purposes I am not interested in date values, so these are ignored. All values are returned as vectors of doubles, although volume is an unsigned integer value.
Definition at line 79 of file yahooTS.h.
      
  | 
  
| 
 
 Definition at line 86 of file yahooTS.h. 00086                { badEnum,
00087                  Open,
00088                  High,
00089                  Low,
00090                  Close,
00091                  Volume,
00092                  lastEnum } dataKind;
 | 
  
      
  | 
  
| 
 
 Definition at line 94 of file yahooTS.h. 00095   { 
00096     path_ = 0;
00097   };
 | 
  
      
  | 
  
| 
 
 Definition at line 98 of file yahooTS.h. 00098 : path_(p) {}  | 
  
      
  | 
  
| 
 Copy from the input string until either the end of the string (e.g., the null) is reached or a comma is found. 
 
 Definition at line 56 of file yahooTS.cpp. Referenced by parseVals_(). 
 00059 {
00060   const char *rtnPtr = 0;
00061   if (line != 0) {
00062     for (size_t charCnt = 0; charCnt < bufSize-1 && *line != '\0'; charCnt++) {
00063       if (*line == ',') {
00064         line++;
00065         break;
00066       }
00067       else {
00068         buf[charCnt] = *line++;
00069       }
00070     }
00071     
00072     buf[charCnt] = '\0';
00073     if (charCnt > 0)
00074     {
00075       rtnPtr = buf;
00076     }
00077   }
00078   return rtnPtr;
00079 } // getStr_
 | 
  
      
  | 
  
| 
 Read a Yahoo equity time series from a file. Yahoo allows historical equity data to be downloaded in "spread sheet" format. In this format there is a title line, listing the data columns (e.g., date, open, high, low, close and volume). Following the title line are comma separated values. In reading this Yahoo data file, the first line is skipped. The Yahoo data values are listed from most recent to oldest. In the data vector returned, a[0] will be the oldest and a[N-1] will be the most recent. 
 
 Definition at line 208 of file yahooTS.cpp. Referenced by main(). 
 00212 {
00213   const double *rtnPtr = 0;
00214   char fullPath[512];
00215   size_t freePath = sizeof( fullPath );
00216   FILE *fptr;
00217 
00218   if (path_ != 0) {
00219     strncpy( fullPath, path_, freePath-1 );
00220     freePath = freePath - strlen( fullPath );
00221   }
00222   strncat( fullPath, fileName, freePath-1 );
00223   fptr = fopen( fullPath, "r" );
00224   if (fptr != 0) {
00225     char line[512];
00226     size_t lineSize = sizeof( line );
00227     int ix = N-1;
00228 
00229     if (fgets( line, lineSize, fptr ) != 0) {
00230       rtnPtr = a;
00231       while (fgets( line, lineSize, fptr ) != 0) {
00232         if (ix >= 0) {
00233           a[ix] = getValue_( line, kind );
00234           ix--;
00235         }
00236         else {
00237           break;
00238         }
00239       } // while
00240     }
00241     else {
00242       fprintf(stderr, "getTS: title line expected\n");
00243     }
00244     ix++;
00245     N = N - ix;
00246   }
00247   else {
00248     const char *error = strerror( errno );
00249     fprintf(stderr, "getTS: Error opening %s: %s\n", fullPath, error );
00250   }
00251 
00252   return rtnPtr;
00253 } // getTS
 | 
  
      
  | 
  
| 
 A data line from a Yahoo historical data file consists of a set of comma separated values:. 
 
    date,open,high,low,close,volume
This function is passed a Yahoo data line and a kind value which indicates which value to return. Date is is ignored, so the value of kind should be one of: Open, High, Low, Close, Volume. Definition at line 151 of file yahooTS.cpp. Referenced by getTS(). 
 00153 {
00154   double retval = 0;
00155 
00156   if (kind > badEnum && kind < lastEnum) {
00157     const size_t NUM_VALS = 5;
00158     double vals[ NUM_VALS ];
00159 
00160     parseVals_( line, vals, NUM_VALS );
00161 
00162     size_t ix = (size_t)kind - 1;
00163     if (ix < NUM_VALS) {
00164       retval = vals[ix];
00165     }
00166   }
00167 
00168   return retval;
00169 } // getValue
 | 
  
      
  | 
  
| 
 Parse a comma separated line of values into a vector of doubles. The comma separated values are: Date,Open,High,Low,Close,Volume The date value is skipped. 
 Definition at line 101 of file yahooTS.cpp. Referenced by getValue_(). 
 00104 {
00105   char buf[128];
00106   const char *ptr;
00107 
00108   // skip the date
00109   ptr = getStr_( line, buf, sizeof( buf ) );
00110   if (ptr == 0) {
00111     fprintf(stderr, "parseVals: date expected\n" );
00112     return;
00113   }
00114 
00115   // get the Open, High, Low, Close and Volume values
00116   size_t cnt = 0;
00117   for (dataKind kind = Open; 
00118        kind <= Volume && cnt < n; 
00119        kind = (dataKind)((size_t)kind + 1)) {
00120 
00121     ptr = getStr_( line, buf, sizeof( buf ) );
00122     if (ptr == 0) {
00123       fprintf(stderr, "parseVals: value expected\n");
00124       return;
00125     }
00126 
00127     double v;
00128 
00129     sscanf( buf, "%lf", &v );
00130     vals[cnt] = v;
00131     cnt++;
00132   }
00133 
00134 } // parseVals_
 | 
  
      
  | 
  
| 
 
 Definition at line 106 of file yahooTS.h. 00106 { return path_; }
 | 
  
      
  | 
  
| 
 
 Definition at line 105 of file yahooTS.h. 00105 { path_ = p; }
 | 
  
      
  | 
  
| 
 
  | 
  
1.2.8.1 written by Dimitri van Heesch,
 © 1997-2001