#include <histogram.h>
Public Methods | |
histogram () | |
~histogram () | |
void | calculate (const double *raw_data, const size_t N, bin_vec &binz) |
Calculate a histogram. More... | |
Private Methods | |
histogram (const histogram &rhs) | |
void | init_bins (bin_vec &bins, const double min, const double max) |
Initialize the histogram bin array. More... |
The histogram class constructor is initialized with the number of bins to be used in the histogram.
Each bin is a histo_bin object which contains information on the start and end value of the bin range and the frequency in the bin. The frequency is the number of values in the data set which are greater than or equal to start and less than end.
Example:
Given an array of 20 double values, calculate the histogram. The data does not need to be in sorted order. The histogram calculation will sort it. A vector of histo_bin objects is allocated by the histogram::bin_vec class constructor. The data, number of elements in the data set and the vector of histogram bins is passed to the histogram calculate function. This will initialize each of the histogram bins with a start, end and frequency.
const size_t num_bins = 20; histogram::bin_vec binz( num_bins ); histogram histo; histo.calculate( data, N, binz ); for (size_t i = 0; i < num_bins; i++) { size_t freq = static_cast<size_t>( binz[i] ); printf("7.4f 2d\n", binz.start(i), freq ); }
Note that binz[i] returns the frequency as a double, which is cast to a size_t value.
Definition at line 45 of file histogram.h.
|
|
|
Definition at line 202 of file histogram.h. 00202 {} |
|
Definition at line 203 of file histogram.h. 00203 {} |
|
Calculate a histogram.
Definition at line 52 of file histogram.cpp. Referenced by pdf::pdf_stddev().
00055 { 00056 double *sort_data = new double[ N ]; 00057 00058 for (size_t i = 0; i < N; i++) { 00059 sort_data[i] = raw_data[i]; 00060 } 00061 00062 dbl_sort s; 00063 00064 s.sort( sort_data, N ); 00065 double min = sort_data[0]; 00066 double max = sort_data[N-1]; 00067 00068 size_t num_bins = bins.length(); 00069 00070 init_bins( bins, min, max ); 00071 00072 value_pool pool( sort_data, N ); 00073 00074 size_t bin_ix = 0; 00075 00076 double val; 00077 bool more_values = pool.get_val( val ); 00078 double end = bins.end(bin_ix); 00079 00080 while (bin_ix < num_bins && more_values) { 00081 if (val < end) { 00082 bins[bin_ix] = bins[bin_ix] + 1; // increment the frequency 00083 more_values = pool.get_val( val ); 00084 } 00085 else { 00086 bin_ix++; 00087 end = bins.end(bin_ix); 00088 } 00089 } // while 00090 00091 delete [] sort_data; 00092 } // calculate |
|
Initialize the histogram bin array.
Definition at line 11 of file histogram.cpp. Referenced by calculate().
00014 { 00015 size_t num_bins = bins.length(); 00016 double range = max - min; 00017 double bin_size = range / static_cast<double>( num_bins ); 00018 00019 double start = min; 00020 double end = min + bin_size; 00021 00022 for (size_t i = 0; i < num_bins; i++) { 00023 bins[i] = 0; // initialize frequency to zero 00024 bins.start(i, start ); // initialize the i'th bin start value 00025 bins.end(i, end ); // initialize the i'th bin end value 00026 start = end; 00027 end = end + bin_size; 00028 } 00029 // The frequency in a bin is incremented if a value v is 00030 // in the range start <= v < end. This is fine until 00031 // we reach the last bin, which should also get values 00032 // which are in the range start <= v <= end. So add a 00033 // small amount to the end value to assure that the 00034 // end value of the last bin is beyond the value range. 00035 bins.end(num_bins-1, bins.end(num_bins-1) + (bin_size / 10.0) ); 00036 } // init_bins |