In our last mission we used
R to plot a trend-line for population growth in Houston, based on historical data from the past century. Depending on which of two different methods we used, we arrived at an estimate for the city’s 2010 population of 2,144,531 (based on the 100-year growth trend for the city) or 2,225,125 (based on the steeper growth trend of the past fifty years). Looking now at the official Census count for 2010, it turns out that our guesses are close, but both of too high: the actual reported figure for 2010 is 2,099,451.
It would have been surprising to have guessed perfectly based on nothing other than a linear trend — and the fact that we came as close as we did speaks well of this sort of “back of the envelope” projection technique (at least for the case of steady-growth). But there was a lot of information contained in those data points that we essentially ignored: our two trendlines were really based on nothing more than a start and an end point.
A more sophisticated set of tools for making projections — which may be able to extract some extra meaning from the variation contained in the data — is provided in
R by the excellent
forecast package, developed by Rob Hyndman of the Monash University in Australia. To access these added functions, you’ll need to install it:
> install.packages(forecast) > library(forecast)
R: an object with
R is perfectly happy to help you analyze and plot time series data organized in vectors and dataframes, it actually has a specialized object class for this sort of thing, created with the
ts() function. Remember:
R is an “object-oriented” language. Every object (a variable, a dataframe, a function, a time series) is associated with a certain class, which helps the language figure out how to manage and interact with them. To find the class of an object, use the
> a=c(1,2) > class(a)  "numeric" > a=TRUE > class(a)  "logical" > class(plot)  "function" > a=ts(1) > class(a)  "ts" >