The purpose of this chapter is threefold: (i) to review many basic notions from simple regression (the linear regression model, ordinary least squares [OLS], and the central limit theorem—in this context, basic inference); (ii) to introduce some more advanced features of R (matrix commands, curve fitting, plotting, and “inquiry” functions); and (iii) to introduce the idea of simulating data.

Real data is very important in statistics, but so is simulated data. Simulated data has known characteristics, allowing the student/programmer to examine the performance of algorithms, plots, and formulas in the best- and worst-case scenarios. Simulating data based on formulas and models allows the student/programmer to operationalize the formulas and models, often leading to a more complete understanding of what the formula or model is “saying.” The ability to simulate data allows the student/programmer to quickly check conjectures and produce useful examples and counter examples. It is the opinion of the author that the ability to *effortlessly and routinely* simulate data is a skill all statisticians should have.

Imagine data produced by the following simple model: *y _{k}* = β

The errors, ϵ_{k}, are normal, have mean zero, have equal spread, and are independent.

*Note:* A key difference between a traditional statistical problems and a time series problem ...

Start Free Trial

No credit card required