Models and Formulas
To statisticians, a model is a concise way to describe a set of data, usually with a mathematical formula. Sometimes, the goal is to build a predictive model with training data to predict values based on other data. Other times, the goal is to build a descriptive model that helps you understand the data better.
R has a special notation for describing relationships between
variables. Suppose that you are assuming a linear model for a variable
y, predicted from the variables
x1, x2, ..., xn. (Statisticians usually refer to y as the dependent variable, and x1, x2,
..., xn as the independent variables.) In equation
form, this implies a relationship like:

In R, you would write the relationship as y ~ x1 + x2 + ... + xn, which is a
formula object.
So, let’s try to use a linear regression to estimate the relationship. The
formula is dist~speed. We’ll use
the lm function to
estimate the parameters of a linear model. The lm function returns an object of class
lm, which we will assign to a
variable called cars.lm:
> cars.lm <- lm(formula=dist~speed,data=cars)
Now, let’s take a quick look at the results returned:
> cars.lm
Call:
lm(formula = dist ~ speed, data = cars)
Coefficients:
(Intercept) speed
-17.579 3.932As you can see, printing an lm object shows you the original function call (and thus the data set and formula) and the estimated coefficients. For some more information, ...