Software Essentials 29
2.1.9 Factors
In statistics, a ‘factor’ is a categorical variable, that is, a variable whose values are discrete categories
(e.g., ‘red’, ‘green’, ‘blue’, ‘orange’ or ‘animal’, ‘vegetable ’, ‘mineral’). The possible categories
are called the levels of the factor.
In R factors are represented by ob jects of class "factor". A factor dataset f contains values
f[1], f[2], ..., f[n] w hich are treated as categorical values.
> col <- c("red", "green", "red", "blue",
"blue", "green", "red")
> col
[1] "red" "green" "red" "blue" "blue" "green" "red"
> f <- factor(col)
> f
[1] red green red blue blue green red
Levels: blue green red
Factors are superficially similar to vectors of character strings, but their conceptual nature and prac -
tical ...