Follow these steps to perform linear regression on SLID data:
- You can use the str function to get an overview of the data:
> str(SLID) Output: 'data.frame': 7425 obs. of 5 variables: $ wages : num 10.6 11 NA 17.8 NA ... $ education: num 15 13.2 16 14 8 16 12 14.5 15 10 ... $ age : int 40 19 49 46 71 50 70 42 31 56 ... $ sex : Factor w/ 2 levels "Female","Male": 2 2 2 2 2 1 1 1 2 1 ... $ language : Factor w/ 3 levels "English","French",..: 1 1 3 3 1 1 1 1 1 1 ..
- First, we visualize the variable wages against language, age, education, and sex:
> par(mfrow=c(2,2)) > plot(SLID$wages ~ SLID$language) > plot(SLID$wages ~ SLID$age) > plot(SLID$wages ~ SLID$education) > plot(SLID$wages ~ SLID$sex)