Data understanding and preparation

To begin, we will load the dataset named water and define the structure of the str() function as follows:

    > data(water)        > str(water)    'data.frame':   43 obs. of  8 variables:    $ Year   : int  1948 1949 1950 1951 1952 1953 1954       1955 1956 1957 ...    $ APMAM  : num  9.13 5.28 4.2 4.6 7.15 9.7 5.02 6.7        10.5 9.1 ...    $ APSAB  : num  3.58 4.82 3.77 4.46 4.99 5.65 1.45        7.44 5.85 6.13 ...    $ APSLAKE: num  3.91 5.2 3.67 3.93 4.88 4.91 1.77        6.51 3.38 4.08 ...    $ OPBPC  : num  4.1 7.55 9.52 11.14 16.34 ...    $ OPRC   : num  7.43 11.11 12.2 15.15 20.05 ...    $ OPSLAKE: num  6.47 10.26 11.35 11.13 22.81 ...    $ BSAAM  : int  54235 67567 66161 68094 107080        67594 65356 67909 92715 70024 ...

Here we have eight features and one response variable, ...

Get Mastering Machine Learning with R - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.