Output from the str(df) function

The str() function is an important function to run after you create a new data object. If you examine the output from the str(df) function in the log, you will see that the default read.csv function you have just run has defined each of the variables in the data as either numeric or factor. While at this point it is okay to leave it as is, we will eventually want to change the data type for these variables since Length.of.Stay is not a factor, it is an integer. Alternatively, we can perform another read.csv function that specifies the exact data type that we want, specifying the ColClasses vector. I usually like to look at a small dataset sample first, then determine what the data type should be, and then ...

Get Practical Predictive Analytics now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.