Data comes in various forms and complexities. It is a difficult task to even list the major/minor complexity levels of data preparation. The different forms of data, as well as complexity levels, may be known or unknown. Thus, it is difficult to have a standard set of guidelines for teaching data preparation methods.
Complexities arise on various counts, such as file types, files with missing data values, files with different kinds of attributes, etc. In some cases, it may be simply improbable for the user to read the data properly without repeated efforts of writing the codes over and over again. In Section 3.2, we use the options available in the R function
read.table to import data of external files which pose some difficulties. The options may vary to accommodate data problems, avoiding certain number of lines of file, and so forth. A good practice during the learning curve is to validate the imported data into R and check if it is on the expected lines. Thus, it may help to see the imported data using the functions
View, etc., and such functions will be illustrated in Section 3.4. The R functions
assign are effective in carrying out data manipulation without the need to create new R objects. The use of these ...