To clean and explore the data, closely follow the ensuing instructions:
- Imported numeric data often contains special characters such as percentage signs, dollar signs, commas, and so on. This causes R to think that the field is a character field instead of a numeric field. For example, our FINVIZ dataset contains numerous values with percentage signs that must be removed. To do this, we will create a clean_numeric function that will strip away any unwanted characters using the gsub command. We will create this function once and then use it multiple times throughout the chapter:
clean_numeric <- function(s){ s <- gsub("%|\\$|,|\\)|\\(", "", s) s <- as.numeric(s) }
- Next, we will apply this function to the numeric fields ...