February 2020
Intermediate to advanced
328 pages
8h 19m
English
Before moving on to the model-building part, we need to preprocess the input data. Let's get started:
data_cleaning <- function(sentence) { sentence = gsub('[[:punct:] ]+',' ',sentence) sentence = gsub("[^[:alnum:]\\-\\.\\s]", " ", sentence) sentence = stringi::stri_trans_general(sentence, "latin-ascii") sentence = tolower(sentence) sentence}sentences <- map(sentences,data_cleaning)
Read now
Unlock full access