November 2019
Intermediate to advanced
346 pages
9h 36m
English
Our initial steps are to import our dataset of fake news and perform basic data munging (steps 1-6), such as converting the target into a numeric type. Next, in step 7, we train-test split our dataset in preparation for constructing a classifier. Since we are dealing with textual data, we must featurize these. To that end, in steps 8 and 9, we instantiate Tf-Idf vectorizers for NLP on the text and fit these. Other NLP approaches may be fruitful here. Continuing to featurize, we extract the numerical features of our DataFrame (steps 10 and 11). Having finished featurizing the dataset, we can now instantiate a basic classifier and fit it on the dataset (step 12). In steps 13-15, we repeat the process on the testing set and measure ...