November 2019
Intermediate to advanced
346 pages
9h 36m
English
In the following steps, you will read in the fake news dataset, preprocess it, and then train a Random Forest classifier to detect fake news:
import pandas as pdcolumns = [ "text", "language", "thread_title", "spam_score", "replies_count", "participants_count", "likes", "comments", "shares", "type",]df = pd.read_csv("fake_news_dataset.csv", usecols=columns)
df = df[df["language"] == "english"]df = df.dropna()df = df.drop("language", axis=1
features = 0feature_map = {}def add_feature(name): ...