Classifying news articles by topic using a CNN

For this example, we will use the dataset of references to news web pages collected by a news aggregator. There are four categories in the dataset belonging to the news of science and technology, business, entertainment, and health. The complete Jupyter Notebook for this example can be found under the Chapter05/03_example.ipynb directory in this book's code repository.

We will first look at the sample of the data from this dataset:

news_df = pd.read_csv('data/newsCorpora.csv',delimiter='\t', header=None, names=['ID','TITLE','URL','PUBLISHER','CATEGORY','STORY','HOSTNAME','TIMESTAMP'])news_df = news_df.sample(frac=1.0)news_df.head(5)

The dataset is represented in the table format as follows:

Get Hands-On Natural Language Processing with Python now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.