Skip to Content
Data Science Bookcamp
book

Data Science Bookcamp

by Leonard Apeltsin
November 2021
Beginner to intermediate
704 pages
20h 16m
English
Manning Publications
Content preview from Data Science Bookcamp

Part 3. Case study 3: Tracking disease outbreaks using news headlines

Problem statement

Congratulations! You have just been hired by the American Institute of Health. The Institute monitors disease epidemics in both foreign and domestic lands. A critical component of the monitoring process is analyzing published news data. Each day, the Institute receives hundreds of news headlines describing disease outbreaks in various locations. The news headlines are too numerous to be analyzed by hand.

Your first assignment is as follows: You will process the daily quota of news headlines and extract locations that are mentioned You will then cluster the headlines based on their geographic distribution. Finally, you will review the largest clusters within ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Introducing Data Science

Introducing Data Science

Arno Meysman, Davy Cielen, Mohamed Ali
Learning Data Science

Learning Data Science

Sam Lau, Joseph Gonzalez, Deborah Nolan

Publisher Resources

ISBN: 9781617296253Publisher SupportOtherPublisher WebsiteSupplemental ContentErrata PagePurchase Link