© Akshay Kulkarni and Adarsha Shivananda 2019
Akshay Kulkarni and Adarsha ShivanandaNatural Language Processing Recipeshttps://doi.org/10.1007/978-1-4842-4267-4_1

1. Extracting the Data

Akshay Kulkarni1  and Adarsha Shivananda1
(1)
Bangalore, Karnataka, India
 
In this chapter, we are going to cover various sources of text data and ways to extract it, which can act as information or insights for businesses.
  • Recipe 1. Text data collection using APIs

  • Recipe 2. Reading PDF file in Python

  • Recipe 3. Reading word document

  • Recipe 4. Reading JSON object

  • Recipe 5. Reading HTML page and HTML parsing

  • Recipe 6. Regular expressions

  • Recipe 7. String handling

  • Recipe 8. Web scraping

Introduction

Before getting into details of the book, let’s see the different possible data sources ...

Get Natural Language Processing Recipes: Unlocking Text Data with Machine Learning and Deep Learning using Python now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.