November 2016
Beginner to intermediate
687 pages
15h 31m
English
In this chapter, we will cover the following recipes:
This chapter covers parsing specific kinds of data, focusing primarily on dates, times, and HTML. Luckily, there are a number of useful libraries to accomplish this, so we don't have to delve into tricky and overly complicated regular expressions. These libraries can be great complements to NLTK:
dateutil provides datetime parsing and timezone conversionlxml and BeautifulSoup can parse, clean, and convert HTMLcharade and UnicodeDammit ...Read now
Unlock full access