Word and sentence tokenize

So far, we've used the brown corpus, which represents a source of data already correctly processed. Let's see what happens when we treat a new data source introduced by the user. As a source, I used a passage based on the novel The Adventures of Huckleberry Finn by Mark Twain:

We catched fish, and talked, and we took a swim now and then to keep off sleepiness. It was kind of solemn, drifting down the big still river, laying on our backs looking up at the stars, and we didn't ever feel like talking loud, and it warn't often that we laughed, only a kind of low chuckle. We had mighty good weather, as a general thing, and nothing ever happened to us at all, that night, nor the next, nor the next.

Text data can be split ...

Get Keras 2.x Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.