book

Hands-On Transfer Learning with Python

by Dipanjan Sarkar, Raghav Bali, Tamoghna Ghosh

August 2018

Intermediate to advanced

438 pages

12h 3m

English

Packt Publishing

Read now

Unlock full access

Content preview from Hands-On Transfer Learning with Python

Traditional text categorization

Building text categorization algorithms/models involves a set of preprocessing steps and proper representation of textual data as numerical vectors. Following are the general preprocessing steps:

Sentence splitting: Split a document into a set of sentences.
Tokenization: Split sentences into constituent words.
Stemming or lemmatization: The word tokens are reduced to their base form. For example, words such as playing, played, and plays have one base: play. The base word output of stemming need not be a word in the dictionary. Whereas the root word from lemmatization, also known as the lemma, will always be present in the dictionary.
Text cleanup: Case conversion, correcting spellings, and removing stopwords ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Hands-On Transfer Learning with TensorFlow 2.0

Margaret Maynard-Reid

Hands-On One-shot Learning with Python

Shruti Jadon, Ankush Garg

Transfer Learning for Natural Language Processing

Paul Azunre

Mastering Computer Vision with TensorFlow 2.x

Krishnendu Kar

Publisher Resources

ISBN: 9781788831307Supplemental Content