Skip to Content
Machine Learning Solutions
book

Machine Learning Solutions

by Jalaj Thanaki
April 2018
Beginner to intermediate content levelBeginner to intermediate
566 pages
12h 17m
English
Packt Publishing
Content preview from Machine Learning Solutions

Feature engineering for the baseline model

For this application, we will be using a basic statistical feature extraction concept in order to generate the features from raw text data. In the NLP domain, we need to convert raw text into a numerical format so that the ML algorithm can be applied to that numerical data. There are many techniques available, including indexing, count based vectorization, Term Frequency - Inverse Document Frequency (TF-IDF ), and so on. I have already discussed the concept of TF-IDF in Chapter 4, Generate features using TF-IDF:

Note

Indexing is basically used for fast data retrieval. In indexing, we provide a unique identification number. This unique identification number can be assigned in alphabetical order or based ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Machine Learning

Machine Learning

Subramanian Chandramouli, Saikat Dutt, Amit Kumar Das
Machine Learning for Business

Machine Learning for Business

Doug Hudgeon, Richard Nichol
Introducing Machine Learning

Introducing Machine Learning

Dino Esposito, Francesco Esposito

Publisher Resources

ISBN: 9781788390040Supplemental Content