Classifying Text Using Naive Bayes
"Language is a process of free creation; its laws and principles are fixed, but the manner in which the principles of generation are used is free and infinitely varied. Even the interpretation and use of words involves a process of free creation."
– Noam Chomsky

Not all information exists in tables. From Wikipedia to social media, there are billions of written words that we would like our computers to process and extract bits of information from. The sub-field of machine learning that deals with textual data goes by names such as Text Mining and Natural Language Processing (NLP). These different names reflect the fact that the field inherits from multiple disciplines. On the one hand, we have computer science ...

Get Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.