Chapter 1. Introduction
This is not so much an instructional manual, but rather notes, tables, and examples for machine learning. It was created by the author as an additional resource during training, meant to be distributed as a physical notebook. Participants (who favor the physical characteristics of dead-tree material) could add their own notes and thoughts and have a valuable reference of curated examples.
We will walk through classification with structured data. Other common machine learning applications include predicting a continuous value (regression), creating clusters, or trying to reduce dimensionality, among others. This book does not discuss deep learning techniques. While those techniques work well for unstructured data, most recommend the techniques in this book for structured data.
We assume knowledge and familiarity with Python. Learning how to manipulate data using the pandas library is useful. We have many examples using pandas, and it is an excellent tool for dealing with structured data. However, some of the indexing operations may be confusing if you are not familiar with numpy. Full coverage of pandas could be a book in itself.
Libraries Used
This book uses many libraries. This can be a good thing and a bad thing. Some of these libraries may be hard to install or conflict with other library versions. Do not feel like you need to install all of these libraries. Use “JIT installation” and only install the libraries that you want to use as you need them. ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access