Book description
Machine learning has become an integral part of many commercial applications and research projects, but this field is not exclusive to large companies with extensive research teams. If you use Python, even as a beginner, this book will teach you practical ways to build your own machine learning solutions. With all the data available today, machine learning applications are limited only by your imagination.
You’ll learn the steps necessary to create a successful machine-learning application with Python and the scikit-learn library. Authors Andreas Müller and Sarah Guido focus on the practical aspects of using machine learning algorithms, rather than the math behind them. Familiarity with the NumPy and matplotlib libraries will help you get even more from this book.
With this book, you’ll learn:
- Fundamental concepts and applications of machine learning
- Advantages and shortcomings of widely used machine learning algorithms
- How to represent data processed by machine learning, including which data aspects to focus on
- Advanced methods for model evaluation and parameter tuning
- The concept of pipelines for chaining models and encapsulating your workflow
- Methods for working with text data, including text-specific processing techniques
- Suggestions for improving your machine learning and data science skills
Table of contents
- Preface
- 1. Introduction
- 2. Supervised Learning
- 3. Unsupervised Learning and Preprocessing
-
4. Representing Data and Engineering Features
- 4.1. Categorical Variables
- 4.2. OneHotEncoder and ColumnTransformer: Categorical Variables with scikit-learn
- 4.3. Convenient ColumnTransformer creation with make_columntransformer
- 4.4. Binning, Discretization, Linear Models, and Trees
- 4.5. Interactions and Polynomials
- 4.6. Univariate Nonlinear Transformations
- 4.7. Automatic Feature Selection
- 4.8. Utilizing Expert Knowledge
- 4.9. Summary and Outlook
- 5. Model Evaluation and Improvement
- 6. Algorithm Chains and Pipelines
-
7. Working with Text Data
- 7.1. Types of Data Represented as Strings
- 7.2. Example Application: Sentiment Analysis of Movie Reviews
- 7.3. Representing Text Data as a Bag of Words
- 7.4. Stopwords
- 7.5. Rescaling the Data with tf–idf
- 7.6. Investigating Model Coefficients
- 7.7. Bag-of-Words with More Than One Word (n-Grams)
- 7.8. Advanced Tokenization, Stemming, and Lemmatization
- 7.9. Topic Modeling and Document Clustering
- 7.10. Summary and Outlook
- 8. Wrapping Up
- Index
Product information
- Title: Introduction to Machine Learning with Python
- Author(s):
- Release date: September 2016
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781449369897
You might also like
book
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition
Through a recent series of breakthroughs, deep learning has boosted the entire field of machine learning. …
book
Python for Data Analysis, 3rd Edition
Get the definitive handbook for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python …
book
Practical Time Series Analysis
Time series data analysis is increasingly important due to the massive production of such data through …
book
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition
Through a series of recent breakthroughs, deep learning has boosted the entire field of machine learning. …