Machine Learning Algorithms

Book description

Build strong foundation for entering the world of Machine Learning and data science with the help of this comprehensive guide

About This Book

  • Get started in the field of Machine Learning with the help of this solid, concept-rich, yet highly practical guide.
  • Your one-stop solution for everything that matters in mastering the whats and whys of Machine Learning algorithms and their implementation.
  • Get a solid foundation for your entry into Machine Learning by strengthening your roots (algorithms) with this comprehensive guide.

Who This Book Is For

This book is for IT professionals who want to enter the field of data science and are very new to Machine Learning. Familiarity with languages such as R and Python will be invaluable here.

What You Will Learn

  • Acquaint yourself with important elements of Machine Learning
  • Understand the feature selection and feature engineering process
  • Assess performance and error trade-offs for Linear Regression
  • Build a data model and understand how it works by using different types of algorithm
  • Learn to tune the parameters of Support Vector machines
  • Implement clusters to a dataset
  • Explore the concept of Natural Processing Language and Recommendation Systems
  • Create a ML architecture from scratch.

In Detail

As the amount of data continues to grow at an almost incomprehensible rate, being able to understand and process data is becoming a key differentiator for competitive organizations. Machine learning applications are everywhere, from self-driving cars, spam detection, document search, and trading strategies, to speech recognition. This makes machine learning well-suited to the present-day era of Big Data and Data Science. The main challenge is how to transform data into actionable knowledge.

In this book you will learn all the important Machine Learning algorithms that are commonly used in the field of data science. These algorithms can be used for supervised as well as unsupervised learning, reinforcement learning, and semi-supervised learning. A few famous algorithms that are covered in this book are Linear regression, Logistic Regression, SVM, Naive Bayes, K-Means, Random Forest, TensorFlow, and Feature engineering. In this book you will also learn how these algorithms work and their practical implementation to resolve your problems. This book will also introduce you to the Natural Processing Language and Recommendation systems, which help you run multiple algorithms simultaneously.

On completion of the book you will have mastered selecting Machine Learning algorithms for clustering, classification, or regression based on for your problem.

Style and approach

An easy-to-follow, step-by-step guide that will help you get to grips with real -world applications of Algorithms for Machine Learning.

Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at If you purchased this book elsewhere, you can visit and register to have the code file.

Publisher resources

Download Example Code

Table of contents

  1. Preface
    1. What this book covers
    2. What you need for this book
    3. Who this book is for
    4. Conventions
    5. Reader feedback
    6. Customer support
      1. Downloading the example code
      2. Downloading the color images of this book
      3. Errata
      4. Piracy
      5. Questions
  2. A Gentle Introduction to Machine Learning
    1. Introduction - classic and adaptive machines
    2. Only learning matters
      1. Supervised learning
      2. Unsupervised learning
      3. Reinforcement learning
    3. Beyond machine learning - deep learning and bio-inspired adaptive systems
    4. Machine learning and big data
    5. Further reading
    6. Summary
  3. Important Elements in Machine Learning
    1. Data formats
      1. Multiclass strategies
        1. One-vs-all
        2. One-vs-one
    2. Learnability
      1. Underfitting and overfitting
      2. Error measures
      3. PAC learning
    3. Statistical learning approaches
      1. MAP learning
      2. Maximum-likelihood learning
    4. Elements of information theory
    5. References
    6. Summary
  4. Feature Selection and Feature Engineering
    1. scikit-learn toy datasets
    2. Creating training and test sets
    3. Managing categorical data
    4. Managing missing features
    5. Data scaling and normalization
    6. Feature selection and filtering
    7. Principal component analysis
      1. Non-negative matrix factorization
      2. Sparse PCA
      3. Kernel PCA
    8. Atom extraction and dictionary learning
    9. References
    10. Summary
  5. Linear Regression
    1. Linear models
    2. A bidimensional example
    3. Linear regression with scikit-learn and higher dimensionality
      1. Regressor analytic expression
    4. Ridge, Lasso, and ElasticNet
    5. Robust regression with random sample consensus
    6. Polynomial regression
    7. Isotonic regression
    8. References
    9. Summary
  6. Logistic Regression
    1. Linear classification
    2. Logistic regression
    3. Implementation and optimizations
    4. Stochastic gradient descent algorithms
    5. Finding the optimal hyperparameters through grid search
    6. Classification metrics
    7. ROC curve
    8. Summary
  7. Naive Bayes
    1. Bayes' theorem
    2. Naive Bayes classifiers
    3. Naive Bayes in scikit-learn
      1. Bernoulli naive Bayes
      2. Multinomial naive Bayes
      3. Gaussian naive Bayes
    4. References
    5. Summary
  8. Support Vector Machines
    1. Linear support vector machines
    2. scikit-learn implementation
      1. Linear classification
      2. Kernel-based classification
        1. Radial Basis Function
        2. Polynomial kernel
        3. Sigmoid kernel
        4. Custom kernels
      3. Non-linear examples
    3. Controlled support vector machines
    4. Support vector regression
    5. References
    6. Summary
  9. Decision Trees and Ensemble Learning
    1. Binary decision trees
      1. Binary decisions
      2. Impurity measures
        1. Gini impurity index
        2. Cross-entropy impurity index
        3. Misclassification impurity index
      3. Feature importance
    2. Decision tree classification with scikit-learn
    3. Ensemble learning
      1. Random forests
        1. Feature importance in random forests
      2. AdaBoost
      3. Gradient tree boosting
      4. Voting classifier
    4. References
    5. Summary
  10. Clustering Fundamentals
    1. Clustering basics
      1. K-means
        1. Finding the optimal number of clusters
          1. Optimizing the inertia
          2. Silhouette score
          3. Calinski-Harabasz index
          4. Cluster instability
      2. DBSCAN
      3. Spectral clustering
    2. Evaluation methods based on the ground truth
      1. Homogeneity 
      2. Completeness
      3. Adjusted rand index
    3. References
    4. Summary
  11. Hierarchical Clustering
    1. Hierarchical strategies
    2. Agglomerative clustering
      1. Dendrograms
      2. Agglomerative clustering in scikit-learn
      3. Connectivity constraints
    3. References
    4. Summary
  12. Introduction to Recommendation Systems
    1. Naive user-based systems
      1. User-based system implementation with scikit-learn
    2. Content-based systems
    3. Model-free (or memory-based) collaborative filtering
    4. Model-based collaborative filtering
      1. Singular Value Decomposition strategy
      2. Alternating least squares strategy
      3. Alternating least squares with Apache Spark MLlib
    5. References
    6. Summary 
  13. Introduction to Natural Language Processing
    1. NLTK and built-in corpora
      1. Corpora examples
    2. The bag-of-words strategy
      1. Tokenizing
        1. Sentence tokenizing
        2. Word tokenizing
      2. Stopword removal
        1. Language detection
      3. Stemming
      4. Vectorizing
        1. Count vectorizing
          1. N-grams
        2. Tf-idf vectorizing
    3. A sample text classifier based on the Reuters corpus
    4. References
    5. Summary
  14. Topic Modeling and Sentiment Analysis in NLP
    1. Topic modeling
      1. Latent semantic analysis
      2. Probabilistic latent semantic analysis
      3. Latent Dirichlet Allocation
    2. Sentiment analysis
      1. VADER sentiment analysis with NLTK
    3. References
    4. Summary
  15. A Brief Introduction to Deep Learning and TensorFlow
    1. Deep learning at a glance
      1. Artificial neural networks
      2. Deep architectures
        1. Fully connected layers
        2. Convolutional layers
        3. Dropout layers
        4. Recurrent neural networks
    2. A brief introduction to TensorFlow
      1. Computing gradients
      2. Logistic regression
      3. Classification with a multi-layer perceptron
      4. Image convolution
    3. A quick glimpse inside Keras
    4. References
    5. Summary
  16. Creating a Machine Learning Architecture
    1. Machine learning architectures
      1. Data collection
      2. Normalization
      3. Dimensionality reduction
      4. Data augmentation
      5.  Data conversion
      6. Modeling/Grid search/Cross-validation
      7. Visualization
    2. scikit-learn tools for machine learning architectures
      1. Pipelines
      2. Feature unions
    3. References
    4. Summary

Product information

  • Title: Machine Learning Algorithms
  • Author(s): Giuseppe Bonaccorso
  • Release date: July 2017
  • Publisher(s): Packt Publishing
  • ISBN: 9781785889622