Machine Learning with R Quick Start Guide

Book description

Learn how to use R to apply powerful machine learning methods and gain insight into real-world applications using clustering, logistic regressions, random forests, support vector machine, and more.

Key Features

  • Use R 3.5 to implement real-world examples in machine learning
  • Implement key machine learning algorithms to understand the working mechanism of smart models
  • Create end-to-end machine learning pipelines using modern libraries from the R ecosystem

Book Description

Machine Learning with R Quick Start Guide takes you on a data-driven journey that starts with the very basics of R and machine learning. It gradually builds upon core concepts so you can handle the varied complexities of data and understand each stage of the machine learning pipeline.

From data collection to implementing Natural Language Processing (NLP), this book covers it all. You will implement key machine learning algorithms to understand how they are used to build smart models. You will cover tasks such as clustering, logistic regressions, random forests, support vector machines, and more. Furthermore, you will also look at more advanced aspects such as training neural networks and topic modeling.

By the end of the book, you will be able to apply the concepts of machine learning, deal with data-related problems, and solve them using the powerful yet simple language that is R.

What you will learn

  • Introduce yourself to the basics of machine learning with R 3.5
  • Get to grips with R techniques for cleaning and preparing your data for analysis and visualize your results
  • Learn to build predictive models with the help of various machine learning techniques
  • Use R to visualize data spread across multiple dimensions and extract useful features
  • Use interactive data analysis with R to get insights into data
  • Implement supervised and unsupervised learning, and NLP using R libraries

Who this book is for

This book is for graduate students, aspiring data scientists, and data analysts who wish to enter the field of machine learning and are looking to implement machine learning techniques and methodologies from scratch using R 3.5. A working knowledge of the R programming language is expected.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Machine Learning with R Quick Start Guide
  3. About Packt
    1. Why subscribe?
    2. Packt.com
  4. Contributors
    1. About the author
    2. About the reviewer
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  6. R Fundamentals for Machine Learning
    1. R and RStudio installation
      1. Things to know about R
      2. Using RStudio
        1. RStudio installation 
    2. Some basic commands
    3. Objects, special cases, and basic operators in R
      1. Working with objects
      2. Working with vectors
        1. Vector indexing
          1. Functions on vectors
      3. Factor
        1. Factor levels
      4. Strings
        1. String functions
      5. Matrices
        1. Representing matrices
        2. Creating matrices
        3. Accessing elements in a matrix
        4. Matrix functions
      6. Lists
        1. Creating lists
        2. Accessing components and elements in a list
      7. Data frames
        1. Accessing elements in data frames
        2. Functions of data frames
      8. Importing or exporting data
      9. Working with functions
    4. Controlling code flow
    5. All about R packages
      1. Installing packages
      2. Necessary packages
    6. Taking further steps
      1. Background on the financial crisis
    7. Summary
  7. Predicting Failures of Banks - Data Collection
    1. Collecting financial data
      1. Why FDIC?
      2. Listing files
      3. Finding files
      4. Combining results
      5. Removing tables
      6. Knowing your observations
      7. Handling duplications
      8. Operating our problem
    2. Collecting the target variable
    3. Structuring data
    4. Summary
  8. Predicting Failures of Banks - Descriptive Analysis
    1. Data overview
      1. Getting acquainted with our variables
        1. Finding missing values for a variable
      2. Converting the format of the variables
      3. Sampling
        1. Partitioning samples
        2. Checking samples
    2. Implementing descriptive analysis
      1. Dealing with outliers
        1. The winsorization process
          1. Implementing winsorization
      2. Distinguishing single valued variables
      3. Treating missing information
      4. Analyzing the missing value
      5. Understanding the results
    3. Summary
  9. Predicting Failures of Banks - Univariate Analysis
    1. Feature selection algorithm
      1. Feature selection classes
    2. Filter methods
    3. Wrapper methods
      1. Boruta package
    4. Embedded methods
      1. Ridge regression
        1. A limitation of Ridge regression
      2. Lasso 
        1. Limitations of Lasso
      3. Elastic net
        1. Drawbacks of elastic net
    5. Dimensionality reduction
      1. Dimensionality reduction technique
    6. Summary
  10. Predicting Failures of Banks - Multivariate Analysis
    1. Logistic regression
    2. Regularized methods
    3. Testing a random forest model
    4. Gradient boosting
    5. Deep learning in neural networks
      1. Designing a neural network
      2. Training a neural network
    6. Support vector machines
      1. Selecting SVM parameters
        1. The SVM kernel parameter
        2. The cost parameter
        3. Gamma parameter
      2. Training an SVM model
    7. Ensembles
      1. Average model
      2. Majority vote
      3. Model of models
    8. Automatic machine learning
      1. Standardizing variables
    9. Summary 
  11. Visualizing Economic Problems in the European Union
    1. A general overview of economic problems in countries
      1. Understanding credit ratings
      2. The role of credit rating agencies
      3. The credit rating process
    2. Clustering countries based on macroeconomic imbalances
      1. Data collection
        1. Downloading and viewing the data
        2. Streamlining data
      2. Studying the data
        1. Acquiring the target variable
        2. Acquiring the credit quality
        3. Displaying the credit ratings on a map
        4. Carrying out a descriptive analysis of data
      3. Detecting macroeconomic imbalances
        1. The self-organizing maps technique
        2. Training the SOM
    3. Summary
  12. Sovereign Crisis - NLP and Topic Modeling
    1. Predicting country ratings using macroeconomic information
    2. Implementing decision trees
      1. Ordered logistic regression
    3. Predicting sovereign ratings using European country reports
    4. Summary
  13. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Machine Learning with R Quick Start Guide
  • Author(s): Ivan Pastor Sanz
  • Release date: March 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781838644338