book

Machine Learning For Dummies

Name: Machine Learning For Dummies
ISBN: 9781119245513

by John Paul Mueller, Luca Massaron

May 2016

Intermediate to advanced

432 pages

11h 16m

English

For Dummies

Read now

Unlock full access

Cover
Introduction
About This BookFoolish AssumptionsIcons Used in This BookBeyond the BookWhere to Go from Here
Part 1: Introducing How Machines Learn
Chapter 1: Getting the Real Story about AI
Moving beyond the HypeDreaming of Electric SheepOvercoming AI FantasiesConsidering the Relationship between AI and Machine LearningConsidering AI and Machine Learning SpecificationsDefining the Divide between Art and Engineering
Chapter 2: Learning in the Age of Big Data
Defining Big DataConsidering the Sources of Big DataSpecifying the Role of Statistics in Machine LearningUnderstanding the Role of AlgorithmsDefining What Training Means
Chapter 3: Having a Glance at the Future
Creating Useful Technologies for the FutureDiscovering the New Work Opportunities with Machine LearningAvoiding the Potential Pitfalls of Future Technologies
Part 2: Preparing Your Learning Tools
Chapter 4: Installing an R Distribution
Choosing an R Distribution with Machine Learning in MindInstalling R on WindowsInstalling R on LinuxInstalling R on Mac OS XDownloading the Datasets and Example Code
Chapter 5: Coding in R Using RStudio
Understanding the Basic Data TypesWorking with VectorsOrganizing Data Using ListsWorking with MatricesInteracting with Multiple Dimensions Using ArraysCreating a Data FramePerforming Basic Statistical Tasks
Chapter 6: Installing a Python Distribution
Choosing a Python Distribution with Machine Learning in MindInstalling Python on LinuxInstalling Python on Mac OS XInstalling Python on WindowsDownloading the Datasets and Example Code

Chapter 7: Coding in Python Using Anaconda
Working with Numbers and LogicCreating and Using StringsInteracting with DatesCreating and Using FunctionsUsing Conditional and Loop StatementsStoring Data Using Sets, Lists, and TuplesDefining Useful IteratorsIndexing Data Using DictionariesStoring Code in Modules
Chapter 8: Exploring Other Machine Learning Tools
Meeting the Precursors SAS, Stata, and SPSSLearning in Academia with WekaAccessing Complex Algorithms Easily Using LIBSVMRunning As Fast As Light with Vowpal WabbitVisualizing with Knime and RapidMinerDealing with Massive Data by Using Spark
Part 3: Getting Started with the Math Basics
Chapter 9: Demystifying the Math Behind Machine Learning
Working with DataExploring the World of ProbabilitiesDescribing the Use of Statistics
Chapter 10: Descending the Right Curve
Interpreting Learning As OptimizationExploring Cost FunctionsDescending the Error CurveUpdating by Mini-Batch and Online
Chapter 11: Validating Machine Learning
Checking Out-of-Sample ErrorsGetting to Know the Limits of BiasKeeping Model Complexity in MindKeeping Solutions BalancedTraining, Validating, and TestingResorting to Cross-ValidationLooking for Alternatives in ValidationOptimizing Cross-Validation ChoicesAvoiding Sample Bias and Leakage Traps
Chapter 12: Starting with Simple Learners
Discovering the Incredible PerceptronGrowing Greedy Classification TreesTaking a Probabilistic Turn
Part 4: Learning from Smart and Big Data
Chapter 13: Preprocessing Data
Gathering and Cleaning DataRepairing Missing DataTransforming DistributionsCreating Your Own FeaturesCompressing DataDelimiting Anomalous Data
Chapter 14: Leveraging Similarity
Measuring Similarity between VectorsUsing Distances to Locate ClustersTuning the K-Means AlgorithmSearching for Classification by K-Nearest NeighborsLeveraging the Correct K Parameter
Chapter 15: Working with Linear Models the Easy Way
Starting to Combine VariablesMixing Variables of Different TypesSwitching to ProbabilitiesGuessing the Right FeaturesLearning One Example at a Time
Chapter 16: Hitting Complexity with Neural Networks
Learning and Imitating from NatureStruggling with OverfittingIntroducing Deep Learning
Chapter 17: Going a Step beyond Using Support Vector Machines
Revisiting the Separation Problem: A New ApproachExplaining the AlgorithmApplying NonlinearityIllustrating Hyper-ParametersClassifying and Estimating with SVM
Chapter 18: Resorting to Ensembles of Learners
Leveraging Decision TreesWorking with Almost Random GuessesBoosting Smart PredictorsAveraging Different Predictors
Part 5: Applying Learning to Real Problems
Chapter 19: Classifying Images
Working with a Set of ImagesExtracting Visual FeaturesRecognizing Faces Using EigenfacesClassifying Images
Chapter 20: Scoring Opinions and Sentiments
Introducing Natural Language ProcessingUnderstanding How Machines ReadUsing Scoring and Classification
Chapter 21: Recommending Products and Movies
Realizing the RevolutionDownloading Rating DataLeveraging SVD
Part 6: The Part of Tens
Chapter 22: Ten Machine Learning Packages to Master
Cloudera OryxCUDA-ConvnetConvNetJSe1071gbmGensimglmnetrandomForestSciPyXGBoost
Chapter 23: Ten Ways to Improve Your Machine Learning Models
Studying Learning CurvesUsing Cross-Validation CorrectlyChoosing the Right Error or Score MetricSearching for the Best Hyper-ParametersTesting Multiple ModelsAveraging ModelsStacking ModelsApplying Feature EngineeringSelecting Features and ExamplesLooking for More Data
About the Author
Advertisement Page
Connect with Dummies
End User License Agreement

Content preview from Machine Learning For Dummies

Chapter 12

Starting with Simple Learners

IN THIS CHAPTER

Trying a perceptron to separate classes by a line

Partitioning recursively training data by decision trees

Discovering the rules behind playing tennis and surviving the Titanic

Leveraging Bayesian probability to analyze textual data

Beginning with this chapter, the examples start illustrating the basics of how to learn from data. The plan is to touch some of the simplest learning strategies first — providing some formulas (just those that are essential), intuitions about their functioning, and examples in R and Python for experimenting with some of their most typical characteristics. The chapter begins by reviewing the use of the perceptron to separate classes.

At the root of all principal machine learning techniques presented in the book, there is always an algorithm based on somewhat interrelated linear combinations, variations of the sample splitting of decision trees, or some kind of Bayesian probabilistic reasoning. This chapter uses classification trees to demonstrate the technique. The only exception is the K-Nearest Neighbors (KNN) algorithm, which, based on analogical reasoning, is treated apart in a special chapter devoted to detection of similarity in data (Chapter 14).

Getting a grasp on these basic techniques means being able to deal with more complex learning techniques later and being able to understand (and use) them better. It may appear incredible now, but you can create some of the most effective algorithms ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781119245513Purchase book

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Machine Learning For Dummies

by John Paul Mueller, Luca Massaron

Starting with Simple Learners

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.