book

Machine Learning For Dummies

Name: Machine Learning For Dummies
ISBN: 9781119245513

by John Paul Mueller, Luca Massaron

May 2016

Intermediate to advanced

432 pages

11h 16m

English

For Dummies

Read now

Unlock full access

Cover
Introduction
About This BookFoolish AssumptionsIcons Used in This BookBeyond the BookWhere to Go from Here
Part 1: Introducing How Machines Learn
Chapter 1: Getting the Real Story about AI
Moving beyond the HypeDreaming of Electric SheepOvercoming AI FantasiesConsidering the Relationship between AI and Machine LearningConsidering AI and Machine Learning SpecificationsDefining the Divide between Art and Engineering
Chapter 2: Learning in the Age of Big Data
Defining Big DataConsidering the Sources of Big DataSpecifying the Role of Statistics in Machine LearningUnderstanding the Role of AlgorithmsDefining What Training Means
Chapter 3: Having a Glance at the Future
Creating Useful Technologies for the FutureDiscovering the New Work Opportunities with Machine LearningAvoiding the Potential Pitfalls of Future Technologies
Part 2: Preparing Your Learning Tools
Chapter 4: Installing an R Distribution
Choosing an R Distribution with Machine Learning in MindInstalling R on WindowsInstalling R on LinuxInstalling R on Mac OS XDownloading the Datasets and Example Code
Chapter 5: Coding in R Using RStudio
Understanding the Basic Data TypesWorking with VectorsOrganizing Data Using ListsWorking with MatricesInteracting with Multiple Dimensions Using ArraysCreating a Data FramePerforming Basic Statistical Tasks
Chapter 6: Installing a Python Distribution
Choosing a Python Distribution with Machine Learning in MindInstalling Python on LinuxInstalling Python on Mac OS XInstalling Python on WindowsDownloading the Datasets and Example Code

Chapter 7: Coding in Python Using Anaconda
Working with Numbers and LogicCreating and Using StringsInteracting with DatesCreating and Using FunctionsUsing Conditional and Loop StatementsStoring Data Using Sets, Lists, and TuplesDefining Useful IteratorsIndexing Data Using DictionariesStoring Code in Modules
Chapter 8: Exploring Other Machine Learning Tools
Meeting the Precursors SAS, Stata, and SPSSLearning in Academia with WekaAccessing Complex Algorithms Easily Using LIBSVMRunning As Fast As Light with Vowpal WabbitVisualizing with Knime and RapidMinerDealing with Massive Data by Using Spark
Part 3: Getting Started with the Math Basics
Chapter 9: Demystifying the Math Behind Machine Learning
Working with DataExploring the World of ProbabilitiesDescribing the Use of Statistics
Chapter 10: Descending the Right Curve
Interpreting Learning As OptimizationExploring Cost FunctionsDescending the Error CurveUpdating by Mini-Batch and Online
Chapter 11: Validating Machine Learning
Checking Out-of-Sample ErrorsGetting to Know the Limits of BiasKeeping Model Complexity in MindKeeping Solutions BalancedTraining, Validating, and TestingResorting to Cross-ValidationLooking for Alternatives in ValidationOptimizing Cross-Validation ChoicesAvoiding Sample Bias and Leakage Traps
Chapter 12: Starting with Simple Learners
Discovering the Incredible PerceptronGrowing Greedy Classification TreesTaking a Probabilistic Turn
Part 4: Learning from Smart and Big Data
Chapter 13: Preprocessing Data
Gathering and Cleaning DataRepairing Missing DataTransforming DistributionsCreating Your Own FeaturesCompressing DataDelimiting Anomalous Data
Chapter 14: Leveraging Similarity
Measuring Similarity between VectorsUsing Distances to Locate ClustersTuning the K-Means AlgorithmSearching for Classification by K-Nearest NeighborsLeveraging the Correct K Parameter
Chapter 15: Working with Linear Models the Easy Way
Starting to Combine VariablesMixing Variables of Different TypesSwitching to ProbabilitiesGuessing the Right FeaturesLearning One Example at a Time
Chapter 16: Hitting Complexity with Neural Networks
Learning and Imitating from NatureStruggling with OverfittingIntroducing Deep Learning
Chapter 17: Going a Step beyond Using Support Vector Machines
Revisiting the Separation Problem: A New ApproachExplaining the AlgorithmApplying NonlinearityIllustrating Hyper-ParametersClassifying and Estimating with SVM
Chapter 18: Resorting to Ensembles of Learners
Leveraging Decision TreesWorking with Almost Random GuessesBoosting Smart PredictorsAveraging Different Predictors
Part 5: Applying Learning to Real Problems
Chapter 19: Classifying Images
Working with a Set of ImagesExtracting Visual FeaturesRecognizing Faces Using EigenfacesClassifying Images
Chapter 20: Scoring Opinions and Sentiments
Introducing Natural Language ProcessingUnderstanding How Machines ReadUsing Scoring and Classification
Chapter 21: Recommending Products and Movies
Realizing the RevolutionDownloading Rating DataLeveraging SVD
Part 6: The Part of Tens
Chapter 22: Ten Machine Learning Packages to Master
Cloudera OryxCUDA-ConvnetConvNetJSe1071gbmGensimglmnetrandomForestSciPyXGBoost
Chapter 23: Ten Ways to Improve Your Machine Learning Models
Studying Learning CurvesUsing Cross-Validation CorrectlyChoosing the Right Error or Score MetricSearching for the Best Hyper-ParametersTesting Multiple ModelsAveraging ModelsStacking ModelsApplying Feature EngineeringSelecting Features and ExamplesLooking for More Data
About the Author
Advertisement Page
Connect with Dummies
End User License Agreement

Content preview from Machine Learning For Dummies

Chapter 18

Resorting to Ensembles of Learners

IN THIS CHAPTER

Discovering why many guesses are better than one

Making uncorrelated trees work well together in Random Forests

Learning to map complex target functions piece by piece using boosting

Getting better predictions by averaging models

After discovering so many complex and powerful algorithms, you might be surprised to discover that a summation of simpler machine learning algorithms can often outperform the most sophisticated solutions. Such is the power of ensembles, groups of models made to work together to produce better predictions. The amazing thing about ensembles is that they are made up of groups of singularly nonperforming algorithms.

Ensembles don’t work much differently from the collective intelligence of crowds, through which a set of wrong answers, if averaged, provides the right answer. Sir Francis Galton, the English Victorian age statistician known for having formulated the idea of correlation, narrated the anecdote of a crowd in a county fair that could guess correctly the weight of an ox after all the people’s previous answers were averaged. You can find similar examples everywhere and easily recreate the experiment by asking friends to guess the number of sweets in a jar and averaging their answers. The more friends who participate in the game, the more precise the averaged answer.

Luck isn’t what’s behind the result — it’s simply the law of large numbers in action (see more at https://en.wikipedia.org/wiki/Law_of_large_numbers ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781119245513Purchase book

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Machine Learning For Dummies

by John Paul Mueller, Luca Massaron

Resorting to Ensembles of Learners

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.