book

Test Driven Machine Learning

Name: Test Driven Machine Learning
Author: Justin Bozonier
ISBN: 9781784399085

by Justin Bozonier

November 2015

Intermediate to advanced

190 pages

4h 11m

English

Packt Publishing

Read now

Unlock full access

Test-Driven Machine Learning
Table of Contents
Test-Driven Machine Learning
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and moreWhy subscribe?Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for

Conventions
Reader feedback
Customer support
Downloading the example codeDownloading the color images of this bookErrataPiracyQuestions
1. Introducing Test-Driven Machine Learning
Test-driven development
The TDD cycle
RedGreenRefactor
Behavior-driven development
Our first test
The anatomy of a testGivenWhenThen
TDD applied to machine learning
Dealing with randomness
Different approaches to validating the improved models
Classification overviewRegressionClustering
Quantifying the classification models
Summary
2. Perceptively Testing a Perceptron
Getting started
Summary
3. Exploring the Unknown with Multi-armed Bandits
Understanding a bandit
Testing with simulation
Starting from scratch
Simulating real world situations
A randomized probability matching algorithm
A bootstrapping bandit
The problem with straight bootstrapping
Multi-armed armed bandit throw down
Summary
4. Predicting Values with Regression
Refresher on advanced regressionRegression assumptionsQuantifying model quality
Generating our own data
Building the foundations of our model
Cross-validating our model
Generating data
Summary
5. Making Decisions Black and White with Logistic Regression
Generating logistic data
Measuring model accuracy
Generating a more complex example
Test driving our model
Summary
6. You're So Naïve, Bayes
Gaussian classification by hand
Beginning the development
Summary
7. Optimizing by Choosing a New Algorithm
Upgrading the classifier
Applying our classifier
Upgrading to Random Forest
Summary
8. Exploring scikit-learn Test First
Test-driven design
Planning our journey
Creating a classifier chooser (it needs to run tests to evaluate classifier performance)
Getting choosey
Developing testable documentation
Decision trees
Summary
9. Bringing It All Together
Starting at the highest level
The real world
What we've accomplished
Summary
Index

Content preview from Test Driven Machine Learning

Cross-validating our model

Now before we cheat and look at our answer key, let's see how well this solution does at predicting data it hasn't seen. To do this, I write the following fairly large test:

def final_model_cross_validation_test(): df = pandas.read_csv('./generated_data.csv') df['predicted_dependent_var'] = 25.6266 \ + 2.7083*df['ind_var_a'] \ - 1.5527*df['ind_var_b'] \ - 0.3917*df['ind_var_c'] \ - 0.2006*df['ind_var_e'] \ + 5.6450*df['ind_var_b'] * df['ind_var_c'] df['diff'] = (df['dependent_var'] - df['predicted_dependent_var']).abs() print df['diff'] print '===========' cv_df = pandas.read_csv('./generated_data_cv.csv') cv_df['predicted_dependent_var'] = 25.6266 \ + 2.7083*cv_df['ind_var_a'] \ - 1.5527*cv_df['ind_var_b'] \ - 0.3917*cv_df['ind_var_c'] ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781784399085

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Test Driven Machine Learning

by Justin Bozonier

Cross-validating our model

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.