September 2015
Beginner to intermediate
608 pages
13h 43m
English
"More is different." | ||
| --Philip Warren Anderson | ||
In the previous chapters, we've used regression techniques to fit models to the data. In Chapter 3, Correlation, for example, we built a linear model that used ordinary least squares and the normal equation to fit a straight line through the athletes' heights and log weights. In Chapter 4, Classification, we used Incanter's optimize namespace to minimize the logistic cost function and build a classifier of Titanic's passengers. In this chapter, we'll apply similar analysis in a way that's suitable for much larger quantities of data.
We'll be working with a relatively modest dataset of only 100,000 records. This isn't big data (at 100 MB, it will fit comfortably in the memory of ...
Read now
Unlock full access