May 2017
Intermediate to advanced
310 pages
8h 5m
English
The columns in a data frame are known as its features. The rows are known as records or observations. Now examine the following data matrix. This data will be referenced in subsections so please do take note:
[[ 58. 1. 43.] [ 10. 200. 65.] [ 20. 75. 7.]]
Feature 1, with data 58, 10, 20, has its values lying between 10 and 58. For feature 2, the data lies between 1 and 200. Inconsistent results will be produced if we supply this data to any machine learning algorithm. Ideally, we will need to scale the data to a certain range in order to get consistent results.
Once again, a closer inspection reveals that each feature (or column) lies around different mean values. Therefore, what we would want to do is to align the features ...