September 2017
Beginner to intermediate
304 pages
7h 2m
English
The dataset that we will be using to illustrate clustering techniques is about delivery drivers. The dataset looks like this:
$ head fleet_data.csv Driver_ID,Distance_Feature,Speeding_Feature 3423311935,71.24,28.0 3423313212,52.53,25.0 3423313724,64.54,27.0 3423311373,55.69,22.0 3423310999,54.58,25.0 3423313857,41.91,10.0 3423312432,58.64,20.0 3423311434,52.02,8.0 3423311328,31.25,34.0
The first column, Driver_ID, includes various anonymous identifications of particular drivers. The second and third columns are attributes that we will utilize in our clusters. The Distance_Feature column is a mean distance driven per data, and Speeding_Feature is a mean percentage of time during which the driver is driving 5+ miles ...
Read now
Unlock full access