To illustrate the concepts in this chapter, we will be using the bike sharing dataset. This dataset contains hourly records of the number of bicycle rentals in the capital bike sharing system. It also contains variables related to date, time, weather, seasonal, and holiday information.
Extracting features from the bike sharing dataset
The dataset is available at http://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset.
Click on the Data Folder link, and then download the Bike-Sharing-Dataset.zip file.
The bike sharing data was enriched with weather and seasonal data by Hadi Fanaee-T at the University of Porto and used in the following paper: Fanaee-T, Hadi and Gama Joao, Event labeling combining ensemble detectors and background knowledge, ...
Get Machine Learning with Spark - Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.