A regression problem

Given some descriptors of a song, the goal of this problem is to predict the year when the song was produced. That's basically a regression problem, since the target variable to predict is a number in the range between 1922 and 2011.

For each song, in addition to the year of production, 90 attributes are provided. All of them are related to the timbre: 12 of them relate to the timbre average and 78 attributes describe the timbre's covariance; all the features are numerical (integer or floating point numbers).

The dataset is composed of more than half a million observations. As for the competition behind the dataset, the authors tried to achieve the best results using the first 463,715 observations as a training set and the remaining ...

Get Regression Analysis with Python now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.