O'Reilly logo

Effective Amazon Machine Learning by Alexis Perrier

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Finding open datasets

There is a multitude of dataset repositories available online, from local to global public institutions, from non-profit organizations to data-focused start-ups. Here's a small list of open dataset resources that are well suited for predictive analytics. This, by far, is not an exhaustive list:

This thread on Quora points to many other interesting data sources: https://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public . You can also ask for specific datasets on Reddit at https://www.reddit.com/r/datasets/. 
  • UCI Machine Learning Repository is a collection of datasets maintained by UC Irvine since 1987, hosting over 300 datasets related to classification, clustering, regression, and other ML tasks, https://archive.ics.uci.edu/ml/ ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required