2
Building and Using a Dataset
The data collection and curation process is one of the most important stages in model building. It is also one of the most time-consuming. Typically, data can come from many sources; for example, customer records, transaction data, or stock lists. Nowadays, with the timely conjunction of big data, fast, high-capacity SSDs (to store big data), and GPUs (to process big data), it is easier for individuals to collect, store, and process data.
In this chapter, you will learn about finding and accessing pre-existing, ready-made data sources that can be used to train your model. We will also look at ways to create your own datasets, transforming datasets so that they are useful for your problem, and we will also see how ...
Get Machine Learning for Emotion Analysis in Python now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.