Chapter 13. Big Data, Analytics, and Machine Learning Services
In the world of information technology, data is generated at a huge volume. This data can be just the information of all registered users on online food-ordering applications or real-time user actions captured on the application. Data generated at large volume is referred to as big data. If you have a use case to store this data, you can utilize the storage solutions we discussed in Chapter 10 based on your requirements. This chapter focuses on how to process the data at high volume. How can we generate insights out of data already present in storage or live streaming data by running data analytics or ML models on top of it? For example, we might want to determine the most ordered food item based on location or the restaurant with the highest rating in a particular locality.
The first part of this chapter introduces you to AWS big data, live streaming, and analytics services such as Amazon Elastic MapReduce (EMR), AWS Glue, Amazon Athena, Amazon QuickSight, and Amazon Redshift. The second section explores how you can run ML workloads on the AWS cloud and the different services supporting that.
AWS Big Data and Analytics
Information is vital to making business decisions or serving our customers better, but the volume of data is rapidly growing, ranging from terabytes to petabytes (and more). The variety of data is also increasing—data can be in any form. We require specific tools to store and process big data. Traditional ...