9

Using Geospatial Data with Amazon EMR

In the previous chapter, we learned about machine learning with Amazon SageMaker, a powerful service for creating, testing, and tuning machine learning algorithms. In this chapter, we will learn about Elastic Map Reduce (EMR), which is essentially a managed Hadoop cluster. Hadoop is a powerful framework for the massively parallel processing of data. This ability is unique to the Hadoop architecture and is the only way to efficiently query petabytes of data using commodity hardware. Hadoop is an interesting community project that is really made up of hundreds of plug-and-play widgets. There is also a service on Hadoop to do machine learning called Mahout, as well as Spark ML. In this chapter, we will walk ...

Get Geospatial Data Analytics on AWS now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.