O'Reilly logo

Learning AWS - Second Edition by Amit Shah, Aurobindo Sarkar

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Using AWS Glue and Amazon Athena

In this section, we will use AWS Glue to create a crawler, an ETL job, and a job that runs KMeans clustering algorithm on the input data.

We use a publicly available dataset about the students' knowledge status on a subject. The dataset and the field descriptions are available for download from the UCI site: https://archive.ics.uci.edu/ml/datasets/User+Knowledge+Modeling

  1. Log in to the AWS Management Console and go to the Glue console. Click on the Add crawler button.
  2. Specify the Crawler name as User Modeling Data Crawler as shown here. Click on the Next button:
  1. In the Add a data store screen, select S3

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required