O'Reilly logo

Effective Amazon Machine Learning by Alexis Perrier

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Download the dataset from RedShift

The right way to download data from Redshift is to connect to the database using Psql and use the Unload command to dump the results of an SQL query in S3. The following command exports all the tweets to the s3://aml.packt/data/veggies/results/ location using an appropriate role:

unload ('select * from tweets') to 's3://aml.packt/data/veggies/results/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';

We can then download the files and aggregate them:

# Download$ aws s3 cp s3://aml.packt/data/veggies/results/0000_part_00 data/$ aws s3 cp s3://aml.packt/data/veggies/results/0001_part_00 data/# Combine$ cp data/0000_part_00 data/veggie_tweets.tmp$ cat data/0001_part_00 >> data/veggie_tweets.tmp ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required