Creating the datasource

We will start by creating the datasource. Let's first see what parameters are needed by generating the following skeleton:

$ aws machinelearning create-data-source-from-s3 --generate-cli-skeleton

This generates the following JSON object:

{   "DataSourceId": "",   "DataSourceName": "",   "DataSpec": {       "DataLocationS3": "",       "DataRearrangement": "",       "DataSchema": "",       "DataSchemaLocationS3": ""   },   "ComputeStatistics": true}

The different parameters are mostly self-explanatory and further information can be found on the AWS documentation at

A word on the schema: when creating a datasource from the web interface, you have the possibility ...

