We will start by creating the datasource. Let's first see what parameters are needed by generating the following skeleton:
$ aws machinelearning create-data-source-from-s3 --generate-cli-skeleton
This generates the following JSON object:
{ "DataSourceId": "", "DataSourceName": "", "DataSpec": { "DataLocationS3": "", "DataRearrangement": "", "DataSchema": "", "DataSchemaLocationS3": "" }, "ComputeStatistics": true}
The different parameters are mostly self-explanatory and further information can be found on the AWS documentation at http://docs.aws.amazon.com/cli/latest/reference/machinelearning/create-data-source-from-s3.html.
A word on the schema: when creating a datasource from the web interface, you have the possibility ...