Amazon ML works on comma separated values files (.csv), a very simple format where each row is an observation and each column is a variable or attribute. There are, however, a few conditions that should be met:
- The data must be encoded in plain text using a character set, such as ASCII, Unicode, or EBCDIC
- All values must be separated by commas; if a value contains a comma, it should be enclosed by double quotes
- Each observation (row) must be smaller than 100k
There are also conditions regarding end of line characters that separate rows. Special care must be taken when using Excel on OS X (Mac), as explained on this page: http://docs.aws.amazon.com/machine-learning/latest/dg/understanding-the-data-format-for-amazon-ml.html. ...