Here are some reasons for failure during training, along with their fixes:
- Failure 1 - s3 bucket related issue: No S3 objects are found at the s3://DEMO-ObjectDetection/s3_train_data/ S3 URL given in the input data source. Please ensure that the bucket exists in the selected region (us-east-1), the objects exist under that S3 prefix, and the arn:aws:iam::11111111:role/service-role/AmazonSageMaker-ExecutionRole-xxxxxxx role has s3:ListBucket permissions on the DEMO-ObjectDetection bucket. Or, the The specified bucket does not exist error message from S3. Solution: Change the S3 bucket path, described previously. Repeat one each for the train, validation, annotation, and image data files.