Creating the dataset
In this chapter, we will take on the role of the bad guy. We want to create a program that can beat CAPTCHAs, allowing our comment spam program to advertise on someone's website. It should be noted that our CAPTCHAs will be a little easier that those used on the web today and that spamming isn't a very nice thing to do.
Our CAPTCHAs will be individual English words of four letters only, as shown in the following image:
Our goal will be to create a program that can recover the word from images like this. To do this, we will use four steps:
- Break the image into individual letters.
- Classify each individual letter.
- Recombine the letters ...
Get Python: Real-World Data Science now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.