Chapter 10. Seeing AI: Using Azure Machine Learning and Cognitive Services in a Mobile App at Scale

In previous chapters we’ve looked at how to use Azure Machine Learning and Cognitive Services. We’ve shown you how Microsoft runs Cognitive Services so it scales to massive numbers of users. But what does that look like inside a real-world application? What kind of architecture do you need to take advantage of Azure AI services? And how ambitious can you be when it comes to the kind of problems you can solve with AI services?

How about describing the world around them for millions of blind users, in real time?

In this chapter, we’re going to look at how the Seeing AI app uses Azure Machine Learning and Cognitive Services to tell blind users what’s in front of them, using a mix of prebuilt and custom models, running locally and in the cloud.

Think of the app as a talking camera: it can read out short sections of text like signs and labels, capture and recognize longer documents like forms and menus, and even read handwriting. It recognizes faces the user knows and describes the people around them; it can also describe what’s in an image or narrate what’s going on, like someone playing football in the park. It can describe a color, emit audio tones to describe how dark or bright it is, beep to help users scan a barcode to get information about the box or tin they’re holding, and identify different bank notes when it’s time to pay for something. Machine learning services on Azure provide ...

Get Azure AI Services at Scale for Cloud, Mobile, and Edge now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.