Chapter 5. Audio

This chapter explores the practical side of implementing audio-related AI features in your Swift apps. Taking a top-down approach, we explore two audio tasks and how to implement them using Swift and various AI tools.

Audio and Practical AI

Here are the two audio-related practical AI tasks that we explore in this chapter:

Speech Recognition

Making a computer understand human words is incredibly useful. You can take dictation or order a computer around.

Sound Classification

Classification is going to crop up repeatedly in this book. We build a sound classifier app that can tell us what animal sound we’re listening to.

Images might be the trendy hot topic that triggered an explosion of deep learning, machine learning, and artificial intelligence (AI) features in products, and activity classification might be a novel way of using the myriad sensors in a modern iOS device, but sound is one of the real stars of practical applications of machine learning. Almost everyone has used sound at least once on their mobile device (like the music identification service, Shazam), even before AI was (yet again) a buzzword.

Task: Speech Recognition

Speech recognition is one of those touchpoints of AI that most people have used at some point or another: whether it’s on a phone call with an irritating phone robot that’s trying to understand your voice, or actively using your computer with assistive and accessibility technologies, speech recognition has been pervasive a lot ...

Get Practical Artificial Intelligence with Swift now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.