Chapter 5. Audio
This chapter explores the practical side of implementing audio-related AI features in your Swift apps. Taking a top-down approach, we explore two audio tasks and how to implement them using Swift and various AI tools.
Audio and Practical AI
Here are the two audio-related practical AI tasks that we explore in this chapter:
- Speech Recognition
Making a computer understand human words is incredibly useful. You can take dictation or order a computer around.
- Sound Classification
Classification is going to crop up repeatedly in this book. We build a sound classifier app that can tell us what animal sound we’re listening to.
Images might be the trendy hot topic that triggered an explosion of deep learning, machine learning, and artificial intelligence (AI) features in products, and activity classification might be a novel way of using the myriad sensors in a modern iOS device, but sound is one of the real stars of practical applications of machine learning. Almost everyone has used sound at least once on their mobile device (like the music identification service, Shazam), even before AI was (yet again) a buzzword.
Task: Speech Recognition
Speech recognition is one of those touchpoints of AI that most people have used at some point or another: whether it’s on a phone call with an irritating phone robot that’s trying to understand your voice, or actively using your computer with assistive and accessibility technologies, speech recognition has been pervasive a lot ...