8

Diarizing Speech with WhisperX and NVIDIA’s NeMo

Welcome to Chapter 8, where we will explore the world of speech diarization. While Whisper has proven to be a powerful tool for transcribing speech, there’s another crucial aspect of speech analysis that can significantly enhance its utility – speaker diarization. By augmenting Whisper with the ability to identify and attribute speech segments to different speakers, we open a new realm of possibilities for analyzing multispeaker conversations. This chapter will explore how Whisper can be integrated with cutting-edge diarization techniques to unlock these capabilities.

We will start by exploring the evolution of speaker diarization systems, from the limitations of early approaches to the transformative ...

Get Learn OpenAI Whisper now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.