4

Fine-Tuning Whisper for Domain and Language Specificity

OpenAI’s Whisper represents a groundbreaking innovation in ASR through its ability to transcribe speech into text with unprecedented accuracy. However, as with any machine learning model, Whisper’s out-of-the-box performance still exhibits limitations in niche contexts. For example, during the onset of COVID-19, Whisper could not recognize the term for several months. Similarly, the model needed to accurately transcribe the names of key figures and places associated with the Russia–Ukraine conflict, which required prior training data.

Thus, to fully tap into this model’s potential, we must customize it for specific situations. This chapter will uncover techniques for adapting Whisper’s ...

Get Learn OpenAI Whisper now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.