Foreword
In my lab at the University of Rochester, we spent over a decade building AI systems that listen to a patient’s voice and watch their facial movements to detect early signs of Parkinson’s disease and autism, often before a clinician ever sees them. The models we built were sophisticated. The algorithms were sound. But the hardest problem was never the model. It was the data.
We learned this lesson the way most researchers do: painfully. Our early systems would perform beautifully on curated datasets and then fall apart in the real world—not because the neural networks were wrong, but because the data feeding them was incomplete, inconsistent, or stripped of the context that gave it meaning. A voice recording without metadata about the patient’s medication timing was just noise. A facial expression without the conversational context was ambiguous at best, misleading at worst. The signal was always there—the data just wasn’t ready to reveal it.
That experience, repeated across clinical studies, national-scale health AI deployments in Saudi Arabia, and advisory work with the National Academies, has given me a deep conviction: the organizations that will lead in the AI era are not the ones with the most powerful models. They are the ones with the most deliberately architected data.
This is precisely the argument that Navnit, Kien, Srikanth, and Harsha make in AI-Ready Data Blueprints, and they make it with a clarity and practical depth that is rare in technical writing.
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access