Uninformed data

The following technique could be seen as something of a game changer in how most modern data scientists work. While it is common to work with structured and unstructured text, it is less common to work on raw binary data the reason being the gap between computer science and data science. Textual processing is limited to a standard set of operations that most will be familiar with, that is, acquiring, parsing and storing, and so on. Instead of restricting ourselves to these operations, we will work directly with audio transforming and enrich the uninformed signal data into informed transcription. In doing this, we enable a new type of data pipeline that is analogous to teaching a computer to hear the voice from audio files.

A second ...

Get Mastering Spark for Data Science now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.