Chapter 3. Data Exploration Using RethinkDB

Data exploration is the process of analyzing and refactoring structured or non-structured data and is commonly done before going onto actual data analysis. Operations such as performing a duplicate cleanup and finding whitespace data can be done at the data exploration stage.

We can keep data exploration as the pre-emptive operation before performing heavy-cost operations such as running various batches and jobs, which is quite expensive in computing, and finding irrelevant data in that stage would be painful.

Data exploration can be very useful in various scenarios. Suppose you have large dataset of DNA diversion of people living in New York or terabytes of data from NASA about Mars' temperature records. ...

Get Mastering RethinkDB now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.