Rapid data production with a multi-model database

Ingest the data you need in an agile manner.

By Joel Ruisi
February 20, 2018
Clockworks Clockworks (source: Pixabay)

The demand for data consumption greatly outweighs the data production capabilities of most organizations. Is this because there is a shortage of data? Absolutely not! Data is everywhere, and is generated at a rate faster than most IT systems can handle. One reason data-centric IT systems fail is due to the fact that they must deal with not only a large amount of data, but also a variety of data formats and models.

Traditionally, in order to handle this variety of data, different types of databases get “bolted” together into a single complex architecture. Then, processes for data integration and synchronization are added so that each database is kept up to date with the data it requires to do its specific job. On top of this, we need to add another layer of complexity to accommodate data security, data provenance, failover, redundancy, backup, etc. This polyglot infrastructure also requires a layer for exposing data “views” so that data can be consumed. Because of all this complexity, it becomes difficult to be agile in producing the data that consumers demand.

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

In order to deal with the problems inherent in the above architecture, a multi-model database can be utilized. The multi-model database enables system implementers to rapidly ingest data as is, and expose data in an agile manner. The multi-model database achieves this using a single, unified back end. This means that functionality like security, data indexing, and synchronization are all handled the same way, even if the data models and structures are completely different. For example, we can ingest JSON data from an external data source, store the data in JSON documents, and expose the data as relational data via SQL queries. This is just one example of the agility that can be achieved with a multi-model database.

Using a multi-model database platform results in a less complex architecture that can take in many types of data models, quickly adapt, and produce the data required by end users. This data-centric approach means that as existing requirements change or new requirements are discovered, data can quickly be brought in and exposed in many different formats with little or no upfront data modeling effort required.

Since the data is already “there,” IT organizations that use a multi-model database platform are seeing rapid time to market. Which is really what agility is all about.

This post is a collaboration between O’Reilly and MarkLogic. See our statement of editorial independence.

Post topics: Data science