Chapter 3. Unified Analytics Architecture

In the last few years, multiple factors have converged with great synchronicity, paving the way to the dream of the last decade: a unified analytics architecture, a single architecture enabling the aggregation, analysis, and modeling of the full gambit of a company’s data. This has the potential to revolutionize the development of ML models and the organization and processing of data. The factors that enabled this are far reaching, but one powerful contributing factor is universal object storage and elasticity of the cloud.

Prevalence of the Cloud

The cloud has been a core driver of unification in data storage, analytics, modeling, and governance. The cloud abstracted away the complex management of servers and distributed compute and storage, enabling millions of teams to thrive that otherwise wouldn’t have the IT support to run these workloads. Most major data storage players (including Snowflake, Cloudera, Yellowbrick, Databricks, and Vertica) have worked tirelessly to support cloud deployments of their stacks, with a few top players offering on-premises, cloud, and hybrid support out of the box for companies with complex and varying requirements.

Cloud analytics does have drawbacks such as periodic high latency due to the “noisy neighbor” situation (lots of people using the same cloud network at the same time), but that’s not stopping anyone. Companies have emerged that are powered by these storage mechanisms and are implementing ...

Get Accelerate Machine Learning with a Unified Analytics Architecture now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.