Managing stateless microservices and stateful data stores

Bridge the operational silos between web application architecture, system administration, and teams with DC/OS.

By Adam Michael Wood

June 6, 2017

Suspension bridge (source: Unsplash)

Today’s highly scalable web applications are not monolithic code bases, but rather a disaggregated mix of stateless microservices and stateful data stores. This typical architectural pattern of modern web applications solves certain problems of scaling, data management, and development throughput. However, this solution leads to a new problem: siloing. Disaggregated architecture often leads to disaggregated administration and a disaggregated team.

In this article we’ll briefly explore the evolution of web application architecture, look more closely at the problem of siloing, and discuss how DC/OS, an open source application platform, can solve this problem.

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

A Little Background on Application Architecture

Web application architecture trends have evolved significantly in the last decade. These changes have been driven largely by the need to support three requirements of modern applications:

heavy, and variable, traffic loads;
increased collection and use of data;
high development velocity.

Meeting these challenges inevitably led developers to move away from conventional single machine applications, distributing the work across multiple servers. The first common pattern was to separate out a database server from the primary application server. Once this pattern was established, it quite naturally informed the architectural patterns that followed. Two servers became three when analytics databases were separated from business databases. Separate object storage for static assets introduced a new component. Indexing and search servers might be added to the mix.

The logical result of this evolution is the typical architecture seen in many contemporary web apps: a mix of stateless, containerized microservices and stateful, high-volume data stores. This approach solves all three challenges it intended:

Scalability. By definition, stateless services do not need to persist data from session to session. This means they can be replicated on demand, and each replica does not need to be aware of, or coordinate with, other replicas. Additionally, since core application functionality is composed of many different microservices, it is possible to scale out only those services that are heavily used.
Data management. Persisting data in one or more “big data” stores that are optimized for sharding across multiple virtual machines makes it relatively easy to scale up storage as needed. Perhaps more importantly, the wide range of options for persistent data storage allows developers to store data in a way custom suited to its purpose.
Development velocity. Microservices and data stores are composable, meaning they can be arranged and recombined into new applications. This makes it easier to develop new applications quickly.

For all its benefits, though, modern application architecture creates its own set of problems.

Siloing

Each of these architectural components has different infrastructure demands, different deployment methods, different scaling needs, and different maintenance challenges. The development of a diverse set of operational procedures to handle these needs is inevitable. This often leads to siloing, in which separate teams handle these separate operations.

Team siloing is a typical organizational response to handling diverse operational needs. It often makes sense early on, as it allows people to focus on particular problems and develop expertise. In the long run, however, this type of siloing is inefficient, and adds complexity and bureaucratic overhead.

As much as team siloing is problematic, technology siloing can be as well. While separation of concerns and loose coupling are strong benefits of a microservice architecture, the need to understand and interact with the system as a whole always comes up. Whether for testing, deployment automation, auditing, or analytics, a host of ad hoc administrative procedures accrete to the system. These tend to tighten the coupling between services.

Bridging the Gap

An alternative to ad hoc administrative procedures and team siloing is a single platform that provides the benefits of a disaggregated application architecture with a unified administrative interface. One such platform is DC/OS, an open source Datacenter Operating System that allows you to deploy and manage both containers and big data services on pooled compute resources.

DC/OS composes a set of Linux systems into a single cluster. Applications are composed of multiple packages that run at the DC/OS layer. Developers are able to build their own packages, but a number of packages are already available. These include relational and non-relational databases, file systems, object stores, web servers, message queues, data processing engines, and machine learning tools. Docker is also available as a DC/OS package, providing a scalable container platform for custom or off-the-shelf applications that implement core business functionality.

Master nodes orchestrate work across the cluster, delegating computing tasks and storage needs to agent nodes based on their available resources. Scheduling of agent nodes is managed by the open source Apache Mesos distributed system kernel. Apache Mesos runs a two-level scheduling process. On one level the needs of each individual DC/OS-level app are assessed, while a second level provides underlying computing resources to those needs.

New agent or master nodes can be configured and added to the system easily to expand computing capacity. DC/OS clusters can be assembled in a private datacenter, or built on a cloud computing platform such as AWS, Google Cloud, or Microsoft Azure.

Building an application on DC/OS can provide the advantage of a disaggregated, service-oriented development approach, while avoiding the siloing tendencies of multiple disparate platforms.

This post is a collaboration between Mesosphere and O’Reilly. See our statement of editorial independence.

Post topics: Infrastructure