How to bring fast data access to microservice architecture with in-memory data grids

For stack scalability, elasticity at the business logic layer should be matched with elasticity at the caching layer.

By Jagdish Mirani

December 7, 2017

Architecture (source: Pexels)

Ever increasing numbers of people are using mobile and web apps and the average time consumed using them continues to grow. This use comes with expectations of real-time response, even during peak access times.

Modern, cloud-native applications have to stand up to all this application demand. In addition to user-initiated requests, applications respond to requests from other applications or handle data streamed in from sensors and monitors. Lapses in service will add friction to customer touch points, cause big declines in operational efficiency, and make it difficult to take advantage of fleeting sales opportunities during periods of high activity, like Cyber Monday.

What are your options for dealing with demand?

Scaling large monoliths dynamically on demand becomes impractical as these systems grow. Meeting demands like Cyber Monday by scaling up a large clunky deployment of a monolith, and then scaling it back down when the higher capacity is no longer needed, is impractical, because as these systems grow, they become increasingly fragile, severely limiting the scope of what can be changed.

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

Microservices, on the other hand, can be scaled more dynamically because the scope of the software being scaled is limited only to the microservices that need to be scaled, rather than an entire monolithic application. This is one of the reasons a microservices-based approach emerged as the ideal way to build web-scale applications, and more enterprises are considering this approach for their own custom software development.

Breaking down a big monolithic application into microservices aligns well with the horizontal scalability and availability benefits of distributed computing. This approach allows for scaling each microservice separately, and allocating more computing resource where it is needed, rather than the coarse grained approach of scaling an entire monolith. Also, each microservice can be hosted on the infrastructure that is most appropriate for its workload. For example, if a microservice that works with data that has complex interdependencies, a graph database is likely to be the best choice.

Independent scaling of microservices can also be achieved by adding instances of microservices. Creating instances of large monoliths is operationally cumbersome, and adds resources to the entire app, rather than the specific area of the application where more performance is needed.

Adding instances of microservices solves the problem of scaling the business logic, but it doesn’t do the same for the data access layer. Scaling the business logic becomes ineffective without commensurately scaling the data access layer. This impedance mismatch is like driving a car built for speed on a dirt road. An In-memory Data Grid (IMDG) like GemFire can harmonize this mismatch by having application instances access data from a super fast data access layer.

Scaling the Data Access Layer

Cloud-native architectures need a whole new level of efficiency for how IT capacity is used — as a pool of reusable capacity that can be dynamically aligned for scaling applications and improving performance. The goal is to achieve a type of elasticity that quickly goes from 1 instance of a microservice, to hundreds of instances across hundreds of microservices, and then back down to 1 again. Each microservice can be scaled independently as part of a highly distributed architecture.

One approach to addressing the data requirements for the instance sprawl caused by the need to scale microservices is to share a caching layer across instances of microservices. All instances are identical and they all need the same structural view of their data, so sharing a caching layer across instances makes sense. Since the caching layer is not being shared across microservices (teams) but rather across instances of the same microservice, team autonomy is preserved. All the horizontal scalability and performance benefit of distributed computing are preserved.

An alternate approach to scaling the data layer is to scale the disk-based backing stores underneath the microservices. There are a couple of challenges with this. First, traditional databases scale vertically (up) rather than horizontally (out). Vertical scaling involves the use of more powerful hardware, and hardware options are limited on the high end. So, this approach is limited by the performance and scalability of a single computer. Second, disk based access is inherently much slower than memory. Even solid state disk involves latencies that are multiple times the latency of memory. With these challenges, sufficiently scaling a disk based data layer, if achievable, can involve exorbitant costs.

Supporting the Performance Requirements of Microservices

A more economical option is to work with an in-memory data grid, like Pivotal’s GemFire. GemFire’s performance optimizations are built into the architecture, and these refinements help deliver performance for microservice architectures:

Fast simple lookups (reads) involve the retrieval of a small amount of data, often just a single value, from a large data set. This is a very common access pattern that occurs with high frequency.
MapReduce style processing, executes code to where the data is, rather than moving the data to where the processing has to occur. Moving large volumes of data is detrimental to performance.
Transactions (writes) involve a tradeoff between data consistency and performance. Depending on configuration, GemFire provides the right set of options for users to make this trade-off.
- Session state data is a key enabler of the ability to scale up or down by adding or removing instances. Having state data externally accessible means that it is preserved when instances go down, and new instances can pick up processing where the failed one left off.

Let’s look at each of these optimizations briefly:

Simple lookups (reads) are a common data access pattern for microservices. So optimizing lookup speeds pays dividends because of the sheer frequency with which they occur. Lookups occur quickly in GemFire because of in-memory performance, and how GemFire partitions data for horizontal scale.

GemFire is primarily memory oriented, although it also supports the notion of in-built disk persistence. In-memory processing makes GemFire as much as two orders of magnitude faster than disk-based systems for both read and write operations. GemFire’s low latency, high throughput in-memory cache was built for very fast microservices response time, while supporting massive concurrency.

Modern cloud-native architectures have to be able to scale horizontally, both at the application logic layer and the data layer. With GemFire’s elasticity, memory can be scaled linearly by adding servers as and when needed. Data is partitioned across servers, and the cluster of servers can be scaled up or down as microservice instances are added or removed, and the data is automatically re-partitioned. Lookups by primary key are sent to the appropriate nodes that locally serve relevant partitions of data in parallel, which speeds up response time to a fraction of the time it would take on a single server.

MapReduce style processing, whereby code is executed locally by the appropriate data partitions for parallel processing, makes it possible to quickly process large volumes of data. Data is replicated across servers to facilitate recovery from failures, and for scaling data read operations by leveraging multiple copies of data for reading.

Transactions (writes) typically don’t occur as frequently in caching use cases, but they require some special consideration because they raise data consistency issues. GemFire can process high volume transactions that can be configured to support atomic, consistent, isolated and durable (ACID) transactions, satisfying the consistency requirements of demanding transaction processing applications. Failures are recovered fast, and are handled in ways that maintain data integrity and consistency. Synchronous replication across GemFire redundancy zones hosted on separate physical hardware keeps stale data from being served up and also keeps data consistent.

Session state data relates to each user’s state and application interactions. This data has a lifespan that extends only to the end of each session. The best practice for cloud-native applications is to build microservices that are stateless, with the state saved externally. If any part of the infrastructure underneath the session becomes unavailable, the user interaction can still proceed without losing context – a new instance of the microservice can be onboarded with the user’s context from an external source that maintains state information.

Putting session state information in GemFire has been a common practice with customers. It makes session state data accessible fast, highly scalable and available. Using GemFire for session state caching minimizes the transition time across change boundaries.

The need for performance is not likely to ebb anytime soon. We still have a long way to go with what we can achieve with data. Microservice based architectures provide a flexible and effective way of addressing the performance needs of modern applications. The effectiveness of these architectures, however, require performance and scalability throughout the stack, and in-memory data grids, like GemFire, are a great way of addressing these requirements at the data layer.

This post is a collaboration between O’Reilly and Pivotal. See our statement of editorial independence.

Post topics: Software Architecture