Chapter 4. Looking Forward

The final chapter of this report is your crystal ball, looking futureward at distributed SQL and other distributed-computing-enabled capabilities.

Distributed SQL: The Enabler

The distributed mindset influenced the creation of the distributed SQL category, but that doesn’t mean it’s done. Distributed SQL is a foundation to extrapolate from. As we saw in Chapter 3, businesses are choosing distributed SQL because it gives them value that wasn’t possible before, and they can now extend that to other offerings.

Technologists are building capabilities into this category that impact your strategic plans. Here’s a short overview on a new distributed SQL capability, to help you plot the course to the next stage of your product vision. We finish up this report with an exhortation about the distributed mindset and using it to look into the future.

Serverless

Since the mid-2010s, a distributed paradigm known as serverless has risen in popularity. Its most common form is code functions that can be invoked remotely from anywhere, with the distinguishing feature that while they execute on computers, they don’t need to be managed in any way as physical machines (see Table 4-1 for hyperscaler serverless function solutions).

For example, in AWS Lambda a developer can create a utility function in whatever language they like and, when the function executes the hosting environment, provides the environment to execute that function. If there are no running instances of the function code, one is quickly started and responds to the request. The developer does not know or care about details at the machine level, and the hosting environment automatically scales up or down to as much computing power as needed to respond to multiple invocations of the function.

Table 4-1. Just some of the serverless function compute solutions available in the market today
Vendor Serverless function product
Alibaba Cloud Function compute
Amazon Web Services AWS Lambda
Cloudflare Cloudflare Workers
Google Cloud Platform Cloud functions, Firebase functions
Microsoft Azure Azure functions
Netlify Netlify functions

Serverless adds simplicity to the distributed paradigm. A serverless architecture means that the resources are always available. Rather than planning for how many computers to buy or servers to provision, customers are billed based on their consumption of serverless resources: if you use less, you pay less.

Functions aren’t the sole focus for the serverless paradigm. Low-level object storage solutions like AWS S3, Azure Blob Storage, or Google Cloud Platform Cloud Storage also fit. You can probably see where we’re going: if application logic in functions can be serverless, and low-level storage can be serverless, why can’t the database? Distributed SQL already automates a lot of the key capabilities required for serverless.

With some small addition, distributed SQL can become the perfect fit for serverless functions and any application. There is a possibility that, with serverless capabilities embedded into a distributed SQL database, we can completely remove dependency on operations and reduce a database down to a simple SQL API in the cloud.

In this new world, developers will be able to create models of data and let the database handle scale, replication, and locality requirements. As SQL statements are sent to the database, monitoring processes can automatically handle necessary physical instance operations. Developers will be able to have an attitude of “I don’t care how it’s accomplished machine-wise, just do the things I ask of my distributed SQL database.” This will naturally increase time-to-value and greatly reduce the cost, time, and complexity of operating the database.

It’s also the last piece of the puzzle for making many applications truly distributed. Store nonrelational data in object storage solutions that can replicate anywhere, make transactional SQL data distributed, automatically scalable and available, and run application code in functions that execute ephemerally without being managed on dedicated servers. Serverless distributed SQL completes the picture for always-on, autoscaled, fully managed applications.

Cockroach Labs has already introduced a beta serverless option for CockroachDB. They make CockroachDB serverless by giving their database multitenant architecture, making it so they can run a tenant-isolated SQL layer depending on a core, shared, distributed key value store that automatically includes tenant IDs in each query (Figure 4-1). By separating the storage from the SQL computation, CockroachDB can autoscale its tenant SQL processes using Kubernetes pods, giving customers the ability to quickly scale their database processing power up or down.

The CockroachDB approach to multitenancy with serverless distributed SQL
Figure 4-1. The CockroachDB approach to multitenancy with serverless distributed SQL

The Distributed Mindset: An Exhortation

We’ve discussed the vital parts of the distributed mindset: scale, resilience, and locality. We’ve also discovered the crucial feature list of distributed SQL: ease of scaling, always-on resilience, data locality, SQL, and ACID compliance. And we’ve reviewed the ways distributed SQL cozies up to real-world situations.

We’re just at the beginning. The next five years will shake the foundations of our traditional, nondistributed databases. Here are three complementary, short ideas about the future to help you craft a vision of this future.

The Future Must Be Physical

Creating software has always been a logic-based activity: make a flowchart, diagram classes, design data relationships. Architect a proof that the application concept will work, and then apply that to a rack of servers in the nearest datacenter. With distributed SQL, you can apply the physical world to all these plans: what do you want the experience of your application to be for people in Stockholm? How do your customers in Mexico City need their regulatory data stored?

The paradigm shift looks like this: create things with an entity relationship diagram in one hand, and a globe in the other. The physical world used to be a map of limitations. Now it’s a map of possibilities.

The Future Must Be Familiar

Distributed SQL owes its success to understanding where it came from. The SQL language is simply the best way to track and understand the data that runs our critical business applications. Vendors in this category understand this and have enabled many ways of retaining the language while massively updating the capabilities. Apart from keeping the SQL language itself, many vendors have made their solutions wire-compatible with PostgreSQL. This ensures that drivers and tools that already understand Postgres will be able to connect and understand distributed solutions.

Architects and leaders, when you choose distributed SQL solutions, keep this in mind: empower technical people with familiar tools. Many object-relational mapping (ORM) tools now have extensions that work with distributed solutions, including Hibernate for Java and the ORM in Django for Python. If you bring people along with you on the journey, they’ll enhance their own skills and come ready to continually shift their solutions to a distributed mindset.

The Future Must Be Fantastic

The distributed mindset applied to software creation will continue to allow developers to dream and build breakthrough applications. Soon, any business can have the audacity to believe they are making something Google-sized, and they will be able to have the distributed computing tools right in front of them to go ahead and try. It will be cheaper and easier every year to make your creation available to the world, without compromising experience.

In a distributed world, you can create something that delights hearts no matter where they beat.

Get What Is Distributed SQL? now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.