Chapter 8: Exploring Apache Arrow Flight RPC

Distributed systems have always interested me. A distributed system is like a really good puzzle – immensely satisfying once you figure out how all the pieces fit together to achieve your goal. If you're not familiar with the term, a distributed system is simply a situation where you have various components of a system spread across multiple machines on a network. The idea is to split up the work and coordinate efforts among the components to complete tasks more efficiently. A great example would be Apache Spark, which we covered back in Chapter 3, Data Science with Apache Arrow.

The goal of distributed systems is generally to provide a robust, scalable, and reliable conglomeration of components ...

Get In-Memory Analytics with Apache Arrow now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.