CHAPTER 6Messaging
Large-scale data-intensive applications need coordination between the parallel processes to achieve meaningful objectives. Since applications are deployed in many nodes within a cluster, messaging is the only way to achieve coordination while they execute.
It is fair to say that when large numbers of machines are used for an application, performance depends on how efficiently it can transfer messages among the distributed processes. Developing messaging libraries is a challenging task and requires years of software engineering to produce efficient, workable, and robust systems. For example, the Message Passing Interface (MPI) standard–based frameworks celebrated 25 years of development and are still undergoing retooling thanks to emerging hardware and applications.
There are many intricate details involved in programming with networking hardware, making it impossible for an application developer to fashion a program directly using hardware features. Multiple layers of software have been developed to facilitate messaging. Some of these layers are built into the operating system, with others at the firmware level and many at the software library level.
Nowadays, computer networks are operating continuously everywhere, transferring petabytes of data that touch every aspect of our lives. There are many types of networks that are designed to work in various settings, including cellular networks, home cable networks, undersea cables that carry data between continents, ...
Get Foundations of Data Intensive Applications now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.