1Introduction to Distributed Systems
K. Karthikeyan1*, S. Hemalatha2 and S. Vignesh1
1Department of Computer Science and Business Systems, K.S. Rangasamy College of Technology, Tiruchengode, India
2Department of Computer Science and Design, Kongu Engineering College, Perundurai, India
Abstract
A distributed system is an assembly of independent computers linked by a network that cooperates to accomplish a shared objective. Distributed systems, as opposed to conventional centralized systems, which assign all work to one machine, provide for greater performance, fault tolerance, and scalability. The goal of communication-efficient distributed machine learning is to train machine learning models across several nodes while minimizing the communication overhead in distributed systems. In distributed machine learning, data are distributed across various nodes, and computation is performed in parallel. However, exchanging data and model parameters between nodes can lead to significant communication costs, especially in scenarios with large datasets or complex models. Training complex machine learning models on large datasets is very computationally intensive. Distributed ML systems partition the data and workload across multiple machines to parallelize and speed up training. However, the communication cost of synchronizing model parameters can dominate system performance. Blockchain technology is inherently based on distributed systems principles. It decentralizes the storage and ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access