O'Reilly logo

Apache Spark Graph Processing by Rindra Ramamonjison

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Summary

In this chapter, we learned about the different ways to visualize and analyze graphs in Spark. We studied the connectedness of different networks by looking at their degree distribution, finding their connected components, and by calculating their cluster coefficients. In addition, we also learned how to visualize graph data using GraphStream. After this, we showed how the PageRank algorithm can be used to rank node importance in different networks. This chapter also showed us how to use SBT to build a Spark program that relies on third-party libraries.

Throughout this chapter, we have also studied how the basic Spark RDD operations can be used to transform, join, and filter collections of graph vertices and edges. In the next chapter, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required