Tuning Giraph

This chapter covers

  • Key performance factors and bottlenecks in Giraph
  • Optimal setups of the Hadoop cluster hosting Giraph
  • Designing and implementing ad hoc data structures optimized for your algorithms
  • Spilling excessing data to disk when necessary
  • The different Giraph parameters and knobs

So far, you have seen how you can use Giraph to compute graph analytics on large graphs across hundreds of machines. You have been presented with Giraph’s architecture and the programming model that allows you to write programs that scale. It is neat to write programs with the vertex-centric programming model, without worrying about ...

Get Practical Graph Analytics with Apache Giraph now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.