Book description
Practical Graph Analytics with Apache Giraph helps you build data mining and machine learning applications using the Apache Foundation’s Giraph framework for graph processing. This is the same framework as used by Facebook, Google, and other social media analytics operations to derive business value from vast amounts of interconnected data points.
Graphs arise in a wealth of data scenarios and describe the connections that are naturally formed in both digital and real worlds. Examples of such connections abound in online social networks such as Facebook and Twitter, among users who rate movies from services like Netflix and Amazon Prime, and are useful even in the context of biological networks for scientific research. Whether in the context of business or science, viewing data as connected adds value by increasing the amount of information available to be drawn from that data and put to use in generating new revenue or scientific opportunities.
Apache Giraph offers a simple yet flexible programming model targeted to graph algorithms and designed to scale easily to accommodate massive amounts of data. Originally developed at Yahoo!, Giraph is now a top top-level project at the Apache Foundation, and it enlists contributors from companies such as Facebook, LinkedIn, and Twitter. Practical Graph Analytics with Apache Giraph brings the power of Apache Giraph to you, showing how to harness the power of graph processing for your own data by building sophisticated graph analytics applications using the very same framework that is relied upon by some of the largest players in the industry today.
Table of contents
- Cover
- Title
- Copyright
- Contents at a Glance
- Contents
- About the Authors
- About the Techincal reviewer
- Introduction
- Annotation Conventions
-
Part I: Giraph Building Blocks
- Chapter 1: Introducing Giraph
- Chapter 2: Modeling Graph Processing Use Cases
- Chapter 3: The Giraph Programming Model
- Chapter 4: Giraph Algorithmic Building Blocks
-
Part II: Giraph Overview
- Chapter 5: Working with Giraph
- Chapter 6: Giraph Architecture
- Chapter 7: Graph IO Formats
- Chapter 8: Beyond the Basic API
-
Part III: Advanced Topics
- Chapter 9: Exposing Parallelism in Giraph
- Chapter 10: Advanced IO
- Chapter 11: Tuning Giraph
-
Chapter 12: Giraph in the Cloud
- A Quick Introduction to Cloud Computing
-
Giraph on the Amazon Web Services Cloud
- Before You Begin
- Creating Your First Cluster on the Amazon Cloud
- The Building Blocks of an EMR Cluster
- The Composition of an EMR Cluster: Instance Groups
- Deploying Giraph Applications onto an EMR Cluster
- EMR Cluster Data Processing Steps
- When Things Go Wrong: Debugging EMR Clusters
- Where’s My Stuff? Data Migration to and from EMR Clusters
- Putting It All Together: Ephemeral Graph Processing EMR Clusters
- Getting the Most Bang for the Buck: Amazon EMR Spot Instances
- One Size Doesn’t Fit All: Fine-Tuning Your EMR Clusters
- Summary
-
Appendix A: Install and Configure Giraph and Hadoop
-
System Requirements
- Hadoop Installation
- Giraph Installation
- Installing the Binary Release of Giraph
- Installing Giraph As Part of a Packaged Hadoop Distribution
- Installing Giraph by Building from Source Code
- Fundamentals of Hadoop and Hadoop Ecosystem Projects Configuration
- Configuring Giraph
- Configuring Hadoop
- Configuring Hadoop in Pseudo-Distributed Mode
- Summary
-
System Requirements
- Index
Product information
- Title: Practical Graph Analytics with Apache Giraph
- Author(s):
- Release date: October 2015
- Publisher(s): Apress
- ISBN: 9781484212516
You might also like
book
Neo4j Graph Data Modeling
Design efficient and flexible databases by optimizing the power of Neo4j In Detail Neo4j is a …
book
Hands-On Graph Analytics with Neo4j
Discover how to use Neo4j to identify relationships within complex and large graph datasets using graph …
video
Getting Started with the Neo4j GraphQL Library
Get started building GraphQL APIs with the Neo4j GraphQL Library. Learn how to generate a CRUD …
book
Building Big Data Pipelines with Apache Beam
Implement, run, operate, and test data processing pipelines using Apache Beam Key Features Understand how to …