Graph Databases in Action

Book description

Relationships in data often look far more like a web than an orderly set of rows and columns. Graph databases shine when it comes to revealing valuable insights within complex, interconnected data such as demographics, financial records, or computer networks. In Graph Databases in Action, experts Dave Bechberger and Josh Perryman illuminate the design and implementation of graph databases in real-world applications. You'll learn how to choose the right database solutions for your tasks, and how to use your new knowledge to build agile, flexible, and high-performing graph-powered applications!

About the Technology
Isolated data is a thing of the past! Now, data is connected, and graph databases—like Amazon Neptune, Microsoft Cosmos DB, and Neo4j—are the essential tools of this new reality. Graph databases represent relationships naturally, speeding the discovery of insights and driving business value.

About the Book
Graph Databases in Action introduces you to graph database concepts by comparing them with relational database constructs. You'll learn just enough theory to get started, then progress to hands-on development. Discover use cases involving social networking, recommendation engines, and personalization.

What's Inside
  • Graph databases vs. relational databases
  • Systematic graph data modeling
  • Querying and navigating a graph
  • Graph patterns
  • Pitfalls and antipatterns


About the Reader
For software developers. No experience with graph databases required.

About the Authors
Dave Bechberger and Josh Perryman have decades of experience building complex data-driven systems and have worked with graph databases since 2014.

Quotes
A comprehensive overview of graph databases and how to build them using Apache tools.
- Richard Vaughan, Purple Monkey Collective

A well-written and thorough introduction to the topic of graph databases.
- Luis Moux, EMO

A great guide in your journey towards graph databases and exploiting the new possibilities for data processing.
- Mladen Knežić, CROZ

A great introduction to graph databases and how you should approach designing systems that leverage graph databases.
- Ron Sher, Intuit

Publisher resources

View/Submit Errata

Table of contents

  1. Graph Databases in Action
  2. Copyright
  3. contents
  4. front matter
    1. foreword
    2. preface
    3. acknowledgments
    4. about this book
      1. Who should read this book
      2. How this book is organized: A roadmap
      3. About the code
      4. About the technologies
      5. liveBook discussion forum
    5. about the authors
    6. about the cover illustration
  5. Part 1. Getting started with graph databases
  6. 1 Introduction to graphs
    1. 1.1 What is a graph?
      1. 1.1.1 What is a graph database?
      2. 1.1.2 Comparison with other types of databases
      3. 1.1.3 Why can’t I use SQL?
    2. 1.2 Is my problem a graph problem?
      1. 1.2.1 Explore the questions
      2. 1.2.2 I’m still confused. . . . Is this a graph problem?
    3. Summary
  7. 2 Graph data modeling
    1. 2.1 The data modeling process
      1. 2.1.1 Data modeling terms
      2. 2.1.2 Four-step process for data modeling
    2. 2.2 Understand the problem
      1. 2.2.1 Domain and scope questions
      2. 2.2.2 Business entity questions
      3. 2.2.3 Functionality questions
    3. 2.3 Developing the whiteboard model
      1. 2.3.1 Identifying and grouping entities
      2. 2.3.2 Identifying relationships between entities
    4. 2.4 Constructing the logical data model
      1. 2.4.1 Translating entities to vertices
      2. 2.4.2 Translating relationships to edges
      3. 2.4.3 Finding and assigning properties
    5. 2.5 Checking our model
    6. Summary
  8. 3 Running basic and recursive traversals
    1. 3.1 Setting up your environment
      1. 3.1.1 Starting the Gremlin Server
      2. 3.1.2 Starting the Gremlin Console, connecting to the Gremlin Server, and loading the data
    2. 3.2 Traversing a graph
      1. 3.2.1 Using a logical data model (schema) to plan traversals
      2. 3.2.2 Planning the steps through the graph data
      3. 3.2.3 Fundamental concepts of traversing a graph
      4. 3.2.4 Writing traversals in Gremlin
      5. 3.2.5 Retrieving properties with values steps
    3. 3.3 Recursive traversals
      1. 3.3.1 Using recursive logic
      2. 3.3.2 Writing recursive traversals in Gremlin
    4. Summary
  9. 4 Pathfinding traversals and mutating graphs
    1. 4.1 Mutating a graph
      1. 4.1.1 Creating vertices and edges
      2. 4.1.2 Removing data from our graph
      3. 4.1.3 Updating a graph
      4. 4.1.4 Extending our graph
    2. 4.2 Paths
      1. 4.2.1 Cycles in graphs
      2. 4.2.2 Finding the simple path
    3. 4.3 Traversing and filtering edges
      1. 4.3.1 Introducing the E and V steps for traversing edges
      2. 4.3.2 Filtering with edge properties
      3. 4.3.3 Include edges in path results
      4. 4.3.4 Performant edge counts and denormalization
    4. Summary
  10. 5 Formatting results
    1. 5.1 Review of values steps
    2. 5.2 Constructing our result payload
      1. 5.2.1 Applying aliases in Gremlin
      2. 5.2.2 Projecting results instead of aliasing
    3. 5.3 Organizing our results
      1. 5.3.1 Ordering results returned from a graph traversal
      2. 5.3.2 Grouping results returned from a graph traversal
      3. 5.3.3 Limiting results
    4. 5.4 Combining steps into complex traversals
    5. Summary
  11. 6 Developing an application
    1. 6.1 Starting the project
      1. 6.1.1 Selecting our tools
      2. 6.1.2 Setting up the project
      3. 6.1.3 Obtaining a driver
      4. 6.1.4 Preparing the database server Instance
    2. 6.2 Connecting to our database
      1. 6.2.1 Building the cluster configuration
      2. 6.2.2 Setting up the GraphTraversalSource
    3. 6.3 Retrieving data
      1. 6.3.1 Retrieving a vertex
      2. 6.3.2 Using Gremlin language variants (GLVs)
      3. 6.3.3 Adding terminal steps
      4. 6.3.4 Creating the Java method in our application
    4. 6.4 Adding, modifying, and deleting data
      1. 6.4.1 Adding vertices
      2. 6.4.2 Adding edges
      3. 6.4.3 Updating properties
      4. 6.4.4 Deleting elements
    5. 6.5 Translating our list and path traversals
      1. 6.5.1 Getting a list of results
      2. 6.5.2 Implementing recursive traversals
      3. 6.5.3 Implementing paths
    6. Summary
  12. Part 2. Building on Graph Databases
  13. 7 Advanced data modeling techniques
    1. 7.1 Reviewing our current data models
    2. 7.2 Extending our logical data model
    3. 7.3 Translating entities to vertices
      1. 7.3.1 Using generic labels
      2. 7.3.2 Denormalizing graph data
      3. 7.3.3 Translating relationships to edges
      4. 7.3.4 Finding and assigning properties
      5. 7.3.5 Moving properties to edges
      6. 7.3.6 Checking our model
    4. 7.4 Extending our data model for personalization
    5. 7.5 Comparing the results
    6. Summary
  14. 8 Building traversals using known walks
    1. 8.1 Preparing to develop our traversals
      1. 8.1.1 Identifying the required elements
      2. 8.1.2 Selecting a starting place
      3. 8.1.3 Setting up test data
    2. 8.2 Writing our first traversal
      1. 8.2.1 Designing our traversal
      2. 8.2.2 Developing the traversal code
    3. 8.3 Pagination and graph databases
    4. 8.4 Recommending the highest-rated restaurants
      1. 8.4.1 Designing our traversal
      2. 8.4.2 Developing the traversal code
    5. 8.5 Writing the last recommendation engine traversal
      1. 8.5.1 Designing our traversal
      2. 8.5.2 Adding this traversal to our application
    6. Summary
  15. 9 Working with subgraphs
    1. 9.1 Working with subgraphs
      1. 9.1.1 Extracting a subgraph
      2. 9.1.2 Traversing a subgraph
    2. 9.2 Building a subgraph for personalization
    3. 9.3 Building the traversal
      1. 9.3.1 Reversing the traversing direction
      2. 9.3.2 Evaluating the individualized results of the subgraph
    4. 9.4 Implementing a subgraph with a remote connection
      1. 9.4.1 Connecting with TinkerPop’s Client class
      2. 9.4.2 Adding this traversal to our application
    5. Summary
  16. Part 3. Moving Beyond the Basics
  17. 10 Performance, pitfalls, and anti-patterns
    1. 10.1 Slow-performing traversals
      1. 10.1.1 Explaining our traversal
      2. 10.1.2 Profiling our traversal
      3. 10.1.3 Indexes
    2. 10.2 Dealing with supernodes
      1. 10.2.1 It’s about instance data
      2. 10.2.2 It’s about the database
      3. 10.2.3 What makes a supernode?
      4. 10.2.4 Monitoring for supernodes
      5. 10.2.5 What to do if you have a supernode
    3. 10.3 Application anti-patterns
      1. 10.3.1 Using graphs for non-graph use cases
      2. 10.3.2 Dirty data
      3. 10.3.3 Lack of adequate testing
    4. 10.4 Traversal anti-patterns
      1. 10.4.1 Not using parameterized traversals
      2. 10.4.2 Using unlabeled filtering steps
    5. Summary
  18. 11 What’s next: Graph analytics, machine learning, and resources
    1. 11.1 Graph analytics
      1. 11.1.1 Pathfinding
      2. 11.1.2 Centrality
      3. 11.1.3 Community detection
      4. 11.1.4 Graphs and machine learning
      5. 11.1.5 Additional resources
    2. 11.2 Final thoughts
    3. Summary
  19. appendix. Apache TinkerPop installation and overview
    1. A.1 Overview
      1. A.1.1 Gremlin traversal language
      2. A.1.2 TinkerGraph
      3. A.1.3 Gremlin Console
      4. A.1.4 Gremlin Language Variants (GLVs)
      5. A.1.5 Gremlin Server
      6. A.1.6 Documentation
    2. A.2 Installation
      1. A.2.1 Installing and verifying the Java Runtime
      2. A.2.2 Installing Gremlin Console
      3. A.2.3 Installing Gremlin Server
      4. A.2.4 Configuring the Gremlin Console to connect to the Gremlin Server
      5. A.2.5 Gremlin Console command modes: Local versus remote
      6. A.2.6 Using the Gremlin Console
  20. index

Product information

  • Title: Graph Databases in Action
  • Author(s): Dave Bechberger, Josh Perryman
  • Release date: November 2020
  • Publisher(s): Manning Publications
  • ISBN: 9781617296376