Learning Neo4j 3.x - Second Edition

Book description

Run blazingly fast queries on complex graph datasets with the power of the Neo4j graph database

About This Book

  • Get acquainted with graph database systems and apply them in real-world use cases
  • Use Cypher query language, APOC and other Neo4j extensions to derive meaningful analysis from complex data sets.
  • A practical guide filled with ready to use examples on querying, graph processing and visualizing information to build smarter spatial applications.

Who This Book Is For

This book is for developers who want an alternative way to store and process data within their applications. No previous graph database experience is required; however, some basic database knowledge will help you understand the concepts more easily.

What You Will Learn

  • Understand the science of graph theory, databases and its advantages over traditional databases.
  • Install Neo4j, model data and learn the most common practices of traversing data
  • Learn the Cypher query language and tailor-made procedures to analyze and derive meaningful representations of data
  • Improve graph techniques with the help of precise procedures in the APOC library
  • Use Neo4j advanced extensions and plugins for performance optimization.
  • Understand how Neo4j's new security features and clustering architecture are used for large scale deployments.

In Detail

Neo4j is a graph database that allows traversing huge amounts of data with ease. This book aims at quickly getting you started with the popular graph database Neo4j.

Starting with a brief introduction to graph theory, this book will show you the advantages of using graph databases along with data modeling techniques for graph databases. You'll gain practical hands-on experience with commonly used and lesser known features for updating graph store with Neo4j's Cypher query language. Furthermore, you'll also learn to create awesome procedures using APOC and extend Neo4j's functionality, enabling integration, algorithmic analysis, and other advanced spatial operation capabilities on data.

Through the course of the book you will come across implementation examples on the latest updates in Neo4j, such as in-graph indexes, scaling, performance improvements, visualization, data refactoring techniques, security enhancements, and much more. By the end of the book, you'll have gained the skills to design and implement modern spatial applications, from graphing data to unraveling business capabilities with the help of real-world use cases.

Style and approach

A step-by-step approach of adopting Neo4j, the world's leading graph database. This book includes a lot of background information, helps you grasp the fundamental concepts behind this radical new way of dealing with connected data, and will give you lots of examples of use cases and environments where a graph database would be a great fit

Table of contents

  1. Preface
    1. What this book covers
    2. What you need for this book
    3. Who this book is for
    4. Conventions
    5. Reader feedback
    6. Customer support
      1. Downloading the example code
      2. Errata
      3. Piracy
      4. Questions
  2. Graph Theory and Databases
    1. Introducing Neo4j 3.x and a history of graphs
    2. Definition and usage of the graph theory
      1. Social studies
      2. Biological studies
      3. Computer science
      4. Flow problems
      5. Route problems
      6. Web search
    3. Background
      1. Navigational databases
      2. Relational databases
      3. NoSQL databases
        1. Key-value stores
        2. Column-family stores
        3. Document stores
        4. Graph databases
    4. The Property Graph model of graph databases
      1. Node labels
      2. Relationship types
    5. Why use graph databases, or not
      1. Why use a graph database?
        1. Complex queries
        2. In-the-clickstream queries on live data
        3. Pathfinding queries
      2. When not to use a graph database and what to use instead
        1. Large set-oriented queries
        2. Graph global operations
        3. Simple aggregate-oriented queries
    6. Test questions
    7. Summary
  3. Getting Started with Neo4j
    1. Key concepts and characteristics of Neo4j
      1. Built for graphs from the ground up
      2. Transactional ACID-compliant database
      3. Made for online transaction processing
      4. Designed for scalability
      5. A declarative query language - Cypher
    2. Sweet spot use cases of Neo4j
      1. Complex join-intensive queries
        1. Pathfinding queries
      2. Committed to open source
    3. The features
      1. The support
    4. The license conditions
    5. Installing Neo4j
      1. Installing Neo4j on Windows
      2. Installing Neo4j on Mac or Linux
    6. Using Neo4j in a cloud environment
    7. Sandbox
    8. Using Neo4j in a Docker container
      1. Installing Docker
        1. Preparing the filesystem
        2. Running Neo4j in a Docker container
    9. Test questions
    10. Summary
  4. Modeling Data for Neo4j
    1. The four fundamental data constructs
    2. How to start modeling for graph databases
      1. What we know – ER diagrams and relational schemas
      2. Introducing complexity through join tables
    3. A graph model – a simple, high-fidelity model of reality
    4. Graph modeling – best practices and pitfalls
      1. Graph modeling best practices
        1. Designing for query-ability
        2. Aligning relationships with use cases
        3. Looking for n-ary relationships
        4. Granulate nodes
        5. Using in-graph indexes when appropriate
      2. Graph database modeling pitfalls
        1. Using rich properties
        2. Node representing multiple concepts
        3. Unconnected graphs
        4. The dense node pattern
    5. Test questions
    6. Summary
  5. Getting Started with Cypher
    1. Writing the Cypher syntax
    2. Key attributes of Cypher
    3. Being crude with the data
      1. Create data
      2. Read data
      3. Update data
      4. Delete data
    4. Key operative words in Cypher
    5. Syntax norms
    6. More that you need to know
      1. With a little help from my friends
    7. The Cypher refcard
    8. The openCypher project
    9. Summary
  6. Awesome Procedures on Cypher - APOC
    1. Installing APOC
      1. On a hardware server
      2. On a Docker container
    2. Verifying APOC installation
    3. Functions and procedures
    4. My preferred usages 
      1. A little help from a friend
      2. Graph overview
    5. Several key usages
      1. Setup
      2. Random graph generators
      3. PageRank
      4. Timeboxed execution of Cypher statements
      5. Linking of a collection of nodes
      6. There's more in APOC
    6. Test questions
    7. Summary
  7. Extending Cypher
    1. Building an extension project
      1. Creating a function
      2. Creating a procedure
    2. Custom aggregators
    3. Unmanaged extensions
      1. HTTP and JAX-RS refreshers
        1. Registering
        2. Accessing
      2. Streaming JSON responses
    4. Summary
  8. Query Performance Tuning
    1. Explain and profile instructions
      1. A query plan
      2. Operators
    2. Indexes
      1. Force index usage
      2. Force label usage
    3. Rules of thumb
      1. Explain all the queries
      2. Rows
      3. Do not overconsume
      4. Cartesian or not?
      5. Simplicity
    4. Summary
  9. Importing Data into Neo4j
    1. LOAD CSV
      1. Scaling the import
    2. Importing from a JSON source
    3. Importing from a JDBC source
      1. Test setup
      2. Importing all the systems
    4. Importing from an XML source 
    5. Summary
  10. Going Spatial
    1. What is spatial?
      1. Refresher
      2. Not faulty towers
    2. What is so spatial then?
      1. Neo4j's spatial features 
      2. APOC  spatial features
        1. Geocoding
          1. Setting up OSM as provider
          2. Setting up Google as provider
    3. Neo4j spatial
      1. Online demo
      2. Features
      3. Importing OpenStreetMap data
        1. Large OSM Imports
          1. Easy way
          2. The tougher way to import data 
      4. Restroom please
        1. Understanding WKT and  BBOX
    4. Removing all the geo data
    5. Summary
  11. Security
    1. Authentication and authorization
    2. Roles
      1. Other roles
    3. Users management
    4. Linking Neo4j to an LDAP directory
      1. Starting the directory 
    5. Configuring Neo4j to use LDAP
    6. Test questions
    7. Summary
  12. Visualizations for Neo4j
    1. The power of graph visualizations
      1. Why graph visualizations matter!
        1. Interacting with data visually
        2. Looking for patterns
        3. Spot what's important
      2. The basic principles of graph visualization
    2. Open source visualization libraries
      1. D3.js
      2. GraphViz
      3. Sigma.js
      4. Vivagraph.js
      5. yWorks
      6. Integrating visualization libraries in your application
      7. Visualization solutions
        1. Gephi
        2. Keylines
          1. Keylines graph visualization
        3. Linkurio.us
        4. Neo4j Browser
        5. Tom Sawyer Software for graph visualization
    3. Closing remarks on visualizations - pitfalls and issues
      1. The fireworks effect
      2. The loading effect
    4. Cytoscape example
      1. Source code
    5. Questions and answers
    6. Summary
  13. Data Refactoring with Neo4j
    1. Preliminary step
    2. Simple changes
      1. Renaming
      2. Adding data
        1. Adding data with a default value
        2. Adding data with specific values
        3. Checking our values
      3. Removing data
    3. Great changes
      1. Know your model
      2. Refactoring tools
      3. Property to label
      4. Property to node
      5. Related node to label
      6. Merging nodes 
      7. Relations
    4. Consequences
    5. Summary
  14. Clustering
    1. Why set up a cluster?
    2. Concepts
      1. Core servers
      2. Read replica servers
      3. High throughput
      4. Data redundancy
      5. High availability
      6. Bolt
    3. Building a cluster
      1. The core servers
      2. The read replicas
      3. The bolt+routing protocol
    4. Disaster recovery
    5. Summary
  15. Use Case Example - Recommendations
    1. Recommender systems dissected
    2. Using a graph model for recommendations
    3. Specific query examples for recommendations
      1. Recommendations based on product purchases
      2. Recommendations based on brand loyalty
      3. Recommendations based on social ties
      4. Bringing it all together - compound recommendations
    4. Business variations on recommendations
    5. Fraud detection systems
    6. Access control systems
    7. Social networking systems
    8. Questions and answers
    9. Summary
  16. Use Case Example - Impact Analysis and Simulation
    1. Impact analysis systems dissected
      1. Impact analysis in business process management
      2. Modeling your business as a graph
        1. Which applications are used in which buildings?
        2. Which buildings are affected if something happens to Appl_9?
        3. What business processes with an RTO of 0-2 hours would be affected by a fire at location Loc_100?
    2. Impact simulation in a cost calculation environment
      1. Modeling your product hierarchy as a graph
      2. Working with a product hierarchy graph
        1. Calculating the price based on a full sweep of the tree
        2. Calculating the price based on intermediate pricing
        3. Impact simulation on product hierarchy
    3. Questions and answers
    4. Summary
  17. Tips and Tricks
    1. Reset password
      1. Check for other hosts
      2. Getting the first line of a CSV file
    2. Enabling SSH on a Raspberry Pi
    3. Creating guides for the Neo4j browser
    4. Data backup and restore
      1. Community version
      2. Enterprise version
    5. Tools
      1. Cypher-shell
      2. Data integration tools
      3. Modeling tools
        1. Arrows
        2. OmniGraffle
    6. Community projects
    7. Online documentation
    8. Community
    9. More proverbs

Product information

  • Title: Learning Neo4j 3.x - Second Edition
  • Author(s): Jérôme Baton, Rik Van Bruggen
  • Release date: October 2017
  • Publisher(s): Packt Publishing
  • ISBN: 9781786466143