8

Building a GDS Pipeline for Node Classification Model Training

Classifying observations within categories is a classical machine learning (ML) task. As we learned in the preceding chapters, we can use existing ML models such as decision trees to classify a graph’s nodes. The graph structure is used to find extra features, bringing more knowledge into the model. In this chapter, we will discover another key feature of the Neo4j GDS library: pipelines. They let you configure and train an ML model, before using it to make predictions on unseen nodes. You can do all of this from Neo4j, without having to add another library such as scikit-learn to the tech stack.

Also, we are going to work on the Netflix dataset we created earlier in this book ...

Get Graph Data Science with Neo4j now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.