Chapter 11. Entity Resolution Revisited

This chapter uses entity resolution for a streaming video service as an example of unsupervised machine learning with graph algorithms. After completing this chapter, you should be able to:

  • Name the categories of graph algorithms that are appropriate for entity resolution as unsupervised learning

  • List three different approaches for assessing the similarity of entities

  • Understand how parameterized weights can adapt entity resolution to be a supervised learning task

  • Interpret a simple GSQL FROM clause and have a general understanding of ACCUM semantics

  • Set up and run a TigerGraph Cloud Starter Kit using GraphStudio

Problem: Identify Real-World Users and Their Tastes

The streaming video on demand (SVoD) market is big business. Accurate estimates of the global market size are hard to come by, but the most conservative estimate may be $50 billion in 2020,1 with annual growth rates ranging from 11%2 to 21%3 for the next five years or so. Movie studios, television networks, communication networks, and tech giants have been merging and reinventing themselves, in hopes of becoming a leader in the new preferred format for entertainment consumption: on-demand digital entertainment, on any video-capable device.

To succeed, SVoD providers need to have the content to attract and retain many millions of subscribers. Traditional video technology (movie theaters and broadcast television) limited the provider to offering only one program at a ...

Get Graph-Powered Analytics and Machine Learning with TigerGraph now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.