Join reordering

Another way to improve query performance is join reordering.

Let's consider the following query structure with a binary relationship between edges A and B and incoming and outgoing relations (vertices) between B, C, D, and E:

In a naive way, a query planner will create a query plan similar to this during the graph-relational translation where the resolution of each vertex between two edges basically leads to the creation of one join operation in the execution plan:

This is known as a left deep plan, which of course is not optimal ...

Get Mastering Apache Spark 2.x - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.