July 2017
Intermediate to advanced
796 pages
18h 55m
English
The join type chosen directly impacts the performance of the join. This is because joins require the shuffling of data between executors to execute the tasks, hence different joins, and even the order of the joins, need to be considered when using join.
The following is a table you could use to refer to when writing Join code:
| Join type | Performance considerations and tips |
| inner | Inner join requires the left and right tables to have the same column. If you have duplicate or multiple copies of the keys on either the left or right side, the join will quickly blow up into a sort of a Cartesian join, taking a lot longer to complete than if designed correctly to minimize the multiple keys. |
|
cross |
Cross Join ... |
Read now
Unlock full access