Appendix M. Reference for joins

In this appendix, you will find the reference material for joins. The idea is to help you go through the various types of joins with quick examples, so you can quickly pick the right one (or left one, if my editors allow me a little joke), based on your needs.

The labs in this appendix are based on chapter 12, where you learned more about transformations. Two labs explore joins: labs #940 and #941.

This appendix does not mention the union () and unionByName() methods, which can be used to combine (union) dataframes together. Those methods are used in chapters 3, 15, and 17.

Lab Examples from this chapter are available in GitHub at https://github.com/jgperrin/net.jgp.books.spark.ch12 .

M.1 Setting up the decorum ...

Get Spark in Action, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.