July 2017
Intermediate to advanced
796 pages
18h 55m
English
Left anti join results in rows from only statesPopulationDF if, and only if, there is NO corresponding row in statesTaxRatesDF.

Join the two datasets by the State column as follows:
val joinDF = statesPopulationDF.join(statesTaxRatesDF, statesPopulationDF("State") === statesTaxRatesDF("State"), "leftanti")%sqlval joinDF = spark.sql("SELECT * FROM statesPopulationDF LEFT ANTI JOIN statesTaxRatesDF ON statesPopulationDF.State = statesTaxRatesDF.State")scala> joinDF.countres22: Long = 28scala> joinDF.show(5)+--------+----+----------+| State|Year|Population|+--------+----+----------+| Alaska|2010| 714031||Delaware|2010| 899816| ...Read now
Unlock full access