July 2017
Intermediate to advanced
796 pages
18h 55m
English
Inner join results in rows from both statesPopulationDF and statesTaxRatesDF when state is non-NULL in both datasets.

Join the two datasets by the state column as follows:
val joinDF = statesPopulationDF.join(statesTaxRatesDF, statesPopulationDF("State") === statesTaxRatesDF("State"), "inner")%sqlval joinDF = spark.sql("SELECT * FROM statesPopulationDF INNER JOIN statesTaxRatesDF ON statesPopulationDF.State = statesTaxRatesDF.State")scala> joinDF.countres22: Long = 329scala> joinDF.show+--------------------+----+----------+--------------------+-------+| State|Year|Population| State|TaxRate|+--------------------+----+----------+--------------------+-------+ ...Read now
Unlock full access