July 2017
Intermediate to advanced
796 pages
18h 55m
English
Cross join matches every row from left with every row from right, generating a Cartesian cross product.

Join the two datasets by the State column as follows:
scala> val joinDF=statesPopulationDF.crossJoin(statesTaxRatesDF)joinDF: org.apache.spark.sql.DataFrame = [State: string, Year: int ... 3 more fields]%sqlval joinDF = spark.sql("SELECT * FROM statesPopulationDF CROSS JOIN statesTaxRatesDF")scala> joinDF.countres46: Long = 16450scala> joinDF.show(10)+-------+----+----------+-----------+-------+| State|Year|Population| State|TaxRate|+-------+----+----------+-----------+-------+|Alabama|2010| 4785492| Alabama| 4.0||Alabama|2010| ...Read now
Unlock full access