January 2019
Beginner to intermediate
154 pages
4h 31m
English
The union() transformation takes another RDD as an input and produces a new RDD containing elements from both the RDDs, as shown in the following code. Let's create two RDDs: one with numbers 1 to 5 and another with numbers 5 to 10, and then concatenate them together to get a new RDD with the numbers 1 to 10:
#PythonfirstRDD = spark.sparkContext.parallelize(range(1,6))secordRDD = spark.sparkContext.parallelize(range(5,11))firstRDD.union(secordRDD).collect()
The following code performs the same operation in Scala:
//scalaval firstRDD = spark.sparkContext.parallelize(1 to 5)val secordRDD = spark.sparkContext.parallelize(5 to 10)firstRDD.union(secordRDD).collect()
Read now
Unlock full access