Method 1: Using Scala JUnit test

Suppose you have written an application in Scala that can tell you how many words are there in a document or text file as follows:

package com.chapter16.SparkTestingimport org.apache.spark._import org.apache.spark.sql.SparkSessionclass wordCounterTestDemo {  val spark = SparkSession    .builder    .master("local[*]")    .config("spark.sql.warehouse.dir", "E:/Exp/")    .appName(s"OneVsRestExample")    .getOrCreate()  def myWordCounter(fileName: String): Long = {    val input = spark.sparkContext.textFile(fileName)    val counts = input.flatMap(_.split(" ")).distinct()    val counter = counts.count()    counter  }}

The preceding code simply parses a text file and performs a flatMap operation by simply splitting the words. Then, it performs ...

Get Apache Spark 2: Data Processing and Real-Time Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.