A sample application

To better understand the life cycle of the Spark application, let's create a sample application and understand the execution step by step. The following example shows the content of the data file that we will use in our application. The sale.csv file stores information,such as PRODUCT_CODE, COUNTRY_CODE, and the order AMOUNT for each ORDER_ID:

$ cat sale.csvORDER_ID,PRODUCT_CODE,COUNTRY_CODE,AMOUNT1,PC_01,USA,200.002,PC_01,USA,46.343,PC_04,USA,123.544,PC_02,IND,99.765,PC_44,IND,245.006,PC_02,AUS,654.217,PC_03,USA,75.008,PC_01,SPN,355.009,PC_03,USA,34.0210,PC_03,USA,567.07

We shall now create a sample application using Python API to find out the total sales amount by country and sort them in descending order by the total ...

Get Apache Spark Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.