Developing a machine learning application

In this section, we will present a machine learning example for textual analysis. Refer to Chapter 6, Using Spark SQL in Machine Learning Applications, for more details about the machine learning code presented in this section.

The Dataset used in the following example contains 1,080 documents of free text business descriptions of Brazilian companies categorized into a subset of nine categories. You can download this Dataset from

scala> val inRDD = spark.sparkContext.textFile("file:///Users/aurobindosarkar/Downloads/")scala> val rowRDD =",")).map(attributes => Row(attributes(0).toDouble, attributes(1).toDouble, attributes(2).toDouble, ...

Get Learning Spark SQL now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.