Developing a machine learning application

In this section, we will present a machine learning example for textual analysis. Refer to Chapter 6, Using Spark SQL in Machine Learning Applications, for more details about the machine learning code presented in this section.

The Dataset used in the following example contains 1,080 documents of free text business descriptions of Brazilian companies categorized into a subset of nine categories. You can download this Dataset from https://archive.ics.uci.edu/ml/datasets/CNAE-9.

scala> val inRDD = spark.sparkContext.textFile("file:///Users/aurobindosarkar/Downloads/CNAE-9.data")scala> val rowRDD = inRDD.map(_.split(",")).map(attributes => Row(attributes(0).toDouble, attributes(1).toDouble, attributes(2).toDouble, ...

Get Learning Spark SQL now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.