Create the FraudDetection trait

In an empty FraudDetectionPipeline.scala file, add in the following imports. These are imports that we need for Logging, Feature Vector creation, DataFrame and SparkSession respectively:

import org.apache.log4j.{Level, Logger}import org.apache.spark.ml.linalg.Vectorsimport org.apache.spark.sql.{DataFrame, SparkSession}

This is an all-important trait, holding a method for SparkSession creation and other code. The code from classes that extend from this trait can share one instance of a SparkSession:

trait FraudDetectionWrapper {

Next, we need the path to the testing dataset, meant for cross-validation, which is crucial to our classification:

val trainSetFileName = "training.csv"

The entry point to programming ...

Get Modern Scala Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.