How to do it...

  1. We downloaded the prepared data file in LIBSVM from: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/glass.scale

The dataset contains 11 features and 214 rows.

  1. The original dataset and data dictionary is also available at the UCI website: http://archive.ics.uci.edu/ml/datasets/Glass+Identification
    • ID number: 1 to 214
    • RI: Refractive index
    • Na: Sodium (unit measurement: weight percent in corresponding oxide, as are attributes 4-10)
    • Mg: Magnesium
    • Al: Aluminum
    • Si: Silicon
    • K: Potassium
    • Ca: Calcium
    • Ba: Barium
    • Fe: Iron

Type of glass: Will find our class attributes or clusters using BisectingKMeans():

  • building_windows_float_processed
  • building_windows_non-_float_processed
  • vehicle_windows_float_processed

Get Apache Spark 2: Data Processing and Real-Time Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.