Skip to Content
Java Deep Learning Projects
book

Java Deep Learning Projects

by Md. Rezaul Karim
June 2018
Intermediate to advanced
436 pages
10h 33m
English
Packt Publishing
Content preview from Java Deep Learning Projects

Dataset preparation for training

Since we do not have any unlabeled data, I would like to select some samples randomly for test. Well, one more thing is that features and labels come in two separate files. Therefore, we can perform the necessary preprocessing and then join them together so that our pre-processed data will have features and labels together.

Then the rest will be used for training. Finally, we'll save the training and testing set in a separate CSV file to be used later on. First, let's load the samples and see the statistics. By the way, we use the read() method of Spark but specify the necessary options and format too:

Dataset<Row> data = spark.read()                .option("maxColumns", 25000)                .format("com.databricks.spark.csv") .option("header", ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Java Deep Learning Essentials

Java Deep Learning Essentials

Yusuke Sugomori
Machine Learning in Java - Second Edition

Machine Learning in Java - Second Edition

AshishSingh Bhatia, Bostjan Kaluza
Mastering Java Machine Learning

Mastering Java Machine Learning

Uday Kamath, Krishna Choppella

Publisher Resources

ISBN: 9781788997454Supplemental Content