book

Java Deep Learning Projects

by Md. Rezaul Karim

June 2018

Intermediate to advanced

436 pages

10h 33m

English

Packt Publishing

Read now

Unlock full access

Content preview from Java Deep Learning Projects

Dataset preparation for training

Since we do not have any unlabeled data, I would like to select some samples randomly for test. Well, one more thing is that features and labels come in two separate files. Therefore, we can perform the necessary preprocessing and then join them together so that our pre-processed data will have features and labels together.

Then the rest will be used for training. Finally, we'll save the training and testing set in a separate CSV file to be used later on. First, let's load the samples and see the statistics. By the way, we use the read() method of Spark but specify the necessary options and format too:

Dataset<Row> data = spark.read()                .option("maxColumns", 25000)                .format("com.databricks.spark.csv") .option("header", ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Publisher Resources

ISBN: 9781788997454Supplemental Content

Java Deep Learning Projects

by Md. Rezaul Karim

Dataset preparation for training

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

Java Deep Learning Essentials

Java Deep Learning Cookbook

Machine Learning in Java - Second Edition

Mastering Java Machine Learning

Publisher Resources