Preparing the training and test sets

One of the most important parts of the data science pipeline, after data collection (which was in a sense outsourced—we use data collected by others) is data preprocessing, that is, clearing the dataset and transforming it to suit our needs.

So, our goal is to predict the direction of price change from the actual price in dollars over time. To do that, we define variables such as file, symbol, batchSize, splitRatio, and epochs. You can see the explanation of each variable in the inline comments within this code:

// StockPricePrediction.javaString file = "data/prices-split-adjusted.csv";String symbol = "GRMN"; // stock nameint batchSize = 128; // mini-batch sizedouble splitRatio = 0.8; // 80% for training, ...

Get Java Deep Learning Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.