Skip to Content
Java Deep Learning Cookbook
book

Java Deep Learning Cookbook

by Rahul Raj
November 2019
Intermediate to advanced
304 pages
8h 40m
English
Packt Publishing
Content preview from Java Deep Learning Cookbook

Tokenizing data and training the model

We need to perform tokenization in order to build the Word2Vec models. The context of a sentence (document) is determined by the words in it. Word2Vec models require words rather than sentences (documents) to feed in, so we need to break the sentence into atomic units and create a token each time a white space is hit. DL4J has a tokenizer factory that is responsible for creating the tokenizer. The TokenizerFactory generates a tokenizer for the given string. In this recipe, we will tokenize the text data and train the Word2Vec model on top of them.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Java Deep Learning Projects

Java Deep Learning Projects

Md. Rezaul Karim
Java: Data Science Made Easy

Java: Data Science Made Easy

Richard M. Reese, Jennifer L. Reese, Alexey Grigorev
Java 9 High Performance

Java 9 High Performance

Mayur Ramgir, Nick Samoylov
Introduction to Deep Learning Using PyTorch

Introduction to Deep Learning Using PyTorch

Goku Mohandas, Alfredo Canziani

Publisher Resources

ISBN: 9781788995207Supplemental Content