Skip to Content
Machine Learning in Java - Second Edition
book

Machine Learning in Java - Second Edition

by AshishSingh Bhatia, Bostjan Kaluza
November 2018
Intermediate to advanced
300 pages
7h 42m
English
Packt Publishing
Content preview from Machine Learning in Java - Second Edition

Importing from file

Another option to load the documents is through cc.mallet.pipe.iterator.CsvIterator.CsvIterator(Reader, Pattern, int, int, int), which assumes all of the documents are in a single file and returns one instance per line extracted by a regular expression. The class is initialized by the following components:

  • Reader: This is the object that specifies how to read from a file
  • Pattern: This is a regular expression, extracting three groups: data, target label, and document name
  • int, int, int: These are the indexes of data, target, and name groups as they appear in a regular expression

Consider a text document in the following format, specifying the document name, category, and content:

AP881218 local-news A 16-year-old student ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Mastering Java Machine Learning

Mastering Java Machine Learning

Uday Kamath, Krishna Choppella
Java: Data Science Made Easy

Java: Data Science Made Easy

Richard M. Reese, Jennifer L. Reese, Alexey Grigorev

Publisher Resources

ISBN: 9781788474399Supplemental Content