April 2019
Beginner to intermediate
386 pages
11h 20m
English
Once the input stream was created for the training file, an object stream of DocumentSample instances was created. It used an instance of the PlainTextByLineStream class, which read in the data from the file where each line was converted into a string:
ObjectStream<String> objectStream = new PlainTextByLineStream(dataInputStream, StandardCharsets.UTF_8);ObjectStream<DocumentSample> documentSampleStream = new DocumentSampleStream(objectStream);
The train method performed the actual training. Its first argument used an abbreviation for the language used, which is English. The second argument was the DocumentSample stream:
DoccatModel documentCategorizationModel = DocumentCategorizerME.train("en", documentSampleStream);
An instance ...
Read now
Unlock full access