Natural Language Processing with Java Cookbook

Book description

A problem-solution guide to encounter various NLP tasks utilizing Java open source libraries and cloud-based solutions

Key Features

  • Perform simple-to-complex NLP text processing tasks using modern Java libraries Extract relationships between different text complexities using a problem-solution approach
  • Utilize cloud-based APIs to perform machine translation operations

Book Description

Natural Language Processing (NLP) has become one of the prime technologies for processing very large amounts of unstructured data from disparate information sources. This book includes a wide set of recipes and quick methods that solve challenges in text syntax, semantics, and speech tasks.

At the beginning of the book, you'll learn important NLP techniques, such as identifying parts of speech, tagging words, and analyzing word semantics. You will learn how to perform lexical analysis and use machine learning techniques to speed up NLP operations. With independent recipes, you will explore techniques for customizing your existing NLP engines/models using Java libraries such as OpenNLP and the Stanford NLP library. You will also learn how to use NLP processing features from cloud-based sources, including Google and Amazon's AWS. You will master core tasks, such as stemming, lemmatization, part-of-speech tagging, and named entity recognition. You will also learn about sentiment analysis, semantic text similarity, language identification, machine translation, and text summarization.

By the end of this book, you will be ready to become a professional NLP expert using a problem-solution approach to analyze any sort of text, sentences, or semantic words.

What you will learn

  • Explore how to use tokenizers in NLP processing
  • Implement NLP techniques in machine learning and deep learning applications
  • Identify sentences within the text and learn how to train specialized NER models
  • Learn how to classify documents and perform sentiment analysis
  • Find semantic similarities between text elements and extract text from a variety of sources
  • Preprocess text from a variety of data sources
  • Learn how to identify and translate languages

Who this book is for

This book is for data scientists, NLP engineers, and machine learning developers who want to perform their work on linguistic applications faster with the use of popular libraries on JVM machines. This book will help you build real-world NLP applications using a recipe-based approach. Prior knowledge of Natural Language Processing basics and Java programming is expected.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Natural Language Processing with Java Cookbook
  3. About Packt
    1. Why subscribe?
    2. Packt.com
  4. Contributors
    1. About the author
    2. About the reviewer
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Conventions used
    4. Sections
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    5. Get in touch
      1. Reviews
  6. Preparing Text for Analysis and Tokenization
    1. Technical requirements
    2. Tokenization using the Java SDK
      1. Getting ready
      2. How to do it...
      3. How it works...
    3. Tokenization using OpenNLP
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    4. Tokenization using maximum entropy
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    5. Training a neural network tokenizer for specialized text
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Identifying the stem of a word
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    7. Training an OpenNLP lemmatization model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    8. Determining the lexical meaning of a word using OpenNLP
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    9. Removing stop words using LingPipe
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
  7. Isolating Sentences within a Document
    1. Technical requirements
    2. Finding sentences using the Java core API
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    3. Performing SBD using the BreakIterator class
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Using OpenNLP to perform SBD
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Using the Stanford NLP API to perform SBD
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Using the LingPipe and chunking to perform SBD
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    7. Performing SBD on specialized text
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    8. Training a neural network to perform SBD with specialized text
      1. Getting ready 
      2. How to do it...
      3. How it works...
      4. See also
  8. Performing Name Entity Recognition
    1. Technical requirements
    2. Using regular expressions to find entities
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Using chunks with regular expressions to identify entities
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Using OpenNLP to find entities in text
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Isolating multiple entities types
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    6. Using a CRF model to find entities in a document
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    7. Using a chunker to find entities
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    8. Training a specialized NER model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
  9. Detecting POS Using Neural Networks
    1. Technical requirements
    2. Finding POS using tagging
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Using a chunker to find POS
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Using a tag dictionary
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Finding POS using the Penn Treebank
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Finding POS from textese
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    7. Using a pipeline to perform tagging
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    8. Using a hidden Markov model to perform POS
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    9. Training a specialized POS model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
  10. Performing Text Classification
    1. Technical requirements
    2. Training a maximum entropy model for text classification
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    3. Classifying documents using a maximum entropy model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Classifying documents using the Stanford API
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Training a model to classify text using LingPipe
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    6. Using LingPipe to classify text
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    7. Detecting spam
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    8. Performing sentiment analysis on reviews
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  11. Finding Relationships within Text
    1. Technical requirements
    2. Displaying parse trees graphically
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Using probabilistic context-free grammar to parse text
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Using OpenNLP to generate a parse tree
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Using the Google NLP API to parse text
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Identifying parent-child relationships in text
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    7. Finding co-references in a sentence
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  12. Language Identification and Translation
    1. Technical requirements
    2. Detecting the natural language in use using LingPipe
      1. Getting ready
      2. How to do it… 
      3. How it works…
      4. There's more…
      5. See also
    3. Discovering supported languages using the Google API
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
    4. Detecting the natural language in use using the Google API
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    5. Language translation using Google
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    6. Language detection and translation using Amazon AWS
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    7. Converting text to speech using the Google Cloud Text-to-Speech API
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
    8. Converting speech to text using the Google Cloud Speech-to-Text API
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
  13. Identifying Semantic Similarities within Text
    1. Technical requirements
    2. Finding the cosine similarity of the text
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Finding the distance between text
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    4. Finding differences between plaintext instances
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Finding hyponyms and antonyms
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  14. Common Text Processing and Generation Tasks
    1. Technical requirements
    2. Generating random numbers
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    3. Spell-checking using the LanguageTool API
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
    4. Checking grammar using the LanguageTool API
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
    5. Summarizing text in a document
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more...
      5. See also
    6. Creating, inverting, and using dictionaries
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
  15. Extracting Data for Use in NLP Analysis
    1. Technical requirements
    2. Connecting to an HTML page
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    3. Extracting text and metadata from an HTML page
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    4. Extracting text from a PDF document
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    5. Extracting metadata from a PDF document
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    6. Extracting text from a Word document
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    7. Extracting metadata from a Word document
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    8. Extracting text from a spreadsheet
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    9. Extracting metadata from a spreadsheet
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
  16. Creating a Chatbot
    1. Technical requirements
    2. Creating a simple chatbot using AWS
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
    3. Creating a bot using AWS Toolkit for Eclipse
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
    4. Creating a Lambda function
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
    5. Uploading the Lambda function
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
    6. Executing a Lambda function from Eclipse
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
  17. Installation and Configuration
    1. Technical requirements
    2. Getting ready to use the Google Cloud Platform
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
    3. Configuring Eclipse to use the Google Cloud Platform
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
    4. Getting ready to use Amazon Web Services
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
    5. Configuring Eclipse to use Amazon Web Services
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. See also
  18. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Natural Language Processing with Java Cookbook
  • Author(s): Richard M. Reese
  • Release date: April 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781789801156