Book description
A problem-solution guide to encounter various NLP tasks utilizing Java open source libraries and cloud-based solutions
Key Features
- Perform simple-to-complex NLP text processing tasks using modern Java libraries Extract relationships between different text complexities using a problem-solution approach
- Utilize cloud-based APIs to perform machine translation operations
Book Description
Natural Language Processing (NLP) has become one of the prime technologies for processing very large amounts of unstructured data from disparate information sources. This book includes a wide set of recipes and quick methods that solve challenges in text syntax, semantics, and speech tasks.
At the beginning of the book, you'll learn important NLP techniques, such as identifying parts of speech, tagging words, and analyzing word semantics. You will learn how to perform lexical analysis and use machine learning techniques to speed up NLP operations. With independent recipes, you will explore techniques for customizing your existing NLP engines/models using Java libraries such as OpenNLP and the Stanford NLP library. You will also learn how to use NLP processing features from cloud-based sources, including Google and Amazon's AWS. You will master core tasks, such as stemming, lemmatization, part-of-speech tagging, and named entity recognition. You will also learn about sentiment analysis, semantic text similarity, language identification, machine translation, and text summarization.
By the end of this book, you will be ready to become a professional NLP expert using a problem-solution approach to analyze any sort of text, sentences, or semantic words.
What you will learn
- Explore how to use tokenizers in NLP processing
- Implement NLP techniques in machine learning and deep learning applications
- Identify sentences within the text and learn how to train specialized NER models
- Learn how to classify documents and perform sentiment analysis
- Find semantic similarities between text elements and extract text from a variety of sources
- Preprocess text from a variety of data sources
- Learn how to identify and translate languages
Who this book is for
This book is for data scientists, NLP engineers, and machine learning developers who want to perform their work on linguistic applications faster with the use of popular libraries on JVM machines. This book will help you build real-world NLP applications using a recipe-based approach. Prior knowledge of Natural Language Processing basics and Java programming is expected.
Table of contents
- Title Page
- Copyright and Credits
- About Packt
- Contributors
- Preface
-
Preparing Text for Analysis and Tokenization
- Technical requirements
- Tokenization using the Java SDK
- Tokenization using OpenNLP
- Tokenization using maximum entropy
- Training a neural network tokenizer for specialized text
- Identifying the stem of a word
- Training an OpenNLP lemmatization model
- Determining the lexical meaning of a word using OpenNLP
- Removing stop words using LingPipe
-
Isolating Sentences within a Document
- Technical requirements
- Finding sentences using the Java core API
- Performing SBD using the BreakIterator class
- Using OpenNLP to perform SBD
- Using the Stanford NLP API to perform SBD
- Using the LingPipe and chunking to perform SBD
- Performing SBD on specialized text
- Training a neural network to perform SBD with specialized text
-
Performing Name Entity Recognition
- Technical requirements
- Using regular expressions to find entities
- Using chunks with regular expressions to identify entities
- Using OpenNLP to find entities in text
- Isolating multiple entities types
- Using a CRF model to find entities in a document
- Using a chunker to find entities
- Training a specialized NER model
- Detecting POS Using Neural Networks
-
Performing Text Classification
- Technical requirements
- Training a maximum entropy model for text classification
- Classifying documents using a maximum entropy model
- Classifying documents using the Stanford API
- Training a model to classify text using LingPipe
- Using LingPipe to classify text
- Detecting spam
- Performing sentiment analysis on reviews
- Finding Relationships within Text
-
Language Identification and Translation
- Technical requirements
- Detecting the natural language in use using LingPipe
- Discovering supported languages using the Google API
- Detecting the natural language in use using the Google API
- Language translation using Google
- Language detection and translation using Amazon AWS
- Converting text to speech using the Google Cloud Text-to-Speech API
- Converting speech to text using the Google Cloud Speech-to-Text API
- Identifying Semantic Similarities within Text
- Common Text Processing and Generation Tasks
-
Extracting Data for Use in NLP Analysis
- Technical requirements
- Connecting to an HTML page
- Extracting text and metadata from an HTML page
- Extracting text from a PDF document
- Extracting metadata from a PDF document
- Extracting text from a Word document
- Extracting metadata from a Word document
- Extracting text from a spreadsheet
- Extracting metadata from a spreadsheet
- Creating a Chatbot
- Installation and Configuration
- Other Books You May Enjoy
Product information
- Title: Natural Language Processing with Java Cookbook
- Author(s):
- Release date: April 2019
- Publisher(s): Packt Publishing
- ISBN: 9781789801156
You might also like
book
Natural Language Processing with Java - Second Edition
Explore various approaches to organize and extract useful text from unstructured data using Java Key Features …
book
Natural Language Processing with Python Quick Start Guide
Build and deploy intelligent applications for natural language processing with Python by using industry standard tools …
book
Hands-On Natural Language Processing with Python
Foster your NLP applications with the help of deep learning, NLTK, and TensorFlow Key Features Weave …
book
Natural Language Processing with Spark NLP
If you want to build an enterprise-quality application that uses natural language text but aren’t sure …