book

Natural Language Processing with TensorFlow

Name: Natural Language Processing with TensorFlow
Author: Thushan Ganegedara
ISBN: 9781788478311

by Thushan Ganegedara

May 2018

Intermediate to advanced

472 pages

11h 27m

English

Packt Publishing

Read now

Unlock full access

Natural Language Processing with TensorFlow
Table of Contents
Natural Language Processing with TensorFlow
Why subscribe?
PacktPub.com
Contributors
About the author
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code filesDownload the color imagesConventions used

Get in touch
Reviews
1. Introduction to Natural Language Processing
What is Natural Language Processing?
Tasks of Natural Language Processing
The traditional approach to Natural Language Processing
Understanding the traditional approachExample – generating football game summariesDrawbacks of the traditional approach
The deep learning approach to Natural Language Processing
History of deep learningThe current state of deep learning and NLPUnderstanding a simple deep model – a Fully-Connected Neural Network
The roadmap – beyond this chapter
Introduction to the technical tools
Description of the toolsInstalling Python and scikit-learnInstalling Jupyter NotebookInstalling TensorFlow
Summary
2. Understanding TensorFlow
What is TensorFlow?Getting started with TensorFlowTensorFlow client in detailTensorFlow architecture – what happens when you execute the client?Cafe Le TensorFlow – understanding TensorFlow with an analogy
Inputs, variables, outputs, and operations
Defining inputs in TensorFlowFeeding data with Python codePreloading and storing data as tensorsBuilding an input pipelineDefining variables in TensorFlowDefining TensorFlow outputsDefining TensorFlow operationsComparison operationsMathematical operationsScatter and gather operationsNeural network-related operationsNonlinear activations used by neural networksThe convolution operationThe pooling operationDefining lossOptimization of neural networksThe control flow operations
Reusing variables with scoping
Implementing our first neural network
Preparing the dataDefining the TensorFlow graphRunning the neural network
Summary
3. Word2vec – Learning Word Embeddings
What is a word representation or meaning?
Classical approaches to learning word representation
WordNet – using an external lexical knowledge base for learning word representationsTour of WordNetProblems with WordNetOne-hot encoded representationThe TF-IDF methodCo-occurrence matrix
Word2vec – a neural network-based approach to learning word representation
Exercise: is queen = king – he + she?Designing a loss function for learning word embeddings
The skip-gram algorithm
From raw text to structured dataLearning the word embeddings with a neural networkFormulating a practical loss functionEfficiently approximating the loss functionNegative sampling of the softmax layerHierarchical softmaxLearning the hierarchyOptimizing the learning modelImplementing skip-gram with TensorFlow
The Continuous Bag-of-Words algorithm
Implementing CBOW in TensorFlow
Summary
4. Advanced Word2vec
The original skip-gram algorithmImplementing the original skip-gram algorithmComparing the original skip-gram with the improved skip-gram
Comparing skip-gram with CBOW
Performance comparisonWhich is the winner, skip-gram or CBOW?
Extensions to the word embeddings algorithms
Using the unigram distribution for negative samplingImplementing unigram-based negative samplingSubsampling – probabilistically ignoring the common wordsImplementing subsamplingComparing the CBOW and its extensions
More recent algorithms extending skip-gram and CBOW
A limitation of the skip-gram algorithmThe structured skip-gram algorithmThe loss functionThe continuous window model
GloVe – Global Vectors representation
Understanding GloVeImplementing GloVe
Document classification with Word2vec
DatasetClassifying documents with word embeddingsImplementation – learning word embeddingsImplementation – word embeddings to document embeddingsDocument clustering and t-SNE visualization of embedded documentsInspecting several outliersImplementation – clustering/classification of documents with K-means
Summary
5. Sentence Classification with Convolutional Neural Networks
Introducing Convolution Neural NetworksCNN fundamentalsThe power of Convolution Neural Networks
Understanding Convolution Neural Networks
Convolution operationStandard convolution operationConvolving with strideConvolving with paddingTransposed convolutionPooling operationMax poolingMax pooling with strideAverage poolingFully connected layersPutting everything together
Exercise – image classification on MNIST with CNN
About the dataImplementing the CNNAnalyzing the predictions produced with a CNN
Using CNNs for sentence classification
CNN structureData transformationThe convolution operationPooling over timeImplementation – sentence classification with CNNs
Summary
6. Recurrent Neural Networks
Understanding Recurrent Neural NetworksThe problem with feed-forward neural networksModeling with Recurrent Neural NetworksTechnical description of a Recurrent Neural Network
Backpropagation Through Time
How backpropagation worksWhy we cannot use BP directly for RNNsBackpropagation Through Time – training RNNsTruncated BPTT – training RNNs efficientlyLimitations of BPTT – vanishing and exploding gradients
Applications of RNNs
One-to-one RNNsOne-to-many RNNsMany-to-one RNNsMany-to-many RNNs
Generating text with RNNs
Defining hyperparametersUnrolling the inputs over time for Truncated BPTTDefining the validation datasetDefining weights and biasesDefining state persisting variablesCalculating the hidden states and outputs with unrolled inputsCalculating the lossResetting state at the beginning of a new segment of textCalculating validation outputCalculating gradients and optimizingOutputting a freshly generated chunk of text
Evaluating text results output from the RNN
Perplexity – measuring the quality of the text result
Recurrent Neural Networks with Context Features – RNNs with longer memory
Technical description of the RNN-CFImplementing the RNN-CFDefining the RNN-CF hyperparametersDefining input and output placeholdersDefining weights of the RNN-CFVariables and operations for maintaining hidden and context statesCalculating outputCalculating the lossCalculating validation outputComputing test outputComputing the gradients and optimizingText generated with the RNN-CF
Summary
7. Long Short-Term Memory Networks
Understanding Long Short-Term Memory NetworksWhat is an LSTM?LSTMs in more detailHow LSTMs differ from standard RNNs
How LSTMs solve the vanishing gradient problem
Improving LSTMsGreedy samplingBeam searchUsing word vectorsBidirectional LSTMs (BiLSTM)
Other variants of LSTMs
Peephole connectionsGated Recurrent Units
Summary
8. Applications of LSTM – Generating Text
Our dataAbout the datasetPreprocessing data
Implementing an LSTM
Defining hyperparametersDefining parametersDefining an LSTM cell and its operationsDefining inputs and labelsDefining sequential calculations required to process sequential dataDefining the optimizerDecaying learning rate over timeMaking predictionsCalculating perplexity (loss)Resetting statesGreedy sampling to break unimodalityGenerating new textExample generated text
Comparing LSTMs to LSTMs with peephole connections and GRUs
Standard LSTMReviewExample generated textGated Recurrent Units (GRUs)ReviewThe codeExample generated textLSTMs with peepholesReviewThe codeExample generated textTraining and validation perplexities over time
Improving LSTMs – beam search
Implementing beam searchExamples generated with beam search
Improving LSTMs – generating text with words instead of n-grams
The curse of dimensionalityWord2vec to the rescueGenerating text with Word2vecExamples generated with LSTM-Word2vec and beam searchPerplexity over time
Using the TensorFlow RNN API
Summary
9. Applications of LSTM – Image Caption Generation
Getting to know the dataILSVRC ImageNet datasetThe MS-COCO dataset
The machine learning pipeline for image caption generation
Extracting image features with CNNs
Implementation – loading weights and inferencing with VGG-
Building and updating variablesPreprocessing inputsInferring VGG-16Extracting vectorized representations of imagesPredicting class probabilities with VGG-16
Learning word embeddings
Preparing captions for feeding into LSTMs
Generating data for LSTMs
Defining the LSTM
Evaluating the results quantitatively
BLEUROUGEMETEORCIDErBLEU-4 over time for our model
Captions generated for test images
Using TensorFlow RNN API with pretrained GloVe word vectors
Loading GloVe word vectorsCleaning dataUsing pretrained embeddings with TensorFlow RNN APIDefining the pretrained embedding layer and the adaptation layerDefining the LSTM cell and softmax layerDefining inputs and outputsProcessing images and text differentlyDefining the LSTM output calculationDefining the logits and predictionsDefining the sequence lossDefining the optimizer
Summary
10. Sequence-to-Sequence Learning – Neural Machine Translation
Machine translation
A brief historical tour of machine translation
Rule-based translationStatistical Machine Translation (SMT)Neural Machine Translation (NMT)
Understanding Neural Machine Translation
Intuition behind NMTNMT architectureThe embedding layerThe encoderThe context vectorThe decoder
Preparing data for the NMT system
At training timeReversing the source sentenceAt testing time
Training the NMT
Inference with NMT
The BLEU score – evaluating the machine translation systems
Modified precisionBrevity penaltyThe final BLEU score
Implementing an NMT from scratch – a German to English translator
Introduction to dataPreprocessing dataLearning word embeddingsDefining the encoder and the decoderDefining the end-to-end output calculationSome translation results
Training an NMT jointly with word embeddings
Maximizing matchings between the dataset vocabulary and the pretrained embeddingsDefining the embeddings layer as a TensorFlow variable
Improving NMTs
Teacher forcingDeep LSTMs
Attention
Breaking the context vector bottleneckThe attention mechanism in detailImplementing the attention mechanismDefining weightsComputing attentionSome translation results – NMT with attentionVisualizing attention for source and target sentences
Other applications of Seq2Seq models – chatbots
Training a chatbotEvaluating chatbots – Turing test
Summary
11. Current Trends and the Future of Natural Language Processing
Current trends in NLPWord embeddingsRegion embeddingInput representationLearning region embeddingsImplementation – region embeddingsClassification accuracyProbabilistic word embeddingEnsemble embeddingTopic embeddingNeural Machine Translation (NMT)Improving the attention mechanismHybrid MT models
Penetration into other research fields
Combining NLP with computer visionVisual Question Answering (VQA)Caption generation for images with attentionReinforcement learningTeaching agents to communicate using their own languageDialogue agents with reinforcement learningGenerative Adversarial Networks for NLP
Towards Artificial General Intelligence
One Model to Learn Them AllA joint many-task model – growing a neural network for multiple NLP tasksFirst level – word-based tasksSecond level – syntactic tasksThird level – semantic-level tasks
NLP for social media
Detecting rumors in social mediaDetecting emotions in social mediaAnalyzing political framing in tweets
New tasks emerging
Detecting sarcasmLanguage groundingSkimming text with LSTMs
Newer machine learning models
Phased LSTMDilated Recurrent Neural Networks (DRNNs)
Summary
References
A. Mathematical Foundations and Advanced TensorFlow
Basic data structuresScalarVectorsMatricesIndexing of a matrix
Special types of matrices
Identity matrixDiagonal matrixTensors
Tensor/matrix operations
TransposeMultiplicationElement-wise multiplicationInverseFinding the matrix inverse – Singular Value Decomposition (SVD)NormsDeterminant
Probability
Random variablesDiscrete random variablesContinuous random variablesThe probability mass/density functionConditional probabilityJoint probabilityMarginal probabilityBayes' rule
Introduction to Keras
Introduction to the TensorFlow seq2seq library
Defining embeddings for the encoder and decoderDefining the encoderDefining the decoder
Visualizing word embeddings with TensorBoard
Starting TensorBoardSaving word embeddings and visualizing via TensorBoard
Summary
Index

Overview

Natural Language Processing with TensorFlow takes you through a journey to master the principles of natural language processing (NLP) using the capabilities of TensorFlow, the leading tool in deep learning. With this book, you will learn how to manage, analyze, and process text data and build advanced models that perform translations, summaries, classifications, and more.

What this Book will help me do

Understand the fundamentals of natural language processing and tensor-based algorithms.
Learn to implement effective word embeddings using Word2Vec and similar techniques.
Master the application of CNNs and RNNs for NLP tasks like sentence classification and language modeling.
Develop skills for creating high-performance LSTMs for text generation and other advanced tasks.
Apply advanced TensorFlow features to real-world NLP tasks, including neural machine translation.

Author(s)

None Saad and None Ganegedara bring years of experience in deep learning and natural language processing to their writing. They strive to explain complex technologies with clarity and applicability, ensuring readers not only understand concepts but can apply them in practice. Their approach combines deep technical insight with examples reflecting real-world challenges.

Who is it for?

This book is designed for software developers and data scientists who are familiar with Python and wish to delve into NLP with TensorFlow. Aiming to bridge theoretical concepts with practical applications, it's perfect for those with a basic understanding of machine learning concepts and seeks to expand into NLP. Beginners in TensorFlow are welcome as the book introduces key aspects of this library essential for NLP workflows.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Natural Language Processing with TensorFlow - Second Edition

Deep Learning with Applications Using Python: Chatbots and Face, Object, and Speech Recognition With TensorFlow and Keras

Navin Kumar Manaswi

Publisher Resources

ISBN: 9781788478311

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills