Natural Language Processing in Action

Book description

Natural Language Processing in Action is your guide to creating machines that understand human language using the power of Python with its ecosystem of packages dedicated to NLP and AI.



About the Technology

Recent advances in deep learning empower applications to understand text and speech with extreme accuracy. The result? Chatbots that can imitate real people, meaningful resume-to-job matches, superb predictive search, and automatically generated document summaries—all at a low cost. New techniques, along with accessible tools like Keras and TensorFlow, make professional-quality NLP easier than ever before.

About the Book

Natural Language Processing in Action is your guide to building machines that can read and interpret human language. In it, you’ll use readily available Python packages to capture the meaning in text and react accordingly. The book expands traditional NLP approaches to include neural networks, modern deep learning algorithms, and generative techniques as you tackle real-world problems like extracting dates and names, composing text, and answering free-form questions.



What's Inside

  • Some sentences in this book were written by NLP! Can you guess which ones?
  • Working with Keras, TensorFlow, gensim, and scikit-learn
  • Rule-based and data-based NLP
  • Scalable pipelines


About the Reader

This book requires a basic understanding of deep learning and intermediate Python skills.



About the Authors

Hobson Lane, Cole Howard, and Hannes Max Hapke are experienced NLP engineers who use these techniques in production.



Quotes
Learn both the theory and practical skills needed to go beyond merely understanding the inner workings of NLP, and start creating your own algorithms or models.
- From the Foreword by Dr. Arwen Griffioen, Zendesk

Provides a great overview of current NLP tools in Python. I’ll definitely be keeping this book on hand for my own NLP work. Highly recommended!
- Tony Mullen, Northeastern University–Seattle

An intuitive guide to get you started with NLP. The book is full of programming examples that help you learn in a very pragmatic way.
- Tommaso Teofili, Adobe Systems

Publisher resources

View/Submit Errata

Table of contents

  1. Inside front cover
  2. Natural Language Processing in Action
  3. Copyright
  4. Brief Table of Contents
  5. Table of Contents
  6. Front matter
    1. Foreword
    2. Preface
    3. Acknowledgments
    4. Hobson Lane
    5. Hannes Max Hapke
    6. Cole Howard
    7. About this Book
    8. Roadmap
    9. About this book
    10. About the code
    11. liveBook discussion forum
    12. About the Authors
    13. About the cover Illustration
  7. Part 1. Wordy machines
  8. 1 Packets of thought (NLP overview)
    1. 1.1 Natural language vs. programming language
    2. 1.2 The magic
      1. 1.2.1 Machines that converse
      2. 1.2.2 The math
    3. 1.3 Practical applications
    4. 1.4 Language through a computer’s “eyes”
      1. 1.4.1 The language of locks
      2. 1.4.2 Regular expressions
      3. 1.4.3 A simple chatbot
      4. 1.4.4 Another way
    5. 1.5 A brief overflight of hyperspace
    6. 1.6 Word order and grammar
    7. 1.7 A chatbot natural language pipeline
    8. 1.8 Processing in depth
    9. 1.9 Natural language IQ
    10. Summary
  9. 2 Build your vocabulary (word tokenization)
    1. 2.1 Challenges (a preview of stemming)
    2. 2.2 Building your vocabulary with a tokenizer
      1. 2.2.1 Dot product
      2. 2.2.2 Measuring bag-of-words overlap
      3. 2.2.3 A token improvement
      4. 2.2.4 Extending your vocabulary with n-grams
      5. 2.2.5 Normalizing your vocabulary
    3. 2.3 Sentiment
      1. 2.3.1 VADER—A rule-based sentiment analyzer
      2. 2.3.2 Naive Bayes
    4. Summary
  10. 3 Math with words (TF-IDF vectors)
    1. 3.1 Bag of words
    2. 3.2 Vectorizing
      1. 3.2.1 Vector spaces
    3. 3.3 Zipf’s Law
    4. 3.4 Topic modeling
      1. 3.4.1 Return of Zipf
      2. 3.4.2 Relevance ranking
      3. 3.4.3 Tools
      4. 3.4.4 Alternatives
      5. 3.4.5 Okapi BM25
      6. 3.4.6 What’s next
    5. Summary
  11. 4 Finding meaning in word counts (semantic analysis)
    1. 4.1 From word counts to topic scores
      1. 4.1.1 TF-IDF vectors and lemmatization
      2. 4.1.2 Topic vectors
      3. 4.1.3 Thought experiment
      4. 4.1.4 An algorithm for scoring topics
      5. 4.1.5 An LDA classifier
    2. 4.2 Latent semantic analysis
      1. 4.2.1 Your thought experiment made real
    3. 4.3 Singular value decomposition
      1. 4.3.1 U—left singular vectors
      2. 4.3.2 S—singular values
      3. 4.3.3 VT—right singular vectors
      4. 4.3.4 SVD matrix orientation
      5. 4.3.5 Truncating the topics
    4. 4.4 Principal component analysis
      1. 4.4.1 PCA on 3D vectors
      2. 4.4.2 Stop horsing around and get back to NLP
      3. 4.4.3 Using PCA for SMS message semantic analysis
      4. 4.4.4 Using truncated SVD for SMS message semantic analysis
      5. 4.4.5 How well does LSA work for spam classification?
    5. 4.5 Latent Dirichlet allocation (LDiA)
      1. 4.5.1 The LDiA idea
      2. 4.5.2 LDiA topic model for SMS messages
      3. 4.5.3 LDiA + LDA = spam classifier
      4. 4.5.4 A fairer comparison: 32 LDiA topics
    6. 4.6 Distance and similarity
    7. 4.7 Steering with feedback
      1. 4.7.1 Linear discriminant analysis
    8. 4.8 Topic vector power
      1. 4.8.1 Semantic search
      2. 4.8.2 Improvements
    9. Summary
  12. Part 2. Deeper learning (neural networks)
  13. 5 Baby steps with neural networks (perceptrons and backpropagation)
    1. 5.1 Neural networks, the ingredient list
      1. 5.1.1 Perceptron
      2. 5.1.2 A numerical perceptron
      3. 5.1.3 Detour through bias
      4. 5.1.4 Let’s go skiing—the error surface
      5. 5.1.5 Off the chair lift, onto the slope
      6. 5.1.6 Let’s shake things up a bit
      7. 5.1.7 Keras: Neural networks in Python
      8. 5.1.8 Onward and deepward
      9. 5.1.9 Normalization: input with style
    2. Summary
  14. 6 Reasoning with word vectors (Word2vec)
    1. 6.1 Semantic queries and analogies
      1. 6.1.1 Analogy questions
    2. 6.2 Word vectors
      1. 6.2.1 Vector-oriented reasoning
      2. 6.2.2 How to compute Word2vec representations
      3. 6.2.3 How to use the gensim.word2vec module
      4. 6.2.4 How to generate your own word vector representations
      5. 6.2.5 Word2vec vs. GloVe (Global Vectors)
      6. 6.2.6 fastText
      7. 6.2.7 Word2vec vs. LSA
      8. 6.2.8 Visualizing word relationships
      9. 6.2.9 Unnatural words
      10. 6.2.10 Document similarity with Doc2vec
    3. Summary
  15. 7 Getting words in order with convolutional neural networks (CNNs)
    1. 7.1 Learning meaning
    2. 7.2 Toolkit
    3. 7.3 Convolutional neural nets
      1. 7.3.1 Building blocks
      2. 7.3.2 Step size (stride)
      3. 7.3.3 Filter composition
      4. 7.3.4 Padding
      5. 7.3.5 Learning
    4. 7.4 Narrow windows indeed
      1. 7.4.1 Implementation in Keras: prepping the data
      2. 7.4.2 Convolutional neural network architecture
      3. 7.4.3 Pooling
      4. 7.4.4 Dropout
      5. 7.4.5 The cherry on the sundae
      6. 7.4.6 Let’s get to learning (training)
      7. 7.4.7 Using the model in a pipeline
      8. 7.4.8 Where do you go from here?
    5. Summary
  16. 8 Loopy (recurrent) neural networks (RNNs)
    1. 8.1 Remembering with recurrent networks
      1. 8.1.1 Backpropagation through time
      2. 8.1.2 When do we update what?
      3. 8.1.3 Recap
      4. 8.1.4 There’s always a catch
      5. 8.1.5 Recurrent neural net with Keras
    2. 8.2 Putting things together
    3. 8.3 Let’s get to learning our past selves
    4. 8.4 Hyperparameters
    5. 8.5 Predicting
      1. 8.5.1 Statefulness
      2. 8.5.2 Two-way street
      3. 8.5.3 What is this thing?
    6. Summary
  17. 9 Improving retention with long short-term memory networks
    1. 9.1 LSTM
      1. 9.1.1 Backpropagation through time
      2. 9.1.2 Where does the rubber hit the road?
      3. 9.1.3 Dirty data
      4. 9.1.4 Back to the dirty data
      5. 9.1.5 Words are hard. Letters are easier.
      6. 9.1.6 My turn to chat
      7. 9.1.7 My turn to speak more clearly
      8. 9.1.8 Learned how to say, but not yet what
      9. 9.1.9 Other kinds of memory
      10. 9.1.10 Going deeper
    2. Summary
  18. 10 Sequence-to-sequence models and attention
    1. 10.1 Encoder-decoder architecture
      1. 10.1.1 Decoding thought
      2. 10.1.2 Look familiar?
      3. 10.1.3 Sequence-to-sequence conversation
      4. 10.1.4 LSTM review
    2. 10.2 Assembling a sequence-to-sequence pipeline
      1. 10.2.1 Preparing your dataset for the sequence-to-sequence training
      2. 10.2.2 Sequence-to-sequence model in Keras
      3. 10.2.3 Sequence encoder
      4. 10.2.4 Thought decoder
      5. 10.2.5 Assembling the sequence-to-sequence network
    3. 10.3 Training the sequence-to-sequence network
      1. 10.3.1 Generate output sequences
    4. 10.4 Building a chatbot using sequence-to-sequence networks
      1. 10.4.1 Preparing the corpus for your training
      2. 10.4.2 Building your character dictionary
      3. 10.4.3 Generate one-hot encoded training sets
      4. 10.4.4 Train your sequence-to-sequence chatbot
      5. 10.4.5 Assemble the model for sequence generation
      6. 10.4.6 Predicting a sequence
      7. 10.4.7 Generating a response
      8. 10.4.8 Converse with your chatbot
    5. 10.5 Enhancements
      1. 10.5.1 Reduce training complexity with bucketing
      2. 10.5.2 Paying attention
    6. 10.6 In the real world
    7. Summary
  19. Part 3. Getting real (real-world NLP challenges)
  20. 11 Information extraction (named entity extraction and question answering)
    1. 11.1 Named entities and relations
      1. 11.1.1 A knowledge base
      2. 11.1.2 Information extraction
    2. 11.2 Regular patterns
      1. 11.2.1 Regular expressions
      2. 11.2.2 Information extraction as ML feature extraction
    3. 11.3 Information worth extracting
      1. 11.3.1 Extracting GPS locations
      2. 11.3.2 Extracting dates
    4. 11.4 Extracting relationships (relations)
      1. 11.4.1 Part-of-speech (POS) tagging
      2. 11.4.2 Entity name normalization
      3. 11.4.3 Relation normalization and extraction
      4. 11.4.4 Word patterns
      5. 11.4.5 Segmentation
      6. 11.4.6 Why won’t split('.!?') work?
      7. 11.4.7 Sentence segmentation with regular expressions
    5. 11.5 In the real world
    6. Summary
  21. 12 Getting chatty (dialog engines)
    1. 12.1 Language skill
      1. 12.1.1 Modern approaches
      2. 12.1.2 A hybrid approach
    2. 12.2 Pattern-matching approach
      1. 12.2.1 A pattern-matching chatbot with AIML
      2. 12.2.2 A network view of pattern matching
    3. 12.3 Grounding
    4. 12.4 Retrieval (search)
      1. 12.4.1 The context challenge
      2. 12.4.2 Example retrieval-based chatbot
      3. 12.4.3 A search-based chatbot
    5. 12.5 Generative models
      1. 12.5.1 Chat about NLPIA
      2. 12.5.2 Pros and cons of each approach
    6. 12.6 Four-wheel drive
      1. 12.6.1 The Will to succeed
    7. 12.7 Design process
    8. 12.8 Trickery
      1. 12.8.1 Ask questions with predictable answers
      2. 12.8.2 Be entertaining
      3. 12.8.3 When all else fails, search
      4. 12.8.4 Being popular
      5. 12.8.5 Be a connector
      6. 12.8.6 Getting emotional
    9. 12.9 In the real world
    10. Summary
  22. 13 Scaling up (optimization, parallelization, and batch processing)
    1. 13.1 Too much of a good thing (data)
    2. 13.2 Optimizing NLP algorithms
      1. 13.2.1 Indexing
      2. 13.2.2 Advanced indexing
      3. 13.2.3 Advanced indexing with Annoy
      4. 13.2.4 Why use approximate indexes at all?
      5. 13.2.5 An indexing workaround: discretizing
    3. 13.3 Constant RAM algorithms
      1. 13.3.1 Gensim
      2. 13.3.2 Graph computing
    4. 13.4 Parallelizing your NLP computations
      1. 13.4.1 Training NLP models on GPUs
      2. 13.4.2 Renting vs. buying
      3. 13.4.3 GPU rental options
      4. 13.4.4 Tensor processing units
    5. 13.5 Reducing the memory footprint during model training
    6. 13.6 Gaining model insights with TensorBoard
      1. 13.6.1 How to visualize word embeddings
    7. Summary
  23. Appendix A. Your NLP tools
    1. A.1 Anaconda3
    2. A.2 Install NLPIA
    3. A.3 IDE
    4. A.4 Ubuntu package manager
    5. A.5 Mac
      1. A.5.1 A Mac package manager
      2. A.5.2 Some packages
      3. A.5.3 Tuneups
    6. A.6 Windows
      1. A.6.1 Get Virtual
    7. A.7 NLPIA automagic
  24. Appendix B. Playful Python and regular expressions
    1. B.1 Working with strings
      1. B.1.1 String types (str and bytes)
      2. B.1.2 Templates in Python (.format())
    2. B.2 Mapping in Python (dict and OrderedDict)
    3. B.3 Regular expressions
      1. B.3.1 |—OR
      2. B.3.2 ()—Groups
      3. B.3.3 []—Character classes
    4. B.4 Style
    5. B.5 Mastery
  25. Appendix C. Vectors and matrices (linear algebra fundamentals)
    1. C.1 Vectors
      1. C.1.1 Distances
  26. Appendix D. Machine learning tools and techniques
    1. D.1 Data selection and avoiding bias
    2. D.2 How fit is fit?
    3. D.3 Knowing is half the battle
    4. D.4 Cross-fit training
    5. D.5 Holding your model back
      1. D.5.1 Regularization
      2. D.5.2 Dropout
      3. D.5.3 Batch normalization
    6. D.6 Imbalanced training sets
      1. D.6.1 Oversampling
      2. D.6.2 Undersampling
      3. D.6.3 Augmenting your data
    7. D.7 Performance metrics
      1. D.7.1 Measuring classifier performance
      2. D.7.2 Measuring regressor performance
    8. D.8 Pro tips
  27. Appendix E. Setting up your AWS GPU
    1. E.1 Steps to create your AWS GPU instance
      1. E.1.1 Cost control
  28. Appendix F. Locality sensitive hashing
    1. F.1 High-dimensional vectors are different
      1. F.1.1 Vector space indexes and hashes
      2. F.1.2 High-dimensional thinking
    2. F.2 High-dimensional indexing
      1. F.2.1 Locality sensitive hashing
      2. F.2.2 Approximate nearest neighbors
    3. F.3 “Like” prediction
  29. Resources
    1. Applications and project ideas
    2. Courses and tutorials
    3. Tools and packages
    4. Research papers and talks
      1. Vector space models and semantic search
      2. Finance
      3. Question answering systems
      4. Deep learning
      5. LSTMs and RNNs
    5. Competitions and awards
    6. Datasets
    7. Search engines
      1. Search algorithms
      2. Open source search engines
      3. Open source full-text indexers
      4. Manipulative search engines
      5. Less manipulative search engines
      6. Distributed search engines
  30. Glossary
    1. Acronyms
    2. Terms
  31. Index
  32. List of Figures
  33. List of Tables
  34. List of Listings

Product information

  • Title: Natural Language Processing in Action
  • Author(s): Cole Howard, Hobson Lane, Hannes Hapke
  • Release date: April 2019
  • Publisher(s): Manning Publications
  • ISBN: 9781617294631