11 Deep learning for text

This chapter covers

  • Preprocessing text data for machine learning applications
  • Bag-of-words approaches and sequence-modeling approaches for text processing
  • The Transformer architecture
  • Sequence-to-sequence learning

11.1 Natural language processing: The bird’s-eye view

In computer science, we refer to human languages, like English or Mandarin, as “natural” languages, to distinguish them from languages that were designed for machines, like Assembly, LISP, or XML. Every machine language was designed: its starting point was a human engineer writing down a set of formal rules to describe what statements you could make in that language and what they meant. Rules came first, and people started using the language only once ...

Get Deep Learning with R, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.