O'Reilly logo
live online training icon Live Online training

Learning Regular Expressions

Unlock the power of text processing using grep, JavaScript, and Python

James Lee

Regular expressions are patterns that are used to find text, which can then be manipulated programmatically. They’ve been around for decades, and nearly every programming language makes use of them since they’re so handy for searching and manipulating text. For example, say you need to find and replace a specific text string across thousands of files or search a single file for that one specific string, but you can't remember if it was written in uppercase or lowercase (or both). “Regex” gives you the flexibility and power to do these kinds of complex, nuanced search-and-replaces. But regex has a reputation for being hard to learn.

Expert James Lee walks you through the seemingly arcane syntax of regular expressions, illustrated through short applied examples. You’ll discover the features of regular expressions (and how to use them), using grep. You’ll also get a taste of how regex works in common programming languages like Python and JavaScript, so you can apply what you learn in your daily programming and operations tasks.

What you'll learn-and how you can apply it

By the end of this live online course, you’ll understand:

  • What regular expressions are and the value they provide
  • Regular expression syntax and rules
  • How to read and write regular expressions

And you’ll be able to:

  • Use Unix tools (grep, egrep, sed, awk) that use regular expressions
  • Read and write regular expressions
  • Test and validate regular expressions
  • Use regular expressions in programming languages such as JavaScript and Python

This training course is for you because...

  • You’re a programmer, engineer, or other professional comfortable working in a command-line, shell environment (no programming experience required).
  • You need tools to locate, parse, and replace text.
  • You want to be able to search and replace text to solve day-to-day problems.

Prerequisites

  • A working knowledge of the Linux command line (cd, ls, cat) and a text editor (either terminal or GUI based—e.g., vi, emacs, or nano)

Recommended preparation:

  • A machine with Python, JavaScript, and Node.js installed (useful but not required)

Recommended follow-up:

About your instructor

  • In the early 1990s, James Lee installed Red Hat on an unused piece of hardware he found in the closet and hasn't looked back since. James uses Linux both personally and professionally and is particularly happy that, over his career in technology, he's never had to use Windows. He’s worked with many Linux distributions, including Red Hat, CentOS, Scientific Linux, Debian, and Ubuntu, and recently booted Raspbian on a Raspberry Pi 3. Nowadays, he does most of his development work on a MacBook Pro but spends more time in Darwin than macOS—often with multiple active SSH sessions to various Linux servers. James is also an open source advocate and instructor; he’s delivered countless training courses on open source products such as Linux, Perl, and Python.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Introduction to regular expressions and grep/egrep (20 minutes)

  • Lecture: What are they?; Why are they important?; grep, egrep, and the difference between them; sed and awk; example using grep—case insensitive, use a file of regular expressions, display what does not match; online tools
  • Q&A

Basic syntax (25 minutes)

  • Lecture: Simple regular expressions—normal characters and “.”; bounding at beginning and end of the line—“^” and “$”; matching special characters with the backslash; regular expression rule #1—the match that begins earliest wins
  • Hands-on exercise: Work with basic syntax
  • Q&A

Character classes (25 minutes)

  • Lecture: Character class syntax—in a class and not in a class; predefined character classes—\d, \w, etc.; POSIX classes—[::alpha::], etc.
  • Hands-on exercise: Work with character classes
  • Q&A

Break (10 minutes)

Quantifiers, boundaries, and OR-ing (30 minutes)

  • Lecture: Quantifier syntax—specifying how many of something; regular expression rule #2—quantifiers are greedy; boundary syntax—\b, etc.; OR syntax
  • Hands-on exercise: Work with quantifiers, boundaries, and OR-ing
  • Q&A

Capturing and replacing (25 minutes)

  • Lecture: Capturing syntax (text extraction)—() and \1, etc.; replacing text; turning off capture—(?:)
  • Hands-on exercise: Work with capturing and replacing
  • Q&A

Break (10 minutes)

Lazy quantifiers (20 minutes)

  • Lecture: Lazy quantifier syntax
  • Hands-on exercise: Work with lazy quantifiers
  • Q&A

Inline modifiers and lookarounds (25 minutes)

  • Lecture: Inline modifier syntax—(?i), etc.; lookarounds—(?=), etc.
  • Hands-on exercise: Work with inline modifiers and lookarounds
  • Q&A

Break (10 minutes)

Practical and efficient regular expressions (30 minutes)

  • Lecture: Practical regular expressions—writing regexes that solve real problems; efficient regular expressions—writing regexes that use resources, such as memory, efficiently
  • Hands-on exercise: Work with practical and efficient regular expressions
  • Q&A

Other tools and programming languages (30 minutes)

  • Lecture: sed and awk; JavaScript; Python
  • Hands-on exercise: Work with other tools and programming languages
  • Q&A