8Parsing and Corruption

Like the corruption of a Jedi, the corruption of memory proceeds in stages. A seed is planted, it grows, and eventually, a Sith tries to harvest it. The seed may be as small as a single bit, and the reward harvested is often the ability to run code of the attacker’s choosing.

Input corrupts, and unconstrained input corrupts deviously. Input corrupts because it is the source, the carrier, the medium whose message is LULZ. Almost all attacks are inputs. But useful programs must process input, and interesting programs, those that surprise, delight, or even merely serve us, take complex input. That input is sometimes deviously and cunningly designed to have specific and detrimental effects. To be clear, the surprise is to the programmer who wrote the code, not the one who crafts the input.

This chapter will look at memory corruption, which happens frequently when parsing input; it is a step on the way to exploitation but is not synonymous with it. Memory can become corrupt accidentally. Usually these bugs (or cosmic rays) will lead to a crash or uselessly weird behaviors.

After we look at corruption and the threats to parsers, we’ll consider defenses, including input validation in its many flavors, memory safety tools that seek to limit and constrain corruption, and then robust defensive patterns, including Recognizer, Single Parser, and safer language design. The Recognizer pattern concentrates all parsing in a Recognizer, which hands it off to the rest ...

Get Threats now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.