Chapter 4. Malware Analysis
When the air-gapped nuclear centrifuges in Iran’s Natanz uranium enrichment facility inexplicably ceased to function in 2010, no one knew for sure who was responsible. The Stuxnet worm was one of the most sensational successes of international cyber warfare, and a game-changing demonstration of the far-reaching destructive capabilities of malicious computer software. This piece of malware propagated itself indiscriminately around the world, only unleashing its payload when it detected a specific make of industrial computer system that the target used. Stuxnet reportedly ended up on tens of thousands of Windows machines in its dormant state, while resulting in the destruction of one-fifth of Iran’s nuclear centrifuges, thereby achieving its alleged goal of obstructing the state’s weapons program.
Malware analysis is the study of the functionality, purpose, origin, and potential impact of malicious software. This task is traditionally highly manual and laborious, requiring analysts with expert knowledge in software internals and reverse engineering. Data science and machine learning have shown promise in automating certain parts of malware analysis, but these methods still rely heavily on extracting meaningful features from the data, which is a nontrivial task that continues to require practitioners with specialized skillsets.
In this chapter, we do not focus on statistical learning methods.1 Instead, we discuss one of the most important but often underemphasized ...