O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Practical Binary Analysis

Book Description

As malware increasingly obfuscates itself and applies anti-analysis techniques to thwart our analysis, we need more sophisticated methods that allow us to raise that dark curtain designed to keep us out—binary analysis can help. The goal of all binary analysis is to determine (and possibly modify) the true properties of binary programs to understand what they really do, rather than what we think they should do. While reverse engineering and disassembly are critical first steps in many forms of binary analysis, there is much more to be learned.

This hands-on guide teaches you how to tackle the fascinating but challenging topics of binary analysis and instrumentation and helps you become proficient in an area typically only mastered by a small group of expert hackers. It will take you from basic concepts to state-of-the-art methods as you dig into topics like code injection, disassembly, dynamic taint analysis, and binary instrumentation. Written for security engineers, hackers, and those with a basic working knowledge of C/C++ and x86-64, Practical Binary Analysis will teach you in-depth how binary programs work and help you acquire the tools and techniques needed to gain more control and insight into binary programs.

Once you’ve completed an introduction to basic binary formats, you’ll learn how to analyze binaries using techniques like the GNU/Linux binary analysis toolchain, disassembly, and code injection. You’ll then go on to implement profiling tools with Pin and learn how to build your own dynamic taint analysis tools with libdft and symbolic execution tools using Triton. You’ll learn how to:

•Parse ELF and PE binaries and build a binary loader with libbfd
•Use data-flow analysis techniques like program tracing, slicing, and reaching definitions analysis to reason about runtime flow of your programs
•Modify ELF binaries with techniques like parasitic code injection and hex editing
•Build custom disassembly tools with Capstone
•Use binary instrumentation to circumvent anti-analysis tricks commonly used by malware
•Apply taint analysis to detect control hijacking and data leak attacks
•Use symbolic execution to build automatic exploitation tools

With exercises at the end of each chapter to help solidify your skills, you’ll go from understanding basic assembly to performing some of the most sophisticated binary analysis and instrumentation. Practical Binary Analysis gives you what you need to work effectively with binary programs and transform your knowledge from basic understanding to expert-level proficiency.

Table of Contents

  1. Cover Page
  2. Title Page
  3. Copyright Page
  4. Dedication
  5. About the Author
  6. BRIEF CONTENTS
  7. CONTENTS IN DETAIL
  8. FOREWORD
  9. PREFACE
  10. ACKNOWLEDGMENTS
  11. INTRODUCTION
    1. What Is Binary Analysis, and Why Do You Need It?
    2. What Makes Binary Analysis Challenging?
    3. Who Should Read This Book?
    4. What’s in This Book?
    5. How to Use This Book
  12. PART I: BINARY FORMATS
  13. 1 ANATOMY OF A BINARY
    1. 1.1 The C Compilation Process
    2. 1.2 Symbols and Stripped Binaries
    3. 1.3 Disassembling a Binary
    4. 1.4 Loading and Executing a Binary
    5. 1.5 Summary
    6. Exercises
  14. 2 THE ELF FORMAT
    1. 2.1 The Executable Header
    2. 2.2 Section Headers
    3. 2.3 Sections
    4. 2.4 Program Headers
    5. 2.5 Summary
    6. Exercises
  15. 3 THE PE FORMAT: A BRIEF INTRODUCTION
    1. 3.1 The MS-DOS Header and MS-DOS Stub
    2. 3.2 The PE Signature, File Header, and Optional Header
    3. 3.3 The Section Header Table
    4. 3.4 Sections
    5. 3.5 Summary
    6. Exercises
  16. 4 BUILDING A BINARY LOADER USING LIBBFD
    1. 4.1 What Is libbfd?
    2. 4.2 A Simple Binary-Loading Interface
    3. 4.3 Implementing the Binary Loader
    4. 4.4 Testing the Binary Loader
    5. 4.5 Summary
    6. Exercises
  17. PART II: BINARY ANALYSIS FUNDAMENTALS
  18. 5 BASIC BINARY ANALYSIS IN LINUX
    1. 5.1 Resolving Identity Crises Using file
    2. 5.2 Using ldd to Explore Dependencies
    3. 5.3 Viewing File Contents with xxd
    4. 5.4 Parsing the Extracted ELF with readelf
    5. 5.5 Parsing Symbols with nm
    6. 5.6 Looking for Hints with strings
    7. 5.7 Tracing System Calls and Library Calls with strace and ltrace
    8. 5.8 Examining Instruction-Level Behavior Using objdump
    9. 5.9 Dumping a Dynamic String Buffer Using gdb
    10. 5.10 Summary
    11. Exercise
  19. 6 DISASSEMBLY AND BINARY ANALYSIS FUNDAMENTALS
    1. 6.1 Static Disassembly
    2. 6.2 Dynamic Disassembly
    3. 6.3 Structuring Disassembled Code and Data
    4. 6.4 Fundamental Analysis Methods
    5. 6.5 Effects of Compiler Settings on Disassembly
    6. 6.6 Summary
    7. Exercises
  20. 7 SIMPLE CODE INJECTION TECHNIQUES FOR ELF
    1. 7.1 Bare-Metal Binary Modification Using Hex Editing
    2. 7.2 Modifying Shared Library Behavior Using LD_PRELOAD
    3. 7.3 Injecting a Code Section
    4. 7.4 Calling Injected Code
    5. 7.5 Summary
    6. Exercises
  21. PART III: ADVANCED BINARY ANALYSIS
  22. 8 CUSTOMIZING DISASSEMBLY
    1. 8.1 Why Write a Custom Disassembly Pass?
    2. 8.2 Introduction to Capstone
    3. 8.3 Implementing a ROP Gadget Scanner
    4. 8.4 Summary
    5. Exercises
  23. 9 BINARY INSTRUMENTATION
    1. 9.1 What Is Binary Instrumentation?
    2. 9.2 Static Binary Instrumentation
    3. 9.3 Dynamic Binary Instrumentation
    4. 9.4 Profiling with Pin
    5. 9.5 Automatic Binary Unpacking with Pin
    6. 9.6 Summary
    7. Exercises
  24. 10 PRINCIPLES OF DYNAMIC TAINT ANALYSIS
    1. 10.1 What Is DTA?
    2. 10.2 DTA in Three Steps: Taint Sources, Taint Sinks, and Taint Propagation
    3. 10.3 Using DTA to Detect the Heartbleed Bug
    4. 10.4 DTA Design Factors: Taint Granularity, Taint Colors, and Taint Policies
    5. 10.5 Summary
    6. Exercise
  25. 11 PRACTICAL DYNAMIC TAINT ANALYSIS WITH LIBDFT
    1. 11.1 Introducing libdft
    2. 11.2 Using DTA to Detect Remote Control-Hijacking
    3. 11.3 Circumventing DTA with Implicit Flows
    4. 11.4 A DTA-Based Data Exfiltration Detector
    5. 11.5 Summary
    6. Exercise
  26. 12 PRINCIPLES OF SYMBOLIC EXECUTION
    1. 12.1 An Overview of Symbolic Execution
    2. 12.2 Constraint Solving with Z3
    3. 12.3 Summary
    4. Exercises
  27. 13 PRACTICAL SYMBOLIC EXECUTION WITH TRITON
    1. 13.1 Introduction to Triton
    2. 13.2 Maintaining Symbolic State with Abstract Syntax Trees
    3. 13.3 Backward Slicing with Triton
    4. 13.4 Using Triton to Increase Code Coverage
    5. 13.5 Automatically Exploiting a Vulnerability
    6. 13.6 Summary
    7. Exercise
  28. PART IV: APPENDIXES
  29. A A CRASH COURSE ON X86 ASSEMBLY
    1. A.1 Layout of an Assembly Program
    2. A.2 Structure of an x86 Instruction
    3. A.3 Common x86 Instructions
    4. A.4 Common Code Constructs in Assembly
  30. B IMPLEMENTING PT_NOTE OVERWRITING USING LIBELF
    1. B.1 Required Headers
    2. B.2 Data Structures Used in elfinject
    3. B.3 Initializing libelf
    4. B.4 Getting the Executable Header
    5. B.5 Finding the PT_NOTE Segment
    6. B.6 Injecting the Code Bytes
    7. B.7 Aligning the Load Address for the Injected Section
    8. B.8 Overwriting the .note.ABI-tag Section Header
    9. B.9 Setting the Name of the Injected Section
    10. B.10 Overwriting the PT_NOTE Program Header
    11. B.11 Modifying the Entry Point
  31. C LIST OF BINARY ANALYSIS TOOLS
    1. C.1 Disassemblers
    2. C.2 Debuggers
    3. C.3 Disassembly Frameworks
    4. C.4 Binary Analysis Frameworks
  32. D FURTHER READING
    1. D.1 Standards and References
    2. D.2 Papers and Articles
    3. D.3 Books
  33. INDEX