Learn LLVM 12

Book description

Learn how to build and use all parts of real-world compilers, including the frontend, optimization pipeline, and a new backend by leveraging the power of LLVM core libraries

Key Features

  • Get to grips with effectively using LLVM libraries step-by-step
  • Understand LLVM compiler high-level design and apply the same principles to your own compiler
  • Use compiler-based tools to improve the quality of code in C++ projects

Book Description

LLVM was built to bridge the gap between compiler textbooks and actual compiler development. It provides a modular codebase and advanced tools which help developers to build compilers easily. This book provides a practical introduction to LLVM, gradually helping you navigate through complex scenarios with ease when it comes to building and working with compilers.

You’ll start by configuring, building, and installing LLVM libraries, tools, and external projects. Next, the book will introduce you to LLVM design and how it works in practice during each LLVM compiler stage: frontend, optimizer, and backend. Using a subset of a real programming language as an example, you will then learn how to develop a frontend and generate LLVM IR, hand it over to the optimization pipeline, and generate machine code from it. Later chapters will show you how to extend LLVM with a new pass and how instruction selection in LLVM works. You’ll also focus on Just-in-Time compilation issues and the current state of JIT-compilation support that LLVM provides, before finally going on to understand how to develop a new backend for LLVM.

By the end of this LLVM book, you will have gained real-world experience in working with the LLVM compiler development framework with the help of hands-on examples and source code snippets.

What you will learn

  • Configure, compile, and install the LLVM framework
  • Understand how the LLVM source is organized
  • Discover what you need to do to use LLVM in your own projects
  • Explore how a compiler is structured, and implement a tiny compiler
  • Generate LLVM IR for common source language constructs
  • Set up an optimization pipeline and tailor it for your own needs
  • Extend LLVM with transformation passes and clang tooling
  • Add new machine instructions and a complete backend

Who this book is for

This book is for compiler developers, enthusiasts, and engineers who are new to LLVM and are interested in learning about the LLVM framework. It is also useful for C++ software engineers looking to use compiler-based tools for code analysis and improvement, as well as casual users of LLVM libraries who want to gain more knowledge of LLVM essentials. Intermediate-level experience with C++ programming is mandatory to understand the concepts covered in this book more effectively.

Table of contents

  1. Learn LLVM 12
  2. Contributors
  3. About the author
  4. About the reviewer
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Code in Action
    6. Download the color images
    7. Conventions used
    8. Get in touch
    9. Reviews
  6. Section 1 – The Basics of Compiler Construction with LLVM
  7. Chapter 1: Installing LLVM
    1. Getting the prerequisites ready
      1. Ubuntu
      2. Fedora and RedHat
      3. FreeBSD
      4. OS X
      5. Windows
      6. Configuring Git
    2. Building with CMake
      1. Cloning the repository
      2. Creating a build directory
      3. Generating the build system files
    3. Customizing the build process
      1. Variables defined by CMake
      2. Variables defined by LLVM
    4. Summary
  8. Chapter 2: Touring the LLVM Source
    1. Technical requirements
    2. Contents of the LLVM mono repository
      1. LLVM core libraries and additions
      2. Compilers and tools
      3. Runtime libraries
    3. Layout of an LLVM project
    4. Creating your own project using LLVM libraries
      1. Creating the directory structure
      2. Adding the CMake files
      3. Adding the C++ source files
      4. Compiling the tinylang application
    5. Targeting a different CPU architecture
    6. Summary
  9. Chapter 3: The Structure of a Compiler
    1. Technical requirements
    2. Building blocks of a compiler
    3. An arithmetic expression language
      1. Formalism for specifying the syntax of a programming language
      2. How grammar helps the compiler writer
    4. Lexical analysis
      1. A handwritten lexer
    5. Syntactical analysis
      1. A handwritten parser
      2. The abstract syntax tree
    6. Semantic analysis
    7. Generating code with the LLVM backend
      1. Textual representation of the LLVM IR
      2. Generating the IR from the AST
      3. The missing pieces – the driver and the runtime library
    8. Summary
  10. Section 2 – From Source to Machine Code Generation
  11. Chapter 4: Turning the Source File into an Abstract Syntax Tree
    1. Technical requirements
    2. Defining a real programming language
    3. Creating the project layout
    4. Managing source files and user messages
    5. Structuring the lexer
    6. Constructing a recursive descent parser
    7. Generating a parser and lexer with bison and flex
    8. Performing semantic analysis
      1. Handling the scope of names
      2. Using LLVM-style RTTI for the AST
      3. Creating the semantic analyzer
    9. Summary
  12. Chapter 5: Basics of IR Code Generation
    1. Technical requirements
    2. Generating IR from the AST
      1. Understanding the IR code
      2. Knowing the load-and-store approach
      3. Mapping the control flow to basic blocks
    3. Using AST numbering to generate IR code in SSA form
      1. Defining the data structure to hold values
      2. Reading and writing values local to a basic block
      3. Searching the predecessor blocks for a value
      4. Optimizing the generated phi instructions
      5. Sealing a block
      6. Creating IR code for expressions
      7. Emitting the IR code for a function
      8. Controlling visibility with linkage and name mangling
      9. Converting types from an AST description to LLVM types
      10. Creating the LLVM IR function
      11. Emitting the function body
    4. Setting up the module and the driver
      1. Wrapping everything in the code generator
      2. Initializing the target machine class
      3. Emitting assembler text and object code
    5. Summary
  13. Chapter 6: IR Generation for High-Level Language Constructs
    1. Technical requirements
    2. Working with arrays, structs, and pointers
    3. Getting the application binary interface right
    4. Creating IR code for classes and virtual functions
      1. Implementing single inheritance
      2. Extending single inheritance with interfaces
      3. Adding support for multiple inheritance
    5. Summary
  14. Chapter 7: Advanced IR Generation
    1. Technical requirements
    2. Throwing and catching exceptions
      1. Raising an exception
      2. Catching an exception
      3. Integrating the exception-handling code into the application
    3. Generating metadata for type-based alias analysis
      1. Understanding the need for additional metadata
      2. Adding TBAA metadata to tinylang
    4. Adding debug metadata
      1. Understanding the general structure of debug metadata
      2. Tracking variables and their values
      3. Adding line numbers
      4. Adding debug support to tinylang
    5. Summary
  15. Chapter 8: Optimizing IR
    1. Technical requirements
    2. Introducing the LLVM Pass manager
    3. Implementing a Pass using the new Pass manager
      1. Adding a Pass to the LLVM source tree
      2. Adding a new Pass as a plugin
    4. Adapting a Pass for use with the old Pass manager
    5. Adding an optimization pipeline to your compiler
      1. Creating an optimization pipeline with the new Pass manager
      2. Extending the Pass pipeline
    6. Summary
  16. Section 3 –Taking LLVM to the Next Level
  17. Chapter 9: Instruction Selection
    1. Technical requirements
    2. Understanding the LLVM target backend structure
    3. Using MIR to test and debug the backend
    4. How instruction selection works
      1. Specifying the target description in the TableGen language
      2. Instruction selection with the selection DAG
      3. Fast instruction selection – FastISel
      4. The new global instruction selection – GlobalISel
    5. Supporting new machine instructions
      1. Adding a new instruction to the assembler and code generation
      2. Testing the new instruction
    6. Summary
  18. Chapter 10: JIT Compilation
    1. Technical requirements
    2. Getting an overview of LLVM's JIT implementation and use cases
    3. Using JIT compilation for direct execution
      1. Exploring the lli tool
      2. Implementing our own JIT compiler with LLJIT
      3. Building a JIT compiler class from scratch
    4. Utilizing a JIT compiler for code evaluation
      1. Identifying the language semantics
    5. Summary
  19. Chapter 11: Debugging Using LLVM Tools
    1. Technical requirements
    2. Instrumenting an application with sanitizers
      1. Detecting memory access problems with the address sanitizer
      2. Finding uninitialized memory access with the memory sanitizer
      3. Pointing out data races with the thread sanitizer
    3. Finding bugs with libFuzzer
      1. Limitations and alternatives
    4. Performance profiling with XRay
    5. Checking the source with the Clang Static Analyzer
      1. Adding a new checker to the Clang Static Analyzer
    6. Creating your own Clang-based tool
    7. Summary
  20. Chapter 12: Create Your Own Backend
    1. Technical requirements
    2. Setting the stage for a new backend
    3. Adding the new architecture to the Triple class
    4. Extending the ELF file format definition in LLVM
    5. Creating the target description
      1. Implementing the top-level file of the target description
      2. Adding the register definition
      3. Defining the calling convention
      4. Creating the scheduling model
      5. Defining the instruction formats and the instruction information
    6. Implementing the DAG instruction selection classes
      1. Initializing the target machine
      2. Adding the selection DAG implementation
      3. Supporting target-specific operations
      4. Configuring the target lowering
    7. Generating assembler instructions
    8. Emitting machine code
    9. Adding support for disassembling
    10. Piecing it all together
    11. Summary
    12. Why subscribe?
  21. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Leave a review - let other readers know what you think

Product information

  • Title: Learn LLVM 12
  • Author(s): Kai Nacke
  • Release date: May 2021
  • Publisher(s): Packt Publishing
  • ISBN: 9781839213502