O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Hands-On System Programming with Linux

Book Description

Get up and running with system programming concepts in Linux

Key Features

  • Acquire insight on Linux system architecture and its programming interfaces
  • Get to grips with core concepts such as process management, signalling and pthreads
  • Packed with industry best practices and dozens of code examples

Book Description

The Linux OS and its embedded and server applications are critical components of today's software infrastructure in a decentralized, networked universe. The industry's demand for proficient Linux developers is only rising with time. Hands-On System Programming with Linux gives you a solid theoretical base and practical industry-relevant descriptions, and covers the Linux system programming domain. It delves into the art and science of Linux application programming— system architecture, process memory and management, signaling, timers, pthreads, and file IO.

This book goes beyond the use API X to do Y approach; it explains the concepts and theories required to understand programming interfaces and design decisions, the tradeoffs made by experienced developers when using them, and the rationale behind them. Troubleshooting tips and techniques are included in the concluding chapter.

By the end of this book, you will have gained essential conceptual design knowledge and hands-on experience working with Linux system programming interfaces.

What you will learn

  • Explore the theoretical underpinnings of Linux system architecture
  • Understand why modern OSes use virtual memory and dynamic memory APIs
  • Get to grips with dynamic memory issues and effectively debug them
  • Learn key concepts and powerful system APIs related to process management
  • Effectively perform file IO and use signaling and timers
  • Deeply understand multithreading concepts, pthreads APIs, synchronization and scheduling

Who this book is for

Hands-On System Programming with Linux is for Linux system engineers, programmers, or anyone who wants to go beyond using an API set to understanding the theoretical underpinnings and concepts behind powerful Linux system programming APIs. To get the most out of this book, you should be familiar with Linux at the user-level logging in, using shell via the command line interface, the ability to use tools such as find, grep, and sort. Working knowledge of the C programming language is required. No prior experience with Linux systems programming is assumed.

Downloading the example code for this book You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

Table of Contents

  1. Title Page
  2. Copyright and Credits
    1. Hands-On System Programming with Linux
  3. Packt Upsell
    1. Why subscribe?
    2. Packt.com
  4. Contributors
    1. About the author
    2. About the reviewer
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  6. Linux System Architecture
    1. Technical requirements
    2. Linux and the Unix operating system
    3. The Unix philosophy in a nutshell
      1. Everything is a process – if it's not a process, it's a file
      2. One tool to do one task
      3. Three standard I/O channels
        1. Word count 
        2. cat
      4. Combine tools seamlessly
      5. Plain text preferred
      6. CLI, not GUI
      7. Modular, designed to be repurposed by others
      8. Provide mechanisms, not policies
        1. Pseudocode
    4. Linux system architecture
      1. Preliminaries
        1. The ABI
        2. Accessing a register's content via inline assembly
        3. Accessing a control register's content via inline assembly
        4. CPU privilege levels
          1. Privilege levels or rings on the x86
      2. Linux architecture
        1. Libraries
        2. System calls
        3. Linux – a monolithic OS
          1. What does that mean?
    5. Execution contexts within the kernel
      1. Process context
      2. Interrupt context
    6. Summary
  7. Virtual Memory
    1. Technical requirements
    2. Virtual memory
      1. No VM – the problem
        1. Objective
      2. Virtual memory
        1. Addressing 1 – the simplistic flawed approach
        2. Addressing 2 – paging in brief
          1. Paging tables – simplified
          2. Indirection
          3. Address-translation
      3. Benefits of using VM
        1. Process-isolation
        2. The programmer need not worry about physical memory
        3. Memory-region protection
        4. SIDEBAR :: Testing the memcpy() C program
    3. Process memory layout
      1. Segments or mappings
        1. Text segment
        2. Data segments
        3. Library segments
        4. Stack segment
          1. What is stack memory?
          2. Why a process stack?
          3. Peeking at the stack
      2. Advanced – the VM split
    4. Summary
  8. Resource Limits
    1. Resource limits
    2. Granularity of resource limits
      1. Resource types
        1. Available resource limits
    3. Hard and soft limits
      1. Querying and changing resource limit values
        1. Caveats
        2. A quick note on the prlimit utility
          1. Using prlimit(1) – examples
      2. API interfaces
        1. Code examples
      3. Permanence
    4. Summary
  9. Dynamic Memory Allocation
    1. The glibc malloc(3) API family
      1. The malloc(3) API
        1. malloc(3) – some FAQs
        2. malloc(3) – a quick summary
      2. The free API
        1. free – a quick summary
      3. The calloc API
      4. The realloc API
        1. The realloc(3) – corner cases
        2. The reallocarray API
    2. Beyond the basics
      1. The program break
      2. Using the sbrk() API
      3. How malloc(3) really behaves
        1. Code example – malloc(3) and the program break
          1. Scenario 1 – default options
          2. Scenario 2 – showing malloc statistics
          3. Scenario 3 – large allocations option
        2. Where does freed memory go?
      4. Advanced features
        1. Demand-paging
          1. Resident or not?
        2. Locking memory
          1. Limits and privileges
          2. Locking all pages
        3. Memory protection
          1. Memory protection – a code example
        4. An Aside – LSM logs, Ftrace
          1. LSM logs
          2. Ftrace
          3. An experiment – running the memprot program on an ARM-32
          4. Memory protection keys – a brief note
        5. Using alloca to allocate automatic memory
    3. Summary
  10. Linux Memory Issues
    1. Common memory issues
      1. Incorrect memory accesses
        1. Accessing and/or using uninitialized variables
          1. Test case 1: Uninitialized memory access
        2. Out-of-bounds memory accesses
          1. Test case 2
          2. Test case 3
          3. Test case 4
          4. Test case 5
          5. Test case 6
          6. Test case 7
        3. Use-after-free/Use-after-return bugs
          1. Test case 8
          2. Test case 9
          3. Test case 10
      2. Leakage
        1. Test case 11
        2. Test case 12
        3. Test case 13 
          1. Test case 13.1
          2. Test case 13.2
          3. Test case 13.3
      3. Undefined behavior
      4. Fragmentation
      5. Miscellaneous
    2. Summary
  11. Debugging Tools for Memory Issues
    1. Tool types
      1. Valgrind
        1. Using Valgrind's Memcheck tool
        2. Valgrind summary table
        3. Valgrind pros and cons : a quick summary
      2. Sanitizer tools
        1. Sanitizer toolset
        2. Building programs for use with ASan
        3. Running the test cases with ASan
        4. AddressSanitizer (ASan) summary table
        5. AddressSanitizer pros and cons – a quick summary
      3. Glibc mallopt
        1. Malloc options via the environment 
    2. Some key points
      1. Code coverage while testing
      2. What is the modern C/C++ developer to do?
      3. A mention of the malloc API helpers
    3. Summary
  12. Process Credentials
    1. The traditional Unix permissions model
      1. Permissions at the user level
      2. How the Unix permission model works
        1. Determining the access category
      3. Real and effective IDs
        1. A puzzle – how can a regular user change their password?
        2. The setuid and setgid special permission bits
          1. Setting the setuid and setgid bits with chmod
          2. Hacking attempt 1
      4. System calls
        1. Querying the process credentials
          1. Code example
          2. Sudo – how it works
          3. What is a saved-set ID?
        2. Setting the process credentials
          1. Hacking attempt 2
        3. An aside – a script to identify setuid-root and setgid  installed programs
          1. setgid example – wall
          2. Giving up privileges
          3. Saved-set UID – a quick demo
          4. The setres[u|g]id(2) system calls
      5. Important security notes
    2. Summary
  13. Process Capabilities
    1. The modern POSIX capabilities model
      1. Motivation
      2. POSIX capabilities
      3. Capabilities – some gory details
        1. OS support
          1. Viewing process capabilities via procfs
        2. Thread capability sets
        3. File capability sets
      4. Embedding capabilities into a program binary
        1. Capability-dumb binaries
          1. Getcap and similar utilities
          2. Wireshark – a case in point
        2. Setting capabilities programmatically
    2. Miscellaneous
      1. How ls displays different binaries
      2. Permission models layering
      3. Security tips
        1. FYI – under the hood, at the level of the Kernel
    3. Summary
  14. Process Execution
    1. Technical requirements
    2. Process execution
      1. Converting a program to a process
      2. The exec Unix axiom
        1. Key points during an exec operation
        2. Testing the exec axiom
          1. Experiment 1 – on the CLI, no frills
          2. Experiment 2 – on the CLI, again
        3. The point of no return
      3. Family time – the exec family APIs
        1. The wrong way
          1. Error handling and the exec
          2. Passing a zero as an argument
          3. Specifying the name of the successor
        2. The remaining exec family APIs
          1. The execlp API
          2. The execle API
          3. The execv API
        3. Exec at the OS level
        4. Summary table – exec family of APIs
        5. Code example
    3. Summary
  15. Process Creation
    1. Process creation
      1. How fork works
      2. Using the fork system call
        1. Fork rule #1
        2. Fork rule #2 – the return
        3. Fork rule #3
          1. Atomic execution?
        4. Fork rule #4 – data
        5. Fork rule #5 – racing
        6. The process and open files
          1. Fork rule #6 – open files
          2. Open files and security
        7. Malloc and the fork
          1. COW in a nutshell
      3. Waiting and our simpsh project
        1. The Unix fork-exec semantic
          1. The need to wait
        2. Performing the wait
          1. Defeating the race after fork
          2. Putting it together – our simpsh project
          3. The wait API – details
        3. The scenarios of wait
          1. Wait scenario #1
          2. Wait scenario #2
          3. Fork bombs and creating more than one child
          4. Wait scenario #3
        4. Variations on the wait  – APIs
          1. The waitpid(2)
          2. The waitid (2)
          3. The actual system call
        5. A note on the vfork
      4. More Unix weirdness
        1. Orphans
        2. Zombies
          1. Fork rule #7
      5. The rules of fork – a summary
    2. Summary
  16. Signaling - Part I
    1. Why signals?
      1. The signal mechanism in brief
    2. Available signals
      1. The standard or Unix signals
    3. Handling signals
      1. Using the sigaction system call to trap signals
        1. Sidebar – the feature test macros
        2. The sigaction structure
        3. Masking signals
          1. Signal masking with the sigprocmask API
          2. Querying the signal mask
        4. Sidebar – signal handling within the OS – polling not interrupts
      2. Reentrant safety and signalling
        1. Reentrant functions
          1. Async-signal-safe functions
        2. Alternate ways to be safe within a signal handler
          1. Signal-safe atomic integers
      3. Powerful sigaction flags
        1. Zombies not invited
          1. No zombies! – the classic way
          2. No zombies! – the modern way
        2. The SA_NOCLDSTOP flag
        3. Interrupted system calls and how to fix them with the SA_RESTART
        4. The once only SA_RESETHAND flag
        5. To defer or not? Working with SA_NODEFER
          1. Signal behavior when masked
          2. Case 1 : Default : SA_NODEFER bit cleared
          3. Case 2 : SA_NODEFER bit set
          4. Running of case 1 – SA_NODEFER bit cleared [default]
          5. Running of case 2 – SA_NODEFER bit set
        6. Using an alternate signal stack
          1. Implementation to handle high-volume signals with an alternate signal stack
          2. Case 1 – very small (100 KB) alternate signal stack
          3. Case 2 : A large (16 MB) alternate signal stack 
        7. Different approaches to handling signals at high volume
    4. Summary
  17. Signaling - Part II
    1. Gracefully handling process crashes
      1. Detailing information with the SA_SIGINFO
        1. The siginfo_t structure
        2. Getting system-level details when a process crashes
          1. Trapping and extracting information from a crash
          2. Register dumping
          3. Finding the crash location in source code
    2. Signaling – caveats and gotchas
      1. Handling errno gracefully
        1. What does errno do?
        2. The errno race
        3. Fixing the errno race
      2. Sleeping correctly
        1. The nanosleep system call
    3. Real-time signals
      1. Differences from standard signals
        1. Real time signals and priority
    4. Sending signals
      1. Just kill 'em
        1. Killing yourself with a raise
        2. Agent 00 – permission to kill
        3. Are you there?
      2. Signaling as IPC
        1. Crude IPC
        2. Better IPC – sending a data item
          1. Sidebar – LTTng
    5. Alternative signal-handling techniques
      1. Synchronously waiting for signals
        1. Pause, please
          1. Waiting forever or until a signal arrives
        2. Synchronously blocking for signals via the sigwait* APIs
          1. The sigwait library API
          2. The sigwaitinfo and the sigtimedwait system calls
        3. The signalfd(2) API
    6. Summary
  18. Timers
    1. Older interfaces
      1. The good ol' alarm clock
        1. Alarm API – the downer
      2. Interval timers
        1. A simple CLI digital clock
          1. Obtaining the current time
          2. Trial runs
        2. A word on using the profiling timers
    2. The newer POSIX (interval) timers mechanism
      1. Typical application workflow
        1. Creating and using a POSIX (interval) timer
          1. The arms race – arming and disarming a POSIX timer
          2. Querying the timer
          3. Example code snippet showing the workflow
          4. Figuring the overrun
      2. POSIX interval timers – example programs
        1. The reaction – time game
          1. How fast is fast?
          2. Our react game – how it works
          3. React – trial runs
          4. The react game – code view
        2. The run:walk interval timer application
          1. A few trial runs
          2. The low – level design and code
        3. Timer lookup via proc
    3. A quick mention
      1. Timers via file descriptors
      2. A quick note on watchdog timers
    4. Summary
  19. Multithreading with Pthreads Part I - Essentials
    1. Multithreading concepts
      1. What exactly is a thread?
        1. Resource sharing
      2. Multiprocess versus multithreaded
        1. Example 1 – creation/destruction – process/thread
          1. The multithreading model
        2. Example 2 – matrix multiplication – process/thread
        3. Example 3 – kernel build
          1. On a VM with 1 GB RAM, two CPU cores and parallelized make -j4
          2. On a VM with 1 GB RAM, one CPU core and sequential make -j1
      3. Motivation – why threads?
        1. Design motivation
          1. Taking advantage of potential parallelism
          2. Logical separation
          3. Overlapping CPU with I/O
          4. Manager-worker model
          5. IPC becoming simple(r)
        2. Performance motivation
          1. Creation and destruction
          2. Automatically taking advantage of modern hardware
          3. Resource sharing
          4. Context switching
      4. A brief history of threading
        1. POSIX threads
        2. Pthreads and Linux
    2. Thread management – the essential pthread APIs
      1. Thread creation
      2. Termination
        1. The return of the ghost
        2. So many ways to die
      3. How many threads is too many?
        1. How many threads can you create?
          1. Code example – creating any number of threads
        2. How many threads should one create?
      4. Thread attributes
        1. Code example – querying the default thread attributes
      5. Joining
        1. The thread model join and the process model wait
        2. Checking for life, timing out
        3. Join or not?
      6. Parameter passing
        1. Passing a structure as a parameter
        2. Thread parameters – what not to do
      7. Thread stacks
        1. Get and set thread stack size
        2. Stack location
        3. Stack guards
    3. Summary
  20. Multithreading with Pthreads Part II - Synchronization
    1. The racing problem
      1. Concurrency and atomicity
        1. The pedagogical bank account example
        2. Critical sections
    2. Locking concepts
      1. Is it atomic?
        1. Dirty reads
      2. Locking guidelines
        1. Locking granularity
      3. Deadlock and its avoidance
        1. Common deadlock types
          1. Self deadlock (relock)
          2. The ABBA deadlock
        2. Avoiding deadlock
    3. Using the pthread APIs for synchronization
      1. The mutex lock
        1. Seeing the race
        2. Mutex attributes
          1. Mutex types
          2. The robust mutex attribute
          3. IPC, threads, and the process-shared mutex
        3. Priority inversion, watchdogs, and Mars
          1. Priority inversion
          2. Watchdog timer in brief
          3. The Mars Pathfinder mission in brief
          4. Priority inheritance – avoiding priority inversion
          5. Summary of mutex attribute usage
        4. Mutex locking – additional variants
          1. Timing out on a mutex lock attempt
          2. Busy-waiting (non-blocking variant) for the lock
          3. The reader-writer mutex lock
          4. The spinlock variant
        5. A few more mutex usage guidelines
          1. Is the mutex locked?
      2. Condition variables
        1. No CV – the naive approach
        2. Using the condition variable
        3. A simple CV usage demo application
        4. CV broadcast wakeup
    4. Summary
  21. Multithreading with Pthreads Part III
    1. Thread safety
      1. Making code thread-safe
        1. Reentrant-safe versus thread-safe
        2. Summary table – approaches to making functions thread-safe
        3. Thread safety via mutex locks
        4. Thread safety via function refactoring
        5. The standard C library and thread safety
          1. List of APIs not required to be thread-safe
          2. Refactoring glibc APIs from foo to foo_r
          3. Some glibc foo and foo_r APIs
        6. Thread safety via TLS 
        7. Thread safety via TSD 
    2. Thread cancelation and cleanup
      1. Canceling a thread
        1. The thread cancelation framework
          1. The cancelability state
          2. The cancelability type
          3. Canceling a thread – a code example
      2. Cleaning up at thread exit
        1. Thread cleanup – code example
    3. Threads and signaling
      1. The issue
      2. The POSIX solution to handling signals on MT
      3. Code example – handling signals in an MT app
    4. Threads vs processes – look again
      1. The multiprocess vs the multithreading model – pros of the MT model
      2. The multiprocess vs the multithreading model – cons of the MT model
    5. Pthreads – a few random tips and FAQs
      1. Pthreads – some FAQs
      2. Debugging multithreaded (pthreads) applications with GDB
    6. Summary
  22. CPU Scheduling on Linux
    1. The Linux OS and the POSIX scheduling model
      1. The Linux process state machine
        1. The sleep states
      2. What is real time?
        1. Types of real time
      3. Scheduling policies
        1. Peeking at the scheduling policy and priority
        2. The nice value
        3. CPU affinity
    2. Exploiting Linux's soft real-time capabilities
      1. Scheduling policy and priority APIs
        1. Code example – setting a thread scheduling policy and priority
        2. Soft real-time – additional considerations
    3. RTL – Linux as an RTOS
    4. Summary
  23. Advanced File I/O
    1. I/O performance recommendations
      1. The kernel page cache
        1. Giving hints to the kernel on file I/O patterns
          1. Via the posix_fadvise(2) API
          2. Via the readahead(2) API
      2. MT app file I/O with the pread, pwrite APIs
      3. Scatter – gather I/O
        1. Discontiguous data file – traditional approach
        2. Discontiguous data file – the SG – I/O approach
        3. SG – I/O variations
      4. File I/O via memory mapping
        1. The Linux I/O code path in brief
        2. Memory mapping a file for I/O
          1. File and anonymous mappings
          2. The mmap advantage
          3. Code example
          4. Memory mapping – additional points
      5. DIO and AIO
        1. Direct I/O (DIO)
        2. Asynchronous I/O (AIO)
        3. I/O technologies – a quick comparison
      6. Multiplexing or async blocking I/O – a quick note
      7. I/O – miscellaneous
        1. Linux's inotify framework
        2. I/O schedulers
        3. Ensuring sufficient disk space
        4. Utilities for I/O monitoring, analysis, and bandwidth control
    2. Summary
  24. Troubleshooting and Best Practices
    1. Troubleshooting tools
      1. perf
      2. Tracing tools
      3. The Linux proc filesystem
    2. Best practices
      1. The empirical approach
      2. Software engineering wisdom in a nutshell
      3. Programming
        1. A programmer’s checklist – seven rules
        2. Better testing
        3. Using the Linux kernel's control groups
    3. Summary
  25. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think