Chapter 1. Systems Programmers Can Have Nice Things

In certain contexts—for example the context Rust is targeting—being 10x or even 2x faster than the competition is a make-or-break thing. It decides the fate of a system in the market, as much as it would in the hardware market.

Graydon Hoare

All computers are now parallel...
Parallel programming is programming.

Michael McCool et al., Structured Parallel Programming

TrueType parser flaw used by nation-state attacker for surveillance; all software is security-sensitive.

Andy Wingo

We chose to open our book with the three quotes above for a reason. But let’s start with a mystery. What does the following C program do?

int main(int argc, char **argv) {
  unsigned long a[1];
  a[3] = 0x7ffff7b36cebUL;
  return 0;
}

On Jim’s laptop this morning, this program printed:

undef: Error: .netrc file is readable by others.
undef: Remove password or make file unreadable by others.

Then it crashed. If you try it on your machine, it may do something else. What’s going on here?

The program is flawed. The array a is only one element long, so using a[3] is, according to the C programming language standard, undefined behavior:

Behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

Undefined behavior doesn’t just have an unpredictable result: the standard explicitly permits the program to do anything at all. In our case, storing this particular value in the fourth element of this particular array happens to corrupt the function call stack such that returning from the main function, instead of exiting the program gracefully as it should, jumps into the midst of code from the standard C library for retrieving a password from a file in the user’s home directory. It doesn’t go well.

C and C++ have hundreds of rules for avoiding undefined behavior. They’re mostly common sense: don’t access memory you shouldn’t, don’t let arithmetic operations overflow, don’t divide by zero, and so on. But the compiler does not enforce these rules; it has no obligation to detect even blatant violations. Indeed, the preceding program compiles without errors or warnings. The responsibility for avoiding undefined behavior falls entirely on you, the programmer.

Empirically speaking, we programmers do not have a great track record in this regard. While a student at the University of Utah, researcher Peng Li modified C and C++ compilers to make the programs they translated report whether they executed certain forms of undefined behavior. He found that nearly all programs do, including those from well-respected projects that hold their code to high standards. Assuming that you can avoid undefined behavior in C and C++ is like assuming you can win a game of chess simply because you know the rules.

The occasional strange message or crash may be a quality issue, but inadvertent undefined behavior has also been a major cause of security flaws since the 1988 Morris Worm used a variation of the technique shown earlier to propagate from one computer to another on the early Internet.

So C and C++ put programmers in an awkward position: those languages are the industry standards for systems programming, but the demands they place on programmers all but guarantee a steady stream of crashes and security problems. Answering our mystery just raises a bigger question: can’t we do any better?

Rust Shoulders the Load for You

Our answer is framed by our three opening quotes. The third quote refers to reports that Stuxnet, a computer worm found breaking into industrial control equipment in 2010, gained control of the victims’ computers using, among many other techniques, undefined behavior in code that parsed TrueType fonts embedded in word processing documents. It’s a safe bet that the authors of that code were not expecting it to be used this way, illustrating that it’s not just operating systems and servers that need to worry about security: any software that might handle data from an untrusted source could be the target of an exploit.

The Rust language makes you a simple promise: if your program passes the compiler’s checks, it is free of undefined behavior. Dangling pointers, double-frees, and null pointer dereferences are all caught at compile time. Array references are secured with a mix of compile-time and run-time checks, so there are no buffer overruns: the Rust equivalent of our unfortunate C program exits safely with an error message.

Further, Rust aims to be both safe and pleasant to use. In order to make stronger guarantees about your program’s behavior, Rust imposes more restrictions on your code than C and C++ do, and these restrictions take practice and experience to get used to. But the language overall is flexible and expressive. This is attested to by the breadth of code written in Rust and the range of application areas to which it is being applied.

In our experience, being able to trust the language to catch more mistakes encourages us to try more ambitious projects. Modifying large, complex programs is less risky when you know that issues of memory management and pointer validity are taken care of. And debugging is much simpler when the potential consequences of a bug don’t include corrupting unrelated parts of your program.

Of course, there are still plenty of bugs that Rust cannot detect. But in practice, taking undefined behavior off the table substantially changes the character of development for the better.

Parallel Programming Is Tamed

Concurrency is notoriously difficult to use correctly in C and C++. Developers usually turn to concurrency only when single-threaded code has proven unable to achieve the performance they need. But the second opening quote argues that parallelism is too important to modern machines to treat as a method of last resort.

As it turns out, the same restrictions that ensure memory safety in Rust also ensure that Rust programs are free of data races. You can share data freely between threads, as long as it isn’t changing. Data that does change can only be accessed using synchronization primitives. All the traditional concurrency tools are available: mutexes, condition variables, channels, atomics, and so on. Rust simply checks that you’re using them properly.

This makes Rust an excellent language for exploiting the abilities of modern multi-core machines. The Rust ecosystem offers libraries that go beyond the usual concurrency primitives and help you distribute complex loads evenly across pools of processors, use lock-free synchronization mechanisms like Read-Copy-Update, and more.

And Yet Rust Is Still Fast

This, finally, is our first opening quote. Rust shares the ambitions Bjarne Stroustrup articulates for C‍++ in his paper “Abstraction and the C++ Machine Model”:

In general, C++ implementations obey the zero-overhead principle: What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.

Systems programming is often concerned with pushing the machine to its limits. For video games, the entire machine should be devoted to creating the best experience for the player. For web browsers, the efficiency of the browser sets the ceiling on what content authors can do. Within the machine’s inherent limitations, as much memory and processor attention as possible must be left to the content itself. The same principle applies to operating systems: the kernel should make the machine’s resources available to user programs, not consume them itself.

But when we say Rust is “fast,” what does that really mean? One can write slow code in any general-purpose language. It would be more precise to say that, if you are ready to make the investment to design your program to make the best use of the underlying machine’s capabilities, Rust supports you in that effort. The language is designed with efficient defaults and gives you the ability to control how memory gets used and how the processor’s attention is spent.

Rust Makes Collaboration Easier

We hid a fourth quote in the title of this chapter: “Systems programmers can have nice things.” This refers to Rust’s support for code sharing and reuse.

Rust’s package manager and build tool, Cargo, makes it easy to use libraries published by others on Rust’s public package repository, the crates.io website. You simply add the library’s name and required version number to a file, and Cargo takes care of downloading the library, together with whatever other libraries it uses in turn, and linking the whole lot together. You can think of Cargo as Rust’s answer to NPM or RubyGems, with an emphasis on sound version management and reproducible builds. There are popular Rust libraries providing everything from off-the-shelf serialization to HTTP clients and servers and modern graphics APIs.

Going further, the language itself is also designed to support collaboration: Rust’s traits and generics let you create libraries with flexible interfaces so that they can serve in many different contexts. And Rust’s standard library provides a core set of fundamental types that establish shared conventions for common cases, making different libraries easier to use together.

The next chapter aims to make concrete the broad claims we’ve made in this chapter, with a tour of several small Rust programs that show off the language’s strengths.

Get Programming Rust, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.