Chapter 1. Systems Programmers Can Have Nice Things
In certain contextsâfor example the context Rust is targetingâbeing 10x or even 2x faster than the competition is a make-or-break thing. It decides the fate of a system in the market, as much as it would in the hardware market.
All computers are now parallel...
Parallel programming is programming.Michael McCool et al., Structured Parallel Programming
TrueType parser flaw used by nation-state attacker for surveillance; all software is security-sensitive.
We chose to open our book with the three quotes above for a reason. But letâs start with a mystery. What does the following C program do?
int
main
(
int
argc
,
char
**
argv
)
{
unsigned
long
a
[
1
];
a
[
3
]
=
0x7ffff7b36cebUL
;
return
0
;
}
On Jimâs laptop this morning, this program printed:
undef: Error: .netrc file is readable by others.
undef: Remove password or make file unreadable by others.
Then it crashed. If you try it on your machine, it may do something else. Whatâs going on here?
The program is flawed. The array a
is only one element long, so using a[3]
is, according to the C programming language standard, undefined behavior:
Behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
Undefined behavior doesnât just have an unpredictable result: the standard explicitly permits the program to do anything at all. In our case, storing this particular value in the fourth element of this particular array happens to corrupt the function call stack such that returning from the main
function, instead of exiting the program gracefully as it should, jumps into the midst of code from the standard C library for retrieving a password from a file in the userâs home directory. It doesnât go well.
C and C++ have hundreds of rules for avoiding undefined behavior. Theyâre mostly common sense: donât access memory you shouldnât, donât let arithmetic operations overflow, donât divide by zero, and so on. But the compiler does not enforce these rules; it has no obligation to detect even blatant violations. Indeed, the preceding program compiles without errors or warnings. The responsibility for avoiding undefined behavior falls entirely on you, the programmer.
Empirically speaking, we programmers do not have a great track record in this regard. While a student at the University of Utah, researcher Peng Li modified C and C++ compilers to make the programs they translated report whether they executed certain forms of undefined behavior. He found that nearly all programs do, including those from well-respected projects that hold their code to high standards. Assuming that you can avoid undefined behavior in C and C++ is like assuming you can win a game of chess simply because you know the rules.
The occasional strange message or crash may be a quality issue, but inadvertent undefined behavior has also been a major cause of security flaws since the 1988 Morris Worm used a variation of the technique shown earlier to propagate from one computer to another on the early Internet.
So C and C++ put programmers in an awkward position: those languages are the industry standards for systems programming, but the demands they place on programmers all but guarantee a steady stream of crashes and security problems. Answering our mystery just raises a bigger question: canât we do any better?
Rust Shoulders the Load for You
Our answer is framed by our three opening quotes. The third quote refers to reports that Stuxnet, a computer worm found breaking into industrial control equipment in 2010, gained control of the victimsâ computers using, among many other techniques, undefined behavior in code that parsed TrueType fonts embedded in word processing documents. Itâs a safe bet that the authors of that code were not expecting it to be used this way, illustrating that itâs not just operating systems and servers that need to worry about security: any software that might handle data from an untrusted source could be the target of an exploit.
The Rust language makes you a simple promise: if your program passes the compilerâs checks, it is free of undefined behavior. Dangling pointers, double-frees, and null pointer dereferences are all caught at compile time. Array references are secured with a mix of compile-time and run-time checks, so there are no buffer overruns: the Rust equivalent of our unfortunate C program exits safely with an error message.
Further, Rust aims to be both safe and pleasant to use. In order to make stronger guarantees about your programâs behavior, Rust imposes more restrictions on your code than C and C++ do, and these restrictions take practice and experience to get used to. But the language overall is flexible and expressive. This is attested to by the breadth of code written in Rust and the range of application areas to which it is being applied.
In our experience, being able to trust the language to catch more mistakes encourages us to try more ambitious projects. Modifying large, complex programs is less risky when you know that issues of memory management and pointer validity are taken care of. And debugging is much simpler when the potential consequences of a bug donât include corrupting unrelated parts of your program.
Of course, there are still plenty of bugs that Rust cannot detect. But in practice, taking undefined behavior off the table substantially changes the character of development for the better.
Parallel Programming Is Tamed
Concurrency is notoriously difficult to use correctly in C and C++. Developers usually turn to concurrency only when single-threaded code has proven unable to achieve the performance they need. But the second opening quote argues that parallelism is too important to modern machines to treat as a method of last resort.
As it turns out, the same restrictions that ensure memory safety in Rust also ensure that Rust programs are free of data races. You can share data freely between threads, as long as it isnât changing. Data that does change can only be accessed using synchronization primitives. All the traditional concurrency tools are available: mutexes, condition variables, channels, atomics, and so on. Rust simply checks that youâre using them properly.
This makes Rust an excellent language for exploiting the abilities of modern multi-core machines. The Rust ecosystem offers libraries that go beyond the usual concurrency primitives and help you distribute complex loads evenly across pools of processors, use lock-free synchronization mechanisms like Read-Copy-Update, and more.
And Yet Rust Is Still Fast
This, finally, is our first opening quote. Rust shares the ambitions Bjarne Stroustrup articulates for Câ++ in his paper âAbstraction and the C++ Machine Modelâ:
In general, C++ implementations obey the zero-overhead principle: What you donât use, you donât pay for. And further: What you do use, you couldnât hand code any better.
Systems programming is often concerned with pushing the machine to its limits. For video games, the entire machine should be devoted to creating the best experience for the player. For web browsers, the efficiency of the browser sets the ceiling on what content authors can do. Within the machineâs inherent limitations, as much memory and processor attention as possible must be left to the content itself. The same principle applies to operating systems: the kernel should make the machineâs resources available to user programs, not consume them itself.
But when we say Rust is âfast,â what does that really mean? One can write slow code in any general-purpose language. It would be more precise to say that, if you are ready to make the investment to design your program to make the best use of the underlying machineâs capabilities, Rust supports you in that effort. The language is designed with efficient defaults and gives you the ability to control how memory gets used and how the processorâs attention is spent.
Rust Makes Collaboration Easier
We hid a fourth quote in the title of this chapter: âSystems programmers can have nice things.â This refers to Rustâs support for code sharing and reuse.
Rustâs package manager and build tool, Cargo, makes it easy to use libraries published by others on Rustâs public package repository, the crates.io website. You simply add the libraryâs name and required version number to a file, and Cargo takes care of downloading the library, together with whatever other libraries it uses in turn, and linking the whole lot together. You can think of Cargo as Rustâs answer to NPM or RubyGems, with an emphasis on sound version management and reproducible builds. There are popular Rust libraries providing everything from off-the-shelf serialization to HTTP clients and servers and modern graphics APIs.
Going further, the language itself is also designed to support collaboration: Rustâs traits and generics let you create libraries with flexible interfaces so that they can serve in many different contexts. And Rustâs standard library provides a core set of fundamental types that establish shared conventions for common cases, making different libraries easier to use together.
The next chapter aims to make concrete the broad claims weâve made in this chapter, with a tour of several small Rust programs that show off the languageâs strengths.
Get Programming Rust, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.