Chapter 1. Best Practices

We do not all have to write like Faulkner, or program like Dijkstra. I will gladly tell people what my programming style is, and I will even tell them where I think their own style is unclear or makes me jump through mental hoops.
But I do this as a fellow programmer, not as the Perl god … stylistic limits should be self-imposed, or at most policed by consensus among your buddies.

—Larry Wall Natural Language Principles in Perl

Code matters. Analysis, design, decomposition, algorithms, data structures, and control flow mean nothing until they are made real, given form and power in the statements of some programming language. It is code that allows abstractions and ideas to control the physical world, that enables mathematical procedures to govern real-world processes, that converts data into information and information into knowledge.

Code matters. So the way in which you code matters too. Every programmer has a unique approach to writing software; a unique coding style. Programmers' styles are based on their earliest experiences in programming—the linguistic idiosyncrasies of their first languages, the way in which code was presented in their initial textbooks, and the stylistic prejudices of their early instructors. That style will develop and change as the programmer's experience and skills increase. Indeed, most programmers' style is really just a collection of coding habits that have evolved in response to the opportunities and pressures they have experienced throughout their careers.

Just as in natural evolution, those opportunities and pressures may lead to a coding style that is fit, strong, and well-adapted to the programmer's needs. Or it may lead to a coding style that is nasty, brutish, and underthought. But what it most often leads to is something even worse: Intuitive Programmer Syndrome.

Many programmers code by instinct. They aren't conscious of the hundreds of choices they make every time they code: how they format their source, the names they use for variables, the kinds of loops they use (while vs for vs map), whether to put that extra semicolon at the end of the block, whether to grep with a regex or a block, where and when to put comments, whether to use an object-oriented or procedural approach, how to explain their programs in their documentation, whether to return undef or throw an exception on failure, how to decompose the different components of a system into subroutines, how to bundle those subroutines into modules, how to interact with the program's user.

Developers are usually focused entirely on the problems they're solving, the solutions they're creating, and the algorithms they're implementing. So when it comes to choosing a variable name, they use the first one that comes to mind[1]; when it comes to using a loop they use the one they always use[2]; and when it comes to that trailing semicolon, well, sometimes they do and sometimes they don't. Just as the spirit moves them.

In The Importance of Being Earnest, Oscar Wilde captures the nature of the Intuitive Programmer perfectly:

Lady Bracknell:

Good afternoon, dear Algernon, I hope you are behaving very well.

Mr Moncreif:

I'm feeling very well, Aunt Augusta.

Lady Bracknell:

That isn't the same thing at all.

In fact in my experience the two things rarely go together.

And so it is with many programmers. They write their code in the way that seems natural, that happens intuitively, and that feels good.

Unfortunately, if you're earnest about your profession, comfort isn't enough. "Behaving very well" may seem stuffy and conventional and uncreative and completely at odds with the whole outlaw hacker ethos, but it has one important advantage: it works. Good social manners help societies run smoothly; good programming manners help programs—and programming teams—do the same.

Rules, conventions, standards, and practices help programmers communicate and coordinate with one another. They provide a uniform and predictable framework for thinking about problems, and a common language for expressing solutions. This is especially critical in Perl, where the language itself is deliberately designed to offer many ways to accomplish the same task, and consequently supports many incompatible dialects in which to express any solution.

The goal of this book is to help you to develop a conscious programming style: to train yourself—and your team—to do things consistently in a way you've decided is correct, rather than in whatever way feels good at the time. Or, if you prefer your metaphors more Eastern than Edwardian: to help you move beyond the illusion of the sensual programming life, and become stylistically enlightened.

Three Goals

A good coding style is one that reduces the costs of your software project. There are three main ways in which a coding style can do that: by producing applications that are more robust, by supporting implementations that are more efficient, and by creating source code that is easier to maintain.

Robustness

When deciding how you will write code, choose a style that is likely to reduce the number of bugs in your programs. There are several ways that your coding style can do that:

  • A coding style can minimize the chance of introducing errors in the first place. For example, appending _ref to the name of every variable that stores a reference (see Chapter 3) makes it harder to accidentally write $array_ref[$n] instead of $array_ref->[$n], because anything except an arrow after _ref will soon come to look wrong.

  • A coding style can make it easy to detect incorrect edge cases, where bugs often hide. For example, constructing a regular expression from a table (see Chapter 12) can prevent that regex from ever matching a value that the table doesn't cover, or from failing to match a value that it does.

  • A coding style can help you avoid constructs that don't scale well. For example, avoiding a cascaded if-elsif-elsif-elsif-… in favour of table look-ups (see Chapter 6) can ensure that the cost of any selection statement stays nearly constant, rather than growing linearly with the number of alternatives.

  • A coding style can improve how code handles failure. For example, mandating a standard interface for I/O prompting (see Chapter 10) can encourage developers to habitually verify terminal input, rather than simply assuming it will always be correct.

  • A coding style can improve how code reports failure. For example, a rule that every failure must throw an exception, rather than returning an undef (see Chapter 13), will ensure that errors cannot be quietly ignored or accidentally propagated into unrelated code.

  • A coding style can improve the structure of code. For example, a prohibition against reusing code via cutting-and-pasting (see Chapter 17) can force developers to abstract program components into subroutines and then aggregate those subroutines into modules.

Efficiency

Of course, it doesn't matter how bug-free or error-tolerant your code is if it takes a week to predict tomorrow's weather, an hour to execute someone's stock trade, or even just one full second to deploy the airbags. Correctness is vital, but so is efficiency.

Efficient code doesn't have to be fragile, complex, or hard to maintain. Coding for efficiency is often simply a matter of working with Perl's strengths and avoiding its weaknesses. For example, reading an entire file of text (possibly gigabytes of it) into a variable just to change each occurrence of 'C#' to 'D-flat' is vastly slower than reading and changing the data line-by-line (see Chapter 10). On the other hand, when you do need to read an entire file into your program, then doing so line-by-line becomes woefully inefficient.

Efficiency can be a particularly thorny goal, though. Changes in Perl's implementation from version to version, and platform-specific differences within the same version, can change the relative efficiency of particular constructs. So whenever you're choosing between two possible solutions on the basis of efficiency, it's critical to benchmark each candidate on the actual platform on which you'll be deploying code, using real data (see Chapter 19).

Maintainability

You will typically spend at least four times longer maintaining code than you spent writing it [3]. So it makes sense to optimize your programming style for readability, not writability. Better yet, try to optimize for comprehensibility: easy-to-read and easy-to-understand aren't necessarily the same thing.

When you're developing a particular code suite over a long period of time, you eventually find yourself "in the zone". In that state, you seem to have the design and the control flow and the data structures and the naming conventions and the modular decomposition and every other aspect of the program constantly at your mental fingertips. You understand the code in a profound way. It's easy to "see" problems directly and locate bugs quickly, sometimes without even quite knowing how you knew. You truly grok the source.

Six months later, the code might just as well have been written by someone else[4]. You've moved on from it, forgotten the clever intricacies of the design, lost the implicit understanding of the control and data flows. And you have no idea why that critical variable was named $nxt_eTofF_trig, what it stores, what that value is used for, or how it might be implicated in this newly discovered bug.

By far the easiest way to fix that bug is to get yourself back into the zone: to recover the detailed mental model you had when you first wrote it. That means that to build software that's easy to maintain, you need to build software that's easy to re-grok. And to do that, you need to preserve as much of your mental model of the code as you can, in some medium more permanent and reliable than mere neurons. You need to encode your understanding in your documentation and, if possible, in the source itself.

Having a consistent and coherent approach to coding can help. Consistent coding habits allow you to carry part of your mental model through every project, and to stay at least partially in the same mindset every time you write code. Having an entire team with consistent coding habits extends those benefits much further, making it easier for someone else to reconstruct your intentions and your understanding, because your code looks and works the same as theirs.

This Book

To help you develop that consistent and coherent approach, the following 18 chapters explore a coordinated set of coding practices that have been specifically designed to enhance the robustness, efficiency, and maintainability of Perl code.

Each piece of advice is framed as a single imperative sentence—a "Thou shalt …" or a "Thou shalt not …", presented like this:

Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.

Each such admonition is followed by a detailed explanation of the rule, explaining how and when it applies. Every recommendation also includes a summary of the reasoning behind the prescription or proscription, usually in terms of how it can improve the reliability, performance, or comprehensibility of your code.

Almost every guideline also includes at least one example of code that conforms to the rule (set in constant-width bold) as well as counterexamples that break it (set in constant-width regular). These code fragments aim to demonstrate the advantages of following the suggested practice, and the problems that can occur if you don't. All of these examples are also available for you to download and reuse from http://www.oreilly.com/catalog/perlbp.

The guidelines are organized by topic, not by significance. For example, some readers will wonder why use strict and use warnings aren't mentioned on Acknowledgments. But if you've already seen the light on those two, they don't need to be on Acknowledgments. And if you haven't seen the light yet, Chapter 18 is soon enough. By then you'll have discovered several hundred ways in which code can go horribly wrong, and will be better able to appreciate these two ways in which Perl can help your code go right.

Other readers may object to "trivial" code layout recommendations appearing so early in the book. But if you've ever had to write code as part of a group, you'll know that layout is where most of the arguments start. Code layout is the medium in which all other coding practices are practised, so the sooner everyone can admit that code layout is trivial, set aside their "religious" convictions, and agree on a coherent coding style, the sooner your team can start getting useful work done.

As you consider these pieces of advice, think about each of them in the context of the type of coding you typically do. Question your current practice in the particular area being discussed, and compare it against the recommended approach. Evaluate the robustness, efficiency, and maintainability of your current coding habits and consider whether a change is justified.

But remember that each of piece of advice is a guideline: a fellow programmer gladly telling you what his programming style is, and where he thinks other styles are unclear or make him jump through mental hoops. Whether or not you agree with all of them doesn't matter. What matters is that you become aware of the coding issues these guidelines address, think through the arguments made in their favour, assess the benefits and costs of changing your current practices, and then consciously decide whether to adopt the solutions offered here.

Then consider whether they will work for everyone else on your project as well. Coding is (usually) a collaborative effort; developing and adopting a team coding style is too. Mainly because a team coding standard will stay adopted only if every member of your team is willing to sign off on it, support it, use it, and encourage other team members to follow it as well.

Use this book as a starting point for your discussions. Negotiate a style that suits you all. Perhaps everyone will eventually agree that—although their personal style is self-evidently superior to anything else imaginable—they are nevertheless graciously willing to abide by the style suggested here as a reasonable compromise. Or perhaps someone will point out that particular recommendations just aren't appropriate for your circumstances, and suggest something that would work better.

Be careful, though. It's amazing how many arguments about coding practice are actually just rationalizations: carefully constructed excuses that ultimately boil down to either "It's not my present habit!" or "It's just too much effort to change!" Not changing your current practices can be a valid choice, but not for either of those reasons.

Keep in mind that the goal of any coding style is to reduce your development costs by increasing the maintainability, robustness, and efficiency of your code. Be wary of any argument—either for or against change—that doesn't directly address at least one of those issues.

Rehabiting

People cling to their current coding habits even when those habits are manifestly making their code buggy, slow, and incomprehensible to others. They cling to those habits because it's easier to live with their deficiencies than it is to fix them. Not thinking about how you code requires no effort. That's the whole point of a habit. It's a skill that has been compiled down from a cerebral process and then burnt into muscle memory; a microcoded reflex that your fingers can perform without your conscious control.

For example, if you're an aficionado of the BSD style of bracketing (see Chapter 2), then it's likely that your fingers can type Closingparen-Return-Openingcurly-Return-Tab without your ever needing to think about it—which makes it especially hard if your development team decides to adopt K&R bracketing instead, because now you have to type Closingparen-Return-Openingcurly-Return-dammit!-Backspace-Backspace-Backspace-Space-Openingcurly-Return-Tab for a couple of months until your fingers learn the new sequence.

Likewise, if you're used to writing Perl like this:

@tcmd= grep /^.*;$/ => @cmd;

then abiding by the guidelines in this book and writing this instead:

@terminated_commands
   = grep { m/ \A [^\n]* ; \n? \z /xms } @raw_commands;

will be deeply onerous. At least, it will be at first, until you break your existing habits and develop new ones.

But that's the great thing about programming habits: they're incredibly easy to change. All you have to do is consciously practise things the new way for long enough, and eventually your coding habits will automatically re-formulate themselves around that new behaviour.

So, if you decide to adopt the recommendations in the following chapters, try to adopt them zealously. See how often you can catch yourself (or others in your team) breaking the new rules. Stop letting your fingers do the programming. Recorrect each old habit the instant you notice yourself backsliding. Be strict with your hands. Rather than letting them type what feels good, force them to type what works well.

Soon enough you'll find yourself typing Closingparen-Space-Openingcurly-Return-Tab, and g-r-e-p-Space-Openingcurly-Space, and Closingslash-x-m-s, all without even thinking about it. At which point, having reprogrammed your intuitions correctly, you will once again be able to program "correctly" … by intuition.



[1] Often whatever is short, vaguely relevant, and easy to spell: $value, @data, $next, %tmp, $obj, $key, @nums, %opt, $arg, $foo, $in, %to, $fh, $x, $y, @q, and so on.

[2] The three-part C-style for loop: "It"s so flexible! What more do you need?"

[3] The observation that maintenance costs tend to outweigh initial development costs about 4-to-1 is often referred to as Boehm's Law. The predominance of maintenance over development has been repeatedly observed in real-world studies over the past three decades, though the actual cost ratio varies from about 2-to-1 to well over 10-to-1.

[4] That's Eagleson's Law. Other experts bitterly assert that the critical interval is closer to three weeks.

Get Perl Best Practices now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.