Refactoring is not changing code.
Okay, yes, it is, but there’s more to it. Refactoring is a type of changing code, but has one major constraint that makes “changing code” an imprecise way to describe it: you don’t change the behavior of the code. Two immediate questions should come to mind:
The shorter answer, for us, is using tests and version control.
Another approach, supported by William Opdyke, whose thesis is the foundational work on refactoring, stresses using automated tools that are responsible for changing the code as well as guaranteeing safety before doing so. Professional coders might find that removing the human element limits the types of changes that can be made, as the number of changes that can be guaranteed as “safe” is confined to the functionality of the tools.
Fowler’s approach pulls away from automation, while at the same time stressing the “mechanics” of the refactoring: steps of altering code that minimize unsafe states.
If we relied on an “Opdykian,” automated approach for this book, the tooling would hold us back significantly. And we’re straying from Fowler’s emphasis on mechanics (step-by-step processes) as well. The reason is that, as we move toward confidence in our code through a given refactoring, if it is backed up by tests, verifying the success of our changes should be straightforward. And when we fail to execute a refactoring properly, version control (we’ll be using Git) should give us an easy way to simply “roll back” to the state of the code beforehand.
Any form of “changing code” carries significant risk to your codebase if you cannot easily revert it to a previous, safe version. If you don’t have versioned backups of the codebase that you plan on refactoring, put this book down and don’t pick it up again until you have your code under version control.
Admittedly, the approach of this book might seem reactive and cavalier in comparison to the earlier paths of automation and mechanics. However, the process—the “red” (failure state of a test), “green” (passing state of a test), “refactor” cycle, with an eye on rolling back quickly if things go wrong—employed in this book is based upon how quality-focused teams operate with tools that are popular among them. Perhaps later, automated refactoring will catch up with Fowler’s extensive catalog of refactorings, as well as all that are presented in this book, but I wouldn’t count on it happening soon.
We could instead write a function that accomplishes the same goal in a slightly different way:
And either of these will work fine for many applications. Any tests that we used for the
byTwo function would basically just be a mapping between an input number, and an output number that is twice the value. But most of the time, we are more interested in the results, rather than whether the
<< operator is used. We can think of this as an implementation detail. Although you could think of implementation details like this as behavior, it is behavior that is insignificant if all we care about is the input and output of a function.
If we happened to use the second version of
byTwo for some reason, we might find that it breaks when our
number argument gets too large (try it with a trillion:
1000000000000 << 1). Does this mean we suddenly care about this implementation detail?
No. We care that our output is broken. This means that our test suite needs to include more cases than we initially thought. And we can happily swap this implementation out for one that satisfies all of our test cases; whether that is
return number * 2 or
return number + number is not our main concern.
We changed the implementation details, but doubling our number is the behavior we care about. What we care about is also what we test (either manually or in an automated way). Testing the specifics is not only unnecessary in many cases, but it also will result in a codebase that we can’t refactor as freely.
The degree to which we specify and test our code is literally the effort that demonstrates our care for its behavior. Not having a test, a manual procedure for execution, or at least a description of how it should work means that the code is basically unverifiable.
Let’s assume that the following code has no supporting tests, documentation, or business processes that are described through it:
Do we care if the behavior changes? Actually, yes! This function could be holding a lot of things together. Just because we can’t understand it doesn’t make it less important. However, it does make it much more dangerous.
But in the context of refactoring, we don’t care if this behavior changes yet, because we won’t be refactoring it. When we have any code that lacks tests (or at least a documented way of executing it), we do not want to change this code. We can’t refactor it, because we won’t be able to verify that the behavior doesn’t change. Later in the book, we’ll cover creating “characterization tests” to deal with untested code.
This situation isn’t confined to legacy code either. It’s also impossible to refactor new, untested code without tests, whether they are automated or manual.
As far as refactoring goes, we don’t initially care about performance. Like with our doubling function a couple of sections back, we care about our inputs delivering expected outputs. Most of the time, we can lean on the tools given to us and our first guess at an implementation will be good enough. The mantra here is to “write for humans first.”
Good enough for a first implementation means that we’re able to, in a reasonable amount of time, determine that the inputs yield expected outputs. If our implementation does not allow for that because it takes too long, then we need to change the implementation. But by that time, we should have tests in place to verify inputs and outputs. When we have tests in place, then we have enough confidence to refactor our code and change the implementation. If we don’t have those tests in place, we are putting behavior we really care about (the inputs and outputs) at risk.
Although it is part of what is called nonfunctional testing, and generally not the focus of refactoring, we can prioritize performance characteristics (and other “nonfunctional” aspects of the code like usability) by making them falsifiable, just like the program’s correctness. In other words, we can test for performance.
Performance is privileged among nonfunctional aspects of a codebase in that it is relatively easy to elevate to a similar standard of correctness. By “benchmarking” our code, we can create tests that fail when performance (e.g., of a function) is too slow, and pass when performance is acceptable. We do this by making the running of the function (or other process) itself the “input,” and designating the time (or other resource) taken as the “output.”
However, until the performance is under some verifiable testing structure of this format, it would not be considered “behavior” that we are concerned about changing or not changing. If we have functional tests in place, we can adjust our implementations freely until we decide on some standard of performance. At that point, our test suite grows to encompass performance characteristics.
So in the end, caring about performance (and other nonfunctional aspects) is a secondary concern until we decide to create expectations and tests around it.
The point is to improve quality while preserving behavior. This is not to say that fixing bugs in broken code and creating new features (writing new code) are not important. In fact, these two types of tasks are tied more closely to business objectives, and are likely to receive much more direct attention from project/product managers than concerns about the quality of the codebase. However, those actions are both about changing behavior and therefore are distinct from refactoring.
We now have two more items to address. First, why is quality important, in the context of “getting things done”? And second, what is quality, and how does refactoring contribute to it?
It may seem as though everyone and every project operates on a simple spectrum between quality and getting things done. On the one end, you have a “beautiful” codebase that doesn’t do anything of value. And on the other hand, you have a codebase that tries to support many features, but is full of bugs and half-completed ideas.
A metaphor that has gained popularity in the last 20 years is that of technical debt. Describing things this way puts code into a pseudofinancial lingo that noncoders can understand easily, and facilitates a more nuanced conversation about how quickly tasks can and should be done.
The aforementioned spectrum of quality to speed is accurate to a degree. On small projects, visible and addressable technical debt may be acceptable. As a project grows, however, quality becomes increasingly important.
There are metrics like code/test coverage, complexity, numbers of arguments, and length of a file. There are tools to monitor for syntax errors and style guide violations. Some languages go as far as to eliminate the possibility of certain styles of code being written.
There is no one grand metric for quality. For the purposes of this book, quality code is code that works properly and is able to be extended easily. Flowing from that, our tactical concerns are to write tests for code, and write code that is easily testable. Here I not so humbly introduce the EVAN principles of code quality:
Feel free to make up your own “principles of software quality” with your own name.
In the context of refactoring, quality is the goal.
Because you solve such a wide range of problems in software and have so many tools at your disposal, your first guess is rarely optimal. To demand of yourself only to write the best solutions (and never revisit them) is completely impractical.
With refactoring, you write your best guess for the code and the test (although not in that order if you’re doing test-driven development, or TDD; see Chapter 4). Then, your tests ensure that as you change the details of the code, the overall behavior (inputs and outputs of the test, aka the interface) remains the same. With the freedom that provides, you can change your code, approaching whatever version of quality (possibly including performance and other nonfunctional characteristics) and whatever forms of abstraction you see fit. Aside from the benefit of being able to improve your code gradually, one significant additional perk of practicing refactoring is that you will learn to be less wrong the first time around: not by insisting on it up front, but by having experience with transforming bad code into good code.
So we use refactoring to safely change code (but not behavior), in order to improve quality. You may rightfully be wondering what this looks like in action. This is what is covered in the rest of the book; however, we have a few chapters of background to get through before that promise can be delivered upon.
In Chapters 6 and 7, we look at general refactoring techniques. Following that, we look at refactoring object-oriented code with hierarchies in Chapter 8 and patterns in Chapter 9. Then we finish with asynchronous refactoring (Chapter 10), and refactoring through functional programming (Chapter 11).
Although for most of this book we commit to refactoring as a process for improving code, it is not the only purpose. Refactoring also helps build confidence in coding generally, as well as familiarity with what you are working on.
You’re more important than your code. Break it. Delete it all. Change everything you want. Folds had the money for a new piano. You have version control. What happens between you and your editor is no one else’s business, and you’re working in the cheapest, most flexible, and most durable medium of all time.
By all means, refactor your code to improve it when it suits you. My guess is that will happen frequently. But if you want to delete something you don’t like, or just want to break it or tear it apart to see how it works, go for it. You will learn a lot by writing tests and moving in small steps, but that’s not always the easiest or most fun and liberating path to exploration.
And the list goes on. For existing code, any changes made to the interface (aka behavior) should break tests. Otherwise, this indicates poor coverage. However, changes to the underlying details of implementation should not break tests.
Additionally, any code changes made without tests in place (or at least a commitment to manually test the code) cannot be guaranteed to preserve behavior, and therefore are not refactoring, just changing code.
Initially for this book, we considered designating “refactoring,” as it is colloquially used to mean “changing code,” by a lowercase r, and reserving the capitalized version for our more specific definition (confidently restructuring code in a way that preserves behavior). Because this is cumbersome and we never mean lowercase “refactoring” in the context of this book, we decided not to use this distinction. However, when you hear someone say “refactoring,” it is worth pausing to consider whether they mean “Refactoring” or “refactoring” (i.e., restructuring or just changing code).
Hopefully, this chapter has helped to reveal what refactoring is, or at least provide some examples of what it is not.
Frameworks can’t save us from our quality issues. jQuery didn’t save us, and neither will ESNext, Ramda, Sanctuary, Immutable.js, React, Elm, or whatever comes next. Reducing and organizing code is useful, but through the rest of this book, you will be developing a process to make improvements that don’t involve a cycle of suffering with poor quality followed by investing an unknowable amount of time to rebuild it in the “Framework of the Month,” followed by more suffering in that framework, followed by rebuilding, and so on.