BUY THIS BOOK
Add to Cart

Print Book $29.95


Add to Cart

PDF $23.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £20.95

What is this?

Looking to Reprint or License this content?


Prefactoring
Prefactoring

By Ken Pugh
Book Price: $29.95 USD
£20.95 GBP
PDF Price: $23.99

Cover | Table of Contents | Colophon


Table of Contents

Chapter One: Introduction to Prefactoring
WE START WITH AN INTRODUCTION TO THE FACETS OF PREFACTORING AND DISCUSS HOW IT RELATES TO ITS NAMESAKE, REFACTORING. We explain that what you get out of prefactoring depends upon your point of view and the context in which you develop. We introduce guidelines that represent suggestions of good practices appropriate to the development context.
Refactoring is the practice of altering code to improve its internal structure without changing its external behavior. Prefactoring uses the insights you have gleaned from your experience, as well as the experience of others, in developing software. The expertise gained in refactoring is part of that experience.
I have condensed my ideas and the ideas I have heard from many developers over many years into the prefactoring guidelines we will explore in this book. Take them as a starting point to developing your own guidelines. Many guidelines relate to basic design principles, but they are expressed in different fashions. Other guidelines revolve around the concepts of Extreme Abstraction, Extreme Separation, and Extreme Readability. I will talk about those concepts later in this chapter.
Another facet of prefactoring is a concentration on interfaces. By considering interfaces—what components can do for you, instead of how they work—you further the goal of abstraction. Refactoring is also concerned with interfaces; the ones for which you do not change the external behavior, while you are altering the internal implementation.
Applying the guidelines in this book does not guarantee that you will never need to refactor your design or code. You might decrease the amount of refactoring that is required. Can you foresee everything? No. Are the decisions you make today final? No. It is practically impossible to think of everything or know everything in the beginning of a project. You will learn more things as a project goes along. However, you can use your experience and the experiences of others to guide you in a certain direction. You can make decisions today that might minimize changes tomorrow.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Is Prefactoring?
Refactoring is the practice of altering code to improve its internal structure without changing its external behavior. Prefactoring uses the insights you have gleaned from your experience, as well as the experience of others, in developing software. The expertise gained in refactoring is part of that experience.
I have condensed my ideas and the ideas I have heard from many developers over many years into the prefactoring guidelines we will explore in this book. Take them as a starting point to developing your own guidelines. Many guidelines relate to basic design principles, but they are expressed in different fashions. Other guidelines revolve around the concepts of Extreme Abstraction, Extreme Separation, and Extreme Readability. I will talk about those concepts later in this chapter.
Another facet of prefactoring is a concentration on interfaces. By considering interfaces—what components can do for you, instead of how they work—you further the goal of abstraction. Refactoring is also concerned with interfaces; the ones for which you do not change the external behavior, while you are altering the internal implementation.
Applying the guidelines in this book does not guarantee that you will never need to refactor your design or code. You might decrease the amount of refactoring that is required. Can you foresee everything? No. Are the decisions you make today final? No. It is practically impossible to think of everything or know everything in the beginning of a project. You will learn more things as a project goes along. However, you can use your experience and the experiences of others to guide you in a certain direction. You can make decisions today that might minimize changes tomorrow.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Three Extremes
Abstraction, separation of concerns , and readability underlie many of the guidelines. These notions parallel some of the ideas in Extreme Programming. If abstraction is good, Extreme Abstraction is better; if separation of concerns is good, Extreme Separation is better; and if readability is good, Extreme Readability is better. Many of the guidelines present an extreme position, so you can differentiate it from your current practices. You might wind up finding your own in-between position that balances the tradeoffs in a manner appropriate to your situation.
Abstraction is one of the key principles in an object-oriented system. You specify operations without specifying the details of how those operations will be implemented—the "what" and not the "how." On one level, a system can be described with enough abstraction that either a manual, computer-aided, or automated procedure could implement it. However, sometimes a system is described so abstractly that you cannot imagine how it will operate until you can see a concrete realization, such as a prototype.
The flow of this book parallels abstraction. Operations and interfaces are stated in a language-insensitive manner, using only those facets such as classes, interfaces, and exceptions that are common to all object-oriented languages. Pseudocode is used to present the sequence and logic that an implementation might encompass. To demonstrate that the abstractions can become reality, code is shown after the interfaces have been defined on an abstract level.
As an example of Extreme Abstraction, one guideline suggests that concepts never be described with primitives (e.g., int or double).
Separation of concerns deals with splitting responsibilities between different classes, different methods, and different variables. As we will see in the sample system, a typical class is the Customer class. One can assign to this class any method that deals with the
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Guidelines Explored
Our goal as developers is to make understandable, readable, and maintainable code. The guidelines in this book are designed to help you reach this goal. The guidelines presented in this book do not represent best practices. "Best" can be determined only in the context in which you are currently developing a system. However, the guidelines do represent suggestions for creating good practices appropriate to your context.
Many of the guidelines are different manifestations of the same basic principles. The underlying principles have tradeoffs in their application, which also appear in the derived guidelines. For example, applying the principle of separation of concerns usually creates more classes and more methods. Consistency, even though it might increase the amount of code, also makes systems that do similar things have the same structure, thus decreasing learning. A concentration on interfaces and delegation increases the number of delegating methods.
One rule exists: nothing works everywhere, and hence, you must be the judge if a particular practice is appropriate for your application. You need to apply principles in context . The decision whether to use a particular principle or practice depends on the situation in which it is employed. When you try to apply the same principle or style to everything, you can create waste or confusion. To require vast documentation on a program that is to be used only as a transition to another program is wasteful. Failure to document fully the program in a cardiac pacemaker might be fatal. Similarly, some programs, such as pacemakers and avionics, need to deal with lots of error handling. Other programs can have simpler error processing, such as a browser that needs to display an error message only if it does not get a reply from the server.
The number of people involved in a system affects how you develop the system. I spend much less time worrying about prefactoring on simple scripts that only I use. If someone else is going to run the scripts, I spend more time dealing with issues such as input validation and meaningful error messages.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Context for This Book
As I mentioned in the Preface, the guidelines in this book are presented in the context of the development of a system for a CD rental store. This development follows an agile process, which is a composite of iterative and incremental processes. Along the way, we encounter situations that are typical in the creation of a system, especially interactions with clients. We start with a set of requirements that are expressed in use cases. Complete details of each use case are not filled in until each use case is being implemented.
A preliminary analysis of the entire system is performed and an overall architecture is formed. The overall architecture is the big picture. It does not contain every detail of every class. After filling in the details for the first set of requirements, the solution is designed in detail and implemented. For waterfall programmers, it might seem as though we are coding too early. For extreme programmers, this might seem like overanalysis. I prefer such an intermediate approach. It is important to deliver a working system to the user early in the development cycle, but the system should fit into the overall solution to the problem.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter Two: The System in So Many Words
WE MEET SAM, THE CLIENT, FOR WHOM WE ARE DEVELOPING A SYSTEM. Tim, a co-developer, and I interact with Sam to get an overall view of what he wants the system to do through use cases and prototypes. We work together to determine a common vocabulary to describe the system's requirements.
Systems are not developed in a vacuum. They are created to meet an organization's needs. The client for whom a system is developed is the source of the requirements for the system and is the final decider of whether a system meets those requirements. Sam, the client, represents a composite of clients for whom I have developed systems over the years.
Sam owns the business CD Rental and Lawn Mower Repair. He started out with lawn mower repair and discovered that people who use lawn mowers like to listen to CDs and they prefer listening to a different CD each time they mow. Therefore, Sam came up with the idea of renting CDs. The service started out as a whim, but it has grown dramatically.
Sam contacted me about creating a system for keeping track of rentals in his store. His current system of using cards similar to library cards works, but it is unable to provide him the reports he feels his growing business requires.
Currently Sam has only one store. Since business is booming, he is considering opening several more stores. He wants us to design the system not only so it works in his store today, but so he can change it easily to accommodate multiple stores tomorrow.
Tim introduced me to Sam. Tim studied computer science in college and worked summers at Sam's CD Rental and Lawn Mower Repair. He has been working as a programmer for five years, the last couple of years with me. He is back in school getting a master's degree. We still work together, but mostly remotely. He takes courses and does some teaching, so he is often unavailable during the day, when I am talking with Sam.
Tim represents an amalgamation of programmers with whom I have worked. We work together on approaches to solutions, but usually work separately on code due to the remoteness factor. Because of our physical separation, code readability is extremely important.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Meet Sam
Systems are not developed in a vacuum. They are created to meet an organization's needs. The client for whom a system is developed is the source of the requirements for the system and is the final decider of whether a system meets those requirements. Sam, the client, represents a composite of clients for whom I have developed systems over the years.
Sam owns the business CD Rental and Lawn Mower Repair. He started out with lawn mower repair and discovered that people who use lawn mowers like to listen to CDs and they prefer listening to a different CD each time they mow. Therefore, Sam came up with the idea of renting CDs. The service started out as a whim, but it has grown dramatically.
Sam contacted me about creating a system for keeping track of rentals in his store. His current system of using cards similar to library cards works, but it is unable to provide him the reports he feels his growing business requires.
Currently Sam has only one store. Since business is booming, he is considering opening several more stores. He wants us to design the system not only so it works in his store today, but so he can change it easily to accommodate multiple stores tomorrow.
Tim introduced me to Sam. Tim studied computer science in college and worked summers at Sam's CD Rental and Lawn Mower Repair. He has been working as a programmer for five years, the last couple of years with me. He is back in school getting a master's degree. We still work together, but mostly remotely. He takes courses and does some teaching, so he is often unavailable during the day, when I am talking with Sam.
Tim represents an amalgamation of programmers with whom I have worked. We work together on approaches to solutions, but usually work separately on code due to the remoteness factor. Because of our physical separation, code readability is extremely important.
Sam came up with some features that he wants to incorporate into his system. They are based on what he already does with his index cards, as well as additional ideas that he developed in his head. He listed them on a sheet of paper:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Reinvention Avoidance
Once Tim and I understand Sam's system requirements, our first step is to determine whether an existing program provides the features that we need. There is no sense in re-creating the wheel if an existing wheel works the way we want. Our goal as developers is to solve the client's problem, not to just write code.
Sam had searched for a commercial program and did not find anything. It appears that he is in a unique business, so nothing has been written, which is not surprising.
We suggested to him that the process of renting a CD is similar to the process of renting a videotape or DVD. He could purchase one of those programs and it would already have many of the features that he wanted. He decided that he would rather have his own custom program instead of dealing with the terminology and handling differences among CDs and DVDs. We recommended that if he decides to expand into selling CDs, we should investigate retail sales systems. A lot of functionality already exists in those systems that should not be re-created. If a preexisting solution fits into the overall system, at least that part of the wheel need not be recreated.
Since Sam wants us to develop a custom system, Tim and I start to analyze the problem. We need to outline the concepts involved in the problem and clarify our understanding of what needs to be solved.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What's in a Name?
Names are important, not just for the code but also for requirements and analysis. If you don't know what you're talking about, it's hard to design for it.
Sam described how he wants to keep track of the CDs. He also desired a catalog of all the CDs that he has for rent.
"So, what is a CD?" I asked Sam.
He paused for a moment and looked at me with a questioning expression on his face. He must have thought I was crazy. "You know, one of those round things you put in a CD player," he said.
"So, when you said you want a CD catalog, do you mean you want an entry in it for every round thing you have in your store?" I asked.
He paused again. "No, I want only one for each title, regardless of how many copies I have in the store."
I suggested, "So, let's decide to use two terms, one for the CD title and one for the CD copy. This way we minimize the opportunity for misunderstanding. What do you want to call each thing?"
"Now I see what you mean," he replied. "What do you suggest?"
I replied, "Let's call the title a CDRelease, and the other a CDDisc. We could use the name CDTitle, but that would start to get confusing when we talk about the title of a CDTitle. To clarify what we mean even further, we can describe each term with a sentence:
"Now is it possible that a CD which a customer would be looking for would be related to two different UPCs?" I asked.
"It's possible," he said. "But I don't think we need to worry about that. One would usually have the term rerelease in its title."
"We can always revisit this question if things change," I said. "Let's alter your requirements and the use cases to utilize these terms."
At this point, Sam and I came up with the following list of modified requirements:
  • Keep track of where each CDDisc is, both when it is in the store and when someone has rented it (including who has rented it).
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Splitters Versus Lumpers
If the world were perfect, you would have exactly one unique name for each concept in a system. In this imperfect world, having two concepts with the same name leads to confusion. In Sam's case, the term CD was applied to both a CDRelease and a CDDisc. Separating the two concepts with two names clarified the requirements.
Using two different names for a single idea can also be confusing, albeit less so than two ideas with a single name. Referring to a physical CD as both a CDDisc and a CDPhysical might be justified by political measures. ("This department calls it this and that department calls it that.") Sam referred to the act of renting a CD as both renting a CD and checking out a CD. If these two terms really encompass the same operation, the duality of reference can be annoying, but might not be confusing.
Sometimes it is hard to determine whether you have two independent concepts or one. Try making up a one-line definition for a name. If it is difficult to create a simple definition, go ahead and use two names. Later on, if you find that the distinction was meaningless, you can always declare the two names to be synonyms. Suppose that Sam and I came up with the terms CDAlbum and CDRelease. We might distinguish them by stating that a CDAlbum is a collection of songs with a title given to the set, and a CDRelease is a collection of songs that was released on a single CDDisc.
The conversion from one style of architecture, design, or coding to another is not necessarily symmetrical. Suppose that a single name has been used to denote two ideas. Later you decide that you need to replace that name with appropriate names for each idea. You need to examine each usage of the term carefully to determine which of the two concepts it represents. On the other hand, suppose that you have used two different names for a single concept. If you want to combine those into a single name, you can do a simple global replacement.
For example, suppose we have a class called Message, which represents messages displayed to the user. We think at the beginning that these messages are going to behave differently, so we divide them into
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Clumping
When Sam described his customers in detail, he mentioned that he needed to keep track of each customer's home address, including street, city, state, and Zip Code, as well as credit card billing address, including street, city, state, and Zip Code.
I asked him, "Do both of those addresses contain the same information?"
He replied affirmatively.
I said, "Then let's just describe the combination as an Address. That way, you don't have to keep mentioning all the parts unless there is something different about them."
"OK," he answered.
We clumped the data into a class, as follows:
    class Address
        {
        String line1;
        String line2;
        String city;
        String state;
        String zip;
        }
At this point, we simply clump the related data, even though we have not assigned any behavior to the class. This data object helps in abstraction and in cutting down parameter lists. Even though the class contains only data at this point, we might be able to assign responsibility to it later on.
Clumping and lumping look similar, but they have distinctly different meanings. Clumping involves combining a set of attributes into a single named concept. The attributes should form a cohesive whole. Lumping involves using a single name for two different concepts. Clumping is an abstraction technique, which makes for an efficient description of a set of data. Lumping can hide relevant distinctions between concepts.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Abstracting
In creating a description of a use case or a model of a possible class, avoid using primitive data types. Pretend that ints or doubles do not exist. Almost every type of number can be described with an abstract data type (ADT). Items are priced in Dollars (or CurrencyUnits, if you are globally oriented). The number of physical copies of an item in an inventory is a Count. The discount that a good customer receives is denoted with a Percentage. The size of a CDDisc is expressed as a Length (or LengthInMeters or LengthInInches if you are going to be sending a satellite into space). The time for a single song on a CDRelease could be stored in a TimePeriod.
Using an ADT places the focus on what can be done with the type, not on how the type is represented. An ADT shows what you intend to do with the variable. You can declare the variable as a primitive data type and name the variable to reflect that intent. However, a variable declared as an abstract data type can have built-in validation, whereas a variable declared as a primitive cannot.
Each ADT needs a related description. For example, a Count represents a number of items. A Count can be zero or positive. If a Count is negative, it represents an invalid count. Declaring a variable as a Count conveys this information. You can create variations of Count. You may have a CountWithLimit data type with a maximum count that, if exceeded, would signal an error.
You can place limits on many different data types. For example, Ages (of humans) can range between 0 and 150 years, SpeedLimits (for automobiles) between 5 and 80 mph, and Elevations (for normal flying) between 0 and 60,000 feet. All these types can be represented by an int or a double, but that is an implementation issue, not an abstraction issue.
Abstract types can contain more than just validation. A price can be represented in Dollars. The string representation of a Dollar differs from the string representation of a double. A string for Dollar
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Prototypes Are Worth a Thousand Words
It is often said that a picture is worth a thousand words. A prototype is like a picture. A user interface described in text is often harder for the customer to visualize than the same interface described with a diagram or picture. Use cases can provide excellent textual descriptions. A prototype (or screen mockup) gives a more concrete perspective on a program's intended operation. The prototype can spark feedback from the client in both the program's operation and in missing requirements.
One of the dangers of making a perfect-looking GUI for a prototype is that the interface represents the program to the user. If the interface is complete, the user might expect that the system is almost complete. Some user interface experts suggest that interfaces be designed using whiteboards or Post-it notes. If you are programming in Java, you can use the Napkin Look and Feel (http://napkinlaf.sourceforge.net/). Tim and I created a rough-draft prototype of the screens for the uses cases we worked on with Sam (Figure 2-1). We went over it with Sam. The cases are simple, so he had no changes in its interface. He did note that the buttons should use a large font so that he could read them without his glasses.
Figure 2-1: Rental screens
could read them without his glasses.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter Three: General Development Issues
BEFORE WE ANALYZE SAM'S SYSTEM IN DETAIL, we look at some general system development issues. These issues relate to all forms of software development, not just object-oriented design. We examine the big picture , interface contracts, communicating with code, simplicity, dealing with errors, and the spreadsheet conundrum.
The big picture refers to the broad perspective of a system in development. The big picture includes the system's overall architecture and business purpose.
Most successful systems have a single vision of the architecture. The vision can come from group consensus or from a single respected individual. Design decisions within a system should be consistent with that architecture.
As Tim and I develop Sam's system, we will keep in mind that its ultimate goal is to function as a multistore system. As the individual pieces of the initial system are developed, the choice between the various design approaches will be affected by that business purpose.
Sam's system is being created in an entirely new environment. Any components that we create (classes, display widgets, etc.) are going to be used in that context. If we attempt to develop components in a vacuum (e.g., without reference to their use), we might have a lot of vacuuming to do when we are finished.
For example, we are developing a new Customer class. Its purpose and interface are driven by its representation as someone to whom Sam rents a CD. An attempt to make the class more general (e.g., so that it can represent a purchaser of CDDiscs) not only would be unnecessary, but also would complicate its required purpose.
On the other hand, much software development occurs within an existing environment, which represents the even "bigger picture." The environment might consist of the entire enterprise, a single division, or a single department. Gaining knowledge of that environment before creating your own system helps save development time. The environment might have components that you can use in your system. It might have established frameworks that will make your system structure consistent with other systems in the environment.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Start with the Big Picture
The big picture refers to the broad perspective of a system in development. The big picture includes the system's overall architecture and business purpose.
Most successful systems have a single vision of the architecture. The vision can come from group consensus or from a single respected individual. Design decisions within a system should be consistent with that architecture.
As Tim and I develop Sam's system, we will keep in mind that its ultimate goal is to function as a multistore system. As the individual pieces of the initial system are developed, the choice between the various design approaches will be affected by that business purpose.
Sam's system is being created in an entirely new environment. Any components that we create (classes, display widgets, etc.) are going to be used in that context. If we attempt to develop components in a vacuum (e.g., without reference to their use), we might have a lot of vacuuming to do when we are finished.
For example, we are developing a new Customer class. Its purpose and interface are driven by its representation as someone to whom Sam rents a CD. An attempt to make the class more general (e.g., so that it can represent a purchaser of CDDiscs) not only would be unnecessary, but also would complicate its required purpose.
On the other hand, much software development occurs within an existing environment, which represents the even "bigger picture." The environment might consist of the entire enterprise, a single division, or a single department. Gaining knowledge of that environment before creating your own system helps save development time. The environment might have components that you can use in your system. It might have established frameworks that will make your system structure consistent with other systems in the environment.
For example, the bigger picture might already contain a Customer class. If that class represents the concept that you want to use in the new system, to create another would be unnecessary duplication. This "bigger picture" also determines how you can develop components. If there were no existing
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Interface Contracts
A system is made up of interfaces that interact with each other. The interfaces can be implemented by object-oriented code or by procedural code. Yukihiro Matsumoto, the inventor of Ruby, suggests that the interface is everything to the user. I would add that the interface is everything to the developer.
Bertrand Meyer introduced the concept of Design by Contract in Object-Oriented Software Construction (Prentice Hall PTR, 2000). An interface has a contract with the user of that interface. The contract consists of preconditions and postconditions for every method in the interface. Preconditions are assertions that must be true when the method is called so that the method can perform its operations. Postconditions are assertions that should be true when the method finishes.
For example, when you're writing to a file, a precondition is that the file must be opened, and a postcondition is that the file length has changed (if you're appending to the end of the file). A precondition can also apply to the value of an argument in a call; for instance, it must be between a given set of values. The called method should assure that the postconditions are met. Otherwise, it has not performed its job properly. The questions are who checks the preconditions and what should result if a precondition is not met.
Some designers feel that it is the calling routine's responsibility to make sure the preconditions are satisfied. For example, the calling routine should ensure that parameters are within the ranges the preconditions specify. If they are not, the called method has no responsibility to perform its contractual obligation.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Validation
You should validate all input data before transferring it to internal processing. Violation of this rule has caused numerous web sites to be subject to attacks such as Structured Query Language (SQL) injection.
You should convert input to its corresponding abstract data type as it comes into the system. For example, you should convert an input field that represents a dollar amount into a variable of type Dollar. Failures of conversion (such as an amount that has three decimal digits) can be reported back to the user immediately. There is no sense in processing an invalid Dollar.
All identifiers should contain self-validating values that can prevent most common entry errors. For example, if a PhysicalID was used to identify each CDDisc, it should contain a check-digit or other error-detection mechanism. Common typing errors can be caught at input, instead of being passed along as erroneous data.
The most stringent rule is that the value of every parameter to every method should be subject to validation . The rule can be relaxed if the caller of the method (or its caller) has performed the validation. If an attribute is set from a method called from the outside world, the setting method should check for validity. If the value for the attribute is read from a configuration file, it should be checked. If the attribute is set by data read from a database and that data was placed there by the system, the checking needed is minimized. The data should have been stored only if it underwent validity checking. The paranoid developer might still want to check it.
Paranoia is not necessarily a bad thing, unless it hinders code interpretation or performance. Some error-handling routines might never be invoked. That is not necessarily a bad thing. It is better for a doctor to double-check the blood types of a heart donor and its recipient before doing a heart transplant than it is for her not to double-check. If the double-check never fails because a mismatch never occurs, the error procedure ("Get another heart") is never executed.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Code Communicates
A primary purpose of code is communication. You are communicating with the computer, telling it what operations to perform. However, you are also communicating with the reader, showing him the steps you are taking to complete an operation. The most important issues to communicate are what are you doing and why you are doing it, not how you are doing it.
Code should communicate its purpose and your intention. On at least one level, code should read like a book, albeit one with stilted syntax. The client should be able to read the code to see if it follows the logic that he expressed in the requirements. The details of the implementation should be relegated to deeper levels.
When reading code, it is often difficult to determine intent. If the writer did not take particular care to make his intent clear, it might be buried in the details. The "micromanagement" of details can hide the flow of the logic.
For example:
    if (a_customer.has_late_rental ())
        a_customer.suspend_rental_privilege();
This code does not show how the suspension of rental privileges is recorded. It states why the person is being suspended (because of a late rental). The client should be able to follow the logic involved, without getting buried in implementation details.
Code readability is measured by whether someone else can read it. The pair programming practice of Extreme Programming ensures that two people can read the code. If you are solo-programming, Nitin Narayan, a reviewer, suggests, "Readability of code is best verified when tested by two or three programmers who read the code and explain what it does to the actual coder. I call this the code readability test. The coder can then change his code to make it more readable based on the feedback he gets from the code readability testers."
Implicitness makes programs shorter, but can make them less readable by programmers who are less experienced in the language or who use many languages simultaneously. Implicitness requires that you make assumptions about the programmer's knowledge. Explicitness requires more writing, but uncertainty as to interpretation is minimized.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Consistency Is Simplicity
Consistency is a form of simplicity. Consistency makes it easy to deal with the world. Can you imagine using your cell phone if every time you turned it on, another revision of the user interface was downloaded to it? However, too much consistency can inhibit creativity. You should apply consistency with a purpose. "A foolish consistency is the hobgoblin of little minds" (Ralph Waldo Emerson).
Consistency is not just following code style conventions. It is doing similar things in a similar manner, unless there is a good reason to change. If you are going to use exceptions, come up with guidelines for what types of events are exceptions and what types are not. For example, should the failure to find a customer with a particular name be an exception or an expected condition? Is it the caller's or the callee's responsibility to check contracts? Will a dozen callers contain the same checking code that could be stored once only, in a single callee?
The development environment can provide consistency. For example, Borland JBuilder has a command that creates a listener for events. A separate listening class is created for each event, along with a separate method that is used to code the response to the event. This approach adds a level of indirection. However, once you're familiar with the approach, it is easy to determine which methods are event handlers.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A Prefactoring Attitude
As you write your code, be conscious of the way you are writing it. Use a "prefactoring" editor attitude. It is OK to cut a block of code and paste it elsewhere. That usually means you're moving it to a better position: into a separate method, as detailed in Martin Fowler's Extract Method. If you find yourself copying and pasting a block of code, stop and analyze what you are copying. Should you place that code into a method? If so, why not place it there now, to prevent code duplication before it occurs?
Are you using the code you are copying as a template? Then why not create a template? A source code template is good for creating a common pattern in the layout of your classes. For example, if you decide that all classes should have to_string() and from_string() methods, set up a source code base that includes those methods. Whenever a developer creates a new class, she can copy that template into the new class source. This is "the exception to the rule" of copying and pasting more than a single line. In this case, there is a justifiable reason for the copy and paste: interface consistency. You can create a "wizard" to perform automatic text replacements. If your integrated development environment (IDE) supports an "implement interface" command, this template should be an interface. The act of implementing it creates the skeleton code for the methods of the implementing class.
When you find yourself writing a comment for a section of code, ask yourself why you are commenting it. Are you describing how you are implementing a particular algorithm within the method? If so, the section of code should probably be its own method. For example:
    int [] array;
    int odd_number;
    // Find the first odd number in the array
    for (int m=1; m < array.length; m++)
        {
        if (array[m] % 2  == 1)
            {
            odd_number = array[m];
            break;
            }
        }
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Don't Repeat Yourself
The "Adapt a Prefactoring Attitude" guideline is a specialization of the "Don't Repeat Yourself (DRY) principle of Andrew Hunt and David Thomas (The Pragmatic Programmer: From Journeyman to Master, Addison-Wesley Professional, 1999). The concept is that information should have one authoritative source. If information is needed in multiple ways, a transformation process converts it from the single source into the other formats. By doing so, information needs to be changed only in one place. Dave Thomas says, "The idea is to try to plan ahead to prevent duplication, rather than to waste time removing stuff you've already done."
For example, an XML description of a data table can be transformed into SQL commands to create the table, as well as language-specific classes to access the table. Changes in the organization of the data table need to be made only in one place.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Documentation of Assumptions and Decisions
Keeping a journal is one way to learn from experience. In the journal, you can document the assumptions you made (e.g., the network is reliable) and the reasoning behind your design decisions. The journal can be separate from or part of the code source.
When you are faced with a design decision, you must have at least two alternatives. If you have only one option, you really have no decision to make. You might employ one of the guidelines in this book to aid you in your decision making. As you make decisions, document why you made them, especially for the more important ones. Later on, you can analyze those decisions and examine how your reasoning and assumptions worked out.
The requirements outline what functionality your system needs to provide. The code itself says how you are technically providing that functionality. However, the code does not document why you chose that particular technical approach. The journal provides the "why." For example:
"Sam stated that he never buys any CDRelease that has more than one physical CD. Therefore, there is only one physical CD that corresponds to a CDDisc and therefore only one ID associated with each CDDisc."
Suppose that later on, someone brings up the issue of CDReleases that contain more than one physical CD. You can examine the documented assumption and see whether it still holds true. If a multi-CD album was always rented as a whole package, the single-ID assumption can still be true. If the album were rented as individual CDs, now the assumption is false, and the design will need to undergo modification.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Dealing with Deviations and Errors
Dealing with error conditions is probably the hardest part of the development effort. Errors fall into at least two categories: conditions that arise in the normal operation of the program, and failures in the environment in which the program is operating.
I prefer the term deviation for an error that occurs during normal processing. A deviation is a departure from straightforward processing that can occur during normal program operation. Most use case logic deals with straightforward logic. The user does this, the system responds with that. In the normal course of processing, the system needs to deviate from this straightforwardness.
For example, it is possible that a CustomerID is entered that does not equal any of the IDs in the set of Customers. This could occur because the CustomerID was input incorrectly or the Customer was deleted because the customer had not rented for several years. If the collection of customers is kept on a server, causes include a network failure or server failure.
The first set of causes for a CustomerID not being found are deviations that can occur during normal processing. A correction mechanism can be suggested to the user (e.g., reenter the ID), though user action might not solve the problem. The second set of causes (network or server failure) are errors, not deviations. They should not occur during normal operation. However, if the server or network were known to be unreliable, they could be handled as deviations.
Deviations should be dealt with at an appropriate level. The methods closest to where the deviation occurs often have the most information regarding what actions the user can take. If opening a nonexistent file signals an error, the caller of the open method usually knows the file's purpose and can add information regarding what might occur in the absence of that file. For example, suppose the file the method was opening was a configuration file. If the configuration file is nonexistent, the method might choose to use default settings. If the configuration file is absolutely required by the program, the method can signal an error.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Speeding
Jerry Weinberg tells the story of two groups, both working on solving the same problem. Jerry's group created a solution that worked correctly but was slower than the other group's. The other group developed a solution that was fast but did not work for all input values.
The other group leader referred to the differences by belittling the correct solution for running so slowly. Jerry replied by asking when the other group would have a usable solution.
It is usually easier to transform a correctly designed "slow" system into a fast enough system than it is to alter an incorrect fast system. Even if the slow system cannot be transformed, it can be used as a reference platform for functionality tests.
Do not waste time making assumptions about performance. Use a profiler to measure performance so that you can focus on a handful of key bottlenecks. Jim Batterson, a fellow consultant, tells the story:
I know that when I was really concerned about efficiency, the best thing I had was a monitoring tool that would tell me where I was spending my time. It was always true that I was spending about 90% of my time on about 10 lines of code, or one little subroutine or one read to a file that was done over and over again. You could optimize the hell out of the rest of the system and never get more than a 10% improvement, or you could optimize that one part and make that baby fly.
Once you have determined the location of the bottleneck, you can create a solution. Selecting a different algorithm often yields the greatest performance gains. For example, a quicksort algorithm works better than a bubble sort most of the time.
With object-oriented programs, high levels of abstraction can be the cause of bottlenecks. Fewer layers of abstraction decrease the number of method calls and thus the calling overhead. Sometimes more tightly coupled objects can eliminate overhead (e.g., sending messages in internal format between computer systems, instead of in a standard text format).
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Spreadsheet Conundrum
The spreadsheet is an analogy for many design decisions you make during development. Consider the data on a spreadsheet, as shown in Figure 3-1. If you were to store the data in a linear manner in a file, you would need to decide whether to store the data by row or by column. Perhaps storing it by row seems most natural. What if programs that require the data in column order access it later? Row order makes that access inefficient.
Figure 3-1: Spreadsheet of CDDiscs and days
If you knew that future programs were going to use column order, you should have considered that in your initial code. However, if you cannot reasonably foretell in what order data will be accessed, you cannot worry too much now how it should be stored. Just document your assumptions and later on, if you have to change your approach, you will at least know why you did it the other way.
Many facets of programs parallel the spreadsheet. For example, string resources and languages form a spreadsheet such as that shown in Figure 3-2.
Figure 3-2: Spreadsheet of resources and languages
Typically the data in Figure 3-2 is stored with strings stored sequentially for each language. If you will be adding more languages, having the data stored in that manner makes sense. However, if you are always adding more resources, but never adding more languages, it could be more efficient to store the strings sequentially by resource.
This spreadsheet conundrum is reflected in the organization of graphics packages. A package can be organized in two ways, which correspond to the rows and columns of Figure 3-3.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Tools Are Tools—Use Them Wisely
Many software development tools are available. They range from requirements documentation to modeling tools to IDEs (Integrated Development Environment). Tools are wonderful. They can automate many processes and ensure consistency and integrity.
Particular features of IDEs or frameworks can influence how you develop your system. An IDE such as Microsoft Visual Studio makes it easy to develop handlers for graphical user interface (GUI) events. With a couple of mouse clicks, you can set up a function that is called when a button is clicked or when text is entered into an edit box. The IDE strongly suggests a stylistic pattern for handling and naming the functions. You are welcome to override that pattern if you have strong feelings concerning your own style. However, it is often easier to accept rather than to fight. It will add consistency to your programs. Other developers working on your system might not be as adamant about the style and might be much more willing to accept the default. So the default becomes the easiest form of consistent code.
An old adage says, "Use the right tool for the job." When programming particular aspects of a system, usually certain tools are designed to perform specific jobs easily. For example, Perl performs string processing; XLST transforms XML from one form into another; Crystal Reports creates reports from databases. If you are familiar with the appropriate tool, it is quicker to code the operation with that tool. If you are not familiar with the tool and if the aspect of the program that requires the tool is relatively small, coding using the main development language might make more sense. For example, if you have a number of string manipulations, writing those manipulations in PERL makes sense. If you have only a few, code them in Java (if that's what you are using). Not only do you save tool-learning time, but also, the maintainers after you will not be required to know yet another language.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter Four: Getting the Big Picture
Content preview·Buy PDF of this chapter|