Chapter 1. When to Rewrite an Application

This is a report about migrating frontend interfaces for web applications, so it might seem odd that it starts with a warning. Before going deep into the how of this guide, we encourage you to thoroughly consider whether a migration of your frontend interface is really something you need to do. There are many cases where your current application will be serving your needs well and should probably stay the way it is. This is a practical guide, and our first piece of practical advice is if it ain’t broke, don’t fix it.

Since you’ve picked up this report, you probably do have hankerings for fixing a codebase riddled with technical debt and rewriting it to something beautiful, clean, and productive. We won’t argue that your codebase is perfect. Honestly, if your business is successful, the chances of museum-worthy code should be vanishingly small. We know vast swathes of that application are probably too horrible to think about, that you need to triple your estimate whenever some hapless engineer needs to tweak something in the dragon zone, or worse, that there are areas that your team is too afraid to touch at all. Given the challenges your team is facing, when should you choose to live with them and set aside the migration guide you hope will make everything better? The following sections will help you decide.

When Your Existing Application Is Likely the Best Choice

There are three main scenarios where working within your existing application and not rewriting it is likely your best choice:

  • You are still finding product/market fit and your product needs are changing rapidly.
  • Your product is mature and won’t see much more development.
  • Regardless of product needs, your development team is still productive.

It can be tricky to identify which of these situations are applicable to you. The following sections give an overview of the most recognizable attributes of each situation, so you can identify the scenarios where keeping your existing application is probably a better choice than rewriting.

You Are Still Finding Product/Market Fit

If your business model, and your product offering, is changing rapidly, then now is not the time to undertake a migration to a new technology stack unless you are so burdened by technical debt that your development speed is putting that business at risk.

Product/market fit describes the situation where you have found a sustainable business niche for your product; your core offering is no longer changing. A product that has found market fit is seeing sustained user and/or revenue growth.

If the following situations sound like your situation, you probably are still finding product/market fit:

  • Your userbase is still small and most of your time is spent adjusting how your product works.
  • Your team is still small and lean if you’re still able to whip together a new feature in a day and get it in front of users.
  • Your team has never woken up at night to try scale up servers in a panic.

If you’re not yet at product/market fit, what matters most is fast prototyping so that you can get to a sustainable business niche faster. If you can still iterate quickly, then it makes more sense for you to focus on prototyping, and not spend time investing in a future you have not yet secured. First focus on making the immediate changes needed to get to the point where scale is a problem.

Atomic design, the concept of “building web applications like Lego,” and creating a solid reliable base from which you can iterate fast on your UI is a huge temptation for an early stage team optimistically anticipating needing to scale up as their product is wildly successful.

We strongly caution against sacrificing critical short-run speed in favor of investing in a long-run future that your startup may, as a result of this misplaced investment, never see. Provided you are not in an extreme situation (and in an early stage company, with a newer codebase, extreme cases of technical debt tend to be rare) you should probably just muddle through it.

The Product Is Mature and Not Being Actively Developed

One of the main benefits of this migration strategy, and of using atomic design, is that it helps larger teams move more quickly and effectively. If your product is mature and your customers don’t want to see radical innovation that disrupts their established workflows, your business objective may be to maintain that product for the duration of its life without adding major new features.

In this case, a technical migration to a more productive codebase has relatively little upside. If there won’t be many new features to build, there isn’t much of a return to setting things up so that you can build them faster. Similar to the case of a product still finding market fit, the benefit of lining up your product and technical roadmaps is greatly diminished when your product roadmap has reached the end of the line.

The Engineering Team Is Still Productive

A major practical benefit of atomic design is that it allows larger teams (say, more than three engineers) to work together on shared code productively. Past a certain size, coordination challenges or a slowing velocity of development as technical debt and complexity rises starts to reduce productivity. If your team is still working productively, then migration solves a problem you don’t have.

Small teams are often still very productive. If it’s just a couple of engineers each assigned to fairly isolated parts of a codebase, there’s much less need to coordinate work, and people don’t experience lengthy discussions and back-and-forth whenever changes one team or person wants to make impact others. Software engineers are more likely to be working with code that they wrote, where they understand the decision history, and know how to navigate the more complex or confusing areas.

This is the typical stage where the tenets of The Mythical Man Month1 do not yet apply. A self-test for this is, did the last software developer to join your team quickly and definitely increase the output of the team as a whole? If they did, and your small productive team is not likely to grow rapidly in the near future, a migration like the one outlined here is likely a premature optimization. If, on the other hand, you feel like you’re rapidly getting to the point where adding more people will run you into trouble, read on.

When a Rewrite Makes Sense

You’re no longer trying out prototypes, getting them in front of users, tweaking and iterating, testing again, or trying a different angle. You’ve done all that and you know exactly who your product serves and what it does. At this stage, your codebase is knuckling down for the long haul.

Your problem is not figuring out what it is or how to make it work, but how to make it scale. How can you grow your team? You need to onboard 10 new developers into this codebase this month. How can you keep them all from tripping over each other? How can you prevent regressions when the spaghetti code of your heavily revised and iterated prototype is now hitting some serious production traffic, and any one tweak here causes something presumably unrelated to break?

For frontend code, this problem is rife as even the best intentioned prototype often quickly ends up a stateful mess. In this rewrite guide, the scope we’re considering is a partial rewrite of the frontend interface of your application. While the same decision-making guidelines can be applied to other areas of the codebase, our focus is on when and how to rewrite the frontend provided that the data access and core business logic won’t need to change.

Declining Marginal Productivity of an Additional Engineer

The Mythical Man Month describes the apparent paradox where adding an additional software engineer to a late project will tend to make that project even later. A 12-month project for one developer is not the same as a one-month project for 12 developers since it is a myth that men (software developers are unfortunately referred to throughout the text as men) and months are interchangeable.

The reason “developer months” as a concept doesn’t work out in reality is because of the declining marginal productivity of each additional engineer. Going from no developers at all to having one developer brings about a huge gain in productivity. Adding a second and third person increase the overall output again, but not as much as going from nobody to somebody did. Eventually, a point comes where too many people working on a shared codebase leads to very little additional benefit. It’s even possible that adding more developers reduces the overall output of the whole, as the challenges of coordinating and onboarding more people in the same codebase rise. This is often referred to as Brooks’s law, after the author.

When the team becomes bigger, the issue of who should do what becomes a challenge and even the leanest methodology starts to take up a lot of time when it needs to be applied to many engineers on the same project. Once the team is at a scale where most people are reading and working with code they haven’t written themselves, understanding why things were done a specific way requires more back-and-forth. There is more discussion, and perhaps a need to consult documentation or Git history.

Lastly, how things should be done ends up taking more discussion. While different types of discussions do best with various group sizes, research on group size indicates that it’s rare for more than six participants to be the most effective number of people in a conversation. When the number of people who are affected by and want to participate in a discussion exceeds six, discussions take longer and the solutions found tend to be worse. All this is typically solved through adding more structure and process, which in itself takes time to agree on, makes onboarding lengthier, and needs to be revised as the team grows.

It can be difficult to separate the challenge of technical debt from a larger team struggling to coordinate. It takes time for teams to get bigger, and it also takes time for technical debt to build up. We won’t focus on making the separation between “problems caused by a growing team” and “problems owing to technical debt” because these two situations are co-occurring. There’s really no need to figure out how much of either problem you have. Chances are, it’s a mix of both, and migrating with atomic design can still be a helpful option for your team.

So far, this description has been anecdotal. You might find yourself nodding along as you’ve intuitively noticed a sense of lethargy come over your team or engineering organization as you’ve grown. But how do you know if it is really true? The past often is romanticized, and in startup culture the words “lean” and “scrappy” get thrown around as inherent qualities every would-be unicorn must embody, so if you’re in the tech startup world there’s a good chance your company likes to say you are (or were) lean and scrappy. How do you know whether your sense that things have degraded is true, or just a misleading memory of a past that never was?

This is the question we asked ourselves at Buffer. We had a frontend web team of 11 developers all working in a shared monolithic repository, trying to wrangle an unyielding knot of Backbone.js code with some newer React.js views. It felt like everything took much longer than it should have, and much longer than it used to take. Engineers were frustrated and feeling unproductive, and our product team was routinely baffled by relatively small features that took months to ship. There was a clear signal that the team was not set up in a productive way and we were running into classic coordination challenges, but it’s hard to make the leap to “We need to rewrite!” based on only qualitative evidence, passionately expressed as that may be. We wanted data.

To get data on productivity over time, we’ve used GitHub data of commits over time. This has the flaw of using “number of commits” as the only productivity measure. While it’s slightly better than lines of code, we recognize that the number of commits is an inherently flawed way to view developer productivity: it’s the impact of those commits that matters. Since determining the impact of a commit from the commit itself is beyond the scope of this report, we’re taking number of commits as a proxy for how productive our team is, acknowledging the limitations of this approach.

Figure 1-1 shows the commits per author over time for the monolithic repository in which coding started to feel like swimming in syrup.

Figure 1-1. Commits per author over time

You can see commits per author trend down over time quite clearly. From this graph, you can see that each author has pushed fewer commits than they did in the past. Since this compares people to their own past selves, each person remains at worst constant in skill. More likely, developers are becoming more skilled and experienced over time, but despite their learning, they are still committing less than they used to. 2015 and 2016 were both years the team grew rapidly. If you look at only 2014, you’d see a dip and rise, as a small team initially struggles, but then successfully onboards a few new hires and productivity rises again. In 2015 and 2016, we see a steady decline as the team grows, with no compensating rise after onboarding was complete.

Figure 1-2 shows commits per author relative to the number of authors. It’s notable than in this chart, as the number of authors rises, the number of commits per author falls.

There is a clear inverse relationship between how many developers work in a shared repository, and the productivity of each developer. Seeing commits per person decline over time, even as people realistically are likely gaining in skill over time, indicates that the environment is making developers decreasingly productive. This situation points to an external force holding our team back: an environment that had a rising cost of change.

Figure 1-2. Commits per author relative to the number of authors

You can find the script to generate your own marginal productivity graphs from your Git history here.2 If your team has gotten into this situation, and you still have a lot of development work ahead, a reset to a more productive environment makes sense.

In the next chapter, we’ll cover two different approaches for rewriting a codebase that has declining productivity, and show you how to do this in a low-risk and nondisruptive way.

1 Frederick P. Brooks Jr., The Mythical Man-Month: Essays on Software Engineering, Anniversary Edition (2nd ed.), Addison-Wesley Professional, 1995.

2 Scripts used to generate these graphs: http://bit.ly/2svVlLg.

Get Atomic Migration Strategy for Web Teams now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.