Chapter 4. Measure and Reduce Your Code’s Carbon Footprint

A wise person at Google once told me, “The internet is a building.” The massive data centers that make up the internet and store its zettabytes of data1 require electricity, water, and other resources to run and cool their servers. These resources come at a cost to our planet. Anything you do online — texting a friend, searching the web, streaming your favorite music, watching a YouTube video, or just leaving your browser open overnight — ultimately translates to emissions of greenhouse gasses such as carbon dioxide that alter the Earth’s climate. It’s easy to forget this when you’re busy blasting space aliens or shopping online for bunny slippers.

Likewise, if you’re a software engineer, every piece of code you write has an associated impact on carbon emissions, called its carbon footprint. If your code is a small script that runs on one desktop computer, its carbon footprint may be tiny. If your code runs on millions of CPUs in a data center that gets its electricity from burning coal, its carbon footprint might be more significant. If your code is a smartphone app, its carbon footprint also encompasses the electricity to charge the phones that run the app each day. Your decisions about software design, in other words, are also decisions about carbon in our warming world. This chapter will help you understand the energy use and carbon footprint of your code and reduce them, a practice sometimes called “Green IT.” If you enjoy tweaking the performance of your code to run faster and consume fewer resources, you’ll find carbon tweaking quite similar.

Another way to address climate change is to join software projects that tackle specific challenges related to climate change. Within Google, for example, about 80 such projects are running at press time, some official and others staffed by passionate volunteers. You can view and try some of these projects, like the Environmental Insights Explorer that helps cities measure their CO2 emissions and Project Sunroof that drives the spread of solar panels, while others are purely internal, like the case study on data center cooling later in this chapter.

We’ll discuss:

  • The technology industry’s contribution to global greenhouse gas emissions

  • How to measure the carbon footprint of your code

  • Steps you can take to reduce your code’s carbon footprint

  • Contributing to climate-related software projects

Measuring Carbon Emissions

The entire world emits about 40-50 billion tons of CO2 per year, and the internet’s carbon footprint is about 1.0–1.5% of that number. This footprint primarily counts the power and cooling needed for computer hardware.

A cartoon of a person

Description automatically generated

Researchers have presented a variety of estimates on data center CO2 emissions. When it comes to training AI models, a widely publicized paper in 2021 concluded that the energy use for these activities was massive and unsustainable. A follow-up paper, however, found these conclusions relied on unrealistic estimates that were “100–100,000x higher than real carbon footprints.” I encourage you to read both papers and come to your own conclusions. In general, Google has found empirically that clever choices in the training infrastructure can “reduce the carbon footprint up to ~100–1000X.”

Note

If you’re an AI researcher who trains machine learning models, please measure the energy use and carbon footprint of your systems (starting with the tips in this chapter) and publish them in your research papers. Sharing your data will help other researchers better estimate the overall toll of AI on data centers.

In any event, as responsible software engineers, we should try to reduce the carbon footprint of the code we deploy to data centers. It’s the right thing to do.

A cartoon of a person using a computer

Description automatically generated

To do that, we’ll need to review some basic facts about power.

Principles of Power

A cartoon of a person with a black background

Description automatically generated

Think about how you charge your cell phone. You plug a charger into a wall outlet or a USB port, and the charger draws some amount of electrical power, measured in watts (abbreviated W). A typical phone charger might draw 5W of power. A power supply for a laptop might be 100W, a desktop computer is more in the range of 700–1000W, and a hair dryer could draw 1500W. The biggest power draws in modern homes are for heating and cooling, where appliances might run up to 5000W. As the numbers get larger, we divide watts by 1000 to get kilowatts (kW). A 5000W air conditioner draws 5 kW.

Now take a look at your home’s electric bill. It measures your energy usage in kilowatt hours (kWh). A kilowatt hour is simply 1,000 watts used continuously for one hour. Charging your 5-watt phone (0.005 kW) continuously for two hours is 0.01 kWh. Running your 5 kW air conditioner for two hours is 10 kWh. The average American house uses about 886 kWh per month.

A cartoon of a person using a computer

Description automatically generated

Utility companies that produce electricity generally also emit CO2 unless they’re fueled by solar, wind, or other sustainable sources. We talk about carbon emissions in units called Carbon Dioxide Equivalent, or CO2e. It’s measured in metric tons. For reference, one metric ton of CO2e is roughly the emissions from driving 2,500 miles (4000 km) in a gasoline-powered car.

To calculate CO2e for code running continuously in a data center, you’ll need two numbers:

  1. The kWh consumed as your code runs in your data center

  2. A value for average carbon emissions per kWh

The first number, kWh, is a multiplication of three values:

(# hours your code runs)

x

(# processors running your code)

x

(average power per processor)

So, if your code runs continuously in a data center for one year (8,760 hours) on 2 processors, and each processor uses (say) 95 watts on average, your kWh for processing would be (8760 x 2 x 95)/1000, which is about 1,664 kWh. Note that average power per processor can be challenging to calculate; see the sidebar “Average Power” for details.

This calculation is pretty rough and doesn’t take into account several important factors. Your code might not run 100% of the time. If that’s the case, multiply the result by the percentage of time your code runs. (You could even factor in the power used when processors are on but idle.) You can also refine the calculation by including the power used by RAM, storage media, and other server components.2 Finally, new cloud computing design patterns, such as serverless computing and function-as-a-service (FaaS), hold the promise to reduce energy use by clever allocation of resources behind the scenes.

The second number we need, average carbon emissions per kWh, is built into the Greenhouse Gas Equivalencies Calculator provided by the U.S. Environmental Protection Agency (EPA). Just plug in the kWh you calculated earlier, as in Figure 4-1, and see that the CO2e of 1,664 kWh is 0.72 metric tons. Those emissions are roughly equivalent to driving a car 1,845 miles (2,969 km), which is about the distance from San Francisco, California, to Kansas City, Missouri, or a round trip between Paris and Rome. Keep in mind that these emissions calculations are based on U.S. averages and would vary per data center.

The EPA s Greenhouse Gas Equivalencies Calculator.
Figure 4-1. The EPA’s Greenhouse Gas Equivalencies Calculator.
A person with a speech bubble

Description automatically generated

Let’s up the game. A large data center may have millions of servers arranged in racks, drawing hundreds of millions of watts of power. These numbers are large enough that we stop talking about kilowatts and introduce megawatts (MW), or millions of watts. The greenhouse gas emissions from 1 megawatt hour (MWh) of energy use is roughly equivalent to driving 1100 miles in a gasoline car or burning 485 pounds of coal. Multiply that by the number of hours in a year (365 x 24), and now we’re talking about 8760 megawatt hours, or 3800 metric tons of CO2e. That’s a year’s worth of emissions from 843 cars, which is a tiny fraction of the world’s 1+ billion cars, but it’s not zero either. And that’s just one year in one data center. I’m assuming here that the data center generates its power from fossil fuels, however, which might not be the case. I’ll discuss this more later.

A cartoon of a person with sunglasses

Description automatically generated

That’s right. And I’ve been measuring only one source of emissions. Let’s expand on that.

Beyond Direct Carbon Emissions

A cartoon of a person talking to a person

Description automatically generated

People who are deep into the research on greenhouse gas emissions separate them into three categories or scopes:

Scope 1

Direct emissions to the environment from stuff you own. An example is the emissions from burning fuel in your car, or emissions from your air conditioner refrigerant leaking (which has higher global warming potential than plain old CO2, so please fix any leaks pronto!).

Scope 2

Emissions from energy that you purchase to run your stuff. These emissions are associated with generating the energy you use for everything from running your lights to charging your phone. If you’re a company, your scope 2 emissions are related to the electricity you purchase to run your business, whether it’s for a small office or a giant data center.

Scope 3

Everything else. This includes emissions associated with your supply chain. If you make and sell stuff, this also includes the emissions associated with using and transporting your products. For most businesses, their scope 3 emissions likely dwarf their scope 1 and 2 emissions. Google’s 2023 environmental report, for example, notes that “Scope 3 emissions represent 75% of our carbon footprint.”

So, hey, do you make phones? Then the electricity someone uses to charge their phone is in your scope 3 and also in their scope 2. Did you just buy a server? The electricity and direct emissions associated with making that server are in your scope 3, and in the supply chain’s scope 1 and 2. Do you drive to work? The emissions associated with your car burning gas are in your scope 1 and the fuel company’s scope 3. If this all seems complicated, you’re not alone in thinking that, but the goal is to help businesses understand their direct and indirect emissions so they can focus on reducing both.

Controlling Your Code’s Carbon Footprint

Electricity-related emissions from data centers are just part of the total CO2e associated with software, but they offer a great opportunity for carbon savings. As a software engineer, you can help out by changing your code and managing your servers to use less energy or cleaner energy. Let’s discuss three primary areas where you may have control over your systems’ emissions:

  • Processor usage: How many processors you use, and how much time you use them for

  • Location: Which data centers run your code

  • Time of day: Which hours of the day your code runs

Controlling Processor Usage

If your code runs in a data center, it probably runs on multiple processors. But how many does it need? This number is generally under your (or your team’s) control, and if you reduce it, you can reduce CO2e. Here are three general approaches you could take, which I’ll discuss in turn:

  • Code optimizations, such as more efficient algorithms

  • More efficient allocation of computing resources (CPUs, disk, memory)

  • Running code on Tensor Processing Units (TPUs) and Graphics Processing Units (GPUs) when appropriate

Optimizations in your code can change your carbon footprint. We know this first-hand at Google. Even tiny reductions in processing, like copying strings more quickly in memory, can produce huge gains in energy efficiency when scaled across Google’s massive operations. Googlers who create the most impactful optimizations even receive awards, called Perfy Awards. (No cash, just bragging rights.)

A cartoon of a person using a computer

Description automatically generated

A great way to optimize your code is to focus your attention on the most expensive functions: those that use the most CPU time and/or are called most often. To locate these potential performance hogs, use a code profiler: a tool that runs your code and computes statistics about function calls. The sidebar “Mini Case Study: Profiling an Application” presents an example. Once you’ve isolated those functions, try to optimize them. This could mean changing an algorithm or porting those functions to a higher performance language. At Google, when we write Python applications for example, we’ve been known to rewrite the frequently-executed functions in C++ for greater performance.

Warning

Be aware of the tradeoff between optimizing your code and keeping it easy to read and understand. Too much optimization can produce convoluted code and lead to more and tougher bugs, wasting developer time and increasing maintenance costs.

Besides code optimization, look for opportunities to reduce the computing resources that are allocated to your application, such as processors, memory, and disk space. This technique is called capacity management.

Suppose your web application receives double its usual traffic on holidays. For this sort of seasonal application, you might be tempted to allocate CPUs to handle peak capacity and leave them in place all the time, just in case of load spikes. On non-holidays, which means most of the year, half your fleet of CPUs may be sitting idle, wasting energy and producing unnecessary CO2e. A good cloud provider will provide an API or dashboard to measure these sorts of wasted resources, such as the idle time of your CPUs, unused RAM, and unallocated disk space. (How many years’ worth of log files are you keeping? How many do you really need?) Use that data to manage your applications’ capacity thoughtfully.

Beyond code optimizations and capacity management, keep your eyes open for CPU processing that would be more efficient on other kinds of processors, such as a Graphics Processing Unit (GPU) or an optimized AI chip such as a Tensor Processing Unit (TPU). GPUs tend to use more energy than CPUs but they’re much faster for certain tasks, like image and video processing, so their total energy consumption may be less overall than a CPU’s. TPUs, on the other hand, are much more efficient in energy and speed compared to CPUs. They’re optimized for training machine learning models. (Don’t train a ML model on CPUs if you can help it!) Your applications can achieve huge savings in speed, energy, and CO2e with the right processing units.

What about coding for performance?

Cartoon person sitting at a computer

Description automatically generated

Optimizations that reduce CPU cycles, RAM, disk space, and network traffic, generally also reduce CO2e. Go for it!

Optimizations that just make your code run in less time, however, may or may not decrease CO2e. It depends on how you reduced the runtime. Did you write a more clever algorithm and run it on the same hardware? Then yes, that’s likely to reduce CO2e by reducing CPU cycles. Did you speed things up by adding more servers? Then your CO2e could increase because it’s running on more physical CPUs, even if it occupies them for less time. The calculation gets more complicated as CPUs gain many more cores, because computing with many cores in a single chip is typically very energy efficient.

Another decent measure of CO2e reduction, if you host your code with a third-party cloud provider, is cost. Look at the invoice you receive and pay each billing period. Higher cost generally means higher resource use and therefore higher CO2e. If you optimize your code and your cost drops, you’ve probably reduced CO2e too. Finally, check if your cloud provider has a tool for estimating CO2e, such as Google Cloud’s Carbon Footprint tool.

Controlling the Code’s Location

A white speech bubble with black text

Description automatically generated

Running code in the cloud is, on average, more energy efficient than running it on premises, but all providers are not the same. Specifically, where does your provider get their electricity? Their local electrical grid might produce electricity by burning fossil fuels, which is a carbon-intensive technique, or it might include renewable energy sources, like hydroelectric, solar, and wind, with very low or no carbon emissions. The greener a data center, the more intensively you can use their computing resources with less environmental impact.

Tip

Check out app.electricitymaps.com to display the climate impact of electricity in specific regions of the world. Your cloud provider may also publish summary statistics for its data centers or regions, such as Google Cloud’s carbon data across GCP regions.

Besides CO2e, cloud providers may publish other metrics as points of comparison, which may help you to choose a relatively green provider. Here are a few from a Google research paper:

  • Carbon intensity: Tons of CO2e per megawatt hour. This tells you how clean a data center is, that is, how low or high their emissions are. Lower values are greener.

  • Power usage effectiveness (PUE): the center’s total energy use divided by the amount consumed by its computing hardware. This tells you how efficient a data center is with its energy. When PUE equals 1.1, for example (which happens to be Google’s average PUE in its data centers at press time), it means for every megawatt consumed by computing hardware, another 10% is spent elsewhere to operate the data center. Lower values are greener.

  • Carbon-free energy percentage (%CFE): The percentage of energy used by the data center that comes from carbon-free sources such as wind and solar power. This tells you how green a data center is. Higher values are greener. One hundred percent would mean zero CO2 emissions.

Even within a single provider, its data centers may vary quite a bit when it comes to sustainability. For example, the cloud services offered by Amazon Web Services, Google Cloud, and Microsoft Azure are hosted in dozens of regions around the world, spanning hundreds of countries with different norms, practices, and laws regarding the environment. If your cloud provider supports it, choose to run your code in their greener data centers. If your provider doesn’t publish enough information on this topic, ask for it. Let them know, as a paying customer, that the data is important for your decisions.

A cartoon of a person talking to a person

Description automatically generated

These terms are a hot topic, and experts are still debating how and when to use them. Nevertheless, let’s take a crack at their possible meanings at a high level.

Carbon neutral often means that the provider is buying credits, called carbon offsets, that reduce or prevent emissions globally. The reduced emissions may not necessarily be their own. If the total credits they purchase equal their own carbon footprint, they may say they are carbon neutral. 100% renewable energy usually means the provider is purchasing renewable energy in large enough quantities to match their annual electricity use, but they also may still have some carbon emissions. Finally, 24/7 carbon-free generally means the provider has addressed their emissions by sourcing clean energy for every hour of every day, in every grid where they operate. As an example, as of 2022, Google has matched 100% of the electricity consumption of its global operations with purchases of renewable energy annually since 2017. Google has also set a goal to address its scope 2 emissions associated with operational electricity use by 2030, by running on 24/7 carbon-free energy on every grid where it operates (making them 24/7 carbon-free for scope 2).

Optimizing For Time of Day

A white ovals with black text

Description automatically generated

Data centers may use different sources of power at different times of day. You may be able to reduce your CO2e by running your code at times when low-carbon power is plentiful. This is particularly true for relatively large jobs, like training a machine learning model, which can take days or weeks and might not be time-sensitive. Schedule these jobs to run at low emissions times for the local grid.

One caveat: as technology improves, data centers are becoming smarter at scheduling jobs to run efficiently. Within Google, for example, software called the Carbon-Intelligent Computing System automatically delays certain compute tasks to run during less carbon-intensive times of day, based on predictions about the electrical grid. Your cloud provider might already be taking into account the time of day to reduce their costs. Ask them if your own time-of-day optimizations would be helpful or if it’s better to let their scheduling algorithms choose when to run your non-time-sensitive code.

Getting Involved

A cartoon of a person using a computer

Description automatically generated

Even if we reduce or eliminate data center emissions from our own code, the world still has plenty of other climate change-related problems to solve. And by and large, technologists like us love solving problems. So what else can we do?

A great start is to join climate change-related projects where your technology skills can help. In the world of open source, GitHub maintains a list of climate change projects, and last time I looked there were over 1,000 projects listed. Also check out the Green Software Foundation, whose mission is “to reduce the total change in global carbon emissions associated with software.”

Closer to home, if you’re lucky, your employer might have an interest in such projects, especially if you work in Big Tech. Google, for example, has a sizable community of employees, called Anthropocene, who are passionate about mitigating climate change and its effects. Twice a year we hold a “climate fair” and invite all Googlers to learn about dozens of climate-related projects and join as volunteers. I’ve had the privilege of working on two such projects. The first is Project Sunroof, which calculates the costs and energy savings of rooftop solar panels at a given street address. It uses Google Maps APIs to render the intensity of sunlight on rooftops. I started out as a technical writer, updating Sunroof documentation, but soon transitioned to a related project, the Environmental Insights Explorer. This project provides greenhouse gas data to city governments so they can make informed decisions to reduce emissions, and I had the opportunity to write software on a small and enthusiastic team for a few hours a week.

If your company doesn’t have anything similar to Anthropocene, consider starting a special interest group yourself (if it makes sense, given your role and the company’s size and mission). It’s been my experience that most engineers “get it” about climate change and want to help; all they need is an opportunity. Also, talk to your management about reducing carbon or steering its business toward greener data centers. Saving energy usually means saving money.

If you happen to work for a cloud provider or in a data center, check if you’re publicizing your green statistics, such as carbon intensity, power usage effectiveness (PUE), and carbon-free energy percentage (%CFE). If not, consider whether it would benefit your business to provide these statistics to customers. Greener operation can be a competitive edge.

Case Study: Cooling a Data Center by AI

As mentioned earlier in the chapter, artificial intelligence (AI) programming is sometimes blamed for extreme processing loads in data centers. The following case study flips that story around to show how AI can be applied responsibly to save energy.

In the mid-2010s, Google was using lots of energy to cool its data centers. Curious Googlers began to wonder: could we save energy by controlling the temperature of our data centers by AI? Not long before, the Google division Deepmind had created an AI, AlphaGo, that could beat human players in the board game Go. Could Google apply similar machine learning technology to beat humans at the game of climate control?

Data center cooling works as follows (or at least it did at the time). Chilled water is pumped into the data center to cool the building. The water warms in the process and is pumped to a chiller, which extracts the heat from the water and vents it elsewhere, and the process repeats.

Other companies had previously tried and failed to cool hardware by AI to optimize cost. This time was different, however, because of the AlphaGo team’s expertise in a branch of machine learning called deep reinforcement learning.

Intuitively, reinforcement learning (RL) is like learning by trial and error. Good decisions by the AI are rewarded, and poor ones are penalized, according to some mathematical function. If all goes well, the AI converges on a set of rules of behavior that maximizes reward. This technique is popular for AI that plays games and solves puzzles. Little by little, the system moves its game pieces incrementally closer to a win or a solution, and it learns which moves work well in which states.

Deep reinforcement learning is a variety of RL. It begins by creating a model of the world (in this case, a model of the data center) by examining gobs of real-world data and applying supervised machine learning.3 Once the model is ready, the RL system is told to begin in some state, and it makes its first prediction. A software agent looks at the prediction and takes action, moving the model to a different state that, ideally, improves upon the previous state. This process repeats, the system progresses from state to state, and the agent gradually learns which actions work best in which states. The best outcome is for the RL system to reach a stable state or set point, such as an ideal temperature range for data center machinery, and to maintain that set point over time.

This process is tricky. If you’re not careful, the RL system can produce trivial results. The agent could learn, for example, that the ideal state is to shut off cooling altogether — producing unbeatable energy savings while the chips melt. The system also needs to compensate for real-world constraints. Turning chillers on and off frequently is expensive and can wear out the machinery.

To tackle these challenges, Google assembled a team including, at first, software engineers, AlphaGo research scientists, and mechanical engineers who design data centers. The team created a very simple, low-budget prototype, all duct tape and bubble gum. The AI system would issue predictions of set points for the cooling system, and the team would email the predictions to real live humans in the data center. If the system’s recommendation seemed reasonable, the humans would implement it by tweaking the cooling controls.

The cheap prototype worked pretty well, so in the next phase of the project, the team traveled to the data center and tested the system live. By co-locating with the data center personnel, the engineers learned about complex interactions among machines in the data center, which were not apparent from the data and their model did not cover. (Recall from Chapter 3 that production systems can be way more complex than development and test systems.) As a result, the team learned a ton, saved time, and avoided all kinds of hiccups. In the end, the system reduced the energy used for data center cooling by up to 40%. This was a significant energy saving.

Once the system seemed to be stable and operating well, the team asked: what would happen if full control was turned over to the AI? (Cue dramatic music.) The team couldn’t be sure, so they carefully built a safety mechanism. This mechanism was based on constraints defined by Google’s data center operators. The software engineers programmed the safety mechanism to not trust the AI, and the AI didn’t trust the safety mechanism either, a setup known as a mutual distrust model. If the two systems disagreed about a recommendation, they would throw it away. Plus, at all times, human operators could override the AI.

The resulting AI-controlled cooling system worked well enough for Google to implement it across a number of its data centers. In fact, it was one of the first major successes of reinforcement learning deployed in the real world. Over time, the system has delivered significant, consistent energy savings.

Let’s review why the data center cooling solution is a great example of responsible software engineering:

  • A responsible goal. Reduced cooling costs translate into reduced carbon emissions.

  • Cross-functional collaboration. The software engineers worked closely with research scientists from AlphaGo and the boots-on-the-ground workers in the data center. This arrangement simplified collaboration and built trust among professionals with diverse skill sets.

  • Focus on safety. The team kept a human expert in the loop to carry out, and later to confirm, the AI’s commands. Eventually, control was handed to the AI, but only after creating a mutual distrust model that would discard potentially risky recommendations.

Following Google’s lead, various companies today offer AI-based products for climate control in data centers. The problem is not fully solved, however, because solutions do not scale well from one data center to another. Many data centers have unique setups: they contain different equipment, and they generate and collect different data in different formats. As one Google engineer told me, “The biggest roadblock to mitigating climate change with artificial intelligence is data standardization.” The AI itself is relatively well-understood at this point. What’s harder is getting the data in the right format at a large scale. This is an area of active research.

Summary

Each of us can reduce our carbon footprint in all sorts of ways, like using less gasoline and electricity. People with software development skills can do even more. Start by measuring your code’s carbon footprint. If you don’t have enough data to measure effectively, advocate for your data centers to provide what you need. Once you understand your footprint, you can work to reduce it. And if you have time, consider getting involved in coding projects to mitigate climate change. The work can be deeply satisfying, and the ultimate goal is critical for our planet.

1 One zettabyte is 1021 bytes, or 1 billion terabytes.

2 An informal, back-of-the-envelope analysis by my team suggests that non-CPU components may increase power usage by about 20%.

3 Technically this is model-based RL. Not all RL systems use a model.

Get Responsible Software Engineering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.