book

Sustainable AI

Name: Sustainable AI
Author: Raghavendra Selvan
ISBN: 9781098155513

by Raghavendra Selvan

October 2025

Intermediate to advanced

292 pages

8h 9m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Includes

Quizzes

Preface
Who Should Read This Book?What This Book Is and Is NotUsing This BookConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. Sustainability and Artificial Intelligence
Scope of SustainabilityArtificial Intelligence: The New Electricity?Sustainability × AIAI for SustainabilitySustainability of AIA Green Path to Sustainable AITL;DR
2. Under the Hood of Generative AI
Representation LearningOverview of Representation SpacesLearning Representation SpacesLearning Representations to GenAIAutoencodersLarge Language ModelsMultimodal Generative ModelsTour of Neural ArchitecturesData ModalitiesNeural Network ZooFormalizing Machine LearningNonlinear Models and Deep LearningHow to Train Your ModelBuilding GenAIGenAI IngredientsResources and Engineering at ScaleAdditional ResourcesCommon NotationsDatasetsFrom ML Basics to Sustainable AI
3. Quantifying the Efficiency of Deep Learning
AI WasteResource Consumption of Deep LearningResource Efficiency and Climate AwarenessActual Carbon Footprint of AIResource Efficiency and Sustainable AIQuantifying Resource Consumption of AIModel ComplexityComputation TimeEnergy ConsumptionCarbon Footprint of AI ModelsGHG Emissions and Carbon FootprintRelating Carbon Footprint to Energy ConsumptionEstimating the Carbon Footprint of AI ModelsEfficiency Quantified: What Comes Next?
4. Data Parsimony
The Cost of DataCarbon Footprint of Data StorageScale of Datasets in AICarbon Footprint of Processing DataDataset CurationActive Learning for Dataset CreationLearning with Pruned DatasetsInstance SelectionTokenization and Data Efficiency in Modern AI ModelsCoreset SelectionLearning with Compressed DataData Point CompressionDataset CondensationData and Dataset Compressed: What Comes Next?
5. Automating Model Selection
MotivationThe Model Selection Hierarchy: MC3-SpaceModel Selection as OptimizationHyperparameter OptimizationGrid SearchRandom SearchBayesian OptimizationNeural Architecture SearchNAS Search SpaceNAS As OptimizationNAS Using Random SearchNAS Using Evolutionary AlgorithmsEfficiency and NASModel Selection in the Era of Foundational ModelsMixture of ExpertsModel Selection Automated: What Comes Next?
6. Training Efficiency
Training Costs of AI ModelsTransfer LearningPretrained ModelsFine-Tuning of Pretrained ModelsIn-Context Learning in LLMsTraining Compressed Neural NetworksNeural Network PruningFactorized Neural NetworksLow-Rank Adaptation of Foundational ModelsQuantizationLow-Precision TrainingQuantizing Optimizer StatesEfficient Training Achieved: What Comes Next?
7. Lean Inference
Lifetime Cost of an AI ModelAchieving Lean InferenceResource-Efficient ArchitecturesKnowledge DistillationPruning of Trained ModelsPost-Training QuantizationDeploying ModelsCross-Platform ModelsInference Beyond PythonAI Model Inference in Low-Level LanguagesServing Foundational Models in C++Inference Is Lean: What Comes Next?
8. Hardware Considerations
Environmental Cost of AI HardwareEmbodied EmissionsE-WasteHardware Scaling Laws of AIThe Alchemy of Creating AIImproving the Resource Efficiency of AI HardwareCluster-Level OptimizationAccelerator-Level OptimizationCustom Hardware OptimizationHardware Optimized: What Comes Next?
9. A Recipe for Sustainable AI
Technical Debt of Machine LearningEnvironmental Debt of AITransparency DebtData DebtOther Elements of Environmental DebtOperationalizing Sustainable AIMLOpsGreen MLOpsGreen MLOps in PracticeModel CardsEnergy RatingsOrchestration FrameworksSustainable AI Operationalized: What Comes Next?

10. Toward Sustainable AI
Rebound Effects and AIEfficiency Is Not EnoughBroader Environmental EffectsBeyond EfficiencyEconomic Sustainability of AISocial Sustainability of AIThe Way ForwardSystems ThinkingPutting Systems Thinking into PracticeImpact of Sustainable AI
Epilogue
Index
About the Author

Content preview from Sustainable AI

Chapter 4. Data Parsimony

Data is the new oil was a common idiom in the early 2010s, used in the context of generating value via digital data. It also unintentionally captures the increasing carbon footprint of storing and processing vast amounts of data. Lifecycle emissions for each terabyte of data on hard drive storage are estimated to be anywhere between 2 and 20kgCO₂e per year, as Figure 4-1 illustrates for commonly used storage devices.

Pie chart illustrating the greenhouse gas emissions of storage devices by life stage, highlighting that 93.8% of emissions occur during the use phase.

Large-scale computations on massive amounts of data have been essential to the progress in AI model development, with the most recent LLMs being trained on datasets that consist of more than 15 trillion data points (tokens).¹ Not all of the data used for training ML models is informative, however. Uninformative or duplicative data can also contribute to the notion of AI waste that was presented in Chapter 3. Reducing the amount of data used can have a considerable impact on reducing the energy consumption and carbon footprint of selecting and developing AI models.

In this chapter we introduce methods of identifying informative data points and extracting useful information from them. This chapter offers a paradigm of developing DL models while reducing AI waste from a data perspective, which we refer to as data parsimony. It ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781098155506Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Sustainable AI

by Raghavendra Selvan

Chapter 4. Data Parsimony

Figure 4-1. Typical GHG emissions across the lifecycle of storage devices. (Source: Seagate Sustainability Report.)

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.