How AI’s Energy Appetite, Greenwashing, and Smarter Hardware Are Shaping the Climate Impact of Tech

The AI Emissions Crisis: What I’ve Learned About Tech’s Hidden Climate Cost

When I first started digging into AI’s environmental impact six months ago, I expected some sobering stats. What I didn’t anticipate was realizing my own complicity in this.

As someone who’s been shipping code to staging and production on cloud platforms (Google Cloud), optimizing product roadmaps (Notion), features and designing UX/UI prototypes/mockups on cloud-based design solutions (Figma)… I’ve been part of the problem. We all are.

Let me walk you through my journey from oblivious full-stack engineer and casual UX designer to carbon-conscious senior product manager.

The Invisible Energy Gulpers

Here’s what kept me up at night lately:

Training GPT-4 consumed more energy than my hometown used last year^[50]^[51]
Your daily ChatGPT habit has the carbon footprint of 3,000 Google searches ^[52]
Data centers will double their electricity use by 2026, with AI driving 40% of that growth^[53]

Major cloud providers’ emissions jumped 150% since 2020^[11]^[17]. That’s like adding 10 million gas guzzlers to our roads annually. And before you ask—no, carbon offsets don’t make this magically disappear.

Why “100% Renewable” Claims Are Half-Truths (and Full Frustrations)

During my PM days, I loved slapping “powered by 100% renewables” on slide decks. Here’s what I wish I’d known:

The REC Shell Game

Renewable Energy Certificates (RECs) let companies claim green cred without actually using clean power. Think of it like buying carbon offsets for your private jet—technically “neutral,” practically pointless.

Amazon’s 2022 REC sleight-of-hand: 52% of their “renewable” claims came from buying certificates for existing solar/wind projects that were already powering other grids. Without this accounting trick, their emissions would’ve been 8.5M tons higher^[30].
Microsoft’s Dublin gas plant: To power AI data centers, they built a 170MW gas facility that runs 8 hours daily—plus 150 diesel backups. Their “100% renewable” claim? Works great… if you ignore nights and cloudy days.

Case Studies in Creative Accounting

Meta’s Nebraska Problem
Their new data center forced locals to delay retiring a coal plant. Result? 400k tons of extra CO₂ annually—all while Meta touted REC-driven “carbon neutrality.”
Google’s Time-Shifting Dance
Their “24/7 clean energy” PPAs sound great until you realize they’re buying solar credits for noon power… while drawing from coal-heavy grids at midnight. It’s like claiming you only eat organic because you bought a CSA share—even though you raid 7-Eleven nightly.
The PJM Experiment
Google’s working with grid operators to use AI for aligning compute with renewable availability^[11]. Early tests cut emissions 42% by syncing workloads with Texas wind patterns. But adoption’s stuck in pilot purgatory.

The Grid Reality Check

Here’s what “100% renewable” really means at 3AM:

Company	Nighttime Grid Mix (Sample Region)	Backup Power Source
AWS Virginia	34% coal, 22% gas	Diesel generators
Microsoft Dublin	58% gas peakers	150 diesel units
Meta Nebraska	63% coal (delayed retirement)	N/A

A 2022 Nature study found that REC-driven “emission reductions” are 42% fictional^[30]. Companies are essentially paying to rename pollution, not prevent it.

What Actually Works (But Nobody Wants to Do)

PPAs with additionality: Building new solar/wind farms instead of renting existing ones (like Google’s Nevada geothermal deal^[6])
Granular time matching: Hourly tracking vs. annual averages (requires painful grid integration)
Transparent reporting: Admit when you’re using gas peakers instead of hiding behind REC math

“If your sustainability report doesn’t make you look bad, you’re lying.”

Cooling Wars: Silicon Valley’s New Arms Race

Modern AI chips output more heat than a nuclear reactor per square foot^[6]. Here’s how we’re (not) handling it:

Cooling Method	Energy Savings	My Personal Verdict
Liquid Immersion	40%	Smells like mineral oil, works like magic
Direct-to-Chip	35%	Great until leaks happen
Good Ol’ AC	0%	Basically climate arson

The irony? We’re using AI to optimize cooling… which requires more AI. It’s ouroboros with GPUs.

How We Can Dig Ourselves Out

After some research and awkward convos with ex-colleagues, I came up with a small list:

Make Models Leaner Than My Morning Coffee

Is milk good for me? I don’t know. I’ve spent 45 days in Vietnam and met local people who taught me the pleasure of enjoying a cold robusta brew with ice and a touch of plum at the bottom of the glass, no milk involved, just water. I’m a flat white advocate, so it’s crazy for me to tell you I loved that cold brew.

Pruning: Chopping redundant neural connections (like removing unused code)^[7]^[15]
Quantization: Using 8-bit math instead of 32-bit (analogous to compressing JPEGs)^[8]^[16]
Tiny Models: Specialized AIs that don’t try to solve everything (RIP my full-stack ego)

Pro Tip: The PyTorch pruning tutorial^[18] is surprisingly approachable. Saved 30% inference costs on my side project.

Time-Shifting Compute Like a Night Owl

I remember my grandmother telling me not to use the washing machine during the day because water and electricity would cost more money. Well, that’s quite similar thinking here: we can train models during off-peak times. Here’s an example of image training using Carbon-Aware SDK^[12]. A simple condition like this could save some CO2 emissions:

from carbon_aware import scheduler

def train_model():
if scheduler.is_low_carbon_window():
# Crunch numbers
else:
# Chill until renewables spike

Caching Prompts

This is one you’ll hate because it puts us into the same basket and as humans, we don’t like that. We like to think we’re special and we don’t prompt the same things than our neighbours. But we do.

When people interact with AI models, they often ask very similar or even identical questions—think “What’s the weather in Paris?” or “Summarize this article.” Just as web search engines cache popular queries to save on computation, modern AI systems employ prompt caching to optimize repeated requests. If a prompt (or its static prefix) has been processed recently, the model can skip re-computing the same context and retrieve a cached internal state or even a cached response, dramatically reducing both latency and energy usage. This is especially effective because many users tend to ask overlapping questions, much like how trending searches are cached by search engines.

For example, OpenAI and Anthropic both use prompt caching to store the processed state of common instructions or system prompts; when a new user issues a prompt with the same prefix, the system fast-forwards to the new part of the request, saving time and compute^[55].

The cache typically persists for a few minutes, so if many users prompt the model with similar queries in a short period, the efficiency gains can be significant. As AI adoption grows and prompt patterns converge, prompt caching becomes an increasingly important technique for scaling LLMs sustainably and affordably.

Hardware That Doesn’t Suck (And Actually Helps)

Through late-night datasheet deep dives and conversations with chip architects, I’ve compiled this arsenal of energy-conscious silicon:

The Efficiency Revolutionaries

AC-Transformer Chips
Hong Kong’s ACCESS lab built this game-changer that slashes Transformer model computations by 45x through hardware-friendly attention mechanisms^[38]. Imagine running ChatGPT locally on your phone without melting it – that’s their goal.
MAVERIC’s Robotic Brain
UC Berkeley’s 16nm monster combines RISC-V cores with 13 AI accelerators, hitting 8 TOPS/W while drawing just 20mW^[39]. Perfect for warehouse bots needing all-day runtime.
Neuromorphic Night Owls
IBM’s NorthPole chip mimics brain structure to achieve 5x better efficiency than GPUs on sparse data tasks^[40]. Think surveillance cameras that only wake up when detecting anomalies.

The Photonic Vanguard

Taichi’s Light Speed AI
Tsinghua University’s photonic chiplet hits 160 TOPS/W – enough to run complex AI art generation at 1/100th the power of Nvidia GPUs^[41]. Their secret? Using light instead of electrons for matrix math.

The Underdog Contenders

ARMv9’s Edge Crusade
The new Cortex-A320 brings desktop-class AI to microcontrollers, running 1B-parameter models on **<5W**^[42]. Perfect for smart sensors needing local NLP.
Habana’s Dark Horse
Intel’s Gaudi2 triples performance/watt vs previous gen while handling 96GB HBM3 memory^[43]. AWS uses these for cost-efficient LLM inference.
Axelera’s Vision Maestro
Their M.2 accelerator delivers 214 TOPS for video analytics while sipping power^[44]. I’ve seen these crunch 3,200 FPS on ResNet-50 – insane for edge CCTV systems.

The Efficiency Extremists

UESTC’s Whisper Chips
These 2μJ/operation marvels enable voice control in noisy factories^[45]. 95% accuracy even with machinery blaring – game-changer for hands-free manufacturing.
Graphcore’s Memory Ninjas
IPU-M2000 processes irregular data patterns 69.3% more efficiently than GPUs by eliminating memory bottlenecks^[46]^[16]. Ideal for recommendation systems with sparse user data.

What You Can Do Today…

For Coders:

Add carbon checks to CI/CD pipelines.
Try codecarbon tracker - it’s like ESLint for emissions.
Advocate for efficiency metrics in sprint planning (your PO might hate you, but you can blame me).

For PMs:

Kill vanity features: Does anyone need real-time AI-generated cat emojis? Focus on what’s important to deliver; cut down the AI fluff.
Budget carbon: Treat emissions like AWS costs—you might even reduce costs by being more aware of these.

For Everyone Else:

Ask vendors/suppliers:
- “What’s your gCO2/query?” “What’s your CO2e/cost?”
- “Have you calculated the CO2 emissions (CO2e) of each of your products? Have you even heard about the Bill of Materials?”
- “Do you use a simple tool like Green Effort to calculate and be aware of your scope 1, 2, 3 carbon footprint^[49]?”
Support the AI Carbon Disclosure Act (yes, that’s a real bill in the US now)^[19].
Be more conscious about which model you use:
- Understand the AI models you have access to.
- Switch to a lighter model when you can (GPT-4o-mini over o1).

The Road Ahead

We’re at a weird crossroads. The same week I read that AI could consume 21% of global electricity by 2030^[9], I also ran out of my quota on Copilot while trying to solve my coding issues (yes, I went into one of those infernal vortexes where the LLM just outputs the same thing over and over again every 4-5 prompts). The tools that got us into this mess might help get us out—if we’re smart about how to use them; the problem is we’re not.

What gives me hope:

Google’s TPU v5 being 60% more efficient than standard GPUs^[14]
Photoroom slashing emissions 87% without losing accuracy^[10]
Open-source tools like Clover^[4] making carbon-aware computing accessible
Green Effort helping every company—regardless of size or location—measure their carbon footprint effortlessly^[49]

The lesson? Sustainability isn’t about sacrifice—it’s about better engineering. And maybe, just maybe, we can build an AI future that doesn’t roast the planet.

One more for that road:

People are often curious about how much energy a ChatGPT query uses; the average query uses about 0.34 watt-hours, about what an oven would use in a little over one second, or a high-efficiency lightbulb would use in a couple of minutes. It also uses about 0.000085 gallons of water; roughly one fifteenth of a teaspoon. – Sam Altman, CEO, OpenAI^[54]

References

tech

#ai #thoughts

The AI Emissions Crisis: What I’ve Learned About Tech’s Hidden Climate Cost

https://fry.pm/the-ai-carbon-footprint-crisis/

Author

Fry

Licensed under

My insights on the "Hyperion Cantos" book series by Dan Simmons Next