AI Energy Research
Empirical benchmarking of energy consumption and carbon emissions across frontier LLMs on 5 MMLU domains. Establishes a per-query carbon cost framework with a 4.3× energy differential between model endpoints and 12.1% cross-domain bias in energy allocation. Foundation for the CCI service integrated into the ClinicalSearch pipeline.
AI systems consume energy at query time, but this cost is invisible by default. Organizations running AI pipelines make model selection decisions based on capability and price rarely on carbon. This paper establishes a per-query carbon cost framework for frontier LLMs, benchmarked across 5 MMLU domains. The core finding is a 4.3× energy differential between model endpoints meaning that model selection is a carbon decision as much as a capability decision. A 12.1% cross-domain bias in energy allocation shows that domain context, not just model choice, affects carbon cost. The CCI metric is now integrated as a live service in the ClinicalSearch pipeline, providing real-time carbon cost per query alongside clinical evidence retrieval.
What is the per-query carbon cost differential across frontier LLMs on standardized benchmarks, and can a general-purpose CCI framework enable carbon-aware model selection in production AI pipelines?
Empirical benchmarking across frontier LLMs on 5 MMLU domain subsets; per-query energy measurement using inference provider APIs and hardware-level monitoring; carbon intensity calculation using regional grid carbon factors; cross-domain bias analysis across subject areas; CCI metric derivation as a composable service for downstream pipeline integration; validation against GPT-4 as benchmark baseline.
PUBLISHED
Energies · MDPI · 19(3), 642
DOI: 10.3390/en19030642 →Suggested citation
Kaur, R., Kundu, T., Park, K. M., & Pinsky, E. (2026)
Team
Lead Researcher
FilledRashanjot Kaur
Designed benchmarking framework, CCI metric derivation, cross-domain bias analysis, and pipeline integration architecture.
Skills: LLM Benchmarking, Energy Measurement, Carbon Modeling, Research Design
Co-author
FilledKundu T.
Co-author. Contributed to experimental design and results analysis.
Skills: ML Research, Empirical Benchmarking
Faculty Advisor / Co-author
FilledProf. Kathleen Park
Faculty advisor. Co-author. Supervised research direction and academic positioning.
Skills: Operations Research, Sustainability, Academic Mentorship
Faculty Advisor / Co-author
FilledProf. Eugene Pinsky
Faculty advisor. Co-author. Supervised methodology and paper submission.
Skills: Computer Science, AI Research, Academic Mentorship
Prof. Eugene Pinsky, Prof. Kathleen Park