Skip to content
View hongping-zh's full-sized avatar

Block or report hongping-zh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please donโ€™t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
hongping-zh/README.md

๐Ÿ‘‹ Hi, I'm Hongping Zhang

Independent AI Researcher | Energy Efficiency & Sustainable Computing


๐ŸŽฏ Core Assets

Asset Type Impact Link
๐Ÿค— HuggingFace Optimum Integration Official Documentation Trusted by thousands of HF developers View Docs โ†’
๐Ÿ“Š Complete Energy Dataset Research Benchmark 360+ configurations, 5 precision methods Explore Data โ†’
๐Ÿฆพ EcoCompute AI Assistant Interactive Tool Conversational energy advisor on ClawHub Try EcoCompute โ†’
๐Ÿ›๏ธ MLCommons Power WG Discussion Industry Recognition Invited to contribute to MLPerf power measurement standards View Discussion โ†’

๐Ÿ”ฌ Core Discovery

Quantization only saves energy for models > 3.2โ€“4.6B parameters.
For smaller models, FP16 is actually more energy-efficient.
โ€” Measured on RTX 4090D, RTX 5090, A800 with NVML power sampling.

This finding challenges the default assumption that "quantize everything = green." Our benchmark data is open and reproducible.

Key Findings:

  • NF4 crossover: 3.2โ€“3.9B parameters (hardware-dependent)
  • INT8 crossover: 4.0โ€“4.6B parameters (hardware-dependent)
  • Below threshold: Quantization adds 25โ€“55% energy overhead
  • Above threshold: Quantization saves 15โ€“23% energy

๐Ÿš€ Try It Now

๐ŸŒ Live Demo EcoCompute Dashboard โ†’
๐Ÿ“Š What it does Compare AI models by Accuracy ร— Cost ร— Carbon in one dashboard
โšก Data source Real GPU benchmarks โ€” PyTorch 2.10 + CUDA 12.8, 10 runs per config

๐Ÿ“ˆ Recognition & Impact

Achievement Details
๐Ÿค— HuggingFace Official Quantization energy findings integrated into Optimum documentation
๐Ÿ›๏ธ MLCommons Invited Contributing to MLPerf Power Working Group on quantization energy metrics
๐Ÿ“Š Open Dataset 360 configurations, 270 analyzed + 90 FP8 reserved for future work
๐ŸŒ Zenodo Archive Permanent DOI: 10.5281/zenodo.18900289
๐Ÿ“ Research Paper "When Does Quantization Save Energy?" โ€” arXiv submission in progress

๐ŸŽฏ 2026 Roadmap

  • โœ… HuggingFace Integration โ€” Official documentation published
  • โœ… MLCommons Engagement โ€” Invited to Power Working Group
  • ๐Ÿ”„ arXiv Publication โ€” Seeking endorsement for cs.LG submission
  • ๐Ÿ›ก๏ธ VS Code Extension โ€” Real-time energy linting before code merges
  • ๐Ÿค Enterprise Pilots โ€” Seeking design partners for carbon-aware CI/CD

๐Ÿ’š How You Can Help

I'm looking for design partners, early adopters, arXiv endorsers, and grant sponsors to take EcoCompute from research to production.

Action Link
โญ Star the repo quantization-energy-crossover
๐ŸŒ Try the demo Live Dashboard โ†’
๐Ÿ“ง arXiv Endorsement Email me โ†’
๐Ÿค Become a design partner Email me โ†’
๐Ÿ’ผ Invest / Grant Email me โ†’

๐Ÿ“š Key Publications & Resources


๐ŸŒ Making AI development more sustainable, one model at a time.

Pinned Loading

  1. ecocompute-dynamic-eval ecocompute-dynamic-eval Public

    โšก Compare AI models by Accuracy ร— Cost ร— Carbon โ€” RTX 5090 benchmarks reveal 4-bit quantization wastes energy on small models

    TypeScript 2

โšก