Skip to content

NUS-HPC-AI-Lab/POME

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Ā 

History

1 Commit
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 

Repository files navigation

POME: Post Optimization Model Edit via Muon-style Projection

Introduction

We introduce Post-Optimization Model Edit (POME), a new algorithm that enhances the performance of fine-tuned large language models using only their pretrained and fine-tuned checkpoints, without requiring extra data or further optimization. The core idea is to apply a muon-style projection to $\Delta W$, the difference between the fine-tuned and pretrained weights. This projection uses truncated singular value decomposition (SVD) to equalize the influence of dominant update directions and prune small singular values, which often represent noise. As a simple post-processing step, POME is completely decoupled from the training pipeline. It requires zero modifications and imposes no overhead, making it universally compatible with any optimizer or distributed framework.

Latest News šŸ”„

  • [2025/10] šŸ”„ Propose POME [arxiv] [HuggingFace], a simple and efficient method for post optimization model edit.

1. MetaMathQA

Prerequisites:

  • Python >= 3.10
  • PyTorch >= 2.5.1
  • CUDA >= 11.6

We strongly recommend using Anaconda to create a new environment (Python >= 3.10) to run our examples.

Clone POME and install the required packages:

git clone https://github.com/sglucas/POME.git
cd matamath
pip install -r requirements.txt

POME:

# For LLaMA2-7B
BASE_MODEL='meta-llama/Llama-2-7b-hf'
MODEL="your_model_path"
OUTPUT="output_path"
python pome.py --base_model $BASE_MODEL --model $MODEL   --output_path $OUTPUT --alpha 2.6 --truncation 0.5 --layer up_proj

# For LLaMA3-8B
BASE_MODEL='meta-llama/Meta-Llama-3-8B'
MODEL="your_model_path"
OUTPUT="output_path"
python pome.py --base_model $BASE_MODEL --model $MODEL   --output_path $OUTPUT --alpha 1.5 --truncation 0.5 --layer up_proj

# For Gemma2-9B
BASE_MODEL='google/gemma-2-9b'
MODEL="your_model_path"
OUTPUT="output_path"
python pome.py --base_model $BASE_MODEL --model $MODEL   --output_path $OUTPUT --alpha 1.2 --truncation 0.5 --layer up_proj

Evaluation on GSM8K and MATH:

cd metamath
bash eval_gsm8k.sh --model $OUTPUT  --data_file ./data/test/GSM8K_test.jsonl
bash eval_math.sh --model $OUTPUT   --data_file ./data/test/MATH_test.jsonl

You can also directly download the models from huggingface and then evaluate them:

Model MetaMathQA MetaMathQA+POME
LLaMA2-7B šŸ¤— HuggingFace šŸ¤— HuggingFace
LLaMA3-8B šŸ¤— HuggingFace šŸ¤— HuggingFace
Gemma2-9B šŸ¤— HuggingFace šŸ¤— HuggingFace

Results:

Model Task Adam +POME
LLaMA2-7B GSM8K 67.2 69.7
LLaMA2-7B MATH 19.4 19.7
LLaMA3-8B GSM8K 80.3 81.4
LLaMA3-8B MATH 31.5 32.7
Gemma2-9B GSM8K 82.2 83.3
Gemma2-9B MATH 36.1 37.3

2. Code Generation

Coming soon.

Citation

@misc{liu2025pomepostoptimizationmodel,
      title={POME: Post Optimization Model Edit via Muon-style Projection}, 
      author={Yong Liu and Di Fu and Yang Luo and Zirui Zhu and Minhao Cheng and Cho-Jui Hsieh and Yang You},
      year={2025},
      eprint={2510.06627},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2510.06627}, 
}

About

POME: Post Optimization Model Edit via Muon-style Projection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

⚔