Skip to content

zs1314/AHD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Breaking Block Boundaries: Anchor-based History-stable Decoding for Diffusion Large Language Models

arXiv License

This is the official implementation of AHD (Anchor-based History-stable Decoding), a training-free, plug-and-play dynamic decoding strategy for Diffusion Large Language Models (dLLMs).


📑 Table of Contents


📁 Project Structure

open-dLLM-compress/
├── llada/                  # AHD on LLaDA-8B-Instruct
│   ├── generate_AHD_acc.py # AHD decoding implementation
│   ├── generate.py         # Baseline (Semi-AR) decoding
│   ├── eval_llada.py       # Evaluation harness wrapper
│   ├── eval_*.sh           # Evaluation scripts for each benchmark
│   └── model/              # LLaDA model definition
├── llada1.5/               # AHD on LLaDA-1.5 (same structure as llada/)
├── MMADA/                  # AHD on MMaDA (vision-language)
│   ├── models/             # MMaDA model with AHD integration
│   ├── scripts/            # Evaluation scripts
│   ├── lmms_eval/          # lmms-eval framework
│   └── generate_demo.py    # Quick demo
├── DIFFA/                  # AHD on DIFFA (audio-language)
│   ├── src/                # DIFFA model and AHD audio decoding
│   ├── inference_voicebench.py
│   └── voicebench/         # VoiceBench evaluation
├── assets/                 # Figures
├── LICENSE
└── README.md

🧪 Evaluation of AHD on LLaDA & LLaDA-1.5

⚙️ Models

Model Name Hugging Face Repo Local Path
LLaDA-8B-Instruct GSAI-ML/LLaDA-8B-Instruct ./Models/LLaDA-8B-Instruct/
LLaDA-1.5 GSAI-ML/LLaDA-1.5 ./Models/LLaDA-1.5/

📦 Dependencies

cd llada  # or cd llada1.5
conda create -n llada python=3.12
conda activate llada
pip install -r requirements.txt

🔧 Quick Demo

Please make sure to set the correct model path in generate_AHD_acc.py.

python generate_AHD_acc.py

🔨 Evaluation

Supported Benchmarks

Benchmark Script Few-shot
BBH eval_bbh.sh 3
MMLU-Pro eval_mmlu_pro.sh 0
HumanEval eval_humaneval.sh 0
MBPP eval_mbpp.sh 3
MATH eval_math.sh 3
ASDiv eval_asdiv.sh 0
TruthfulQA eval_truthqa.sh 0
sh eval_bbh.sh
sh eval_mmlu_pro.sh
sh eval_humaneval.sh
sh eval_mbpp.sh
sh eval_math.sh
sh eval_asdiv.sh
sh eval_truthqa.sh

Tip

Each script contains both Baseline and AHD. You can configure length, block_length, num_fewshot, and AHD-specific hyperparameters (kl_threshold_AHD, history_length_AHD, etc.) directly in the scripts.

Note

HumanEval requires post-processing:

python postprocess_code.py {samples_xxx.jsonl}

🧪 Evaluation of AHD on MMADA

⚙️ Models

Model Name Hugging Face Repo Local Path
MMaDA-8B-MixCoT Gen-Verse/MMaDA-8B-MixCoT ./Models/MMaDA-8B-MixCoT/

📦 Dependencies

cd MMADA
conda create -n mmada python=3.11
conda activate mmada
pip install -r requirements.txt
cd lmms_eval
uv pip install -e .

🔧 Quick Demo

Please make sure to set the correct model path in generate_demo.py.

python generate_demo.py

🔑 Environment Variable Configuration

Some evaluation tasks use an LLM as a judge (e.g., GPT). Please configure the following environment variables before running evaluation:

export OPENAI_API_KEY="YOUR_OPENAI_API_KEY"
export API_TYPE="openai"
export OPENAI_API_URL="https://api.openai.com/v1/chat/completions"

🔨 Evaluation

Supported Benchmarks

Benchmark Task Name
MathVista-mini mathvista_testmini_mmada
MathVision mathvision_test_mmada
ScienceQA-Img scienceqa_img_mmada
GQA gqa
MME mme
cd ..
bash scripts/eval_baseline.sh
bash scripts/eval_AHD.sh

Tip

You can configure the following hyperparameters in the scripts above: GEN_LENGTH, DIFF_STEP, BLOCK_LENGTH, NGPU.

Note

The default LLM-judge model used in this paper is gpt-4.1-mini.


🧪 Evaluation of AHD on DIFFA

⚙️ Models

Model Name Hugging Face Repo Local Path
Whisper-Small openai/whisper-small ./DIFFA/whisper/
DIFFA zhoujiaming777/DIFFA ./DIFFA/checkpoint-diffa/
LLaDA-8B-Instruct GSAI-ML/LLaDA-8B-Instruct ./DIFFA/LLaDA-8B-Instruct/

📦 Dependencies

cd DIFFA
conda create -n diffa python=3.10
conda activate diffa
pip install -r requirements.txt

🔍 Inference

python inference_voicebench.py \
    --model_path path/to/DIFFA/checkpoint-diffa \
    --whisper_path path/to/DIFFA/whisper \
    --llm_path path/to/DIFFA/LLaDA-8B-Instruct \
    --data openbookqa \
    --generation_method AHD

Tip

  • Datasets: openbookqa, bbh, alpacaeval, wildvoice, commoneval
  • Methods: Vanilla, AHD
  • Key arguments: --steps, --block_length, --max_new_tokens

🔨 Evaluation

Follow the evaluation method from VoiceBench:

cd voicebench
python evaluate.py --src_file {result.jsonl} --evaluator xx

📑 Todo List

  • Re-architect the codebase
  • Support multi-batch-size
  • Support implementations of other method

🎓 Citation

If you find this work helpful for your research, please consider citing:

@article{zou2026ahd,
      title={Breaking Block Boundaries: Anchor-based History-stable Decoding for Diffusion Large Language Models}, 
      author={Shun Zou and Yong Wang and Zehui Chen and Lin Chen and Chongyang Tao and Feng Zhao and Xiangxiang Chu},
      journal={arXiv preprint arXiv:2604.08964},
      year={2026}
}

🙏 Acknowledgement

We would like to thank the authors of the following projects for their excellent work and open-source contributions:

About

[ACL 2026] Breaking Block Boundaries: Anchor-based History-stable Decoding for Diffusion Large Language Models

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages