This repository contains the code implementation of the experiments presented in the paper Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking.
- 🐳 Docker environments for easy installation
- 🤗 Pretrained weights for inference and evaluation
- 📉 Weights and Biases logs for enhanced reproducibility
- 🔬 Code for all experiments in our paper:
- Toy experiments on synthetic data
- Text generation on OpenWebText
- Image generation on CIFAR-10 & ImageNet-32
- ✏️ [May 7, 2026] Released correct implementation of perplexity evaluation. (see mdm-prime/text)
- 📓 [May 1, 2026] Released errata note. The current perplexity evaluation is incorrect.
- 📅 [Mar 17, 2026] Released MDM-Prime-v2. Check out the implementation in mdm-prime/text.
- 🎉 [Sep 18, 2025] Our paper has been accepted to NeurIPS 2025.
- Dataset: 2D Synthetic Dataset
- Folder: mdm-prime/toy
- Dataset: OpenWebText (OWT)
- Folder: mdm-prime/text
- Dataset: CIFAR-10, ImageNet-32
- Folder: mdm-prime/image
We identified an error in the perplexity evaluation results in our paper. Please see our errata note for more details.
The following results are unaffected and the code can still be used to reproduce them:
- Claims about idle steps (Fig. 1)
- Sample quality comparisons (Tables 3, 4)
The perplexity results in Tables 1, 2 do not represent a real improvement. Please see our corrected results in mdm-prime/text.
We apologize for any inconvenience this may cause.
This code implementation is developed based on the following repositories.
- kuleshov-group/mdlm (at commit
3ecb6dc), licensed under theApache-2.0license. - facebookresearch/flow_matching (at commit
c056dd6), licensed under theCC BY-NC 4.0license.
Further changes based on this repository are licensed under the Apache-2.0 and CC BY-NC 4.0 licenses.
If you find this code implementation useful, please consider citing our paper.
@article{chao2026dependency,
title = {{Dependency Breaks Validity of Loss Functions in Masked Diffusion Models}},
author = {Chao, Chen-Hao and Xu, Minkai and Geffner, Tomas and Vahdat, Arash and Krishnan, Rahul G.},
journal = {chen-hao-chao.github.io},
year = {2026}
}
@inproceedings{chao2025mdmprime,
title = {{Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking}},
author = {Chen-Hao Chao, Wei-Fang Sun, Hanwen Liang, Chun-Yi Lee, Rahul G. Krishnan},
booktitle = {Proceedings of the Conference on Neural Information Processing Systems (NeurIPS)},
year = {2025},
}


