This was done on Windows 11, RTX 5090 32 GB Vram, Nvidia Driver 576.52
Follow the guide carefully step by step:
1) Download Bagel repository (lets assume it's "Bagel" folder on your PC) and remove
flash_attn==2.5.8, torch==2.5.1, torchvision==0.20.1 from "requirements.txt" file inside "Bagel" folder
2) Download everything from https://huggingface.co/ByteDance-Seed/BAGEL-7B-MoT/tree/main
Of course main file is "ema.safetensors", 29.2 GB
Put all files into "_\Bagel\models\BAGEL-7B-MoT_" , create all subfolders if needed
2) Install miniconda. Start "Anaconda Prompt" from Start, all next commands must be typed there:
conda create -n BagelEnv python=3.10
3) conda activate BagelEnv
All next commands must be with BagelEnv enviroment being active
4) Install Cuda Toolkit 12.8 and cudnn into BagelEnv miniconda enviroment, non system wide, to avoid conflicts with
other projects:
conda install cuda=12.8 cudnn -c nvidia -c conda-forge
5) pip install torch==2.7.0+cu128 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
6) pip install -r requirements.txt
7) a) download "flash_attn-2.7.4.post1%2Bcu128torch2.8.0cxx11abiTRUE-cp310-cp310-win_amd64.whl" from https://huggingface.co/lldacing/flash-attention-windows-wheel/tree/f1d19914b4b710823709e11e2a88d052216a110b
pip install flash_attn-2.7.4.post1%2Bcu128torch2.8.0cxx11abiTRUE-cp310-cp310-win_amd64.whl
It successfully worked for me even though it's for Torch 2.8.0 and we've installed 2.7.0
OR
b) Later @petermg released wheel for torch 2.7.0, so you can instead download it:
"flash_attn-2.7.4.post1+cu128.torch270-cp310-cp310-win_amd64.whl" from https://github.com/petermg/flash_attn_windows/releases/tag/01
pip install flash_attn-2.7.4.post1+cu128.torch270-cp310-cp310-win_amd64.whl
8) pip install gradio
9) To start Bagel, still inside "Anaconda Prompt" and with active BagelEnv enviroment, change folder to "Bagel"
repository you downloaded in step 1) then:
python app.py
I got next output in console:
The safetensors archive passed at models/BAGEL-7B-MoT\ema.safetensors does not contain metadata. Make sure to save your model with the save_pretrained method. Defaulting to 'pt' metadata.
Then browser started and Bagel UI appeared, everything works fine. CPU is not utilized much but 5090 is
under 95-100% load during image generation, editing, understanding. VRAM is almost full, like 30-30.5 GB used.
Image generation and editing takes around 55-65 seconds, image understanding takes about 0.5-5 seconds depending on
image size.
This was done on Windows 11, RTX 5090 32 GB Vram, Nvidia Driver 576.52
Follow the guide carefully step by step:
1) Download Bagel repository (lets assume it's "Bagel" folder on your PC) and remove
flash_attn==2.5.8, torch==2.5.1, torchvision==0.20.1 from "requirements.txt" file inside "Bagel" folder
2) Download everything from https://huggingface.co/ByteDance-Seed/BAGEL-7B-MoT/tree/main
Of course main file is "ema.safetensors", 29.2 GB
Put all files into "_\Bagel\models\BAGEL-7B-MoT_" , create all subfolders if needed
2) Install miniconda. Start "Anaconda Prompt" from Start, all next commands must be typed there:
conda create -n BagelEnv python=3.103)
conda activate BagelEnvAll next commands must be with BagelEnv enviroment being active
4) Install Cuda Toolkit 12.8 and cudnn into BagelEnv miniconda enviroment, non system wide, to avoid conflicts with
other projects:
conda install cuda=12.8 cudnn -c nvidia -c conda-forge5)
pip install torch==2.7.0+cu128 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu1286)
pip install -r requirements.txt7) a) download "flash_attn-2.7.4.post1%2Bcu128torch2.8.0cxx11abiTRUE-cp310-cp310-win_amd64.whl" from https://huggingface.co/lldacing/flash-attention-windows-wheel/tree/f1d19914b4b710823709e11e2a88d052216a110b
pip install flash_attn-2.7.4.post1%2Bcu128torch2.8.0cxx11abiTRUE-cp310-cp310-win_amd64.whlIt successfully worked for me even though it's for Torch 2.8.0 and we've installed 2.7.0
OR
b) Later @petermg released wheel for torch 2.7.0, so you can instead download it:
"flash_attn-2.7.4.post1+cu128.torch270-cp310-cp310-win_amd64.whl" from https://github.com/petermg/flash_attn_windows/releases/tag/01
pip install flash_attn-2.7.4.post1+cu128.torch270-cp310-cp310-win_amd64.whl8)
pip install gradio9) To start Bagel, still inside "Anaconda Prompt" and with active BagelEnv enviroment, change folder to "Bagel"
repository you downloaded in step 1) then:
python app.pyI got next output in console:
The safetensors archive passed at models/BAGEL-7B-MoT\ema.safetensors does not contain metadata. Make sure to save your model with the
save_pretrainedmethod. Defaulting to 'pt' metadata.share=Trueinlaunch().Then browser started and Bagel UI appeared, everything works fine. CPU is not utilized much but 5090 is
under 95-100% load during image generation, editing, understanding. VRAM is almost full, like 30-30.5 GB used.
Image generation and editing takes around 55-65 seconds, image understanding takes about 0.5-5 seconds depending on
image size.