Issue with running model on multi gpu machine

Hey folks

I'm running into an issue with the [HunyuanVideo-I2V](https://github.com/Tencent-Hunyuan/HunyuanVideo-I2V) project. I'm using a `g5.12xlarge` AWS instance (4x GPUs, each 28GB VRAM), but when trying to run `sample_image2video.py`, I get a `torch.cuda.OutOfMemoryError` during model loading.


<img width="1600" height="700" alt="Image" src="https://github.com/user-attachments/assets/64f8857b-5257-4e04-bf99-e47f0c492b39" />


Here's the key part of the error:

```
CUDA out of memory. Tried to allocate 90.00 MiB. GPU has a total capacity of 21.99 GiB of which 87.38 MiB is free...
```

I already have `PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True` set, but the issue persists.

command i am running : 
```
ALLOW_RESIZE_FOR_SP=1 torchrun --nproc_per_node=4 \
    sample_image2video.py \
    --model HYVideo-T/2 \
    --prompt "An Asian man with short hair in black tactical uniform and white clothes waves a firework stick." \
    --i2v-mode \
    --i2v-image-path ./assets/demo/i2v/imgs/0.jpg \
    --i2v-resolution 360p \
    --i2v-stability \
    --infer-steps 1 \
    --video-length 5 \
    --flow-reverse \
    --flow-shift 7.0 \
    --seed 0 \
    --embedded-cfg-scale 6.0 \
    --save-path ./results \
    --ulysses-degree 2 \
    --ring-degree 2

```

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with running model on multi gpu machine #70

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue with running model on multi gpu machine #70

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions