question about distill_dcm_wan_detail_expert

I noticed that this line was commented out before your LoraConfig to transformer:
# transformer.requires_grad_(False)

<img width="607" height="190" alt="Image" src="https://github.com/user-attachments/assets/3cd3deca-6de2-476c-ae9b-c853bf675b57" />

Doesn't this go against the original intention of your detail expert for fine-tuning the semantic expert?
Total num of the training parameters is about 1.2 billion

btw im trying to train the detail expert but OOM problem happended. The problem was found in computing gan_g_loss. wondering if you could give me some suggestions. looking forward to your reply

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about distill_dcm_wan_detail_expert #24

transformer.requires_grad_(False)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

question about distill_dcm_wan_detail_expert #24

Description

transformer.requires_grad_(False)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions