Skip to content

Commit 50f68ce

Browse files
authored
Merge pull request #258 from TianQi-777/patch-1
Update README_zh.md
2 parents e0b4904 + a4eac81 commit 50f68ce

File tree

1 file changed

+13
-13
lines changed

1 file changed

+13
-13
lines changed

README_zh.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,13 @@
33
[English](./README.md)
44

55
<p align="center">
6-
<img src="https://raw.githubusercontent.com/Tencent/HunyuanVideo/refs/heads/main/assets/logo.png" height=100>
6+
<img src="https://raw.githubusercontent.com/Tencent-Hunyuan/HunyuanVideo/refs/heads/main/assets/logo.png" height=100>
77
</p>
88

99
# HunyuanVideo: A Systematic Framework For Large Video Generation Model
1010

1111
<div align="center">
12-
<a href="https://github.com/Tencent/HunyuanVideo"><img src="https://img.shields.io/static/v1?label=HunyuanVideo Code&message=Github&color=blue"></a> &ensp;
12+
<a href="https://github.com/Tencent-Hunyuan/HunyuanVideo"><img src="https://img.shields.io/static/v1?label=HunyuanVideo Code&message=Github&color=blue"></a> &ensp;
1313
<a href="https://aivideo.hunyuan.tencent.com"><img src="https://img.shields.io/static/v1?label=Project%20Page&message=Web&color=green"></a> &ensp;
1414
<a href="https://video.hunyuan.tencent.com"><img src="https://img.shields.io/static/v1?label=Playground&message=Web&color=green"></a>
1515
</div>
@@ -43,8 +43,8 @@
4343

4444
## 🔥🔥🔥 更新!!
4545

46-
* 2025年03月06日: 🌅 开源 [HunyuanVideo-I2V](https://github.com/Tencent/HunyuanVideo-I2V), 支持高质量图生视频。
47-
* 2025年01月13日: 📈 开源 Penguin Video [基准测试集](https://github.com/Tencent/HunyuanVideo/blob/main/assets/PenguinVideoBenchmark.csv)
46+
* 2025年03月06日: 🌅 开源 [HunyuanVideo-I2V](https://github.com/Tencent-Hunyuan/HunyuanVideo-I2V), 支持高质量图生视频。
47+
* 2025年01月13日: 📈 开源 Penguin Video [基准测试集](https://github.com/Tencent-Hunyuan/HunyuanVideo/blob/main/assets/PenguinVideoBenchmark.csv)
4848
* 2024年12月18日: 🏃‍♂️ 开源 HunyuanVideo [FP8 模型权重](https://huggingface.co/tencent/HunyuanVideo/blob/main/hunyuan-video-t2v-720p/transformers/mp_rank_00_model_states_fp8.pt),节省更多 GPU 显存。
4949
* 2024年12月17日: 🤗 HunyuanVideo已经集成到[Diffusers](https://huggingface.co/docs/diffusers/main/api/pipelines/hunyuan_video)中。
5050
* 2024年12月03日: 🚀 开源 HunyuanVideo 多卡并行推理代码,由[xDiT](https://github.com/xdit-project/xDiT)提供。
@@ -94,7 +94,7 @@
9494
- [x] FP8 量化版本
9595
- [x] Penguin Video 基准测试集
9696
- [x] ComfyUI
97-
- [HunyuanVideo (图生视频模型)](https://github.com/Tencent/HunyuanVideo-I2V)
97+
- [HunyuanVideo (图生视频模型)](https://github.com/Tencent-Hunyuan/HunyuanVideo-I2V)
9898
- [x] 推理代码
9999
- [x] 模型权重
100100

@@ -147,7 +147,7 @@ HunyuanVideo 是一个全新的开源视频生成大模型,具有与领先的
147147

148148
HunyuanVideo 是一个隐空间模型,训练时它采用了 3D VAE 压缩时间维度和空间维度的特征。文本提示通过一个大语言模型编码后作为条件输入模型,引导模型通过对高斯噪声的多步去噪,输出一个视频的隐空间表示。最后,推理时通过 3D VAE 解码器将隐空间表示解码为视频。
149149
<p align="center">
150-
<img src="https://raw.githubusercontent.com/Tencent/HunyuanVideo/refs/heads/main/assets/overall.png" height=300>
150+
<img src="https://raw.githubusercontent.com/Tencent-Hunyuan/HunyuanVideo/refs/heads/main/assets/overall.png" height=300>
151151
</p>
152152

153153

@@ -157,7 +157,7 @@ HunyuanVideo 是一个隐空间模型,训练时它采用了 3D VAE 压缩时
157157

158158
HunyuanVideo 采用了 Transformer 和 Full Attention 的设计用于视频生成。具体来说,我们使用了一个“双流到单流”的混合模型设计用于视频生成。在双流阶段,视频和文本 token 通过并行的 Transformer Block 独立处理,使得每个模态可以学习适合自己的调制机制而不会相互干扰。在单流阶段,我们将视频和文本 token 连接起来并将它们输入到后续的 Transformer Block 中进行有效的多模态信息融合。这种设计捕捉了视觉和语义信息之间的复杂交互,增强了整体模型性能。
159159
<p align="center">
160-
<img src="https://raw.githubusercontent.com/Tencent/HunyuanVideo/refs/heads/main/assets/backbone.png" height=350>
160+
<img src="https://raw.githubusercontent.com/Tencent-Hunyuan/HunyuanVideo/refs/heads/main/assets/backbone.png" height=350>
161161
</p>
162162

163163
### **MLLM 文本编码器**
@@ -168,13 +168,13 @@ HunyuanVideo 采用了 Transformer 和 Full Attention 的设计用于视频生
168168

169169
由于 MLLM 是基于 Causal Attention 的,而 T5-XXL 使用了 Bidirectional Attention 为扩散模型提供更好的文本引导。因此,我们引入了一个额外的 token 优化器来增强文本特征。
170170
<p align="center">
171-
<img src="https://raw.githubusercontent.com/Tencent/HunyuanVideo/refs/heads/main/assets/text_encoder.png" height=275>
171+
<img src="https://raw.githubusercontent.com/Tencent-Hunyuan/HunyuanVideo/refs/heads/main/assets/text_encoder.png" height=275>
172172
</p>
173173

174174
### **3D VAE**
175175
我们的 VAE 采用了 CausalConv3D 作为 HunyuanVideo 的编码器和解码器,用于压缩视频的时间维度和空间维度,其中时间维度压缩 4 倍,空间维度压缩 8 倍,压缩为 16 channels。这样可以显著减少后续 Transformer 模型的 token 数量,使我们能够在原始分辨率和帧率下训练视频生成模型。
176176
<p align="center">
177-
<img src="https://raw.githubusercontent.com/Tencent/HunyuanVideo/refs/heads/main/assets/3dvae.png" height=150>
177+
<img src="https://raw.githubusercontent.com/Tencent-Hunyuan/HunyuanVideo/refs/heads/main/assets/3dvae.png" height=150>
178178
</p>
179179

180180
### **Prompt 改写**
@@ -494,10 +494,10 @@ HunyuanVideo 的开源离不开诸多开源工作,这里我们特别感谢 [SD
494494

495495
## Star 趋势
496496

497-
<a href="https://star-history.com/#Tencent/HunyuanVideo&Date">
497+
<a href="https://star-history.com/#Tencent-Hunyuan/HunyuanVideo&Date">
498498
<picture>
499-
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=Tencent/HunyuanVideo&type=Date&theme=dark" />
500-
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=Tencent/HunyuanVideo&type=Date" />
501-
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=Tencent/HunyuanVideo&type=Date" />
499+
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=Tencent-Hunyuan/HunyuanVideo&type=Date&theme=dark" />
500+
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=Tencent-Hunyuan/HunyuanVideo&type=Date" />
501+
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=Tencent-Hunyuan/HunyuanVideo&type=Date" />
502502
</picture>
503503
</a>

0 commit comments

Comments
 (0)