Skip to content

Commit ce58b66

Browse files
authored
Merge branch 'main' into feat/mmrarebench-hf
2 parents 2eb26bf + 2d5b16c commit ce58b66

53 files changed

Lines changed: 4593 additions & 994 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@ English | [简体中文](/docs/zh-CN/README_zh-CN.md) | [日本語](/docs/ja/REA
3333
- **[2025-08-04]** In [PR 1175](https://github.com/open-compass/VLMEvalKit/pull/1175), we refine the `can_infer_option` and `can_infer_text`, which increasingly route the evaluation to LLM choice extractors and empirically leads to slight performance improvement for MCQ benchmarks.
3434

3535
## 🆕 News
36+
37+
- **[2026-04-08]** Supported [**Video-MME-v2**](https://github.com/MME-Benchmarks/Video-MME-v2). Video-MME-v2 is an authoritative benchmark towards the next stage in video understanding evaluation. 🔥🔥🔥
3638
- **[2025-07-07]** Supported [**SeePhys**](https://seephys.github.io/), which is a ​full spectrum multimodal benchmark for evaluating physics reasoning across different knowledge levels. thanks to [**Quinn777**](https://github.com/Quinn777) 🔥🔥🔥
3739
- **[2025-07-02]** Supported [**OvisU1**](https://huggingface.co/AIDC-AI/Ovis-U1-3B), thanks to [**liyang-7**](https://github.com/liyang-7) 🔥🔥🔥
3840
- **[2025-06-16]** Supported [**PhyX**](https://phyx-bench.github.io/), a benchmark aiming to assess capacity for physics-grounded reasoning in visual scenarios. 🔥🔥🔥

requirements.txt

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,26 +4,28 @@ antlr4-python3-runtime==4.11.1
44
apted>=1.0.3
55
bert_score
66
cairosvg
7+
cd-fvd
78
colormath>=3.0.0
89
datasets
910
decord>=0.6.0
1011
distance>=0.1.3
1112
dotenv
1213
editdistance>=0.8.1
1314
einops
14-
# for gemini api
1515
google-genai
1616
gradio
1717
huggingface_hub
1818
imageio
1919
ipdb
2020
jieba>=0.42.1
2121
json_repair
22+
latex2sympy2-extended
2223
levenshtein>=0.27.1
2324
lpips
2425
lxml>=6.0.2
2526
math-verify
2627
matplotlib
28+
nest_asyncio
2729
nltk
2830
num2words
2931
numpy
@@ -43,9 +45,9 @@ python-dotenv
4345
qwen_vl_utils
4446
requests
4547
rich
48+
rouge
4649
scikit-image
4750
scikit-learn
48-
# For UniSVG
4951
sentence_transformers
5052
sentencepiece
5153
setuptools
@@ -57,7 +59,6 @@ tiktoken
5759
timeout-decorator
5860
timm
5961
torch
60-
# For SArena
6162
torchmetrics
6263
torchvision
6364
tqdm

0 commit comments

Comments
 (0)