#12 [enhance] ICDAR 테이블 파싱 TEDS 병렬 계산 최적화#13
Conversation
- Defer TEDS/OCR scoring from process_results to aggregation stage - Use ProcessPoolExecutor with spawn context to avoid CUDA fork issues - Add picklable worker functions in teds.py for multiprocessing - Replace YAML aggregation: mean with custom parallel aggregation functions - Support TEDS_NUM_WORKERS env var for worker count control
Summary by CodeRabbit릴리스 노트
개요ICDAR 테이블 파싱 작업의 메트릭 집계를 순차 처리에서 멀티프로세싱 기반 병렬 처리로 리팩토링했습니다. YAML 설정에서 정적 "mean" 값을 동적 함수 참조로 변경하고, 병렬 워커 함수 및 집계 함수를 추가하여 평가 계산을 지연시킵니다. 변경 사항
시퀀스 다이어그램sequenceDiagram
participant Config as YAML Configuration
participant Process as Results Processing
participant Agg as Aggregation Functions
participant Worker as Parallel Workers
participant Metric as Metric Engines<br/>(TEDS, OCR)
Config->>Process: References aggregation functions
Process->>Agg: Defers metric computation,<br/>returns data tuples
Agg->>Agg: _parallel_compute()
Agg->>Worker: Spawn worker processes<br/>with compute_* functions
Worker->>Metric: Execute TEDS/OCR evaluation
Metric-->>Worker: Metric scores
Worker-->>Agg: Results with error handling
Agg-->>Agg: Collect & compute mean
Agg-->>Config: Return aggregated score
예상 코드 리뷰 노력🎯 3 (중간 난이도) | ⏱️ ~25분 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lmms_eval/tasks/icdar_table_parsing/teds.py`:
- Around line 217-218: The tuple unpacking (e.g., pred_html, gt_html, _) is done
outside the try block in compute_teds_full (and the sibling functions
compute_teds_structure and compute_teds_content), so malformed inputs raise
before the exception handler and a worker can't fallback to the 0.0 return; move
the unpacking into the try block and catch Exception around it, and on any
exception return 0.0 to preserve the worker fallback path while keeping the
existing scoring logic unchanged.
In `@lmms_eval/tasks/icdar_table_parsing/utils.py`:
- Line 18: The module-level _NUM_WORKERS assignment currently calls
int(os.environ.get("TEDS_NUM_WORKERS", min(os.cpu_count() or 1, 64))) which
raises ValueError for non-integer env values (e.g., "auto") and bypasses the
intended 64 max; change it to read TEDS_NUM_WORKERS into a variable, try to
parse it to int in a try/except (or validate numeric and positive), and on
failure fall back to default = min(os.cpu_count() or 1, 64); finally clamp the
resulting number to a minimum of 1 and a maximum of 64 and assign that to
_NUM_WORKERS (referencing the TEDS_NUM_WORKERS env var and the _NUM_WORKERS
symbol).
- Line 8: Replace the direct Loguru import ("from loguru import logger") with
the project standard logger by importing eval_logger from lmms_eval.utils and
aliasing it to logger (e.g., from lmms_eval.utils import eval_logger as logger),
then update all existing logger.* calls in this module (the places currently
using logger at the top-level import and the logger.info/error/debug calls
throughout the file) to use that eval_logger alias so the module uses the
centralized eval_logger instead of loguru directly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d8aff134-d3ce-4bf1-8085-2f94ed5375b8
📒 Files selected for processing (3)
lmms_eval/tasks/icdar_table_parsing/icdar_table_parsing.yamllmms_eval/tasks/icdar_table_parsing/teds.pylmms_eval/tasks/icdar_table_parsing/utils.py
| def compute_teds_full(args: tuple) -> float: | ||
| pred_html, gt_html, _ = args |
There was a problem hiding this comment.
인자 언패킹을 try 내부로 옮겨 워커 fallback을 보장해 주세요
Line 218, Line 229, Line 240에서 언패킹이 try 밖에 있어 입력 형식이 깨지면 0.0 반환 경로를 타지 못합니다. 단일 워커 경로에서는 집계가 중단될 수 있습니다.
제안 패치
-def compute_teds_full(args: tuple) -> float:
- pred_html, gt_html, _ = args
- try:
+def compute_teds_full(args: tuple) -> float:
+ try:
+ pred_html, gt_html, _ = args
return TEDS(structure_only=False).evaluate(pred_html, gt_html)
except Exception:
import traceback
traceback.print_exc()
return 0.0
def compute_teds_struct(args: tuple) -> float:
- pred_html, gt_html, _ = args
try:
+ pred_html, gt_html, _ = args
return TEDS(structure_only=True).evaluate(pred_html, gt_html)
except Exception:
import traceback
traceback.print_exc()
return 0.0
def compute_ocr(args: tuple) -> float:
- pred_html, gt_html, _ = args
try:
+ pred_html, gt_html, _ = args
return OCRMetric().evaluate(pred_html, gt_html)
except Exception:
import traceback
traceback.print_exc()
return 0.0Also applies to: 228-229, 239-240
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lmms_eval/tasks/icdar_table_parsing/teds.py` around lines 217 - 218, The
tuple unpacking (e.g., pred_html, gt_html, _) is done outside the try block in
compute_teds_full (and the sibling functions compute_teds_structure and
compute_teds_content), so malformed inputs raise before the exception handler
and a worker can't fallback to the 0.0 return; move the unpacking into the try
block and catch Exception around it, and on any exception return 0.0 to preserve
the worker fallback path while keeping the existing scoring logic unchanged.
|
|
||
| from lmms_eval.tasks._task_utils.file_utils import generate_submission_file | ||
| from lmms_eval.tasks.icdar_table_parsing.teds import TEDS, OCRMetric | ||
| from loguru import logger |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# 파일 존재 확인 및 내용 확인
find . -name "utils.py" -path "*/icdar_table_parsing/*" -type fRepository: ThakiCloud/lmms-eval
Length of output: 110
🏁 Script executed:
# 파일 내용 전체 확인 (라인 수 먼저 확인)
if [ -f "lmms_eval/tasks/icdar_table_parsing/utils.py" ]; then
wc -l "lmms_eval/tasks/icdar_table_parsing/utils.py"
fiRepository: ThakiCloud/lmms-eval
Length of output: 112
🏁 Script executed:
# eval_logger 정의 확인
if [ -f "lmms_eval/utils.py" ]; then
rg -n "eval_logger" "lmms_eval/utils.py" | head -20
fiRepository: ThakiCloud/lmms-eval
Length of output: 308
🏁 Script executed:
# 저장소 내 eval_logger 사용 패턴 확인
rg -n "from lmms_eval\.utils import eval_logger" --type=py | head -20Repository: ThakiCloud/lmms-eval
Length of output: 301
🏁 Script executed:
# icdar_table_parsing/utils.py 파일 내용 확인
cat -n "lmms_eval/tasks/icdar_table_parsing/utils.py" | head -60Repository: ThakiCloud/lmms-eval
Length of output: 2543
🏁 Script executed:
# 로깅 사용 부분 확인 (라인 40-50, 140-170)
cat -n "lmms_eval/tasks/icdar_table_parsing/utils.py" | sed -n '1,20p; 35,50p; 140,166p'Repository: ThakiCloud/lmms-eval
Length of output: 3140
프로젝트 표준 eval_logger로 로깅 통합
loguru를 직접 import하지 말고, lmms_eval.utils에서 제공하는 eval_logger를 사용해 주세요. 라인 8, 43, 148, 156, 164에서 로깅 코드를 수정해야 합니다.
제안 패치
-from loguru import logger
+from lmms_eval.utils import eval_logger
@@
- logger.warning("Worker failed for item {}: {}", idx, e)
+ eval_logger.warning(f"Worker failed for item {idx}: {e}")
@@
- logger.info("Computing TEDS for {} samples with {} workers", len(items), workers)
+ eval_logger.info(f"Computing TEDS for {len(items)} samples with {workers} workers")
@@
- logger.info("Computing TEDS-struct for {} samples with {} workers", len(items), workers)
+ eval_logger.info(f"Computing TEDS-struct for {len(items)} samples with {workers} workers")
@@
- logger.info("Computing OCR metric for {} samples with {} workers", len(items), workers)
+ eval_logger.info(f"Computing OCR metric for {len(items)} samples with {workers} workers")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lmms_eval/tasks/icdar_table_parsing/utils.py` at line 8, Replace the direct
Loguru import ("from loguru import logger") with the project standard logger by
importing eval_logger from lmms_eval.utils and aliasing it to logger (e.g., from
lmms_eval.utils import eval_logger as logger), then update all existing logger.*
calls in this module (the places currently using logger at the top-level import
and the logger.info/error/debug calls throughout the file) to use that
eval_logger alias so the module uses the centralized eval_logger instead of
loguru directly.
|
|
||
| TABLE_PARSING_PROMPT = "Convert the table in the image to HTML format. Output only the HTML table code, starting with <table> and ending with </table>. Do not include any explanation." | ||
|
|
||
| _NUM_WORKERS = int(os.environ.get("TEDS_NUM_WORKERS", min(os.cpu_count() or 1, 64))) |
There was a problem hiding this comment.
TEDS_NUM_WORKERS 파싱이 import 실패와 상한 우회를 유발할 수 있습니다
Line 18은 비정상 값(예: TEDS_NUM_WORKERS=auto)에서 ValueError로 모듈 import가 실패하고, 환경변수 값에는 최대 64 제한도 적용되지 않습니다. PR 목표(최대 64)와 동작이 어긋납니다.
제안 패치
-_NUM_WORKERS = int(os.environ.get("TEDS_NUM_WORKERS", min(os.cpu_count() or 1, 64)))
+def _resolve_num_workers() -> int:
+ default_workers = min(os.cpu_count() or 1, 64)
+ raw = os.environ.get("TEDS_NUM_WORKERS")
+ if raw is None:
+ return default_workers
+ try:
+ parsed = int(raw)
+ except ValueError:
+ return default_workers
+ return max(1, min(parsed, 64))
+
+
+_NUM_WORKERS = _resolve_num_workers()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lmms_eval/tasks/icdar_table_parsing/utils.py` at line 18, The module-level
_NUM_WORKERS assignment currently calls int(os.environ.get("TEDS_NUM_WORKERS",
min(os.cpu_count() or 1, 64))) which raises ValueError for non-integer env
values (e.g., "auto") and bypasses the intended 64 max; change it to read
TEDS_NUM_WORKERS into a variable, try to parse it to int in a try/except (or
validate numeric and positive), and on failure fall back to default =
min(os.cpu_count() or 1, 64); finally clamp the resulting number to a minimum of
1 and a maximum of 64 and assign that to _NUM_WORKERS (referencing the
TEDS_NUM_WORKERS env var and the _NUM_WORKERS symbol).
Issue?
#12
Changes?
process_results에서 개별 TEDS/OCR 점수 계산을 제거하고, aggregation 단계에서 전체 샘플을 일괄 병렬 처리로 변경compute_teds_full,compute_teds_struct,compute_ocr)를teds.py모듈에 분리 정의aggregation: mean을 커스텀 aggregation 함수(teds_aggregate,teds_struct_aggregate,ocr_aggregate)로 교체TEDS_NUM_WORKERS환경변수로 워커 수 조절 가능 (기본값: CPU 코어 수, 최대 64)Why we need?
Test?
python -m lmms_eval --model qwen2_5_vl --tasks icdar_table_parsing --limit 8 --batch_size 1로 정상 동작 확인CC (Optional)
Anything else? (Optional)