[feat] Add fastvideo-port agent skill with Cosmos 2.5 lessons by Mister-Raggs · Pull Request #1241 · hao-ai-lab/FastVideo

Mister-Raggs · 2026-04-18T05:54:52Z

Purpose
Adds a .agents/ skill for porting new diffusion models to FastVideo automatically. Extracted from the Cosmos 2.5 port (PR #1227) and generalised to work for any future model. Goal is >90% port work automation so an agent can handle the mechanical phases without hand-holding.

Changes
Added .agents/skills/fastvideo-port/SKILL.md — 9-phase port workflow (recon → model impl → alignment → pipeline → SSIM → training), with recommended patterns table, hard stop conditions, credentials checklist, and alignment settings checklist
Added scripts/check_prereqs.sh — validates GPU/VRAM, attention backend, HF tokens, gating status, and weights on disk before starting a port
Added scripts/recon.py — GitHub API recon (no clone needed), outputs structured JSON with architecture, components, HF repo status
Added scripts/auto_mapper.py — first-pass param_names_mapping via string similarity between official and FastVideo state dicts
Added scripts/alignment_test.py — per-component numerical parity template (atol=1e-4, SEED=42)
Added scripts/ssim_test.py — SSIM regression test template vs reference video (gate: mean ≥ 0.85)
Added scripts/eval_harness.py — diffs agent port branch vs ground truth, outputs lesson candidates
Added 8 lessons in .agents/lessons/ from the Cosmos 2.5 port covering critical failure patterns
Test Plan

Validate prereqs script runs without error

bash .agents/skills/fastvideo-port/scripts/check_prereqs.sh --model test-model

Validate recon.py produces valid JSON against a public repo

python .agents/skills/fastvideo-port/scripts/recon.py
--repo GAIR-NLP/daVinci-MagiHuman --output /tmp/davinci_recon.json
python -c "import json; d=json.load(open('/tmp/davinci_recon.json')); print(list(d.keys()))"
Test Results

Test output

check_prereqs.sh (no GPU on dev machine — expected FAIL for GPU check)

=================================================
FastVideo Port — Prerequisites Check

[ Environment ]
[OK] fastvideo importable
[OK] uv available

[ GPU ]
[FAIL] No CUDA GPU detected — alignment tests and training require GPU

=================================================
PASS: 2 WARN: 0 FAIL: 1

Script correctly exits 1 on hard failure

Checklist I ran pre-commit run --all-files and fixed all issues I added or updated tests for my changes I updated documentation if needed I considered GPU memory impact of my changes

Skills: - .agents/skills/fastvideo-port/SKILL.md: 9-phase port workflow with credentials checklist, HF repo timing fix, alignment settings table, and daVinci-MagiHuman as worked example - recon.py: GitHub API recon -> structured JSON without cloning - auto_mapper.py: first-pass param_names_mapping via string similarity - alignment_test.py: per-component numerical parity template (atol=1e-4) - ssim_test.py: SSIM regression test template vs official reference video - eval_harness.py: diffs agent port vs ground truth, emits lesson candidates Lessons from Cosmos 2.5 port (8, generalised for any FastVideo port): - ReplicatedLinear required for LoRA injection (critical) - fp32/bf16 dtype mismatch at projection layers (critical) - Hardcoded T5-XXL shape in utils.py breaks non-Wan encoders (important) - Preprocessing entrypoint not always ported to new system (important) - Multimodal VLM processor needs inner tokenizer unwrap (important) - bf16 embeddings need .float() before .numpy() (important) - Scheduler shift overwritten by base class train() (important) - AutoTokenizer fails on multimodal processors (important)

- add check_prereqs.sh: validates GPU, env vars, HF tokens, gating status, weights on disk, and pre-commit before any port work starts - add Recommended patterns table: opinionated defaults for training pipeline subclassing, attention layers, ReplicatedLinear, port order, and dtype strategy — with explicit reasoning for each - add Hard stop conditions table: 6 observable failure signals that indicate a root cause problem; agent must stop and fix before continuing rather than debugging symptoms further downstream - wire check_prereqs.sh into Step 0 as the first action - generalise two lesson Prevention sections (autotokenizer, lora)

mergify · 2026-04-18T05:55:30Z

⚠️ PR title format required

Your PR title must start with a type tag in brackets. Examples:

[feat] Add new model support
[bugfix] Fix VAE tiling corruption
[refactor] Restructure training pipeline
[perf] Optimize attention kernel
[ci] Update test infrastructure
[docs] Add inference guide
[misc] Clean up configs
[new-model] Port Flux2 to FastVideo

Valid tags: feat, feature, bugfix, fix, refactor, perf, ci, doc, docs, misc, chore, kernel, new-model

Please update your PR title and the merge protection check will pass automatically.

mergify · 2026-04-18T05:55:31Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 PR merge requirements

Waiting for:

#approved-reviews-by>=1
check-success=full-suite-passed
check-success~=pre-commit

This rule is failing.

#approved-reviews-by>=1
check-success=full-suite-passed
check-success~=pre-commit
check-success=fastcheck-passed
title~=(?i)^\[(feat|feature|bugfix|fix|refactor|perf|ci|doc|docs|misc|chore|kernel|new.?model)\]

gemini-code-assist

Code Review

This pull request introduces the 'fastvideo-port' skill, which establishes a standardized workflow and a suite of utility scripts for porting video generation models. The contribution includes documented lessons from previous ports, a comprehensive skill guide, and tools for reconnaissance, weight mapping, and numerical alignment. Review feedback identified a critical command injection vulnerability in the prerequisite script, logic errors in the reconnaissance script regarding URL parsing and branch detection, and fragile parsing logic in the evaluation harness.

gemini-code-assist · 2026-04-18T05:56:51Z

+    GATED=$(python - <<PY 2>/dev/null || echo "error"
+from huggingface_hub import model_info
+try:
+    info = model_info("$hf_id", token="${HF_TOKEN:-None}")
+    gated = getattr(info, "gated", False)
+    print("gated" if gated else "open")
+except Exception as e:
+    print(f"error: {e}")
+PY


The embedded Python script is vulnerable to command injection because it uses shell variable expansion ($hf_id) directly inside a Python string literal. A maliciously crafted model ID could execute arbitrary Python code. Additionally, passing the string "None" as a token will cause authentication failures when HF_TOKEN is unset. It is safer to pass these values as environment variables and access them via os.environ in Python, while using a quoted heredoc ('PY') to prevent shell expansion.

Suggested change

GATED=$(python - <<PY 2>/dev/null || echo "error"

from huggingface_hub import model_info

try:

info = model_info("$hf_id", token="${HF_TOKEN:-None}")

gated = getattr(info, "gated", False)

print("gated" if gated else "open")

except Exception as e:

print(f"error: {e}")

PY

GATED=$(HF_ID="$hf_id" python - <<'PY' 2>/dev/null || echo "error"

import os

from huggingface_hub import model_info

try:

hf_id = os.environ.get("HF_ID")

token = os.environ.get("HF_TOKEN")

info = model_info(hf_id, token=token)

gated = getattr(info, "gated", False)

print("gated" if gated else "open")

except Exception as e:

print(f"error: {e}")

PY

gemini-code-assist · 2026-04-18T05:56:51Z

+    except Exception:
+        # Try HEAD branch
+        try:
+            url2 = url.replace(f"/{ref}/", "/HEAD/")


The fallback to /HEAD/ in the raw.githubusercontent.com URL is invalid and will result in a 404 error. This service does not resolve HEAD as a symbolic reference for the default branch; it requires a specific branch name (e.g., main or master). To support repositories with different default branch names, you should query the repository metadata via the GitHub API to determine the correct default_branch before attempting to fetch raw files.

gemini-code-assist · 2026-04-18T05:56:51Z

+    m = re.search(r"github\.com[:/]([^/]+/[^/\s\.]+)", url)
+    if not m:
+        raise ValueError(f"Cannot parse GitHub URL: {url}")
+    return m.group(1).rstrip("/")


The regular expression used to parse the repository URL incorrectly excludes dots (\.), which are valid characters in GitHub repository names. This will cause the script to truncate names like owner/repo.name to owner/repo. A more robust approach would be to match all non-whitespace and non-separator characters and then explicitly handle the optional .git suffix.

Suggested change

m = re.search(r"github\.com[:/]([^/]+/[^/\s\.]+)", url)

if not m:

raise ValueError(f"Cannot parse GitHub URL: {url}")

return m.group(1).rstrip("/")

m = re.search(r"github\.com[:/]([^/\s?#]+/[^/\s?#]+)", url)

if not m:

raise ValueError(f"Cannot parse GitHub URL: {url}")

repo = m.group(1)

return repo.removesuffix(".git").rstrip("/")

gemini-code-assist · 2026-04-18T05:56:51Z

+    if "cross_attn" in source or "cross_attention" in source:
+        if "self_attn" in source or "self_attention" in source:
+            result["attention_type"] = "self+cross"
+        else:
+            result["attention_type"] = "cross_attn"
+    elif "self_attn" in source or "SelfAttention" in source:
+        result["attention_type"] = "self_attn"


The heuristics for detecting attention types are case-sensitive and may miss implementations using CamelCase (e.g., CrossAttention). Converting the source to lowercase before performing these checks would improve reliability.

Suggested change

if "cross_attn" in source or "cross_attention" in source:

if "self_attn" in source or "self_attention" in source:

result["attention_type"] = "self+cross"

else:

result["attention_type"] = "cross_attn"

elif "self_attn" in source or "SelfAttention" in source:

result["attention_type"] = "self_attn"

source_lc = source.lower()

if "cross_attn" in source_lc or "cross_attention" in source_lc:

if "self_attn" in source_lc or "self_attention" in source_lc:

result["attention_type"] = "self+cross"

else:

result["attention_type"] = "cross_attn"

elif "self_attn" in source_lc or "selfattention" in source_lc:

result["attention_type"] = "self_attn"

gemini-code-assist · 2026-04-18T05:56:51Z

+                        "what_happened": f"Agent implementation differs from ground truth in `{current_file}`.",
+                        "diff_preview": "\n".join(current_hunk[:20]),
+                    })
+            current_file = line.split(" b/")[-1] if " b/" in line else line


Parsing the diff header by splitting on " b/" is fragile and will fail if a filename contains that specific sequence. While acceptable for an internal tool, using a regular expression to match the diff --git format would be more robust.

Copilot

Pull request overview

Adds a new .agents/skills/fastvideo-port skill intended to automate recurring/mechanical parts of porting new diffusion video models into FastVideo, extracted/generalized from the Cosmos 2.5 port workflow.

Changes:

Introduces a 9-phase porting workflow document (SKILL.md) plus templates/scripts for prereq checks, recon, weight key auto-mapping, alignment testing, SSIM testing, and an eval harness.
Adds a prereqs checker (check_prereqs.sh) and GitHub recon script (recon.py) to reduce setup friction before porting.
Adds eight “lessons learned” markdowns from the Cosmos 2.5 port to guide future ports.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 13 comments.

Show a summary per file

File	Description
`.agents/skills/index.jsonl`	Registers the new `fastvideo-port` skill in the skill index.
`.agents/skills/fastvideo-port/SKILL.md`	Defines the end-to-end port workflow, checklists, and references.
`.agents/skills/fastvideo-port/scripts/check_prereqs.sh`	Adds a pre-port environment/credential/weights sanity check.
`.agents/skills/fastvideo-port/scripts/recon.py`	Adds GitHub API-based repo “recon” to generate structured JSON guidance.
`.agents/skills/fastvideo-port/scripts/auto_mapper.py`	Adds a state-dict key similarity tool to draft `param_names_mapping` rules.
`.agents/skills/fastvideo-port/scripts/alignment_test.py`	Provides a parity/alignment test template for component-level numerical checks.
`.agents/skills/fastvideo-port/scripts/ssim_test.py`	Provides an SSIM regression test template to gate end-to-end parity.
`.agents/skills/fastvideo-port/scripts/eval_harness.py`	Adds a harness to compare agent-generated ports against a ground-truth port and draft lesson candidates.
`.agents/lessons/2026-04-09_scheduler-shift-not-wired-to-training.md`	Documents a training scheduler shift pitfall from Cosmos 2.5.
`.agents/lessons/2026-04-09_preprocessing-entrypoint-not-always-ported.md`	Documents preprocessing entrypoint mismatch pitfall.
`.agents/lessons/2026-04-09_multimodal-processor-needs-inner-tokenizer.md`	Documents multimodal processor tokenizer unwrapping requirement.
`.agents/lessons/2026-04-09_lora-requires-replicated-linear.md`	Documents LoRA injection requirement for `ReplicatedLinear`.
`.agents/lessons/2026-04-09_hardcoded-text-encoder-shape-in-utils.md`	Documents hardcoded text-encoder shape pitfall in dataset utils.
`.agents/lessons/2026-04-09_fp32-bf16-dtype-mismatch-at-projection.md`	Documents fp32/bf16 projection dtype mismatch pitfall.
`.agents/lessons/2026-04-09_bf16-embeddings-need-float-before-numpy.md`	Documents bf16-to-numpy conversion pitfall in preprocessing.
`.agents/lessons/2026-04-09_autotokenizer-fails-on-multimodal-processor.md`	Documents AutoTokenizer failure mode on multimodal processors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-18T05:59:04Z

+                    else:
+                        captures.append(re.escape(op))
+                        replacements.append(fp)
+                pat = r"^" + r"\." .join(captures) + r"$"


There is a Python syntax error here: r"\." .join(captures) has whitespace before the dot, so the file won’t parse and pre-commit/ruff will fail. Remove the space so it becomes a normal method call on the string.

Suggested change

pat = r"^" + r"\." .join(captures) + r"$"

pat = r"^" + r"\.".join(captures) + r"$"

Copilot · 2026-04-18T05:59:04Z

+    return run(["git", "diff", f"{base}...{head}", "--", "fastvideo/", "examples/", "tests/"],
+               cwd=repo)
+
+
+def get_changed_files(repo: str, base: str, head: str) -> list[str]:
+    out = run(["git", "diff", "--name-only", f"{base}...{head}"], cwd=repo)


git diff {base}...{head} uses the three-dot form (merge-base to head), which can hide changes that exist only on base (agent branch). For an eval harness that compares agent vs ground truth, use a two-dot diff (or generate diffs in both directions) so discrepancies aren’t missed.

Suggested change

return run(["git", "diff", f"{base}...{head}", "--", "fastvideo/", "examples/", "tests/"],

cwd=repo)

def get_changed_files(repo: str, base: str, head: str) -> list[str]:

out = run(["git", "diff", "--name-only", f"{base}...{head}"], cwd=repo)

return run(["git", "diff", f"{base}..{head}", "--", "fastvideo/", "examples/", "tests/"],

cwd=repo)

def get_changed_files(repo: str, base: str, head: str) -> list[str]:

out = run(["git", "diff", "--name-only", f"{base}..{head}"], cwd=repo)

Copilot · 2026-04-18T05:59:04Z

+import re
+import sys
+from urllib.request import urlopen, Request
+from urllib.error import HTTPError


Unused import HTTPError will fail ruff (F401) since .agents/ is linted by pre-commit. Remove it or catch HTTPError explicitly in the request helpers.

Suggested change

from urllib.error import HTTPError

Copilot · 2026-04-18T05:59:05Z

+import os
+from pathlib import Path
+
+import numpy as np
+import pytest


os and numpy are currently unused (the only np usage is in commented code), which will fail ruff (F401) under pre-commit. Remove the unused imports, or restructure the template so the imports are actually referenced.

Copilot · 2026-04-18T05:59:05Z

+# Hard failures (exit 1): missing GPU, uv not installed, FastVideo not installed
+# Warnings (continue):    missing tokens, weights not downloaded yet


Header comment says missing uv is a hard failure, but the script only issues a warning when uv is absent. Either change the comment to match behavior or make missing uv increment FAIL and exit non-zero.

Suggested change

# Hard failures (exit 1): missing GPU, uv not installed, FastVideo not installed

# Warnings (continue): missing tokens, weights not downloaded yet

# Hard failures (exit 1): missing GPU, FastVideo not installed

# Warnings (continue): uv not installed, missing tokens, weights not downloaded yet

Copilot · 2026-04-18T05:59:06Z

+    for line in diff.splitlines():
+        if line.startswith("diff --git"):
+            if current_file and current_hunk:
+                hunk_text = "\n".join(current_hunk)
+                if len(current_hunk) > 5:  # ignore trivial diffs


extract_lesson_candidates() never emits a candidate for the final file in the diff because it only flushes current_hunk when it sees the next diff --git header. Add a flush after the loop to process the last accumulated hunk.

Copilot · 2026-04-18T05:59:06Z

+    return candidates
+
+
+def format_lesson_md(candidate: dict, model_name: str) -> str:


format_lesson_md() is annotated as returning str, but it actually returns a (filename, body) tuple. This will fail mypy; update the return type annotation to a tuple type.

Suggested change

def format_lesson_md(candidate: dict, model_name: str) -> str:

def format_lesson_md(candidate: dict, model_name: str) -> tuple[str, str]:

Copilot · 2026-04-18T05:59:06Z

+    m = re.search(r"github\.com[:/]([^/]+/[^/\s\.]+)", url)
+    if not m:
+        raise ValueError(f"Cannot parse GitHub URL: {url}")
+    return m.group(1).rstrip("/")


parse_repo_url() currently disallows . in repo names ([^/\s\.]+), so URLs like https://github.com/org/my.repo will be parsed incorrectly. Consider allowing dots in the repo name and separately stripping a trailing .git if present.

Suggested change

m = re.search(r"github\.com[:/]([^/]+/[^/\s\.]+)", url)

if not m:

raise ValueError(f"Cannot parse GitHub URL: {url}")

return m.group(1).rstrip("/")

m = re.search(r"github\.com[:/]([^/]+/[^/\s]+)", url)

if not m:

raise ValueError(f"Cannot parse GitHub URL: {url}")

repo = m.group(1).rstrip("/")

if repo.endswith(".git"):

repo = repo[:-4]

return repo

Copilot · 2026-04-18T05:59:06Z

+def main():
+    parser = argparse.ArgumentParser(description="Recon a GitHub repo for FastVideo porting")
+    parser.add_argument("repo_url", help="GitHub URL, e.g. https://github.com/org/repo")
+    parser.add_argument("--output", "-o", default=None, help="JSON output file (default: stdout)")
+    parser.add_argument("--token", default=None, help="GitHub personal access token (optional)")


The docs/check_prereqs.sh flow expects GITHUB_TOKEN to be set, but recon.py only accepts --token and doesn’t default to the environment variable. Consider defaulting --token to os.environ.get('GITHUB_TOKEN') (or checking it when --token is omitted) so the advertised workflow works without extra flags.

Copilot · 2026-04-18T05:59:07Z

+- FastVideo repo cloned; `uv pip install -e .[dev]` complete.
+- `official_weights/<model_name>/` populated (download manually — large files
+  time out in agents; see Step 0).
+- Official repo cloned under `FastVideo/<model_name>/` (Step 0).


Prereqs say the official repo should be cloned under FastVideo/<model_name>/, but Step 0’s command clones it into <model_name>/ in the current directory. Please make these paths consistent so users/agents don’t put the repo in two different places.

Suggested change

- Official repo cloned under `FastVideo/<model_name>/` (Step 0).

- Official repo cloned locally in the location specified in Step 0.

- check_prereqs.sh: fix command injection in HF gating check (use env var + quoted heredoc so $hf_id is never expanded inside Python) - recon.py: fix /HEAD/ fallback (query API for default_branch instead); fix URL regex to allow dots in repo names (.git suffix stripped); lowercase source before attention-type heuristic checks - eval_harness.py: use regex to parse diff --git header instead of fragile split-on-" b/"

Mister-Raggs and others added 3 commits April 17, 2026 14:13

Merge branch 'hao-ai-lab:main' into feat/agent-port-skill

509532b

Copilot AI review requested due to automatic review settings April 18, 2026 05:54

Mister-Raggs changed the title ~~Feat/agent port skill~~ [feat] Add fastvideo-port agent skill with Cosmos 2.5 lessonsFeat/agent port skill Apr 18, 2026

gemini-code-assist bot reviewed Apr 18, 2026

View reviewed changes

mergify bot added the type: feat New feature or capability label Apr 18, 2026

Copilot AI reviewed Apr 18, 2026

View reviewed changes

Mister-Raggs changed the title ~~[feat] Add fastvideo-port agent skill with Cosmos 2.5 lessonsFeat/agent port skill~~ [feat] Add fastvideo-port agent skill with Cosmos 2.5 lessons Apr 19, 2026

	pat = r"^" + r"\." .join(captures) + r"$"
	pat = r"^" + r"\.".join(captures) + r"$"

		# Hard failures (exit 1): missing GPU, uv not installed, FastVideo not installed
		# Warnings (continue): missing tokens, weights not downloaded yet

		return candidates


		def format_lesson_md(candidate: dict, model_name: str) -> str:

	- Official repo cloned under `FastVideo/<model_name>/` (Step 0).
	- Official repo cloned locally in the location specified in Step 0.

Conversation

Mister-Raggs commented Apr 18, 2026

Validate prereqs script runs without error

Validate recon.py produces valid JSON against a public repo

check_prereqs.sh (no GPU on dev machine — expected FAIL for GPU check)

================================================= FastVideo Port — Prerequisites Check

================================================= PASS: 2 WARN: 0 FAIL: 1

Script correctly exits 1 on hard failure

Uh oh!

mergify bot commented Apr 18, 2026

⚠️ PR title format required

Uh oh!

mergify bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge Protections

🔴 PR merge requirements

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

=================================================
FastVideo Port — Prerequisites Check

=================================================
PASS: 2 WARN: 0 FAIL: 1

mergify bot commented Apr 18, 2026 •

edited

Loading