安装指南

本项目使用 uv 进行依赖管理，确保环境可复现。

前置要求

Python 3.11
CUDA 12.1+ (推荐)
Linux x86_64
uv >= 0.9.0

安装 uv

# 方式1: pip安装
pip install uv

# 方式2: 官方脚本
curl -LsSf https://astral.sh/uv/install.sh | sh

快速安装

# 克隆项目
git clone <repo_url> sam3d_pipeline
cd sam3d_pipeline

# 创建虚拟环境并安装基础依赖
uv venv --python 3.11
source .venv/bin/activate
uv sync

模块依赖安装

每个模块有独立的依赖，按需安装。

1. GroundingDINO (目标检测)

# ⚠️ 需要从源码安装，必须使用 --no-build-isolation
uv pip install --no-build-isolation "groundingdino @ git+https://github.com/IDEA-Research/GroundingDINO.git"

# 下载 checkpoint
mkdir -p src/modules/grounding_dino/checkpoints
wget -P src/modules/grounding_dino/checkpoints/ \
    https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha2/groundingdino_swinb_cogcoor.pth

2. SAM (单帧分割)

# 安装 segment-anything
uv pip install segment-anything

# 下载 checkpoint
mkdir -p src/modules/sam/checkpoints
wget -P src/modules/sam/checkpoints/ \
    https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

3. SAM2 (视频追踪) ⭐ 核心模块

# ⚠️ 需要从源码安装，包名是 sam-2 (带横线)
uv pip install --no-build-isolation "sam-2 @ git+https://github.com/facebookresearch/sam2.git"

# 下载 checkpoint
mkdir -p src/modules/sam2/checkpoints
wget -O src/modules/sam2/checkpoints/sam2_hiera_large.pt \
    https://dl.fbaipublicfiles.com/segment_anything_2/092824/sam2.1_hiera_large.pt

4. RAM (标签识别)

# ⚠️ 需要从源码安装
uv pip install "ram @ git+https://github.com/xinyu1205/recognize-anything.git"

# 下载 checkpoint (从 HuggingFace)
mkdir -p src/modules/ram/checkpoints
# 需要手动下载: https://huggingface.co/xinyu1205/recognize-anything-plus-model
# 文件: ram_swin_large_14m.pth (~2GB)

5. MASt3R (3D重建)

# 克隆 MASt3R 仓库
git clone --recursive https://github.com/naver/mast3r third_party/mast3r

# 安装
uv pip install -e third_party/mast3r

# 下载 checkpoint (从 HuggingFace)
mkdir -p src/modules/mast3r/checkpoints
# huggingface-cli download naver/MASt3R --local-dir src/modules/mast3r/checkpoints

6. SAM3 (文本分割)

# 需要 SAM3 模型，通常从 HuggingFace 下载
mkdir -p src/modules/sam3/checkpoints/sam3
# 下载模型文件到该目录

7. SAM3D Objects (单图3D)

# 克隆代码
git clone https://github.com/facebookresearch/sam3d-objects third_party/sam3d-objects

# ⚠️ 安装特殊依赖 (需要编译)
uv pip install spconv-cu121

# PyTorch3D (需要编译，可能较慢)
uv pip install --no-build-isolation "pytorch3d @ git+https://github.com/facebookresearch/pytorch3d.git"

# 下载 checkpoints
# 从 HuggingFace 下载完整的 checkpoint 目录到 src/modules/sam3d_objects/checkpoints/

8. Kinematics (运动学)

# 已包含在基础依赖中
# pytorch-kinematics 会自动安装

# 可选: nvdiffrast (可微渲染)
uv pip install --no-build-isolation "nvdiffrast @ git+https://github.com/NVlabs/nvdiffrast.git"

一键安装脚本

#!/bin/bash
# install_all.sh - 安装所有模块依赖

set -e

echo "=== 安装基础依赖 ==="
uv sync

echo "=== 安装 GroundingDINO ==="
uv pip install --no-build-isolation "groundingdino @ git+https://github.com/IDEA-Research/GroundingDINO.git"

echo "=== 安装 SAM ==="
uv pip install segment-anything

echo "=== 安装 SAM2 ==="
uv pip install --no-build-isolation "sam-2 @ git+https://github.com/facebookresearch/sam2.git"

echo "=== 安装 RAM ==="
uv pip install --no-build-isolation "ram @ git+https://github.com/xinyu1205/recognize-anything.git"

echo "=== 完成 ==="
echo "请手动下载所需的 checkpoint 文件"

Checkpoint 下载汇总

模块	文件	大小	下载地址
GroundingDINO	`groundingdino_swinb_cogcoor.pth`	~700MB	GitHub Release
SAM	`sam_vit_h_4b8939.pth`	~2.5GB	Meta AI
SAM2	`sam2_hiera_large.pt`	~900MB	Meta AI
RAM	`ram_swin_large_14m.pth`	~2GB	HuggingFace
MASt3R	`MASt3R_ViTLarge_*.pth`	~1.5GB	HuggingFace
SAM3D	多个 `.ckpt` 文件	~4GB	HuggingFace

验证安装

# 激活环境
source .venv/bin/activate

# 测试导入
python -c "
from src.modules.grounding_dino import GroundingDINO
from src.modules.sam2 import SAM2VideoTracker
print('✓ 模块导入成功')
"

# 运行测试
python tools/test_robot_segmentation.py --video Semiff/data/example_01/video.mp4 --output output/test

常见问题

Q: PyTorch3D 编译失败

# 确保 CUDA 环境正确
export CUDA_HOME=/usr/local/cuda
export PATH=$CUDA_HOME/bin:$PATH

# 使用 --no-build-isolation
uv pip install --no-build-isolation "pytorch3d @ git+https://github.com/facebookresearch/pytorch3d.git"

Q: spconv 安装失败

# 使用预编译轮子
uv pip install spconv-cu121
# 或
uv pip install spconv-cu120

Q: GroundingDINO 编译错误

# 确保安装了编译工具
sudo apt install build-essential

# 设置环境变量
export CUDA_HOME=/usr/local/cuda
uv pip install "groundingdino @ git+https://github.com/IDEA-Research/GroundingDINO.git"

Q: 显存不足

部分模型需要较大显存：

SAM2 Large: ~8GB
SAM3D: ~16GB
完整 Pipeline: ~24GB

建议使用 RTX 3090/4090 或更高显卡。

环境导出

# 导出当前环境 (用于复现)
uv pip freeze > requirements.lock.txt

# 从锁定文件恢复
uv pip install -r requirements.lock.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

安装指南

前置要求

安装 uv

快速安装

模块依赖安装

1. GroundingDINO (目标检测)

2. SAM (单帧分割)

3. SAM2 (视频追踪) ⭐ 核心模块

4. RAM (标签识别)

5. MASt3R (3D重建)

6. SAM3 (文本分割)

7. SAM3D Objects (单图3D)

8. Kinematics (运动学)

一键安装脚本

Checkpoint 下载汇总

验证安装

常见问题

Q: PyTorch3D 编译失败

Q: spconv 安装失败

Q: GroundingDINO 编译错误

Q: 显存不足

环境导出

FilesExpand file tree

INSTALL.md

Latest commit

History

INSTALL.md

File metadata and controls

安装指南

前置要求

安装 uv

快速安装

模块依赖安装

1. GroundingDINO (目标检测)

2. SAM (单帧分割)

3. SAM2 (视频追踪) ⭐ 核心模块

4. RAM (标签识别)

5. MASt3R (3D重建)

6. SAM3 (文本分割)

7. SAM3D Objects (单图3D)

8. Kinematics (运动学)

一键安装脚本

Checkpoint 下载汇总

验证安装

常见问题

Q: PyTorch3D 编译失败

Q: spconv 安装失败

Q: GroundingDINO 编译错误

Q: 显存不足

环境导出