This repository contains tools and scripts for pairing aerial (UAV) and ground/street images using camera extrinsics, building datasets for image matching and running reconstruction demos.
The following sections document common workflows used in this project.
- Prepare transforms.json files (camera intrinsics & extrinsics) for aerial and ground sequences.
- Use helper scripts to find nadir / oblique frames or to pair UAV↔ground images.
- Run
parse_dataset.pyto build custom ground-truth pairs and dataset lists. - Run matching code for different models, and finally run the demo reconstruction scripts.
These are the concrete commands used in the project to prepare datasets and pairs. Update paths and thresholds to your needs.
- Find nadir (near-vertical) images:
python scripts/find_vertical.py \
--json /working/HorizonGS_real_dataset/road/transforms.json \
--threshold 10 \
--out /working/nadir_list.txt- Find oblique images (by tilt angle range):
python scripts/find_oblique.py \
--json /working/HorizonGS_real_dataset/road/transforms.json \
--min-angle 10 --max-angle 60 \
--out /working/oblique_list.txt- Pair aerial (UAV) and ground images using extrinsics (example):
python scripts/pair_uav_ground.py \
--uav-json /working/HorizonGS_real_dataset/park/transforms.json \
--ground-json /working/HorizonGS_real_dataset/park/transforms.json \
--single-json /working/HorizonGS_real_dataset/road/transforms.json \
--distance-thresh 80 \
--angle-thresh 90 \
--iou-thresh 0.02 \
--out /working/pairs_uav_ground.csv \
--visualize /working/pair_viz \
--ground-z 0.0Notes:
- The script supports two modes: (1) separate
--uav-jsonand--ground-json; (2) a single--single-jsoncontaining both frame types together. If you use--single-json, the script expects--uav-substrand--ground-substrto identify which frames are aerial vs ground (defaultaerialandstreet). - When running in
--single-jsonmode the CLI still requires values for--uav-jsonand--ground-jsondue to the argument parser; you can pass placeholder values if you prefer (e.g.--uav-json 1 --ground-json 1). This behavior can be changed if desired.
After pairing, run the dataset parsing script to create final lists and ground-truth pairs used for evaluation:
python parse_dataset.py \
--sparse_dir /working/HorizonGS_real_dataset/road/sparse/0 \
--output /working/HorizonGS_real_dataset/road/air2ground_pairs_gt.txt \
--mode custom \
--csv_pairs /working/pairs_uav_ground.csvAdjust arguments as needed for your data layout.
The repository contains multiple matchers. Example model entry points:
- SuperGlue pretrained network:
SuperGluePretrainedNetwork/match_pairs.py - mast3r matcher:
mast3r/mast3r_image_matching.py - vGGT matcher:
vggt/match_pairs_vggt.py
Each module contains its own argument parsing and dependency list. Check the top of each script and the model folders for README or requirements specific to that matcher.
After matching and filtering inliers you can run the reconstruction demos:
- mast3r demo:
mast3r/demo.py - vggt demo (Gradio):
vggt/demo_gradio.py
Follow the comments in each demo for dataset and configuration options.
While diagnosing pairing issues we noticed that filtering by the angle between two optical axes can pass false-positive pairs when the two cameras have similar pointing directions but are not looking toward each other. To improve pairing quality the script scripts/pair_uav_ground.py now uses an improved angle criterion:
- For a candidate pair (UAV, ground): compute the vector from UAV center to ground center. Compute the angle between the UAV optical axis and that vector (how well UAV looks toward ground). Compute the vector from ground center to UAV center and compute the angle between the ground optical axis and that vector. Use the maximum of these two angles as the pair angle. If that maximum angle exceeds
--angle-thresh, the pair is rejected.
This approach reduces cases where distant images with similar camera orientations are incorrectly considered as facing each other.