Thank you very much for the author's work!!
I want to do an iterative approach to complete scene reconstruction, but I am currently facing issues with camera_poses.
My first inference used the following [00.jpg, 01.jpg, 02.jpg, 03. jpg], which would output [00_pose, 01_pose, 02_pose, 03_pose].
When I used Multi Modal Inference [02.jpg+02_pose, 03.jpg+03_pose, 04.jpg, 05. jpg] for my second inference , what I hope is that [02.jpg+02_pose] outputs 02_pose, so that it can be embedded into the result of the first inference
But it seems that the second output [02. jpg+02_pose] will be changed to an identity matrix, which means taking this point as the world coordinate origin
Is this the only output allowed by the model?
Thank you very much for the author's work!!
I want to do an iterative approach to complete scene reconstruction, but I am currently facing issues with camera_poses.
My first inference used the following [00.jpg, 01.jpg, 02.jpg, 03. jpg], which would output [00_pose, 01_pose, 02_pose, 03_pose].
When I used Multi Modal Inference [02.jpg+02_pose, 03.jpg+03_pose, 04.jpg, 05. jpg] for my second inference , what I hope is that [02.jpg+02_pose] outputs 02_pose, so that it can be embedded into the result of the first inference
But it seems that the second output [02. jpg+02_pose] will be changed to an identity matrix, which means taking this point as the world coordinate origin
Is this the only output allowed by the model?