Skip to content

Mismatch between released ASE metadata split and public ASE dataset scene count #155

@seanmu

Description

@seanmu

I downloaded the official training metadata from the Pre-computed Training Metadata section in the data processing README, and I am trying to use it to finetune the released pretrained weights.

However, I noticed a mismatch for the ASE dataset:

the ASE dataset I downloaded contains 100,000 scenes
the released metadata contains:
98,696 scenes in ase_scene_list_train.npy
5,194 scenes in ase_scene_list_val.npy

This gives a total of 103,890 scenes. I also checked that the train and val scene lists do not overlap, so the combined count is indeed 103,890.

Could you clarify why these numbers do not match?
Specifically, I would like to understand whether:

the released metadata was generated from a different version of ASE,
it corresponds to an internal ASE split rather than the public 100K release,
or the metadata was built from a processed subset/superset whose scene definition differs from the public ASE dataset.

I want to make sure I am using the correct ASE data and metadata setup for finetuning the official pretrained weights.

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions