Mismatch between released ASE metadata split and public ASE dataset scene count

I downloaded the official training metadata from the Pre-computed Training Metadata section in the data processing README, and I am trying to use it to finetune the released pretrained weights.

However, I noticed a mismatch for the ASE dataset:

the ASE dataset I downloaded contains 100,000 scenes
the released metadata contains:
98,696 scenes in ase_scene_list_train.npy
5,194 scenes in ase_scene_list_val.npy

This gives a total of 103,890 scenes. I also checked that the train and val scene lists do not overlap, so the combined count is indeed 103,890.

Could you clarify why these numbers do not match?
Specifically, I would like to understand whether:

the released metadata was generated from a different version of ASE,
it corresponds to an internal ASE split rather than the public 100K release,
or the metadata was built from a processed subset/superset whose scene definition differs from the public ASE dataset.

I want to make sure I am using the correct ASE data and metadata setup for finetuning the official pretrained weights.

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mismatch between released ASE metadata split and public ASE dataset scene count #155

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mismatch between released ASE metadata split and public ASE dataset scene count #155

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions