Questions about sampling rate of dataset and VAE

Thank you for sharing your work on TangoFlux. I have a couple of questions:

1. As far as I know, the WavCaps and AudioCaps datasets used for training have lower sampling rates than 44.1 kHz. Did you upsample these datasets for training, or is the high-frequency range in the generated audio intentionally left empty?

2. Is there a specific reason for choosing Stable Audio Open's VAE for compression?

I appreciate your contributions and look forward to your clarification. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about sampling rate of dataset and VAE #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Questions about sampling rate of dataset and VAE #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions