Skip to content

Metadata Tabulate Problematic with Per Sequence Data #35

@Oddant1

Description

@Oddant1

Bug Description
Metadata Tabulate churns for extended periods of time on data with many rows/records e.g. ErrorCorrectionDetails. Resulting .qzv unapproachably long.

Steps to reproduce the behavior

  1. Run qiime metadata tabulate on a significantly sized ErrorCorrectionDetails .qza
  2. Wait for the heat death of the universe
  3. Get nothing

Expected behavior
Let the user know what they're about to do has a significant probability of taking a long time, failing, or both. Let the user know that the process failed, how it failed, and why it failed to the best of our ability when it does fail.

Computation Environment
See forum x-ref

  • OS: On an HPC so presumably some form of Linux
  • 64gb of RAM available to compute node
  • Unknown version of Qiime2

Questions

  1. What file size is large enough for us to warn them about the probability of failure? Or should we just say "Producing .qzv files from this form of .qza can take a significant amount of time and compute power with a high risk of failure" or something similar regardless of any other factors?
  2. Can we do this without negatively impacting backwards compatibility for users?

References

  1. https://forum.qiime2.org/t/metadata-tabulate-fails-on-demux-error-correction-details-without-error-message/11824/4
  2. New/Improved visualization for Golay statistics q2-demux#105

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions