Skip to content

Bash parameters prefixed infront of command_prepare are insufficient, leading to silent errors #26

@alexiswl

Description

@alexiswl

Noticed from some of the errors in #23, bolt is continuing to run commands, despite previous errors.

In the snippet below, it looks like bcftools merge is called and when it returns a Usage snippet, the next line which is bcftools sort is run anyway??

Details
      "2025-12-08T03:45:58.053232052Z 2025-12-08 03:45:58,052 - bolt.util - INFO - Running bcftools merge...",
        "2025-12-08T03:45:58.053474427Z 2025-12-08 03:45:58,053 - bolt.util - INFO - Executing command: bcftools merge \\",
        "2025-12-08T03:45:58.053512058Z           -m all \\",
        "2025-12-08T03:45:58.053534348Z           -Oz \\",
        "2025-12-08T03:45:58.053553688Z           -o output/pcgr/nosampleset.pcgr.grch38.unsorted.vcf.gz \\",
        "2025-12-08T03:45:58.053573359Z           output/pcgr/pcgr_4/nosampleset.pcgr.grch38.pass.vcf.gz",
        "2025-12-08T03:45:58.062109158Z 2025-12-08 03:45:58,061 - bolt.util - INFO - ",
        "2025-12-08T03:45:58.062187770Z 2025-12-08 03:45:58,062 - bolt.util - INFO - About:   Merge multiple VCF/BCF files from non-overlapping sample sets to create one multi-sample file.",
        "2025-12-08T03:45:58.062278831Z 2025-12-08 03:45:58,062 - bolt.util - INFO - Note that only records from different files can be merged, never from the same file. For",
        "2025-12-08T03:45:58.062366943Z 2025-12-08 03:45:58,062 - bolt.util - INFO - \"vertical\" merge take a look at \"bcftools norm\" instead.",
        "2025-12-08T03:45:58.062465855Z 2025-12-08 03:45:58,062 - bolt.util - INFO - Usage:   bcftools merge [options] <A.vcf.gz> <B.vcf.gz> [...]",
        "2025-12-08T03:45:58.062544646Z 2025-12-08 03:45:58,062 - bolt.util - INFO - ",
        "2025-12-08T03:45:58.062638178Z 2025-12-08 03:45:58,062 - bolt.util - INFO - Options:",
        "2025-12-08T03:45:58.062751910Z 2025-12-08 03:45:58,062 - bolt.util - INFO - --force-samples               Resolve duplicate sample names",
        "2025-12-08T03:45:58.062831162Z 2025-12-08 03:45:58,062 - bolt.util - INFO - --print-header                Print only the merged header and exit",
        "2025-12-08T03:45:58.062936284Z 2025-12-08 03:45:58,062 - bolt.util - INFO - --use-header FILE             Use the provided header",
        "2025-12-08T03:45:58.063035006Z 2025-12-08 03:45:58,062 - bolt.util - INFO - -0  --missing-to-ref              Assume genotypes at missing sites are 0/0",
        "2025-12-08T03:45:58.063149228Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -f, --apply-filters LIST          Require at least one of the listed FILTER strings (e.g. \"PASS,.\")",
        "2025-12-08T03:45:58.063248850Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -F, --filter-logic x|+            Remove filters if some input is PASS (\"x\"), or apply all filters (\"+\") [+]",
        "2025-12-08T03:45:58.063348582Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -g, --gvcf -|REF.FA               Merge gVCF blocks, INFO/END tag is expected. Implies -i QS:sum,MinDP:min,I16:sum,IDV:max,IMF:max",
        "2025-12-08T03:45:58.063471664Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -i, --info-rules TAG:METHOD,..    Rules for merging INFO fields (method is one of sum,avg,min,max,join) or \"-\" to turn off the default [DP:sum,DP4:sum]",
        "2025-12-08T03:45:58.063585266Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -l, --file-list FILE              Read file names from the file",
        "2025-12-08T03:45:58.063685968Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -L, --local-alleles INT           EXPERIMENTAL: if more than <int> ALT alleles are encountered, drop FMT/PL and output LAA+LPL instead; 0=unlimited [0]",
        "2025-12-08T03:45:58.063772159Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -m, --merge STRING                Allow multiallelic records for <snps|indels|both|snp-ins-del|all|none|id>, see man page for details [both]",
        "2025-12-08T03:45:58.063877521Z 2025-12-08 03:45:58,063 - bolt.util - INFO - --no-index                    Merge unindexed files, the same chromosomal order is required and -r/-R are not allowed",
        "2025-12-08T03:45:58.063972873Z 2025-12-08 03:45:58,063 - bolt.util - INFO - --no-version                  Do not append version and command line to the header",
        "2025-12-08T03:45:58.064075205Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -o, --output FILE                 Write output to a file [standard output]",
        "2025-12-08T03:45:58.064164837Z 2025-12-08 03:45:58,064 - bolt.util - INFO - -O, --output-type u|b|v|z[0-9]    u/b: un/compressed BCF, v/z: un/compressed VCF, 0-9: compression level [v]",
        "2025-12-08T03:45:58.064266949Z 2025-12-08 03:45:58,064 - bolt.util - INFO - -r, --regions REGION              Restrict to comma-separated list of regions",
        "2025-12-08T03:45:58.064364401Z 2025-12-08 03:45:58,064 - bolt.util - INFO - -R, --regions-file FILE           Restrict to regions listed in a file",
        "2025-12-08T03:45:58.064513103Z 2025-12-08 03:45:58,064 - bolt.util - INFO - --regions-overlap 0|1|2       Include if POS in the region (0), record overlaps (1), variant overlaps (2) [1]",
        "2025-12-08T03:45:58.064615915Z 2025-12-08 03:45:58,064 - bolt.util - INFO - --threads INT                 Use multithreading with <int> worker threads [0]",
        "2025-12-08T03:45:58.064719867Z 2025-12-08 03:45:58,064 - bolt.util - INFO - ",
        "2025-12-08T03:45:58.065711436Z 2025-12-08 03:45:58,065 - bolt.util - INFO - Merged VCF written to: output/pcgr/nosampleset.pcgr.grch38.unsorted.vcf.gz",
        "2025-12-08T03:45:58.065780207Z 2025-12-08 03:45:58,065 - bolt.util - INFO - Sorting merged VCF file...",
        "2025-12-08T03:45:58.065849648Z 2025-12-08 03:45:58,065 - bolt.util - INFO - Executing command: bcftools sort \\",
        "2025-12-08T03:45:58.065862558Z           -Oz \\",
        "2025-12-08T03:45:58.065869138Z           -o output/pcgr/nosampleset.pcgr.grch38.vcf.gz \\",
        "2025-12-08T03:45:58.065874879Z           output/pcgr/nosampleset.pcgr.grch38.unsorted.vcf.gz",

There are multiple commands likely affected by this, for example this command

When this code chunk is parsed to util.execute_command, we first have the command parsed through command_prepare which wraps set -o pipefail in front of the command, but this means both set -o errexit is not set (aka set -e), along with set -u (equivalent of no unset).

While I understand the benefits of shell=True in our subprocess.run call, to allow for pipelines, this is not recommended as it can lead to shell injection errors.
If you wish to continue using this, I would highly recommend first shlexing the input variables.

Metadata

Metadata

Labels

bugSomething isn't working

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions