Skip to content

hostremoval_input_fastq has insufficient memory #789

@ivelsko

Description

@ivelsko

Check Documentation

I have checked the following places for your error:

Description of the bug

hostremoval_input_fastq fails on larger samples b/c it runs out of memory

Steps to reproduce

Steps to reproduce the behaviour:

  1. Command line:
nextflow run nf-core/eager \
-r 2.3.5 \
-profile eva,archgen,big_data \
--outdir /mnt/archgen/microbiome_calculus/abpCapture/03-preprocessing/set1_set3/eager2 \
-work-dir /mnt/archgen/microbiome_calculus/abpCapture/03-preprocessing/set1_set3/work \
--input /mnt/archgen/microbiome_calculus/abpCapture/03-preprocessing/set1_set3/abpCap_set1_set3_eager_input.tsv \
--complexity_filter_poly_g \
--fasta /mnt/archgen/Reference_Genomes/Human/HG19/hg19_complete.fasta \
--seq_dict /mnt/projects1/Reference_Genomes/Human/HG19/hg19_complete.dict \
--bwa_index /mnt/archgen/Reference_Genomes/Human/HG19/ \
--bwaalnn 0.02 \
--bwaalnl 1024 \
--hostremoval_input_fastq \
--hostremoval_mode remove \
--run_bam_filtering \
--bam_unmapped_type fastq \
--skip_damage_calculation \
--skip_qualimap \
--email irina_marie_velsko@eva.mpg.de \
-name abpCap_set13 \
-with-tower
  1. See error:
Error executing process > 'hostremoval_input_fastq (BSH001.A0101.SG1)'

Caused by:
  Process `hostremoval_input_fastq (BSH001.A0101.SG1)` terminated with an error exit status (1)

Command executed:

  samtools index BSH001.A0101.SG1_PE.mapped.bam
  extract_map_reads.py BSH001.A0101.SG1_PE.mapped.bam BSH001.A0101.SG1_R1_lanemerged.fq.gz -rev BSH001.A0101.SG1_R2_lanemerged.fq.gz -m  remove -of BSH001.A0101.SG1_PE.mapped.hostremoved.fwd.fq.gz -or BSH001.A0101.SG1_PE.mapped.hostremove
d.rev.fq.gz -p 1

Command exit status:
  1

Command output:
  - Extracting mapped reads from BSH001.A0101.SG1_PE.mapped.bam
  - Parsing forward fq file BSH001.A0101.SG1_R1_lanemerged.fq.gz

Command error:
  Traceback (most recent call last):
    File "/home/irina_marie_velsko/.nextflow/assets/nf-core/eager/bin/extract_map_reads.py", line 270, in <module>
    File "/home/irina_marie_velsko/.nextflow/assets/nf-core/eager/bin/extract_map_reads.py", line 147, in parse_fq
    File "/home/irina_marie_velsko/.nextflow/assets/nf-core/eager/bin/extract_map_reads.py", line 120, in get_fq_reads
    File "/opt/conda/envs/nf-core-eager-2.3.5/lib/python3.7/site-packages/Bio/SeqIO/QualityIO.py", line 933, in FastqGeneralIterator
      seq_string = handle_readline().rstrip()
    File "/opt/conda/envs/nf-core-eager-2.3.5/lib/python3.7/site-packages/xopen/__init__.py", line 268, in readline
      return self._file.readline(*args)
    File "/opt/conda/envs/nf-core-eager-2.3.5/lib/python3.7/codecs.py", line 322, in decode
      (result, consumed) = self._buffer_decode(data, self.errors, final)
  MemoryError

Work dir:
  /mnt/archgen/microbiome_calculus/abpCapture/03-preprocessing/set1_set3/work/e4/663badafbd377d9291bdb211a98525

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
-[nf-core/eager] Pipeline completed with errors-

Expected behaviour

There should be enough memory for the host-mapped reads to be removed from the input files and the resulting host_removed fastq files to be written for both forward and reverse reads.

Log files

Have you provided the following extra information/files:

  • The command used to run the pipeline
  • The .nextflow.log file
  • The exact error: see above

System

  • Hardware: HPC
  • Executor: sge
  • OS: Linux
  • Version: Ubuntu 20.04.3 LTS

Nextflow Installation

  • Version: 20.10.0 build 5430

Container engine

  • Engine: Singularity
  • version:
  • Image tag: nfcore/eager:2.3.5

Additional context

I tried to increase the memory from 32GB by adjusting the lines

#$ -l h_rss=184320M,mem_free=184320M
#$ -S /bin/bash -j y -o output.log -l h_vmem=180G,virtual_free=180G

in the .command.run file. It ran with 180GB as written above, but the qacct record says maxvmem 120.745GB.

Metadata

Metadata

Assignees

Labels

DSL2bugSomething isn't workingneeds upstream fixNeeds a fix in the upstream tool project

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions