@@ -302,7 +302,7 @@ The output files are saved in the `../results/fastp/ directory`.
302302``` bash
303303fastp \
304304 --in1 ../data/subsampled/ERR5766177_PE.mapped.hostremoved.fwd.fq_subsample_1000000.fastq.gz \
305- --in2 ../data/subsampled/ERR5766177_PE.mapped.hostremoved.fwd .fq_subsample_1000000.fastq.gz \
305+ --in2 ../data/subsampled/ERR5766177_PE.mapped.hostremoved.rev .fq_subsample_1000000.fastq.gz \
306306 --merge \
307307 --merged_out ../results/fastp/ERR5766177.merged.fastq.gz \
308308 --include_unmerged \
@@ -316,54 +316,54 @@ fastp \
316316 total bases: 101000000
317317 Q20 bases: 99440729(98.4562%)
318318 Q30 bases: 94683150(93.7457%)
319-
319+ Q40 bases: 27968326(27.6914%)
320+
320321 Read2 before filtering:
321322 total reads: 1000000
322323 total bases: 101000000
323- Q20 bases: 99440729(98.4562%)
324- Q30 bases: 94683150(93.7457%)
325-
324+ Q20 bases: 96103171(95.1517%)
325+ Q30 bases: 89042465(88.1609%)
326+ Q40 bases: 24849295(24.6033%)
327+
326328 Merged and filtered:
327- total reads: 1994070
328- total bases: 201397311
329- Q20 bases: 198330392(98.4772%)
330- Q30 bases: 188843169(93.7665%)
331-
329+ total reads: 1312040
330+ total bases: 122538903
331+ Q20 bases: 119762428(97.7342%)
332+ Q30 bases: 113200374(92.3791%)
333+ Q40 bases: 35574405(29.0311%)
334+
332335 Filtering result:
333- reads passed filter: 1999252
334- reads failed due to low quality: 728
335- reads failed due to too many N: 20
336+ reads passed filter: 1985074
337+ reads failed due to low quality: 14419
338+ reads failed due to too many N: 507
336339 reads failed due to too short: 0
337- reads with adapter trimmed: 282
338- bases trimmed due to adapters: 18654
339- reads corrected by overlap analysis: 0
340- bases corrected by overlap analysis: 0
341-
342- Duplication rate: 0.2479 %
343-
344- Insert size peak (evaluated by paired-end reads): 31
345-
346- Read pairs merged: 228
347- % of original read pairs: 0.0228 %
348- % in reads after filtering: 0.0114339 %
349-
350-
340+ reads with adapter trimmed: 889290
341+ bases trimmed due to adapters: 34036630
342+ reads corrected by overlap analysis: 26668
343+ bases corrected by overlap analysis: 36019
344+
345+ Duplication rate: 0.0192 %
346+
347+ Insert size peak (evaluated by paired-end reads): 43
348+
349+ Read pairs merged: 672964
350+ % of original read pairs: 67.2964 %
351+ % in reads after filtering: 51.2914 %
352+
353+
351354 JSON report: ../results/fastp/ERR5766177.fastp.json
352355 HTML report: ../results/fastp/ERR5766177.fastp.html
353-
354- fastp --in1 ../data/subsampled/ERR5766177_PE.mapped.hostremoved.fwd.fq_subsample_1000000.fastq.gz \
355- --in2 ../data/subsampled/ERR5766177_PE.mapped.hostremoved.fwd.fq_subsample_1000000.fastq.gz --merge \
356- --merged_out ../results/fastp/ERR5766177.merged.fastq.gz --include_unmerged --dedup \
357- --json ../results/fastp/ERR5766177.fastp.json --html ../results/fastp/ERR5766177.fastp.html
358- fastp v0.23.2, time used: 11 seconds
356+
357+ fastp --in1 ../data/subsampled/ERR5766177_PE.mapped.hostremoved.fwd.fq_subsample_1000000.fastq.gz --in2 ../data/subsampled/ERR5766177_PE.mapped.hostremoved.rev.fq_subsample_1000000.fastq.gz --merge --merged_out ../results/fastp/ERR5766177.merged.fastq.gz --include_unmerged --dedup --json ../results/fastp/ERR5766177.fastp.json --html ../results/fastp/ERR5766177.fastp.html
358+ fastp v1.0.1, time used: 8 seconds
359359:::
360360
361361::: {.callout-tip title="Question" appearance="simple"}
362362What do you think of the number of read pairs that were merged ?
363363:::
364364
365365::: {.callout-note collapse="true" title="Answer"}
366- Here, only 228 read pairs were merged.
366+ Here, only 672964 read pairs were merged.
367367This is due to the length of the reads of 100bp, and length of the DNA fragments.
368368If you would use fewer cycles, and have shorter DNA fragments, you would expect this number to go up.
369369:::
@@ -1697,4 +1697,4 @@ There are many different ways to normalise sequencing data, but this out of the
16971697Just to name a few, the most commonly used are RLE, TSS, rarefaction, CLR, or GMPR.
16981698:::
16991699
1700- ## References
1700+ ## References
0 commit comments