All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- Make the ordering in the reports consistent --> RG, Order, Family
- Add a step in the analysis of deamination patterns: bam_deam_positions
- Create a table that shows substitutions-rates for the first 10 and last 10 positions of each alignment.
- Create a table that shows the 95% confidence-intervals for the first and last 10 positions of each alignment
- the are saved to the
stats/directory under{RG}_04_deamination_positions/
- Add the
--ancientness_thresholdflag to specify the threshold used for reporting ancientness levels ('++' or '+')
- Fix a minor bug when re-running a quicksand analysis with the --rerun flag. The bedfilter-stats were duplicated from the respective 'best' entry and not set to '-'
- #14: output stats for all mappings done in the
stats/SAMPLE_01_mapped.tsvandstats/SAMPLE_02_deduped.tsv. Only the 'best' mappings were shown before.
- Publish the "KrakenUniq parsed report" in the 'stats' directory
- Add an R-friendly version of the final summary report (column names w/o special characters,
R_final_report.tsv) - In the final_report, replace
SpeciesKmersname withKmersand include the "best" and "(family)" level stats. E.g. "4 (129)" - for the
KmerCoverageandKmerDupRatecolumns, also combine "best" and "(family)" - Add a
--fixed_bedfilteringflag to run dustmasking and bedfiltering also for fixed references (off by default)
- Fix config-error for profile docker, causing images with ENTRYPOINT statement to fail
- Remove the default read filters in the
samtools coveragefunction. (This caused results with ReadsDeduped > 0 but CoveredBP 0). - Remove bug in the
final_report.tsvreport generation, the bug kept duplicated families (from Kraken) with MappedReads 0 in the report. - Update default values in the report to match the format of the
final_report.tsv(e.g. 0.0 instead of 0)
- Add a
--doublestrandedflag to adjust damage pattern analysis to the ones observed in data created from double stranded libraries.- Changes the
bam_deam_stats.pyand themask_deamination.pyscripts to look at 3' G to A substitutions instead of the C to T changes as done before. - Nothing changes for default runs
- Changes the
- Fix a bug in the processing of paired fastq-files.
- removes the "1" flag from the resulting bam files (make sure to merge your reads before quicksand!)
This version adds 4 columns to the end of the final_report.tsv file and adds an additional file filtered_report_{n}p_{m}b.tsv to the output-directory. This file should serve as a quick look on the final_report and shoult not be treated as the final output file
This version alters the final_report.tsv file and adds an additional file filtered_report_{n}p_{m}b.tsv which is a filtered version of the final_report based on two freshly introduced filter-flags
--reportfilter_percentagesets the filter threshold for the FamPercentage column--reportfilter_breadthsets the filter threshold for the ProportionExpectedBreadth column
Within the workflow quicksand now takes additional information from the samtools coverage command that analyzes the deduplicated reads. This additional information is (or is used to calculate)
- Depth of Coverage
- Breadth of Coverage
- Expected Breadth of Coverage, based on the inStrain documentation
- Proportion of Expected Breadth, based on the inStrain documentation
This is a rewrite of the v1.6.1 pipeline in dsl2 syntax of nextflow
to account for nextflow-versions >22.10
While the code was restructured, the flags, features and outputs remain the same as in v1.6.1
making these versions (almost) fully compatible. See the changes below.
- instead of
FamKmersnow reportSpeciesKmersand the respective kmer-stats to better compare assignments. Before, families with many species always had lower kmer-stats - for each genome in the fixed-references file, run the full pipeline. They are no longer reduced to 1 reference per family as in
v1.6.1 - parse the taxonomy directly from the DB, no need to add an additional file!
- remove the test-data and the -profile test option, as it was no longer working
This is a minor update to the final_report created
- Added two columns to the end of the final_report.
MeanFragmentLength: The mean fragment length of all the DNA molecules in the bedfiltered or deduped bamfileMeanFragmentLength(3term): The mean fragment length of all deaminated DNA molecules in the bedfiltered or deduped bamfile