Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,10 @@

> Peltzer, A., Jäger, G., Herbig, A., Seitz, A., Kniep, C., Krause, J., & Nieselt, K. (2016). EAGER: efficient ancient genome reconstruction. Genome Biology, 17(1), 1–14. doi: [10.1186/s13059-016-0918-z](https://doi.org/10.1186/s13059-016-0918-z)

- [MultiVCFAnalyzer](doi:10.1038/nature13591)

> Bos, K. I., Harkins, K. M., Herbig A. et al., (2014). Pre-Columbian mycobacterial genomes reveal seals as a source of New World human tuberculosis. Nature, 514, 494–497. doi: [10.1038/nature13591](https://doi.org/10.1038/nature13591)

## Software packaging/containerisation tools

- [Anaconda](https://anaconda.com)
Expand Down
28 changes: 28 additions & 0 deletions assets/schema_fasta.json
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,34 @@
"pattern": "^\\S+\\.vcf$",
"exists": true,
"errorMessage": "SNP annotation files for GATK must not contain any spaces and have file extension '.vcf'."
},
"consensus_multivcfanalyzer_additional_vcf_files": {
"type": "string",
"format": "directory-path",
"pattern": "^\\S+\\.$",
"exists": true,
"errorMessage": "The directory containing the additional vcf files for multivcfanalyzer must not contain any spaces and have file extensions ''. Wildcards allowed."
},
"consensus_multivcfanalyzer_reference_gff_annotations": {
"type": "string",
"format": "file-path",
"pattern": "^\\S+\\.gff(\\.gz)?$",
"exists": true,
"errorMessage": "The annotation file must not contain any spaces and have file extension '.gff'."
},
"consensus_multivcfanalyzer_reference_gff_exclude": {
"type": "string",
"format": "file-path",
"pattern": "^\\S+\\.gff(\\.gz)?$",
"exists": true,
"errorMessage": "The annotation file must not contain any spaces and have file extension '.gff'."
},
"consensus_multivcfanalyzer_reference_snpeff_results": {
"type": "string",
"format": "file-path",
"pattern": "^\\S+\\.txt$",
"exists": true,
"errorMessage": "The snpeff file containing the results from the SnpEff analysis must not contain any spaces and have file extensions 'txt'."
}
},
"required": ["reference_name", "fasta"],
Expand Down
11 changes: 11 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -1642,4 +1642,15 @@ process {
]
]
}

withName: MULTIVCFANALYZER {
tag = { "${meta.reference}" }
ext.prefix = { "multivcfanalyzer_${meta.reference}" }
publishDir = [
path: { "${params.outdir}/consensus_sequence/" },
mode: params.publish_dir_mode,
enabled: true,
pattern: '*.{fasta.gz,tsv,txt}'
]
}
}
23 changes: 23 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -776,3 +776,26 @@ When using pileupCaller for genotyping, single-stranded and double-stranded libr
</details>

[ANGSD](http://www.popgen.dk/angsd/index.php/ANGSD) is a software for analyzing next generation sequencing data. It can estimate genotype likelihoods and allele frequencies from next-generation sequencing data. The output provided is a bgzipped genotype likelihood file, containing likelihoods across all samples per reference. Users can specify the model used for genotype likelihood estimation, as well as the output format. For more information on the available options, see the [ANGSD](https://www.popgen.dk/angsd/index.php/Genotype_Likelihoods).

### Consensus Sequence Generation

#### MultiVCFAnalyzer

<details markdown="1">
<summary>Output files</summary>

- `consensus_sequence/`

- `fullAlignment.fasta.gz`: A fasta file containing the alignment of all positions contained in the VCF files i.e. including ref calls.
- `info.txt`: File containing information, e.g. parameters used, about the run.
- `snpAlignment.fasta.gz`: A fasta file containing the alignment of just SNP positions variable in the samples only. -`snpAlignmentIncludingRefGenome.fasta.gz`: A fasta file containing the alignment of just SNP positions variable between samples including the reference genome. -`snpStatistics.tsv`: Table containing some basic statistics about the SNP calls of each sample. -`snpTable.tsv`: Basic SNP table of combined positions taken from each VCF file. If SnpEff provided, it will also contain a columns on the effect of the detected SNP.
- `snpTableForSnpEff.tsv`: Input file for SnpEff analysis.
- `snpTableWithUncertaintyCalls.tsv`: Basic SNP table of combined positions taken from each VCF file, but with lower case characters indicating uncertain calls.
- `structureGenotypes.tsv`: Input file for STRUCTURE.
- `structureGenotypes_noMissingData-Columns.tsv`: Alternate input file for STRUCTURE.
- `MultiVCFAnalyzer.json`: Summary statistics in MultiQC JSON format,
- `versions.yml`: File containing software versions.

</details>

[MultiVCFAnalyzer](https://github.com/alexherbig/MultiVCFAnalyzer) is a SNP alignment generation tool, that allows further evaluation and filtering of SNP calls made by the GATK UnifiedGenotyper. More specifically it takes one or more VCF files as well as a reference genome, and will allow filtering of SNPs via a variety of metrics and produces a FASTA file with each sample as an entry containing ‘consensus calls’ at each position.
5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,11 @@
"git_sha": "f0719ae309075ae4a291533883847c3f7c441dad",
"installed_by": ["modules"]
},
"multivcfanalyzer": {
"branch": "master",
"git_sha": "b34fb117e397e72a070cde8adf21b758670c90f5",
"installed_by": ["modules"]
},
"picard/createsequencedictionary": {
"branch": "master",
"git_sha": "20b0918591d4ba20047d7e13e5094bcceba81447",
Expand Down
7 changes: 7 additions & 0 deletions modules/nf-core/multivcfanalyzer/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

100 changes: 100 additions & 0 deletions modules/nf-core/multivcfanalyzer/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

120 changes: 120 additions & 0 deletions modules/nf-core/multivcfanalyzer/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading