Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
b0acaa1
more time for no_output_timeout
maxulysse May 14, 2019
1e43ca4
https for links
maxulysse May 14, 2019
f4e1f46
update usage.md + change --strelkaBP to -noStrelkaBP
maxulysse May 14, 2019
8140a38
switch for ensemblvep 96
maxulysse May 15, 2019
366e72f
update docs
maxulysse May 15, 2019
eee21fc
add docs about annotation
maxulysse May 15, 2019
ab14770
add docs about containers
maxulysse May 15, 2019
b2cd43d
add information about input
maxulysse May 15, 2019
70b784f
improve tests
maxulysse May 15, 2019
3166c7e
clean up tests
maxulysse May 15, 2019
dad929f
update docs on usage
maxulysse May 15, 2019
3862d27
add ASCAT
maxulysse May 15, 2019
7a66c0c
improve QC reports
maxulysse May 15, 2019
ca2d4dc
update docs
maxulysse May 16, 2019
cc1246e
update process names
maxulysse May 16, 2019
5a30948
switch back to vep version 95
maxulysse May 16, 2019
30f75a1
use https file for multiple
maxulysse May 16, 2019
c29ef66
fix tests
maxulysse May 16, 2019
3478c7a
update README
maxulysse May 16, 2019
dc3541e
improve docs
maxulysse May 16, 2019
13d232f
improve Jenkins tests
maxulysse May 16, 2019
6c86ec8
refactoring + parallelisation of ApplyBQSR
maxulysse May 16, 2019
efee7d7
extend no_output_timeout
maxulysse May 16, 2019
f053341
specify tag 1.6 for nfcore/base
maxulysse May 17, 2019
cc4121a
add docs about monochrome_logs
maxulysse May 17, 2019
113ca64
update script
maxulysse May 17, 2019
e0a64b0
update tests
maxulysse May 17, 2019
64c8d45
add --monochrome logs for tests
maxulysse May 17, 2019
74ee480
fix snpEff output
maxulysse May 17, 2019
be2f03b
code polishing - lowercase
maxulysse May 17, 2019
3789649
add some docs about QC tools
maxulysse May 17, 2019
268b737
code polishing - refactoring
maxulysse May 17, 2019
30ff4c7
fix TSV and concatVCF
maxulysse May 17, 2019
e8c5d87
fix cross for somatic pair
maxulysse May 18, 2019
e081b01
improve tests
maxulysse May 21, 2019
0338f72
code polishing
maxulysse May 21, 2019
7f4ea8f
Update docs/usage.md
maxulysse May 21, 2019
1057b63
Update docs/usage.md
maxulysse May 21, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ jobs:
- setup_remote_docker
- run:
command: docker build -t nfcore/sarekvep:dev.${GENOME} containers/vep/. --build-arg GENOME=${GENOME} --build-arg SPECIES=${SPECIES} --build-arg VEP_VERSION=${VEP_VERSION}
no_output_timeout: 1.5h
no_output_timeout: 3h
- run:
command: echo "$DOCKERHUB_PASS" | docker login -u "$DOCKERHUB_USERNAME" --password-stdin ; docker push nfcore/sarekvep:dev.${GENOME}

Expand All @@ -76,7 +76,7 @@ jobs:
- setup_remote_docker
- run:
command: docker build -t nfcore/sarekvep:dev.${GENOME} containers/vep/. --build-arg GENOME=${GENOME} --build-arg SPECIES=${SPECIES} --build-arg VEP_VERSION=${VEP_VERSION}
no_output_timeout: 1.5h
no_output_timeout: 3h
- run:
command: echo "$DOCKERHUB_PASS" | docker login -u "$DOCKERHUB_USERNAME" --password-stdin ; docker push nfcore/sarekvep:dev.${GENOME}

Expand All @@ -92,7 +92,7 @@ jobs:
- setup_remote_docker
- run:
command: docker build -t nfcore/sarekvep:dev.${GENOME} containers/vep/. --build-arg GENOME=${GENOME} --build-arg SPECIES=${SPECIES} --build-arg VEP_VERSION=${VEP_VERSION}
no_output_timeout: 30m
no_output_timeout: 1h
- run:
command: echo "$DOCKERHUB_PASS" | docker login -u "$DOCKERHUB_USERNAME" --password-stdin ; docker push nfcore/sarekvep:dev.${GENOME}

Expand Down
4 changes: 2 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,8 @@ install:

# Build references if needed
before_script:
- "${TRAVIS_BUILD_DIR}/bin/build_reference.sh --test $TEST --build"
- "${TRAVIS_BUILD_DIR}/bin/build_reference.sh --test $TEST --verbose"

# Actual tests
script:
- "${TRAVIS_BUILD_DIR}/bin/run_tests.sh --test $TEST"
- "${TRAVIS_BUILD_DIR}/bin/run_tests.sh --test $TEST --verbose"
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM nfcore/base
FROM nfcore/base:1.6
LABEL authors="Maxime Garcia" \
description="Docker image containing all requirements for nf-core/sarek pipeline"

Expand Down
29 changes: 13 additions & 16 deletions Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,46 +6,43 @@ pipeline {
}

stages {
stage('Setup environment') {
stage('Docker setup') {
steps {
sh "./bin/download_docker.sh -t ALL"
sh "./bin/download_docker.sh"
}
}
stage('Build') {
stage('Build references') {
steps {
sh "rm -rf data"
sh "./bin/build_reference.sh --test ALL --build"
sh "rm -rf work/ references/pipeline_info .nextflow*"
sh "rm -rf references/"
sh "./bin/build_reference.sh"
}
}
stage('Somatic') {
stage('Germline') {
steps {
sh "./bin/run_tests.sh --test SOMATIC"
sh "rm -rf work/ .nextflow* results/"
sh "rm -rf data/"
sh "git clone --single-branch --branch sarek https://github.com/nf-core/test-datasets.git data"
sh "./bin/run_tests.sh --test GERMLINE"
sh "rm -rf data/"
}
}
stage('Germline') {
stage('Somatic') {
steps {
sh "./bin/run_tests.sh --test GERMLINE"
sh "rm -rf work/ .nextflow* results/"
sh "./bin/run_tests.sh --test SOMATIC"
}
}
stage('targeted') {
stage('Targeted') {
steps {
sh "./bin/run_tests.sh --test TARGETED"
sh "rm -rf work/ .nextflow* results/"
}
}
stage('Annotation') {
steps {
sh "./bin/run_tests.sh --test ANNOTATEALL"
sh "rm -rf work/ .nextflow* results/"
}
}
stage('Multiple') {
steps {
sh "./bin/run_tests.sh --test MULTIPLE"
sh "rm -rf work/ .nextflow* results/"
}
}
}
Expand Down
71 changes: 10 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
[![Travis build status][travis-badge]](https://travis-ci.com/nf-core/sarek/)
[![CircleCi build status][circleci-badge]](https://circleci.com/gh/nf-core/sarek/)

[![Install with bioconda][bioconda-badge]](http://bioconda.github.io/)
[![Install with bioconda][bioconda-badge]](https://bioconda.github.io/)
[![Docker Container available][docker-sarek-badge]](https://hub.docker.com/r/nfcore/sarek/)
[![Install with Singularity][singularity-badge]](https://www.sylabs.io/docs/)

Expand All @@ -26,8 +26,8 @@ Sarek is a workflow designed to run analyses on whole genome or targeted sequenc
It's built using [Nextflow](https://www.nextflow.io),
a domain specific language for workflow building,
across multiple compute infrastructures in a very portable manner.
Software dependencies are handled using [Docker](https://www.docker.com) or [Singularity](https://www.sylabs.io/singularity/) - container technologies that provide excellent reproducibility and ease of use.
Thus making installation trivial and results highly reproducible
Software dependencies are handled using [Conda](https://conda.io/), [Docker](https://www.docker.com) or [Singularity](https://www.sylabs.io/singularity/) - environment/container technologies that provide excellent reproducibility and ease of use.
Thus making installation trivial and results highly reproducible.

It is listed on the [Elixir - Tools and Data Services Registry](https://bio.tools/Sarek), [Dockstore](https://dockstore.org/workflows/github.com/SciLifeLab/Sarek/) and [omicX - Bioinformatics tools](https://omictools.com/sarek-tool).

Expand All @@ -39,73 +39,22 @@ The nf-core/sarek pipeline comes with documentation about the pipeline, found in
* [Local installation](https://nf-co.re/usage/local_installation)
* [Adding your own system config](https://nf-co.re/usage/adding_own_config)
* [Reference genomes](https://nf-co.re/usage/reference_genomes)
* [Extra documentation on reference](docs/reference.md)
3. [Running the pipeline](docs/usage.md)
* [Tests documentation](docs/TESTS.md)
* [Configuration and profiles documentation](docs/CONFIG.md)
* [Input files documentation](docs/input.md)
* [Extra documentation on variant calling](docs/variantcalling.md)
* [Documentation about containers](docs/containers.md)
* [Extra documentation for targeted sequencing](docs/targetseq.md)

* [Intervals documentation](docs/INTERVALS.md)
* [Running the pipeline](docs/USAGE.md)
* [Running the pipeline using Conda](docs/CONDA.md)
* [Command line parameters](docs/PARAMETERS.md)
* [Examples](docs/USE_CASES.md)
* [Input files documentation](docs/INPUT.md)
* [Processes documentation](docs/PROCESS.md)
* [Documentation about containers](docs/CONTAINERS.md)
4. [Output and how to interpret the results](docs/output.md)
* [Complementary information about ASCAT](docs/ASCAT.md)
* [Complementary information about annotations](docs/ANNOTATION.md)
* [Output documentation structure](docs/OUTPUT.md)
* [Extra documentation on annotation](docs/annotation.md)
5. [Troubleshooting](https://nf-co.re/usage/troubleshooting)

## Workflow steps

Sarek is built with several workflow scripts.
A wrapper script contained within the repository makes it easy to run the different workflow scripts as a single job.
To test your installation, follow the [tests documentation.](docs/TESTS.md)

Raw FastQ files or BAM files (unmapped, aligned or recalibrated) can be used as inputs.
You can choose which variant callers to use, plus the pipeline is capable of accommodating additional variant calling software or CNV callers if required.

The worflow steps and tools used are as follows:

1. **Preprocessing** _(based on [GATK best practices](https://software.broadinstitute.org/gatk/best-practices/))_
* Map reads to Reference
* [BWA](http://bio-bwa.sourceforge.net/)
* Mark Duplicates
* [GATK MarkDuplicates](https://github.com/broadinstitute/gatk)
* Base (Quality Score) Recalibration
* [GATK BaseRecalibrator](https://github.com/broadinstitute/gatk)
* [GATK ApplyBQSR](https://github.com/broadinstitute/gatk)
2. **Germline variant calling**
* SNVs and small indels
* [GATK HaplotypeCaller](https://github.com/broadinstitute/gatk)
* [Strelka2](https://github.com/Illumina/strelka)
* Structural variants
* [Manta](https://github.com/Illumina/manta)
3. **Somatic variant calling**
* SNVs and small indels
* [MuTect2](https://github.com/broadinstitute/gatk)
* [Freebayes](https://github.com/ekg/freebayes)
* [Strelka2](https://github.com/Illumina/strelka)
* Structural variants
* [Manta](https://github.com/Illumina/manta)
* Sample heterogeneity, ploidy and CNVs
* [ASCAT](https://github.com/Crick-CancerGenomics/ascat)
4. **Annotation**
* Variant annotation
* [SnpEff](http://snpeff.sourceforge.net/)
* [VEP (Variant Effect Predictor)](https://www.ensembl.org/info/docs/tools/vep/index.html)
5. **QC and Reporting**
* QC
* [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
* [Qualimap bamqc](http://qualimap.bioinfo.cipf.es/doc_html/command_line.html)
* [samtools stats](https://www.htslib.org/doc/samtools.html)
* [GATK MarkDuplicates](https://github.com/broadinstitute/gatk)
* [bcftools stats](http://www.htslib.org/doc/bcftools.html)
* [VCFtools](https://vcftools.github.io/index.html)
* [SnpEff](http://snpeff.sourceforge.net/)
* Reporting
* [MultiQC](http://multiqc.info)

## Credits

Sarek was developed at the [National Genomics Infastructure][ngi-link] and [National Bioinformatics Infastructure Sweden][nbis-link] which are both platforms at [SciLifeLab][scilifelab-link], with the support of [The Swedish Childhood Tumor Biobank (Barntumörbanken)][btb-link].
Expand Down
Loading