You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 27, 2020. It is now read-only.
From our repo, get the [`intervals` list file](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/repeats/wgs_calling_regions.grch37.list). More information about this file in the [intervals documentation](INTERVALS.md)
23
23
24
24
Description of how to generate the Loci file used in the ASCAT process is described [here](https://github.com/SciLifeLab/Sarek/blob/master/docs/ASCAT.md).
25
25
26
-
You can create your own cosmic reference for any human reference as specified below.
26
+
You can create your own cosmic reference for any human reference as specified below in the Cosmic section.
27
27
28
-
### COSMIC files
28
+
## GRCh38
29
+
30
+
Use `--genome GRCh38` to map against GRCh38. Before doing so and if you are not on UPPMAX, you need to adjust the settings in `genomes.config` to your needs.
31
+
32
+
To get the needed files, download the GATK bundle for GRCh38 from [ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/hg38/](ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/hg38/). You can also download the required files from the Google Cloud mirror link [here](https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0).
33
+
34
+
The MD5SUM of `Homo_sapiens_assembly38.fasta` included in that file is 7ff134953dcca8c8997453bbb80b6b5e.
35
+
36
+
If you download the data from the FTP servers `beta/` directory, which seems to be an older version of the bundle, only `Homo_sapiens_assembly38.known_indels.vcf` is needed. Also, you can omit `dbsnp_138_` and `dbsnp_144` files as we use `dbsnp_146`. The old ones also use the wrong chromosome naming convention. The Google Cloud mirror has all data in the `v0` directory, but requires you to remove the `resources_broad_hg38_v0_` prefixes from all files.
If you just downloaded the `Homo_sapiens_assembly38.fasta.gz` file, you would need to do:
55
+
56
+
```
57
+
gunzip Homo_sapiens_assembly38.fasta.gz
58
+
bwa index -6 Homo_sapiens_assembly38.fasta
59
+
```
60
+
61
+
Description of how to generate the Loci file used in the ASCAT process is described [here](https://github.com/SciLifeLab/Sarek/blob/master/docs/ASCAT.md).
29
62
30
-
To annotate with COSMIC variants during MuTect1/2 Variant Calling you need to create a compatible VCF file.
63
+
You can create your own cosmic reference for any human reference as specified below in the Cosmic section.
64
+
65
+
## COSMIC files
66
+
67
+
To annotate with COSMIC variants during MuTect1/2 Variant Calling you need to create a compatible VCF file.
31
68
Download the coding and non-coding VCF files from [COSMIC](http://cancer.sanger.ac.uk/cosmic/download) and
32
69
process them with the [Create\_Cosmic.sh](https://github.com/SciLifeLab/Sarek/tree/master/scripts/Create_Cosmic.sh)
33
-
script. The script requires a fasta index `.fai`, of the reference file you are using.
70
+
script for either GRCh37 or GRCh38. The script requires a fasta index `.fai`, of the reference file you are using.
34
71
35
72
Example:
36
73
@@ -47,23 +84,6 @@ To index the resulting VCF file use [igvtools](https://software.broadinstitute.o
47
84
igvtools index <cosmicvxx.vcf>
48
85
```
49
86
50
-
## GRCh38
51
-
52
-
Use `--genome GRCh38` to map against GRCh38. Before doing so and if you are not on UPPMAX, you need to adjust the settings in `genomes.config` to your needs.
53
-
54
-
To get the needed files, download the GATK bundle for GRCh38 from [ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/hg38/](ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/hg38/).
55
-
56
-
The MD5SUM of `Homo_sapiens_assembly38.fasta` included in that file is 7ff134953dcca8c8997453bbb80b6b5e.
57
-
58
-
From the `beta/` directory, which seems to be an older version of the bundle, only `Homo_sapiens_assembly38.known_indels.vcf` is needed. Also, you can omit `dbsnp_138_` and `dbsnp_144` files as we use `dbsnp_146`. The old ones also use the wrong chromosome naming convention.
59
-
60
-
Afterwards, the following needs to be done:
61
-
62
-
```
63
-
gunzip Homo_sapiens_assembly38.fasta.gz
64
-
bwa index -6 Homo_sapiens_assembly38.fasta
65
-
```
66
-
67
87
## smallGRCh37
68
88
69
89
Use `--genome smallGRCh37` to map against a small reference genome based on GRCh37. `smallGRCh37` is the default genome for the testing profile (`-profile testing`).
0 commit comments