FindDNAFusion is a combinatorial pipeline for the detection of cancer-associated gene fusions in next-generation DNA sequencing data. It integated three optimized software tools JULI, FACTERA and GeneFuse to accurately make gene fusion calls in clinical DNA-NGS panels. It also incorporated improvements including parsing the outputs of each tool, filtering out common tool-specific artifactual calls, selecting reportable fusions according to established criteria and annotating selected fusions using proper visualization applications. This pipeline can be used as an accurate and efficient tool in the detection of somatic fusions in DNA-NGS panels with intron-tiled bait probes when RNA is not available.
Requires high performance Linux computer.
git clone https://github.com/jml-bioinfo/FindDNAFusion.git
cd FindDNAFusion/database
wget https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz
gunzip hg19.fa.gz
bwa index -a bwtsw hg19.fa
samtools faidx hg19.fasta
wget http://hgdownload.cse.ucsc.edu/goldenpath/hg19/bigZips/hg19.2bit
wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_19/gencode.v19.annotation.gtf.gz
gunzip gencode.v19.annotation.gtf.gz
mv gencode.v19.annotation.gtf GENCODE19.gtf
- JULI-v0.1.6.2 from https://github.com/sgilab/JuLI
- FACTERA1.4.4 from http://factera.stanford.edu
- GeneFuse0.8.0 from https://github.com/OpenGene/GeneFuse
- BWA0.7.17-r1188 or newer
- SAMtools1.10 or newer
- Other software packages (Perl modules) Getopt::Long, Cwd, and POSIX.
Provide the following arguments to run FindDNAFusion
- sequence directory storing raw seqence FASTQ files, specified by -i
- BED file storing position information of targeted intron regions associated gene fusions, specified by -p
- reference genome in fasta file which must be in the same directory as the files generated by "bwa index", specified by -r (this release is limited to use HG19)
- output directory, specified by -o (optional)
- number of CPUs per sample to be used, specified by -c (optional)
#for example
./FindDNAFusion -i /ion/LNGS-new/RUN163/raw_seq_dir -p database/example-targeted-intron-regions.bed -r database/hg19.fasta -c 16 -o output_dir &