Skip to content

jml-bioinfo/FindDNAFusion

Repository files navigation

FindDNAFusion

FindDNAFusion is a combinatorial pipeline for the detection of cancer-associated gene fusions in next-generation DNA sequencing data. It integated three optimized software tools JULI, FACTERA and GeneFuse to accurately make gene fusion calls in clinical DNA-NGS panels. It also incorporated improvements including parsing the outputs of each tool, filtering out common tool-specific artifactual calls, selecting reportable fusions according to established criteria and annotating selected fusions using proper visualization applications. This pipeline can be used as an accurate and efficient tool in the detection of somatic fusions in DNA-NGS panels with intron-tiled bait probes when RNA is not available.

Install & Set up

Requires high performance Linux computer.

get source

git clone https://github.com/jml-bioinfo/FindDNAFusion.git

download reference data and create index

cd FindDNAFusion/database

wget https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz

gunzip hg19.fa.gz

bwa index -a bwtsw hg19.fa

samtools faidx hg19.fasta

download other required big data files

wget http://hgdownload.cse.ucsc.edu/goldenpath/hg19/bigZips/hg19.2bit

wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_19/gencode.v19.annotation.gtf.gz

gunzip gencode.v19.annotation.gtf.gz

mv gencode.v19.annotation.gtf GENCODE19.gtf

install software dependencies

  1. JULI-v0.1.6.2 from https://github.com/sgilab/JuLI
  2. FACTERA1.4.4 from http://factera.stanford.edu
  3. GeneFuse0.8.0 from https://github.com/OpenGene/GeneFuse
  4. BWA0.7.17-r1188 or newer
  5. SAMtools1.10 or newer
  6. Other software packages (Perl modules) Getopt::Long, Cwd, and POSIX.

usage

Provide the following arguments to run FindDNAFusion

  1. sequence directory storing raw seqence FASTQ files, specified by -i
  2. BED file storing position information of targeted intron regions associated gene fusions, specified by -p
  3. reference genome in fasta file which must be in the same directory as the files generated by "bwa index", specified by -r (this release is limited to use HG19)
  4. output directory, specified by -o (optional)
  5. number of CPUs per sample to be used, specified by -c (optional)

#for example

./FindDNAFusion -i /ion/LNGS-new/RUN163/raw_seq_dir -p database/example-targeted-intron-regions.bed -r database/hg19.fasta -c 16 -o output_dir &

Contact & feedback

Xiaokang.Pan@osumc.edu

About

FindDNAFusion is a combinatorial pipeline for the detecton of cancer-associated gene fusions in next-generation DNA sequencing data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors