Hackathon team: gene expression analysis for Covid-19 Virtual Biohackathon (vBH)
https://github.com/virtual-biohackathons/covid-19-bh20
https://github.com/virtual-biohackathons/covid-19-bh20/wiki/GeneExpression
We want to perform RNAseq-based analyses on published datasets in order to better understand the interaction between human host and virus.
Fig.1 - We want to focus on already known (such as ACE2 and TMPRSS2) but also on new candidate genes that may play a punctual or a general role in the interaction between host and virus. To this end, we will perform extensive RNAseq analyses as described in the workflow section belowBiological: Perform a global RNA-Seq analysis with SARS-CoV-2 infected datasets to search for new candidate genes for testing experimentally
Methodological: Create a packaged reproducible pipeline in Docker to help scientists to easily treat their RNA-Seq data and for us if any new dataset comes out
- Check literature to select interesting genes/datasets to study
- Downloading RNAseqs from SRA/GEO
- Pipeline to clean reads
- Map against viral genome (+ viral DBs)
- Map against human genome (genes and isoforms pipelines)
- Check shared reads between both genomes
- Map reads against transposable elements
- Check the existence of chimeric reads
- Perform differential expression analysis on genes, transcripts and TEs
- Functional enrichment analysis
- SNP/Splicing on risk factor datasets for selected genes
- FastQC (https://github.com/s-andrews/FastQC)
- Fastq-screen (https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/)
- trimmomatic (http://www.usadellab.org/cms/?page=trimmomatic)
- STAR (https://github.com/alexdobin/STAR)
- Bowtie2 (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml)
- TEtools (https://github.com/l-modolo/TEtools)
- TEtranscripts (http://hammelllab.labsites.cshl.edu/software)
- LIONS (www.github.com/ababaian/LIONS)
- samtools / picard (http://samtools.sourceforge.net/; https://broadinstitute.github.io/picard/)
- featureCounts (http://subread.sourceforge.net)
- MultiQC (https://github.com/ewels/MultiQC)
- KissSplice (http://kissplice.prabi.fr/)
ggplot2
install.packages("ggplot2")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
DESEq2
BiocManager::install("DESeq2")
EdgeR
BiocManager::install("EdgeR")
Limma-Voom
BiocManager::install("limma")
install.packages("devtools")
SARTools
library(devtools)
install_github("PF2-pasteur-fr/SARTools", build_opts="--no-resave-data")
- SARS-MERS: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE56192
- COVID: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE147507
- Murine coronavirus (M-CoV): https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-4111/
