Skip to content

tkinley/genome-mapping-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Genome Mapping Analysis

This repository contains scripts for analyzing genome mapping data from SAM files. The scripts extract mapping information, summarize mapped reads across genes, and identify multi-mapped reads.

Structure

  • Scripts/: Contains the Python scripts for data processing and analysis.
  • data/: Directory to store input SAM files.
  • results/: Directory to store output CSV files.

Scripts

1. Multi-Mapper Analysis

The multi_mapper_analysis.py script identifies and counts multi-mapped and unique-mapped reads in SAM files.

2. Unmapped Analysis

The unmapped_analysis_script.py script merges data from multiple SAM files, summarizes mapped reads per gene, and filters genes using read-count thresholds.

Usage

  1. Install the required packages:

    pip install -r requirements.txt
  2. Run the multi-mapper analysis script:

    python Scripts/multi_mapper_analysis.py
  3. Run the unmapped analysis script:

    python Scripts/unmapped_analysis_script.py

Requirements

  • pandas
  • pysam

Example Data

Place your SAM files in the data/ directory.

About

This repository contains Python scripts for analyzing genomic data from SAM files, focusing on mapping reads and filtering genes based on mapping quality and coverage.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages