This repository contains scripts for analyzing genome mapping data from SAM files. The scripts extract mapping information, summarize mapped reads across genes, and identify multi-mapped reads.
Scripts/: Contains the Python scripts for data processing and analysis.data/: Directory to store input SAM files.results/: Directory to store output CSV files.
The multi_mapper_analysis.py script identifies and counts multi-mapped and unique-mapped reads in SAM files.
The unmapped_analysis_script.py script merges data from multiple SAM files, summarizes mapped reads per gene, and filters genes using read-count thresholds.
-
Install the required packages:
pip install -r requirements.txt
-
Run the multi-mapper analysis script:
python Scripts/multi_mapper_analysis.py
-
Run the unmapped analysis script:
python Scripts/unmapped_analysis_script.py
- pandas
- pysam
Place your SAM files in the data/ directory.