UTR_maker

Make 5' and 3' UTRs from annotated cDNA mRNA transcript models.

Tutorial

This tutorial demonstrates how to use the UTR Maker script "utr_maker.py" to analyze GenBank files and extract 5' and 3' UTR regions from mRNA transcript models. We'll use two example files: MZ242719.txt and MZ242720.txt.

Prerequisites

Before using the script, ensure you have the following installed:

Python 3.6 or later
BioPython library (pip install biopython)

Script Overview

The UTR Maker script provides a class-based approach to:

Read GenBank files
Identify locus segments
Find coding sequence boundaries
Extract 5' and 3' UTRs
Save the results

Installation

Save the script as utr_maker.py
Place your GenBank files in the same directory or provide the full path

Basic Usage

from utr_maker import UTRMaker

# Create UTRMaker instance
maker = UTRMaker("MZ242719.txt")

# Save UTRs to files
maker.save_utrs("MZ242719")

# Get detailed information
details = maker.get_utr_details()

Example Analysis

Let's analyze our two example files:

Example 1: MZ242719.txt

This file contains a complete HIV-1 genome. The UTRs are structured as follows:

5' UTR components:

Locus segment 1 (positions 1-290): Minimal 5' UTR
Part of segment 2 before gag starts (positions 291-336)
Total length: 336 bp

3' UTR components:

Sequences after nef gene (positions 8955-9173)
Total length: 219 bp

maker = UTRMaker("MZ242719.txt")
details = maker.get_utr_details()
print(f"5' UTR length: {details['5_utr_length']} bp")
print(f"3' UTR length: {details['3_utr_length']} bp")

Example 2: MZ242720.txt

This file represents a different transcript model. The UTRs are structured as:

5' UTR components:

Locus segment 1 (positions 1-290): Minimal 5' UTR
Locus segment 9 (positions 291-359): tat/rev segment
Part of segment 10 before coding starts (positions 360-375)
Total length: 375 bp

3' UTR components:

Sequences after the last coding sequence (positions 3723-3941)
Total length: 219 bp

maker = UTRMaker("MZ242720.txt")
details = maker.get_utr_details()
print(f"5' UTR length: {details['5_utr_length']} bp")
print(f"3' UTR length: {details['3_utr_length']} bp")

Output Files

For each input file, the script generates two FASTA files:

{filename}_5UTR.fasta: Contains the 5' UTR sequence
{filename}_3UTR.fasta: Contains the 3' UTR sequence

Advanced Features

The script provides several utility methods:

find_segment_boundaries(): Get all locus segment positions
find_coding_boundaries(): Find first CDS start and last CDS stop
get_utr_details(): Get comprehensive information about UTRs and segments

Error Handling

The script includes basic error handling for:

Missing files
Invalid GenBank format
Missing coding sequences
Missing segment annotations

Notes

The script assumes GenBank format input files
UTR regions are determined based on coding sequence boundaries
Locus segments are identified from misc_feature annotations
All positions are 0-based following Python convention

Troubleshooting

If you encounter issues:

Verify input file format
Check file permissions
Ensure BioPython is properly installed
Verify CDS annotations exist in the GenBank file

For more complex analyses or custom modifications, refer to the script's source code and BioPython documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
MZ242719.txt		MZ242719.txt
MZ242720.txt		MZ242720.txt
README.md		README.md
utr-maker.py		utr-maker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UTR_maker

Tutorial

Prerequisites

Script Overview

Installation

Basic Usage

Example Analysis

Example 1: MZ242719.txt

Example 2: MZ242720.txt

Output Files

Advanced Features

Error Handling

Notes

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

UTR_maker

Tutorial

Prerequisites

Script Overview

Installation

Basic Usage

Example Analysis

Example 1: MZ242719.txt

Example 2: MZ242720.txt

Output Files

Advanced Features

Error Handling

Notes

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages