Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 4 additions & 5 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,16 +47,15 @@ jobs:
uses: actions/checkout@v3
- name: Setup conda
id: setup_conda
uses: conda-incubator/setup-miniconda@v2
uses: conda-incubator/setup-miniconda@v3
with:
python-version: 3.9
miniforge-variant: Mambaforge-pypy3
channels: umccr,bioconda,conda-forge,defaults
channel-priority: true
python-version: 3.9
- name: Prepare env
id: prepare_env
run: |
mamba install boa anaconda-client bump2version
conda install bump2version conda-build anaconda-client
# When on development branch (inferred from bump_version value), append '-dev' to package name and set
# commit-specific version
- name: Set package name and version
Expand All @@ -69,7 +68,7 @@ jobs:
--no-commit
- name: Build and upload conda package
run: |
conda mambabuild \
conda build \
--token "${{ secrets.ANACONDA_TOKEN }}" \
conda/bactabolize/

Expand Down
4 changes: 1 addition & 3 deletions .github/workflows/lint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,8 @@ jobs:
shell: bash -l {0}
steps:
- uses: actions/checkout@v2
- uses: conda-incubator/setup-miniconda@v2
- uses: conda-incubator/setup-miniconda@v3
with:
python-version: 3.9
miniforge-variant: Mambaforge-pypy3
environment-file: requirements-dev.yaml
- name: execute_precommit_hooks
run: |
Expand Down
4 changes: 1 addition & 3 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,8 @@ jobs:
shell: bash -l {0}
steps:
- uses: actions/checkout@v2
- uses: conda-incubator/setup-miniconda@v2
- uses: conda-incubator/setup-miniconda@v3
with:
python-version: 3.9
miniforge-variant: Mambaforge-pypy3
environment-file: requirements-dev.yaml
- name: prepare_environment
run: |
Expand Down
2 changes: 1 addition & 1 deletion .mdlrc
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
rules "~MD024"
rules "~MD024,~MD036"
style "#{File.dirname(__FILE__)}/.mdl_style.rb"
2 changes: 1 addition & 1 deletion .pylintrc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[MESSAGE CONTROL]
disable=logging-fstring-interpolation,duplicate-code,missing-module-docstring,use-dict-literal,use-list-literal,invalid-name,too-many-locals,too-few-public-methods,unspecified-encoding,missing-function-docstring,consider-using-f-string,missing-class-docstring,subprocess-run-check,too-many-arguments
disable=logging-fstring-interpolation,duplicate-code,missing-module-docstring,use-dict-literal,use-list-literal,invalid-name,too-many-locals,too-few-public-methods,unspecified-encoding,missing-function-docstring,consider-using-f-string,missing-class-docstring,subprocess-run-check,too-many-arguments,import-error

[FORMAT]
max-line-length=120
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,13 @@
A high-throughput genome-scale metabolic reconstruction and growth simulation pipeline.

## How to run

**Install and quick start [here](https://github.com/kelwyres/Bactabolize/wiki/1.-Quick-start)**

**Visit the [wiki](https://github.com/kelwyres/Bactabolize/wiki) to find out more!**

## Description

Bactabolize is designed for rapid generation of strain-specific metabolic reconstructions from bacterial genome data
using the approach described in [Norsigian et al. Nature Protocols
2020](https://www.nature.com/articles/s41596-019-0254-3). It leverages the [COBRApy
Expand All @@ -29,8 +31,7 @@ can be performed under a variety of growth conditions and mediums.
Bactabolize is freely available under a [GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.en.html).
Please cite the following papers if you make use of Bactabolize:

* Vezina B. / Watts S.C. et al. 'Bactabolize: A tool for high-throughput generation of bacterial strain-specific metabolic models'. eLife (2023).
[https://doi.org/10.7554/eLife.87406.3](https://doi.org/10.7554/eLife.87406.3)
* Vezina B. / Watts S.C. et al. 'Bactabolize: A tool for high-throughput generation of bacterial strain-specific
metabolic models'. eLife (2023). [https://doi.org/10.7554/eLife.87406.3](https://doi.org/10.7554/eLife.87406.3)
* Ebrahim, A., Lerman, J.A., Palsson, B.O. et al. 'COBRApy: COnstraints-Based Reconstruction and Analysis for Python'. BMC
Syst Biol 7, 74 (2013). <https://doi.org/10.1186/1752-0509-7-74>

2 changes: 2 additions & 0 deletions bactabolize/alignment.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ def __str__(self):


def run_blastp(query_fp, subject_fp):
# pylint: disable=no-else-return
# Create a database
create_blast_database(subject_fp, 'prot')
# Run alignment
Expand All @@ -51,6 +52,7 @@ def run_blastp(query_fp, subject_fp):


def run_blastn(query_fp, subject_fp):
# pylint: disable=no-else-return
# Create database
create_blast_database(subject_fp, 'nucl')
# Run alignment
Expand Down
15 changes: 9 additions & 6 deletions bactabolize/annotate.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ def run(assembly_fp, output_fp):
elif assembly_filetype == 'fasta':
assembly_genbank_fp = None
assembly_fasta_fp = assembly_fp
else:
assert False
print('========================================')
prodigal_data = run_prodigal(assembly_fasta_fp)
prodigal_orfs = parse_prodigal_output(prodigal_data)
Expand Down Expand Up @@ -114,6 +116,7 @@ def get_qual_note(partial_str):


def match_existing_orfs_updated_annotations(new_fp, existing_fp, overlap_min=0.80):
# pylint: disable=too-many-branches
# Get features and create list of start and end objects for each
features_new = collect_all_features(new_fp)
features_existing = collect_all_features(existing_fp)
Expand All @@ -126,17 +129,17 @@ def match_existing_orfs_updated_annotations(new_fp, existing_fp, overlap_min=0.8
# Find overlaps
positions = contig_positions_new[contig] + contig_positions_existing[contig]
features_matched = discover_overlaps(positions, overlap_min)

# Discover those not matched using location comparison
features_matched_new = [f[0] for f in features_matched]
features_matched_existing = [f[1] for f in features_matched]

# Find unmatched features by comparing locations
new_unmatched = []
for feature in features_new[contig]:
if not any(f.location == feature.location for f in features_matched_new):
new_unmatched.append(feature)

existing_unmatched = []
for feature in features_existing[contig]:
if not any(f.location == feature.location for f in features_matched_existing):
Expand All @@ -151,7 +154,7 @@ def match_existing_orfs_updated_annotations(new_fp, existing_fp, overlap_min=0.8
continue
feature_new.qualifiers[qual] = feature_existing.qualifiers[qual]
features_updated.append(feature_new)

# Add existing ORFs that had no match
features_updated.extend(new_unmatched)
features_updated.extend(existing_unmatched)
Expand All @@ -163,7 +166,7 @@ def match_existing_orfs_updated_annotations(new_fp, existing_fp, overlap_min=0.8
print(f'\t{len(existing_unmatched)} existing features unmatched')
print(f'\t{len(new_unmatched)} re-annotated features unmatched')
print(f'\t{len(features_updated)} total features')

# Update new genbank with new feature set
update_genbank_annotations(new_fp, contig_features_updated)

Expand Down Expand Up @@ -214,7 +217,7 @@ def discover_overlaps(positions, overlap_min):
continue
# Check if this pair is already matched by comparing locations
already_matched = any(
fn.location == feature_new.location and fe.location == feature_existing.location
fn.location == feature_new.location and fe.location == feature_existing.location
for fn, fe in features_matched
)
if already_matched:
Expand Down
4 changes: 2 additions & 2 deletions bactabolize/draft_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,13 +56,13 @@ def run(config):
original_genes.append(gene.id)
model_draft.notes['Original_Genes'] = original_genes

# Same gene dictionary of reference model and genome annotations to csv
# Same gene dictionary of reference model and genome annotations to csv
gene_dict_fp = config.output_fp.parent / f'{config.output_fp.stem}_gene_dictionary.csv'
with open(gene_dict_fp, 'w') as csv_file:
writer = csv.writer(csv_file)
for key, value in isolate_orthologs.items():
writer.writerow([key, value])

# Mutate a copy of the model and rename genes
cobra.manipulation.modify.rename_genes(model_draft, isolate_orthologs)

Expand Down
3 changes: 3 additions & 0 deletions bactabolize/model_fba.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ def run(config):
model = cobra.io.load_json_model(fh)
elif config.model_fp.suffix == '.xml':
model = read_sbml_model(fh)
else:
assert False

if config.fba_spec_fp:
spec = parse_spec(config.fba_spec_fp)
elif config.fba_spec_name:
Expand Down
4 changes: 4 additions & 0 deletions bactabolize/patch_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,16 @@ def run(config):
model_draft = cobra.io.load_json_model(fh)
elif config.draft_model_fp.suffix == '.xml':
model_draft = read_sbml_model(fh)
else:
assert False

with config.ref_model_fp.open('r') as fh:
if config.ref_model_fp.suffix == '.json':
model_ref = cobra.io.load_json_model(fh)
elif config.ref_model_fp.suffix == '.xml':
model_ref = read_sbml_model(fh)
else:
assert False

patch = parse_patch(config.patch_fp, model_draft.id)
# Apply patch
Expand Down