Skip to content

Commit ba9ed15

Browse files
committed
changes required by @jfy133 for PR #355
1 parent 7ea2790 commit ba9ed15

6 files changed

Lines changed: 29 additions & 28 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ Additional functionality contained by the pipeline currently includes:
6868

6969
* Taxonomic binner with alignment (`MALT`)
7070
* Taxonomic binner without alignment (`Kraken2`)
71-
* aDNA characteristic screening of taxonomically binned data (`MaltExtract`)
71+
* aDNA characteristic screening of taxonomically binned data from MALT (`MaltExtract`)
7272

7373
## Quick Start
7474

bin/kraken_parse.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ def parse_kraken(infile, countlim):
4444
'''
4545
INPUT:
4646
infile (str): path to kraken report file
47-
countlim (int): lower count threshold to report hit
47+
countlim (int): lowest count threshold to report hit
4848
OUTPUT:
4949
resdict (dict): key=taxid, value=readCount
5050

bin/merge_kraken_res.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,8 @@ def _get_args():
1515
parser.add_argument(
1616
'-o',
1717
dest="output",
18-
default=None,
19-
help="Output file. Default = sources.csv")
18+
default="kraken_count_table.csv",
19+
help="Output file. Default = kraken_count_table.csv")
2020

2121
args = parser.parse_args()
2222

@@ -55,5 +55,5 @@ def write_csv(pd_dataframe, outfile):
5555
OUTFILE = _get_args()
5656
all_csv = get_csv()
5757
resdf = merge_csv(all_csv)
58-
write_csv(resdf, "kraken_otu_table.csv")
58+
write_csv(resdf, OUTFILE)
5959
print(resdf)

docs/usage.md

Lines changed: 22 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -919,6 +919,8 @@ For malt, this only applies when `--malt_min_support_mode` is set to 'reads'. De
919919

920920
Specify the path to the _directory_ containing your taxonomic classifer's database (malt or kraken).
921921

922+
For Kraken2, it can be either the path to the _directory_ or the path to the `.tar.gz` compressed directory of the Kraken2 database.
923+
922924
#### `--percent_identity`
923925

924926
Specify the minimum percent identity (or similarity) a squence must have to the reference for it to be retained. Default is 85
@@ -928,13 +930,13 @@ Specify the minimum percent identity (or similarity) a squence must have to the
928930
Use this to run the program in 'BlastN', 'BlastP', 'BlastX' modes to align DNA and DNA, protein and protein, or DNA reads against protein references respectively.
929931
respectively. Ensure your database matches the mode. Check the [MALT manual](http://ab.inf.uni-tuebingen.de/data/software/malt/download/manual.pdf) for more details. Default: 'BlastN'
930932

931-
Only when `--metagenomic_tool malt`
933+
Only when `--metagenomic_tool malt` is also supplied is also supplied
932934

933935
#### `--malt_alignment_mode`
934936

935937
Specify what alignment algorithm to use. Options are 'Local' or 'SemiGlobal'. Local is a BLAST like alignment, but is much slower. Semi-global alignment aligns reads end-to-end. Default: 'SemiGlobal'
936938

937-
Only when `--metagenomic_tool malt`
939+
Only when `--metagenomic_tool malt` is also supplied
938940

939941

940942
#### `--malt_top_percent`
@@ -943,113 +945,113 @@ Specify the top percent value of the LCA algorthim. From the [MALT manual](http:
943945
read, only those matches are used for taxonomic placement whose bit disjointScore is within
944946
10% of the best disjointScore for that read.". Default: 1.
945947

946-
Only when `--metagenomic_tool malt`
948+
Only when `--metagenomic_tool malt` is also supplied
947949

948950

949951
#### `--malt_min_support_mode`
950952

951953
Specify whether to use a percentage, or raw number of reads as the value used to decide the minimum support a taxon requires to be retained.
952954

953-
Only when `--metagenomic_tool malt`
955+
Only when `--metagenomic_tool malt` is also supplied
954956

955957

956958
#### `--malt_min_support_percent`
957959

958960
Specify the minimum number of reads (as a percentage of all assigned reads) a given taxon is required to have to be retained as a positive 'hit' in the RMA6 file. This only applies when `--malt_min_support_mode` is set to 'percent'. Default 0.01.
959961

960-
Only when `--metagenomic_tool malt`
962+
Only when `--metagenomic_tool malt` is also supplied
961963

962964
#### `--malt_max_queries`
963965

964966
Specify the maximum number of alignments a read can have. All further alignments are discarded. Default: 100
965967

966-
Only when `--metagenomic_tool malt`
968+
Only when `--metagenomic_tool malt` is also supplied
967969

968970
#### `--malt_memory_mode`
969971

970972
How to load the database into memory. Options are 'load', 'page' or 'map'. 'load' directly loads the entire database into memory prior seed look up, this is slow but compatible with all servers/file systems. 'page' and 'map' perform a sort of 'chunked' database loading, allow seed look up prior entire database loading. Note that Page and Map modes do not work properly not with many remote filesystems such as GPFS. Default is 'load'.
971973

972-
Only when `--metagenomic_tool malt`
974+
Only when `--metagenomic_tool malt` is also supplied
973975

974976
#### `--run_maltextract`
975977

976978
Turn on MaltExtract for MALT aDNA characteristics authentication of metagenomic output from MALT.
977979

978980
More can be seen in the [MaltExtract documentation](https://github.com/rhuebler/)
979981

980-
Only when `--metagenomic_tool malt`
982+
Only when `--metagenomic_tool malt` is also supplied
981983

982984
#### `maltextract_taxon_list`
983985

984986
Path to a `.txt` file with taxa of interest you wish to assess for aDNA characteristics. In `.txt` file should be one taxon per row, and the taxon should be in a valid [NCBI taxonomy](https://www.ncbi.nlm.nih.gov/taxonomy) name format.
985987

986-
Only when `--metagenomic_tool malt`
988+
Only when `--metagenomic_tool malt` is also supplied
987989

988990
#### `maltextract_ncbifiles`
989991

990992
Path to directory containing containing the NCBI resource tree and taxonomy table files (ncbi.tre and ncbi.map; avaliable at the [HOPS repository](https://github.com/rhuebler/HOPS/Resources)).
991993

992-
Only when `--metagenomic_tool malt`
994+
Only when `--metagenomic_tool malt` is also supplied
993995

994996
#### `maltextract_filter`
995997

996998
Specify which MaltExtract filter to use. This is used to specify what types of characteristics to scan for. The default will output statistics on all alignments, and then a second set with just reads with one C to T mismatch in the first 5 bases. Further details on other parameters can be seen in the [HOPS documentation](https://github.com/rhuebler/HOPS/#maltextract-parameters). Options: 'def_anc', 'ancient', 'default', 'crawl', 'scan', 'srna', 'assignment'. Default: 'def_anc'.
997999

998-
Only when `--metagenomic_tool malt`
1000+
Only when `--metagenomic_tool malt` is also supplied
9991001

10001002
#### `maltextract_toppercent`
10011003

10021004
Specify percent of top alignments for each read to be considered for each node. Default: 0.01.
10031005

1004-
Only when `--metagenomic_tool malt`
1006+
Only when `--metagenomic_tool malt` is also supplied
10051007

10061008
#### `maltextract_destackingoff`
10071009

10081010
Turn off destacking. If left on, a read that overlap with another read will be removed (leaving a depth coverage of 1).
10091011

1010-
Only when `--metagenomic_tool malt`
1012+
Only when `--metagenomic_tool malt` is also supplied
10111013

10121014
#### `maltextract_downsamplingoff`
10131015

10141016
Turn off downsampling. By default, downsampling is on and will randomly select 10,000 reads if the number of reads on a node exceeds this number. This is to speed up processing, under the assumption at 10,000 reads the species is a 'true positive'.
10151017

1016-
Only when `--metagenomic_tool malt`
1018+
Only when `--metagenomic_tool malt` is also supplied
10171019

10181020
#### `maltextract_duplicateremovaloff`
10191021

10201022
Turn off duplicate removal. By default, reads that are an exact copy (i.e. same start, stop coordinate and exact sequence match) will be removed as it is considered a PCR duplicate.
10211023

1022-
Only when `--metagenomic_tool malt`
1024+
Only when `--metagenomic_tool malt` is also supplied
10231025

10241026
#### `maltextract_matches`
10251027

10261028
Export alignments of hits for each node in BLAST format. By default turned off.
10271029

1028-
Only when `--metagenomic_tool malt`
1030+
Only when `--metagenomic_tool malt` is also supplied
10291031

10301032
#### `maltextract_megansummary`
10311033

10321034
Export 'minimal' summary files (i.e. without alignments) that can be loaded into [MEGAN6](https://doi.org/10.1371/journal.pcbi.1004957). By default turned off.
10331035

1034-
Only when `--metagenomic_tool malt`
1036+
Only when `--metagenomic_tool malt` is also supplied
10351037

10361038
#### `maltextract_percentidentity`
10371039

10381040
Minimum percent identity alignments are required to have to be reported. Higher values allows fewer mismatches between read and reference sequence, but therefore will provide greater confidence in the hit. Lower values allow more mismatches, which can account for damage and divergence of a related strain/species to the reference. Recommended to set same as MALT parameter or higher. Default: 85.0.
10391041

1040-
Only when `--metagenomic_tool malt`
1042+
Only when `--metagenomic_tool malt` is also supplied
10411043

10421044
#### `maltextract_topalignment`
10431045

10441046
Use the best alignment of each read for every statistic, except for those concerning read distribution and coverage. Default: off.
10451047

1046-
Only when `--metagenomic_tool malt`
1048+
Only when `--metagenomic_tool malt` is also supplied
10471049

10481050
#### `maltextract_singlestranded`
10491051

10501052
Switch damage patterns to single-stranded mode. Default: off.
10511053

1052-
Only when `--metagenomic_tool malt`
1054+
Only when `--metagenomic_tool malt` is also supplied
10531055

10541056
## Clean up
10551057

main.nf

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2208,10 +2208,10 @@ process kraken_merge {
22082208
file(csv_count) from ch_kraken_parsed.collect()
22092209

22102210
output:
2211-
file('kraken_otu_table.csv') into kraken_merged
2211+
file('kraken_count_table.csv') into kraken_merged
22122212

22132213
script:
2214-
out = "kraken_otu_table.csv"
2214+
out = "kraken_count_table.csv"
22152215
"""
22162216
merge_kraken_res.py -o $out
22172217
"""

nextflow.config

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,6 @@ params {
173173
malt_top_percent = 1
174174
malt_min_support_mode = 'percent'
175175
malt_min_support_percent = 0.01
176-
metagenomic_min_support_reads = 1
177176
malt_max_queries = 100
178177
malt_memory_mode = 'load'
179178
malt_weighted_lca = false

0 commit comments

Comments
 (0)