- Add a first version of Sima de los huesos and Denisovan diagnostic positions to the tree XML
- Add a upper hierarchie (NA'SIMA'DEN) for the common postions of Neanderthal, Sima and Denisova
- Add a first version of a final report
- Add stats and output-files for Deduped and Deaminated sequences separately
- Remove the
min_supportandmax_gapsflags, as they are not helpful in mixed samples (that was the original intention...)
- add Neanderthal haplogroups to the phylotree XML file, as published in Andreeva et al. 2022
- Update LICENSE to include the MIT LICENSE of the Institute of Genetic Epidemiology, Insbruck
- Include the
--penalty_plusflag (default: 2) to display nodes with higher penalty than the lowest.
- add
TotalMismatchcolumn, indicating the total number of mismatches in the branch support between covered and supported positions - add
SumOfGapscolumn, that shows the accumulated number of gaps in this branch - add
DistanceToBestcolumn that shows the difference of supported branch positions to the maximum number of supported positions in any branch - update the
Penaltycolumn to showSumOfGaps+TotalMismatch+DistanceToBest - Rename (and add) output tree-tables
- rename the full table from
all_groupstoraw - add a table that shows the path to the nodes with the best two (minimum) penalty scores (
best)
- rename the full table from
- Add the
RequiredGapscolumn in the summary that, looking back, counts how many intermediate nodes were skipped to get there (different to thePenaltycolumn that looks forward to the closest child) - introduce the
--max_gapsand--min_supportflags to filter the tree based on the BranchSupport and the RequiredGaps columns
- Reformat parse_pylotree code (more readable and maintainable)
- Use BranchSupport as a new filter-metric
- Update the output-files (All Nodes, 70% Support, 70% Support + Max 3 Gaps)
- Add MIT LICENSE
- Call a position with ! (re-mutation) only as 'found', if the same position was not requested before. This is necessary, because they hit tip-positions way too often and accumulate hits in large branches.
- e.g. 16311T in L3 -> 16311T! in U1b3 (not found)
- e.g. 16311T in L3 -> 16311A! in U5? (found)
- Second output (groups with coverage) now clips all nodes that have a penalty >= 3
- this can make the second file less useful in some instances, but much more useful in many more :)