The guidance for the DataHarmonizer template's 'original_mutation_description' field says any string can be inputted.
At the moment, addfunctions2gvf.py relies on the contents of 'original_mutation_description' matching names derived from the VCF, which could result in data loss:
merged_df = pd.merge(gvf, df, on=['original_mutation_description', 'protein_symbol'], how='left') #, 'alias'
The guidance for the DataHarmonizer template's 'original_mutation_description' field says any string can be inputted.
At the moment,
addfunctions2gvf.pyrelies on the contents of 'original_mutation_description' matching names derived from the VCF, which could result in data loss: