So apparently there are quite some issues with Python3+, making reading input files quite difficult.
a.) UTF-8/Latin-1 file encodings are quite difficult to handle in Python3, especially forcing these to be consistent on how input reading was handled automatically by Python 2.
cf for details:
https://stackoverflow.com/questions/12468179/unicodedecodeerror-utf8-codec-cant-decode-byte-0x9c
b.) I guess the easiest way would be adding pysam (which is installable via python-pip, conda etc pp) and then rely on this as reading/writing library for SAM/BAM compatibility. One could even have automatic MD tagging activated, making the process easier for users too.
So apparently there are quite some issues with Python3+, making reading input files quite difficult.
a.) UTF-8/Latin-1 file encodings are quite difficult to handle in Python3, especially forcing these to be consistent on how input reading was handled automatically by Python 2.
cf for details:
https://stackoverflow.com/questions/12468179/unicodedecodeerror-utf8-codec-cant-decode-byte-0x9c
b.) I guess the easiest way would be adding
pysam(which is installable via python-pip, conda etc pp) and then rely on this as reading/writing library for SAM/BAM compatibility. One could even have automatic MD tagging activated, making the process easier for users too.