Skip to content

Conversation

@glennhickey
Copy link
Contributor

MAF q-lines (news to me) are defined here

 Lines starting with "q" -- information about the quality of each aligned base for the species

 s hg18.chr1                  32741 26 + 247249719 TTTTTGAAAAACAAACAACAAGTTGG
 s panTro2.chrUn            9697231 26 +  58616431 TTTTTGAAAAACAAACAACAAGTTGG
 q panTro2.chrUn                                   99999999999999999999999999
 s dasNov1.scaffold_179265     1474  7 +      4584 TT----------AAGCA---------
 q dasNov1.scaffold_179265                         99----------32239--------- 

This PR adds support for them in TAF. This is done by using column tags in TAF with key == "q" and value of a string of ascii-phred values (min=0=!, max=93=~). This full spectrum can't be represented in MAF, which only supports 10 different values.

Also, since everything needs to be transposed all the time, it assumes there's a score for every position. If there isn't, a default of max-score is used.

@glennhickey
Copy link
Contributor Author

The ASCII phred's weren't working in practice because they could include : which is a reserved character in the TAF tag format. So I've switched it to use vg's base64 encoding of the raw 1-byte quality values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants