Replies: 3 comments 5 replies
-
|
Indeed, it's UTF-8, which is a bit of a Yikes™, but compatible with the closest thing we have to an upstream: $ echo '好' | ./cmark --sourcepos
<p data-sourcepos="1:1-1:3">好</p>PR to correct the docs happily accepted. |
Beta Was this translation helpful? Give feedback.
3 replies
-
Beta Was this translation helpful? Give feedback.
1 reply
-
|
@saecki I created issue #777 to add a new parsing option that will transform the UTF-8-based columns into Unicode character-based columns. And already created an implementation for that – #779. Feel free to take a look 🙂 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The docs of
LineColumn::columnsay:This made me assume it was the character-based offset + 1 (essentially as if encoded in UTF-32).
But if a
Sourceposcontains UTF-8 characters that are longer than 1 byte such asöor介, this doesn't hold true.So what encoding is
LineColumn::columnusing? If this isn't already documented somewhere, it would be great if the documentation would mention that.Beta Was this translation helpful? Give feedback.
All reactions