Skip to content

feat: Add parse option for char-based columns in Sourcepos#779

Merged
kivikakk merged 3 commits intokivikakk:mainfrom
Martin005:sourcepos_chars
Mar 29, 2026
Merged

feat: Add parse option for char-based columns in Sourcepos#779
kivikakk merged 3 commits intokivikakk:mainfrom
Martin005:sourcepos_chars

Conversation

@Martin005
Copy link
Copy Markdown
Contributor

This PR introduces a new parsing option to report source position columns as Unicode character counts instead of UTF-8 byte offsets.

I created a lot of tests for different node values / extensions, but the tests are not exhaustive. Please let me know if I should cover more things 🙂

Closes #777

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Mar 24, 2026

Merging this PR will not alter performance

✅ 1 untouched benchmark


Comparing Martin005:sourcepos_chars (583d092) with main (0d4a9ca)1

Open in CodSpeed

Footnotes

  1. No successful run was found on main (583d092) during the generation of this report, so 0d4a9ca was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@Martin005 Martin005 changed the title feat: Add support for char-based columns in Sourcepos feat: Add parse option for char-based columns in Sourcepos Mar 24, 2026
Copy link
Copy Markdown
Owner

@kivikakk kivikakk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this looks great! 🤍

if lc.column == 0 {
return;
}
if let Some(line) = lines.get(lc.line.wrapping_sub(1)) {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrapping_sub cannot produce a useful result if lc.line is zero here (18446744073709551615); it would make more sense to give an underflow error than an out-of-bounds one.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's right, thanks for noticing. Should I cover that in another PR? 🙂

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally consider such non-blocking comments as: you can feel free to fix if you like, whether in its own PR or just whatever one you happen to do next; and maybe I will if I happen to write a PR myself sometime soon and remember :)

@kivikakk kivikakk enabled auto-merge March 29, 2026 00:43
@kivikakk kivikakk merged commit 79cd65d into kivikakk:main Mar 29, 2026
26 checks passed
@kivikakk
Copy link
Copy Markdown
Owner

I created a lot of tests for different node values / extensions, but the tests are not exhaustive. Please let me know if I should cover more things 🙂

Your tests are always super exhaustive to my eyes; I really appreciate your efforts!

@Martin005 Martin005 deleted the sourcepos_chars branch March 29, 2026 06:53
@Martin005
Copy link
Copy Markdown
Contributor Author

Martin005 commented Mar 29, 2026

I created a lot of tests for different node values / extensions, but the tests are not exhaustive. Please let me know if I should cover more things 🙂

Your tests are always super exhaustive to my eyes; I really appreciate your efforts!

By not being exhaustive, I meant that the test cases don't test every single node value / extension. And there isn't a case for a long Markdown document with multiple node values.
But I don't think there should be a situation where the sourcepos_chars gives an incorrect value as the implementation is very straightforward (famous last words 😀)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

(Proposal) Parse option to make sourcepos column char-based

2 participants