Skip to content

sort: add locale-aware month parsing using ICU#9722

Open
ChrisDryden wants to merge 4 commits into
uutils:mainfrom
ChrisDryden:sort-locale-month
Open

sort: add locale-aware month parsing using ICU#9722
ChrisDryden wants to merge 4 commits into
uutils:mainfrom
ChrisDryden:sort-locale-month

Conversation

@ChrisDryden

@ChrisDryden ChrisDryden commented Dec 19, 2025

Copy link
Copy Markdown
Collaborator

This PR is to address the lack of locale information when using the -M option in sort since all of the months were hardcoded to the english months. This PR introduces a new dependency icu_datetime to be able to get all of the month locale information.

Originally when trying to fix the associated GNU test it wasn't working because nl was removing non-utf8 characters. Now that the nl issue was addressed this should fix that GNU test.

When implementing this, choices were made that were maybe not the most efficient, but more readable and I think thats the better choice for this type of solution because there will only be a maximum of 12 comparisons when implementing this type of sorting.

@ChrisDryden

Copy link
Copy Markdown
Collaborator Author

Going to have to wait and rebase until this PR is in: #9720

@sylvestre

Copy link
Copy Markdown
Contributor

i will have a look once the jobs are green but the approach looks perfect!

Comment thread src/uucore/src/lib/features/i18n/month.rs
Comment thread src/uucore/src/lib/features/i18n/month.rs
@ChrisDryden ChrisDryden force-pushed the sort-locale-month branch 3 times, most recently from f6801e0 to efae246 Compare December 19, 2025 22:09
@ChrisDryden

Copy link
Copy Markdown
Collaborator Author

Had to revert the string lossy change and add a comment, the string lossy modifies the non utf characters

@github-actions

Copy link
Copy Markdown

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

@ChrisDryden

Copy link
Copy Markdown
Collaborator Author

I think another one of the utilities is lossy converting the non utf, investigating now

@ChrisDryden

Copy link
Copy Markdown
Collaborator Author

Oh the CI is missing the Japan locale

@ChrisDryden ChrisDryden marked this pull request as ready for review December 19, 2025 23:37
@ChrisDryden

Copy link
Copy Markdown
Collaborator Author

It passed! it was the Japanese locale missing causing the failure

@github-actions

Copy link
Copy Markdown

GNU testsuite comparison:

Congrats! The gnu test tests/sort/sort-month is no longer failing!
Congrats! The gnu test tests/tail/inotify-dir-recreate is now passing!

@RenjiSann

Copy link
Copy Markdown
Collaborator

Look really good ! Thank you for taking over the month sorting :D

One thing we should be mindful of is the fact that GNU's month sorting exclusively matches month names against the abbreviated month name (see this discussion I had with ICU guys), so it's not bothering with trimming.

Now, this may be a limitation of GNU, and we might not want to stick with it

@ChrisDryden

Copy link
Copy Markdown
Collaborator Author

Sounds like the ICU data abbreviations don't always match the GNU abbreviations, the one specifically that would fail this is the french april. I don't see how we could do this without the non-exclusive matching

@ChrisDryden

Copy link
Copy Markdown
Collaborator Author

@RenjiSann is there a different pathway we should take for this then? Was wondering if you had any thoughts on what the next steps should be?

@github-actions

github-actions Bot commented Jan 5, 2026

Copy link
Copy Markdown

GNU testsuite comparison:

Congrats! The gnu test tests/sort/sort-month is no longer failing!

@RenjiSann

Copy link
Copy Markdown
Collaborator

I assumed that would be a known incompatibility, since I can't come with a better solution

@sylvestre

Copy link
Copy Markdown
Contributor

could you please fix the conflicts ? :)

@github-actions

github-actions Bot commented Jan 9, 2026

Copy link
Copy Markdown

GNU testsuite comparison:

Skipping an intermittent issue tests/shuf/shuf-reservoir (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/sort/sort-stale-thread-mem (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/sort/sort-month is no longer failing!

@sylvestre

Copy link
Copy Markdown
Contributor

fails with:


error[E0425]: cannot find function `sem_init` in crate `nix::libc`
    --> /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/ctrlc-3.5.1/src/platform/unix/mod.rs:19:20
     |
  19 |         nix::libc::sem_init(&mut SEMAPHORE as *mut _, SEM_THREAD_SHARED, 0);
     |                    ^^^^^^^^ help: a function with a similar name exists: `res_init`
     |
    ::: /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/libc-0.2.180/src/unix/mod.rs:1449:5
     |
1449 |     pub fn res_init() -> c_int;
     |     --------------------------- similarly named function `res_init` defined here

For more information about this error, try `rustc --explain E0425`.
error: could not compile `ctrlc` (lib) due to 1 previous error
warning: build failed, waiting for other jobs to finish...


@ChrisDryden

Copy link
Copy Markdown
Collaborator Author

Looks like I updated that dependency when running cargo update and that ctrlc version breaks redox, I fixed it to the same version as upstream

@github-actions

github-actions Bot commented Jan 9, 2026

Copy link
Copy Markdown

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/sort/sort-month is no longer failing!

@github-actions

Copy link
Copy Markdown

GNU testsuite comparison:

Congrats! The gnu test tests/sort/sort-month is no longer failing!

@sylvestre

Copy link
Copy Markdown
Contributor

cargo deny fails
(just add an ignore)

@github-actions

Copy link
Copy Markdown

GNU testsuite comparison:

Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/sort/sort-month is no longer failing!

@sylvestre

Copy link
Copy Markdown
Contributor

sorry, it needs rebasing

@github-actions

Copy link
Copy Markdown

GNU testsuite comparison:

GNU test failed: tests/sort/sort-debug-keys. tests/sort/sort-debug-keys is passing on 'main'. Maybe you have to rebase?
Congrats! The gnu test tests/sort/sort-month is no longer failing!

@ChrisDryden

Copy link
Copy Markdown
Collaborator Author

Whoops looks like something went wrong in the rebase

@codspeed-hq

codspeed-hq Bot commented Jan 20, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 283 untouched benchmarks
⏩ 38 skipped benchmarks1


Comparing ChrisDryden:sort-locale-month (8fa6212) with main (7b0e7e4)

Open in CodSpeed

Footnotes

  1. 38 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions

Copy link
Copy Markdown

GNU testsuite comparison:

GNU test failed: tests/sort/sort-debug-keys. tests/sort/sort-debug-keys is passing on 'main'. Maybe you have to rebase?
Congrats! The gnu test tests/sort/sort-month is no longer failing!

@ChrisDryden ChrisDryden force-pushed the sort-locale-month branch 3 times, most recently from 50eb22b to 63aec75 Compare January 26, 2026 17:07
@github-actions

Copy link
Copy Markdown

GNU testsuite comparison:

Congrats! The gnu test tests/sort/sort-month is no longer failing!

@ChrisDryden

Copy link
Copy Markdown
Collaborator Author

Amazing to think that this is the last sort gnu test, will work on rebasing this today

@ChrisDryden

Copy link
Copy Markdown
Collaborator Author

The sort-months gnu test didnt put the comment during the outage but its still passing.

@github-actions

github-actions Bot commented Feb 3, 2026

Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/tail/follow-name (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/shuf/shuf-reservoir (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/sort/sort-stale-thread-mem (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/sort/sort-month is no longer failing!

@github-actions

Copy link
Copy Markdown

GNU testsuite comparison:

Congrats! The gnu test tests/pr/bounded-memory is no longer failing!
Congrats! The gnu test tests/sort/sort-month is no longer failing!
Note: The gnu test tests/expand/bounded-memory is now being skipped but was previously passing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants