Skip to content

migrate jiff-icu to ICU4X 2.0 (beta 2), add Custom trait for hooking localization into Jiff's fmt::strtime APIs, optimize strftime#338

Merged
BurntSushi merged 8 commits intomasterfrom
ag/icu2
Apr 28, 2025
Merged

migrate jiff-icu to ICU4X 2.0 (beta 2), add Custom trait for hooking localization into Jiff's fmt::strtime APIs, optimize strftime#338
BurntSushi merged 8 commits intomasterfrom
ag/icu2

Conversation

@BurntSushi
Copy link
Copy Markdown
Owner

@BurntSushi BurntSushi commented Apr 25, 2025

The ICU4X project seems to be slowly working toward a 2.0 release. It
is currently on beta.2. There are some nice improvements here, in
particular, the ability to localize zoned datetimes.

This PR migrates to ICU4X 2.0 (first commit) and then adds support for
conversions between time zones, offsets and zoned datetimes (second
commit). I've also added a basic example demonstrating how to format a
jiff::Zoned value.

I've also added support for %c, %r, %X and %x to jiff::fmt::strtime.
Since I have no plans for Jiff to do localization, and since these specifiers
are commonly implemented via localization, I've added some
infrastructure to jiff::fmt::strtime that enables overriding the behavior
of these specifiers. Specifically, in Biff (not yet released), I've used this
trait with ICU4X to provide localized datetime formatting.

I'm somewhat conflicted about adding a hook for localization into
strtime-style APIs. In particular, the ideal way to do localization is
just by using ICU4X directly. By doing it through strftime, you're
somewhat limiting yourself quite a bit. But it can be quite
convenient, and it can be important if one is trying to build or
integrate with legacy APIs like POSIX. To mitigate this, I've added
documentation warning folks against using this functionality, and
instead pointing them toward ICU4X.

I also did some work optimizing strftime in this PR, spurred by
uutils/coreutils#7852.

Here's the improvement from this PR for Jiff:

$ critcmp base change1 -f jiff
group                          base                                   change1
-----                          ----                                   -------
parse/strptime/oneshot/jiff    1.01     59.9±0.88ns        ? ?/sec    1.00     59.2±0.71ns        ? ?/sec
print/strftime/oneshot/jiff    1.74    174.7±1.45ns        ? ?/sec    1.00    100.6±0.39ns        ? ?/sec

And relative to chrono and time (a little tricky to read this output since we distinguish between prebuilt and oneshot, and it is appropriate to compare Jiff's oneshot with Chrono's prebuilt, since Jiff doesn't have a prebuilt API):

$ critcmp change1 -g '(.*)/(?:humantime|jiff|chrono|time)$'
group                      change1//chrono                        change1//jiff                          change1//time
-----                      ---------------                        -------------                          -------------
parse/strptime/oneshot     2.86    169.2±2.63ns        ? ?/sec    1.00     59.2±0.71ns        ? ?/sec
parse/strptime/prebuilt    1.00     81.6±0.93ns        ? ?/sec                                           1.36    110.9±0.85ns        ? ?/sec
print/strftime/oneshot     2.13    214.7±3.63ns        ? ?/sec    1.00    100.6±0.39ns        ? ?/sec
print/strftime/prebuilt    1.00     89.3±0.77ns        ? ?/sec                                           1.14    101.7±0.87ns        ? ?/sec

This basically does what is necessary to get everything compiled and
tests passing.

We'll add on time zone and offset stuff in a subsequent commit.

Note that we now depend on `icu_calendar` and `icu_time` directly, with
the latter being optional (but enabled by default). In particular, one
can do useful things with just conversions to dates with `icu_calendar`.
But I expect most folks will want both.
That API was added, so this is really no longer needed.
I've finally relented and exposes a very restricted way
to build an `Error` value.

I was motivated to do this because there are some traits
in Jiff that use `jiff::Error` in their return types. Without
a constructor like this, it isn't possible for implementors
of that trait to provide their own error values.
@BurntSushi BurntSushi changed the title migrate jiff-icu to ICU4X 2.0 (beta 2) migrate jiff-icu to ICU4X 2.0 (beta 2), add Custom trait for hooking localization into Jiff's fmt::strtime APIs Apr 27, 2025
Specifically, this permits customizing the %c, %r, %X and %x
conversion specifiers. (Which are also newly added in this commit.)

We also include a default behavior for these specifiers
(meant to match Unicode's `und` locale) as well as an opt-in
POSIX behavior (meant to match POSIX's `C` locale).

Some minor refactoring is also included here. It was too annoying
to split into its own commit.
@BurntSushi BurntSushi changed the title migrate jiff-icu to ICU4X 2.0 (beta 2), add Custom trait for hooking localization into Jiff's fmt::strtime APIs migrate jiff-icu to ICU4X 2.0 (beta 2), add Custom trait for hooking localization into Jiff's fmt::strtime APIs Apr 28, 2025
This actually never got any optimization attention, and
it looks like its perf matters to coreutils. So let's
take a look at it!

Ref uutils/coreutils#7852
@BurntSushi BurntSushi changed the title migrate jiff-icu to ICU4X 2.0 (beta 2), add Custom trait for hooking localization into Jiff's fmt::strtime APIs migrate jiff-icu to ICU4X 2.0 (beta 2), add Custom trait for hooking localization into Jiff's fmt::strtime APIs, optimize strftime Apr 28, 2025
This is mostly just using `#[inline]` in variouos spots and
specializing common code paths.

The most interesting change is that a `BrokenDownTime` now
contains a `Option<TimeZone>` instead of a time zone
abbreviation. It turns out that getting the time zone
abbreviation for a fixed offset time zone does some work
up-front to format the string. By just storing the time
zone, we can defer that work until we know we actually
need it.
@BurntSushi BurntSushi merged commit f21740e into master Apr 28, 2025
34 checks passed
@BurntSushi BurntSushi deleted the ag/icu2 branch April 28, 2025 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant