Fix broken log setting by jhiemstrawisc · Pull Request #3062 · PelicanPlatform/pelican

jhiemstrawisc · 2026-02-02T20:09:56Z

PR #2897 introduced a few new helper functions that should be used in most cases when getting/setting log levels. These new helpers were needed because the approach in that PR was to crank the internal log level to debug regardless of the configured value, and to then filter which messages were actually printed via logging hooks. The key is that log.GetLevel() is now permanently locked to Debug, so to get the log level that actually corresponds to what users will find, you now need to use config.GetEffectiveLogLevel(). Similar for config.SetLogging().

However, while that PR introduced this requirement it actually follow it in the new code it wrote, let alone go through the repo and clean up all the old log.{Get,Set}Level() calls, leading to a variety of bugs.

I've gone through every instance of .{Get,Set}level() from the logrus library to see whether they should be updated with the helper functions, and in most cases the answer was "yes."

However, I realized I could keep pulling and pulling on the thread, so here are a few things I noticed but did not decide to clean up:

The logging API introduced in Add runtime log level management API with automatic restoration #2897 used to give the perma-locked "Debug" level no matter how your service was configured because it used log.GetLevel() instead of the new helpers. I fixed the API but expected to see a test fail because the API now actually returns the correct logging level. However, no tests failed and that indicates this either isn't currently tested or isn't tested correctly.
Some comments in the code around log.SetLevel() seemed to contradict what the code was actually doing, e.g. this comment says we shouldn't call log.SetLevel() directly despite the fact its called three lines later.
The pelican server set-logging-level does not follow the logging hierarchy inheritance protocol discussed in the docs, e.g. here. In some ways this might be an accidental feature because it means that setting Logging.Level to something like debug doesn't cause xrootd to restart with the updated logging levels.

While working on this, I also noticed that in the last ~month, some unit test started littering my local repo's xrootd/ directory with files like authfile-cache-generated and copied-tls-creds.crt.tmp about 50% of the time, but only when I run all the unit tests with go ./.... I spent a solid 45 minutes trying to track it down because this really bugs me, but I couldn't find the source and wasn't willing to start git bisecting.

These are all small pet peeves I noticed while I was poking around other things related to logging.

While I was looking into the other logging bugs I fix in the encompassing PR, I noticed that drift in our default log levels had rendered a few of these comments out of date. It also looks like I forgot to give a Lotman logging directive a default.

…base The PR that introduced these config functions didn't actually go to clean up _all_ the code that was getting/setting log levels the old way. In fact, that PR continued to add additional `log.GetLevel()` calls even after introducing the helper functions, and the places where this was done didn't work as expected (e.g. the log web UI API). When I fixed the log API so it didn't always return "debug", I was alarmed to see the tests continued passing, but I decided to stop pulling on the thread there or this is bound to spiral. I went through and examined case by case to see whether each invocation of `log.{Get,set}Level` should be modified, and in most cases the answer was yes.

jhiemstrawisc · 2026-02-02T20:23:13Z

I started adding a bit of the evidence from my testing to demonstrate that this diff fixed the original bug pointed out by @williamnswanson (sub logging levels like Logging.Origin.Xrd were always locked on "debug"), and in the process found another bug that prevents setting any of the sub logging levels to trace. I'll keep working on it.

turetske

One comment that doesn't seem to track, other than that. Seems reasonable. But I also want to wait to see what the GitHub actions do as well as test locally.

Previously, `GetEffectiveLogLevel()` was trying to deduce the log level in a really convoluted way by getting the last level in the hook that wasn't found in the overall set of valid levels. However, when you provide `trace` as the configured log level, _all_ values are found in the hook, triggering the fallthrough to `log.GetLevel()`, which is always "debug". Since the log level enum starts at 0 and is just panic (0), fatal (1), error (2), warning (3), info (4), debug (5), trace (6) it's much easier to find the configured logging level by... taking a max

…gging' These two functions called 'log.SetLevel(log.DebugLevel)' for the underlying logrus object because we want to capture _all_ logged messages and then filter based on the effective log level. However, this neglected the fact that we do in fact use trace logging in Pelican outside of XRootD! As these two functions were originally implemented, they effectively disabled all trace logging. The fix is to enable trace logging under the hood and then rely on filtering to choose which get displayed.

…bal 'trace'

jhiemstrawisc · 2026-02-02T22:00:15Z

As it turns out, #2897 also completely disabled "trace"-level logging in the Pelican process. There were actually two additional, related bugs here:

GetEffectiveLogLevel wasn't working correctly with "trace"
The SetLogging() and initFilterLogging() methods set the global logrus level to "debug" internally, which prevented any trace logging.

Combined, you could neither tell when something was set to trace, and you couldn't produce trace logs when they appeared in the code. Addressed in the latest commits.

turetske

One minor comment that seems like it was missed. Otherwise I tested and it seems to have fixed the problem. I'm not approving because I'm waiting for GitHub actions to be working again.

jhiemstrawisc · 2026-02-03T14:43:07Z

As a heads up, it looks like e2e_fed_tests/logging_level_test.go::TestCLILoggingLevelChanges might be the latest new flaky test. In particular, it was failing whenever I ran locally because my testing environment set a timeout of 30s, but the test actually builds the Pelican binary from scratch so it was taking closer to a minute to run. I think it's failing in CI sometimes for a similar reason.

I think it's probably bad practice to be building the binary as part of these CI tests because of the overhead, and because we don't have to do that when we come through the cmd package. In cmd/ we run the commands directly using cobra. Maybe having the ability to access these commands in e2e_fed_tests is a good enough reason to export the cobra primitives?

turetske

LGTM!

jhiemstrawisc added 3 commits February 2, 2026 18:29

Clean up pet peeves I noticed while fixing logging bug

5bbfc9c

These are all small pet peeves I noticed while I was poking around other things related to logging.

jhiemstrawisc added this to the v7.23 milestone Feb 2, 2026

jhiemstrawisc requested a review from turetske February 2, 2026 20:09

jhiemstrawisc added bug Something isn't working configuration labels Feb 2, 2026

turetske requested changes Feb 2, 2026

View reviewed changes

Comment thread config/config.go Outdated

jhiemstrawisc added 4 commits February 2, 2026 21:35

Fix additional instances of 'log.GetLevel' that broke after using glo…

2382eb7

…bal 'trace'

Address forgotten comment update

02de223

jhiemstrawisc requested a review from turetske February 2, 2026 21:56

turetske requested changes Feb 2, 2026

View reviewed changes

Comment thread config/logging.go Outdated

jhiemstrawisc added 2 commits February 3, 2026 14:14

Address review feedback -- update code comment

b9b3a7b

Check errors coming from 'param.Set()'

21b05cb

jhiemstrawisc linked an issue Feb 3, 2026 that may be closed by this pull request

Logging is not inheriting the values from Logging.Level #3036

Closed

Check error on additional 'param.Set'

4ab2d03

jhiemstrawisc requested a review from turetske February 3, 2026 17:46

turetske approved these changes Feb 3, 2026

View reviewed changes

turetske merged commit 83440e9 into PelicanPlatform:main Feb 3, 2026
30 of 32 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix broken log setting#3062

Fix broken log setting#3062
turetske merged 10 commits into
PelicanPlatform:mainfrom
jhiemstrawisc:issue-3036

jhiemstrawisc commented Feb 2, 2026

Uh oh!

jhiemstrawisc commented Feb 2, 2026

Uh oh!

turetske left a comment

Uh oh!

Uh oh!

jhiemstrawisc commented Feb 2, 2026

Uh oh!

turetske left a comment

Uh oh!

Uh oh!

jhiemstrawisc commented Feb 3, 2026 •

edited

Loading

Uh oh!

turetske left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jhiemstrawisc commented Feb 2, 2026

Uh oh!

jhiemstrawisc commented Feb 2, 2026

Uh oh!

turetske left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jhiemstrawisc commented Feb 2, 2026

Uh oh!

turetske left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jhiemstrawisc commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

turetske left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jhiemstrawisc commented Feb 3, 2026 •

edited

Loading