Skip to content

Fix param accessor usage in SetServerDefaults causing port 0 in URLs#3042

Merged
jhiemstrawisc merged 3 commits into
mainfrom
copilot/reset-config-parameters-defaults
Feb 2, 2026
Merged

Fix param accessor usage in SetServerDefaults causing port 0 in URLs#3042
jhiemstrawisc merged 3 commits into
mainfrom
copilot/reset-config-parameters-defaults

Conversation

Copilot AI commented Jan 28, 2026

Copy link
Copy Markdown
Contributor

This writeup is generated by Copilot and heavily edited by @h2zh.

The story of this PR starts with a duplicate SetServerDefaults call introduced in #2869 . It intended to pass the TestWriteOriginScitokensConfig test tripped up by the "port 0" problem. It was a temporary bypass so it should be removed when the permanent fix comes (in this PR).

Problem: SetServerDefaults was using param.Origin_Port.GetInt() and param.Cache_Port.GetInt() to construct URLs, but these read from the atomic config which hasn't been refreshed yet at that point. This caused port values to be 0 instead of the default 8443, breaking TestWriteOriginScitokensConfig.

Why atomic config is stale
In InitServer func, Viper instance and atomic config follow these two steps to be synced:
Step 1: SetServerDefaults(viper.GetViper()) - This sets defaults in the viper instance (like Origin.Port = 8443), but does NOT update the atomic config yet.
Step 2: param.Refresh() - This is what updates the atomic config by copying values from viper into the atomic pointer.
Inside SetServerDefaults, if you call param.Origin_Port.GetInt(), you're reading from the atomic config, which hasn't been refreshed yet (that happens in Step 2, after SetServerDefaults completes).

Changes

  • Fix param accessor usage: Replace param.Origin_Port.GetInt() and param.Cache_Port.GetInt() with v.GetInt(param.Origin_Port.GetName()) and v.GetInt(param.Cache_Port.GetName()) to read from the viper instance being configured

  • Remove duplicate call: Remove redundant SetServerDefaults call at line 1946 in InitServer

This aligns with the function's documented guideline:

"you SHOULD NOT do a param.<some param>.Get*() here as part of the logic for setting defaults on the passed v because you'll be operating on two different config structs!"

// Before: reads from atomic config (returns 0)
if param.Origin_Port.GetInt() != 443 {
    v.SetDefault(param.Origin_Url.GetName(), fmt.Sprintf("https://%v:%v", ..., param.Origin_Port.GetInt()))
}

// After: reads from viper instance (returns 8443)
originPort := v.GetInt(param.Origin_Port.GetName())
if originPort != 443 {
    v.SetDefault(param.Origin_Url.GetName(), fmt.Sprintf("https://%v:%v", ..., originPort))
}

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • fonts.googleapis.com
    • Triggering command: /usr/local/bin/node node /home/REDACTED/work/pelican/pelican/web_ui/frontend/node_modules/.bin/next build ux_amd64/vet ap /yaml.v3@v3.0.1/apic.go /yaml.v3@v3.0.1/decode.go ux_amd64/vet -I ache/go/1.24.12//home/REDACTED/work/pelican/pelican/web_ui/frontend/node_modules/.bin/�� -I ux_amd64/vet bin/�� x64/src/os/user -o ux_amd64/vet /tmp/cc9nTXZC.s g/protobuf/encod-c -o ux_amd64/vet (dns block)
  • scarf.sh
    • Triggering command: /usr/local/bin/node node ./report.js (dns block)
  • topology.opensciencegrid.org
    • Triggering command: /tmp/go-build1201400828/b001/xrootd.test /tmp/go-build1201400828/b001/xrootd.test -test.testlogfile=/tmp/go-build1201400828/b001/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true -test.run=^(TestOSDFAuthRetrieval|TestAuthPathCompToWord|TestConstructAuthEntry|TestAuthPrivilegesFromWord|TestAuthPoliciesFromLine|TestPopulateAuthLinesMapForOrigin|TestPopulateAuthLinesMapForCache|TestSerializeAuthline|TestGetSortedSerializedAuthLines|T /httprouter@v1.3-ifaceassert ux_amd64/vet get er32/adler32.go x64/pkg/tool/lin-o ux_amd64/vet -uns�� @v1.0.0/proto.go @v1.0.0/varint.go ux_amd64/vet g_.a tobuf_extensions/pbutil ache/go/1.24.12/-importcfg ux_amd64/vet (dns block)
    • Triggering command: /tmp/go-build3929737176/b001/xrootd.test /tmp/go-build3929737176/b001/xrootd.test -test.testlogfile=/tmp/go-build3929737176/b001/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true -test.run=^(TestOSDFAuthRetrieval|TestAuthPathCompToWord|TestConstructAuthEntry|TestAuthPrivilegesFromWord|TestAuthPoliciesFromLine|TestPopulateAuthLinesMapForOrigin|TestPopulateAuthLinesMapForCache|TestSerializeAuthline|TestGetSortedSerializedAuthLines|T o@v1.45.25/priva-ifaceassert ux_amd64/vet -errorsas -ifaceassert -nilfunc ux_amd64/vet -c 1.67.3/internal/-errorsas 1.67.3/internal/-ifaceassert ux_amd64/vet x64/src/runtime/as gner/v4 ndor/bin/as ux_amd64/vet (dns block)
    • Triggering command: /tmp/go-build2339574086/b001/xrootd.test /tmp/go-build2339574086/b001/xrootd.test -test.testlogfile=/tmp/go-build2339574086/b001/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true -test.run=TestOSDFAuthRetrieval ux_amd64/compile-ifaceassert ux_amd64/vet x64/src/runtime/as hentication-libr--64 ache/go/1.24.12/-o ux_amd64/vet e oneg@v0.0.0-20191010083416-a7dc8-errorsas om/alecthomas/units@v0.0.0-20211-ifaceassert ux_amd64/vet x64/src/hash/crcas /scrape ache/go/1.24.12/-o ux_amd64/vet (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

@copilot Could you explain why you add this block of code around line 2044 in config/config.go? Reset all config parameters back to the defaults override all config parameters setting happening between the first call of SetServerDefaults on line 1611 to this block of code in the InitServer function.

	// Ensure server defaults are applied before any federation discovery; several
	// tests rely on URL defaults being derived from Server.ExternalWebUrl.
	if err := SetServerDefaults(viper.GetViper()); err != nil {
		logging.FlushLogs(true)
		return err
	}

Copilot AI changed the title [WIP] Reset all config parameters to defaults Remove duplicate SetServerDefaults call causing orphan temp directories Jan 28, 2026
Copilot AI requested a review from h2zh January 28, 2026 17:45
Copilot AI changed the title Remove duplicate SetServerDefaults call causing orphan temp directories Fix param accessor usage in SetServerDefaults causing port 0 in URLs Jan 28, 2026
@jhiemstrawisc

Copy link
Copy Markdown
Member

Just to keep track of the impact AI-generated code is having on repo maintainability, I suspect this is another regression coming from a co-pilot feature that was merged without a second human review:
#2900

@h2zh h2zh force-pushed the copilot/reset-config-parameters-defaults branch from b82c89d to 1f48694 Compare January 28, 2026 19:57
@h2zh h2zh changed the base branch from copilot/allow-mutable-configuration to main January 28, 2026 19:59
@h2zh h2zh force-pushed the copilot/reset-config-parameters-defaults branch from 1f48694 to be9fbc5 Compare January 28, 2026 20:20
Copilot AI and others added 3 commits January 28, 2026 21:00
- SetDefault operations only set values if they don't already have a value. Since the first call at line 1497 already set all the defaults, the second call was completely redundant for configuration purposes.
- This duplicate SetServerDefaults call was introduced in PR #2869 - Allow configuration to be mutable at runtime. It intended to pass the TestWriteOriginScitokensConfig test tripped up by the "port 0" problem. It was a temporary bypass so it should be removed when the permanent fix comes (in this PR).
@h2zh h2zh force-pushed the copilot/reset-config-parameters-defaults branch from be9fbc5 to 67df10c Compare January 28, 2026 21:00
@h2zh h2zh added the bug Something isn't working label Jan 28, 2026
@h2zh h2zh added this to the v7.24 milestone Jan 28, 2026
@h2zh h2zh added internal Internal code improvements, not user-facing cache Issue relating to the cache component origin Issue relating to the origin component director Issue relating to the director component registry Issue relating to the registry component labels Jan 28, 2026
@h2zh

h2zh commented Jan 28, 2026

Copy link
Copy Markdown
Contributor

@jhiemstrawisc This PR is a regression for #2869

That PR calls SetServerDefaults twice to bypass the "port 0" issue in TestWriteOriginScitokensConfig, which is a far-fetching way to solve the problem and doesn't fix the root cause (see the related issue and this comment)

This port 0 problem is caused by using the atomic config before it gets refreshed, which is a different scenario than the "port number race condition" problem in concurrent tests.

Could you provide a second review for this PR?

@h2zh h2zh marked this pull request as ready for review January 28, 2026 21:26
@jhiemstrawisc jhiemstrawisc self-requested a review January 29, 2026 14:07
@jhiemstrawisc

Copy link
Copy Markdown
Member

@h2zh thanks for the poke, I'll take a look as soon as I'm able!

@jhiemstrawisc jhiemstrawisc left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! In fact, I checked git blame and it looks like I was both the one who added the comment for SetServerDefaults() that explains nothing in the function should do a param.Get, and also the one who did exactly that in this section of code you're fixing 🤓 Thanks for cleaning it up for me!

@jhiemstrawisc

Copy link
Copy Markdown
Member

To be clear, I still think the AI did the wrong thing here, even though I own part of the original bug. Rather than read the comments at the top of the function to fix the bug as @h2zh does here, the AI found a workaround to get tests to pass.

@jhiemstrawisc jhiemstrawisc merged commit 1c805fe into main Feb 2, 2026
38 of 40 checks passed
jhiemstrawisc added a commit to jhiemstrawisc/pelican that referenced this pull request Feb 5, 2026
The warning says nothing in the function should use 'viper.Get*'
or 'param.<blah>.Get* and it gives a good reason. Unfortunately,
we were still doing exactly that! It turns out it wasn't actually
too much of a problem until bugs compounded between PR PelicanPlatform#2869 and
PR PelicanPlatform#3042, at which point it broke configuration.
@jhiemstrawisc jhiemstrawisc added the create-patch Patch this into multiple versions of Pelican label Feb 6, 2026
turetske pushed a commit that referenced this pull request Feb 6, 2026
The warning says nothing in the function should use 'viper.Get*'
or 'param.<blah>.Get* and it gives a good reason. Unfortunately,
we were still doing exactly that! It turns out it wasn't actually
too much of a problem until bugs compounded between PR #2869 and
PR #3042, at which point it broke configuration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working cache Issue relating to the cache component create-patch Patch this into multiple versions of Pelican director Issue relating to the director component internal Internal code improvements, not user-facing origin Issue relating to the origin component registry Issue relating to the registry component

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Correct param access in InitServer to pass test, rather than calling SetServerDefaults twice

3 participants