(2.14) [ADDED] Config reload: add/remove remote leafnodes#7937
(2.14) [ADDED] Config reload: add/remove remote leafnodes#7937
Conversation
|
Claude finished @kozlovic's task in 5m 54s —— View job PR Review: (2.14) [ADDED] Config reload: add/remove remote leafnodes
SummaryThis is a well-structured PR that replaces the old One Real Bug
When every existing remote is absent from the new config and no TLS/compression changes are made, the loop leaves both The fix is a one-liner: track a A test case — starting with one remote, reloading with zero remotes, and asserting Other Notes
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 034dbc9fdc
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2f962fb8c9
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…afnodes Thorough review covering architecture, concurrency, test coverage, and correctness of the slice-to-map migration for leafRemoteCfgs. https://claude.ai/code/session_015Xsf5CaGfyCh8FoH6x8fpV
b26bc3b to
30b53bd
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0dcdde481f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Tests cover gaps identified in the review: - Simultaneous add+remove of remotes in a single reload - Re-enabling a previously disabled remote - Data path verification (pub/sub) after adding a remote via reload - Multiple sequential reloads (add, add more, remove, re-add) - Same-account different-URL remote management - Data path after remove+re-add cycle - Disabling one of multiple remotes without affecting others - Connection stability on no-op reload (CID preservation) - Account-scoped remotes with data isolation verification https://claude.ai/code/session_015Xsf5CaGfyCh8FoH6x8fpV
Implementation changes (from nats-io#7937): - Change leafRemoteCfgs from slice to map for O(1) add/remove - Add RemoteLeafOpts.matches() for identity-based remote matching - Add CompressionOpts.equals() method (replaces compressOptsEqual) - Rewrite leafNodeOption.Apply() to support add/remove/update remotes - Add getLeafNodeOptionsChanges() for clean diff detection - Add checkConfigsEqual() for reflection-based config comparison - Remove updateRemoteLeafNodesTLSConfig (absorbed into Apply) - Remove copyRemoteLNConfigForReloadCompare (replaced by checkConfigsEqual) - Cancel migrate timer when remote removed from config - Update remoteLeafNodeStillValid to use matches() Test fixes for slice-to-map migration in: - jetstream_cluster_2_test.go, jetstream_helpers_test.go, jetstream_leafnode_test.go, leafnode_test.go, opts_test.go New integration tests in reload_leafnode_test.go: - Fix data path verification (hub subscription count baseline) - Fix account-scoped test (hub leafnode authorization) https://claude.ai/code/session_015Xsf5CaGfyCh8FoH6x8fpV
|
I'm thinking that it would be useful to have an optional |
|
@wallyqs When I first started working on this feature we had a quick discussion with @neilalexander and I said that we would probably need a If we decide to go with that, let me know and I can try to add it. |
3484571 to
106e74a
Compare
|
I did a rebase from main to take in the fixes to the JS data race that I was experiencing in tests. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 106e74a1f2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
106e74a to
c329ca8
Compare
|
@neilalexander @wallyqs I need to make sure that I check that a config can't have multiple remotes with same "ID" (account/credentials/URL), which I am not checking at the moment. Regardless of config reload, we should not accept such config (the hub will reject as duplicates, but the leaf would keep trying to connect). This also would make the detection of a remote difficult. So let me work on that a bit and when I am ready, I will switch to "ready for review" |
|
I like that idea of checking them at config time, thanks Ivan |
cb33476 to
8734236
Compare
|
@wallyqs @neilalexander @derekcollison I have made quite a bit of changes, so I have squashed and force-pushed. We now detect (and reject) duplicate remote configurations. I also had to handle cases that could lead to duplicates during config reload (to be fair very unlikely to have happened unless intentional - add/remove/add or enable/disable/enable in very quick sequence). The PR currently does not have the public |
|
@wallyqs Is it possible to ask Claude to redo the review? |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 873423629e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a2c01c0832
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ec2af5c5d6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…reload Tests added: - TestCheckConfigsEqual: unit test for the reflection-based config comparison helper - TestLeafNodeCfgLifecycleMethods: unit tests for stillValid, markAsRemoved, setConnectInProgress, notifyQuitChannel, and isConnectInProgress - TestConfigReloadLeafNodeDisableThenEnable: integration test for full disable -> enable cycle verifying connection is re-established - TestConfigReloadLeafNodeRemovedConfigCleanup: verifies rmLeafRemoteCfgs map is cleaned up after remote removal - TestConfigReloadAddRemoteLeafNodeMessageFlow: end-to-end test verifying messages flow bidirectionally after adding a remote via config reload - TestConfigReloadLeafNodeRemoveAndReAdd: tests remove -> re-add cycle works correctly including cleanup of intermediate state https://claude.ai/code/session_019n3SMwZZhfRYX4y6URkEbK
…reload Tests added: - TestCheckConfigsEqual: unit test for the reflection-based config comparison helper - TestLeafNodeCfgLifecycleMethods: unit tests for stillValid, markAsRemoved, setConnectInProgress, notifyQuitChannel, and isConnectInProgress - TestConfigReloadLeafNodeDisableThenEnable: integration test for full disable -> enable cycle verifying connection is re-established - TestConfigReloadLeafNodeRemovedConfigCleanup: verifies rmLeafRemoteCfgs map is cleaned up after remote removal - TestConfigReloadAddRemoteLeafNodeMessageFlow: end-to-end test verifying messages flow bidirectionally after adding a remote via config reload - TestConfigReloadLeafNodeRemoveAndReAdd: tests remove -> re-add cycle works correctly including cleanup of intermediate state https://claude.ai/code/session_019n3SMwZZhfRYX4y6URkEbK
Added REVIEW.md with detailed review covering: - Strengths: identity scheme, lifecycle management, checkConfigsEqual helper, retry logic, addLeafNodeConnection race fix - Issues: lock ordering change in removeLeafNodeConnection needs scrutiny, DO_REMOTES goto pattern vs idiomatic for-loop - Minor nits: redundant RemoteLeafOpts qualifier, comment typo Added 6 tests covering identified gaps: - TestCheckConfigsEqual: unit test for reflection-based config comparison - TestLeafNodeCfgLifecycleMethods: lifecycle method unit tests - TestConfigReloadLeafNodeDisableThenEnable: disable/enable cycle - TestConfigReloadLeafNodeRemovedConfigCleanup: rmLeafRemoteCfgs cleanup - TestConfigReloadAddRemoteLeafNodeMessageFlow: e2e message flow after reload - TestConfigReloadLeafNodeRemoveAndReAdd: remove/re-add cycle https://claude.ai/code/session_019n3SMwZZhfRYX4y6URkEbK
1725b5a to
d548893
Compare
Added REVIEW.md with detailed review covering: - Strengths: identity scheme, lifecycle management, checkConfigsEqual helper, retry logic, addLeafNodeConnection race fix - Issues: lock ordering change in removeLeafNodeConnection needs scrutiny, DO_REMOTES goto pattern vs idiomatic for-loop - Minor nits: redundant RemoteLeafOpts qualifier, comment typo Added 6 tests covering identified gaps: - TestCheckConfigsEqual: unit test for reflection-based config comparison - TestLeafNodeCfgLifecycleMethods: lifecycle method unit tests - TestConfigReloadLeafNodeDisableThenEnable: disable/enable cycle - TestConfigReloadLeafNodeRemovedConfigCleanup: rmLeafRemoteCfgs cleanup - TestConfigReloadAddRemoteLeafNodeMessageFlow: e2e message flow after reload - TestConfigReloadLeafNodeRemoveAndReAdd: remove/re-add cycle https://claude.ai/code/session_019n3SMwZZhfRYX4y6URkEbK
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d548893e6b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
b6b6f60 to
be2f20e
Compare
|
@claude, brief re-review please. |
|
Claude finished @kozlovic's task in 5m 6s —— View job Re-review of PR #7937 — Config reload: add/remove remote leafnodes
SummaryThis PR has matured significantly since the first review. All issues raised in the first pass have been addressed, and several new edge cases found by Codex in subsequent rounds have also been handled. The code is now in good shape. Previous Issues — All Resolved ✅
New Code — Looks Good
Lock ordering in
Duplicate detection layering — Minor Observation
VerdictAll previously raised bugs are fixed. The new code for |
Reflects upstream changes: goto→for loop refactor, NKey identity matching, LocalAccount semantics fix, safeName() in error messages, and adds reload blocking window analysis. https://claude.ai/code/session_019n3SMwZZhfRYX4y6URkEbK
|
Addressing Claude comment: If a network issue occurs causing the readLoop for instance to call closeConnection(), which then calls remoteLeafNodeConnection, in there, we are - under the server lock - calling setConnectInProgress(true) which indeed drains the channel and set the boolean to true, then releases the server lock. Only then the reload Apply() can occur, which would call I am not seeing the events that could lead to the quitCh being missed. @claude re-review with this additional information. |
|
Claude finished @kozlovic's task in 4m 42s —— View job Re-review:
|
The configuration reload now supports adding and/or removing remote leafnodes. The changes allowed are the same than what was previously supported, that is: TLS, "TLS first", compression and enabled/disabled. Note that the DenyImports/DenyExports list in the leafnode remote configuration was previously ignored because it was changed at runtime, but with the internal changes made to the diff process, and the fact that they are no longer changed at runtime, we no longer ignore them, which means that if they are changed, the configuration reload will now properly reject the changes. A remote is identified with the combination of its URLs list, local account and credentials file name. This is what is used by the server to detect changes for the remote leafnodes list. The list of URLs can contain username/password information, so a redacted version of the URLs list is used, so that if the name appears in the log, it will not show sensitive information. The changes ensure that if a remote is quickly changed from enabled to disabled to enabled again (and same for added and removed) we don't end-up with duplicate leafnode connections. Unrelated: fixed some tests and flappers. Co-authored-by: Waldemar Quevedo <wally@nats.io> Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Unrelated, replace manual set of flag in gateway.go with use of c.setNoReconnect() function. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Also, during configuration parsing, the duplicate detection will work but possibly before non normalized URLs (say without default leafnode port). But NewServer() will call validateOptions which will re-check for duplicates (after normalization). These changes have removed the caching of the remote identity in the RemoteLeafOpts object. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
- Replaced DO_REMOTES goto with a for loop - Fixed typo in test - Use `lrc.name()` instead of `lrc.RemoteLeafOpts.name()` Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
be2f20e to
ccd14ad
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ccd14ad101
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@neilalexander I did a rebase from main and fixed a conflict in the |
|
Thanks, I will take a look soon! |
The configuration reload now supports adding and/or removing remote leafnodes. A remote is identified with the combination of its URLs list, local account and credentials file name. This is what is used by the server to detect changes for the remote leafnodes list.
Signed-off-by: Ivan Kozlovic ivan@synadia.com