Anycast support for the Pelican cache#3510
Draft
bbockelm wants to merge 3 commits into
Draft
Conversation
Allow a V2 (persistent) cache to participate in TCP anycast: it peers with a BGP router via an embedded pure-Go GoBGP speaker and advertises the configured anycast net blocks, but only while the cache is healthy and is serving a host certificate with the expected anycast hostname as a SAN (verified by a TLS probe to the cache's own external URL, not the anycast name, to avoid probing a different cache). Service selection is director-preferred by default: clients use the director's geo/load/health-aware choice and only fall back to the anycast endpoint when the director is unreachable. Client.PreferAnycast opts a well-covered site into contacting the anycast endpoint first (downloads) and routing write-through uploads to it. Because the anycast endpoint is itself a cache, a 403 from it now returns the same X-Pelican-* token-hint headers the director would, so the client can acquire the right token and retry. The header builders and namespace longest-prefix match are lifted into server_structs and shared by the director and cache; the cache advertises itself as the collections-url so listings flow through it. These header improvements are not gated on anycast and also help a forced-cache transfer during a director outage. The cache continues to advertise its unique URL to the director; the shared anycast address is published federation-wide in the discovery doc via Director.AnycastUrl and announced via BGP. Includes unit tests plus a two-instance GoBGP integration test (with bind-with-retry to avoid a free-port race) and a client write-through test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
TCP anycast only works if the kernel accepts packets destined to the
anycast IP, which requires the address to be present on a local network
device. Add an option for the cache to add/remove the anycast service
address(es) (IPv4 and/or IPv6) on the relevant interface via netlink
(pure Go, github.com/vishvananda/netlink; Linux only).
New Cache.Anycast parameters:
- Addresses: the anycast service IP(s) to bind locally (bare IP implies
a /32 or /128 host route), distinct from Routes (the BGP-advertised
net blocks).
- Device: the interface to manage; auto-detected by asking the kernel
which device routes to the director when left empty.
- AddressManagement: on/off/auto (default auto). "auto" adds an address
only if absent at startup and removes (on shutdown) only what Pelican
added; "on" always adds and always removes; "off" never touches them.
Addresses are bound before BGP starts (so the kernel accepts traffic
before routes draw it) and removed on shutdown after routes are
withdrawn. Non-Linux builds get a stub that no-ops when management is
off and errors if it is actually requested, keeping cross-platform
builds working.
Tests: cross-platform pure-logic (mode parsing, add/remove decision
matrix, address normalization) plus Linux netlink tests exercising real
add/remove on the loopback device, which skip when the process lacks
CAP_NET_ADMIN.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This branch adds support for TCP anycast for the cache.
If enabled, the cache will contact its local router via BGP and advertise its availability to serve an anycast route.
If the client contacts the cache (whether in anycast mode or not), the cache will respond with information about what token is required to access the namespace: this allows the client to work with the cache directly and not contact the director.
The idea of anycast is a Big Change. Submitting as a draft to get early CI feedback.