-
Notifications
You must be signed in to change notification settings - Fork 384
A filter or fadeIn extension to shadow traffic first before directing live traffic to new pods #3863
Description
—
Is your feature request related to a problem? Please describe.
When (auto)scaling out, Product Read API's new pods always serve requests with higher-than-usual latencies negatively contributing to the overall API's latencies.
The API is a Tier 1 service and latencies are critical there.
To this date we were using fadeIn filter to extend the time the new pods enter service fully, but still even if gradually, they get directed live traffic, which they serve slower due to caches required to be build up. Also it is not 100% clear that the fadeIn is gradually directing the traffic to the pods newly added by autoscaler (if a feature on its own we would need that in the first place).
Pre-warming the new pods is quite a challenge because PrAPI relies on consistent hash balancing -- cannot tell what SKUs the new pods will serve until they enter the service (although Skipper "knows" and could use that to an advantage of the feature proposed).
Originally described in the Post Mortem document.
Describe the solution you would like
Introduce a new (shadowIn?) or extend the existing fadeIn filter with ability to shadow traffic to the new pods first, and only direct live traffic after 100% of shadow traffic is routed.
Important: the shadowed traffic requests should be sent in a fire-and-forget manner, so the shadowed request processing time doesn’t affect the actual one.
Such a filter (or an extension to fadeIn) would definitely be useful for any application/service/API that needs a warm-up before serving traffic and would offload the potential code needed to do such a pre-initialization from the service’s code (not that trivial to implement usually and would need to be done in every service benefitting from it) into Skipper (e.g.: “if your service need pre-warmup use shadow fadeIn”). The filter could configurable with (defaults after “=” sign to keep the filter backwards-compatible):
fadeIn(
duration,
fade-in curve exponent = 1,
do shadow before directing traffic = false,
duration after reaching 100% shadow traffic before directing live traffic = 0s
)
Describe alternatives you've considered (optional)
At the moment we use fadeIn but because it directs live traffic (although gradually which is better than all-at-once; even though we observed it doesn't work as expected for new pods added by autoscaler) the initial requests are served below expected latencies, ideally the new pods get shadow traffic first.