Fix bad deferred logcontext handling#19180
Conversation
95de67d to
e2a0891
Compare
e2a0891 to
d496cd0
Compare
| deferred=stop_cancellation(notify_deferred), | ||
| timeout=delay_ms / 1000, | ||
| clock=self._clock, | ||
| ) |
| @@ -0,0 +1 @@ | |||
| Fix bad deferred logcontext handling across the codebase. | |||
There was a problem hiding this comment.
Based on how often and easy it is to get this wrong, ideally, we'd have some lint for this. But not for now ⏩
d496cd0 to
13dd110
Compare
| with PreserveLoggingContext(): | ||
| self.deferred.errback(BodyExceededMaxSize()) |
There was a problem hiding this comment.
While I painted with a wide brush, I think this is necessary because the deferred is being passed in and could have whatever callback chain which could change logcontext.
In practice, I don't think these really had callback chains to worry about which is why we probably didn't experience problems in this area before.
Applies with a bunch of the Twisted HTTP machinery type stuff here.
| deferred=stop_cancellation(notify_deferred), | ||
| timeout=delay_ms / 1000, | ||
| clock=self._clock, | ||
| ) |
There was a problem hiding this comment.
There is a bunch of smoke here that fixing AwakenableSleeper may fix #19165
PerDestinationQueue (federation_transaction_transmission_loop in the logs) may be intertwined (not sure, not following the layers quite concretely) with MatrixFederationHttpClient which uses AwakenableSleeper
| with PreserveLoggingContext(): | ||
| new_d.errback(defer.TimeoutError("Timed out after %gs" % (timeout,))) |
There was a problem hiding this comment.
Seems necessary as we return new_d and people can chain whatever on top of it.
Usually, there are things chained that change the logcontext because of this pattern: await make_deferred_yieldable(timeout_deferred(...)) (make_deferred_yieldable changes the logcontext)
devonh
left a comment
There was a problem hiding this comment.
Everything in the PR is as per previous logcontext fixes, including following similar patterns as #19146.
It would be nice to have a linter in the future to catch raw deferred handling like this as you mentioned.
Hopefully these changes can cut down on some of the leaked logcontexts going forward. We'll have to just merge this in and run it in the wild for a while to see if it makes any difference in practice.
|
Thanks for the review @devonh 🐪 |
Fix bad deferred logcontext handling
These aren't really something personally experienced but I just went around the codebase looking for all of the Deferred
.callback,.errback, and.canceland wrapped them withPreserveLoggingContext()Spawning from wanting to solve #19165 but unconfirmed whether this has any effect.
To explain the fix, see the Deferred callbacks section of our logcontext docs for more info (specifically using solution 2).
Heads-up, I wrote the docs too so it's my assumptions/understanding all the way down. Apply your own scrutiny.
Pull Request Checklist
EventStoretoEventWorkerStore.".code blocks.