Conversation
|
This pull request tries to reduce the tail latency of publishing to a stream. Without the optimization in this pull request we have the following results:
Under low load, single client publishing one message at a time, latency varies between 61 microseconds, up to ~29 milliseconds for unlucky messages. Under high load, single client publishing 500 messages at a time, latency goes from ~1ms all the way to ~46ms. One cause for the high tail latency is due to periodic snapshotting of the stream. Taking a stream snapshot involves writing a file and sync it to disk, and doing it in a safe way requires more than one call to fsync (710407c). With the changes in this pull request, tail latency improves significantly:
I used the "lats" benchmark client: https://github.com/synadia-labs/lats |
6e05f0f to
e32926e
Compare
e32926e to
2f79166
Compare
|
Claude finished @neilalexander's task in 4m 0s —— View job Code Review: Asynchronous Stream Snapshots
SummaryThis PR moves the file write and Raft log compaction of stream snapshots off the ObservationsConcurrency and mutex usage — looks correct
Behavior change: In the old code, // old
} else if err != errNoSnapAvailable && err != errNodeClosed && err != errCatchupsRunning {
s.RateLimitWarnf(...)
failedSnapshots++
}In the new code, // new (server/jetstream_cluster.go ~L3020)
if err != errNoSnapAvailable && err != errNodeClosed {
s.Warnf("Failed to install snapshot for '%s > %s' [%s]: %v", ...)
failedSnapshots++
}This means that while a catchup is in progress, each snapshot tick will log a warning and increment Shutdown with an in-flight async snapshot On Redundant When Double lock/unlock in the timer case (minor) // server/jetstream_cluster.go ~L3366
case <-t.C:
snapMu.Lock()
forceIfFailed := failedSnapshots > 0
snapMu.Unlock()
doSnapshot(forceIfFailed) // acquires snapMu again
If the server is already shutting down and VerdictThe implementation is correct and safe. The only item worth a second look is the |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2f79166212
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
server/jetstream_cluster.go
Outdated
| if err != errNoSnapAvailable && err != errNodeClosed { | ||
| s.Warnf("Failed to install snapshot for '%s > %s' [%s]: %v", |
There was a problem hiding this comment.
Ignore catchup-in-progress on snapshot checkpoint creation
When CreateSnapshotCheckpoint fails because a peer catchup is active (errCatchupsRunning from raft.createSnapshotCheckpointLocked), this branch treats it as a hard failure, increments failedSnapshots, and shortens the compaction ticker. On busy clusters this turns normal catchup behavior into repeated failure accounting and eventually forced snapshots, which is a behavioral regression from the previous stream path that explicitly ignored errCatchupsRunning and can trigger unnecessary blocking snapshots/log churn.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
This would previously happen on the n.InstallSnapshot() call. This PR "splits" that call into n.CreateSnapshotCheckpoint() followed by InstallSnapshot on that checkpoint
|
needs rebase? |
2f79166 to
5885726
Compare
done |
Commit 0033a15 extended Raft's interface with the ability to create snapshot checkpoints which can be used to install snapshots asynchronously. This commit reuses the same interface to make stream snapshots asynchronous: writing of the snapshot file and Raft's log compaction steps are done in a separate goroutine. Signed-off-by: Daniele Sciascia <daniele@nats.io>
5885726 to
e800634
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e8006345c5
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // to force it through. We might have been catching up a peer for a long period, and this | ||
| // protects our log size from growing indefinitely. | ||
| forceSnapshot := failedSnapshots > 4 | ||
| c, err := n.CreateSnapshotCheckpoint(forceSnapshot) |
There was a problem hiding this comment.
Re-check catchup state before installing checkpoint
CreateSnapshotCheckpoint is invoked before flushAllPending() and stateSnapshot(), but the Raft catchup guard (len(n.progress) == 0) is only enforced at checkpoint creation time, not in checkpoint.InstallSnapshot. If a follower catchup starts in that window, this path will still install/compact a non-forced snapshot, which bypasses the previous safety behavior and can interrupt active catchups with avoidable retry/snapshot churn.
Useful? React with 👍 / 👎.
Commit 0033a15 extended Raft's interface with the ability to create snapshot checkpoints which can be used to install snapshots asynchronously.
This commit reuses the same interface to make stream snapshots asynchronous: writing of the snapshot file and Raft's log compaction steps are done in a separate goroutine.
Signed-off-by: Daniele Sciascia daniele@nats.io