Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 48 additions & 45 deletions pubsub/v2/doc.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,11 @@ messages, hiding the details of the underlying server RPCs.
Pub/Sub is a many-to-many, asynchronous messaging system that decouples senders
and receivers.

If you are migrating from the v1 library, please read over the migration guide:
https://github.com/googleapis/google-cloud-go/blob/main/pubsub/MIGRATING.md
If you are migrating from the v1 library, please read over the [migration guide].

More information about Pub/Sub is available at
https://cloud.google.com/pubsub/docs.
More information about Pub/Sub is available at the [product documentation page].

See https://godoc.org/cloud.google.com/go for authentication, timeouts,
See the [main Google Cloud Go package] for authentication, timeouts,
connection pooling and similar aspects of this package.

# Publishing
Expand Down Expand Up @@ -107,8 +105,8 @@ been called unless exactly once delivery is enabled. Applications should be awar
of these deliveries.

Note: This uses pubsub's streaming pull feature. This feature has properties that
may be surprising. Please refer to https://cloud.google.com/pubsub/docs/pull#streamingpull
for more details on how streaming pull behaves.
may be surprising. Please refer to the [Streaming Pull API] for more details on
how streaming pull behaves.

# Emulator

Expand Down Expand Up @@ -167,32 +165,16 @@ AckDeadline for the MaxExtension value.

# Fine Tuning PubSub Receive Performance

As the PubSub client receives messages from the PubSub server, it puts them into
the callback function passed to Receive. The user must Ack or Nack a message
in this function. Each invocation by the client of the passed-in callback occurs
in a goroutine; that is, messages are processed concurrently.
This section describes how to adjust [ReceiveSettings] for best performance.

The buffer holds a maximum of MaxOutstandingMessages messages or MaxOutstandingBytes
bytes, and the client stops requesting more messages from the server whenever the buffer
is full. Messages in the buffer have an ack deadline; that is, the server keeps a
deadline for each outstanding message. When that deadline expires, the server considers
the message lost and redelivers the message. Each message in the buffer automatically has
its deadline periodically extended. If a message is held beyond its deadline,
for example if your program hangs, the message will be redelivered.

This medium post describes tuning Pub/Sub performance in more detail
https://medium.com/google-cloud/pub-sub-flow-control-batching-9ba9a75bce3b

- Subscription.ReceiveSettings.MaxExtension
Subscriber.ReceiveSettings.MaxExtension

This is the maximum amount of time that the client will extend a message's deadline.
This value should be set to the maximum expected processing time, plus some
buffer. It is fairly safe to set it quite high; the only downside is that it will take
longer to recover from hanging programs. The higher the extension allowed, the longer
it takes before the server considers messages lost and re-sends them to some
other, healthy instance of your application.
buffer. The higher the extension allowed, the longer it takes before the server considers
messages lost and re-sends them to some other, healthy instance of your application.

- Subscription.ReceiveSettings.MaxDurationPerAckExtension
Subscriber.ReceiveSettings.MaxDurationPerAckExtension

This is the maximum amount of time to extend each message's deadline per
ModifyAckDeadline RPC. Normally, the deadline is determined by the 99th percentile
Expand All @@ -203,7 +185,7 @@ of time to extend a message's deadline on a per-RPC basis, you can decrease the
of time before message redelivery when errors occur. However, the downside is that more
ModifyAckDeadline RPCs will be sent.

- Subscription.ReceiveSettings.MinDurationPerAckExtension
Subscriber.ReceiveSettings.MinDurationPerAckExtension

This is the minimum amount of time to extend each message's deadline per
ModifyAckDeadline RPC. This is the complement setting of MaxDurationPerAckExtension and
Expand All @@ -213,7 +195,7 @@ low, it may be better to issue fewer ModifyAckDeadline RPCs rather than every
effectively removes the automatic derivation of deadlines and fixes it to the value
you wish to extend your messages' deadlines by each time.

- Subscription.ReceiveSettings.MaxOutstandingMessages
Subscriber.ReceiveSettings.MaxOutstandingMessages

This is the maximum number of messages that are to be processed by the callback
function at a time. Once this limit is reached, the client waits for messages
Expand All @@ -223,7 +205,7 @@ This value is set by default to a fairly conservatively low number. We strongly
encourage setting this number as high as memory allows, since a low setting will
artificially rate limit reception. Setting this value to -1 causes it to be unbounded.

- Subscription.ReceiveSettings.MaxOutstandingBytes
Subscriber.ReceiveSettings.MaxOutstandingBytes

This is the maximum amount of bytes (message size) that are to be processed by
the callback function at a time. Once this limit is reached, the client waits
Expand All @@ -237,30 +219,51 @@ For example, if the client sets MaxOutstandingBytes to 50 KiB, but receives
a batch of messages totaling 100 KiB, there will be a temporary overflow of
message byte size until messages are acked.

Similar to MaxOutstandingMessages, we recommend setting this higher to maximize
Similar to MaxOutstandingMessages, you can set this higher to maximize
processing throughput. Setting this value to -1 causes it to be unbounded.

- Subscription.ReceiveSettings.NumGoroutines
Subscriber.ReceiveSettings.NumGoroutines

This is the number of goroutines spawned to receive messages from the Pubsub server.
Each goroutine opens a StreamingPull stream, so this also directly sets the number of
open StreamingPull streams.

According to the [Resource Limits] page, each stream can handle about 10 MB/s of messages.
Leaving this value as the default would be good for most use cases.

This is the number of goroutines spawned to receive messages from the Pubsub server,
where each goroutine opens a StreamingPull stream. This setting affects the rate of
message intake from server to local buffer.
If increasing this value to be greater than 1, please set `EnablePerStreamFlowControl` to true.
This helps align the server-side flow control with what is available locally. For more
explanation of this issue, see [this issue].

Setting this value to 1 is sufficient for many workloads. Each stream can handle about
10 MB/s of messages, so if your throughput is under this, set NumGoroutines=1.
Reducing the number of streams can improve the performance by decreasing overhead.
Currently, there is an issue where setting NumGoroutines greater than 1 results in poor
behavior interacting with flow control. Since each StreamingPull stream has its own flow
control, the server-side flow control will not match what is available locally.
Going above 100 streams can lead to poor behavior, such as acks/modacks not succeeding in a reasonable
amount of time and resulting in high message expiration rates. In these cases,
increase the number of subscriber client applications rather than increasing this value.

Going above 100 streams can lead to increasingly poor behavior, such as acks/modacks not
succeeding in a reasonable amount of time, leading to message expiration. In these cases,
we recommend horizontally scaling by increasing the number of subscriber client applications.
By default, the number of connections in the gRPC conn pool is min(4,GOMAXPROCS). Each connection supports
up to 100 streams. Thus, if you have 4 or more CPU cores, the default setting allows a maximum of 400 streams
which is already excessive for most use cases.
If you want to change the limits on the number of streams, you can change the number of connections
in the gRPC connection pool as shown below:

opts := []option.ClientOption{
option.WithGRPCConnectionPool(2),
}
client, err := pubsub.NewClient(ctx, projID, opts...)

This [medium post] describes tuning Pub/Sub performance in more detail.

# General tips

Each application should use a single PubSub client instead of creating many.
In addition, when publishing to a single topic, a publisher should be instantiated
once and reused to take advantage of flow control and batching capabilities.

[product documentation page]: https://cloud.google.com/pubsub/docs.
[Streaming Pull API]: https://docs.cloud.google.com/pubsub/docs/pull#streamingpull-api
[main Google Cloud Go package]: https://pkg.go.dev/cloud.google.com/go
[migration guide]: https://github.com/googleapis/google-cloud-go/blob/main/pubsub/MIGRATING.md
[medium post]: https://medium.com/google-cloud/pub-sub-flow-control-batching-9ba9a75bce3b
[this issue]: https://issuetracker.google.com/352592079
[Resource Limits]: https://docs.cloud.google.com/pubsub/quotas#resource_limits
*/
package pubsub // import "cloud.google.com/go/pubsub/v2"
Loading