Document --max-shard-size-bytes support for shards larger than 80 GiB#12204
Document --max-shard-size-bytes support for shards larger than 80 GiB#12204sumobrian wants to merge 1 commit intoopensearch-project:mainfrom
Conversation
|
Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Merged. Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer. When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). |
20e3aa3 to
91053e8
Compare
| - `include_global_state: true` – Ensures that global cluster state is included. | ||
| - `compress: false` – Disables metadata compression, which is required for compatibility with RFS. | ||
| - Shards of up to **80 GiB** are supported by default. Larger shard sizes can be configured, **except in AWS GovCloud (US)**, where 80 GiB is the maximum. | ||
| - Shards of up to **80 GiB** are supported by default. Larger shard sizes can be configured. For details, see [Backfill migration using RFS]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/deploy/configuration-options/#backfill-migration-using-rfs). **In AWS GovCloud (US)**, 80 GiB is the maximum supported shard size. |
There was a problem hiding this comment.
How about : "In AWS GovCloud (US) with the ECS deployment, 80 GiB is the maximum supported shard size"
There was a problem hiding this comment.
This is a good callout, but I don’t want to fragment the documentation down to individual leaf nodes. Instead, the documentation should clearly direct users: for ECS, go here; for EKS, go here.
| - `include_global_state: true` – Ensures that global cluster state is included. | ||
| - `compress: false` – Disables metadata compression, which is required for compatibility with RFS. | ||
| - Shards of up to **80 GiB** are supported by default. Larger shard sizes can be configured, **except in AWS GovCloud (US)**, where 80 GiB is the maximum. | ||
| - Shards of up to **80 GiB** are supported by default. Larger shard sizes can be configured. For details, see [Backfill migration using RFS]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/deploy/configuration-options/#backfill-migration-using-rfs). **In AWS GovCloud (US)**, 80 GiB is the maximum supported shard size. |
There was a problem hiding this comment.
The cross link anchor should change to #configuring-large-shard-support
| By default, RFS supports shards of up to **80 GiB**. To migrate larger shards, pass the `--max-shard-size-bytes` flag through `reindexFromSnapshotExtraArgs`. For example, to support shards up to 200 GiB: | ||
|
|
||
| ```json | ||
| "reindexFromSnapshotExtraArgs": "--max-shard-size-bytes 200000000000" |
There was a problem hiding this comment.
The RFS code uses binary GiB internally (80 * 1024 * 1024 * 1024L), so the flag value should also use binary
"reindexFromSnapshotExtraArgs": "--max-shard-size-bytes 214748364800"
| ``` | ||
|
|
||
| Ensure that your worker nodes have sufficient local disk space, because RFS requires approximately **2x the shard size** in local storage to unpack and process the Lucene index. For more information about available RFS arguments, see the [DocumentsFromSnapshotMigration README](https://github.com/opensearch-project/opensearch-migrations/blob/main/DocumentsFromSnapshotMigration/README.md#arguments). | ||
|
|
There was a problem hiding this comment.
I think we are missing a {% include copy.html %} block here
|
|
||
| ### Shards appear stuck with no errors | ||
|
|
||
| If `console backfill status --deep-check` shows shards that remain in progress indefinitely with no errors in the logs, the shard may exceed the default **80 GiB** size limit. Shards larger than this limit are silently rejected by RFS workers and will never complete. To resolve this, increase the `--max-shard-size-bytes` value in your deployment configuration. For details, see [Configuring large shard support]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/deploy/configuration-options/#configuring-large-shard-support). |
There was a problem hiding this comment.
"silently rejected" seems too harsh here. How about : "are skipped by RFS workers without surfacing an error in the backfill status output"
- Update is-migration-assistant-right-for-you.md to note larger shards can be configured and link to configuration options - Add 'Configuring large shard support' section to configuration-options.md with --max-shard-size-bytes usage and disk space requirements - Add troubleshooting entry to backfill.md for shards that appear stuck due to exceeding the default size limit Signed-off-by: Brian Presley <bjpres@amazon.com>
91053e8 to
6f3d00b
Compare
Description
Documents how to configure RFS to migrate shards larger than the default 80 GiB limit using the
--max-shard-size-bytesflag.Changes
reindexFromSnapshotExtraArgsexample and disk space guidance.Context
Users migrating clusters with shards exceeding 80 GiB encounter a situation where the shard appears stuck indefinitely with no errors in the backfill status output. The RFS worker silently rejects the shard due to the default
--max-shard-size-byteslimit (80 GB), and the work item is repeatedly acquired and failed without progress. This documentation update makes the configuration path discoverable.Issues Resolved
N/A
Check List
--signoffBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.