Skip to content

Update text analyzer documentation clarifying the example#12229

Open
kolchfa-aws wants to merge 1 commit intomainfrom
analyzer-update-22
Open

Update text analyzer documentation clarifying the example#12229
kolchfa-aws wants to merge 1 commit intomainfrom
analyzer-update-22

Conversation

@kolchfa-aws
Copy link
Copy Markdown
Collaborator

Corrects documentation for the edge_ngram and ngram tokenizers, which previously suggested that both tokenizers parse strings into words by default. In fact, tokenizers treat the entire input as a single token unless configured with token_chars. Updated descriptions to explain this default behavior and show that configuration is required for word-based parsing. Updated the edge_ngram example to demonstrate configured behavior.

Checklist

  • By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
    For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
@github-actions
Copy link
Copy Markdown

Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Merged.

Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer.

When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant