Skip to content

Add BOM for UTF-8 to improve support for non-english characters#22339

Merged
thijsoo merged 2 commits intorelease/25.4from
2326-253-issue-with-escaping-special-characters-in-llms-file-when-the-file-is-opened-in-the-browser
Jun 11, 2025
Merged

Add BOM for UTF-8 to improve support for non-english characters#22339
thijsoo merged 2 commits intorelease/25.4from
2326-253-issue-with-escaping-special-characters-in-llms-file-when-the-file-is-opened-in-the-browser

Conversation

@leonidasmi
Copy link
Copy Markdown
Contributor

@leonidasmi leonidasmi commented Jun 11, 2025

Context

Summary

This PR can be summarized in the following changelog entry:

  • Improves support for non-english characters in llms.txt, for servers that don't serve .txt files in UTF-8.

Relevant technical choices:

Test instructions

Test instructions for the acceptance test before the PR gets merged

This PR can be acceptance tested by following these steps:

  • Make sure you set your llms.txt file to have a non-english character
    • For example, add ελληνικα in your Site Title
  • Generate your llms.txt file and then view it in the browser
  • You'll see the # Ελληνικα: part properly and not gorbled
  • Add the following snippet, that removes the BOM prefix from the llms.txt file:
add_filter( 'wpseo_llmstxt_encoding_prefix', 'custom_llmstxt_encoding_prefix', 10 );
function custom_llmstxt_encoding_prefix() {
    return '';
}
  • Re-generate the file and view it in the browser again
  • This time, you'll see that part as garbled: # Ελληνικα:

Relevant test scenarios

  • Changes should be tested with the browser console open
  • Changes should be tested on different posts/pages/taxonomies/custom post types/custom taxonomies
  • Changes should be tested on different editors (Default Block/Gutenberg/Classic/Elementor/other)
  • Changes should be tested on different browsers
  • Changes should be tested on multisite

Test instructions for QA when the code is in the RC

  • QA should use the same steps as above.

Impact check

This PR affects the following parts of the plugin, which may require extra testing:

  • Check with this RC and the released versions, that you get the same content if there are no non-english characters in the llms.txt file.

UI changes

  • This PR changes the UI in the plugin. I have added the 'UI change' label to this PR.

Other environments

  • This PR also affects Shopify. I have added a changelog entry starting with [shopify-seo], added test instructions for Shopify and attached the Shopify label to this PR.

Documentation

  • I have written documentation for this change. For example, comments in the Relevant technical choices, comments in the code, documentation on Confluence / shared Google Drive / Yoast developer portal, or other.

Quality assurance

  • I have tested this code to the best of my abilities.
  • During testing, I had activated all plugins that Yoast SEO provides integrations for.
  • I have added unit tests to verify the code works as intended.
  • If any part of the code is behind a feature flag, my test instructions also cover cases where the feature flag is switched off.
  • I have written this PR in accordance with my team's definition of done.
  • I have checked that the base branch is correctly set.

Innovation

  • No innovation project is applicable for this PR.
  • This PR falls under an innovation project. I have attached the innovation label.
  • I have added my hours to the WBSO document.

Fixes #

@leonidasmi leonidasmi added the changelog: enhancement Needs to be included in the 'Enhancements' category in the changelog label Jun 11, 2025
@leonidasmi leonidasmi marked this pull request as ready for review June 11, 2025 09:17
@leonidasmi leonidasmi added this to the 25.4 milestone Jun 11, 2025
@coveralls
Copy link
Copy Markdown

coveralls commented Jun 11, 2025

Pull Request Test Coverage Report for Build b158b26bd4504ab7f9d7f7222a4a81a3654ed313

Details

  • 4 of 4 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.003%) to 53.158%

Totals Coverage Status
Change from base Build 277ef9ccecf3d7c24d6e10a2b5501997e9f2d89b: 0.003%
Covered Lines: 29826
Relevant Lines: 57038

💛 - Coveralls

Copy link
Copy Markdown
Contributor

@thijsoo thijsoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CR + ACC 👍

@thijsoo thijsoo merged commit f9c1a61 into release/25.4 Jun 11, 2025
26 checks passed
@thijsoo thijsoo deleted the 2326-253-issue-with-escaping-special-characters-in-llms-file-when-the-file-is-opened-in-the-browser branch June 11, 2025 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog: enhancement Needs to be included in the 'Enhancements' category in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants