Skip to content

Add content in the llms.txt file#22257

Merged
thijsoo merged 14 commits intofeature/llms-txtfrom
606-add-content-in-the-llmstxt-file
May 15, 2025
Merged

Add content in the llms.txt file#22257
thijsoo merged 14 commits intofeature/llms-txtfrom
606-add-content-in-the-llmstxt-file

Conversation

@leonidasmi
Copy link
Copy Markdown
Contributor

@leonidasmi leonidasmi commented May 13, 2025

Context

Summary

This PR can be summarized in the following changelog entry:

  • Adds content in the llms.txt file.

Relevant technical choices:

  • Here is the inclusion rules for the post and term lists.

Test instructions

Test instructions for the acceptance test before the PR gets merged

This PR can be acceptance tested by following these steps:

  • Turn on the LLMs.txt feature
  • Go and check the llms.txt file that was created and it looks like this:
  • Now, for each section separately:

Intro:

  • Confirm that the intro section is like this:
Generated by Yoast SEO, this is an llms.txt file, meant for consumption by LLMs.

This is the [sitemap](http://sitekit.local/sitemap_index.xml) of this website.
  • Disable our sitemaps, regenerate the file and confirm that the txt files point to the WP sitemap: This is the [sitemap](http://sitekit.local/wp-sitemap.xml) of this website.

Title:

  • If your Site Title is test1 and your Tagline is test2, confirm that the title section is like this:
# test1: test2
  • If there's no Tagline (or if the tagline is the default Just another WordPress site one), confirm that the title section is like this:
# test1
  • If there's no Site Title, confirm that the title section is like this:
# test2
  • If there's no Site Title or Tagline, confirm that there's no section at all

Description:

  • In Settings>Reading>Home Page set a page as the home page. Also, go to edit that page and set a Meta description
  • Generate the txt file and confirm the description section contains the meta description that's set for that page:
> Custom meta description, set in the page
  • The above should be true even in a site where it has indexables disabled and not created.
  • Delete the meta description from the page, regenerate the txt file and confirm that there's no description section
  • In Settings>Reading>Home Page set the latest posts as the home page. Also, go to Yoast SEO->Settings->Homepage and give a meta description there
  • Generate the txt file and confirm the description section contains the meta description that's set in the yoast settings:
> Custom meta description, set in the Yoast settings
  • In the settings, delete the meta description or only set the Tagline replacement variable. Regenerate the txt file and confirm that there's no description section

Post Lists:

  • Have at least 6 posts and 6 pages and generate the .txt file.
  • Confirm that you'll get the following sections:
## Posts
- [Blog post 1](https://example.com/xyz)
- [Blog post 2](https://example.com/xyz)
- [Blog post 3](https://example.com/xyz)
- [Blog post 4](https://example.com/xyz)
- [Blog post 5](https://example.com/xyz)


## Pages
- [About us](https://example.com/xyz)
- [Contact](https://example.com/xyz)
- [Terms](https://example.com/xyz)
- [Privacy policy](https://example.com/xyz)
- [Page 1](https://example.com/xyz)
- [Page 2](https://example.com/xyz)
  • The ordering of the lists are based on the modified date. Meaning, we display the 5 latest updated posts/pages/CPTs.
  • If you had no posts or pages, confirm that the sections would not be displayed
  • Make a couple of those posts draft, or private or password-protected
    • Regenerate the txt file and confirm that those posts are no longer in the txt file
  • Have less than 5 public posts and make one of them older than 1 year
    • Regenerate the txt file and confirm that that post is no longer in the txt file
    • This rule applies only for posts, not any other CPT.
  • Go to Yoast SEO->Settings->Content Types->Pages and turn Show pages in search results into false
    • Regenerate the file and confirm that the pages are no longer there
    • Revert the setting back to what it was to continue testing
  • Add the following snippet, to make pages not indexable:
add_filter( 'wpseo_indexable_excluded_post_types', 'exclude_page' );
function exclude_page( ) {
	return ['page'];
}
  • Regenerate the file and confirm that the pages are no longer there
  • Add the following snippet, to create a new CPT:
function register_blog_post_type() {
    $args = [
		'label'              => \__( 'Alternative posts', 'yoast-test-helper' ),
		'labels'             => [
			'name'          => \__( 'Alternative posts', 'yoast-test-helper' ),
			'singular_name' => \__( 'Alternative post', 'yoast-test-helper' ),
		],
        'public'             => true,
        'has_archive'        => true,
        'taxonomies'         => ['alternative-genre' ],
    ];

    register_post_type('blog-post', $args);
}
add_action('init', 'register_blog_post_type');
  • Create a couple of posts for that CPT, regenerate the .txt file and confirm that you get an additional section:
## Alternative posts
- [Alternative Post 1](https://example.com/alternative-post-1/)
- [Alternative Post 2](https://example.com/alternative-post-2/)
  • In the snippet above, turn the CPT into a non-public one: 'public' => false,
    • Regenerate the txt file and confirm that the Alternative posts section is not displayed

Term Lists:

  • For categories and tags, have at least 6 of each and assign posts to them
    • Generate the txt file and confirm that you get a list of the ones that have the most posts
  • Have 3 categories only to have posts
    • Generate the txt file and confirm that you only get those 3 categories and no empty ones
  • Add the following snippet, to exclude the tags from being indexabled:
add_filter( 'wpseo_indexable_excluded_taxonomies', 'exclude_category' );
function exclude_category ( ) {
	return ['post_tag'];
}
  • Regenerate the txt file and confirm that tags are no longer there
  • Add the following snippet to create a custom taxonomy for the CPT that we have created above
function register_genre_taxonomy() {
    $args = array(
        'label'        => 'Alternative genres',
        'public'       => true,
        'hierarchical' => true, // true if it's like categories, false for tags
    );

    register_taxonomy('alternative-genre', array('book'), $args);
}
add_action('init', 'register_genre_taxonomy');
  • Also create a couple of alternative posts and assign them to that custom taxonomy
    • Regenerate the txt file and confirm that you get the terms that have the most CPTs posts.
  • Not tweak that snippet to make the taxonomy a private one: 'public' => false,
    • Regenerate the txt file and confirm that that taxonomy is no longer there
  • Go to Yoast SEO->Settings->Categories & tags and toggle the Show categories in search results to off in some of the taxonomies
    • Regenerate the txt file and confirm that those taxonomies are no longer in there.

Relevant test scenarios

  • Changes should be tested with the browser console open
  • Changes should be tested on different posts/pages/taxonomies/custom post types/custom taxonomies
  • Changes should be tested on different editors (Default Block/Gutenberg/Classic/Elementor/other)
  • Changes should be tested on different browsers
  • Changes should be tested on multisite

Test instructions for QA when the code is in the RC

  • QA should use the same steps as above.

QA can test this PR by following these steps:

Impact check

This PR affects the following parts of the plugin, which may require extra testing:

UI changes

  • This PR changes the UI in the plugin. I have added the 'UI change' label to this PR.

Other environments

  • This PR also affects Shopify. I have added a changelog entry starting with [shopify-seo], added test instructions for Shopify and attached the Shopify label to this PR.

Documentation

  • I have written documentation for this change. For example, comments in the Relevant technical choices, comments in the code, documentation on Confluence / shared Google Drive / Yoast developer portal, or other.

Quality assurance

  • I have tested this code to the best of my abilities.
  • During testing, I had activated all plugins that Yoast SEO provides integrations for.
  • I have added unit tests to verify the code works as intended.
  • If any part of the code is behind a feature flag, my test instructions also cover cases where the feature flag is switched off.
  • I have written this PR in accordance with my team's definition of done.
  • I have checked that the base branch is correctly set.

Innovation

  • No innovation project is applicable for this PR.
  • This PR falls under an innovation project. I have attached the innovation label.
  • I have added my hours to the WBSO document.

Fixes https://github.com/Yoast/reserved-tasks/issues/564

@leonidasmi leonidasmi changed the base branch from feature/llms-txt to llms-txt/create-file May 13, 2025 08:10
@leonidasmi leonidasmi added the changelog: non-user-facing Needs to be included in the 'Non-userfacing' category in the changelog label May 13, 2025
@leonidasmi leonidasmi added this to the feature/llms-txt milestone May 13, 2025
@leonidasmi leonidasmi marked this pull request as ready for review May 14, 2025 06:54
Copy link
Copy Markdown
Contributor

@thijsoo thijsoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions

Comment thread src/llms-txt/domain/markdown/items/link.php Outdated
Comment thread src/llms-txt/domain/markdown/markdown-bucket.php Outdated
Comment thread src/llms-txt/domain/markdown/sections/title.php Outdated
Base automatically changed from llms-txt/create-file to feature/llms-txt May 15, 2025 07:52
Copy link
Copy Markdown
Contributor

@thijsoo thijsoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@thijsoo thijsoo merged commit 8422494 into feature/llms-txt May 15, 2025
35 of 38 checks passed
@thijsoo thijsoo deleted the 606-add-content-in-the-llmstxt-file branch May 15, 2025 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog: non-user-facing Needs to be included in the 'Non-userfacing' category in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants