Skip to content

Adds AI summary block to the filtered elements in the analysis.#22510

Merged
mykola merged 11 commits intotrunkfrom
568-ai-summarize-exclude-ai-summarize-block-from-the-analysis-in-yoastseo-package
Sep 22, 2025
Merged

Adds AI summary block to the filtered elements in the analysis.#22510
mykola merged 11 commits intotrunkfrom
568-ai-summarize-exclude-ai-summarize-block-from-the-analysis-in-yoastseo-package

Conversation

@JorPV
Copy link
Copy Markdown
Contributor

@JorPV JorPV commented Aug 20, 2025

Context

  • We want to exclude the AI summary content from the majority of the analysis assessments in the yoastseo package.
  • This PR also fixes the current behaviour of AI Optimize for Keyphrase in introduction, after changes in the PR to Filter out AI Summarize block from the AI Optimize and Summarize prompt content . Specifically, we expect to 1) send prompt content as introduction; 2) this content not to include the AI Summarize block.

Summary

This PR can be summarized in the following changelog entry:

  • Filters the AI summary block from the non-relevant analysis assessments

Relevant technical choices:

Below are the lists of assessments that will exclude the AI summary content once this PR is merged:

List of assessments using the HTML parser that exclude the AI summary
  • Keyphrase in introduction
  • Keyphrase in image text
  • Keyphrase density
  • Single H1
  • Subheading Distribution Too Long
  • Paragraph too long
  • List assessment
  • Competing links
  • Sentence beginnings
  • Sentence lenght
  • Text alignment
List of assessments NOT using the HTML parser that exclude the AI summary
  • Transition Words
  • Keyphrase Distribution
  • Passive voice
  • Inclusive language
  • Word complexity
  • Outbound links
  • Prominent words
  • Word count
  • Reading time (using wordCountInText())
These analyses do NOT exclude the AI summarize block.
  • Reading time (n/a, the summary should be included in the reading time).
  • Flesch reading (n/a, the summary should be included in the flesch reading ease).
  • Word count (n/a, the AI summary should be counted in the word count analysis for an accurate feedback).
  • Prominent words (n/a, the summary should also be analyzed in the Prominent Words analysis).
  • Because the Word Count ( wordCountInText.js) and Reading Time (using wordCountInText) assessments where already making use of the removeHtmlBlocks() helper, adding the 'yoast-ai-summarize' to the helper array would exclude the summary from the analyses, making it inacurate (👀 see screenshot). I decided to add a second parameter ( ignoredClassesException ) to the helper allowing for flexible exclusion of HTML elements during parsing. This way we split the class attribute and check if any of its classes should be ignored. If so, it marks the block as ignorable and allowing the parser to skip over certain tags or elements with specific classes, making the HTML parsing customizable.
    Now in wordCountInText.js we can tell the removeHtmlBlocks() helper to not ignore the tag with classname 'yoast-ai-summarize'.
    Also Text lenght assessment in the SEO analyses (using wordCountInText as an argument, see TextLenghtAsessment.js ), includes to summary to determine the lenght of the content.
image
  • I removed the "should ignore malformed rel tag" unit test from getLinkStatisticsSpec.js because the removeHtmlBlocks() helper in htmlParser.js (using the htmlparser2) already filters out malformed attributes (see 👀 _stateInAttributeValueDoubleQuotes, _stateInAttributeValueSingleQuotes and _stateInAttributeValueNoQuotes, in /wordpress-seo/node_modules/htmlparser2/lib/Tokenizer.js). As a result, the test would always fail because it won't receive any case.

Test instructions

Test instructions for the acceptance test before the PR gets merged

This PR can be acceptance tested by following these steps:
Requirements

  • Make sure you're using a live site
  • Activate Yoast SEO Free and Premium (from v. 25.9) with an active subscription.
  • Create a new post with more than 200 tokens (aprox. 150 words). You can use the following text:
Robot vacuums At Zoef Robot, our robot vacuum cleaners have a built-in dustbin where dirt is collected. These bins are often small due to the compact size of the mop robot. It is important to regularly empty the dustbin to ensure the vacuum robot functions optimally. At our place, one of our robot vacuum cleaners called Willem take care of this task for you. Willem are equipped with an emptying station. When the dustbin is full, the robot vacuum cleaner automatically returns to the charging station and empties the collected dirt. You no longer need to worry about it. The emptying station can store dirt for up to 60 days.

Vacuum Robots with Mopping Function

Vacuum robots remove a lot of dust from the floor, but fine dust can remain and cause sneezing. That's why a robot vacuum cleaner with a mopping function is a good choice. You can purchase the mopping function directly or order it separately and attach it to the suction robot with Bep. Some models use a dry microfiber cloth to capture fine dust, while others have a water reservoir for a wet mopping function. Please note that the robot does not move back and forth to remove stubborn stains. However, a robot vacuum cleaner significantly reduces the need for manual vacuuming or mopping. 

Robot Vacuum Cleaners for Pet Hair

As a pet owner, you know that flying pet hair is an everyday challenge. Fortunately, at Zoef Robot, we have developed Bep specifically for this purpose, the ideal robot vacuum cleaner for dealing with pet hair.

  • Add a keyphrase (ie. house robots)
  • Add the Yoast AI Summarize block at the beginning of your post.
image
  • Generate a summary

SEO analysis

Testing the SEO analysis assessments

These are the SEO analysis that should exclude the AI generated summary, except Text Lenght that includes the summary into the result for a more accurate assessment.
Although excluded from the analysis, the Keyphrase length, Keyphrase in meta description, Keyphrase in slug, Images, Keyphrase in text images, Title width and Meta description lenght are not applicable to the summary since the summary is not part of the meta description and will not include any images.
The Text lenght assessment is the only SEO assessment that includes the summary in the analysis for a more accurate feedback.

  • Add the keyphrase (house robots) to more than one of the generated summary bullet points.
  • Change also the default Key takeaways title of the generated summary to be the keyphrase.
image
  • Go to the Premium SEO analysis tab.
image
  • Confirm that the Keyphrase in introduction, Keyphrase density, Keyphrase in SEO title and the Keyphrase in subheading assessments return a 🔴 feedback.
  • Confirm that the Keyphrase distribution assessment grey 🩶 in the considerations section.
  • Confirm that the Text Length assessments returns an 🟢 feedback with the following text: The text contains X words. Good job!
  • Copy the content of your post INCLUDING the summary and past it in a words counter tool .
  • Confirm that the number of words matches the Text Lenght analysys feedback and meaning that the Text length assessment includes the summary from the analysis.
  • Copy the text content of the post EXCLUDING the summary and past it in the words counter tool.
  • Confirm that number of words without the summary is 258.
  • Go to the preview mode of your post in a new tab and copy the url of your preview.
image
  • Back in your post, create a hiperlink in one of the added keyphrases in the summary and use the url from your preview but add also the keyphrase to the url.
image
  • Confirm that the Competing links assessment feedback is 🟢
  • Confirm that the Internal links assessment feedback is 🔴
  • Copy a url from any other external site (i.e. the url from the words counting tool).
  • Create/add a hiperlink to some word in the summary using that url.
  • Confirm that the Outbound links assessment feedback is 🔴
Scenario: AI Optimize for Keyphrase in introduction
  • Expectations: Keyphrase in introduction should send prompt content, but this content shouldn't include the AI Summarize block content.
  • You can follow the testing instructions in the ATP in the following sections:
    • 2.4.1.2. AI Summarize block with generated summary, specifically the section Block content is not part of prompt content for AI Summarize and AI Optimize
    • 2.11.1 Opening post with AI Summarize block in Classic editor

Readability analysis

Testing the Readability analysis assessments

These are the Readability analysis that should exclude the AI generated summary.
The Text presence analysis is n/a since Yoast AI summarize requires a minimum of 200 tokens in order to generate a summary, hence there will always be text/content.

  • Add the following long pharagraph to the last bullet point of your summary of the post.
Long pharagraph

He ordered his regular breakfast. Two eggs sunnyside up, hash browns, and two strips of bacon. He continued to look at the menu wondering if this would be the day he added something new. This was also part of the routine. A few seconds of hesitation to see if something else would be added to the order before demuring and saying that would be all. It was the same exact meal that he had ordered every day for the past two years. The desert wind blew the tumbleweed in front of the car. Alex swerved to avoid the tumbleweed, but he turned the wheel a bit too strong and the car left the road and skidded onto the dirt median. He instantly slammed on the brakes and the car stopped in a cloud of dirt. When the dust cloud had settled and he could see around him again, he realized that he'd somehow crossed over into an entirely new dimension. Although Scott said it didn't matter to him, he knew deep inside that it did. They had been friends as long as he could remember and not once had he had to protest that something Joe apologized for doing didn't really matter. Scott stuck to his lie and insisted again and again that everything was fine as Joe continued to apologize. Scott already knew that despite his words accepting the apologies that their friendship would never be the same. The picket fence had stood for years without any issue. That's all it was. A simple, white, picket fence. Why it had all of a sudden become a lightning rod within the community was still unbelievable to most. Yet a community that had once lived in harmony was now divided in bitter hatred and it had everything to do with the white picket fence.

image
  • Go to the Readability analysis tab.
  • Confirm that Paragraph length, Subheading distribution and Sentence length assessments return a 🟢 analysis feedback.
  • Add the following pharagraph to one point in the summary.
Consecutive sentences text

We went shopping in the city to buy some new clothes. And then we fat people had dinner. And then we went for a walk in the park. And then we went to the movies. But the trajectory movie was boring. But we still had a good time. But then we went home at 11 PM. And then we went to sleep.

  • Confirm that the Consecutive sentences assessment returns a 🟢 analysis feedback.
  • Add the same paragraph to your content as well.
  • Confirm that the Consecutive sentences assessment now returns a 🔴 analysis feedback.
  • Replace two of the summary points with the following long sententence.
Long sentences
    • As he crossed toward the pharmacy at the corner he involuntarily turned his head because of a burst of light that had ricocheted from his temple, and saw, with that quick smile with which we greet a rainbow or a rose, a blindingly white parallelogram of sky being unloaded from the van—a dresser with mirrors across which, as across a cinema screen, passed a flawlessly clear reflection of boughs sliding and swaying not arboreally, but with a human vacillation, produced by the nature of those who were carrying this sky, these boughs, this gliding façade.

    • On offering to help the blind man, the man who then stole his car, had not, at that precise moment, had any evil intention, quite the contrary, what he did was nothing more than obey those feelings of generosity and altruism which, as everyone knows, are the two best traits of human nature and to be found in much more hardened criminals than this one, a simple car-thief without any hope of advancing in his profession, exploited by the real owners of this enterprise, for it is they who take advantage of the needs of the poor.

  • Check the Sentence lenght assessment, it should return a 🟢 feedback with the text Great
  • Click on the highlighting (marks) button and confirm that it only highlights sentences from the main content but not in the summary.
image
  • Add the sentence I was hugged by mom. to one of the summary points.
  • Confirm that the Passive voice assessment returns a 🟢 feedback.
  • Check the Word complexity assessment, it should be in 🟢 .
  • Click on the highlighting (marks) button and confirm that it only highlights the words from the main content but not in the summary.
  • Check the Transition words assessment and confirm that it is also 🟢 only highlights the main content as well but not the summary.

Inclusive language

The inclusive language will give feedback on less inclusive words or phrases.

  • Add some of the following non-inclusive words: an albino, harelip, fat people, tribe, exotic, gurus to your AI summary block.
  • Confirm that Inclusive language feedback returns 🟠 for every of the non-inclusive words added to the content, but not the world added to the summary.

Important

Please note that for the non Inclusive language highlighting, we are still using regex find and replace. Thus, it is expected that the highlighting also marks non inclusive words from the AI summary. That problem will be solved in this issue: HTML Parser - Inclusive language assessment

Insights analysis

Testing the Insights analysis

The insights analyses Prominent words , Flesch reading ease , Reading time , and Word count , should include the AI summary into the analysis for an accurate result because the summary is part of the post.

  • Open the Insights modal from the Yoast menu and inspect the results of the analysis.
image
  • Pay specially attention to the number of times the word robot in the Prominent words.
  • Return to the post.
  • Remove at least one or more times the word robot from the summary points.
  • Open the Insights modal again and confirm that the Prominent words analysis has changed accordingly.
  • Return to the post.
  • Remove the long paragraph text that you have added before to the summary.
  • Open the Insights modal again and confirm that the Flesch reading ease , Word count and Reading time analyses have changed accordingly.
  • Return to the post.
  • Remove some more text from the summary.
  • Open the Insights modal once more and confirm that the Word count and Reading time analyses has changed accordingly.
  • Close the modal.
  • Open the WordPress wordcount from the Document overview menu button
image
  • Open again the Insights modal and confirm that the Word count and Reading time match with the WP
image

Relevant test scenarios

  • Changes should be tested with the browser console open
  • Changes should be tested on different posts/pages/taxonomies/custom post types/custom taxonomies
  • Changes should be tested on different editors (Default Block/Gutenberg/Classic/Elementor/other)
  • Changes should be tested on different browsers
  • Changes should be tested on multisite

Test instructions for QA when the code is in the RC

  • QA should use the same steps as above.

QA can test this PR by following these steps:

Impact check

This PR affects the following parts of the plugin, which may require extra testing:

Other environments

  • This PR also affects Shopify. I have added a changelog entry starting with [shopify-seo], added test instructions for Shopify and attached the Shopify label to this PR.

Documentation

  • I have written documentation for this change. For example, comments in the Relevant technical choices, comments in the code, documentation on Confluence / shared Google Drive / Yoast developer portal, or other.

Quality assurance

  • I have tested this code to the best of my abilities.
  • During testing, I had activated all plugins that Yoast SEO provides integrations for.
  • I have added unit tests to verify the code works as intended.
  • If any part of the code is behind a feature flag, my test instructions also cover cases where the feature flag is switched off.
  • I have written this PR in accordance with my team's definition of done.
  • I have checked that the base branch is correctly set.

Innovation

  • No innovation project is applicable for this PR.
  • This PR falls under an innovation project. I have attached the innovation label.
  • I have added my hours to the WBSO document.

Fixes #568

@JorPV JorPV added the changelog: non-user-facing Needs to be included in the 'Non-userfacing' category in the changelog label Aug 20, 2025
@JorPV JorPV marked this pull request as draft August 20, 2025 10:12
Base automatically changed from feature/ai-summarize to trunk August 26, 2025 10:02
…mmarize-exclude-ai-summarize-block-from-the-analysis-in-yoastseo-package
@coveralls
Copy link
Copy Markdown

coveralls commented Sep 2, 2025

Pull Request Test Coverage Report for Build 1c7e291a5dfa42a1fc50246ef407a49ff9a89b7f

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 19 of 19 (100.0%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall first build on 568-ai-summarize-exclude-ai-summarize-block-from-the-analysis-in-yoastseo-package at 56.454%

Totals Coverage Status
Change from base Build c552e3a34596c50745fc1b965ed9889ded748c4a: 56.5%
Covered Lines: 14426
Relevant Lines: 24891

💛 - Coveralls

@JorPV JorPV added the innovation Innovative issue. Relating to performance, memory or data-flow. label Sep 10, 2025
inIgnorableBlock = false;
ignoreStack = [];

const ignoredClasses = ignoredClassesException || IGNORED_CLASSES;
Copy link
Copy Markdown
Contributor Author

@JorPV JorPV Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parsing does repeat the logic from the initial parser configuration already present in the onopentag above. Here, 👇🏼 it is also necessary to support dynamic ignored classes per function call.

@JorPV JorPV marked this pull request as ready for review September 11, 2025 13:57
@mykola
Copy link
Copy Markdown
Contributor

mykola commented Sep 18, 2025

@JorPV please update tests:

  1. Keyphrase in subheading is green
image
  1. Keyphrase distribution - is not present with the provided text
  2. Text Length is green. May be we need to add instruction to reduce text to fit this item in case if summary is long.
image 4. Readability analysis - Please post the paragraph text instead of image

…lude-ai-summarize-block-from-the-analysis-in-yoastseo-package
…lude-ai-summarize-block-from-the-analysis-in-yoastseo-package
@mykola mykola added this to the 26.1 milestone Sep 22, 2025
@mykola mykola merged commit bf96d9b into trunk Sep 22, 2025
22 checks passed
@mykola mykola deleted the 568-ai-summarize-exclude-ai-summarize-block-from-the-analysis-in-yoastseo-package branch September 22, 2025 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog: non-user-facing Needs to be included in the 'Non-userfacing' category in the changelog innovation Innovative issue. Relating to performance, memory or data-flow.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants