Skip to content

PC-965 Fixes highlighting when inline HTML tags are present in Classic editor and in Shopify#19162

Merged
marinakoleva merged 26 commits intotrunkfrom
PC-965-yoast-markers-break-html-of-content
Dec 7, 2022
Merged

PC-965 Fixes highlighting when inline HTML tags are present in Classic editor and in Shopify#19162
marinakoleva merged 26 commits intotrunkfrom
PC-965-yoast-markers-break-html-of-content

Conversation

@FAMarfuaty
Copy link
Copy Markdown
Contributor

@FAMarfuaty FAMarfuaty commented Nov 10, 2022

Context

  • In WordPress, this PR fixes the bug where the highlighting didn't work on text that contains links.
  • In Shopify, this bug affects the functionality of the highlighting of words. Adding the yoastmark inside an anchor tag' attribute, an action that we do when we mark words, breaks the HTML. Hence, the affected assessments are:
    • Keyword density
    • Keyphrase distribution
    • Word complexity

Summary

This PR can be summarized in the following changelog entry:

  • Fixes a bug where the highlighting feature in the Classic editor would not work when inline HTML tags were present.
  • [shopify-seo] Fixes a bug where the yoastmark tags broke the HTML when applied to inline HTML attributes.
  • [yoastseo] Excludes applying yoastmark to anchor tag attributes.

Relevant technical choices:

  • The solution of this PR is only applied to anchor HTML tags, since the reported bug is related only to these tags.
  • I tried testing with other HTML tags that can include URL in their attribute, and so far I don't see any problem with those.
    • Other tags that I tried: blockquote, embed, img, link
  • Unrelated to the goal of this PR, all mentions of topic in the markWordsInSentences file were changed to words. Topic is an umbrella term for keyphrase and synonyms that we use in the code. But the functionality in this file is also used to mark words in the word complexity assessment, so the using the word 'topic' is no longer accurate there.

Test instructions

Test instructions for the acceptance test before the PR gets merged

This PR can be acceptance tested by following these steps:

WordPress

  • Install and activate Yoast SEO
  • Set the site language to English
  • Create a post in Classic editor and add this text:
    catnip.txt

NOTE: Add the text in Text editor mode in Classic editor, and in Code editor mode in Block editor

Keyword density assessment
  • Set the word catnip as the focus keyphrase
  • Confirm that the keyphrase density assessment detects 8 occurrences of the keyphrase in the text
  • Embed this link below to the phrase "alternative plants exist" (the link contains the focus keyphrase):

https://en.wikipedia.org/wiki/Catnip#Felines_not_affected_by_catnip

  • Confirm that the keyword density assessment still detects 8 occurrences of the keyphrase in the text
  • Click the eye icon of keyword density assessment
  • Confirm that the "alternative plants exist" is not highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
  • Embed this link below to the phrase "catnip flowers" (the link contains the focus keyphrase)

https://hort.extension.wisc.edu/articles/catnip-nepeta-cataria/

  • Confirm that the keyword density assessment still detects 8 occurrences of the keyphrase in the text
  • Click the eye icon of keyword density assessment
  • Confirm that the "catnip" in "catnip flowers" is highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
Keyphrase distribution assessment
  • Install and activate Yoast SEO premium
    • If building the plugin, run composer require yoast/wordpress-seo:dev-PC-965-yoast-markers-break-html-of-content@dev before building it
  • Go to the previous post
  • Set the word catnip as the focus keyphrase
  • Embed this link below to the phrase "alternative plants exist" (the link contains the focus keyphrase):

https://en.wikipedia.org/wiki/Catnip#Felines_not_affected_by_catnip

  • Click the eye icon of keyphrase distribution assessment
  • Confirm that the "alternative plants exist" is not highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
  • Embed this link below to the phrase "catnip flowers":

https://hort.extension.wisc.edu/articles/catnip-nepeta-cataria/

  • Click the eye icon of keyphrase distribution assessment
  • Confirm that the "catnip" in "catnip flowers" is highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
Word complexity assessment
  • Install and activate Yoast SEO premium
  • Go to the previous post
  • Go to the Readability analysis tab
  • Embed this link below to the word "hereditary" in the text:

https://en.wikipedia.org/wiki/hereditary-defects

  • Click the eye icon on the Word complexity assessment
  • Confirm that the word "hereditary" is highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
Sentence length, Passive voice, and Transition words assessment
  • Go to the previous post
  • Add this sentence to the post:

However, the compounds were found to repel <a href="https://en.wikipedia.org/wiki/Mosquito" rel="nofollow">mosquitos</a>, and it is hypothesized that rubbing against the plants provides the cats with a chemical coat that protects them against mosquito bites.

NOTE: Add the text in Text editor mode in Classic editor, and in Code editor mode in Block editor

  • Click on the eye icon of the sentence length assessment
  • Confirm that the sentence above is highlighted
  • Click on the eye icon of the passive sentence assessment
  • Confirm that the sentence above is highlighted
  • Click on the eye icon of the transition words assessments
  • Confirm that the sentence above is highlighted
Consecutive sentences assessment
  • (Assessment not available for E-commerce content types)
  • Go to the previous post
  • Add these three sentences containing anchor links that start with the same words:

<a href="https://en.wikipedia.org/wiki/Cat">Cats</a> detect nepetalactone through their <a href="https://en.wikipedia.org/wiki/Olfactory_epithelium">olfactory epithelium</a>, not through their vomeronasal organ. <a href="https://en.wikipedia.org/wiki/Cat">Cats</a> detect nepetalactone through their <a href="https://en.wikipedia.org/wiki/Olfactory_epithelium">olfactory epithelium</a>, not through their vomeronasal organ. <a href="https://en.wikipedia.org/wiki/Cat">Cats</a> detect nepetalactone through their <a href="https://en.wikipedia.org/wiki/Olfactory_epithelium">olfactory epithelium</a>, not through their vomeronasal organ.

NOTE: Add the text in Text editor mode in Classic editor, and in Code editor mode in Block editor

  • Click on the eye icon of the consecutive sentences assessment
  • Confirm that the three sentences above are highlighted
Paragraph length assessment
  • Go to the previous post
  • Make one of the paragraphs that contain links longer than 150 words
  • Click on the paragraph length assessment
  • Confirm that the long paragraph that contains links is highlighted
Inclusive language analysis
  • Go to the previous post
  • In one of the sentences that contain links, add a non-inclusive word like, "seniors", "policemen" etc.
  • Click on the eye icon of the feedback
  • Confirm that sentence that contains the non-inclusive word is highlighted

Upgrade routine

  • Install and activate the previous version of Yoast SEO

  • Set the site language to English

  • Create a post in Classic editor and add this text:
    catnip.txt
    NOTE: Add the text in Text editor mode in Classic editor

  • Set "catnip" as the focus keyphrase

  • Embed this link below to the phrase "catnip flowers":

https://hort.extension.wisc.edu/articles/catnip-nepeta-cataria/

  • Click the eye icon of keyphrase distribution assessment or keyphrase density
  • Confirm that the "catnip" in "catnip flower" is NOT highlighted
  • Upgrade Yoast SEO version to the one that includes this fix
  • Go to the previous post
  • Click the eye icon of keyphrase distribution assessment or keyphrase density
  • Confirm that the "catnip" in "catnip flower" is highlighted

Test in Shopify

  • Install and activate Yoast SEO for Shopify

    • If building the app:
      • Build the app from main in shopify-seo
      • Link the wordpress-seo before building
  • Set the store language to English

  • Create a product and add this text:
    catnip.txt

  • NOTE: the test instruction below should be repeated in all content types in Shopify

Keyword density and Keyphrase distribution assessment
  • Set the word catnip as the focus keyphrase
  • Confirm that the keyword density assessment detects 8 occurrences of the keyphrase in the text
  • Embed this link below to the phrase "alternative plants exist":

https://en.wikipedia.org/wiki/Catnip#Felines_not_affected_by_catnip

  • Confirm that the keyword density assessment still detects 8 occurrences of the keyphrase in the text
  • Click the eye icon of keyword density assessment
  • Confirm that the "alternative plants exist" is not highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
  • Click the eye icon of keyphrase distribution assessment
  • Confirm that the "alternative plants exist" is not highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
  • Embed this link below to the phrase "catnip flowers":

https://hort.extension.wisc.edu/articles/catnip-nepeta-cataria/

  • Confirm that the keyword density assessment still detects 8 occurrences of the keyphrase in the text

  • Click the eye icon of keyword density assessment

  • Confirm that the "catnip" in "catnip flower" is highlighted

  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"

  • Click the eye icon of keyphrase distribution assessment

  • Confirm that the "catnip" in "catnip flowers" is highlighted

  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"

Word complexity assessment
  • Go to the Readability analysis tab
  • Embed this link below to the word "hereditary" in the text:

https://en.wikipedia.org/wiki/hereditary-defects

  • Click the eye icon on the Word complexity assessment
  • Confirm that the word "hereditary" is highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"

Relevant test scenarios

  • Changes should be tested with the browser console open
  • Changes should be tested on different posts/pages/taxonomies/custom post types/custom taxonomies/Woo product pages
    • When testing in WooCommerce:
      • Install and activate WooCommerce
      • Install and activate Yoast SEO for WooCommerce
        • If building the plugin, run composer require yoast/wordpress-seo:dev-PC-965-yoast-markers-break-html-of-content@dev --dev before building it
  • Changes should be tested on different editors (Block/Classic/Elementor/other)
    • Fully test in Classic editor
    • Smoke test in Block editor and Gutenberg editor
      • The bugfix is introduced for Classic editor. Hence smoke testing for Block editor and Gutenberg editor should be enough
    • Testing in Elementor is not necessary since we disable the highlighting functionality there
  • Changes should be tested on different browsers
  • Changes should be tested on multisite

Test instructions for QA when the code is in the RC

  • QA should use the same steps as above.

QA can test this PR by following these steps:

Impact check

This PR affects the following parts of the plugin, which may require extra testing:

  • WordPress:
    • This PR affects the highlighting functionality of all assessments. Also smoke test the highlighting of sentences, paragraphs, and words without links
    • This solution of this PR is only applied to anchor HTML tags, since the reported bug is related to only that tags.
    • But it's also good to check that the yoastmark doesn't break other HTML tags that can include URL as one of the attributes, like blockquote, img, embed, iframe .
  • Shopify:
    • Smoke test the highlighting functionality of other assessments that highlight sentences:
      • Sentence length
      • Transition words
      • Passive voice
      • Consecutive sentences
      • Paragraph length
    • This solution of this PR is only applied to anchor HTML tags, since the reported bug is related to only that tags.
    • But it's also good to check that the yoastmark doesn't break other HTML tags that can include URL as one of the attributes, like blockquote, img, embed, iframe .
  • In order to check if these other HTML tags are broken you need to right click on the element in the Text editor, choose Inspect and then check the element's HTML in the Elements tab of the Console.
  • You can use these examples:
<blockquote cite="http://www.worldwildlife.org/who/index.html">
For 50 years, WWF has been protecting the future of nature. The world's leading conservation organization, WWF works in 100 countries and is supported by 1.2 million members in the United States and close to 5 million globally.
</blockquote>

<img src="https://www.allaboutbirds.org/guide/assets/photo/308074031-480px.jpg" alt="Photos and Videos for Rock Pigeon, All About Birds, Cornell ..." />
<iframe src="demo_iframe.htm" height="200" width="300" title="Iframe Example"></iframe>

UI changes

  • This PR changes the UI in the plugin. I have added the 'UI change' label to this PR.

Other environments

  • This PR also affects Shopify. I have added a changelog entry starting with [shopify-seo], added test instructions for Shopify and attached the Shopify label to this PR.

Documentation

  • I have written documentation for this change.

Quality assurance

  • I have tested this code to the best of my abilities
  • I have added unit tests to verify the code works as intended
  • If any part of the code is behind a feature flag, my test instructions also cover cases where the feature flag is switched off.
  • I have written this PR in accordance with my team's definition of done.

Fixes https://yoast.atlassian.net/browse/PC-965 and https://yoast.atlassian.net/browse/IM-2081

@FAMarfuaty FAMarfuaty added the changelog: bugfix Needs to be included in the 'Bugfixes' category in the changelog label Nov 10, 2022
@FAMarfuaty FAMarfuaty marked this pull request as ready for review November 11, 2022 13:28
@FAMarfuaty FAMarfuaty added the Shopify This PR impacts Shopify. label Nov 11, 2022
Comment on lines +55 to +66
/**
* In 'markup', we apply the markings also inside the anchor's attribute if there is a match, on top of
* marking the anchor's text.
* The step below is to replace the incorrectly marked anchors with the marked anchors that we want:
* where the markings are only applied in the anchor's text.
*/
if ( anchors.length > 0 ) {
const markupAnchors = getAnchorsFromText( markup );
for ( let i = 0; i < markupAnchors.length; i++ ) {
markup = markup.replace( markupAnchors[ i ], markedAnchors[ i ] );
}
}
Copy link
Copy Markdown
Contributor Author

@FAMarfuaty FAMarfuaty Nov 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach is as follows:

  • Get the anchors from the sentence
  • For every anchor, apply the markings only to the anchor text. Replace the unmarked anchor in the sentence with the marked anchor:
    • Retrieve the anchor text
    • Apply the marking to the anchor text
    • Replace the original anchor text with the marked anchor text
  • If there is an anchor found in the sentence:
    • The incorrectly marked anchor will be replaced with the correctly marked anchor (only marked anchor text, excluding the attributes), retrieved from getMarkedAnchors()

This approach works, but I have to admit that it's a little bit convoluted.
Let me know if you have suggestions on how to improve the approach :)

Copy link
Copy Markdown
Contributor Author

@FAMarfuaty FAMarfuaty Nov 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agnieszkaszuba 's suggestion for improvement for this approach:

  • Before running the functionality in addMarkSingleWord, get all the <a> tags (so only the opening tags) from the text and save them in an array (you can use this part of the regex from getAnchorsFromText: <a[\s]+(?:[^>]+)>)
  • Apply marks to the text (addMarkSingleWord)
  • Replace the <a> tags in the marked text with the original <a> tags

The advantage of this approach is that we don’t need to mark the text between alt tags twice, like it’s currently done in addMarkSingleWord and in getMarkedAnchors.

Copy link
Copy Markdown
Contributor Author

@FAMarfuaty FAMarfuaty Nov 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tried to implement the above suggestion. However, the third step "Replace the tags in the marked text with the original tags" proved to be tricky. And we couldn't formulate a regex that can detect anchor opening tag that includes the <yoastmark>.

Hence, we decided to use the current approach.

Comment on lines +40 to +42
mark._properties.marked = languageProcessing.replaceSingleQuotesInTags( mark._properties.marked );
mark._properties.original = languageProcessing.replaceSingleQuotesInTags( mark._properties.original );

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Classic editor uses double quotes for HTML attribute values. However, in yoastseo, we use single quotes for the attribute values when we create the marked object. As the result, the replacement did not work, as the marks passed by yoastseo did not match anything in the original text.
This step is replacing the single quotes in the marked object output by yoastseo with double quotes. This way, we make sure that the replacement can find a match between the original text of the marked object and the text in the page.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In yoastseo, single quotes are used because Block editor uses single quotes for HTML tag attributes.

agnieszkaszuba and others added 4 commits November 16, 2022 16:37
'topic' is an umbrella term for keyphrase and synonyms that we use in the code. But the functionality in this file is also used to mark words in the word complexity assessment, so the using the word 'topic' is no longer accurate here
…ast-markers-break-html-of-content

# Conflicts:
#	packages/js/src/decorator/tinyMCE.js
@agnieszkaszuba
Copy link
Copy Markdown
Contributor

CR: 💯

@iolandasequino
Copy link
Copy Markdown
Contributor

CR: My most sincere compliments for accomplishing such a tricky change! 👏
Not only for finding a solution to the problem, but also for documenting it so clearly and thoroughly. I added some edits to the parts that were most difficult for me to understand.
This is very educational and will be helpful whenever we need to edit this code. Great job! 🤩

@FAMarfuaty FAMarfuaty force-pushed the PC-965-yoast-markers-break-html-of-content branch from 12d81ce to deef450 Compare December 2, 2022 12:21
@hdvos
Copy link
Copy Markdown
Contributor

hdvos commented Dec 5, 2022

During acceptance testing, a bug was found. See video.
This bug that was solved within this issue. The solution also gave opportunities for other optimizations of the code that was written previously. So some refactoring was done. on previously written after solving the bug. So all needs to be re-acceptance-tested.

Screen.Recording.2022-11-23.at.15.02.55.mov

@marinakoleva
Copy link
Copy Markdown
Contributor

Acceptance testing passed with flying colours 🎉

@marinakoleva marinakoleva added this to the 19.13 milestone Dec 7, 2022
@marinakoleva marinakoleva merged commit e16abc2 into trunk Dec 7, 2022
@marinakoleva marinakoleva deleted the PC-965-yoast-markers-break-html-of-content branch December 7, 2022 13:45
@marinakoleva marinakoleva changed the title PC-965 Fixes highlights when inline HTML tags are present in Classic editor PC-965 Fixes highlighting when inline HTML tags are present in Classic editor Dec 7, 2022
@marinakoleva marinakoleva changed the title PC-965 Fixes highlighting when inline HTML tags are present in Classic editor PC-965 Fixes highlighting when inline HTML tags are present in Classic editor and in Shopify Dec 7, 2022
@enricobattocchi enricobattocchi modified the milestones: 19.13, 19.14 Dec 13, 2022
mhkuu added a commit that referenced this pull request Dec 21, 2022
…ak-html-of-content"

This reverts commit e16abc2, reversing
changes made to db44faa.
@manuelaugustin manuelaugustin added changelog: reverted Needs to be excluded from all categories in the changelog, should be used for reverted PR and removed changelog: bugfix Needs to be included in the 'Bugfixes' category in the changelog labels Dec 21, 2022
@hdvos hdvos restored the PC-965-yoast-markers-break-html-of-content branch January 10, 2023 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog: reverted Needs to be excluded from all categories in the changelog, should be used for reverted PR Shopify This PR impacts Shopify.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants