Skip to content

[IM] 19477 yoast markers break html of content in classic editor and shopify#19648

Closed
hdvos wants to merge 30 commits intofeature/lingo-fixesfrom
19477-attempt-3-yoast-markers-break-html-of-content-in-shopify-and-classic
Closed

[IM] 19477 yoast markers break html of content in classic editor and shopify#19648
hdvos wants to merge 30 commits intofeature/lingo-fixesfrom
19477-attempt-3-yoast-markers-break-html-of-content-in-shopify-and-classic

Conversation

@hdvos
Copy link
Copy Markdown
Contributor

@hdvos hdvos commented Jan 17, 2023

Context

Summary

This PR can be summarized in the following changelog entry:

  • Fixes a bug where the highlighting feature in the Classic editor would not work when inline HTML tags were present.
  • [shopify-seo] Fixes a bug where the yoastmark tags broke the HTML when applied to inline HTML attributes.
  • [yoastseo] Excludes applying yoastmark to anchor tag attributes.

Relevant technical choices:

  • in getSentencesFromTokens, a check was added. Previously, if there was an opening tag on the beginning of the text, and a closing tag at the end, both would be removed, regardless of whether they belonged together. In this PR a check is added that does a rudimentary (and not watertight) check whether they belong together. Once we have the html parser, this should be replaced.

Test instructions

Test instructions for the acceptance test before the PR gets merged

This PR can be acceptance tested by following these steps:

NOTE: while testing, you might note that there are different results for the same text between post and page in default editor. This is reproducible on trunk. And this IM issue was created for it

This testing plan is quite extensive. The reason is that this PR touches A lot of different marking

WordPress

  • Install and activate Yoast SEO
  • Set the site language to English
  • Create a post in Classic editor and add this text:
    catnip.txt

NOTE: Add the text in Text editor mode in Classic editor, and in Code editor mode in Block editor

Keyword density assessment
  • Set the word catnip as the focus keyphrase
  • Confirm that the keyphrase density assessment detects 8 occurrences of the keyphrase in the text
  • Embed this link below to the phrase "alternative plants exist" (the link contains the focus keyphrase):

https://en.wikipedia.org/wiki/Catnip#Felines_not_affected_by_catnip

  • Confirm that the keyword density assessment still detects 8 occurrences of the keyphrase in the text
  • Click the eye icon of keyword density assessment
  • Confirm that the "alternative plants exist" is not highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
  • Embed this link below to the phrase "catnip flowers" (the link contains the focus keyphrase)

https://hort.extension.wisc.edu/articles/catnip-nepeta-cataria/

  • Confirm that the keyword density assessment still detects 8 occurrences of the keyphrase in the text
  • Click the eye icon of keyword density assessment
  • Confirm that the "catnip" in "catnip flowers" is highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
Keyphrase distribution assessment
  • Install and activate Yoast SEO premium
    • When building the Premium plugin, run composer require yoast/wordpress-seo:dev-19477-attempt-3-yoast-markers-break-html-of-content-in-shopify-and-classic@dev before building it
  • Go to the previous post
  • Set the word catnip as the focus keyphrase
  • Embed this link below to the phrase "alternative plants exist" (the link contains the focus keyphrase):

https://en.wikipedia.org/wiki/Catnip#Felines_not_affected_by_catnip

  • Click the eye icon of keyphrase distribution assessment
  • Confirm that the "alternative plants exist" is not highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
  • Embed this link below to the phrase "catnip flowers":

https://hort.extension.wisc.edu/articles/catnip-nepeta-cataria/

  • Click the eye icon of keyphrase distribution assessment
  • Confirm that the "catnip" in "catnip flowers" is highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
Synonyms:
  • add catmint as a keyphrase synonym.
  • toggle the highlighting button for keyphrase density
  • Make sure all occurrences of catnip are marked
  • toggle the highlighting button for the keyphrase distribution
  • Make sure all occurrences of catnip as well as all occurrences of catmint are marked
  • embed the url below to one of the occurrences of catmint

https://en.wikipedia.org/wiki/Catmint

  • Highlight keyphrase distribution
  • Make sure that all occurrences of catmint are properly highlighted.
  • Undo the highlighting
  • Make sure that all occurrences of catmint are still there. (there should be 3 occurrences)
Word complexity assessment
  • Install and activate Yoast SEO premium
  • Go to the previous post
  • Go to the Readability analysis tab
  • Embed this link below to the word "hereditary" in the text:

https://en.wikipedia.org/wiki/hereditary-defects

  • Click the eye icon on the Word complexity assessment
  • Confirm that the word "hereditary" is highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
Sentence length, Passive voice, and Transition words assessment
  • Go to the previous post
  • Add this sentence to the post (in text editing mode as opposed to visual editing):

However, the compounds were found to repel <a href="https://en.wikipedia.org/wiki/Mosquito" rel="nofollow">mosquitos</a>, and it is hypothesized that rubbing against the plants provides the cats with a chemical coat that protects them against mosquito bites.

NOTE: Add the text in Text editor mode in Classic editor, and in Code editor mode in Block editor

  • Click on the eye icon of the sentence length assessment
  • Confirm that the sentence above is highlighted
  • Click on the eye icon of the passive sentence assessment
  • Confirm that the sentence above is highlighted
  • Click on the eye icon of the transition words assessments
  • Confirm that the sentence above is highlighted
Consecutive sentences assessment
  • (Assessment not available for E-commerce content types)
  • Go to the previous post
  • Add these three sentences containing anchor links that start with the same words:

<a href="https://en.wikipedia.org/wiki/Cat">Cats</a> detect nepetalactone through their <a href="https://en.wikipedia.org/wiki/Olfactory_epithelium">olfactory epithelium</a>, not through their vomeronasal organ. <a href="https://en.wikipedia.org/wiki/Cat">Cats</a> detect nepetalactone through their <a href="https://en.wikipedia.org/wiki/Olfactory_epithelium">olfactory epithelium</a>, not through their vomeronasal organ. <a href="https://en.wikipedia.org/wiki/Cat">Cats</a> detect nepetalactone through their <a href="https://en.wikipedia.org/wiki/Olfactory_epithelium">olfactory epithelium</a>, not through their vomeronasal organ.

NOTE: Add the text in Text editor mode in Classic editor, and in Code editor mode in Block editor

  • Click on the eye icon of the consecutive sentences assessment
  • Confirm that the three sentences above are highlighted
Paragraph length assessment
  • Go to the previous post
  • Make one of the paragraphs that contain links longer than 150 words
  • Click on the paragraph length assessment
  • Confirm that the long paragraph that contains links is highlighted
Paragraph length assessment
  • Combine the 2nd, 3rd and 4th paragraph to 1 paragraph.
  • Activate highlighting for paragraph length assessment
  • Make sure the newly combined paragraph is highlighted.
Inclusive language analysis
  • Go to the previous post
  • In one of the sentences that contain links, add a non-inclusive word like, "seniors", "policemen" etc.
  • Click on the eye icon of the feedback
  • Confirm that sentence that contains the non-inclusive word is highlighted

Upgrade routine

  • Install and activate the previous version of Yoast SEO

  • Set the site language to English

  • Create a post in Classic editor and add this text:
    catnip.txt
    NOTE: Add the text in Text editor mode in Classic editor

  • Set "catnip" as the focus keyphrase

  • Embed this link below to the phrase "catnip flowers":

https://hort.extension.wisc.edu/articles/catnip-nepeta-cataria/

  • Click the eye icon of keyphrase distribution assessment or keyphrase density
  • Confirm that the "catnip" in "catnip flower" is NOT highlighted
  • Upgrade Yoast SEO version to the one that includes this fix
  • Go to the previous post
  • Click the eye icon of keyphrase distribution assessment or keyphrase density
  • Confirm that the "catnip" in "catnip flower" is highlighted
Smoke test Advanced custom fields
Smoke classic editor block in block editor
  • Install and activate the classic editor.
  • Create a post and add text and save the post.
  • Deactivate the classic editor.
  • Open the post in the block editor or gutenberg. Do NOT try block recovery.
  • Smoke test all highlighting.
  • Do block recovery.
  • Smoke test all highlighting
  • Add a single /classic block (create a new block and type /classic), and add some text to it that contains the words catnip and catmint.
  • Make sure that with keyphrase distribution all occurrences of catnip and catmint are highlighted.
  • Make sure that with keyphrase density all occurrences of catnip are highlighted.

Test in Shopify

  • Install and activate Yoast SEO for Shopify

    • If building the app:
      • Build the app from main in shopify-seo
      • Link the wordpress-seo before building
  • Set the store language to English

  • Create a product and add this text:
    catnip.txt

  • NOTE: the test instruction below should be repeated in all content types in Shopify

Keyword density and Keyphrase distribution assessment
  • Set the word catnip as the focus keyphrase
  • Confirm that the keyword density assessment detects 8 occurrences of the keyphrase in the text
  • Embed this link below to the phrase "alternative plants exist":

https://en.wikipedia.org/wiki/Catnip#Felines_not_affected_by_catnip

  • Confirm that the keyword density assessment still detects 8 occurrences of the keyphrase in the text
  • Click the eye icon of keyword density assessment
  • Confirm that the "alternative plants exist" is not highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
  • Click the eye icon of keyphrase distribution assessment
  • Confirm that the "alternative plants exist" is not highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
  • Embed this link below to the phrase "catnip flowers":

https://hort.extension.wisc.edu/articles/catnip-nepeta-cataria/

  • Confirm that the keyword density assessment still detects 8 occurrences of the keyphrase in the text

  • Click the eye icon of keyword density assessment

  • Confirm that the "catnip" in "catnip flower" is highlighted

  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"

  • Click the eye icon of keyphrase distribution assessment

  • Confirm that the "catnip" in "catnip flowers" is highlighted

  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"

Word complexity assessment
  • Go to the Readability analysis tab
  • Embed this link below to the word "hereditary" in the text:

https://en.wikipedia.org/wiki/hereditary-defects

  • Click the eye icon on the Word complexity assessment
  • Confirm that the word "hereditary" is highlighted
  • Check the anchor link of the phrase and confirm that the href value doesn't contain yoast mark tags, e.g. "<yoastmark class='yoast-text-mark'>" or "</yoastmark>"
(Smoke) Test impact by sentence tokenizer
  • Add the following sentence as a separate paragraph (use text editing mode (as opposed to visual editing mode)): <i>The cat</i> was greeted by <i>the dog</i>.
  • Toggle highlighting for passive voice analysis. Make sure that the above-mentioned sentence is properly highlighted.
    • properly =
    • The sentence is fully highlighted.
    • The previous sentence and past sentence are not highlighted (unless they are passive voice as well)
    • When highlighting 'the cat' and 'the dog' are still in italics.
    • After undoing the highlighting 'the cat' and 'the dog' are still in italics.
  • Add the following sentence as a separate paragraph (use text editing mode (as opposed to visual editing mode)): <div>The cat was greeted by the dog</div>.
  • Toggle highlighting for passive voice analysis. Make sure that the above-mentioned sentence is properly highlighted.
  • Add the word midget to a few random sentences in the post. This should trigger the inclusive language assessment for midget.
  • Toggle highlighting for midget.
  • Make sure that all sentences containing midget are properly highlighted.

Relevant test scenarios

  • Changes should be tested with the browser console open
  • Changes should be tested on different posts/pages/taxonomies/custom post types/custom taxonomies
  • Changes should be tested on different editors (Block/Classic/Elementor/other)
  • Changes should be tested on different browsers
  • Changes should be tested on multisite

Test instructions for QA when the code is in the RC

  • QA should use the same steps as above.

QA can test this PR by following these steps:

Impact check

This PR affects the following parts of the plugin, which may require extra testing:

  • This PR impacts all assessments that use highlighting. Tests are added for this.
  • This PR impacts classic editor and shopify. Block editor and Elementor need to be smoke tested.
  • This PR impacts the sentence tokenizer. An acceptance testing scenario was added to cover this.

UI changes

  • This PR changes the UI in the plugin. I have added the 'UI change' label to this PR.

Other environments

  • This PR also affects Shopify. I have added a changelog entry starting with [shopify-seo], added test instructions for Shopify and attached the Shopify label to this PR.

Documentation

  • I have written documentation for this change.

Quality assurance

  • I have tested this code to the best of my abilities
  • I have added unit tests to verify the code works as intended
  • If any part of the code is behind a feature flag, my test instructions also cover cases where the feature flag is switched off.
  • I have written this PR in accordance with my team's definition of done.

Innovation

  • No innovation project is applicable for this PR.
  • This PR falls under an innovation project. I have attached the innovation label and noted the work hours.

Fixes #19477

@FAMarfuaty FAMarfuaty changed the base branch from trunk to feature/lingo-fixes January 17, 2023 12:42
hdvos added 13 commits January 17, 2023 13:47
# Conflicts:
#	packages/yoastseo/spec/languageProcessing/helpers/sentence/SentenceTokenizerSpec.js
# Conflicts:
#	packages/yoastseo/spec/languageProcessing/helpers/sentence/SentenceTokenizerSpec.js
#	packages/yoastseo/spec/scoring/productPages/fullTextTests/testTexts/en/englishPaper2.js
@hdvos hdvos added this to the feature/lingo-fixes milestone Jan 17, 2023
@hdvos hdvos marked this pull request as ready for review January 17, 2023 15:29
@hdvos hdvos requested a review from a team as a code owner January 17, 2023 15:29
@hdvos hdvos linked an issue Jan 17, 2023 that may be closed by this pull request
Comment thread packages/js/src/decorator/gutenberg.js Outdated
// Note: if isMaybeSingleQuoted is true, this does nessarily mean that the start mark is single quoted.
// It could also be that the start mark doesn't occur at all in startMarkIndex.
// In that case, startMarkIndex will be -1 during later tests.
const isMaybeSingleQuoted = startMarkIndex >= 0;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question for CR: not sure what would be a good name for isMaybeSingleQuoted. Suggestions would be welcome.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think isStartMarkSingleQuoted would be more appropriate? because when it's true, we know for sure that it is single quoted (so not a "maybe" anymore)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I went with Maybe is because of the following note
// Note: if isMaybeSingleQuoted is true, this does necessary mean that the start mark is single quoted.

if startMarkIndex >= 0, this means that the doublequoted start mark is not present. This could either mean that the start mark is single quoted, or that no start mark occurs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I misunderstood the comment then.
So, there are three cases: no start mark, single quote, and double quote. startMarkIndex >= 0 is true for single quote and false for double quote, but is it supposed to be true or false for no start mark occurs?
But, if it is also true for no start mark, wouldn't it be appropriate to call the variable isNotDoubleQuoted?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good suggestion. Will apply.

Comment thread packages/yoastseo/src/languageProcessing/helpers/sentence/SentenceTokenizer.js Outdated
Comment thread packages/yoastseo/src/languageProcessing/helpers/sentence/SentenceTokenizer.js Outdated
@hdvos hdvos added changelog: bugfix Needs to be included in the 'Bugfixes' category in the changelog Shopify This PR impacts Shopify. labels Jan 19, 2023
Copy link
Copy Markdown
Contributor

@FAMarfuaty FAMarfuaty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really great that you also found a solution for the latest issue found, and a neat solution at that! And the code looks great in general. I really appreciate the efforts you were pouring into this PR 🙌🏽

I have some minor suggestions below and also some answers to your questions 😄

Comment thread packages/js/src/decorator/gutenberg.js Outdated
// Note: if isMaybeSingleQuoted is true, this does nessarily mean that the start mark is single quoted.
// It could also be that the start mark doesn't occur at all in startMarkIndex.
// In that case, startMarkIndex will be -1 during later tests.
const isMaybeSingleQuoted = startMarkIndex >= 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think isStartMarkSingleQuoted would be more appropriate? because when it's true, we know for sure that it is single quoted (so not a "maybe" anymore)

Comment thread packages/js/src/decorator/gutenberg.js Outdated
Comment thread packages/js/src/decorator/gutenberg.js Outdated
Comment thread packages/js/src/decorator/tinyMCE.js Outdated
Comment thread packages/yoastseo/spec/languageProcessing/helpers/html/normalizeHTMLSpec.js Outdated
Comment thread packages/yoastseo/spec/languageProcessing/helpers/word/markWordsInSentenceSpec.js Outdated
Comment thread packages/yoastseo/src/languageProcessing/helpers/sentence/SentenceTokenizer.js Outdated
Comment thread packages/yoastseo/src/languageProcessing/helpers/sentence/SentenceTokenizer.js Outdated
Comment thread packages/yoastseo/src/languageProcessing/helpers/sentence/SentenceTokenizer.js Outdated
hdvos and others added 3 commits January 20, 2023 11:47
…enceTokenizer.js

Co-authored-by: Aida Marfuaty <48715883+FAMarfuaty@users.noreply.github.com>
Co-authored-by: Aida Marfuaty <48715883+FAMarfuaty@users.noreply.github.com>
@hdvos hdvos changed the title 19477 attempt 3 yoast markers break html of content in shopify and classic [IM] 19477 yoast markers break html of content in classic editor and shopify Jan 20, 2023
@FAMarfuaty
Copy link
Copy Markdown
Contributor

This PR will be closed in favor of this PR #19688. This is because this PR contains changes from feature/lingo-fixes that we don't want to introduce to trunk just yet.

@FAMarfuaty FAMarfuaty closed this Jan 23, 2023
@mhkuu mhkuu deleted the 19477-attempt-3-yoast-markers-break-html-of-content-in-shopify-and-classic branch February 10, 2023 08:09
@mhkuu mhkuu removed this from the feature/lingo-fixes milestone Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog: bugfix Needs to be included in the 'Bugfixes' category in the changelog Shopify This PR impacts Shopify.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Yoast markers break HTML of content in Shopify

3 participants