Skip to content

Inaccurate analysis when keyphrase contains a period #19691

@mmikhan

Description

@mmikhan

Reporter: @amboutwe
From: https://yoast.atlassian.net/browse/IM-2011

Please give us a description of what happened

A . is used to determine when a sentence ends but if the keyphrase contains a dot, the analysis report is inaccurate.

Customer set the keyphrase as 11. SSW (German) which is a shortened way to say 11th week of pregnancy. In the below copy, the keyword is not recognized:

Das zweite Trimester steht fast vor der Tür. Damit naht auch Besserung, was einige Deiner Symptome betrifft. In der 11. SSW hast Du Dich bestimmt schon über sämtliche Themen informiert. Dabei hast Du sicherlich bemerkt, dass es trotz aller Recherche ständig noch etwas Neues zu entdecken gibt.

This issue is related to Yoast/YoastSEO.js#745 where the sentence tokenizer incorrectly splits at an ordinal number, frustrating the keyphrase recognition.

  • Also, note this discussion on Slack.
  • We’re going to make this for German only. There are also false positives, like It happened in 2000., so it’s probably best to check for numbers with 2-3 digits. We will do this in LR-10 Improves sentence recognition for German with ordinal numbers #18560.
  • Check whether LR-10 Improves sentence recognition for German with ordinal numbers #18560 resolves this issue.
  • Check whether there are also other languages that use a dot for ordinal numbers (instead of morphology, like 11th). See Ordinal indicator. If there’s other languages, create issues to tackle these.
  • In this issue, let’s also look at all other linked observations in the GitHub issues (and in HelpScout) and see if they have been resolved (I’m pretty sure some are).
    • If they have been solved, ping IM to close the GitHub issue with a comment which version should have resolved the issue.
    • If they have not been solved, please create a separate issue.

Observations from the linked GitHub issues and HelpScout conversations:

Screenshots, screen recording, code snippet

ScreenRecorder-error-on.turkish-wordpress.1.mp4

Used versions

  • WordPress version: 5.6

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions