Skip to content

fix: return only unique Author ID <=> Post Author (ID) pairings.#22065

Merged
pls78 merged 1 commit intoYoast:trunkfrom
eddiesshop:fix/reassigned_authors_query
Apr 4, 2025
Merged

fix: return only unique Author ID <=> Post Author (ID) pairings.#22065
pls78 merged 1 commit intoYoast:trunkfrom
eddiesshop:fix/reassigned_authors_query

Conversation

@eddiesshop
Copy link
Copy Markdown
Contributor

@eddiesshop eddiesshop commented Feb 20, 2025

Because of the join across the wp_posts table, there could potentially be many posts where an author has been updated. Currently, each of those rows are returned but that data is discarded because the SELECT statement only grabs author_id <=> post_author pairings. For large data sets, this leads to an extremely inflated data list of duplicated author_id <=> post_author (ID) pairings.

With the GROUP BY statement, we ensure that only unique author_id <=> post_author (ID) pairings are returned, leading to much more efficient operation (from a code standpoint).

Context

See this issue: #22064

This PR makes the update_indexables_author_to_reassigned operation, and the wp yoast cleanup operation, much more efficient, by tackling only the (Old) Author ID <=> (New) Post Author (ID) pairings that need to be processed.

Summary

This PR can be summarized in the following changelog entry:
changelog: enhancement

  • Fixes an issue where running the wp yoast cleanup CLI command would hang when it reaches the update_indexables_author_to_reassigned step (for very large data sets). Props to eddiesshop.

Relevant technical choices:

Test instructions

Test instructions for the acceptance test before the PR gets merged

This PR can be acceptance tested by following these steps:

  1. Create an arbitrarily large number of posts and ensure they are assigned to the same author.
  2. Run wp yoast index
  3. Create a new author in the DB.
  4. Assign all posts to this new author.
  5. Run wp yoast cleanup. Notice that the command hangs at the update_indexables_author_to_reassigned step.
  6. Run the following query: SELECT wp_yoast_indexable.author_id, wp_posts.post_author FROM wp_yoast_indexable JOIN wp_posts on wp_yoast_indexable.object_id = wp_posts.id WHERE object_type='post' AND wp_yoast_indexable.author_id <> wp_posts.post_author ORDER BY wp_yoast_indexable.author_id. Notice that this query returns the same number of rows as the posts that you created.
  7. Now run the following query: SELECT wp_yoast_indexable.author_id, wp_posts.post_author FROM wp_yoast_indexable JOIN wp_posts on wp_yoast_indexable.object_id = wp_posts.id WHERE object_type='post' AND wp_yoast_indexable.author_id <> wp_posts.post_author GROUP BY wp_yoast_indexable.author_id, wp_posts.post_author ORDER BY wp_yoast_indexable.author_id. Notice that only 1 row is returned, indicating the Old Author ID being updated with the New Post Author ID.
  8. Checkout this branch.
  9. Run wp yoast cleanup. Notice that the command runs without hanging.

Relevant test scenarios

  • Changes should be tested with the browser console open
  • Changes should be tested on different posts/pages/taxonomies/custom post types/custom taxonomies
  • Changes should be tested on different editors (Default Block/Gutenberg/Classic/Elementor/other)
  • Changes should be tested on different browsers
  • Changes should be tested on multisite
    Please see test steps above for reasoning on selecting second choice.

Test instructions for QA when the code is in the RC

N/A

  • QA should use the same steps as above.

QA can test this PR by following these steps:
N/A

Impact check

This PR affects the following parts of the plugin, which may require extra testing:
N/A

UI changes

  • This PR changes the UI in the plugin. I have added the 'UI change' label to this PR.

Other environments

  • This PR also affects Shopify. I have added a changelog entry starting with [shopify-seo], added test instructions for Shopify and attached the Shopify label to this PR.

Documentation

  • I have written documentation for this change. For example, comments in the Relevant technical choices, comments in the code, documentation on Confluence / shared Google Drive / Yoast developer portal, or other.

Quality assurance

  • I have tested this code to the best of my abilities.
  • During testing, I had activated all plugins that Yoast SEO provides integrations for.
  • I have added unit tests to verify the code works as intended.
  • If any part of the code is behind a feature flag, my test instructions also cover cases where the feature flag is switched off.
  • I have written this PR in accordance with my team's definition of done.
  • I have checked that the base branch is correctly set.

Innovation

  • No innovation project is applicable for this PR.
  • This PR falls under an innovation project. I have attached the innovation label.
  • I have added my hours to the WBSO document.

Fixes #22064

Because of the join across the `wp_posts` table, there could potentially be _many_ posts where an author has been updated. Currently, each of those rows are returned but that data is discarded because the `SELECT` statement only grabs `author_id` <=> `post_author` pairings. For large data sets, this leads to an extremely inflated data list of duplicated `author_id` <=> `post_author` (ID) pairings.

With the `GROUP BY` statement, we ensure that only unique `author_id` <=> `post_author` (ID) pairings are returned, leading to much more efficient operation (from a code standpoint).
@enricobattocchi
Copy link
Copy Markdown
Member

Hey @eddiesshop, thanks for the suggestion! We'll try to schedule a review and test in the upcoming weeks (we can't commit to a date since we are currently working on a large project).

@pls78 pls78 added the changelog: enhancement Needs to be included in the 'Enhancements' category in the changelog label Apr 4, 2025
@pls78 pls78 added this to the 25.0 milestone Apr 4, 2025
Copy link
Copy Markdown
Member

@pls78 pls78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CR && Acc: ✅

@pls78 pls78 merged commit 1f6f07a into Yoast:trunk Apr 4, 2025
@eddiesshop eddiesshop deleted the fix/reassigned_authors_query branch April 22, 2025 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog: enhancement Needs to be included in the 'Enhancements' category in the changelog community-patch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

update_indexables_author_to_reassigned frequently hangs due to inefficient query

4 participants