Skip to content

Rails 6: Patch ActiveRecord::Base.clear_query_caches_for_current_thread #61

Merged
bquorning merged 1 commit intomasterfrom
testdouble/connection-sharing-db-switching
Feb 9, 2021
Merged

Rails 6: Patch ActiveRecord::Base.clear_query_caches_for_current_thread #61
bquorning merged 1 commit intomasterfrom
testdouble/connection-sharing-db-switching

Conversation

@zdennis
Copy link
Copy Markdown
Contributor

@zdennis zdennis commented Apr 21, 2020

What does this PR do?

This PR patches ActiveRecord::Base.clear_query_caches_for_current_thread to restore the connection's database after it is done. Otherwise it's possible for a Mysql2Adapter instance's database to be changed out from under it during an operation (e.g. insert, update, delete, truncate, etc).

Alternate to PR #60

This PR is an alternate to PR #60.

That PR resolves the general issue but loses out on an optimization to reduce the number of DB connections needed. This PR keeps the optimization, but leave the risk that this issue could show up again if new code paths are introduced to ActiveRecord or in gems that exercise the problematic code path.

Illustrating the issue in an application

Consider using this gem and active_record_shards gem. You may have this set up:

class UnsharedModel < ::ActiveRecord::Base
  not_sharded
end

ActiveRecord::Base.default_shard = 1
ActiveRecord::Base.connection.cache do
  UnshardedModel.create!(name: "Soren")
end

Context of this issue

This issue presented itself with a combination of using active_record_shards and active_record_host_pool. It was representing the sharded and unsharded database connections within the same connection pool. This problem did not manifest itself until Rails 6 when cache clearing for cache dirtying operations—e.g. insert, update, truncate, delete, etc—was updated to loop over the app's connection handlers, connection pools to search for currently active connections.

This happens mid-operation (inside of ActiveRecord::ConnectionAdapters::Mysql2Adapter instance) just before the operation is carried out. The result was that referencing the connection was changing the currently selected database and it was not being restored. This led to issues where the operation would try to be carried out on a different database that did not have the same tables/columns.

It is important to note that this issue existed in active_record_host_pool for years, but no code paths exercised it. The assumption that connections would not be re-referenced from within an operation on a connection was safe for years until ActiveRecord 6.

Why this fix?

The active_record_host_pool library is optimized to reduce the number of database connections by pooling them by host/port/socket/user/slave. This means that different databases on the same host will use the same connection pool. This requires that this library provide a mechanism for selecting the correct database when executing queries.

This fix maintains that optimization and fixes the currently known problematic code path introduced in ActiveRecord 6.

Benefits of this change

Preserves host pool optimization.

Consequences of this change

This does not fix the issue, it just fixes the one spot where we know ActiveRecord 6.0 can expose it.

References

The problematic line that causes the database to switch is:

@cx._host_pool_current_database = @database

The change in Rails 6 that exercised a code path to expose this surface the issue: https://github.com/rails/rails/pull/35089/files#diff-193e6e4a86bf5d360a5001bce59334cbR180-R186

For an alternate solution see: #60

Comment thread test/test_arhp.rb
# update, delete/destroy, truncate, etc.)
ActiveRecord::Base.connection_handlers = {
ActiveRecord::Base.writing_role => ActiveRecord::Base.default_connection_handler
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to simulate https://github.com/rails/rails/blob/6-0-stable/activerecord/lib/active_record/railtie.rb#L199. This is necessary to create the environment where the underlying database can be switched during a dirtying operation (e.g. insert, update, etc).

@zdennis zdennis force-pushed the testdouble/connection-sharing-db-switching branch from 9a876f9 to fd17cfd Compare April 22, 2020 14:40
@zdennis
Copy link
Copy Markdown
Contributor Author

zdennis commented Apr 22, 2020

I noticed I had left in some yaml anchors in the test database.yml file so I removed so, amended, and re-pushed this up.

Copy link
Copy Markdown
Contributor

@mriddle mriddle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you for all of the context 🙏 A few small comments but it looks good.

We're probably going to see more of this in the newer Rails versions so any effort we're investing should be invested upstream where possible but for now, this will get things working again 💖

Comment thread lib/active_record_host_pool/connection_handling.rb Outdated
Comment thread lib/active_record_host_pool/connection_handling.rb Outdated
Comment thread test/test_arhp.rb
Comment thread test/test_arhp.rb Outdated
Comment thread test/test_arhp.rb Outdated
Comment thread lib/active_record_host_pool/connection_handling.rb Outdated
@zdennis zdennis force-pushed the testdouble/connection-sharing-db-switching branch from b64ca48 to c0fa370 Compare April 30, 2020 20:02
@zdennis
Copy link
Copy Markdown
Contributor Author

zdennis commented Apr 30, 2020

@mriddle / @bquorning , thanks for taking the time to review. I've pushed up a number changes based on your feedback. Take a look when you get a chance. Thanks!

@zdennis zdennis requested review from bquorning and mriddle April 30, 2020 20:25
@mtsolakiszen
Copy link
Copy Markdown

@zdennis I only wish I had found this PR sooner after spending the better part of the day looking into this while trying to upgrade our project to Rails 6 😄
Are there any updates on this PR or any workarounds that can be suggested short of monkey patching?

Copy link
Copy Markdown
Member

@bquorning bquorning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just ran into the same problem on a different application. Oh, why didn’t I review this PR sooner 🤦🏼😃

Comment thread test/test_arhp.rb Outdated
@bquorning
Copy link
Copy Markdown
Member

@mriddle If this PR looks ok with you, I think we can do a patch release tomorrow.

…tore underlying database

Otherwise it's possible for a Mysql2Adapter instance's database to be changed out from under it during an operation (e.g. insert, update, delete, truncate, etc).

CONTEXT OF THE ISSUE

This issue presented itself with a combination of using active_record_shards and active_record_host_pool. It was representing the sharded and unsharded database connections within the same connection pool. This problem did not manifest itself until Rails 6 when cache clearing for cache dirtying operations—e.g. insert, update, truncate, delete, etc—was updated to loop over the app's connection handlers, connection pools to search for currently active connections.

This happens mid-operation (inside of ActiveRecord::ConnectionAdapters::Mysql2Adapter instance) just before the operation is carried out. The result was that referencing the connection was changing the currently selected database and it was not being restored. This led to issues where the operation would try to be carried out on a different database that did not have the same tables/columns.

It is important to note that this issue existed in active_record_host_pool for years, but no code paths exercised it. The assumption that connections would not be re-referenced from within an operation on a connection was safe for years until ActiveRecord 6.

WHY THIS FIX

The active_record_host_pool library is optimized to reduce the number of database connections by pooling them by host/port/socket/user/slave. This means that different databases on the same host will use the same connection pool. This requires that this library provide a mechanism for selecting the correct database when executing queries.

This fix maintains that optimization and fixes the currently known problematic code path introduced in ActiveRecord 6.

BENEFITS OF THIS CHANGE

Preserves host pool optimization.

CONSEQUENCES OF THIS CHANGE

This does not fix the issue, it just fixes the one spot where we know ActiveRecord 6.0 can expose it.

REFERENCES

The problematic line that causes the database to switch is: https://github.com/zendesk/active_record_host_pool/blob/981fdeb8dfa52f390bd91471adc19049283c6fb8/lib/active_record_host_pool/connection_proxy.rb#L16

The change in Rails 6 that exercised a code path to expose this surface the issue: https://github.com/rails/rails/pull/35089/files#diff-193e6e4a86bf5d360a5001bce59334cbR180-R186

For an alternate solution see: #60
@bquorning bquorning force-pushed the testdouble/connection-sharing-db-switching branch from baca215 to dee387c Compare February 9, 2021 08:08
@bquorning bquorning merged commit a6ca763 into master Feb 9, 2021
@bquorning bquorning deleted the testdouble/connection-sharing-db-switching branch February 9, 2021 08:11
@bquorning bquorning mentioned this pull request Feb 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants