Skip to content

Improve unaccent handling#76436

Closed
xmo-odoo wants to merge 3 commits into
odoo:masterfrom
odoo-dev:master-unaccent-pocalypse-xmo
Closed

Improve unaccent handling#76436
xmo-odoo wants to merge 3 commits into
odoo:masterfrom
odoo-dev:master-unaccent-pocalypse-xmo

Conversation

@xmo-odoo

Copy link
Copy Markdown
Collaborator

Covers tasks 2627454 (allow opting out of unaccent for indexing / performance reasons) and 2634851 (natively support psycopg2.sql in the unaccent wrapper).

Task 2551518 (unaccent indexes) is ignored, as it's not clear whether that should be handled in the ORM (with workarounds for unaccent not being IMMUTABLE) or just documented as a danger (generally speaking behaviour with respect to unaccent seems completely undocumented currently so...).

@robodoo

robodoo commented Sep 13, 2021

Copy link
Copy Markdown
Contributor

Pull request status dashboard

@C3POdoo C3POdoo requested review from a team September 13, 2021 13:53
@C3POdoo C3POdoo added the RD research & development, internal work label Sep 13, 2021
@xmo-odoo xmo-odoo force-pushed the master-unaccent-pocalypse-xmo branch from 94438b8 to 72d39de Compare September 13, 2021 14:00
@C3POdoo C3POdoo requested a review from a team September 13, 2021 14:09

@rco-odoo rco-odoo left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you been able to confirm the issue, i.e., the index on parent_path not being used when unaccent is enabled?

Comment thread odoo/osv/expression.py Outdated
`website` added its own sql-compatible version of
`get_unaccent_wrapper` in order to use `psycopg2.sql` for safety.

That is very sad.

Add a condition in the "standard" unaccent wrapper which checks
whether the input type is a `Composable`, and in that case return an
`SQL`. This increases the cost of the wrapper a tad but probably not
to a really sensible amount.

Task 2634851
For searches, Odoo uses `unaccent` if it's available. On some
technical fields this is completely unnecessary and precludes the use
of indexes.

This PR provides:

* an opt-out (`unaccent = False`) on `String` and `Text` fields
* a warning if `unaccent` is enabled on *parent_path* fields as their
  performance can be rather critical and not using the index is quite
  an issue (note: the check that `parent_path` fields have been moved
  outside of the check for their existence as we want to check that
  the field is declared and correctly configured in all cases,
  probably)

Task 2627454
@xmo-odoo

xmo-odoo commented Sep 20, 2021

Copy link
Copy Markdown
Collaborator Author

Have you been able to confirm the issue, i.e., the index on parent_path not being used when unaccent is enabled?

I'd kinda assumed it had been checked, but after duplicating ir.ui.menu a few times (so it'd have enough records postgres would consider the index worth using):

# select count(*) from ir_ui_menu;
 count 
-------
  7712
(1 row)

Fetching the children of a random record (those are the queries generated by env['ir.ui.menu'].search([('id', 'child_of', some_id)])):

test=# explain analyze SELECT "ir_ui_menu".id FROM "ir_ui_menu" WHERE (("ir_ui_menu"."active" = true) AND ("ir_ui_menu"."parent_path"::text like '121/124/%')) ORDER BY  "ir_ui_menu"."sequence" ,"ir_ui_menu"."id";

 Sort  (cost=107.08..107.63 rows=223 width=8) (actual time=0.659..0.702 rows=224 loops=1)
   Sort Key: sequence, id
   Sort Method: quicksort  Memory: 35kB
   ->  Bitmap Heap Scan on ir_ui_menu  (cost=6.58..98.38 rows=223 width=8) (actual time=0.086..0.488 rows=224 loops=1)
         Filter: (active AND ((parent_path)::text ~~ '121/124/%'::text))
         Heap Blocks: exact=61
         ->  Bitmap Index Scan on ir_ui_menu_parent_path_index  (cost=0.00..6.52 rows=224 width=0) (actual time=0.042..0.043>
               Index Cond: (((parent_path)::text >= '121/124/'::text) AND ((parent_path)::text < '121/1240'::text))
 Planning Time: 0.628 ms
 Execution Time: 0.815 ms

adding unaccent to the mix:

test=# explain analyze SELECT "ir_ui_menu".id FROM "ir_ui_menu" WHERE (("ir_ui_menu"."active" = true) AND (unaccent("ir_ui_menu"."parent_path"::text) like unaccent('121/124/%'))) ORDER BY  "ir_ui_menu"."sequence" ,"ir_ui_menu"."id";

 Sort  (cost=224.96..225.05 rows=38 width=8) (actual time=8.868..8.882 rows=224 loops=1)
   Sort Key: sequence, id
   Sort Method: quicksort  Memory: 35kB
   ->  Seq Scan on ir_ui_menu  (cost=0.00..223.96 rows=38 width=8) (actual time=0.585..8.780 rows=224 loops=1)
         Filter: (active AND (unaccent((parent_path)::text) ~~ unaccent('121/124/%'::text)))
         Rows Removed by Filter: 7488
 Planning Time: 0.239 ms
 Execution Time: 8.940 ms

@xmo-odoo xmo-odoo force-pushed the master-unaccent-pocalypse-xmo branch from 72d39de to 803aa37 Compare September 20, 2021 10:20
@xmo-odoo

Copy link
Copy Markdown
Collaborator Author

@robodoo rebase-ff r+

@robodoo

robodoo commented Sep 20, 2021

Copy link
Copy Markdown
Contributor

Merge method set to rebase and fast-forward

@robodoo

robodoo commented Sep 20, 2021

Copy link
Copy Markdown
Contributor

@xmo-odoo, you may want to rebuild or fix this PR as it has failed CI.

@xmo-odoo

Copy link
Copy Markdown
Collaborator Author

@xmo-odoo, you may want to rebuild or fix this PR as it has failed CI.

agree 2

@robodoo

robodoo commented Sep 20, 2021

Copy link
Copy Markdown
Contributor

Linked pull request(s) odoo/enterprise#20822 not ready. Linked PRs are not staged until all of them are ready.

robodoo pushed a commit that referenced this pull request Sep 20, 2021
`website` added its own sql-compatible version of
`get_unaccent_wrapper` in order to use `psycopg2.sql` for safety.

That is very sad.

Add a condition in the "standard" unaccent wrapper which checks
whether the input type is a `Composable`, and in that case return an
`SQL`. This increases the cost of the wrapper a tad but probably not
to a really sensible amount.

Task 2634851

Part-of: #76436
robodoo pushed a commit that referenced this pull request Sep 20, 2021
For searches, Odoo uses `unaccent` if it's available. On some
technical fields this is completely unnecessary and precludes the use
of indexes.

This PR provides:

* an opt-out (`unaccent = False`) on `String` and `Text` fields
* a warning if `unaccent` is enabled on *parent_path* fields as their
  performance can be rather critical and not using the index is quite
  an issue (note: the check that `parent_path` fields have been moved
  outside of the check for their existence as we want to check that
  the field is declared and correctly configured in all cases,
  probably)

Task 2627454

Part-of: #76436
robodoo pushed a commit that referenced this pull request Sep 20, 2021
closes #76436

Related: odoo/enterprise#20822
Signed-off-by: Xavier Morel (xmo) <xmo@odoo.com>
@robodoo robodoo added the 15.1 label Sep 20, 2021
@robodoo robodoo closed this Sep 20, 2021
@robodoo robodoo temporarily deployed to merge September 20, 2021 14:27 Inactive
@fw-bot fw-bot deleted the master-unaccent-pocalypse-xmo branch October 4, 2021 14:47
rco-odoo pushed a commit to odoo-dev/odoo that referenced this pull request Nov 2, 2023
In odoo#76436 we added `unaccent=False` on
field definitions to prevent the ORM (expression.py) from using
`unaccent` for matching domain operators ('(not) (=)(i)like').
However, since odoo#136007, the domain
operators '=like', 'like' and 'not like' no longer use `unaccent`.

We can therefore prevent the use of `unaccent` by using the right
operator ('=like', 'like', 'not like') instead of overriding the
behavior in the field itself.  This makes the `unaccent` parameter on
`Field` less relevant.

In order to reduce field complexity and keep a generic behavior, we
remove this parameter from `Field` and change the usage where
necessary:

 - Remove `unaccent=False` from all `parent_path` fields.  These fields
   are only searched with operator `=like` (see the implementation of
   `child_of` in expression.py).

 - Remove `unaccent=False` from `res.partner` fields `phone` and
   `mobile`.  They were introduced to simplify the creation of indexes
   (in odoo#91788) for the voip module,
   which uses `ilike` on these fields.  We replace `ilike` with `like`
   to have the same behavior as before (see Enterprise PR).

 - Remove `unaccent=False` from fields `ir.attachment.store_fname` and
   `crm.lead.email_domain_criterion`.  They were already useless (no
   search with '...like' operators).

There is a small behavior change: all the mentioned fields will be
searched with `unaccent` from the webclient, since the latter always
uses operator `like` by default.
robodoo pushed a commit that referenced this pull request Nov 2, 2023
In #76436 we added `unaccent=False` on
field definitions to prevent the ORM (expression.py) from using
`unaccent` for matching domain operators ('(not) (=)(i)like').
However, since #136007, the domain
operators '=like', 'like' and 'not like' no longer use `unaccent`.

We can therefore prevent the use of `unaccent` by using the right
operator ('=like', 'like', 'not like') instead of overriding the
behavior in the field itself.  This makes the `unaccent` parameter on
`Field` less relevant.

In order to reduce field complexity and keep a generic behavior, we
remove this parameter from `Field` and change the usage where
necessary:

 - Remove `unaccent=False` from all `parent_path` fields.  These fields
   are only searched with operator `=like` (see the implementation of
   `child_of` in expression.py).

 - Remove `unaccent=False` from `res.partner` fields `phone` and
   `mobile`.  They were introduced to simplify the creation of indexes
   (in #91788) for the voip module,
   which uses `ilike` on these fields.  We replace `ilike` with `like`
   to have the same behavior as before (see Enterprise PR).

 - Remove `unaccent=False` from fields `ir.attachment.store_fname` and
   `crm.lead.email_domain_criterion`.  They were already useless (no
   search with '...like' operators).

There is a small behavior change: all the mentioned fields will be
searched with `unaccent` from the webclient, since the latter always
uses operator `like` by default.

closes #139568

Related: odoo/enterprise#49448
Signed-off-by: Raphael Collet <rco@odoo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

15.1 RD research & development, internal work

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants