`aiida_profile_clean` causes flaky test failures under pytest-xdist

Might be slop but I am just documenting all flaky CI behavior. This one does not have an easy fix

https://github.com/aiidateam/aiida-core/actions/runs/25106933128/job/73570476035?pr=7284

```
_______________________ ERROR at setup of test_get_by_id _______________________
[gw0] linux -- Python 3.10.20 /home/runner/work/aiida-core/aiida-core/.venv/bin/python3
src/aiida/tools/pytest_fixtures/orm.py:100: in factory
    computer = Computer.collection.get(
src/aiida/orm/entities.py:143: in get
    return res.one()[0]
src/aiida/orm/querybuilder.py:1152: in one
    raise NotExistent('No result was found')
E   aiida.common.exceptions.NotExistent: No result was found

During handling of the above exception, another exception occurred:
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1965: in _exec_single_context
    self.dialect.do_execute(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py:921: in do_execute
    cursor.execute(statement, parameters)
E   sqlite3.IntegrityError: UNIQUE constraint failed: db_dbcomputer.label

The above exception was the direct cause of the following exception:
src/aiida/storage/psql_dos/orm/utils.py:119: in save
    self.session.commit()
.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py:1923: in commit
    trans.commit(_to_root=True)
<string>:2: in commit
    ???
.venv/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py:139: in _go
    ret_value = fn(self, *arg, **kw)
.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py:1239: in commit
    self._prepare_impl()
<string>:2: in _prepare_impl
    ???
.venv/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py:139: in _go
    ret_value = fn(self, *arg, **kw)
.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py:1214: in _prepare_impl
    self.session.flush()
.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py:4179: in flush
    self._flush(objects)
.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py:4314: in _flush
    with util.safe_reraise():
.venv/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py:147: in __exit__
    raise exc_value.with_traceback(exc_tb)
.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py:4275: in _flush
    flush_context.execute()
.venv/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py:466: in execute
    rec.execute(self)
.venv/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py:642: in execute
    util.preloaded.orm_persistence.save_obj(
.venv/lib/python3.10/site-packages/sqlalchemy/orm/persistence.py:93: in save_obj
    _emit_insert_statements(
.venv/lib/python3.10/site-packages/sqlalchemy/orm/persistence.py:1226: in _emit_insert_statements
    result = connection.execute(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1412: in execute
    return meth(
.venv/lib/python3.10/site-packages/sqlalchemy/sql/elements.py:515: in _execute_on_connection
    return connection._execute_clauseelement(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1635: in _execute_clauseelement
    ret = self._execute_context(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1844: in _execute_context
    return self._exec_single_context(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1984: in _exec_single_context
    self._handle_dbapi_exception(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:2339: in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1965: in _exec_single_context
    self.dialect.do_execute(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py:921: in do_execute
    cursor.execute(statement, parameters)
E   sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: db_dbcomputer.label
E   [SQL: INSERT INTO db_dbcomputer (uuid, label, hostname, description, scheduler_type, transport_type, metadata) VALUES (?, ?, ?, ?, ?, ?, ?)]
E   [parameters: ('aa6d37d0-edd5-4661-a4ed-97e5bd90271a', 'localhost', 'localhost', '', 'core.direct', 'core.local', '{"workdir": "/tmp/pytest-of-runner/pytest-0/popen-gw0/test_get_by_id0"}')]
E   (Background on this error at: https://sqlalche.me/e/20/gkpj)
```

and here
https://github.com/aiidateam/aiida-core/actions/runs/25108262868/job/73586466642?pr=7284
```
...................[gw0] node down: Not properly terminated
F
_____________________ tests/storage/psql_dos/test_query.py _____________________
[gw0] linux -- Python 3.10.20 /home/runner/work/aiida-core/aiida-core/.venv/bin/python3
worker 'gw0' crashed while running 'tests/storage/psql_dos/test_query.py::test_qb_ordering_limits_offsets_sqla'

replacing crashed worker gw0
```

and also the REST API error might be related to this
https://github.com/aiidateam/aiida-core/actions/runs/25117114340/job/73607205380
```
❯ Next this one F                                                                                                                                                                               
_________________________ tests/restapi/test_routes.py _________________________                                                                                                                
[gw0] linux -- Python 3.14.4 /home/runner/work/aiida-core/aiida-core/.venv/bin/python3                                                                                                          
worker 'gw0' crashed while running 'tests/restapi/test_routes.py::TestRestApi::test_computers_orderby_schedulertype_desc'                                                                       
                                                                                                                                                                                                
replacing crashed worker gw0
```

> init_profile is an autouse=True fixture that uses aiida_profile_clean — every test method in TestRestApi triggers reset_storage() on the shared database. Under xdist, this
  crashes workers.


## Summary

Several tests fail intermittently under `pytest-xdist` due to a race condition when `aiida_profile_clean` resets the shared database while other workers are actively using it. This manifests as both `IntegrityError` failures and outright worker crashes.

Observed in: https://github.com/aiidateam/aiida-core/actions/runs/25106933128/job/73570476092?pr=7284

## Failure Category 1: UNIQUE constraint violation (7 tests)

**Affected tests** (all in `tests/cmdline/params/types/test_code.py`):
- `test_get_by_id`
- `test_get_by_uuid`
- `test_get_by_label`
- `test_get_by_fullname`
- `test_ambiguous_label_pk`
- `test_ambiguous_label_uuid`
- `test_entry_point_validation`

**Affected jobs:** `test minimum reqs (3.10, sqlite)`, `tests-presto`

**Error:**

```
src/aiida/tools/pytest_fixtures/orm.py:100: in factory
    computer = Computer.collection.get(
E   aiida.common.exceptions.NotExistent: No result was found

During handling of the above exception, another exception occurred:

E   aiida.common.exceptions.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: db_dbcomputer.label
E   [SQL: INSERT INTO db_dbcomputer ... VALUES (..., 'localhost', ...)]
```

**Root cause:**

The `aiida_profile` fixture is session-scoped, so all xdist workers share the same database. `test_shell_complete` (line 127) uses `aiida_profile_clean`, which calls `reset_storage()` and wipes the entire database. This creates a race condition in the `aiida_computer` fixture (`src/aiida/tools/pytest_fixtures/orm.py:100-111`):

1. Worker A resets the database via `aiida_profile_clean`
2. Worker A calls `Computer.collection.get(label='localhost', ...)` -> `NotExistent`
3. Worker B (or another fixture in the same worker) also calls `.get(...)` -> `NotExistent`
4. Worker A creates and stores the computer -> succeeds
5. Worker B tries to create the same computer -> `IntegrityError` on the label unique constraint

## Failure Category 2: Worker crashes (5 tests)

In the same CI run, `gw0` crashed ("node down: Not properly terminated") during five different tests across four jobs, with no Python traceback:

| Job | Test running when worker crashed |
|-----|----------------------------------|
| `tests-presto` | `tests/tools/archive/orm/test_comments.py::test_exclude_comments_flag` |
| `tests (3.14, psql, rmq)` | `tests/restapi/test_routes.py::TestRestApi::test_computers_orderby_schedulertype_desc` |
| `tests (3.10, psql, zmq)` | `tests/storage/psql_dos/test_backend.py::test_get_info` |
| `tests (3.14, psql, zmq)` | `tests/orm/implementation/test_nodes.py::TestBackendNode::test_clear_attributes` |
| `tests (3.10, psql, rmq)` | `tests/restapi/test_identifiers.py::test_full_type_unregistered[WorkFunctionNode]` |

These are likely caused by the same `reset_storage()` race corrupting database state mid-operation in another worker, leading to segfaults or fatal errors in the storage/ORM layer.

## Possible fixes

1. **Give each xdist worker its own profile/database** -- eliminates all shared-state races but increases resource usage.
2. **Avoid `aiida_profile_clean` in xdist runs** -- use unique labels/data per test instead of wiping the database.
3. **Handle `IntegrityError` in `aiida_computer` fixture** -- catch the error and retry with a `.get()` call:
   ```python
   try:
       computer = Computer(label=label, ...).store()
   except IntegrityError:
       computer = Computer.collection.get(label=label)
   ```
   This only addresses Category 1 and doesn't fix the worker crashes.

Option 1 is the most robust since it eliminates the root cause for both failure categories.


Job	Test running when worker crashed
`tests-presto`	`tests/tools/archive/orm/test_comments.py::test_exclude_comments_flag`
`tests (3.14, psql, rmq)`	`tests/restapi/test_routes.py::TestRestApi::test_computers_orderby_schedulertype_desc`
`tests (3.10, psql, zmq)`	`tests/storage/psql_dos/test_backend.py::test_get_info`
`tests (3.14, psql, zmq)`	`tests/orm/implementation/test_nodes.py::TestBackendNode::test_clear_attributes`
`tests (3.10, psql, rmq)`	`tests/restapi/test_identifiers.py::test_full_type_unregistered[WorkFunctionNode]`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`aiida_profile_clean` causes flaky test failures under pytest-xdist #7347

Summary

Failure Category 1: UNIQUE constraint violation (7 tests)

Failure Category 2: Worker crashes (5 tests)

Possible fixes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

aiida_profile_clean causes flaky test failures under pytest-xdist #7347

Description

Summary

Failure Category 1: UNIQUE constraint violation (7 tests)

Failure Category 2: Worker crashes (5 tests)

Possible fixes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`aiida_profile_clean` causes flaky test failures under pytest-xdist #7347