Might be slop but I am just documenting all flaky CI behavior. This one does not have an easy fix
https://github.com/aiidateam/aiida-core/actions/runs/25106933128/job/73570476035?pr=7284
_______________________ ERROR at setup of test_get_by_id _______________________
[gw0] linux -- Python 3.10.20 /home/runner/work/aiida-core/aiida-core/.venv/bin/python3
src/aiida/tools/pytest_fixtures/orm.py:100: in factory
computer = Computer.collection.get(
src/aiida/orm/entities.py:143: in get
return res.one()[0]
src/aiida/orm/querybuilder.py:1152: in one
raise NotExistent('No result was found')
E aiida.common.exceptions.NotExistent: No result was found
During handling of the above exception, another exception occurred:
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1965: in _exec_single_context
self.dialect.do_execute(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py:921: in do_execute
cursor.execute(statement, parameters)
E sqlite3.IntegrityError: UNIQUE constraint failed: db_dbcomputer.label
The above exception was the direct cause of the following exception:
src/aiida/storage/psql_dos/orm/utils.py:119: in save
self.session.commit()
.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py:1923: in commit
trans.commit(_to_root=True)
<string>:2: in commit
???
.venv/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py:139: in _go
ret_value = fn(self, *arg, **kw)
.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py:1239: in commit
self._prepare_impl()
<string>:2: in _prepare_impl
???
.venv/lib/python3.10/site-packages/sqlalchemy/orm/state_changes.py:139: in _go
ret_value = fn(self, *arg, **kw)
.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py:1214: in _prepare_impl
self.session.flush()
.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py:4179: in flush
self._flush(objects)
.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py:4314: in _flush
with util.safe_reraise():
.venv/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py:147: in __exit__
raise exc_value.with_traceback(exc_tb)
.venv/lib/python3.10/site-packages/sqlalchemy/orm/session.py:4275: in _flush
flush_context.execute()
.venv/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py:466: in execute
rec.execute(self)
.venv/lib/python3.10/site-packages/sqlalchemy/orm/unitofwork.py:642: in execute
util.preloaded.orm_persistence.save_obj(
.venv/lib/python3.10/site-packages/sqlalchemy/orm/persistence.py:93: in save_obj
_emit_insert_statements(
.venv/lib/python3.10/site-packages/sqlalchemy/orm/persistence.py:1226: in _emit_insert_statements
result = connection.execute(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1412: in execute
return meth(
.venv/lib/python3.10/site-packages/sqlalchemy/sql/elements.py:515: in _execute_on_connection
return connection._execute_clauseelement(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1635: in _execute_clauseelement
ret = self._execute_context(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1844: in _execute_context
return self._exec_single_context(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1984: in _exec_single_context
self._handle_dbapi_exception(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:2339: in _handle_dbapi_exception
raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1965: in _exec_single_context
self.dialect.do_execute(
.venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py:921: in do_execute
cursor.execute(statement, parameters)
E sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: db_dbcomputer.label
E [SQL: INSERT INTO db_dbcomputer (uuid, label, hostname, description, scheduler_type, transport_type, metadata) VALUES (?, ?, ?, ?, ?, ?, ?)]
E [parameters: ('aa6d37d0-edd5-4661-a4ed-97e5bd90271a', 'localhost', 'localhost', '', 'core.direct', 'core.local', '{"workdir": "/tmp/pytest-of-runner/pytest-0/popen-gw0/test_get_by_id0"}')]
E (Background on this error at: https://sqlalche.me/e/20/gkpj)
and here
https://github.com/aiidateam/aiida-core/actions/runs/25108262868/job/73586466642?pr=7284
...................[gw0] node down: Not properly terminated
F
_____________________ tests/storage/psql_dos/test_query.py _____________________
[gw0] linux -- Python 3.10.20 /home/runner/work/aiida-core/aiida-core/.venv/bin/python3
worker 'gw0' crashed while running 'tests/storage/psql_dos/test_query.py::test_qb_ordering_limits_offsets_sqla'
replacing crashed worker gw0
and also the REST API error might be related to this
https://github.com/aiidateam/aiida-core/actions/runs/25117114340/job/73607205380
❯ Next this one F
_________________________ tests/restapi/test_routes.py _________________________
[gw0] linux -- Python 3.14.4 /home/runner/work/aiida-core/aiida-core/.venv/bin/python3
worker 'gw0' crashed while running 'tests/restapi/test_routes.py::TestRestApi::test_computers_orderby_schedulertype_desc'
replacing crashed worker gw0
init_profile is an autouse=True fixture that uses aiida_profile_clean — every test method in TestRestApi triggers reset_storage() on the shared database. Under xdist, this
crashes workers.
Summary
Several tests fail intermittently under pytest-xdist due to a race condition when aiida_profile_clean resets the shared database while other workers are actively using it. This manifests as both IntegrityError failures and outright worker crashes.
Observed in: https://github.com/aiidateam/aiida-core/actions/runs/25106933128/job/73570476092?pr=7284
Failure Category 1: UNIQUE constraint violation (7 tests)
Affected tests (all in tests/cmdline/params/types/test_code.py):
test_get_by_id
test_get_by_uuid
test_get_by_label
test_get_by_fullname
test_ambiguous_label_pk
test_ambiguous_label_uuid
test_entry_point_validation
Affected jobs: test minimum reqs (3.10, sqlite), tests-presto
Error:
src/aiida/tools/pytest_fixtures/orm.py:100: in factory
computer = Computer.collection.get(
E aiida.common.exceptions.NotExistent: No result was found
During handling of the above exception, another exception occurred:
E aiida.common.exceptions.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: db_dbcomputer.label
E [SQL: INSERT INTO db_dbcomputer ... VALUES (..., 'localhost', ...)]
Root cause:
The aiida_profile fixture is session-scoped, so all xdist workers share the same database. test_shell_complete (line 127) uses aiida_profile_clean, which calls reset_storage() and wipes the entire database. This creates a race condition in the aiida_computer fixture (src/aiida/tools/pytest_fixtures/orm.py:100-111):
- Worker A resets the database via
aiida_profile_clean
- Worker A calls
Computer.collection.get(label='localhost', ...) -> NotExistent
- Worker B (or another fixture in the same worker) also calls
.get(...) -> NotExistent
- Worker A creates and stores the computer -> succeeds
- Worker B tries to create the same computer ->
IntegrityError on the label unique constraint
Failure Category 2: Worker crashes (5 tests)
In the same CI run, gw0 crashed ("node down: Not properly terminated") during five different tests across four jobs, with no Python traceback:
| Job |
Test running when worker crashed |
tests-presto |
tests/tools/archive/orm/test_comments.py::test_exclude_comments_flag |
tests (3.14, psql, rmq) |
tests/restapi/test_routes.py::TestRestApi::test_computers_orderby_schedulertype_desc |
tests (3.10, psql, zmq) |
tests/storage/psql_dos/test_backend.py::test_get_info |
tests (3.14, psql, zmq) |
tests/orm/implementation/test_nodes.py::TestBackendNode::test_clear_attributes |
tests (3.10, psql, rmq) |
tests/restapi/test_identifiers.py::test_full_type_unregistered[WorkFunctionNode] |
These are likely caused by the same reset_storage() race corrupting database state mid-operation in another worker, leading to segfaults or fatal errors in the storage/ORM layer.
Possible fixes
- Give each xdist worker its own profile/database -- eliminates all shared-state races but increases resource usage.
- Avoid
aiida_profile_clean in xdist runs -- use unique labels/data per test instead of wiping the database.
- Handle
IntegrityError in aiida_computer fixture -- catch the error and retry with a .get() call:
try:
computer = Computer(label=label, ...).store()
except IntegrityError:
computer = Computer.collection.get(label=label)
This only addresses Category 1 and doesn't fix the worker crashes.
Option 1 is the most robust since it eliminates the root cause for both failure categories.
Might be slop but I am just documenting all flaky CI behavior. This one does not have an easy fix
https://github.com/aiidateam/aiida-core/actions/runs/25106933128/job/73570476035?pr=7284
and here
https://github.com/aiidateam/aiida-core/actions/runs/25108262868/job/73586466642?pr=7284
and also the REST API error might be related to this
https://github.com/aiidateam/aiida-core/actions/runs/25117114340/job/73607205380
Summary
Several tests fail intermittently under
pytest-xdistdue to a race condition whenaiida_profile_cleanresets the shared database while other workers are actively using it. This manifests as bothIntegrityErrorfailures and outright worker crashes.Observed in: https://github.com/aiidateam/aiida-core/actions/runs/25106933128/job/73570476092?pr=7284
Failure Category 1: UNIQUE constraint violation (7 tests)
Affected tests (all in
tests/cmdline/params/types/test_code.py):test_get_by_idtest_get_by_uuidtest_get_by_labeltest_get_by_fullnametest_ambiguous_label_pktest_ambiguous_label_uuidtest_entry_point_validationAffected jobs:
test minimum reqs (3.10, sqlite),tests-prestoError:
Root cause:
The
aiida_profilefixture is session-scoped, so all xdist workers share the same database.test_shell_complete(line 127) usesaiida_profile_clean, which callsreset_storage()and wipes the entire database. This creates a race condition in theaiida_computerfixture (src/aiida/tools/pytest_fixtures/orm.py:100-111):aiida_profile_cleanComputer.collection.get(label='localhost', ...)->NotExistent.get(...)->NotExistentIntegrityErroron the label unique constraintFailure Category 2: Worker crashes (5 tests)
In the same CI run,
gw0crashed ("node down: Not properly terminated") during five different tests across four jobs, with no Python traceback:tests-prestotests/tools/archive/orm/test_comments.py::test_exclude_comments_flagtests (3.14, psql, rmq)tests/restapi/test_routes.py::TestRestApi::test_computers_orderby_schedulertype_desctests (3.10, psql, zmq)tests/storage/psql_dos/test_backend.py::test_get_infotests (3.14, psql, zmq)tests/orm/implementation/test_nodes.py::TestBackendNode::test_clear_attributestests (3.10, psql, rmq)tests/restapi/test_identifiers.py::test_full_type_unregistered[WorkFunctionNode]These are likely caused by the same
reset_storage()race corrupting database state mid-operation in another worker, leading to segfaults or fatal errors in the storage/ORM layer.Possible fixes
aiida_profile_cleanin xdist runs -- use unique labels/data per test instead of wiping the database.IntegrityErrorinaiida_computerfixture -- catch the error and retry with a.get()call:Option 1 is the most robust since it eliminates the root cause for both failure categories.