Skip to content

Fix unoptimized wheels#152

Merged
rhpvorderman merged 1 commit intomainfrom
wheelfix
May 27, 2025
Merged

Fix unoptimized wheels#152
rhpvorderman merged 1 commit intomainfrom
wheelfix

Conversation

@marcelm
Copy link
Copy Markdown
Owner

@marcelm marcelm commented May 27, 2025

setuptools starting from version 75.7.0 changed how CFLAGS work: They no longer are added to the CFLAGS Python was compiled with, but replace them.

With this change, our CFLAGS of -g0 -DNDEBUG get used directly and the wheels are compiled without proper optimizations because -O2 is missing.

Within cibuildwheel, the Python compiler options that we "lose" by this change are:

-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall

Of these, we really only require -O3 and -DNDEBUG, so the easiest fix should be to just add -O3 to the CFLAGS that we provide.

Found by @rhpvorderman in marcelm/cutadapt#843

setuptools starting from version 75.7.0 changed how CFLAGS work: They no
longer are added to the CFLAGS Python was compiled with, but replace them.

With this change, our CFLAGS of `-g0 -DNDEBUG` get used directly and the
wheels are compiled without proper optimizations because `-O2` is missing.

Within cibuildwheel, the Python compiler options are:

    -fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall

Of these, we really only require `-O3` and `-DNDEBUG`, so the easiest fix
should be to just add `-O3` to the CFLAGS that we provide.

Found by @rhpvorderman in marcelm/cutadapt#843
@rhpvorderman
Copy link
Copy Markdown
Collaborator

I checked all of the included pythons:

./cpython-3.11.12/bin/python
-Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall
./cpython-3.13.3/bin/python
-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall
./cpython-3.10.17/bin/python
-Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall
./cpython-3.12.10/bin/python
-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall
./cpython-3.14.0b1-nogil/bin/python
-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall
./cpython-3.8.20/bin/python
-Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall
./cpython-3.13.3-nogil/bin/python
-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall
./pp311-pypy311_pp73/bin/python
-DNDEBUG -O2
./cpython-3.14.0b1/bin/python
-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall
./pp310-pypy310_pp73/bin/python
-DNDEBUG -O2
./cpython-3.9.22/bin/python
-Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall
./pipx/venvs/uv/bin/python
-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall
./pipx/venvs/patchelf/bin/python
-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall
./pipx/venvs/swig/lib/python3.12/site-packages/swig/data/share/swig/4.3.1/python
bash: ./pipx/venvs/swig/lib/python3.12/site-packages/swig/data/share/swig/4.3.1/python: Is a directory
./pipx/venvs/swig/bin/python
-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall
./pipx/venvs/auditwheel/bin/python
-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall
./pipx/venvs/cmake/bin/python
-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall
./pipx/shared/bin/python
-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall
./tools/bin/python
-fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall

Some of them allso use -fwrapv (guaranteeing that signed integers always wrap) but I don't think that is very relevant for our use cases. (In fact, I would set -fstrict-overflow, signed overflows are undefined, so they don't happen).

@marcelm
Copy link
Copy Markdown
Owner Author

marcelm commented May 27, 2025

Agreed, and thanks for checking! So ok to merge?

@rhpvorderman rhpvorderman merged commit 7998d91 into main May 27, 2025
25 checks passed
@rhpvorderman rhpvorderman deleted the wheelfix branch May 27, 2025 07:42
@rhpvorderman
Copy link
Copy Markdown
Collaborator

Yes! And thanks for looking at the CFLAGS. I will update sequali again. For now 1.0.1 with the "big wheels" is fine, but next release I want to make sure that the debug information is gone.

I will have the shakespearian dialogue with myself "To -O3 or not -O3" and decide whether I shall use -O3 or -O2 by default. O3 has better autovectorization, but is sometimes too greedy. I think I will go that route as greedy algorithms are usually quite favourable for our data (150 bp strings at least).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants