What's the problem this feature will solve?
By default, the wheel platform detection pip does on Linux isn’t very advanced: it produces a platform tag like linux_x86_64, which is the same across nearly all Linux installations.
For packages with compiled bits that link against system-installed shared objects (.so files), it’s necessary to use different wheels for systems with different versions of the shared libraries. For example, a wheel built on Ubuntu 14.04 will not necessarily work on Ubuntu 16.04.
Being able to specify a custom platform name would allow building the same package version on different systems to produce two wheels with unique filenames that can be served from the same PyPI registry. For example, you could produce two wheels with the (already valid) filenames:
numpy-1.9.2-cp36-cp36m-linux_ubuntu_14_04_x86_64.whl
numpy-1.9.2-cp36-cp36m-linux_ubuntu_16_04_x86_64.whl
These wheels can be distributed to package consumers (e.g. internally inside a company), who get the benefits of quick wheel installation (no compile-on-install) and the possibility to work from several different platforms.
Describe the solution you'd like
We would like to add a --platform flag to the pip install and pip wheel commands, with identical behavior to the existing --platform flag on the pip download command. This change would allow building and installing wheels with a custom, user-provided platform name.
At $COMPANY, we’ve been building wheels with a custom platform name (just like the numpy example above) for several years. We upload these wheels to our internal PyPI registry, and install them using using pip-custom-platform, a wrapper around pip that adds in the --platform flag being requested in this issue.
Because we have hundreds of different internal Python projects, upgrading from one OS version to another is a long process during which we need to support development (and wheels!) on multiple OSes.
Overall, we’ve been very happy with the custom platform approach. The burden on projects and friction for developers is very low, and conceptually it fits in very nicely with the “platform” concept of a wheel (similar to how macOS wheels specify versions in the wheel’s platform tag). The downside for us right now, of course, is that we have to use a monkey-patched version of pip instead of the real thing.
pip-custom-platform has also started to be used by others in interesting use cases such as building packages for AWS Lambda to specify their own platform names.
The scope of the proposed change to pip is identical to the --platform flag for pip download: it’s just a flag that allows users to manually specify a platform to use. pip-custom-platform does have functionality to automatically generate platform names based on the Linux distribution, but we think it’s better to leave this complexity out of pip, and instead have users construct the platform name themselves and either pass in this flag manually or as an environment variable (possibly set at the system level by a system administrator, e.g. in /etc/environment).
Alternative solutions
We considered several potential alternative solutions to this problem:
- Abandon wheels entirely and compile on installation. This works but makes installation unrealistically slow (20+ minutes for some projects) and requires installing build tooling and library headers (gcc, fortran, libxxx-dev, cmake, etc.) everywhere.
- Running one PyPI registry per distribution, where each serves wheels with the
linux_86_64 platform tag but which have been compiled on the corresponding platform. This could work, but is pretty unwieldy: every project would need some hackery in their build scripts to select the correct server, plus we’d have to run a ton of these registries (we currently support 3 Ubuntu versions, plus about half a dozen random platforms which are used in specialty cases but are still important to support, e.g. random Amazon Linux AMIs used for EMR). Additionally, if you were to accidentally use the wrong PyPI server and download wheels built for the wrong platform, pip would happily install them without problem and you wouldn’t notice until you get runtime errors in your code.
Additional context
Why can’t we use manylinux wheels?
manylinux wheels are a method of providing a single built wheel that works across most Linux installations. In general they do this job pretty well, but unfortunately we’re unable to use them for many packages due to the security concerns associated with them.
Specifically, for packages which need to depend on C libraries outside of the manylinux1 profile (for example, anything that depends on libssl, or almost all of the scientific Python libraries), the choices for producing a manylinux wheel are to either statically link in the dependencies, or to bundle the .so files in the wheel. In both of these cases, it is very difficult to roll out security updates across hundreds of different services, as it may involve patching and rebuilding the affected wheels, then building and deploying all of the services that consume them.
By contrast, rolling out security updates to shared libraries when services don’t bundle or statically link them is typically as easy as your distro’s equivalent of apt-get upgrade to pull in the latest patched version.
What's the problem this feature will solve?
By default, the wheel platform detection pip does on Linux isn’t very advanced: it produces a platform tag like
linux_x86_64, which is the same across nearly all Linux installations.For packages with compiled bits that link against system-installed shared objects (
.sofiles), it’s necessary to use different wheels for systems with different versions of the shared libraries. For example, a wheel built on Ubuntu 14.04 will not necessarily work on Ubuntu 16.04.Being able to specify a custom platform name would allow building the same package version on different systems to produce two wheels with unique filenames that can be served from the same PyPI registry. For example, you could produce two wheels with the (already valid) filenames:
numpy-1.9.2-cp36-cp36m-linux_ubuntu_14_04_x86_64.whlnumpy-1.9.2-cp36-cp36m-linux_ubuntu_16_04_x86_64.whlThese wheels can be distributed to package consumers (e.g. internally inside a company), who get the benefits of quick wheel installation (no compile-on-install) and the possibility to work from several different platforms.
Describe the solution you'd like
We would like to add a
--platformflag to thepip installandpip wheelcommands, with identical behavior to the existing--platformflag on thepip downloadcommand. This change would allow building and installing wheels with a custom, user-provided platform name.At
$COMPANY, we’ve been building wheels with a custom platform name (just like thenumpyexample above) for several years. We upload these wheels to our internal PyPI registry, and install them using using pip-custom-platform, a wrapper around pip that adds in the--platformflag being requested in this issue.Because we have hundreds of different internal Python projects, upgrading from one OS version to another is a long process during which we need to support development (and wheels!) on multiple OSes.
Overall, we’ve been very happy with the custom platform approach. The burden on projects and friction for developers is very low, and conceptually it fits in very nicely with the “platform” concept of a wheel (similar to how macOS wheels specify versions in the wheel’s platform tag). The downside for us right now, of course, is that we have to use a monkey-patched version of pip instead of the real thing.
pip-custom-platform has also started to be used by others in interesting use cases such as building packages for AWS Lambda to specify their own platform names.
The scope of the proposed change to pip is identical to the
--platformflag forpip download: it’s just a flag that allows users to manually specify a platform to use. pip-custom-platform does have functionality to automatically generate platform names based on the Linux distribution, but we think it’s better to leave this complexity out of pip, and instead have users construct the platform name themselves and either pass in this flag manually or as an environment variable (possibly set at the system level by a system administrator, e.g. in/etc/environment).Alternative solutions
We considered several potential alternative solutions to this problem:
linux_86_64platform tag but which have been compiled on the corresponding platform. This could work, but is pretty unwieldy: every project would need some hackery in their build scripts to select the correct server, plus we’d have to run a ton of these registries (we currently support 3 Ubuntu versions, plus about half a dozen random platforms which are used in specialty cases but are still important to support, e.g. random Amazon Linux AMIs used for EMR). Additionally, if you were to accidentally use the wrong PyPI server and download wheels built for the wrong platform, pip would happily install them without problem and you wouldn’t notice until you get runtime errors in your code.Additional context
Why can’t we use manylinux wheels?
manylinux wheels are a method of providing a single built wheel that works across most Linux installations. In general they do this job pretty well, but unfortunately we’re unable to use them for many packages due to the security concerns associated with them.
Specifically, for packages which need to depend on C libraries outside of the manylinux1 profile (for example, anything that depends on libssl, or almost all of the scientific Python libraries), the choices for producing a manylinux wheel are to either statically link in the dependencies, or to bundle the
.sofiles in the wheel. In both of these cases, it is very difficult to roll out security updates across hundreds of different services, as it may involve patching and rebuilding the affected wheels, then building and deploying all of the services that consume them.By contrast, rolling out security updates to shared libraries when services don’t bundle or statically link them is typically as easy as your distro’s equivalent of
apt-get upgradeto pull in the latest patched version.