493 lines
23 KiB
ReStructuredText
493 lines
23 KiB
ReStructuredText
PEP: 600
|
||
Title: Future 'manylinux' Platform Tags for Portable Linux Built Distributions
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Nathaniel J. Smith <njs@pobox.com>
|
||
Thomas Kluyver <thomas@kluyver.me.uk>
|
||
Sponsor: Paul Moore <p.f.moore@gmail.com>
|
||
BDFL-Delegate: Paul Moore <p.f.moore@gmail.com>
|
||
Discussions-To: Discourse https://discuss.python.org/t/the-next-manylinux-specification/1043
|
||
Status: Accepted
|
||
Type: Informational
|
||
Content-Type: text/x-rst
|
||
Created: 3-May-2019
|
||
Post-History: 3-May-2019
|
||
Resolution: https://discuss.python.org/t/pep-600-future-manylinux-platform-tags-for-portable-linux-built-distributions/2414/27
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP proposes a scheme for new 'manylinux' wheel tags to be
|
||
defined without requiring a PEP for every specific tag, similar to how
|
||
Windows and macOS tags already work. This will allow package
|
||
maintainers to take advantage of new tags more quickly, while making
|
||
better use of limited volunteer time.
|
||
|
||
Non-goals include: handling non-glibc-based platforms; integrating
|
||
with external package managers or handling external dependencies such
|
||
as CUDA; making manylinux tags more sophisticated than their
|
||
Windows/macOS equivalents; doing anything besides taking our existing
|
||
tried-and-tested approach and streamlining it. These are important
|
||
issues and other PEPs may address them in the future, but for this PEP
|
||
they're out of scope.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Python users appreciate it when PyPI has pre-compiled packages for
|
||
their platform, because it makes installation fast and simple. But
|
||
distributing pre-compiled binaries on Linux is challenging because of
|
||
the diversity of Linux-based platforms. For example, Debian, Android,
|
||
and Alpine all use the Linux kernel, but with radically different
|
||
userspace libraries, which makes it difficult or impossible to create
|
||
a single wheel that works on all three. This complexity has caused
|
||
many previous discussions of Linux wheels to stall out.
|
||
|
||
The "manylinux" project succeeded by adopting a strategy of ruthless
|
||
pragmatism. We chose a large but tractable set of Linux platforms –
|
||
specifically, mainstream glibc-based distributions like Debian,
|
||
OpenSuSE, Ubuntu, RHEL, etc. – and then we did whatever it takes to
|
||
make wheels that work across all these platforms.
|
||
|
||
This approach requires many compromises. Manylinux wheels can only
|
||
rely on external libraries that maintain a consistent ABI and are
|
||
universally available across all these distributions, which in
|
||
practice restricts them to a small set of core libraries like glibc
|
||
and a few others. Wheels have to be built on carefully-chosen
|
||
platforms of the oldest possible vintage, using a Python that is
|
||
itself built in a carefully-chosen configuration. Other shared library
|
||
dependencies have to be bundled into the wheel, which requires a
|
||
complex process to avoid collisions between unrelated wheels. And
|
||
finally, the details of these requirements change over time, as new
|
||
distro versions are released, and old ones fall out of use.
|
||
|
||
It turns out that these requirements are not too onerous: they're
|
||
essentially equivalent to what you have to do to ship Windows or macOS
|
||
wheels, and the manylinux approach has achieved substantial uptake
|
||
among both package maintainers and end-users. But any manylinux PEP
|
||
needs some way to address these complexities.
|
||
|
||
In previous manylinux PEPs (:pep:`513`, :pep:`571`), we've done this
|
||
by attempting to write down in the PEP the exact set of libraries,
|
||
symbol versions, Python configuration, etc. that we believed would
|
||
lead to wheels that work on all mainstream glibc-based Linux systems.
|
||
But this created several problems:
|
||
|
||
First, PEPs are generally supposed to be normative references: if
|
||
software doesn't match the PEP, then we fix the software. But in this
|
||
case, the PEPs are attempting to describe Linux distributions, which
|
||
are a moving target, and do not consider our PEPs to constrain their
|
||
behavior. This means that we've been taking on an unbounded commitment
|
||
to keep updating every manylinux PEP whenever the Linux distro
|
||
landscape changes. This is a substantial commitment for unfunded
|
||
volunteers to take on, and it's not clear that this work produces
|
||
value for our users.
|
||
|
||
And second, every time we move manylinux forward to a newer range of
|
||
supported platforms, or add support for a new architecture, we have to
|
||
go through a fairly elaborate process: writing a new PEP, updating the
|
||
PyPI and pip codebases to recognize the new tag, waiting for the new
|
||
pip to percolate to users, etc. None of this happens on Windows/macOS;
|
||
it's only a tax on Linux maintainers. This slows deployment of new
|
||
manylinux versions, and consumes part of our community's limited PEP
|
||
review bandwidth, thus slowing progress of the Python packaging
|
||
ecosystem as a whole. This is especially problematic for less-popular
|
||
architectures, who have less volunteer resources to overcome these
|
||
barriers.
|
||
|
||
How can we fix it?
|
||
|
||
A manylinux PEP has to address three main audiences:
|
||
|
||
- **Package installers**, like pip, need to be able to determine which
|
||
wheel tags are compatible with the system they find themselves
|
||
running on. This requires some automated process to introspect the
|
||
system and match it up with wheel tags.
|
||
|
||
- **Package indexes**, like PyPI, need to be able to validate which
|
||
wheel tags are valid. Generally, this just requires something like a
|
||
list of valid tags, or regex they match, with no need to know
|
||
anything about the actual semantics for individual tags. (But see
|
||
the discussion of upload verification below.)
|
||
|
||
- **Package maintainers** need to be able to build wheels that meet
|
||
the requirements for a given wheel tag.
|
||
|
||
Here's the key insight behind this new PEP: it's crucial that
|
||
different **package installers** and **package indexes** all agree on
|
||
which manylinux tags are valid and which systems they install on, so
|
||
we need a PEP to specify these – but, these are straightforward, and
|
||
don't really change between manylinux versions. The complicated part
|
||
that keeps changing is the process of actually **building the wheels**
|
||
– but, if there are multiple competing build environments, it *doesn't
|
||
matter* whether they use exactly the same rules as each other, as long
|
||
as they all produce wheels that work on end-user systems. Therefore,
|
||
we don't need an interoperability standard for building wheels, so we
|
||
don't need to write the details into a PEP.
|
||
|
||
To further convince ourselves that this approach will work, let's look
|
||
again at how we handle wheels on Windows and macOS: the PEPs describe
|
||
which tags are valid, and which systems they're supposed to work on,
|
||
but not how to actually build wheels for those platforms. And in
|
||
practice, if you want to distribute Windows or macOS wheels, you might
|
||
have to jump through some complicated and poorly documented hoops in
|
||
order to bundle dependencies, target the right range of OS versions,
|
||
etc. But the system works, and the way to improve it is to write
|
||
better docs and build better tooling; no-one thinks that the way to
|
||
make Windows wheels work better is to publish a PEP describing
|
||
which symbols we think Microsoft should be including in their
|
||
libraries and how their linker ought to work. This PEP extends that
|
||
philosophy to manylinux as well.
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
Core definition
|
||
---------------
|
||
|
||
Tags using the new scheme will look like::
|
||
|
||
manylinux_2_17_x86_64
|
||
|
||
Or more generally::
|
||
|
||
manylinux_${GLIBCMAJOR}_${GLIBCMINOR}_${ARCH}
|
||
|
||
This tag is a promise: the wheel's creator promises that the wheel
|
||
will work on any mainstream Linux distro that uses glibc version
|
||
``${GLIBCMAJOR}.${GLIBCMINOR}`` or later, and where the ``${ARCH}``
|
||
matches the return value from ``distutils.util.get_platform()``. (For
|
||
more detail about architecture tags, see :pep:`425`.)
|
||
|
||
If a user installs this wheel into an environment that matches these
|
||
requirements and it doesn't work, then that wheel does not comply with
|
||
this specification. This should be considered a bug in the wheel, and
|
||
it's the wheel creator's responsibility to look for a fix (possibly
|
||
with the help of the broader community).
|
||
|
||
The word "mainstream" is intentionally somewhat vague, and should be
|
||
interpreted expansively. The goal is to rule out weird homebrew Linux
|
||
systems; generally any distro you've actually heard of should be
|
||
considered "mainstream". We also provide a way for maintainers of
|
||
"weird" distros to manually override this check, though based on
|
||
experience with previous manylinux PEPs, we don't expect this feature
|
||
to see much use.
|
||
|
||
And finally, compliant wheels are required to "play well with others",
|
||
i.e., installing a manylinux wheel must not cause other unrelated
|
||
packages to break.
|
||
|
||
Any method of producing wheels which meets these criteria is
|
||
acceptable. However, in practice we expect that the auditwheel project
|
||
will maintain an up-to-date set of tools and build images for
|
||
producing manylinux wheels, as well as documentation about how they
|
||
work and how to use them, and that most maintainers will want to use
|
||
those. For the latest information on building manylinux wheels,
|
||
including recommendations about which build images to use, see
|
||
https://packaging.python.org.
|
||
|
||
Since these requirements are fairly high-level, here are some examples
|
||
of how they play out in specific situations:
|
||
|
||
Example: if a wheel is tagged as ``manylinux_2_17_x86_64``, but it
|
||
uses symbols that were only added in glibc 2.18, then that wheel won't
|
||
work on systems with glibc 2.17. Therefore, we can conclude that this
|
||
wheel is in violation of this specification.
|
||
|
||
Example: Until ~2017, all major Linux distros included
|
||
``libncursesw.so.5`` as part of their default install. Until that
|
||
date, a wheel that linked to ``libncursesw.so.5`` was compliant with
|
||
this specification. Then, distros started switching to ncurses 6,
|
||
which has a different name and incompatible ABI, and stopped
|
||
installing ``libncursesw.so.5`` by default. So after that date, a
|
||
wheel that links to ``libncursesw.so.5`` was no longer compliant with
|
||
this specification.
|
||
|
||
Example: The Linux ELF linker places all shared library SONAMEs into a
|
||
single process-global namespace. If independent wheels used the same
|
||
SONAME for their bundled libraries, they might end up colliding and
|
||
using the wrong library version, which would violate the "play well
|
||
with others" rule. Therefore, this specification requires that wheels
|
||
use globally-unique names for all bundled libraries. (Auditwheel
|
||
currently accomplishes this by renaming all bundled libraries to
|
||
include a globally-unique hash.)
|
||
|
||
Example: we've observed certain wheels using C++ in ways that
|
||
`interfere with other packages
|
||
<https://github.com/apache/arrow/pull/2210>`__ via an unclear
|
||
mechanism. This is also a violation of the "play well with others"
|
||
rule, so those wheels aren't compliant with this specification.
|
||
|
||
Example: The imaginary architecture LEG v7 has both big-endian and
|
||
little-endian variants. Big-endian binaries require a big-endian
|
||
system, and little-endian binaries require a little-endian system. But
|
||
unfortunately, it's discovered that due to a bug in :pep:`425`, both
|
||
variants use the same architecture tag, ``legv7``. This makes it
|
||
impossible to create a compliant ``manylinux_2_17_legv7`` wheel: no
|
||
matter what we do, it will crash on some user's systems. So, we write
|
||
a new PEP defining architecture tags ``legv7le`` and ``legv7be``; now
|
||
we can ship manylinux LEG v7 wheels.
|
||
|
||
Example: There's also a LEG v8. It also has big-endian and
|
||
little-endian variants. But fortunately, it turns out that :pep:`425`
|
||
already does the right thing LEG v8, so LEG v8 enthusiasts can start
|
||
shipping ``manylinux_2_17_legv8le`` and ``manylinux_2_17_legv8be``
|
||
wheels immediately once this PEP is implemented, even though the
|
||
authors of this PEP don't know anything at all about LEG v8.
|
||
|
||
|
||
Legacy manylinux tags
|
||
---------------------
|
||
|
||
The existing manylinux tags are redefined as aliases for new-style
|
||
tags:
|
||
|
||
- ``manylinux1_x86_64`` is now an alias for ``manylinux_2_5_x86_64``
|
||
- ``manylinux1_i686`` is now an alias for ``manylinux_2_5_i686``
|
||
- ``manylinux2010_x86_64`` is now an alias for ``manylinux_2_12_x86_64``
|
||
- ``manylinux2010_i686`` is now an alias for ``manylinux_2_12_i686``
|
||
|
||
This redefinition is largely a no-op, but does affect a few things:
|
||
|
||
- Previously, we had an open-ended and growing commitment to keep
|
||
updating every manylinux PEP whenever a new Linux distro was
|
||
released, for the rest of time. By making this PEP normative for the
|
||
older tags, that obligation goes away. When this PEP is accepted,
|
||
the previous manylinux PEPs will receive a final update noting that
|
||
they are no longer maintained and referring to this PEP.
|
||
|
||
- The "play well with others" rule was always intended, but previous
|
||
PEPs didn't state it explicitly; now it's explicit.
|
||
|
||
- Previous PEPs assumed that glibc 3.x might be incompatible with
|
||
glibc 2.x, so we checked for compatibility between a system and a
|
||
tag using logic like::
|
||
|
||
sys_major == tag_major and sys_minor >= tag_minor
|
||
|
||
Recently the glibc maintainers `advised us
|
||
<https://sourceware.org/bugzilla/show_bug.cgi?id=24636>`__ that we
|
||
should assume that glibc will maintain backwards-compatibility
|
||
indefinitely, even if they bump the major version number. So the new
|
||
check for compatibility is::
|
||
|
||
(sys_major, sys_minor) >= (tag_major, tag_minor)
|
||
|
||
|
||
Package installers
|
||
------------------
|
||
|
||
Generally, package installers should install manylinux wheels on
|
||
systems that have an appropriate glibc and architecture, and not
|
||
otherwise. If there are multiple compatible manylinux wheels
|
||
available, then the wheel with the highest glibc version should be
|
||
preferred, in order to take advantage of newer compilers and glibc
|
||
features.
|
||
|
||
In addition, we follow previous specifications, and allow for Python
|
||
distributors to manually override this check by adding a
|
||
``_manylinux`` module to their standard library. If this package is
|
||
importable, and if it defines a function called
|
||
``manylinux_compatible``, then package installers should call this
|
||
function, passing in the major version, minor version, and
|
||
architecture from the manylinux tag, and it will either return a
|
||
boolean saying whether wheels with the given tag should be considered
|
||
compatible with the current system, or else ``None`` to indicate that
|
||
the default logic should be used.
|
||
|
||
For compatibility with previous specifications, if the tag is
|
||
``manylinux1`` or ``manylinux_2_5`` exactly, then we also check the
|
||
module for a boolean attribute ``manylinux1_compatible``, and if the
|
||
tag version is ``manylinux2010`` or ``manylinux_2_12`` exactly, then
|
||
we also check the module for a boolean attribute
|
||
``manylinux2010_compatible``. If both the new and old attributes are
|
||
defined, then ``manylinux_compatible`` takes precedence.
|
||
|
||
Here's some example code. You don't have to actually use this code,
|
||
but you can use it for reference if you have questions about the exact
|
||
semantics::
|
||
|
||
LEGACY_ALIASES = {
|
||
"manylinux1_x86_64": "manylinux_2_5_x86_64",
|
||
"manylinux1_i686": "manylinux_2_5_i686",
|
||
"manylinux2010_x86_64": "manylinux_2_12_x86_64",
|
||
"manylinux2010_i686": "manylinux_2_12_i686",
|
||
}
|
||
|
||
def manylinux_tag_is_compatible_with_this_system(tag):
|
||
# Normalize and parse the tag
|
||
tag = LEGACY_ALIASES.get(tag, tag)
|
||
m = re.match("manylinux_([0-9]+)_([0-9]+)_(.*)", tag)
|
||
if not m:
|
||
return False
|
||
tag_major_str, tag_minor_str, tag_arch = m.groups()
|
||
tag_major = int(tag_major_str)
|
||
tag_minor = int(tag_minor_str)
|
||
|
||
if not system_uses_glibc():
|
||
return False
|
||
sys_major, sys_minor = get_system_glibc_version()
|
||
if (sys_major, sys_minor) < (tag_major, tag_minor):
|
||
return False
|
||
sys_arch = get_system_arch()
|
||
if sys_arch != tag_arch:
|
||
return False
|
||
|
||
# Check for manual override
|
||
try:
|
||
import _manylinux
|
||
except ImportError:
|
||
pass
|
||
else:
|
||
if hasattr(_manylinux, "manylinux_compatible"):
|
||
result = _manylinux.manylinux_compatible(
|
||
tag_major, tag_minor, tag_arch,
|
||
)
|
||
if result is not None:
|
||
return bool(result)
|
||
else:
|
||
if (tag_major, tag_minor) == (2, 5):
|
||
if hasattr(_manylinux, "manylinux1_compatible"):
|
||
return bool(_manylinux.manylinux1_compatible)
|
||
if (tag_major, tag_minor) == (2, 12):
|
||
if hasattr(_manylinux, "manylinux2010_compatible"):
|
||
return bool(_manylinux.manylinux2010_compatible)
|
||
|
||
return True
|
||
|
||
|
||
Package indexes
|
||
---------------
|
||
|
||
The exact set of wheel tags accepted by PyPI, or any package index, is
|
||
a policy question, and up to the maintainers of that index. But, we
|
||
recommend that package indexes accept any wheels whose platform tag
|
||
matches the following regexes:
|
||
|
||
- ``manylinux1_(x86_64|i686)``
|
||
- ``manylinux2010_(x86_64|i686)``
|
||
- ``manylinux_[0-9]+_[0-9]+_(.*)``
|
||
|
||
Package indexes may impose additional requirements; for example, they
|
||
might audit uploaded wheels and reject those that contain known
|
||
problems, such as a ``manylinux_2_17`` wheel that references symbols
|
||
from later glibc versions, or dependencies on external libraries that
|
||
are known not to exist on all systems. Or a package index might decide
|
||
to be conservative and reject wheels tagged ``manylinux_2_999``, on
|
||
the grounds that no-one knows what the Linux distro landscape will
|
||
look like when glibc 2.999 is released. We leave the details of any
|
||
such checks to the discretion of the package index maintainers.
|
||
|
||
|
||
Rejected alternatives
|
||
=====================
|
||
|
||
**Continuing the manylinux20XX series**: As discussed above, this
|
||
leads to much more effort-intensive, slower, and more complex rollouts
|
||
of new versions. And while there are two places where it seems at
|
||
first to have some compensating benefits, if you look more closely
|
||
this turns out not to be the case.
|
||
|
||
First, this forces us to produce human-readable descriptions of how
|
||
Linux distros work, in the text of the PEP. But this is less valuable
|
||
than it might seem at first, and can actually be handled better by the
|
||
new "perennial" approach anyway.
|
||
|
||
If you're trying to build wheels, the main thing you need is a
|
||
tutorial on how to use the build images and tooling around them. If
|
||
you're trying to add support for a new build profile or create a
|
||
competitor to auditwheel, then your best resources will be the
|
||
auditwheel source code and issue tracker, which are always going to be
|
||
more detailed, precise, and reliable than a summary spec written in
|
||
English and without tests. Documentation like the old manylinux20XX
|
||
PEPs does add value! But in both cases, it's primarily as a secondary
|
||
reference to provide overview and context.
|
||
|
||
And furthermore, the PEP process is poorly suited to maintaining this
|
||
kind of reference documentation – there's a reason we don't keep the
|
||
pip user manual in the PEPs repository! The auditwheel maintainers are
|
||
the best situated to understand what kinds of documentation are useful
|
||
to their users, and to maintain that documentation over time. For
|
||
example, there's substantial overlap between the different manylinux
|
||
versions, and the PEP process currently forces us to handle this by
|
||
copy-pasting everything between a growing list of documents; instead,
|
||
the auditwheel maintainers might choose to factor out the common parts
|
||
into a single piece of shared documentation.
|
||
|
||
A related concern was that with the perennial approach, it may become
|
||
harder for package maintainers to decide which build profile to
|
||
target: instead of having to pick between ``manylinux1``,
|
||
``manylinux2010``, ``manylinux2014``, ..., they now have a wider array
|
||
of options like ``manylinux_2_5``, ``manylinux_2_6``, ...,
|
||
``manylinux_2_20``, ... But again, we don't believe this will be a
|
||
problem in practice. In either system, most package maintainers won't
|
||
be starting by reading PEPs and trying to implement them from scratch.
|
||
If you're a particularly expert and ambitious package maintainer who
|
||
needs to target a new version or new architecture, the perennial
|
||
approach gives you additional flexibility. But for regular everyday
|
||
maintainers, we expect they'll start from a tutorial like
|
||
packaging.python.org, and by choosing from existing build images. A
|
||
tutorial can just as easily recommend ``manylinux_2_17`` as it can
|
||
recommend ``manylinux2014``, and we expect the actual set of
|
||
pre-provided build images to be identical in both cases. And again, by
|
||
maintaining this documentation in the right place, instead of trying
|
||
to do it PEPs repository, we expect that we'll end up with
|
||
documentation that's higher-quality and more fitted to purpose.
|
||
|
||
Finally, some participants have pointed out that it's very nice to be
|
||
able to look at a wheel and tell definitively whether it meets the
|
||
requirements of the spec. With the new "perennial" approach, we can
|
||
never say with 100% certainty that a wheel does meet the spec, because
|
||
that depends on the Linux distros. As engineers we have a
|
||
well-justified dislike for that kind of uncertainty.
|
||
|
||
However: as demonstrated by the examples above, we can still tell
|
||
definitively when a wheel *doesn't* meet the spec, which turns out to
|
||
be what's important in practice. And, in practice, with the
|
||
manylinux20XX approach, whenever distros change, we actually change
|
||
the spec; it takes a bit longer. So even if a wheel was compliant
|
||
today, it might be become non-compliant tomorrow. This is frustrating,
|
||
but unfortunately this uncertainty is unavoidable if what you care
|
||
about is distributing working wheels to users.
|
||
|
||
So even on these points where the old approach initially seems to have
|
||
advantages, we expect the new approach to actually do as well or
|
||
better.
|
||
|
||
**Switching to perennial tags, but continuing to write a PEP for each
|
||
version**: This was proposed as a kind of hybrid, to try to get some
|
||
of the advantages of the perennial tagging system – like easier
|
||
rollouts of new versions – while keeping the advantages of the
|
||
manylinux20XX scheme, like forcing us to write documentation about
|
||
Linux distros, simplifying options for package maintainers, and being
|
||
able to definitively tell when a wheel meets the spec. But as
|
||
discussed above, on a closer look, it turns out that these advantages
|
||
are largely illusory. And this also inherits significant
|
||
*dis*\advantages from the manylinux20XX scheme, like creating
|
||
indefinite obligations to update a growing list of copy-pasted PEPs.
|
||
|
||
**Making auditwheel normative**: Another possibility that was
|
||
considered was to make auditwheel the normative reference on the
|
||
definition of manylinux, i.e., a wheel would be compliant if and only
|
||
if ``auditwheel check`` completed without errors. This was rejected
|
||
because the point of packaging PEPs is to define interoperability
|
||
between tools, not to bless specific tools.
|
||
|
||
**Adding extra words to the tag string**: Another proposal we
|
||
considered was to add extra words to the wheel tag, e.g.
|
||
``manylinux_glibc_2_17`` instead of ``manylinux_2_17``. The motivation
|
||
would be to leave the door open to other kinds of versioning
|
||
heuristics in the future – for example, we could have
|
||
``manylinux_glibc_$VERSION`` and ``manylinux_alpine_$VERSION``.
|
||
|
||
But "manylinux" has always been a synonym for "broad compatibility
|
||
with mainstream glibc-based distros"; reusing it for unrelated build
|
||
profiles like alpine is more confusing than helpful. Also, some early
|
||
reviewers who aren't steeped in the details of packaging found the
|
||
word ``glibc`` actively misleading, jumping to the conclusion that it
|
||
meant they needed a system with *exactly* that glibc version. And tags
|
||
like ``manylinux_$VERSION`` and ``alpine_$VERSION`` also have the
|
||
advantages of parsimony and directness. So we'll go with that.
|