PEP 513: Update from Robert

This commit is contained in:
Nick Coghlan 2016-01-27 21:37:05 +10:00
parent 3e7419f50e
commit f97c710fe7
1 changed files with 225 additions and 87 deletions

View File

@ -9,7 +9,7 @@ Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 19-Jan-2016
Post-History: 19-Jan-2016
Post-History: 19-Jan-2016, 25-Jan-2016
Abstract
@ -17,9 +17,9 @@ Abstract
This PEP proposes the creation of a new platform tag for Python package built
distributions, such as wheels, called ``manylinux1_{x86_64,i386}`` with
external dependencies limited restricted to a standardized subset of
external dependencies limited to a standardized, restricted subset of
the Linux kernel and core userspace ABI. It proposes that PyPI support
uploading and distributing Wheels with this platform tag, and that ``pip``
uploading and distributing wheels with this platform tag, and that ``pip``
support downloading and installing these packages on compatible platforms.
@ -27,17 +27,17 @@ Rationale
=========
Currently, distribution of binary Python extensions for Windows and OS X is
straightforward. Developers and packagers build wheels, which are assigned
platform tags such as ``win32`` or ``macosx_10_6_intel``, and upload these
wheels to PyPI. Users can download and install these wheels using tools such
as ``pip``.
straightforward. Developers and packagers build wheels [1]_ [2]_, which are
assigned platform tags such as ``win32`` or ``macosx_10_6_intel``, and upload
these wheels to PyPI. Users can download and install these wheels using tools
such as ``pip``.
For Linux, the situation is much more delicate. In general, compiled Python
extension modules built on one Linux distribution will not work on other Linux
distributions, or even on the same Linux distribution with different system
libraries installed.
distributions, or even on different machines running the same Linux
distribution with different system libraries installed.
Build tools using PEP 425 platform tags [1]_ do not track information about the
Build tools using PEP 425 platform tags [3]_ do not track information about the
particular Linux distribution or installed system libraries, and instead assign
all wheels the too-vague ``linux_i386`` or ``linux_x86_64`` tags. Because of
this ambiguity, there is no expectation that ``linux``-tagged built
@ -54,8 +54,8 @@ in practice, is compatible enough that packages conforming to this standard
will work on *many* linux systems, including essentially all of the desktop
and server distributions in common use. We know this because there are
companies who have been distributing such widely-portable pre-compiled Python
extension modules for Linux -- e.g. Enthought with Canopy [2]_ and Continuum
Analytics with Anaconda [3]_.
extension modules for Linux -- e.g. Enthought with Canopy [4]_ and Continuum
Analytics with Anaconda [5]_.
Building on the compability lessons learned from these companies, we thus
define a baseline ``manylinux1`` platform tag for use by binary Python
@ -93,23 +93,23 @@ libraries.
Versioning of Core Shared Libraries
-----------------------------------
Even if author or maintainers of a Python extension module with to use no
Even if the developers a Python extension module wish to use no
external shared libraries, the modules will generally have a dynamic runtime
dependency on the GNU C library, ``glibc``. While it is possible, statically
linking ``glibc`` is usually a bad idea because of bloat, and because certain
important C functions like ``dlopen()`` cannot be called from code that
statically links ``glibc``. A runtime shared library dependency on a
system-provided ``glibc`` is unavoidable in practice.
linking ``glibc`` is usually a bad idea because certain important C functions
like ``dlopen()`` cannot be called from code that statically links ``glibc``. A
runtime shared library dependency on a system-provided ``glibc`` is unavoidable
in practice.
The maintainers of the GNU C library follow a strict symbol versioning scheme
for backward compatibility. This ensures that binaries compiled against an older
version of ``glibc`` can run on systems that have a newer ``glibc``. The
opposite is generally not true -- binaries compiled on newer Linux
distributions tend to rely upon versioned functions in glibc that are not
distributions tend to rely upon versioned functions in ``glibc`` that are not
available on older systems.
This generally prevents built distributions compiled on the latest Linux
distributions from being portable.
This generally prevents wheels compiled on the latest Linux distributions
from being portable.
The ``manylinux1`` policy
@ -117,16 +117,17 @@ The ``manylinux1`` policy
For these reasons, to achieve broad portability, Python wheels
* should depend only on an extremely limited set of external shared
libraries; and
* should depend only on ``old`` symbol versions in those external shared
libraries.
* should depend only on an extremely limited set of external shared
libraries; and
* should depend only on "old" symbol versions in those external shared
libraries; and
* should depend only on a widely-compatible kernel ABI.
The ``manylinux1`` policy thus encompasses a standard for what the
permitted external shared libraries a wheel may depend on, and the maximum
depended-upon symbol versions therein.
The permitted external shared libraries are: ::
To be eligible for the ``manylinux1`` platform tag, a Python wheel must
therefore both (a) contain binary executables and compiled code that links
*only* to libraries (other than the appropriate ``libpython`` library, which is
always a permitted dependency consistent with the PEP 425 ABI tag) with SONAMEs
included in the following list: ::
libpanelw.so.5
libncursesw.so.5
@ -150,6 +151,13 @@ The permitted external shared libraries are: ::
libgthread-2.0.so.0
libglib-2.0.so.0
and (b), work on a stock CentOS 5.11 [6]_ system that contains the system
package manager's provided versions of these libraries.
Because CentOS 5 is only available for x86_64 and i386 architectures,
these are the only architectures currently supported by the ``manylinux1``
policy.
On Debian-based systems, these libraries are provided by the packages ::
libncurses5 libgcc1 libstdc++6 libc6 libx11-6 libxext6
@ -161,72 +169,187 @@ On RPM-based systems, these libraries are provided by the packages ::
libICE libSM mesa-libGL glib2
This list was compiled by checking the external shared library dependencies of
the Canopy [1]_ and Anaconda [2]_ distributions, which both include a wide array
the Canopy [4]_ and Anaconda [5]_ distributions, which both include a wide array
of the most popular Python modules and have been confirmed in practice to work
across a wide swath of Linux systems in the wild.
For dependencies on externally-provided versioned symbols in the above shared
libraries, the following symbol versions are permitted: ::
Many of the permitted system libraries listed above use symbol versioning
schemes for backward compatibility. The latest symbol versions provided with
the CentOS 5.11 versions of these libraries are: ::
GLIBC_2.5
CXXABI_3.4.8
GLIBCXX_3.4.9
GCC_4.2.0
Therefore, as a consequence of requirement (b), any wheel that depends on
versioned symbols from the above shared libraries may depend only on symbols
with the following versions: ::
GLIBC <= 2.5
CXXABI <= 3.4.8
GLIBCXX <= 3.4.9
GCC <= 4.2.0
These symbol versions were determined by inspecting the latest symbol version
provided in the libraries distributed with CentOS 5, a Linux distribution
released in April 2007. In practice, this means that Python wheels which conform
to this policy should function on almost any linux distribution released after
this date.
These recommendations are the outcome of the relevant discussions in January
2016 [7]_, [8]_.
Note that in our recommendations below, we do not suggest that ``pip``
or PyPI should attempt to check for and enforce the details of this
policy (just as they don't check for and enforce the details of
existing platform tags like ``win32``). The text above is provided (a)
as advice to package builders, and (b) as a method for allocating
blame if a given wheel doesn't work on some system: if it satisfies
the policy above, then this is a bug in the spec or the installation
tool; if it does not satisfy the policy above, then it's a bug in the
wheel. One useful consequence of this approach is that it leaves open
the possibility of further updates and tweaks as we gain more
experience, e.g., we could have a "manylinux 1.1" policy which targets
the same systems and uses the same ``manylinux1`` platform tag (and
thus requires no further changes to ``pip`` or PyPI), but that adjusts
the list above to remove libraries that have turned out to be
problematic or add libraries that have turned out to be safe.
Compilation and Tooling
=======================
Compilation of Compliant Wheels
===============================
The way glibc, libgcc, and libstdc++ manage their symbol versioning
means that in practice, the compiler toolchains that most developers
use to do their daily work are incapable of building
``manylinux1``-compliant wheels. Therefore we do not attempt to change
the default behavior of ``pip wheel`` / ``bdist_wheel``: they will
continue to generate regular ``linux_*`` platform tags, and developers
who wish to use them to generate ``manylinux1``-tagged wheels will
have to change the tag as a second post-processing step.
To support the compilation of wheels meeting the ``manylinux1`` standard, we
provide initial drafts of two tools.
The first is a Docker image based on CentOS 5.11, which is recommended as an
easy to use self-contained build box for compiling ``manylinux1`` wheels [4]_.
Compiling on a more recently-released linux distribution will generally
Docker Image
------------
The first tool is a Docker image based on CentOS 5.11, which is recommended as
an easy to use self-contained build box for compiling ``manylinux1`` wheels
[9]_. Compiling on a more recently-released linux distribution will generally
introduce dependencies on too-new versioned symbols. The image comes with a
full compiler suite installed (``gcc``, ``g++``, and ``gfortran`` 4.8.2) as
well as the latest releases of Python and pip.
well as the latest releases of Python and ``pip``.
The second tool is a command line executable called ``auditwheel`` [5]_. First,
it inspects all of the ELF files inside a wheel to check for dependencies on
versioned symbols or external shared libraries, and verifies conformance with
the ``manylinux1`` policy. This includes the ability to add the new platform
tag to conforming wheels.
Auditwheel
----------
In addition, ``auditwheel`` has the ability to automatically modify wheels that
depend on external shared libraries by copying those shared libraries from
the system into the wheel itself, and modifying the appropriate RPATH entries
such that these libraries will be picked up at runtime. This accomplishes a
similar result as if the libraries had been statically linked without requiring
changes to the build system.
The second tools is a command line executable called ``auditwheel`` [10]_ that
may aid in package maintainers in dealing with third-party external
dependencies.
There are at least three methods for building wheels that use third-party
external libraries in a way that meets the above policy.
1. The third-party libraries can be statically linked.
2. The third-party shared libraries can be distributed in
separate packages on PyPI which are depended upon by the wheel.
3. The third-party shared libraries can be bundled inside the wheel
libraries, linked with a relative path.
All of these are valid option which may be effectively used by different
packages and communities. Statically linking generally requires
package-specific modifications to the build system, and distributing
third-party dependencies on PyPI may require some coordination of the
community of users of the package.
As an often-automatic alternative to these options, we introduce ``auditwheel``.
The tool inspects all of the ELF files inside a wheel to check for
dependencies on versioned symbols or external shared libraries, and verifies
conformance with the ``manylinux1`` policy. This includes the ability to add
the new platform tag to conforming wheels. More importantly, ``auditwheel`` has
the ability to automatically modify wheels that depend on external shared
libraries by copying those shared libraries from the system into the wheel
itself, and modifying the appropriate ``RPATH`` entries such that these
libraries will be picked up at runtime. This accomplishes a similar result as
if the libraries had been statically linked without requiring changes to the
build system. Packagers are advised that bundling, like static linking, may
implicate copyright concerns.
Bundled Wheels on Linux
=======================
While we acknowledge many approaches for dealing with third-party library
dependencies within ``manylinux1`` wheels, we recognize that the ``manylinux1``
policy encourages bundling external dependencies, a practice
which runs counter to the package management policies of many linux
distributions' system package managers [11]_, [12]_. The primary purpose of
this is cross-distro compatibility. Furthermore, ``manylinux1`` wheels on PyPI
occupy a different niche than the Python packages available through the
system package manager.
The decision in this PEP to encourage departure from general Linux distribution
unbundling policies is informed by the following concerns:
1. In these days of automated continuous integration and deployment
pipelines, publishing new versions and updating dependencies is easier
than it was when those policies were defined.
2. ``pip`` users remain free to use the ``"--no-binary"`` option if they want
to force local builds rather than using pre-built wheel files.
3. The popularity of modern container based deployment and "immutable
infrastructure" models involve substantial bundling at the application
layer anyway.
4. Distribution of bundled wheels through PyPI is currently the norm for
Windows and OS X.
5. This PEP doesn't rule out the idea of offering more targeted binaries for
particular Linux distributions in the future.
The model described in this PEP is most ideally suited for cross-platform
Python packages, because it means they can reuse much of the
work that they're already doing to make static Windows and OS X wheels. We
recognize that it is less optimal for Linux-specific packages that might
prefer to interact more closely with Linux's unique package management
functionality and only care about targeting a small set of particular distos.
Security Implications
---------------------
One of the advantages of dependencies on centralized libraries in Linux is
that bugfixes and security updates can be deployed system-wide, and
applications which depend on these libraries will automatically feel the
effects of these patches when the underlying libraries are updated. This can
be particularly important for security updates in packages engaged in
communication across the network or cryptography.
``manylinux1`` wheels distributed through PyPI that bundle security-critical
libraries like OpenSSL will thus assume responsibility for prompt updates in
response disclosed vulnerabilities and patches. This closely parallels the
security implications of the distribution of binary wheels on Windows that,
because the platform lacks a system package manager, generally bundle their
dependencies. In particular, because it lacks a stable ABI, OpenSSL cannot be
included in the ``manylinux1`` profile.
Neither of these tools are necessary to build wheels which conform with the
``manylinux1`` policy. Similar results can usually be achieved by statically
linking external dependencies and/or using certain inline assembly constructs
to instruct the linker to prefer older symbol versions, however these tricks
can be quite esoteric.
Platform Detection for Installers
=================================
Because the ``manylinux1`` profile is already known to work for the many
thousands of users of popular commercial Python distributions, we suggest that
installation tools like ``pip`` should error on the side of assuming that a
system *is* compatible, unless there is specific reason to think otherwise.
Above, we defined what it means for a *wheel* to be
``manylinux1``-compatible. Here we discuss what it means for a *Python
installation* to be ``manylinux1``-compatible. In particular, this is
important for tools like ``pip`` to know when deciding whether or not
they should consider ``manylinux1``-tagged wheels for installation.
Because the ``manylinux1`` profile is already known to work for the
many thousands of users of popular commercial Python distributions, we
suggest that installation tools should error on the side of assuming
that a system *is* compatible, unless there is specific reason to
think otherwise.
We know of three main sources of potential incompatibility that are likely to
arise in practice:
* A linux distribution that is too old (e.g. RHEL 4)
* A linux distribution that does not use glibc (e.g. Alpine Linux, which is
based on musl libc, or Android)
* A linux distribution that does not use ``glibc`` (e.g. Alpine Linux, which is
based on musl ``libc``, or Android)
* Eventually, in the future, there may exist distributions that break
compatibility with this profile
@ -283,23 +406,14 @@ The proposed logic for ``pip`` or related tools, then, is:
wheels.
Security Implications
=====================
PyPI Support
============
One of the advantages of dependencies on centralized libraries in Linux is
that bugfixes and security updates can be deployed system-wide, and
applications which depend on on these libraries will automatically feel the
effects of these patches when the underlying libraries are updated. This can
be particularly important for security updates in packages communication
across the network or cryptography.
``manylinux1`` wheels distributed through PyPI that bundle security-critical
libraries like OpenSSL will thus assume responsibility for prompt updates in
response disclosed vulnerabilities and patches. This closely parallels the
security implications of the distribution of binary wheels on Windows that,
because the platform lacks a system package manager, generally bundle their
dependencies. In particular, because its lacks a stable ABI, OpenSSL cannot be
included in the ``manylinux1`` profile.
PyPI should permit wheels containing the ``manylinux1`` platform tag to be
uploaded. PyPI should not attempt to formally verify that wheels containing
the ``manylinux1`` platform tag adhere to the ``manylinux1`` policy described
in this document. This verification tasks should be left to other tools, like
``auditwheel``, that are developed separately.
Rejected Alternatives
@ -317,19 +431,44 @@ and build multiple wheels in order to cover all the common Linux distributions.
Therefore we consider such proposals to be out-of-scope for this PEP.
Future updates
==============
We anticipate that at some point in the future there will be a
``manylinux2`` specifying a more modern baseline environment (perhaps
based on CentOS 6), and someday a ``manylinux3`` and so forth, but we
defer specifying these until we have more experience with the initial
``manylinux1`` proposal.
References
==========
.. [1] PEP 425 -- Compatibility Tags for Built Distributions
.. [1] PEP 0427 -- The Wheel Binary Package Format 1.0
(https://www.python.org/dev/peps/pep-0427/)
.. [2] PEP 0491 -- The Wheel Binary Package Format 1.9
(https://www.python.org/dev/peps/pep-0491/)
.. [3] PEP 425 -- Compatibility Tags for Built Distributions
(https://www.python.org/dev/peps/pep-0425/)
.. [2] Enthought Canopy Python Distribution
.. [4] Enthought Canopy Python Distribution
(https://store.enthought.com/downloads/)
.. [3] Continuum Analytics Anaconda Python Distribution
.. [5] Continuum Analytics Anaconda Python Distribution
(https://www.continuum.io/downloads)
.. [4] manylinux1 docker image
.. [6] CentOS 5.11 Release Notes
(https://wiki.centos.org/Manuals/ReleaseNotes/CentOS5.11)
.. [7] manylinux-discuss mailing list discussion
(https://groups.google.com/forum/#!topic/manylinux-discuss/-4l3rrjfr9U)
.. [8] distutils-sig discussion
(https://mail.python.org/pipermail/distutils-sig/2016-January/027997.html)
.. [9] manylinux1 docker image
(https://quay.io/repository/manylinux/manylinux)
.. [5] auditwheel
.. [10] auditwheel tool
(https://pypi.python.org/pypi/auditwheel)
.. [11] Fedora Bundled Software Policy
(https://fedoraproject.org/wiki/Bundled_Software_policy)
.. [12] Debian Policy Manual -- 4.13: Convenience copies of code
(https://www.debian.org/doc/debian-policy/ch-source.html#s-embeddedfiles)
Copyright
=========
@ -337,7 +476,6 @@ Copyright
This document has been placed into the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil