From a5f87a7b821b22580ce48bf08ab1d508ec67a8cc Mon Sep 17 00:00:00 2001 From: Mark Williams Date: Mon, 5 Feb 2018 23:57:12 -0800 Subject: [PATCH] PEP 571: Updated version of the manylinux ABI (GH-565) manylinux1 is getting old enough now to start making things difficult (specifically around network security), so it's time for a refresh to a slightly newer baseline. --- pep-0571.rst | 343 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 343 insertions(+) create mode 100644 pep-0571.rst diff --git a/pep-0571.rst b/pep-0571.rst new file mode 100644 index 000000000..1937d58ea --- /dev/null +++ b/pep-0571.rst @@ -0,0 +1,343 @@ +PEP: 571 +Title: The manylinux2 Platform Tag +Version: $Revision$ +Last-Modified: $Date$ +Author: Mark Williams +BDFL-Delegate: Nick Coghlan +Discussions-To: Distutils SIG +Status: Active +Type: Informational +Content-Type: text/x-rst +Created: +Post-History: +Resolution: + + +Abstract +======== + +This PEP proposes the creation of a ``manylinux2`` platform tag to +succeed the ``manylinux1`` tag introduced by PEP 513 [1]_. It also +proposes that PyPI and ``pip`` both be updated to support uploading, +downloading, and installing ``manylinux2`` distributions on compatible +platforms. + +Rationale +========= + +True to its name, the ``manylinux1`` platform tag has made the +installation of binary extension modules a reality on many Linux +systems. Libraries like ``cryptography`` [2]_ and ``numpy`` [3]_ are +more accessible to Python developers now that their installation on +common architectures does not depend on fragile development +environments and build toolchains. + +``manylinux1`` wheels achieve their portability by allowing the +extension modules they contain to link against only a small set of +system-level shared libraries that export versioned symbols old enough +to benefit from backwards-compatibility policies. Extension modules +in a ``manylinux1`` wheel that rely on ``glibc``, for example, must be +built against version 2.5 or earlier; they may then be run systems +that provide more recent ``glibc`` version that still export the +required symbols at version 2.5. + +PEP 513 drew its whitelisted shared libraries and their symbol +versions from CentOS 5.11, which was the oldest supported CentOS +release at the time of its writing. Unfortunately, CentOS 5.11 +reached its end-of-life on March 31st, 2017 with a clear warning +against its continued use. [4]_ No further updates, such as security +patches, will be made available. This means that its packages will +remain at obsolete versions that hamper the efforts of Python software +packagers who use the ``manylinux1`` Docker image. + +CentOS 6 is now the oldest supported CentOS release, and will receive +maintenance updates through November 30th, 2020. [5]_ We propose that +a new PEP 425-style [6]_ platform tag called ``manylinux2`` be derived +from CentOS 6 and that the ``manylinux`` toolchain, PyPI, and ``pip`` +be updated to support it. + + +The ``manylinux2`` policy +========================= + +The following criteria determine a ``linux`` wheel's eligibility for +the ``manylinux2`` tag: + +1. The wheel may only contain binary executables and shared objects + compiled for one of the two architectures supported by CentOS 6: + x86_64 or i686. [5]_ +2. The wheel's binary executables or shared objects may not link + against externally-provided libraries except those in the following + whitelist: :: + + libgcc_s.so.1 + libstdc++.so.6 + libm.so.6 + libdl.so.2 + librt.so.1 + libcrypt.so.1 + libc.so.6 + libnsl.so.1 + libutil.so.1 + libpthread.so.0 + libresolv.so.2 + libX11.so.6 + libXext.so.6 + libXrender.so.1 + libICE.so.6 + libSM.so.6 + libGL.so.1 + libgobject-2.0.so.0 + libgthread-2.0.so.0 + libglib-2.0.so.0 + + This list is identical to the externally-provided libraries + whitelisted for ``manylinux1``, minus ``libncursesw.so.5`` and + ``libpanelw.so.5``. [7]_ ``libpythonX.Y`` remains ineligible for + inclusion for the same reasons outlined in PEP 513. + + On Debian-based systems, these libraries are provided by the packages: + + ============ ======================================================= + Package Libraries + ============ ======================================================= + libc6 libdl.so.2, libresolv.so.2, librt.so.1, libc.so.6, + libpthread.so.0, libm.so.6, libutil.so.1, libcrypt.so.1, + libnsl.so.1 + libgcc1 libgcc_s.so.1 + libgl1 libGL.so.1 + libglib2.0-0 libgobject-2.0.so.0, libgthread-2.0.so.0, libglib-2.0.so.0 + libice6 libICE.so.6 + libsm6 libSM.so.6 + libstdc++6 libstdc++.so.6 + libx11-6 libX11.so.6 + libxext6 libXext.so.6 + libxrender1 libXrender.so.1 + ============ ======================================================= + + On RPM-based systems, they are provided by these packages: + + ============ ======================================================= + Package Libraries + ============ ======================================================= + glib2 libglib-2.0.so.0, libgthread-2.0.so.0, libgobject-2.0.so.0 + glibc libresolv.so.2, libutil.so.1, libnsl.so.1, librt.so.1, + libcrypt.so.1, libpthread.so.0, libdl.so.2, libm.so.6, + libc.so.6 + libICE libICE.so.6 + libX11 libX11.so.6 + libXext: libXext.so.6 + libXrender libXrender.so.1 + libgcc: libgcc_s.so.1 + libstdc++ libstdc++.so.6 + mesa libGL.so.1 + ============ ======================================================= + +3. If the wheel contains binary executables or shared objects linked + against any whitelisted libraries that also export versioned + symbols, they may only depend on the following maximum versions:: + + GLIBC_2.12 + CXXABI_1.3.3 + GLIBCXX_3.4.13 + GCC_4.3.0 + + As an example, ``manylinux2`` wheels may include binary artifacts + that require ``glibc`` symbols at version ``GLIBC_2.4``, because + this an earlier version than the maximum of ``GLIBC_2.12``. +4. If a wheel is built for any version of CPython 2 or CPython + versions 3.0 up to and including 3.2, it *must* include a CPython + ABI tag indicating its Unicode ABI. A ``manylinux2`` wheel built + against Python 2, then, must include either the ``cpy27mu`` tag + indicating it was built against an interpreter with the UCS-4 ABI + or the ``cpy27m`` tag indicating an interpeter with the UCS-2 + ABI. [8]_ [9]_ +5. A wheel *must not* require the ``PyFPE_jbuf`` symbol. This is + achieved by building it against a Python compiled *without* the + ``--with-fpectl`` ``configure`` flag. + +Compilation of Compliant Wheels +=============================== + +Like ``manylinux1``, the ``auditwheel`` tool adds ```manylinux2`` +platform tags to ``linux`` wheels built by ``pip wheel`` or +``bdist_wheel`` in a ``manylinux2`` Docker container. + +Docker Images +------------- + +``manylinux2`` Docker images based on CentOS 6 x86_64 and i686 are +provided for building binary ``linux`` wheels that can reliably be +converted to ``manylinux2`` wheels. [10]_ These images come with a +full compiler suite installed (``gcc``, ``g++``, and ``gfortran`` +4.8.2) as well as the latest releases of Python and ``pip``. + +Compatibility with kernels that lack ``vsyscall`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A Docker container assumes that its userland is compatible with its +host's kernel. Unfortunately, an increasingly common kernel +configuration breaks breaks this assumption for x86_64 CentOS 6 Docker +images. + +Versions 2.14 and earlier of ``glibc`` require the kernel provide an +archaic system call optimization known as ``vsyscall`` on x86_64. [11]_ +To effect the optimization, the kernel maps a read-only page of +frequently-called system calls -- most notably ``time(2)`` -- into +each process at a fixed memory location. ``glibc`` then invokes these +system calls by dereferencing a function pointer to the appropriate +offset into the ``vsyscall`` page and calling it. This avoids the +overhead associated with invoking the kernel that affects normal +system call invocation. ``vsyscall`` has long been deprecated in +favor of an equivalent mechanism known as vDSO, or "virtual dynamic +shared object", in which the kernel instead maps a relocatable virtual +shared object containing the optimized system calls into each +process. [12]_ + +The ``vsyscall`` page has serious security implications because it +does not participate in address space layout randomization (ASLR). +Its predictable location and contents make it a useful source of +gadgets used in return-oriented programming attacks. [13]_ At the same +time, its elimination breaks the x86_64 ABI, because ``glibc`` +versions that depend on ``vsyscall`` suffer from segmentation faults +when attempting to dereference a system call pointer into a +non-existent page. As a compromise, Linux 3.1 implemented an +"emulated" ``vsyscall`` that reduced the executable code, and thus the +material for ROP gadgets, mapped into the process. [14]_ +``vsyscall=emulated`` has been the default configuration in most +distribution's kernels for many years. + +Unfortunately, ``vsyscall`` emulation still exposes predicatable code +at a reliable memory location, and continues to be useful for +return-oriented programming. [15]_ Because most distributions have now +upgraded to ``glibc`` versions that do not depend on ``vsyscall``, +they are beginning to ship kernels that do not support ``vsyscall`` at +all. [16]_ + +CentOS 5.11 and 6 both include versions of ``glibc`` that depend on +the ``vsyscall`` page (2.5 and 2.12.2 respectively), so containers +based on either cannot run under kernels provided with many +distribution's upcoming releases. [17]_ Continuum Analytics faces a +related problem with its conda software suite, and as they point out, +this will pose a significant obstacle to using these tools in hosted +services. [18]_ If Travis CI, for example, begins running jobs under +a kernel that does not provide the ``vsyscall`` interface, Python +packagers will not be able to use our Docker images there to build +``manylinux`` wheels. [19]_ + +We have derived a patch from the ``glibc`` git repository that +backports the removal of all dependencies on ``vsyscall`` to the +version of ``glibc`` included with our ``manylinux2`` image. [20]_ +Rebuilding ``glibc``, and thus building ``manylinux2`` image itself, +still requires a host kernel that provides the ``vsyscall`` mechanism, +but the resulting image can be both run on hosts that provide it and +those that do not. Because the ``vsyscall`` interface is an +optimization that is only applied to running processes, the +``manylinux2`` wheels built with this modified image should be +identical to those built on an unmodified CentOS 6 system. Also, the +``vsyscall`` problem applies only to x86_64; it is not part of the +i686 ABI. + +Auditwheel +---------- + +The ``auditwheel`` tool has also been updated to produce +``manylinux2`` wheels. [21]_ Its behavior and purpose are otherwise +unchanged from PEP 513. + + +Platform Detection for Installers +================================= + +Platforms may define a ``manylinux2_compatible`` boolean attribute on +the ``_manylinux`` module described in PEP 513. A platform is +considered incompatible with ``manylinux2`` if the attribute is +``False``. + + +Backwards compatibility with ``manylinux1`` wheels +================================================== + +As explained in PEP 513, the specified symbol versions for +``manylinux1`` whitelisted libraries constitute an *upper bound*. The +same is true for the symbol versions defined for ``manylinux2`` in +this PEP. As a result, ``manylinux1`` wheels are considered +``manylinux2`` wheels. A ``pip`` that recognizes the ``manylinux2`` +platform tag will thus install ``manylinux1`` wheels for +``manylinux2`` platforms -- even when explicitly set -- when no +``manylinux2`` wheels are available. [22]_ + +PyPI Support +============ + +PyPI should permit wheels containing the ``manylinux2`` platform tag +to be uploaded in the same way that it permits ``manylinux1``. It +should not attempt to verify the compatibility of ``manylinux2`` +wheels. + + +References +========== + +.. [1] PEP 513 -- A Platform Tag for Portable Linux Built Distributions + (https://www.python.org/dev/peps/pep-0513/) +.. [2] pyca/cryptography + (https://cryptography.io/) +.. [3] numpy + (https://numpy.org) +.. [4] CentOS 5.11 EOL announcement + (https://lists.centos.org/pipermail/centos-announce/2017-April/022350.html) +.. [5] CentOS Product Specifications + (https://web.archive.org/web/20180108090257/https://wiki.centos.org/About/Product) +.. [6] PEP 425 -- Compatibility Tags for Built Distributions + (https://www.python.org/dev/peps/pep-0425/) +.. [7] ncurses 5 -> 6 transition means we probably need to drop some + libraries from the manylinux whitelist + (https://github.com/pypa/manylinux/issues/94) +.. [8] PEP 3149 + https://www.python.org/dev/peps/pep-3149/ +.. [9] SOABI support for Python 2.X and PyPy + https://github.com/pypa/pip/pull/3075 +.. [10] manylinux2 Docker images + (https://hub.docker.com/r/markrwilliams/manylinux2/) +.. [11] On vsyscalls and the vDSO + (https://lwn.net/Articles/446528/) +.. [12] vdso(7) + (http://man7.org/linux/man-pages/man7/vdso.7.html) +.. [13] Framing Signals -- A Return to Portable Shellcode + (http://www.cs.vu.nl/~herbertb/papers/srop_sp14.pdf) +.. [14] ChangeLog-3.1 + (https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.1) +.. [15] Project Zero: Three bypasses and a fix for one of Flash's Vector.<*> mitigations + (https://googleprojectzero.blogspot.com/2015/08/three-bypasses-and-fix-for-one-of.html) +.. [16] linux: activate CONFIG_LEGACY_VSYSCALL_NONE ? + (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=852620) +.. [17] [Wheel-builders] Heads-up re: new kernel configurations breaking the manylinux docker image + (https://mail.python.org/pipermail/wheel-builders/2016-December/000239.html) +.. [18] Due to glibc 2.12 limitation, static executables that use + time(), cpuinfo() and maybe a few others cannot be run on systems + that do not support or use `vsyscall=emulate` + (https://github.com/ContinuumIO/anaconda-issues/issues/8203) +.. [19] Travis CI + (https://travis-ci.org/) +.. [20] remove-vsyscall.patch + https://github.com/markrwilliams/manylinux/commit/e9493d55471d153089df3aafca8cfbcb50fa8093#diff-3eda4130bdba562657f3ec7c1b3f5720 +.. [21] auditwheel manylinux2 branch + (https://github.com/markrwilliams/auditwheel/tree/manylinux2) +.. [22] pip manylinux2 branch + https://github.com/markrwilliams/pip/commits/manylinux2 + + +Copyright +========= + +This document has been placed into the public domain. + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: