2023-04-07 03:30:10 -04:00
|
|
|
|
PEP: 711
|
|
|
|
|
Title: PyBI: a standard format for distributing Python Binaries
|
|
|
|
|
Author: Nathaniel J. Smith <njs@pobox.com>
|
|
|
|
|
PEP-Delegate: TODO
|
2023-04-07 03:46:47 -04:00
|
|
|
|
Discussions-To: https://discuss.python.org/t/pep-711-pybi-a-standard-format-for-distributing-python-binaries/25547
|
2023-04-07 03:30:10 -04:00
|
|
|
|
Status: Draft
|
|
|
|
|
Type: Standards Track
|
|
|
|
|
Topic: Packaging
|
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
|
Created: 06-Apr-2023
|
2023-04-07 03:46:47 -04:00
|
|
|
|
Post-History: `06-Apr-2023 <https://discuss.python.org/t/pep-711-pybi-a-standard-format-for-distributing-python-binaries/25547>`__
|
2023-04-07 03:30:10 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
“Like wheels, but instead of a pre-built python package, it’s a
|
|
|
|
|
pre-built python interpreter”
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Motivation
|
|
|
|
|
==========
|
|
|
|
|
|
|
|
|
|
End goal: Pypi.org has pre-built packages for all Python versions on all
|
|
|
|
|
popular platforms, so automated tools can easily grab any of them and
|
|
|
|
|
set it up. It becomes quick and easy to try Python prereleases, pin
|
|
|
|
|
Python versions in CI, make a temporary environment to reproduce a bug
|
|
|
|
|
report that only happens on a specific Python point release, etc.
|
|
|
|
|
|
|
|
|
|
First step (this PEP): define a standard packaging file format to hold pre-built
|
|
|
|
|
Python interpreters, that reuses existing Python packaging standards as much as
|
|
|
|
|
possible.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Examples
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
Example pybi builds are available at `pybi.vorpus.org
|
|
|
|
|
<https://pybi.vorpus.org>`__. They're zip files, so you can unpack them and poke
|
|
|
|
|
around inside if you want to get a feel for how they're laid out.
|
|
|
|
|
|
|
|
|
|
You can also look at the `tooling I used to create them
|
|
|
|
|
<https://github.com/njsmith/pybi-tools>`__.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Specification
|
|
|
|
|
=============
|
|
|
|
|
|
|
|
|
|
Filename
|
|
|
|
|
--------
|
|
|
|
|
|
|
|
|
|
Filename: ``{distribution}-{version}[-{build tag}]-{platform tag}.pybi``
|
|
|
|
|
|
|
|
|
|
This matches the wheel file format defined in :pep:`427`, except dropping the
|
|
|
|
|
``{python tag}`` and ``{abi tag}`` and changing the extension from ``.whl`` →
|
|
|
|
|
``.pybi``.
|
|
|
|
|
|
|
|
|
|
For example:
|
|
|
|
|
|
|
|
|
|
- ``cpython-3.9.3-manylinux_2014.pybi``
|
|
|
|
|
- ``cpython-3.10b2-win_amd64.pybi``
|
|
|
|
|
|
|
|
|
|
Just like for wheels, if a pybi supports multiple platforms, you can
|
|
|
|
|
separate them by dots to make a “compressed tag set”:
|
|
|
|
|
|
|
|
|
|
- ``cpython-3.9.5-macosx_11_0_x86_64.macosx_11_0_arm64.pybi``
|
|
|
|
|
|
|
|
|
|
(Though in practice this probably won’t be used much, e.g. the above
|
|
|
|
|
filename is more idiomatically written as
|
|
|
|
|
``cpython-3.9.5-macosx_11_0_universal2.pybi``.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
File contents
|
|
|
|
|
-------------
|
|
|
|
|
|
|
|
|
|
A ``.pybi`` file is a zip file, that can be unpacked directly into an
|
|
|
|
|
arbitrary location and then used as a self-contained Python environment.
|
|
|
|
|
There’s no ``.data`` directory or install scheme keys, because the
|
|
|
|
|
Python environment knows which install scheme it’s using, so it can just
|
|
|
|
|
put things in the right places to start with.
|
|
|
|
|
|
|
|
|
|
The “arbitrary location” part is important: the pybi can’t contain any
|
|
|
|
|
hardcoded absolute paths. In particular, any preinstalled scripts MUST
|
|
|
|
|
NOT embed absolute paths in their shebang lines.
|
|
|
|
|
|
|
|
|
|
Similar to wheels’ ``<package>-<version>.dist-info`` directory, the pybi archive
|
|
|
|
|
must contain a top-level directory named ``pybi-info/``. (Rationale: calling it
|
|
|
|
|
``pybi-info`` instead ``dist-info`` makes sure that tools don’t get confused
|
|
|
|
|
about which kind of metadata they’re looking at; leaving off the
|
|
|
|
|
``{name}-{version}`` part is fine because only one pybi can be installed into a
|
|
|
|
|
given directory.) The ``pybi-info/`` directory contains at least the following
|
|
|
|
|
files:
|
|
|
|
|
|
|
|
|
|
- ``.../PYBI``: metadata about the archive itself, in the same
|
|
|
|
|
RFC822-ish format as ``METADATA`` and ``WHEEL`` files:
|
|
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
|
|
Pybi-Version: 1.0
|
|
|
|
|
Generator: {name} {version}
|
|
|
|
|
Tag: {platform tag}
|
|
|
|
|
Tag: {another platform tag}
|
|
|
|
|
Tag: {...and so on...}
|
|
|
|
|
Build: 1 # optional
|
|
|
|
|
|
|
|
|
|
- ``.../RECORD``: same as in wheels, except see the note about
|
|
|
|
|
symlinks, below.
|
|
|
|
|
|
|
|
|
|
- ``.../METADATA``: In the same format as described in the current core
|
|
|
|
|
metadata spec, except that the following keys are forbidden because
|
|
|
|
|
they don’t make sense:
|
|
|
|
|
|
|
|
|
|
- ``Requires-Dist``
|
|
|
|
|
- ``Provides-Extra``
|
|
|
|
|
- ``Requires-Python``
|
|
|
|
|
|
|
|
|
|
And also there are some new, required keys described below.
|
|
|
|
|
|
|
|
|
|
Pybi-specific core metadata
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
Here's an example of the new ``METADATA`` fields, before we give the full details::
|
|
|
|
|
|
|
|
|
|
Pybi-Environment-Marker-Variables: {"implementation_name": "cpython", "implementation_version": "3.10.8", "os_name": "posix", "platform_machine": "x86_64", "platform_system": "Linux", "python_full_version": "3.10.8", "platform_python_implementation": "CPython", "python_version": "3.10", "sys_platform": "linux"}
|
|
|
|
|
Pybi-Paths: {"stdlib": "lib/python3.10", "platstdlib": "lib/python3.10", "purelib": "lib/python3.10/site-packages", "platlib": "lib/python3.10/site-packages", "include": "include/python3.10", "platinclude": "include/python3.10", "scripts": "bin", "data": "."}
|
|
|
|
|
Pybi-Wheel-Tag: cp310-cp310-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: cp310-abi3-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: cp310-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: cp39-abi3-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: cp38-abi3-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: cp37-abi3-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: cp36-abi3-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: cp35-abi3-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: cp34-abi3-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: cp33-abi3-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: cp32-abi3-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py310-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py3-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py39-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py38-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py37-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py36-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py35-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py34-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py33-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py32-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py31-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py30-none-PLATFORM
|
|
|
|
|
Pybi-Wheel-Tag: py310-none-any
|
|
|
|
|
Pybi-Wheel-Tag: py3-none-any
|
|
|
|
|
Pybi-Wheel-Tag: py39-none-any
|
|
|
|
|
Pybi-Wheel-Tag: py38-none-any
|
|
|
|
|
Pybi-Wheel-Tag: py37-none-any
|
|
|
|
|
Pybi-Wheel-Tag: py36-none-any
|
|
|
|
|
Pybi-Wheel-Tag: py35-none-any
|
|
|
|
|
Pybi-Wheel-Tag: py34-none-any
|
|
|
|
|
Pybi-Wheel-Tag: py33-none-any
|
|
|
|
|
Pybi-Wheel-Tag: py32-none-any
|
|
|
|
|
Pybi-Wheel-Tag: py31-none-any
|
|
|
|
|
Pybi-Wheel-Tag: py30-none-any
|
|
|
|
|
|
|
|
|
|
Specification:
|
|
|
|
|
|
|
|
|
|
- ``Pybi-Environment-Marker-Variables``: The value of all PEP 508
|
|
|
|
|
environment marker variables that are static across installs of this
|
|
|
|
|
Pybi, as a JSON dict. So for example:
|
|
|
|
|
|
|
|
|
|
- ``python_version`` will always be present, because a Python 3.10 package
|
|
|
|
|
always has ``python_version == "3.10"``.
|
|
|
|
|
|
|
|
|
|
- ``platform_version`` will generally not be present, because it gives
|
|
|
|
|
detailed information about the OS where Python is running, for example::
|
|
|
|
|
|
|
|
|
|
#60-Ubuntu SMP Thu May 6 07:46:32 UTC 2021
|
|
|
|
|
|
|
|
|
|
``platform_release`` has similar issues.
|
|
|
|
|
|
|
|
|
|
- ``platform_machine`` will *usually* be present, except for macOS universal2
|
|
|
|
|
pybis: these can potentially be run in either x86-64 or arm64 mode, and we
|
|
|
|
|
don't know which until the interpreter is actually invoked, so we can't
|
|
|
|
|
record it in static metadata.
|
|
|
|
|
|
|
|
|
|
**Rationale:** In many cases, this should allow a resolver running on Linux
|
|
|
|
|
to compute package pins for a Python environment on Windows, or vice-versa,
|
|
|
|
|
so long as the resolver has access to the target platform’s .pybi file. (Note
|
|
|
|
|
that ``Requires-Python`` constraints can be checked by using the
|
|
|
|
|
``python_full_version`` value.) While we have to leave out a few keys
|
|
|
|
|
sometimes, they're either fairly useless (``platform_version``,
|
|
|
|
|
``platform_release``) or can be reconstructed by the resolver
|
|
|
|
|
(``platform_machine``).
|
|
|
|
|
|
|
|
|
|
The markers are also just generally useful information to have
|
|
|
|
|
accessible. For example, if you have a ``pypy3-7.3.2`` pybi, and you
|
|
|
|
|
want to know what version of the Python language that supports, then
|
|
|
|
|
that’s recorded in the ``python_version`` marker.
|
|
|
|
|
|
|
|
|
|
(Note: we may want to deprecate/remove ``platform_version`` and
|
|
|
|
|
``platform_release``? They're problematic and I can't figure out any cases
|
|
|
|
|
where they're useful. But that's out of scope of this particular PEP.)
|
|
|
|
|
|
|
|
|
|
- ``Pybi-Paths``: The install paths needed to install wheels (same keys
|
|
|
|
|
as ``sysconfig.get_paths()``), as relative paths starting at the root
|
|
|
|
|
of the zip file, as a JSON dict.
|
|
|
|
|
|
|
|
|
|
These paths MUST be written in Unix format, using forward slashes as
|
|
|
|
|
a separator, not backslashes.
|
|
|
|
|
|
|
|
|
|
It must be possible to invoke the Python interpreter by running
|
|
|
|
|
``{paths["scripts"]}/python``. If there are alternative interpreter
|
|
|
|
|
entry points (e.g. ``pythonw`` for Windows GUI apps), then they
|
|
|
|
|
should also be in that directory under their conventional names, with
|
|
|
|
|
no version number attached. (You can *also* have a ``python3.11``
|
|
|
|
|
symlink if you want; there’s no rule against that. It’s just that
|
|
|
|
|
``python`` has to exist and work.)
|
|
|
|
|
|
|
|
|
|
**Rationale:** ``Pybi-Paths`` and ``Pybi-Wheel-Tag``\ s (see below) are
|
|
|
|
|
together enough to let an installer choose wheels and install them into an
|
|
|
|
|
unpacked pybi environment, without invoking Python. Besides, we need to write
|
|
|
|
|
down the interpreter location somewhere, so it’s two birds with one stone.
|
|
|
|
|
|
|
|
|
|
- ``Pybi-Wheel-Tag``: The wheel tags supported by this interpreter, in
|
|
|
|
|
preference order (most-preferred first, least-preferred last), except
|
|
|
|
|
that the special platform tag ``PLATFORM`` should replace any
|
|
|
|
|
platform tags that depend on the final installation system.
|
|
|
|
|
|
|
|
|
|
**Discussion:** It would be nice™ if installers could compute a pybi’s
|
|
|
|
|
corresponding wheel tags ahead of time, so that they could install
|
|
|
|
|
wheels into the unpacked pybi without needing to actually invoke the
|
|
|
|
|
python interpreter to query its tags – both for efficiency and to
|
|
|
|
|
allow for more exotic use cases like setting up a Windows environment
|
|
|
|
|
from a Linux host.
|
|
|
|
|
|
|
|
|
|
But unfortunately, it’s impossible to compute the full set of
|
|
|
|
|
platform tags supported by a Python installation ahead of time,
|
|
|
|
|
because they can depend on the final system:
|
|
|
|
|
|
|
|
|
|
- A pybi tagged ``manylinux_2_12_x86_64`` can always use wheels
|
|
|
|
|
tagged as ``manylinux_2_12_x86_64``. It also *might* be able to
|
|
|
|
|
use wheels tagged ``manylinux_2_17_x86_64``, but only if the final
|
|
|
|
|
installation system has glibc 2.17+.
|
|
|
|
|
|
|
|
|
|
- A pybi tagged ``macosx_11_0_universal2`` (= x86-64 + arm64 support
|
|
|
|
|
in the same binary) might be able to use wheels tagged as
|
|
|
|
|
``macosx_11_0_arm64``, but only if it’s installed on an “Apple
|
|
|
|
|
Silicon” machine and running in arm64 mode.
|
|
|
|
|
|
|
|
|
|
In these two cases, an installation tool can still work out the
|
|
|
|
|
appropriate set of wheel tags by computing the local platform tags,
|
|
|
|
|
taking the wheel tag templates from ``Pybi-Wheel-Tag``, and swapping
|
|
|
|
|
in the actual supported platforms in place of the magic ``PLATFORM``
|
|
|
|
|
string.
|
|
|
|
|
|
|
|
|
|
However, there are other cases that are even more complicated:
|
|
|
|
|
|
|
|
|
|
- You can (usually) run both 32- and 64-bit apps on 64-bit Windows. So a pybi
|
|
|
|
|
installer might compute the set of allowable pybi tags on the current
|
|
|
|
|
platform as [``win32``, ``win_amd64``]. But you can’t then just take that
|
|
|
|
|
set and swap it into the pybi’s wheel tag template or you get nonsense:
|
|
|
|
|
|
|
|
|
|
::
|
|
|
|
|
|
|
|
|
|
[
|
|
|
|
|
"cp39-cp39-win32",
|
|
|
|
|
"cp39-cp39-win_amd64",
|
|
|
|
|
"cp39-abi3-win32",
|
|
|
|
|
"cp39-abi3-win_amd64",
|
|
|
|
|
...
|
|
|
|
|
]
|
|
|
|
|
|
|
|
|
|
To handle this, the installer needs to somehow understand that a
|
|
|
|
|
``manylinux_2_12_x86_64`` pybi can use a ``manylinux_2_17_x86_64`` wheel
|
|
|
|
|
as long as those are both valid tags on the current machine, but a
|
|
|
|
|
``win32`` pybi *can’t* use a ``win_amd64`` wheel, even if those are both
|
|
|
|
|
valid tags on the current machine.
|
|
|
|
|
|
|
|
|
|
- A pybi tagged ``macosx_11_0_universal2`` might be able to use
|
|
|
|
|
wheels tagged as ``macosx_11_0_x86_64``, but only if it’s
|
|
|
|
|
installed on an x86-64 machine *or* it’s installed on an ARM
|
|
|
|
|
machine *and* the interpreter is invoked with the magic
|
|
|
|
|
incantation that tells macOS to run a binary in x86-64 mode. So
|
|
|
|
|
how the installer plans to invoke the pybi matters too!
|
|
|
|
|
|
|
|
|
|
So actually using ``Pybi-Wheel-Tag`` values is less trivial than it
|
|
|
|
|
might seem, and they’re probably only useful with fairly
|
|
|
|
|
sophisticated tooling. But, smart pybi installers will already have
|
|
|
|
|
to understand a lot of these platform compatibility issues in order
|
|
|
|
|
to select a working pybi, and for the cross-platform
|
|
|
|
|
pinning/environment building case, users can potentially provide
|
|
|
|
|
whatever information is needed to disambiguate exactly what platform
|
|
|
|
|
they’re targeting. So, it’s still useful enough to include in the PyBI
|
|
|
|
|
metadata -- tools that don't find it useful can simply ignore it.
|
|
|
|
|
|
|
|
|
|
You can probably generate these metadata values by running this script on the
|
|
|
|
|
built interpreter:
|
|
|
|
|
|
|
|
|
|
.. code:: python
|
|
|
|
|
|
|
|
|
|
import packaging.markers
|
|
|
|
|
import packaging.tags
|
|
|
|
|
import sysconfig
|
|
|
|
|
import os.path
|
|
|
|
|
import json
|
|
|
|
|
import sys
|
|
|
|
|
|
|
|
|
|
marker_vars = packaging.markers.default_environment()
|
|
|
|
|
# Delete any keys that depend on the final installation
|
|
|
|
|
del marker_vars["platform_release"]
|
|
|
|
|
del marker_vars["platform_version"]
|
|
|
|
|
# Darwin binaries are often multi-arch, so play it safe and
|
|
|
|
|
# delete the architecture marker. (Better would be to only
|
|
|
|
|
# do this if the pybi actually is multi-arch.)
|
|
|
|
|
if marker_vars["sys_platform"] == "darwin":
|
|
|
|
|
del marker_vars["platform_machine"]
|
|
|
|
|
|
|
|
|
|
# Copied and tweaked version of packaging.tags.sys_tags
|
|
|
|
|
tags = []
|
|
|
|
|
interp_name = packaging.tags.interpreter_name()
|
|
|
|
|
if interp_name == "cp":
|
|
|
|
|
tags += list(packaging.tags.cpython_tags(platforms=["xyzzy"]))
|
|
|
|
|
else:
|
|
|
|
|
tags += list(packaging.tags.generic_tags(platforms=["xyzzy"]))
|
|
|
|
|
|
|
|
|
|
tags += list(packaging.tags.compatible_tags(platforms=["xyzzy"]))
|
|
|
|
|
|
|
|
|
|
# Gross hack: packaging.tags normalizes platforms by lowercasing them,
|
|
|
|
|
# so we generate the tags with a unique string and then replace it
|
|
|
|
|
# with our special uppercase placeholder.
|
|
|
|
|
str_tags = [str(t).replace("xyzzy", "PLATFORM") for t in tags]
|
|
|
|
|
|
|
|
|
|
(base_path,) = sysconfig.get_config_vars("installed_base")
|
|
|
|
|
# For some reason, macOS framework builds report their
|
|
|
|
|
# installed_base as a directory deep inside the framework.
|
|
|
|
|
while "Python.framework" in base_path:
|
|
|
|
|
base_path = os.path.dirname(base_path)
|
|
|
|
|
paths = {key: os.path.relpath(path, base_path).replace("\\", "/") for (key, path) in sysconfig.get_paths().items()}
|
|
|
|
|
|
|
|
|
|
json.dump({"marker_vars": marker_vars, "tags": str_tags, "paths": paths}, sys.stdout)
|
|
|
|
|
|
|
|
|
|
This emits a JSON dict on stdout with separate entries for each set of
|
|
|
|
|
pybi-specific tags.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Symlinks
|
|
|
|
|
--------
|
|
|
|
|
|
|
|
|
|
Currently, symlinks are used by default in all Unix Python installs (e.g.,
|
|
|
|
|
``bin/python3 -> bin/python3.9``). And furthermore, symlinks are *required* to
|
|
|
|
|
store macOS framework builds in ``.pybi`` files. So, unlike wheel files, we
|
|
|
|
|
absolutely have to support symlinks in ``.pybi`` files for them to be useful at
|
|
|
|
|
all.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Representing symlinks in zip files
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
The de-facto standard for representing symlinks in zip files is the
|
|
|
|
|
Info-Zip symlink extension, which works as follows:
|
|
|
|
|
|
|
|
|
|
- The symlink’s target path is stored as if it were the file contents
|
|
|
|
|
- The top 4 bits of the Unix permissions field are set to ``0xa``,
|
|
|
|
|
i.e.: ``permissions & 0xf000 == 0xa000``
|
|
|
|
|
- The Unix permissions field, in turn, is stored as the top 16 bits of
|
|
|
|
|
the “external attributes” field.
|
|
|
|
|
|
|
|
|
|
So if using Python’s ``zipfile`` module, you can check whether a
|
|
|
|
|
``ZipInfo`` represents a symlink by doing:
|
|
|
|
|
|
|
|
|
|
.. code:: python
|
|
|
|
|
|
|
|
|
|
(zip_info.external_attr >> 16) & 0xf000 == 0xa000
|
|
|
|
|
|
|
|
|
|
Or if using Rust’s ``zip`` crate, the equivalent check is:
|
|
|
|
|
|
|
|
|
|
.. code:: rust
|
|
|
|
|
|
|
|
|
|
fn is_symlink(zip_file: &zip::ZipFile) -> bool {
|
|
|
|
|
match zip_file.unix_mode() {
|
|
|
|
|
Some(mode) => mode & 0xf000 == 0xa000,
|
|
|
|
|
None => false,
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
If you’re on Unix, your ``zip`` and ``unzip`` commands probably understands this
|
|
|
|
|
format already.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Representing symlinks in RECORD files
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
Normally, a ``RECORD`` file lists each file + its hash + its length:
|
|
|
|
|
|
|
|
|
|
.. code:: text
|
|
|
|
|
|
|
|
|
|
my/favorite/file,sha256=...,12345
|
|
|
|
|
|
|
|
|
|
For symlinks, we instead write:
|
|
|
|
|
|
|
|
|
|
.. code:: text
|
|
|
|
|
|
|
|
|
|
name/of/symlink,symlink=path/to/symlink/target,
|
|
|
|
|
|
|
|
|
|
That is: we use a special “hash function” called ``symlink``, and then
|
|
|
|
|
store the actual symlink target as the “hash value”. And the length is
|
|
|
|
|
left empty.
|
|
|
|
|
|
|
|
|
|
**Rationale:** we’re already committed to the ``RECORD`` file containing a
|
|
|
|
|
redundant check on everything in the main archive, so for symlinks we at least
|
|
|
|
|
need to store some kind of hash, plus some kind of flag to indicate that this is
|
|
|
|
|
a symlink. Given that symlink target strings are roughly the same size as a
|
|
|
|
|
hash, we might as well store them directly. This also makes the symlink
|
|
|
|
|
information easier to access for tools that don’t understand the Info-Zip
|
|
|
|
|
symlink extension, and makes it possible to losslessly unpack and repack a Unix
|
|
|
|
|
pybi on a Windows system, which someone might find handy at some point.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Storing symlinks in ``pybi`` files
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
When a pybi creator stores a symlink, they MUST use both of the
|
|
|
|
|
mechanisms defined above: storing it in the zip archive directly using
|
|
|
|
|
the Info-Zip representation, and also recording it in the ``RECORD``
|
|
|
|
|
file.
|
|
|
|
|
|
|
|
|
|
Pybi consumers SHOULD validate that the symlinks in the archive and
|
|
|
|
|
``RECORD`` file are consistent with each other.
|
|
|
|
|
|
|
|
|
|
We also considered using *only* the ``RECORD`` file to store symlinks,
|
|
|
|
|
but then the vanilla ``unzip`` tool wouldn’t be able to unpack them, and
|
|
|
|
|
that would make it hard to install a pybi from a shell script.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Limitations
|
|
|
|
|
~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
Symlinks enable a lot of potential messiness. To keep things under
|
|
|
|
|
control, we impose the following restrictions:
|
|
|
|
|
|
|
|
|
|
- Symlinks MUST NOT be used in ``.pybi``\ s targeting Windows, or other
|
|
|
|
|
platforms that are missing first-class symlink support.
|
|
|
|
|
|
|
|
|
|
- Symlinks MUST NOT be used inside the ``pybi-info`` directory.
|
|
|
|
|
(Rationale: there’s no need, and it makes things simpler for
|
|
|
|
|
resolvers that need to extract info from ``pybi-info`` without
|
|
|
|
|
unpacking the whole archive.)
|
|
|
|
|
|
|
|
|
|
- Symlink targets MUST be relative paths, and MUST be inside the pybi
|
|
|
|
|
directory.
|
|
|
|
|
|
|
|
|
|
- If ``A/B/...`` is recorded as a symlink in the archive, then there
|
|
|
|
|
MUST NOT be any other entries in the archive named like
|
|
|
|
|
``A/B/.../C``.
|
|
|
|
|
|
|
|
|
|
For example, if an archive has a symlink ``foo -> bar``, and then
|
|
|
|
|
later in the archive there’s a regular file named ``foo/blah.py``,
|
|
|
|
|
then a naive unpacker could potentially end up writing a file called
|
|
|
|
|
``bar/blah.py``. Don’t be naive.
|
|
|
|
|
|
|
|
|
|
Unpackers MUST verify that these rules are followed, because without
|
|
|
|
|
them attackers could create evil symlinks like ``foo -> /etc/passwd`` or
|
|
|
|
|
``foo -> ../../../../../etc`` + ``foo/passwd -> ...`` and cause havoc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Non-normative comments
|
|
|
|
|
======================
|
|
|
|
|
|
|
|
|
|
Why not just use conda?
|
|
|
|
|
-----------------------
|
|
|
|
|
|
|
|
|
|
This isn't really in the scope of this PEP, but since conda is a popular way to
|
|
|
|
|
distribute binary Python interpreters, it's a natural question.
|
|
|
|
|
|
|
|
|
|
The simple answer is: conda is great! But, there are lots of python users who
|
|
|
|
|
aren't conda users, and they deserve nice things too. This PEP just gives them
|
|
|
|
|
another option.
|
|
|
|
|
|
|
|
|
|
The deeper answer is: the maintainers who upload packages to PyPI are the
|
|
|
|
|
backbone of the Python ecosystem. They're the first audience for Python
|
|
|
|
|
packaging tools. And one thing they want is to upload a package once, and have
|
|
|
|
|
it be accessible across all the different ways Python is deployed: in Debian and
|
|
|
|
|
Fedora and Homebrew and FreeBSD, in Conda environments, in big companies'
|
|
|
|
|
monorepos, in Nix, in Blender plugins, in RenPy games, ..... you get the idea.
|
|
|
|
|
|
|
|
|
|
All of these environments have their own tooling and strategies for managing
|
|
|
|
|
packages and dependencies. So what's special about PyPI and wheels is that
|
|
|
|
|
they're designed to describe dependencies in a *standard, abstract way*, that
|
|
|
|
|
all these downstream systems can consume and convert into their local
|
|
|
|
|
conventions. That's why package maintainers use Python-specific metadata and
|
|
|
|
|
upload to PyPI: because it lets them address all of those systems
|
|
|
|
|
simultaneously. Every time you build a Python package for conda, there's an
|
|
|
|
|
intermediate wheel that's generated, because wheels are the common language that
|
|
|
|
|
Python package build systems and conda can use to talk to each other.
|
|
|
|
|
|
|
|
|
|
But then, if you're a maintainer releasing an sdist+wheels, then you naturally
|
|
|
|
|
want to test what you're releasing, which may depend on arbitrary PyPI packages
|
|
|
|
|
and versions. So you need tools that build Python environments directly from
|
|
|
|
|
PyPI, and conda is fundamentally not designed to do that. So conda and pip are
|
|
|
|
|
both necessary for different cases, and this proposal happens to be targeting
|
|
|
|
|
the pip side of that equation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sdists (or not)
|
|
|
|
|
---------------
|
|
|
|
|
|
|
|
|
|
It might be cool to have an “sdist” equivalent for pybis, i.e., some
|
|
|
|
|
kind of format for a Python source release that’s structured-enough to
|
|
|
|
|
let tools automatically fetch and build it into a pybi, for platforms
|
|
|
|
|
where prebuilt pybis aren’t available. But, this isn’t necessary for the
|
|
|
|
|
MVP and opens a can of worms, so let’s worry about it later.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
What packages should be bundled inside a pybi?
|
|
|
|
|
----------------------------------------------
|
|
|
|
|
|
|
|
|
|
Pybi builders have the power to pick and choose what exactly goes inside. For
|
|
|
|
|
example, you could include some preinstalled packages in the pybi’s
|
|
|
|
|
``site-packages`` directory, or prune out bits of the stdlib that you don’t
|
|
|
|
|
want. We can’t stop you! Though if you do preinstall packages, then it's
|
|
|
|
|
strongly recommended to also include the correct metadata (``.dist-info`` etc.),
|
|
|
|
|
so that it’s possible for Pip or other tools to understand out what’s going on.
|
|
|
|
|
|
|
|
|
|
For my prototype “general purpose” pybi’s, what I chose is:
|
|
|
|
|
|
|
|
|
|
- Make sure ``site-packages`` is *empty*.
|
|
|
|
|
|
|
|
|
|
**Rationale:** for traditional standalone python installers that are targeted
|
|
|
|
|
at end-users, you probably want to include at least ``pip``, to avoid
|
|
|
|
|
bootstrapping issues (:pep:`453`). But pybis are different: they’re designed
|
|
|
|
|
to be installed by “smart” tooling, that consume the pybi as part of some
|
|
|
|
|
kind of larger automated deployment process. It’s easier for these installers
|
|
|
|
|
to start from a blank slate and then add whatever they need, than for them to
|
|
|
|
|
start with some preinstalled packages that they may or may not want. (And
|
|
|
|
|
besides, you can still run ``python -m ensurepip``.)
|
|
|
|
|
|
|
|
|
|
- Include the full stdlib, *except* for ``test``.
|
|
|
|
|
|
|
|
|
|
**Rationale:** the top-level ``test`` module contains CPython’s own test
|
|
|
|
|
suite. It’s huge (CPython without ``test`` is ~37 MB, then ``test``
|
|
|
|
|
adds another ~25 MB on top of that!), and essentially never used by
|
|
|
|
|
regular user code. Also, as precedent, the official nuget packages,
|
|
|
|
|
the official manylinux images, and multiple Linux distributions all
|
|
|
|
|
leave it out, and this hasn’t caused any major problems.
|
|
|
|
|
|
|
|
|
|
So this seems like the best way to balance broad compatibility with
|
|
|
|
|
reasonable download/install sizes.
|
|
|
|
|
|
|
|
|
|
- I’m not shipping any ``.pyc`` files. They take up space in the
|
|
|
|
|
download, can be generated on the final system at minimal cost, and
|
|
|
|
|
dropping them removes a source of location-dependence. (``.pyc``
|
|
|
|
|
files store the absolute path of the corresponding ``.py`` file and
|
|
|
|
|
include it in tracebacks; but, pybis are relocatable, so the correct
|
|
|
|
|
path isn’t known until after install.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Backwards Compatibility
|
|
|
|
|
=======================
|
|
|
|
|
|
|
|
|
|
No backwards compatibility considerations.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Security Implications
|
|
|
|
|
=====================
|
|
|
|
|
|
|
|
|
|
No security implications, beyond the fact that anyone who takes it upon
|
|
|
|
|
themselves to distribute binaries has to come up with a plan to manage their
|
|
|
|
|
security (e.g., whether they roll a new build after an OpenSSL CVE drops). But
|
|
|
|
|
collectively, we core Python folks are already maintaining binary builds for all
|
|
|
|
|
major platforms (macOS + Windows through python.org, and Linux builds through
|
|
|
|
|
the official manylinux image), so even if we do start releasing official CPython
|
|
|
|
|
builds on PyPI it doesn't really raise any new security issues.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
How to Teach This
|
|
|
|
|
=================
|
|
|
|
|
|
|
|
|
|
This isn't targeted at end-users; their experience will simply be that e.g.
|
|
|
|
|
their pyenv or tox invocation magically gets faster and more reliable (if those
|
|
|
|
|
projects' maintainers decide to take advantage of this PEP).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This document is placed in the public domain or under the
|
|
|
|
|
CC0-1.0-Universal license, whichever is more permissive.
|