PEP 777: How to Re-invent the Wheel (#4036)
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com> Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> Co-authored-by: Carol Willing <carolcode@willingconsulting.com>
This commit is contained in:
parent
3b1b3d8ba3
commit
fd8070858f
|
@ -640,6 +640,8 @@ peps/pep-0759.rst @warsaw
|
|||
peps/pep-0760.rst @pablogsal @brettcannon
|
||||
peps/pep-0761.rst @sethmlarson @hugovk
|
||||
# ...
|
||||
peps/pep-0777.rst @warsaw
|
||||
# ...
|
||||
peps/pep-0789.rst @njsmith
|
||||
# ...
|
||||
peps/pep-0801.rst @warsaw
|
||||
|
|
|
@ -0,0 +1,303 @@
|
|||
PEP: 777
|
||||
Title: How to Re-invent the Wheel
|
||||
Author: Ethan Smith <ethan@ethanhs.me>
|
||||
Sponsor: Barry Warsaw <barry@python.org>
|
||||
PEP-Delegate: Paul Moore <p.f.moore@gmail.com>
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
Topic: Packaging
|
||||
Created: 09-Oct-2024
|
||||
Post-History:
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
The current :pep:`wheel 1.0 specification <427>` was written over a decade ago,
|
||||
and has been extremely robust to changes in the Python packaging ecosystem.
|
||||
Previous efforts to improve the wheel specification
|
||||
:pep:`were deferred <491#pep-deferral>` to focus on other packaging
|
||||
specifications. Meanwhile, the use of wheels has changed dramatically in the
|
||||
last decade. There have been many requests for new wheel features over the
|
||||
years; however, a fundamental obstacle to evolving the wheel specification has
|
||||
been that there is no defined process for how to handle adding
|
||||
backwards-incompatible features to wheels. Therefore, to enable other PEPs to
|
||||
describe new enhancements to the wheel specification, **this PEP prescribes**
|
||||
**compatibility requirements on future wheel revisions**. This PEP does *not*
|
||||
specify a new wheel revision. The specification of a new wheel format
|
||||
(“Wheel 2.0”) is left to a future PEP.
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
Currently, wheel specification changes that require new installer behavior are backwards incompatible and require a major version increase in
|
||||
the wheel metadata format. An increase of the wheel major version has yet to
|
||||
happen, partially because such a change has the potential to be
|
||||
catastrophically disruptive. Per
|
||||
`the wheel specification <https://packaging.python.org/en/latest/specifications/binary-distribution-format/#installing-a-wheel-distribution-1-0-py32-none-any-whl>`_,
|
||||
any installer that does not support the new major version must abort at install
|
||||
time. This means that if the major version were to be incremented without
|
||||
further planning, many users would see installation failures as older installers reject new wheels
|
||||
uploaded to public package indices like the Python Package Index (PyPI). It is
|
||||
critically important to carefully plan the interactions between build tools,
|
||||
package indices, and package installers to avoid incompatibility issues,
|
||||
especially considering the long tail of users who are slow to update their
|
||||
installers.
|
||||
|
||||
The backward compatibility concerns have prevented valuable improvements
|
||||
to the wheel file format, such as
|
||||
`better compression <https://discuss.python.org/t/improving-wheel-compression-by-nesting-data-as-a-second-zip/1747>`_,
|
||||
`wheel data format improvements <https://discuss.python.org/t/should-there-be-a-new-standard-for-installing-arbitrary-data-files/7853/7>`_,
|
||||
`better information about what is included in a wheel <https://discuss.python.org/t/record-the-top-level-names-of-a-wheel-in-metadata/29494>`_,
|
||||
and `JSON formatted metadata in the ".dist-info" folder <https://discuss.python.org/t/is-was-there-a-goal-with-pep-566s-json-encoding-section/12324/3>`_.
|
||||
|
||||
This PEP describes constraints and behavior for new wheel revisions to preserve
|
||||
stability for existing tools that do not support a new major version of the wheel format.
|
||||
This ensures that backwards incompatible changes to the wheel specification
|
||||
will only affect users and tools that are properly set up to use the newer
|
||||
wheels. With a clear path for evolving the wheel specification, future PEPs
|
||||
will be able to improve the wheel format without needing to re-define a
|
||||
completely new compatibility story.
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
Add Wheel-Version Metadata Field to Core Metadata
|
||||
-------------------------------------------------
|
||||
|
||||
Currently, the :pep:`wheel 1.0 PEP <427>`, PEP 427, specifies that wheel files
|
||||
must contain a ``WHEEL`` metadata file that contains the version of the wheel
|
||||
specification that the file conforms to. PEP 427 stipulates that installers
|
||||
MUST warn on installation of a wheel with a minor version greater than supported,
|
||||
and MUST abort on installation of wheels with a major version that is greater than
|
||||
what the installer supports. This ensures that users do not get invalid
|
||||
installations from wheels that installers cannot properly install.
|
||||
|
||||
However, resolvers do not currently exclude wheels with an incompatible wheel
|
||||
version. There is also currently no way for a resolver to check a wheel's
|
||||
version without downloading the wheel directly. To make wheel version filtering
|
||||
easy for resolvers, the wheel version **MUST** be included in the relevant
|
||||
metadata file (currently METADATA). This will allow resolvers to efficiently
|
||||
check the wheel version using the :pep:`658` metadata API without needing to
|
||||
download and inspect the ``.dist-info/WHEEL`` file.
|
||||
|
||||
To accomplish this, a new core metadata field is introduced called
|
||||
``Wheel-Version``. While this field is optional for metadata included in a
|
||||
wheel of major version 1, it is a mandatory field for metadata in wheels of major
|
||||
version 2 or higher. This enforces that future revisions of the wheel
|
||||
specification can rely on resolvers skipping incompatible wheels by checking
|
||||
the ``Wheel-Version`` field.
|
||||
|
||||
The ``Wheel-Version`` field in the metadata file shall contain the exact same entry as the
|
||||
``Wheel-Version`` entry in the ``WHEEL`` file, or any future replacement file
|
||||
defining metadata about the wheel file. Installers **MUST** verify that these
|
||||
entries match when installing a wheel. If ``Wheel-Version`` is absent from the
|
||||
metadata file, then the implied major version of the wheel is 1.
|
||||
|
||||
Resolver Behavior Regarding ``Wheel-Version``
|
||||
---------------------------------------------
|
||||
|
||||
Resolvers, in the process of selecting a wheel to install, **MUST** check a
|
||||
candidate wheel's ``Wheel-Version``, and ignore incompatible wheel files.
|
||||
Without ignoring these files, older installers might select a wheel that uses
|
||||
an unsupported wheel version for that installer, and force the installer to
|
||||
abort per :pep:`427`. By skipping incompatible wheel files, users will not see
|
||||
installation errors when a project adopts a new wheel major version. As already
|
||||
specified in PEP 427, installers **MUST** abort if a user tries to directly
|
||||
install a wheel that is incompatible. If, in the process of resolving packages
|
||||
found in multiple indices, a resolver comes across two wheels of the same
|
||||
distribution and version, resolvers should prioritize the wheel of the highest
|
||||
compatible version.
|
||||
|
||||
While the above protects users from unexpected breakages, users may miss a new
|
||||
release of a distribution if their installer does not support the wheel version
|
||||
used in the release. Imagine in the future that a package publishes 3.0 wheel
|
||||
files. Downstream users won't see that there is a new release available if
|
||||
their installers only support 2.x wheels. Therefore, installers **SHOULD** emit
|
||||
a warning if, in the process of resolving packages, they come across an incompatible wheel
|
||||
and skip it.
|
||||
|
||||
First Major Version Bump Must Change File Extension
|
||||
---------------------------------------------------
|
||||
|
||||
Unfortunately, existing resolvers do not check the compatibility of wheels
|
||||
before selecting them as installation candidates. Until a majority of users
|
||||
update to installers that properly check for wheel compatibility, it is unsafe
|
||||
to allow publishing wheels of a new major version that existing resolvers might
|
||||
select. It could take upwards of four years before the majority of users are on
|
||||
updated resolvers, based on current data about PyPI installer usage (See the
|
||||
:ref:`777-pypi-download-analysis`, for
|
||||
details). To allow for experimentation and faster adoption of 2.0 wheels,
|
||||
this PEP proposes a one time change to the file extension of the
|
||||
wheel file format, from ``.whl`` to ``.whlx``. This resolves the initial
|
||||
transition issue of 2.0 wheels breaking users on existing installers that do
|
||||
not implement ``Wheel-Version`` checks. By using a different file extension,
|
||||
2.0 wheels can immediately be uploaded to PyPI, and users will be able to
|
||||
experiment with the new features right away. Users on older installers will
|
||||
simply ignore these new files.
|
||||
|
||||
One rejected alternative would be to keep the ``.whl`` extension, but delay the
|
||||
publishing of wheel 2.0 to PyPI. For more on that, please see Rejected Ideas.
|
||||
|
||||
Recommended Build Backend Behavior with New Wheel Formats
|
||||
---------------------------------------------------------
|
||||
|
||||
Build backends are recommended to generate the most compatible wheel based on
|
||||
features a project uses. For example, if a wheel does not use symbolic links,
|
||||
and such a feature was introduced in wheel 5.0, the build backend could
|
||||
generate a wheel of version 4.0. On the other hand, some features will want to
|
||||
be adopted by default. For example, if wheel 3.0 introduces better compression,
|
||||
the build backend may wish to enable this feature by default to improve the
|
||||
wheel size and download performance.
|
||||
|
||||
Limitations on Future Wheel Revisions
|
||||
-------------------------------------
|
||||
|
||||
While it is difficult to know what future features may be planned for the wheel
|
||||
format, it is important that certain compatibility promises are maintained.
|
||||
|
||||
Wheel files, when installed, **MUST** stay compatible with the Python standard
|
||||
library's ``importlib.metadata`` for all supported CPython versions. For
|
||||
example, replacing ``.dist-info/METADATA`` with a JSON formatted metadata file
|
||||
MUST be a multi-major version migration with one version introducing the new
|
||||
JSON file alongside the existing email header format, and another future
|
||||
version removing the email header format metadata file. The version to remove
|
||||
``.dist-info/METADATA`` also **MUST** be adopted only after the last CPython
|
||||
release that lacked support for the new file reaches end of life. This ensures
|
||||
that code using ``importlib.metadata`` will not break with wheel major version
|
||||
revisions.
|
||||
|
||||
Wheel files **MUST** remain ZIP format files as the outer container format.
|
||||
Additionally, the ``.dist-info`` metadata directory **MUST** be placed at the
|
||||
root of the archive without any compression, so that unpacking the wheel file
|
||||
produces a normal ``.dist-info`` directory holding any metadata for the wheel.
|
||||
Future wheel revisions **MAY** modify the layout, compression, and other
|
||||
attributes about non-metadata components of a wheel such as data and code. This
|
||||
assures that future wheel revisions remain compatible with tools operating on
|
||||
package metadata, while allowing for improvements to code storage in the wheel,
|
||||
such as adopting compression.
|
||||
|
||||
Package tooling **MUST NOT** assume that the contents and format of the wheel
|
||||
file will remain the same for future wheel major versions beyond the
|
||||
limitations above about metadata folder contents and outer container format.
|
||||
For example, newer wheel major versions may add or remove filename components,
|
||||
such as the build tag or the platform tag. Therefore it is incumbent upon
|
||||
tooling to check the metadata for the ``Wheel-Version`` before attempting to
|
||||
install a wheel.
|
||||
|
||||
Finally, future wheel revisions **MUST NOT** use any compression formats not in
|
||||
the CPython standard library of at least the latest release. Wheels generated
|
||||
using any new compression format should be tagged as requiring at least the
|
||||
first released version of CPython to support the new compression format,
|
||||
regardless of the Python API compatibility of the code within the wheel.
|
||||
|
||||
Backwards Compatibility
|
||||
=======================
|
||||
|
||||
Backwards compatibility is an incredibly important issue for evolving the wheel
|
||||
format. If adopting a new wheel revision is painful for downstream users,
|
||||
package creators will hesitate to adopt the new standards, and users will be
|
||||
stuck with failed CI pipelines and other installation woes.
|
||||
|
||||
Several choices in the above specification are made so that the adoption of a
|
||||
new feature is less painful. For example, today wheels of an incompatible major
|
||||
version are still selected by pip as installation candidates, which causes
|
||||
installer failures if a project starts publishing 2.0 wheels. To avoid this
|
||||
issue, this PEP requires resolvers to filter out wheels with major versions or
|
||||
features incompatible with the installer.
|
||||
|
||||
This PEP also defines constraints on future wheel revisions, with the goal of
|
||||
maintaining compatibility with CPython, but allowing evolution of wheel
|
||||
contents. Wheel revisions shouldn't cause package installations to break on
|
||||
older CPython revisions, as not only would it be frustrating, it would be
|
||||
incredibly hard to debug for users.
|
||||
|
||||
The main compatibility limitation of this PEP is for projects that start
|
||||
publishing solely new wheels alongside a source distribution. If a user on an
|
||||
older installer tries to install the package, it will fall back to the source
|
||||
distribution, because the resolver will skip all newer wheels. Users are often
|
||||
poorly set up to build projects from source, so this could lead to some failed
|
||||
builds users would not see otherwise. There are several approaches to resolving
|
||||
this issue, such as allowing dual-publishing for the initial migration, or
|
||||
marking source distributions as not intended to be built.
|
||||
|
||||
Rejected Ideas
|
||||
==============
|
||||
|
||||
The Wheel Format is Perfect and Does not Need to be Changed
|
||||
-----------------------------------------------------------
|
||||
The wheel format has been around for over 10 years, and in that time, Python
|
||||
packages have changed a lot. It is much more common for packages to include
|
||||
Rust or C extension modules, increasing the size of packages. Better
|
||||
compression, such as lzma or zstd, could save a lot of time and bandwidth for
|
||||
PyPI and its users. Compatibility tags cannot express the wide variety of
|
||||
hardware used to accelerate Python code today, nor encode shared library
|
||||
compatibility information. In order to address these issues, evolution of the
|
||||
wheel package format is necessary.
|
||||
|
||||
Wheel Format Changes Should be Tied to CPython Releases
|
||||
-------------------------------------------------------
|
||||
I do not believe that tying wheel revisions to CPython
|
||||
releases is beneficial. The main benefit of doing so is to make adoption of new
|
||||
wheels predictable - users with the latest CPython get the latest package
|
||||
format! This choice has several issues however. First, tying the new format
|
||||
to the latest CPython makes adoption much slower. Users on LTS versions of
|
||||
Linux with older Python installations are free to update their pip in a virtual
|
||||
environment, but cannot update the version of Python as easily. While some
|
||||
changes to the wheel format must be tied to CPython changes necessarily, such
|
||||
as adding new compression formats or changing the metadata format, many changes
|
||||
do not need to be tied to the Python version, such as symlinks, enhanced
|
||||
compatibility tags, and new formats that use existing compression formats in
|
||||
the standard library. Additionally, wheels are used across multiple different
|
||||
language implementations, which lag behind the CPython version. It seems unfair
|
||||
to prevent their users from using a feature due to the Python version. Lastly,
|
||||
while this PEP does not suggest tying the wheel version to CPython releases, a
|
||||
future PEP may still do so at any time, so this choice does not need to be made
|
||||
in this PEP.
|
||||
|
||||
Keep Using ``.whl`` as the File Extension
|
||||
-----------------------------------------
|
||||
While keeping the extension ``.whl`` is appealing for many reasons, it presents
|
||||
several problems that are difficult to surmount. First, current installers
|
||||
would still pick a new wheel and fail to install the package. Furthermore,
|
||||
the file name of a wheel would not be able to change without breaking existing
|
||||
installers that expect a set wheel file name format. While the current filename
|
||||
specification for wheels is sufficient for current usage, the optional
|
||||
build tag in the middle of the file name makes any extensions ambiguous (i.e.
|
||||
``foo-0.3-py3-none-any-fancy_new_tag.whl`` would parse as the build tag being
|
||||
``py3``). This limits changes to information stored in the wheel file name.
|
||||
|
||||
Discussion Topics
|
||||
=================
|
||||
|
||||
Should Indices Support Dual-publishing for the First Migration?
|
||||
---------------------------------------------------------------
|
||||
Since ``.whl`` and ``.whlx`` will look different in file name, they could be
|
||||
uploaded side-by-side to package indices like PyPI. This has some nice
|
||||
benefits, like dual-support for older and newer installers, so users who can
|
||||
get the latest features, while users who don't upgrade still can install the
|
||||
latest version of a package.
|
||||
|
||||
There are many complications however. Should we allow wheel 2 uploads to
|
||||
existing wheel 1-only releases? Should we put any requirements on the
|
||||
side-by-side wheels, such as:
|
||||
|
||||
.. admonition:: Constraints on dual-published wheels
|
||||
|
||||
A given index may contain identical-content wheels with different wheel
|
||||
versions, and installers should prefer the newest-available wheel format,
|
||||
with all other factors held constant.
|
||||
|
||||
Should we only allow uploading both with :pep:`694` allowing "atomic"
|
||||
dual-publishing?
|
||||
|
||||
Acknowledgements
|
||||
================
|
||||
|
||||
The author of this PEP is greatly indebted to the incredibly valuable review,
|
||||
advice, and feedback of Barry Warsaw and Michael Sarahan.
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document is placed in the public domain or under the
|
||||
CC0-1.0-Universal license, whichever is more permissive.
|
Binary file not shown.
After Width: | Height: | Size: 117 KiB |
Binary file not shown.
After Width: | Height: | Size: 126 KiB |
|
@ -0,0 +1,78 @@
|
|||
:orphan:
|
||||
|
||||
.. _777-pypi-download-analysis:
|
||||
|
||||
Appendix: Analysis of Installer Usage on PyPI
|
||||
=============================================
|
||||
|
||||
.. note::
|
||||
This analysis is not perfect. While it uses the best available data,
|
||||
mirrors, caches used by enterprises, and other confounding factors
|
||||
could affect the numbers in this analysis. Consider the numbers as trends
|
||||
rather than concrete reliable figures.
|
||||
|
||||
One pertinent question to :pep:`777` is how frequently Python users update their
|
||||
installer. If users update quite frequently, compatibility concerns are not as
|
||||
important; users will be up-to-date by the time new features get added. On the
|
||||
other hand, if users are frequently using older installers, then incompatible
|
||||
wheels on PyPI would have a much wider impact. To figure out the relative share
|
||||
of up-to-date vs outdated installers, we can use PyPI download statistics.
|
||||
|
||||
PyPI publishes a `BigQuery dataset <https://console.cloud.google.com/marketplace/product/gcp-public-data-pypi/pypi>`_,
|
||||
which contains information about each download PyPI receives, including
|
||||
installer name and version when available. The following query was used to
|
||||
collect the data for this analysis:
|
||||
|
||||
.. code-block:: sql
|
||||
|
||||
#standardSQL
|
||||
SELECT
|
||||
details.installer.name as installer_name,
|
||||
details.installer.version as installer_version,
|
||||
COUNT(*) as num_downloads,
|
||||
FROM `bigquery-public-data.pypi.file_downloads`
|
||||
WHERE
|
||||
-- Only query the last 6 months of data
|
||||
DATE(timestamp)
|
||||
BETWEEN DATE_TRUNC(DATE_SUB(CURRENT_DATE(), INTERVAL 6 MONTH), MONTH)
|
||||
AND CURRENT_DATE()
|
||||
GROUP BY `installer_name`, `installer_version`
|
||||
ORDER BY `num_downloads` DESC
|
||||
|
||||
With the raw data available, we can start investigating how up-to-date
|
||||
installers that download packages from PyPI are. The below chart shows the
|
||||
breakdown by installer name of all downloads on PyPI for the six month period
|
||||
from March 10, 2024 to September 10, 2024.
|
||||
|
||||
.. image:: appendix-dl-by-installer.png
|
||||
:class: invert-in-dark-mode
|
||||
:width: 600
|
||||
:alt: A pie chart breaking down PyPI downloads by installer. pip makes up
|
||||
87.5%, uv makes up 4.8%, poetry makes up 3.0%, requests makes up 1.6%,
|
||||
and "null" makes up 2.1%.
|
||||
|
||||
As can be seen above, pip is the most popular installer in this time frame.
|
||||
For simplicity's sake, this analysis will focus on pip installations when
|
||||
considering how up-to-date installers are. pip has existed for a long
|
||||
time, so analyzing the version of pip used to download packages should
|
||||
provide an idea of how frequently users update their installers. Below is a
|
||||
chart breaking down installations in PyPI over the same six month period, now
|
||||
grouped by pip installer major version. pip uses calendar versioning, so
|
||||
an installation from pip 20.x means that the user has not updated their pip
|
||||
in four years.
|
||||
|
||||
.. image:: appendix-dl-by-pip-version.png
|
||||
:class: invert-in-dark-mode
|
||||
:width: 600
|
||||
:alt: A pie chart breaking down PyPI downloads by pip major version. 24.x
|
||||
makes up 47.7%, 23.x makes up 19.9%, 22.x makes up 10.5%, 21.x makes up
|
||||
13.9%, 20.x makes up 5.4%, and 9.x makes up 1.9%.
|
||||
|
||||
Over two thirds of users currently run pip from this year or last. However,
|
||||
about 7% are on a version that is at least four years old(!). This indicates that
|
||||
there is a long tail of users who do not regularly update their installers.
|
||||
|
||||
Coming back to the initial question for PEP 777, it appears that caution should
|
||||
be taken when publishing wheels with major version 2 to PyPI, as they are
|
||||
likely to cause issues with a small but significant proportion of users who do
|
||||
not regularly update their pip.
|
Loading…
Reference in New Issue