2016-08-23 12:40:01 -04:00
|
|
|
|
PEP: 527
|
|
|
|
|
Title: Removing Un(der)used file types/extensions on PyPI
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Last-Modified: $Date$
|
|
|
|
|
Author: Donald Stufft <donald@stufft.io>
|
2016-08-24 16:45:14 -04:00
|
|
|
|
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
|
2016-08-23 12:40:01 -04:00
|
|
|
|
Discussions-To: distutils-sig@python.org
|
2020-06-17 04:54:30 -04:00
|
|
|
|
Status: Final
|
2016-08-23 12:40:01 -04:00
|
|
|
|
Type: Process
|
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
|
Created: 23-Aug-2016
|
|
|
|
|
Post-History: 23-Aug-2016
|
2016-09-02 12:39:05 -04:00
|
|
|
|
Resolution: https://mail.python.org/pipermail/distutils-sig/2016-September/029624.html
|
2016-08-23 12:40:01 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
========
|
|
|
|
|
|
|
|
|
|
This PEP recommends deprecating, and ultimately removing, support for uploading
|
|
|
|
|
certain unused or under used file types and extensions to PyPI. In particular
|
|
|
|
|
it recommends disallowing further uploads of any files of the types
|
|
|
|
|
``bdist_dumb``, ``bdist_rpm``, ``bdist_dmg``, ``bdist_msi``, and
|
|
|
|
|
``bdist_wininst``, leaving PyPI to only accept new uploads of the ``sdist``,
|
|
|
|
|
``bdist_wheel``, and ``bdist_egg`` file types.
|
|
|
|
|
|
|
|
|
|
In addition, this PEP proposes removing support for new uploads of sdists using
|
2016-08-24 16:45:14 -04:00
|
|
|
|
the ``.tar``, ``.tar.bz2``, ``.tar.xz``, ``.tar.Z``, ``.tgz``, ``.tbz``, and
|
|
|
|
|
any other extension besides ``.tar.gz`` and ``.zip``.
|
2016-08-23 12:40:01 -04:00
|
|
|
|
|
2016-08-25 07:57:33 -04:00
|
|
|
|
Finally, this PEP also proposes limiting the number of allowed sdist uploads
|
|
|
|
|
for each individual release of a project on PyPI to one instead of one for each
|
|
|
|
|
allowed extension.
|
|
|
|
|
|
2016-08-23 12:40:01 -04:00
|
|
|
|
|
|
|
|
|
Rationale
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
File Formats
|
|
|
|
|
------------
|
|
|
|
|
|
|
|
|
|
Currently PyPI supports the following file types:
|
|
|
|
|
|
|
|
|
|
* ``sdist``
|
|
|
|
|
* ``bdist_wheel``
|
|
|
|
|
* ``bdist_egg``
|
|
|
|
|
* ``bdist_wininst``
|
|
|
|
|
* ``bdist_msi``
|
|
|
|
|
* ``bdist_dmg``
|
|
|
|
|
* ``bdist_rpm``
|
|
|
|
|
* ``bdist_dumb``
|
|
|
|
|
|
|
|
|
|
However, these different types of files have varying amounts of usefulness or
|
|
|
|
|
general use in the ecosystem. Continuing to support them adds a maintenance
|
|
|
|
|
burden on PyPI as well as tool authors and incurs a cost in both bandwidth and
|
|
|
|
|
disk space not only on PyPI itself, but also on any mirrors of PyPI.
|
|
|
|
|
|
2016-08-24 16:45:14 -04:00
|
|
|
|
|
|
|
|
|
Python packaging is a multi-level ecosystem where PyPI is primarily suited and
|
|
|
|
|
used to distribute virtual environment compatible packages directly from their
|
|
|
|
|
respective project owners. These packages are then consumed either directly
|
|
|
|
|
by end-users, or by downstream distributors that take these packages and turn
|
|
|
|
|
them into their respective system level packages (such as RPM, deb, MSI, etc).
|
|
|
|
|
|
|
|
|
|
While PyPI itself only directly works with these Python specific but platform
|
|
|
|
|
agnostic packages, we encourage community-driven and commercial conversions of
|
|
|
|
|
these packages to downstream formats for particular target environments, like:
|
|
|
|
|
|
|
|
|
|
* The conda cross-platform data analysis ecosystem (conda-forge)
|
|
|
|
|
* The deb based Linux ecosystem (Debian, Ubuntu, etc)
|
|
|
|
|
* The RPM based Linux ecosystem (Fedora, openSuSE, Mageia, etc)
|
|
|
|
|
* The homebrew, MacPorts and fink ecosystems for Mac OS X
|
2018-09-21 17:28:48 -04:00
|
|
|
|
* The Windows Package Management ecosystem (NuGet, Chocolatey, etc)
|
2016-08-24 16:45:14 -04:00
|
|
|
|
* 3rd party creation of Windows MSIs and installers (e.g. Christoph Gohlke's
|
|
|
|
|
work at http://www.lfd.uci.edu/~gohlke/pythonlibs/ )
|
|
|
|
|
* other commercial redistribution formats (ActiveState's PyPM, Enthought
|
|
|
|
|
Canopy, etc)
|
|
|
|
|
* other open source community redistribution formats (Nix, Gentoo, Arch, \*BSD,
|
|
|
|
|
etc)
|
|
|
|
|
|
|
|
|
|
It is the belief of this PEP that the entire ecosystem is best supported by
|
|
|
|
|
keeping PyPI focused on the platform agnostic formats, where the limited amount
|
|
|
|
|
of time by volunteers can be best used instead of spreading the available time
|
|
|
|
|
out amongst several platforms. Further more, this PEP believes that the people
|
|
|
|
|
best positioned to provide well integrated packages for a particular platform
|
|
|
|
|
are people focused on that platform, and not across all possible platforms.
|
|
|
|
|
|
|
|
|
|
|
2016-08-23 12:40:01 -04:00
|
|
|
|
bdist_dumb
|
|
|
|
|
~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
As it's name implies, ``bdist_dumb`` is not a very complex format, however it
|
|
|
|
|
is so simple as to be worthless for actual usage.
|
|
|
|
|
|
|
|
|
|
For instance, if you're using something like pyenv on macOS and you're building
|
|
|
|
|
a library using Python 3.5, then ``bdist_dumb`` will produce a ``.tar.gz`` file
|
|
|
|
|
named something like ``exampleproject-1.0.macosx-10.11-x86_64.tar.gz``. Right
|
|
|
|
|
off the bat this file name is somewhat difficult to differentiate from an
|
|
|
|
|
``sdist`` since they both use the same file extension (and with the legacy pre
|
|
|
|
|
PEP 440 versions, ``1.0-macosx-10.11-x86_64`` is a valid, although quite silly,
|
|
|
|
|
version number). However, once you open up the created ``.tar.gz``, you'd find
|
|
|
|
|
that there is no metadata inside that could be used for things like dependency
|
|
|
|
|
discovery and in fact, it is quite simply a tarball containing hardcoded paths
|
|
|
|
|
to wherever files would have been installed on the computer creating the
|
|
|
|
|
``bdist_dumb``. Going back to our pyenv on macOS example, this means that if I
|
|
|
|
|
created it, it would contain files like:
|
|
|
|
|
|
|
|
|
|
``Users/dstufft/.pyenv/versions/3.5.2/lib/python3.5/site-packages/example.py``
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
bdist_rpm
|
|
|
|
|
~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
The ``bdist_rpm`` format on PyPI allows people to upload ``.rpm`` files for
|
|
|
|
|
end users to manually download by hand and then manually install by hand.
|
|
|
|
|
However, the common usage of ``rpm`` is with a specially designed repository
|
|
|
|
|
that allows automatic installation of dependencies, upgrades, etc which PyPI
|
|
|
|
|
does not provide. Thus, it is a type of file that is barely being used on PyPI
|
|
|
|
|
with only ~460 files of this type having been uploaded to PyPI (out a total of
|
|
|
|
|
662,544).
|
|
|
|
|
|
|
|
|
|
In addition, services like `COPR <https://copr.fedorainfracloud.org/>`_ provide
|
|
|
|
|
a better supported mechanism for publishing and using RPM files than we're ever
|
|
|
|
|
likely to get on PyPI.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
bdist_dmg, bdist_msi, and bdist_wininst
|
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
The ``bdist_dmg``, ``bdist_msi``, and ``bdist_winist`` formats are similar in
|
|
|
|
|
that they are an OS specific installer that will only install a library into an
|
|
|
|
|
environment and are not designed for real user facing installs of applications
|
|
|
|
|
(which would require things like bundling a Python interpreter and the like).
|
|
|
|
|
|
|
|
|
|
Out of these three, the usage for ``bdist_dmg`` and ``bdist_msi`` is very low,
|
|
|
|
|
with only ~500 ``bdist_msi`` files and ~50 ``bdist_dmg`` files having been
|
|
|
|
|
uploaded to PyPI. The ``bdist_wininst`` format has more use, with ~14,000 files
|
|
|
|
|
having ever been uploaded to PyPI.
|
|
|
|
|
|
|
|
|
|
It's quite easy to look at the low usage of ``bdist_dmg`` and ``bdist_msi`` and
|
|
|
|
|
conclude that removing them will be fairly low impact, however
|
|
|
|
|
``bdist_wininst`` has several orders of magnitude more usage. This is somewhat
|
|
|
|
|
misleading though, because although it has more people *uploading* those files
|
|
|
|
|
the actual usage of those uploaded files is fairly low. Taking a look at the
|
|
|
|
|
previous 30 days, we can see that 90% of all downloads of ``bdist_winist``
|
|
|
|
|
files from PyPI were generated by the mirroring infrastructure and 7% of them
|
|
|
|
|
were generated by setuptools (which can currently be better covered by
|
|
|
|
|
``bdist_egg`` files).
|
|
|
|
|
|
|
|
|
|
Given the small number of files uploaded for ``bdist_dmg`` and ``bdist_msi``
|
|
|
|
|
and that ``bdist_wininst`` is largely existing to either consume bandwidth and
|
|
|
|
|
disk space via the mirroring infrastructure *or* could be trivially replaced
|
|
|
|
|
with ``bdist_egg``, this PEP proposes to include these three formats in the
|
|
|
|
|
list of those to be disallowed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
File Extensions
|
|
|
|
|
---------------
|
|
|
|
|
|
2017-01-17 12:35:47 -05:00
|
|
|
|
Currently ``sdist`` supports a wide variety of file extensions like ``.tar.gz``,
|
2016-08-23 12:40:01 -04:00
|
|
|
|
``.tar``, ``.tar.bz2``, ``.tar.xz``, ``.zip``, ``.tar.Z``, ``.tgz``, and
|
|
|
|
|
``.tbz``. However, of those the only extensions which get anything more than
|
2019-06-25 00:58:50 -04:00
|
|
|
|
negligible usage is ``.tar.gz`` with 444,338 sdists currently, ``.zip`` with
|
2016-08-23 12:40:01 -04:00
|
|
|
|
58,774 sdists currently, and ``.tar.bz2`` with 3,265 sdists currently.
|
|
|
|
|
|
|
|
|
|
Having multiple formats accepted requires tooling both within PyPI and outside
|
|
|
|
|
of PyPI to handle all of the various extensions that *might* be used (even if
|
|
|
|
|
nobody is currently using them). This doesn't only affect PyPI, but ripples out
|
|
|
|
|
throughout the ecosystem. In addition, the different formats all have different
|
|
|
|
|
requirements for what optional C libraries Python was linked against and
|
|
|
|
|
different requirements for what versions of Python they support. In addition,
|
|
|
|
|
multiple formats also create a weird situation where there may be two
|
|
|
|
|
``sdist`` files for a particular project/release with subtly different content.
|
|
|
|
|
|
|
|
|
|
It's easy to advocate that anything outside of ``.tar.gz``, ``.zip``, and
|
|
|
|
|
``.tar.bz2`` should be disallowed. Outside of a tiny handful, nobody has
|
|
|
|
|
actively been uploading these other types of files in the ~15 years of PyPI's
|
|
|
|
|
existence so they've obviously not been particularly useful. In addition, while
|
|
|
|
|
``.tar.xz`` is theoretically a nicer format than the other ``.tar.*`` formats
|
|
|
|
|
due to the better compression ratio achieved by LZMA, it is only available in
|
|
|
|
|
Python 3.3+ and has an optional dependency on the lzma C library.
|
|
|
|
|
|
|
|
|
|
Looking at the three extensions we *do* have in current use, it's also fairly
|
|
|
|
|
easy to conclude that ``.tar.bz2`` can be disallowed as well. It has a fairly
|
|
|
|
|
small number of files ever uploaded with it and it requires an additional
|
|
|
|
|
optional C library to handle the bzip2 compression.
|
|
|
|
|
|
|
|
|
|
Finally we get down to ``.tar.gz`` and ``.zip``. Looking at the pure numbers
|
|
|
|
|
for these two, we can see that ``.tar.gz`` is by far the most uploaded format,
|
|
|
|
|
with 444,338 total uploaded compared to ``.zip``'s 58,774 and on POSIX
|
|
|
|
|
operating systems ``.tar.gz`` is also the default produced by all currently
|
|
|
|
|
released versions of Python and setuptools. In addition, these two file types
|
|
|
|
|
both use the same C library (``zlib``) which is also required for
|
|
|
|
|
``bdist_wheel`` and ``bdist_egg``. The two wrinkles with deciding between
|
|
|
|
|
``.tar.gz`` and ``.zip`` is that while on POSIX operating systems ``.tar.gz``
|
|
|
|
|
is the default, on Windows ``.zip`` is the default and the ``bdist_wheel``
|
|
|
|
|
format also uses zip.
|
|
|
|
|
|
2016-08-24 16:45:14 -04:00
|
|
|
|
Instead of trying to standardize on either ``.tar.gz`` or ``.zip``, this PEP
|
|
|
|
|
proposes that we allow *either* ``.tar.gz`` or ``.zip`` for sdists.
|
2016-08-23 12:40:01 -04:00
|
|
|
|
|
2016-08-25 07:57:33 -04:00
|
|
|
|
|
|
|
|
|
Limiting number of sdists per release
|
|
|
|
|
-------------------------------------
|
|
|
|
|
|
|
|
|
|
A sdist on PyPI should be a single source of truth for a particular release of
|
|
|
|
|
software. However, currently PyPI allows you to upload one sdist for each of
|
|
|
|
|
the sdist file extensions it allows. Currently this allows something like 10
|
|
|
|
|
different sdists for a project, but even with this PEP it allows two different
|
|
|
|
|
sources of truth for a single version. Having multiple sdists often times can
|
|
|
|
|
account for strange bugs that only expose themselves based on which sdist that
|
|
|
|
|
the person used.
|
|
|
|
|
|
|
|
|
|
To resolve this, this PEP proposes to allow one, and only one, sdist per
|
|
|
|
|
release of a project.
|
|
|
|
|
|
|
|
|
|
|
2016-08-23 12:40:01 -04:00
|
|
|
|
Removal Process
|
|
|
|
|
===============
|
|
|
|
|
|
|
|
|
|
This PEP does **NOT** propose removing any existing files from PyPI, only
|
|
|
|
|
disallowing new ones from being uploaded. This restriction will be phased in on
|
|
|
|
|
a per-project basis to allow projects to adjust to the new restrictions where
|
|
|
|
|
applicable.
|
|
|
|
|
|
|
|
|
|
First, any *existing* projects will be flagged to allow legacy file types to be
|
|
|
|
|
uploaded, and any project without that flag (i.e. new projects) will not be
|
2016-08-24 16:45:14 -04:00
|
|
|
|
able to upload anything but ``sdist`` with a ``.tar.gz`` or ``.zip`` extension,
|
2016-08-23 12:40:01 -04:00
|
|
|
|
``bdist_wheel``, and ``bdist_egg``. Then, any existing projects that have never
|
|
|
|
|
uploaded a file that requires the legacy file type flag will have that flag
|
|
|
|
|
removed, also making them fall under the new restrictions. Finally, an email
|
|
|
|
|
will be generated to the maintainers of all projects still given the legacy
|
|
|
|
|
flag, which will inform them of the upcoming new restrictions on uploads and
|
|
|
|
|
tell them that these restrictions will be applied to future uploads to their
|
2016-08-24 16:45:14 -04:00
|
|
|
|
projects starting in 1 month. Finally, after 1 month all projects will have the
|
|
|
|
|
legacy file type flag removed, and support for uploading these types of files
|
|
|
|
|
will cease to exist on PyPI.
|
2016-08-23 12:40:01 -04:00
|
|
|
|
|
|
|
|
|
This plan should provide minimal disruption since it does not remove any
|
|
|
|
|
existing files, and the types of files it does prevent from being uploaded are
|
|
|
|
|
either not particularly useful (or used) types of files *or* they can continue
|
|
|
|
|
to upload a similar type of file with a slight change to their process.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
coding: utf-8
|
|
|
|
|
End:
|