python-peps/peps/pep-0763.rst

402 lines
16 KiB
ReStructuredText
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 763
Title: Limiting deletions on PyPI
Author: William Woodruff <william@yossarian.net>,
Alexis Challande <alexis.challande@trailofbits.com>
Sponsor: Donald Stufft <donald@stufft.io>
PEP-Delegate: Donald Stufft <donald@stufft.io>
Discussions-To: https://discuss.python.org/t/69487
Status: Draft
Type: Standards Track
Topic: Packaging
Created: 24-Oct-2024
Post-History: `09-Jul-2022 <https://discuss.python.org/t/17227>`__,
`01-Oct-2024 <https://discuss.python.org/t/66351>`__,
`28-Oct-2024 <https://discuss.python.org/t/69487>`__
Abstract
========
We propose limiting when users can delete files, releases, and projects
from PyPI. A project, release, or file may only be deleted within 72 hours
of when it is uploaded to the index. From this point, users may only use
the "yank" mechanism specified by :pep:`592`.
An exception to this restriction is made for releases and files that are
marked with :ref:`pre-release specifiers <packaging:version-specifiers>`,
which will remain deletable at any time.
The PyPI administrators will retain the ability to delete files, releases,
and projects at any time, for example for moderation or security purposes.
Rationale and Motivation
========================
As observed in :pep:`592`, user-level deletion of projects on PyPI
enables a catch-22 situation of dependency breakage:
Whenever a project detects that a particular release on PyPI might be
broken, they oftentimes will want to prevent further users from
inadvertently using that version. However, the obvious solution of
deleting the existing file from a repository will break users who have
pinned to a specific version of the project.
This leaves projects in a catch-22 situation where new projects may be pulling
down this known broken version, but if they do anything to prevent that theyll
break projects that are already using it.
On a technical level, the problem of deletion is mitigated by
"yanking," also specified in :pep:`592`. However, deletions continue to be
allowed on PyPI, and have caused multiple notable disruptions to the Python
ecosystem over the interceding years:
* July 2022: `atomicwrites <https://pypi.org/project/atomicwrites/>`_
was `deleted by its maintainer <https://github.com/untitaker/python-atomicwrites/issues/61>`_
in an attempt to remove the project's "critical" designation, without the
maintainer realizing that project deletion would also delete all previously
uploaded releases.
The project was subsequently restored with the maintainer's consent,
but at the cost of manual administrator action and extensive downstream
breakage to projects like `pytest <https://github.com/pytest-dev/pytest/issues/10114>`_.
As of October 2024, atomicwrites is archived but still has
around `4.5 million monthly downloads from PyPI <https://pypistats.org/packages/atomicwrites>`_.
* April 2023: `codecov <https://pypi.org/project/codecov/>`_ was deleted by
its maintainers after a long deprecation period. This caused extensive
breakage for many of Codecov's CI/CD users, who were unaware of the
deprecation period due to limited observability of deprecation warnings
within CI/CD logs.
The project was
`subsequently re-created <https://about.codecov.io/blog/message-regarding-the-pypi-package/>`_
by its maintainers, with a new release published to compensate for the deleted releases
(which were not restored), meaning that any pinned installations remained
broken. As of October 2024, this single release remains the only release on
PyPI and has around
`1.5 million monthly downloads <https://pypistats.org/packages/codecov>`_.
* June 2023: `python-sonarqube-api <https://pypi.org/project/python-sonarqube-api/>`_
deleted all released releases prior to 2.0.2.
The project's maintainer subsequently
`deleted conversations <https://discuss.python.org/t/stop-allowing-deleting-things-from-pypi/17227/114>`_
and force-pushed over the tag history for ``python-sonarqube-api``'s source
repository, impeding efforts by users to compare changes between
releases.
* June 2024: `PySimpleGUI <https://pypi.org/project/PySimpleGUI/>`_ changed
licenses and deleted
`nearly all previous releases <https://discuss.python.org/t/48790/27>`_.
This resulted in widespread disruption for users, who (prior to the
relicensing) were downloading PySimpleGUI approximately 25,000 times a day.
In addition to their disruptive effect on downstreams, deletions
also have detrimental effects on PyPI's sustainability as well as the overall
security of the ecosystem:
* Deletions increase support workload for PyPI's administrators and
moderators, as users mistakenly file support requests believing that PyPI
is broken, or that the administrators themselves have removed the
project.
* Deletions impair external (meaning end-user) incident response and analysis,
making it difficult to distinguish "good faith" maintainer behavior from
a malicious actor attempting the cover their tracks.
The Python ecosystem is continuing to grow,
meaning that future deletions of projects can be reasonably assumed to
be *just, as if not more,* disruptive than the deletions sampled above.
Given all of the above, this PEP concludes that deletions now present a greater
risk and detriment to the Python ecosystem than a benefit.
In addition to these technical arguments, there is also precedent
from other packaging ecosystems for limiting the ability of users to delete
projects and their constituent releases. This precedent is documented in
:ref:`Appendix A <pep763-appendix-a>`.
Specification
=============
There are three different types of deletable objects:
1. **Files**, which are individual project distributions (such as source
distributions or wheels).
Example: ``requests-2.32.3-py3-none-any.whl``.
2. **Releases**, which contain one or more files that share the same version
number.
Example: `requests v2.32.3 <https://pypi.org/project/requests/2.32.3/>`_.
3. **Projects**, which contain one or more releases.
Example: `requests <https://pypi.org/project/requests>`_.
Deletion eligibility rules
--------------------------
This PEP proposes the following *deletion eligibility rules*:
* A **file** is deletable if and only if it was uploaded to
PyPI less than 72 hours from the current time, **or** if it
has a :ref:`pre-release specifier <packaging:version-specifiers>`.
* A **release** is deletable if and only if all of its
contained files are deletable.
* A **project** is deletable if and only if all of its releases are deletable.
These rules allow new projects to be
deleted entirely, and allow old projects to delete new files or releases,
but do not allow old projects to delete old files or releases.
Implementation
==============
This PEP's implementation primarily concerns aspects of PyPI that are not
standardized or subject to standardization, such as the web interface and
signed-in user operations. As a result, this section describes its
implementation in behavioral terms.
Changes
-------
* Per the eligibility rules above, PyPI will reject web interface requests
(using an appropriate HTTP response code of its choosing) for
file, release, or project deletion if the respective object is not
eligible for deletion.
* PyPI will amend its web interface to indicate a file/release/project's
deletion ineligibility, e.g. by styling the relevant UI elements as "inactive"
and making relevant bottoms/forms unclickable.
Security Implications
=====================
This PEP does not identify negative security implications associated with the
proposed approach.
This PEP identifies one minor positive security implication: by restricting
user-controlled deletions, this PEP makes it more difficult for a malicious
actor to cover their tracks by deleting malware from the index. This is
particularly useful for external (i.e. non-PyPI administrator) triage and
incident response, where the defending party needs easy access to malware
samples to develop indicators of compromise.
How To Teach This
=================
This PEP suggests at least two pieces of public-facing material to help
the larger Python packaging community (and its downstream consumers)
understand its changes:
* An announcement post on the `PyPI blog <https://blog.pypi.org>`_ explaining
the nature of the PEP, its motivations, and its behavioral implications for
PyPI.
* An announcement banner on PyPI itself, linking to the above.
* Updates to the `PyPI user documentation <https://docs.pypi.org/>`_ explaining
the difference between deletion and yanking and the limited conditions under
which the former can still be initiated by package owners.
Rejected Ideas
==============
Conditioning deletion on dependency relationships
-------------------------------------------------
An alternative to time-based deletion windows is deletion eligibility based on
downstream dependents. For example, a release could be considered deletable
if and only if it has fewer than ``N`` downstream dependents on PyPI,
where ``N`` could be as low as 1.
This idea is appealing since it directly links deletion eligibility to
disruptiveness. `npm <https://www.npmjs.com/>`_ uses it and
conditions project removal on the absence of any downstream dependencies
known to the index.
Despite its appeal, this PEP identifies several disadvantages and technical
limitations that make dependency-conditioned deletion not appropriate
for PyPI:
1. *PyPI is not aware of dependency relationships.* In Python packaging,
both project builds *and* metadata generation are frequently dynamic
operations, involving arbitrary project-specified code. This is typified
by source distributions containing ``setup.py`` scripts, where the execution
of ``setup.py`` is responsible for computing the set of dependencies
encoded in the project's metadata.
This is in marked contrast to ecosystems like npm and Rust's
`crates <https://crates.io/>`_, where project *builds* can be dynamic but
the project's metadata itself is static.
As a result of this, `PyPI doesn't know your project's dependencies
<https://dustingram.com/articles/2018/03/05/why-pypi-doesnt-know-dependencies/>`_,
and is architecturally incapable of knowing them without either running
arbitrary code (a significant security risk) or performing a long-tail
deprecation of ``setup.py``-based builds in favor of :pep:`517` and
:pep:`621`-style static metadata.
2. *Results in an unintuitive permissions model.* Dependency-conditioned
deletion results in a "reversed" power relationship, where anybody
who introduces a dependency on a project can prevent that project from
being deleted.
This is reasonable on face value, but can be abused to produce unexpected
and undesirable (in the context of enabling some deletions) outcomes.
A notable example of this is npm's
`everything package <https://www.npmjs.com/package/everything>`_, which
depends on every public package on npm (as of 30 Dec 2023) and thereby
prevents their deletion.
Conditioning deletion on download count
---------------------------------------
Another alternative to time-based deletion windows is to delete based on the
number of downloads. For example, a release could be considered deletable if
and only if it has fewer than ``N`` downloads during the last period.
While presenting advantages by tying a project deletion possibility to its
usage, this PEP identifies several limitations to this approach:
1. *Ecosystem diversity.* The Python ecosystem includes projects with widely
varying usage patterns. A fixed download threshold would not adequately account
for niche but critical projects with naturally low download counts.
2. *Time sensitivity.* Download counts do not necessarily reflect a project's
current status or importance. A previously popular project might have low
recent downloads but still be crucial for maintaining older systems.
3. *Technical complexity.* Accessing the download count of a project within
PyPI is not straightforward, and there is limited possibility to gather a
project's download statistics from mirrors or other distributions systems.
.. _pep763-appendix-a:
Appendix A: Precedent in other ecosystems
=========================================
The following is a table of support for deletion in different packaging
ecosystems. An ecosystem is considered to **not** support deletion
if it restrict's a user's ability to perform deletions in a manner similar
to this PEP.
An earlier version of this table, showing only deletion, was
compiled by Donald Stufft and others on the Python discussion forum in
`July 2022 <https://discuss.python.org/t/17227/59>`__.
.. list-table::
:header-rows: 1
* - Ecosystem (Index)
- Deletion
- Yanking
- Notes
* - Python (PyPI)
- ✅ [#f1]_
- ✅ [#f2]_
- Deletion currently completely unrestricted.
* - Rust (crates.io)
- ❌
- ✅ [#f3]_
- Deletion by users not allowed at all.
* - JavaScript (npm)
- ❌ [#f4]_
- ✅ [#f5]_
- Deletion is limited by criteria similar to this PEP.
* - Ruby (RubyGems)
- ✅ [#f6]_
- ❌
- RubyGems calls deletion "yanking." Yanking in PyPI's terms is not supported at all.
* - Java (Maven Central)
- ❌ [#f7]_
- ❌
- Deletion by users not allowed at all.
* - PHP (Packagist)
- ❌ [#f8]_
- ❌
- Deletion restricted after an undocumented number of installs.
* - .NET (NuGet)
- ❌ [#f9]_
- ✅ [#f10]_
- NuGet calls yanking "unlisting."
* - Elixir (Hex)
- ❌ [#f11]_
- ✅ [#f11]_
- Hex calls yanking "retiring."
* - R (CRAN)
- ❌ [#f12]_
- ✅ [#f12]_
- Deletion is limited to within 24 hours of initial release or
60 minutes for subsequent versions. CRAN calls yanking "archiving."
* - Perl (CPAN)
- ✅
- ❌
- Yanking is not supported at all. Deletion seemingly encouraged,
at least as of 2021 [#f13]_.
* - Lua (LuaRocks)
- ✅ [#f14]_
- ✅ [#f14]_
- LuaRocks calls yanking "archiving."
* - Haskell (Hackage)
- ❌ [#f15]_
- ✅ [#f16]_
- Hackage calls yanking "deprecating."
* - OCaml (OPAM)
- ❌ [#f17]_
- ✅ [#f17]_
- Deletion is allowed if it occurs "reasonably soon" after inclusion.
Yanking is *de facto* supported by the ``available: false`` marker, which
effectively disables resolution.
The following trends are present:
* A strong majority of indices **do not** support deletion (9 vs. 4)
* A strong majority of indices **do** support yanking (9 vs. 4)
* An overwhelming majority of indices support one or the other or neither,
but **not** both (11 vs. 2)
* PyPI and LuaRocks are notable outliers in supporting **both** deletion and
yanking.
Footnotes
=========
.. [#f1] https://pypi.org/help/#deletion
.. [#f2] https://pypi.org/help/#yanked
.. [#f3] https://doc.rust-lang.org/cargo/commands/cargo-yank.html
.. [#f4] https://docs.npmjs.com/unpublishing-packages-from-the-registry
.. [#f5] https://docs.npmjs.com/deprecating-and-undeprecating-packages-or-package-versions
.. [#f6] https://guides.rubygems.org/removing-a-published-gem/
.. [#f7] https://central.sonatype.org/faq/can-i-change-a-component/
.. [#f8] https://github.com/composer/packagist/issues/875
.. [#f9] https://learn.microsoft.com/en-us/nuget/nuget-org/policies/deleting-packages
.. [#f10] https://learn.microsoft.com/en-us/nuget/nuget-org/policies/deleting-packages#unlisting-a-package
.. [#f11] https://hex.pm/docs/faq#can-packages-be-removed-from-the-repository
.. [#f12] https://cran.r-project.org/web/packages/policies.html
.. [#f13] https://neilb.org/2021/05/10/delete-your-old-releases.html
.. [#f14] https://luarocks.org/changes
.. [#f15] https://hackage.haskell.org/upload
.. [#f16] https://hackage.haskell.org/packages/deprecated
.. [#f17] https://github.com/ocaml/opam-repository/wiki/Policies#1-removal-of-packages-should-be-avoided
Copyright
=========
This document is placed in the public domain or under the CC0-1.0-Universal
license, whichever is more permissive.