PEP 752: Package repository namespaces (#3903)
PEP 752: Package repository namespaces --------- Co-authored-by: Barry Warsaw <barry@python.org>
This commit is contained in:
parent
5d475023e5
commit
9247c9872c
|
@ -631,6 +631,7 @@ peps/pep-0749.rst @JelleZijlstra
|
||||||
# ...
|
# ...
|
||||||
peps/pep-0750.rst @gvanrossum @lysnikolaou
|
peps/pep-0750.rst @gvanrossum @lysnikolaou
|
||||||
peps/pep-0751.rst @brettcannon
|
peps/pep-0751.rst @brettcannon
|
||||||
|
peps/pep-0752.rst @warsaw
|
||||||
# ...
|
# ...
|
||||||
# peps/pep-0754.rst
|
# peps/pep-0754.rst
|
||||||
# ...
|
# ...
|
||||||
|
|
|
@ -0,0 +1,429 @@
|
||||||
|
PEP: 752
|
||||||
|
Title: Package repository namespaces
|
||||||
|
Author: Ofek Lev <ofekmeister@gmail.com>
|
||||||
|
Sponsor: Barry Warsaw <barry@python.org>
|
||||||
|
PEP-Delegate: Donald Stufft <donald@stufft.io>
|
||||||
|
Status: Draft
|
||||||
|
Type: Standards Track
|
||||||
|
Topic: Packaging
|
||||||
|
Created: 13-Aug-2024
|
||||||
|
|
||||||
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
|
This PEP specifies a way for organizations to reserve package name prefixes
|
||||||
|
for future uploads.
|
||||||
|
|
||||||
|
"Namespaces are one honking great idea -- let's do more of
|
||||||
|
those!" - :pep:`20`
|
||||||
|
|
||||||
|
Motivation
|
||||||
|
==========
|
||||||
|
|
||||||
|
The current ecosystem lacks a way for projects with many packages to signal a
|
||||||
|
verified pattern of ownership. Some examples:
|
||||||
|
|
||||||
|
* `Typeshed <https://github.com/python/typeshed>`__ is a community effort to
|
||||||
|
maintain type stubs for various packages. The stub packages they maintain
|
||||||
|
mirror the package name they target and are prefixed by ``types-``. For
|
||||||
|
example, the package ``requests`` has a stub that users would depend on
|
||||||
|
called ``types-requests``.
|
||||||
|
* Major cloud providers like Amazon, Google and Microsoft have a common prefix
|
||||||
|
for each feature's corresponding package [1]_. For example, most of Google's
|
||||||
|
packages are prefixed by ``google-cloud-`` e.g. ``google-cloud-compute`` for
|
||||||
|
`using virtual machines <https://cloud.google.com/products/compute>`__.
|
||||||
|
* Many projects [2]_ support a model where some packages are officially
|
||||||
|
maintained and third-party developers are encouraged to participate by
|
||||||
|
creating their own. For example, `Datadog <https://www.datadoghq.com>`__
|
||||||
|
offers observability as a service for organizations at any scale. The
|
||||||
|
`Datadog Agent <https://docs.datadoghq.com/agent/>`__ ships out-of-the-box
|
||||||
|
with
|
||||||
|
`official integrations <https://github.com/DataDog/integrations-core>`__
|
||||||
|
for many products, like various databases and web servers, which are
|
||||||
|
distributed as Python packages that are prefixed by ``datadog-``. There is
|
||||||
|
support for creating `third-party integrations`__ which customers may run.
|
||||||
|
|
||||||
|
__ https://docs.datadoghq.com/developers/integrations/agent_integration/
|
||||||
|
|
||||||
|
Such projects are uniquely vulnerable to attacks stemming from malicious actors
|
||||||
|
squatting anticipated package names. For example, say a new product is released
|
||||||
|
for which monitoring would be valuable. It would be reasonable to assume that
|
||||||
|
Datadog would eventually support it as an official integration. It takes a
|
||||||
|
nontrivial amount of time to deliver such an integration due to roadmap
|
||||||
|
prioritization and the time required for implementation. It would be impossible
|
||||||
|
to reserve the name of every potential package so in the interim an attacker
|
||||||
|
may create a legitimate-appearing package which would execute malicious code at
|
||||||
|
runtime. Not only are users more likely to install such packages but doing so
|
||||||
|
taints the perception of the entire project.
|
||||||
|
|
||||||
|
Namespacing also would drastically reduce the incidence of
|
||||||
|
`typosquatting <https://en.wikipedia.org/wiki/Typosquatting>`__
|
||||||
|
because typos would have to be in the prefix itself which is
|
||||||
|
`normalized <naming_>`_ and likely to be a short, well-known identifier like
|
||||||
|
``aws-``.
|
||||||
|
|
||||||
|
Rationale
|
||||||
|
=========
|
||||||
|
|
||||||
|
Tolerance for Disruption
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
Other package ecosystems have generally solved this problem by taking one of
|
||||||
|
two approaches: either minimizing or maximizing backwards compatibility.
|
||||||
|
|
||||||
|
* `NPM <https://www.npmjs.com>`__ has the concept of
|
||||||
|
`scoped packages <https://docs.npmjs.com/about-scopes>`__ which were
|
||||||
|
`introduced`__ primarily to combat there being a dearth of available good
|
||||||
|
package names (whether a real or perceived phenomenon). When a user or
|
||||||
|
organization signs up they are given a scope that matches their name. For
|
||||||
|
example, the
|
||||||
|
`package <https://www.npmjs.com/package/@google-cloud/storage>`__ for using
|
||||||
|
Google Cloud Storage is ``@google-cloud/storage`` where ``@google-cloud/`` is
|
||||||
|
the scope. Regular user accounts (non-organization) may publish `unscoped`__
|
||||||
|
packages for public use.
|
||||||
|
This approach has the lowest amount of backwards compatibility because every
|
||||||
|
installer and tool has to be modified to account for scopes.
|
||||||
|
* `NuGet <https://www.nuget.org>`__ has the concept of
|
||||||
|
`package ID prefix reservation`__ which was
|
||||||
|
`introduced`__ primarily to satisfy users wishing to know where a package
|
||||||
|
came from. A package name prefix may be reserved for use by one or more
|
||||||
|
owners. Every reserved package has a special indication
|
||||||
|
`on its page <https://www.nuget.org/packages/Google.Cloud.Storage.V1>`__ to
|
||||||
|
communicate this. After reservation, any upload with a reserved prefix will
|
||||||
|
fail if the user is not an owner of the prefix. Existing packages that have a
|
||||||
|
prefix that is owned may continue to release as usual. This approach has the
|
||||||
|
highest amount of backwards compatibility because only modifications to
|
||||||
|
indices like PyPI are required and installers do not need to change.
|
||||||
|
|
||||||
|
__ https://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages
|
||||||
|
__ https://docs.npmjs.com/package-scope-access-level-and-visibility
|
||||||
|
__ https://learn.microsoft.com/en-us/nuget/nuget-org/id-prefix-reservation
|
||||||
|
__ https://devblogs.microsoft.com/nuget/Package-identity-and-trust/
|
||||||
|
|
||||||
|
This PEP specifies the NuGet approach of authorized reservation across a flat
|
||||||
|
namespace for the following reasons:
|
||||||
|
|
||||||
|
* Causing churn for the community is a hard blocker.
|
||||||
|
* The NPM approach has the potential to cause confusion for users if we allow
|
||||||
|
unscoped names. Our community has chosen to normalize separator characters
|
||||||
|
and so ``@aws/s3`` would likely be confused with ``@aws-s3``.
|
||||||
|
|
||||||
|
Approval Process
|
||||||
|
----------------
|
||||||
|
|
||||||
|
PyPI has been understaffed, receiving the first `dedicated specialist`__ in
|
||||||
|
July 2024. Due to lack of resources, user support has been lacking for
|
||||||
|
`package name claims <https://discuss.python.org/t/27436/19>`__,
|
||||||
|
`organization requests <https://discuss.python.org/t/33764/15>`__,
|
||||||
|
`storage limit increases <https://discuss.python.org/t/54035>`__,
|
||||||
|
and even `account recovery <https://discuss.python.org/t/43422/122>`__.
|
||||||
|
|
||||||
|
__ https://pyfound.blogspot.com/2024/07/announcing-our-new-pypi-support.html
|
||||||
|
|
||||||
|
The `default policy <grant-approval-criteria_>`_ of only allowing
|
||||||
|
`corporate organizations <corp-orgs_>`_ to reserve namespaces (except in
|
||||||
|
specific scenarios) provides the following benefits:
|
||||||
|
|
||||||
|
* PyPI would have a constant source of funding for support specialists,
|
||||||
|
infrastructure maintenance and new features.
|
||||||
|
* Although each application would require independent review, less human
|
||||||
|
feedback would be required because the process to approve a paid organization
|
||||||
|
already bestows a certain amount of trust.
|
||||||
|
|
||||||
|
Specification
|
||||||
|
=============
|
||||||
|
|
||||||
|
`Organizations <orgs_>`_ (NOT regular users) MAY reserve one or more
|
||||||
|
namespaces. Such reservations neither confer ownership nor grant special
|
||||||
|
privileges to existing packages.
|
||||||
|
|
||||||
|
.. _naming:
|
||||||
|
|
||||||
|
Naming
|
||||||
|
------
|
||||||
|
|
||||||
|
A namespace MUST be a `valid`__ project name and `normalized`__ internally e.g.
|
||||||
|
``foo.bar`` would become ``foo-bar``. The user facing namespace (e.g. in UI
|
||||||
|
tooltips) MUST preserve the original pre-normalized text as defined during
|
||||||
|
reservation.
|
||||||
|
|
||||||
|
__ https://packaging.python.org/en/latest/specifications/name-normalization/#name-format
|
||||||
|
__ https://packaging.python.org/en/latest/specifications/name-normalization/#name-normalization
|
||||||
|
|
||||||
|
Grant Semantics
|
||||||
|
---------------
|
||||||
|
|
||||||
|
A namespace grant bestows ownership over the following:
|
||||||
|
|
||||||
|
1. A package matching the namespace itself such as the placeholder package
|
||||||
|
`microsoft <https://pypi.org/project/microsoft/>`__.
|
||||||
|
2. Packages that start with the namespace followed by a hyphen. For example,
|
||||||
|
the namespace ``foo`` would match the package ``foo-bar`` but not the
|
||||||
|
package ``foobar``.
|
||||||
|
|
||||||
|
Package name matching acts upon the `normalized <naming_>`_ namespace.
|
||||||
|
|
||||||
|
Namespaces are per-repository and MUST NOT be shared between repositories.
|
||||||
|
|
||||||
|
Grant Types
|
||||||
|
-----------
|
||||||
|
|
||||||
|
There are two types of grants.
|
||||||
|
|
||||||
|
.. _root-grant:
|
||||||
|
|
||||||
|
Root Grant
|
||||||
|
''''''''''
|
||||||
|
|
||||||
|
Only `organizations <orgs_>`_ have the ability to submit requests for namespace
|
||||||
|
grants. An organization gets a root grant for every accepted request. This
|
||||||
|
grant may produce any number of `child grants <child-grant_>`_.
|
||||||
|
|
||||||
|
.. _child-grant:
|
||||||
|
|
||||||
|
Child Grant
|
||||||
|
'''''''''''
|
||||||
|
|
||||||
|
A child grant is created by the owner of a `root grant <root-grant_>`_. The
|
||||||
|
child namespace MUST be prefixed by the root grant namespace followed by a
|
||||||
|
hyphen. For example, ``google-cloud`` would be a valid child of the root
|
||||||
|
namespace ``google``.
|
||||||
|
|
||||||
|
Child grants cannot have their own child grants.
|
||||||
|
|
||||||
|
.. _grant-ownership:
|
||||||
|
|
||||||
|
Grant Ownership
|
||||||
|
---------------
|
||||||
|
|
||||||
|
The owner of a grant may allow any number of other organizations to use the
|
||||||
|
grant. The grants behave as if they were owned by the organization. The owner
|
||||||
|
may revoke this permission at any time.
|
||||||
|
|
||||||
|
The owner may transfer ownership to another organization. If the organization
|
||||||
|
is a corporate organization, the target for transfer must also be. Settings for
|
||||||
|
permitted organizations are transferred as well.
|
||||||
|
|
||||||
|
.. _uploads:
|
||||||
|
|
||||||
|
Uploads
|
||||||
|
-------
|
||||||
|
|
||||||
|
If the following criteria are all true for a given upload:
|
||||||
|
|
||||||
|
1. The package does not yet exist.
|
||||||
|
2. The name matches a reserved namespace.
|
||||||
|
3. The user is not authorized to use the namespace by the owner of the
|
||||||
|
namespace.
|
||||||
|
|
||||||
|
Then the upload MUST fail with a 403 HTTP status code.
|
||||||
|
|
||||||
|
.. _user-interface:
|
||||||
|
|
||||||
|
User Interface
|
||||||
|
--------------
|
||||||
|
|
||||||
|
Every page for a particular release
|
||||||
|
(`example <https://pypi.org/project/google-cloud-compute/1.19.2/>`__)
|
||||||
|
that both matches an active namespace grant and is tied to an
|
||||||
|
`owner <grant-ownership_>`_
|
||||||
|
MUST receive a special indicator that signifies this tie.
|
||||||
|
|
||||||
|
The UI also MUST indicate what the prefix is (NuGet does not do this) and this
|
||||||
|
value MUST match the ``namespace`` key in the `API <repository-metadata_>`_.
|
||||||
|
|
||||||
|
Repositories SHOULD have a dedicated page that enumerates every active
|
||||||
|
namespace grant and which organization(s) own it.
|
||||||
|
|
||||||
|
.. _public-namespaces:
|
||||||
|
|
||||||
|
Public Namespaces
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
The owner of a grant may choose to allow others the ability to release new
|
||||||
|
packages with the associated namespace. Doing so MUST allow
|
||||||
|
`uploads <uploads_>`_ for new packages matching the namespace from any user
|
||||||
|
but such releases MUST NOT have the `visual indicator <user-interface_>`_.
|
||||||
|
|
||||||
|
It is possible for the `owner <grant-ownership_>`_ of a namespace to both make
|
||||||
|
it public and allow other organizations to use it. In this case, the permitted
|
||||||
|
organizations have no special permissions and are essentially only public.
|
||||||
|
|
||||||
|
Root grants given to `community projects <grant-approval-criteria_>`_ SHALL
|
||||||
|
always be public.
|
||||||
|
|
||||||
|
.. _repository-metadata:
|
||||||
|
|
||||||
|
Repository Metadata
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
To allow installers and other tooling insight into this metadata for a given
|
||||||
|
artifact upload of a namespaced package, the :pep:`JSON API <691>` MUST include
|
||||||
|
the following keys:
|
||||||
|
|
||||||
|
* ``namespace``: This is the associated `normalized <naming_>`_
|
||||||
|
namespace e.g. ``foo-bar``. If the namespace matches a child grant and the
|
||||||
|
user happens to be authorized for both the child and the root grant, this
|
||||||
|
MUST be the namespace associated with the child grant.
|
||||||
|
* ``owner``: This is the organization with which the user is associated and
|
||||||
|
owner of the grant. If the namespace is `public <public-namespaces_>`_ and
|
||||||
|
the user is not part of a `permitted <grant-ownership_>`_ organization, this
|
||||||
|
key MUST be set to ``__public__``. This is useful for tools that wish to make
|
||||||
|
a distinction between official and community packages.
|
||||||
|
|
||||||
|
The `Simple API`__ MAY include the aforementioned keys as attributes, for
|
||||||
|
example:
|
||||||
|
|
||||||
|
__ https://packaging.python.org/en/latest/specifications/simple-repository-api/#base-html-api
|
||||||
|
|
||||||
|
.. code-block:: html
|
||||||
|
|
||||||
|
<a href="..." namespace="foo-bar" owner="org1">...</a>
|
||||||
|
|
||||||
|
Grant Removal
|
||||||
|
-------------
|
||||||
|
|
||||||
|
If a grant is shared with other organizations, the owner organization MUST
|
||||||
|
initiate a transfer as a prerequisite for organization deletion.
|
||||||
|
|
||||||
|
If a grant is not shared, the owner may unclaim the namespace in either of the
|
||||||
|
following circumstances:
|
||||||
|
|
||||||
|
* The organization manually removes themselves as the owner.
|
||||||
|
* The organization is deleted.
|
||||||
|
|
||||||
|
When a reserved namespace becomes unclaimed, repositories:
|
||||||
|
|
||||||
|
1. MUST remove the `visual indicator <user-interface_>`_
|
||||||
|
2. MUST NOT modify past `release metadata <repository-metadata_>`_
|
||||||
|
|
||||||
|
Grant Applications
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Submission
|
||||||
|
''''''''''
|
||||||
|
|
||||||
|
Only `organizations <orgs_>`_ have access to the page for submitting grant
|
||||||
|
applications. Reviews of `corporate organizations <corp-orgs_>`_ applications
|
||||||
|
are prioritized.
|
||||||
|
|
||||||
|
.. _grant-approval-criteria:
|
||||||
|
|
||||||
|
Approval Criteria
|
||||||
|
'''''''''''''''''
|
||||||
|
|
||||||
|
1. The namespace MUST NOT be something common like ``tool`` or ``apps``.
|
||||||
|
2. The namespace SHOULD be greater than three characters.
|
||||||
|
3. The namespace SHOULD properly and clearly identify the reservation owner.
|
||||||
|
4. The organization SHOULD be actively using the namespace.
|
||||||
|
5. There SHOULD be evidence that *not* reserving the namespace may cause
|
||||||
|
ambiguity, confusion, or other harm to the community.
|
||||||
|
|
||||||
|
Organizations that are not `corporate organizations <corp-orgs_>`_ MUST
|
||||||
|
represent one of the following:
|
||||||
|
|
||||||
|
* Large, popular open-source projects with many packages [2]_
|
||||||
|
* Universities that actively publish packages
|
||||||
|
* Government organizations that actively publish packages
|
||||||
|
* NPOs/NGOs that actively publish packages like
|
||||||
|
`Our World in Data <https://github.com/owid>`__
|
||||||
|
|
||||||
|
Backwards Compatibility
|
||||||
|
=======================
|
||||||
|
|
||||||
|
There are no intrinsic concerns because there is still a flat namespace and
|
||||||
|
installers need no modification. Additionally, many projects have already
|
||||||
|
chosen to signal a shared purpose with a prefix like `typeshed has done`__.
|
||||||
|
|
||||||
|
__ https://github.com/python/typeshed/issues/2491#issuecomment-578456045
|
||||||
|
|
||||||
|
Security Implications
|
||||||
|
=====================
|
||||||
|
|
||||||
|
* Although users will no longer see the visual indicator when a namespace
|
||||||
|
becomes unclaimed, external consumers of metadata may have difficulty
|
||||||
|
scraping the user facing
|
||||||
|
`enumeration <user-interface_>`_ of grants to verify current ownership.
|
||||||
|
* There is an opportunity to build on top of :pep:`740` and :pep:`480` so that
|
||||||
|
one could prove cryptographically that a specific release came from an owner
|
||||||
|
of the associated namespace. This PEP makes no effort to describe how this
|
||||||
|
will happen other than that work is planned for the future.
|
||||||
|
|
||||||
|
How to Teach This
|
||||||
|
=================
|
||||||
|
|
||||||
|
For organizations, we will document how to reserve namespaces, what the
|
||||||
|
benefits are and pricing.
|
||||||
|
|
||||||
|
For consumers of packages we will document the indicator on release pages, how
|
||||||
|
metadata is exposed in the `API <repository-metadata_>`_ and potentially in
|
||||||
|
future note tooling that supports utilizing namespaces to provide extra
|
||||||
|
security guarantees during installation.
|
||||||
|
|
||||||
|
Reference Implementation
|
||||||
|
========================
|
||||||
|
|
||||||
|
None at this time.
|
||||||
|
|
||||||
|
Rejected Ideas
|
||||||
|
==============
|
||||||
|
|
||||||
|
Allow Non-Public Namespaces for Community Projects
|
||||||
|
--------------------------------------------------
|
||||||
|
|
||||||
|
This PEP enforces that the discretionary namespace grants for community
|
||||||
|
projects are `public <public-namespaces_>`_. This is almost always desired by
|
||||||
|
such projects and prevents the following situations:
|
||||||
|
|
||||||
|
* A perceived reduction in openness of community projects, for example if a
|
||||||
|
project was taken over by a business entity there may be a desire for it to
|
||||||
|
prevent the creation of new packages matching the namespace.
|
||||||
|
* When an existing community project with plugins (such as MkDocs) chooses to
|
||||||
|
reserve a namespace, future plugins that are officially adopted would have to
|
||||||
|
change their name. This would cause a massive disruption to users and reset
|
||||||
|
usage statistics. The workaround is to have a new package that is advertised
|
||||||
|
which would depend on the real package but this is suboptimal.
|
||||||
|
|
||||||
|
Open Issues
|
||||||
|
===========
|
||||||
|
|
||||||
|
None at this time.
|
||||||
|
|
||||||
|
Footnotes
|
||||||
|
=========
|
||||||
|
|
||||||
|
.. [1] The following shows the package prefixes for the major cloud providers:
|
||||||
|
|
||||||
|
- Amazon: `aws-cdk- <https://docs.aws.amazon.com/cdk/api/v2/python/>`__
|
||||||
|
- Google: `google-cloud- <https://github.com/googleapis/google-cloud-python/tree/main/packages>`__
|
||||||
|
and others based on ``google-``
|
||||||
|
- Microsoft: `azure- <https://github.com/Azure/azure-sdk-for-python/tree/main/sdk>`__
|
||||||
|
|
||||||
|
.. [2] Some examples of projects that have many packages with a common prefix:
|
||||||
|
|
||||||
|
- `MkDocs <https://github.com/mkdocs/mkdocs>`__ is a documentation framework
|
||||||
|
based on Markdown files. They have the concept of
|
||||||
|
`plugins <https://www.mkdocs.org/dev-guide/plugins/>`__ which may be
|
||||||
|
developed by anyone and by convention are prefixed by ``mkdocs-``.
|
||||||
|
- `Project Jupyter <https://jupyter.org>`__ is devoted to the development of
|
||||||
|
tooling for sharing interactive documents. They support `extensions`__
|
||||||
|
which in most cases (and in all cases for officially maintained extensions)
|
||||||
|
are prefixed by ``jupyter-``.
|
||||||
|
- `OpenTelemetry <https://opentelemetry.io>`__ is an open standard for
|
||||||
|
observability with `official packages`__ for the core APIs and SDK with
|
||||||
|
`third-party packages`__ to collect data from various sources. All
|
||||||
|
packages are prefixed by ``opentelemetry-`` with child prefixes in the
|
||||||
|
form ``opentelemetry-<component>-<name>-``.
|
||||||
|
|
||||||
|
__ https://jupyterlab.readthedocs.io/en/stable/user/extensions.html
|
||||||
|
__ https://github.com/open-telemetry/opentelemetry-python
|
||||||
|
__ https://github.com/open-telemetry/opentelemetry-python-contrib
|
||||||
|
|
||||||
|
.. _orgs: https://blog.pypi.org/posts/2023-04-23-introducing-pypi-organizations/
|
||||||
|
.. _corp-orgs: https://docs.pypi.org/organization-accounts/pricing-and-payments/#corporate-organizations
|
||||||
|
|
||||||
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
|
This document is placed in the public domain or under the
|
||||||
|
CC0-1.0-Universal license, whichever is more permissive.
|
Loading…
Reference in New Issue