PEP 752: Package repository namespaces (#3903)
PEP 752: Package repository namespaces --------- Co-authored-by: Barry Warsaw <barry@python.org>
This commit is contained in:
parent
5d475023e5
commit
9247c9872c
|
@ -631,6 +631,7 @@ peps/pep-0749.rst @JelleZijlstra
|
|||
# ...
|
||||
peps/pep-0750.rst @gvanrossum @lysnikolaou
|
||||
peps/pep-0751.rst @brettcannon
|
||||
peps/pep-0752.rst @warsaw
|
||||
# ...
|
||||
# peps/pep-0754.rst
|
||||
# ...
|
||||
|
|
|
@ -0,0 +1,429 @@
|
|||
PEP: 752
|
||||
Title: Package repository namespaces
|
||||
Author: Ofek Lev <ofekmeister@gmail.com>
|
||||
Sponsor: Barry Warsaw <barry@python.org>
|
||||
PEP-Delegate: Donald Stufft <donald@stufft.io>
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
Topic: Packaging
|
||||
Created: 13-Aug-2024
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This PEP specifies a way for organizations to reserve package name prefixes
|
||||
for future uploads.
|
||||
|
||||
"Namespaces are one honking great idea -- let's do more of
|
||||
those!" - :pep:`20`
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
The current ecosystem lacks a way for projects with many packages to signal a
|
||||
verified pattern of ownership. Some examples:
|
||||
|
||||
* `Typeshed <https://github.com/python/typeshed>`__ is a community effort to
|
||||
maintain type stubs for various packages. The stub packages they maintain
|
||||
mirror the package name they target and are prefixed by ``types-``. For
|
||||
example, the package ``requests`` has a stub that users would depend on
|
||||
called ``types-requests``.
|
||||
* Major cloud providers like Amazon, Google and Microsoft have a common prefix
|
||||
for each feature's corresponding package [1]_. For example, most of Google's
|
||||
packages are prefixed by ``google-cloud-`` e.g. ``google-cloud-compute`` for
|
||||
`using virtual machines <https://cloud.google.com/products/compute>`__.
|
||||
* Many projects [2]_ support a model where some packages are officially
|
||||
maintained and third-party developers are encouraged to participate by
|
||||
creating their own. For example, `Datadog <https://www.datadoghq.com>`__
|
||||
offers observability as a service for organizations at any scale. The
|
||||
`Datadog Agent <https://docs.datadoghq.com/agent/>`__ ships out-of-the-box
|
||||
with
|
||||
`official integrations <https://github.com/DataDog/integrations-core>`__
|
||||
for many products, like various databases and web servers, which are
|
||||
distributed as Python packages that are prefixed by ``datadog-``. There is
|
||||
support for creating `third-party integrations`__ which customers may run.
|
||||
|
||||
__ https://docs.datadoghq.com/developers/integrations/agent_integration/
|
||||
|
||||
Such projects are uniquely vulnerable to attacks stemming from malicious actors
|
||||
squatting anticipated package names. For example, say a new product is released
|
||||
for which monitoring would be valuable. It would be reasonable to assume that
|
||||
Datadog would eventually support it as an official integration. It takes a
|
||||
nontrivial amount of time to deliver such an integration due to roadmap
|
||||
prioritization and the time required for implementation. It would be impossible
|
||||
to reserve the name of every potential package so in the interim an attacker
|
||||
may create a legitimate-appearing package which would execute malicious code at
|
||||
runtime. Not only are users more likely to install such packages but doing so
|
||||
taints the perception of the entire project.
|
||||
|
||||
Namespacing also would drastically reduce the incidence of
|
||||
`typosquatting <https://en.wikipedia.org/wiki/Typosquatting>`__
|
||||
because typos would have to be in the prefix itself which is
|
||||
`normalized <naming_>`_ and likely to be a short, well-known identifier like
|
||||
``aws-``.
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
Tolerance for Disruption
|
||||
------------------------
|
||||
|
||||
Other package ecosystems have generally solved this problem by taking one of
|
||||
two approaches: either minimizing or maximizing backwards compatibility.
|
||||
|
||||
* `NPM <https://www.npmjs.com>`__ has the concept of
|
||||
`scoped packages <https://docs.npmjs.com/about-scopes>`__ which were
|
||||
`introduced`__ primarily to combat there being a dearth of available good
|
||||
package names (whether a real or perceived phenomenon). When a user or
|
||||
organization signs up they are given a scope that matches their name. For
|
||||
example, the
|
||||
`package <https://www.npmjs.com/package/@google-cloud/storage>`__ for using
|
||||
Google Cloud Storage is ``@google-cloud/storage`` where ``@google-cloud/`` is
|
||||
the scope. Regular user accounts (non-organization) may publish `unscoped`__
|
||||
packages for public use.
|
||||
This approach has the lowest amount of backwards compatibility because every
|
||||
installer and tool has to be modified to account for scopes.
|
||||
* `NuGet <https://www.nuget.org>`__ has the concept of
|
||||
`package ID prefix reservation`__ which was
|
||||
`introduced`__ primarily to satisfy users wishing to know where a package
|
||||
came from. A package name prefix may be reserved for use by one or more
|
||||
owners. Every reserved package has a special indication
|
||||
`on its page <https://www.nuget.org/packages/Google.Cloud.Storage.V1>`__ to
|
||||
communicate this. After reservation, any upload with a reserved prefix will
|
||||
fail if the user is not an owner of the prefix. Existing packages that have a
|
||||
prefix that is owned may continue to release as usual. This approach has the
|
||||
highest amount of backwards compatibility because only modifications to
|
||||
indices like PyPI are required and installers do not need to change.
|
||||
|
||||
__ https://blog.npmjs.org/post/116936804365/solving-npms-hard-problem-naming-packages
|
||||
__ https://docs.npmjs.com/package-scope-access-level-and-visibility
|
||||
__ https://learn.microsoft.com/en-us/nuget/nuget-org/id-prefix-reservation
|
||||
__ https://devblogs.microsoft.com/nuget/Package-identity-and-trust/
|
||||
|
||||
This PEP specifies the NuGet approach of authorized reservation across a flat
|
||||
namespace for the following reasons:
|
||||
|
||||
* Causing churn for the community is a hard blocker.
|
||||
* The NPM approach has the potential to cause confusion for users if we allow
|
||||
unscoped names. Our community has chosen to normalize separator characters
|
||||
and so ``@aws/s3`` would likely be confused with ``@aws-s3``.
|
||||
|
||||
Approval Process
|
||||
----------------
|
||||
|
||||
PyPI has been understaffed, receiving the first `dedicated specialist`__ in
|
||||
July 2024. Due to lack of resources, user support has been lacking for
|
||||
`package name claims <https://discuss.python.org/t/27436/19>`__,
|
||||
`organization requests <https://discuss.python.org/t/33764/15>`__,
|
||||
`storage limit increases <https://discuss.python.org/t/54035>`__,
|
||||
and even `account recovery <https://discuss.python.org/t/43422/122>`__.
|
||||
|
||||
__ https://pyfound.blogspot.com/2024/07/announcing-our-new-pypi-support.html
|
||||
|
||||
The `default policy <grant-approval-criteria_>`_ of only allowing
|
||||
`corporate organizations <corp-orgs_>`_ to reserve namespaces (except in
|
||||
specific scenarios) provides the following benefits:
|
||||
|
||||
* PyPI would have a constant source of funding for support specialists,
|
||||
infrastructure maintenance and new features.
|
||||
* Although each application would require independent review, less human
|
||||
feedback would be required because the process to approve a paid organization
|
||||
already bestows a certain amount of trust.
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
`Organizations <orgs_>`_ (NOT regular users) MAY reserve one or more
|
||||
namespaces. Such reservations neither confer ownership nor grant special
|
||||
privileges to existing packages.
|
||||
|
||||
.. _naming:
|
||||
|
||||
Naming
|
||||
------
|
||||
|
||||
A namespace MUST be a `valid`__ project name and `normalized`__ internally e.g.
|
||||
``foo.bar`` would become ``foo-bar``. The user facing namespace (e.g. in UI
|
||||
tooltips) MUST preserve the original pre-normalized text as defined during
|
||||
reservation.
|
||||
|
||||
__ https://packaging.python.org/en/latest/specifications/name-normalization/#name-format
|
||||
__ https://packaging.python.org/en/latest/specifications/name-normalization/#name-normalization
|
||||
|
||||
Grant Semantics
|
||||
---------------
|
||||
|
||||
A namespace grant bestows ownership over the following:
|
||||
|
||||
1. A package matching the namespace itself such as the placeholder package
|
||||
`microsoft <https://pypi.org/project/microsoft/>`__.
|
||||
2. Packages that start with the namespace followed by a hyphen. For example,
|
||||
the namespace ``foo`` would match the package ``foo-bar`` but not the
|
||||
package ``foobar``.
|
||||
|
||||
Package name matching acts upon the `normalized <naming_>`_ namespace.
|
||||
|
||||
Namespaces are per-repository and MUST NOT be shared between repositories.
|
||||
|
||||
Grant Types
|
||||
-----------
|
||||
|
||||
There are two types of grants.
|
||||
|
||||
.. _root-grant:
|
||||
|
||||
Root Grant
|
||||
''''''''''
|
||||
|
||||
Only `organizations <orgs_>`_ have the ability to submit requests for namespace
|
||||
grants. An organization gets a root grant for every accepted request. This
|
||||
grant may produce any number of `child grants <child-grant_>`_.
|
||||
|
||||
.. _child-grant:
|
||||
|
||||
Child Grant
|
||||
'''''''''''
|
||||
|
||||
A child grant is created by the owner of a `root grant <root-grant_>`_. The
|
||||
child namespace MUST be prefixed by the root grant namespace followed by a
|
||||
hyphen. For example, ``google-cloud`` would be a valid child of the root
|
||||
namespace ``google``.
|
||||
|
||||
Child grants cannot have their own child grants.
|
||||
|
||||
.. _grant-ownership:
|
||||
|
||||
Grant Ownership
|
||||
---------------
|
||||
|
||||
The owner of a grant may allow any number of other organizations to use the
|
||||
grant. The grants behave as if they were owned by the organization. The owner
|
||||
may revoke this permission at any time.
|
||||
|
||||
The owner may transfer ownership to another organization. If the organization
|
||||
is a corporate organization, the target for transfer must also be. Settings for
|
||||
permitted organizations are transferred as well.
|
||||
|
||||
.. _uploads:
|
||||
|
||||
Uploads
|
||||
-------
|
||||
|
||||
If the following criteria are all true for a given upload:
|
||||
|
||||
1. The package does not yet exist.
|
||||
2. The name matches a reserved namespace.
|
||||
3. The user is not authorized to use the namespace by the owner of the
|
||||
namespace.
|
||||
|
||||
Then the upload MUST fail with a 403 HTTP status code.
|
||||
|
||||
.. _user-interface:
|
||||
|
||||
User Interface
|
||||
--------------
|
||||
|
||||
Every page for a particular release
|
||||
(`example <https://pypi.org/project/google-cloud-compute/1.19.2/>`__)
|
||||
that both matches an active namespace grant and is tied to an
|
||||
`owner <grant-ownership_>`_
|
||||
MUST receive a special indicator that signifies this tie.
|
||||
|
||||
The UI also MUST indicate what the prefix is (NuGet does not do this) and this
|
||||
value MUST match the ``namespace`` key in the `API <repository-metadata_>`_.
|
||||
|
||||
Repositories SHOULD have a dedicated page that enumerates every active
|
||||
namespace grant and which organization(s) own it.
|
||||
|
||||
.. _public-namespaces:
|
||||
|
||||
Public Namespaces
|
||||
-----------------
|
||||
|
||||
The owner of a grant may choose to allow others the ability to release new
|
||||
packages with the associated namespace. Doing so MUST allow
|
||||
`uploads <uploads_>`_ for new packages matching the namespace from any user
|
||||
but such releases MUST NOT have the `visual indicator <user-interface_>`_.
|
||||
|
||||
It is possible for the `owner <grant-ownership_>`_ of a namespace to both make
|
||||
it public and allow other organizations to use it. In this case, the permitted
|
||||
organizations have no special permissions and are essentially only public.
|
||||
|
||||
Root grants given to `community projects <grant-approval-criteria_>`_ SHALL
|
||||
always be public.
|
||||
|
||||
.. _repository-metadata:
|
||||
|
||||
Repository Metadata
|
||||
-------------------
|
||||
|
||||
To allow installers and other tooling insight into this metadata for a given
|
||||
artifact upload of a namespaced package, the :pep:`JSON API <691>` MUST include
|
||||
the following keys:
|
||||
|
||||
* ``namespace``: This is the associated `normalized <naming_>`_
|
||||
namespace e.g. ``foo-bar``. If the namespace matches a child grant and the
|
||||
user happens to be authorized for both the child and the root grant, this
|
||||
MUST be the namespace associated with the child grant.
|
||||
* ``owner``: This is the organization with which the user is associated and
|
||||
owner of the grant. If the namespace is `public <public-namespaces_>`_ and
|
||||
the user is not part of a `permitted <grant-ownership_>`_ organization, this
|
||||
key MUST be set to ``__public__``. This is useful for tools that wish to make
|
||||
a distinction between official and community packages.
|
||||
|
||||
The `Simple API`__ MAY include the aforementioned keys as attributes, for
|
||||
example:
|
||||
|
||||
__ https://packaging.python.org/en/latest/specifications/simple-repository-api/#base-html-api
|
||||
|
||||
.. code-block:: html
|
||||
|
||||
<a href="..." namespace="foo-bar" owner="org1">...</a>
|
||||
|
||||
Grant Removal
|
||||
-------------
|
||||
|
||||
If a grant is shared with other organizations, the owner organization MUST
|
||||
initiate a transfer as a prerequisite for organization deletion.
|
||||
|
||||
If a grant is not shared, the owner may unclaim the namespace in either of the
|
||||
following circumstances:
|
||||
|
||||
* The organization manually removes themselves as the owner.
|
||||
* The organization is deleted.
|
||||
|
||||
When a reserved namespace becomes unclaimed, repositories:
|
||||
|
||||
1. MUST remove the `visual indicator <user-interface_>`_
|
||||
2. MUST NOT modify past `release metadata <repository-metadata_>`_
|
||||
|
||||
Grant Applications
|
||||
------------------
|
||||
|
||||
Submission
|
||||
''''''''''
|
||||
|
||||
Only `organizations <orgs_>`_ have access to the page for submitting grant
|
||||
applications. Reviews of `corporate organizations <corp-orgs_>`_ applications
|
||||
are prioritized.
|
||||
|
||||
.. _grant-approval-criteria:
|
||||
|
||||
Approval Criteria
|
||||
'''''''''''''''''
|
||||
|
||||
1. The namespace MUST NOT be something common like ``tool`` or ``apps``.
|
||||
2. The namespace SHOULD be greater than three characters.
|
||||
3. The namespace SHOULD properly and clearly identify the reservation owner.
|
||||
4. The organization SHOULD be actively using the namespace.
|
||||
5. There SHOULD be evidence that *not* reserving the namespace may cause
|
||||
ambiguity, confusion, or other harm to the community.
|
||||
|
||||
Organizations that are not `corporate organizations <corp-orgs_>`_ MUST
|
||||
represent one of the following:
|
||||
|
||||
* Large, popular open-source projects with many packages [2]_
|
||||
* Universities that actively publish packages
|
||||
* Government organizations that actively publish packages
|
||||
* NPOs/NGOs that actively publish packages like
|
||||
`Our World in Data <https://github.com/owid>`__
|
||||
|
||||
Backwards Compatibility
|
||||
=======================
|
||||
|
||||
There are no intrinsic concerns because there is still a flat namespace and
|
||||
installers need no modification. Additionally, many projects have already
|
||||
chosen to signal a shared purpose with a prefix like `typeshed has done`__.
|
||||
|
||||
__ https://github.com/python/typeshed/issues/2491#issuecomment-578456045
|
||||
|
||||
Security Implications
|
||||
=====================
|
||||
|
||||
* Although users will no longer see the visual indicator when a namespace
|
||||
becomes unclaimed, external consumers of metadata may have difficulty
|
||||
scraping the user facing
|
||||
`enumeration <user-interface_>`_ of grants to verify current ownership.
|
||||
* There is an opportunity to build on top of :pep:`740` and :pep:`480` so that
|
||||
one could prove cryptographically that a specific release came from an owner
|
||||
of the associated namespace. This PEP makes no effort to describe how this
|
||||
will happen other than that work is planned for the future.
|
||||
|
||||
How to Teach This
|
||||
=================
|
||||
|
||||
For organizations, we will document how to reserve namespaces, what the
|
||||
benefits are and pricing.
|
||||
|
||||
For consumers of packages we will document the indicator on release pages, how
|
||||
metadata is exposed in the `API <repository-metadata_>`_ and potentially in
|
||||
future note tooling that supports utilizing namespaces to provide extra
|
||||
security guarantees during installation.
|
||||
|
||||
Reference Implementation
|
||||
========================
|
||||
|
||||
None at this time.
|
||||
|
||||
Rejected Ideas
|
||||
==============
|
||||
|
||||
Allow Non-Public Namespaces for Community Projects
|
||||
--------------------------------------------------
|
||||
|
||||
This PEP enforces that the discretionary namespace grants for community
|
||||
projects are `public <public-namespaces_>`_. This is almost always desired by
|
||||
such projects and prevents the following situations:
|
||||
|
||||
* A perceived reduction in openness of community projects, for example if a
|
||||
project was taken over by a business entity there may be a desire for it to
|
||||
prevent the creation of new packages matching the namespace.
|
||||
* When an existing community project with plugins (such as MkDocs) chooses to
|
||||
reserve a namespace, future plugins that are officially adopted would have to
|
||||
change their name. This would cause a massive disruption to users and reset
|
||||
usage statistics. The workaround is to have a new package that is advertised
|
||||
which would depend on the real package but this is suboptimal.
|
||||
|
||||
Open Issues
|
||||
===========
|
||||
|
||||
None at this time.
|
||||
|
||||
Footnotes
|
||||
=========
|
||||
|
||||
.. [1] The following shows the package prefixes for the major cloud providers:
|
||||
|
||||
- Amazon: `aws-cdk- <https://docs.aws.amazon.com/cdk/api/v2/python/>`__
|
||||
- Google: `google-cloud- <https://github.com/googleapis/google-cloud-python/tree/main/packages>`__
|
||||
and others based on ``google-``
|
||||
- Microsoft: `azure- <https://github.com/Azure/azure-sdk-for-python/tree/main/sdk>`__
|
||||
|
||||
.. [2] Some examples of projects that have many packages with a common prefix:
|
||||
|
||||
- `MkDocs <https://github.com/mkdocs/mkdocs>`__ is a documentation framework
|
||||
based on Markdown files. They have the concept of
|
||||
`plugins <https://www.mkdocs.org/dev-guide/plugins/>`__ which may be
|
||||
developed by anyone and by convention are prefixed by ``mkdocs-``.
|
||||
- `Project Jupyter <https://jupyter.org>`__ is devoted to the development of
|
||||
tooling for sharing interactive documents. They support `extensions`__
|
||||
which in most cases (and in all cases for officially maintained extensions)
|
||||
are prefixed by ``jupyter-``.
|
||||
- `OpenTelemetry <https://opentelemetry.io>`__ is an open standard for
|
||||
observability with `official packages`__ for the core APIs and SDK with
|
||||
`third-party packages`__ to collect data from various sources. All
|
||||
packages are prefixed by ``opentelemetry-`` with child prefixes in the
|
||||
form ``opentelemetry-<component>-<name>-``.
|
||||
|
||||
__ https://jupyterlab.readthedocs.io/en/stable/user/extensions.html
|
||||
__ https://github.com/open-telemetry/opentelemetry-python
|
||||
__ https://github.com/open-telemetry/opentelemetry-python-contrib
|
||||
|
||||
.. _orgs: https://blog.pypi.org/posts/2023-04-23-introducing-pypi-organizations/
|
||||
.. _corp-orgs: https://docs.pypi.org/organization-accounts/pricing-and-payments/#corporate-organizations
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document is placed in the public domain or under the
|
||||
CC0-1.0-Universal license, whichever is more permissive.
|
Loading…
Reference in New Issue