PEP 752: Address feedback, round 1 (#3916)

* PEP 752: Address feedback, round 1

* address feedback
This commit is contained in:
Ofek Lev 2024-08-21 15:08:50 -04:00 committed by GitHub
parent 9765fa989f
commit 05bf9c5ef1
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 106 additions and 43 deletions

View File

@ -47,16 +47,23 @@ verified pattern of ownership. Some examples:
__ https://docs.datadoghq.com/developers/integrations/agent_integration/
Such projects are uniquely vulnerable to attacks stemming from malicious actors
squatting anticipated package names. For example, say a new product is released
for which monitoring would be valuable. It would be reasonable to assume that
Datadog would eventually support it as an official integration. It takes a
nontrivial amount of time to deliver such an integration due to roadmap
prioritization and the time required for implementation. It would be impossible
to reserve the name of every potential package so in the interim an attacker
may create a legitimate-appearing package which would execute malicious code at
runtime. Not only are users more likely to install such packages but doing so
taints the perception of the entire project.
Such projects are uniquely vulnerable to `dependency confusion`__ attacks.
For example, say a new product is released for which monitoring would be
valuable. It would be reasonable to assume that Datadog would eventually
support it as an official integration. It takes a nontrivial amount of time to
deliver such an integration due to roadmap prioritization and the time required
for implementation. It would be impossible to reserve the name of every
potential package so in the interim an attacker may create a package that
appears legitimate which would execute malicious code at runtime. Not only are
users more likely to install such packages but doing so taints the perception
of the entire project.
__ https://www.activestate.com/resources/quick-reads/dependency-confusion/
Although :pep:`708` attempts to address this attack vector, it is specifically
about the case of multiple repositories being considered during dependency
resolution and does not offer any protection to the aforementioned use cases.
Namespacing also would drastically reduce the incidence of
`typosquatting <https://en.wikipedia.org/wiki/Typosquatting>`__
@ -137,7 +144,7 @@ Specification
`Organizations <orgs_>`_ (NOT regular users) MAY reserve one or more
namespaces. Such reservations neither confer ownership nor grant special
privileges to existing packages.
privileges to existing projects.
.. _naming:
@ -157,15 +164,16 @@ Grant Semantics
A namespace grant bestows ownership over the following:
1. A package matching the namespace itself such as the placeholder package
1. A project matching the namespace itself such as the placeholder package
`microsoft <https://pypi.org/project/microsoft/>`__.
2. Packages that start with the namespace followed by a hyphen. For example,
the namespace ``foo`` would match the package ``foo-bar`` but not the
package ``foobar``.
2. Projects that start with the namespace followed by a hyphen. For example,
the namespace ``foo`` would match the normalized project name ``foo-bar``
but not the project name ``foobar``.
Package name matching acts upon the `normalized <naming_>`_ namespace.
Namespaces are per-repository and MUST NOT be shared between repositories.
Namespaces are per-package repository and SHALL NOT be shared between
repositories.
Grant Types
-----------
@ -213,9 +221,9 @@ Uploads
If the following criteria are all true for a given upload:
1. The package does not yet exist.
1. The project does not yet exist.
2. The name matches a reserved namespace.
3. The user is not authorized to use the namespace by the owner of the
3. The project is not owned by an organization with an active grant for the
namespace.
Then the upload MUST fail with a 403 HTTP status code.
@ -243,8 +251,8 @@ Public Namespaces
-----------------
The owner of a grant may choose to allow others the ability to release new
packages with the associated namespace. Doing so MUST allow
`uploads <uploads_>`_ for new packages matching the namespace from any user
projects with the associated namespace. Doing so MUST allow
`uploads <uploads_>`_ for new projects matching the namespace from any user
but such releases MUST NOT have the `visual indicator <user-interface_>`_.
It is possible for the `owner <grant-ownership_>`_ of a namespace to both make
@ -254,33 +262,41 @@ organizations have no special permissions and are essentially only public.
Root grants given to `community projects <grant-approval-criteria_>`_ SHALL
always be public.
When a `child grant <child-grant_>`_ is created, its public status SHALL be
inherited from the `root grant <root-grant_>`_. Owners of child grants MAY
make them public at any time. If a grant is public, it MUST NOT be made private
unless the owner of the grant is the owner of every project that matches the
namespace.
.. _repository-metadata:
Repository Metadata
-------------------
To allow installers and other tooling insight into this metadata for a given
artifact upload of a namespaced package, the :pep:`JSON API <691>` MUST include
the following keys:
To allow installers and other tooling insight into this project-level metadata
of a namespaced project, the :pep:`JSON API <691>` version will be incremented
and support new keys for the project endpoint.
* ``namespace``: This is the associated `normalized <naming_>`_
namespace e.g. ``foo-bar``. If the namespace matches a child grant and the
user happens to be authorized for both the child and the root grant, this
MUST be the namespace associated with the child grant.
* ``owner``: This is the organization with which the user is associated and
owner of the grant. If the namespace is `public <public-namespaces_>`_ and
the user is not part of a `permitted <grant-ownership_>`_ organization, this
key MUST be set to ``__public__``. This is useful for tools that wish to make
a distinction between official and community packages.
The ``owner`` key SHOULD be added and refer to the owner of the project,
whether an organization or a user.
The `Simple API`__ MAY include the aforementioned keys as attributes, for
example:
The ``namespace`` key MAY be added and MUST be ``null`` if the project does not
match an active namespace grant. If the project does match a namespace grant,
the value MUST be a mapping with the following keys:
__ https://packaging.python.org/en/latest/specifications/simple-repository-api/#base-html-api
* ``name``: This is the associated `normalized <naming_>`_ namespace e.g.
``foo-bar``. If the owner of the project owns multiple matching grants then
this MUST be the namespace with the most number of characters. For example,
if the project name matched both ``foo-bar`` and ``foo-bar-baz`` then this
key would be the latter.
* ``owners``: This is an array of organizations that
`own <grant-ownership_>`_ the grant. This is useful for tools that wish to
make a distinction between official and community packages by checking if
the array contains the project ``owner``.
* ``public``: This is a boolean indicating whether the namespace is
`public <public-namespaces_>`_.
.. code-block:: html
<a href="..." namespace="foo-bar" owner="org1">...</a>
The presence of the ``namespace`` key indicates support for this PEP.
Grant Removal
-------------
@ -297,7 +313,7 @@ following circumstances:
When a reserved namespace becomes unclaimed, repositories:
1. MUST remove the `visual indicator <user-interface_>`_
2. MUST NOT modify past `release metadata <repository-metadata_>`_
2. MUST remove the ``namespace`` key in the `API <repository-metadata_>`_
Grant Applications
------------------
@ -370,6 +386,51 @@ None at this time.
Rejected Ideas
==============
Organization Scoping
--------------------
The primary motivation for this PEP is to reduce dependency confusion attacks
and NPM-style scoping with an allowance of the legacy flat namespace would
increase the risk. If documentation instructed a user to install ``bar`` in the
namespace ``foo`` then the user must be careful to install ``@foo/bar`` and not
``foo-bar``, or vice versa. The Python packaging ecosystem has normalization
rules for names in order to maximize the ease of communication and this would
be a regression.
The runtime environment of Python is also not conducive to scoping. Whereas
multiple versions of the same JavaScript package may coexist, Python only
allows a single global namespace. Barring major changes to the language itself,
this is nearly impossible to change. Additionally, users have come to expect
that the package name is usually the same as what they would import and
eliminating the flat namespace would do away with that convention.
Scoping would be particularly affected by organization changes which are bound
to happen over time. An organization may change their name due to internal
shuffling, an acquisition, or any other reason. Whenever this happens every
project they own would in effect be renamed which would cause unnecessary
confusion for users, frequently.
Users have come to expect that package names may be typed without worry of
conflicting shell syntax and any namespace solution would pose challenges:
* Copying NPM's syntax (e.g. ``@foo/bar``) would alienate a large number of
Windows users because the ``@`` character is considered special in
`PowerShell`__.
* Starting names with a ``/`` would conflict with the common installer
capability of accepting paths without URI ``file://`` syntax.
* Starting names with a ``//`` like Bazel
`target patterns <https://bazel.build/run/build#specifying-build-targets>`__
would be confusing to users because the current normalization standard
eliminates consecutive separator characters.
__ https://learn.microsoft.com/en-us/powershell/scripting/lang-spec/chapter-07?view=powershell-7.4#717--operator
Finally, the disruption to the community would be massive because it would
require an update from every package manager, security scanner, IDE, etc. New
packages released with the scoping would be incompatible with older tools and
would cause confusion for users along with frustration from maintainers having
to triage such complaints.
Allow Non-Public Namespaces for Community Projects
--------------------------------------------------
@ -404,9 +465,11 @@ Footnotes
.. [2] Some examples of projects that have many packages with a common prefix:
- `Django <https://www.djangoproject.com>`__ is one of the most widely used
frameworks in existence. They have the concept of `middleware`__ which
allows for third-party packages to modify the request/response cycle.
These packages are by convention prefixed by ``django-``.
web frameworks in existence. They have the concept of `reusable apps`__,
which are commonly installed via
`third-party packages <https://djangopackages.org>`__ that implement a
subset of functionality to extend Django-based websites. These packages
are by convention prefixed by ``django-`` or ``dj-``.
- `Project Jupyter <https://jupyter.org>`__ is devoted to the development of
tooling for sharing interactive documents. They support `extensions`__
which in most cases (and in all cases for officially maintained
@ -434,7 +497,7 @@ Footnotes
They have the concept of `plugins`__, and also `providers`__ which are
prefixed by ``apache-airflow-providers-``.
__ https://docs.djangoproject.com/en/5.1/topics/http/middleware/
__ https://docs.djangoproject.com/en/5.1/intro/reusable-apps/
__ https://jupyterlab.readthedocs.io/en/stable/user/extensions.html
__ https://docs.pytest.org/en/stable/how-to/writing_plugins.html
__ https://www.sphinx-doc.org/en/master/usage/extensions/index.html