PEP 735: Dependency Groups in pyproject.toml (#3541)

Co-authored-by: James Webber <jamestwebber@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
Co-authored-by: Brett Cannon <brett@python.org>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: Thomas Grainger <tagrain@gmail.com>
Co-authored-by: chrysle <fritzihab@posteo.de>
This commit is contained in:
Stephen Rosen 2023-11-28 19:15:22 -06:00 committed by GitHub
parent 66d1fc80bd
commit 1a089b850e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 653 additions and 0 deletions

1
.github/CODEOWNERS vendored
View File

@ -612,6 +612,7 @@ peps/pep-0731.rst @gvanrossum @encukou @vstinner @zooba @iritkatriel
peps/pep-0732.rst @Mariatta
peps/pep-0733.rst @encukou @vstinner @zooba @iritkatriel
peps/pep-0734.rst @ericsnowcurrently
peps/pep-0735.rst @brettcannon
# ...
# peps/pep-0754.rst
# ...

652
peps/pep-0735.rst Normal file
View File

@ -0,0 +1,652 @@
PEP: 735
Title: Dependency Groups in pyproject.toml
Author: Stephen Rosen <sirosen0@gmail.com>
Sponsor: Brett Cannon <brett@python.org>
PEP-Delegate: Paul Moore <p.f.moore@gmail.com>
Discussions-To: https://discuss.python.org/t/39233
Status: Draft
Type: Standards Track
Topic: Packaging
Created: 20-Nov-2023
Post-History: `14-Nov-2023 <https://discuss.python.org/t/29684>`__, `20-Nov-2023 <https://discuss.python.org/t/39233>`__
Abstract
========
This PEP specifies a mechanism for storing package requirements in
``pyproject.toml`` files such that they are not included in any built distribution of
the project.
This is suitable for creating named groups of dependencies, similar to
``requirements.txt`` files, which launchers, IDEs, and other tools can find and
identify by name.
The feature defined here is referred to as "Dependency Groups".
Motivation
==========
There are two major use cases for which the Python community has no
standardized answer:
* How should development dependencies be defined for packages?
* How should dependencies be defined for projects which do not build
distributions (non-package projects)?
In the absence of any standard, two known workflow tools, PDM and Poetry, have
defined their own solutions for Dependency Groups to address the first of these
two needs. Their definitions are very similar to one another, although PDM
structures them much more similarly to
`extras <https://packaging.python.org/en/latest/specifications/dependency-specifiers/#extras>`__
than Poetry does.
Neither of them addresses the needs of non-package projects, and neither can be
referenced natively from other tools like ``tox``, ``nox``, or ``pip``.
There are two pre-existing solutions which are similar to this proposal: the
use of ``requirements.txt`` files and package extras.
Regarding ``requirements.txt``, many projects may define one or more of these files,
and may arrange them either at the project root (e.g. ``requirements.txt`` and
``test-requirements.txt``) or else in a directory (e.g.
``requirements/base.txt`` and ``requirements/test.txt``). However, there are
two major issues with the use of requirements files in this way:
* There is no standardized naming convention such that tools can discover or
use these files by name.
* ``requirements.txt`` files are *not standardized*, but instead provide
options to ``pip``.
Therefore, their use is not portable to any new
installer or tool which wishes to process them without relying upon ``pip``.
Additionally, ``requirements.txt`` files may be viewed as a heavyweight
solution for very small dependency sets of only one or two items, and a terser
declaration will be beneficial to projects with a number of small groups of
dependencies.
Regarding extras, the use of extras to define development dependencies is a
widespread practice, but it has two major downsides:
* An extra is defined as an optional part of a package which may specify
additional dependencies.
This means that it cannot be installed without installing all of the package
dependencies, and the project must be defined as a package.
* Many developers view their extras as part of the public interface for their
package. Because these are published data, package developers often are
concerned about ensuring that their development extras are not confused with
user-facing extras. Therefore, their development needs are not appropriate to
publish in the manner of extras.
Providing a standardized place to store dependency data which matches the
typical use cases of ``requirements.txt`` files will allow for better
cross-compatibility between tools and will onboard beginning users into
``pyproject.toml`` data more quickly (before they learn about the
``[build-system]`` table, for example). It will also resolve the long-standing
ambiguity and tension within the Python community about whether or not extras
should be used to declare development dependencies.
Supported Project Types
-----------------------
This PEP aims to serve the needs of two distinct project types:
* libraries and other packages which want to define development dependencies
* projects which do not build distributions which want a standardized way to
declare their dependency data
The Python packaging toolchain is oriented towards the production of wheel and
sdist files, containing project source code for installation. However, not all
projects are designed or intended for this mode of distribution. Primary
examples include webapps (which often run from source), and data science
projects (which may start as collections of scripts and only later evolve
package structures). This PEP seeks to offer these projects a way to specify
their package dependencies in a way which is not constrained by the
requirements of distribution building -- for example, a data science project
does not need to declare a version or authors in order for these data to be
valid, nor does it need to define an installable package source tree.
Simultaneously, this PEP should benefit projects which *are* intended to build
distributions.
Because Dependency Groups, as specified in this PEP, are explicitly defined to
not be included in any built distribution, two common use cases are addressed
by this addition:
* the use of ``extras``, as discussed above
* the use of ``requirements.txt`` files for distributed libraries and
applications, which takes a similar form to their use in
non-distribution-building projects
Rationale
=========
This PEP defines the storage of requirements data in lists within a
``[dependency-groups]`` table.
This name was chosen to match the canonical name of the feature
("Dependency Groups").
This format should be as simple and learnable as possible, having a format
very similar to existing ``requirements.txt`` files for many cases. Each list
in ``[dependency-groups]`` is defined as a list of package specifiers. However,
there are a number of use cases which require data which cannot be expressed in
PEP 508 package specifiers. Therefore, this PEP also defines an object-format
for package specification, capable of defining these additional data.
The format does not support many forms of non-package data from
``requirements.txt`` files, such as ``pip`` options for package servers or
hashes. Including them in the initial stage complicates this proposal and it is
not clear that the majority of use-cases require them. Future expansions to
this specification may extend the object format for dependencies in
``[dependency-groups]`` values. Because this specification provides for an
object format for the data, it is possible for these to be added in future
specifications.
The following use cases are considered important targets for this PEP:
* A web application (e.g. a Django or Flask app) which does not build a
distribution, but bundles and ships its source to a deployment toolchain. The
app has runtime dependencies, testing dependencies, and development
environment dependencies. All of these dependency sets should be declared
independently, and none of these runtime contexts should require that the
project be built into a wheel.
* A library which builds a distribution, but has a variety of testing and
development environments it wishes to declare. Some of these rely on the
library's own dependencies (``test`` and ``typing`` environments) while
others do not (``docs`` and ``lint`` environments).
* A data science project which consists of multiple scripts that depend on the same suite
of core libraries (e.g. ``numpy``, ``pandas``, ``matplotlib``), and which
also has additional libraries, sometimes conflicting, for specific scripts
(e.g. ``scikit-learn==1.3.2`` for one script and ``scikit-learn==0.24.2`` for
another).
* *Input data* to a lockfile generator. Because there is no standardized
lockfile format, it is still the prerogative of tools like ``poetry`` and
``pip-compile`` to describe their own formats. However, it should be possible
for the data from this PEP to be used as an input to these tools.
Furthermore, tools may define their own structures and conventions such that
the generated lockfiles can be referenced by the names of their originating
Dependency Groups in ``pyproject.toml``.
* *Input data* to a tox, Nox, or Hatch environment, as can
currently be achieved, for example, with ``deps = -r requirements.txt`` in
``tox.ini``. These tools will need to add additional options for processing
Dependency Groups.
* Embeddable data for ``pyproject.toml`` within a script, as in :pep:`723`,
which defines embedded ``pyproject.toml`` data within scripts. This
PEP does not define exactly how :pep:`723` should be modified, but being
consumable by that interface is a stated goal.
* IDE discovery of requirements data. For example, VS Code could look for a dependency
group named ``test`` to use when running tests.
Note that this PEP does not reserve any names for specific use cases. It is
considered a problem for downstream standards and conventions to define
well-known names for certain needs, such as ``test`` or ``docs``.
Regarding Poetry and PDM Dependency Groups
------------------------------------------
Poetry and PDM already offer a feature which each calls "Dependency Groups",
but using non-standard data belonging to the ``poetry`` and ``pdm`` tools.
(PDM also uses extras for some Dependency Groups, and overlaps the notion
heavily with extras.)
This PEP is not guaranteed to be a perfectly substitutable solution for the
same problem space for each tool. However, the ideas are extremely similar, and
it should be possible for Poetry and PDM to support at least some
PEP-735-standardized Dependency Group configurations using their own Dependency
Group nomenclature.
A level of interoperability with Poetry and PDM is a goal of this PEP, but
certain features and behaviors defined here may not be supported by Poetry and
PDM. Matching the existing Poetry and PDM *semantics* for Dependency Groups is
a non-goal.
Dependency Groups are not Hidden Extras
---------------------------------------
One could be forgiven for thinking that Dependency Groups are just extras which
go unpublished.
However, there are two major features which distinguish them from
extras:
* they support non-package projects
* installation of a Dependency Group does not imply installation of a package's
dependencies (or the package itself)
Object Specification Does Not Allow for "PEP 508 Decomposition"
---------------------------------------------------------------
Poetry and PDM both allow for their object formats for dependency data to be
used to decompose a PEP 508 specification into parts. For example, a dependency
on ``requests==2.21.0`` could be written under these tools in a form like
``{name = "requests", version = "==2.21.0"}``.
This spec does not allow for such a structure. There is only one way to write
down a PEP 508 dependency, as a string. This has two positive effects:
* There are fewer ways of writing identical data
* It is not possible, by design, to combine a PEP 508 specifier with other
key-value pairs in a dependency specifier
Specification
=============
This PEP defines a new section (table) in ``pyproject.toml`` files named
``dependency-groups``. The ``dependency-groups`` table contains an arbitrary
number of user-defined keys, each of which has, as its value, a list of
requirements specifiers (defined below). These keys must match the following
regular expression: ``[a-z0-9][a-z0-9-]*[a-z0-9]``. Meaning that they must be
all lower-case alphanumerics, with ``-`` allowed only in the middle, and at
least two characters long. These requirements are chosen so that the
normalization rules used for PyPI package names are unnecessary as the names
are already normalized.
Requirements specifiers will use a definition based on standardized
`Dependency Specifiers <https://packaging.python.org/en/latest/specifications/dependency-specifiers/>`__
introduced in :pep:`508`.
This PEP also proposes an object format to define a dependency, called a
"Dependency Object Specifier" (defined below). The elements in
``[dependency-groups]`` lists must either be strings, in which case they must
be valid Dependency Specifiers (PEP 508) or else Dependency Object Specifiers.
Dependency Object Specifiers
----------------------------
Dependency Object Specifiers are objects (tables in TOML) which define a set of
dependencies. They do not necessarily refer to a single package, as will become
clear from the definitions below.
There are two forms of Dependency Object Specifiers: "Dependency Group Includes"
and "Path Dependencies".
Dependency Group Include
''''''''''''''''''''''''
A Dependency Group Include includes the dependencies of another Dependency
Group in the current Dependency Group.
An include is defined as an object with exactly one key, ``"include"``, whose
value is a string, the name of another Dependency Group.
For example, ``{include = "test"}`` is an include which expands to the
contents of the ``test`` Dependency Group.
Path Dependency
'''''''''''''''
A Path Dependency is a dependency on a package found via a local filesystem
path OR on dependencies of a package found via a local filesystem path.
It contains the following keys, with associated types:
* ``path``: a string, required
* ``extras``: a list of strings, optional
* ``editable``: a boolean, optional
* ``only-deps``: a boolean, optional
For a simple example, ``{path = ".", editable = true, extras = ["mysql"]}`` is
a Path Dependency on the current project including its ``mysql`` extra. In
``requirements.txt`` files, a similar idea can be expressed as ``-e '.[mysql]'``.
``path``
~~~~~~~~
The ``path`` must refer to the path to a built distribution (wheel, sdist, or
any future file format) or a directory containing python package source code.
If the ``path`` is relative, it is relative to the directory containing
``pyproject.toml``.
If the path refers to a built distribution, that package should be installed
when the Dependency Group is installed.
If the path refers to a directory, the directory SHOULD be a valid Python
package. Implementations MAY refuse to process ``path`` directives which are
not packages, even when ``only-deps`` is specified (see below for how
``only-deps`` would potentially make it possible to process non-package paths).
``extras``
~~~~~~~~~~
The ``extras`` key is a list of strings, each of which is the name of an extra
which should be included in the package installation.
``editable``
~~~~~~~~~~~~
If ``editable`` is ``true``, implementations installing the Dependency Group
SHOULD install the dependency in editable mode. If it is ``false``, they SHOULD
install it in non-editable mode. If ``editable`` is absent, no default behavior
is specified and implementations may choose their preferred default.
Implementations MAY provide options to users to configure or override this behavior.
For example, a tool may have an option ``--never-editable`` which always treats
``editable`` as ``false``.
However, implementations SHOULD prefer to use the ``editable`` value if it is
present.
If ``editable`` is specified on a Path Dependency which refers to a built
distribution, implementations MUST treat this as an error.
``only-deps``
~~~~~~~~~~~~~
If ``only-deps`` is ``true``, implementations MUST NOT install the package
at the specified ``path``. Instead, they should only install the dependencies
of that package. This may still require building a package from a source tree
in order to discover ``dynamic`` dependency data.
If ``only-deps`` is ``false`` or absent, implementations should install the
package at the specified ``path``.
It is possible to specify ``only-deps`` on a Path Dependency which does not
refer to a valid python package and for tools to, at least in theory, process
such a dependency successfully. For example, a ``pyproject.toml`` file
containing only ``[project.dependencies]`` and none of the other required keys
in the ``[project]`` table could be supported. Implementations SHOULD NOT
support such structures and SHOULD fail if a Path Dependency refers to a python
project which is not a package.
If ``only-deps`` is ``true`` and ``extras`` are specified, implementations
should install the ``extras`` as well as all of the non-optional dependencies
of the package.
Handling of Multiple Path Dependencies Referring to the Same Path
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
It is possible for resolution of a Dependency Group to refer to the same
path multiple times (for example, via two combined includes) using distinct
Path Dependencies.
If the ``path`` values are distinct, but refer to the same concrete file or
directory on the filesystem, handling behavior is unspecified. Implementations
MAY normalize these paths to a single value.
If the ``path`` values are identical, implementations MUST treat the result as
a singular Path Dependency following the rules below:
- The ``extras`` keys are concatenated into a single list
- If ``editable`` is always ``true`` or always ``false``, it is treated as
such. Otherwise, the ``editable`` key is treated as though it were absent.
- If the ``only-deps`` key is present, a ``false`` or absent value takes
priority over any ``true`` values. In other words, implementations respect
``only-deps`` if it is ``true`` for all instances of the same ``path``.
Otherwise, they treat it as ``false``.
Example Dependency Groups Table
-------------------------------
The following is an example of a partial ``pyproject.toml`` which uses this to
define four Dependency Groups: ``test``, ``docs``, ``typing``, and
``typing-test``:
.. code:: toml
[dependency-groups]
test = ["pytest", "coverage", {path = ".", editable = true}]
docs = ["sphinx", "sphinx-rtd-theme"]
typing = ["mypy", "types-requests", {path = ".", extras = ["types"], only-deps = true}]
typing-test = [{include = "typing"}, {include = "test"}, "useful-types"]
[project.optional-dependencies]
types = ["typing-extensions"]
Note how ``test`` and ``typing`` refer to the current package while ``docs``
does not. This reflects the ability of Dependency Groups to be used in the same
manner as extras, adding to dependencies, or completely independently.
``typing-test`` is defined as a union of two existing groups, plus an
additional package. ``typing`` includes an extra, ``types``, and this is
included by extension under ``typing-test``.
Under ``typing-test`` implementations may choose whether or not to use an
editable installation of the current package or not, but they MUST treat
``only-deps`` as ``false``.
Package Building
----------------
Build backends MUST NOT include Dependency Group data in built distributions.
Use of Dependency Groups
------------------------
Tools which support Dependency Groups are expected to provide new options and
interfaces to allow users to install from Dependency Groups.
No syntax is defined for expressing the Dependency Group of a package, for two
reasons:
* it would not be valid to refer to the Dependency Groups of a third-party
package from PyPI (because the data is defined to be unpublished)
* there is not guaranteed to be a current package for Dependency Groups -- part
of their purpose is to support non-package projects
For example, a possible pip interface for installing Dependency Groups
would be:
.. code:: shell
pip install --dependency-groups=test,typing
Note that this is only an example. This PEP does not declare any requirements
for how tools support the installation of Dependency Groups.
Reference Implementation
========================
There is currently no reference implementation/consumer of this specification.
Backwards Compatibility
=======================
At time of writing, the ``dependency-groups`` namespace within a
``pyproject.toml`` file is unused. Since the top-level namespace is
reserved for use only by standards specified at packaging.python.org,
there should be no direct backwards compatibility concerns.
Security Implications
=====================
This PEP introduces new syntaxes and data formats for specifying dependency
information in projects. However, it does not introduce newly specified
mechanisms for handling or resolving dependencies.
It therefore does not carry security concerns other than those inherent in any
tools which may already be used to install dependencies -- i.e. malicious
dependencies may be specified here, just as they may be specified in
``requirements.txt`` files.
How to Teach This
=================
This feature should be referred to by its canonical name, "Dependency Groups".
The basic form of usage should be taught as a variant on typical
``requirements.txt`` data. Standard dependency specifiers (:pep:`508`) can be
added to a named list. Rather than asking pip to install from a
``requirements.txt`` file, either pip or a relevant workflow tool will install
from a named Dependency Group.
For new Python users, they may be taught directly to create a section in
``pyproject.toml`` containing their Dependency Groups, similarly to how they
are currently taught to use ``requirements.txt`` files.
This also allows new Python users to learn about ``pyproject.toml`` files
without needing to learn about package building.
A ``pyproject.toml`` file with only ``[dependency-groups]`` and no other tables
is valid.
For both new and experienced users, the object style used in Dependency Object
Specifiers will need to be explained. Support for
``{path = ".", editable = true}`` and
``{path = ".", editable = true, extras = ["extra"]}`` should be taught
similarly to teaching ``pip install -e .`` and ``pip install -e '.[extra]'``
-- it intentionally mirrors the effects of those commands.
Support for inclusion of one Dependency Group in another can be
taught as a homologue for one requirements file including another using ``-r``.
Rejected Ideas
==============
Why not define each Dependency Group as a table?
------------------------------------------------
If our goal is to allow for future expansion, then defining each Dependency
Group as a subtable, thus enabling us to attach future keys to each group,
allows for the greatest future flexibility.
However, it also makes the structure nested more deeply, and therefore harder
to teach and learn. One of the goals of this PEP is to be an easy replacement
for many ``requirements.txt`` use-cases.
Why not define a special string syntax to extend Dependency Specifiers?
-----------------------------------------------------------------------
Earlier drafts of this specification defined syntactic forms for Dependency
Group Includes and Path Dependencies.
However, there were three major issues with this approach:
* it complicates the string syntax which must be taught, beyond PEP 508
* the resulting strings would always need to be disambiguated from PEP 508
specifiers, complicating implementations
* support for the matrix of Path Dependency requirements (``editable``, ``only-deps``,
``extras``) would require a complex syntax whose design was unclear
Why not restrict dependencies to PEP 508 only?
----------------------------------------------
There are known use cases for:
* including one Dependency Group in another
* including the current package (if the project is a package)
* including the current package with extras (if the project is a package)
* installing ``project.dependencies`` from the current project but not
installing the current project (as can be done with
``{path = ".", only-deps = true}``)
* specifying that an installation be done in editable mode
These are not satisfiable without some expansion of syntax beyond what is
possible with existing Dependency Specifiers (:pep:`508`).
Why is the table not named ``[run]``, ``[project.dependency-groups]``, ...?
---------------------------------------------------------------------------
There are many possible names for this concept.
It will have to live alongside the already existing ``[project.dependencies]``
and ``[project.optional-dependencies]`` tables, and possibly a new
``[external]`` dependency table as well (at time of writing, :pep:`725`, which
defines the ``[external]`` table, is in progress).
``[run]`` was a leading proposal in earlier discussions, but its proposed usage
centered around a single set of runtime dependencies. This PEP explicitly
outlines multiple groups of dependencies, which makes ``[run]`` a less
appropriate fit -- this is not just dependency data for a specific runtime
context, but for multiple contexts.
``[project.dependency-groups]`` would be ideal, but has major downsides for
non-package projects. ``[project]`` requires several keys to be defined, such
as ``name`` and ``version``. Using this name would either require redefining
the ``[project]`` table to allow for these keys to be absent, or else would
impose a requirement on non-package projects to define and use these keys. By
extension, it would effectively require any non-package project allow itself to
be treated as a package.
Why is pip's planned implementation of ``--only-deps`` not sufficient?
----------------------------------------------------------------------
pip currently has a feature on the roadmap to add an
`--only-deps flag <https://github.com/pypa/pip/issues/11440>`_.
This flag is intended to allow users to install package dependencies and extras
without installing the current package.
It does not address the needs of non-package projects, nor does it allow for
the installation of an extra without the package dependencies.
Therefore, while it may be a useful feature for pip to pursue, it does not
address the same use-cases addressed here.
Why isn't <environment manager> a solution?
-------------------------------------------
Existing environment managers like tox, Nox, and Hatch already have
the ability to list inlined dependencies as part of their configuration data.
This meets many development dependency needs, and clearly associates dependency
groups with relevant tasks which can be run.
These mechanisms are *good* but they are not *sufficient*.
First, they do not address the needs of non-package projects.
Second, there is no standard for other tools to use to access these data. This
has impacts on high-level tools like IDEs and Dependabot, which cannot support
deep integration with these Dependency Groups. (For example, at time of writing
Dependabot will not flag dependencies which are pinned in ``tox.ini`` files.)
Open Issues
===========
Documenting Distinctions from Poetry and PDM Object Formats
-----------------------------------------------------------
The object format here is similar to the ones used by Poetry and PDM, but not
identical.
The differences should be captured in Appendix B, as a part of documenting Poetry
and PDM behaviors and data formats.
Editable Installation May or May Not Be In Scope
------------------------------------------------
The ``editable`` field formalizes a behavior which is supported by setuptools
and pip, but which has three major issues:
- it combines information about the existence of a dependency with information
about *how* that dependency should be installed
- editable installs, as implemented today, do not provide for seamless updates
to package metadata
- there is no specification for the behavior of an editable install -- this is
an implementation-specific feature
Therefore, ``editable`` may need to be removed.
Such a removal must account for how existing tools and workflows which rely on
editable installs will be impacted.
Should ``include`` accept a list?
---------------------------------
This would enable more compact includes of multiple other Dependency Groups, at
the cost of a minor complication to the specification.
Footnotes
=========
Appendix A: Prior Art in Non-Python Languages
=============================================
TODO
Appendix B: Prior Art in Python
===============================
TODO
Appendix C: Use Cases
=====================
TODO
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.