From 3e4a83d1303ed807eb9d8ab96ecbad8ec1d3d422 Mon Sep 17 00:00:00 2001 From: Ralf Gommers Date: Wed, 6 Dec 2023 21:48:29 +0100 Subject: [PATCH] PEP 725: version 2, addressing review comments on Discourse to date (#3546) --- peps/pep-0725.rst | 147 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 130 insertions(+), 17 deletions(-) diff --git a/peps/pep-0725.rst b/peps/pep-0725.rst index 172afbd86..13fc866d6 100644 --- a/peps/pep-0725.rst +++ b/peps/pep-0725.rst @@ -18,6 +18,18 @@ This PEP specifies how to write a project's external, or non-PyPI, build and runtime dependencies in a ``pyproject.toml`` file for packaging-related tools to consume. +This PEP proposes to add an ``[external]`` table to ``pyproject.toml`` with +three keys: "build-requires", "host-requires" and "dependencies". These +are for specifying three types of dependencies: + +1. ``build-requires``, build tools to run on the build machine +2. ``host-requires``, build dependencies needed for host machine but also needed at build time. +3. ``dependencies``, needed at runtime on the host machine but not needed at build time. + +Cross compilation is taken into account by distinguishing build and host dependencies. +Optional build-time and runtime dependencies are supported too, in a manner analogies +to how that is supported in the ``[project]`` table. + Motivation ========== @@ -36,13 +48,13 @@ this PEP are to: information. Packaging ecosystems like Linux distros, Conda, Homebrew, Spack, and Nix need -full sets of dependencies for Python packages, and have tools like pyp2rpm_ +full sets of dependencies for Python packages, and have tools like pyp2spec_ (Fedora), Grayskull_ (Conda), and dh_python_ (Debian) which attempt to -automatically generate dependency metadata from the metadata in +automatically generate dependency metadata for their own package managers from the metadata in upstream Python packages. External dependencies are currently handled manually, because there is no metadata for this in ``pyproject.toml`` or any other standard location. Enabling automating this conversion is a key benefit of -this PEP, making packaging Python easier and more reliable. In addition, the +this PEP, making packaging Python packages for distros easier and more reliable. In addition, the authors envision other types of tools making use of this information, e.g., dependency analysis tools like Repology_, Dependabot_ and libraries.io_. Software bill of materials (SBOM) generation tools may also be able to use this @@ -100,7 +112,7 @@ Cross compilation Cross compilation is not yet (as of August 2023) well-supported by stdlib modules and ``pyproject.toml`` metadata. It is however important when translating external dependencies to those of other packaging systems (with -tools like ``pyp2rpm``). Introducing support for cross compilation immediately +tools like ``pyp2spec``). Introducing support for cross compilation immediately in this PEP is much easier than extending ``[external]`` in the future, hence the authors choose to include this now. @@ -204,9 +216,9 @@ Virtual package specification There is no ready-made support for virtual packages in PURL or another standard. There are a relatively limited number of such dependencies though, -and adoption a scheme similar to PURL but with the ``virtual:`` rather than +and adopting a scheme similar to PURL but with the ``virtual:`` rather than ``pkg:`` scheme seems like it will be understandable and map well to Linux -distros with virtual packages and the likes of Conda and Spack. +distros with virtual packages and to the likes of Conda and Spack. The two known virtual package types are ``compiler`` and ``interface``. @@ -262,6 +274,39 @@ allow a version of a dependency for a wheel that isn't allowed for an sdist, nor contain new dependencies that are not listed in the sdist's metadata at all. +Canonical names of dependencies and ``-dev(el)`` split packages +''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' + +It is fairly common for distros to split a package into two or more packages. +In particular, runtime components are often separately installable from +development components (headers, pkg-config and CMake files, etc.). The latter +then typically has a name with ``-dev`` or ``-devel`` appended to the +project/library name. This split is the responsibility of each distro to +maintain, and should not be reflected in the ``[external]`` table. It is not +possible to specify this in a reasonable way that works across distros, hence +only the canonical name should be used in ``[external]``. + +The intended meaning of using a PURL or virtual dependency is "the full package +with the name specified". It will depend on the context in which the metadata +is used whether the split is relevant. For example, if ``libffi`` is a host +dependency and a tool wants to prepare an environment for building a wheel, +then if a distro has split off the headers for ``libffi`` into a +``libffi-devel`` package then the tool has to install both ``libffi`` and +``libffi-devel``. + +Python development headers +'''''''''''''''''''''''''' + +Python headers and other build support files may also be split. This is the +same situation as in the section above (because Python is simply a regular +package in distros). *However*, a ``python-dev|devel`` dependency is special because +in ``pyproject.toml`` Python itself is an implicit rather than an explicit +dependency. Hence a choice needs to be made here - add ``python-dev`` implicitly, +or make each package author add it explicitly under ``[external]``. For +consistency between Python dependencies and external dependencies, we choose to +add it implicitly. Python development headers must be assumed to be necessary +when an ``[external]`` table contains one or more compiler packages. + Specification ============= @@ -324,7 +369,7 @@ strings of the arrays MUST be valid PURL_ strings. with values of arrays of PURL_ strings (``optional-dependencies``) - `Core metadata`_: ``Requires-External``, N/A -The (optional) dependencies of the project. +The (optional) runtime dependencies of the project. For ``dependencies``, it is a key whose value is an array of strings. Each string represents a dependency of the project and MUST be formatted as either a @@ -347,10 +392,13 @@ cryptography 39.0: [external] build-requires = [ + "virtual:compiler/c", "virtual:compiler/rust", + "pkg:generic/pkg-config", ] host-requires = [ "pkg:generic/openssl", + "pkg:generic/libffi", ] SciPy 1.10: @@ -363,19 +411,14 @@ SciPy 1.10: "virtual:compiler/cpp", "virtual:compiler/fortran", "pkg:generic/ninja", + "pkg:generic/pkg-config", ] host-requires = [ "virtual:interface/blas", "virtual:interface/lapack", # >=3.7.1 (can't express version ranges with PURL yet) ] - [external.optional-host-requires] - dependency_detection = [ - "pkg:generic/pkg-config", - "pkg:generic/cmake", - ] - -pygraphviz 1.10: +Pillow 10.1.0: .. code:: toml @@ -384,9 +427,24 @@ pygraphviz 1.10: "virtual:compiler/c", ] host-requires = [ - "pkg:generic/graphviz", + "pkg:generic/libjpeg", + "pkg:generic/zlib", ] + [external.optional-host-requires] + extra = [ + "pkg:generic/lcms2", + "pkg:generic/freetype", + "pkg:generic/libimagequant", + "pkg:generic/libraqm", + "pkg:generic/libtiff", + "pkg:generic/libxcb", + "pkg:generic/libwebp", + "pkg:generic/openjpeg", # add >=2.0 once we have version specifiers + "pkg:generic/tk", + ] + + NAVis 1.4.0: .. code:: toml @@ -480,7 +538,22 @@ information about that in its documentation, as will tools like ``auditwheel``. Reference Implementation ======================== -There is no reference implementation at this time. +This PEP contains a metadata specification, rather that a code feature - hence +there will not be code implementing the metadata spec as a whole. However, +there are parts that do have a reference implementation: + +1. The ``[external]`` table has to be valid TOML and therefore can be loaded + with ``tomllib``. +2. The PURL specification, as a key part of this spec, has a Python package + with a reference implementation for constructing and parsing PURLs: + `packageurl-python`_. + +There are multiple possible consumers and use cases of this metadata, once +that metadata gets added to Python packages. Tested metadata for all of the +top 150 most-downloaded packages from PyPI with published platform-specific +wheels can be found in `rgommers/external-deps-build`_. This metadata has +been validated by using it to build wheels from sdists patched with that +metadata in clean Docker containers. Rejected Ideas @@ -516,6 +589,43 @@ Support in PURL for version expressions and ranges is still pending. The pull request at `vers implementation for PURL`_ seems close to being merged, at which point this PEP could adopt it. +Versioning of virtual dependencies +---------------------------------- + +Once PURL supports version expressions, virtual dependencies can be versioned +with the same syntax. It must be better specified however what the version +scheme is, because this is not as clear for virtual dependencies as it is for +PURLs (e.g., there can be multiple implementations, and abstract interfaces may +not be unambiguously versioned). E.g.: + +- OpenMP: has regular ``MAJOR.MINOR`` versions of its standard, so would look + like ``>=4.5``. +- BLAS/LAPACK: should use the versioning used by `Reference LAPACK`_, which + defines what the standard APIs are. Uses ``MAJOR.MINOR.MICRO``, so would look + like ``>=3.10.0``. +- Compilers: these implement language standards. For C, C++ and Fortran these + are versioned by year. In order for versions to sort correctly, we choose to + use the full year (four digits). So "at least C99" would be ``>=1999``, and + selecting C++14 or Fortran 77 would be ``==2014`` or ``==1977`` respectively. + Other languages may use different versioning schemes. These should be + described somewhere before they are used in ``pyproject.toml``. + +A logistical challenge is where to describe the versioning - given that this +will evolve over time, this PEP itself is not the right location for it. +Instead, this PEP should point at that (to be created) location. + +Who defines canonical names and canonical package structure? +------------------------------------------------------------ + +Similarly to the logistics around versioning is the question about what names +are allowed and where they are described. And then who is in control of that +description and responsible for maintaining it. Our tentative answer is: there +should be a central list for virtual dependencies and ``pkg:generic`` PURLs, +maintained as a PyPA project. See +https://discuss.python.org/t/pep-725-specifying-external-dependencies-in-pyproject-toml/31888/62. +TODO: once that list/project is prototyped, include it in the PEP and close +this open issue. + Syntax for virtual dependencies ------------------------------- @@ -572,9 +682,10 @@ CC0-1.0-Universal license, whichever is more permissive. .. _setuptools metadata: https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata .. _SPDX: https://spdx.dev/ .. _PURL: https://github.com/package-url/purl-spec/ +.. _packageurl-python: https://pypi.org/project/packageurl-python/ .. _vers: https://github.com/package-url/purl-spec/blob/version-range-spec/VERSION-RANGE-SPEC.rst .. _vers implementation for PURL: https://github.com/package-url/purl-spec/pull/139 -.. _pyp2rpm: https://github.com/fedora-python/pyp2rpm +.. _pyp2spec: https://github.com/befeleme/pyp2spec .. _Grayskull: https://github.com/conda/grayskull .. _dh_python: https://www.debian.org/doc/packaging-manuals/python-policy/index.html#dh-python .. _Repology: https://repology.org/ @@ -585,3 +696,5 @@ CC0-1.0-Universal license, whichever is more permissive. .. _auditwheel: https://github.com/pypa/auditwheel .. _delocate: https://github.com/matthew-brett/delocate .. _delvewheel: https://github.com/adang1345/delvewheel +.. _rgommers/external-deps-build: https://github.com/rgommers/external-deps-build +.. _Reference LAPACK: https://github.com/Reference-LAPACK/lapack