python-peps/pep-0643.rst

209 lines
8.1 KiB
ReStructuredText

PEP: 643
Title: Metadata for Package Source Distributions
Author: Paul Moore <p.f.moore@gmail.com>
BDFL-Delegate: Paul Ganssle <paul@ganssle.io>
Discussions-To: https://discuss.python.org/t/pep-643-metadata-for-package-source-distributions/5577
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 24-Oct-2020
Post-History: 24-Oct-2020, 01-Nov-2020, 02-Nov-2020
Abstract
========
Python package metadata is stored in the distribution file in a standard
format, defined in the `Core Metadata Specification`_. However, for
source distributions, while the format of the data is defined, there has
traditionally been a lot of inconsistency in what data is recorded in
the sdist. See `here
<https://discuss.python.org/t/why-isnt-source-distribution-metadata-trustworthy-can-we-make-it-so/2620>`_
for a discussion of this issue.
As a result, metadata consumers are unable to rely on the data available
from source distributions, and need to use the (costly) :pep:`517` build
mechanisms to extract medatata.
This PEP defines a standard that allows build backends to reliably store
package metadata in the source distribution, while still retaining the
necessary flexibility to handle metadata fields that have to be calculated
at build time. It further defines a set of metadata values that must be
fixed when the sdist is created, ensuring that consumers have a minimum
"core" of metadata they can be sure is available.
Motivation
==========
There are a number of issues with the way that metadata is currently
stored in source distributions:
* The details of how to store metadata, while standardised, are not
easy to find.
* The specification requires an old metadata version, and has not been
updated in line with changes to the core metadata spec.
* There is no way in the spec to distinguish between "this field has been
omitted because its value will not be known until build time" and "this
field does not have a value".
* The core metadata specification allows most fields to be optional,
meaning that the previous issue affects nearly every metadata field.
This PEP proposes an update to the metadata specification to allow
recording of fields which are expected to be "filled in later", and
updates the sdist specification to clarify that backends should record
sdist metadata using that version of the spec (or later). It restricts
which fields can be "filled in later", so that a core set of metadata is
available, and reliable, when read from a sdist.
Rationale
=========
:pep:`621` proposes a mechanism for users to specify metadata in
``pyproject.toml``. As part of that mechanism, a way was needed to say
that a particular field is defined dynamically by the backend. During
discussions on the PEP, it became clear that the same type of mechanism
would address the issue of distinguishing between "not known yet" and
"definitely has no value" in sdists. This PEP defines the ``Dynamic``
metadata field by analogy with the ``dynamic`` field in :pep:`621`.
Specification
=============
This PEP defines the relationship between metadata values specified in
a sdist, and the corresponding values in wheels built from that sdist.
It requires build backends to clearly mark any fields which will *not*
simply be copied unchanged from the sdist to the wheel.
In addition, this PEP makes the `PyPA Specifications`_ document the
canonical location for the specification of the sdist format (collecting
the information in :pep:`517` and in this PEP).
A new field, ``Dynamic``, will be added to the `Core Metadata Specification`_.
This field will be multiple use, and will be allowed to contain the name
of another core metadata field. The ``Dynamic`` metadata item is only
allowed in source distribution metadata.
If tools that read metadata encounter the ``Dynamic`` field anywhere except
in a sdist, they MAY fail with an error. If they do not fail, they MUST
ignore the field, and may report a warning.
If a field is marked as ``Dynamic``, there is no restriction placed on
its value in a wheel built from the sdist. A field which is marked as
``Dynamic``, MUST NOT have an explicit value in the sdist.
If a field is *not* marked as ``Dynamic``, then the value of the field
in any wheel built from the sdist MUST match the value in the sdist.
If the field is not in the sdist, and not marked as ``Dynamic``, then it
MUST NOT be present in the wheel.
Build backends MUST ensure that these rules are followed, and MUST
report an error if they are unable to do so.
The following fields are the only ones allowed to be marked as ``Dynamic``:
* ``Platform``
* ``Supported-Platform``
* ``Requires-Dist``
* ``Requires-External``
* ``Provides-Extra``
* ``Provides-Dist``
* ``Obsoletes-Dist``
As it adds a new metadata field, this PEP updates the core metadata
format to version 2.2.
Source distributions MUST use the latest version of the core metadata
specification (which will be version 2.2 or later).
Build backends SHOULD encourage projects to avoid using ``Dynamic``,
preferring to use environment markers on static values to adapt to
details of the install location.
Backwards Compatibility
=======================
As this proposal increments the core metadata version, it is compatible
with existing sdists, which will use an older metadata version. Tools
can determine whether a sdist conforms to this PEP by checking the
metadata version.
Security Implications
=====================
As this specification is purely for the storage of data that is intended
to be publicly available, there are no security implications.
How to Teach This
=================
This is a data storage format for project metadata, and so will not
typically be visible to end users. There is therefore no need to teach
users how to use the format. Developers wanting to reference the
metadata will be able to find the details in the `PyPA Specifications`_.
Rejected Ideas
==============
1. Rather than marking fields as ``Dynamic``, fields should be assumed
to be dynamic unless explicitly marked as ``Static``.
This is logically equivalent to the current proposal, but it implies
that fields being dynamic is the norm. Packaging tools can be much
more efficient in the presence of metadata that is known to be static,
so the PEP chooses to make dynamic fields the exception, and require
backends to "opt in" to making a field dynamic.
2. Rather than having a ``Dynamic`` field, add a special value that
indicates that a field is "not yet defined".
Again, this is logically equivalent to the current proposal. It makes
"being dynamic" an explicit choice, but requires a special value. As
some fields can contain arbitrary text, choosing a such a value is
somewhat awkward (although likely not a problem in practice). There
does not seem to be enough benefit to this approach to make it worth
using instead of the proposed mechanism.
3. Allow ``Requires-Python`` to be ``Dynamic``, as it cannot include environment
markers to tailor the requirement to the target environment.
Currently, no projects on PyPI have a ``Requires-Python`` value that varies
between different wheels for the same version, so there is no practical
need for this flexibility at present. If a genuine use case is identified
later, the specification can be changed to allow ``Rquires-Python`` to be
dynamic at that time.
4. Allow ``Dynamic`` to be used in wheels and/or installed distributions.
There is no obvious value to allowing this, and it seems like it is simply
adding complexity for no real reason. Allowing this could be done in a
follow-up proposal if there turned out to be sufficient benefit.
5. Allow a field to be marked as ``Dynamic``, but *also* have a value in the
sdist metadata.
There appears to be no use case for allowing this. If a use case is
identified in the future, the specification can be updated at that time.
Open Issues
===========
None
References
==========
.. _Core Metadata Specification: https://packaging.python.org/specifications/core-metadata/
.. _PyPA Specifications: https://packaging.python.org/specifications/
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.