PEP 643: Metadata for Package Source Distributions (GH-1683)

This commit is contained in:
Paul Moore 2020-10-26 17:25:41 +00:00 committed by GitHub
parent 0265bd00d5
commit 5f8209da52
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 210 additions and 0 deletions

210
pep-0643.rst Normal file
View File

@ -0,0 +1,210 @@
PEP: 643
Title: Metadata for Package Source Distributions
Author: Paul Moore <p.f.moore@gmail.com>
BDFL-Delegate: Paul Ganssle <paul@ganssle.io>
Discussions-To: Discourse, thread to be created.
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 24-Oct-2020
Post-History: 24-Oct-2020
Abstract
========
Python package metadata is stored in the distribution file in a standard
format, defined in the `Core Metadata Specification`_. However, for
source distributions, while the format of the data is defined, there has
traditionally been a lot of inconsistency in what data is recorded in
the sdist. See `here
<https://discuss.python.org/t/why-isnt-source-distribution-metadata-trustworthy-can-we-make-it-so/2620>`_
for a discussion of this issue.
As a result, metadata consumers are unable to rely on the data available
from source distributions, and need to use the (costly) :pep:`517` build
mechanisms to extract medatata.
This PEP defines a standard that allows build backends to reliably store
package metadata in the source distribution, while still retaining the
necessary flexibility to handle metadata fields that have to be calculated
at build time. It further defines a set of metadata values that must be
fixed when the sdist is created, ensuring that consumers have a minimum
"core" of metadata they can be sure is available.
Motivation
==========
There are a number of issues with the way that metadata is currently
stored in source distributions:
* The details of how to store metadata, while standardised, are not
easy to find.
* The specification requires an old metadata version, and has not been
updated in line with changes to the core metadata spec.
* There is no way in the spec to distinguish between "this field has been
omitted because its value will not be known until build time" and "this
field does not have a value".
* The core metadata specification allows most fields to be optional,
meaning that the previous issue affects nearly every metadata field.
This PEP proposes an update to the metadata specification to allow
recording of fields which are expected to be "filled in later", and
updates the sdist specification to clarify that backends should record
sdist metadata using that version of the spec (or later). It restricts
which fields can be "filled in later", so that a core set of metadata is
available, and reliable, when read from a sdist.
Rationale
=========
:pep:`621` proposes a mechanism for users to specify metadata in
``pyproject.toml``. As part of that mechanism, a way was needed to say
that a particular field is defined dynamically by the backend. During
discussions on the PEP, it became clear that the same type of mechanism
would address the issue of distinguishing between "not known yet" and
"definitely has no value" in sdists. This PEP defines the ``Dynamic``
metadata field by analogy with the ``dynamic`` field in :pep:`621`.
Specification
=============
This PEP defines the relationship between metadata values specified in
a sdist, and the corresponding values in wheels built from that sdist.
It requires build backends to clearly mark any fields which will *not*
simply be copied unchanged from the sdist to the wheel.
A new field, ``Dynamic``, will be added to the `Core Metadata Specification`_.
This field will be multiple use, and will be allowed to contain the name
of another core metadata field. The ``Dynamic`` metadata item is only
allowed in source distribution metadata.
If a field is marked as ``Dynamic``, there is no restriction placed on
its value in a wheel built from the sdist. A field which is marked as
``Dynamic``, MUST NOT have an explicit value in the sdist.
If a field is *not* marked as ``Dynamic``, then the value of the field
in any wheel built from the sdist MUST match the value in the sdist.
If the field is not in the sdist, and not marked as ``Dynamic``, then it
MUST NOT be present in the wheel.
Build backends MUST ensure that these rules are followed, and MUST
report an error if they are unable to do so.
The following fields MAY NOT be marked as ``Dynamic``:
* ``Name``
* ``Version``
* ``Summary``
* ``Description``
* ``Requires-Python``
* ``License``
* ``Author``
* ``Author-email``
* ``Maintainer``
* ``Maintainer-email``
* ``Keywords``
* ``Classifier``
* ``Project-URL``
As it adds a new metadata field, this PEP updates the core metadata
format to version 2.2.
Source distributions MUST use the latest version of the core metadata
specification (which will be version 2.2 or later).
Backwards Compatibility
=======================
As this proposal increments the core metadata version, it is compatible
with existing sdists, which will use an older metadata version. Tools
can determine whether a sdist conforms to this PEP by checking the
metadata version.
Security Implications
=====================
As this specification is purely for the storage of data that is intended
to be publicly available, there are no security implications.
How to Teach This
=================
This is a data storage format for project metadata, and so will not
typically be visible to end users. There is therefore no need to teach
users how to use the format. Developers wanting to reference the
metadata will be able to find the details in the `PyPA Specifications`_.
Rejected Ideas
==============
1. Rather than marking fields as ``Dynamic``, fields should be assumed
to be dynamic unless explicitly marked as ``Static``.
This is logically equivalent to the current proposal, but it implies
that fields being dynamic is the norm. Packaging tools can be much
more efficient in the presence of metadata that is known to be static,
so the PEP chooses to make dynamic fields the exception, and require
backends to "opt in" to making a field dynamic.
2. Rather than having a ``Dynamic`` field, add a special value that
indicates that a field is "not yet defined".
Again, this is logically equivalent to the current proposal. It makes
"being dynamic" an explicit choice, but requires a special value. As
some fields can contain arbitrary text, choosing a such a value is
somewhat awkward (although likely not a problem in practice). There
does not seem to be enough benefit to this approach to make it worth
using instead of the proposed mechanism.
Open Issues
===========
1. Should we allow ``Dynamic`` to be used in wheels and/or installed
distributions?
``Dynamic`` has no obvious meaning in either of these situations, and
the PEP therefore disallows it. However, backends may find it useful
to simply copy the field across, and it may have some usefulness in
recording "other wheels built from the source this came from may have
different values". However, the value seems marginal, and the added
complexity involved in explaining the feature does not seem worth it.
Allowing this could be done in a follow-up proposal if there turned
out to be sufficient benefit.
2. If a field is marked as ``Dynamic``, but has a value in the sdist
metadata, how should that be interpreted?
The simplest answer is to just not allow dynamic fields to have a
value in the sdist at all. For now, this is what the PEP proposes.
But is there benefit in having a value which tools can take as a
"hint" for what the value in the wheel will be?
3. Should this PEP change the canonical source for the sdist
specification to the `PyPA Specifications`_ document?
It would be beneficial to collect all of the details of the sdist
format in one place. However, distribution formats are not currently
collected there, and making the move would extend the impact of this
PEP significantly.
References
==========
.. _Core Metadata Specification: https://packaging.python.org/specifications/core-metadata/
.. _PyPA Specifications: https://packaging.python.org/specifications/
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.