PEP: 643 Title: Metadata for Package Source Distributions Author: Paul Moore BDFL-Delegate: Paul Ganssle Discussions-To: https://discuss.python.org/t/pep-643-metadata-for-package-source-distributions/5577 Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 24-Oct-2020 Post-History: 24-Oct-2020 Abstract ======== Python package metadata is stored in the distribution file in a standard format, defined in the `Core Metadata Specification`_. However, for source distributions, while the format of the data is defined, there has traditionally been a lot of inconsistency in what data is recorded in the sdist. See `here `_ for a discussion of this issue. As a result, metadata consumers are unable to rely on the data available from source distributions, and need to use the (costly) :pep:`517` build mechanisms to extract medatata. This PEP defines a standard that allows build backends to reliably store package metadata in the source distribution, while still retaining the necessary flexibility to handle metadata fields that have to be calculated at build time. It further defines a set of metadata values that must be fixed when the sdist is created, ensuring that consumers have a minimum "core" of metadata they can be sure is available. Motivation ========== There are a number of issues with the way that metadata is currently stored in source distributions: * The details of how to store metadata, while standardised, are not easy to find. * The specification requires an old metadata version, and has not been updated in line with changes to the core metadata spec. * There is no way in the spec to distinguish between "this field has been omitted because its value will not be known until build time" and "this field does not have a value". * The core metadata specification allows most fields to be optional, meaning that the previous issue affects nearly every metadata field. This PEP proposes an update to the metadata specification to allow recording of fields which are expected to be "filled in later", and updates the sdist specification to clarify that backends should record sdist metadata using that version of the spec (or later). It restricts which fields can be "filled in later", so that a core set of metadata is available, and reliable, when read from a sdist. Rationale ========= :pep:`621` proposes a mechanism for users to specify metadata in ``pyproject.toml``. As part of that mechanism, a way was needed to say that a particular field is defined dynamically by the backend. During discussions on the PEP, it became clear that the same type of mechanism would address the issue of distinguishing between "not known yet" and "definitely has no value" in sdists. This PEP defines the ``Dynamic`` metadata field by analogy with the ``dynamic`` field in :pep:`621`. Specification ============= This PEP defines the relationship between metadata values specified in a sdist, and the corresponding values in wheels built from that sdist. It requires build backends to clearly mark any fields which will *not* simply be copied unchanged from the sdist to the wheel. A new field, ``Dynamic``, will be added to the `Core Metadata Specification`_. This field will be multiple use, and will be allowed to contain the name of another core metadata field. The ``Dynamic`` metadata item is only allowed in source distribution metadata. If a field is marked as ``Dynamic``, there is no restriction placed on its value in a wheel built from the sdist. A field which is marked as ``Dynamic``, MUST NOT have an explicit value in the sdist. If a field is *not* marked as ``Dynamic``, then the value of the field in any wheel built from the sdist MUST match the value in the sdist. If the field is not in the sdist, and not marked as ``Dynamic``, then it MUST NOT be present in the wheel. Build backends MUST ensure that these rules are followed, and MUST report an error if they are unable to do so. The following fields MAY NOT be marked as ``Dynamic``: * ``Name`` * ``Version`` * ``Summary`` * ``Description`` * ``Requires-Python`` * ``License`` * ``Author`` * ``Author-email`` * ``Maintainer`` * ``Maintainer-email`` * ``Keywords`` * ``Classifier`` * ``Project-URL`` As it adds a new metadata field, this PEP updates the core metadata format to version 2.2. Source distributions MUST use the latest version of the core metadata specification (which will be version 2.2 or later). The ``Requires-Python`` field for a project may vary by target platform, but is not allowed to be declared as ``Dynamic`` in the sdist metadata. To handle this situation, build backends MUST use environment markers on the ``Requires-Python`` field to allow that metadata to remain common across the sdist and all wheel archives, rather than generating platform dependent ``Requires-Python`` metadata as part of the wheel build process. Build backends SHOULD also use this approach for other metadata fields that may vary by target platform (e.g. dependency declarations). Backwards Compatibility ======================= As this proposal increments the core metadata version, it is compatible with existing sdists, which will use an older metadata version. Tools can determine whether a sdist conforms to this PEP by checking the metadata version. Security Implications ===================== As this specification is purely for the storage of data that is intended to be publicly available, there are no security implications. How to Teach This ================= This is a data storage format for project metadata, and so will not typically be visible to end users. There is therefore no need to teach users how to use the format. Developers wanting to reference the metadata will be able to find the details in the `PyPA Specifications`_. Rejected Ideas ============== 1. Rather than marking fields as ``Dynamic``, fields should be assumed to be dynamic unless explicitly marked as ``Static``. This is logically equivalent to the current proposal, but it implies that fields being dynamic is the norm. Packaging tools can be much more efficient in the presence of metadata that is known to be static, so the PEP chooses to make dynamic fields the exception, and require backends to "opt in" to making a field dynamic. 2. Rather than having a ``Dynamic`` field, add a special value that indicates that a field is "not yet defined". Again, this is logically equivalent to the current proposal. It makes "being dynamic" an explicit choice, but requires a special value. As some fields can contain arbitrary text, choosing a such a value is somewhat awkward (although likely not a problem in practice). There does not seem to be enough benefit to this approach to make it worth using instead of the proposed mechanism. Open Issues =========== 1. Should we allow ``Dynamic`` to be used in wheels and/or installed distributions? ``Dynamic`` has no obvious meaning in either of these situations, and the PEP therefore disallows it. However, backends may find it useful to simply copy the field across, and it may have some usefulness in recording "other wheels built from the source this came from may have different values". However, the value seems marginal, and the added complexity involved in explaining the feature does not seem worth it. Allowing this could be done in a follow-up proposal if there turned out to be sufficient benefit. 2. If a field is marked as ``Dynamic``, but has a value in the sdist metadata, how should that be interpreted? The simplest answer is to just not allow dynamic fields to have a value in the sdist at all. For now, this is what the PEP proposes. But is there benefit in having a value which tools can take as a "hint" for what the value in the wheel will be? 3. Should this PEP change the canonical source for the sdist specification to the `PyPA Specifications`_ document? It would be beneficial to collect all of the details of the sdist format in one place. However, distribution formats are not currently collected there, and making the move would extend the impact of this PEP significantly. References ========== .. _Core Metadata Specification: https://packaging.python.org/specifications/core-metadata/ .. _PyPA Specifications: https://packaging.python.org/specifications/ Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.