PEP: 643 Title: Metadata for Package Source Distributions Author: Paul Moore BDFL-Delegate: Paul Ganssle Discussions-To: https://discuss.python.org/t/pep-643-metadata-for-package-source-distributions/5577 Status: Final Type: Standards Track Content-Type: text/x-rst Created: 24-Oct-2020 Post-History: 24-Oct-2020, 01-Nov-2020, 02-Nov-2020, 14-Nov-2020 Resolution: https://discuss.python.org/t/pep-643-metadata-for-package-source-distributions/5577/53 Abstract ======== Python package metadata is stored in the distribution file in a standard format, defined in the `Core Metadata Specification`_. However, for source distributions, while the format of the data is defined, there has traditionally been a lot of inconsistency in what data is recorded in the source distribution. See `here `_ for a discussion of this issue. As a result, metadata consumers are unable to rely on the data available from source distributions, and need to use the (costly) :pep:`517` build mechanisms to extract medatata. This PEP defines a standard that allows build backends to reliably store package metadata in the source distribution, while still retaining the necessary flexibility to handle metadata fields that have to be calculated at build time. Motivation ========== There are a number of issues with the way that metadata is currently stored in source distributions: * The details of how to store metadata, while standardised, are not easy to find. * The specification requires an old metadata version, and has not been updated in line with changes to the core metadata spec. * There is no way in the spec to distinguish between "this field has been omitted because its value will not be known until build time" and "this field does not have a value". * The core metadata specification allows most fields to be optional, meaning that the previous issue affects nearly every metadata field. This PEP proposes an update to the metadata specification to allow recording of fields which are expected to be "filled in later", and updates the source distribution specification to clarify that backends should record sdist metadata using that version of the spec (or later). Rationale ========= This PEP allows projects to define source distribution metadata values as being "dynamic". In this context, saying that a field is "dynamic" means that the value has not been fixed at the time that the source distribution was generated. Dynamic values will be supplied by the build backend at the time when the wheel is generated, and could depend on details of the build environment. :pep:`621` has a similar concept, of "dynamic" values that will be "filled in later", and so we choose to use the same term here by analogy. Specification ============= This PEP defines the relationship between metadata values specified in a source distribution, and the corresponding values in wheels built from it. It requires build backends to clearly mark any fields which will *not* simply be copied unchanged from the sdist to the wheel. In addition, this PEP makes the `PyPA Specifications`_ document the canonical location for the specification of the source distribution format (collecting the information in :pep:`517` and in this PEP). A new field, ``Dynamic``, will be added to the `Core Metadata Specification`_. This field will be multiple use, and will be allowed to contain the name of another core metadata field. When found in the metadata of a source distribution, the following rules apply: 1. If a field is *not* marked as ``Dynamic``, then the value of the field in any wheel built from the sdist MUST match the value in the sdist. If the field is not in the sdist, and not marked as ``Dynamic``, then it MUST NOT be present in the wheel. 2. If a field is marked as ``Dynamic``, it may contain any valid value in a wheel built from the sdist (including not being present at all). 3. Backends MUST NOT mark a field as ``Dynamic`` if they can determine that it was generated from data that will not change at build time. Backends MAY record the value they calculated for a field they mark as ``Dynamic`` in a source distribution. Consumers, however, MUST NOT treat this value as canonical, but MAY use it as an hint about what the final value in a wheel could be. In any context other than a source distribution, if a field is marked as ``Dynamic``, that indicates that the value was generated at wheel build time and may not match the value in the sdist (or in other builds of the project). Backends are not required to record this information, though, and consumers MUST NOT assume that the lack of a ``Dynamic`` marking has any significance, except in a source distribution. The fields ``Name`` and ``Version`` MUST NOT be marked as ``Dynamic``. As it adds a new metadata field, this PEP updates the core metadata format to version 2.2. Source distributions SHOULD use the latest version of the core metadata specification that was available when they were created. Build backends are strongly encouraged to only mark fields as ``Dynamic`` when absolutely necessary, and to encourage projects to avoid backend features that require the use of ``Dynamic``. Projects should prefer to use environment markers on static values to adapt to details of the install location. Backwards Compatibility ======================= As this proposal increments the core metadata version, it is compatible with existing source distributions, which will use an older metadata version. Tools can determine whether a source distribution conforms to this PEP by checking the metadata version. Security Implications ===================== As this specification is purely for the storage of data that is intended to be publicly available, there are no security implications. How to Teach This ================= This is a data storage format for project metadata, and so will not typically be visible to end users. There is therefore no need to teach users how to use the format. Developers wanting to reference the metadata will be able to find the details in the `PyPA Specifications`_. Rejected Ideas ============== 1. Rather than marking fields as ``Dynamic``, fields should be assumed to be dynamic unless explicitly marked as ``Static``. This is logically equivalent to the current proposal, but it implies that fields being dynamic is the norm. Packaging tools can be much more efficient in the presence of metadata that is known to be static, so the PEP chooses to make dynamic fields the exception, and require backends to "opt in" to making a field dynamic. In addition, if dynamic is the default, then in future, as more and more metadata becomes static, metadata files will include an increasing number of ``Static`` declarations. 2. Rather than having a ``Dynamic`` field, add a special value that indicates that a field is "not yet defined". Again, this is logically equivalent to the current proposal. It makes "being dynamic" an explicit choice, but requires a special value. As some fields can contain arbitrary text, choosing a such a value is somewhat awkward (although likely not a problem in practice). There does not seem to be enough benefit to this approach to make it worth using instead of the proposed mechanism. 3. Special handling of ``Requires-Python``. Early drafts of the PEP needed special discussion of ``Requires-Python``, because the lack of environment markers for this field meant that it might be difficult to require it to be static. The final form of the PEP no longer needs this, as the idea of a whitelist of fields allowed to be dynamic was dropped. 4. Restrict the use of ``Dynamic`` to a minimal "white list" of permitted fields. This approach was likely to prove extremely difficult for setuptools to implement in a backward compatible way, due to the dynamic nature of the setuptools interface. Instead, the proposal now allows most fields to be dynamic, but encourages backends to avoid dynamic values unless essential. Open Issues =========== None References ========== .. _Core Metadata Specification: https://packaging.python.org/specifications/core-metadata/ .. _PyPA Specifications: https://packaging.python.org/specifications/ Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.