python-peps/peps/pep-0426.rst

1517 lines
60 KiB
ReStructuredText

PEP: 426
Title: Metadata for Python Software Packages 2.0
Author: Alyssa Coghlan <ncoghlan@gmail.com>,
Daniel Holth <dholth@gmail.com>,
Donald Stufft <donald@stufft.io>
BDFL-Delegate: Donald Stufft <donald@stufft.io>
Discussions-To: distutils-sig@python.org
Status: Withdrawn
Type: Informational
Topic: Packaging
Requires: 440, 508, 518
Created: 30-Aug-2012
Post-History: 14-Nov-2012, 05-Feb-2013, 07-Feb-2013, 09-Feb-2013,
27-May-2013, 20-Jun-2013, 23-Jun-2013, 14-Jul-2013,
21-Dec-2013
Replaces: 345
.. withdrawn::
The ground-up metadata redesign proposed in this PEP has been withdrawn in
favour of the more modest proposal in :pep:`566`, which retains the basic
Key:Value format of previous metadata versions, but also defines a standardised
mechanism for translating that format to nested JSON-compatible data structures.
Some of the ideas in this PEP (or the related :pep:`459`) may still be considered
as part of later proposals, but they will be handled in a more incremental
fashion, rather than as a single large proposed change with no feasible
migration plan.
Abstract
========
This PEP describes a mechanism for publishing and exchanging metadata
related to Python distributions. It includes specifics of the field names,
and their semantics and usage.
This document specifies the never released version 2.0 of the metadata format.
Version 1.0 is specified in :pep:`241`.
Version 1.1 is specified in :pep:`314`.
Version 1.2 is specified in :pep:`345`.
Version 2.0 of the metadata format proposed migrating from directly defining a
custom key-value file format to instead defining a JSON-compatible in-memory
representation that may be used to define metadata representation in other
contexts (such as API and archive format definitions).
This version also defines a formal extension mechanism, allowing new
fields to be added for particular purposes without requiring updates to
the core metadata format.
Note on PEP History
===================
This PEP was initially deferred for an extended period, from December 2013
through to March 2017, as distutils-sig worked through a number of other
changes. These changes included:
* defining a binary compatibility tagging format in :pep:`425`
* defining a binary archive format (``wheel``) in :pep:`427`
* explicitly defining versioning and version comparison in :pep:`440`
* explicitly defining the PyPI "simple" API in :pep:`503`
* explicitly defining dependency specifiers and the extras system in :pep:`508`
* declaring static build system dependencies (``pyproject.toml``) in :pep:`518`
* migrating PyPI hosting to Rackspace, and placing it behind the Fastly CDN
* shipping ``pip`` with CPython by default in :pep:`453`, and backporting that
addition to Python 2.7 in :pep:`477`
* establishing `packaging.python.org`_ as the common access point for Python
packaging ecosystem documentation
* migrating to using the `specifications`_ section of packaging.python.org
as the central location for tracking packaging related PEPs
The time spent pursuing these changes provided additional perspective on which
metadata format changes were genuinely desirable, and which could be omitted
from the revised specification as merely being "change for change's sake".
It also allowed a number of features that aren't critical to the core activity
of publishing and distributing software to be moved out to :pep:`459`, a separate
proposal for a number of standard metadata extensions that provide additional
optional information about a release.
As of September 2017, it was deferred again, on the grounds that
it doesn't actually help solve any particularly pressing problems:
- JSON representation would be better handled through defining a
transformation of the existing metadata 1.2 fields
- clarification of the additional fields defined in the past few
years and related changes to the spec management process would
be better covered in a `minor spec version update`_
.. _packaging.python.org: https://packaging.python.org/
.. _specifications: https://packaging.python.org/specifications/
.. _minor spec version update: https://mail.python.org/pipermail/distutils-sig/2017-September/031465.html
Finally, the PEP was withdrawn in February 2018 in favour of :pep:`566` (which
pursues that more incremental strategy).
Purpose
=======
The purpose of this PEP is to define a common metadata interchange format
for communication between software publication tools and software integration
tools in the Python ecosystem. One key aim is to support full dependency
analysis in that ecosystem without requiring the execution of arbitrary
Python code by those doing the analysis. Another aim is to encourage good
software distribution practices by default, while continuing to support the
current practices of almost all existing users of the Python Package Index
(both publishers and integrators). Finally, the aim is to support an upgrade
path from the currently in use metadata formats that is transparent to end
users.
The design draws on the Python community's nearly 20 years of experience with
distutils based software distribution, and incorporates ideas and concepts
from other distribution systems, including Python's setuptools, pip and
other projects, Ruby's gems, Perl's CPAN, Node.js's npm, PHP's composer
and Linux packaging systems such as RPM and APT.
While the specifics of this format are aimed at the Python ecosystem, some
of the ideas may also be useful in the future evolution of other dependency
management ecosystems.
Development, Distribution and Deployment of Python Software
===========================================================
The metadata design in this PEP is based on a particular conceptual model
of the software development and distribution process. This model consists of
the following phases:
* Software development: this phase involves working with a source checkout
for a particular application to add features and fix bugs. It is
expected that developers in this phase will need to be able to build the
software, run the software's automated test suite, run project specific
utility scripts and publish the software.
* Software publication: this phase involves taking the developed software
and making it available for use by software integrators. This includes
creating the descriptive metadata defined in this PEP, as well as making
the software available (typically by uploading it to an index server).
* Software integration: this phase involves taking published software
components and combining them into a coherent, integrated system. This
may be done directly using Python specific cross-platform tools, or it may
be handled through conversion to development language neutral platform
specific packaging systems.
* Software deployment: this phase involves taking integrated software
components and deploying them on to the target system where the software
will actually execute.
The publication and integration phases are collectively referred to as
the distribution phase, and the individual software components distributed
in that phase are formally referred to as "distribution packages", but are more
colloquially known as just "packages" (relying on context to disambiguate them
from the "module with submodules" kind of Python package).
The exact details of these phases will vary greatly for particular use cases.
Deploying a web application to a public Platform-as-a-Service provider,
publishing a new release of a web framework or scientific library,
creating an integrated Linux distribution, or upgrading a custom application
running in a secure enclave are all situations this metadata design should
be able to handle.
The complexity of the metadata described in this PEP thus arises directly
from the actual complexities associated with software development,
distribution and deployment in a wide range of scenarios.
Supporting definitions
----------------------
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in :rfc:`2119`.
"Projects" are software components that are made available for integration.
Projects include Python libraries, frameworks, scripts, plugins,
applications, collections of data or other resources, and various
combinations thereof. Public Python projects are typically registered on
the `Python Package Index`_.
"Releases" are uniquely identified snapshots of a project.
"Distribution packages" are the packaged files which are used to publish
and distribute a release.
Depending on context, "package" may refer to either a distribution, or
to an importable Python module that has a ``__path__`` attribute and hence
may also have importable submodules.
"Source archive" and "VCS checkout" both refer to the raw source code for
a release, prior to creation of an sdist or binary archive.
An "sdist" is a publication format providing the distribution metadata and
any source files that are essential to creating a binary archive for
the distribution. Creating a binary archive from an sdist requires that
the appropriate build tools be available on the system.
"Binary archives" only require that prebuilt files be moved to the correct
location on the target system. As Python is a dynamically bound
cross-platform language, many so-called "binary" archives will contain only
pure Python source code.
"Contributors" are individuals and organizations that work together to
develop a software component.
"Publishers" are individuals and organizations that make software components
available for integration (typically by uploading distributions to an
index server)
"Integrators" are individuals and organizations that incorporate published
distributions as components of an application or larger system.
"Build tools" are automated tools intended to run on development systems,
producing source and binary distribution archives. Build tools may also be
invoked by integration tools in order to build software distributed as
sdists rather than prebuilt binary archives.
"Index servers" are active distribution registries which publish version and
dependency metadata and place constraints on the permitted metadata.
"Public index servers" are index servers which allow distribution uploads
from untrusted third parties. The `Python Package Index`_ is a public index
server.
"Publication tools" are automated tools intended to run on development
systems and upload source and binary distribution archives to index servers.
"Integration tools" are automated tools that consume the metadata and
distribution archives published by an index server or other designated
source, and make use of them in some fashion, such as installing them or
converting them to a platform specific packaging format.
"Installation tools" are integration tools specifically intended to run on
deployment targets, consuming source and binary distribution archives from
an index server or other designated location and deploying them to the target
system.
"Automated tools" is a collective term covering build tools, index servers,
publication tools, integration tools and any other software that produces
or consumes distribution version and dependency metadata.
"Legacy metadata" refers to earlier versions of this metadata specification,
along with the supporting metadata file formats defined by the
``setuptools`` project.
"Distro" is used as the preferred term for Linux distributions, to help
avoid confusion with the Python-specific use of the term "distribution
package".
"Qualified name" is a dotted Python identifier. For imported modules and
packages, the qualified name is available as the ``__name__`` attribute,
while for functions and classes it is available as the ``__qualname__``
attribute.
A "fully qualified name" uniquely locates an object in the Python module
namespace. For imported modules and packages, it is the same as the
qualified name. For other Python objects, the fully qualified name consists
of the qualified name of the containing module or package, a colon (``:``)
and the qualified name of the object relative to the containing module or
package.
A "prefixed name" starts with a qualified name, but is not necessarily a
qualified name - it may contain additional dot separated segments which are
not valid identifiers.
Integration and deployment of distributions
-------------------------------------------
The primary purpose of the distribution metadata is to support integration
and deployment of distributions as part of larger applications and systems.
Integration and deployment can in turn be broken down into further substeps.
* Build: the build step is the process of turning a VCS checkout, source
archive or sdist into a binary archive. Dependencies must be available
in order to build and create a binary archive of the distribution
(including any documentation that is installed on target systems).
* Installation: the installation step involves getting the distribution
and all of its runtime dependencies onto the target system. In this
step, the distribution may already be on the system (when upgrading or
reinstalling) or else it may be a completely new installation.
* Runtime: this is normal usage of a distribution after it has been
installed on the target system.
These three steps may all occur directly on the target system. Alternatively
the build step may be separated out by using binary archives provided by the
publisher of the distribution, or by creating the binary archives on a
separate system prior to deployment. The advantage of the latter approach
is that it minimizes the dependencies that need to be installed on
deployment targets (as the build dependencies will be needed only on the
build systems).
The published metadata for distribution packages SHOULD allow integrators, with
the aid of build and integration tools, to:
* obtain the original source code that was used to create a distribution
* identify and retrieve the dependencies (if any) required to use a
distribution
* identify and retrieve the dependencies (if any) required to build a
distribution from source
* identify and retrieve the dependencies (if any) required to run a
distribution's test suite
Development and publication of distributions
--------------------------------------------
The secondary purpose of the distribution metadata is to support effective
collaboration amongst software contributors and publishers during the
development phase.
The published metadata for distributions SHOULD allow contributors
and publishers, with the aid of build and publication tools, to:
* perform all the same activities needed to effectively integrate and
deploy the distribution
* identify and retrieve the additional dependencies needed to develop and
publish the distribution
* specify the dependencies (if any) required to use the distribution
* specify the dependencies (if any) required to build the distribution
from source
* specify the dependencies (if any) required to run the distribution's
test suite
* specify the additional dependencies (if any) required to develop and
publish the distribution
Metadata format
===============
The format defined in this PEP is an in-memory representation of Python
distribution metadata as a string-keyed dictionary. Permitted values for
individual entries are strings, lists of strings, and additional
nested string-keyed dictionaries.
Except where otherwise noted, dictionary keys in distribution metadata MUST
be valid Python identifiers in order to support attribute based metadata
access APIs.
The individual field descriptions show examples of the key name and value
as they would be serialised as part of a JSON mapping.
Unless otherwise indicated, the fields identified as core metadata are required.
Automated tools MUST NOT accept distributions with missing core metadata as
valid Python distributions.
All other fields are optional. Automated tools MUST operate correctly
if a distribution does not provide them, except for those operations
which specifically require the omitted fields.
Automated tools MUST NOT insert dummy data for missing fields. If a valid
value is not provided for a required field then the metadata and the
associated distribution MUST be rejected as invalid. If a valid value
is not provided for an optional field, that field MUST be omitted entirely.
Automated tools MAY automatically derive valid values from other
information sources (such as a version control system).
Automated tools, especially public index servers, MAY impose additional
length restrictions on metadata beyond those enumerated in this PEP. Such
limits SHOULD be imposed where necessary to protect the integrity of a
service, based on the available resources and the service provider's
judgment of reasonable metadata capacity requirements.
Metadata files
--------------
The information defined in this PEP is serialised to ``pysdist.json``
files for some use cases. These are files containing UTF-8 encoded JSON
metadata.
Each metadata file consists of a single serialised mapping, with fields as
described in this PEP. When serialising metadata, automated tools SHOULD
lexically sort any keys and list elements in order to simplify reviews
of any changes.
There are expected to be three standard locations for these metadata files:
* as a ``{distribution}-{version}.dist-info/pysdist.json`` file in an
``sdist`` source distribution archive
* as a ``{distribution}-{version}.dist-info/pysdist.json`` file in a ``wheel``
binary distribution archive
* as a ``{distribution}-{version}.dist-info/pysdist.json`` file in a local
Python installation database
This file is expected to be identical in all three locations - it is
generated when creating a source archive or binary archive from a source
tree, and then preserved unchanged on installation, or when building a
binary archive from a source archive.
.. note::
These locations are to be confirmed, since they depend on the definition
of sdist 2.0 and the revised installation database standard. There will
also be a wheel 1.1 format update after this PEP is approved that
mandates provision of 2.0+ metadata.
Note that these metadata files MAY be processed even if the version of the
containing location is too low to indicate that they are valid. Specifically,
unversioned ``sdist`` archives, unversioned installation database directories
and version 1.0 of the ``wheel`` specification may still provide
``pysdist.json`` files.
.. note::
Until this specification is formally marked as Active, it is recommended
that tools following the draft format use an alternative filename like
``metadata.json`` or ``pep426-20131213.json`` to avoid colliding with
the eventually standardised files.
Other tools involved in Python distribution MAY also use this format.
Note that these metadata files are generated by build tools based on other
input formats (such as ``setup.py`` and ``pyproject.toml``) rather than being
used directly as a data input format. Generating the metadata as part of the
publication process also helps to deal with version specific fields (including
the source URL and the version field itself).
For backwards compatibility with older installation tools, metadata 2.0
files MAY be distributed alongside legacy metadata.
Index servers MAY allow distributions to be uploaded and installation tools
MAY allow distributions to be installed with only legacy metadata.
Automated tools MAY attempt to automatically translate legacy metadata to
the format described in this PEP. Advice for doing so effectively is given
in Appendix A.
Metadata validation
-------------------
A `jsonschema <https://pypi.org/project/jsonschema/>`__ description of
the distribution metadata is `available
<https://hg.python.org/peps/file/default/pep-0426/pydist-schema.json>`__.
This schema does NOT currently handle validation of some of the more complex
string fields (instead treating them as opaque strings).
Except where otherwise noted, all URL fields in the metadata MUST comply
with :rfc:`3986`.
.. note::
The current version of the schema file covers the previous draft of the
PEP, and has not yet been updated for the split into the essential
dependency resolution metadata and multiple standard extensions, and nor
has it been updated for the various other differences between the current
draft and the earlier drafts.
Core metadata
=============
This section specifies the core metadata fields that are required for every
Python distribution.
Publication tools MUST ensure at least these fields are present when
publishing a distribution.
Index servers MUST ensure at least these fields are present in the metadata
when distributions are uploaded.
Installation tools MUST refuse to install distributions with one or more
of these fields missing by default, but MAY allow users to force such an
installation to occur.
Metadata version
----------------
Version of the file format; ``"2.0"`` is the only legal value.
Automated tools consuming metadata SHOULD warn if ``metadata_version`` is
greater than the highest version they support, and MUST fail if
``metadata_version`` has a greater major version than the highest
version they support (as described in :pep:`440`, the major version is the
value before the first dot).
For broader compatibility, build tools MAY choose to produce
distribution metadata using the lowest metadata version that includes
all of the needed fields.
Example::
"metadata_version": "2.0"
Generator
---------
Name (and optional version) of the program that generated the file,
if any. A manually produced file would omit this field.
Examples::
"generator": "flit"
"generator": "setuptools (34.3.1)"
Name
----
The name of the distribution, as defined in :pep:`508`.
As distribution names are used as part of URLs, filenames, command line
parameters and must also interoperate with other packaging systems, the
permitted characters are constrained to:
* ASCII letters (``[a-zA-Z]``)
* ASCII digits (``[0-9]``)
* underscores (``_``)
* hyphens (``-``)
* periods (``.``)
Distribution names MUST start and end with an ASCII letter or digit.
Automated tools MUST reject non-compliant names. A regular expression to
enforce these constraints (when run with ``re.IGNORECASE``) is::
^([A-Z0-9]|[A-Z0-9][A-Z0-9._-]*[A-Z0-9])$
All comparisons of distribution names MUST be case insensitive, and MUST
consider hyphens and underscores to be equivalent.
Index servers MAY consider "confusable" characters (as defined by the
Unicode Consortium in `TR39: Unicode Security Mechanisms <TR39_>`_) to be
equivalent.
Index servers that permit arbitrary distribution name registrations from
untrusted sources SHOULD consider confusable characters to be equivalent
when registering new distributions (and hence reject them as duplicates).
Integration tools MUST NOT silently accept a confusable alternate
spelling as matching a requested distribution name.
At time of writing, the characters in the ASCII subset designated as
confusables by the Unicode Consortium are:
* ``1`` (DIGIT ONE), ``l`` (LATIN SMALL LETTER L), and ``I`` (LATIN CAPITAL
LETTER I)
* ``0`` (DIGIT ZERO), and ``O`` (LATIN CAPITAL LETTER O)
Example::
"name": "ComfyChair"
Version
-------
The distribution's public or local version identifier, as defined in :pep:`440`.
Version identifiers are designed for consumption by automated tools and
support a variety of flexible version specification mechanisms (see :pep:`440`
for details).
Version identifiers MUST comply with the format defined in :pep:`440`.
Version identifiers MUST be unique within each project.
Index servers MAY place restrictions on the use of local version identifiers
as described in :pep:`440`.
Example::
"version": "1.0a2"
Summary
-------
A short summary of what the distribution does.
This field SHOULD contain fewer than 512 characters and MUST contain fewer
than 2048.
This field SHOULD NOT contain any line breaks.
A more complete description SHOULD be included as a separate file in the
sdist for the distribution. Refer to the ``python-details`` extension in
:pep:`459` for more information.
Example::
"summary": "A module that is more fiendish than soft cushions."
Source code metadata
====================
This section specifies fields that provide identifying details for the
source code used to produce this distribution.
All of these fields are optional. Automated tools MUST operate correctly if
a distribution does not provide them, including failing cleanly when an
operation depending on one of these fields is requested.
Source labels
-------------
Source labels are text strings with minimal defined semantics. They are
intended to allow the original source code to be unambiguously identified,
even if an integrator has applied additional local modifications to a
particular distribution.
To ensure source labels can be readily incorporated as part of file names
and URLs, and to avoid formatting inconsistencies in hexadecimal hash
representations they MUST be limited to the following set of permitted
characters:
* Lowercase ASCII letters (``[a-z]``)
* ASCII digits (``[0-9]``)
* underscores (``_``)
* hyphens (``-``)
* periods (``.``)
* plus signs (``+``)
Source labels MUST start and end with an ASCII letter or digit.
A regular expression to rnforce these constraints (when run with
``re.IGNORECASE``) is::
^([A-Z0-9]|[A-Z0-9][A-Z0-9._-+]*[A-Z0-9])$
A source label for a project MUST NOT match any defined version for that
project. This restriction ensures that there is no ambiguity between version
identifiers and source labels.
Examples::
"source_label": "1.0.0-alpha.1"
"source_label": "1.3.7+build.11.e0f985a"
"source_label": "v1.8.1.301.ga0df26f"
"source_label": "2013.02.17.dev123"
Source URL
----------
A string containing a full URL where the source for this specific version of
the distribution can be downloaded.
Source URLs MUST be unique within each project. This means that the URL
can't be something like ``"https://github.com/pypa/pip/archive/main.zip"``,
but instead must be ``"https://github.com/pypa/pip/archive/1.3.1.zip"``.
The source URL MUST reference either a source archive or a tag or specific
commit in an online version control system that permits creation of a
suitable VCS checkout. It is intended primarily for integrators that
wish to recreate the distribution from the original source form.
All source URL references SHOULD specify a secure transport mechanism
(such as ``https``) AND include an expected hash value in the URL for
verification purposes. If a source URL is specified without any hash
information, with hash information that the tool doesn't understand, or
with a selected hash algorithm that the tool considers too weak to trust,
automated tools SHOULD at least emit a warning and MAY refuse to rely on
the URL. If such a source URL also uses an insecure transport, automated
tools SHOULD NOT rely on the URL.
For source archive references, an expected hash value may be specified by
including a ``<hash-algorithm>=<expected-hash>`` entry as part of the URL
fragment.
As of 2017, it is RECOMMENDED that ``'sha256'`` hashes be used for source
URLs, as this hash is not yet known to be vulnerable to generation of
malicious collisions, while also being widely available on client systems.
For version control references, the ``VCS+protocol`` scheme SHOULD be
used to identify both the version control system and the secure transport,
and a version control system with hash based commit identifiers SHOULD be
used. Automated tools MAY omit warnings about missing hashes for version
control systems that do not provide hash based commit identifiers.
To handle version control systems that do not support including commit or
tag references directly in the URL, that information may be appended to the
end of the URL using the ``@<commit-hash>`` or the ``@<tag>#<commit-hash>``
notation.
.. note::
This isn't *quite* the same as the existing VCS reference notation
supported by pip. Firstly, the distribution name is a separate field rather
than embedded as part of the URL. Secondly, the commit hash is included
even when retrieving based on a tag, in order to meet the requirement
above that *every* link should include a hash to make things harder to
forge (creating a malicious repo with a particular tag is easy, creating
one with a specific *hash*, less so).
Example::
"source_url": "https://github.com/pypa/pip/archive/1.3.1.zip#sha256=2dc6b5a470a1bde68946f263f1af1515a2574a150a30d6ce02c6ff742fcc0db8
"source_url": "git+https://github.com/pypa/pip.git@1.3.1#7921be1537eac1e97bc40179a57f0349c2aee67d"
"source_url": "git+https://github.com/pypa/pip.git@7921be1537eac1e97bc40179a57f0349c2aee67d"
Semantic dependencies
=====================
Dependency metadata allows published projects to make use of functionality
provided by other published projects, without needing to bundle copies of
particular releases of those projects.
Semantic dependencies allow publishers to indicate not only which other
projects are needed, but also *why* they're needed. This additional
information allows integrators to install just the dependencies they need
for specific activities, making it easier to minimise installation
footprints in constrained environments (regardless of the reasons for
those constraints).
By default, dependency declarations are assumed to be for
"runtime dependencies": other releases that are needed to actually use the
published release.
There are also four different kinds of optional dependency that releases may
declare:
* ``test`` dependencies: other releases that are needed to run the
automated test suite for this release, but are not needed just to
use it (e.g. ``nose2`` or ``pytest``)
* ``build`` dependencies: other releases that are needed to build this
a deployable binary version of this release from source
(e.g. ``flit`` or ``setuptools``)
* ``doc`` dependencies: other releases that are needed to build the
documentation for this distribution (e.g. the ``sphinx`` build tool)
* ``dev`` dependencies: other releases that are needed when working on this
distribution, but do not fit into exactly one of the other optional
dependency categories (e.g. ``pylint``, ``flake8``). ``dev`` dependencies
are also effectively considered as combined ``test``, ``build``, and ``doc``
dependencies, without needing to be listed three times
These optional categories are known as
`Extras <Extras (optional dependencies)_>`_. In addition to the four
standard categories, projects may also declare their own custom categories
in the `Extras`_ field.
There are also two standard extra categories that imply dependencies on
other extras:
* ``alldev``: implies the ``test``, ``build``, ``doc``, ``dev`` extras
* ``all``: if not otherwise defined, implies all declared extras
Dependency management is heavily dependent on the version identification
and specification scheme defined in :pep:`440` and the dependency specification,
extra, and environment marker schemes defined in :pep:`508`.
All of these fields are optional. Automated tools MUST operate correctly if
a distribution does not provide them, by assuming that a missing field
indicates "Not applicable for this distribution".
Mapping dependencies to development and distribution activities
---------------------------------------------------------------
The different categories of dependency are based on the various distribution
and development activities identified above, and govern which dependencies
should be installed for the specified activities:
* Required runtime dependencies:
* unconditional dependencies
* Required build dependencies:
* the ``build`` extra
* the ``dev`` extra
* If running the distribution's test suite as part of the build process,
also install the unconditional dependencies and ``test`` extra
* Required development and publication dependencies:
* unconditional dependencies
* the ``test`` extra
* the ``build`` extra
* the ``doc`` extra
* the ``dev`` extra
The notation described in `Extras (optional dependencies)`_ SHOULD be used
to determine exactly what gets installed for various operations.
Installation tools SHOULD report an error if dependencies cannot be
satisfied, MUST at least emit a warning, and MAY allow the user to force
the installation to proceed regardless.
See Appendix B for an overview of mapping these dependencies to an RPM
spec file.
Extras
------
A list of optional sets of dependencies that may be used to define
conditional dependencies in dependency fields. See
`Extras (optional dependencies)`_ for details.
The names of extras MUST abide by the same restrictions as those for
distribution names.
The following extra names are available by default and MUST NOT be
declared explicitly in this field:
* ``all``
* ``alldev``
* ``build``
* ``dev``
* ``doc``
* ``test``
Example::
"extras": ["warmup", "tea"]
Dependencies
------------
A list of release requirements needed to actually run this release.
Public index servers MAY prohibit strict version matching clauses or direct
references in this field.
Example::
"dependencies":
{
"requires": ["SciPy", "PasteDeploy", "zope.interface > 3.5.0"]
},
{
"requires": ["pywin32 > 1.0"],
"environment": "sys_platform == 'win32'"
},
{
"requires": ["SoftCushions"],
"extra": "warmup"
}
]
While many dependencies will be needed to use a project release at all, others
are needed only on particular platforms or only when particular optional
features of the release are needed.
To handle this, release dependency specifiers are mappings with the following
subfields:
* ``requires``: a list of requirements needed to satisfy the dependency
* ``extra``: the name of a set of optional dependencies that are requested
and installed together. See `Extras (optional dependencies)`_ for details
* ``environment``: an environment marker defining the environment that
needs these dependencies. The syntax and capabilities of environment
markers are defined in :pep:`508`
Individual entries in the ``requires`` lists are strings using the dependency
declaration format defined in :pep:`508`, with the exception that environment
markers MUST NOT be included in the individual dependency declarations, and
are instead supplied in the separate ``environment`` field.
``requires`` is the only required subfield. When it is the only subfield, the
dependencies are said to be *unconditional*. If ``extra`` or ``environment``
is specified, then the dependencies are *conditional*.
All three fields may be supplied, indicating that the dependencies are
needed only when the named extra is requested in a particular environment.
Automated tools MUST combine related dependency specifiers (those with
common values for ``extra`` and ``environment``) into a single specifier
listing multiple requirements when serialising metadata.
Despite this required normalisation, the same extra name or environment
marker MAY appear in multiple conditional dependencies. This may happen,
for example, if an extra itself only needs some of its dependencies in
specific environments. It is only the combination of extras and environment
markers that is required to be unique in a list of dependency specifiers.
Aside from the six standard extra categories, any extras referenced from a
dependency specifier MUST be named in the `Extras`_ field for this distribution.
This helps avoid typographical errors and also makes it straightforward to
identify the available extras without scanning the full set of dependencies.
To reuse an extra definition as part of another extra, project releases MAY
declare dependencies on themselves. To avoid infinite recursion in these cases,
automated tools MUST special case dependencies from a project back onto itself.
Metadata Extensions
===================
Extensions to the metadata MAY be present in a mapping under the
``extensions`` key. The keys MUST be valid prefixed names, while
the values MUST themselves be nested mappings.
Two key names are reserved and MUST NOT be used by extensions, except as
described below:
* ``extension_version``
* ``installer_must_handle``
The following example shows the ``python.details`` and ``python.commands``
standard extensions from :pep:`459`::
"extensions" : {
"python.details": {
"license": "GPL version 3, excluding DRM provisions",
"keywords": [
"comfy", "chair", "cushions", "too silly", "monty python"
],
"classifiers": [
"Development Status :: 4 - Beta",
"Environment :: Console (Text Based)",
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)"
],
"document_names": {
"description": "README.rst",
"license": "LICENSE.rst",
"changelog": "NEWS"
}
},
"python.commands": {
"wrap_console": [{"chair": "chair:run_cli"}],
"wrap_gui": [{"chair-gui": "chair:run_gui"}],
"prebuilt": ["reduniforms"]
},
}
Extension names are defined by distributions that will then make use of
the additional published metadata in some way.
To reduce the chance of name conflicts, extension names SHOULD use a
prefix that corresponds to a module name in the distribution that defines
the meaning of the extension. This practice will also make it easier to
find authoritative documentation for metadata extensions.
Metadata extensions allow development tools to record information in the
metadata that may be useful during later phases of distribution, but is
not essential for dependency resolution or building the software.
Extension versioning
--------------------
Extensions MUST be versioned, using the ``extension_version`` key.
However, if this key is omitted, then the implied version is ``1.0``.
Automated tools consuming extension metadata SHOULD warn if
``extension_version`` is greater than the highest version they support,
and MUST fail if ``extension_version`` has a greater major version than
the highest version they support (as described in :pep:`440`, the major
version is the value before the first dot).
For broader compatibility, build tools MAY choose to produce
extension metadata using the lowest metadata version that includes
all of the needed fields.
Required extension handling
---------------------------
A project may consider correct handling of some extensions to be essential
to correct installation of the software. This is indicated by setting the
``installer_must_handle`` field to ``true``. Setting it to ``false`` or
omitting it altogether indicates that processing the extension when
installing the distribution is not considered mandatory by the developers.
Installation tools MUST fail if ``installer_must_handle`` is set to ``true``
for an extension and the tool does not have any ability to process that
particular extension (whether directly or through a tool-specific plugin
system).
If an installation tool encounters a required extension it doesn't
understand when attempting to install from a wheel archive, it MAY fall
back on attempting to install from source rather than failing entirely.
Extras (optional dependencies)
==============================
As defined in :pep:`508`, extras are additional dependencies that enable an
optional aspect of a project release, often corresponding to a ``try: import
optional_dependency ...`` block in the code. They are also used to indicate
semantic dependencies for activities other than normal runtime using (such as
testing, building, or working on the component).
To support the use of the release with or without the optional dependencies,
they are listed separately from the release's core runtime dependencies
and must be requested explicitly, either in the dependency specifications of
another project, or else when issuing a command to an installation tool.
Example of a distribution with optional dependencies::
"name": "ComfyChair",
"extras": ["warmup"]
"dependencies": [
{
"requires": ["SoftCushions"],
"extra": "warmup"
},
{
"requires": ["cython"],
"extra": "build"
}
]
Other distributions require the additional dependencies by placing the
relevant extra names inside square brackets after the distribution name when
specifying the dependency. Multiple extras from a dependency can be requested
by placing to
If the standard ``all`` extra has no explicitly declared entries, then
integration tools SHOULD implicitly define it as a dependency on all of the
extras explicitly declared by the project.
If the standard ``alldev`` extra has no explicitly declared entries, then
integration tools SHOULD implicitly define it as a dependency on the standard
``test``, ``build``, ``doc``, and ``dev`` extras.
The full set of dependency requirements is then based on the unconditional
dependencies, along with those of any requested extras.
Dependency examples (showing just the ``requires`` subfield)::
"requires": ["ComfyChair"]
-> requires ``ComfyChair`` only
"requires": ["ComfyChair[warmup]"]
-> requires ``ComfyChair`` and ``SoftCushions``
"requires": ["ComfyChair[all]"]
-> requires ``ComfyChair`` and ``SoftCushions``, but will also
pick up any new extras defined in later versions
Updating the metadata specification
===================================
The metadata specification may be updated with clarifications without
requiring a new PEP or a change to the metadata version.
Changing the meaning of existing fields or adding new features (other than
through the extension mechanism) requires a new metadata version defined in
a new PEP.
Appendix A: Conversion notes for legacy metadata
================================================
The reference implementations for converting from legacy metadata to
metadata 2.0 are:
* the `wheel project <https://bitbucket.org/dholth/wheel/overview>`__, which
adds the ``bdist_wheel`` command to ``setuptools``
* the `Warehouse project <https://github.com/dstufft/warehouse>`__, which
will eventually be migrated to the Python Packaging Authority as the next
generation Python Package Index implementation
* the `distlib project <https://bitbucket.org/pypa/distlib/>`__ which is
derived from the core packaging infrastructure created for the
``distutils2`` project
.. note::
These tools have yet to be updated for the switch to standard extensions
for several fields.
While it is expected that there may be some edge cases where manual
intervention is needed for clean conversion, the specification has been
designed to allow fully automated conversion of almost all projects on
PyPI.
Metadata conversion (especially on the part of the index server) is a
necessary step to allow installation and analysis tools to start
benefiting from the new metadata format, without having to wait for
developers to upgrade to newer build systems.
Appendix B: Mapping dependency declarations to an RPM SPEC file
===============================================================
As an example of mapping this PEP to Linux distro packages, assume an
example project without any extras defined is split into 2 RPMs
in a SPEC file: ``example`` and ``example-devel``.
The unconditional dependencies would be mapped to the Requires dependencies
for the "example" RPM (a mapping from environment markers relevant to Linux
to SPEC file conditions would also allow those to be handled correctly).
The ``build`` and ``dev`` extra dependencies would be mapped to the
BuildRequires dependencies for the "example" RPM. Depending on how the
``%check`` section in the RPM was defined, the ``test`` extra may also be
mapped to the BuildRequires declaration for the RPM.
All defined dependencies relevant to Linux in the ``dev``, ``test``, ``build``,
and ``doc`` extras would become Requires dependencies for the "example-devel"
RPM.
A documentation toolchain dependency like Sphinx would either go in the
``build`` extra (for example, if man pages were included in the
built distribution) or in the ``doc`` extra (for example, if the
documentation is published solely through Read the Docs or the
project website). This would be enough to allow an automated converter
to map it to an appropriate dependency in the spec file.
If the project did define any extras, those could be mapped to additional
virtual RPMs with appropriate BuildRequires and Requires entries based on
the details of the dependency specifications. Alternatively, they could
be mapped to other system package manager features (such as weak dependencies).
The metadata extension format should also provide a way for distribution
specific hints to be included in the upstream project metadata without needing
to manually duplicate any of the upstream metadata in a distribution specific
format.
Appendix C: Summary of differences from PEP 345
===============================================
* Metadata-Version is now 2.0, with semantics specified for handling
version changes
* The increasingly complex ad hoc "Key: Value" format has been replaced by
a more structured JSON compatible format that is easily represented as
Python dictionaries, strings, lists.
* Most fields are now optional and filling in dummy data for omitted fields
is explicitly disallowed
* Explicit permission for in-place clarifications without releasing a new
version of the specification
* The PEP now attempts to provide more of an explanation of *why* the fields
exist and how they are intended to be used, rather than being a simple
description of the permitted contents
* Changed the version scheme to be based on :pep:`440` rather than :pep:`386`
* Added the source label mechanism as described in :pep:`440`
* Formally defined dependency declarations, extras, and environment markers
in :pep:`508`
* Support for different kinds of dependencies through additional reserved
extra names
* Updated obsolescence mechanism
* A well-defined metadata extension mechanism, and migration of any fields
not needed for dependency resolution to standard extensions
* With all due respect to Charles Schulz and Peanuts, many of the examples
have been updated to be more thematically appropriate for Python ;)
The rationale for major changes is given in the following sections.
Metadata-Version semantics
--------------------------
The semantics of major and minor version increments are now specified,
and follow the same model as the format version semantics specified for
the wheel format in :pep:`427`: minor version increments must behave
reasonably when processed by a tool that only understand earlier metadata
versions with the same major version, while major version increments
may include changes that are not compatible with existing tools.
The major version number of the specification has been incremented
accordingly, as interpreting :pep:`426` metadata obviously cannot be
interpreted in accordance with earlier metadata specifications.
Whenever the major version number of the specification is incremented, it
is expected that deployment will take some time, as either metadata
consuming tools must be updated before other tools can safely start
producing the new format, or else the sdist and wheel formats, along with
the installation database definition, will need to be updated to support
provision of multiple versions of the metadata in parallel.
Existing tools won't abide by this guideline until they're updated to
support the new metadata standard, so the new semantics will first take
effect for a hypothetical 2.x -> 3.0 transition. For the 1.x -> 2.x
transition, we will use the approach where tools continue to produce the
existing supplementary files (such as ``entry_points.txt``) in addition
to any equivalents specified using the new features of the standard
metadata format (including the formal extension mechanism).
Switching to a JSON compatible format
-------------------------------------
The old "Key:Value" format was becoming increasingly limiting, with various
complexities like parsers needing to know which fields were permitted to
occur more than once, which fields supported the environment marker
syntax (with an optional ``";"`` to separate the value from the marker) and
eventually even the option to embed arbitrary JSON inside particular
subfields.
The old serialisation format also wasn't amenable to easy conversion to
standard Python data structures for use in any new install hook APIs, or
in future extensions to the runtime importer APIs to allow them to provide
information for inclusion in the installation database.
Accordingly, we've taken the step of switching to a JSON-compatible metadata
format. This works better for APIs and is much easier for tools to parse and
generate correctly. Changing the name of the metadata file also makes it
easy to distribute 1.x and 2.x metadata in parallel, greatly simplifying
several aspects of the migration to the new metadata format.
The specific choice of ``pydist.json`` as the preferred file name relates
to the fact that the metadata described in these files applies to the
distribution as a whole, rather than to any particular build. Additional
metadata formats may be defined in the future to hold information that can
only be determined after building a binary distribution for a particular
target environment.
Changing the version scheme
---------------------------
See :pep:`440` for a detailed rationale for the various changes made to the
versioning scheme.
Source labels
-------------
The new source label support is intended to make it clearer that the
constraints on public version identifiers are there primarily to aid in
the creation of reliable automated dependency analysis tools. Projects
are free to use whatever versioning scheme they like internally, so long
as they are able to translate it to something the dependency analysis tools
will understand.
Source labels also make it straightforward to record specific details of a
version, like a hash or tag name that allows the release to be reconstructed
from the project version control system.
Support for optional dependencies for distributions
---------------------------------------------------
The new extras system allows distributions to declare optional
behaviour, and to use the dependency fields to indicate when
particular dependencies are needed only to support that behaviour. It is
derived from the equivalent system that is already in widespread use as
part of ``setuptools`` and allows that aspect of the legacy ``setuptools``
metadata to be accurately represented in the new metadata format.
The additions to the extras syntax relative to setuptools are defined to
make it easier to express the various possible combinations of dependencies,
in particular those associated with build systems (with optional support
for running the test suite) and development systems.
Support for different kinds of semantic dependencies
----------------------------------------------------
The separation of the five different kinds of dependency through the Extras
system allows a project to optionally indicate whether a dependency is needed
specifically to develop, build, test or use the distribution.
The advantage of having these distinctions supported in the upstream Python
specific metadata is that even if a project doesn't care about these
distinction themselves, they may be more amenable to patches from
downstream redistributors that separate the fields appropriately. Over time,
this should allow much greater control over where and when particular
dependencies end up being installed.
Support for metadata extensions
-------------------------------
The new extension effectively allows sections of the metadata
namespace to be delegated to other projects, while preserving a
standard overall format metadata format for easy of processing by
distribution tools that do not support a particular extension.
It also works well in combination with the new ``build`` extra
to allow a distribution to depend on tools which *do* know how to handle
the chosen extension, and the new extras mechanism in general, allowing
support for particular extensions to be provided as optional features.
Possible future uses for extensions include declaration of plugins for
other projects and hints for automatic conversion to Linux system
packages.
The ability to declare an extension as required is included primarily to
allow the definition of the metadata hooks extension to be deferred until
some time after the initial adoption of the metadata 2.0 specification. If
a release needs a ``postinstall`` hook to run in order to complete
the installation successfully, then earlier versions of tools should fall
back to installing from source rather than installing from a wheel file and
then failing to run the expected postinstall hook.
Appendix D: Deferred features
=============================
Several potentially useful features have been deliberately deferred in
order to better prioritise our efforts in migrating to the new metadata
standard. These all reflect information that may be nice to have in the
new metadata, but which can be readily added through metadata extensions or
in metadata 2.1 without breaking any use cases already supported by metadata
2.0.
Once the ``pypi``, ``setuptools``, ``pip``, ``wheel`` and ``distlib``
projects support creation and consumption of metadata 2.0, then we may
revisit the creation of metadata 2.1 with some or all of these additional
features.
Standard extensions
-------------------
Some of the information provided by the legacy metadata system has been
moved out to standard extensions defined in :pep:`459`.
This allows publication of the core dependency metadata in a more readily
consumable format to proceed even before the full details of those extensions
have been resolved.
Improved handling of project obsolescence, renames and mergers
--------------------------------------------------------------
Earlier drafts of this PEP included new ``Provides`` and ``Obsoleted-By``
fields for more robust automated notifications and tracking of project
obsolescence, renames and mergers.
This isn't an essential feature of a dependency management system, and has
been deferred indefinitely as a possible future metadata extension.
MIME type registration
----------------------
At some point after acceptance of the PEP, we may submit the
following MIME type registration request to IANA:
* ``application/vnd.python.pydist+json``
It's even possible we may be able to just register the ``vnd.python``
namespace under the banner of the PSF rather than having to register
the individual subformats.
String methods in environment markers
-------------------------------------
Supporting at least ".startswith" and ".endswith" string methods in
environment markers would allow some conditions to be written more
naturally. For example, ``"sys.platform.startswith('win')"`` is a
somewhat more intuitive way to mark Windows specific dependencies,
since ``"'win' in sys.platform"`` is incorrect thanks to ``cygwin``
and the fact that 64-bit Windows still shows up as ``win32`` is more
than a little strange.
Appendix E: Rejected features
=============================
The following features have been explicitly considered and rejected as
introducing too much additional complexity for too small a gain in
expressiveness.
Separate lists for conditional and unconditional dependencies
-------------------------------------------------------------
Earlier versions of this PEP used separate lists for conditional and
unconditional dependencies. This turned out to be annoying to handle in
automated tools and removing it also made the PEP and metadata schema
substantially shorter, suggesting it was actually harder to explain as well.
Separate lists for semantic dependencies
----------------------------------------
Earlier versions of this PEP used separate fields rather than the extras
system for test, build, documentation, and development dependencies. This
turned out to be annoying to handle in automated tools and removing it also
made the PEP and metadata schema substantially shorter, suggesting it was
actually harder to explain as well.
Introducing friction for overly precise dependency declarations
---------------------------------------------------------------
Earlier versions of this PEP attempted to introduce friction into the
inappropriate use of overly strict dependency declarations in published
releases. Discussion on distutils-sig came to the conclusion that wasn't
a serious enough problem to tackle directly at the interoperability
specification layer, and if it does become a problem in the future,
it would be better tackled at the point where projects are uploaded to
the public Python Package Index.
Disallowing underscores in distribution names
---------------------------------------------
Debian doesn't actually permit underscores in names, but that seems
unduly restrictive for this spec given the common practice of using
valid Python identifiers as Python distribution names. A Debian side
policy of converting underscores to hyphens seems easy enough to
implement (and the requirement to consider hyphens and underscores as
equivalent ensures that doing so won't introduce any name conflicts).
Allowing the use of Unicode in distribution names
-------------------------------------------------
This PEP deliberately avoids following Python 3 down the path of arbitrary
Unicode identifiers, as the security implications of doing so are
substantially worse in the software distribution use case (it opens
up far more interesting attack vectors than mere code obfuscation).
In addition, the existing tools really only work properly if you restrict
names to ASCII and changing that would require a *lot* of work for all
the automated tools in the chain.
It may be reasonable to revisit this question at some point in the (distant)
future, but setting up a more reliable software distribution system is
challenging enough without adding more general Unicode identifier support
into the mix.
Depending on source labels
--------------------------
There is no mechanism to express a dependency on a source label - they
are included in the metadata for internal project reference only. Instead,
dependencies must be expressed in terms of either public versions or else
direct URL references.
Alternative dependencies
------------------------
An earlier draft of this PEP considered allowing lists in place of the
usual strings in dependency specifications to indicate that there are
multiple ways to satisfy a dependency.
If at least one of the individual dependencies was already available, then
the entire dependency would be considered satisfied, otherwise the first
entry would be added to the dependency set.
Alternative dependency specification example::
["Pillow", "PIL"]
["mysql", "psycopg2 >= 4", "sqlite3"]
However, neither of the given examples is particularly compelling,
since Pillow/PIL style forks aren't common, and the database driver use
case would arguably be better served by an SQL Alchemy defined "supported
database driver" metadata extension where a project depends on SQL Alchemy,
and then declares in the extension which database drivers are checked for
compatibility by the upstream project.
Compatible release comparisons in environment markers
-----------------------------------------------------
:pep:`440` defines a rich syntax for version comparisons that could
potentially be useful with ``python_version`` and ``python_full_version``
in environment markers. However, allowing the full syntax would mean
environment markers are no longer a Python subset, while allowing
only some of the comparisons would introduce yet another special case
to handle.
Given that environment markers are only used in cases where a higher level
"or" is implied by the metadata structure, it seems easier to require the
use of multiple comparisons against specific Python versions for the rare
cases where this would be useful.
Conditional provides
--------------------
Under the revised metadata design, conditional "provides" based on runtime
features or the environment would go in a separate "may_provide" field.
However, it isn't clear there's any use case for doing that, so the idea
is rejected unless someone can present a compelling use case (and even then
the idea won't be reconsidered until metadata 2.1 at the earliest).
References
==========
This document specifies version 2.0 of the metadata format.
Version 1.0 is specified in :pep:`241`.
Version 1.1 is specified in :pep:`314`.
Version 1.2 is specified in :pep:`345`.
The initial attempt at a standardised version scheme, along with the
justifications for needing such a standard can be found in :pep:`386`.
* `reStructuredText markup
<https://docutils.sourceforge.io/>`__
.. _Python Package Index: https://pypi.org/
.. _TR39: https://www.unicode.org/reports/tr39/tr39-1.html#Confusable_Detection
Copyright
=========
This document has been placed in the public domain.