PEP: 621 Title: Storing project metadata in pyproject.toml Author: Brett Cannon , Dustin Ingram , Paul Ganssle , Paul Moore , Pradyun Gedam , Sébastien Eustace , Thomas Kluyver , Tzu-Ping Chung Discussions-To: https://discuss.python.org/t/pep-621-storing-project-metadata-in-pyproject-toml/4513 Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 22-Jun-2020 Post-History: 22-Jun-2020 Abstract ======== This PEP specifies how to write a project's `core metadata`_ in a ``pyproject.toml`` file for packaging-related tools to consume. Motivation ========== The key motivators of this PEP are: - Encourage users to specify core metadata statically for speed, ease of specification, deterministic consumption by build back-ends, and ease analysis of source checkouts - Provide a tool-agnostic way of specifying the metadata for ease of learning and transitioning between build back-ends - Allow for more code sharing between build back-ends for the "boring parts" of a project's metadata This PEP does **not** attempt to standardize all possible metadata required by a build back-end, only the metadata covered by the `core metadata`_ specification which are very common across projects and would stand to benefit from being static and consistently specified. This means build back-ends are still free and able to innovate around patterns like how to specify the files to include in a wheel. There is also an included escape hatch for users and build back-ends to use when they choose to partially opt-out of this PEP (compared to opting-out of this PEP entirely, which is also possible). This PEP is also not trying to change the underlying `core metadata`_ in any way. Such considerations should be done in a separate PEP which may lead to changes or additions to what this PEP specifies. Finally, this PEP is meant for users to specify metadata for build back-ends or those doing analysis on a source checkout. Once a build back-end has produced an artifact, then the metadata contained in the artifact that the build back-end produced should be considered canonical and overriding what this PEP specifies. In the eyes of this PEP, a source distribution is considered a build artifact, thus people should not read the metadata specified in this PEP as the canonical metadata in a source distribution. Rationale ========= The design guidelines the authors of this PEP followed were: - Define as much of the `core metadata`_ as reasonable - Define the metadata statically with an escape hatch for those who want to define it dynamically - Use familiar names where it makes sense, but be willing to use more modern terminology - Try to be ergonomic within a TOML file instead of mirroring how tools specify metadata at a low-level - Learn from other build back-ends in the packaging ecosystem which have used TOML for their metadata - Don't try to standardize things which lack a pre-existing standard at a lower-level - *When* metadata is specified using this PEP then it is considered canonical, but that any and all metadata can be considered *optional* (`core metadata`_ has its own requirements of what data must be provided *somehow*) Specification ============= When specifying project metadata, tools MUST adhere and honour the metadata as specified in this PEP. If metadata is improperly specified then tools MUST raise an error to notify the user about their mistake. Details ------- Table name '''''''''' Tools MUST specify fields defined by this PEP in a table named ``[project]``. No tools may add fields to this table which are not defined by this PEP. For tools wishing to store their own settings in ``pyproject.toml``, they may use the ``[tool]`` table as defined in :pep:`518`. The lack of a ``[project]`` table implicitly means the build back-end will dynamically provide all fields. ``name`` '''''''' - Format: string - `Core metadata`_: ``Name`` (`link `__) - Synonyms - Flit_: ``module``/``dist-name`` (`link `__) - Poetry_: ``name`` (`link `__) - Setuptools_: ``name`` (`link `__) The name of the project. Tools MUST require users to statically define this field. Tools SHOULD normalize this name, as specified by :pep:`503`, as soon as it is read for internal consistency. ``version`` ''''''''''' - Format: string - `Core metadata`_: ``Version`` (`link `__) - Synonyms - Flit_: N/A (read from a ``__version__`` attribute) (`link `__) - Poetry_: ``version`` (`link `__) - Setuptools_: ``version`` (`link `__) The version of the project as supported by :pep:`440`. Users SHOULD prefer to specify already-normalized versions. ``description`` ''''''''''''''' - Format: string - `Core metadata`_: ``Summary`` (`link `__) - Synonyms - Flit_: N/A - Poetry_: ``description`` (`link `__) - Setuptools_: ``description`` (`link `__) The summary description of the project. ``readme`` '''''''''' - Format: String or table - `Core metadata`_: ``Description`` (`link `__) - Synonyms - Flit_: ``description-file`` (`link `__) - Poetry_: ``readme`` (`link `__) - Setuptools_: ``long_description`` (`link `__) The full description of the project (i.e. the README). The field accepts either a string or a table. If it is a string then it is the relative path to a text file containing the full description. Tools MUST assume the file's encoding as UTF-8. If the file path ends in a case-insensitive ``.md`` suffix, then tools MUST assume the content-type is ``text/markdown``. If the file path ends in a case-insensitive ``.rst``, then tools MUST assume the content-type is ``text/x-rst``. If a tool recognizes more extensions than this PEP, they MAY infer the content-type for the user without specifying this field as ``dynamic``. For all unrecognized suffixes when a content-type is not provided, tools MUST raise an error. The ``readme`` field may also take a table. The ``file`` key has a string value representing a relative path to a file containing the full description. The ``text`` key has a string value which is the full description. These keys are mutually-exclusive, thus tools MUST raise an error if the metadata specifies both keys. The table also has a ``content-type`` field which takes a string specifying the content-type of the full description. A tool MUST raise an error if the metadata does not specify this field in the table. If the metadata does not specify the ``charset`` parameter, then it is assumed to be UTF-8. Tools MAY support other encodings if they choose to. Tools MAY support alternative content-types which they can transform to a content-type as supported by the `core metadata`_. Otherwise tools MUST raise an error for unsupported content-types. ``requires-python`` ''''''''''''''''''' - Format: string - `Core metadata`_: ``Requires-Python`` (`link `__) - Synonyms - Flit_: ``requires-python`` (`link `__) - Poetry_: As a ``python`` dependency in the ``[tool.poetry.dependencies]`` table (`link `__) - Setuptools_: ``python_requires`` (`link `__) The Python version requirements of the project. Build back-ends MAY try to backfill appropriate ``Programming Language :: Python`` `trove classifiers`_ based on what the user specified for this field. ``license`` ''''''''''' - Format: Table - `Core metadata`_: ``License`` (`link `__) - Synonyms - Flit_: ``license`` (`link `__) - Poetry_: ``license`` (`link `__) - Setuptools_: ``license``, ``license_file``, ``license_files`` (`link `__) The table may have one of two keys. The ``file`` key has a string value that is a relative file path to the file which contains the license for the project. Tools MUST assume the file's encoding is UTF-8. The ``text`` key has a string value which is the license of the project. These keys are mutually exclusive, so a tool MUST raise an error if the metadata specifies both keys. A practical string value for the ``license`` key has been purposefully left out to allow for a future PEP to specify support for SPDX_ expressions. If such support comes to fruition and a tool can unambiguously identify the license specified, then the tool MAY fill in the appropriate trove classifier. ``authors``/``maintainers`` ''''''''''''''''''''''''''' - Format: Array of inline tables with string keys and values - `Core metadata`_: ``Author``/``Author-email``/``Maintainer``/``Maintainer-email`` (`link `__) - Synonyms - Flit_: ``author``/``author-email``/``maintainer``/``maintainer-email`` (`link `__) - Poetry_: ``authors``/``maintainers`` (`link `__) - Setuptools_: ``author``/``author_email``/``maintainer``/``maintainer_email`` (`link `__) The people or organizations considered to be the "authors" of the project. The exact meaning is open to interpretation — it may list the original or primary authors, current maintainers, or owners of the package. The "maintainers" field is similar to "authors" in that its exact meaning is open to interpretation. These fields accept an array of tables with 2 keys: ``name`` and ``email``. Both values must be strings. The ``name`` value MUST be a valid email name (i.e. whatever can be put as a name, before an email, in `RFC #822`_) and not contain commas. The ``email`` value MUST be a valid email address. Both keys are optional. Using the data to fill in `core metadata`_ is as follows: 1. If only ``name`` is provided, the value goes in ``Author``/``Maintainer`` as appropriate. 2. If only ``email`` is provided, the value goes in ``Author-email``/``Maintainer-email`` as appropriate. 3. If both ``email`` and ``name`` are provided, the value goes in ``Author-email``/``Maintainer-email`` as appropriate, with the format ``{name} <{email}>``. ``keywords`` '''''''''''' - Format: array of strings - `Core metadata`_: ``Keywords`` (`link `__) - Synonyms - Flit_: ``keywords`` (`link `__) - Poetry_: ``keywords`` (`link `_) - Setuptools_: ``keywords`` (`link `__) The keywords for the project. ``classifiers`` ''''''''''''''' - Format: array of strings - `Core metadata`_: ``Classifier`` (`link `__) - Synonyms - Flit_: ``classifiers`` (`link `__) - Poetry_: ``classifiers`` (`link `__) - Setuptools_: ``classifiers`` (`link `__) `Trove classifiers`_ which apply to the project. Build back-ends MAY automatically fill in extra trove classifiers if the back-end can deduce the classifiers from the provided metadata. ``urls`` '''''''' - Format: Table, with keys and values of strings - `Core metadata`_: ``Project-URL`` (`link `__) - Synonyms - Flit_: ``[tool.flit.metadata.urls]`` table (`link `__) - Poetry_: ``[tool.poetry.urls]`` table (`link `__) - Setuptools_: ``project_urls`` (`link `__) A table of URLs where the key is the URL label and the value is the URL itself. Entry points '''''''''''' - Format: Table (``[project.scripts]``, ``[project.gui-scripts]``, and ``[project.entry-points]``) - `Core metadata`_: N/A; `Entry point specification `_ - Synonyms - Flit_: ``[tool.flit.scripts]`` table for console scripts, ``[tool.flit.entrypoints]`` for the rest (`link `__) - Poetry_: ``[tool.poetry.scripts]`` table for console scripts (`link `__) - Setuptools_: ``entry_points`` (`link `__) There are three tables related to entry points. The ``[project.scripts]`` table corresponds to ``console_scripts`` group. The key of the table is the name of the entry point and the value is the object reference. The ``[project.gui-scripts]`` table corresponds to the ``gui_scripts`` group. Its format is the same as ``[project.scripts]``. The ``[project.entry-points]`` table is a collection of tables. Each sub-table's name is an entry point group. The key and value semantics are the same as ``[project.scripts]``. Users MUST NOT create nested sub-tables but instead keep the entry point groups to only one level deep. Build back-ends MUST raise an error if the metadata defines a ``[project.entry-points.console_scripts]`` or ``[project.entry-points.gui_scripts]`` table, as they would be ambiguous in the face of ``[project.scripts]`` and ``[project.gui-scripts]``, respectively. ``dependencies``/``optional-dependencies`` '''''''''''''''''''''''''''''''''''''''''' - Format: TBD - `Core metadata`_: ``Requires-Dist`` (`link `__) - Synonyms - Flit_: ``requires`` for required dependencies, ``requires-extra`` for optional dependencies (`link `__) - Poetry_: ``[tool.poetry.dependencies]`` for dependencies (both required and for development), ``[tool.poetry.extras]`` for optional dependencies (`link `__) - Setuptools_: ``install_requires`` for required dependencies, ``extras_require`` for optional dependencies (`link `__) See the open issue on `How to specify dependencies?`_ for a discussion of the options of how to specify a project's dependencies. ``dynamic`` ''''''''''' - Format: Array of strings - `Core metadata`_: N/A - No synonyms Specifies which fields listed by this PEP were intentionally unspecified so another tool can/will provide such metadata dynamically. This clearly delineates which metadata is purposefully unspecified and expected to stay unspecified compared to being provided via tooling later on. - A build back-end MUST honour statically-specified metadata (which means the metadata did not list the field in ``dynamic``). - A build back-end MUST raise an error if the metadata specifies the ``name`` in ``dynamic``. - If the `core metadata`_ specification lists a field as "Required", then the metadata MUST specify the field statically or list it in ``dynamic`` (build back-ends MUST raise an error otherwise, i.e. a required field is in no way listed in a ``pyproject.toml`` file). - If the `core metadata`_ specification lists a field as "Optional", the metadata MAY list it in ``dynamic`` if the expectation is a build back-end will provide the data for the field later. - Build back-ends MUST raise an error if the metadata specifies a field statically as well as being listed in ``dynamic``. - If the metadata does not list a field in ``dynamic``, then a build back-end CANNOT fill in the requisite metadata on behalf of the user (i.e. ``dynamic`` is the only way to allow a tool to fill in metadata and the user must opt into the filling in). - Build back-ends MUST raise an error if the metadata specifies a field in ``dynamic`` but is still unspecified in the final artifact (i.e. the build back-end was unable to provide the data for a field listed in ``dynamic``). Example ------- :: [project] name = "spam" version = "2020.0.0" description = "Lovely Spam! Wonderful Spam!" readme = "README.rst" requires-python = ">=3.8" license = {file = "LICENSE.txt"} keywords = ["egg", "bacon", "sausage", "tomatoes", "Lobster Thermidor"] authors = [ {email = "hi@pradyunsg.me"}, {name = "Tzu-Ping Chung"} ] maintainers = [ {name = "Brett Cannon", email = "brett@python.org"} ] classifiers = [ "Development Status :: 4 - Beta", "Programming Language :: Python" ] # Using 'dependencies' and 'optional-dependencies' as an example # as those fields' format are an Open Issue. dynamic = ["dependencies", "optional-dependencies"] [project.urls] homepage = "example.com" documentation = "readthedocs.org" repository = "github.com" changelog = "github.com/me/spam/blob/master/CHANGELOG.md" [project.scripts] spam-cli = "spam:main_cli" [project.gui-scripts] spam-gui = "spam:main_gui" [project.entry-points."spam.magical"] tomatoes = "spam:main_tomatoes" Backwards Compatibility ======================= As this provides a new way to specify a project's `core metadata`_ and is using a new table name which falls under the reserved namespace as outlined in :pep:`518`, there are no backwards-compatibility concerns. Security Implications ===================== There are no direct security concerns as this PEP covers how to statically define project metadata. Any security issues would stem from how tools consume the metadata and choose to act upon it. How to Teach This ================= [How to teach users, new and experienced, how to apply the PEP to their work.] Reference Implementation ======================== There are currently no proofs-of-concept from any build tools implementing this PEP. Rejected Ideas ============== Other table names ----------------- Anything under ``[build-system]`` ''''''''''''''''''''''''''''''''' There was worry that using this table name would exacerbate confusion between build metadata and project metadata, e.g. by using ``[build-system.metadata]`` as a table. ``[package]`` ''''''''''''' Garnered no strong support. ``[metadata]`` '''''''''''''' The strongest contender after ``[project]``, but in the end it was agreed that ``[project]`` read better for certain sub-tables, e.g. ``[project.urls]``. Support for a metadata provider ------------------------------- Initially there was a proposal to add a middle layer between the static metadata specified by this PEP and ``prepare_metadata_for_build_wheel()`` as specified by :pep:`517`. The idea was that if a project wanted to insert itself between a build back-end and the metadata there would be a hook to do so. In the end the authors considered this idea unnecessarily complicated and would move the PEP away from its design goal to push people to define core metadata statically as much as possible. Require a normalized project name --------------------------------- While it would make things easier for tools to only work with the normalized name as specified in :pep:`503`, the idea was ultimately rejected as it would hurt projects transitioning to using this PEP. Specify files to include when building -------------------------------------- The authors decided fairly quickly during design discussions that this PEP should focus exclusively on project metadata and not build metadata. As such, specifying what files should end up in a source distribution or wheel file is out of scope for this PEP. Name the ``[project.urls]`` table ``[project.project-urls]`` ------------------------------------------------------------ This suggestion came thanks to the corresponding `core metadata`_ being `Project-Url`. But once the overall table name of `[project]` was chosen, the redundant use of the word "project" suggested the current, shorter name was a better fit. Have a separate ``url``/``home-page`` field ------------------------------------------- While the `core metadata`_ supports it, having a single field for a project's URL while also supporting a full table seemed redundant and confusing. Recommend that tools put development-related dependencies into a "dev" extra ---------------------------------------------------------------------------- As various tools have grown the concept of required dependencies versus development dependencies, the idea of suggesting to tools that they put such development tool into a "dev" grouping came up. In the end, though, the authors deemed it out-of-scope for this specification to suggest such a workflow. Have the ``dynamic`` field only require specifying missing required fields -------------------------------------------------------------------------- The authors considered the idea that the ``dynamic`` field would only require the listing of missing required fields and make listing optional fields optional. In the end, though, this went against the design goal of promoting specifying as much information statically as possible. Different structures for the ``readme`` field --------------------------------------------- The ``readme`` field had a proposed ``readme_content_type`` field, but the authors considered the string/table hybrid more practical for the common case while still accommodating the more complex case. Same goes for using``long_description`` and a corresponding ``long_description_content_type`` field. The ``file`` key in the table format was originally proposed as ``path``, but ``file`` corresponds to setuptools' ``file`` key and there is no strong reason otherwise to choose one over the other. Allowing the ``readme`` field to imply ``text/plain`` ----------------------------------------------------- The authors considered allowing for unspecified content-types which would default to ``text/plain``, but decided that it would be best to be explicit in this case to prevent accidental incorrect renderings on PyPI and to force users to be clear in their intent. Other names for ``dependencies``/``optional-dependencies`` ---------------------------------------------------------- The authors originally proposed ``requires``/``extra-requires`` as names, but decided to go with the current names after a survey of other packaging ecosystems showed Python was an outlier: 1. `npm `__ 2. `Rust `__ 3. `Dart `__ 4. `Swift `__ 5. `Ruby `__ Normalizing on the current names helps minimize confusion for people coming from other ecosystems without using terminology that is necessarily foreign to new programmers. It also prevents potential confusion with ``requires`` in the ``[build-system]`` table as specified in :pep:`518`. Support ``Maintainers``/``Maintainers-email`` --------------------------------------------- When discussing how to support ``Authors``/``Authors-email``, the question was brought up as to how exactly authors differed from maintainers. As this was never clearly defined and no one could come up with a good definition, the decision was made to drop the concept of maintainers. Drop ``maintainers`` to unify with ``authors`` ---------------------------------------------- As the difference between ``Authors`` and ``Maintainers`` fields in the `core metadata`_ is unspecified and ambiguous, this PEP originally proposed unifying them as a single ``authors`` field. Other ecosystems have selected "author" as the term to use, so the thinking was to standardize on ``Author`` in the core metadata as the place to list people maintaining a project. In the end, though, the decision to adhere to the core metadata was deemed more important to help with the the acceptance of this PEP, rather than trying to introduce a new interpretation for some of the core metadata. Support an arbitrary depth of tables for ``project.entry-points`` ----------------------------------------------------------------- There was a worry that keeping ``project.entry-points`` to a depth of 1 for sub-tables would cause confusion to users if they use a dotted name and are not used to table names using quotation marks (e.g. ``project.entry-points."spam.magical"``). But supporting an arbitrary depth -- e.g. ``project.entry-points.spam.magical`` -- would preclude any form of an exploded table format in the future. It would also complicate things for build back-ends as they would have to make sure to traverse the full table structure rather than a single level and raising errors as appropriate on value types. Backfilling trove classifiers SHOULD occur instead of MAY happen ---------------------------------------------------------------- Originally this PEP said that tools SHOULD backfill appropriate trove classifiers. This was changed to say it MAY occur to emphasize it was entirely optional for build back-ends to implement. Open Issues =========== How to specify dependencies? ---------------------------- People seem to fall into two camps on how to specify dependencies: using :pep:`508` strings or TOML tables (sometimes referred to as the "exploded table" format due to it being the equivalent of translating a :pep:`508` string into a table format). There is no question as to whether one format or another can fully represent what the other can. This very much comes down to a question of familiarity and (perceived) ease of use. Supporters of :pep:`508` strings believe familiarity is important as the format has been in use for 5 years and in some variant for 15 years (since the introduction of :pep:`345`). This would facilitate transitioning people to using this PEP as there would be one less new concept to learn. Supporters also think the format is reasonably ergonomic and understandable upon first glance, so using a DSL for it is not a major drawback. Supporters of the exploded table format believe it has better ergonomics. Tooling which can validate TOML formats could also help detect errors in a ``pyproject.toml`` file while editing instead of waiting until the user has run a tool in the case of :pep:`508`'s DSL. Supporters also believe it is easier to read and reason (both in general and for first-time users). They also point out that other programming languages have adopted a format more like an exploded table thanks to their use of standardized configuration formats (e.g. `Rust `__, and `Dart `__). The thinking is that an exploded table format would be more familiar to people coming to Python from another programming language. The authors briefly considered supporting both formats, but decided that it would lead to confusion as people would need to be familiar with two formats instead of just one. Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. .. _PyPI: https://pypi.org .. _core metadata: https://packaging.python.org/specifications/core-metadata/ .. _flit: https://flit.readthedocs.io/ .. _poetry: https://python-poetry.org/ .. _setuptools: https://setuptools.readthedocs.io/ .. _setuptools metadata: https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata .. _survey of tools: https://github.com/uranusjr/packaging-metadata-comparisons .. _trove classifiers: https://pypi.org/classifiers/ .. _SPDX: https://spdx.dev/ .. _RFC #822: https://tools.ietf.org/html/rfc822 .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: