python-peps/peps/pep-0639.rst

1017 lines
39 KiB
ReStructuredText

PEP: 639
Title: Improving License Clarity with Better Package Metadata
Author: Philippe Ombredanne <pombredanne@nexb.com>,
C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>,
Karolina Surma <karolina.surma@gazeta.pl>,
PEP-Delegate: Brett Cannon <brett@python.org>
Discussions-To: https://discuss.python.org/t/53020
Status: Draft
Type: Standards Track
Topic: Packaging
Created: 15-Aug-2019
Post-History: `15-Aug-2019 <https://discuss.python.org/t/2154>`__,
`17-Dec-2021 <https://discuss.python.org/t/12622>`__,
`10-May-2024 <https://discuss.python.org/t/53020>`__,
.. _639-abstract:
Abstract
========
This PEP defines a specification how licenses are documented in the Python
projects.
To achieve that, it:
- Adopts the `SPDX license expression syntax <639-spdx_>`__ as a
means of expressing the license for a Python project.
- Defines how to include license files within the projects, source and built
distributions.
- Specifies the necessary changes to :term:`Core Metadata` and
the corresponding :term:`Pyproject Metadata key`\s
- Describes the necessary changes to related specifications,
namely the `source distribution (sdist) <sdistspec_>`__,
`built distribution (wheel) <wheelspec_>`__ and
`installed project <installedspec_>`__ standards.
- :ref:`Provides guidance <639-spec-converting-metadata>`
for authors and tools converting legacy license metadata.
This will make license declaration simpler and less ambiguous for
package authors to create, end users to understand,
and tools to programmatically process.
The changes will update the
`Core Metadata specification <coremetadataspec_>`__ to version 2.4.
.. _639-goals:
Goals
=====
This PEP's scope is limited to covering new mechanisms for documenting
the license of a :term:`distribution package`, specifically defining:
- A means of specifying a SPDX :term:`license expression`.
- A method of including license texts in :term:`distribution package`\s
and installed :term:`Project`\s.
The changes that this PEP requires have been designed to minimize impact and
maximize backward compatibility.
.. _639-non-goals:
Non-Goals
=========
This PEP doesn't recommend any particular license to be chosen by any
particular package author.
If projects decide not to use the new fields, no additional restrictions are
imposed by this PEP when uploading to PyPI.
This PEP also is not about license documentation for individual files,
though this is a :ref:`surveyed topic <639-license-doc-source-files>`
in an appendix, nor does it intend to cover cases where the
:term:`source distribution <Source Distribution (or "sdist")>` and
:term:`binary distribution` packages don't have
:ref:`the same licenses <639-rejected-ideas-difference-license-source-binary>`.
.. _639-motivation:
Motivation
==========
Software must be licensed in order for anyone other than its creator to
download, use, share and modify it.
Today, there are multiple fields where licenses
are documented in :term:`Core Metadata`,
and there are limitations to what can be expressed in each of them.
This often leads to confusion both for package authors
and end users, including distribution re-packagers.
This has triggered a number of license-related discussions and issues,
including on `outdated and ambiguous PyPI classifiers <classifierissue_>`__,
`license interoperability with other ecosystems <interopissue_>`__,
`too many confusing license metadata options <packagingissue_>`__,
`limited support for license files in the Wheel project <wheelfiles_>`__, and
`the lack of clear, precise and standardized license metadata <pepissue_>`__.
As a result, on average, Python packages tend to have more ambiguous and
missing license information than other common ecosystems. This is supported by
the `statistics page <cdstats_>`__ of the
`ClearlyDefined project <clearlydefined_>`__, an
`Open Source Initiative <osi_>`__ effort to help
improve licensing clarity of other FOSS projects, covering all packages
from PyPI, Maven, npm and Rubygems.
The current license classifiers could be extended to include the full range of
the SPDX identifiers while deprecating the ambiguous classifiers
(such as ``License :: OSI Approved :: BSD License``).
However, there are multiple arguments against such an approach:
- It requires a great effort to duplicate the SPDX license list and keep it in
sync.
- It is a hard break in backward compatibility, forcing package authors
to update to new classifiers immediately when PyPI deprecates the old ones.
- It only covers packages under a single license;
it doesn't address projects that vendor dependencies (e.g. Setuptools),
offer a choice of licenses (e.g. Packaging) or were relicensed,
adapt code from other projects or contain fonts, images,
examples, binaries or other assets under other licenses.
- It requires both authors and tools understand and implement the PyPI-specific
classifier system.
- It does not provide as clear an indicator that a package
has adopted the new system, and should be treated accordingly.
.. _639-rationale:
Rationale
=========
A survey was conducted to map the existing license metadata
definitions in the :ref:`Python ecosystem <639-license-doc-python>` and a
:ref:`variety of other packaging systems, Linux distributions,
language ecosystems and applications <639-license-doc-other-projects>`.
The takeaways from the survey have guided the recommendations of this PEP:
- SPDX and SPDX-like syntaxes are the most popular :term:`license expression`\s
in many modern package systems.
- Most Free and Open Source Software licenses require package authors to
include their full text in a :term:`Distribution Package`.
Therefore, this PEP introduces two new Core Metadata fields:
- :ref:`License-Expression <639-spec-field-license-expression>` that
provides an unambiguous way to express the license of a package
using SPDX license expressions.
- :ref:`License-File <639-spec-field-license-file>` that
offers a standardized way to include the full text of the license(s)
with the package when distributed,
and allows other tools consuming the :term:`Core Metadata`
to locate a :term:`distribution archive`'s license files.
Furthermore, this specification builds upon
existing practice in the `Setuptools <setuptoolsfiles_>`__ and
`Wheel <wheelfiles_>`__ projects.
An up-to-date version of the current draft of this PEP is
`implemented <hatchimplementation_>`__ in the
`Hatch <hatch_>`__ packaging tool, and an earlier draft of the
:ref:`license files portion <639-spec-field-license-file>`
is `implemented in Setuptools <setuptoolspep639_>`__.
.. _639-terminology:
Terminology
===========
The keywords "MUST", "MUST NOT", "REQUIRED",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
in this document are to be interpreted as described in :rfc:`2119`.
.. _639-terminology-license:
License terms
-------------
The license-related terminology draws heavily from the `SPDX Project <spdx_>`__,
particularly :term:`license identifier` and :term:`license expression`.
.. glossary::
license classifier
A `PyPI Trove classifier <classifiers_>`__
(as :ref:`described <packaging:core-metadata-classifier>`
in the :term:`Core Metadata` specification)
which begins with ``License ::``.
license expression
SPDX expression
A string with valid `SPDX license expression syntax <spdxpression_>`__
including one or more SPDX :term:`license identifier`\(s),
which describes a :term:`Project`'s license(s)
and how they inter-relate.
Examples:
``GPL-3.0-or-later``,
``MIT AND (Apache-2.0 OR BSD-2-clause)``
license identifier
SPDX identifier
A valid `SPDX short-form license identifier <spdxid_>`__,
as described in the
:ref:`639-spec-field-license-expression` section of this PEP.
This includes all valid SPDX identifiers and
the strings ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary``.
Examples:
``MIT``,
``GPL-3.0-only``
root license directory
license directory
The directory under which license files are stored in a
:term:`project source tree`, :term:`distribution archive`
or :term:`installed project`.
Also, the root directory that their paths
recorded in the :ref:`License-File <639-spec-field-license-file>`
:term:`Core Metadata field` are relative to.
Defined to be the :term:`project root directory`
for a :term:`project source tree` or
:term:`source distribution <Source Distribution (or "sdist")>`;
and a subdirectory named ``licenses`` of
the directory containing the :term:`built metadata`
i.e., the ``.dist-info/licenses`` directory—
for a :term:`Built Distribution` or :term:`installed project`.
.. _639-specification:
Specification
=============
The changes necessary to implement this PEP include:
- additions to :ref:`Core Metadata <639-spec-core-metadata>`,
as defined in the `specification <coremetadataspec_>`__.
- additions to the author-provided
:ref:`project source metadata <639-spec-source-metadata>`,
as defined in the `specification <pyprojecttoml_>`__.
- :ref:`minor additions <639-spec-project-formats>` to the
source distribution (sdist), built distribution (wheel) and installed project
specifications.
- :ref:`guide for tools <639-spec-converting-metadata>`
handling and converting legacy license metadata to license
expressions, to ensure the results are consistent and correct.
Note that the guidance on errors and warnings is for tools' default behavior;
they MAY operate more strictly if users explicitly configure them to do so,
such as by a CLI flag or a configuration option.
.. _639-spdx:
SPDX license expression syntax
------------------------------
This PEP adopts the SPDX license expression syntax as
documented in the `SPDX specification <spdxpression_>`__, either
Version 2.2 or a later compatible version.
A license expression can use the following :term:`license identifier`\s:
- Any SPDX-listed license short-form identifiers that are published in the
`SPDX License List <spdxlist_>`__, version 3.17 or any later compatible
version. Note that the SPDX working group never removes any license
identifiers; instead, they may choose to mark an identifier as "deprecated".
- The ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` strings, to
identify licenses that are not included in the SPDX license list.
Examples of valid SPDX expressions:
.. code-block:: none
MIT
BSD-3-Clause
MIT AND (Apache-2.0 OR BSD-2-clause)
MIT OR GPL-2.0-or-later OR (FSFUL AND BSD-2-Clause)
GPL-3.0-only WITH Classpath-Exception-2.0 OR BSD-3-Clause
LicenseRef-Public-Domain OR CC0-1.0 OR Unlicense
LicenseRef-Proprietary
Examples of invalid SPDX expressions:
.. code-block:: none
Use-it-after-midnight
Apache-2.0 OR 2-BSD-Clause
.. _639-spec-core-metadata:
Core Metadata
-------------
The error and warning guidance in this section applies to build and
publishing tools; end-user-facing install tools MAY be less strict than
mentioned here when encountering malformed metadata
that does not conform to this specification.
As it adds new fields, this PEP updates the Core Metadata version to 2.4.
.. _639-spec-field-license-expression:
Add ``License-Expression`` field
''''''''''''''''''''''''''''''''
The ``License-Expression`` optional :term:`Core Metadata field`
is specified to contain a text string
that is a valid SPDX :term:`license expression`, as defined by this PEP.
Publishing tools SHOULD issue an informational warning if this field is
missing, and MAY raise an error. Build tools MAY issue a similar warning,
but MUST NOT raise an error.
A license expression is an SPDX expression as :ref:`defined above <639-spdx>`.
When processing the ``License-Expression`` field, build and publishing tools:
- SHOULD halt execution and raise an error if:
- The field does not contain a valid license expression
- One or more license identifiers are not valid
(as :ref:`defined above <639-spdx>`)
- SHOULD report an informational warning, and publishing tools MAY raise an
error, if one or more license identifiers have been marked as deprecated in
the `SPDX License List <spdxlist_>`__.
- MUST store a case-normalized version of the ``License-Expression`` field
using the reference case for each SPDX license identifier and
uppercase for the ``AND``, ``OR`` and ``WITH`` keywords.
- SHOULD report an informational warning, and MAY raise an error if
the normalization process results in changes to the
``License-Expression`` field contents.
For all newly-uploaded :term:`distribution archive`\s
that include a ``License-Expression`` field,
the `Python Package Index (PyPI) <pypi_>`__ MUST
validate that they contain a valid, case-normalized license expression with
valid identifiers (as :ref:`defined above <639-spdx>`)
and MUST reject uploads that do not.
PyPI MAY reject an upload for using a deprecated license identifier,
so long as it was deprecated as of the above-mentioned SPDX License List
version.
.. _639-spec-field-license-file:
Add ``License-File`` field
''''''''''''''''''''''''''
``License-File`` is an optional :term:`Core Metadata field`.
Each instance contains the string
representation of the path of a license-related file. The path is located
within the :term:`project source tree`, relative to the
:term:`project root directory`.
It is a multi-use field that may appear zero or
more times and each instance lists the path to one such file. Files specified
under this field could include license text, author/attribution information,
or other legal notices that need to be distributed with the package.
As :ref:`specified by this PEP <639-spec-project-formats>`, its value
is also that file's path relative to the :term:`root license directory`
in both :term:`installed project`\s
and the standardized :term:`Distribution Package` types.
If a ``License-File`` is listed in a
:term:`Source Distribution <Source Distribution (or "sdist")>` or
:term:`Built Distribution`'s Core Metadata:
- That file MUST be included in the :term:`distribution archive` at the
specified path relative to the root license directory.
- That file MUST be installed with the :term:`project` at that same relative
path.
- The specified relative path MUST be consistent between project source trees,
source distributions (sdists), built distributions (:term:`Wheel`\s) and
installed projects.
- Inside the root license directory, packaging tools MUST reproduce the
directory structure under which the source license files are located
relative to the project root.
- Path delimiters MUST be the forward slash character (``/``),
and parent directory indicators (``..``) MUST NOT be used.
- License file content MUST be UTF-8 encoded text.
Build tools MAY and publishing tools SHOULD produce an informative warning
if a built distribution's metadata contains no ``License-File`` entries,
and publishing tools MAY but build tools MUST NOT raise an error.
For all newly-uploaded :term:`distribution archive`\s that include one or more
``License-File`` fields in their Core Metadata
and declare a ``Metadata-Version`` of ``2.4`` or higher,
PyPI SHOULD validate that all specified files are present in that
:term:`distribution archive`\s,
and MUST reject uploads that do not validate.
.. _639-spec-field-license:
Deprecate ``License`` field
'''''''''''''''''''''''''''
The legacy unstructured-text ``License`` :term:`Core Metadata field`
is deprecated and replaced by the new ``License-Expression`` field.
Build and publishing tools MUST raise an error
if both these fields are present and their values are not identical,
including capitalization and excluding leading and trailing whitespace.
If only the ``License`` field is present, such tools SHOULD issue a warning
informing users it is deprecated and recommending ``License-Expression``
instead.
For all newly-uploaded :term:`distribution archive`\s that include a
``License-Expression`` field, the `Python Package Index (PyPI) <pypi_>`__ MUST
reject any that specify a ``License`` field and the text of which is not
identical to that of ``License-Expression``,
as :ref:`defined here <639-spdx>`.
The ``License`` field may be removed from a new version of the specification
in a future PEP.
.. _639-spec-field-classifier:
Deprecate license classifiers
'''''''''''''''''''''''''''''
Using :term:`license classifier`\s
in the ``Classifier`` :term:`Core Metadata field`
(`described in the Core Metadata specification <coremetadataclassifiers_>`__)
is deprecated and replaced by the more precise ``License-Expression`` field.
If the ``License-Expression`` field is present, build tools SHOULD and
publishing tools MUST raise an error if one or more license classifiers
is included in a ``Classifier`` field, and MUST NOT add
such classifiers themselves.
Otherwise, if this field contains a license classifier, build tools MAY
and publishing tools SHOULD issue a warning informing users such classifiers
are deprecated, and recommending ``License-Expression`` instead.
For compatibility with existing publishing and installation processes,
the presence of license classifiers SHOULD NOT raise an error unless
``License-Expression`` is also provided.
For all newly-uploaded distributions that include a
``License-Expression`` field, the `Python Package Index (PyPI) <pypi_>`__ MUST
reject any that also specify any license classifiers.
New license classifiers MUST NOT be `added to PyPI <classifiersrepo_>`__;
users needing them SHOULD use the ``License-Expression`` field instead.
License classifiers may be removed from a new version of the specification
in a future PEP.
.. _639-spec-source-metadata:
Project source metadata
-----------------------
This PEP specifies changes to the project's source
metadata under a ``[project]`` table in the ``pyproject.toml`` file.
.. _639-spec-key-license-text:
Add string value to ``license`` key
'''''''''''''''''''''''''''''''''''
``license`` key in the ``[project]`` table is defined to contain a top-level
string value. It is a valid SPDX license expression as
:ref:`defined in this PEP <639-spdx>`.
Its value maps to the ``License-Expression`` field in the core metadata.
Build tools SHOULD validate the expression as described in the
:ref:`639-spec-field-license-expression` section,
outputting an error or warning as specified.
When generating the Core Metadata, tools MUST perform case normalization.
If a top-level string value for the ``license`` key is present and valid,
for purposes of backward compatibility
tools MAY back-fill the ``License`` Core Metadata field
with the normalized value of the ``license`` key.
Examples:
.. code-block:: toml
[project]
license = "MIT"
[project]
license = "MIT AND (Apache-2.0 OR BSD-2-clause)"
[project]
license = "MIT OR GPL-2.0-or-later OR (FSFUL AND BSD-2-Clause)"
[project]
license = "LicenseRef-Proprietary"
.. _639-spec-key-license-files:
Add ``license-files`` key
'''''''''''''''''''''''''
A new ``license-files`` key is added to the ``[project]`` table for specifying
paths in the project source tree relative to ``pyproject.toml`` to file(s)
containing licenses and other legal notices to be distributed with the package.
It corresponds to the ``License-File`` fields in the Core Metadata.
Its value is a table, which if present MUST contain one of two optional,
mutually exclusive subkeys, ``paths`` and ``globs``; if both are specified,
tools MUST raise an error. Both are arrays of strings; the ``paths`` subkey
contains verbatim file paths, and the ``globs`` subkey valid glob patterns,
which MUST be parsable by the ``glob`` `module <globmodule_>`__ in the
Python standard library.
Path delimiters MUST be the forward slash character (``/``),
and parent directory indicators (``..``) MUST NOT be used.
Tools MUST assume that license file content is valid UTF-8 encoded text,
and SHOULD validate this and raise an error if it is not.
If the ``paths`` subkey is a non-empty array, build tools:
- MUST treat each value as a verbatim, literal file path, and
MUST NOT treat them as glob patterns.
- MUST include each listed file in all distribution archives.
- MUST NOT match any additional license files beyond those explicitly
statically specified by the user under the ``paths`` subkey.
- MUST list each file path under a ``License-File`` field in the Core Metadata.
- MUST raise an error if one or more paths do not correspond to a valid file
in the project source that can be copied into the distribution archive.
If the ``globs`` subkey is a non-empty array, build tools:
- MUST treat each value as a glob pattern, and MUST raise an error if the
pattern contains invalid glob syntax.
- MUST include all files matched by at least one listed pattern in all
distribution archives.
- MAY exclude files matched by glob patterns that can be unambiguously
determined to be backup, temporary, hidden, OS-generated or VCS-ignored.
- MUST list each matched file path under a ``License-File`` field in the
Core Metadata.
- SHOULD issue a warning and MAY raise an error if no files are matched.
- MAY issue a warning if any individual user-specified pattern
does not match at least one file.
If the ``license-files`` key is present, and the ``paths`` or ``globs`` subkey
is set to a value of an empty array, then tools MUST NOT include any
license files and MUST NOT raise an error.
.. _639-default-patterns:
If the ``license-files`` key is not present and not explicitly marked as
``dynamic``, tools MUST assume a default value of the following:
.. code-block:: toml
license-files.globs = ["LICEN[CS]E*", "COPYING*", "NOTICE*", "AUTHORS*"]
In this case, tools MAY issue a warning if no license files are matched,
but MUST NOT raise an error.
If the ``license-files`` key is marked as ``dynamic`` (and not present),
to preserve consistent behavior with current tools and help ensure the packages
they create are legally distributable, build tools SHOULD default to
including at least the license files matching the above patterns, unless the
user has explicitly specified their own.
Examples of valid license files declaration:
.. code-block:: toml
[project]
license-files = { globs = ["LICEN[CS]E*", "AUTHORS*"] }
[project]
license-files.paths = ["licenses/LICENSE.MIT", "licenses/LICENSE.CC0"]
[project]
license-files = { paths = [] }
[project]
license-files.globs = []
Examples of invalid license files declaration:
.. code-block:: toml
[project]
license-files.globs = ["LICEN[CS]E*", "AUTHORS*"]
license-files.paths = ["LICENSE.MIT"]
Reason: license-files.paths and license-files.globs are mutually exclusive.
.. code-block:: toml
[project]
license-files = { paths = ["..\LICENSE.MIT"] }
Reason: ``..`` must not be used.
``\`` is an invalid path delimiter, ``/`` must be used.
.. code-block:: toml
[project]
license-files = { globs = ["LICEN{CSE*"] }
Reason: "LICEN{CSE*" is not a valid glob.
.. _639-spec-key-license-table:
Deprecate ``license`` key table subkeys
'''''''''''''''''''''''''''''''''''''''
Table values for the ``license`` key in the ``[project]`` table,
including the ``text`` and ``file`` table subkeys, are now deprecated.
If the new ``license-files`` key is present,
build tools MUST raise an error if the ``license`` key is defined
and has a value other than a single top-level string.
If the new ``license-files`` key is not present
and the ``text`` subkey is present in a ``license`` table,
tools SHOULD issue a warning informing users it is deprecated
and recommending a license expression as a top-level string key instead.
Likewise, if the new ``license-files`` key is not present
and the ``file`` subkey is present in the ``license`` table,
tools SHOULD issue a warning informing users it is deprecated and recommending
the ``license-files`` key instead.
If the specified license ``file`` is present in the source tree,
build tools SHOULD use it to fill the ``License-File`` field
in the core metadata, and MUST include the specified file
as if it were specified in a ``license-file.paths`` field.
If the file does not exist at the specified path,
tools MUST raise an informative error as previously specified.
However, tools MUST also still assume the
:ref:`specified default value <639-default-patterns>`
for the ``license-files`` key and also include,
in addition to a license file specified under the ``license.file`` subkey,
any license files that match the specified list of patterns.
Table values for the ``license`` key MAY be removed
from a new version of the specification in a future PEP.
.. _639-spec-project-formats:
License files in project formats
--------------------------------
A few additions will be made to the existing specifications.
:term:`Project source tree`\s
Per :ref:`639-spec-source-metadata` section, the
`Declaring Project Metadata specification <pyprojecttoml_>`__
will be updated to reflect that license file paths MUST be relative to the
project root directory; i.e. the directory containing the ``pyproject.toml``
(or equivalently, other legacy project configuration,
e.g. ``setup.py``, ``setup.cfg``, etc).
:term:`Source distributions (sdists) <Source Distribution (or "sdist")>`
The `sdist specification <sdistspec_>`__ will be updated to reflect that if
the ``Metadata-Version`` is ``2.4`` or greater,
the sdist MUST contain any license files specified by
the :ref:`License-File field <639-spec-field-license-file>`
in the ``PKG-INFO`` at their respective paths
relative to the of the sdist
(containing the ``pyproject.toml`` and the ``PKG-INFO`` Core Metadata).
:term:`Built distribution`\s (:term:`wheel`\s)
The `Wheel specification <wheelspec_>`__ will be updated to reflect that if
the ``Metadata-Version`` is ``2.4`` or greater and one or more
``License-File`` fields is specified, the ``.dist-info`` directory MUST
contain a ``licenses`` subdirectory, which MUST contain the files listed
in the ``License-File`` fields in the ``METADATA`` file at their respective
paths relative to the ``licenses`` directory.
:term:`Installed project`\s
The `Recording Installed Projects specification <installedspec_>`__ will be
updated to reflect that if the ``Metadata-Version`` is ``2.4`` or greater
and one or more ``License-File`` fields is specified, the ``.dist-info``
directory MUST contain a ``licenses`` subdirectory which MUST contain
the files listed in the ``License-File`` fields in the ``METADATA`` file
at their respective paths relative to the ``licenses`` directory,
and that any files in this directory MUST be copied from wheels
by install tools.
.. _639-spec-converting-metadata:
Converting legacy metadata
--------------------------
Tools MUST NOT use the contents of the ``license.text`` ``[project]`` key
(or equivalent tool-specific format),
license classifiers or the value of the Core Metadata ``License`` field
to fill the top-level string value of the ``license`` key
or the Core Metadata ``License-Expression`` field
without informing the user and requiring unambiguous, affirmative user action
to select and confirm the desired license expression value before proceeding.
Tool authors, who need to automatically convert license classifiers to
SPDX identifiers, can use the
:ref:`recommendation <639-spec-mapping-classifiers-identifiers>` prepared by
the PEP authors.
.. _639-backwards-compatibility:
Backwards Compatibility
=======================
Adding a new ``License-Expression`` Core Metadata field and a top-level string
value for the ``license`` key in the ``pyproject.toml`` ``[project]`` table
unambiguously means support for the specification in this PEP. This avoids the
risk of new tooling misinterpreting a license expression as a free-form license
description or vice versa.
The legacy deprecated Core Metadata ``License`` field, ``license`` key table
subkeys (``text`` and ``file``) in the ``pyproject.toml`` ``[project]`` table
and license classifiers retain backwards compatibility. A removal is
left to a future PEP and a new version of the Core Metadata specification.
Specification of the new ``License-File`` Core Metadata field and adding the
files in the distribution codifies the existing practices of many packaging
tools. It is designed to be largely backwards-compatible with their existing
use of that field. The new ``license-files`` key in the ``[project]`` table of
``pyproject.toml`` will only have an effect once users and tools adopt it.
This PEP specifies that license files should be placed in a dedicated
``licenses`` subdir of ``.dist-info`` directory. This is new and ensures that
wheels following this PEP will have differently-located licenses relative to
those produced via the previous installer-specific behavior. This is further
supported by a new metadata version.
This also resolves current issues where license files are accidentally
replaced if they have the same names in different places, making wheels
undistributable without noticing. It also prevents conflicts with other
metadata files in the same directory.
The additions will be made to the source distribution (sdist), built
distribution (wheel) and installed project specifications. They document
behaviors allowed under their current specifications, and gate them behind the
new metadata version.
This PEP proposes PyPI implement validation of the new
``License-Expression`` and ``License-File`` fields, which has no effect on
new and existing packages uploaded unless they explicitly opt in to using
these new fields and fail to follow the specification correctly.
Therefore, this does not have a backward compatibility impact, and guarantees
forward compatibility by ensuring all distributions uploaded to PyPI with the
new fields conform to the specification.
.. _639-security-implications:
Security Implications
=====================
This PEP has no foreseen security implications: the ``License-Expression``
field is a plain string and the ``License-File`` fields are file paths.
Neither introduces any known new security concerns.
.. _639-how-to-teach-this:
How to Teach This
=================
A majority of packages use a single license which makes the case simple:
a single license identifier is a valid license expression.
Users of packaging tools will learn the valid license expression of their
package through the messages issued by the tools when they detect invalid
ones, or when the deprecated ``License`` field or license classifiers are used.
If an invalid ``License-Expression`` is used, an error message will help users
understand they need to use SPDX identifiers. For authors using the
now-deprecated ``License`` field or license classifiers, packaging tools will
warn them and inform them of the modern replacement, ``License-Expression``.
Finally, the users who may not be aware of this PEP will be guided by the
publishing tools toward including ``license`` and ``license-files`` in their
project source metadata.
Tools may also help with the conversion and suggest a license expression in
many common cases:
- The appendix :ref:`639-spec-mapping-classifiers-identifiers` provides
tool authors with recommendation on how to suggest a license expression
produced from legacy classifiers.
- Tools may be able to suggest how to update an existing ``License`` value
in project source metadata and convert that to a license expression,
as also :ref:`specified in this PEP <639-spec-converting-metadata>`.
For instance, a tool may suggest converting a value of ``MIT`` in the
``license.text`` key in ``[project]`` (or the equivalent in tool-specific
formats) to a top-level string value of the ``license`` key (or equivalent).
Likewise, a tool could suggest converting from a ``License`` of ``Apache2``
(which is not a valid license expression as :ref:`defined in this PEP
<639-spdx>`) to a ``License-Expression`` of ``Apache-2.0``.
.. _639-reference-implementation:
Reference Implementation
========================
Tools will need to support parsing and validating license expressions in the
``License-Expression`` field.
The `license-expression library <licenseexplib_>`__ is a reference Python
implementation that handles license expressions including parsing,
formatting and validation, using flexible lists of license symbols
(including SPDX license IDs and any extra identifiers included here).
It is licensed under Apache-2.0 and is already used in several projects,
including the `SPDX Python Tools <spdxpy_>`__,
the `ScanCode toolkit <scancodetk_>`__
and the Free Software Foundation Europe (FSFE) `REUSE project <reuse_>`__.
.. _639-rejected-ideas:
Rejected Ideas
==============
Many alternative ideas were proposed and after a careful consideration,
rejected. The exhaustive list including the rationale for rejecting can be found
in a :ref:`separate page <639-rejected-ideas-details>`.
Open Issues
===========
Should the ``License`` field be back-filled, or mutually exclusive?
-------------------------------------------------------------------
At present, this PEP explicitly allows, but does not require, build tools to
back-fill the ``License`` Core Metadata field with the verbatim text from the
``License-Expression`` field. This would improve backwards compatibility and was
suggested by some on the Discourse thread. On the other hand, allowing it does
increase complexity and is less of a clean separation, preventing the
``License`` field from being mutually exclusive with the new
``License-Expression`` field and requiring that their values match.
As such, it would be useful to have a more concrete rationale and use cases for
the back-filled data in order to come to a final consensus on this matter.
Therefore, is the status quo acceptable, allowing tools to decide this for
themselves? Should this PEP recommend, or even require, that tools back-fill
this metadata (which would presumably be reversed once a breaking revision of
the metadata spec is issued)? Or should this not be explicitly allowed, or even
prohibited?
Should custom license identifiers be allowed?
---------------------------------------------
The current version of this PEP specifies the possibility to use the
custom identifiers ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary``
to handle the cases where projects have a license, but there is not a
recognized SPDX license identifier for it. For maximum flexibility, custom
``LicenseRef-<CUSTOM-TEXT>`` license identifiers could be allowed. In some cases
``LicenseRef-Proprietary`` may not be appropriate or specific enough, but
package authors could still want to benefit from the mainstream Python build
tooling.
However, this could increase the confusion about licensing. Custom identifiers
cannot be checked for correctness and users may think they always have to
prepend identifiers with ``LicenseRef``. This would lead to tools producing
invalid metadata. Additionally, this promotes the use of custom license
identifiers, leading to even more ambiguity.
Standards-conforming tools should not be required to allow custom license
identifiers, since they will not recognize or know how to treat them. By
contrast, custom tools, which would be required to understand custom
identifiers, don't have to follow the listed rules for license identifiers. This
specification already allows such use in specific ecosystems, which avoids the
disadvantages of forcing them on all mainstream packaging tools.
As an alternative, a ``LicenseRef-Custom`` identifier could be defined, which
would more explicitly indicate that the license cannot be expressed with
existing identifiers and the license text should be referenced for details,
in cases where ``LicenseRef-Proprietary`` is not appropriate. This would avoid
the main downsides of the approach of allowing an arbitrary ``LicenseRef``,
while addressing several of the potential scenarios cited for it.
On the other hand, as SPDX aims to encompass all FSF-recognized "Free" and
OSI-approved "Open Source" licenses, anything outside those bounds would
generally be covered by ``LicenseRef-Proprietary``, thus making
``LicenseRef-Custom`` somewhat redundant to it. Furthermore, it may mislead
authors of projects with complex/multiple licenses that they should use it over
specifying a license expression.
At present, the PEP retains the existing approach over either of these, since
the benefits
otherwise seem marginal. Not defining this now enables allowing it later (or
even now, with custom packaging tools) without affecting backward compatibility.
This would be problematic, if they were allowed now and later determined to be
unnecessary.
Appendices
==========
A list of auxilliary documents is provided:
- Detailed :ref:`Licensing Examples <639-examples>`,
- :ref:`User Scenarios <639-user-scenarios>`,
- :ref:`License Documentation in Python and Other Projects <639-license-doc-python>`,
- :ref:`Mapping License Classifiers to SPDX Identifiers <639-spec-mapping-classifiers-identifiers>`,
- :ref:`Rejected Ideas <639-rejected-ideas-details>` in detail.
References
==========
.. _cc0: https://creativecommons.org/publicdomain/zero/1.0/
.. _cdstats: https://clearlydefined.io/stats
.. _choosealicense: https://choosealicense.com/
.. _classifierissue: https://github.com/pypa/trove-classifiers/issues/17
.. _classifiers: https://pypi.org/classifiers
.. _classifiersrepo: https://github.com/pypa/trove-classifiers
.. _clearlydefined: https://clearlydefined.io
.. _coremetadataspec: https://packaging.python.org/specifications/core-metadata
.. _coremetadataclassifiers: https://packaging.python.org/en/latest/specifications/core-metadata/#classifier-multiple-use
.. _globmodule: https://docs.python.org/3/library/glob.html
.. _hatch: https://hatch.pypa.io/latest/
.. _hatchimplementation: https://discuss.python.org/t/12622/22
.. _installedspec: https://packaging.python.org/specifications/recording-installed-packages/
.. _interopissue: https://github.com/pypa/interoperability-peps/issues/46
.. _licenseexplib: https://github.com/nexB/license-expression/
.. _osi: https://opensource.org
.. _packagingissue: https://github.com/pypa/packaging-problems/issues/41
.. _pyprojecttoml: https://packaging.python.org/en/latest/specifications/pyproject-toml/
.. _pepissue: https://github.com/pombredanne/spdx-pypi-pep/issues/1
.. _pypi: https://pypi.org/
.. _pypugdistributionpackage: https://packaging.python.org/en/latest/glossary/#term-Distribution-Package
.. _pypugglossary: https://packaging.python.org/glossary/
.. _pypugproject: https://packaging.python.org/en/latest/glossary/#term-Project
.. _reuse: https://reuse.software/
.. _scancodetk: https://github.com/nexB/scancode-toolkit
.. _sdistspec: https://packaging.python.org/specifications/source-distribution-format/
.. _setuptoolsfiles: https://github.com/pypa/setuptools/issues/2739
.. _setuptoolspep639: https://github.com/pypa/setuptools/pull/2645
.. _spdx: https://spdx.dev/
.. _spdxid: https://spdx.dev/ids/
.. _spdxlist: https://spdx.org/licenses/
.. _spdxpression: https://spdx.github.io/spdx-spec/v2.2.2/SPDX-license-expressions/
.. _spdxpy: https://github.com/spdx/tools-python/
.. _spdxversion: https://github.com/pombredanne/spdx-pypi-pep/issues/6
.. _wheelfiles: https://github.com/pypa/wheel/issues/138
.. _wheelproject: https://wheel.readthedocs.io/en/stable/
.. _wheelspec: https://packaging.python.org/specifications/binary-distribution-format/
Acknowledgments
===============
- Alyssa Coghlan
- Kevin P. Fleming
- Pradyun Gedam
- Oleg Grenrus
- Dustin Ingram
- Chris Jerdonek
- Cyril Roelandt
- Luis Villa
- Seth M. Larson
- Ofek Lev
Copyright
=========
This document is placed in the public domain or under the
`CC0-1.0-Universal license <cc0_>`__, whichever is more permissive.