diff --git a/peps/pep-0639.rst b/peps/pep-0639.rst index 2b3707fa3..eb9e941aa 100644 --- a/peps/pep-0639.rst +++ b/peps/pep-0639.rst @@ -33,14 +33,11 @@ To achieve that, it: - Specifies the necessary changes to :term:`Core Metadata` and the corresponding :term:`Pyproject Metadata key`\s -- Describes the necessary changes to related specifications, - namely the `source distribution (sdist) `__, +- Describes the necessary changes to + the `source distribution (sdist) `__, `built distribution (wheel) `__ and `installed project `__ standards. -- :ref:`Provides guidance <639-spec-converting-metadata>` - for authors and tools converting legacy license metadata. - This will make license declaration simpler and less ambiguous for package authors to create, end users to understand, and tools to programmatically process. @@ -101,7 +98,7 @@ including on `outdated and ambiguous PyPI classifiers `__, `license interoperability with other ecosystems `__, `too many confusing license metadata options `__, `limited support for license files in the Wheel project `__, and -`the lack of clear, precise and standardized license metadata `__. +`the lack of precise license metadata `__. As a result, on average, Python packages tend to have more ambiguous and missing license information than other common ecosystems. This is supported by @@ -218,10 +215,12 @@ particularly :term:`license identifier` and :term:`license expression`. as described in the :ref:`639-spec-field-license-expression` section of this PEP. This includes all valid SPDX identifiers and - the strings ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary``. + the custom ``LicenseRef-[idstring]`` strings conforming to the + `SPDX specification, clause 10.1 `__. Examples: ``MIT``, - ``GPL-3.0-only`` + ``GPL-3.0-only``, + ``LicenseRef-My-Custom-License`` root license directory license directory @@ -254,7 +253,7 @@ The changes necessary to implement this PEP include: :ref:`project source metadata <639-spec-source-metadata>`, as defined in the `specification `__. -- :ref:`minor additions <639-spec-project-formats>` to the +- :ref:`additions <639-spec-project-formats>` to the source distribution (sdist), built distribution (wheel) and installed project specifications. @@ -283,8 +282,11 @@ A license expression can use the following :term:`license identifier`\s: version. Note that the SPDX working group never removes any license identifiers; instead, they may choose to mark an identifier as "deprecated". -- The ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` strings, to - identify licenses that are not included in the SPDX license list. +- The custom ``LicenseRef-[idstring]`` string(s), where + ``[idstring]`` is a unique string containing letters, numbers, + ``.`` and/or ``-``, to identify licenses that are not included in the SPDX + license list. The custom identifiers must follow the SPDX specification, + `clause 10.1 `__ of the given specification version. Examples of valid SPDX expressions: @@ -293,10 +295,10 @@ Examples of valid SPDX expressions: MIT BSD-3-Clause - MIT AND (Apache-2.0 OR BSD-2-clause) + MIT AND (Apache-2.0 OR BSD-2-Clause) MIT OR GPL-2.0-or-later OR (FSFUL AND BSD-2-Clause) GPL-3.0-only WITH Classpath-Exception-2.0 OR BSD-3-Clause - LicenseRef-Public-Domain OR CC0-1.0 OR Unlicense + LicenseRef-Special-License OR CC0-1.0 OR Unlicense LicenseRef-Proprietary @@ -306,6 +308,8 @@ Examples of invalid SPDX expressions: Use-it-after-midnight Apache-2.0 OR 2-BSD-Clause + LicenseRef-License with spaces + LicenseRef-License_with_underscores .. _639-spec-core-metadata: @@ -328,34 +332,21 @@ Add ``License-Expression`` field The ``License-Expression`` optional :term:`Core Metadata field` is specified to contain a text string -that is a valid SPDX :term:`license expression`, as defined by this PEP. +that is a valid SPDX :term:`license expression`, +as :ref:`defined above <639-spdx>`. -Publishing tools SHOULD issue an informational warning if this field is -missing, and MAY raise an error. Build tools MAY issue a similar warning, -but MUST NOT raise an error. - -A license expression is an SPDX expression as :ref:`defined above <639-spdx>`. - -When processing the ``License-Expression`` field, build and publishing tools: - -- SHOULD halt execution and raise an error if: - - - The field does not contain a valid license expression - - - One or more license identifiers are not valid - (as :ref:`defined above <639-spdx>`) - -- SHOULD report an informational warning, and publishing tools MAY raise an - error, if one or more license identifiers have been marked as deprecated in - the `SPDX License List `__. - -- MUST store a case-normalized version of the ``License-Expression`` field - using the reference case for each SPDX license identifier and - uppercase for the ``AND``, ``OR`` and ``WITH`` keywords. - -- SHOULD report an informational warning, and MAY raise an error if - the normalization process results in changes to the - ``License-Expression`` field contents. +Build and publishing tools SHOULD +check that the ``License-Expression`` field contains a valid SPDX expression, +including the validity of the particular license identifiers +(as :ref:`defined above <639-spdx>`). +Tools MAY halt execution and raise an error when an invalid expression is found. +If tools choose to validate the SPDX expression, they also SHOULD +store a case-normalized version of the ``License-Expression`` +field using the reference case for each SPDX license identifier and uppercase +for the ``AND``, ``OR`` and ``WITH`` keywords. +Tools SHOULD report a warning and publishing tools MAY raise an error +if one or more license identifiers +have been marked as deprecated in the `SPDX License List `__. For all newly-uploaded :term:`distribution archive`\s that include a ``License-Expression`` field, @@ -363,6 +354,8 @@ the `Python Package Index (PyPI) `__ MUST validate that they contain a valid, case-normalized license expression with valid identifiers (as :ref:`defined above <639-spdx>`) and MUST reject uploads that do not. +Custom license identifiers which conform to the SPDX specification +are considered valid. PyPI MAY reject an upload for using a deprecated license identifier, so long as it was deprecated as of the above-mentioned SPDX License List version. @@ -430,19 +423,19 @@ Deprecate ``License`` field The legacy unstructured-text ``License`` :term:`Core Metadata field` is deprecated and replaced by the new ``License-Expression`` field. -Build and publishing tools MUST raise an error -if both these fields are present and their values are not identical, -including capitalization and excluding leading and trailing whitespace. +The fields are mutually exclusive. +Tools which generate Core Metadata MUST NOT create both these fields. +Tools which read Core Metadata, when dealing with both these fields present +at the same time, MUST read the value of ``License-Expression`` and MUST +disregard the value of the ``License`` field. -If only the ``License`` field is present, such tools SHOULD issue a warning +If only the ``License`` field is present, tools MAY issue a warning informing users it is deprecated and recommending ``License-Expression`` instead. For all newly-uploaded :term:`distribution archive`\s that include a ``License-Expression`` field, the `Python Package Index (PyPI) `__ MUST -reject any that specify a ``License`` field and the text of which is not -identical to that of ``License-Expression``, -as :ref:`defined here <639-spdx>`. +reject any that specify both ``License`` and ``License-Expression`` fields. The ``License`` field may be removed from a new version of the specification in a future PEP. @@ -499,15 +492,10 @@ string value. It is a valid SPDX license expression as :ref:`defined in this PEP <639-spdx>`. Its value maps to the ``License-Expression`` field in the core metadata. -Build tools SHOULD validate the expression as described in the +Build tools SHOULD validate and perform case normalization of the expression +as described in the :ref:`639-spec-field-license-expression` section, outputting an error or warning as specified. -When generating the Core Metadata, tools MUST perform case normalization. - -If a top-level string value for the ``license`` key is present and valid, -for purposes of backward compatibility -tools MAY back-fill the ``License`` Core Metadata field -with the normalized value of the ``license`` key. Examples: @@ -815,13 +803,14 @@ Users of packaging tools will learn the valid license expression of their package through the messages issued by the tools when they detect invalid ones, or when the deprecated ``License`` field or license classifiers are used. -If an invalid ``License-Expression`` is used, an error message will help users -understand they need to use SPDX identifiers. For authors using the -now-deprecated ``License`` field or license classifiers, packaging tools will -warn them and inform them of the modern replacement, ``License-Expression``. -Finally, the users who may not be aware of this PEP will be guided by the -publishing tools toward including ``license`` and ``license-files`` in their -project source metadata. +If an invalid ``License-Expression`` is used, the users will not be able +to publish their package to PyPI and an error message will help them +understand they need to use SPDX identifiers. +It will be possible to generate a distribution with incorrect license metadata, +but not to publish one on PyPI or any other index server that enforces ``License-Expression`` validity. +For authors using the now-deprecated ``License`` field or license classifiers, +packaging tools may warn them and inform them of the replacement, +``License-Expression``. Tools may also help with the conversion and suggest a license expression in many common cases: @@ -833,13 +822,6 @@ many common cases: - Tools may be able to suggest how to update an existing ``License`` value in project source metadata and convert that to a license expression, as also :ref:`specified in this PEP <639-spec-converting-metadata>`. - For instance, a tool may suggest converting a value of ``MIT`` in the - ``license.text`` key in ``[project]`` (or the equivalent in tool-specific - formats) to a top-level string value of the ``license`` key (or equivalent). - Likewise, a tool could suggest converting from a ``License`` of ``Apache2`` - (which is not a valid license expression as :ref:`defined in this PEP - <639-spdx>`) to a ``License-Expression`` of ``Apache-2.0``. - .. _639-reference-implementation: @@ -847,16 +829,13 @@ Reference Implementation ======================== Tools will need to support parsing and validating license expressions in the -``License-Expression`` field. - -The `license-expression library `__ is a reference Python -implementation that handles license expressions including parsing, -formatting and validation, using flexible lists of license symbols -(including SPDX license IDs and any extra identifiers included here). -It is licensed under Apache-2.0 and is already used in several projects, -including the `SPDX Python Tools `__, -the `ScanCode toolkit `__ -and the Free Software Foundation Europe (FSFE) `REUSE project `__. +``License-Expression`` field if they decide to implement this part of the +specification. +It's up to the tools whether they prefer to implement the validation on their +side (e.g. like `hatch `__) or use one of the available +Python libraries (e.g. `license-expression `__). +This PEP does not mandate using any specific library and leaves it to the +tools authors to choose the best implementation for their projects. .. _639-rejected-ideas: @@ -869,77 +848,6 @@ rejected. The exhaustive list including the rationale for rejecting can be found in a :ref:`separate page <639-rejected-ideas-details>`. -Open Issues -=========== - -Should the ``License`` field be back-filled, or mutually exclusive? -------------------------------------------------------------------- - -At present, this PEP explicitly allows, but does not require, build tools to -back-fill the ``License`` Core Metadata field with the verbatim text from the -``License-Expression`` field. This would improve backwards compatibility and was -suggested by some on the Discourse thread. On the other hand, allowing it does -increase complexity and is less of a clean separation, preventing the -``License`` field from being mutually exclusive with the new -``License-Expression`` field and requiring that their values match. - -As such, it would be useful to have a more concrete rationale and use cases for -the back-filled data in order to come to a final consensus on this matter. - -Therefore, is the status quo acceptable, allowing tools to decide this for -themselves? Should this PEP recommend, or even require, that tools back-fill -this metadata (which would presumably be reversed once a breaking revision of -the metadata spec is issued)? Or should this not be explicitly allowed, or even -prohibited? - - -Should custom license identifiers be allowed? ---------------------------------------------- - -The current version of this PEP specifies the possibility to use the -custom identifiers ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` -to handle the cases where projects have a license, but there is not a -recognized SPDX license identifier for it. For maximum flexibility, custom -``LicenseRef-`` license identifiers could be allowed. In some cases -``LicenseRef-Proprietary`` may not be appropriate or specific enough, but -package authors could still want to benefit from the mainstream Python build -tooling. - -However, this could increase the confusion about licensing. Custom identifiers -cannot be checked for correctness and users may think they always have to -prepend identifiers with ``LicenseRef``. This would lead to tools producing -invalid metadata. Additionally, this promotes the use of custom license -identifiers, leading to even more ambiguity. - -Standards-conforming tools should not be required to allow custom license -identifiers, since they will not recognize or know how to treat them. By -contrast, custom tools, which would be required to understand custom -identifiers, don't have to follow the listed rules for license identifiers. This -specification already allows such use in specific ecosystems, which avoids the -disadvantages of forcing them on all mainstream packaging tools. - -As an alternative, a ``LicenseRef-Custom`` identifier could be defined, which -would more explicitly indicate that the license cannot be expressed with -existing identifiers and the license text should be referenced for details, -in cases where ``LicenseRef-Proprietary`` is not appropriate. This would avoid -the main downsides of the approach of allowing an arbitrary ``LicenseRef``, -while addressing several of the potential scenarios cited for it. - -On the other hand, as SPDX aims to encompass all FSF-recognized "Free" and -OSI-approved "Open Source" licenses, anything outside those bounds would -generally be covered by ``LicenseRef-Proprietary``, thus making -``LicenseRef-Custom`` somewhat redundant to it. Furthermore, it may mislead -authors of projects with complex/multiple licenses that they should use it over -specifying a license expression. - -At present, the PEP retains the existing approach over either of these, since -the benefits -otherwise seem marginal. Not defining this now enables allowing it later (or -even now, with custom packaging tools) without affecting backward compatibility. -This would be problematic, if they were allowed now and later determined to be -unnecessary. - - Appendices ========== @@ -967,6 +875,7 @@ References .. _globmodule: https://docs.python.org/3/library/glob.html .. _hatch: https://hatch.pypa.io/latest/ .. _hatchimplementation: https://discuss.python.org/t/12622/22 +.. _hatchparseimpl: https://github.com/pypa/hatch/blob/hatchling-v1.24.2/backend/src/hatchling/licenses/parse.py#L8-L18 .. _installedspec: https://packaging.python.org/specifications/recording-installed-packages/ .. _interopissue: https://github.com/pypa/interoperability-peps/issues/46 .. _licenseexplib: https://github.com/nexB/license-expression/ @@ -978,16 +887,14 @@ References .. _pypugdistributionpackage: https://packaging.python.org/en/latest/glossary/#term-Distribution-Package .. _pypugglossary: https://packaging.python.org/glossary/ .. _pypugproject: https://packaging.python.org/en/latest/glossary/#term-Project -.. _reuse: https://reuse.software/ -.. _scancodetk: https://github.com/nexB/scancode-toolkit .. _sdistspec: https://packaging.python.org/specifications/source-distribution-format/ .. _setuptoolsfiles: https://github.com/pypa/setuptools/issues/2739 .. _setuptoolspep639: https://github.com/pypa/setuptools/pull/2645 .. _spdx: https://spdx.dev/ +.. _spdxcustom: https://spdx.github.io/spdx-spec/v2.2.2/other-licensing-information-detected/ .. _spdxid: https://spdx.dev/ids/ .. _spdxlist: https://spdx.org/licenses/ .. _spdxpression: https://spdx.github.io/spdx-spec/v2.2.2/SPDX-license-expressions/ -.. _spdxpy: https://github.com/spdx/tools-python/ .. _spdxversion: https://github.com/pombredanne/spdx-pypi-pep/issues/6 .. _wheelfiles: https://github.com/pypa/wheel/issues/138 .. _wheelproject: https://wheel.readthedocs.io/en/stable/ diff --git a/peps/pep-0639/appendix-rejected-ideas.rst b/peps/pep-0639/appendix-rejected-ideas.rst index 91f9b58cd..990456c47 100644 --- a/peps/pep-0639/appendix-rejected-ideas.rst +++ b/peps/pep-0639/appendix-rejected-ideas.rst @@ -300,37 +300,6 @@ Therefore, for these reasons, we reject this here in favor of the reserved string value of the ``license`` key. -Must be marked dynamic to back-fill -''''''''''''''''''''''''''''''''''' - -The ``license`` key in the ``pyproject.toml`` could be required to be -explicitly set to dynamic in order for the ``License`` Core Metadata field -to be automatically back-filled from -the top-level string value of the ``license`` key. -This would be more explicit that the filling will be done, -as strictly speaking the ``license`` key is not (and cannot be) specified in -``pyproject.toml``, and satisfies a stricter interpretation of the letter -of the previous :pep:`621` specification that PEP 639 revises. - -However, this doesn't seem to be necessary, because it is simply using the -static, literal value of the ``license`` key, as specified -strictly in PEP 639. Therefore, any conforming tool can -deterministically derive this using only the static data -in the ``pyproject.toml`` file itself. - -Furthermore, this actually adds significant ambiguity, as it means the value -could get filled arbitrarily by other tools, which would in turn compromise -and conflict with the value of the new ``License-Expression`` field, which is -why such is explicitly prohibited by PEP 639. Therefore, not marking it as -``dynamic`` will ensure it is only handled in accordance with PEP 639's -requirements. - -Finally, users explicitly being told to mark it as ``dynamic``, or not, to -control filling behavior seems to be a bit of a misuse of the ``dynamic`` -field as apparently intended, and prevents tools from adapting to best -practices (fill, don't fill, etc.) as they develop and evolve over time. - - Source metadata ``license-files`` key ------------------------------------- @@ -738,6 +707,32 @@ tools to immediate take advantage of improvements and accept new licenses balancing flexibility and compatibility. +Don't allow custom license identifiers +'''''''''''''''''''''''''''''''''''''' + +A previous draft of this PEP specified the possibility to use only two +custom identifiers: ``LicenseRef-Public-Domain`` and ``LicenseRef-Proprietary`` +to handle the cases where projects have a license, but there is not a +recognized SPDX license identifier for it. +The custom identifiers cannot be checked for correctness and users may think +they always have to prepend identifiers with ``LicenseRef``. +This would lead to tools producing invalid metadata. + +However, Python packages are produced in many open and close +environments, +where it may be impossible to declare the license using only the small subset +of the allowed custom identifiers and where, for various reasons, +it's not possible to add the license to the SPDX license list. + +The custom license identifiers are explicitly allowed and described in the +official SPDX specification and they can be syntactically validated although +not case-normalized. + +Therefore, with acknowledgement that the custom identifiers can't be fully +validated and may contain mistakes, it was decided to allow +them in line with the official SPDX specification. + + .. _639-rejected-ideas-difference-license-source-binary: Different licenses for source and binary distributions