PEP 723: Update based on feedback (#3279)
Co-authored-by: Adam Turner <9087854+aa-turner@users.noreply.github.com>
This commit is contained in:
parent
dc8c5c47a9
commit
6adff45cef
323
pep-0723.rst
323
pep-0723.rst
|
@ -73,36 +73,62 @@ begun getting frustrated with the lack of unification regarding both tooling
|
|||
and specs. Adding yet another way to define metadata, even for a currently
|
||||
unsatisfied use case, would further fragment the community.
|
||||
|
||||
A use case that this PEP wishes to support that other formats may preclude is
|
||||
a script that desires to transition to a directory-type project. A user may
|
||||
be rapidly prototyping locally or in a remote REPL environment and then decide
|
||||
to transition to a more formal project if their idea works out. This
|
||||
intermediate script stage would be very useful to have fully reproducible bug
|
||||
reports. By using the same metadata format, the user can simply copy and paste
|
||||
the metadata into a ``pyproject.toml`` file and continue working without having
|
||||
to learn a new format. More likely, even, is that tooling will eventually
|
||||
support this transformation with a single command.
|
||||
The following are some of the use cases that this PEP wishes to support:
|
||||
|
||||
* A user facing CLI that is capable of executing scripts. If we take Hatch as
|
||||
an example, the interface would be simply
|
||||
``hatch run /path/to/script.py [args]`` and Hatch will manage the
|
||||
environment for that script. Such tools could be used as shebang lines on
|
||||
non-Windows systems e.g. ``#!/usr/bin/env hatch run``. You would also be
|
||||
able to enter a shell into that environment like other projects by doing
|
||||
``hatch -p /path/to/script.py shell`` since the project flag would learn
|
||||
that project metadata could be read from a single file.
|
||||
* A script that desires to transition to a directory-type project. A user may
|
||||
be rapidly prototyping locally or in a remote REPL environment and then
|
||||
decide to transition to a more formal project layout if their idea works
|
||||
out. This intermediate script stage would be very useful to have fully
|
||||
reproducible bug reports. By using the same metadata format, the user can
|
||||
simply copy and paste the metadata into a ``pyproject.toml`` file and
|
||||
continue working without having to learn a new format. More likely, even, is
|
||||
that tooling will eventually support this transformation with a single
|
||||
command.
|
||||
* Users that wish to avoid manual dependency management. For example, package
|
||||
managers that have commands to add/remove dependencies or dependency update
|
||||
automation in CI that triggers based on new versions or in response to
|
||||
CVEs [1]_.
|
||||
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
Any Python script may assign a variable named ``__pyproject__`` to a multi-line
|
||||
*double-quoted* string (``"""``) containing a valid TOML document. The opening
|
||||
of the string MUST be on the same line as the assignment. The closing of the
|
||||
string MUST be on a line by itself, and MUST NOT be indented.
|
||||
*double-quoted* string literal (``"""``) containing a valid TOML document. The
|
||||
variable MUST start at the beginning of the line and the opening of the string
|
||||
MUST be on the same line as the assignment. The closing of the string MUST be
|
||||
on a line by itself, and MUST NOT be indented.
|
||||
|
||||
When there are multiple ``__pyproject__`` variables defined, tools MUST produce
|
||||
an error.
|
||||
|
||||
The TOML document MUST NOT contain multi-line double-quoted strings, as that
|
||||
would conflict with the Python string containing the document. Single-quoted
|
||||
multi-line TOML strings may be used instead.
|
||||
|
||||
This is the canonical regular expression that MUST be used to parse the
|
||||
metadata:
|
||||
|
||||
.. code:: text
|
||||
|
||||
(?ms)^__pyproject__ *= *"""\\?$(.+?)^"""$
|
||||
|
||||
In circumstances where there is a discrepancy between the regular expression
|
||||
and the text specification, the regular expression takes precedence.
|
||||
|
||||
Tools reading embedded metadata MAY respect the standard Python encoding
|
||||
declaration. If they choose not to do so, they MUST process the file as UTF-8.
|
||||
|
||||
This document MAY include the ``[project]`` and ``[tool]`` tables but MUST NOT
|
||||
define the ``[build-system]`` table. The ``[build-system]`` table MAY be
|
||||
allowed in a future PEP that standardizes how backends are to build
|
||||
distributions from single file scripts.
|
||||
This document MAY include the ``[project]``, ``[tool]`` and ``[build-system]``
|
||||
tables.
|
||||
|
||||
The ``[project]`` table differs in the following ways:
|
||||
|
||||
|
@ -110,11 +136,15 @@ The ``[project]`` table differs in the following ways:
|
|||
dynamically by tools if the user does not define them
|
||||
* These fields do not need to be listed in the ``dynamic`` array
|
||||
|
||||
Non-script running tools MAY choose to read from their expected ``[tool]``
|
||||
sub-table. If a single-file script is not the sole input to a tool then
|
||||
behavior SHOULD NOT be altered based on the embedded metadata. For example,
|
||||
if a linter is invoked with the path to a directory, it SHOULD behave the same
|
||||
as if zero files had embedded metadata.
|
||||
Non-script running tools MAY choose to alter their behavior based on
|
||||
configuration that is stored in their expected ``[tool]`` sub-table.
|
||||
|
||||
Build frontends SHOULD NOT use the backend defined in the ``[build-system]``
|
||||
table to build scripts with embedded metadata. This requires a new PEP because
|
||||
the current methods defined in :pep:`517` act upon a directory, not a file.
|
||||
We use ``SHOULD NOT`` instead of ``MUST NOT`` in order to allow tools to
|
||||
experiment [2]_ with such functionality before we standardize (indeed this
|
||||
would be a requirement).
|
||||
|
||||
Example
|
||||
-------
|
||||
|
@ -180,15 +210,6 @@ raised in the Rust pre-RFC.
|
|||
Reference Implementation
|
||||
========================
|
||||
|
||||
This regular expression may be used to parse the metadata:
|
||||
|
||||
.. code:: text
|
||||
|
||||
(?ms)^__pyproject__ *= *"""\\?$(.+?)^"""$
|
||||
|
||||
In circumstances where there is a discrepancy between the regular expression
|
||||
and the text specification, the text specification takes precedence.
|
||||
|
||||
The following is an example of how to read the metadata on Python 3.11 or
|
||||
higher.
|
||||
|
||||
|
@ -196,9 +217,16 @@ higher.
|
|||
|
||||
import re, tomllib
|
||||
|
||||
REGEX = r'(?ms)^__pyproject__ *= *"""\\?$(.+?)^"""$'
|
||||
|
||||
def read(script: str) -> dict | None:
|
||||
match = re.search(r'(?ms)^__pyproject__ *= *"""\\?$(.+?)^"""$', script)
|
||||
return tomllib.loads(match.group(1)) if match else None
|
||||
matches = list(re.finditer(REGEX, script))
|
||||
if len(matches) > 1:
|
||||
raise ValueError('Multiple __pyproject__ definitions found')
|
||||
elif len(matches) == 1:
|
||||
return tomllib.loads(matches[0])
|
||||
else:
|
||||
return None
|
||||
|
||||
Often tools will edit dependencies like package managers or dependency update
|
||||
automation in CI. The following is a crude example of modifying the content
|
||||
|
@ -218,7 +246,7 @@ using the ``tomlkit`` library.
|
|||
|
||||
Note that this example used a library that preserves TOML formatting. This is
|
||||
not a requirement for editing by any means but rather is a "nice to have"
|
||||
especially since there are unlikely to be embedded comments.
|
||||
feature.
|
||||
|
||||
|
||||
Backwards Compatibility
|
||||
|
@ -257,8 +285,13 @@ The risk here is part of the functionality of the tool being used to run the
|
|||
script, and as such should already be addressed by the tool itself. The only
|
||||
additional risk introduced by this PEP is if an untrusted script with a
|
||||
embedded metadata is run, when a potentially malicious dependency might be
|
||||
installed. This risk is addressed by the normal good practice of reviewing code
|
||||
before running it.
|
||||
installed.
|
||||
|
||||
This risk is addressed by the normal good practice of reviewing code
|
||||
before running it. Additionally, tools may be able to provide locking
|
||||
functionality when configured by their ``[tool]`` sub-table to, for example,
|
||||
add the resolution result as managed metadata somewhere in the script (this
|
||||
is what Go's ``gorun`` can do).
|
||||
|
||||
|
||||
How to Teach This
|
||||
|
@ -270,9 +303,13 @@ about metadata itself direct users to the living document that describes
|
|||
`project metadata <pyproject metadata_>`_.
|
||||
|
||||
We will document that the name and version fields in the ``[project]`` table
|
||||
may be elided for simplicity. Additionally, we will have guidance (perhaps
|
||||
temporary) explaining that single-file scripts cannot be built into a wheel
|
||||
and therefore you would never see the associated ``[build-system]`` metadata.
|
||||
may be elided for simplicity. Additionally, we will have guidance explaining
|
||||
that single-file scripts cannot (yet) be built into a wheel via standard means.
|
||||
|
||||
We will explain that it is up to individual tools whether or not their behavior
|
||||
is altered based on the embedded metadata. For example, every script runner may
|
||||
not be able to provide an environment for specific Python versions as defined
|
||||
by the ``requires-python`` field.
|
||||
|
||||
Finally, we may want to list some tools that support this PEP's format.
|
||||
|
||||
|
@ -284,6 +321,47 @@ Tools that support managing different versions of Python should attempt to use
|
|||
the highest available version of Python that is compatible with the script's
|
||||
``requires-python`` metadata, if defined.
|
||||
|
||||
For projects that have large multi-line external metadata to embed like a
|
||||
README file, it is recommended that they become directories with a
|
||||
``pyproject.toml`` file. While this is technically allowed, it is strongly
|
||||
discouraged to have large chunks of multi-line metadata and is indicative
|
||||
of the fact that a script has graduated to a more traditional layout.
|
||||
|
||||
If the content is small, for example in the case of internal packages, it is
|
||||
recommended that multi-line *single-quoted* TOML strings (``'''``) be used.
|
||||
For example:
|
||||
|
||||
.. code:: python
|
||||
|
||||
__pyproject__ = """
|
||||
[project]
|
||||
readme.content-type = "text/markdown"
|
||||
readme.text = '''
|
||||
# Some Project
|
||||
Please refer to our corporate docs
|
||||
for more information.
|
||||
'''
|
||||
"""
|
||||
|
||||
|
||||
Tooling buy-in
|
||||
==============
|
||||
|
||||
The following is a list of tools that have expressed support for this PEP or
|
||||
have committed to implementing support should it be accepted:
|
||||
|
||||
* `Pantsbuild and Pex <https://discuss.python.org/t/31151/15>`__: expressed
|
||||
support for any way to define dependencies and also features that this PEP
|
||||
considers as valid use cases such as building packages from scripts and
|
||||
embedding tool configuration
|
||||
* `Mypy <https://discuss.python.org/t/31151/16>`__ and
|
||||
`Ruff <https://discuss.python.org/t/31151/42>`__: strongly expressed support
|
||||
for embedding tool configuration as it would solve existing pain points for
|
||||
users
|
||||
* `Hatch <https://discuss.python.org/t/31151/53>`__: (author of this PEP)
|
||||
expressed support for all aspects of this PEP, and will be one of the first
|
||||
tools to support running scripts with specifically configured Python versions
|
||||
|
||||
|
||||
Rejected Ideas
|
||||
==============
|
||||
|
@ -317,6 +395,7 @@ the setup is too complex for the average user like when requiring Nvidia
|
|||
drivers. Situations like this would allow users to proceed with what they want
|
||||
to do whereas otherwise they may stop at that point altogether.
|
||||
|
||||
.. _723-comment-block:
|
||||
|
||||
Why not use a comment block resembling requirements.txt?
|
||||
--------------------------------------------------------
|
||||
|
@ -411,23 +490,32 @@ small subset of users.
|
|||
Studio Code would be able to provide TOML syntax highlighting much more
|
||||
easily than each writing custom logic for this feature.
|
||||
|
||||
Additionally, the block comment format goes against the recommendation of
|
||||
:pep:`8`:
|
||||
Additionally, since the original block comment alternative format went against
|
||||
the recommendation of :pep:`8` and as a result linters and IDE auto-formatters
|
||||
that respected the recommendation would
|
||||
`fail by default <https://discuss.python.org/t/29905/247>`__, the final
|
||||
proposal uses standard comments starting with a single ``#`` character.
|
||||
|
||||
Each line of a block comment starts with a ``#`` and a single space (unless
|
||||
it is indented text inside the comment). [...] Paragraphs inside a block
|
||||
comment are separated by a line containing a single ``#``.
|
||||
The concept of regular comments that do not appear to be intended for machines
|
||||
(i.e. `encoding declarations`__) affecting behavior would not be customary to
|
||||
users of Python and goes directly against the "explicit is better than
|
||||
implicit" foundational principle.
|
||||
|
||||
Linters and IDE auto-formatters that respect this long-time recommendation
|
||||
would fail by default. The following uses the example from :pep:`722`:
|
||||
__ https://docs.python.org/3/reference/lexical_analysis.html#encoding-declarations
|
||||
|
||||
.. code:: bash
|
||||
|
||||
$ flake8 .
|
||||
.\script.py:3:1: E266 too many leading '#' for block comment
|
||||
.\script.py:4:1: E266 too many leading '#' for block comment
|
||||
.\script.py:5:1: E266 too many leading '#' for block comment
|
||||
Users typing what to them looks like prose could alter runtime behavior. This
|
||||
PEP takes the view that the possibility of that happening, even when a tool
|
||||
has been set up as such (maybe by a sysadmin), is unfriendly to users.
|
||||
|
||||
Finally, and critically, the alternatives to this PEP like :pep:`722` do not
|
||||
satisfy the use cases enumerated herein, such as setting the supported Python
|
||||
versions, the eventual building of scripts into packages, and the ability to
|
||||
have machines edit metadata on behalf of users. It is very likely that the
|
||||
requests for such features persist and conceivable that another PEP in the
|
||||
future would allow for the embedding of such metadata. At that point there
|
||||
would be multiple ways to achieve the same thing which goes against our
|
||||
foundational principle of "there should be one - and preferably only one -
|
||||
obvious way to do it".
|
||||
|
||||
Why not consider scripts as projects without wheels?
|
||||
----------------------------------------------------
|
||||
|
@ -443,13 +531,67 @@ pinning e.g. a lock file with some sort of hash checking. Such projects would
|
|||
never be distributed as a wheel (except for maybe a transient editable one
|
||||
that is created when doing ``pip install -e .``).
|
||||
|
||||
In contrast, scripts are managed loosely by its runner and would almost
|
||||
always have relaxed dependency constraints. Additionally, to reduce
|
||||
friction associated with managing small projects there may be a future
|
||||
in which there is a standard prescribed way to ship projects that are in
|
||||
the form of a single file. The author of the Rust RFC for embedding metadata
|
||||
In contrast, scripts are managed loosely by their runners and would almost
|
||||
always have relaxed dependency constraints. Additionally, there may be a future
|
||||
in which there is `a standard way <723-limit-build-backend_>`_ to ship projects
|
||||
that are in the form of a single file.
|
||||
|
||||
.. _723-limit-build-backend:
|
||||
|
||||
Why not limit build backend behavior?
|
||||
-------------------------------------
|
||||
|
||||
A previous version of this PEP proposed that the ``[build-system]`` table
|
||||
mustn't be defined. The rationale was that builds would never occur so it
|
||||
did not make sense to allow this section.
|
||||
|
||||
We removed that limitation based on
|
||||
`feedback <https://discuss.python.org/t/31151/9>`__ stating that there
|
||||
are already tools that exist in the wild that build wheels and source
|
||||
distributions from single files.
|
||||
|
||||
The author of the Rust RFC for embedding metadata
|
||||
`mentioned to us <https://discuss.python.org/t/29905/179>`__ that they are
|
||||
actively looking into that based on user feedback.
|
||||
actively looking into that as well based on user feedback saying that there
|
||||
is unnecessary friction with managing small projects, which we have also
|
||||
heard in the Python community.
|
||||
|
||||
There has been `a commitment <https://discuss.python.org/t/31151/15>`__ to
|
||||
support this by at least one major build system.
|
||||
|
||||
Why not limit tool behavior?
|
||||
----------------------------
|
||||
|
||||
A previous version of this PEP proposed that non-script running tools SHOULD
|
||||
NOT modify their behavior when the script is not the sole input to the tool.
|
||||
For example, if a linter is invoked with the path to a directory, it SHOULD
|
||||
behave the same as if zero files had embedded metadata.
|
||||
|
||||
This was done as a precaution to avoid tool behavior confusion and generating
|
||||
various feature requests for tools to support this PEP. However, during
|
||||
discussion we received `feedback <https://discuss.python.org/t/31151/16>`__
|
||||
from maintainers of tools that this would be undesirable and potentially
|
||||
confusing to users. Additionally, this may allow for a universally easier
|
||||
way to configure tools in certain circumstances and solve existing issues.
|
||||
|
||||
Why not accept all valid Python expression syntax?
|
||||
--------------------------------------------------
|
||||
|
||||
There has been a suggestion that we should not restrict how the
|
||||
``__pyproject__`` variable is defined and we should parse the abstract syntax
|
||||
tree. For example:
|
||||
|
||||
.. code:: python
|
||||
|
||||
__pyproject__ = (
|
||||
"""
|
||||
[project]
|
||||
dependencies = []
|
||||
"""
|
||||
)
|
||||
|
||||
We will not be doing this so that every language has the possibility to read
|
||||
the metadata without dependence on knowledge of every version of Python.
|
||||
|
||||
Why not just set up a Python project with a ``pyproject.toml``?
|
||||
---------------------------------------------------------------
|
||||
|
@ -472,6 +614,61 @@ suggestion until the `current discussion on Discourse
|
|||
won't be distributed as wheels is resolved. And even then, it doesn't address
|
||||
the "sending someone a script in a gist or email" use case.
|
||||
|
||||
Why not infer the requirements from import statements?
|
||||
------------------------------------------------------
|
||||
|
||||
The idea would be to automatically recognize ``import`` statements in the source
|
||||
file and turn them into a list of requirements.
|
||||
|
||||
However, this is infeasible for several reasons. First, the points above about
|
||||
the necessity to keep the syntax easily parsable, for all Python versions, also
|
||||
by tools written in other languages, apply equally here.
|
||||
|
||||
Second, PyPI and other package repositories conforming to the Simple Repository
|
||||
API do not provide a mechanism to resolve package names from the module names
|
||||
that are imported (see also `this related discussion`__).
|
||||
|
||||
__ https://discuss.python.org/t/record-the-top-level-names-of-a-wheel-in-metadata/29494
|
||||
|
||||
Third, even if repositories did offer this information, the same import name may
|
||||
correspond to several packages on PyPI. One might object that disambiguating
|
||||
which package is wanted would only be needed if there are several projects
|
||||
providing the same import name. However, this would make it easy for anyone to
|
||||
unintentionally or malevolently break working scripts, by uploading a package to
|
||||
PyPI providing an import name that is the same as an existing project. The
|
||||
alternative where, among the candidates, the first package to have been
|
||||
registered on the index is chosen, would be confusing in case a popular package
|
||||
is developed with the same import name as an existing obscure package, and even
|
||||
harmful if the existing package is malware intentionally uploaded with a
|
||||
sufficiently generic import name that has a high probability of being reused.
|
||||
|
||||
A related idea would be to attach the requirements as comments to the import
|
||||
statements instead of gathering them in a block, with a syntax such as::
|
||||
|
||||
import numpy as np # requires: numpy
|
||||
import rich # requires: rich
|
||||
|
||||
This still suffers from parsing difficulties. Also, where to place the comment
|
||||
in the case of multiline imports is ambiguous and may look ugly::
|
||||
|
||||
from PyQt5.QtWidgets import (
|
||||
QCheckBox, QComboBox, QDialog, QDialogButtonBox,
|
||||
QGridLayout, QLabel, QSpinBox, QTextEdit
|
||||
) # requires: PyQt5
|
||||
|
||||
Furthermore, this syntax cannot behave as might be intuitively expected
|
||||
in all situations. Consider::
|
||||
|
||||
import platform
|
||||
if platform.system() == "Windows":
|
||||
import pywin32 # requires: pywin32
|
||||
|
||||
Here, the user's intent is that the package is only required on Windows, but
|
||||
this cannot be understood by the script runner (the correct way to write
|
||||
it would be ``requires: pywin32 ; sys_platform == 'win32'``).
|
||||
|
||||
(Thanks to Jean Abou-Samra for the clear discussion of this point)
|
||||
|
||||
Why not use a requirements file for dependencies?
|
||||
-------------------------------------------------
|
||||
|
||||
|
@ -574,6 +771,18 @@ References
|
|||
.. _pyproject without wheels: https://discuss.python.org/t/projects-that-arent-meant-to-generate-a-wheel-and-pyproject-toml/29684
|
||||
|
||||
|
||||
Footnotes
|
||||
=========
|
||||
|
||||
.. [1] A large number of users use scripts that are version controlled. For
|
||||
example, `the SREs that were mentioned <723-comment-block_>`_ or
|
||||
projects that require special maintenance like the
|
||||
`AWS CLI <https://github.com/aws/aws-cli/tree/4393dcdf044a5275000c9c193d1933c07a08fdf1/scripts>`__
|
||||
or `Calibre <https://github.com/kovidgoyal/calibre/tree/master/setup>`__.
|
||||
.. [2] For example, projects like Hatch and Poetry have their own backends
|
||||
and may wish to support this use case only when their backend is used.
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
|
|
Loading…
Reference in New Issue