PEP 723: Modify the spec based on post-acceptance discussions (gh-3601)

* PEP 723: Modify the spec based on post-acceptance discussions
This commit is contained in:
Paul Moore 2023-12-22 22:50:04 +00:00 committed by GitHub
parent 9779c7efe6
commit 5f200941a3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 45 additions and 88 deletions

View File

@ -12,6 +12,7 @@ Created: 04-Aug-2023
Post-History: `04-Aug-2023 <https://discuss.python.org/t/30979>`__,
`06-Aug-2023 <https://discuss.python.org/t/31151>`__,
`23-Aug-2023 <https://discuss.python.org/t/32149>`__,
`06-Dec-2023 <https://discuss.python.org/t/40418>`__,
Replaces: 722
Resolution: https://discuss.python.org/t/36763
@ -55,23 +56,12 @@ Rationale
This PEP defines a mechanism for embedding metadata *within the script itself*,
and not in an external file.
We choose to follow the latest developments of other modern packaging
ecosystems (namely `Go`__ and provisionally `Rust`__) by embedding the existing
file used to describe projects, in our case ``pyproject.toml``.
__ https://github.com/erning/gorun
__ https://rust-lang.github.io/rfcs/3424-cargo-script.html
The format is intended to bridge the gap between different types of users
of Python. Users will benefit from seamless interoperability with tools that
already work with TOML.
One of the central themes we discovered from the recent
`packaging survey <https://discuss.python.org/t/22420>`__ is that users have
begun getting frustrated with the lack of unification regarding both tooling
and specs. Adding yet another metadata format (like :pep:`722` syntax for a
list of dependencies), even for a currently unsatisfied use case, would
further fragment the community.
The metadata format is designed to be similar to the layout of data in the
``pyproject.toml`` file of a Python project directory, to provide a familiar
experience for users who have experience writing Python projects. By using a
similar format, we avoid unnecessary inconsistency between packaging tools,
a common frustration expressed by users in the recent
`packaging survey <https://discuss.python.org/t/22420>`__.
The following are some of the use cases that this PEP wishes to support:
@ -83,11 +73,8 @@ The following are some of the use cases that this PEP wishes to support:
* A script that desires to transition to a directory-type project. A user may
be rapidly prototyping locally or in a remote REPL environment and then
decide to transition to a more formal project layout if their idea works
out. This intermediate script stage would be very useful to have fully
reproducible bug reports. By using the same format, the user can simply copy
and paste the metadata into a ``pyproject.toml`` file and continue working
without having to learn a new format. More likely, even, is that tooling will
eventually support this transformation with a single command.
out. Being able to define dependencies in the script would be very useful to
have fully reproducible bug reports.
* Users that wish to avoid manual dependency management. For example, package
managers that have commands to add/remove dependencies or dependency update
automation in CI that triggers based on new versions or in response to
@ -155,18 +142,20 @@ and the regular expression, the text specification takes precedence.
Tools MUST NOT read from metadata blocks with types that have not been
standardized by this PEP or future ones.
pyproject type
--------------
script type
-----------
The first type of metadata block is named ``pyproject`` which represents
content similar to [3]_ what one would see in a ``pyproject.toml`` file.
The first type of metadata block is named ``script`` which contains script
metadata (dependency data and tool configuration).
This document MAY include the ``[run]`` and ``[tool]`` tables.
This document MAY include top-level fields ``dependencies`` and ``requires-python``,
and MAY optionally include a ``[tool]`` table.
The :pep:`tool table <518#tool-table>` MAY be used by any tool, script runner
or otherwise, to configure behavior.
The ``[tool]`` table MAY be used by any tool, script runner or otherwise, to configure
behavior. It has the same semantics as the :pep:`tool table <518#tool-table>` in
``pyproject.toml``.
The ``[run]`` table MAY include the following optional fields:
The top-level fields are:
* ``dependencies``: A list of strings that specifies the runtime dependencies
of the script. Each entry MUST be a valid :pep:`508` dependency.
@ -174,11 +163,6 @@ The ``[run]`` table MAY include the following optional fields:
the script is compatible. The value of this field MUST be a valid
:pep:`version specifier <440#version-specifiers>`.
Any future PEPs that define additional fields for the ``[run]`` table when used
in a ``pyproject.toml`` file MUST include the aforementioned fields exactly as
specified. The fields defined by this PEP are equally as applicable to
full-fledged projects as they are to single-file scripts.
Script runners MUST error if the specified ``dependencies`` cannot be provided.
Script runners SHOULD error if no version of Python that satisfies the specified
``requires-python`` can be provided.
@ -186,12 +170,11 @@ Script runners SHOULD error if no version of Python that satisfies the specified
Example
-------
The following is an example of a script with an embedded ``pyproject.toml``:
The following is an example of a script with embedded metadata:
.. code:: python
# /// pyproject
# [run]
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "requests<3",
@ -206,24 +189,6 @@ The following is an example of a script with an embedded ``pyproject.toml``:
data = resp.json()
pprint([(k, v["title"]) for k, v in data.items()][:10])
The following [4]_ is an example of a proposed syntax for single-file Rust
projects that embeds their equivalent of ``pyproject.toml``, which is called
``Cargo.toml``:
.. code:: rust
#!/usr/bin/env cargo
//! ```cargo
//! [dependencies]
//! regex = "1.8.0"
//! ```
fn main() {
let re = Regex::new(r"^\d{4}-\d{2}-\d{2}$").unwrap();
println!("Did our date match? {}", re.is_match("2014-01-01"));
}
Reference Implementation
========================
@ -238,7 +203,7 @@ higher.
REGEX = r'(?m)^# /// (?P<type>[a-zA-Z0-9-]+)$\s(?P<content>(^#(| .*)$\s)+)^# ///$'
def read(script: str) -> dict | None:
name = 'pyproject'
name = 'script'
matches = list(
filter(lambda m: m.group('type') == name, re.finditer(REGEX, script))
)
@ -275,7 +240,7 @@ __ https://tomlkit.readthedocs.io/en/latest/
)
config = tomlkit.parse(content)
config['project']['dependencies'].append(dependency)
config['dependencies'].append(dependency)
new_content = ''.join(
f'# {line}' if line.strip() else f'#{line}'
for line in tomlkit.dumps(config).splitlines(keepends=True)
@ -309,24 +274,24 @@ blocks.
Backwards Compatibility
=======================
At the time of writing, the ``# /// pyproject`` block comment starter does not
appear `on GitHub`__. Therefore, there is little risk of existing scripts being
broken by this PEP.
At the time of writing, the ``# /// script`` block comment starter does not
appear in any Python files `on GitHub`__. Therefore, there is little risk of existing
scripts being broken by this PEP.
__ https://github.com/search?q=%22%23+%2F%2F%2F+pyproject%22&type=code
__ https://github.com/search?q=%22%23+%2F%2F%2F+script%22&type=code
Security Implications
=====================
If a script containing embedded metadata is ran using a tool that automatically
If a script containing embedded metadata is run using a tool that automatically
installs dependencies, this could cause arbitrary code to be downloaded and
installed in the user's environment.
The risk here is part of the functionality of the tool being used to run the
script, and as such should already be addressed by the tool itself. The only
additional risk introduced by this PEP is if an untrusted script with a
embedded metadata is run, when a potentially malicious dependency or transitive
additional risk introduced by this PEP is if an untrusted script with embedded
metadata is run, when a potentially malicious dependency or transitive
dependency might be installed.
This risk is addressed by the normal good practice of reviewing code
@ -338,15 +303,13 @@ How to Teach This
=================
To embed metadata in a script, define a comment block that starts with the
line ``# /// pyproject`` and ends with the line ``# ///``. Every line between
line ``# /// script`` and ends with the line ``# ///``. Every line between
those two lines must be a comment and the full content is derived by removing
the first two characters. The ``pyproject`` type indicates that the content
is TOML and resembles a ``pyproject.toml`` file.
the first two characters.
.. code:: python
# /// pyproject
# [run]
# /// script
# dependencies = [
# "requests<3",
# "rich",
@ -354,8 +317,7 @@ is TOML and resembles a ``pyproject.toml`` file.
# requires-python = ">=3.11"
# ///
The two allowed tables are ``[run]`` and ``[tool]``. The ``[run]`` table may
contain the following fields:
The allowed fields are described in the following table:
.. list-table::
@ -376,6 +338,10 @@ contain the following fields:
- Tools might error if no version of Python that satisfies
the constraint can be executed.
In addition, a ``[tool]`` table is allowed. Details of what is permitted are similar
to what is permitted in ``pyproject.toml``, but precise information must be included
in the documentation of the relevant tool.
It is up to individual tools whether or not their behavior is altered based on
the embedded metadata. For example, every script runner may not be able to
provide an environment for specific Python versions as defined by the
@ -455,7 +421,7 @@ would live as single-file scripts:
up each Python script with a shebang line pointing to the desired Python
executable or script runner.
This PEP argues that reusing our TOML-based metadata format is the best for
This PEP argues that the proposed TOML-based metadata format is the best for
each category of user and that the requirements-like block comment is only
approachable for those who have familiarity with ``requirements.txt``, which
represents a small subset of users.
@ -464,12 +430,11 @@ represents a small subset of users.
already starting with zero context and are unlikely to be familiar with
TOML nor ``requirements.txt``. These users will very likely rely on
snippets found online via a search engine or utilize AI in the form
of a chat bot or direct code completion software. Searching for Python
metadata formatting will lead them to the TOML-based format that already
exists which they can reuse. The author tested GitHub Copilot with this
PEP and it already supports auto-completion of ``dependencies``. In contrast,
a new format may take years of being trained on the Internet for models to
learn.
of a chat bot or direct code completion software. The similarity with
dependency information stored in ``pyproject.toml`` will provide useful
search results relatively quickly, and while the ``pyproject.toml`` format
and the script metadata format are not the same, any resulting discrepancies
are unlikely to be difficult for the intended users to resolve.
Additionally, these users are most susceptible to formatting quirks and
syntax errors. TOML is a well-defined format with existing online
@ -484,7 +449,7 @@ represents a small subset of users.
with TOML since they are used to structured data formats and there would be
less perceived magic in their systems.
Additionally, for maintenance of their systems ``/// pyproject`` would be
Additionally, for maintenance of their systems ``/// script`` would be
much easier to search for from a shell than a block comment with potentially
numerous extensions over time.
* For the SRE types, they are likely to be familiar with TOML already from
@ -817,15 +782,7 @@ Footnotes
__ https://github.com/facelessuser/pymdown-extensions/discussions/1973
__ https://github.com/Python-Markdown/markdown
.. [3] A future PEP that officially introduces the ``[run]`` table to
``pyproject.toml`` files will make this PEP not just similar but a strict
subset.
.. [4] One important thing to note is that the metadata is embedded in a
`doc-comment`__ (their equivalent of docstrings). `Other syntaxes`__ are
under consideration within the Rust project.
__ https://doc.rust-lang.org/stable/book/ch14-02-publishing-to-crates-io.html#making-useful-documentation-comments
__ https://github.com/epage/cargo-script-mvs/blob/main/0000-cargo-script.md#embedded-manifest-format
Copyright