2023-08-03 17:14:28 -04:00
|
|
|
PEP: 722
|
|
|
|
Title: Dependency specification for single-file scripts
|
|
|
|
Author: Paul Moore <p.f.moore@gmail.com>
|
|
|
|
PEP-Delegate: Brett Cannon <brett@python.org>
|
|
|
|
Discussions-To: https://discuss.python.org/t/29905
|
|
|
|
Status: Draft
|
|
|
|
Type: Standards Track
|
|
|
|
Topic: Packaging
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
Created: 19-Jul-2023
|
|
|
|
Post-History: `19-Jul-2023 <https://discuss.python.org/t/29905>`__
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
========
|
|
|
|
|
|
|
|
This PEP specifies a format for including 3rd-party dependencies in a
|
|
|
|
single-file Python script.
|
|
|
|
|
|
|
|
|
|
|
|
Motivation
|
|
|
|
==========
|
|
|
|
|
|
|
|
Not all Python code is structured as a "project", in the sense of having its own
|
|
|
|
directory complete with a ``pyproject.toml`` file, and being built into an
|
|
|
|
installable distribution package. Python is also routinely used as a scripting
|
|
|
|
language, with Python scripts as a (better) alternative to shell scripts, batch
|
|
|
|
files, etc. When used to create scripts, Python code is typically stored as a
|
|
|
|
single file, often in a directory dedicated to such "utility scripts", which
|
|
|
|
might be in a mix of languages with Python being only one possibility among
|
|
|
|
many. Such scripts may be shared, often by something as simple as email, or a
|
|
|
|
link to a URL such as a Github gist. But they are typically *not* "distributed"
|
|
|
|
or "installed" as part of a normal workflow.
|
|
|
|
|
|
|
|
One problem when using Python as a scripting language in this way is how to run
|
|
|
|
the script in an environment that contains whatever third party dependencies are
|
|
|
|
required by the script. There is currently no standard tool that addresses this
|
|
|
|
issue, and this PEP does *not* attempt to define one. However, any tool that
|
|
|
|
*does* address this issue will need to know what 3rd party dependencies a script
|
|
|
|
requires. By defining a standard format for storing such data, existing tools,
|
|
|
|
as well as any future tools, will be able to obtain that information without
|
|
|
|
requiring users to include tool-specific metadata in their scripts.
|
|
|
|
|
|
|
|
|
|
|
|
Rationale
|
|
|
|
=========
|
|
|
|
|
|
|
|
Because a key requirement is writing single-file scripts, and simple sharing by
|
|
|
|
giving someone a copy of the script, the PEP defines a mechanism for embedding
|
|
|
|
dependency data *within the script itself*, and not in an external file.
|
|
|
|
|
|
|
|
We define the concept of a *metadata block* that contains information about a
|
|
|
|
script. The only type of metadata defined here is dependency information, but
|
|
|
|
making the concept general allows expansion in the future, should it be needed.
|
|
|
|
|
|
|
|
In order to identify metadata blocks, the script can simply be read as a text
|
|
|
|
file. This is deliberate, as Python syntax changes over time, so attempting to
|
|
|
|
parse the script as Python code would require choosing a specific version of
|
|
|
|
Python syntax. Also, it is likely that at least some tools will not be written
|
|
|
|
in Python, and expecting them to implement a Python parser is too much of a
|
|
|
|
burden.
|
|
|
|
|
|
|
|
However, to avoid needing changes to core Python, the format is designed to
|
|
|
|
appear as comments to the Python parser. It is possible to write code where a
|
|
|
|
metadata block is *not* interpreted as a comment (for example, by embedding it
|
|
|
|
in a Python multi-line string), but such uses are discouraged and can easily be
|
|
|
|
avoided assuming you are not deliberately trying to create a pathological
|
|
|
|
example.
|
|
|
|
|
|
|
|
A `review <language survey_>`_ of how other languages allow scripts to specify
|
|
|
|
their dependencies shows that a "structured comment" like this is a
|
|
|
|
commonly-used approach.
|
|
|
|
|
|
|
|
Specification
|
|
|
|
=============
|
|
|
|
|
|
|
|
Any Python script may contain one or more *metadata blocks*. Metadata blocks are
|
|
|
|
identified by reading the script *as a text file* (i.e., the file is not parsed
|
|
|
|
as Python source code), looking for contiguous blocks of lines that start with
|
|
|
|
the identifying characters ``##``. Whitespace is not allowed before the
|
|
|
|
identifying ``##``. More than one metadata block may exist in a Python file.
|
|
|
|
|
2023-08-05 06:42:05 -04:00
|
|
|
Tools reading metadata blocks MAY respect the standard Python encoding
|
2023-08-03 17:14:28 -04:00
|
|
|
declaration. If they choose not to do so, they MUST process the file as UTF-8.
|
|
|
|
|
|
|
|
Within a metadata block, whitespace after the ``##`` and at the end of the line
|
|
|
|
is ignored, and blank lines are ignored. The first line of a metadata block is
|
|
|
|
special, and identifies the type of block. This line MUST contain a colon
|
|
|
|
character, ``:``. If the colon is not present, the block is not considered to be
|
|
|
|
a metadata block, and tools MUST ignore it. The block type is all of the text on
|
|
|
|
the initial line, up to the colon. There must be no whitespace before the colon.
|
|
|
|
The initial line MAY contain text after the colon. How this is interpreted
|
|
|
|
depends on the block type. Block types MUST be treated as case sensitive.
|
|
|
|
|
|
|
|
The interpretation of any lines in a metadata block after the initial
|
|
|
|
identifying line, is defined by the type of block.
|
|
|
|
|
|
|
|
Tools MUST ignore any blocks with types they do not handle.
|
|
|
|
|
|
|
|
Block types starting with the characters ``X-`` are reserved for the user, and
|
|
|
|
MUST NOT be given a meaning in any future standard.
|
|
|
|
|
|
|
|
Otherwise, the only defined block type is ``Script Dependencies``. For this
|
|
|
|
block type,
|
|
|
|
|
|
|
|
1. Text after the colon on the initial line is NOT allowed.
|
|
|
|
2. All subsequent lines MUST contain :pep:`508` requirement
|
|
|
|
specifiers, one per line.
|
|
|
|
|
|
|
|
There SHOULD only be a single ``Script Dependencies`` block in the file. Tools
|
|
|
|
consuming dependency data MAY simply process the first such block found. This
|
|
|
|
avoids the need for tools to process more data than is necessary.
|
|
|
|
|
|
|
|
Consumers MUST validate that at a minimum, all dependencies start with a
|
|
|
|
``name`` as defined in :pep:`508`, and they MAY validate that all dependencies
|
|
|
|
conform fully to :pep:`508`. They MUST fail with an error if they find an
|
|
|
|
invalid specifier.
|
|
|
|
|
|
|
|
Example
|
|
|
|
-------
|
|
|
|
|
|
|
|
The following is an example of a script with an embedded dependency block::
|
|
|
|
|
|
|
|
# In order to run, this script needs the following 3rd party libraries
|
|
|
|
#
|
|
|
|
## Script Dependencies:
|
|
|
|
## requests
|
|
|
|
## rich
|
|
|
|
|
|
|
|
import requests
|
|
|
|
from rich.pretty import pprint
|
|
|
|
|
|
|
|
resp = requests.get("https://peps.python.org/api/peps.json")
|
|
|
|
data = resp.json()
|
|
|
|
pprint([(k, v["title"]) for k, v in data.items()][:10])
|
|
|
|
|
|
|
|
|
|
|
|
Backwards Compatibility
|
|
|
|
=======================
|
|
|
|
|
|
|
|
As metadata blocks take the form of a structured comment, they can be added
|
|
|
|
without altering the meaning of existing code.
|
|
|
|
|
|
|
|
It is possible that a comment may already exist which matches the form of a
|
|
|
|
metadata block. While the use of a double ``#`` prefix is intended to minimise
|
|
|
|
this risk, it is still possible.
|
|
|
|
|
|
|
|
Because tools must ignore unrecognised metadata types, the only potential issue
|
|
|
|
we need to consider is script dependencies. In that case, a tool might read the
|
|
|
|
wrong dependencies. In practice, though, this is unlikely to happen, as (a) the
|
|
|
|
header text (``Script Dependencies:``) is fairly unusual, and (b) any following
|
|
|
|
lines are unlikely to conform to :pep:`508` unless they *are* dependencies.
|
|
|
|
|
|
|
|
In the rare case where an existing comment would be interpreted incorrectly as a
|
|
|
|
dependency block, this can be addressed by adding an actual dependency block
|
|
|
|
(which can be empty if the script has no dependencies) earlier in the code.
|
|
|
|
|
|
|
|
|
|
|
|
Security Implications
|
|
|
|
=====================
|
|
|
|
|
|
|
|
If a script containing a dependency block is run using a tool that automatically
|
|
|
|
installs dependencies, this could cause arbitrary code to be downloaded and
|
|
|
|
installed in the user's environment.
|
|
|
|
|
|
|
|
The risk here is part of the functionality of the tool being used to run the
|
|
|
|
script, and as such should already be addressed by the tool itself. The only
|
|
|
|
additional risk introduced by this PEP is if an untrusted script with a
|
|
|
|
dependency block is run, when a potentially malicious dependency might be
|
|
|
|
installed. This risk is addressed by the normal good practice of reviewing code
|
|
|
|
before running it.
|
|
|
|
|
|
|
|
|
|
|
|
How to Teach This
|
|
|
|
=================
|
|
|
|
|
|
|
|
The format is intended to be close to how a developer might already specify
|
|
|
|
script dependencies in an explanatory comment. The required structure is
|
|
|
|
deliberately minimal, and the concept of using a special comment marker (``##``
|
|
|
|
in this case) is not unusual (the "shebang" line in a Unix shell script is an
|
|
|
|
example).
|
|
|
|
|
|
|
|
Users will need to know how to write Python dependency specifiers. This is
|
|
|
|
covered by :pep:`508`, but for simple examples (which is expected to be the norm
|
|
|
|
for inexperienced users) the syntax is either just a package name, or a name and
|
|
|
|
a version restriction, which is fairly well-understood syntax.
|
|
|
|
|
|
|
|
Users will also know how to *run* a script using a tool that interprets
|
|
|
|
dependency data. This is not covered by this PEP, as it is the responsibility of
|
|
|
|
such a tool to document how it should be used.
|
|
|
|
|
|
|
|
Note that the core Python interpreter does *not* interpret dependency blocks.
|
|
|
|
This may be a point of confusion for beginners, who try to run ``python
|
|
|
|
some_script.py`` and do not understand why it fails. This is no different than
|
|
|
|
the current status quo, though, where running a script without its dependencies
|
|
|
|
present will give an error.
|
|
|
|
|
|
|
|
In general, it is assumed that if a beginner is given a script with dependencies
|
|
|
|
(regardless of whether they are specified in a dependency block), the person
|
|
|
|
supplying the script should explain how to run that script, and if that involves
|
|
|
|
using a script runner tool, that should be noted.
|
|
|
|
|
|
|
|
|
|
|
|
Recommendations
|
|
|
|
===============
|
|
|
|
|
|
|
|
This section is non-normative and simply describes "good practices" when using
|
|
|
|
metadata blocks.
|
|
|
|
|
|
|
|
Scripts should, in general, place metadata blocks at the top of the file,
|
|
|
|
either immediately after any shebang line, or straight after the script
|
|
|
|
docstring. In particular, the metadata block should always be placed before
|
|
|
|
any executable code in the file. This makes it easy for the human reader to
|
|
|
|
locate the metadata block, and allows tools to only read the minimum necessary
|
|
|
|
to identify them.
|
|
|
|
|
|
|
|
|
|
|
|
Reference Implementation
|
|
|
|
========================
|
|
|
|
|
|
|
|
Code to implement this proposal in Python is fairly straightforward, so the
|
|
|
|
reference implementation can be included here.
|
|
|
|
|
|
|
|
A parser that reads *only* the script dependency metadata.
|
|
|
|
|
|
|
|
.. code:: python
|
|
|
|
|
|
|
|
import tokenize
|
|
|
|
from packaging.requirements import Requirement
|
|
|
|
|
|
|
|
DEPENDENCY_BLOCK_MARKER = "Script Dependencies:"
|
|
|
|
|
|
|
|
def read_dependency_block(filename):
|
|
|
|
# Use the tokenize module to handle any encoding declaration.
|
|
|
|
with tokenize.open(filename) as f:
|
|
|
|
for line in f:
|
|
|
|
if line.startswith("##"):
|
|
|
|
line = line[2:].strip()
|
|
|
|
if line == DEPENDENCY_BLOCK_MARKER:
|
|
|
|
for line in f:
|
|
|
|
if not line.startswith("##"):
|
|
|
|
break
|
|
|
|
line = line[2:].strip()
|
|
|
|
if not line:
|
|
|
|
continue
|
|
|
|
# Try to convert to a requirement. This will raise
|
|
|
|
# an error if the line is not a PEP 508 requirement
|
|
|
|
yield Requirement(line)
|
|
|
|
break
|
|
|
|
|
|
|
|
A full metadata block parser that returns all metadata blocks in a script.
|
|
|
|
|
|
|
|
.. code:: python
|
|
|
|
|
|
|
|
import tokenize
|
|
|
|
from packaging.requirements import Requirement
|
|
|
|
|
|
|
|
def read_metadata_blocks(filename):
|
|
|
|
# Use the tokenize module to handle any encoding declaration.
|
|
|
|
with tokenize.open(filename) as f:
|
|
|
|
for line in f:
|
|
|
|
if line.startswith("##"):
|
|
|
|
block_type, sep, extra = line[2:].strip().partition(":")
|
|
|
|
if not sep:
|
|
|
|
continue
|
|
|
|
block_data = []
|
|
|
|
for line in f:
|
|
|
|
if not line.startswith("##"):
|
|
|
|
break
|
|
|
|
line = line[2:].strip()
|
|
|
|
if not line:
|
|
|
|
continue
|
|
|
|
block_data.append(line)
|
|
|
|
yield block_type, extra, block_data
|
|
|
|
|
|
|
|
A format similar to the one proposed here is already supported `in pipx
|
|
|
|
<https://github.com/pypa/pipx/pull/916>`__ and in `pip-run
|
|
|
|
<https://pypi.org/project/pip-run/>`__.
|
|
|
|
|
|
|
|
|
|
|
|
Rejected Ideas
|
|
|
|
==============
|
|
|
|
|
|
|
|
Why not include other metadata?
|
|
|
|
-------------------------------
|
|
|
|
|
|
|
|
The "metadata block" format is designed to allow additional metadata types, but
|
|
|
|
none are defined at this time. Currently, the only data used by tools is
|
|
|
|
dependency information, and therefore this is the only information required by
|
|
|
|
this standard. If, in future, a need is identified for other data to be
|
|
|
|
standardised, adding further metadata types is straightforward.
|
|
|
|
|
|
|
|
By reserving metadata types starting with ``X-``, the specification allows
|
|
|
|
experimentation with additional data *before* standardising.
|
|
|
|
|
|
|
|
Two particular cases are a script version number, and the version of Python
|
|
|
|
needed to run the script.
|
|
|
|
|
|
|
|
In the case of the version number, there are no known tools that try to extract
|
|
|
|
version information from scripts, so there is no immediate benefit to having the
|
|
|
|
version as metadata, rather than, for example, as a normal comment or a
|
|
|
|
``__version__`` attribute (see :pep:`396`). If it becomes common for tools to
|
|
|
|
want to introspect script versions, this could be added at a later date.
|
|
|
|
|
|
|
|
In the case of the Python version, existing tools provide a means for the *user*
|
|
|
|
to specify what Python interpreter to use when running the script (for example,
|
|
|
|
``pipx run`` provides the ``--python`` command line option), but they do not
|
|
|
|
typically allow the *script* to define a version range, and then automatically
|
|
|
|
pick an interpreter based on that. Having a "supported version" for a script may
|
|
|
|
allow the tool to provide better error messages when run with an inappropriate
|
|
|
|
interpreter, but currently, this is largely a theoretical benefit. Again, it is
|
|
|
|
something that can be added later if it becomes a commonly requested feature.
|
|
|
|
|
|
|
|
|
|
|
|
Why not use a more standard data format (e.g., TOML)?
|
|
|
|
-----------------------------------------------------
|
|
|
|
|
|
|
|
First of all, the only practical choice for an alternative format is TOML.
|
|
|
|
Python packaging has standardised on TOML for structured data, and using a
|
|
|
|
different format, such as YAML or JSON, would add complexity and confusion for
|
|
|
|
no real benefit.
|
|
|
|
|
|
|
|
So the question is essentially, "why not use TOML?"
|
|
|
|
|
|
|
|
The key idea behind the "metadata block" format is to define something that
|
|
|
|
reads naturally as a comment in the script. Dependency data is useful both for
|
|
|
|
tools and for the human reader, so having a human readable format is beneficial.
|
|
|
|
On the other hand, TOML of necessity has a syntax of its own, which distracts
|
|
|
|
from the underlying data.
|
|
|
|
|
|
|
|
It is important to remember that developers who *write* scripts in Python are
|
|
|
|
often *not* experienced in Python, or Python packaging. They are often systems
|
|
|
|
administrators, or data analysts, who may simply be using Python as a "better
|
|
|
|
batch file". For such users, the TOML format is extremely likely to be
|
|
|
|
unfamiliar, and the syntax will be obscure to them, and not particularly
|
|
|
|
intuitive. Such developers may well be copying dependency specifiers from
|
|
|
|
sources such as Stack Overflow, without really understanding them. Having to
|
|
|
|
embed such a requirement into a TOML structure is an additional complexity --
|
|
|
|
and it is important to remember that the goal here is to make using 3rd party
|
|
|
|
libraries *easy* for such users.
|
|
|
|
|
|
|
|
Furthermore, TOML, by its nature, is a flexible format intended to support very
|
|
|
|
general data structures. There are *many* ways of writing a simple list of
|
|
|
|
strings in it, and it will not be clear to inexperienced users which form to use.
|
|
|
|
|
|
|
|
And finally, there will be tools that expect to *write* dependency data into
|
|
|
|
scripts -- for example, an IDE with a feature that automatically adds an import
|
|
|
|
and a dependency specifier when you reference a library function. While
|
|
|
|
libraries exist that allow editing TOML data, they are not always good at
|
|
|
|
preserving the user's layout, which could include comments, specific formatting,
|
|
|
|
etc. Even if libraries exist which do an effective job at this, expecting all
|
|
|
|
tools to use such a library is a significant imposition on code supporting this
|
|
|
|
PEP.
|
|
|
|
|
|
|
|
By choosing a simple, line-based format with no quoting rules, dependency data
|
|
|
|
is easy to read (for humans and tools) and easy to write. The format doesn't
|
|
|
|
have the flexibility of something like TOML, but the use case simply doesn't
|
|
|
|
demand that sort of flexibility.
|
|
|
|
|
|
|
|
|
|
|
|
Why not embed a ``pyproject.toml`` file in the script?
|
|
|
|
------------------------------------------------------
|
|
|
|
|
|
|
|
First of all, ``pyproject.toml`` is a TOML based format, so all of the previous
|
|
|
|
concerns around TOML as a format apply. However, ``pyproject.toml`` is a
|
|
|
|
standard used by Python packaging, and re-using an existing standard is a
|
|
|
|
reasonable suggestion that deserves to be addressed on its own merits.
|
|
|
|
|
|
|
|
The first issue is that the suggestion rarely implies that *all* of
|
|
|
|
``pyproject.toml`` is to be supported for scripts. A script is not intended to
|
|
|
|
be "built" into any sort of distributable artifact like a wheel (see below for
|
|
|
|
more on this point), so the ``[build-system]`` section of ``pyproject.toml``
|
|
|
|
makes little sense, for example. And while the tool-specific sections of
|
|
|
|
``pyproject.toml`` might be useful for scripts, it's not at all clear that a
|
|
|
|
tool like `ruff <https://beta.ruff.rs/docs/>`__ would want to support per-file
|
|
|
|
configuration in this way, leading to confusion when users *expect* it to work,
|
|
|
|
but it doesn't. Furthermore, this sort of tool-specific configuration is just as
|
|
|
|
useful for individual files in a larger project, so we have to consider what it
|
|
|
|
would mean to embed a ``pyproject.toml`` into a single file in a larger project
|
|
|
|
that has its own ``pyproject.toml``.
|
|
|
|
|
|
|
|
In addition, ``pyproject.toml`` is currently focused on projects that are to be
|
|
|
|
built into wheels. There is `an ongoing discussion <pyproject without wheels_>`_
|
|
|
|
about how to use ``pyproject.toml`` for projects that are not intended to be
|
|
|
|
built as wheels, and until that question is resolved (which will likely require
|
|
|
|
some PEPs of its own) it seems premature to be discussing embedding
|
|
|
|
``pyproject.toml`` into scripts, which are *definitely* not intended to be built
|
|
|
|
and distributed in that manner.
|
|
|
|
|
|
|
|
The conclusion, therefore (which has been stated explicitly in some, but not
|
|
|
|
all, cases) is that this proposal is intended to mean that we would embed *part
|
|
|
|
of* ``pyproject.toml``. Typically this is the ``[project]`` section from
|
|
|
|
:pep:`621`, or even just the ``dependencies`` item from that section.
|
|
|
|
|
|
|
|
At this point, the first issue is that by framing the proposal as "embedding
|
|
|
|
``pyproject.toml``", we would be encouraging the sort of confusion discussed in
|
|
|
|
the previous paragraphs - developers will expect the full capabilities of
|
|
|
|
``pyproject.toml``, and be confused when there are differences and limitations.
|
|
|
|
It would be better, therefore, to consider this suggestion as simply being a
|
|
|
|
proposal to use an embedded TOML format, but specifically re-using the
|
|
|
|
*structure* of a particular part of ``pyproject.toml``. The problem then becomes
|
|
|
|
how we describe that structure, *without* causing confusion for people familiar
|
|
|
|
with ``pyproject.toml``. If we describe it with reference to ``pyproject.toml``,
|
|
|
|
the link is still there. But if we describe it in isolation, people will be
|
|
|
|
confused by the "similar but different" nature of the structure.
|
|
|
|
|
|
|
|
It is also important to remember that a key part of the target audience for this
|
|
|
|
proposal is developers who are simply using Python as a "better batch file"
|
|
|
|
solution. These developers will generally not be familiar with Python packaging
|
|
|
|
and its conventions, and are often the people most critical of the "complexity"
|
|
|
|
and "difficulty" of packaging solutions. As a result, proposals based on those
|
|
|
|
existing solutions are likely to be unwelcome to that audience, and could easily
|
|
|
|
result in people simply continuing to use existing adhoc solutions, and ignoring
|
|
|
|
the standard that was intended to make their lives easier.
|
|
|
|
|
|
|
|
Why not just set up a Python project with a ``pyproject.toml``?
|
|
|
|
---------------------------------------------------------------
|
|
|
|
|
|
|
|
Again, a key issue here is that the target audience for this proposal is people
|
|
|
|
writing scripts which aren't intended for distribution. Sometimes scripts will
|
|
|
|
be "shared", but this is far more informal than "distribution" - it typically
|
|
|
|
involves sending a script via an email with some written instructions on how to
|
|
|
|
run it, or passing someone a link to a gist.
|
|
|
|
|
|
|
|
Expecting such users to learn the complexities of Python packaging is a
|
|
|
|
significant step up in complexity, and would almost certainly give the
|
|
|
|
impression that "Python is too hard for scripts".
|
|
|
|
|
|
|
|
In addition, if the expectation here is that the ``pyproject.toml`` will somehow
|
|
|
|
be designed for running scripts in place, that's a new feature of the standard
|
|
|
|
that doesn't currently exist. At a minimum, this isn't a reasonable suggestion
|
|
|
|
until the `current discussion on Discourse <pyproject without wheels_>`_ about
|
|
|
|
using ``pyproject.toml`` for projects that won't be distributed as wheels is
|
|
|
|
resolved. And even then, it doesn't address the "sending someone a script in a
|
|
|
|
gist or email" use case.
|
|
|
|
|
|
|
|
Why not use a requirements file for dependencies?
|
|
|
|
-------------------------------------------------
|
|
|
|
|
|
|
|
Putting your requirements in a requirements file, doesn't require a PEP. You can
|
|
|
|
do that right now, and in fact it's quite likely that many adhoc solutions do
|
|
|
|
this. However, without a standard, there's no way of knowing how to locate a
|
|
|
|
script's dependency data. And furthermore, the requirements file format is
|
|
|
|
pip-specific, so tools relying on it are depending on a pip implementation
|
|
|
|
detail.
|
|
|
|
|
|
|
|
So in order to make a standard, two things would be required:
|
|
|
|
|
|
|
|
1. A standardised replacement for the requirements file format.
|
|
|
|
2. A standard for how to locate the requiements file for a given script.
|
|
|
|
|
|
|
|
The first item is a significant undertaking. It has been discussed on a number
|
|
|
|
of occasions, but so far no-one has attempted to actually do it. The most likely
|
|
|
|
approach would be for standards to be developed for individual use cases
|
|
|
|
currently addressed with requirements files. One option here would be for this
|
|
|
|
PEP to simply define a new file format which is simply a text file containing
|
|
|
|
:pep:`508` requirements, one per line. That would just leave the question of how
|
|
|
|
to locate that file.
|
|
|
|
|
|
|
|
The "obvious" solution here would be to do something like name the file the same
|
|
|
|
as the script, but with a ``.reqs`` extension (or something similar). However,
|
|
|
|
this still requires *two* files, where currently only a single file is needed,
|
|
|
|
and as such, does not match the "better batch file" model (shell scripts and
|
|
|
|
batch files are typically self-contained). It requires the developer to remember
|
|
|
|
to keep the two files together, and this may not always be possible. For
|
|
|
|
example, system administration policies may require that *all* files in a
|
|
|
|
certain directory are executable (the Linux filesystem standards require this of
|
|
|
|
``/usr/bin``, for example). And some methods of sharing a script (for example,
|
|
|
|
publishing it on a text file sharing service like Github's gist, or a corporate
|
|
|
|
intranet) may not allow for deriving the location of an associated requirements
|
|
|
|
file from the script's location (tools like ``pipx`` support running a script
|
|
|
|
directly from a URL, so "download and unpack a zip of the script and its
|
|
|
|
dependencies" may not be an appropriate requirement).
|
|
|
|
|
|
|
|
Essentially, though, the issue here is that there is an explicitly stated
|
|
|
|
requirement that the format supports storing dependency data *in the script file
|
|
|
|
itself*. Solutions that don't do that are simply ignoring that requirement.
|
|
|
|
|
|
|
|
Why not use (possibly restricted) Python syntax?
|
|
|
|
------------------------------------------------
|
|
|
|
|
|
|
|
This would typically involve storing the dependencies as a (runtime) list
|
|
|
|
variable with a conventional name, such as::
|
|
|
|
|
|
|
|
__requires__ = [
|
|
|
|
"requests",
|
|
|
|
"click",
|
|
|
|
]
|
|
|
|
|
|
|
|
Other suggestions include a static multi-line string, or including the
|
|
|
|
dependencies in the script's docstring.
|
|
|
|
|
|
|
|
The most significant problem with this proposal is that it requires all
|
|
|
|
consumers of the dependency data to implement a Python parser. Even if the
|
|
|
|
syntax is restricted, the *rest* of the script will use the full Python syntax,
|
|
|
|
and trying to define a syntax which can be successfully parsed in isolation from
|
|
|
|
the surrounding code is likely to be extremely difficult and error-prone.
|
|
|
|
|
|
|
|
Furthermore, Python's syntax changes in every release. If extracting dependency
|
|
|
|
data needs a Python parser, the parser will need to know which version of Python
|
|
|
|
the script is written for, and the overhead for a generic tool of having a
|
|
|
|
parser that can handle *multiple* versions of Python is unsustainable.
|
|
|
|
|
|
|
|
Even if the above issues could be addressed, the format would give the
|
|
|
|
impression that the data could be altered at runtime. However, this is not the
|
|
|
|
case in general, and code that tries to do so will encounter unexpected and
|
|
|
|
confusing behaviour.
|
|
|
|
|
|
|
|
And finally, there is no evidence that having dependency data available at
|
|
|
|
runtime is of any practical use. Should such a use be found, it is simple enough
|
|
|
|
to get the data by parsing the source - ``read_dependency_block(__file__)``.
|
|
|
|
|
|
|
|
It is worth noting, though, that the ``pip-run`` utility does implement (an
|
|
|
|
extended form of) this approach. `Further discussion <pip-run issue_>`_ of
|
|
|
|
the ``pip-run`` design is available on the project's issue tracker.
|
|
|
|
|
|
|
|
Should scripts be able to specify a package index?
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
Dependency metadata is about *what* package the code depends on, and not *where*
|
|
|
|
that package comes from. There is no difference here between metadata for
|
|
|
|
scripts, and metadata for distribution packages (as defined in
|
|
|
|
``pyproject.toml``). In both cases, dependencies are given in "abstract" form,
|
|
|
|
without specifying how they are obtained.
|
|
|
|
|
|
|
|
Some tools that use the dependency information may, of course, need to locate
|
|
|
|
concrete dependency artifacts - for example if they expect to create an
|
|
|
|
environment containing those dependencies. But the way they choose to do that
|
|
|
|
will be closely linked to the tool's UI in general, and this PEP does not try to
|
|
|
|
dictate the UI for tools.
|
|
|
|
|
|
|
|
There is more discussion of this point, and in particular of the UI choices made
|
|
|
|
by the ``pip-run`` tool, in `the previously mentioned pip-run issue <pip-run
|
|
|
|
issue_>`_.
|
|
|
|
|
|
|
|
What about local dependencies?
|
|
|
|
------------------------------
|
|
|
|
|
|
|
|
These can be handled without needing special metadata and tooling, simply by
|
|
|
|
adding the location of the dependencies to ``sys.path``. This PEP simply isn't
|
|
|
|
needed for this case. If, on the other hand, the "local dependencies" are actual
|
|
|
|
distributions which are published locally, they can be specified as usual with a
|
|
|
|
:pep:`508` requirement, and the local package index specified when running a
|
|
|
|
tool by using the tool's UI for that.
|
|
|
|
|
|
|
|
Open Issues
|
|
|
|
===========
|
|
|
|
|
|
|
|
None at this point.
|
|
|
|
|
|
|
|
|
|
|
|
References
|
|
|
|
==========
|
|
|
|
|
|
|
|
.. _pip-run issue: https://github.com/jaraco/pip-run/issues/44
|
|
|
|
.. _language survey: https://dbohdan.com/scripts-with-dependencies
|
|
|
|
.. _pyproject without wheels: https://discuss.python.org/t/projects-that-arent-meant-to-generate-a-wheel-and-pyproject-toml/29684
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
=========
|
|
|
|
|
|
|
|
This document is placed in the public domain or under the
|
|
|
|
CC0-1.0-Universal license, whichever is more permissive.
|