PEP 722: Further revisions (gh-3282)
This commit is contained in:
parent
7afb59534f
commit
189134e403
405
pep-0722.rst
405
pep-0722.rst
|
@ -49,11 +49,10 @@ Because a key requirement is writing single-file scripts, and simple sharing by
|
||||||
giving someone a copy of the script, the PEP defines a mechanism for embedding
|
giving someone a copy of the script, the PEP defines a mechanism for embedding
|
||||||
dependency data *within the script itself*, and not in an external file.
|
dependency data *within the script itself*, and not in an external file.
|
||||||
|
|
||||||
We define the concept of a *metadata block* that contains information about a
|
We define the concept of a *dependency block* that contains information about
|
||||||
script. The only type of metadata defined here is dependency information, but
|
what 3rd party packages a script depends on.
|
||||||
making the concept general allows expansion in the future, should it be needed.
|
|
||||||
|
|
||||||
In order to identify metadata blocks, the script can simply be read as a text
|
In order to identify dependency blocks, the script can simply be read as a text
|
||||||
file. This is deliberate, as Python syntax changes over time, so attempting to
|
file. This is deliberate, as Python syntax changes over time, so attempting to
|
||||||
parse the script as Python code would require choosing a specific version of
|
parse the script as Python code would require choosing a specific version of
|
||||||
Python syntax. Also, it is likely that at least some tools will not be written
|
Python syntax. Also, it is likely that at least some tools will not be written
|
||||||
|
@ -62,7 +61,7 @@ burden.
|
||||||
|
|
||||||
However, to avoid needing changes to core Python, the format is designed to
|
However, to avoid needing changes to core Python, the format is designed to
|
||||||
appear as comments to the Python parser. It is possible to write code where a
|
appear as comments to the Python parser. It is possible to write code where a
|
||||||
metadata block is *not* interpreted as a comment (for example, by embedding it
|
dependency block is *not* interpreted as a comment (for example, by embedding it
|
||||||
in a Python multi-line string), but such uses are discouraged and can easily be
|
in a Python multi-line string), but such uses are discouraged and can easily be
|
||||||
avoided assuming you are not deliberately trying to create a pathological
|
avoided assuming you are not deliberately trying to create a pathological
|
||||||
example.
|
example.
|
||||||
|
@ -74,42 +73,41 @@ commonly-used approach.
|
||||||
Specification
|
Specification
|
||||||
=============
|
=============
|
||||||
|
|
||||||
Any Python script may contain one or more *metadata blocks*. Metadata blocks are
|
The content of this section will be published in the Python Packaging user
|
||||||
identified by reading the script *as a text file* (i.e., the file is not parsed
|
guide, PyPA Specifications section, as a document with the title "Embedding
|
||||||
as Python source code), looking for contiguous blocks of lines that start with
|
Metadata in Script Files".
|
||||||
the identifying characters ``##``. Whitespace is not allowed before the
|
|
||||||
identifying ``##``. More than one metadata block may exist in a Python file.
|
|
||||||
|
|
||||||
Tools reading metadata blocks MAY respect the standard Python encoding
|
Any Python script may contain a *dependency block*. The dependency block is
|
||||||
|
identified by reading the script *as a text file* (i.e., the file is not parsed
|
||||||
|
as Python source code), looking for the first line of the form::
|
||||||
|
|
||||||
|
# Script Dependencies:
|
||||||
|
|
||||||
|
The hash character must be at the start of the line with no preceding whitespace.
|
||||||
|
The text "Script Dependencies" is recognised regardless of case, and the spaces
|
||||||
|
represent arbitrary whitespace (although at least one space must be present). The
|
||||||
|
following regular expression recognises the dependency block header line::
|
||||||
|
|
||||||
|
(?i)^#\s+script\s+dependencies:\s*$
|
||||||
|
|
||||||
|
Tools reading the dependency block MAY respect the standard Python encoding
|
||||||
declaration. If they choose not to do so, they MUST process the file as UTF-8.
|
declaration. If they choose not to do so, they MUST process the file as UTF-8.
|
||||||
|
|
||||||
Within a metadata block, whitespace after the ``##`` and at the end of the line
|
After the header line, all lines in the file up to the first line that doesn't
|
||||||
is ignored, and blank lines are ignored. The first line of a metadata block is
|
start with a ``#`` sign are considered *dependency lines* and are treated as
|
||||||
special, and identifies the type of block. This line MUST contain a colon
|
follows:
|
||||||
character, ``:``. If the colon is not present, the block is not considered to be
|
|
||||||
a metadata block, and tools MUST ignore it. The block type is all of the text on
|
|
||||||
the initial line, up to the colon. There must be no whitespace before the colon.
|
|
||||||
The initial line MAY contain text after the colon. How this is interpreted
|
|
||||||
depends on the block type. Block types MUST be treated as case sensitive.
|
|
||||||
|
|
||||||
The interpretation of any lines in a metadata block after the initial
|
1. The initial ``#`` sign is stripped.
|
||||||
identifying line, is defined by the type of block.
|
2. If the line contains the character sequence " # " (SPACE HASH SPACE), then
|
||||||
|
those characters and any subsequent characters are discarded. This allows
|
||||||
|
dependency blocks to contain inline comments.
|
||||||
|
3. Whitespace at the start and end of the remaining text is discarded.
|
||||||
|
4. If the line is now empty, it is ignored.
|
||||||
|
5. The content of the line MUST now be a valid :pep:`508` dependency specifier.
|
||||||
|
|
||||||
Tools MUST ignore any blocks with types they do not handle.
|
The requirement for spaces before and after the ``#`` in an inline comment is
|
||||||
|
necessary to distinguish them from part of a :pep:`508` URL specifier (which
|
||||||
Block types starting with the characters ``X-`` are reserved for the user, and
|
can contain a hash, but without surrounding whitespace).
|
||||||
MUST NOT be given a meaning in any future standard.
|
|
||||||
|
|
||||||
Otherwise, the only defined block type is ``Script Dependencies``. For this
|
|
||||||
block type,
|
|
||||||
|
|
||||||
1. Text after the colon on the initial line is NOT allowed.
|
|
||||||
2. All subsequent lines MUST contain :pep:`508` requirement
|
|
||||||
specifiers, one per line.
|
|
||||||
|
|
||||||
There SHOULD only be a single ``Script Dependencies`` block in the file. Tools
|
|
||||||
consuming dependency data MAY simply process the first such block found. This
|
|
||||||
avoids the need for tools to process more data than is necessary.
|
|
||||||
|
|
||||||
Consumers MUST validate that at a minimum, all dependencies start with a
|
Consumers MUST validate that at a minimum, all dependencies start with a
|
||||||
``name`` as defined in :pep:`508`, and they MAY validate that all dependencies
|
``name`` as defined in :pep:`508`, and they MAY validate that all dependencies
|
||||||
|
@ -123,9 +121,13 @@ The following is an example of a script with an embedded dependency block::
|
||||||
|
|
||||||
# In order to run, this script needs the following 3rd party libraries
|
# In order to run, this script needs the following 3rd party libraries
|
||||||
#
|
#
|
||||||
## Script Dependencies:
|
# Script Dependencies:
|
||||||
## requests
|
# requests
|
||||||
## rich
|
# rich # Needed for the output
|
||||||
|
#
|
||||||
|
# # Not needed - just to show that fragments in URLs do not
|
||||||
|
# # get treated as comments
|
||||||
|
# pip @ https://github.com/pypa/pip/archive/1.3.1.zip#sha1=da9234ee9982d4bbb3c72346a6de940a148ea686
|
||||||
|
|
||||||
import requests
|
import requests
|
||||||
from rich.pretty import pprint
|
from rich.pretty import pprint
|
||||||
|
@ -138,18 +140,12 @@ The following is an example of a script with an embedded dependency block::
|
||||||
Backwards Compatibility
|
Backwards Compatibility
|
||||||
=======================
|
=======================
|
||||||
|
|
||||||
As metadata blocks take the form of a structured comment, they can be added
|
As dependency blocks take the form of a structured comment, they can be added
|
||||||
without altering the meaning of existing code.
|
without altering the meaning of existing code.
|
||||||
|
|
||||||
It is possible that a comment may already exist which matches the form of a
|
It is possible that a comment may already exist which matches the form of a
|
||||||
metadata block. While the use of a double ``#`` prefix is intended to minimise
|
dependency block. While the identifying header text, "Script Dependencies" is
|
||||||
this risk, it is still possible.
|
chosen to minimise this risk, it is still possible.
|
||||||
|
|
||||||
Because tools must ignore unrecognised metadata types, the only potential issue
|
|
||||||
we need to consider is script dependencies. In that case, a tool might read the
|
|
||||||
wrong dependencies. In practice, though, this is unlikely to happen, as (a) the
|
|
||||||
header text (``Script Dependencies:``) is fairly unusual, and (b) any following
|
|
||||||
lines are unlikely to conform to :pep:`508` unless they *are* dependencies.
|
|
||||||
|
|
||||||
In the rare case where an existing comment would be interpreted incorrectly as a
|
In the rare case where an existing comment would be interpreted incorrectly as a
|
||||||
dependency block, this can be addressed by adding an actual dependency block
|
dependency block, this can be addressed by adding an actual dependency block
|
||||||
|
@ -176,9 +172,7 @@ How to Teach This
|
||||||
|
|
||||||
The format is intended to be close to how a developer might already specify
|
The format is intended to be close to how a developer might already specify
|
||||||
script dependencies in an explanatory comment. The required structure is
|
script dependencies in an explanatory comment. The required structure is
|
||||||
deliberately minimal, and the concept of using a special comment marker (``##``
|
deliberately minimal, so that formatting rules are easy to learn.
|
||||||
in this case) is not unusual (the "shebang" line in a Unix shell script is an
|
|
||||||
example).
|
|
||||||
|
|
||||||
Users will need to know how to write Python dependency specifiers. This is
|
Users will need to know how to write Python dependency specifiers. This is
|
||||||
covered by :pep:`508`, but for simple examples (which is expected to be the norm
|
covered by :pep:`508`, but for simple examples (which is expected to be the norm
|
||||||
|
@ -207,12 +201,20 @@ Recommendations
|
||||||
This section is non-normative and simply describes "good practices" when using
|
This section is non-normative and simply describes "good practices" when using
|
||||||
metadata blocks.
|
metadata blocks.
|
||||||
|
|
||||||
Scripts should, in general, place metadata blocks at the top of the file,
|
While it is permitted for tools to do minimal validation of requirements, in
|
||||||
|
practice they should do as much "sanity check" validation as possible, even if
|
||||||
|
they cannot do a full check for :pep:`508` syntax. This helps to ensure that
|
||||||
|
dependency blocks that are not correctly terminated are reported early. A good
|
||||||
|
compromise between the minimal approach of checking just that the requirement
|
||||||
|
starts with a name, and full :pep:`508` validation, is to check for a bare name,
|
||||||
|
or a name followed by optional whitespace, and then one of ``[`` (extra), ``@``
|
||||||
|
(urlspec), ``;`` (marker) or one of ``(<!=>~`` (version).
|
||||||
|
|
||||||
|
Scripts should, in general, place the dependency block at the top of the file,
|
||||||
either immediately after any shebang line, or straight after the script
|
either immediately after any shebang line, or straight after the script
|
||||||
docstring. In particular, the metadata block should always be placed before
|
docstring. In particular, the dependency block should always be placed before
|
||||||
any executable code in the file. This makes it easy for the human reader to
|
any executable code in the file. This makes it easy for the human reader to
|
||||||
locate the metadata block, and allows tools to only read the minimum necessary
|
locate it.
|
||||||
to identify them.
|
|
||||||
|
|
||||||
|
|
||||||
Reference Implementation
|
Reference Implementation
|
||||||
|
@ -221,58 +223,35 @@ Reference Implementation
|
||||||
Code to implement this proposal in Python is fairly straightforward, so the
|
Code to implement this proposal in Python is fairly straightforward, so the
|
||||||
reference implementation can be included here.
|
reference implementation can be included here.
|
||||||
|
|
||||||
A parser that reads *only* the script dependency metadata.
|
|
||||||
|
|
||||||
.. code:: python
|
.. code:: python
|
||||||
|
|
||||||
|
import re
|
||||||
import tokenize
|
import tokenize
|
||||||
from packaging.requirements import Requirement
|
from packaging.requirements import Requirement
|
||||||
|
|
||||||
DEPENDENCY_BLOCK_MARKER = "Script Dependencies:"
|
DEPENDENCY_BLOCK_MARKER = r"(?i)^#\s+script\s+dependencies:\s*$"
|
||||||
|
|
||||||
def read_dependency_block(filename):
|
def read_dependency_block(filename):
|
||||||
# Use the tokenize module to handle any encoding declaration.
|
# Use the tokenize module to handle any encoding declaration.
|
||||||
with tokenize.open(filename) as f:
|
with tokenize.open(filename) as f:
|
||||||
for line in f:
|
for line in f:
|
||||||
if line.startswith("##"):
|
if re.match(DEPENDENCY_BLOCK_MARKER, line):
|
||||||
line = line[2:].strip()
|
|
||||||
if line == DEPENDENCY_BLOCK_MARKER:
|
|
||||||
for line in f:
|
for line in f:
|
||||||
if not line.startswith("##"):
|
if not line.startswith("#"):
|
||||||
break
|
break
|
||||||
line = line[2:].strip()
|
# Remove comments. An inline comment is introduced by
|
||||||
|
# a hash, which must be preceded and followed by a
|
||||||
|
# space. The initial hash will be skipped as it has
|
||||||
|
# no space before it.
|
||||||
|
line = line.split(" # ", maxsplit=1)[0]
|
||||||
|
line = line[1:].strip()
|
||||||
if not line:
|
if not line:
|
||||||
continue
|
break
|
||||||
# Try to convert to a requirement. This will raise
|
# Try to convert to a requirement. This will raise
|
||||||
# an error if the line is not a PEP 508 requirement
|
# an error if the line is not a PEP 508 requirement
|
||||||
yield Requirement(line)
|
yield Requirement(line)
|
||||||
break
|
break
|
||||||
|
|
||||||
A full metadata block parser that returns all metadata blocks in a script.
|
|
||||||
|
|
||||||
.. code:: python
|
|
||||||
|
|
||||||
import tokenize
|
|
||||||
from packaging.requirements import Requirement
|
|
||||||
|
|
||||||
def read_metadata_blocks(filename):
|
|
||||||
# Use the tokenize module to handle any encoding declaration.
|
|
||||||
with tokenize.open(filename) as f:
|
|
||||||
for line in f:
|
|
||||||
if line.startswith("##"):
|
|
||||||
block_type, sep, extra = line[2:].strip().partition(":")
|
|
||||||
if not sep:
|
|
||||||
continue
|
|
||||||
block_data = []
|
|
||||||
for line in f:
|
|
||||||
if not line.startswith("##"):
|
|
||||||
break
|
|
||||||
line = line[2:].strip()
|
|
||||||
if not line:
|
|
||||||
continue
|
|
||||||
block_data.append(line)
|
|
||||||
yield block_type, extra, block_data
|
|
||||||
|
|
||||||
A format similar to the one proposed here is already supported `in pipx
|
A format similar to the one proposed here is already supported `in pipx
|
||||||
<https://github.com/pypa/pipx/pull/916>`__ and in `pip-run
|
<https://github.com/pypa/pipx/pull/916>`__ and in `pip-run
|
||||||
<https://pypi.org/project/pip-run/>`__.
|
<https://pypi.org/project/pip-run/>`__.
|
||||||
|
@ -284,32 +263,97 @@ Rejected Ideas
|
||||||
Why not include other metadata?
|
Why not include other metadata?
|
||||||
-------------------------------
|
-------------------------------
|
||||||
|
|
||||||
The "metadata block" format is designed to allow additional metadata types, but
|
The core use case addressed by this proposal is that of identifying what
|
||||||
none are defined at this time. Currently, the only data used by tools is
|
dependencies a standalone script needs in order to run successfully. This is a
|
||||||
dependency information, and therefore this is the only information required by
|
common real-world issue that is currently solved by script runner tools, using
|
||||||
this standard. If, in future, a need is identified for other data to be
|
implementation-specific ways of storing the data. Standardising the storage
|
||||||
standardised, adding further metadata types is straightforward.
|
format improves interoperability by not typing the script to a particular
|
||||||
|
runner.
|
||||||
|
|
||||||
By reserving metadata types starting with ``X-``, the specification allows
|
While it is arguable that other forms of metadata could be useful in a
|
||||||
experimentation with additional data *before* standardising.
|
standalone script, the need is largely theoretical at this point. In practical
|
||||||
|
terms, scripts either don't use other metadata, or they store it in existing,
|
||||||
|
widely used (and therefore de facto standard) formats. For example, scripts
|
||||||
|
needing README style text typically use the standard Python module docstring,
|
||||||
|
and scripts wanting to declare a version use the common convention of having a
|
||||||
|
``__version__`` variable.
|
||||||
|
|
||||||
Two particular cases are a script version number, and the version of Python
|
One case which was raised during the discussion on this PEP, was the ability to
|
||||||
needed to run the script.
|
declare a minimum Python version that a script needed to run, by analogy with
|
||||||
|
the ``Requires-Python`` core metadata item for packages. Unlike packages,
|
||||||
|
scripts are normally only run by one user or in one environment, in contexts
|
||||||
|
where multiple versions of Python are uncommon. The need for this metadata is
|
||||||
|
therefore much less critical in the case of scripts. As further evidence of
|
||||||
|
this, the two key script runners currently available, ``pipx`` and ``pip-run``
|
||||||
|
do not offer a means of including this data in a script.
|
||||||
|
|
||||||
In the case of the version number, there are no known tools that try to extract
|
Creating a standard "metadata container" format would unify the various
|
||||||
version information from scripts, so there is no immediate benefit to having the
|
approaches, but in practical terms there is no real need for unification, and
|
||||||
version as metadata, rather than, for example, as a normal comment or a
|
the disruption would either delay adoption, or more likely simply mean script
|
||||||
``__version__`` attribute (see :pep:`396`). If it becomes common for tools to
|
authors would ignore the standard.
|
||||||
want to introspect script versions, this could be added at a later date.
|
|
||||||
|
|
||||||
In the case of the Python version, existing tools provide a means for the *user*
|
This proposal therefore chooses to focus just on the one use case where there is
|
||||||
to specify what Python interpreter to use when running the script (for example,
|
a clear need for something, and no existing standard or common practice.
|
||||||
``pipx run`` provides the ``--python`` command line option), but they do not
|
|
||||||
typically allow the *script* to define a version range, and then automatically
|
|
||||||
pick an interpreter based on that. Having a "supported version" for a script may
|
Why not use a marker per line?
|
||||||
allow the tool to provide better error messages when run with an inappropriate
|
------------------------------
|
||||||
interpreter, but currently, this is largely a theoretical benefit. Again, it is
|
|
||||||
something that can be added later if it becomes a commonly requested feature.
|
Rather than using a comment block with a header, another possibility would be to
|
||||||
|
use a marker on each line, something like::
|
||||||
|
|
||||||
|
# Script-Dependency: requests
|
||||||
|
# Script-Dependency: click
|
||||||
|
|
||||||
|
While this makes it easier to parse lines individually, it has a number of
|
||||||
|
issues. The first is simply that it's rather verbose, and less readable. This is
|
||||||
|
clearly affected by the chosen keyword, but all of the suggested options were
|
||||||
|
(in the author's opinion) less readable than the block comment form.
|
||||||
|
|
||||||
|
More importantly, this form *by design* makes it impossible to require that the
|
||||||
|
dependency specifiers are all together in a single block. As a result, it's not
|
||||||
|
possible for a human reader, without a careful check of the whole file, to be
|
||||||
|
sure that they have identified all of the dependencies. See the question below,
|
||||||
|
"Why not allow multiple dependency blocks and merge them?", for further
|
||||||
|
discussion of this problem.
|
||||||
|
|
||||||
|
Finally, as the reference implementation demonstrates, parsing the "comment
|
||||||
|
block" form isn't, in practice, significantly more difficult than parsing this
|
||||||
|
form.
|
||||||
|
|
||||||
|
|
||||||
|
Why not use a distinct form of comment for the dependency block?
|
||||||
|
----------------------------------------------------------------
|
||||||
|
|
||||||
|
A previous version of this proposal used ``##`` to identify dependency blocks.
|
||||||
|
Unfortunately, however, the flake8 linter implements a rule requiring that
|
||||||
|
comments must have a space after the initial ``#`` sign. While the PEP author
|
||||||
|
considers that rule misguided, it is on by default and as a result would cause
|
||||||
|
checks to fail when faced with a dependency block.
|
||||||
|
|
||||||
|
Furthermore, the ``black`` formatter, although it allows the ``##`` form, does
|
||||||
|
add a space after the ``#`` for most other forms of comment. This means that if
|
||||||
|
we chose an alternative like ``#%``, automatic reformatting would corrupt the
|
||||||
|
dependency block. Forms including a space, like ``# #`` are possible, but less
|
||||||
|
natural for the average user (omitting the space is an obvious mistake to make).
|
||||||
|
|
||||||
|
While it is possible that linters and formatters could be changed to recognise
|
||||||
|
the new standard, the benefit of having a dedicated prefix did not seem
|
||||||
|
sufficient to justify the transition cost, or the risk that users might be using
|
||||||
|
older tools.
|
||||||
|
|
||||||
|
|
||||||
|
Why not allow multiple dependency blocks and merge them?
|
||||||
|
--------------------------------------------------------
|
||||||
|
|
||||||
|
Because it's too easy for the human reader to miss the fact that there's a
|
||||||
|
second dependency block. This could simply result in the script runner
|
||||||
|
unexpectedly downloading extra packages, or it could even be a way to smuggle
|
||||||
|
malicious packages onto a user's machine (by "hiding" a second dependency block
|
||||||
|
in the body of the script).
|
||||||
|
|
||||||
|
While the principle of "don't run untrusted code" applies here, the benefits
|
||||||
|
aren't sufficient to be worth the risk.
|
||||||
|
|
||||||
|
|
||||||
Why not use a more standard data format (e.g., TOML)?
|
Why not use a more standard data format (e.g., TOML)?
|
||||||
|
@ -347,10 +391,9 @@ And finally, there will be tools that expect to *write* dependency data into
|
||||||
scripts -- for example, an IDE with a feature that automatically adds an import
|
scripts -- for example, an IDE with a feature that automatically adds an import
|
||||||
and a dependency specifier when you reference a library function. While
|
and a dependency specifier when you reference a library function. While
|
||||||
libraries exist that allow editing TOML data, they are not always good at
|
libraries exist that allow editing TOML data, they are not always good at
|
||||||
preserving the user's layout, which could include comments, specific formatting,
|
preserving the user's layout. Even if libraries exist which do an effective job
|
||||||
etc. Even if libraries exist which do an effective job at this, expecting all
|
at this, expecting all tools to use such a library is a significant imposition
|
||||||
tools to use such a library is a significant imposition on code supporting this
|
on code supporting this PEP.
|
||||||
PEP.
|
|
||||||
|
|
||||||
By choosing a simple, line-based format with no quoting rules, dependency data
|
By choosing a simple, line-based format with no quoting rules, dependency data
|
||||||
is easy to read (for humans and tools) and easy to write. The format doesn't
|
is easy to read (for humans and tools) and easy to write. The format doesn't
|
||||||
|
@ -358,6 +401,45 @@ have the flexibility of something like TOML, but the use case simply doesn't
|
||||||
demand that sort of flexibility.
|
demand that sort of flexibility.
|
||||||
|
|
||||||
|
|
||||||
|
Why not use (possibly restricted) Python syntax?
|
||||||
|
------------------------------------------------
|
||||||
|
|
||||||
|
This would typically involve storing the dependencies as a (runtime) list
|
||||||
|
variable with a conventional name, such as::
|
||||||
|
|
||||||
|
__requires__ = [
|
||||||
|
"requests",
|
||||||
|
"click",
|
||||||
|
]
|
||||||
|
|
||||||
|
Other suggestions include a static multi-line string, or including the
|
||||||
|
dependencies in the script's docstring.
|
||||||
|
|
||||||
|
The most significant problem with this proposal is that it requires all
|
||||||
|
consumers of the dependency data to implement a Python parser. Even if the
|
||||||
|
syntax is restricted, the *rest* of the script will use the full Python syntax,
|
||||||
|
and trying to define a syntax which can be successfully parsed in isolation from
|
||||||
|
the surrounding code is likely to be extremely difficult and error-prone.
|
||||||
|
|
||||||
|
Furthermore, Python's syntax changes in every release. If extracting dependency
|
||||||
|
data needs a Python parser, the parser will need to know which version of Python
|
||||||
|
the script is written for, and the overhead for a generic tool of having a
|
||||||
|
parser that can handle *multiple* versions of Python is unsustainable.
|
||||||
|
|
||||||
|
Even if the above issues could be addressed, the format would give the
|
||||||
|
impression that the data could be altered at runtime. However, this is not the
|
||||||
|
case in general, and code that tries to do so will encounter unexpected and
|
||||||
|
confusing behaviour.
|
||||||
|
|
||||||
|
And finally, there is no evidence that having dependency data available at
|
||||||
|
runtime is of any practical use. Should such a use be found, it is simple enough
|
||||||
|
to get the data by parsing the source - ``read_dependency_block(__file__)``.
|
||||||
|
|
||||||
|
It is worth noting, though, that the ``pip-run`` utility does implement (an
|
||||||
|
extended form of) this approach. `Further discussion <pip-run issue_>`_ of
|
||||||
|
the ``pip-run`` design is available on the project's issue tracker.
|
||||||
|
|
||||||
|
|
||||||
Why not embed a ``pyproject.toml`` file in the script?
|
Why not embed a ``pyproject.toml`` file in the script?
|
||||||
------------------------------------------------------
|
------------------------------------------------------
|
||||||
|
|
||||||
|
@ -413,6 +495,60 @@ existing solutions are likely to be unwelcome to that audience, and could easily
|
||||||
result in people simply continuing to use existing adhoc solutions, and ignoring
|
result in people simply continuing to use existing adhoc solutions, and ignoring
|
||||||
the standard that was intended to make their lives easier.
|
the standard that was intended to make their lives easier.
|
||||||
|
|
||||||
|
Why not infer the requirements from import statements?
|
||||||
|
------------------------------------------------------
|
||||||
|
|
||||||
|
The idea would be to automatically recognize ``import`` statements in the source
|
||||||
|
file and turn them into a list of requirements.
|
||||||
|
|
||||||
|
However, this is infeasible for several reasons. First, the points above about
|
||||||
|
the necessity to keep the syntax easily parsable, for all Python versions, also
|
||||||
|
by tools written in other languages, apply equally here.
|
||||||
|
|
||||||
|
Second, PyPI and other package repositories conforming to the Simple Repository
|
||||||
|
API do not provide a mechanism to resolve package names from the module names
|
||||||
|
that are imported (see also `this related discussion <import-names_>`_).
|
||||||
|
|
||||||
|
Third, even if repositories did offer this information, the same import name may
|
||||||
|
correspond to several packages on PyPI. One might object that disambiguating
|
||||||
|
which package is wanted would only be needed if there are several projects
|
||||||
|
providing the same import name. However, this would make it easy for anyone to
|
||||||
|
unintentionally or malevolently break working scripts, by uploading a package to
|
||||||
|
PyPI providing an import name that is the same as an existing project. The
|
||||||
|
alternative where, among the candidates, the first package to have been
|
||||||
|
registered on the index is chosen, would be confusing in case a popular package
|
||||||
|
is developed with the same import name as an existing obscure package, and even
|
||||||
|
harmful if the existing package is malware intentionally uploaded with a
|
||||||
|
sufficiently generic import name that has a high probability of being reused.
|
||||||
|
|
||||||
|
A related idea would be to attach the requirements as comments to the import
|
||||||
|
statements instead of gathering them in a block, with a syntax such as::
|
||||||
|
|
||||||
|
import numpy as np # requires: numpy
|
||||||
|
import rich # requires: rich
|
||||||
|
|
||||||
|
This still suffers from parsing difficulties. Also, where to place the comment
|
||||||
|
in the case of multiline imports is ambiguous and may look ugly::
|
||||||
|
|
||||||
|
from PyQt5.QtWidgets import (
|
||||||
|
QCheckBox, QComboBox, QDialog, QDialogButtonBox,
|
||||||
|
QGridLayout, QLabel, QSpinBox, QTextEdit
|
||||||
|
) # requires: PyQt5
|
||||||
|
|
||||||
|
Furthermore, this syntax cannot behave as might be intuitively expected
|
||||||
|
in all situations. Consider::
|
||||||
|
|
||||||
|
import platform
|
||||||
|
if platform.system() == "Windows":
|
||||||
|
import pywin32 # requires: pywin32
|
||||||
|
|
||||||
|
Here, the user's intent is that the package is only required on Windows, but
|
||||||
|
this cannot be understood by the script runner (the correct way to write
|
||||||
|
it would be ``requires: pywin32 ; sys_platform == 'win32'``).
|
||||||
|
|
||||||
|
(Thanks to Jean Abou-Samra for the clear discussion of this point)
|
||||||
|
|
||||||
|
|
||||||
Why not just set up a Python project with a ``pyproject.toml``?
|
Why not just set up a Python project with a ``pyproject.toml``?
|
||||||
---------------------------------------------------------------
|
---------------------------------------------------------------
|
||||||
|
|
||||||
|
@ -476,44 +612,6 @@ Essentially, though, the issue here is that there is an explicitly stated
|
||||||
requirement that the format supports storing dependency data *in the script file
|
requirement that the format supports storing dependency data *in the script file
|
||||||
itself*. Solutions that don't do that are simply ignoring that requirement.
|
itself*. Solutions that don't do that are simply ignoring that requirement.
|
||||||
|
|
||||||
Why not use (possibly restricted) Python syntax?
|
|
||||||
------------------------------------------------
|
|
||||||
|
|
||||||
This would typically involve storing the dependencies as a (runtime) list
|
|
||||||
variable with a conventional name, such as::
|
|
||||||
|
|
||||||
__requires__ = [
|
|
||||||
"requests",
|
|
||||||
"click",
|
|
||||||
]
|
|
||||||
|
|
||||||
Other suggestions include a static multi-line string, or including the
|
|
||||||
dependencies in the script's docstring.
|
|
||||||
|
|
||||||
The most significant problem with this proposal is that it requires all
|
|
||||||
consumers of the dependency data to implement a Python parser. Even if the
|
|
||||||
syntax is restricted, the *rest* of the script will use the full Python syntax,
|
|
||||||
and trying to define a syntax which can be successfully parsed in isolation from
|
|
||||||
the surrounding code is likely to be extremely difficult and error-prone.
|
|
||||||
|
|
||||||
Furthermore, Python's syntax changes in every release. If extracting dependency
|
|
||||||
data needs a Python parser, the parser will need to know which version of Python
|
|
||||||
the script is written for, and the overhead for a generic tool of having a
|
|
||||||
parser that can handle *multiple* versions of Python is unsustainable.
|
|
||||||
|
|
||||||
Even if the above issues could be addressed, the format would give the
|
|
||||||
impression that the data could be altered at runtime. However, this is not the
|
|
||||||
case in general, and code that tries to do so will encounter unexpected and
|
|
||||||
confusing behaviour.
|
|
||||||
|
|
||||||
And finally, there is no evidence that having dependency data available at
|
|
||||||
runtime is of any practical use. Should such a use be found, it is simple enough
|
|
||||||
to get the data by parsing the source - ``read_dependency_block(__file__)``.
|
|
||||||
|
|
||||||
It is worth noting, though, that the ``pip-run`` utility does implement (an
|
|
||||||
extended form of) this approach. `Further discussion <pip-run issue_>`_ of
|
|
||||||
the ``pip-run`` design is available on the project's issue tracker.
|
|
||||||
|
|
||||||
Should scripts be able to specify a package index?
|
Should scripts be able to specify a package index?
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
|
@ -555,6 +653,7 @@ References
|
||||||
.. _pip-run issue: https://github.com/jaraco/pip-run/issues/44
|
.. _pip-run issue: https://github.com/jaraco/pip-run/issues/44
|
||||||
.. _language survey: https://dbohdan.com/scripts-with-dependencies
|
.. _language survey: https://dbohdan.com/scripts-with-dependencies
|
||||||
.. _pyproject without wheels: https://discuss.python.org/t/projects-that-arent-meant-to-generate-a-wheel-and-pyproject-toml/29684
|
.. _pyproject without wheels: https://discuss.python.org/t/projects-that-arent-meant-to-generate-a-wheel-and-pyproject-toml/29684
|
||||||
|
.. _import-names: https://discuss.python.org/t/record-the-top-level-names-of-a-wheel-in-metadata/29494
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
=========
|
=========
|
||||||
|
|
Loading…
Reference in New Issue