PEP 625: Update following discussions (GH-2671)
This commit is contained in:
parent
ed9d65c1d4
commit
43782e9251
151
pep-0625.rst
151
pep-0625.rst
|
@ -3,7 +3,7 @@ Title: File name of a Source Distribution
|
|||
Author: Tzu-ping Chung <uranusjr@gmail.com>,
|
||||
Paul Moore <p.f.moore@gmail.com>
|
||||
Discussions-To: https://discuss.python.org/t/draft-pep-file-name-of-a-source-distribution/4686
|
||||
Status: Draft
|
||||
Status: Deferred
|
||||
Type: Standards Track
|
||||
Topic: Packaging
|
||||
Content-Type: text/x-rst
|
||||
|
@ -14,14 +14,17 @@ Abstract
|
|||
========
|
||||
|
||||
This PEP describes a standard naming scheme for a Source Distribution, also
|
||||
known as an *sdist*. This scheme distinguishes an sdist from an arbitrary
|
||||
archive file containing source code of Python packages, and can be used to
|
||||
communicate information about the distribution to packaging tools.
|
||||
known as an *sdist*. An sdist is distinct from an arbitrary archive file
|
||||
containing source code of Python packages, and can be used to communicate
|
||||
information about the distribution to packaging tools.
|
||||
|
||||
A standard sdist specified here is a gzipped tar file with a specially
|
||||
formatted file stem and a ``.sdist`` suffix. This PEP does not specify the
|
||||
contents of the tarball.
|
||||
formatted filename and the usual ``.tar.gz`` suffix. This PEP does not specify
|
||||
the contents of the tarball, as that is covered in other specifications.
|
||||
|
||||
**Note**: This PEP has been deferred until :pep:`643` has seen wider adoption
|
||||
(in particular, until Metadata 2.2 is accepted on PyPI, and a number of common
|
||||
backends have implemented it).
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
@ -32,17 +35,18 @@ installation. This format is often considered as an unbuilt counterpart of a
|
|||
:pep:`427` wheel, and given special treatments in various parts of the
|
||||
packaging ecosystem.
|
||||
|
||||
Compared to wheel, however, the sdist is entirely unspecified, and currently
|
||||
works by convention. The widely accepted format of an sdist is defined by the
|
||||
implementation of distutils and setuptools, which creates a source code
|
||||
archive in a predictable format and file name scheme. Installers exploit this
|
||||
predictability to assign this format certain contextual information that helps
|
||||
the installation process. pip, for example, parses the file name of an sdist
|
||||
from a :pep:`503` index, to obtain the distribution's project name and version
|
||||
for dependency resolution purposes. But due to the lack of specification,
|
||||
the installer does not have any guarantee as to the correctness of the inferred
|
||||
message, and must verify it at some point by locally building the distribution
|
||||
metadata.
|
||||
The content of an sdist is specified in :pep:`517` and :pep:`643`, but currently
|
||||
the filename of the sdist is incompletely specified, meaning that consumers
|
||||
of the format must download and process the sdist to confirm the name and
|
||||
version of the distribution included within.
|
||||
|
||||
Installers currently rely on heuristics to infer the name and/or version from
|
||||
the filename, to help the installation process. pip, for example, parses the
|
||||
filename of an sdist from a :pep:`503` index, to obtain the distribution's
|
||||
project name and version for dependency resolution purposes. But due to the
|
||||
lack of specification, the installer does not have any guarantee as to the
|
||||
correctness of the inferred data, and must verify it at some point by locally
|
||||
building the distribution metadata.
|
||||
|
||||
This build step is awkward for a certain class of operations, when the user
|
||||
does not expect the build process to occur. `pypa/pip#8387`_ describes an
|
||||
|
@ -73,63 +77,87 @@ is available in the file name.
|
|||
Specification
|
||||
=============
|
||||
|
||||
The name of an sdist should be ``{distribution}-{version}.sdist``.
|
||||
The name of an sdist should be ``{distribution}-{version}.tar.gz``.
|
||||
|
||||
* ``distribution`` is the name of the distribution as defined in :pep:`345`,
|
||||
and normalised according to :pep:`503`, e.g. ``'pip'``, ``'flit-core'``.
|
||||
and normalised as described in `the wheel spec`_ e.g. ``'pip'``,
|
||||
``'flit_core'``.
|
||||
* ``version`` is the version of the distribution as defined in :pep:`440`,
|
||||
e.g. ``20.2``.
|
||||
e.g. ``20.2``, and normalised according to the rules in that PEP.
|
||||
|
||||
Each component is escaped according to the same rules as :pep:`427`.
|
||||
An sdist must be a gzipped tar archive in pax format, that is able to be
|
||||
extracted by the standard library ``tarfile`` module with the open flag
|
||||
``'r:gz'``.
|
||||
|
||||
An sdist must be a gzipped tar archive that is able to be extracted by the
|
||||
standard library ``tarfile`` module with the open flag ``'r:gz'``.
|
||||
Code that produces an sdist file MUST give the file a name that matches this
|
||||
specification. The specification of the ``build_sdist`` hook from :pep:`517` is
|
||||
extended to require this naming convention.
|
||||
|
||||
Code that processes sdist files MAY determine the distribution name and version
|
||||
by simply parsing the filename, and is not required to verify that information
|
||||
by generating or reading the metadata from the sdist contents.
|
||||
|
||||
Conforming sdist files can be recognised by the presence of the ``.tar.gz``
|
||||
suffix and a *single* hyphen in the filename. Note that some legacy files may
|
||||
also match these criteria, but this is not expected to be an issue in practice.
|
||||
See the "Backwards Compatibility" section of this document for more details.
|
||||
|
||||
|
||||
Backwards Compatibility
|
||||
=======================
|
||||
|
||||
The new file name scheme should not incur backwards incompatibility in
|
||||
existing tools. Installers are likely to have already implemented logic to
|
||||
exclude extensions they do not understand, since they already need to deal
|
||||
with legacy formats on PyPI such as ``.rpm`` and ``.egg``. They should be able
|
||||
to correctly ignore files with extension ``.sdist``.
|
||||
The new filename scheme is a subset of the current informal naming
|
||||
convention for sdist files, so tools that create or publish files conforming
|
||||
to this standard will be readable by older tools that only understand the
|
||||
previous naming conventions.
|
||||
|
||||
pip, for example, skips this extension with the following debug message::
|
||||
Tools that consume sdist filenames would technically not be able to determine
|
||||
whether a file is using the new standard or a legacy form. However, a review
|
||||
of the filenames on PyPI determined that 37% of files are obviously legacy
|
||||
(because they contain multiple or no hyphens) and of the remainder, parsing
|
||||
according to this PEP gives the correct answer in all but 0.004% of cases.
|
||||
|
||||
Skipping link: unsupported archive format: sdist: <URL to file>
|
||||
|
||||
While setuptools ignores it silently.
|
||||
Currently, tools that consume sdists should, if they are to be fully correct,
|
||||
treat the name and version parsed from the filename as provisional, and verify
|
||||
them by downloading the file and generating the actual metadata (or reading it,
|
||||
if the sdist conforms to :pep:`643`). Tools supporting this specification can
|
||||
treat the name and version from the filename as definitive. In theory, this
|
||||
could risk mistakes if a legacy filename is assumed to conform to this PEP,
|
||||
but in practice the chance of this appears to be vanishingly small.
|
||||
|
||||
|
||||
Rejected Ideas
|
||||
==============
|
||||
|
||||
Create specification for sdist metadata
|
||||
---------------------------------------
|
||||
Rely on the specification for sdist metadata
|
||||
--------------------------------------------
|
||||
|
||||
The topic of creating a trustworthy, standard sdist metadata format as a means
|
||||
to distinguish sdists from arbitrary archive files has been raised and
|
||||
discussed multiple times, but has yet to make significant progress due to
|
||||
the complexity of potential metadata inconsistency between an sdist and a
|
||||
wheel built from it.
|
||||
Since this PEP was first written, :pep:`643` has been accepted, defining a
|
||||
trustworthy, standard sdist metadata format. This allows distribution metadata
|
||||
(and in particular name and version) to be determined statically.
|
||||
|
||||
This PEP does not exclude the possibility of creating a metadata specification
|
||||
for sdists in the future. But by specifying only the file name of an sdist, a
|
||||
tool can reliably identify an sdist, and perform useful introspection on its
|
||||
identity, without going into the details required for metadata specification.
|
||||
This is not considered sufficient, however, as in a number of significant
|
||||
cases (for example, reading filenames from a package index) the application
|
||||
only has access to the filename, and reading metadata would involve a
|
||||
potentially costly download.
|
||||
|
||||
Use a currently common sdist naming scheme
|
||||
------------------------------------------
|
||||
Use a dedicated file extension
|
||||
------------------------------
|
||||
|
||||
There is a currently established practice to name an sdist in the format of
|
||||
``{distribution}-{version}.[tar.gz|zip]``.
|
||||
The original version of this PEP proposed a filename of
|
||||
``{distribution}-{version}.sdist``. This has the advantage of being explicit,
|
||||
as well as allowing a future change to the storage format without needing a
|
||||
further change of the file naming convention.
|
||||
|
||||
Popular source code management services use a similar scheme to name the
|
||||
downloaded source archive. GitHub, for example, uses ``distribution-1.0.zip``
|
||||
as the archive name containing source code of repository ``distribution`` on
|
||||
branch ``1.0``. Giving this scheme a special meaning would cause confusion
|
||||
since a source archive may not a valid sdist.
|
||||
However, there are significant compatibility issues with a new extension. Index
|
||||
servers may currently disallow unknown extensions, and if we introduced a new
|
||||
one, it is not clear how to handle cases like a legacy index trying to mirror an
|
||||
index that hosts new-style sdists. Is it acceptable to only partially mirror,
|
||||
omitting sdists for newer versions of projects? Also, build backends that produce
|
||||
the new format would be incompaible with index servers that only accept the old
|
||||
format, and as there is often no way for a user to request an older version of a
|
||||
backend when doing a build, this could make it impossible to build and upload
|
||||
sdists.
|
||||
|
||||
Augment a currently common sdist naming scheme
|
||||
----------------------------------------------
|
||||
|
@ -141,15 +169,28 @@ parse ``distribution-1.0.sdist.tar.gz`` as project ``distribution`` with
|
|||
version ``1.0.sdist``. This would cause the sdist to be downloaded, but fail to
|
||||
install due to inconsistent metadata.
|
||||
|
||||
The same problem exists for all common archive suffixes. To avoid confusing
|
||||
old installers, the sdist scheme must use a suffix that they do not identify
|
||||
as an archive.
|
||||
The main advantage of this proposal was that it is easier for tools to
|
||||
recognise the new-style naming. But this is not a particularly significant
|
||||
benefit, given that all sdists with a single hyphen in the name are parsed
|
||||
the same way under the old and new rules.
|
||||
|
||||
|
||||
Open Issues
|
||||
===========
|
||||
|
||||
The contents of an sdist are required to contain a single top-level directory
|
||||
named ``{name}-{version}``. Currently no normalisation rules are required
|
||||
for the components of this name. Should this PEP require that the same normalisation
|
||||
rules are applied here as for the filename? Note that in practice, it is likely
|
||||
that tools will create the two names using the same code, so normalisation is
|
||||
likely to happen naturally, even if it is not explicitly required.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. _`pypa/pip#8387`: https://github.com/pypa/pip/issues/8387
|
||||
.. _`the wheel spec`: https://packaging.python.org/en/latest/specifications/binary-distribution-format/
|
||||
|
||||
|
||||
Copyright
|
||||
|
|
Loading…
Reference in New Issue