210 lines
9.0 KiB
ReStructuredText
210 lines
9.0 KiB
ReStructuredText
PEP: 625
|
||
Title: Filename of a Source Distribution
|
||
Author: Tzu-ping Chung <uranusjr@gmail.com>,
|
||
Paul Moore <p.f.moore@gmail.com>
|
||
PEP-Delegate: Pradyun Gedam <pradyunsg@gmail.com>
|
||
Discussions-To: https://discuss.python.org/t/draft-pep-file-name-of-a-source-distribution/4686
|
||
Status: Accepted
|
||
Type: Standards Track
|
||
Topic: Packaging
|
||
Content-Type: text/x-rst
|
||
Created: 08-Jul-2020
|
||
Post-History: 08-Jul-2020
|
||
Resolution: https://discuss.python.org/t/pep-625-file-name-of-a-source-distribution/4686/159
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP describes a standard naming scheme for a Source Distribution, also
|
||
known as an *sdist*. An sdist is distinct from an arbitrary archive file
|
||
containing source code of Python packages, and can be used to communicate
|
||
information about the distribution to packaging tools.
|
||
|
||
A standard sdist specified here is a gzipped tar file with a specially
|
||
formatted filename and the usual ``.tar.gz`` suffix. This PEP does not specify
|
||
the contents of the tarball, as that is covered in other specifications.
|
||
|
||
Motivation
|
||
==========
|
||
|
||
An sdist is a Python package distribution that contains "source code" of the
|
||
Python package, and requires a build step to be turned into a wheel on
|
||
installation. This format is often considered as an unbuilt counterpart of a
|
||
:pep:`427` wheel, and given special treatments in various parts of the
|
||
packaging ecosystem.
|
||
|
||
The content of an sdist is specified in :pep:`517` and :pep:`643`, but currently
|
||
the filename of the sdist is incompletely specified, meaning that consumers
|
||
of the format must download and process the sdist to confirm the name and
|
||
version of the distribution included within.
|
||
|
||
Installers currently rely on heuristics to infer the name and/or version from
|
||
the filename, to help the installation process. pip, for example, parses the
|
||
filename of an sdist from a :pep:`503` index, to obtain the distribution's
|
||
project name and version for dependency resolution purposes. But due to the
|
||
lack of specification, the installer does not have any guarantee as to the
|
||
correctness of the inferred data, and must verify it at some point by locally
|
||
building the distribution metadata.
|
||
|
||
This build step is awkward for a certain class of operations, when the user
|
||
does not expect the build process to occur. `pypa/pip#8387`_ describes an
|
||
example. The command ``pip download --no-deps --no-binary=numpy numpy`` is
|
||
expected to only download an sdist for numpy, since we do not need to check
|
||
for dependencies, and both the name and version are available by introspecting
|
||
the downloaded filename. pip, however, cannot assume the downloaded archive
|
||
follows the convention, and must build and check the metadata. For a :pep:`518`
|
||
project, this means running the ``prepare_metadata_for_build_wheel`` hook
|
||
specified in :pep:`517`, which incurs significant overhead.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
By creating a special filename scheme for the sdist format, this PEP frees up
|
||
tools from the time-consuming metadata verification step when they only need
|
||
the metadata available in the filename.
|
||
|
||
This PEP also serves as the formal specification to the long-standing
|
||
filename convention used by the current sdist implementations. The filename
|
||
contains the distribution name and version, to aid tools identifying a
|
||
distribution without needing to download, unarchive the file, and perform
|
||
costly metadata generation for introspection, if all the information they need
|
||
is available in the filename.
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
The name of an sdist should be ``{distribution}-{version}.tar.gz``.
|
||
|
||
* ``distribution`` is the name of the distribution as defined in :pep:`345`,
|
||
and normalised as described in `the wheel spec`_ e.g. ``'pip'``,
|
||
``'flit_core'``.
|
||
* ``version`` is the version of the distribution as defined in :pep:`440`,
|
||
e.g. ``20.2``, and normalised according to the rules in that PEP.
|
||
|
||
An sdist must be a gzipped tar archive in pax format, that is able to be
|
||
extracted by the standard library ``tarfile`` module with the open flag
|
||
``'r:gz'``.
|
||
|
||
Code that produces an sdist file MUST give the file a name that matches this
|
||
specification. The specification of the ``build_sdist`` hook from :pep:`517` is
|
||
extended to require this naming convention.
|
||
|
||
Code that processes sdist files MAY determine the distribution name and version
|
||
by simply parsing the filename, and is not required to verify that information
|
||
by generating or reading the metadata from the sdist contents.
|
||
|
||
Conforming sdist files can be recognised by the presence of the ``.tar.gz``
|
||
suffix and a *single* hyphen in the filename. Note that some legacy files may
|
||
also match these criteria, but this is not expected to be an issue in practice.
|
||
See the "Backwards Compatibility" section of this document for more details.
|
||
|
||
|
||
Backwards Compatibility
|
||
=======================
|
||
|
||
The new filename scheme is a subset of the current informal naming
|
||
convention for sdist files, so tools that create or publish files conforming
|
||
to this standard will be readable by older tools that only understand the
|
||
previous naming conventions.
|
||
|
||
Tools that consume sdist filenames would technically not be able to determine
|
||
whether a file is using the new standard or a legacy form. However, a review
|
||
of the filenames on PyPI determined that 37% of files are obviously legacy
|
||
(because they contain multiple or no hyphens) and of the remainder, parsing
|
||
according to this PEP gives the correct answer in all but 0.004% of cases.
|
||
|
||
Currently, tools that consume sdists should, if they are to be fully correct,
|
||
treat the name and version parsed from the filename as provisional, and verify
|
||
them by downloading the file and generating the actual metadata (or reading it,
|
||
if the sdist conforms to :pep:`643`). Tools supporting this specification can
|
||
treat the name and version from the filename as definitive. In theory, this
|
||
could risk mistakes if a legacy filename is assumed to conform to this PEP,
|
||
but in practice the chance of this appears to be vanishingly small.
|
||
|
||
|
||
Rejected Ideas
|
||
==============
|
||
|
||
Rely on the specification for sdist metadata
|
||
--------------------------------------------
|
||
|
||
Since this PEP was first written, :pep:`643` has been accepted, defining a
|
||
trustworthy, standard sdist metadata format. This allows distribution metadata
|
||
(and in particular name and version) to be determined statically.
|
||
|
||
This is not considered sufficient, however, as in a number of significant
|
||
cases (for example, reading filenames from a package index) the application
|
||
only has access to the filename, and reading metadata would involve a
|
||
potentially costly download.
|
||
|
||
Use a dedicated file extension
|
||
------------------------------
|
||
|
||
The original version of this PEP proposed a filename of
|
||
``{distribution}-{version}.sdist``. This has the advantage of being explicit,
|
||
as well as allowing a future change to the storage format without needing a
|
||
further change of the file naming convention.
|
||
|
||
However, there are significant compatibility issues with a new extension. Index
|
||
servers may currently disallow unknown extensions, and if we introduced a new
|
||
one, it is not clear how to handle cases like a legacy index trying to mirror an
|
||
index that hosts new-style sdists. Is it acceptable to only partially mirror,
|
||
omitting sdists for newer versions of projects? Also, build backends that produce
|
||
the new format would be incompaible with index servers that only accept the old
|
||
format, and as there is often no way for a user to request an older version of a
|
||
backend when doing a build, this could make it impossible to build and upload
|
||
sdists.
|
||
|
||
Augment a currently common sdist naming scheme
|
||
----------------------------------------------
|
||
|
||
A scheme ``{distribution}-{version}.sdist.tar.gz`` was raised during the
|
||
initial discussion. This was abandoned due to backwards compatibility issues
|
||
with currently available installation tools. pip 20.1, for example, would
|
||
parse ``distribution-1.0.sdist.tar.gz`` as project ``distribution`` with
|
||
version ``1.0.sdist``. This would cause the sdist to be downloaded, but fail to
|
||
install due to inconsistent metadata.
|
||
|
||
The main advantage of this proposal was that it is easier for tools to
|
||
recognise the new-style naming. But this is not a particularly significant
|
||
benefit, given that all sdists with a single hyphen in the name are parsed
|
||
the same way under the old and new rules.
|
||
|
||
|
||
Open Issues
|
||
===========
|
||
|
||
The contents of an sdist are required to contain a single top-level directory
|
||
named ``{name}-{version}``. Currently no normalisation rules are required
|
||
for the components of this name. Should this PEP require that the same normalisation
|
||
rules are applied here as for the filename? Note that in practice, it is likely
|
||
that tools will create the two names using the same code, so normalisation is
|
||
likely to happen naturally, even if it is not explicitly required.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. _`pypa/pip#8387`: https://github.com/pypa/pip/issues/8387
|
||
.. _`the wheel spec`: https://packaging.python.org/en/latest/specifications/binary-distribution-format/
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document is placed in the public domain or under the CC0-1.0-Universal
|
||
license, whichever is more permissive.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|