diff --git a/pep-0625.rst b/pep-0625.rst new file mode 100644 index 000000000..a5b8c6a6a --- /dev/null +++ b/pep-0625.rst @@ -0,0 +1,168 @@ +PEP: 625 +Title: File name of a Source Distribution +Author: Tzu-ping Chung , + Paul Moore +Discussions-To: https://discuss.python.org/t/draft-pep-file-name-of-a-source-distribution/4686 +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 08-Jul-2020 +Post-History: 08-Jul-2020 + +Abstract +======== + +This PEP describes a standard naming scheme for a Source Distribution, also +known as an *sdist*. This scheme distinguishes an sdist from an arbitrary +archive file containing source code of Python packages, and can be used to +communicate information about the distribution to packaging tools. + +A standard sdist specified here is a gzipped tar file with a specially +formatted file stem and a ``.sdist`` suffix. This PEP does not specify the +contents of the tarball. + + +Motivation +========== + +An sdist is a Python package distribution that contains "source code" of the +Python package, and requires a build step to be turned into a wheel on +installation. This format is often considered as an unbuilt counterpart of a +:pep:`427` wheel, and given special treatments in various parts of the +packaging ecosystem. + +Compared to wheel, however, the sdist is entirely unspecified, and currently +works by convention. The widely accepted format of an sdist is defined by the +implementation of distutils and setuptools, which creates a source code +archive in a predictable format and file name scheme. Installers exploit this +predictability to assign this format certain contextual information that helps +the installation process. pip, for example, parses the file name of an sdist +from a :pep:`503` index, to obtain the distribution's project name and version +for dependency resolution purposes. But due to the lack of specification, +the installer does not have any guarantee to the correctness of the inferred +message, and must verify it at some point by locally building the distribution +metadata. + +This build step is awkward for a certin class of operations, when the user +does not expect the build process to occur. `pypa/pip#8387`_ describes an +example. The command ``pip download --no-deps --no-binary=numpy numpy`` is +expected to only download an sdist for numpy, since we do not need to check +for dependencies, and both the name and version are available by introspecting +the downloaded file name. pip, however, cannot assume the downloaded archive +follows the convention, and must build check the metadata. For a :pep:`518` +project, this means running the ``prepare_metadata_for_build_wheel`` hook +specified in :pep:`517`, which incurs significant overhead. + + +Rationale +========= + +By creating a special file name scheme for the sdist format, this PEP frees up +tools from the time-consuming metadata verification step when they only need +the metadata available in the file name. + +This PEP also serves as the formal specification to the long-standing +file name convention used by the current sdist implementations. The file name +contains the distribution name and version, to aid tools identifying a +distribution without needing to download, unarchieve the file, and perform +costly metadata generation for introspection, if all the information they need +are available in the file name. + + +Specification +============= + +The name of an sdist should be ``{distribution}-{version}.sdist``. + +* ``distribution`` is the name of the distribution as defined in :pep:`345`, + and normalised according to :pep:`503`, e.g. ``'pip'``, ``'flit-core'``. +* ``version`` is the version of the distribution as defined in :pep:`440`, + e.g. ``20.2``. + +Each component is escaped according to the same rules as :pep:`427`. + +An sdist must be a gzipped tar archive that is able to be extracted by the +standard library ``tarfile`` module with the open flag ``'r:gz'``. + + +Backwards Compatibility +======================= + +The new file name scheme should not incur backwards incompatibility in +existing tools. Installers are likely to have already implemented logic to +exclude extensions they do not understand, since they already need to deal +with legacy formats on PyPI such as ``.rpm`` and ``.egg``. They should be able +to correctly ignore files with extension ``.sdist``. + +pip, for example, skips this extension with the following debug message:: + + Skipping link: unsupported archive format: sdist: + +While setuptools ignores it silently. + + +Rejected Ideas +============== + +Create specification for sdist metadata +--------------------------------------- + +The topic of creating a trustworthy, standard sdist metadata format as a mean +to distinguish sdists from arbitrary archive files has been raised and +discussed multiple times, but has yet to make significant progress due to +the complexity of potential metadata inconsistency between an sdist and a +wheel built from it. + +This PEP does not exclude the possibility of creating a metadata specification +for sdists in the future. But by specifying only the file name of an sdist, a +tool can reliably identify an sdist, and perform useful introspection to its +identity, without going into the details required for metadata specification. + +Use a currently common sdist naming scheme +------------------------------------------ + +There is a currently established practice to name an sdist in the format of +``{distribution}-{version}.[tar.gz|zip]``. + +Popular source code management services use a similar scheme to name the +downloaded source archive. GitHub, for example, uses ``distribution-1.0.zip`` +as the arhieve name containing source code of repository ``distribution`` on +branch ``1.0``. Giving this scheme a special meaning would cause confusion +since a source archive may not a valid sdist. + +Augment a currently common sdist naming scheme +---------------------------------------------- + +A scheme ``{distribution}-{version}.sdist.tar.gz`` was raised during the +initial discussion. This was abandoned due to backwards compatibility issues +with currently available installation tools. pip 20.1, for example, would +parse ``distribution-1.0.sdist.tar.gz`` as project ``distribution`` with +version ``1.0.sdist``. This would cause the sdist be downloaded, but fail to +install due to inconsistent metadata. + +The same problem exists for all common archive suffixes. To avoid confusing +old installers, the sdist scheme must use a suffix that they do not identify +as an archive. + + +References +========== + +.. _`pypa/pip#8387`: https://github.com/pypa/pip/issues/8387 + + +Copyright +========= + +This document is placed in the public domain or under the CC0-1.0-Universal +license, whichever is more permissive. + + + .. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: