567 lines
20 KiB
ReStructuredText
567 lines
20 KiB
ReStructuredText
PEP: 610
|
||
Title: Recording the Direct URL Origin of installed distributions
|
||
Author: Stéphane Bidoul <stephane.bidoul@gmail.com>, Chris Jerdonek <chris.jerdonek@gmail.com>
|
||
Sponsor: Nick Coghlan <ncoghlan@gmail.com>
|
||
Discussions-To: https://discuss.python.org/t/recording-the-source-url-of-an-installed-distribution/1535
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 21-Apr-2019
|
||
Post-History:
|
||
|
||
Abstract
|
||
========
|
||
|
||
Following PEP 440, a distribution can be identified by a name and either a
|
||
version, or a direct URL reference (see `PEP440 Direct References`_).
|
||
After installation, the name and version are captured in the project metadata,
|
||
but currently there is no way to obtain details of the URL used when the
|
||
distribution was identified by a direct URL reference.
|
||
|
||
This proposal defines
|
||
additional metadata, to be added to the installed distribution by the
|
||
installation front end, which records the Direct URL Origin for use by
|
||
consumers which introspect the database of installed packages (see PEP 376).
|
||
|
||
Motivation
|
||
==========
|
||
|
||
The original motivation of this PEP was to permit tools with a "freeze"
|
||
operation allowing a Python environment to be recreated to work in a broader
|
||
range of situations.
|
||
|
||
Specifically, the PEP originated from the desire to address `pip issue #609`_:
|
||
i.e. improving the behavior of ``pip freeze`` in the presence of distributions
|
||
installed from direct URL references. It follows a
|
||
`thread on discuss.python.org`_ about the best course of action to implement
|
||
it.
|
||
|
||
Installation from direct URL references
|
||
---------------------------------------
|
||
|
||
Python installers such as pip are capable of downloading and installing
|
||
distributions from package indexes. They are also capable of downloading
|
||
and installing source code from requirements specifying arbitrary URLs of
|
||
source archives and Version Control Systems (VCS) repositories,
|
||
as standardized in `PEP440 Direct References`_.
|
||
|
||
In other words two relevant installation modes exist.
|
||
|
||
1. the package to install is specified as a name and version specifier:
|
||
|
||
In this case, the installer looks in a package index (or optionally
|
||
using ``--find-links`` in the case of pip) to find the distribution to install.
|
||
|
||
2. The package to install is specified as a direct URL reference:
|
||
|
||
In this case, the installer downloads whatever is specified by the URL
|
||
(typically a wheel, a source archive or a VCS repository) and installs it.
|
||
|
||
In this mode, installers typically download the source code in a
|
||
temporary directory, invoke the PEP 517 build backend to produce a wheel
|
||
if needed, install the wheel, and delete the temporary directory.
|
||
|
||
After installation, no trace of the URL the user requested to download the
|
||
package is left on the user system.
|
||
|
||
Freezing an environment
|
||
-----------------------
|
||
|
||
Pip also sports a command named ``pip freeze`` which examines the Database of
|
||
Installed Python Distributions to generate a list of requirements. The main
|
||
goal of this command is to help users generating a list of requirements that
|
||
will later allow the re-installation the same environment with the highest
|
||
possible fidelity.
|
||
|
||
As of pip version 19.3, the ``pip freeze`` command outputs a ``name==version``
|
||
line for each installed
|
||
distribution (except for editable installs). To achieve the goal of
|
||
reinstalling the same environment, this requires the (name, version)
|
||
tuple to refer to an immutable version of the
|
||
distribution. The immutability is guaranteed by package indexes
|
||
such as Warehouse. The package index to use is typically known from
|
||
environmental or command line parameters of the installer.
|
||
|
||
This freeze mechanism therefore works fine for installation mode 1 (i.e.
|
||
when the package to install was specified as a name plus version specifier).
|
||
|
||
For installation mode 2, i.e. when the package to install was specified as a
|
||
direct URL reference, the ``name==version`` tuple is obviously not sufficient
|
||
to reinstall the same distribution and users of the freeze command expect it
|
||
to output the URL that was originally requested.
|
||
|
||
The reasoning above is equally applicable to tools, other than ``pip freeze``,
|
||
that would attempt to generate a ``Pipfile.lock`` or any other similar format
|
||
from the Database of Installed Python Distributions. Unless specified
|
||
otherwise, "freeze" is used in this document as a generic term for such
|
||
an operation.
|
||
|
||
The importance of installing from (VCS) URLs for application integrators
|
||
------------------------------------------------------------------------
|
||
|
||
For an application integrator, it is important to be able to reliably install
|
||
and freeze unreleased version of python distributions.
|
||
For instance when a developer needs to deploy an unreleased patched version
|
||
of a dependency, it is common to install the dependency directly from a VCS
|
||
branch that has the patch, while waiting for the maintainer to release an
|
||
updated version.
|
||
|
||
In such cases, it is important for "freeze" to pin the exact VCS
|
||
reference (commit-hash if available) that was installed, in order to create
|
||
reproducible builds with the highest possible fidelity.
|
||
|
||
Additional origin metadata available for VCS URLs
|
||
-------------------------------------------------
|
||
|
||
For VCS URLs, there is additional origin information available only at
|
||
install time useful for introspection and certain workflows. For example,
|
||
when installing a revision from a VCS URL, a tool can determine if the
|
||
revision corresponds to a branch, tag or (in the case of Git) a ref. This
|
||
information can be used when introspecting the Database of Installed Distributions
|
||
to communicate to users more information about what version was installed
|
||
(e.g. whether a branch or tag was installed and, if so, the name of the
|
||
branch or tag). This also permits one to know whether a PEP 440 direct
|
||
reference URL can be constructed using the tag form, as only tags have the
|
||
semantics of immutability.
|
||
|
||
In cases where the revision is mutable (e.g. branches and Git refs), knowing
|
||
this information enables workflows where users can e.g. update to the latest
|
||
version of a branch they are tracking, or update to the latest version of a
|
||
pull request they are reviewing locally. In contrast, when the revision is a
|
||
tag, tools can know in advance (e.g. without network calls) that no update is
|
||
needed.
|
||
|
||
As with the URL itself, if this information isn't recorded at install time
|
||
when the VCS repository is available, it would otherwise be lost.
|
||
|
||
Note about "editable" installs
|
||
------------------------------
|
||
|
||
The editable installation mode of pip roughly lets a user insert a
|
||
local directory in sys.path for development purpose. This mode is somewhat
|
||
abused to work around the fact that a non-editable install from a VCS URL
|
||
loses track of the origin after installation.
|
||
Indeed editable installs implicitly record the VCS origin in the checkout
|
||
directory, so the information can be recovered when running "freeze".
|
||
|
||
The use of this workaround, although useful, is fragile, creates confusion
|
||
about the purpose of the editable mode, and works only when the distribution
|
||
can be installed with setuptools (i.e. it is not usable with other PEP 517
|
||
build backends).
|
||
|
||
When this PEP is implemented, it will not be necessary anymore to use
|
||
editable installs for the purpose of making pip freeze work correctly with
|
||
VCS references.
|
||
|
||
Rationale
|
||
=========
|
||
|
||
This PEP specifies a new ``direct_url.json`` metadata file in the
|
||
``.dist-info`` directory of an installed distribution.
|
||
|
||
The fields specified are sufficient to reproduce the source archive and `VCS
|
||
URLs supported by pip`_. They are also sufficient to reproduce `PEP440 Direct
|
||
References`_, as well as `Pipfile and Pipfile.lock`_ entries. Finally, they
|
||
are sufficient to record the branch, tag, and/or Git ref origin of the
|
||
installed version that is already available for editable installs by virtue
|
||
of a VCS checkout being present.
|
||
|
||
Since at least three different ways already exist to encode this type of
|
||
information, this PEP uses a dictionary format, so as not to make any
|
||
assumption on how a direct
|
||
URL reference must ultimately be encoded in a requirement or lockfile. See also
|
||
the `Alternatives`_ section below for more discussion about this choice.
|
||
|
||
Information has been taken from Ruby's bundler manual to verify it has similar
|
||
capabilities and inform the selection and naming of fields in this
|
||
specifications.
|
||
|
||
The JSON format allows for the addition of additional fields in the future.
|
||
|
||
Specification
|
||
=============
|
||
|
||
This PEP specifies a ``direct_url.json`` file in the ``.dist-info`` directory
|
||
of an installed distribution, to record the Direct URL Origin of the distribution.
|
||
|
||
The canonical source for the name and semantics of this metadata file is
|
||
the `Recording the Direct URL Origin of installed distributions`_ document.
|
||
|
||
This file MUST be created by installers when installing a distribution
|
||
from a requirement specifying a direct URL reference (including a VCS URL).
|
||
|
||
This file MUST NOT be created when installing a distribution from an other
|
||
type of requirement (i.e. name plus version specifier).
|
||
|
||
This JSON file MUST be a dictionary, compliant with `RFC 8259`_ and UTF-8
|
||
encoded.
|
||
|
||
If present, it MUST contain at least two fields. The first one is ``url``, with
|
||
type ``string``. Depending on what ``url`` refers to, the second field MUST be
|
||
one of ``vcs_info`` (if ``url`` is a VCS reference), ``archive_info`` (if
|
||
``url`` is a source archives or a wheel), or ``dir_info`` (if ``url`` is a
|
||
local directory). These info fields have a (possibly empty) subdictionary as
|
||
value, with the possible keys defined below.
|
||
|
||
``url`` MUST be stripped of any sensitive authentication information,
|
||
for security reasons.
|
||
|
||
The user:password section of the URL MAY however
|
||
be composed of environment variables, matching the following regular
|
||
expression::
|
||
|
||
\$\{[A-Za-z0-9-_]+\}(:\$\{[A-Za-z0-9-_]+\})?
|
||
|
||
Additionally, the user:password section of the URL MAY be a
|
||
well-known, non security sensitive string. A typical example is ``git``
|
||
in the case of an URL such as ``ssh://git@gitlab.com``.
|
||
|
||
When ``url`` refers to a VCS repository, the ``vcs_info`` key MUST be present
|
||
as a dictionary with the following keys:
|
||
|
||
- A ``vcs`` key (type ``string``) MUST be present, containing the name of the VCS
|
||
(i.e. one of ``git``, ``hg``, ``bzr``, ``svn``). Other VCS's SHOULD be registered by
|
||
writing a PEP to amend this specification.
|
||
The ``url`` value MUST be compatible with the corresponding VCS,
|
||
so an installer can hand it off without transformation to a
|
||
checkout/download command of the VCS.
|
||
- A ``requested_revision`` key (type ``string``) MAY be present naming a
|
||
branch/tag/ref/commit/revision/etc (in a format compatible with the VCS)
|
||
to install.
|
||
- A ``commit_id`` key (type ``string``) MUST be present, containing the
|
||
exact commit/revision number that was installed.
|
||
If the VCS supports commit-hash
|
||
based revision identifiers, such commit-hash MUST be used as
|
||
``commit_id`` in order to reference the immutable
|
||
version of the source code that was installed.
|
||
- If the installer could discover additional information about
|
||
the requested revision, it MAY add a ``resolved_revision`` and/or
|
||
``resolved_revision_type`` field. If no revision was provided in
|
||
the requested URL, ``resolved_revision`` MAY contain the default branch
|
||
that was installed, and ``resolved_revision_type`` will be ``branch``.
|
||
If the installer determines that ``requested_revision`` was a tag, it MAY
|
||
add ``resolved_revision_type`` with value ``tag``.
|
||
|
||
When ``url`` refers to a source archive or a wheel, the ``archive_info`` key
|
||
MUST be present as a dictionary with the following key:
|
||
|
||
- A ``hash`` key (type ``string``) SHOULD be present, with value
|
||
``<hash-algorithm>=<expected-hash>``.
|
||
It is RECOMMENDED that only hashes which are unconditionally provided by
|
||
the latest version of the standard library's ``hashlib`` module be used for
|
||
source archive hashes. At time of writing, that list consists of 'md5',
|
||
'sha1', 'sha224', 'sha256', 'sha384', and 'sha512'.
|
||
|
||
When ``url`` refers to a local directory, the ``dir_info`` key MUST be
|
||
present as a dictionary with the following key:
|
||
|
||
- ``editable`` (type: ``boolean``): `true` if the distribution was installed
|
||
in editable mode, `false` otherwise. If absent, default to `false`.
|
||
|
||
When ``url`` refers to a local directory, it MUST have the ``file`` sheme
|
||
and be compliant with `RFC 8089`_. In particular, the path component must
|
||
be absolute. Symbolic links SHOULD be preserved when making relative
|
||
paths absolute.
|
||
|
||
.. note::
|
||
|
||
When the requested URL has the file:// scheme and points to a local directory that happens to contain a
|
||
VCS checkout, installers MUST NOT attempt to infer any VCS information and
|
||
therefore MUST NOT output any VCS related information (such as ``vcs_info``)
|
||
in ``direct_url.json``.
|
||
|
||
A top-level ``subdirectory`` field MAY be present containing a directory path,
|
||
relative to the root of the VCS repository, source archive or local directory,
|
||
to specify where ``pyproject.toml`` or ``setup.py`` is located.
|
||
|
||
.. note::
|
||
|
||
As a general rule, installers should as much as possible preserve the
|
||
information that was provided in the requested URL when generating
|
||
``direct_url.json``. For example user:password environment variables
|
||
should be preserved and ``requested_revision`` should reflect the revision that was
|
||
provided in the requested URL as faithfully as possible. This information is
|
||
however *enriched* with more precise data, such as ``commit_id``.
|
||
|
||
Registered VCS
|
||
--------------
|
||
|
||
This section lists the registered VCS's; expanded, VCS-specific information
|
||
on how to use the ``vcs``, ``requested_revision``, and other fields of
|
||
``vcs_info``; and in
|
||
some cases additional VCS-specific fields.
|
||
Tools MAY support other VCS's although it is RECOMMENDED to register
|
||
them by writing a PEP to amend this specification. The ``vcs`` field SHOULD be the command name
|
||
(lowercased). Additional fields that would be necessary to
|
||
support such VCS SHOULD be prefixed with the VCS command name.
|
||
|
||
Git
|
||
+++
|
||
|
||
Home page
|
||
|
||
https://git-scm.com/
|
||
|
||
vcs command
|
||
|
||
git
|
||
|
||
``vcs`` field
|
||
|
||
git
|
||
|
||
``requested_revision`` field
|
||
|
||
A tag name, branch name, Git ref, commit hash, shortened commit hash,
|
||
or other commit-ish.
|
||
|
||
``commit_id`` field
|
||
|
||
A commit hash (40 hexadecimal characters sha1).
|
||
|
||
.. note::
|
||
|
||
Installers can use the ``git show-ref`` and ``git symbolic-ref`` commands
|
||
to determine if the ``requested_revision`` corresponds to a Git ref.
|
||
In turn, a ref beginning with ``refs/tags/`` corresponds to a tag, and
|
||
a ref beginning with ``refs/remotes/origin/`` after cloning corresponds
|
||
to a branch.
|
||
|
||
Mercurial
|
||
+++++++++
|
||
|
||
Home page
|
||
|
||
https://www.mercurial-scm.org/
|
||
|
||
vcs command
|
||
|
||
hg
|
||
|
||
``vcs`` field
|
||
|
||
hg
|
||
|
||
``requested_revision`` field
|
||
|
||
A tag name, branch name, changeset ID, shortened changeset ID.
|
||
|
||
``commit_id`` field
|
||
|
||
A changeset ID (40 hexadecimal characters).
|
||
|
||
Bazaar
|
||
++++++
|
||
|
||
Home page
|
||
|
||
https://bazaar.canonical.com/
|
||
|
||
vcs command
|
||
|
||
bzr
|
||
|
||
``vcs`` field
|
||
|
||
bzr
|
||
|
||
``requested_revision`` field
|
||
|
||
A tag name, branch name, revision id.
|
||
|
||
``commit_id`` field
|
||
|
||
A revision id.
|
||
|
||
Subversion
|
||
++++++++++
|
||
|
||
Home page
|
||
|
||
https://subversion.apache.org/
|
||
|
||
vcs command
|
||
|
||
svn
|
||
|
||
``vcs`` field
|
||
|
||
svn
|
||
|
||
``requested_revision`` field
|
||
|
||
``requested_revision`` must be compatible with ``svn checkout`` ``--revision`` option.
|
||
In Subversion, branch or tag is part of ``url``.
|
||
|
||
``commit_id`` field
|
||
|
||
Since Subversion does not support globally unique identifiers,
|
||
this field is the Subversion revision number in the corresponding
|
||
repository.
|
||
|
||
Examples
|
||
========
|
||
|
||
Example direct_url.json
|
||
-----------------------
|
||
|
||
Source archive:
|
||
|
||
.. code::
|
||
|
||
{
|
||
"url": "https://github.com/pypa/pip/archive/1.3.1.zip",
|
||
"archive_info": {
|
||
"hash": "sha256=2dc6b5a470a1bde68946f263f1af1515a2574a150a30d6ce02c6ff742fcc0db8"
|
||
}
|
||
}
|
||
|
||
Git URL with tag and commit-hash:
|
||
|
||
.. code::
|
||
|
||
{
|
||
"url": "https://github.com/pypa/pip.git",
|
||
"vcs_info": {
|
||
"vcs": "git",
|
||
"requested_revision": "1.3.1",
|
||
"resolved_revision_type": "tag",
|
||
"commit_id": "7921be1537eac1e97bc40179a57f0349c2aee67d"
|
||
}
|
||
}
|
||
|
||
Local directory:
|
||
|
||
.. code::
|
||
|
||
{
|
||
"url": "file:///home/user/project",
|
||
"dir_info": {}
|
||
}
|
||
|
||
Local directory installed in editable mode:
|
||
|
||
.. code::
|
||
|
||
{
|
||
"url": "file:///home/user/project",
|
||
"dir_info": {
|
||
"editable": true
|
||
}
|
||
}
|
||
|
||
|
||
Example pip commands and their effect on direct_url.json
|
||
--------------------------------------------------------
|
||
|
||
Commands that generate a ``direct_url.json``:
|
||
|
||
* pip install https://example.com/app-1.0.tgz
|
||
* pip install https://example.com/app-1.0.whl
|
||
* pip install "git+https://example.com/repo/app.git#egg=app&subdirectory=setup"
|
||
* pip install ./app
|
||
* pip install file:///home/user/app
|
||
* pip install --editable "git+https://example.com/repo/app.git#egg=app&subdirectory=setup"
|
||
(in which case, ``url`` will be the local directory where the git repository has been
|
||
cloned to, and ``dir_info`` will be present with ``"editable": true`` and no
|
||
``vcs_info`` will be set)
|
||
* pip install -e ./app
|
||
|
||
Commands that *do not* generate a ``direct_url.json``
|
||
|
||
* pip install app
|
||
* pip install app --no-index --find-links https://example.com/
|
||
|
||
Use cases
|
||
=========
|
||
|
||
"Freezing" an environment
|
||
|
||
Tools, such as ``pip freeze``, which generate requirements from the Database
|
||
of Installed Python Distributions SHOULD exploit ``direct_url.json``
|
||
if it is present, and give it priority over the Version metadata in order
|
||
to generate a higher fidelity output. In the presence of a ``vcs`` direct URL reference,
|
||
the ``commit_id`` field SHOULD be used in priority in order to provide
|
||
the highest possible fidelity to the originally installed version. If
|
||
supported by their requirement format, tools are encouraged also to output
|
||
the ``tag`` value if present, as it has immutable semantics.
|
||
Tools MAY choose another approach, depending on the needs of their users.
|
||
|
||
Note the initial iteration of this PEP does not attempt to make environments
|
||
that include editable installs or installs from local directories
|
||
reproducible, but it does attempt to make them readily identifiable. By
|
||
locating the local project directory via the ``url`` and ``dir_info`` fields
|
||
of this specification, tools can implement any strategy that fits their use
|
||
cases.
|
||
|
||
Backwards Compatibility
|
||
=======================
|
||
|
||
Since this PEP specifies a new file in the ``.dist-info`` directory,
|
||
there are no backwards compatibility implications.
|
||
|
||
Alternatives
|
||
============
|
||
|
||
PEP426 source_url
|
||
-----------------
|
||
|
||
The now withdrawn PEP 426 specifies a ``source_url`` metadata entry.
|
||
It is also implemented in `distlib`_.
|
||
|
||
It was intended for a slightly different purpose, for use in sdists.
|
||
|
||
This format lacks support for the ``subdirectory`` option of pip requirement
|
||
URLs. The same limitation is present in `PEP440 Direct References`_.
|
||
|
||
It also lacks explicit support for `environment variables in the user:password
|
||
part of URLs`_.
|
||
|
||
The introduction of a key/value extensibility mechanism and support
|
||
for environment variables for user:password in PEP 440, would be necessary
|
||
for use in this PEP.
|
||
|
||
revision vs ref
|
||
---------------
|
||
|
||
The ``requested_revision`` key was retained over ``requested_ref`` as it is a more generic term
|
||
across various VCS and ``ref`` has a specific meaning for ``git``.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. _`pip issue #609`: https://github.com/pypa/pip/issues/609
|
||
.. _`thread on discuss.python.org`: https://discuss.python.org/t/pip-freeze-vcs-urls-and-pep-517-feat-editable-installs/1473
|
||
.. _PEP440: http://www.python.org/dev/peps/pep-0440
|
||
.. _`VCS URLs supported by pip`: https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support
|
||
.. _`PEP440 Direct References`: https://www.python.org/dev/peps/pep-0440/#direct-references
|
||
.. _`Pipfile and Pipfile.lock`: https://github.com/pypa/pipfile
|
||
.. _distlib: https://distlib.readthedocs.io
|
||
.. _`environment variables in the user:password part of URLs`: https://pip.pypa.io/en/stable/reference/pip_install/#id10
|
||
.. _`RFC 8259`: https://tools.ietf.org/html/rfc8259
|
||
.. _`RFC 8089`: https://tools.ietf.org/html/rfc8089
|
||
.. _`Recording the Direct URL Origin of installed distributions`: https://packaging.python.org/specifications/direct-url
|
||
|
||
Acknowledgements
|
||
================
|
||
|
||
Various people helped make this PEP a reality. Paul F. Moore provided the
|
||
essence of the abstract. Nick Coghlan suggested the direct_url name.
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document is placed in the public domain or under the
|
||
CC0-1.0-Universal license, whichever is more permissive.
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|