Recording the Direct URL Origin of installed distributions (#1145)
This commit is contained in:
parent
da480869f8
commit
1e094877fd
|
@ -0,0 +1,508 @@
|
|||
PEP: 9999
|
||||
Title: Recording the Direct URL Origin of installed distributions
|
||||
Author: Stéphane Bidoul <stephane.bidoul@acsone.eu>, Chris Jerdonek <chris.jerdonek@gmail.com>
|
||||
Discussions-To: https://discuss.python.org/t/recording-the-source-url-of-an-installed-distribution/1535
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 21-Apr-2019
|
||||
Post-History:
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
Following PEP 440, a distribution can be identified by a name and either a
|
||||
version, or a direct URL reference (see `PEP440 Direct References`_).
|
||||
After installation, the name and version are captured in the project metadata,
|
||||
but currently there is no way to obtain details of the URL used when the
|
||||
distribution was identified by a direct URL reference.
|
||||
|
||||
This proposal defines
|
||||
additional metadata, to be added to the installed distribution by the
|
||||
installation front end, which records the Direct URL Origin for use by
|
||||
consumers which introspect the database of installed packages (see PEP 376).
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
The original motivation of this PEP was to permit tools with a "freeze"
|
||||
operation allowing a Python environment to be recreated to work in a broader
|
||||
range of situations.
|
||||
|
||||
Specifically, the PEP originated from the desire to address `pip issue #609`_:
|
||||
i.e. improving the behavior of ``pip freeze`` in the presence of distributions
|
||||
installed from direct URL references. It follows a
|
||||
`thread on discuss.python.org`_ about the best course of action to implement
|
||||
it.
|
||||
|
||||
Installation from direct URL references
|
||||
---------------------------------------
|
||||
|
||||
Python installers such as pip are capable of downloading and installing
|
||||
distributions from package indexes. They are also capable of downloading
|
||||
and installing source code from requirements specifying arbitrary URLs of
|
||||
source archives and Version Control Systems (VCS) repositories,
|
||||
as standardized in `PEP440 Direct References`_.
|
||||
|
||||
In other words two relevant installation modes exist.
|
||||
|
||||
1. the package to install is specified as a name and version specifier:
|
||||
|
||||
In this case, the installer looks in a package index (or optionally
|
||||
using ``--find-links`` in the case of pip) to find the distribution to install.
|
||||
|
||||
2. The package to install is specified as a direct URL reference:
|
||||
|
||||
In this case, the installer downloads whatever is specified by the URL
|
||||
(typically a wheel, a source archive or a VCS repository) and installs it.
|
||||
|
||||
In this mode, installers typically download the source code in a
|
||||
temporary directory, invoke the PEP 517 build backend to produce a wheel
|
||||
if needed, install the wheel, and delete the temporary directory.
|
||||
|
||||
After installation, no trace of the URL the user requested to download the
|
||||
package is left on the user system.
|
||||
|
||||
Freezing an environment
|
||||
-----------------------
|
||||
|
||||
Pip also sports a command named ``pip freeze`` which examines the Database of
|
||||
Installed Python Distributions to generate a list of requirements. The main
|
||||
goal of this command is to help users generating a list of requirements that
|
||||
will later allow the re-installation the same environment with the highest
|
||||
possible fidelity.
|
||||
|
||||
The ``pip freeze`` command outputs a ``name==version`` line for each installed
|
||||
distribution (except for editable installs). To achieve the goal of
|
||||
reinstalling the same environment, this requires the (name, version)
|
||||
tuple to refer to an immutable version of the
|
||||
distribution. The immutability is guaranteed by package indexes
|
||||
such as Warehouse. The package index to use is typically known from
|
||||
environmental or command line parameters of the installer.
|
||||
|
||||
This freeze mechanism therefore works fine for installation mode 1 (i.e.
|
||||
when the package to install was specified as a name plus version specifier).
|
||||
|
||||
For installation mode 2, i.e. when the package to install was specified as a
|
||||
direct URL reference, the ``name==version`` tuple is obviously not sufficient
|
||||
to reinstall the same distribution and users of the freeze command expect it
|
||||
to output the URL that was originally requested.
|
||||
|
||||
The reasoning above is equally applicable to tools, other than ``pip freeze``,
|
||||
that would attempt to generate a ``Pipfile.lock`` or any other similar format
|
||||
from the Database of Installed Python Distributions. Unless specified
|
||||
otherwise, "freeze" is used in this document as a generic term for such
|
||||
an operation.
|
||||
|
||||
The importance of installing from (VCS) URLs for application integrators
|
||||
------------------------------------------------------------------------
|
||||
|
||||
For an application integrator, it is important to be able to reliably install
|
||||
and freeze unreleased version of python distributions.
|
||||
For instance when a developer needs to deploy an unreleased patched version
|
||||
of a dependency, it is common to install the dependency directly from a VCS
|
||||
branch that has the patch, while waiting for the maintainer to release an
|
||||
updated version.
|
||||
|
||||
In such cases, it is important for "freeze" to pin the exact VCS
|
||||
reference (commit-hash if available) that was installed, in order to create
|
||||
reproducible builds with the highest possible fidelity.
|
||||
|
||||
Additional origin metadata available for VCS URLs
|
||||
-------------------------------------------------
|
||||
|
||||
For VCS URLs, there is additional origin information available only at
|
||||
install time useful for introspection and certain workflows. For example,
|
||||
when installing a revision from a VCS URL, a tool can determine if the
|
||||
revision corresponds to a branch, tag or (in the case of Git) a ref. This
|
||||
information can be used when introspecting the database of installed packages
|
||||
to communicate to users more information about what version was installed
|
||||
(e.g. whether a branch or tag was installed and, if so, the name of the
|
||||
branch or tag). This also permits one to know whether a PEP 440 direct
|
||||
reference URL can be constructed using the tag form, as only tags have the
|
||||
semantics of immutability.
|
||||
|
||||
In cases where the revision is mutable (e.g. branches and Git refs), knowing
|
||||
this information enables workflows where users can e.g. update to the latest
|
||||
version of a branch they are tracking, or update to the latest version of a
|
||||
pull request they are reviewing locally. In contrast, when the revision is a
|
||||
tag, tools can know in advance (e.g. without network calls) that no update is
|
||||
needed.
|
||||
|
||||
As with the URL itself, if this information isn't recorded at install time
|
||||
when the VCS repository is available, it would otherwise be lost.
|
||||
|
||||
Note about "editable" installs
|
||||
------------------------------
|
||||
|
||||
The editable installation mode of pip roughly lets a user insert a
|
||||
local directory in sys.path for development purpose. This mode is somewhat
|
||||
abused to work around the fact that a non-editable install from a VCS URL
|
||||
loses track of the origin after installation.
|
||||
Indeed editable installs implicitly record the VCS origin in the checkout
|
||||
directory, so the information can be recovered when running "freeze".
|
||||
|
||||
The use of this workaround, although useful, is fragile, creates confusion
|
||||
about the purpose of the editable mode, and works only when the distribution
|
||||
can be installed with setuptools (i.e. it is not usable with other PEP 517
|
||||
build backends).
|
||||
|
||||
For the sake of clarity, it is important to note that this PEP is otherwise
|
||||
unrelated to editable installs.
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
This PEP specifies a new ``direct_url.json`` metadata file in the
|
||||
``.dist-info`` directory of an installed distribution.
|
||||
|
||||
The fields specified are sufficient to reproduce the source archive and `VCS
|
||||
URLs supported by pip`_. They are also sufficient to reproduce `PEP440 Direct
|
||||
References`_, as well as `Pipfile and Pipfile.lock`_ entries. Finally, they
|
||||
are sufficient to record the branch, tag, and/or Git ref origin of the
|
||||
installed version that is already available for editable installs by virtue
|
||||
of a VCS checkout being present.
|
||||
|
||||
Since at least three different ways already exist to encode this type of
|
||||
information, this PEP uses a key-value format, so as not to make any
|
||||
assumption on how a direct
|
||||
URL reference must ultimately be encoded in a requirement or lockfile. See also
|
||||
the `Alternatives`_ section below for more discussion about this choice.
|
||||
|
||||
Information has been taken from Ruby's bundler manual to verify it has similar
|
||||
capabilities and inform the selection and naming of fields in this
|
||||
specifications.
|
||||
|
||||
The JSON format allows for the addition of additional fields in the future.
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
This PEP specifies a ``direct_url.json`` file in the ``.dist-info`` directory
|
||||
of an installed distribution, to record the Direct URL Origin of the distribution.
|
||||
|
||||
This file MUST be created by installers when installing a distribution
|
||||
from a requirement specifying a direct URL reference (including a VCS URL)
|
||||
in *non*-editable mode.
|
||||
|
||||
This file MUST NOT be created when installing a distribution from an other
|
||||
type of requirement (i.e. name plus version specifier, or URL in editable mode).
|
||||
|
||||
This JSON MUST be a flat dictionary where all keys and values are of string type.
|
||||
For the sake of forward compatibility, tools SHOULD ignore values which are
|
||||
not of string type.
|
||||
|
||||
If present, it MUST contain at least one field with name ``url``.
|
||||
|
||||
``url`` MUST be stripped of any sensitive authentication information,
|
||||
for security reasons. The user:password section of the URL MAY however
|
||||
be composed of environment variables, matching the following regular
|
||||
expression::
|
||||
|
||||
\$\{[A-Za-z0-9-_]+\}(:\$\{[A-Za-z0-9-_]+\})?
|
||||
|
||||
When ``url`` refers to a VCS repository:
|
||||
|
||||
- A ``vcs`` field MUST be present, containing the name of the VCS
|
||||
(i.e. one of ``git``, ``hg``, ``bzr``, ``svn``). Other VCS's SHOULD be registered by
|
||||
amending this PEP.
|
||||
- The ``url`` value MUST be compatible with the corresponding VCS,
|
||||
so an installer can hand it off without transformation to a
|
||||
checkout/download command of the VCS.
|
||||
- A ``requested_revision`` field MAY be present naming a
|
||||
branch/tag/ref/commit/revision/etc (in a format compatible with the VCS)
|
||||
to install.
|
||||
- A ``commit_id`` field MUST be present, containing the
|
||||
exact commit/revision number that was installed.
|
||||
If the VCS supports commit-hash
|
||||
based revision identifiers, such commit-hash MUST be used as
|
||||
``commit_id`` in order to reference the immutable
|
||||
version of the source code that was installed.
|
||||
- A ``tag`` field naming a tag MAY be present to indicate that a particular
|
||||
tag was installed.
|
||||
- A ``branch`` field naming a branch MAY be present to indicate that a
|
||||
particular branch was installed. If ``branch`` is present, ``tag`` MUST not
|
||||
be present.
|
||||
|
||||
When ``url`` refers to a source archive, a wheel, or a local directory:
|
||||
|
||||
- A ``hash`` field SHOULD be present, with value
|
||||
``<hash-algorithm>=<expected-hash>``.
|
||||
It is RECOMMENDED that only hashes which are unconditionally provided by
|
||||
the latest version of the standard library's ``hashlib`` module be used for
|
||||
source archive hashes. At time of writing, that list consists of 'md5',
|
||||
'sha1', 'sha224', 'sha256', 'sha384', and 'sha512'.
|
||||
|
||||
.. note::
|
||||
|
||||
When the requested URL points to a local directory that happens to contain a
|
||||
VCS checkout, installers MUST NOT attempt to infer any VCS information and
|
||||
therefore MUST NOT output any vcs related information (such as the ``vcs`` field)
|
||||
in ``direct_url.json``.
|
||||
|
||||
A ``subdirectory`` field MAY be present containing a directory path,
|
||||
relative to the root of the VCS repository, source archive or local directory,
|
||||
to specify where ``pyproject.toml`` or ``setup.py`` is located.
|
||||
|
||||
.. note::
|
||||
|
||||
As a general rule, installers should as much as possible preserve the
|
||||
information that was provided in the requested URL when generating
|
||||
``direct_url.json``. For example user:password environment variables
|
||||
should be preserved and ``requested_revision`` should reflect the revision that was
|
||||
provided in the requested URL as faithfully as possible. This information is
|
||||
however *enriched* with more precise data, such as ``commit_id``.
|
||||
|
||||
Registered VCS
|
||||
--------------
|
||||
|
||||
This section lists the registered VCS's; expanded, VCS-specific information
|
||||
on how to use the ``vcs``, ``requested_revision``, and other fields; and in
|
||||
some cases additional VCS-specific fields.
|
||||
Tools MAY support other VCS's although it is RECOMMENDED to register
|
||||
them by amending this PEP. The ``vcs`` field SHOULD be the command name
|
||||
(lowercased). Additional fields that would be necessary to
|
||||
support such VCS SHOULD be prefixed with the VCS command name.
|
||||
|
||||
Git
|
||||
+++
|
||||
|
||||
Home page
|
||||
|
||||
https://git-scm.com/
|
||||
|
||||
vcs command
|
||||
|
||||
git
|
||||
|
||||
vcs field
|
||||
|
||||
git
|
||||
|
||||
requested_revision field
|
||||
|
||||
A tag name, branch name, Git ref, commit hash, shortened commit hash,
|
||||
or other commit-ish.
|
||||
|
||||
commit_id field
|
||||
|
||||
A commit hash (40 hexadecimal characters sha1).
|
||||
|
||||
branch field
|
||||
|
||||
If no ``requested_revision`` is provided and the remote repository has
|
||||
a default branch, this field can be used to record the default branch that
|
||||
was installed.
|
||||
|
||||
VCS-specific fields:
|
||||
|
||||
- A ``git_ref`` field naming a Git ref (string beginning with ``refs/``) MAY
|
||||
be present to indicate that a particular ref was installed (e.g.
|
||||
``refs/pull/123/head``).
|
||||
|
||||
.. note::
|
||||
|
||||
Installers can use the ``git show-ref`` and ``git symbolic-ref`` commands
|
||||
to determine if the ``requested_revision`` corresponds to a Git ref.
|
||||
In turn, a ref beginning with ``refs/tags/`` corresponds to a tag, and
|
||||
a ref beginning with ``refs/remotes/origin/`` after cloning corresponds
|
||||
to a branch.
|
||||
|
||||
Mercurial
|
||||
+++++++++
|
||||
|
||||
Home page
|
||||
|
||||
https://www.mercurial-scm.org/
|
||||
|
||||
vcs command
|
||||
|
||||
hg
|
||||
|
||||
vcs field
|
||||
|
||||
hg
|
||||
|
||||
requested_revision field
|
||||
|
||||
A tag name, branch name, changeset ID, shortened changeset ID.
|
||||
|
||||
commit_id field
|
||||
|
||||
A changeset ID (40 hexadecimal characters).
|
||||
|
||||
Bazaar
|
||||
++++++
|
||||
|
||||
Home page
|
||||
|
||||
https://bazaar.canonical.com/
|
||||
|
||||
vcs command
|
||||
|
||||
bzr
|
||||
|
||||
vcs field
|
||||
|
||||
bzr
|
||||
|
||||
requested_revision field
|
||||
|
||||
A tag name, branch name, revision id.
|
||||
|
||||
commit_id field
|
||||
|
||||
A revision id.
|
||||
|
||||
Subversion
|
||||
++++++++++
|
||||
|
||||
Home page
|
||||
|
||||
https://subversion.apache.org/
|
||||
|
||||
vcs command
|
||||
|
||||
svn
|
||||
|
||||
vcs field
|
||||
|
||||
svn
|
||||
|
||||
requested_revision field
|
||||
|
||||
``requested_revision`` must be compatible with ``svn checkout`` ``--revision`` option.
|
||||
In Subversion, branch or tag is part of ``url``.
|
||||
|
||||
commit_id
|
||||
|
||||
Since Subversion does not support globally unique identifiers,
|
||||
this field is the Subversion revision number in the corresponding
|
||||
repository.
|
||||
|
||||
Examples
|
||||
========
|
||||
|
||||
Example direct_url.json
|
||||
-----------------------
|
||||
|
||||
Source archive:
|
||||
|
||||
.. code::
|
||||
|
||||
{
|
||||
"url": "https://github.com/pypa/pip/archive/1.3.1.zip",
|
||||
"hash": "sha256=2dc6b5a470a1bde68946f263f1af1515a2574a150a30d6ce02c6ff742fcc0db8"
|
||||
}
|
||||
|
||||
Git URL with tag and commit-hash:
|
||||
|
||||
.. code::
|
||||
|
||||
{
|
||||
"url": "https://github.com/pypa/pip.git",
|
||||
"vcs": "git",
|
||||
"requested_revision": "1.3.1",
|
||||
"commit_id": "7921be1537eac1e97bc40179a57f0349c2aee67d"
|
||||
}
|
||||
|
||||
Example pip commands and their effect on direct_url.json
|
||||
--------------------------------------------------------
|
||||
|
||||
Commands that generate a ``direct_url.json``:
|
||||
|
||||
* pip install https://example.com/app-1.0.tgz
|
||||
* pip install https://example.com/app-1.0.whl
|
||||
* pip install "git+https://example.com/repo/app.git#egg=app&subdirectory=setup"
|
||||
* pip install ./app
|
||||
* pip install file:///home/user/app
|
||||
|
||||
Commands that *do not* generate a ``direct_url.json``
|
||||
|
||||
* pip install app
|
||||
* pip install app --no-index --find-links https://example.com/
|
||||
* pip install --editable "git+https://example.com/repo/app.git#egg=app&subdirectory=setup"
|
||||
* pip install -e ./app
|
||||
|
||||
Use cases
|
||||
=========
|
||||
|
||||
"Freezing" an environment
|
||||
|
||||
Tools, such as ``pip freeze``, which generate requirements from the Database
|
||||
of Installed Python Distributions SHOULD exploit ``direct_url.json``
|
||||
if it is present, and give it priority over the Version metadata in order
|
||||
to generate a higher fidelity output. In the presence of a ``vcs`` direct URL reference,
|
||||
the ``commit_id`` field SHOULD be used in priority in order to provide
|
||||
the highest possible fidelity to the originally installed version. If
|
||||
supported by their requirement format, tools are encouraged also to output
|
||||
the ``tag`` value if present, as it has immutable semantics.
|
||||
Tools MAY choose another approach, depending on the needs of their users.
|
||||
|
||||
Backwards Compatibility
|
||||
=======================
|
||||
|
||||
Since this PEP specifies a new file in the ``.dist-info`` directory,
|
||||
there are no backwards compatibility implications.
|
||||
|
||||
Alternatives
|
||||
============
|
||||
|
||||
PEP426 source_url
|
||||
-----------------
|
||||
|
||||
The now withdrawn PEP 426 specifies a ``source_url`` metadata entry.
|
||||
It is also implemented in `distlib`_.
|
||||
|
||||
It was intended for a slightly different purpose, for use in sdists.
|
||||
|
||||
This format lacks support for the ``subdirectory`` option of pip requirement
|
||||
URLs. The same limitation is present in `PEP440 Direct References`_.
|
||||
|
||||
It also lacks explicit support for `environment variables in the user:password
|
||||
part of URLs`_.
|
||||
|
||||
The introduction of a key/value extensibility mechanism and support
|
||||
for environment variables for user:password in PEP 440, would be necessary
|
||||
for use in this PEP.
|
||||
|
||||
revision vs ref
|
||||
---------------
|
||||
|
||||
The ``requested_revision`` key was retained over ``requested_ref`` as it is a more generic term
|
||||
across various VCS and ``ref`` has a specific meaning for ``git``.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. _`pip issue #609`: https://github.com/pypa/pip/issues/609
|
||||
.. _`thread on discuss.python.org`: https://discuss.python.org/t/pip-freeze-vcs-urls-and-pep-517-feat-editable-installs/1473
|
||||
.. _PEP440: http://www.python.org/dev/peps/pep-0440
|
||||
.. _`VCS URLs supported by pip`: https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support
|
||||
.. _`PEP440 Direct References`: https://www.python.org/dev/peps/pep-0440/#direct-references
|
||||
.. _`Pipfile and Pipfile.lock`: https://github.com/pypa/pipfile
|
||||
.. _distlib: https://distlib.readthedocs.io
|
||||
.. _`environment variables in the user:password part of URLs`: https://pip.pypa.io/en/stable/reference/pip_install/#id10
|
||||
|
||||
Acknowledgements
|
||||
================
|
||||
|
||||
Various people helped make this PEP a reality. Paul F. Moore provided the
|
||||
essence of the abstract. Nick Coghlan suggested the direct_url name.
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document is placed in the public domain or under the
|
||||
CC0-1.0-Universal license, whichever is more permissive.
|
||||
|
||||
|
||||
..
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
coding: utf-8
|
||||
End:
|
Loading…
Reference in New Issue