942 lines
41 KiB
Plaintext
942 lines
41 KiB
Plaintext
PEP: 517
|
||
Title: A build-system independent format for source trees
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Nathaniel J. Smith <njs@pobox.com>,
|
||
Thomas Kluyver <thomas@kluyver.me.uk>
|
||
BDFL-Delegate: Nick Coghlan <ncoghlan@gmail.com>
|
||
Discussions-To: <distutils-sig@python.org>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 30-Sep-2015
|
||
Post-History: 1 Oct 2015, 25 Oct 2015
|
||
|
||
|
||
==========
|
||
Abstract
|
||
==========
|
||
|
||
While ``distutils`` / ``setuptools`` have taken us a long way, they
|
||
suffer from three serious problems: (a) they're missing important
|
||
features like usable build-time dependency declaration,
|
||
autoconfiguration, and even basic ergonomic niceties like `DRY
|
||
<https://en.wikipedia.org/wiki/Don%27t_repeat_yourself>`_-compliant
|
||
version number management, and (b) extending them is difficult, so
|
||
while there do exist various solutions to the above problems, they're
|
||
often quirky, fragile, and expensive to maintain, and yet (c) it's
|
||
very difficult to use anything else, because distutils/setuptools
|
||
provide the standard interface for installing packages expected by
|
||
both users and installation tools like ``pip``.
|
||
|
||
Previous efforts (e.g. distutils2 or setuptools itself) have attempted
|
||
to solve problems (a) and/or (b). This proposal aims to solve (c).
|
||
|
||
The goal of this PEP is get distutils-sig out of the business of being
|
||
a gatekeeper for Python build systems. If you want to use distutils,
|
||
great; if you want to use something else, then that should be easy to
|
||
do using standardized methods. The difficulty of interfacing with
|
||
distutils means that there aren't many such systems right now, but to
|
||
give a sense of what we're thinking about see `flit
|
||
<https://github.com/takluyver/flit>`_ or `bento
|
||
<https://cournape.github.io/Bento/>`_. Fortunately, wheels have now
|
||
solved many of the hard problems here -- e.g. it's no longer necessary
|
||
that a build system also know about every possible installation
|
||
configuration -- so pretty much all we really need from a build system
|
||
is that it have some way to spit out standard-compliant wheels and
|
||
sdists.
|
||
|
||
We therefore propose a new, relatively minimal interface for
|
||
installation tools like ``pip`` to interact with package source trees
|
||
and source distributions.
|
||
|
||
==========================
|
||
Reversion to Draft Status
|
||
==========================
|
||
|
||
While this PEP was provisionally accepted for implementation in `pip` and other
|
||
tools, some additional concerns were subsequently raised around adequately
|
||
supporting out of tree builds. It has been reverted to Draft status while those
|
||
concerns are being resolved.
|
||
|
||
=======================
|
||
Terminology and goals
|
||
=======================
|
||
|
||
A *source tree* is something like a VCS checkout. We need a standard
|
||
interface for installing from this format, to support usages like
|
||
``pip install some-directory/``.
|
||
|
||
A *source distribution* is a static snapshot representing a particular
|
||
release of some source code, like ``lxml-3.4.4.zip``. Source
|
||
distributions serve many purposes: they form an archival record of
|
||
releases, they provide a stupid-simple de facto standard for tools
|
||
that want to ingest and process large corpora of code, possibly
|
||
written in many languages (e.g. code search), they act as the input to
|
||
downstream packaging systems like Debian/Fedora/Conda/..., and so
|
||
forth. In the Python ecosystem they additionally have a particularly
|
||
important role to play, because packaging tools like ``pip`` are able
|
||
to use source distributions to fulfill binary dependencies, e.g. if
|
||
there is a distribution ``foo.whl`` which declares a dependency on
|
||
``bar``, then we need to support the case where ``pip install bar`` or
|
||
``pip install foo`` automatically locates the sdist for ``bar``,
|
||
downloads it, builds it, and installs the resulting package.
|
||
|
||
Source distributions are also known as *sdists* for short.
|
||
|
||
A *build frontend* is a tool that users might run that takes arbitrary
|
||
source trees or source distributions and builds wheels from them. The
|
||
actual building is done by each source tree's *build backend*. In a
|
||
command like ``pip wheel some-directory/``, pip is acting as a build
|
||
frontend.
|
||
|
||
An *integration frontend* is a tool that users might run that takes a
|
||
set of package requirements (e.g. a requirements.txt file) and
|
||
attempts to update a working environment to satisfy those
|
||
requirements. This may require locating, building, and installing a
|
||
combination of wheels and sdists. In a command like ``pip install
|
||
lxml==2.4.0``, pip is acting as an integration frontend.
|
||
|
||
|
||
==============
|
||
Source trees
|
||
==============
|
||
|
||
There is an existing, legacy source tree format involving
|
||
``setup.py``. We don't try to specify it further; its de facto
|
||
specification is encoded in the source code and documentation of
|
||
``distutils``, ``setuptools``, ``pip``, and other tools. We'll refer
|
||
to it as the ``setup.py``\-style.
|
||
|
||
Here we define a new style of source tree based around the
|
||
``pyproject.toml`` file defined in PEP 518, extending the
|
||
``[build-system]`` table in that file with one additional key,
|
||
``build-backend``. Here's an example of how it would look::
|
||
|
||
[build-system]
|
||
# Defined by PEP 518:
|
||
requires = ["flit"]
|
||
# Defined by this PEP:
|
||
build-backend = "flit.api:main"
|
||
|
||
``build-backend`` is a string naming a Python object that will be
|
||
used to perform the build (see below for details). This is formatted
|
||
following the same ``module:object`` syntax as a ``setuptools`` entry
|
||
point. For instance, if the string is ``"flit.api:main"`` as in the
|
||
example above, this object would be looked up by executing the
|
||
equivalent of::
|
||
|
||
import flit.api
|
||
backend = flit.api.main
|
||
|
||
It's also legal to leave out the ``:object`` part, e.g. ::
|
||
|
||
build-backend = "flit.api"
|
||
|
||
which acts like::
|
||
|
||
import flit.api
|
||
backend = flit.api
|
||
|
||
Formally, the string should satisfy this grammar::
|
||
|
||
identifier = (letter | '_') (letter | '_' | digit)*
|
||
module_path = identifier ('.' identifier)*
|
||
object_path = identifier ('.' identifier)*
|
||
entry_point = module_path (':' object_path)?
|
||
|
||
And we import ``module_path`` and then lookup
|
||
``module_path.object_path`` (or just ``module_path`` if
|
||
``object_path`` is missing).
|
||
|
||
If the ``pyproject.toml`` file is absent, or the ``build-backend``
|
||
key is missing, the source tree is not using this specification, and
|
||
tools should fall back to running ``setup.py``.
|
||
|
||
Where the ``build-backend`` key exists, it takes precedence over
|
||
``setup.py``, and source trees need not include ``setup.py`` at all.
|
||
Projects may still wish to include a ``setup.py`` for compatibility
|
||
with tools that do not use this spec.
|
||
|
||
=========================
|
||
Build backend interface
|
||
=========================
|
||
|
||
The build backend object is expected to have attributes which provide
|
||
some or all of the following hooks. The common ``config_settings``
|
||
argument is described after the individual hooks.
|
||
|
||
Mandatory hooks
|
||
===============
|
||
|
||
build_wheel
|
||
-----------
|
||
|
||
::
|
||
|
||
build_wheel(wheel_directory, config_settings=None, build_directory=None, metadata_directory=None):
|
||
...
|
||
|
||
Must build a .whl file, and place it in the specified ``wheel_directory``. It
|
||
must return the basename (not the full path) of the ``.whl`` file it creates,
|
||
as a unicode string.
|
||
|
||
If the build frontend has previously called ``prepare_metadata_for_build_wheel``
|
||
and depends on the wheel resulting from this call to have metadata
|
||
matching this earlier call, then it should provide the path to the created
|
||
``.dist-info`` directory as the ``metadata_directory`` argument. If this
|
||
argument is provided, then ``build_wheel`` MUST produce a wheel with identical
|
||
metadata. The directory passed in by the build frontend MUST be
|
||
identical to the directory created by ``prepare_metadata_for_build_wheel``,
|
||
including any unrecognized files it created.
|
||
|
||
Backends which do not provide the ``prepare_metadata_for_build_wheel`` hook may
|
||
either silently ignore the ``metadata_directory`` parameter to ``build_wheel``,
|
||
or else raise an exception when it is set to anything other than ``None``.
|
||
|
||
If ``build_directory`` is not None, it is a unicode string containing the
|
||
path to a directory other than the source directory (the working directory where
|
||
the hook is called) where intermediate build artifacts may be stored.
|
||
|
||
These out-of-tree builds should have the same result as first building an sdist
|
||
and then building a wheel from that sdist. If the backend cannot otherwise
|
||
ensure that unexpected files won't end up in the built wheel archive, it should
|
||
copy any essential input files it needs to ``build_directory`` and perform the
|
||
build there.
|
||
|
||
The provided build directory may be empty, or it may contain artifacts from a
|
||
previous build to be used as a cache. The backend is responsible for determining
|
||
whether any cached artifacts are outdated.
|
||
|
||
If ``build_directory`` is None, the backend may do an 'in place' build which
|
||
modifies the source directory and may produce different results from those that
|
||
would be obtained by exporting an sdist first. The exact semantics of this will
|
||
depend on the build system in use, and hence are not specified here.
|
||
|
||
To guard against frontends that are not fully compliant with this specification,
|
||
defensively coded backends may treat
|
||
``os.path.samefile(build_directory, os.getcwd())`` as a request for an in place
|
||
build.
|
||
|
||
Whatever the value of ``build_directory``, the backend may also store
|
||
intermediate artifacts in other cache locations or temporary directories, which
|
||
it is responsible for managing. The presence or absence of any caches should not
|
||
make a material difference to the final result of the build.
|
||
|
||
build_sdist
|
||
-----------
|
||
|
||
::
|
||
|
||
def build_sdist(sdist_directory, config_settings=None):
|
||
...
|
||
|
||
Must build a .tar.gz source distribution and place it in the specified
|
||
``sdist_directory``. It must return the basename (not the full path) of the
|
||
``.tar.gz`` file it creates, as a unicode string.
|
||
|
||
A .tar.gz source distribution (sdist) contains a single top-level directory called
|
||
``{name}-{version}`` (e.g. ``foo-1.0``), containing the source files of the
|
||
package. This directory must also contain the
|
||
``pyproject.toml`` from the build directory, and a PKG-INFO file containing
|
||
metadata in the format described in
|
||
`PEP 345 <https://www.python.org/dev/peps/pep-0345/>`_. Although historically
|
||
zip files have also been used as sdists, this hook should produce a gzipped
|
||
tarball. This is already the more common format for sdists, and having a
|
||
consistent format makes for simpler tooling.
|
||
|
||
The generated tarball should use the modern POSIX.1-2001 pax tar format, which
|
||
specifies UTF-8 based file names. This is not yet the default for the tarfile
|
||
module shipped with Python 3.6, so backends using the tarfile module need to
|
||
explicitly pass ``format=tarfile.PAX_FORMAT``.
|
||
|
||
Frontends performing a local installation may want to produce an sdist as an
|
||
intermediate, but are advised to be prepared to use a fallback if it fails, as
|
||
e.g. some backends may require version control tools to build an sdist.
|
||
|
||
Optional hooks
|
||
==============
|
||
|
||
get_requires_for_build_wheel
|
||
----------------------------
|
||
|
||
::
|
||
|
||
def get_requires_for_build_wheel(config_settings=None):
|
||
...
|
||
|
||
This hook MUST return an additional list of strings containing PEP 508
|
||
dependency specifications, above and beyond those specified in the
|
||
``pyproject.toml`` file, to be installed when calling the ``build_wheel`` or
|
||
``prepare_metadata_for_build_wheel`` hooks.
|
||
|
||
Example::
|
||
|
||
def get_requires_for_build_wheel(config_settings):
|
||
return ["wheel >= 0.25", "setuptools"]
|
||
|
||
If not defined, the default implementation is equivalent to ``return []``.
|
||
|
||
prepare_metadata_for_build_wheel
|
||
--------------------------------
|
||
|
||
::
|
||
|
||
def prepare_metadata_for_build_wheel(metadata_directory, config_settings=None):
|
||
...
|
||
|
||
Must create a ``.dist-info`` directory containing wheel metadata
|
||
inside the specified ``metadata_directory`` (i.e., creates a directory
|
||
like ``{metadata_directory}/{package}-{version}.dist-info/``). This
|
||
directory MUST be a valid ``.dist-info`` directory as defined in the
|
||
wheel specification, except that it need not contain ``RECORD`` or
|
||
signatures. The hook MAY also create other files inside this
|
||
directory, and a build frontend MUST preserve, but otherwise ignore, such files;
|
||
the intention
|
||
here is that in cases where the metadata depends on build-time
|
||
decisions, the build backend may need to record these decisions in
|
||
some convenient format for re-use by the actual wheel-building step.
|
||
|
||
This must return the basename (not the full path) of the ``.dist-info``
|
||
directory it creates, as a unicode string.
|
||
|
||
If a build frontend needs this information and the method is
|
||
not defined, it should call ``build_wheel`` and look at the resulting
|
||
metadata directly.
|
||
|
||
get_requires_for_build_sdist
|
||
----------------------------
|
||
|
||
::
|
||
|
||
def get_requires_for_build_sdist(config_settings=None):
|
||
...
|
||
|
||
This hook MUST return an additional list of strings containing PEP 508
|
||
dependency specifications, above and beyond those specified in the
|
||
``pyproject.toml`` file. These dependencies will be installed when calling the
|
||
``build_sdist`` hook.
|
||
|
||
If not defined, the default implementation is equivalent to ``return []``.
|
||
|
||
|
||
.. note:: Editable installs
|
||
|
||
This PEP originally specified another hook, ``install_editable``, to do an
|
||
editable install (as with ``pip install -e``). It was removed due to the
|
||
complexity of the topic, but may be specified in a later PEP.
|
||
|
||
Briefly, the questions to be answered include: what reasonable ways existing
|
||
of implementing an 'editable install'? Should the backend or the frontend
|
||
pick how to make an editable install? And if the frontend does, what does it
|
||
need from the backend to do so.
|
||
|
||
Config settings
|
||
===============
|
||
|
||
::
|
||
|
||
config_settings
|
||
|
||
This argument, which is passed to all hooks, is an arbitrary
|
||
dictionary provided as an "escape hatch" for users to pass ad-hoc
|
||
configuration into individual package builds. Build backends MAY
|
||
assign any semantics they like to this dictionary. Build frontends
|
||
SHOULD provide some mechanism for users to specify arbitrary
|
||
string-key/string-value pairs to be placed in this dictionary. For
|
||
example, they might support some syntax like ``--package-config
|
||
CC=gcc``. Build frontends MAY also provide arbitrary other mechanisms
|
||
for users to place entries in this dictionary. For example, ``pip``
|
||
might choose to map a mix of modern and legacy command line arguments
|
||
like::
|
||
|
||
pip install \
|
||
--package-config CC=gcc \
|
||
--global-option="--some-global-option" \
|
||
--build-option="--build-option1" \
|
||
--build-option="--build-option2"
|
||
|
||
into a ``config_settings`` dictionary like::
|
||
|
||
{
|
||
"CC": "gcc",
|
||
"--global-option": ["--some-global-option"],
|
||
"--build-option": ["--build-option1", "--build-option2"],
|
||
}
|
||
|
||
Of course, it's up to users to make sure that they pass options which
|
||
make sense for the particular build backend and package that they are
|
||
building.
|
||
|
||
The hooks may be called with positional or keyword arguments, so backends
|
||
implementing them should be careful to make sure that their signatures match
|
||
both the order and the names of the arguments above.
|
||
|
||
All hooks are run with working directory set to the root of the source
|
||
tree, and MAY print arbitrary informational text on stdout and
|
||
stderr. They MUST NOT read from stdin, and the build frontend MAY
|
||
close stdin before invoking the hooks.
|
||
|
||
The build frontend may capture stdout and/or stderr from the backend. If the
|
||
backend detects that an output stream is not a terminal/console (e.g.
|
||
``not sys.stdout.isatty()``), it SHOULD ensure that any output it writes to that
|
||
stream is UTF-8 encoded. The build frontend MUST NOT fail if captured output is
|
||
not valid UTF-8, but it MAY not preserve all the information in that case (e.g.
|
||
it may decode using the *replace* error handler in Python). If the output stream
|
||
is a terminal, the build backend is responsible for presenting its output
|
||
accurately, as for any program running in a terminal.
|
||
|
||
If a hook raises an exception, or causes the process to terminate,
|
||
then this indicates an error.
|
||
|
||
|
||
Build environment
|
||
=================
|
||
|
||
One of the responsibilities of a build frontend is to set up the
|
||
Python environment in which the build backend will run.
|
||
|
||
We do not require that any particular "virtual environment" mechanism
|
||
be used; a build frontend might use virtualenv, or venv, or no special
|
||
mechanism at all. But whatever mechanism is used MUST meet the
|
||
following criteria:
|
||
|
||
- All requirements specified by the project's build-requirements must
|
||
be available for import from Python. In particular:
|
||
|
||
- The ``get_requires_for_build_wheel`` and ``get_requires_for_build_sdist`` hooks are
|
||
executed in an environment which contains the bootstrap requirements
|
||
specified in the ``pyproject.toml`` file.
|
||
|
||
- The ``prepare_metadata_for_build_wheel`` and ``build_wheel`` hooks are
|
||
executed in an environment which contains the
|
||
bootstrap requirements from ``pyproject.toml`` and those specified by the
|
||
``get_requires_for_build_wheel`` hook.
|
||
|
||
- The ``build_sdist`` hook is executed in an environment which contains the
|
||
bootstrap requirements from ``pyproject.toml`` and those specified by the
|
||
``get_requires_for_build_sdist`` hook.
|
||
|
||
- This must remain true even for new Python subprocesses spawned by
|
||
the build environment, e.g. code like::
|
||
|
||
import sys, subprocess
|
||
subprocess.check_call([sys.executable, ...])
|
||
|
||
must spawn a Python process which has access to all the project's
|
||
build-requirements. This is necessary e.g. for build backends that
|
||
want to run legacy ``setup.py`` scripts in a subprocess.
|
||
|
||
- All command-line scripts provided by the build-required packages
|
||
must be present in the build environment's PATH. For example, if a
|
||
project declares a build-requirement on `flit
|
||
<https://flit.readthedocs.org/en/latest/>`__, then the following must
|
||
work as a mechanism for running the flit command-line tool::
|
||
|
||
import subprocess
|
||
subprocess.check_call(["flit", ...])
|
||
|
||
A build backend MUST be prepared to function in any environment which
|
||
meets the above criteria. In particular, it MUST NOT assume that it
|
||
has access to any packages except those that are present in the
|
||
stdlib, or that are explicitly declared as build-requirements.
|
||
|
||
Frontends should call each hook in a fresh subprocess, so that backends are
|
||
free to change process global state (such as environment variables or the
|
||
working directory). A Python library will be provided which frontends can use
|
||
to easily call hooks this way.
|
||
|
||
Recommendations for build frontends (non-normative)
|
||
---------------------------------------------------
|
||
|
||
A build frontend MAY use any mechanism for setting up a build
|
||
environment that meets the above criteria. For example, simply
|
||
installing all build-requirements into the global environment would be
|
||
sufficient to build any compliant package -- but this would be
|
||
sub-optimal for a number of reasons. This section contains
|
||
non-normative advice to frontend implementors.
|
||
|
||
A build frontend SHOULD, by default, create an isolated environment
|
||
for each build, containing only the standard library and any
|
||
explicitly requested build-dependencies. This has two benefits:
|
||
|
||
- It allows for a single installation run to build multiple packages
|
||
that have contradictory build-requirements. E.g. if package1
|
||
build-requires pbr==1.8.1, and package2 build-requires pbr==1.7.2,
|
||
then these cannot both be installed simultaneously into the global
|
||
environment -- which is a problem when the user requests ``pip
|
||
install package1 package2``. Or if the user already has pbr==1.8.1
|
||
installed in their global environment, and a package build-requires
|
||
pbr==1.7.2, then downgrading the user's version would be rather
|
||
rude.
|
||
|
||
- It acts as a kind of public health measure to maximize the number of
|
||
packages that actually do declare accurate build-dependencies. We
|
||
can write all the strongly worded admonitions to package authors we
|
||
want, but if build frontends don't enforce isolation by default,
|
||
then we'll inevitably end up with lots of packages on PyPI that
|
||
build fine on the original author's machine and nowhere else, which
|
||
is a headache that no-one needs.
|
||
|
||
However, there will also be situations where build-requirements are
|
||
problematic in various ways. For example, a package author might
|
||
accidentally leave off some crucial requirement despite our best
|
||
efforts; or, a package might declare a build-requirement on ``foo >=
|
||
1.0`` which worked great when 1.0 was the latest version, but now 1.1
|
||
is out and it has a showstopper bug; or, the user might decide to
|
||
build a package against numpy==1.7 -- overriding the package's
|
||
preferred numpy==1.8 -- to guarantee that the resulting build will be
|
||
compatible at the C ABI level with an older version of numpy (even if
|
||
this means the resulting build is unsupported upstream). Therefore,
|
||
build frontends SHOULD provide some mechanism for users to override
|
||
the above defaults. For example, a build frontend could have a
|
||
``--build-with-system-site-packages`` option that causes the
|
||
``--system-site-packages`` option to be passed to
|
||
virtualenv-or-equivalent when creating build environments, or a
|
||
``--build-requirements-override=my-requirements.txt`` option that
|
||
overrides the project's normal build-requirements.
|
||
|
||
The general principle here is that we want to enforce hygiene on
|
||
package *authors*, while still allowing *end-users* to open up the
|
||
hood and apply duct tape when necessary.
|
||
|
||
|
||
======================
|
||
Source distributions
|
||
======================
|
||
|
||
We continue with the legacy sdist format, adding some new restrictions.
|
||
This format is mostly
|
||
undefined, but basically comes down to: a file named
|
||
``{NAME}-{VERSION}.{EXT}``, which unpacks into a buildable source tree
|
||
called ``{NAME}-{VERSION}/``. Traditionally these have always
|
||
contained ``setup.py``\-style source trees; we now allow them to also
|
||
contain ``pyproject.toml``\-style source trees.
|
||
|
||
Integration frontends require that an sdist named
|
||
``{NAME}-{VERSION}.{EXT}`` will generate a wheel named
|
||
``{NAME}-{VERSION}-{COMPAT-INFO}.whl``.
|
||
|
||
The new restrictions for sdists built by PEP 517 backends are:
|
||
|
||
- They will be gzipped tar archives, with the ``.tar.gz`` extension. Zip
|
||
archives, or other compression formats for tarballs, are not allowed at
|
||
present.
|
||
- Tar archives must be created in the modern POSIX.1-2001 pax tar format, which
|
||
uses UTF-8 for file names.
|
||
- The source tree contained in an sdist is expected to include the
|
||
``pyproject.toml`` file.
|
||
|
||
====================
|
||
Evolutionary notes
|
||
====================
|
||
|
||
A goal here is to make it as simple as possible to convert old-style
|
||
sdists to new-style sdists. (E.g., this is one motivation for
|
||
supporting dynamic build requirements.) The ideal would be that there
|
||
would be a single static ``pyproject.toml`` that could be dropped into any
|
||
"version 0" VCS checkout to convert it to the new shiny. This is
|
||
probably not 100% possible, but we can get close, and it's important
|
||
to keep track of how close we are... hence this section.
|
||
|
||
A rough plan would be: Create a build system package
|
||
(``setuptools_pypackage`` or whatever) that knows how to speak
|
||
whatever hook language we come up with, and convert them into calls to
|
||
``setup.py``. This will probably require some sort of hooking or
|
||
monkeypatching to setuptools to provide a way to extract the
|
||
``setup_requires=`` argument when needed, and to provide a new version
|
||
of the sdist command that generates the new-style format. This all
|
||
seems doable and sufficient for a large proportion of packages (though
|
||
obviously we'll want to prototype such a system before we finalize
|
||
anything here). (Alternatively, these changes could be made to
|
||
setuptools itself rather than going into a separate package.)
|
||
|
||
But there remain two obstacles that mean we probably won't be able to
|
||
automatically upgrade packages to the new format:
|
||
|
||
1) There currently exist packages which insist on particular packages
|
||
being available in their environment before setup.py is
|
||
executed. This means that if we decide to execute build scripts in
|
||
an isolated virtualenv-like environment, then projects will need to
|
||
check whether they do this, and if so then when upgrading to the
|
||
new system they will have to start explicitly declaring these
|
||
dependencies (either via ``setup_requires=`` or via static
|
||
declaration in ``pyproject.toml``).
|
||
|
||
2) There currently exist packages which do not declare consistent
|
||
metadata (e.g. ``egg_info`` and ``bdist_wheel`` might get different
|
||
``install_requires=``). When upgrading to the new system, projects
|
||
will have to evaluate whether this applies to them, and if so they
|
||
will need to stop doing that.
|
||
|
||
|
||
==================
|
||
Rejected options
|
||
==================
|
||
|
||
* We discussed making the wheel and sdist hooks build unpacked directories
|
||
containing the same contents as their respective archives. In some cases this
|
||
could avoid the need to pack and unpack an archive, but this seems like
|
||
premature optimisation. It's advantageous for tools to work with archives
|
||
as the canonical interchange formats (especially for wheels, where the archive
|
||
format is already standardised). Close control of archive creation is
|
||
important for reproducible builds. And it's not clear that tasks requiring an
|
||
unpacked distribution will be more common than those requiring an archive.
|
||
* We considered an extra hook to copy files to a build directory before invoking
|
||
``build_wheel``. Looking at existing build systems, we found that passing
|
||
a build directory into ``build_wheel`` makes more sense for many tools than
|
||
pre-emptively copying files into a build directory.
|
||
|
||
===================================
|
||
Appendix A: Comparison to PEP 516
|
||
===================================
|
||
|
||
:pep:`516` is a competing proposal to specify a build system interface, which
|
||
has now been rejected in favour of this PEP. The primary difference is
|
||
that our build backend is defined via a Python hook-based interface
|
||
rather than a command-line based interface.
|
||
|
||
This appendix documents the arguments advanced for this PEP over PEP 516.
|
||
|
||
We do *not* expect that specifying Python hooks rather than command line
|
||
interfaces will, by itself, reduce the
|
||
complexity of calling into the backend, because build frontends will
|
||
in any case want to run hooks inside a child -- this is important to
|
||
isolate the build frontend itself from the backend code and to better
|
||
control the build backends execution environment. So under both
|
||
proposals, there will need to be some code in ``pip`` to spawn a
|
||
subprocess and talk to some kind of command-line/IPC interface, and
|
||
there will need to be some code in the subprocess that knows how to
|
||
parse these command line arguments and call the actual build backend
|
||
implementation. So this diagram applies to all proposals equally::
|
||
|
||
+-----------+ +---------------+ +----------------+
|
||
| frontend | -spawn-> | child cmdline | -Python-> | backend |
|
||
| (pip) | | interface | | implementation |
|
||
+-----------+ +---------------+ +----------------+
|
||
|
||
|
||
|
||
The key difference between the two approaches is how these interface
|
||
boundaries map onto project structure::
|
||
|
||
.-= This PEP =-.
|
||
|
||
+-----------+ +---------------+ | +----------------+
|
||
| frontend | -spawn-> | child cmdline | -Python-> | backend |
|
||
| (pip) | | interface | | | implementation |
|
||
+-----------+ +---------------+ | +----------------+
|
||
|
|
||
|______________________________________| |
|
||
Owned by pip, updated in lockstep |
|
||
|
|
||
|
|
||
PEP-defined interface boundary
|
||
Changes here require distutils-sig
|
||
|
||
|
||
.-= Alternative =-.
|
||
|
||
+-----------+ | +---------------+ +----------------+
|
||
| frontend | -spawn-> | child cmdline | -Python-> | backend |
|
||
| (pip) | | | interface | | implementation |
|
||
+-----------+ | +---------------+ +----------------+
|
||
|
|
||
| |____________________________________________|
|
||
| Owned by build backend, updated in lockstep
|
||
|
|
||
PEP-defined interface boundary
|
||
Changes here require distutils-sig
|
||
|
||
|
||
By moving the PEP-defined interface boundary into Python code, we gain
|
||
three key advantages.
|
||
|
||
**First**, because there will likely be only a small number of build
|
||
frontends (``pip``, and... maybe a few others?), while there will
|
||
likely be a long tail of custom build backends (since these are chosen
|
||
separately by each package to match their particular build
|
||
requirements), the actual diagrams probably look more like::
|
||
|
||
.-= This PEP =-.
|
||
|
||
+-----------+ +---------------+ +----------------+
|
||
| frontend | -spawn-> | child cmdline | -Python+> | backend |
|
||
| (pip) | | interface | | | implementation |
|
||
+-----------+ +---------------+ | +----------------+
|
||
|
|
||
| +----------------+
|
||
+> | backend |
|
||
| | implementation |
|
||
| +----------------+
|
||
:
|
||
:
|
||
|
||
.-= Alternative =-.
|
||
|
||
+-----------+ +---------------+ +----------------+
|
||
| frontend | -spawn+> | child cmdline | -Python-> | backend |
|
||
| (pip) | | | interface | | implementation |
|
||
+-----------+ | +---------------+ +----------------+
|
||
|
|
||
| +---------------+ +----------------+
|
||
+> | child cmdline | -Python-> | backend |
|
||
| | interface | | implementation |
|
||
| +---------------+ +----------------+
|
||
:
|
||
:
|
||
|
||
That is, this PEP leads to less total code in the overall
|
||
ecosystem. And in particular, it reduces the barrier to entry of
|
||
making a new build system. For example, this is a complete, working
|
||
build backend::
|
||
|
||
# mypackage_custom_build_backend.py
|
||
import os.path
|
||
import pathlib
|
||
import shutil
|
||
|
||
SDIST_NAME = "mypackage-0.1"
|
||
SDIST_FILENAME = SDIST_NAME + ".tar.gz"
|
||
WHEEL_FILENAME = "mypackage-0.1-py2.py3-none-any.whl"
|
||
|
||
#################
|
||
# sdist creation
|
||
#################
|
||
|
||
def _exclude_hidden_and_special_files(archive_entry):
|
||
"""Tarfile filter to exclude hidden and special files from the archive"""
|
||
if entry.isfile() or entry.isdir():
|
||
if not os.path.basename(archive_entry.name).startswith("."):
|
||
return archive_entry
|
||
return None
|
||
|
||
def _make_sdist(sdist_dir):
|
||
"""Make an sdist and return both the Python object and its filename"""
|
||
sdist_path = pathlib.Path(sdist_dir) / SDIST_FILENAME
|
||
sdist = tarfile.open(sdist_path, "w:gz", format=tarfile.PAX_FORMAT)
|
||
# Tar up the whole directory, minus hidden and special files
|
||
sdist.add(os.getcwd(), arcname=SDIST_NAME,
|
||
filter=_exclude_hidden_and_special_files)
|
||
return sdist, SDIST_FILENAME
|
||
|
||
def build_sdist(sdist_dir, config_settings):
|
||
"""PEP 517 sdist creation hook"""
|
||
sdist, sdist_filename = _make_sdist(sdist_dir)
|
||
return sdist_filename
|
||
|
||
#################
|
||
# wheel creation
|
||
#################
|
||
|
||
def get_requires_for_build_wheel(config_settings):
|
||
"""PEP 517 wheel building dependency definition hook"""
|
||
# As a simple static requirement, this could also just be
|
||
# listed in the project's build system dependencies instead
|
||
return ["wheel"]
|
||
|
||
def _prepare_out_of_tree_build(build_directory):
|
||
"""Prepare out-of-tree build by way of unpacking the sdist"""
|
||
sdist, sdist_filename = _make_sdist(build_directory)
|
||
os.chdir(build_directory)
|
||
if os.path.exists(SDIST_NAME):
|
||
# Prevent caching of stale input files
|
||
shutil.rmtree(SDIST_NAME)
|
||
sdist.extractall()
|
||
os.remove(sdist_filename)
|
||
os.chdir(SDIST_NAME)
|
||
|
||
def build_wheel(wheel_directory, build_directory=None,
|
||
metadata_directory=None, config_settings=None):
|
||
"""PEP 517 wheel creation hook"""
|
||
# First check if the frontend has requested an out-of-tree build
|
||
out_of_tree_build = (build_directory is not None and
|
||
not os.path.samefile(build_directory, os.getcwd())
|
||
if out_of_tree_build:
|
||
_prepare_out_of_tree_build(build_directory)
|
||
# Continue with assembling the wheel archive
|
||
from wheel.archive import archive_wheelfile
|
||
path = os.path.join(wheel_directory, WHEEL_FILENAME)
|
||
archive_wheelfile(path, "src/")
|
||
return WHEEL_FILENAME
|
||
|
||
Of course, this is a *terrible* build backend: it requires the user to
|
||
have manually set up the wheel metadata in
|
||
``src/mypackage-0.1.dist-info/``; when the version number changes it
|
||
must be manually updated in multiple places... but it works, and more features
|
||
could be added incrementally. Much experience suggests that large successful
|
||
projects often originate as quick hacks (e.g., Linux -- "just a hobby,
|
||
won't be big and professional"; `IPython/Jupyter
|
||
<https://en.wikipedia.org/wiki/IPython#Grants_and_awards>`_ -- `a grad
|
||
student's ` ``$PYTHONSTARTUP`` file
|
||
<http://blog.fperez.org/2012/01/ipython-notebook-historical.html>`_),
|
||
so if our goal is to encourage the growth of a vibrant ecosystem of
|
||
good build tools, it's important to minimize the barrier to entry.
|
||
|
||
|
||
**Second**, because Python provides a simpler yet richer structure for
|
||
describing interfaces, we remove unnecessary complexity from the
|
||
specification -- and specifications are the worst place for
|
||
complexity, because changing specifications requires painful
|
||
consensus-building across many stakeholders. In the command-line
|
||
interface approach, we have to come up with ad hoc ways to map
|
||
multiple different kinds of inputs into a single linear command line
|
||
(e.g. how do we avoid collisions between user-specified configuration
|
||
arguments and PEP-defined arguments? how do we specify optional
|
||
arguments? when working with a Python interface these questions have
|
||
simple, obvious answers). When spawning and managing subprocesses,
|
||
there are many fiddly details that must be gotten right, subtle
|
||
cross-platform differences, and some of the most obvious approaches --
|
||
e.g., using stdout to return data for the ``build_requires`` operation
|
||
-- can create unexpected pitfalls (e.g., what happens when computing
|
||
the build requirements requires spawning some child processes, and
|
||
these children occasionally print an error message to stdout?
|
||
obviously a careful build backend author can avoid this problem, but
|
||
the most obvious way of defining a Python interface removes this
|
||
possibility entirely, because the hook return value is clearly
|
||
demarcated).
|
||
|
||
In general, the need to isolate build backends into their own process
|
||
means that we can't remove IPC complexity entirely -- but by placing
|
||
both sides of the IPC channel under the control of a single project,
|
||
we make it much cheaper to fix bugs in the IPC interface than if
|
||
fixing bugs requires coordinated agreement and coordinated changes
|
||
across the ecosystem.
|
||
|
||
**Third**, and most crucially, the Python hook approach gives us much
|
||
more powerful options for evolving this specification in the future.
|
||
|
||
For concreteness, imagine that next year we add a new
|
||
``build_sdist_from_vcs`` hook, which provides an alternative to the current
|
||
``build_sdist`` hook where the frontend is responsible for passing
|
||
version control tracking metadata to backends (including indicating when all
|
||
on disk files are tracked), rather than individual backends having to query that
|
||
information themselves. In order to manage the transition, we'd want it to be
|
||
possible for build frontends to transparently use ``build_sdist_from_vcs`` when
|
||
available and fall back onto ``build_sdist`` otherwise; and we'd want it to be
|
||
possible for build backends to define both methods, for compatibility
|
||
with both old and new build frontends.
|
||
|
||
Furthermore, our mechanism should also fulfill two more goals: (a) If
|
||
new versions of e.g. ``pip`` and ``flit`` are both updated to support
|
||
the new interface, then this should be sufficient for it to be used;
|
||
in particular, it should *not* be necessary for every project that
|
||
*uses* ``flit`` to update its individual ``pyproject.toml`` file. (b)
|
||
We do not want to have to spawn extra processes just to perform this
|
||
negotiation, because process spawns can easily become a bottleneck when
|
||
deploying large multi-package stacks on some platforms (Windows).
|
||
|
||
In the interface described here, all of these goals are easy to
|
||
achieve. Because ``pip`` controls the code that runs inside the child
|
||
process, it can easily write it to do something like::
|
||
|
||
command, backend, args = parse_command_line_args(...)
|
||
if command == "build_sdist":
|
||
if hasattr(backend, "build_sdist_from_vcs"):
|
||
backend.build_sdist_from_vcs(...)
|
||
elif hasattr(backend, "build_sdist"):
|
||
backend.build_sdist(...)
|
||
else:
|
||
# error handling
|
||
|
||
In the alternative where the public interface boundary is placed at
|
||
the subprocess call, this is not possible -- either we need to spawn
|
||
an extra process just to query what interfaces are supported (as was
|
||
included in an earlier draft of PEP 516, an alternative to this), or
|
||
else we give up on autonegotiation entirely (as in the current version
|
||
of that PEP), meaning that any changes in the interface will require
|
||
N individual packages to update their ``pyproject.toml`` files before
|
||
any change can go live, and that any changes will necessarily be
|
||
restricted to new releases.
|
||
|
||
One specific consequence of this is that in this PEP, we're able to
|
||
make the ``prepare_metadata_for_build_wheel`` command optional. In our design,
|
||
this can be readily handled by build frontends, which can put code in
|
||
their subprocess runner like::
|
||
|
||
def dump_wheel_metadata(backend, working_dir):
|
||
"""Dumps wheel metadata to working directory.
|
||
|
||
Returns absolute path to resulting metadata directory
|
||
"""
|
||
if hasattr(backend, "prepare_metadata_for_build_wheel"):
|
||
subdir = backend.prepare_metadata_for_build_wheel(working_dir)
|
||
else:
|
||
wheel_fname = backend.build_wheel(working_dir))
|
||
already_built = os.path.join(working_dir, "ALREADY_BUILT_WHEEL")
|
||
with open(already_built, "w") as f:
|
||
f.write(wheel_fname)
|
||
subdir = unzip_metadata(os.path.join(working_dir, wheel_fname))
|
||
return os.path.join(working_dir, subdir)
|
||
|
||
def ensure_wheel_is_built(backend, output_dir, working_dir, metadata_dir):
|
||
"""Ensures built wheel is available in output directory
|
||
|
||
Returns absolute path to resulting wheel file
|
||
"""
|
||
already_built = os.path.join(working_dir, "ALREADY_BUILT_WHEEL")
|
||
if os.path.exists(already_built):
|
||
with open(already_built, "r") as f:
|
||
wheel_fname = f.read().strip()
|
||
working_path = os.path.join(working_dir, wheel_fname)
|
||
final_path = os.path.join(output_dir, wheel_fname)
|
||
os.rename(working_path, final_path)
|
||
os.remove(already_built)
|
||
else:
|
||
wheel_fname = backend.build_wheel(output_dir, metadata_dir=metadata_dir)
|
||
return os.path.join(output_dir, wheel_fname)
|
||
|
||
and thus expose a totally uniform interface to the rest of the frontend,
|
||
with no extra subprocess calls, no duplicated builds, etc. But
|
||
obviously this is the kind of code that you only want to write as part
|
||
of a private, within-project interface (e.g. the given example requires that
|
||
the working directory be shared between the two calls, but not with any
|
||
other wheel builds, and that the return value from the metadata helper function
|
||
will be passed back in to the wheel building one).
|
||
|
||
(And, of course, making the ``metadata`` command optional is one piece
|
||
of lowering the barrier to entry for developing new backends, as discussed
|
||
above.)
|
||
|
||
|
||
Other differences
|
||
=================
|
||
|
||
Besides the key command line versus Python hook difference described
|
||
above, there are a few other differences in this proposal:
|
||
|
||
* Metadata command is optional (as described above).
|
||
|
||
* We return metadata as a directory, rather than a single METADATA
|
||
file. This aligns better with the way that in practice wheel metadata
|
||
is distributed across multiple files (e.g. entry points), and gives us
|
||
more options in the future. (For example, instead of following the PEP
|
||
426 proposal of switching the format of METADATA to JSON, we might
|
||
decide to keep the existing METADATA the way it is for backcompat,
|
||
while adding new extensions as JSON "sidecar" files inside the same
|
||
directory. Or maybe not; the point is it keeps our options more open.)
|
||
|
||
* We provide a mechanism for passing information between the metadata
|
||
step and the wheel building step. I guess everyone probably will
|
||
agree this is a good idea?
|
||
|
||
* We provide more detailed recommendations about the build environment,
|
||
but these aren't normative anyway.
|
||
|
||
|
||
===========
|
||
Copyright
|
||
===========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|