559 lines
20 KiB
Plaintext
559 lines
20 KiB
Plaintext
PEP: 518
|
||
Title: Specifying Minimum Build System Requirements for Python Projects
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Brett Cannon <brett@python.org>,
|
||
Nathaniel Smith <njs@pobox.com>,
|
||
Donald Stufft <donald@stufft.io>
|
||
BDFL-Delegate: Nick Coghlan
|
||
Discussions-To: distutils-sig <distutils-sig at python.org>
|
||
Status: Provisional
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 10-May-2016
|
||
Post-History: 10-May-2016,
|
||
11-May-2016,
|
||
13-May-2016
|
||
Resolution: https://mail.python.org/pipermail/distutils-sig/2016-May/028969.html
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP specifies how Python software packages should specify what
|
||
dependencies they have in order to execute their chosen build system.
|
||
As part of this specification, a new configuration file is introduced
|
||
for software packages to use to specify their build dependencies (with
|
||
the expectation that the same configuration file will be used for
|
||
future configuration details).
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
When Python first developed its tooling for building distributions of
|
||
software for projects, distutils [#distutils]_ was the chosen
|
||
solution. As time went on, setuptools [#setuptools]_ gained popularity
|
||
to add some features on top of distutils. Both used the concept of a
|
||
``setup.py`` file that project maintainers executed to build
|
||
distributions of their software (as well as users to install said
|
||
distribution).
|
||
|
||
Using an executable file to specify build requirements under distutils
|
||
isn't an issue as distutils is part of Python's standard library.
|
||
Having the build tool as part of Python means that a ``setup.py`` has
|
||
no external dependency that a project maintainer needs to worry about
|
||
to build a distribution of their project. There was no need to specify
|
||
any dependency information as the only dependency is Python.
|
||
|
||
But when a project chooses to use setuptools, the use of an executable
|
||
file like ``setup.py`` becomes an issue. You can't execute a
|
||
``setup.py`` file without knowing its dependencies, but currently
|
||
there is no standard way to know what those dependencies are in an
|
||
automated fashion without executing the ``setup.py`` file where that
|
||
information is stored. It's a catch-22 of a file not being runnable
|
||
without knowing its own contents which can't be known programmatically
|
||
unless you run the file.
|
||
|
||
Setuptools tried to solve this with a ``setup_requires`` argument to
|
||
its ``setup()`` function [#setup_args]_. This solution has a number
|
||
of issues, such as:
|
||
|
||
* No tooling (besides setuptools itself) can access this information
|
||
without executing the ``setup.py``, but ``setup.py`` can't be
|
||
executed without having these items installed.
|
||
* While setuptools itself will install anything listed in this, they
|
||
won't be installed until *during* the execution of the ``setup()``
|
||
function, which means that the only way to actually use anything
|
||
added here is through increasingly complex machinations that delay
|
||
the import and usage of these modules until later on in the
|
||
execution of the ``setup()`` function.
|
||
* This cannot include ``setuptools`` itself nor can it include a
|
||
replacement to ``setuptools``, which means that projects such as
|
||
``numpy.distutils`` are largely incapable of utilizing it and
|
||
projects cannot take advantage of newer setuptools features until
|
||
their users naturally upgrade the version of setuptools to a newer
|
||
one.
|
||
* The items listed in ``setup_requires`` get implicily installed
|
||
whenever you execute the ``setup.py`` but one of the common ways
|
||
that the ``setup.py`` is executed is via another tool, such as
|
||
``pip``, who is already managing dependencies. This means that
|
||
a command like ``pip install spam`` might end up having both
|
||
pip and setuptools downloading and installing packages and end
|
||
users needing to configure *both* tools (and for ``setuptools``
|
||
without being in control of the invocation) to change settings
|
||
like which repository it installs from. It also means that users
|
||
need to be aware of the discovery rules for both tools, as one
|
||
may support different package formats or determine the latest
|
||
version differently.
|
||
|
||
This has cumulated in a situation where use of ``setup_requires``
|
||
is rare, where projects tend to either simply copy and paste snippets
|
||
between ``setup.py`` files or they eschew it all together in favor
|
||
of simply documenting elsewhere what they expect the user to have
|
||
manually installed prior to attempting to build or install their
|
||
project.
|
||
|
||
All of this has led pip [#pip]_ to simply assume that setuptools is
|
||
necessary when executing a ``setup.py`` file. The problem with this,
|
||
though, is it doesn't scale if another project began to gain traction
|
||
in the community as setuptools has. It also prevents other projects
|
||
from gaining traction due to the friction required to use it with a
|
||
project when pip can't infer the fact that something other than
|
||
setuptools is required.
|
||
|
||
This PEP attempts to rectify the situation by specifying a way to list
|
||
the minimal dependencies of the build system of a project in a
|
||
declarative fashion in a specific file. This allows a project to list
|
||
what build dependencies it has to go from e.g. source checkout to
|
||
wheel, while not falling into the catch-22 trap that a ``setup.py``
|
||
has where tooling can't infer what a project needs to build itself.
|
||
Implementing this PEP will allow projects to specify what build system
|
||
they depend on upfront so that tools like pip can make sure that they
|
||
are installed in order to run the build system to build the project.
|
||
|
||
To provide more context and motivation for this PEP, think of the
|
||
(rough) steps required to produce a built artifact for a project:
|
||
|
||
1. The source checkout of the project.
|
||
2. Installation of the build system.
|
||
3. Execute the build system.
|
||
|
||
This PEP covers step #2. It is fully expected that a future PEP will
|
||
cover step #3, including how to have the build system dynamically
|
||
specify more dependencies that the build system requires to perform
|
||
its job. The purpose of this PEP though, is to specify the minimal set
|
||
of requirements for the build system to simply begin execution.
|
||
|
||
|
||
Specification
|
||
=============
|
||
|
||
File Format
|
||
-----------
|
||
|
||
The build system dependencies will be stored in a file named
|
||
``pyproject.toml`` that is written in the TOML format [#toml]_.
|
||
|
||
This format was chosen as it is human-usable (unlike JSON [#json]_),
|
||
it is flexible enough (unlike configparser [#configparser]_), stems
|
||
from a standard (also unlike configparser [#configparser]_), and it
|
||
is not overly complex (unlike YAML [#yaml]_). The TOML format is
|
||
already in use by the Rust community as part of their
|
||
Cargo package manager [#cargo]_ and in private email stated they have
|
||
been quite happy with their choice of TOML. A more thorough
|
||
discussion as to why various alternatives were not chosen can be read
|
||
in the `Other file formats`_ section.
|
||
|
||
Tables not specified in this PEP are reserved for future use by other
|
||
PEPs.
|
||
|
||
build-system table
|
||
------------------
|
||
|
||
The ``[build-system]`` table is used to store build-related data.
|
||
Initially only one key of the table will be valid and mandatory:
|
||
``requires``. This key must have a value of a list of strings
|
||
representing PEP 508 dependencies required to execute the build
|
||
system (currently that means what dependencies are required to
|
||
execute a ``setup.py`` file).
|
||
|
||
For the vast majority of Python projects that rely upon setuptools,
|
||
the ``pyproject.toml`` file will be::
|
||
|
||
[build-system]
|
||
# Minimum requirements for the build system to execute.
|
||
requires = ["setuptools", "wheel"] # PEP 508 specifications.
|
||
|
||
Because the use of setuptools and wheel are so expansive in the
|
||
community at the moment, build tools are expected to use the example
|
||
configuration file above as their default semantics when a
|
||
``pyproject.toml`` file is not present.
|
||
|
||
tool table
|
||
----------
|
||
|
||
The ``[tool]`` table is where tools can have users specify
|
||
configuration data as long as they use a sub-table within ``[tool]``,
|
||
e.g. the `flit <https://pypi.python.org/pypi/flit>`_ tool would store
|
||
its configuration in ``[tool.flit]``.
|
||
|
||
We need some mechanism to allocate names within the ``tool.*``
|
||
namespace, to make sure that different projects don't attempt to use
|
||
the same sub-table and collide. Our rule is that a project can use
|
||
the subtable ``tool.$NAME`` if, and only if, they own the entry for
|
||
``$NAME`` in the Cheeseshop/PyPI.
|
||
|
||
JSON Schema
|
||
-----------
|
||
|
||
To provide a type-specific representation of the resulting data from
|
||
the TOML file for illustrative purposes only, the following JSON
|
||
Schema [#jsonschema]_ would match the data format::
|
||
|
||
{
|
||
"$schema": "http://json-schema.org/schema#",
|
||
|
||
"type": "object",
|
||
"additionalProperties": false,
|
||
|
||
"properties": {
|
||
"build-system": {
|
||
"type": "object",
|
||
"additionalProperties": false,
|
||
|
||
"properties": {
|
||
"requires": {
|
||
"type": "array",
|
||
"items": {
|
||
"type": "string"
|
||
}
|
||
}
|
||
},
|
||
"required": ["requires"]
|
||
},
|
||
|
||
"tool": {
|
||
"type": "object"
|
||
}
|
||
}
|
||
}
|
||
|
||
|
||
Rejected Ideas
|
||
==============
|
||
|
||
A semantic version key
|
||
----------------------
|
||
|
||
For future-proofing the structure of the configuration file, a
|
||
``semantics-version`` key was initially proposed. Defaulting to ``1``,
|
||
the idea was that if any semantics changes to previously defined keys
|
||
or tables occurred which were not backwards-compatible, then the
|
||
``semantics-version`` would be incremented to a new number.
|
||
|
||
In the end, though, it was decided that this was a premature
|
||
optimization. The expectation is that changes to what is pre-defined
|
||
semantically in the configuration file will be rather conservative.
|
||
And in the instances where a backwards-incompatible change would have
|
||
occurred, different names can be used for the new semantics to avoid
|
||
breaking older tools.
|
||
|
||
|
||
A more nested namespace
|
||
-----------------------
|
||
|
||
An earlier draft of this PEP had a top-level ``[package]`` table. The
|
||
idea was to impose some scoping for a semantics versioning scheme
|
||
(see `A semantic version key`_ for why that idea was rejected).
|
||
With the need for scoping removed, the point of having a top-level
|
||
table became superfluous.
|
||
|
||
|
||
Other table names
|
||
-----------------
|
||
|
||
Another name proposed for the ``[build-system]`` table was
|
||
``[build]``. The alternative name is shorter, but doesn't convey as
|
||
much of the intention of what information is store in the table. After
|
||
a vote on the distutils-sig mailing list, the current name won out.
|
||
|
||
|
||
Other file formats
|
||
------------------
|
||
|
||
Several other file formats were put forward for consideration, all
|
||
rejected for various reasons. Key requirements were that the format
|
||
be editable by human beings and have an implementation that can be
|
||
vendored easily by projects. This outright excluded certain formats
|
||
like XML which are not friendly towards human beings and were never
|
||
seriously discussed.
|
||
|
||
Overview of file formats considered
|
||
'''''''''''''''''''''''''''''''''''
|
||
|
||
The key reasons for rejecting the other alternatives considered are
|
||
summarised in the following sections, while the full review (including
|
||
positive arguments in favour of TOML) can be found at [#file_formats]_.
|
||
|
||
TOML was ultimately selected as it provided all the features we
|
||
were interested in, while avoiding the downsides introduced by
|
||
the alternatives.
|
||
|
||
======================= ==== ==== ==== =======
|
||
Feature TOML YAML JSON CFG/INI
|
||
======================= ==== ==== ==== =======
|
||
Well-defined yes yes yes
|
||
Real data types yes yes yes
|
||
Reliable Unicode yes yes yes
|
||
Reliable comments yes yes
|
||
Easy for humans to edit yes ?? ??
|
||
Easy for tools to edit yes ?? yes ??
|
||
In standard library yes yes
|
||
Easy for pip to vendor yes n/a n/a
|
||
======================= ==== ==== ==== =======
|
||
|
||
("??" in the table indicates items where most folks would be
|
||
inclined to answer "yes", but there turn out to be a lot of
|
||
quirks and edge cases that arise in practice due to either
|
||
the lack of a clear specification, or else the underlying
|
||
file format specification being surprisingly complicated)
|
||
|
||
The ``pytoml`` TOML parser is ~300 lines of pure Python code,
|
||
so being outside the standard library didn't count heavily
|
||
against it.
|
||
|
||
Python literals were also discussed as a potential format, but
|
||
weren't considered in the file format review (since they're not
|
||
a common pre-existing file format).
|
||
|
||
|
||
JSON
|
||
''''
|
||
|
||
The JSON format [#json]_ was initially considered but quickly
|
||
rejected. While great as a human-readable, string-based data exchange
|
||
format, the syntax does not lend itself to easy editing by a human
|
||
being (e.g. the syntax is more verbose than necessary while not
|
||
allowing for comments).
|
||
|
||
An example JSON file for the proposed data would be::
|
||
|
||
{
|
||
"build": {
|
||
"requires": [
|
||
"setuptools",
|
||
"wheel>=0.27"
|
||
]
|
||
}
|
||
}
|
||
|
||
|
||
YAML
|
||
''''
|
||
|
||
The YAML format [#yaml]_ was designed to be a superset of JSON
|
||
[#json]_ while being easier to work with by hand. There are three main
|
||
issues with YAML.
|
||
|
||
One is that the specification is large: 86 pages if printed on
|
||
letter-sized paper. That leaves the possibility that someone may use a
|
||
feature of YAML that works with one parser but not another. It has
|
||
been suggested to standardize on a subset, but that basically means
|
||
creating a new standard specific to this file which is not tractable
|
||
long-term.
|
||
|
||
Two is that YAML itself is not safe by default. The specification
|
||
allows for the arbitrary execution of code which is best avoided when
|
||
dealing with configuration data. It is of course possible to avoid
|
||
this behavior -- for example, PyYAML provides a ``safe_load`` operation
|
||
-- but if any tool carelessly uses ``load`` instead then they open
|
||
themselves up to arbitrary code execution. While this PEP is focused on
|
||
the building of projects which inherently involves code execution,
|
||
other configuration data such as project name and version number may
|
||
end up in the same file someday where arbitrary code execution is not
|
||
desired.
|
||
|
||
And finally, the most popular Python implemenation of YAML is
|
||
PyYAML [#pyyaml]_ which is a large project of a few thousand lines of
|
||
code and an optional C extension module. While in and of itself this
|
||
isn't necessarily an issue, this becomes more of a problem for
|
||
projects like pip where they would most likely need to vendor PyYAML
|
||
as a dependency so as to be fully self-contained (otherwise you end
|
||
up with your install tool needing an install tool to work). A
|
||
proof-of-concept re-working of PyYAML has been done to see how easy
|
||
it would be to potentially vendor a simpler version of the library
|
||
which shows it is a possibility.
|
||
|
||
An example YAML file is::
|
||
|
||
build:
|
||
requires:
|
||
- setuptools
|
||
- wheel>=0.27
|
||
|
||
|
||
configparser
|
||
''''''''''''
|
||
|
||
An INI-style configuration file based on what
|
||
configparser [#configparser]_ accepts was considered. Unfortunately
|
||
there is no specification of what configparser accepts, leading to
|
||
support skew between versions. For instance, what ConfigParser in
|
||
Python 2.7 accepts is not the same as what configparser in Python 3
|
||
accepts. While one could standardize on what Python 3 accepts and
|
||
simply vendor the backport of the configparser module, that does mean
|
||
this PEP would have to codify that the backport of configparser must
|
||
be used by all project wishes to consume the metadata specified by
|
||
this PEP. This is overly restrictive and could lead to confusion if
|
||
someone is not aware of that a specific version of configparser is
|
||
expected.
|
||
|
||
An example INI file is::
|
||
|
||
[build]
|
||
requires =
|
||
setuptools
|
||
wheel>=0.27
|
||
|
||
|
||
Python literals
|
||
'''''''''''''''
|
||
|
||
Someone proposed using Python literals as the configuration format.
|
||
The file would contain one dict at the top level, with the data all
|
||
inside that dict, with sections defined by the keys. All Python
|
||
programmers would be used to the format, there would implicitly be no
|
||
third-party dependency to read the configuration data, and it can be
|
||
safe if parsed by ``ast.literal_eval()`` [#ast_literal_eval]_.
|
||
Python literals can be identical to JSON, with the added benefit of
|
||
supporting trailing commas and comments. In addition, Python's richer
|
||
data model may be useful for some future configuration needs (e.g. non-string
|
||
dict keys, floating point vs. integer values).
|
||
|
||
On the other hand, python literals are a Python-specific format, and
|
||
it is anticipated that these data may need to be read by packaging
|
||
tools, etc. that are not written in Python.
|
||
|
||
An example Python literal file for the proposed data would be::
|
||
|
||
# The build configuration
|
||
{"build": {"requires": ["setuptools",
|
||
"wheel>=0.27", # note the trailing comma
|
||
# "numpy>=1.10" # a commented out data line
|
||
]
|
||
# and here is an arbitrary comment.
|
||
}
|
||
}
|
||
|
||
|
||
Sticking with ``setup.cfg``
|
||
---------------------------
|
||
|
||
There are two issues with ``setup.cfg`` used by setuptools as a general
|
||
format. One is that they are ``.ini`` files which have issues as mentioned
|
||
in the configparser_ discussion above. The other is that the schema for
|
||
that file has never been rigorously defined and thus it's unknown which
|
||
format would be safe to use going forward without potentially confusing
|
||
setuptools installations.
|
||
|
||
|
||
|
||
Other file names
|
||
----------------
|
||
|
||
Several other file names were considered and rejected (although this
|
||
is very much a bikeshedding topic, and so the decision comes down to
|
||
mostly taste).
|
||
|
||
pysettings.toml
|
||
Most reasonable alternative.
|
||
|
||
pypa.toml
|
||
While it makes sense to reference the PyPA [#pypa]_, it is a
|
||
somewhat niche term. It's better to have the file name make sense
|
||
without having domain-specific knowledge.
|
||
|
||
pybuild.toml
|
||
From the restrictive perspective of this PEP this filename makes
|
||
sense, but if any non-build metadata ever gets added to the file
|
||
then the name ceases to make sense.
|
||
|
||
pip.toml
|
||
Too tool-specific.
|
||
|
||
meta.toml
|
||
Too generic; project may want to have its own metadata file.
|
||
|
||
setup.toml
|
||
While keeping with traditional thanks to ``setup.py``, it does not
|
||
necessarily match what the file may contain in the future (.e.g is
|
||
knowing the name of a project inerhently part of its setup?).
|
||
|
||
pymeta.toml
|
||
Not obvious to newcomers to programming and/or Python.
|
||
|
||
pypackage.toml & pypackaging.toml
|
||
Name conflation of what a "package" is (project versus namespace).
|
||
|
||
pydevelop.toml
|
||
The file may contain details not specific to development.
|
||
|
||
pysource.toml
|
||
Not directly related to source code.
|
||
|
||
pytools.toml
|
||
Misleading as the file is (currently) aimed at project management.
|
||
|
||
dstufft.toml
|
||
Too person-specific. ;)
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [#distutils] distutils
|
||
(https://docs.python.org/3/library/distutils.html#module-distutils)
|
||
|
||
.. [#setuptools] setuptools
|
||
(https://pypi.python.org/pypi/setuptools)
|
||
|
||
.. [#setup_args] setuptools: New and Changed setup() Keywords
|
||
(http://pythonhosted.org/setuptools/setuptools.html#new-and-changed-setup-keywords)
|
||
|
||
.. [#pip] pip
|
||
(https://pypi.python.org/pypi/pip)
|
||
|
||
.. [#wheel] wheel
|
||
(https://pypi.python.org/pypi/wheel)
|
||
|
||
.. [#toml] TOML
|
||
(https://github.com/toml-lang/toml)
|
||
|
||
.. [#json] JSON
|
||
(http://json.org/)
|
||
|
||
.. [#yaml] YAML
|
||
(http://yaml.org/)
|
||
|
||
.. [#configparser] configparser
|
||
(https://docs.python.org/3/library/configparser.html#module-configparser)
|
||
|
||
.. [#pyyaml] PyYAML
|
||
(https://pypi.python.org/pypi/PyYAML)
|
||
|
||
.. [#pypa] PyPA
|
||
(https://www.pypa.io)
|
||
|
||
.. [#bazel] Bazel
|
||
(http://bazel.io/)
|
||
|
||
.. [#ast_literal_eval] ``ast.literal_eval()``
|
||
(https://docs.python.org/3/library/ast.html#ast.literal_eval)
|
||
|
||
.. [#cargo] Cargo, Rust's package manager
|
||
(http://doc.crates.io/)
|
||
|
||
.. [#jsonschema] JSON Schema
|
||
(http://json-schema.org/)
|
||
|
||
.. [#file_formats] Nathaniel J. Smith's file format review
|
||
(https://gist.github.com/njsmith/78f68204c5d969f8c8bc645ef77d4a8f)
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|