PEP 680: Format rst/PEP and copyedit for grammar, clarity & phrasing (#2233)
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com> Co-authored-by: Taneli Hukkinen <3275109+hukkin@users.noreply.github.com>
This commit is contained in:
parent
3ec824affd
commit
8513edec8d
428
pep-0680.rst
428
pep-0680.rst
|
@ -14,7 +14,7 @@ Post-History: 11-Jan-2022
|
|||
Abstract
|
||||
========
|
||||
|
||||
This proposes adding a module, ``tomllib``, to the standard library for
|
||||
This PEP proposes adding the ``tomllib`` module to the standard library for
|
||||
parsing TOML (Tom's Obvious Minimal Language,
|
||||
`https://toml.io <https://toml.io/en/>`_).
|
||||
|
||||
|
@ -22,67 +22,83 @@ parsing TOML (Tom's Obvious Minimal Language,
|
|||
Motivation
|
||||
==========
|
||||
|
||||
The TOML format is the format of choice for Python packaging, as evidenced by
|
||||
:pep:`517`, :pep:`518` and :pep:`621`. Including TOML support in the standard
|
||||
library helps avoid bootstrapping problems for Python build tools. Currently
|
||||
most Python build tools need to vendor a TOML parsing library.
|
||||
|
||||
Python tools are increasingly configurable via TOML, for examples: ``black``,
|
||||
``mypy``, ``pytest``, ``tox``, ``pylint``, ``isort``. Those that are not, such
|
||||
as ``flake8``, cite the lack of standard library support as a `main reason why
|
||||
<https://github.com/PyCQA/flake8/issues/234#issuecomment-812800657>`_.
|
||||
TOML is the format of choice for Python packaging, as evidenced by
|
||||
:pep:`517`, :pep:`518` and :pep:`621`. This creates a bootstrapping
|
||||
problem for Python build tools, forcing them to vendor a TOML parsing
|
||||
package or employ other undesirable workarounds, and causes serious issues
|
||||
for repackagers and other downstream consumers. Including TOML support in
|
||||
the standard library would neatly solve all of these issues.
|
||||
|
||||
Further, many Python tools are now configurable via TOML, such as
|
||||
``black``, ``mypy``, ``pytest``, ``tox``, ``pylint`` and ``isort``.
|
||||
Many that are not, such as ``flake8``, cite the lack of standard library
|
||||
support as a `main reason why
|
||||
<https://github.com/PyCQA/flake8/issues/234#issuecomment-812800657>`__.
|
||||
Given the special place TOML already has in the Python ecosystem, it makes sense
|
||||
for this to be an included battery.
|
||||
for it to be an included battery.
|
||||
|
||||
Finally, TOML as a format is increasingly popular (some reasons for this are
|
||||
outlined in PEP 518). Hence this is likely to be a generally useful addition,
|
||||
even looking beyond the needs of Python packaging and Python tooling: various
|
||||
Python TOML libraries have about 2000 reverse dependencies on PyPI. For
|
||||
comparison, ``requests`` has about 28k reverse dependencies.
|
||||
Finally, TOML as a format is increasingly popular (for the reasons
|
||||
outlined in :pep:`518`), with various Python TOML libraries having about
|
||||
2000 reverse dependencies on PyPI (for comparison, ``requests`` has about
|
||||
28000 reverse dependencies). Hence, this is likely to be a generally useful
|
||||
addition, even looking beyond the needs of Python packaging and related tools.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
This PEP proposes basing the standard library support for reading TOML on the
|
||||
third party library ``tomli``
|
||||
third-party library ``tomli``
|
||||
(`github.com/hukkin/tomli <https://github.com/hukkin/tomli>`_).
|
||||
|
||||
Many projects have recently switched to using ``tomli``, for example, ``pip``,
|
||||
Many projects have recently switched to using ``tomli``, such as ``pip``,
|
||||
``build``, ``pytest``, ``mypy``, ``black``, ``flit``, ``coverage``,
|
||||
``setuptools-scm``, ``cibuildwheel``.
|
||||
``setuptools-scm`` and ``cibuildwheel``.
|
||||
|
||||
``tomli`` is actively maintained and well-tested. ``tomli`` is about 800 lines
|
||||
of code with 100% test coverage and passes all tests in a test suite `proposed
|
||||
as the official TOML compliance test suite
|
||||
<https://github.com/toml-lang/compliance/pull/8>`_, as well as `the more
|
||||
established BurntSushi/toml-test suite
|
||||
``tomli`` is actively maintained and well-tested. It is about 800 lines
|
||||
of code with 100% test coverage, and passes all tests in the
|
||||
`proposed official TOML compliance test suite
|
||||
<https://github.com/toml-lang/compliance/pull/8>`_, as well as
|
||||
`the more established BurntSushi/toml-test suite
|
||||
<https://github.com/BurntSushi/toml-test>`_.
|
||||
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
A new module ``tomllib`` with the following functions will be added:
|
||||
A new module ``tomllib`` will be added to the Python standard library,
|
||||
exposing the following public functions:
|
||||
|
||||
.. code-block::
|
||||
|
||||
def load(fp: SupportsRead[bytes], /, *, parse_float: Callable[[str], Any] = ...) -> dict[str, Any]: ...
|
||||
def loads(s: str, /, *, parse_float: Callable[[str], Any] = ...) -> dict[str, Any]: ...
|
||||
def load(
|
||||
fp: SupportsRead[bytes],
|
||||
/,
|
||||
*,
|
||||
parse_float: Callable[[str], Any] = ...,
|
||||
) -> dict[str, Any]: ...
|
||||
|
||||
``tomllib.load`` deserializes a binary file containing a
|
||||
TOML document to a Python dict.
|
||||
def loads(
|
||||
s: str,
|
||||
/,
|
||||
*,
|
||||
parse_float: Callable[[str], Any] = ...,
|
||||
) -> dict[str, Any]: ...
|
||||
|
||||
``tomllib.load`` deserializes a binary file-like object containing a
|
||||
TOML document to a Python ``dict``.
|
||||
The ``fp`` argument must have a ``read()`` method with the same API as
|
||||
``io.RawIOBase.read()``.
|
||||
|
||||
``tomllib.loads`` deserializes a str instance containing a TOML document
|
||||
to a Python dict.
|
||||
``tomllib.loads`` deserializes a ``str`` instance containing a TOML document
|
||||
to a Python ``dict``.
|
||||
|
||||
``parse_float`` is a function that takes a string representing a TOML float and
|
||||
returns a Python object (similar to ``parse_float`` in ``json.load``). For
|
||||
example, a function returning a ``decimal.Decimal`` in cases where precision is
|
||||
important. By default, TOML floats are represented as ``float`` type.
|
||||
The ``parse_float`` argument is a callable object that takes as input the
|
||||
original string representation of a TOML float, and returns a corresponding
|
||||
Python object (similar to ``parse_float`` in ``json.load``).
|
||||
For example, the user may pass a function returning a ``decimal.Decimal``,
|
||||
for use cases where exact precision is important. By default, TOML floats
|
||||
are parsed as instances of the Python ``float`` type.
|
||||
|
||||
The returned object contains only basic Python objects (``str``, ``int``,
|
||||
``bool``, ``float``, ``datetime.{datetime,date,time}``, ``list``, ``dict`` with
|
||||
|
@ -91,7 +107,7 @@ string keys), and the results of ``parse_float``.
|
|||
``tomllib.TOMLDecodeError`` is raised in the case of invalid TOML.
|
||||
|
||||
Note that this PEP does not propose ``tomllib.dump`` or ``tomllib.dumps``
|
||||
functions, see `<Including an API for writing TOML_>`_ for details.
|
||||
functions; see `Including an API for writing TOML`_ for details.
|
||||
|
||||
|
||||
Maintenance Implications
|
||||
|
@ -100,69 +116,74 @@ Maintenance Implications
|
|||
Stability of TOML
|
||||
-----------------
|
||||
|
||||
The release of TOML v1 in January 2021 indicates stability. Empirically, TOML
|
||||
has proven to be a stable format even prior to the release of TOML v1. From the
|
||||
`changelog <https://github.com/toml-lang/toml/blob/master/CHANGELOG.md>`_, we
|
||||
see TOML has had no major changes since April 2020 and has had two releases in
|
||||
the last five years.
|
||||
The release of TOML 1.0.0 in January 2021 indicates the TOML format should
|
||||
now be officially considered stable. Empirically, TOML has proven to be a
|
||||
stable format even prior to the release of TOML 1.0.0. From the
|
||||
`changelog <https://github.com/toml-lang/toml/blob/master/CHANGELOG.md>`__, we
|
||||
can see that TOML has had no major changes since April 2020, and has had
|
||||
two releases in the past five years (2017-2021).
|
||||
|
||||
In the event of changes to the TOML specification, we could treat minor
|
||||
In the event of changes to the TOML specification, we can treat minor
|
||||
revisions as bug fixes and update the implementation in place. In the event of
|
||||
major breaking changes, we should preserve support for TOML v1.
|
||||
major breaking changes, we should preserve support for TOML 1.x.
|
||||
|
||||
|
||||
Maintainability of proposed implementation
|
||||
------------------------------------------
|
||||
|
||||
The proposed implementation (``tomli``) is in pure Python, well tested and
|
||||
weighs under 1000 lines of code. It is minimalist, offering a smaller API
|
||||
The proposed implementation (``tomli``) is pure Python, well tested and
|
||||
weighs in at under 1000 lines of code. It is minimalist, offering a smaller API
|
||||
surface area than other TOML implementations.
|
||||
|
||||
The author of ``tomli`` is willing to help integrate ``tomli`` into the standard
|
||||
library and help maintain it, `as per this post
|
||||
<https://github.com/hukkin/tomli/issues/141#issuecomment-998018972>`__.
|
||||
Petr Viktorin has indicated willingness to maintain a read API,
|
||||
`as per this post
|
||||
Furthermore, Python core developer Petr Viktorin has indicated a willingness
|
||||
to maintain a read API, `as per this post
|
||||
<https://discuss.python.org/t/adopting-recommending-a-toml-parser/4068/88>`__.
|
||||
|
||||
Rewriting the parser in C is not deemed necessary at this time. It's rare for
|
||||
TOML parsing to be a bottleneck in applications. Users with higher performance
|
||||
needs can use a third party library (as is already often the case with JSON,
|
||||
despite a stdlib extension module).
|
||||
Rewriting the parser in C is not deemed necessary at this time. It is rare for
|
||||
TOML parsing to be a bottleneck in applications, and users with higher performance
|
||||
needs can use a third-party library (as is already often the case with JSON,
|
||||
despite Python offering a standard library C-extension module).
|
||||
|
||||
|
||||
TOML support a slippery slope for other things
|
||||
----------------------------------------------
|
||||
|
||||
As discussed in motivations, TOML holds a special place in the Python ecosystem.
|
||||
As discussed in the `Motivation`_ section, TOML holds a special place in the
|
||||
Python ecosystem, for reading :pep:`518` ``pyproject.toml`` packaging
|
||||
and tool configuration files.
|
||||
This chief reason to include TOML in the standard library does not apply to
|
||||
other formats, such as YAML or MessagePack.
|
||||
|
||||
In addition, the simplicity of TOML can help serve as a dividing line, for
|
||||
example, YAML is large and complicated.
|
||||
In addition, the simplicity of TOML distinguishes it from other formats like
|
||||
YAML, which are highly complicated to construct and parse.
|
||||
|
||||
Including an API for writing TOML may, however, be added in a future PEP.
|
||||
An API for writing TOML may, however, be added in a future PEP.
|
||||
|
||||
|
||||
Backwards Compatibility
|
||||
=======================
|
||||
|
||||
This proposal has no backwards compatibility issues within the stdlib, as it
|
||||
describes a new module.
|
||||
This proposal has no backwards compatibility issues within the standard
|
||||
library, as it describes a new module.
|
||||
Any existing third-party module named ``tomllib`` will break, as
|
||||
``import tomllib`` will import standard library module.
|
||||
However, ``tomllib`` is not registered on PyPI, so it is unlikely that such
|
||||
a module is widely used.
|
||||
``import tomllib`` will import the standard library module.
|
||||
However, ``tomllib`` is not registered on PyPI, so it is unlikely that any
|
||||
module with this name is widely used.
|
||||
|
||||
Note that we avoid using the more straightforward name ``toml``, to avoid
|
||||
Note that we avoid using the more straightforward name ``toml`` to avoid
|
||||
backwards compatibility implications for users who have pinned versions of the
|
||||
current ``toml`` PyPI package. For more details, see `<Alternative names for
|
||||
module_>`_.
|
||||
current ``toml`` PyPI package.
|
||||
For more details, see the `Alternative names for the module`_ section.
|
||||
|
||||
|
||||
Security Implications
|
||||
=====================
|
||||
|
||||
Errors in the implementation could cause potential security issues.
|
||||
The parser's output is limited to simple data types; inability to load
|
||||
However, the parser's output is limited to simple data types; inability to load
|
||||
arbitrary classes avoids security issues common in more "powerful" formats like
|
||||
pickle and YAML. Also, the implementation will be in pure Python, which reduces
|
||||
security issues endemic to C, such as buffer overflows.
|
||||
|
@ -174,7 +195,7 @@ How to Teach This
|
|||
The API of ``tomllib`` mimics that of other well-established file format
|
||||
libraries, such as ``json`` and ``pickle``. The lack of a ``dump`` function will
|
||||
be explained in the documentation, with a link to relevant third-party libraries
|
||||
(``tomlkit``, ``tomli-w``, ``pytomlpp``).
|
||||
(e.g. ``tomlkit``, ``tomli-w``, ``pytomlpp``).
|
||||
|
||||
|
||||
Reference Implementation
|
||||
|
@ -191,87 +212,94 @@ Basing on another TOML implementation
|
|||
|
||||
Several potential alternative implementations exist:
|
||||
|
||||
* ``tomlkit`` is well established, actively maintained and supports TOML v1. An
|
||||
important difference is that ``tomlkit`` supports style roundtripping. As a
|
||||
* ``tomlkit`` is well established, actively maintained and supports TOML 1.0.0.
|
||||
An important difference is that ``tomlkit`` supports style roundtripping. As a
|
||||
result, it has a more complex API and implementation (about 5x as much code as
|
||||
``tomli``). Its author does not believe that ``tomlkit`` is a good choice for
|
||||
the standard library.
|
||||
|
||||
* ``toml`` is a very widely used library. However, it is not actively
|
||||
maintained, does not support TOML v1 and has several known bugs. Its API is
|
||||
more complex than that of ``tomli``. It has some very limited and mostly
|
||||
unused ability to preserve style through an undocumented decoder API. It has
|
||||
the ability to customise output style through a complicated encoder API. For
|
||||
more details on API differences to this PEP, refer to `Appendix A`_.
|
||||
maintained, does not support TOML 1.0.0 and has a number of known bugs. Its
|
||||
API is more complex than that of ``tomli``. It allows customising output style
|
||||
through a complicated encoder API, and some very limited and mostly unused
|
||||
functionality to preserve input style through an undocumented decoder API.
|
||||
For more details on its API differences from this PEP, refer to `Appendix A`_.
|
||||
|
||||
* ``pytomlpp`` is a Python wrapper for the C++ project ``toml++``. Pure Python
|
||||
libraries are easier to maintain than extension modules.
|
||||
|
||||
* ``rtoml`` is a Python wrapper for the Rust project ``toml-rs`` and hence has
|
||||
similar shortcomings to ``pytomlpp``.
|
||||
In addition, it does not support TOML v1.
|
||||
In addition, it does not support TOML 1.0.0.
|
||||
|
||||
* Writing an implementation from scratch. It's unclear what we would get from
|
||||
this: ``tomli`` meets our needs and the author is willing to help with its
|
||||
this; ``tomli`` meets our needs and the author is willing to help with its
|
||||
inclusion in the standard library.
|
||||
|
||||
|
||||
Including an API for writing TOML
|
||||
---------------------------------
|
||||
|
||||
There are several reasons to not include an API for writing TOML:
|
||||
There are several reasons to not include an API for writing TOML.
|
||||
|
||||
The ability to write TOML is not needed for the use cases that motivate this
|
||||
PEP: for core Python packaging use cases or for tools that need to read
|
||||
configuration.
|
||||
PEP: core Python packaging tools, and projects that need to read TOML
|
||||
configuration files.
|
||||
|
||||
Use cases that involve editing TOML (as opposed to writing brand new TOML) are
|
||||
better served by a style preserving library. TOML is intended as human-readable
|
||||
and human-editable configuration, so it's important to preserve human markup,
|
||||
such as comments and formatting. This requires a parser whose output includes
|
||||
style-related metadata, making it impractical to output plain Python types like
|
||||
``str`` and ``dict``. Designing such an API is complicated.
|
||||
Use cases that involve editing an existing TOML file (as opposed to writing a
|
||||
brand new one) are better served by a style preserving library. TOML is
|
||||
intended as a human-readable and -editable configuration format, so it's
|
||||
important to preserve comments, formatting and other markup. This requires
|
||||
a parser whose output includes style-related metadata, making it impractical
|
||||
to output plain Python types like ``str`` and ``dict``. Furthermore, it
|
||||
substantially complicates the design of the API.
|
||||
|
||||
But even without considering style preservation, there are too many degrees of
|
||||
freedom in how to design a write API. For example, how much control to allow
|
||||
users over output formatting, over serialization of custom types, and over input
|
||||
and output validation. While there are reasonable choices on how to resolve
|
||||
these, the nature of the standard library is such that one only gets one chance
|
||||
to get things right.
|
||||
Even without considering style preservation, there are too many degrees of
|
||||
freedom in how to design a write API. For example, what default style
|
||||
(indentation, vertical and horizontal spacing, quotes, etc) should the library
|
||||
use for the output, and how much control should users be given over it?
|
||||
How should the library handle input and output validation? Should it support
|
||||
serialization of custom types, and if so, how? While there are reasonable
|
||||
options for resolving these issues, the nature of the standard library is such
|
||||
that we only get "one chance to get it right".
|
||||
|
||||
Currently no CPython core developers have expressed willingness to maintain a
|
||||
write API or sponsor a PEP that includes a write API. Since it is hard to change
|
||||
Currently, no CPython core developers have expressed willingness to maintain a
|
||||
write API, or sponsor a PEP that includes one. Since it is hard to change
|
||||
or remove something in the standard library, it is safer to err on the side of
|
||||
exclusion and potentially revisit later.
|
||||
exclusion for now, and potentially revisit this later.
|
||||
|
||||
So, writing TOML is left to third-party libraries. If a good API and relevant
|
||||
use cases for it are found later, it can be added in a future PEP.
|
||||
Therefore, writing TOML is left to third-party libraries. If a good API and
|
||||
relevant use cases for it are found later, write support can be added in a
|
||||
future PEP.
|
||||
|
||||
|
||||
Assorted API details
|
||||
--------------------
|
||||
|
||||
Types accepted by the first argument of ``tomllib.load``
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
Types accepted as the first argument of ``tomllib.load``
|
||||
''''''''''''''''''''''''''''''''''''''''''''''''''''''''
|
||||
|
||||
The ``toml`` library on PyPI allows passing paths (and lists of path-like
|
||||
objects, ignoring missing files and merging the documents into a single object).
|
||||
Doing this would be inconsistent with ``json.load``, ``pickle.load``, etc. If we
|
||||
agree consistency with other stdlib modules is desirable, allowing paths is
|
||||
somewhat out of scope for this PEP. This can easily and explicitly be worked
|
||||
around in user code, or a third-party library.
|
||||
objects, ignoring missing files and merging the documents into a single object)
|
||||
to its ``load`` function. However, allowing this here would be inconsistent
|
||||
with the behavior of ``json.load``, ``pickle.load`` and other standard library
|
||||
functions. If we agree that consistency here is desirable,
|
||||
allowing paths is out of scope for this PEP. This can easily and explicitly
|
||||
be worked around in user code, or by using a third-party library.
|
||||
|
||||
The proposed API takes a binary file, while ``toml.load`` takes a text file and
|
||||
``json.load`` takes either. Using a binary file allows us to a) ensure utf-8 is
|
||||
the encoding used, b) avoid incorrectly parsing single carriage returns as valid
|
||||
TOML due to universal newlines.
|
||||
``json.load`` takes either. Using a binary file allows us to ensure UTF-8 is
|
||||
the encoding used, and avoid incorrectly parsing single carriage returns as
|
||||
valid TOML due to universal newlines in text mode.
|
||||
|
||||
Type accepted by the first argument of ``tomllib.loads``
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Type accepted as the first argument of ``tomllib.loads``
|
||||
''''''''''''''''''''''''''''''''''''''''''''''''''''''''
|
||||
|
||||
While ``tomllib.load`` takes a binary file, ``tomllib.loads`` takes
|
||||
a text string. This may seem inconsistent at first.
|
||||
|
||||
Quoting TOML v1.0.0 specification:
|
||||
Quoting the `TOML v1.0.0 specification <https://toml.io/en/v1.0.0#spec>`_:
|
||||
|
||||
A TOML file must be a valid UTF-8 encoded Unicode document.
|
||||
|
||||
|
@ -282,84 +310,95 @@ a Unicode document in Python is ``str``, not ``bytes``.
|
|||
It is possible to add ``bytes`` support in the future if needed, but
|
||||
we are not aware of any use cases for it.
|
||||
|
||||
|
||||
Controlling the type of mappings returned by ``tomllib.load[s]``
|
||||
----------------------------------------------------------------
|
||||
|
||||
The ``toml`` library on PyPI supports a ``_dict`` argument, which works
|
||||
similarly to the ``object_hook`` argument in ``json.load[s]``. There are several
|
||||
uses of ``_dict`` found on https://grep.app, however, almost all of them are
|
||||
passing ``_dict=OrderedDict``, which should be unnecessary as of Python 3.7. We
|
||||
found two instances of legitimate use: in one case, a custom class was passed
|
||||
for friendlier KeyErrors, in another case, the custom class had several
|
||||
The ``toml`` library on PyPI accepts a ``_dict`` argument in its ``load[s]``
|
||||
functions, which works similarly to the ``object_hook`` argument in
|
||||
``json.load[s]``. There are several uses of ``_dict`` found on
|
||||
https://grep.app; however, almost all of them are passing
|
||||
``_dict=OrderedDict``, which should be unnecessary as of Python 3.7.
|
||||
We found two instances of relevant use: in one case, a custom class was passed
|
||||
for friendlier KeyErrors; in the other, the custom class had several
|
||||
additional lookup and mutation methods (e.g. to help resolve dotted keys).
|
||||
|
||||
Such an argument is not necessary for the core use cases outlined in the
|
||||
motivation section. The absence of this can be pretty easily worked around using
|
||||
a wrapper class, transformer function, or a third-party library. Finally,
|
||||
support could be added later in a backward compatible way.
|
||||
Such a parameter is not necessary for the core use cases outlined in the
|
||||
`Motivation`_ section. The absence of this can be pretty easily worked around
|
||||
using a wrapper class, transformer function, or a third-party library. Finally,
|
||||
support could be added later in a backward-compatible way.
|
||||
|
||||
|
||||
Removing support for ``parse_float`` in ``tomllib.load[s]``
|
||||
-----------------------------------------------------------
|
||||
|
||||
This option is not strictly necessary, since TOML floats are "IEEE 754 binary64
|
||||
values", which is ``float`` on most architectures. Using ``decimal.Decimal``
|
||||
thus allows users extra precision not promised by the TOML format. However, in
|
||||
the author of ``tomli``'s experience, this is useful in scientific and financial
|
||||
applications. TOML-facing users may include non-developers who are not aware of
|
||||
the limits of double-precision float.
|
||||
values", which is equivalent to a Python ``float`` on most architectures.
|
||||
However, parsing floats differently, such as to ``decimal.Decimal``, allows
|
||||
users extra precision beyond that promised by the TOML format. In the
|
||||
author of ``tomli``'s experience, this is particularly useful in scientific and
|
||||
financial applications. This is also useful for other cases that need greater
|
||||
precision, or where end-users include non-developers who may not be aware of
|
||||
the limits of binary64 floats.
|
||||
|
||||
There are also niche architectures where the Python ``float`` is not a IEEE-754
|
||||
binary64. The ``parse_float`` argument allows users to achieve correct TOML
|
||||
semantics even on such architectures.
|
||||
There are also niche architectures where the Python ``float`` is not a IEEE 754
|
||||
binary64 value. The ``parse_float`` argument allows users to achieve correct
|
||||
TOML semantics even on such architectures.
|
||||
|
||||
|
||||
Alternative names for module
|
||||
----------------------------
|
||||
Alternative names for the module
|
||||
--------------------------------
|
||||
|
||||
Ideally, we would be able to use the ``toml`` module name.
|
||||
|
||||
However, the ``toml`` package on PyPI is widely used, so there are backward
|
||||
compatibility concerns. Since the standard library takes precedence over third
|
||||
party packages, users who have pinned versions of ``toml`` would be broken when
|
||||
upgrading Python versions by any API incompatibilities.
|
||||
party packages, libraries and applications who current depend on the ``toml``
|
||||
package would likely break when upgrading Python versions due to the many
|
||||
API incompatibilities listed in `Appendix A`_, even if they pin their
|
||||
dependency versions.
|
||||
|
||||
To further clarify, the user pins are the specific concern here. Even if we were
|
||||
able to get control over the ``toml`` PyPI package and repurpose it as a
|
||||
standard library backport, we would still break users who have pinned to
|
||||
versions of the current ``toml`` package. This is unfortunate, since pinning
|
||||
To further clarify, applications with pinned dependencies are of greatest
|
||||
concern here. Even if we were able to obtain control of the ``toml`` PyPI
|
||||
package name and repurpose it for a backport of the proposed new module,
|
||||
we would still break users on new Python versions that included it in the
|
||||
standard library, regardless of whether they have pinned an older version of
|
||||
the existing ``toml`` package. This is unfortunate, since pinning
|
||||
would likely be a common response to breaking changes introduced by repurposing
|
||||
the ``toml`` package as a backport (that is incompatible with today's ``toml``).
|
||||
|
||||
There are several API incompatibilities between ``toml`` and the API proposed in
|
||||
this PEP, listed in `Appendix A`_.
|
||||
|
||||
Finally, the ``toml`` package on PyPI is not actively maintained and `we have
|
||||
been unable to contact the author <https://github.com/uiri/toml/issues/361>`__,
|
||||
Finally, the ``toml`` package on PyPI is not actively maintained, but as of
|
||||
yet, efforts to request that the author add other maintainers
|
||||
`have been unsuccessful <https://github.com/uiri/toml/issues/361>`__,
|
||||
so action here would likely have to be taken without the author's consent.
|
||||
|
||||
This PEP proposes ``tomllib``. This mirrors ``plistlib`` and ``xdrlib`` (two
|
||||
other file format modules in the standard library), as well as several others
|
||||
such as ``pathlib``, ``contextlib``, ``graphlib``, etc.
|
||||
Instead, this PEP proposes the name ``tomllib``. This mirrors ``plistlib``
|
||||
and ``xdrlib``, two other file format modules in the standard library, as well
|
||||
as other modules, such as ``pathlib``, ``contextlib`` and ``graphlib``.
|
||||
|
||||
Other considered names include:
|
||||
Other names considered but rejected include:
|
||||
|
||||
* ``tomlparser``. This mirrors ``configparser``, but is perhaps slightly less
|
||||
* ``tomlparser``. This mirrors ``configparser``, but is perhaps somewhat less
|
||||
appropriate if we include a write API in the future.
|
||||
* ``tomli``. This assumes we use ``tomli`` as the basis for implementation.
|
||||
* ``toml`` under some namespace, such as ``parser.toml``. However, this is
|
||||
awkward, especially so since existing libraries like ``json``, ``pickle``,
|
||||
``marshal``, ``html`` etc. would not be included in the namespace.
|
||||
awkward, especially so since existing parsing libraries like ``json``,
|
||||
``pickle``, ``xml``, ``html`` etc. would not be included in the namespace.
|
||||
|
||||
|
||||
Previous Discussion
|
||||
===================
|
||||
|
||||
* https://bugs.python.org/issue40059
|
||||
* https://mail.python.org/pipermail/python-dev/2019-May/157405.html
|
||||
* https://mail.python.org/archives/list/python-ideas@python.org/thread/IWJ3I32A4TY6CIVQ6ONPEBPWP4TOV2V7/
|
||||
* https://discuss.python.org/t/adopting-recommending-a-toml-parser/4068/84
|
||||
* https://github.com/hukkin/tomli/issues/141
|
||||
* `bpo-40059: Provide a toml module in the standard library
|
||||
<https://bugs.python.org/issue40059>`_
|
||||
* `[Python-Dev] Adding a toml module to the standard lib?
|
||||
<https://mail.python.org/pipermail/python-dev/2019-May/157405.html>`_
|
||||
* `[Python-Ideas] Python standard library TOML module
|
||||
<https://mail.python.org/archives/list/python-ideas@python.org/thread/IWJ3I32A4TY6CIVQ6ONPEBPWP4TOV2V7/>`_
|
||||
* `[Packaging] Adopting/recommending a toml parser?
|
||||
<https://discuss.python.org/t/adopting-recommending-a-toml-parser/4068>`_
|
||||
* `hukkin/tomli#141: Please consider pushing tomli into the stdlib
|
||||
<https://github.com/hukkin/tomli/issues/141>`_
|
||||
|
||||
|
||||
.. _Appendix A:
|
||||
|
@ -368,23 +407,25 @@ Appendix A: Differences between proposed API and ``toml``
|
|||
=========================================================
|
||||
|
||||
This appendix covers the differences between the API proposed in this PEP and
|
||||
that of the third party package ``toml``. These differences are relevant to
|
||||
that of the third-party package ``toml``. These differences are relevant to
|
||||
understanding the amount of breakage we could expect if we used the ``toml``
|
||||
name for the standard library module, as well as to better understand the design
|
||||
space. Note that this list might not be exhaustive.
|
||||
|
||||
#. This PEP currently proposes not to include a write API. That is, there will
|
||||
be no equivalent of ``toml.dump`` or ``toml.dumps``.
|
||||
#. No proposed inclusion of a write API (no ``toml.dump[s]``)
|
||||
|
||||
Discussed at `<Including an API for writing TOML_>`_.
|
||||
This PEP currently proposes not including a write API; that is, there will
|
||||
be no equivalent of ``toml.dump`` or ``toml.dumps``, as discussed at
|
||||
`Including an API for writing TOML`_.
|
||||
|
||||
If we included a write API, it would be relatively simple to convert most
|
||||
code that uses ``toml`` to use the API proposed in this PEP (acknowledging
|
||||
that that is very different from a compatible API).
|
||||
If we included a write API, it would be relatively straightforward to
|
||||
convert most code that uses ``toml`` to the new standard library module
|
||||
(acknowledging that this is very different from a compatible API, as it
|
||||
would still require code changes).
|
||||
|
||||
A significant fraction of ``toml`` users rely on this, based on comparing
|
||||
`occurrences of "toml.load" <https://grep.app/search?q=toml.load&filter[lang][0]=Python>`__
|
||||
to `occurences of "toml.dump" <https://grep.app/search?q=toml.dump&filter[lang][0]=Python>`__.
|
||||
to `occurrences of "toml.dump" <https://grep.app/search?q=toml.dump&filter[lang][0]=Python>`__.
|
||||
|
||||
#. Different first argument of ``toml.load``
|
||||
|
||||
|
@ -398,76 +439,77 @@ space. Note that this list might not be exhaustive.
|
|||
decoder: TomlDecoder = ...,
|
||||
) -> MutableMapping[str, Any]: ...
|
||||
|
||||
This is pretty different from the first argument proposed in this PEP: ``SupportsRead[bytes]``.
|
||||
This is quite different from the first argument proposed in this PEP:
|
||||
``SupportsRead[bytes]``.
|
||||
|
||||
Recapping the reasons for this, previously mentioned at
|
||||
`<Types accepted by the first argument of tomllib.load_>`_:
|
||||
`Types accepted as the first argument of tomllib.load`_:
|
||||
|
||||
* Allowing passing of paths (and lists of path-like objects, ignoring missing
|
||||
files and merging the documents into a single object) is inconsistent with
|
||||
* Allowing paths (and even lists of paths) as arguments is inconsistent with
|
||||
other similar functions in the standard library.
|
||||
* Using ``SupportsRead[bytes]`` allows us to a) ensure utf-8 is the encoding used,
|
||||
b) avoid incorrectly parsing single carriage returns as valid TOML due to
|
||||
universal newlines. TOML specifies file encoding and valid newline
|
||||
sequences, and hence is simply stricter format than what text file objects
|
||||
represent.
|
||||
* Using ``SupportsRead[bytes]`` allows us to ensure UTF-8 is the encoding used,
|
||||
and avoid incorrectly parsing single carriage returns as valid TOML.
|
||||
|
||||
A significant fraction of ``toml`` users rely on this, based on manual inspection
|
||||
of `occurrences of "toml.load" <https://grep.app/search?q=toml.load&filter[lang][0]=Python>`__.
|
||||
A significant fraction of ``toml`` users rely on this, based on manual
|
||||
inspection of `occurrences of "toml.load"
|
||||
<https://grep.app/search?q=toml.load&filter[lang][0]=Python>`__.
|
||||
|
||||
#. Errors
|
||||
|
||||
``toml`` raises ``TomlDecodeError`` vs the proposed PEP 8 compliant
|
||||
``toml`` raises ``TomlDecodeError``, vs. the proposed :pep:`8`-compliant
|
||||
``TOMLDecodeError``.
|
||||
|
||||
A significant fraction of ``toml`` users rely on this, based on `occurrences
|
||||
of "TomlDecodeError"
|
||||
A significant fraction of ``toml`` users rely on this, based on
|
||||
`occurrences of "TomlDecodeError"
|
||||
<https://grep.app/search?q=TomlDecodeError&case=true&filter[lang][0]=Python>`__.
|
||||
|
||||
#. ``toml.load[s]`` accepts a ``_dict`` argument
|
||||
|
||||
Discussed at `<Controlling the type of mappings returned by tomllib.load[s]_>`_.
|
||||
Discussed at `Controlling the type of mappings returned by tomllib.load[s]`_.
|
||||
|
||||
As discussed, almost all usage consists of ``_dict=OrderedDict``, which is
|
||||
not necessary in Python 3.7 and later.
|
||||
As mentioned there, almost all usage consists of ``_dict=OrderedDict``,
|
||||
which is not necessary in Python 3.7 and later.
|
||||
|
||||
#. ``toml.load[s]`` support an undocumented ``decoder`` argument
|
||||
|
||||
It seems the intended use case is for an implementation of comment
|
||||
preservation. The information recorded is not sufficient to roundtrip the
|
||||
TOML document preserving style, the implementation has known bugs, the
|
||||
feature is undocumented and I could only find one instance of its use on
|
||||
feature is undocumented and we could only find one instance of its use on
|
||||
https://grep.app.
|
||||
|
||||
The `toml.TomlDecoder interface <https://github.com/uiri/toml/blob/3f637dba5f68db63d4b30967fedda51c82459471/toml/decoder.pyi#L36>`__
|
||||
exposed is not simple, containing nine methods.
|
||||
The `toml.TomlDecoder interface
|
||||
<https://github.com/uiri/toml/blob/3f637dba5f68db63d4b30967fedda51c82459471/toml/decoder.pyi#L36>`__
|
||||
exposed is far from simple, containing nine methods.
|
||||
|
||||
Users are probably better served by a more complete implementation of style
|
||||
preserving parsing and writing.
|
||||
Users are likely better served by a more complete implementation of
|
||||
style-preserving parsing and writing.
|
||||
|
||||
#. ``toml.dump[s]`` support an ``encoder`` argument
|
||||
|
||||
Note that we currently propose not to include a write API, however if that
|
||||
Note that we currently propose to not include a write API; however, if that
|
||||
were to change, these differences would likely become relevant.
|
||||
|
||||
This enables two use cases, a) control over how custom types should be
|
||||
serialized, b) control over how output should be formatted.
|
||||
The ``encoder`` argument enables two use cases:
|
||||
|
||||
The first use case is reasonable, however, I could only find two instances of
|
||||
this on https://grep.app. One of these two instances used this ability to add
|
||||
support for dumping ``decimal.Decimal`` (which a potential standard library
|
||||
implementation would support out of the box).
|
||||
* control over how custom types should be serialized, and
|
||||
* control over how output should be formatted.
|
||||
|
||||
If needed, this use case could be well served by the equivalent of the
|
||||
``default`` argument in ``json.dump``.
|
||||
The first is reasonable; however, we could only find two instances of
|
||||
this on https://grep.app. One of these two used this ability to add
|
||||
support for dumping ``decimal.Decimal``, which a potential standard library
|
||||
implementation would support out of the box.
|
||||
If needed for other types, this use case could be well served by the
|
||||
equivalent of the ``default`` argument in ``json.dump``.
|
||||
|
||||
The second use case is enabled by allowing users to specify subclasses of
|
||||
`toml.TomlEncoder <https://github.com/uiri/toml/blob/3f637dba5f68db63d4b30967fedda51c82459471/toml/encoder.pyi#L9>`__
|
||||
`toml.TomlEncoder
|
||||
<https://github.com/uiri/toml/blob/3f637dba5f68db63d4b30967fedda51c82459471/toml/encoder.pyi#L9>`__
|
||||
and overriding methods to specify parts of the TOML writing process. The API
|
||||
consists of five methods and exposes a lot of implementation detail.
|
||||
consists of five methods and exposes substantial implementation detail.
|
||||
|
||||
There is some usage of the ``encoder`` API on https://grep.app, however, it
|
||||
likely accounts for a tiny fraction of overall usage of ``toml``.
|
||||
There is some usage of the ``encoder`` API on https://grep.app; however, it
|
||||
appears to account for a tiny fraction of the overall usage of ``toml``.
|
||||
|
||||
#. Timezones
|
||||
|
||||
|
|
Loading…
Reference in New Issue