2022-01-10 15:55:30 -05:00
|
|
|
|
PEP: 680
|
2022-01-12 05:57:51 -05:00
|
|
|
|
Title: tomllib: Support for Parsing TOML in the Standard Library
|
2022-01-10 15:55:30 -05:00
|
|
|
|
Author: Taneli Hukkinen, Shantanu Jain <hauntsaninja at gmail.com>
|
|
|
|
|
Sponsor: Petr Viktorin <encukou@gmail.com>
|
2022-01-12 05:57:51 -05:00
|
|
|
|
Discussions-To: https://discuss.python.org/t/13040
|
2022-01-10 15:55:30 -05:00
|
|
|
|
Status: Draft
|
|
|
|
|
Type: Standards Track
|
|
|
|
|
Content-Type: text/x-rst
|
|
|
|
|
Created: 01-Jan-2022
|
|
|
|
|
Python-Version: 3.11
|
2022-01-12 05:57:51 -05:00
|
|
|
|
Post-History: 11-Jan-2022
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
========
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
This PEP proposes adding the ``tomllib`` module to the standard library for
|
2022-01-10 15:55:30 -05:00
|
|
|
|
parsing TOML (Tom's Obvious Minimal Language,
|
|
|
|
|
`https://toml.io <https://toml.io/en/>`_).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Motivation
|
|
|
|
|
==========
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
TOML is the format of choice for Python packaging, as evidenced by
|
|
|
|
|
:pep:`517`, :pep:`518` and :pep:`621`. This creates a bootstrapping
|
|
|
|
|
problem for Python build tools, forcing them to vendor a TOML parsing
|
|
|
|
|
package or employ other undesirable workarounds, and causes serious issues
|
|
|
|
|
for repackagers and other downstream consumers. Including TOML support in
|
|
|
|
|
the standard library would neatly solve all of these issues.
|
|
|
|
|
|
|
|
|
|
Further, many Python tools are now configurable via TOML, such as
|
|
|
|
|
``black``, ``mypy``, ``pytest``, ``tox``, ``pylint`` and ``isort``.
|
|
|
|
|
Many that are not, such as ``flake8``, cite the lack of standard library
|
|
|
|
|
support as a `main reason why
|
2022-02-03 18:15:09 -05:00
|
|
|
|
<https://github.com/PyCQA/flake8/issues/234#issuecomment-812800722>`__.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
Given the special place TOML already has in the Python ecosystem, it makes sense
|
2022-01-13 09:25:58 -05:00
|
|
|
|
for it to be an included battery.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Finally, TOML as a format is increasingly popular (for the reasons
|
|
|
|
|
outlined in :pep:`518`), with various Python TOML libraries having about
|
|
|
|
|
2000 reverse dependencies on PyPI (for comparison, ``requests`` has about
|
|
|
|
|
28000 reverse dependencies). Hence, this is likely to be a generally useful
|
|
|
|
|
addition, even looking beyond the needs of Python packaging and related tools.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rationale
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This PEP proposes basing the standard library support for reading TOML on the
|
2022-01-13 09:25:58 -05:00
|
|
|
|
third-party library ``tomli``
|
2022-01-10 15:55:30 -05:00
|
|
|
|
(`github.com/hukkin/tomli <https://github.com/hukkin/tomli>`_).
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Many projects have recently switched to using ``tomli``, such as ``pip``,
|
2022-01-10 15:55:30 -05:00
|
|
|
|
``build``, ``pytest``, ``mypy``, ``black``, ``flit``, ``coverage``,
|
2022-01-13 09:25:58 -05:00
|
|
|
|
``setuptools-scm`` and ``cibuildwheel``.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
``tomli`` is actively maintained and well-tested. It is about 800 lines
|
|
|
|
|
of code with 100% test coverage, and passes all tests in the
|
|
|
|
|
`proposed official TOML compliance test suite
|
|
|
|
|
<https://github.com/toml-lang/compliance/pull/8>`_, as well as
|
|
|
|
|
`the more established BurntSushi/toml-test suite
|
2022-01-10 15:55:30 -05:00
|
|
|
|
<https://github.com/BurntSushi/toml-test>`_.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Specification
|
|
|
|
|
=============
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
A new module ``tomllib`` will be added to the Python standard library,
|
|
|
|
|
exposing the following public functions:
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
.. code-block::
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
def load(
|
|
|
|
|
fp: SupportsRead[bytes],
|
|
|
|
|
/,
|
|
|
|
|
*,
|
|
|
|
|
parse_float: Callable[[str], Any] = ...,
|
|
|
|
|
) -> dict[str, Any]: ...
|
|
|
|
|
|
|
|
|
|
def loads(
|
|
|
|
|
s: str,
|
|
|
|
|
/,
|
|
|
|
|
*,
|
|
|
|
|
parse_float: Callable[[str], Any] = ...,
|
|
|
|
|
) -> dict[str, Any]: ...
|
|
|
|
|
|
|
|
|
|
``tomllib.load`` deserializes a binary file-like object containing a
|
|
|
|
|
TOML document to a Python ``dict``.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
The ``fp`` argument must have a ``read()`` method with the same API as
|
|
|
|
|
``io.RawIOBase.read()``.
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
``tomllib.loads`` deserializes a ``str`` instance containing a TOML document
|
|
|
|
|
to a Python ``dict``.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
The ``parse_float`` argument is a callable object that takes as input the
|
|
|
|
|
original string representation of a TOML float, and returns a corresponding
|
|
|
|
|
Python object (similar to ``parse_float`` in ``json.load``).
|
|
|
|
|
For example, the user may pass a function returning a ``decimal.Decimal``,
|
|
|
|
|
for use cases where exact precision is important. By default, TOML floats
|
|
|
|
|
are parsed as instances of the Python ``float`` type.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
The returned object contains only basic Python objects (``str``, ``int``,
|
|
|
|
|
``bool``, ``float``, ``datetime.{datetime,date,time}``, ``list``, ``dict`` with
|
|
|
|
|
string keys), and the results of ``parse_float``.
|
|
|
|
|
|
|
|
|
|
``tomllib.TOMLDecodeError`` is raised in the case of invalid TOML.
|
|
|
|
|
|
|
|
|
|
Note that this PEP does not propose ``tomllib.dump`` or ``tomllib.dumps``
|
2022-01-13 09:25:58 -05:00
|
|
|
|
functions; see `Including an API for writing TOML`_ for details.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Maintenance Implications
|
|
|
|
|
========================
|
|
|
|
|
|
|
|
|
|
Stability of TOML
|
|
|
|
|
-----------------
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
The release of TOML 1.0.0 in January 2021 indicates the TOML format should
|
|
|
|
|
now be officially considered stable. Empirically, TOML has proven to be a
|
|
|
|
|
stable format even prior to the release of TOML 1.0.0. From the
|
|
|
|
|
`changelog <https://github.com/toml-lang/toml/blob/master/CHANGELOG.md>`__, we
|
|
|
|
|
can see that TOML has had no major changes since April 2020, and has had
|
|
|
|
|
two releases in the past five years (2017-2021).
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
In the event of changes to the TOML specification, we can treat minor
|
2022-01-10 15:55:30 -05:00
|
|
|
|
revisions as bug fixes and update the implementation in place. In the event of
|
2022-01-13 09:25:58 -05:00
|
|
|
|
major breaking changes, we should preserve support for TOML 1.x.
|
|
|
|
|
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
Maintainability of proposed implementation
|
|
|
|
|
------------------------------------------
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
The proposed implementation (``tomli``) is pure Python, well tested and
|
|
|
|
|
weighs in at under 1000 lines of code. It is minimalist, offering a smaller API
|
2022-01-10 15:55:30 -05:00
|
|
|
|
surface area than other TOML implementations.
|
|
|
|
|
|
|
|
|
|
The author of ``tomli`` is willing to help integrate ``tomli`` into the standard
|
|
|
|
|
library and help maintain it, `as per this post
|
|
|
|
|
<https://github.com/hukkin/tomli/issues/141#issuecomment-998018972>`__.
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Furthermore, Python core developer Petr Viktorin has indicated a willingness
|
|
|
|
|
to maintain a read API, `as per this post
|
2022-01-10 15:55:30 -05:00
|
|
|
|
<https://discuss.python.org/t/adopting-recommending-a-toml-parser/4068/88>`__.
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Rewriting the parser in C is not deemed necessary at this time. It is rare for
|
|
|
|
|
TOML parsing to be a bottleneck in applications, and users with higher performance
|
|
|
|
|
needs can use a third-party library (as is already often the case with JSON,
|
|
|
|
|
despite Python offering a standard library C-extension module).
|
|
|
|
|
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
TOML support a slippery slope for other things
|
|
|
|
|
----------------------------------------------
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
As discussed in the `Motivation`_ section, TOML holds a special place in the
|
|
|
|
|
Python ecosystem, for reading :pep:`518` ``pyproject.toml`` packaging
|
|
|
|
|
and tool configuration files.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
This chief reason to include TOML in the standard library does not apply to
|
|
|
|
|
other formats, such as YAML or MessagePack.
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
In addition, the simplicity of TOML distinguishes it from other formats like
|
|
|
|
|
YAML, which are highly complicated to construct and parse.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
An API for writing TOML may, however, be added in a future PEP.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Backwards Compatibility
|
|
|
|
|
=======================
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
This proposal has no backwards compatibility issues within the standard
|
|
|
|
|
library, as it describes a new module.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
Any existing third-party module named ``tomllib`` will break, as
|
2022-01-13 09:25:58 -05:00
|
|
|
|
``import tomllib`` will import the standard library module.
|
|
|
|
|
However, ``tomllib`` is not registered on PyPI, so it is unlikely that any
|
|
|
|
|
module with this name is widely used.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Note that we avoid using the more straightforward name ``toml`` to avoid
|
2022-01-10 15:55:30 -05:00
|
|
|
|
backwards compatibility implications for users who have pinned versions of the
|
2022-01-13 09:25:58 -05:00
|
|
|
|
current ``toml`` PyPI package.
|
|
|
|
|
For more details, see the `Alternative names for the module`_ section.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Security Implications
|
|
|
|
|
=====================
|
|
|
|
|
|
|
|
|
|
Errors in the implementation could cause potential security issues.
|
2022-01-13 09:25:58 -05:00
|
|
|
|
However, the parser's output is limited to simple data types; inability to load
|
2022-01-10 15:55:30 -05:00
|
|
|
|
arbitrary classes avoids security issues common in more "powerful" formats like
|
|
|
|
|
pickle and YAML. Also, the implementation will be in pure Python, which reduces
|
|
|
|
|
security issues endemic to C, such as buffer overflows.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
How to Teach This
|
|
|
|
|
=================
|
|
|
|
|
|
|
|
|
|
The API of ``tomllib`` mimics that of other well-established file format
|
|
|
|
|
libraries, such as ``json`` and ``pickle``. The lack of a ``dump`` function will
|
|
|
|
|
be explained in the documentation, with a link to relevant third-party libraries
|
2022-01-13 09:25:58 -05:00
|
|
|
|
(e.g. ``tomlkit``, ``tomli-w``, ``pytomlpp``).
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reference Implementation
|
|
|
|
|
========================
|
|
|
|
|
|
|
|
|
|
The proposed implementation can be found at https://github.com/hukkin/tomli
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rejected Ideas
|
|
|
|
|
==============
|
|
|
|
|
|
|
|
|
|
Basing on another TOML implementation
|
|
|
|
|
-------------------------------------
|
|
|
|
|
|
2022-01-12 05:57:51 -05:00
|
|
|
|
Several potential alternative implementations exist:
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
* ``tomlkit`` is well established, actively maintained and supports TOML 1.0.0.
|
|
|
|
|
An important difference is that ``tomlkit`` supports style roundtripping. As a
|
2022-01-10 15:55:30 -05:00
|
|
|
|
result, it has a more complex API and implementation (about 5x as much code as
|
2022-01-12 05:57:51 -05:00
|
|
|
|
``tomli``). Its author does not believe that ``tomlkit`` is a good choice for
|
2022-01-10 15:55:30 -05:00
|
|
|
|
the standard library.
|
|
|
|
|
|
2022-01-12 05:57:51 -05:00
|
|
|
|
* ``toml`` is a very widely used library. However, it is not actively
|
2022-01-13 09:25:58 -05:00
|
|
|
|
maintained, does not support TOML 1.0.0 and has a number of known bugs. Its
|
|
|
|
|
API is more complex than that of ``tomli``. It allows customising output style
|
|
|
|
|
through a complicated encoder API, and some very limited and mostly unused
|
|
|
|
|
functionality to preserve input style through an undocumented decoder API.
|
|
|
|
|
For more details on its API differences from this PEP, refer to `Appendix A`_.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-12 05:57:51 -05:00
|
|
|
|
* ``pytomlpp`` is a Python wrapper for the C++ project ``toml++``. Pure Python
|
2022-01-10 15:55:30 -05:00
|
|
|
|
libraries are easier to maintain than extension modules.
|
|
|
|
|
|
2022-01-12 05:57:51 -05:00
|
|
|
|
* ``rtoml`` is a Python wrapper for the Rust project ``toml-rs`` and hence has
|
2022-01-10 15:55:30 -05:00
|
|
|
|
similar shortcomings to ``pytomlpp``.
|
2022-01-13 09:25:58 -05:00
|
|
|
|
In addition, it does not support TOML 1.0.0.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-12 05:57:51 -05:00
|
|
|
|
* Writing an implementation from scratch. It's unclear what we would get from
|
2022-01-13 09:25:58 -05:00
|
|
|
|
this; ``tomli`` meets our needs and the author is willing to help with its
|
2022-01-12 05:57:51 -05:00
|
|
|
|
inclusion in the standard library.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
|
2022-01-10 15:55:30 -05:00
|
|
|
|
Including an API for writing TOML
|
|
|
|
|
---------------------------------
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
There are several reasons to not include an API for writing TOML.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
The ability to write TOML is not needed for the use cases that motivate this
|
2022-01-13 09:25:58 -05:00
|
|
|
|
PEP: core Python packaging tools, and projects that need to read TOML
|
|
|
|
|
configuration files.
|
|
|
|
|
|
|
|
|
|
Use cases that involve editing an existing TOML file (as opposed to writing a
|
|
|
|
|
brand new one) are better served by a style preserving library. TOML is
|
|
|
|
|
intended as a human-readable and -editable configuration format, so it's
|
|
|
|
|
important to preserve comments, formatting and other markup. This requires
|
|
|
|
|
a parser whose output includes style-related metadata, making it impractical
|
|
|
|
|
to output plain Python types like ``str`` and ``dict``. Furthermore, it
|
|
|
|
|
substantially complicates the design of the API.
|
|
|
|
|
|
|
|
|
|
Even without considering style preservation, there are too many degrees of
|
|
|
|
|
freedom in how to design a write API. For example, what default style
|
|
|
|
|
(indentation, vertical and horizontal spacing, quotes, etc) should the library
|
|
|
|
|
use for the output, and how much control should users be given over it?
|
|
|
|
|
How should the library handle input and output validation? Should it support
|
|
|
|
|
serialization of custom types, and if so, how? While there are reasonable
|
|
|
|
|
options for resolving these issues, the nature of the standard library is such
|
|
|
|
|
that we only get "one chance to get it right".
|
|
|
|
|
|
|
|
|
|
Currently, no CPython core developers have expressed willingness to maintain a
|
|
|
|
|
write API, or sponsor a PEP that includes one. Since it is hard to change
|
2022-01-10 15:55:30 -05:00
|
|
|
|
or remove something in the standard library, it is safer to err on the side of
|
2022-01-13 09:25:58 -05:00
|
|
|
|
exclusion for now, and potentially revisit this later.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Therefore, writing TOML is left to third-party libraries. If a good API and
|
|
|
|
|
relevant use cases for it are found later, write support can be added in a
|
|
|
|
|
future PEP.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Assorted API details
|
|
|
|
|
--------------------
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Types accepted as the first argument of ``tomllib.load``
|
|
|
|
|
''''''''''''''''''''''''''''''''''''''''''''''''''''''''
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
The ``toml`` library on PyPI allows passing paths (and lists of path-like
|
2022-01-13 09:25:58 -05:00
|
|
|
|
objects, ignoring missing files and merging the documents into a single object)
|
|
|
|
|
to its ``load`` function. However, allowing this here would be inconsistent
|
|
|
|
|
with the behavior of ``json.load``, ``pickle.load`` and other standard library
|
|
|
|
|
functions. If we agree that consistency here is desirable,
|
|
|
|
|
allowing paths is out of scope for this PEP. This can easily and explicitly
|
|
|
|
|
be worked around in user code, or by using a third-party library.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
The proposed API takes a binary file, while ``toml.load`` takes a text file and
|
2022-01-13 09:25:58 -05:00
|
|
|
|
``json.load`` takes either. Using a binary file allows us to ensure UTF-8 is
|
2022-01-27 05:25:47 -05:00
|
|
|
|
the encoding used (ensuring correct parsing on platforms with other default
|
|
|
|
|
encodings, such as Windows), and avoid incorrectly parsing files containing
|
|
|
|
|
single carriage returns as valid TOML due to universal newlines in text mode.
|
2022-01-13 09:25:58 -05:00
|
|
|
|
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Type accepted as the first argument of ``tomllib.loads``
|
|
|
|
|
''''''''''''''''''''''''''''''''''''''''''''''''''''''''
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
While ``tomllib.load`` takes a binary file, ``tomllib.loads`` takes
|
|
|
|
|
a text string. This may seem inconsistent at first.
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Quoting the `TOML v1.0.0 specification <https://toml.io/en/v1.0.0#spec>`_:
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-12 05:57:51 -05:00
|
|
|
|
A TOML file must be a valid UTF-8 encoded Unicode document.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
``tomllib.loads`` does not intend to load a TOML file, but rather the
|
|
|
|
|
document that the file stores. The most natural representation of
|
|
|
|
|
a Unicode document in Python is ``str``, not ``bytes``.
|
|
|
|
|
|
|
|
|
|
It is possible to add ``bytes`` support in the future if needed, but
|
|
|
|
|
we are not aware of any use cases for it.
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
|
2022-01-10 15:55:30 -05:00
|
|
|
|
Controlling the type of mappings returned by ``tomllib.load[s]``
|
|
|
|
|
----------------------------------------------------------------
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
The ``toml`` library on PyPI accepts a ``_dict`` argument in its ``load[s]``
|
|
|
|
|
functions, which works similarly to the ``object_hook`` argument in
|
|
|
|
|
``json.load[s]``. There are several uses of ``_dict`` found on
|
|
|
|
|
https://grep.app; however, almost all of them are passing
|
|
|
|
|
``_dict=OrderedDict``, which should be unnecessary as of Python 3.7.
|
|
|
|
|
We found two instances of relevant use: in one case, a custom class was passed
|
|
|
|
|
for friendlier KeyErrors; in the other, the custom class had several
|
2022-01-10 15:55:30 -05:00
|
|
|
|
additional lookup and mutation methods (e.g. to help resolve dotted keys).
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Such a parameter is not necessary for the core use cases outlined in the
|
|
|
|
|
`Motivation`_ section. The absence of this can be pretty easily worked around
|
|
|
|
|
using a wrapper class, transformer function, or a third-party library. Finally,
|
|
|
|
|
support could be added later in a backward-compatible way.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Removing support for ``parse_float`` in ``tomllib.load[s]``
|
|
|
|
|
-----------------------------------------------------------
|
|
|
|
|
|
2022-01-27 05:27:05 -05:00
|
|
|
|
This option is not strictly necessary, since TOML floats should be implemented
|
|
|
|
|
as "IEEE 754 binary64 values", which is equivalent to a Python ``float`` on most
|
|
|
|
|
architectures.
|
|
|
|
|
|
|
|
|
|
The TOML specification uses the word "SHOULD", however, implying a
|
|
|
|
|
recommendation that can be ignored for valid reasons. Parsing floats
|
|
|
|
|
differently, such as to ``decimal.Decimal``, allows users extra precision beyond
|
|
|
|
|
that promised by the TOML format. In the author of ``tomli``'s experience, this
|
|
|
|
|
is particularly useful in scientific and financial applications. This is also
|
|
|
|
|
useful for other cases that need greater precision, or where end-users include
|
|
|
|
|
non-developers who may not be aware of the limits of binary64 floats.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
There are also niche architectures where the Python ``float`` is not a IEEE 754
|
|
|
|
|
binary64 value. The ``parse_float`` argument allows users to achieve correct
|
|
|
|
|
TOML semantics even on such architectures.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Alternative names for the module
|
|
|
|
|
--------------------------------
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
Ideally, we would be able to use the ``toml`` module name.
|
|
|
|
|
|
|
|
|
|
However, the ``toml`` package on PyPI is widely used, so there are backward
|
|
|
|
|
compatibility concerns. Since the standard library takes precedence over third
|
2022-01-13 09:25:58 -05:00
|
|
|
|
party packages, libraries and applications who current depend on the ``toml``
|
|
|
|
|
package would likely break when upgrading Python versions due to the many
|
|
|
|
|
API incompatibilities listed in `Appendix A`_, even if they pin their
|
|
|
|
|
dependency versions.
|
|
|
|
|
|
|
|
|
|
To further clarify, applications with pinned dependencies are of greatest
|
|
|
|
|
concern here. Even if we were able to obtain control of the ``toml`` PyPI
|
|
|
|
|
package name and repurpose it for a backport of the proposed new module,
|
|
|
|
|
we would still break users on new Python versions that included it in the
|
|
|
|
|
standard library, regardless of whether they have pinned an older version of
|
|
|
|
|
the existing ``toml`` package. This is unfortunate, since pinning
|
2022-01-10 15:55:30 -05:00
|
|
|
|
would likely be a common response to breaking changes introduced by repurposing
|
|
|
|
|
the ``toml`` package as a backport (that is incompatible with today's ``toml``).
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Finally, the ``toml`` package on PyPI is not actively maintained, but as of
|
|
|
|
|
yet, efforts to request that the author add other maintainers
|
|
|
|
|
`have been unsuccessful <https://github.com/uiri/toml/issues/361>`__,
|
2022-01-10 15:55:30 -05:00
|
|
|
|
so action here would likely have to be taken without the author's consent.
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Instead, this PEP proposes the name ``tomllib``. This mirrors ``plistlib``
|
|
|
|
|
and ``xdrlib``, two other file format modules in the standard library, as well
|
|
|
|
|
as other modules, such as ``pathlib``, ``contextlib`` and ``graphlib``.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Other names considered but rejected include:
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
* ``tomlparser``. This mirrors ``configparser``, but is perhaps somewhat less
|
2022-01-10 15:55:30 -05:00
|
|
|
|
appropriate if we include a write API in the future.
|
|
|
|
|
* ``tomli``. This assumes we use ``tomli`` as the basis for implementation.
|
|
|
|
|
* ``toml`` under some namespace, such as ``parser.toml``. However, this is
|
2022-01-13 09:25:58 -05:00
|
|
|
|
awkward, especially so since existing parsing libraries like ``json``,
|
|
|
|
|
``pickle``, ``xml``, ``html`` etc. would not be included in the namespace.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
|
2022-01-12 05:57:51 -05:00
|
|
|
|
Previous Discussion
|
2022-01-10 15:55:30 -05:00
|
|
|
|
===================
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
* `bpo-40059: Provide a toml module in the standard library
|
|
|
|
|
<https://bugs.python.org/issue40059>`_
|
|
|
|
|
* `[Python-Dev] Adding a toml module to the standard lib?
|
|
|
|
|
<https://mail.python.org/pipermail/python-dev/2019-May/157405.html>`_
|
|
|
|
|
* `[Python-Ideas] Python standard library TOML module
|
|
|
|
|
<https://mail.python.org/archives/list/python-ideas@python.org/thread/IWJ3I32A4TY6CIVQ6ONPEBPWP4TOV2V7/>`_
|
|
|
|
|
* `[Packaging] Adopting/recommending a toml parser?
|
|
|
|
|
<https://discuss.python.org/t/adopting-recommending-a-toml-parser/4068>`_
|
|
|
|
|
* `hukkin/tomli#141: Please consider pushing tomli into the stdlib
|
|
|
|
|
<https://github.com/hukkin/tomli/issues/141>`_
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
.. _Appendix A:
|
|
|
|
|
|
|
|
|
|
Appendix A: Differences between proposed API and ``toml``
|
|
|
|
|
=========================================================
|
|
|
|
|
|
|
|
|
|
This appendix covers the differences between the API proposed in this PEP and
|
2022-01-13 09:25:58 -05:00
|
|
|
|
that of the third-party package ``toml``. These differences are relevant to
|
2022-01-10 15:55:30 -05:00
|
|
|
|
understanding the amount of breakage we could expect if we used the ``toml``
|
|
|
|
|
name for the standard library module, as well as to better understand the design
|
|
|
|
|
space. Note that this list might not be exhaustive.
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
#. No proposed inclusion of a write API (no ``toml.dump[s]``)
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
This PEP currently proposes not including a write API; that is, there will
|
|
|
|
|
be no equivalent of ``toml.dump`` or ``toml.dumps``, as discussed at
|
|
|
|
|
`Including an API for writing TOML`_.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
If we included a write API, it would be relatively straightforward to
|
|
|
|
|
convert most code that uses ``toml`` to the new standard library module
|
|
|
|
|
(acknowledging that this is very different from a compatible API, as it
|
|
|
|
|
would still require code changes).
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-12 05:57:51 -05:00
|
|
|
|
A significant fraction of ``toml`` users rely on this, based on comparing
|
|
|
|
|
`occurrences of "toml.load" <https://grep.app/search?q=toml.load&filter[lang][0]=Python>`__
|
2022-01-13 09:25:58 -05:00
|
|
|
|
to `occurrences of "toml.dump" <https://grep.app/search?q=toml.dump&filter[lang][0]=Python>`__.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
#. Different first argument of ``toml.load``
|
|
|
|
|
|
|
|
|
|
``toml.load`` has the following signature:
|
|
|
|
|
|
|
|
|
|
.. code-block::
|
|
|
|
|
|
|
|
|
|
def load(
|
|
|
|
|
f: Union[SupportsRead[str], str, bytes, list[PathLike | str | bytes]],
|
|
|
|
|
_dict: Type[MutableMapping[str, Any]] = ...,
|
|
|
|
|
decoder: TomlDecoder = ...,
|
|
|
|
|
) -> MutableMapping[str, Any]: ...
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
This is quite different from the first argument proposed in this PEP:
|
|
|
|
|
``SupportsRead[bytes]``.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
Recapping the reasons for this, previously mentioned at
|
2022-01-13 09:25:58 -05:00
|
|
|
|
`Types accepted as the first argument of tomllib.load`_:
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
* Allowing paths (and even lists of paths) as arguments is inconsistent with
|
2022-01-10 15:55:30 -05:00
|
|
|
|
other similar functions in the standard library.
|
2022-01-13 09:25:58 -05:00
|
|
|
|
* Using ``SupportsRead[bytes]`` allows us to ensure UTF-8 is the encoding used,
|
|
|
|
|
and avoid incorrectly parsing single carriage returns as valid TOML.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
A significant fraction of ``toml`` users rely on this, based on manual
|
|
|
|
|
inspection of `occurrences of "toml.load"
|
|
|
|
|
<https://grep.app/search?q=toml.load&filter[lang][0]=Python>`__.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
#. Errors
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
``toml`` raises ``TomlDecodeError``, vs. the proposed :pep:`8`-compliant
|
2022-01-10 15:55:30 -05:00
|
|
|
|
``TOMLDecodeError``.
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
A significant fraction of ``toml`` users rely on this, based on
|
|
|
|
|
`occurrences of "TomlDecodeError"
|
2022-01-12 05:57:51 -05:00
|
|
|
|
<https://grep.app/search?q=TomlDecodeError&case=true&filter[lang][0]=Python>`__.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
#. ``toml.load[s]`` accepts a ``_dict`` argument
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Discussed at `Controlling the type of mappings returned by tomllib.load[s]`_.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
As mentioned there, almost all usage consists of ``_dict=OrderedDict``,
|
|
|
|
|
which is not necessary in Python 3.7 and later.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
#. ``toml.load[s]`` support an undocumented ``decoder`` argument
|
|
|
|
|
|
|
|
|
|
It seems the intended use case is for an implementation of comment
|
|
|
|
|
preservation. The information recorded is not sufficient to roundtrip the
|
|
|
|
|
TOML document preserving style, the implementation has known bugs, the
|
2022-01-13 09:25:58 -05:00
|
|
|
|
feature is undocumented and we could only find one instance of its use on
|
2022-01-10 15:55:30 -05:00
|
|
|
|
https://grep.app.
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
The `toml.TomlDecoder interface
|
|
|
|
|
<https://github.com/uiri/toml/blob/3f637dba5f68db63d4b30967fedda51c82459471/toml/decoder.pyi#L36>`__
|
|
|
|
|
exposed is far from simple, containing nine methods.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Users are likely better served by a more complete implementation of
|
|
|
|
|
style-preserving parsing and writing.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
#. ``toml.dump[s]`` support an ``encoder`` argument
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
Note that we currently propose to not include a write API; however, if that
|
2022-01-10 15:55:30 -05:00
|
|
|
|
were to change, these differences would likely become relevant.
|
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
The ``encoder`` argument enables two use cases:
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
* control over how custom types should be serialized, and
|
|
|
|
|
* control over how output should be formatted.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
The first is reasonable; however, we could only find two instances of
|
|
|
|
|
this on https://grep.app. One of these two used this ability to add
|
|
|
|
|
support for dumping ``decimal.Decimal``, which a potential standard library
|
|
|
|
|
implementation would support out of the box.
|
|
|
|
|
If needed for other types, this use case could be well served by the
|
|
|
|
|
equivalent of the ``default`` argument in ``json.dump``.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
The second use case is enabled by allowing users to specify subclasses of
|
2022-01-13 09:25:58 -05:00
|
|
|
|
`toml.TomlEncoder
|
|
|
|
|
<https://github.com/uiri/toml/blob/3f637dba5f68db63d4b30967fedda51c82459471/toml/encoder.pyi#L9>`__
|
2022-01-10 19:45:09 -05:00
|
|
|
|
and overriding methods to specify parts of the TOML writing process. The API
|
2022-01-13 09:25:58 -05:00
|
|
|
|
consists of five methods and exposes substantial implementation detail.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
2022-01-13 09:25:58 -05:00
|
|
|
|
There is some usage of the ``encoder`` API on https://grep.app; however, it
|
|
|
|
|
appears to account for a tiny fraction of the overall usage of ``toml``.
|
2022-01-10 15:55:30 -05:00
|
|
|
|
|
|
|
|
|
#. Timezones
|
|
|
|
|
|
|
|
|
|
``toml`` uses and exposes custom ``toml.tz.TomlTz`` timezone objects. The
|
|
|
|
|
proposed implementation uses ``datetime.timezone`` objects from the standard
|
|
|
|
|
library.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
=========
|
|
|
|
|
|
|
|
|
|
This document is placed in the public domain or under the
|
|
|
|
|
CC0-1.0-Universal license, whichever is more permissive.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
..
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
sentence-end-double-space: t
|
|
|
|
|
fill-column: 70
|
|
|
|
|
coding: utf-8
|
|
|
|
|
End:
|