python-peps/pep-0712.rst

250 lines
9.7 KiB
ReStructuredText

PEP: 712
Title: Adding a "converter" parameter to dataclasses.field
Author: Joshua Cannon <joshdcannon@gmail.com>
Sponsor: Eric V. Smith <eric at trueblade.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 01-Jan-2023
Python-Version: 3.13
Post-History: `27-Dec-2022 <https://mail.python.org/archives/list/typing-sig@python.org/thread/NWZQIINJQZDOCZGO6TGCUP2PNW4PEKNY/>`__,
`19-Jan-2023 <https://discuss.python.org/t/add-converter-to-dataclass-field/22956>`__,
Abstract
========
:pep:`557` added :mod:`dataclasses` to the Python stdlib. :pep:`681` added
:func:`~py3.11:typing.dataclass_transform` to help type checkers understand
several common dataclass-like libraries, such as attrs, Pydantic, and object
relational mapper (ORM) packages such as SQLAlchemy and Django.
A common feature these libraries provide over the standard library
implementation is the ability for the library to convert arguments given at
initialization time into the types expected for each field using a
user-provided conversion function.
Therefore, this PEP adds a ``converter`` parameter to :func:`dataclasses.field`
(along with the requisite changes to :class:`dataclasses.Field` and
:func:`~py3.11:typing.dataclass_transform`) to specify the function to use to
convert the input value for each field to the representation to be stored in
the dataclass.
Motivation
==========
There is no existing, standard way for :mod:`dataclasses` or third-party
dataclass-like libraries to support argument conversion in a type-checkable
way. To work around this limitation, library authors/users are forced to choose
to:
* Opt-in to a custom Mypy plugin. These plugins help Mypy understand the
conversion semantics, but not other tools.
* Shift conversion responsibility onto the caller of the dataclass
constructor. This can make constructing certain dataclasses unnecessarily
verbose and repetitive.
* Provide a custom ``__init__`` which declares "wider" parameter types and
converts them when setting the appropriate attribute. This not only duplicates
the typing annotations between the converter and ``__init__``, but also opts
the user out of many of the features :mod:`dataclasses` provides.
* Provide a custom ``__init__`` but without meaningful type annotations
for the parameter types requiring conversion.
None of these choices are ideal.
Rationale
=========
Adding argument conversion semantics is useful and beneficial enough that most
dataclass-like libraries provide support. Adding this feature to the standard
library means more users are able to opt-in to these benefits without requiring
third-party libraries. Additionally third-party libraries are able to clue
type-checkers into their own conversion semantics through added support in
:func:`~py3.11:typing.dataclass_transform`, meaning users of those libraries
benefit as well.
Specification
=============
New ``converter`` parameter
---------------------------
This specification introduces a new parameter named ``converter`` to the
:func:`dataclasses.field` function. When an ``__init__`` method is synthesized
by ``dataclass``-like semantics, if an argument is provided for the field, the
``dataclass`` object's attribute will be assigned the result of calling the
converter on the provided argument. If no argument is given and the field was
constructed with a default value, the ``dataclass`` object's attribute will be
assigned the result of calling the converter on the provided default.
Adding this parameter also implies the following changes:
* A ``converter`` attribute will be added to :class:`dataclasses.Field`.
* ``converter`` will be added to :func:`~py3.11:typing.dataclass_transform`'s
list of supported field specifier parameters.
Example
'''''''
.. code-block:: python
@dataclasses.dataclass
class InventoryItem:
# `converter` as a type
id: int = dataclasses.field(converter=int)
skus: tuple[int, ...] = dataclasses.field(converter=tuple[int, ...])
# `converter` as a callable
names: tuple[str, ...] = dataclasses.field(
converter=lambda names: tuple(map(str.lower, names))
)
# The default value is also converted; therefore the following is not a
# type error.
stock_image_path: pathlib.PurePosixPath = dataclasses.field(
converter=pathlib.PurePosixPath, default="assets/unknown.png"
)
item1 = InventoryItem("1", [234, 765], ["PYTHON PLUSHIE", "FLUFFY SNAKE"])
# item1 would have the following values:
# id=1
# skus=(234, 765)
# names=('python plushie', 'fluffy snake')
# stock_image_path=pathlib.PurePosixPath("assets/unknown.png")
Impact on typing
----------------
A ``converter`` must be a callable that accepts a single positional argument, and
the parameter type corresponding to this positional argument provides the type
of the the synthesized ``__init__`` parameter associated with the field.
In other words, the argument provided for the converter parameter must be
compatible with ``Callable[[T], X]`` where ``T`` is the input type for
the converter and ``X`` is the output type of the converter.
Type-checking the default value
'''''''''''''''''''''''''''''''
Because the ``default`` value is unconditionally converted using ``converter``,
if arguments for both ``converter`` and ``default`` are provided to
:func:`dataclasses.field`, the ``default`` argument's type should be checked
using the type of the single argument to the ``converter`` callable.
Converter return type
'''''''''''''''''''''
The return type of the callable must be a type that's compatible with the
field's declared type. This includes the field's type exactly, but can also be
a type that's more specialized (such as a converter returning a ``list[int]``
for a field annotated as ``list``, or a converter returning an ``int`` for a
field annotated as ``int | str``).
Example
'''''''
.. code-block:: python
@dataclasses.dataclass
class Example:
my_int: int = dataclasses.field(converter=int)
my_tuple: tuple[int, ...] = dataclasses.field(converter=tuple[int, ...])
my_cheese: Cheese = dataclasses.field(converter=make_cheese)
# Although the default value is of type `str` and the field is declared to
# be of type `pathlib.Path`, this is not a type error because the default
# value will be converted.
tmpdir: pathlib.Path = dataclasses.field(default="/tmp", converter=pathlib.Path)
Backward Compatibility
======================
These changes don't introduce any compatibility problems since they
only introduce opt-in new features.
Security Implications
======================
There are no direct security concerns with these changes.
How to Teach This
=================
Documentation and examples explaining the new parameter and behavior will be
added to the relevant sections of the docs site (primarily on
:mod:`dataclasses`) and linked from the *What's New* document.
The added documentation/examples will also cover the "common pitfalls" that
users of converters are likely to encounter. Such pitfalls include:
* Needing to handle ``None``/sentinel values.
* Needing to handle values that are already of the correct type.
* Avoiding lambdas for converters, as the synthesized ``__init__``
parameter's type will become ``Any``.
Reference Implementation
========================
The attrs library `already includes <attrs-converters_>`__ a ``converter``
parameter containing converter semantics.
CPython support is implemented on `a branch in the author's fork <cpython-branch_>`__.
Rejected Ideas
==============
Just adding "converter" to ``typing.dataclass_transform``'s ``field_specifiers``
--------------------------------------------------------------------------------
The idea of isolating this addition to
:func:`~py3.11:typing.dataclass_transform` was briefly
`discussed on Typing-SIG <only-dataclass-transform_>`__ where it was suggested
to broaden this to :mod:`dataclasses` more generally.
Additionally, adding this to :mod:`dataclasses` ensures anyone can reap the
benefits without requiring additional libraries.
Not converting default values
-----------------------------
There are pros and cons with both converting and not converting default values.
Leaving default values as-is allows type-checkers and dataclass authors to
expect that the type of the default matches the type of the field. However,
converting default values has two large advantages:
1. Compatibility with attrs. Attrs unconditionally uses the converter to
convert the default value.
2. Simpler defaults. Allowing the default value to have the same type as
user-provided values means dataclass authors get the same conveniences as
their callers.
Automatic conversion using the field's type
-------------------------------------------
One idea could be to allow the type of the field specified (e.g. ``str`` or
``int``) to be used as a converter for each argument provided.
`Pydantic's data conversion <pydantic-data-conversion_>`__ has semantics which
appear to be similar to this approach.
This works well for fairly simple types, but leads to ambiguity in expected
behavior for complex types such as generics. E.g. For ``tuple[int, ...]`` it is
ambiguous if the converter is supposed to simply convert an iterable to a tuple,
or if it is additionally supposed to convert each element type to ``int``.
References
==========
.. _attrs-converters: https://www.attrs.org/en/21.2.0/examples.html#conversion
.. _cpython-branch: https://github.com/thejcannon/cpython/tree/converter
.. _only-dataclass-transform: https://mail.python.org/archives/list/typing-sig@python.org/thread/NWZQIINJQZDOCZGO6TGCUP2PNW4PEKNY/
.. _pydantic-data-conversion: https://docs.pydantic.dev/usage/models/#data-conversion
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.