diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index bbea47164..5c9229fc2 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -592,6 +592,7 @@ pep-0708.rst @dstufft pep-0709.rst @carljm pep-0710.rst @dstufft pep-0711.rst @njsmith +pep-0712.rst @ericvsmith # ... # pep-0754.txt # ... diff --git a/pep-0712.rst b/pep-0712.rst new file mode 100644 index 000000000..c5dd8b99b --- /dev/null +++ b/pep-0712.rst @@ -0,0 +1,249 @@ +PEP: 712 +Title: Adding a "converter" parameter to dataclasses.field +Author: Joshua Cannon +Sponsor: Eric V. Smith +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 01-Jan-2023 +Python-Version: 3.13 +Post-History: `27-Dec-2022 `__, + `19-Jan-2023 `__, + +Abstract +======== + +:pep:`557` added :mod:`dataclasses` to the Python stdlib. :pep:`681` added +:func:`~py3.11:typing.dataclass_transform` to help type checkers understand +several common dataclass-like libraries, such as attrs, Pydantic, and object +relational mapper (ORM) packages such as SQLAlchemy and Django. + +A common feature these libraries provide over the standard library +implementation is the ability for the library to convert arguments given at +initialization time into the types expected for each field using a +user-provided conversion function. + +Therefore, this PEP adds a ``converter`` parameter to :func:`dataclasses.field` +(along with the requisite changes to :class:`dataclasses.Field` and +:func:`~py3.11:typing.dataclass_transform`) to specify the function to use to +convert the input value for each field to the representation to be stored in +the dataclass. + +Motivation +========== + +There is no existing, standard way for :mod:`dataclasses` or third-party +dataclass-like libraries to support argument conversion in a type-checkable +way. To work around this limitation, library authors/users are forced to choose +to: + +* Opt-in to a custom Mypy plugin. These plugins help Mypy understand the + conversion semantics, but not other tools. +* Shift conversion responsibility onto the caller of the dataclass + constructor. This can make constructing certain dataclasses unnecessarily + verbose and repetitive. +* Provide a custom ``__init__`` which declares "wider" parameter types and + converts them when setting the appropriate attribute. This not only duplicates + the typing annotations between the converter and ``__init__``, but also opts + the user out of many of the features :mod:`dataclasses` provides. +* Provide a custom ``__init__`` but without meaningful type annotations + for the parameter types requiring conversion. + +None of these choices are ideal. + +Rationale +========= + +Adding argument conversion semantics is useful and beneficial enough that most +dataclass-like libraries provide support. Adding this feature to the standard +library means more users are able to opt-in to these benefits without requiring +third-party libraries. Additionally third-party libraries are able to clue +type-checkers into their own conversion semantics through added support in +:func:`~py3.11:typing.dataclass_transform`, meaning users of those libraries +benefit as well. + +Specification +============= + +New ``converter`` parameter +--------------------------- + +This specification introduces a new parameter named ``converter`` to the +:func:`dataclasses.field` function. When an ``__init__`` method is synthesized +by ``dataclass``-like semantics, if an argument is provided for the field, the +``dataclass`` object's attribute will be assigned the result of calling the +converter on the provided argument. If no argument is given and the field was +constructed with a default value, the ``dataclass`` object's attribute will be +assigned the result of calling the converter on the provided default. + +Adding this parameter also implies the following changes: + +* A ``converter`` attribute will be added to :class:`dataclasses.Field`. +* ``converter`` will be added to :func:`~py3.11:typing.dataclass_transform`'s + list of supported field specifier parameters. + +Example +''''''' + +.. code-block:: python + + @dataclasses.dataclass + class InventoryItem: + # `converter` as a type + id: int = dataclasses.field(converter=int) + skus: tuple[int, ...] = dataclasses.field(converter=tuple[int, ...]) + # `converter` as a callable + names: tuple[str, ...] = dataclasses.field( + converter=lambda names: tuple(map(str.lower, names)) + ) + + # The default value is also converted; therefore the following is not a + # type error. + stock_image_path: pathlib.PurePosixPath = dataclasses.field( + converter=pathlib.PurePosixPath, default="assets/unknown.png" + ) + + item1 = InventoryItem("1", [234, 765], ["PYTHON PLUSHIE", "FLUFFY SNAKE"]) + # item1 would have the following values: + # id=1 + # skus=(234, 765) + # names=('python plushie', 'fluffy snake') + # stock_image_path=pathlib.PurePosixPath("assets/unknown.png") + + +Impact on typing +---------------- + +A ``converter`` must be a callable that accepts a single positional argument, and +the parameter type corresponding to this positional argument provides the type +of the the synthesized ``__init__`` parameter associated with the field. + +In other words, the argument provided for the converter parameter must be +compatible with ``Callable[[T], X]`` where ``T`` is the input type for +the converter and ``X`` is the output type of the converter. + +Type-checking the default value +''''''''''''''''''''''''''''''' + +Because the ``default`` value is unconditionally converted using ``converter``, +if arguments for both ``converter`` and ``default`` are provided to +:func:`dataclasses.field`, the ``default`` argument's type should be checked +using the type of the single argument to the ``converter`` callable. + +Converter return type +''''''''''''''''''''' + +The return type of the callable must be a type that's compatible with the +field's declared type. This includes the field's type exactly, but can also be +a type that's more specialized (such as a converter returning a ``list[int]`` +for a field annotated as ``list``, or a converter returning an ``int`` for a +field annotated as ``int | str``). + +Example +''''''' + +.. code-block:: python + + @dataclasses.dataclass + class Example: + my_int: int = dataclasses.field(converter=int) + my_tuple: tuple[int, ...] = dataclasses.field(converter=tuple[int, ...]) + my_cheese: Cheese = dataclasses.field(converter=make_cheese) + + # Although the default value is of type `str` and the field is declared to + # be of type `pathlib.Path`, this is not a type error because the default + # value will be converted. + tmpdir: pathlib.Path = dataclasses.field(default="/tmp", converter=pathlib.Path) + + + +Backward Compatibility +====================== + +These changes don't introduce any compatibility problems since they +only introduce opt-in new features. + +Security Implications +====================== + +There are no direct security concerns with these changes. + +How to Teach This +================= + +Documentation and examples explaining the new parameter and behavior will be +added to the relevant sections of the docs site (primarily on +:mod:`dataclasses`) and linked from the *What's New* document. + +The added documentation/examples will also cover the "common pitfalls" that +users of converters are likely to encounter. Such pitfalls include: + +* Needing to handle ``None``/sentinel values. +* Needing to handle values that are already of the correct type. +* Avoiding lambdas for converters, as the synthesized ``__init__`` + parameter's type will become ``Any``. + +Reference Implementation +======================== + +The attrs library `already includes `__ a ``converter`` +parameter containing converter semantics. + +CPython support is implemented on `a branch in the author's fork `__. + +Rejected Ideas +============== + +Just adding "converter" to ``typing.dataclass_transform``'s ``field_specifiers`` +-------------------------------------------------------------------------------- + +The idea of isolating this addition to +:func:`~py3.11:typing.dataclass_transform` was briefly +`discussed on Typing-SIG `__ where it was suggested +to broaden this to :mod:`dataclasses` more generally. + +Additionally, adding this to :mod:`dataclasses` ensures anyone can reap the +benefits without requiring additional libraries. + +Not converting default values +----------------------------- + +There are pros and cons with both converting and not converting default values. +Leaving default values as-is allows type-checkers and dataclass authors to +expect that the type of the default matches the type of the field. However, +converting default values has two large advantages: + +1. Compatibility with attrs. Attrs unconditionally uses the converter to + convert the default value. + +2. Simpler defaults. Allowing the default value to have the same type as + user-provided values means dataclass authors get the same conveniences as + their callers. + +Automatic conversion using the field's type +------------------------------------------- + +One idea could be to allow the type of the field specified (e.g. ``str`` or +``int``) to be used as a converter for each argument provided. +`Pydantic's data conversion `__ has semantics which +appear to be similar to this approach. + +This works well for fairly simple types, but leads to ambiguity in expected +behavior for complex types such as generics. E.g. For ``tuple[int, ...]`` it is +ambiguous if the converter is supposed to simply convert an iterable to a tuple, +or if it is additionally supposed to convert each element type to ``int``. + +References +========== + +.. _attrs-converters: https://www.attrs.org/en/21.2.0/examples.html#conversion +.. _cpython-branch: https://github.com/thejcannon/cpython/tree/converter +.. _only-dataclass-transform: https://mail.python.org/archives/list/typing-sig@python.org/thread/NWZQIINJQZDOCZGO6TGCUP2PNW4PEKNY/ +.. _pydantic-data-conversion: https://docs.pydantic.dev/usage/models/#data-conversion + + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive.