250 lines
9.7 KiB
ReStructuredText
250 lines
9.7 KiB
ReStructuredText
|
PEP: 712
|
||
|
Title: Adding a "converter" parameter to dataclasses.field
|
||
|
Author: Joshua Cannon <joshdcannon@gmail.com>
|
||
|
Sponsor: Eric V. Smith <eric at trueblade.com>
|
||
|
Status: Draft
|
||
|
Type: Standards Track
|
||
|
Content-Type: text/x-rst
|
||
|
Created: 01-Jan-2023
|
||
|
Python-Version: 3.13
|
||
|
Post-History: `27-Dec-2022 <https://mail.python.org/archives/list/typing-sig@python.org/thread/NWZQIINJQZDOCZGO6TGCUP2PNW4PEKNY/>`__,
|
||
|
`19-Jan-2023 <https://discuss.python.org/t/add-converter-to-dataclass-field/22956>`__,
|
||
|
|
||
|
Abstract
|
||
|
========
|
||
|
|
||
|
:pep:`557` added :mod:`dataclasses` to the Python stdlib. :pep:`681` added
|
||
|
:func:`~py3.11:typing.dataclass_transform` to help type checkers understand
|
||
|
several common dataclass-like libraries, such as attrs, Pydantic, and object
|
||
|
relational mapper (ORM) packages such as SQLAlchemy and Django.
|
||
|
|
||
|
A common feature these libraries provide over the standard library
|
||
|
implementation is the ability for the library to convert arguments given at
|
||
|
initialization time into the types expected for each field using a
|
||
|
user-provided conversion function.
|
||
|
|
||
|
Therefore, this PEP adds a ``converter`` parameter to :func:`dataclasses.field`
|
||
|
(along with the requisite changes to :class:`dataclasses.Field` and
|
||
|
:func:`~py3.11:typing.dataclass_transform`) to specify the function to use to
|
||
|
convert the input value for each field to the representation to be stored in
|
||
|
the dataclass.
|
||
|
|
||
|
Motivation
|
||
|
==========
|
||
|
|
||
|
There is no existing, standard way for :mod:`dataclasses` or third-party
|
||
|
dataclass-like libraries to support argument conversion in a type-checkable
|
||
|
way. To work around this limitation, library authors/users are forced to choose
|
||
|
to:
|
||
|
|
||
|
* Opt-in to a custom Mypy plugin. These plugins help Mypy understand the
|
||
|
conversion semantics, but not other tools.
|
||
|
* Shift conversion responsibility onto the caller of the dataclass
|
||
|
constructor. This can make constructing certain dataclasses unnecessarily
|
||
|
verbose and repetitive.
|
||
|
* Provide a custom ``__init__`` which declares "wider" parameter types and
|
||
|
converts them when setting the appropriate attribute. This not only duplicates
|
||
|
the typing annotations between the converter and ``__init__``, but also opts
|
||
|
the user out of many of the features :mod:`dataclasses` provides.
|
||
|
* Provide a custom ``__init__`` but without meaningful type annotations
|
||
|
for the parameter types requiring conversion.
|
||
|
|
||
|
None of these choices are ideal.
|
||
|
|
||
|
Rationale
|
||
|
=========
|
||
|
|
||
|
Adding argument conversion semantics is useful and beneficial enough that most
|
||
|
dataclass-like libraries provide support. Adding this feature to the standard
|
||
|
library means more users are able to opt-in to these benefits without requiring
|
||
|
third-party libraries. Additionally third-party libraries are able to clue
|
||
|
type-checkers into their own conversion semantics through added support in
|
||
|
:func:`~py3.11:typing.dataclass_transform`, meaning users of those libraries
|
||
|
benefit as well.
|
||
|
|
||
|
Specification
|
||
|
=============
|
||
|
|
||
|
New ``converter`` parameter
|
||
|
---------------------------
|
||
|
|
||
|
This specification introduces a new parameter named ``converter`` to the
|
||
|
:func:`dataclasses.field` function. When an ``__init__`` method is synthesized
|
||
|
by ``dataclass``-like semantics, if an argument is provided for the field, the
|
||
|
``dataclass`` object's attribute will be assigned the result of calling the
|
||
|
converter on the provided argument. If no argument is given and the field was
|
||
|
constructed with a default value, the ``dataclass`` object's attribute will be
|
||
|
assigned the result of calling the converter on the provided default.
|
||
|
|
||
|
Adding this parameter also implies the following changes:
|
||
|
|
||
|
* A ``converter`` attribute will be added to :class:`dataclasses.Field`.
|
||
|
* ``converter`` will be added to :func:`~py3.11:typing.dataclass_transform`'s
|
||
|
list of supported field specifier parameters.
|
||
|
|
||
|
Example
|
||
|
'''''''
|
||
|
|
||
|
.. code-block:: python
|
||
|
|
||
|
@dataclasses.dataclass
|
||
|
class InventoryItem:
|
||
|
# `converter` as a type
|
||
|
id: int = dataclasses.field(converter=int)
|
||
|
skus: tuple[int, ...] = dataclasses.field(converter=tuple[int, ...])
|
||
|
# `converter` as a callable
|
||
|
names: tuple[str, ...] = dataclasses.field(
|
||
|
converter=lambda names: tuple(map(str.lower, names))
|
||
|
)
|
||
|
|
||
|
# The default value is also converted; therefore the following is not a
|
||
|
# type error.
|
||
|
stock_image_path: pathlib.PurePosixPath = dataclasses.field(
|
||
|
converter=pathlib.PurePosixPath, default="assets/unknown.png"
|
||
|
)
|
||
|
|
||
|
item1 = InventoryItem("1", [234, 765], ["PYTHON PLUSHIE", "FLUFFY SNAKE"])
|
||
|
# item1 would have the following values:
|
||
|
# id=1
|
||
|
# skus=(234, 765)
|
||
|
# names=('python plushie', 'fluffy snake')
|
||
|
# stock_image_path=pathlib.PurePosixPath("assets/unknown.png")
|
||
|
|
||
|
|
||
|
Impact on typing
|
||
|
----------------
|
||
|
|
||
|
A ``converter`` must be a callable that accepts a single positional argument, and
|
||
|
the parameter type corresponding to this positional argument provides the type
|
||
|
of the the synthesized ``__init__`` parameter associated with the field.
|
||
|
|
||
|
In other words, the argument provided for the converter parameter must be
|
||
|
compatible with ``Callable[[T], X]`` where ``T`` is the input type for
|
||
|
the converter and ``X`` is the output type of the converter.
|
||
|
|
||
|
Type-checking the default value
|
||
|
'''''''''''''''''''''''''''''''
|
||
|
|
||
|
Because the ``default`` value is unconditionally converted using ``converter``,
|
||
|
if arguments for both ``converter`` and ``default`` are provided to
|
||
|
:func:`dataclasses.field`, the ``default`` argument's type should be checked
|
||
|
using the type of the single argument to the ``converter`` callable.
|
||
|
|
||
|
Converter return type
|
||
|
'''''''''''''''''''''
|
||
|
|
||
|
The return type of the callable must be a type that's compatible with the
|
||
|
field's declared type. This includes the field's type exactly, but can also be
|
||
|
a type that's more specialized (such as a converter returning a ``list[int]``
|
||
|
for a field annotated as ``list``, or a converter returning an ``int`` for a
|
||
|
field annotated as ``int | str``).
|
||
|
|
||
|
Example
|
||
|
'''''''
|
||
|
|
||
|
.. code-block:: python
|
||
|
|
||
|
@dataclasses.dataclass
|
||
|
class Example:
|
||
|
my_int: int = dataclasses.field(converter=int)
|
||
|
my_tuple: tuple[int, ...] = dataclasses.field(converter=tuple[int, ...])
|
||
|
my_cheese: Cheese = dataclasses.field(converter=make_cheese)
|
||
|
|
||
|
# Although the default value is of type `str` and the field is declared to
|
||
|
# be of type `pathlib.Path`, this is not a type error because the default
|
||
|
# value will be converted.
|
||
|
tmpdir: pathlib.Path = dataclasses.field(default="/tmp", converter=pathlib.Path)
|
||
|
|
||
|
|
||
|
|
||
|
Backward Compatibility
|
||
|
======================
|
||
|
|
||
|
These changes don't introduce any compatibility problems since they
|
||
|
only introduce opt-in new features.
|
||
|
|
||
|
Security Implications
|
||
|
======================
|
||
|
|
||
|
There are no direct security concerns with these changes.
|
||
|
|
||
|
How to Teach This
|
||
|
=================
|
||
|
|
||
|
Documentation and examples explaining the new parameter and behavior will be
|
||
|
added to the relevant sections of the docs site (primarily on
|
||
|
:mod:`dataclasses`) and linked from the *What's New* document.
|
||
|
|
||
|
The added documentation/examples will also cover the "common pitfalls" that
|
||
|
users of converters are likely to encounter. Such pitfalls include:
|
||
|
|
||
|
* Needing to handle ``None``/sentinel values.
|
||
|
* Needing to handle values that are already of the correct type.
|
||
|
* Avoiding lambdas for converters, as the synthesized ``__init__``
|
||
|
parameter's type will become ``Any``.
|
||
|
|
||
|
Reference Implementation
|
||
|
========================
|
||
|
|
||
|
The attrs library `already includes <attrs-converters_>`__ a ``converter``
|
||
|
parameter containing converter semantics.
|
||
|
|
||
|
CPython support is implemented on `a branch in the author's fork <cpython-branch_>`__.
|
||
|
|
||
|
Rejected Ideas
|
||
|
==============
|
||
|
|
||
|
Just adding "converter" to ``typing.dataclass_transform``'s ``field_specifiers``
|
||
|
--------------------------------------------------------------------------------
|
||
|
|
||
|
The idea of isolating this addition to
|
||
|
:func:`~py3.11:typing.dataclass_transform` was briefly
|
||
|
`discussed on Typing-SIG <only-dataclass-transform_>`__ where it was suggested
|
||
|
to broaden this to :mod:`dataclasses` more generally.
|
||
|
|
||
|
Additionally, adding this to :mod:`dataclasses` ensures anyone can reap the
|
||
|
benefits without requiring additional libraries.
|
||
|
|
||
|
Not converting default values
|
||
|
-----------------------------
|
||
|
|
||
|
There are pros and cons with both converting and not converting default values.
|
||
|
Leaving default values as-is allows type-checkers and dataclass authors to
|
||
|
expect that the type of the default matches the type of the field. However,
|
||
|
converting default values has two large advantages:
|
||
|
|
||
|
1. Compatibility with attrs. Attrs unconditionally uses the converter to
|
||
|
convert the default value.
|
||
|
|
||
|
2. Simpler defaults. Allowing the default value to have the same type as
|
||
|
user-provided values means dataclass authors get the same conveniences as
|
||
|
their callers.
|
||
|
|
||
|
Automatic conversion using the field's type
|
||
|
-------------------------------------------
|
||
|
|
||
|
One idea could be to allow the type of the field specified (e.g. ``str`` or
|
||
|
``int``) to be used as a converter for each argument provided.
|
||
|
`Pydantic's data conversion <pydantic-data-conversion_>`__ has semantics which
|
||
|
appear to be similar to this approach.
|
||
|
|
||
|
This works well for fairly simple types, but leads to ambiguity in expected
|
||
|
behavior for complex types such as generics. E.g. For ``tuple[int, ...]`` it is
|
||
|
ambiguous if the converter is supposed to simply convert an iterable to a tuple,
|
||
|
or if it is additionally supposed to convert each element type to ``int``.
|
||
|
|
||
|
References
|
||
|
==========
|
||
|
|
||
|
.. _attrs-converters: https://www.attrs.org/en/21.2.0/examples.html#conversion
|
||
|
.. _cpython-branch: https://github.com/thejcannon/cpython/tree/converter
|
||
|
.. _only-dataclass-transform: https://mail.python.org/archives/list/typing-sig@python.org/thread/NWZQIINJQZDOCZGO6TGCUP2PNW4PEKNY/
|
||
|
.. _pydantic-data-conversion: https://docs.pydantic.dev/usage/models/#data-conversion
|
||
|
|
||
|
|
||
|
Copyright
|
||
|
=========
|
||
|
|
||
|
This document is placed in the public domain or under the
|
||
|
CC0-1.0-Universal license, whichever is more permissive.
|