PEP 712: Now convert all incoming values (#3152)

This commit is contained in:
Josh Cannon 2023-08-16 15:22:53 -05:00 committed by GitHub
parent 2d383ca8ef
commit 333b29769c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 92 additions and 39 deletions

View File

@ -20,7 +20,7 @@ Abstract
several common dataclass-like libraries, such as attrs, Pydantic, and object several common dataclass-like libraries, such as attrs, Pydantic, and object
relational mapper (ORM) packages such as SQLAlchemy and Django. relational mapper (ORM) packages such as SQLAlchemy and Django.
A common feature these libraries provide over the standard library A common feature other libraries provide over the standard library
implementation is the ability for the library to convert arguments given at implementation is the ability for the library to convert arguments given at
initialization time into the types expected for each field using a initialization time into the types expected for each field using a
user-provided conversion function. user-provided conversion function.
@ -71,12 +71,16 @@ New ``converter`` parameter
--------------------------- ---------------------------
This specification introduces a new parameter named ``converter`` to the This specification introduces a new parameter named ``converter`` to the
:func:`dataclasses.field` function. When an ``__init__`` method is synthesized :func:`dataclasses.field` function. If provided, it represents a single-argument
by ``dataclass``-like semantics, if an argument is provided for the field, the callable used to convert all values when assigning to the associated attribute.
``dataclass`` object's attribute will be assigned the result of calling the
converter on the provided argument. If no argument is given and the field was For frozen dataclasses, the converter is only used inside a ``dataclass``-synthesized
constructed with a default value, the ``dataclass`` object's attribute will be ``__init__`` when setting the attribute. For non-frozen dataclasses, the converter
assigned the result of calling the converter on the provided default. is used for all attribute assignment (E.g. ``obj.attr = value``), which includes
assignment of default values.
The converter is not used when reading attributes, as the attributes should already
have been converted.
Adding this parameter also implies the following changes: Adding this parameter also implies the following changes:
@ -91,13 +95,14 @@ Example
@dataclasses.dataclass @dataclasses.dataclass
class InventoryItem: class InventoryItem:
# `converter` as a type # `converter` as a type (including a GenericAlias).
id: int = dataclasses.field(converter=int) id: int = dataclasses.field(converter=int)
skus: tuple[int, ...] = dataclasses.field(converter=tuple[int, ...]) skus: tuple[int, ...] = dataclasses.field(converter=tuple[int, ...])
# `converter` as a callable # `converter` as a callable.
vendor: str | None = dataclasses.field(converter=str_or_none))
names: tuple[str, ...] = dataclasses.field( names: tuple[str, ...] = dataclasses.field(
converter=lambda names: tuple(map(str.lower, names)) converter=lambda names: tuple(map(str.lower, names))
) ) # Note that lambdas are supported, but discouraged as they are untyped.
# The default value is also converted; therefore the following is not a # The default value is also converted; therefore the following is not a
# type error. # type error.
@ -105,12 +110,31 @@ Example
converter=pathlib.PurePosixPath, default="assets/unknown.png" converter=pathlib.PurePosixPath, default="assets/unknown.png"
) )
item1 = InventoryItem("1", [234, 765], ["PYTHON PLUSHIE", "FLUFFY SNAKE"]) # Default value conversion extends to `default_factory`;
# item1 would have the following values: # therefore the following is also not a type error.
# id=1 shelves: tuple = dataclasses.field(
# skus=(234, 765) converter=tuple, default_factory=list
# names=('python plushie', 'fluffy snake') )
# stock_image_path=pathlib.PurePosixPath("assets/unknown.png")
item1 = InventoryItem(
"1",
[234, 765],
None,
["PYTHON PLUSHIE", "FLUFFY SNAKE"]
)
# item1's repr would be (with added newlines for readability):
# InventoryItem(
# id=1,
# skus=(234, 765),
# vendor=None,
# names=('PYTHON PLUSHIE', 'FLUFFY SNAKE'),
# stock_image_path=PurePosixPath('assets/unknown.png'),
# shelves=()
# )
# Attribute assignment also participates in conversion.
item1.skus = [555]
# item1's skus attribute is now (555,).
Impact on typing Impact on typing
@ -124,12 +148,13 @@ In other words, the argument provided for the converter parameter must be
compatible with ``Callable[[T], X]`` where ``T`` is the input type for compatible with ``Callable[[T], X]`` where ``T`` is the input type for
the converter and ``X`` is the output type of the converter. the converter and ``X`` is the output type of the converter.
Type-checking the default value Type-checking ``default`` and ``default_factory``
''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''
Because the ``default`` value is unconditionally converted using ``converter``, Because default values are unconditionally converted using ``converter``, if
if arguments for both ``converter`` and ``default`` are provided to an argument for ``converter`` is provided alongside either ``default`` or
:func:`dataclasses.field`, the ``default`` argument's type should be checked ``default_factory``, the type of the default (the ``default`` argument if
provided, otherwise the return value of ``default_factory``) should be checked
using the type of the single argument to the ``converter`` callable. using the type of the single argument to the ``converter`` callable.
Converter return type Converter return type
@ -141,22 +166,17 @@ a type that's more specialized (such as a converter returning a ``list[int]``
for a field annotated as ``list``, or a converter returning an ``int`` for a for a field annotated as ``list``, or a converter returning an ``int`` for a
field annotated as ``int | str``). field annotated as ``int | str``).
Example Indirection of allowable argument types
''''''' ---------------------------------------
.. code-block:: python One downside introduced by this PEP is that knowing what argument types are
allowed in the dataclass' ``__init__`` and during attribute assignment is not
@dataclasses.dataclass immediately obvious from reading the dataclass. The allowable types are defined
class Example: by the converter.
my_int: int = dataclasses.field(converter=int)
my_tuple: tuple[int, ...] = dataclasses.field(converter=tuple[int, ...])
my_cheese: Cheese = dataclasses.field(converter=make_cheese)
# Although the default value is of type `str` and the field is declared to
# be of type `pathlib.Path`, this is not a type error because the default
# value will be converted.
tmpdir: pathlib.Path = dataclasses.field(default="/tmp", converter=pathlib.Path)
This is true when reading code from source, however typing-related aides such
as ``typing.reveal_type`` and "IntelliSense" in an IDE should make it easy to know
exactly what types are allowed without having to read any source code.
Backward Compatibility Backward Compatibility
@ -184,6 +204,10 @@ users of converters are likely to encounter. Such pitfalls include:
* Needing to handle values that are already of the correct type. * Needing to handle values that are already of the correct type.
* Avoiding lambdas for converters, as the synthesized ``__init__`` * Avoiding lambdas for converters, as the synthesized ``__init__``
parameter's type will become ``Any``. parameter's type will become ``Any``.
* Forgetting to convert values in the bodies of user-defined ``__init__`` in
frozen dataclasses.
* Forgetting to convert values in the bodies of user-defined ``__setattr__`` in
non-frozen dataclasses.
Reference Implementation Reference Implementation
======================== ========================
@ -213,15 +237,18 @@ Not converting default values
There are pros and cons with both converting and not converting default values. There are pros and cons with both converting and not converting default values.
Leaving default values as-is allows type-checkers and dataclass authors to Leaving default values as-is allows type-checkers and dataclass authors to
expect that the type of the default matches the type of the field. However, expect that the type of the default matches the type of the field. However,
converting default values has two large advantages: converting default values has three large advantages:
1. Compatibility with attrs. Attrs unconditionally uses the converter to 1. Consistency. Unconditionally converting all values that are assigned to the
convert the default value. attribute, involves fewer "special rules" that users must remember.
2. Simpler defaults. Allowing the default value to have the same type as 2. Simpler defaults. Allowing the default value to have the same type as
user-provided values means dataclass authors get the same conveniences as user-provided values means dataclass authors get the same conveniences as
their callers. their callers.
3. Compatibility with attrs. Attrs unconditionally uses the converter to
convert default values.
Automatic conversion using the field's type Automatic conversion using the field's type
------------------------------------------- -------------------------------------------
@ -233,7 +260,33 @@ appear to be similar to this approach.
This works well for fairly simple types, but leads to ambiguity in expected This works well for fairly simple types, but leads to ambiguity in expected
behavior for complex types such as generics. E.g. For ``tuple[int, ...]`` it is behavior for complex types such as generics. E.g. For ``tuple[int, ...]`` it is
ambiguous if the converter is supposed to simply convert an iterable to a tuple, ambiguous if the converter is supposed to simply convert an iterable to a tuple,
or if it is additionally supposed to convert each element type to ``int``. or if it is additionally supposed to convert each element type to ``int``. Or
for ``int | None``, which isn't callable.
Deducing the attribute type from the return type of the converter
-----------------------------------------------------------------
Another idea would be to allow the user to omit the attribute's type annotation
if providing a ``field`` with a ``converter`` argument. Although this would
reduce the common repetition this PEP introduces (e.g. ``x: str = field(converter=str)``),
it isn't clear how to best support this while maintaining the current dataclass
semantics (namely, that the attribute order is preserved for things like the
synthesized ``__init__``, or ``dataclasses.fields``). This is because there isn't
an easy way in Python (today) to get the annotation-only attributes interspersed
with un-annotated attributes in the order they were defined.
A sentinel annotation could be applied (e.g. ``x: FromConverter = ...``),
however this breaks a fundamental assumption of type annotations.
Lastly, this is feasible if *all* fields (including those without a converter)
were assigned to ``dataclasses.field``, which means the class' own namespace
holds the order, however this trades repetition of type+converter with
repetition of field assignment. The end result is no gain or loss of repetition,
but with the added complexity of dataclasses semantics.
This PEP doesn't suggest it can't or shouldn't be done. Just that it isn't
included in this PEP.
References References
========== ==========