PEP 728: TypedDict with Typed Extra Fields (#3441)

Signed-off-by: Zixuan James Li <p359101898@gmail.com>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
This commit is contained in:
Zixuan Li 2024-02-08 21:55:10 -05:00 committed by GitHub
parent 4112eff73b
commit 395a3115da
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 641 additions and 0 deletions

1
.github/CODEOWNERS vendored
View File

@ -607,6 +607,7 @@ peps/pep-0724.rst @jellezijlstra
peps/pep-0725.rst @pradyunsg
peps/pep-0726.rst @AA-Turner
peps/pep-0727.rst @JelleZijlstra
peps/pep-0728.rst @JelleZijlstra
peps/pep-0729.rst @JelleZijlstra @hauntsaninja
peps/pep-0730.rst @ned-deily
peps/pep-0731.rst @gvanrossum @encukou @vstinner @zooba @iritkatriel

640
peps/pep-0728.rst Normal file
View File

@ -0,0 +1,640 @@
PEP: 728
Title: TypedDict with Typed Extra Items
Author: Zixuan James Li <p359101898@gmail.com>
Sponsor: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Status: Draft
Type: Standards Track
Topic: Typing
Content-Type: text/x-rst
Created: 12-Sep-2023
Python-Version: 3.13
.. highlight:: rst
Abstract
========
This PEP proposes a way to type extra items for :class:`~typing.TypedDict` using
a reserved ``__extra__`` key. This addresses the need to define a subset of
keys that might appear in a ``dict`` while permitting additional items of a
specified type, and the need to create a closed TypedDict type with ``__extra__:
Never``.
Motivation
==========
A :py:class:`typing.TypedDict` type can annotate the value type of each known
item in a dictionary. However, due to structural subtyping, a TypedDict can have
extra items that are not visible through its type. There is currently no way to
restrict the types of items that might be present in the TypedDict type's
structural subtypes.
Defining a Closed TypedDict Type
--------------------------------
The current behavior of TypedDict prevents users from defining a closed
TypedDict type when it is expected that the type contains no additional items.
Due to the possible presence of extra items, type checkers cannot infer more
precise return types for ``.items()`` and ``.values()`` on a TypedDict. This can
also be resolved by
`defining a closed TypedDict type <https://github.com/python/mypy/issues/7981>`__.
Another possible use case for this is a sound way to
`enable type narrowing <https://github.com/python/mypy/issues/9953>`__ with the
``in`` check::
class Movie(TypedDict):
name: str
director: str
class Book(TypedDict):
name: str
author: str
def fun(entry: Movie | Book) -> None:
if "author" in entry:
reveal_type(entry) # Revealed type is 'Movie | Book'
Nothing prevents a ``dict`` that is structurally compatible with ``Movie`` to
have the ``author`` key, and under the current specification it would be
incorrect for the type checker to narrow its type.
Allowing Extra Items of a Certain Type
--------------------------------------
For supporting API interfaces or legacy codebase where only a subset of possible
keys are known, it would be useful to explicitly expect additional keys of
certain value types.
However, the typing spec is more restrictive on type checking the construction of a
TypedDict, `preventing users <https://github.com/python/mypy/issues/4617>`__
from doing this::
class MovieBase(TypedDict):
name: str
def fun(movie: MovieBase) -> None:
# movie can have extra items that are not visible through MovieBase
...
movie: MovieBase = {"name": "Blade Runner", "year": 1982} # Not OK
fun({"name": "Blade Runner", "year": 1982}) # Not OK
While the restriction is enforced when constructing a TypedDict, due to
structural subtyping, the TypedDict may have extra items that are not visible
through its type. For example::
class Movie(MovieBase):
year: int
movie: Movie = {"name": "Blade Runner", "year": 1982}
fun(movie) # OK
It is not possible to acknowledge the existence of the extra items through
``in`` checks and access them without breaking type safety, even though they
might exist from arbitrary structural subtypes of ``MovieBase``::
def g(movie: MovieBase) -> None:
if "year" in movie:
reveal_type(movie["year"]) # Error: TypedDict 'MovieBase' has no key 'year'
Some workarounds have already been implemented in response to the need to allow
extra keys, but none of them is ideal. For mypy,
``--disable-error-code=typeddict-unknown-key``
`suppresses type checking error <https://github.com/python/mypy/pull/14225>`__
specifically for unknown keys on TypedDict. This sacrifices type safety over
flexibility, and it does not offer a way to specify that the TypedDict type
expects additional keys compatible with a certain type.
Support Additional Keys for ``Unpack``
--------------------------------------
:pep:`692` adds a way to precisely annotate the types of individual keyword
arguments represented by ``**kwargs`` using TypedDict with ``Unpack``. However,
because TypedDict cannot be defined to accept arbitrary extra items, it is not
possible to
`allow additional keyword arguments <https://discuss.python.org/t/pep-692-using-typeddict-for-more-precise-kwargs-typing/17314/87>`__
that are not known at the time the TypedDict is defined.
Given the usage of pre-:pep:`692` type annotation for ``**kwargs`` in existing
codebases, it will be valuable to accept and type extra items on TypedDict so
that the old typing behavior can be supported in combination with the new
``Unpack`` construct.
Rationale
=========
A type that allows extra items of type ``str`` on a TypedDict can be loosely
described as the intersection between the TypedDict and ``Mapping[str, str]``.
`Index Signatures <https://www.typescriptlang.org/docs/handbook/2/objects.html#index-signatures>`__
in TypeScript achieve this:
.. code-block:: typescript
type Foo = {
a: string
[key: string]: string
}
This proposal aims to support a similar feature without introducing general
intersection of types or syntax changes, offering a natural extension to the
existing type consistency rules.
We propose that we give the dunder attribute ``__extra__`` a special meaning:
When it is defined on a TypedDict type, extra items are allowed, and their types
should be compatible with the value type of ``__extra__``. Different from index
signatures, the types of known items do not need to be consistent with the value
type of ``__extra__``.
There are some advantages to this approach:
- Inheritance works naturally. ``__extra__`` defined on a TypedDict will also
be available to its subclasses.
- We can build on top of the `type consistency rules defined in the typing spec
<https://typing.readthedocs.io/en/latest/spec/typeddict.html#type-consistency>`__.
``__extra__`` can be treated as a pseudo-item in terms of type consistency.
- There is no need to introduce a syntax to specify the type of the extra items.
- We can precisely type the extra items without making ``__extra__`` the union
of known items.
Specification
=============
This specification is structured to parallel :pep:`589` to highlight changes to
the original TypedDict specification.
Extra items are treated as non-required items having the same type of
``__extra__`` whose keys are allowed when determining
`supported and unsupported operations
<https://typing.readthedocs.io/en/latest/spec/typeddict.html#supported-and-unsupported-operations>`__.
Using TypedDict Types
---------------------
For a TypedDict type that has the ``__extra__`` key, during construction, the
value type of each unknown item is expected to be non-required and compatible
with the value type of ``__extra__``. For example::
class Movie(TypedDict):
name: str
__extra__: bool
a: Movie = {"name": "Blade Runner", "novel_adaptation": True} # OK
b: Movie = {
"name": "Blade Runner",
"year": 1982, # Not OK. 'int' is incompatible with 'bool'
}
In this example, ``__extra__: bool`` does not mean that ``Movie`` has a required
string key ``"__extra__"`` whose value type is ``bool``. Instead, it specifies that
keys other than "name" have a value type of ``bool`` and are non-required.
The alternative inline syntax is also supported::
Movie = TypedDict("Movie", {"name": str, "__extra__": bool})
Accessing extra keys is allowed. Type checkers must infer its value type from
the value type of ``__extra__``::
def f(movie: Movie) -> None:
reveal_type(movie["name"]) # Revealed type is 'str'
reveal_type(movie["novel_adaptation"]) # Revealed type is 'bool'
Interaction with PEP 705
------------------------
When ``__extra__`` is annotated with ``ReadOnly[]``, the extra items on the
TypedDict have the properties of read-only items. This affects subclassing
according to the inheritance rules specified in :pep:`PEP 705 <705#Inheritance>`.
Notably, a subclass of the TypedDict type may redeclare the value type of
``__extra__`` or of additional non-extra items if the TypedDict type declares
``__extra__`` to be read-only.
More details are discussed in the later sections.
Interaction with Totality
-------------------------
It is an error to use ``Required[]`` or ``NotRequired[]`` with the special
``__extra__`` item. ``total=False`` and ``total=True`` have no effect on
``__extra__`` itself.
The extra items are non-required, regardless of the totality of the TypedDict.
Operations that are available to ``NotRequired`` items should also be available
to the extra items::
class Movie(TypedDict):
name: str
__extra__: int
def f(movie: Movie) -> None:
del movie["name"] # Not OK
del movie["year"] # OK
Interaction with ``Unpack``
---------------------------
For type checking purposes, ``Unpack[TypedDict]`` with extra items should be
treated as its equivalent in regular parameters, and the existing rules for
function parameters still apply::
class Movie(TypedDict):
name: str
__extra__: int
def f(**kwargs: Unpack[Movie]) -> None: ...
# Should be equivalent to
def f(*, name: str, **kwargs: int) -> None: ...
Inheritance
-----------
``__extra__`` is inherited the same way as a regular ``key: value_type`` item.
As with the other keys, the same rules from
`the typing spec <https://typing.readthedocs.io/en/latest/spec/typeddict.html#inheritance>`__
and :pep:`PEP 705 <705#inheritance>` apply. We interpret the existing rules in the
context of ``__extra__``.
We need to reinterpret the following rule to define how ``__extra__`` interacts
with it:
* Changing a field type of a parent TypedDict class in a subclass is not allowed.
First, it is not allowed to change the value type of ``__extra__`` in a subclass
unless it is declared to be ``ReadOnly`` in the superclass::
class Parent(TypedDict):
__extra__: int | None
class Child(Parent):
__extra__: int # Not OK. Like any other TypedDict item, __extra__'s type cannot be changed
Second, ``__extra__: T`` effectively defines the value type of any unnamed items
accepted to the TypedDict and marks them as non-required. Thus, the above
restriction applies to any additional items defined in a subclass. For each item
added in a subclass, all of the following conditions should apply:
- If ``__extra__`` is read-only
- The item can be either required or non-required
- The item's value type is consistent with ``T``
- If ``__extra__`` is not read-only
- The item is non-required
- The item's value type is consistent with ``T``
- ``T`` is consistent with the item's value type
- If ``__extra__`` is not redeclared, the subclass inherits it as-is.
For example::
class MovieBase(TypedDict):
name: str
__extra__: int | None
class AdaptedMovie(MovieBase): # Not OK. 'bool' is not consistent with 'int | None'
adapted_from_novel: bool
class MovieRequiredYear(MovieBase): # Not OK. Required key 'year' is not known to 'Parent'
year: int | None
class MovieNotRequiredYear(MovieBase): # Not OK. 'int | None' is not consistent with 'int'
year: NotRequired[int]
class MovieWithYear(MovieBase): # OK
year: NotRequired[int | None]
Due to this nature, an important side effect allows us to define a TypedDict
type that disallows additional items::
class MovieFinal(TypedDict):
name: str
__extra__: Never
Here, annotating ``__extra__`` with :class:`typing.Never` specifies that
there can be no other keys in ``MovieFinal`` other than the known ones.
Type Consistency
----------------
In addition to the set ``S`` of keys of the explicitly defined items, a
TypedDict type that has the item ``__extra__: T`` is considered to have an
infinite set of items that all satisfy the following conditions:
- If ``__extra__`` is read-only
- The key's value type is consistent with ``T``
- The key is not in ``S``.
- If ``__extra__`` is not read-only
- The key is non-required
- The key's value type is consistent with ``T``
- ``T`` is consistent with the key's value type
- The key is not in ``S``.
For type checking purposes, let ``__extra__`` be a non-required pseudo-item to
be included whenever "for each ... item/key" is stated in
:pep:`the existing type consistency rules from PEP 705 <705#type-consistency>`,
and we modify it as follows:
A TypedDict type ``A`` is consistent with TypedDict ``B`` if ``A`` is
structurally compatible with ``B``. This is true if and only if all of the
following are satisfied:
* For each item in ``B``, ``A`` has the corresponding key, unless the item
in ``B`` is read-only, not required, and of top value type
(``ReadOnly[NotRequired[object]]``). [Edit: Otherwise, if the
corresponding key with the same name cannot be found in ``A``, "__extra__"
is considered the corresponding key.]
* For each item in ``B``, if ``A`` has the corresponding key [Edit: or
"__extra__"], the corresponding value type in ``A`` is consistent with the
value type in ``B``.
* For each non-read-only item in ``B``, its value type is consistent with
the corresponding value type in ``A``. [Edit: if the corresponding key
with the same name cannot be found in ``A``, "__extra__" is considered the
corresponding key.]
* For each required key in ``B``, the corresponding key is required in ``A``.
For each non-required key in ``B``, if the item is not read-only in ``B``,
the corresponding key is not required in ``A``.
[Edit: if the corresponding key with the same name cannot be found in
``A``, "__extra__" is considered to be non-required as the corresponding
key.]
The following examples illustrate these checks in action.
``__extra__`` puts various restrictions on additional items for type
consistency checks::
class Movie(TypedDict):
name: str
__extra__: int | None
class MovieDetails(TypedDict):
name: str
year: NotRequired[int]
details: MovieDetails = {"name": "Kill Bill Vol. 1", "year": 2003}
movie: Movie = details # Not OK. While 'int' is consistent with 'int | None',
# 'int | None' is not consistent with 'int'
class MovieWithYear(TypedDict):
name: str
year: int | None
details: MovieWithYear = {"name": "Kill Bill Vol. 1", "year": 2003}
movie: Movie = details # Not OK. 'year' is not required in 'Movie',
# so it shouldn't be required in 'MovieWithYear' either
Because "year" is absent in ``Movie``, ``__extra__`` is considered the
corresponding key. ``"year"`` being required violates the rule "For each
required key in ``B``, the corresponding key is required in ``A``".
When ``__extra__`` is defined to be read-only in a TypedDict type, it is possible
for an item to have a narrower type than ``__extra__``'s value type.
class Movie(TypedDict):
name: str
__extra__: ReadOnly[str | int]
class MovieDetails(TypedDict):
name: str
year: NotRequired[int]
details: MovieDetails = {"name": "Kill Bill Vol. 2", "year": 2004}
movie: Movie = details # OK. 'int' is consistent with 'str | int'.
This behaves the same way as :pep:`705` specified if ``year: ReadOnly[str | int]``
is an item defined in ``Movie``.
``__extra__`` as a pseudo-item follows the same rules that other items have, so
when both TypedDicts contain ``__extra__``, this check is naturally enforced::
class MovieExtraInt(TypedDict):
name: str
__extra__: int
class MovieExtraStr(TypedDict):
name: str
__extra__: str
extra_int: MovieExtraInt = {"name": "No Country for Old Men", "year": 2007}
extra_str: MovieExtraStr = {"name": "No Country for Old Men", "description": ""}
extra_int = extra_str # Not OK. 'str' is inconsistent with 'int' for item '__extra__'
extra_str = extra_int # Not OK. 'int' is inconsistent with 'str' for item '__extra__'
Interaction with Mapping[KT, VT]
--------------------------------
A TypedDict type can be consistent with ``Mapping[KT, VT]`` types other than
``Mapping[str, object]`` as long as the union of value types on the TypedDict
type is consistent with ``VT``. It is an extension of this rule from the typing
spec::
* A TypedDict with all ``int`` values is not consistent with
``Mapping[str, int]``, since there may be additional non-``int``
values not visible through the type, due to structural subtyping.
These can be accessed using the ``values()`` and ``items()``
methods in ``Mapping``
For example::
class MovieExtraStr(TypedDict):
name: str
__extra__: str
extra_str: MovieExtraStr = {"name": "Blade Runner", "summary": ""}
str_mapping: Mapping[str, str] = extra_str # OK
int_mapping: Mapping[str, int] = extra_int # Not OK. 'int | str' is not consistent with 'int'
int_str_mapping: Mapping[str, int | str] = extra_int # OK
Furthermore, type checkers should be able to infer the precise return types of
``values()`` and ``items()`` on such TypedDict types::
def fun(movie: MovieExtraStr) -> None:
reveal_type(movie.items()) # Revealed type is 'dict_items[str, str]'
reveal_type(movie.values()) # Revealed type is 'dict_values[str, str]'
Interaction with dict[KT, VT]
-----------------------------
Note that because the presence of ``__extra__`` prohibits additional required
keys in a TypedDict type's structural subtypes, we can determine if the
TypedDict type and its structural subtypes will ever have any required key
during static analysis.
If there is no required key, the TypedDict type is consistent with ``dict[KT,
VT]`` and vice versa if all items on the TypedDict type satisfy the following
conditions:
- ``VT`` is consistent with the value type of the item
- The value type of the item is consistent with ``VT``
For example::
class IntDict(TypedDict):
__extra__: int
class IntDictWithNum(IntDict):
num: NotRequired[int]
def f(x: IntDict) -> None:
v: dict[str, int] = x # OK
v.clear() # OK
not_required_num: IntDictWithNum = {"num": 1, "bar": 2}
regular_dict: dict[str, int] = not_required_num # OK
f(not_required_num) # OK
In this case, methods that are previously unavailable on a TypedDict are allowed.
not_required_num.clear() # OK
reveal_type(not_required_num.popitem()) # OK. Revealed type is tuple[str, int]
Open Issues
===========
Alternatives to the ``__extra__`` Reserved Key
----------------------------------------------
As it was pointed out in the `PEP 705 review comment
<https://discuss.python.org/t/pep-705-typeddict-read-only-and-other-keys/36457/6>`__,
``__extra__`` as a reserved item has some disadvantages, including not allowing
"__extra__" as a regular key, requiring special handling to disallow
``Required`` and ``NotRequired``. There could be some better alternatives to
this without the above-mentioned issues.
Backwards Compatibility
=======================
While dunder attributes like ``__extra__`` are reserved for stdlib, it is still
a limitation that ``__extra__`` is no longer usable as a regular key. If the
proposal is accepted, none of ``__required_keys__``, ``__optional_keys__``,
``__readonly_keys__`` and ``__mutable_keys__`` should include ``__extra__`` in
runtime.
Because this is a type-checking feature, it can be made available to older
versions as long as the type checker supports it.
Rejected Ideas
==============
Allowing Extra Items without Specifying the Type
------------------------------------------------
``extra=True`` was originally proposed for defining a TypedDict that accepts extra
items regardless of the type, like how ``total=True`` works::
class TypedDict(extra=True):
pass
Because it did not offer a way to specify the type of the extra items, the type
checkers will need to assume that the type of the extra items is ``Any``, which
compromises type safety. Furthermore, the current behavior of TypedDict already
allows untyped extra items to be present in runtime, due to structural
subtyping.
Supporting ``TypedDict(extra=type)``
------------------------------------
This adds more corner cases to determine whether a type should be treated as a
type or a value. And it will require more work to support using special forms to
type the extra items.
While this saves us from reserving an attribute for special use, it will require
extra work to implement inheritance, and it is less natural to integrate with
generic TypedDicts.
Support Extra Items with Intersection
-------------------------------------
Supporting intersections in Python's type system requires a lot of careful
considerations, and it can take a long time for the community to reach a
consensus on a reasonable design.
Ideally, extra items in TypedDict should not be blocked by work on
intersections, nor does it necessarily need to be supported through
intersections.
Moreover, the intersection between ``Mapping[...]`` and ``TypedDict`` is not
equivalent to a TypedDict type with the proposed ``__extra__`` special item, as
the value type of all known items in ``TypedDict`` needs to satisfy the
is-subtype-of relation with the value type of ``Mapping[...]``.
Requiring Type Compatibility of the Known Items with ``__extra__``
------------------------------------------------------------------
``__extra__`` restricts the value type for keys that are *unknown* to the
TypedDict type. So the value type of any *known* item is not necessarily
consistent with ``__extra__``'s type, and ``__extra__``'s type is not
necessarily consistent with the value types of all known items.
This differs from TypeScript's `Index Signatures
<https://www.typescriptlang.org/docs/handbook/2/objects.html#index-signatures>`__
syntax, which requires all properties' types to match the string index's type.
For example:
.. code-block:: typescript
interface MovieWithExtraNumber {
name: string // Property 'name' of type 'string' is not assignable to 'string' index type 'number'.
[index: string]: number
}
interface MovieWithExtraNumberOrString {
name: string // OK
[index: string]: number | string
}
This is a known limitation is discussed in `TypeScript's issue tracker
<https://github.com/microsoft/TypeScript/issues/17867>`__,
where it is suggested that there should be a way to exclude the defined keys
from the index signature, so that it is possible to define a type like
``MovieWithExtraNumber``.
Reference Implementation
========================
pyanalyze has `experimental support
<https://github.com/quora/pyanalyze/blob/9bfc2c58467c87774a9950838402d2657b1486a0/pyanalyze/extensions.py#L590>`__
for a similar feature.
Reference implementation for this specific proposal, however, is not currently
available.
Acknowledgments
===============
Thanks to Jelle Zijlstra for sponsoring this PEP and providing review feedback,
Eric Traut who `proposed the original design
<https://mail.python.org/archives/list/typing-sig@python.org/message/3Z72OQWVTOVS6UYUUCCII2UZN56PV5II/>`
this PEP iterates on, and Alice Purcell for offering their perspective as the
author of :pep:`705`.
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.