Update to Data Classes.
The major changes from the previous version are: - Add InitVar to specify initialize-only fields. - Renamed __dataclass_post_init__() to __post_init(). - Rename cmp to compare. - Added eq, separate from compare, so you can test unorderable items for equality. - Flushed out asdict() and astuple(). - Changed replace() to just call __init__(), and dropped complex post-create logic.
This commit is contained in:
parent
039d3b7132
commit
17cc5bc3f7
559
pep-0557.rst
559
pep-0557.rst
|
@ -6,7 +6,7 @@ Type: Standards Track
|
||||||
Content-Type: text/x-rst
|
Content-Type: text/x-rst
|
||||||
Created: 02-Jun-2017
|
Created: 02-Jun-2017
|
||||||
Python-Version: 3.7
|
Python-Version: 3.7
|
||||||
Post-History: 08-Sep-2017
|
Post-History: 08-Sep-2017, 25-Nov-2017
|
||||||
|
|
||||||
Notice for Reviewers
|
Notice for Reviewers
|
||||||
====================
|
====================
|
||||||
|
@ -21,21 +21,27 @@ Abstract
|
||||||
|
|
||||||
This PEP describes an addition to the standard library called Data
|
This PEP describes an addition to the standard library called Data
|
||||||
Classes. Although they use a very different mechanism, Data Classes
|
Classes. Although they use a very different mechanism, Data Classes
|
||||||
can be thought of as "mutable namedtuples with defaults".
|
can be thought of as "mutable namedtuples with defaults". Because
|
||||||
|
Data Classes use normal class definition syntax, you are free to use
|
||||||
|
inheritance, metaclasses, docstrings, user-defined methods, class
|
||||||
|
factories, and other Python class features.
|
||||||
|
|
||||||
A class decorator is provided which inspects a class definition for
|
A class decorator is provided which inspects a class definition for
|
||||||
variables with type annotations as defined in PEP 526, "Syntax for
|
variables with type annotations as defined in PEP 526, "Syntax for
|
||||||
Variable Annotations". In this document, such variables are called
|
Variable Annotations". In this document, such variables are called
|
||||||
fields. Using these fields, the decorator adds generated method
|
fields. Using these fields, the decorator adds generated method
|
||||||
definitions to the class to support instance initialization, a repr,
|
definitions to the class to support instance initialization, a repr,
|
||||||
and comparisons methods. Such a class is called a Data Class, but
|
comparisons methods, and optionally other methods as described in the
|
||||||
there's really nothing special about the class: it is the same class
|
Specification_ section. Such a class is called a Data Class, but
|
||||||
but with the generated methods added.
|
there's really nothing special about the class: the decorator adds
|
||||||
|
generated methods to the class and returns the same class it was
|
||||||
|
given.
|
||||||
|
|
||||||
As an example::
|
As an example::
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class InventoryItem:
|
class InventoryItem:
|
||||||
|
'''Class for keeping track of an item in inventory.'''
|
||||||
name: str
|
name: str
|
||||||
unit_price: float
|
unit_price: float
|
||||||
quantity_on_hand: int = 0
|
quantity_on_hand: int = 0
|
||||||
|
@ -100,7 +106,7 @@ So, why is this PEP needed?
|
||||||
|
|
||||||
With the addition of PEP 526, Python has a concise way to specify the
|
With the addition of PEP 526, Python has a concise way to specify the
|
||||||
type of class members. This PEP leverages that syntax to provide a
|
type of class members. This PEP leverages that syntax to provide a
|
||||||
simple, unobtrusive way to describe Data Classes. With one exception,
|
simple, unobtrusive way to describe Data Classes. With two exceptions,
|
||||||
the specified attribute type annotation is completely ignored by Data
|
the specified attribute type annotation is completely ignored by Data
|
||||||
Classes.
|
Classes.
|
||||||
|
|
||||||
|
@ -110,6 +116,12 @@ interference from Data Classes. The decorated classes are truly
|
||||||
"normal" Python classes. The Data Class decorator should not
|
"normal" Python classes. The Data Class decorator should not
|
||||||
interfere with any usage of the class.
|
interfere with any usage of the class.
|
||||||
|
|
||||||
|
One main design goal of Data Classes is to support static type
|
||||||
|
checkers. The use of PEP 526 syntax is one example of this, but so is
|
||||||
|
the design of the ``fields()`` function and the ``@dataclass``
|
||||||
|
decorator. Due to their very dynamic nature, some of the libraries
|
||||||
|
mentioned above are difficult to use with static type checkers.
|
||||||
|
|
||||||
Data Classes are not, and are not intended to be, a replacement
|
Data Classes are not, and are not intended to be, a replacement
|
||||||
mechanism for all of the above libraries. But being in the standard
|
mechanism for all of the above libraries. But being in the standard
|
||||||
library will allow many of the simpler use cases to instead leverage
|
library will allow many of the simpler use cases to instead leverage
|
||||||
|
@ -118,14 +130,12 @@ sets, and will of course continue to exist and prosper.
|
||||||
|
|
||||||
Where is it not appropriate to use Data Classes?
|
Where is it not appropriate to use Data Classes?
|
||||||
|
|
||||||
- Compatibility with tuples is required.
|
- API compatibility with tuples or dicts is required.
|
||||||
|
|
||||||
- True immutability is required.
|
|
||||||
|
|
||||||
- Type validation beyond that provided by PEPs 484 and 526 is
|
- Type validation beyond that provided by PEPs 484 and 526 is
|
||||||
required, or value validation is required.
|
required, or value validation or conversion is required.
|
||||||
|
|
||||||
XXX Motivation for each dataclass() and field() parameter
|
.. _Specification:
|
||||||
|
|
||||||
Specification
|
Specification
|
||||||
=============
|
=============
|
||||||
|
@ -134,14 +144,14 @@ All of the functions described in this PEP will live in a module named
|
||||||
``dataclasses``.
|
``dataclasses``.
|
||||||
|
|
||||||
A function ``dataclass`` which is typically used as a class decorator
|
A function ``dataclass`` which is typically used as a class decorator
|
||||||
is provided to post-process classes and add generated member
|
is provided to post-process classes and add generated methods,
|
||||||
functions, described below.
|
described below.
|
||||||
|
|
||||||
The ``dataclass`` decorator examines the class to find ``field``'s. A
|
The ``dataclass`` decorator examines the class to find ``field``'s. A
|
||||||
``field`` is defined as any variable identified in
|
``field`` is defined as any variable identified in
|
||||||
``__annotations__``. That is, a variable that is decorated with a
|
``__annotations__``. That is, a variable that has a type annotation.
|
||||||
type annotation. With a single exception described below, none of the
|
With two exceptions described below, none of the Data Class machinery
|
||||||
Data Class machinery examines the type specified in the annotation.
|
examines the type specified in the annotation.
|
||||||
|
|
||||||
Note that ``__annotations__`` is guaranteed to be an ordered mapping,
|
Note that ``__annotations__`` is guaranteed to be an ordered mapping,
|
||||||
in class declaration order. The order of the fields in all of the
|
in class declaration order. The order of the fields in all of the
|
||||||
|
@ -151,7 +161,7 @@ The ``dataclass`` decorator is typically used with no parameters and
|
||||||
no parentheses. However, it also supports the following logical
|
no parentheses. However, it also supports the following logical
|
||||||
signature::
|
signature::
|
||||||
|
|
||||||
def dataclass(*, init=True, repr=True, hash=None, cmp=True, frozen=False)
|
def dataclass(*, init=True, repr=True, eq=True, compare=True, hash=None, frozen=False)
|
||||||
|
|
||||||
If ``dataclass`` is used just as a simple decorator with no
|
If ``dataclass`` is used just as a simple decorator with no
|
||||||
parameters, it acts as if it has the default values documented in this
|
parameters, it acts as if it has the default values documented in this
|
||||||
|
@ -165,37 +175,45 @@ signature. That is, these three uses of ``@dataclass`` are equivalent::
|
||||||
class C:
|
class C:
|
||||||
...
|
...
|
||||||
|
|
||||||
@dataclass(init=True, repr=True, hash=None, cmp=True, frozen=False)
|
@dataclass(init=True, repr=True, eq=True, compare=True, hash=None, frozen=False)
|
||||||
class C:
|
class C:
|
||||||
...
|
...
|
||||||
|
|
||||||
The parameters to ``dataclass`` are:
|
The parameters to ``dataclass`` are:
|
||||||
|
|
||||||
- ``init``: If true, a ``__init__`` method will be generated.
|
- ``init``: If true (the default), a ``__init__`` method will be
|
||||||
|
generated.
|
||||||
|
|
||||||
- ``repr``: If true, a ``__repr__`` function will be generated. The
|
- ``repr``: If true (the default), a ``__repr__`` function will be
|
||||||
generated repr string will have the class name and the name and repr
|
generated. The generated repr string will have the class name and
|
||||||
of each field, in the order they are defined in the class. Fields
|
the name and repr of each field, in the order they are defined in
|
||||||
that are marked as being excluded from the repr are not included.
|
the class. Fields that are marked as being excluded from the repr
|
||||||
For example:
|
are not included. For example:
|
||||||
``InventoryItem(name='widget',unit_price=3.0,quantity_on_hand=10)``.
|
``InventoryItem(name='widget',unit_price=3.0,quantity_on_hand=10)``.
|
||||||
|
|
||||||
- ``cmp``: If true, ``__eq__``, ``__ne__``, ``__lt__``, ``__le__``,
|
- ``eq``: If true (the default), ``__eq__`` and ``__ne__`` methods
|
||||||
|
will be generated. These compare the class as if it were a tuple of
|
||||||
|
its fields, in order. Both instances in the comparison must be of
|
||||||
|
the identical type.
|
||||||
|
|
||||||
|
- ``compare``: If true (the default), ``__lt__``, ``__le__``,
|
||||||
``__gt__``, and ``__ge__`` methods will be generated. These compare
|
``__gt__``, and ``__ge__`` methods will be generated. These compare
|
||||||
the class as if it were a tuple of its fields, in order. Both
|
the class as if it were a tuple of its fields, in order. Both
|
||||||
instances in the comparison must be of the identical type.
|
instances in the comparison must be of the identical type. If
|
||||||
|
``compare`` is True, then ``eq`` is ignored, and ``__eq__`` and
|
||||||
|
``__ne__`` will be automatically generated.
|
||||||
|
|
||||||
- ``hash``: Either a bool or ``None``. If ``None`` (the default), the
|
- ``hash``: Either a bool or ``None``. If ``None`` (the default), the
|
||||||
``__hash__`` method is generated according to how cmp and frozen are
|
``__hash__`` method is generated according to how ``eq`` and
|
||||||
set.
|
``frozen`` are set.
|
||||||
|
|
||||||
If ``cmp`` and ``frozen`` are both true, Data Classes will generate
|
If ``eq`` and ``frozen`` are both true, Data Classes will generate a
|
||||||
a ``__hash__`` for you. If ``cmp`` is true and ``frozen`` is false,
|
``__hash__`` method for you. If ``eq`` is true and ``frozen`` is
|
||||||
``__hash__`` will be set to ``None``, marking it unhashable (which
|
false, ``__hash__`` will be set to ``None``, marking it unhashable
|
||||||
it is). If cmp is false, ``__hash__`` will be left untouched
|
(which it is). If ``eq`` is false, ``__hash__`` will be left
|
||||||
meaning the ``__hash__`` method of the superclass will be used (if
|
untouched meaning the ``__hash__`` method of the superclass will be
|
||||||
superclass is ``object``, this means it will fall back to id-based
|
used (if the superclass is ``object``, this means it will fall back
|
||||||
hashing).
|
to id-based hashing).
|
||||||
|
|
||||||
Although not recommended, you can force Data Classes to create a
|
Although not recommended, you can force Data Classes to create a
|
||||||
``__hash__`` method with ``hash=True``. This might be the case if your
|
``__hash__`` method with ``hash=True``. This might be the case if your
|
||||||
|
@ -204,10 +222,11 @@ The parameters to ``dataclass`` are:
|
||||||
|
|
||||||
See the Python documentation [#]_ for more information.
|
See the Python documentation [#]_ for more information.
|
||||||
|
|
||||||
- ``frozen``: If True, assigning to fields will generate an exception.
|
- ``frozen``: If true (the default is False), assigning to fields will
|
||||||
This emulates read-only frozen instances. See the discussion below.
|
generate an exception. This emulates read-only frozen instances.
|
||||||
|
See the discussion below.
|
||||||
|
|
||||||
``field``'s may optionally specify a default value, using normal
|
``field``s may optionally specify a default value, using normal
|
||||||
Python syntax::
|
Python syntax::
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
|
@ -215,6 +234,11 @@ Python syntax::
|
||||||
a: int # 'a' has no default value
|
a: int # 'a' has no default value
|
||||||
b: int = 0 # assign a default value for 'b'
|
b: int = 0 # assign a default value for 'b'
|
||||||
|
|
||||||
|
In this example, both ``a`` and ``b`` will be included in the added
|
||||||
|
``__init__`` function, which will be defined as::
|
||||||
|
|
||||||
|
def __init__(self, a: int, b: int = 0):
|
||||||
|
|
||||||
For common and simple use cases, no other functionality is required.
|
For common and simple use cases, no other functionality is required.
|
||||||
There are, however, some Data Class features that require additional
|
There are, however, some Data Class features that require additional
|
||||||
per-field information. To satisfy this need for additional
|
per-field information. To satisfy this need for additional
|
||||||
|
@ -222,7 +246,7 @@ information, you can replace the default field value with a call to
|
||||||
the provided ``field()`` function. The signature of ``field()`` is::
|
the provided ``field()`` function. The signature of ``field()`` is::
|
||||||
|
|
||||||
def field(*, default=_MISSING, default_factory=_MISSING, repr=True,
|
def field(*, default=_MISSING, default_factory=_MISSING, repr=True,
|
||||||
hash=None, init=True, cmp=True)
|
hash=None, init=True, compare=True, metadata=None)
|
||||||
|
|
||||||
The ``_MISSING`` value is a sentinel object used to detect if the
|
The ``_MISSING`` value is a sentinel object used to detect if the
|
||||||
``default`` and ``default_factory`` parameters are provided. Users
|
``default`` and ``default_factory`` parameters are provided. Users
|
||||||
|
@ -241,57 +265,133 @@ The parameters to ``field()`` are:
|
||||||
with mutable default values, as discussed below. It is an error to
|
with mutable default values, as discussed below. It is an error to
|
||||||
specify both ``default`` and ``default_factory``.
|
specify both ``default`` and ``default_factory``.
|
||||||
|
|
||||||
- ``init``: If True, this field is included as a parameter to the
|
- ``init``: If true (the default), this field is included as a
|
||||||
generated ``__init__`` function.
|
parameter to the generated ``__init__`` function.
|
||||||
|
|
||||||
- ``repr``: If True, this field is included in the string returned by
|
- ``repr``: If true (the default), this field is included in the
|
||||||
the generated ``__repr__`` function.
|
string returned by the generated ``__repr__`` function.
|
||||||
|
|
||||||
- ``cmp``: If True, this field is included in the generated comparison
|
- ``compare``: If True (the default), this field is included in the
|
||||||
methods (``__eq__`` et al).
|
generated equality and comparison methods (``__eq__``, ``__gt__``,
|
||||||
|
et al.).
|
||||||
|
|
||||||
- ``hash``: This can be a bool or ``None``. If True, this field is
|
- ``hash``: This can be a bool or ``None``. If True, this field is
|
||||||
included in the generated ``__hash__`` method. If ``None`` (the
|
included in the generated ``__hash__`` method. If ``None`` (the
|
||||||
default), use the value of ``cmp``: this would normally be the
|
default), use the value of ``compare``: this would normally be the
|
||||||
expected behavior. A field needs to be considered in the hash if
|
expected behavior. A field needs to be considered in the hash if
|
||||||
it's used for comparisons. Setting this value to anything other
|
it's used for comparisons. Setting this value to anything other
|
||||||
than ``None`` is discouraged.
|
than ``None`` is discouraged.
|
||||||
|
|
||||||
|
- ``metadata``: This can be a mapping or None. None is treated as an
|
||||||
|
empty dict. This value is wrapped in ``types.MappingProxyType`` to
|
||||||
|
make it read-only, and exposed on the Field object. It is not used
|
||||||
|
at all by Data Classes, and is provided as a third-party extension
|
||||||
|
mechanism. Multiple third-parties can each have their own key, to
|
||||||
|
use as a namespace in the metadata.
|
||||||
|
|
||||||
|
If the default value of a field is specified by a call to ``field()``,
|
||||||
|
then the class attribute for this field will be replaced by the
|
||||||
|
specified ``default`` value, if one is provided in the call to
|
||||||
|
``field()``. If no ``default`` is provided, then the class attribute
|
||||||
|
will be deleted. The intent is that after the ``dataclass`` decorator
|
||||||
|
runs, the class attributes will all contain the default values for the
|
||||||
|
fields, just as if the default value itself were specified. For
|
||||||
|
example, after::
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class C:
|
||||||
|
x: int
|
||||||
|
y: int = field(repr=False)
|
||||||
|
z: int = field(repr=False, default=10)
|
||||||
|
t: int = 20
|
||||||
|
|
||||||
|
The class attribute ``C.z`` will be ``10``, the class attribute
|
||||||
|
``C.t`` will be ``20``, and the class attributes ``C.x`` and ``C.y``
|
||||||
|
will not be set.
|
||||||
|
|
||||||
``Field`` objects
|
``Field`` objects
|
||||||
-----------------
|
-----------------
|
||||||
|
|
||||||
``Field`` objects describe each defined field. These objects are
|
``Field`` objects describe each defined field. These objects are
|
||||||
created internally, and are returned by the ``fields()`` module-level
|
created internally, and are returned by the ``fields()`` module-level
|
||||||
method (see below). Users should never instantiate a ``Field``
|
method (see below). Users should never instantiate a ``Field``
|
||||||
object directly. Its attributes are:
|
object directly. Its documented attributes are:
|
||||||
|
|
||||||
- ``name``: The name of the field.
|
- ``name``: The name of the field.
|
||||||
|
|
||||||
- ``type``: The type of the field.
|
- ``type``: The type of the field.
|
||||||
|
|
||||||
- ``default``, ``default_factory``, ``init``, ``repr``, ``hash``, and
|
- ``default``, ``default_factory``, ``init``, ``repr``, ``hash``,
|
||||||
``cmp`` have the identical meaning as they do in the ``field()``
|
``compare``, and ``metadata`` have the identical meaning as they do
|
||||||
declaration.
|
in the ``field()`` declaration.
|
||||||
|
|
||||||
|
Other attributes may exist, but they are private.
|
||||||
|
|
||||||
post-init processing
|
post-init processing
|
||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
The generated ``__init__`` code will call a method named
|
The generated ``__init__`` code will call a method named
|
||||||
``__dataclass_post_init__``, if it is defined on the class. It will
|
``__post_init__``, if it is defined on the class. It will
|
||||||
be called as ``self.__dataclass_post_init__()``.
|
be called as ``self.__post_init__()``.
|
||||||
|
|
||||||
Among other uses, this allows for initializing field values that
|
Among other uses, this allows for initializing field values that
|
||||||
depend on one or more other fields.
|
depend on one or more other fields. For example::
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class C:
|
||||||
|
a: float
|
||||||
|
b: float
|
||||||
|
c: float = field(init=False)
|
||||||
|
|
||||||
|
def __post_init__(self):
|
||||||
|
self.c = self.a + self.b
|
||||||
|
|
||||||
|
See the section below on init-only variables for ways to pass
|
||||||
|
parameters to ``__post_init__()``. Also see the warning about how
|
||||||
|
``replace()`` handles ``init=False`` fields.
|
||||||
|
|
||||||
Class variables
|
Class variables
|
||||||
---------------
|
---------------
|
||||||
|
|
||||||
The one place where ``dataclass`` actually inspects the type of a
|
One place where ``dataclass`` actually inspects the type of a field is
|
||||||
field is to determine if a field is a class variable. It does this by
|
to determine if a field is a class variable as defined in PEP 526. It
|
||||||
seeing if the type of the field is given as of type
|
does this by checking if the type of the field is of type
|
||||||
``typing.ClassVar``. If a field is a ``ClassVar``, it is excluded
|
``typing.ClassVar``. If a field is a ``ClassVar``, it is excluded
|
||||||
from consideration as a field and is ignored by the Data Class
|
from consideration as a field and is ignored by the Data Class
|
||||||
mechanisms.
|
mechanisms. For more discussion, see [#]_. Such ``ClassVar``
|
||||||
|
pseudo-fields are not returned by the module-level ``fields()``
|
||||||
|
function.
|
||||||
|
|
||||||
|
Init-only variables
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
The other place where ``dataclass`` inspects a type annotation is to
|
||||||
|
determine if a field is an init-only variable. It does this by seeing
|
||||||
|
if the type of a field is of type ``dataclasses.InitVar``. If a field
|
||||||
|
is an ``InitVar``, it is considered a pseudo-field called an init-only
|
||||||
|
field. As it is not a true field, it is not returned by the
|
||||||
|
module-level ``fields()`` function. Init-only fields are added as
|
||||||
|
parameters to the generated ``__init__`` method, and are passed to
|
||||||
|
the optional ``__post_init__`` method. They are not otherwise used
|
||||||
|
by Data Classes.
|
||||||
|
|
||||||
|
For example, suppose a field will be initialzed from a database, if a
|
||||||
|
value is not provided when creating the class::
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class C:
|
||||||
|
i: int
|
||||||
|
j: int = None
|
||||||
|
database: InitVar[DatabaseType] = None
|
||||||
|
|
||||||
|
def __post_init__(self, database):
|
||||||
|
if self.j is None and database is not None:
|
||||||
|
self.j = database.lookup('j')
|
||||||
|
|
||||||
|
c = C(10, database=my_database)
|
||||||
|
|
||||||
|
In this case, ``fields()`` will return ``Field`` objects for ``i`` and
|
||||||
|
``j``, but not for ``database``.
|
||||||
|
|
||||||
Frozen instances
|
Frozen instances
|
||||||
----------------
|
----------------
|
||||||
|
@ -299,46 +399,13 @@ Frozen instances
|
||||||
It is not possible to create truly immutable Python objects. However,
|
It is not possible to create truly immutable Python objects. However,
|
||||||
by passing ``frozen=True`` to the ``@dataclass`` decorator you can
|
by passing ``frozen=True`` to the ``@dataclass`` decorator you can
|
||||||
emulate immutability. In that case, Data Classes will add
|
emulate immutability. In that case, Data Classes will add
|
||||||
``__setattr__`` and ``__delattr__`` member functions to the class.
|
``__setattr__`` and ``__delattr__`` methods to the class. These
|
||||||
These functions will raise a ``FrozenInstanceError`` when invoked.
|
methods will raise a ``FrozenInstanceError`` when invoked.
|
||||||
|
|
||||||
There is a tiny performance penalty when using ``frozen=True``:
|
There is a tiny performance penalty when using ``frozen=True``:
|
||||||
``__init__`` cannot use simple assignment to initialize fields, and
|
``__init__`` cannot use simple assignment to initialize fields, and
|
||||||
must use ``object.__setattr__``.
|
must use ``object.__setattr__``.
|
||||||
|
|
||||||
Mutable default values
|
|
||||||
----------------------
|
|
||||||
|
|
||||||
Python stores the default field values in class attributes.
|
|
||||||
Consider this example, not using Data Classes::
|
|
||||||
|
|
||||||
class C:
|
|
||||||
x = []
|
|
||||||
def __init__(self, x=x):
|
|
||||||
self.x = x
|
|
||||||
|
|
||||||
assert C().x is C().x
|
|
||||||
assert C().x is not C([]).x
|
|
||||||
|
|
||||||
That is, two instances of class ``C`` that do not not specify a value
|
|
||||||
for ``x`` when creating a class instance will share the same copy of
|
|
||||||
the list. Because Data Classes just use normal Python class creation,
|
|
||||||
they also share this problem. There is no general way for Data
|
|
||||||
Classes to detect this condition. Instead, Data Classes will raise a
|
|
||||||
``TypeError`` if it detects a default parameter of type ``list``,
|
|
||||||
``dict``, or ``set``. This is a partial solution, but it does protect
|
|
||||||
against many common errors. See `How to support mutable default
|
|
||||||
values`_ in the Discussion section for more details.
|
|
||||||
|
|
||||||
Using default factory functions is a way to create new instances of
|
|
||||||
mutable types as default values for fields::
|
|
||||||
|
|
||||||
@dataclass
|
|
||||||
class C:
|
|
||||||
x: list = field(default_factory=list)
|
|
||||||
|
|
||||||
assert C().x is not C().x
|
|
||||||
|
|
||||||
Inheritance
|
Inheritance
|
||||||
-----------
|
-----------
|
||||||
|
|
||||||
|
@ -346,13 +413,15 @@ When the Data Class is being created by the ``@dataclass`` decorator,
|
||||||
it looks through all of the class's base classes in reverse MRO (that
|
it looks through all of the class's base classes in reverse MRO (that
|
||||||
is, starting at ``object``) and, for each Data Class that it finds,
|
is, starting at ``object``) and, for each Data Class that it finds,
|
||||||
adds the fields from that base class to an ordered mapping of fields.
|
adds the fields from that base class to an ordered mapping of fields.
|
||||||
After all of the base classes, it adds its own fields to the ordered
|
After all of the base class fields are added, it adds its own fields
|
||||||
mapping. Because the fields are in insertion order, derived classes
|
to the ordered mapping. All of the generated methods will use this
|
||||||
override base classes. An example::
|
combined, calculated ordered mapping of fields. Because the fields
|
||||||
|
are in insertion order, derived classes override base classes. An
|
||||||
|
example::
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class Base:
|
class Base:
|
||||||
x: float = 15.0
|
x: Any = 15.0
|
||||||
y: int = 0
|
y: int = 0
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
|
@ -363,6 +432,10 @@ override base classes. An example::
|
||||||
The final list of fields is, in order, ``x``, ``y``, ``z``. The final
|
The final list of fields is, in order, ``x``, ``y``, ``z``. The final
|
||||||
type of ``x`` is ``int``, as specified in class ``C``.
|
type of ``x`` is ``int``, as specified in class ``C``.
|
||||||
|
|
||||||
|
The generated ``__init__`` method for ``C`` will look like::
|
||||||
|
|
||||||
|
def __init__(self, x: int = 15, y: int = 0, z: int = 10):
|
||||||
|
|
||||||
Default factory functions
|
Default factory functions
|
||||||
-------------------------
|
-------------------------
|
||||||
|
|
||||||
|
@ -376,18 +449,165 @@ If a field is excluded from ``__init__`` (using ``init=False``) and
|
||||||
the field also specifies ``default_factory``, then the default factory
|
the field also specifies ``default_factory``, then the default factory
|
||||||
function will always be called from the generated ``__init__``
|
function will always be called from the generated ``__init__``
|
||||||
function. This happens because there is no other way to give the
|
function. This happens because there is no other way to give the
|
||||||
field a default value.
|
field an initial value.
|
||||||
|
|
||||||
|
Mutable default values
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
Python stores default member variable values in class attributes.
|
||||||
|
Consider this example, not using Data Classes::
|
||||||
|
|
||||||
|
class C:
|
||||||
|
x = []
|
||||||
|
def add(self, element):
|
||||||
|
self.x += element
|
||||||
|
|
||||||
|
o1 = C()
|
||||||
|
o2 = C()
|
||||||
|
o1.add(1)
|
||||||
|
o2.add(2)
|
||||||
|
assert o1.x == [1, 2]
|
||||||
|
assert o1.x is o2.x
|
||||||
|
|
||||||
|
Note that the two instances of class ``C`` share the same class
|
||||||
|
variable ``x``, as expected.
|
||||||
|
|
||||||
|
Using Data Classes, *if* this code was valid::
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class D:
|
||||||
|
x: List = []
|
||||||
|
def add(self, element):
|
||||||
|
self.x += element
|
||||||
|
|
||||||
|
it would generate code similar to::
|
||||||
|
|
||||||
|
class D:
|
||||||
|
x = []
|
||||||
|
def __init__(self, x=x):
|
||||||
|
self.x = x
|
||||||
|
def add(self, element):
|
||||||
|
self.x += element
|
||||||
|
|
||||||
|
assert D().x is D().x
|
||||||
|
|
||||||
|
This has the same issue as the original example using class ``C``.
|
||||||
|
That is, two instances of class ``D`` that do not specify a value for
|
||||||
|
``x`` when creating a class instance will share the same copy of
|
||||||
|
``x``. Because Data Classes just use normal Python class creation
|
||||||
|
they also share this problem. There is no general way for Data
|
||||||
|
Classes to detect this condition. Instead, Data Classes will raise a
|
||||||
|
``TypeError`` if it detects a default parameter of type ``list``,
|
||||||
|
``dict``, or ``set``. This is a partial solution, but it does protect
|
||||||
|
against many common errors. See `Automatically support mutable
|
||||||
|
default values`_ in the Rejected Ideas section for more details.
|
||||||
|
|
||||||
|
Using default factory functions is a way to create new instances of
|
||||||
|
mutable types as default values for fields::
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class D:
|
||||||
|
x: list = field(default_factory=list)
|
||||||
|
|
||||||
|
assert D().x is not D().x
|
||||||
|
|
||||||
Module level helper functions
|
Module level helper functions
|
||||||
-----------------------------
|
-----------------------------
|
||||||
|
|
||||||
- ``fields(class_or_instance)``: Returns a list of ``Field`` objects
|
- ``fields(class_or_instance)``: Returns a list of ``Field`` objects
|
||||||
that define the fields for this Data Class. Accepts either a Data
|
that define the fields for this Data Class. Accepts either a Data
|
||||||
Class, or an instance of a Data Class.
|
Class, or an instance of a Data Class. Raises `ValueError` if not
|
||||||
|
passed a Data Class or instance of one. Does not return
|
||||||
|
pseudo-fields which are ``ClassVar`` or ``InitVar``.
|
||||||
|
|
||||||
- ``asdict(instance)``: todo: recursion, class factories, etc.
|
- ``asdict(instance, *, dict_factory=dict)``: Converts the Data Class
|
||||||
|
``instance`` to a dict (by using the factory function
|
||||||
|
``dict_factory``). Each Data Class is converted to a dict of its
|
||||||
|
fields, as name:value pairs. Data Classes, dicts, lists, and tuples
|
||||||
|
are recursed into. For example::
|
||||||
|
|
||||||
- ``astuple(instance)``: todo: recursion, class factories, etc.
|
@dataclass
|
||||||
|
class Point:
|
||||||
|
x: int
|
||||||
|
y: int
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class C:
|
||||||
|
l: List[Point]
|
||||||
|
|
||||||
|
p = Point(10, 20)
|
||||||
|
assert asdict(p) == {'x': 10, 'y': 20}
|
||||||
|
|
||||||
|
c = C([Point(0, 0), Point(10, 4)])
|
||||||
|
assert asdict(c) == {'l': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]}
|
||||||
|
|
||||||
|
Raises ``TypeError`` if ``instance`` is not a Data Class instance.
|
||||||
|
|
||||||
|
- ``astuple(*, tuple_factory=tuple)``: Converts the Data Class
|
||||||
|
``instance`` to a tuple (by using the factory function
|
||||||
|
``tuple_factory``). Each Data Class is converted to a tuple of its
|
||||||
|
field values. Data Classes, dicts, lists, and tuples are recursed
|
||||||
|
into.
|
||||||
|
|
||||||
|
Continuing from the previous example::
|
||||||
|
|
||||||
|
assert astuple(p) == (10, 20)
|
||||||
|
assert astuple(c) == ([(0, 0), (10, 4)],)
|
||||||
|
|
||||||
|
Raises ``TypeError`` if ``instance`` is not a Data Class instance.
|
||||||
|
|
||||||
|
- ``isdataclass(instance)``: Returns ``True`` if ``instance`` is an
|
||||||
|
instance of a Data Class, otherwise returns ``False``.
|
||||||
|
|
||||||
|
- ``make_dataclass(cls_name, fields, *, bases=(), namespace=None)``:
|
||||||
|
Creates a new Data Class with name ``cls_name``, fields as defined
|
||||||
|
in ``fields``, base classes as given in ``bases``, and initialized
|
||||||
|
with a namespace as given in ``namespace``. This function is not
|
||||||
|
strictly required, because any Python mechanism for creating a new
|
||||||
|
class with ``__annotations__`` can then apply the ``dataclass``
|
||||||
|
function to convert that class to a Data Class. This function is
|
||||||
|
provided as a convenience. For example::
|
||||||
|
|
||||||
|
C = make_dataclass('C',
|
||||||
|
[('x', int),
|
||||||
|
('y', int, field(default=5))],
|
||||||
|
namespace={'add_one': lambda self: self.x + 1})
|
||||||
|
|
||||||
|
Is equivalent to::
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class C:
|
||||||
|
x: int
|
||||||
|
y: int = 5
|
||||||
|
|
||||||
|
def add_one(self):
|
||||||
|
return self.x + 1
|
||||||
|
|
||||||
|
- ``replace(instance, **changes)``: Creates a new object of the same
|
||||||
|
type of ``instance``, replacing fields with values from ``changes``.
|
||||||
|
If ``instance`` is not a Data Class, raises ``TypeError``. If
|
||||||
|
values in ``changes`` do not specify fields, raises ``TypeError``.
|
||||||
|
|
||||||
|
The newly returned object is created by calling the ``__init__``
|
||||||
|
method of the Data Class. This ensures that
|
||||||
|
``__post_init__``, if present, is also called.
|
||||||
|
|
||||||
|
Init-only variables without default values, if any exist, must be
|
||||||
|
specified on the call to ``replace`` so that they can be passed to
|
||||||
|
``__init__`` and ``__post_init__``.
|
||||||
|
|
||||||
|
It is an error for ``changes`` to contain any fields that are
|
||||||
|
defined as having ``init=False``. A ``ValueError`` will be raised
|
||||||
|
in this case.
|
||||||
|
|
||||||
|
Be forewarned about how ``init=False`` fields work during a call to
|
||||||
|
``replace()``. They are not copied from the source object, but
|
||||||
|
rather are initialized in ``__post_init__()``, if they're
|
||||||
|
initialized at all. It is expected that ``init=False`` fields will
|
||||||
|
be rarely and judiciously used. If they are used, it might be wise
|
||||||
|
to have alternate class constructors, or perhaps a custom
|
||||||
|
``replace()`` (or similarly named) method which handles instance
|
||||||
|
copying.
|
||||||
|
|
||||||
.. _discussion:
|
.. _discussion:
|
||||||
|
|
||||||
|
@ -421,24 +641,16 @@ workarounds:
|
||||||
|
|
||||||
For more discussion, see [#]_.
|
For more discussion, see [#]_.
|
||||||
|
|
||||||
Should post-init take params?
|
Why not just use namedtuple?
|
||||||
-----------------------------
|
----------------------------
|
||||||
|
|
||||||
The post-init function ``__dataclass_post_init__`` takes no
|
- Any namedtuple can be accidentally compared to any other with the
|
||||||
parameters. This was deemed to be simpler than trying to find a
|
same number of fields. For example: ``Point3D(2017, 6, 2) ==
|
||||||
mechanism to optionally pass a parameter to the
|
Date(2017, 6, 2)``. With Data Classes, this would return False.
|
||||||
``__dataclass_post_init__`` function.
|
|
||||||
|
|
||||||
|
- A namedtuple can be accidentally compared to a tuple. For example
|
||||||
Why not just use namedtuple
|
``Point2D(1, 10) == (1, 10)``. With Data Classes, this would return
|
||||||
---------------------------
|
False.
|
||||||
|
|
||||||
- Any namedtuple can be compared to any other with the same number of
|
|
||||||
fields. For example: ``Point3D(2017, 6, 2) == Date(2017, 6, 2)``.
|
|
||||||
With Data Classes, this would return False.
|
|
||||||
|
|
||||||
- A namedtuple can be compared to a tuple. For example ``Point2D(1,
|
|
||||||
10) == (1, 10)``. With Data Classes, this would return False.
|
|
||||||
|
|
||||||
- Instances are always iterable, which can make it difficult to add
|
- Instances are always iterable, which can make it difficult to add
|
||||||
fields. If a library defines::
|
fields. If a library defines::
|
||||||
|
@ -461,16 +673,19 @@ Why not just use namedtuple
|
||||||
- Cannot control which fields are used for ``__init__``, ``__repr__``,
|
- Cannot control which fields are used for ``__init__``, ``__repr__``,
|
||||||
etc.
|
etc.
|
||||||
|
|
||||||
Why not just use typing.NamedTuple
|
- Cannot support combining fields by inheritance.
|
||||||
----------------------------------
|
|
||||||
|
Why not just use typing.NamedTuple?
|
||||||
|
-----------------------------------
|
||||||
|
|
||||||
For classes with statically defined fields, it does support similar
|
For classes with statically defined fields, it does support similar
|
||||||
syntax to Data Classes, using type annotations. This produces a
|
syntax to Data Classes, using type annotations. This produces a
|
||||||
namedtuple, so it shares ``namedtuple``'s benefits and some of its
|
namedtuple, so it shares ``namedtuple``'s benefits and some of its
|
||||||
downsides.
|
downsides. Data Classes, unlike ``typing.NamedTuple``, support
|
||||||
|
combining fields via inheritance.
|
||||||
|
|
||||||
Why not just use attrs
|
Why not just use attrs?
|
||||||
----------------------
|
-----------------------
|
||||||
|
|
||||||
- attrs moves faster than could be accommodated if it were moved in to
|
- attrs moves faster than could be accommodated if it were moved in to
|
||||||
the standard library.
|
the standard library.
|
||||||
|
@ -482,30 +697,81 @@ Why not just use attrs
|
||||||
|
|
||||||
For more discussion, see [#]_.
|
For more discussion, see [#]_.
|
||||||
|
|
||||||
Dynamic creation of classes
|
post-init parameters
|
||||||
---------------------------
|
--------------------
|
||||||
|
|
||||||
An earlier version of this PEP and the sample implementation provided
|
In an earlier version of this PEP before ``InitVar`` was added, the
|
||||||
a ``make_class`` function that dynamically created Data Classes. This
|
post-init function ``__post_init__`` never took any parameters.
|
||||||
functionality was later dropped, although it might be added at a later
|
|
||||||
time as a helper function. The ``@dataclass`` decorator does not care
|
The normal way of doing parameterized initialization (and not just
|
||||||
how classes are created, so they could be either statically defined or
|
with Data Classes) is to provide an alternate classmethod constructor.
|
||||||
dynamically defined. For this Data Class::
|
For example::
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class C:
|
class C:
|
||||||
x: int
|
x: int
|
||||||
y: int = field(init=False, default=0)
|
|
||||||
|
|
||||||
Here is one way of dynamically creating the same Data Class::
|
@classmethod
|
||||||
|
def from_file(cls, filename):
|
||||||
|
with open(filename) as fl:
|
||||||
|
file_value = int(fl.read())
|
||||||
|
return C(file_value)
|
||||||
|
|
||||||
cls_dict = {'__annotations__': OrderedDict(x=int, y=int),
|
c = C.from_file('file.txt')
|
||||||
'y': field(init=False, default=0),
|
|
||||||
}
|
|
||||||
C = dataclass(type('C', (object,), cls_dict))
|
|
||||||
|
|
||||||
How to support mutable default values
|
Because the ``__post_init__`` function is the last thing called in the
|
||||||
-------------------------------------
|
generated ``__init__``, having a classmethod constructor (which can
|
||||||
|
also execute code immmediately after constructing the object) is
|
||||||
|
functionally equivalent to being able to pass parameters to a
|
||||||
|
``__post_init__`` function.
|
||||||
|
|
||||||
|
With ``InitVar``'s, ``__post_init__`` functions can now take
|
||||||
|
parameters. They are passed first to ``__init__`` which passes them
|
||||||
|
to ``__post_init__`` where user code can use them as needed.
|
||||||
|
|
||||||
|
The only real difference between alternate classmethod constructors
|
||||||
|
and ``InitVar`` pseudo-fields is in object creation. With
|
||||||
|
``InitVar``s, using ``__init__`` and the module-level ``replace()``
|
||||||
|
function ``InitVar``'s must always be specified. With alternate
|
||||||
|
classmethod constructors the additional initialization parameters are
|
||||||
|
always optional. Which approach is more appropriate will be
|
||||||
|
application-specific, but both approaches are supported.
|
||||||
|
|
||||||
|
Rejected ideas
|
||||||
|
==============
|
||||||
|
|
||||||
|
Copying ``init=False`` fields after new object creation in replace()
|
||||||
|
--------------------------------------------------------------------
|
||||||
|
|
||||||
|
Fields that are ``init=False`` are by definition not passed to
|
||||||
|
``__init__``, but instead are initialized with a default value, or by
|
||||||
|
calling a default factory function in ``__init__``, or by code in
|
||||||
|
``__post_init__``.
|
||||||
|
|
||||||
|
A previous version of this PEP specified that ``init=False`` fields
|
||||||
|
would be copied from the source object to the newly created object
|
||||||
|
after ``__init__`` returned, but that was deemed to be inconsistent
|
||||||
|
with using ``__init__`` and ``__post_init__`` to initialize the new
|
||||||
|
object. For example, consider this case::
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Square:
|
||||||
|
length: float
|
||||||
|
area: float = field(init=False, default=0.0)
|
||||||
|
|
||||||
|
def __post_init__(self):
|
||||||
|
self.area = self.length * self.length
|
||||||
|
|
||||||
|
s1 = Square(1.0)
|
||||||
|
s2 = replace(s1, length=2.0)
|
||||||
|
|
||||||
|
If ``init=False`` fields were copied from the source to the
|
||||||
|
destination object after ``__post_init__`` is run, then s2 would end
|
||||||
|
up begin ``Square(length=2.0, area=1.0)``, instead of the correct
|
||||||
|
``Square(length=2.0, area=4.0)``.
|
||||||
|
|
||||||
|
Automatically support mutable default values
|
||||||
|
--------------------------------------------
|
||||||
|
|
||||||
One proposal was to automatically copy defaults, so that if a literal
|
One proposal was to automatically copy defaults, so that if a literal
|
||||||
list ``[]`` was a default value, each instance would get a new list.
|
list ``[]`` was a default value, each instance would get a new list.
|
||||||
|
@ -517,6 +783,9 @@ see [#]_.
|
||||||
Examples
|
Examples
|
||||||
========
|
========
|
||||||
|
|
||||||
|
A complicated example
|
||||||
|
---------------------
|
||||||
|
|
||||||
This code exists in a closed source project::
|
This code exists in a closed source project::
|
||||||
|
|
||||||
class Application:
|
class Application:
|
||||||
|
@ -536,10 +805,10 @@ This can be replaced by::
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class Application:
|
class Application:
|
||||||
name: Str
|
name: str
|
||||||
requirements: List
|
requirements: List[Requirement]
|
||||||
constraints: List[str] = field(default_factory=list)
|
constraints: Dict[str, str] = field(default_factory=dict)
|
||||||
path: Str = ''
|
path: str = ''
|
||||||
executable_links: List[str] = field(default_factory=list)
|
executable_links: List[str] = field(default_factory=list)
|
||||||
executable_dir: Tuple[str] = ()
|
executable_dir: Tuple[str] = ()
|
||||||
additional_items: List[str] = field(init=False, default_factory=list)
|
additional_items: List[str] = field(init=False, default_factory=list)
|
||||||
|
@ -555,8 +824,8 @@ of this PEP and code: Ivan Levkivskyi, Guido van Rossum, Hynek
|
||||||
Schlawack, Raymond Hettinger, and Lisa Roach. I thank them for their
|
Schlawack, Raymond Hettinger, and Lisa Roach. I thank them for their
|
||||||
time and expertise.
|
time and expertise.
|
||||||
|
|
||||||
A special mention must be made about the attrs project. It was a true
|
A special mention must be made about the ``attrs`` project. It was a
|
||||||
inspiration for this PEP, and I respect the design decisions they
|
true inspiration for this PEP, and I respect the design decisions they
|
||||||
made.
|
made.
|
||||||
|
|
||||||
References
|
References
|
||||||
|
@ -580,6 +849,9 @@ References
|
||||||
.. [#] Python documentation for __hash__
|
.. [#] Python documentation for __hash__
|
||||||
(https://docs.python.org/3/reference/datamodel.html#object.__hash__)
|
(https://docs.python.org/3/reference/datamodel.html#object.__hash__)
|
||||||
|
|
||||||
|
.. [#] ClassVar discussion in PEP 526
|
||||||
|
(https://www.python.org/dev/peps/pep-0526/#class-and-instance-variable-annotations)
|
||||||
|
|
||||||
.. [#] Start of python-ideas discussion
|
.. [#] Start of python-ideas discussion
|
||||||
(https://mail.python.org/pipermail/python-ideas/2017-May/045618.html)
|
(https://mail.python.org/pipermail/python-ideas/2017-May/045618.html)
|
||||||
|
|
||||||
|
@ -595,6 +867,7 @@ References
|
||||||
.. [#] Copying mutable defaults
|
.. [#] Copying mutable defaults
|
||||||
(https://github.com/ericvsmith/dataclasses/issues/3)
|
(https://github.com/ericvsmith/dataclasses/issues/3)
|
||||||
|
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
=========
|
=========
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue