python-peps/pep-0487.txt

488 lines
19 KiB
Plaintext

PEP: 487
Title: Simpler customisation of class creation
Version: $Revision$
Last-Modified: $Date$
Author: Martin Teichmann <lkb.teichmann@gmail.com>,
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 27-Feb-2015
Python-Version: 3.6
Post-History: 27-Feb-2015, 05-Feb-2016, 24-Jun-2016, 02-Jul-2016, 13-Jul-2016
Replaces: 422
Resolution: https://mail.python.org/pipermail/python-dev/2016-July/145629.html
Abstract
========
Currently, customising class creation requires the use of a custom metaclass.
This custom metaclass then persists for the entire lifecycle of the class,
creating the potential for spurious metaclass conflicts.
This PEP proposes to instead support a wide range of customisation
scenarios through a new ``__init_subclass__`` hook in the class body,
and a hook to initialize attributes.
The new mechanism should be easier to understand and use than
implementing a custom metaclass, and thus should provide a gentler
introduction to the full power of Python's metaclass machinery.
Background
==========
Metaclasses are a powerful tool to customize class creation. They have,
however, the problem that there is no automatic way to combine metaclasses.
If one wants to use two metaclasses for a class, a new metaclass combining
those two needs to be created, typically manually.
This need often occurs as a surprise to a user: inheriting from two base
classes coming from two different libraries suddenly raises the necessity
to manually create a combined metaclass, where typically one is not
interested in those details about the libraries at all. This becomes
even worse if one library starts to make use of a metaclass which it
has not done before. While the library itself continues to work perfectly,
suddenly every code combining those classes with classes from another library
fails.
Proposal
========
While there are many possible ways to use a metaclass, the vast majority
of use cases falls into just three categories: some initialization code
running after class creation, the initialization of descriptors and
keeping the order in which class attributes were defined.
The first two categories can easily be achieved by having simple hooks
into the class creation:
1. An ``__init_subclass__`` hook that initializes
all subclasses of a given class.
2. upon class creation, a ``__set_name__`` hook is called on all the
attribute (descriptors) defined in the class, and
The third category is the topic of another PEP, :pep:`520`.
As an example, the first use case looks as follows::
>>> class QuestBase:
... # this is implicitly a @classmethod (see below for motivation)
... def __init_subclass__(cls, swallow, **kwargs):
... cls.swallow = swallow
... super().__init_subclass__(**kwargs)
>>> class Quest(QuestBase, swallow="african"):
... pass
>>> Quest.swallow
'african'
The base class ``object`` contains an empty ``__init_subclass__``
method which serves as an endpoint for cooperative multiple inheritance.
Note that this method has no keyword arguments, meaning that all
methods which are more specialized have to process all keyword
arguments.
This general proposal is not a new idea (it was first suggested for
inclusion in the language definition `more than 10 years ago`_, and a
similar mechanism has long been supported by `Zope's ExtensionClass`_),
but the situation has changed sufficiently in recent years that
the idea is worth reconsidering for inclusion.
The second part of the proposal adds an ``__set_name__``
initializer for class attributes, especially if they are descriptors.
Descriptors are defined in the body of a
class, but they do not know anything about that class, they do not
even know the name they are accessed with. They do get to know their
owner once ``__get__`` is called, but still they do not know their
name. This is unfortunate, for example they cannot put their
associated value into their object's ``__dict__`` under their name,
since they do not know that name. This problem has been solved many
times, and is one of the most important reasons to have a metaclass in
a library. While it would be easy to implement such a mechanism using
the first part of the proposal, it makes sense to have one solution
for this problem for everyone.
To give an example of its usage, imagine a descriptor representing weak
referenced values::
import weakref
class WeakAttribute:
def __get__(self, instance, owner):
return instance.__dict__[self.name]()
def __set__(self, instance, value):
instance.__dict__[self.name] = weakref.ref(value)
# this is the new initializer:
def __set_name__(self, owner, name):
self.name = name
Such a ``WeakAttribute`` may, for example, be used in a tree structure
where one wants to avoid cyclic references via the parent::
class TreeNode:
parent = WeakAttribute()
def __init__(self, parent):
self.parent = parent
Note that the ``parent`` attribute is used like a normal attribute,
yet the tree contains no cyclic references and can thus be easily
garbage collected when out of use. The ``parent`` attribute magically
becomes ``None`` once the parent ceases existing.
While this example looks very trivial, it should be noted that until
now such an attribute cannot be defined without the use of a metaclass.
And given that such a metaclass can make life very hard, this kind of
attribute does not exist yet.
Initializing descriptors could simply be done in the
``__init_subclass__`` hook. But this would mean that descriptors can
only be used in classes that have the proper hook, the generic version
like in the example would not work generally. One could also call
``__set_name__`` from within the base implementation of
``object.__init_subclass__``. But given that it is a common mistake
to forget to call ``super()``, it would happen too often that suddenly
descriptors are not initialized.
Key Benefits
============
Easier inheritance of definition time behaviour
-----------------------------------------------
Understanding Python's metaclasses requires a deep understanding of
the type system and the class construction process. This is legitimately
seen as challenging, due to the need to keep multiple moving parts (the code,
the metaclass hint, the actual metaclass, the class object, instances of the
class object) clearly distinct in your mind. Even when you know the rules,
it's still easy to make a mistake if you're not being extremely careful.
Understanding the proposed implicit class initialization hook only requires
ordinary method inheritance, which isn't quite as daunting a task. The new
hook provides a more gradual path towards understanding all of the phases
involved in the class definition process.
Reduced chance of metaclass conflicts
-------------------------------------
One of the big issues that makes library authors reluctant to use metaclasses
(even when they would be appropriate) is the risk of metaclass conflicts.
These occur whenever two unrelated metaclasses are used by the desired
parents of a class definition. This risk also makes it very difficult to
*add* a metaclass to a class that has previously been published without one.
By contrast, adding an ``__init_subclass__`` method to an existing type poses
a similar level of risk to adding an ``__init__`` method: technically, there
is a risk of breaking poorly implemented subclasses, but when that occurs,
it is recognised as a bug in the subclass rather than the library author
breaching backwards compatibility guarantees.
New Ways of Using Classes
=========================
Subclass registration
---------------------
Especially when writing a plugin system, one likes to register new
subclasses of a plugin baseclass. This can be done as follows::
class PluginBase:
subclasses = []
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
cls.subclasses.append(cls)
In this example, ``PluginBase.subclasses`` will contain a plain list of all
subclasses in the entire inheritance tree. One should note that this also
works nicely as a mixin class.
Trait descriptors
-----------------
There are many designs of Python descriptors in the wild which, for
example, check boundaries of values. Often those "traits" need some support
of a metaclass to work. This is how this would look like with this
PEP::
class Trait:
def __init__(self, minimum, maximum):
self.minimum = minimum
self.maximum = maximum
def __get__(self, instance, owner):
return instance.__dict__[self.key]
def __set__(self, instance, value):
if self.minimum < value < self.maximum:
instance.__dict__[self.key] = value
else:
raise ValueError("value not in range")
def __set_name__(self, owner, name):
self.key = name
Implementation Details
======================
The hooks are called in the following order: ``type.__new__`` calls
the ``__set_name__`` hooks on the descriptor after the new class has been
initialized. Then it calls ``__init_subclass__`` on the base class, on
``super()``, to be precise. This means that subclass initializers already
see the fully initialized descriptors. This way, ``__init_subclass__`` users
can fix all descriptors again if this is needed.
Another option would have been to call ``__set_name__`` in the base
implementation of ``object.__init_subclass__``. This way it would be possible
even to prevent ``__set_name__`` from being called. Most of the times,
however, such a prevention would be accidental, as it often happens that a call
to ``super()`` is forgotten.
As a third option, all the work could have been done in ``type.__init__``.
Most metaclasses do their work in ``__new__``, as this is recommended by
the documentation. Many metaclasses modify their arguments before they
pass them over to ``super().__new__``. For compatibility with those kind
of classes, the hooks should be called from ``__new__``.
Another small change should be done: in the current implementation of
CPython, ``type.__init__`` explicitly forbids the use of keyword arguments,
while ``type.__new__`` allows for its attributes to be shipped as keyword
arguments. This is weirdly incoherent, and thus it should be forbidden.
While it would be possible to retain the current behavior, it would be better
if this was fixed, as it is probably not used at all: the only use case would
be that at metaclass calls its ``super().__new__`` with *name*, *bases* and
*dict* (yes, *dict*, not *namespace* or *ns* as mostly used with modern
metaclasses) as keyword arguments. This should not be done. This little
change simplifies the implementation of this PEP significantly, while
improving the coherence of Python overall.
As a second change, the new ``type.__init__`` just ignores keyword
arguments. Currently, it insists that no keyword arguments are given. This
leads to a (wanted) error if one gives keyword arguments to a class declaration
if the metaclass does not process them. Metaclass authors that do want to
accept keyword arguments must filter them out by overriding ``__init__``.
In the new code, it is not ``__init__`` that complains about keyword arguments,
but ``__init_subclass__``, whose default implementation takes no arguments. In
a classical inheritance scheme using the method resolution order, each
``__init_subclass__`` may take out it's keyword arguments until none are left,
which is checked by the default implementation of ``__init_subclass__``.
For readers who prefer reading Python over English, this PEP proposes to
replace the current ``type`` and ``object`` with the following::
class NewType(type):
def __new__(cls, *args, **kwargs):
if len(args) != 3:
return super().__new__(cls, *args)
name, bases, ns = args
init = ns.get('__init_subclass__')
if isinstance(init, types.FunctionType):
ns['__init_subclass__'] = classmethod(init)
self = super().__new__(cls, name, bases, ns)
for k, v in self.__dict__.items():
func = getattr(v, '__set_name__', None)
if func is not None:
func(self, k)
super(self, self).__init_subclass__(**kwargs)
return self
def __init__(self, name, bases, ns, **kwargs):
super().__init__(name, bases, ns)
class NewObject(object):
@classmethod
def __init_subclass__(cls):
pass
Reference Implementation
========================
The reference implementation for this PEP is attached to
`issue 27366 <http://bugs.python.org/issue27366>`__.
Backward compatibility issues
=============================
The exact calling sequence in ``type.__new__`` is slightly changed, raising
fears of backwards compatibility. It should be assured by tests that common use
cases behave as desired.
The following class definitions (except the one defining the metaclass)
continue to fail with a ``TypeError`` as superfluous class arguments are passed::
class MyMeta(type):
pass
class MyClass(metaclass=MyMeta, otherarg=1):
pass
MyMeta("MyClass", (), otherargs=1)
import types
types.new_class("MyClass", (), dict(metaclass=MyMeta, otherarg=1))
types.prepare_class("MyClass", (), dict(metaclass=MyMeta, otherarg=1))
A metaclass defining only a ``__new__`` method which is interested in keyword
arguments now does not need to define an ``__init__`` method anymore, as the
default ``type.__init__`` ignores keyword arguments. This is nicely in line
with the recommendation to override ``__new__`` in metaclasses instead of
``__init__``. The following code does not fail anymore::
class MyMeta(type):
def __new__(cls, name, bases, namespace, otherarg):
return super().__new__(cls, name, bases, namespace)
class MyClass(metaclass=MyMeta, otherarg=1):
pass
Only defining an ``__init__`` method in a metaclass continues to fail with
``TypeError`` if keyword arguments are given::
class MyMeta(type):
def __init__(self, name, bases, namespace, otherarg):
super().__init__(name, bases, namespace)
class MyClass(metaclass=MyMeta, otherarg=1):
pass
Defining both ``__init__`` and ``__new__`` continues to work fine.
About the only thing that stops working is passing the arguments of
``type.__new__`` as keyword arguments::
class MyMeta(type):
def __new__(cls, name, bases, namespace):
return super().__new__(cls, name=name, bases=bases,
dict=namespace)
class MyClass(metaclass=MyMeta):
pass
This will now raise ``TypeError``, but this is weird code, and easy
to fix even if someone used this feature.
Rejected Design Options
=======================
Calling the hook on the class itself
------------------------------------
Adding an ``__autodecorate__`` hook that would be called on the class
itself was the proposed idea of :pep:`422`. Most examples work the same
way or even better if the hook is called only on strict subclasses. In general,
it is much easier to arrange to explicitly call the hook on the class in which it
is defined (to opt-in to such a behavior) than to opt-out (by remember to check for
``cls is __class`` in the hook body), meaning that one does not want the hook to be
called on the class it is defined in.
This becomes most evident if the class in question is designed as a
mixin: it is very unlikely that the code of the mixin is to be
executed for the mixin class itself, as it is not supposed to be a
complete class on its own.
The original proposal also made major changes in the class
initialization process, rendering it impossible to back-port the
proposal to older Python versions.
When it's desired to also call the hook on the base class, two mechanisms are available:
1. Introduce an additional mixin class just to hold the ``__init_subclass__``
implementation. The original "base" class can then list the new mixin as its
first parent class.
2. Implement the desired behaviour as an independent class decorator, and apply that
decorator explicitly to the base class, and then implicitly to subclasses via
``__init_subclass__``.
Calling ``__init_subclass__`` explicitly from a class decorator will generally be
undesirable, as this will also typically call ``__init_subclass__`` a second time on
the parent class, which is unlikely to be desired behaviour.
Other variants of calling the hooks
-----------------------------------
Other names for the hook were presented, namely ``__decorate__`` or
``__autodecorate__``. This proposal opts for ``__init_subclass__`` as
it is very close to the ``__init__`` method, just for the subclass,
while it is not very close to decorators, as it does not return the
class.
For the ``__set_name__`` hook other names have been proposed as well,
``__set_owner__``, ``__set_ownership__`` and ``__init_descriptor__``.
Requiring an explicit decorator on ``__init_subclass__``
--------------------------------------------------------
One could require the explicit use of ``@classmethod`` on the
``__init_subclass__`` decorator. It was made implicit since there's no
sensible interpretation for leaving it out, and that case would need
to be detected anyway in order to give a useful error message.
This decision was reinforced after noticing that the user experience of
defining ``__prepare__`` and forgetting the ``@classmethod`` method
decorator is singularly incomprehensible (particularly since :pep:`3115`
documents it as an ordinary method, and the current documentation doesn't
explicitly say anything one way or the other).
A more ``__new__``-like hook
----------------------------
In :pep:`422` the hook worked more like the ``__new__`` method than the
``__init__`` method, meaning that it returned a class instead of
modifying one. This allows a bit more flexibility, but at the cost
of much harder implementation and undesired side effects.
Adding a class attribute with the attribute order
-------------------------------------------------
This got its own :pep:`520`.
History
=======
This used to be a competing proposal to :pep:`422` by Nick Coghlan and Daniel
Urban. :pep:`422` intended to achieve the same goals as this PEP, but with a
different way of implementation. In the meantime, :pep:`422` has been withdrawn
favouring this approach.
References
==========
.. _more than 10 years ago:
https://mail.python.org/pipermail/python-dev/2001-November/018651.html
.. _Zope's ExtensionClass:
http://docs.zope.org/zope_secrets/extensionclass.html
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: