python-peps/pep-0560.rst

203 lines
7.2 KiB
ReStructuredText
Raw Normal View History

PEP: 560
Title: Core support for generic types
Author: Ivan Levkivskyi <levkivskyi@gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 03-Sep-2017
Python-Version: 3.7
Post-History: 09-Sep-2017
Abstract
========
Initially PEP 484 was designed in such way that it would not introduce
*any* changes to the core CPython interpreter. Now type hints and
the ``typing`` module are extensively used by the community, e.g. PEP 526
and PEP 557 extend the usage of type hints, and the backport of ``typing``
on PyPI has 1M downloads/month. Therefore, this restriction can be removed.
It is proposed to add two special methods ``__class_getitem__`` and
``__subclass_base__`` to the core CPython for better support of
generic types.
Rationale
=========
The restriction to not modify the core CPython interpreter lead to some
design decisions that became questionable when the ``typing`` module started
to be widely used. There are three main points of concerns:
performance of the ``typing`` module, metaclass conflicts, and the large
number of hacks currently used in ``typing``.
Performance:
------------
The ``typing`` module is one of the heaviest and slowest modules in
the standard library even with all the optimizations made. Mainly this is
because subscripted generic types (see PEP 484 for definition of terms
used in this PEP) are class objects (see also [1]_). The three main ways how
the performance can be improved with the help of the proposed special methods:
- Creation of generic classes is slow since the ``GenericMeta.__new__`` is
very slow; we will not need it anymore.
- Very long MROs for generic classes will be twice shorter; they are present
because we duplicate the ``collections.abc`` inheritance chain
in ``typing``.
- Time of instantiation of generic classes will be improved
(this is minor however).
Metaclass conflicts:
--------------------
All generic types are instances of ``GenericMeta``, so if a user uses
a custom metaclass, then it is hard to make a corresponding class generic.
This is particularly hard for library classes that a user doesn't control.
A workaround is to always mix-in ``GenericMeta``::
class AdHocMeta(GenericMeta, LibraryMeta):
pass
class UserClass(LibraryBase, Generic[T], metaclass=AdHocMeta):
...
but this is not always practical or even possible. With the help of the
proposed special attributes the ``GenericMeta`` metaclass will not be needed.
Hacks and bugs that will be removed by this proposal:
-----------------------------------------------------
- ``_generic_new`` hack that exists since ``__init__`` is not called on
instances with a type differing form the type whose ``__new__`` was called,
``C[int]().__class__ is C``.
- ``_next_in_mro`` speed hack will be not necessary since subscription will
not create new classes.
- Ugly ``sys._getframe`` hack, this one is particularly nasty, since it looks
like we can't remove it without changes outside ``typing``.
- Currently generics do dangerous things with private ABC caches
to fix large memory consumption that grows at least as O(N\ :sup:`2`),
see [2]_. This point is also important because it was recently proposed to
re-implement ``ABCMeta`` in C.
- Problems with sharing attributes between subscripted generics,
see [3]_. Current solution already uses ``__getattr__`` and ``__setattr__``,
but it is still incomplete, and solving this without the current proposal
will be hard and will need ``__getattribute__``.
- ``_no_slots_copy`` hack, where we clean-up the class dictionary on every
subscription thus allowing generics with ``__slots__``.
- General complexity of the ``typing`` module, the new proposal will not
only allow to remove the above mentioned hacks/bugs, but also simplify
the implementation, so that it will be easier to maintain.
Specification
=============
The idea of ``__class_getitem__`` is simple: it is an exact analog of
``__getitem__`` with an exception that it is called on a class that
defines it, not on its instances, this allows us to avoid
``GenericMeta.__getitem__`` for things like ``Iterable[int]``.
The ``__class_getitem__`` is automatically a class method and
does not require ``@classmethod`` decorator (similar to
``__init_subclass__``) and is inherited like normal attributes.
For example::
class MyList:
def __getitem__(self, index):
return index + 1
def __class_getitem__(cls, item):
return f"{cls.__name__}[{item.__name__}]"
class MyOtherList(MyList):
pass
assert MyList()[0] == 1
assert MyList[int] == "MyList[int]"
assert MyOtherList()[0] == 1
assert MyOtherList[int] == "MyOtherList[int]"
Note that this method is used as a fallback, so if a metaclass defines
``__getitem__``, then that will have the priority.
If an object that is not a class object appears in the bases of a class
definition, the ``__subclass_base__`` is searched on it. If found,
it is called with the original tuple of bases as an argument. If the result
of the call is not ``None``, then it is substituted instead of this object.
Otherwise (if the result is ``None``), the base is just removed. This is
necessary to avoid inconsistent MRO errors, that are currently prevented by
manipulations in ``GenericMeta.__new__``. After creating the class,
the original bases are saved in ``__orig_bases__`` (currently this is also
done by the metaclass).
NOTE: These two method names are reserved for exclusive use by
the ``typing`` module and the generic types machinery, and any other use is
strongly discouraged. The reference implementation (with tests) can be found
in [4]_, the proposal was originally posted and discussed on
the ``typing`` tracker, see [5]_.
Backwards compatibility and impact on users who don't use ``typing``:
=====================================================================
This proposal may break code that currently uses the names
``__class_getitem__`` and ``__subclass_base__``.
This proposal will support almost complete backwards compatibility with
the current public generic types API; moreover the ``typing`` module is still
provisional. The only two exceptions are that currently
``issubclass(List[int], List)`` returns True, with this proposal it will raise
``TypeError``. Also ``issubclass(collections.abc.Iterable, typing.Iterable)``
will return ``False``, which is probably desirable, since currently we have
a (virtual) inheritance cycle between these two classes.
With the reference implementation I measured negligible performance effects
(under 1% on a micro-benchmark) for regular (non-generic) classes.
References
==========
.. [1] Discussion following Mark Shannon's presentation at Language Summit
(https://github.com/python/typing/issues/432)
.. [2] Pull Request to implement shared generic ABC caches
(https://github.com/python/typing/pull/383)
.. [3] An old bug with setting/accessing attributes on generic types
(https://github.com/python/typing/issues/392)
.. [4] The reference implementation
(https://github.com/ilevkivskyi/cpython/pull/2/files)
.. [5] Original proposal
(https://github.com/python/typing/issues/468)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: