From 1c3d85cc5e38d89b7c3bf319a512df8c28f2076d Mon Sep 17 00:00:00 2001 From: Ivan Levkivskyi Date: Sun, 10 Sep 2017 20:53:54 +0200 Subject: [PATCH] PEP 560: Core support for generic types (#411) --- pep-0560.rst | 202 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 202 insertions(+) create mode 100644 pep-0560.rst diff --git a/pep-0560.rst b/pep-0560.rst new file mode 100644 index 000000000..ecc4001eb --- /dev/null +++ b/pep-0560.rst @@ -0,0 +1,202 @@ +PEP: 560 +Title: Core support for generic types +Author: Ivan Levkivskyi +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 03-Sep-2017 +Python-Version: 3.7 +Post-History: 09-Sep-2017 + + +Abstract +======== + +Initially PEP 484 was designed in such way that it would not introduce +*any* changes to the core CPython interpreter. Now type hints and +the ``typing`` module are extensively used by the community, e.g. PEP 526 +and PEP 557 extend the usage of type hints, and the backport of ``typing`` +on PyPI has 1M downloads/moth. Therefore, this restriction can be removed. +It is proposed to add two special methods ``__class_getitem__`` and +``__subclass_base__`` to the core CPython for better support of +generic types. + + +Rationale +========= + +The restriction to not modify the core CPython interpreter lead to some +design decisions that became questionable when the ``typing`` module started +to be widely used. There are three main points of concerns: +performance of the ``typing`` module, metaclass conflicts, and the large +number of hacks currently used in ``typing``. + + +Performance: +------------ + +The ``typing`` module is one of the heaviest and slowest modules in +the standard library even with all the optimizations made. Mainly this is +because subscripted generic types (see PEP 484 for definition of terms +used in this PEP) are class objects (see also [1]_). The three main ways how +the performance can be improved with the help of the proposed special methods: + +- Creation of generic classes is slow since the ``GenericMeta.__new__`` is + very slow; we will not need it anymore. + +- Very long MROs for generic classes will be twice shorter; they are present + because we duplicate the ``collections.abc`` inheritance chain + in ``typing``. + +- Time of instantiation of generic classes will be improved + (this is minor however). + + +Metaclass conflicts: +-------------------- + +All generic types are instances of ``GenericMeta``, so if a user uses +a custom metaclass, then it is hard to make a corresponding class generic. +This is particularly hard for library classes that a user doesn't control. +A workaround is to always mix-in ``GenericMeta``:: + + class AdHocMeta(GenericMeta, LibraryMeta): + pass + + class UserClass(LibraryBase, Generic[T], metaclass=AdHocMeta): + ... + +but this is not always practical or even possible. With the help of the +proposed special attributes the ``GenericMeta`` metaclass will not be needed. + + +Hacks and bugs that will be removed by this proposal: +----------------------------------------------------- + +- ``_generic_new`` hack that exists since ``__init__`` is not called on + instances with a type differing form the type whose ``__new__`` was called, + ``C[int]().__class__ is C``. + +- ``_next_in_mro`` speed hack will be not necessary since subscription will + not create new classes. + +- Ugly ``sys._getframe`` hack, this one is particularly nasty, since it looks + like we can't remove it without changes outside ``typing``. + +- Currently generics do dangerous things with private ABC caches + to fix large memory consumption that grows at least as O(N\ :sup:`2`), + see [2]_. This point is also important because it was recently proposed to + re-implement ``ABCMeta`` in C. + +- Problems with sharing attributes between subscripted generics, + see [3]_. Current solution already uses ``__getattr__`` and ``__setattr__``, + but it is still incomplete, and solving this without the current proposal + will be hard and will need ``__getattribute__``. + +- ``_no_slots_copy`` hack, where we clean-up the class dictionary on every + subscription thus allowing generics with ``__slots__``. + +- General complexity of the ``typing`` module, the new proposal will not + only allow to remove the above mentioned hacks/bugs, but also simplify + the implementation, so that it will be easier to maintain. + + +Specification +============= + +The idea of ``__class_getitem__`` is simple: it is an exact analog of +``__getitem__`` with an exception that it is called on a class that +defines it, not on its instances, this allows us to avoid +``GenericMeta.__getitem__`` for things like ``Iterable[int]``. +The ``__class_getitem__`` is automatically a class method and +does not require ``@classmethod`` decorator (similar to +``__init_subclass__``) and is inherited like normal attributes. +For example:: + + class MyList: + def __getitem__(self, index): + return index + 1 + def __class_getitem__(cls, item): + return f"{cls.__name__}[{item.__name__}]" + + class MyOtherList(MyList): + pass + + assert MyList()[0] == 1 + assert MyList[int] == "MyList[int]" + + assert MyOtherList()[0] == 1 + assert MyOtherList[int] == "MyOtherList[int]" + +Note that this method is used as a fallback, so if a metaclass defines +``__getitem__``, then that will have the priority. + +If an object that is not a class object appears in the bases of a class +definition, the ``__subclass_base__`` is searched on it. If found, +it is called with the original tuple of bases as an argument. If the result +of the call is not ``None``, then it is substituted instead of this object. +Otherwise (if the result is ``None``), the base is just removed. This is +necessary to avoid inconsistent MRO errors, that are currently prevented by +manipulations in ``GenericMeta.__new__``. After creating the class, +the original bases are saved in ``__orig_bases__`` (currently this is also +done by the metaclass). + +NOTE: These two method names are reserved for exclusive use by +the ``typing`` module and the generic types machinery, and any other use is +strongly discouraged. The reference implementation (with tests) can be found +in [4]_, the proposal was originally posted and discussed on +the ``typing`` tracker, see [5]_. + + +Backwards compatibility and impact on users who don't use ``typing``: +===================================================================== + +This proposal may break code that currently uses the names +``__class_getitem__`` and ``__subclass_base__``. + +This proposal will support almost complete backwards compatibility with +the current public generic types API; moreover the ``typing`` module is still +provisional. The only two exceptions are that currently +``issubclass(List[int], List)`` returns True, with this proposal it will raise +``TypeError``. Also ``issubclass(collections.abc.Iterable, typing.Iterable)`` +will return ``False``, which is probably desirable, since currently we have +a (virtual) inheritance cycle between these two classes. + +With the reference implementation I measured negligible performance effects +(under 1% on a micro-benchmark) for regular (non-generic) classes. + + +References +========== + +.. [1] Discussion following Mark Shannon's presentation at Language Summit + (https://github.com/python/typing/issues/432) + +.. [2] Pull Request to implement shared generic ABC caches + (https://github.com/python/typing/pull/383) + +.. [3] An old bug with setting/accessing attributes on generic types + (https://github.com/python/typing/issues/392) + +.. [4] The reference implementation + (https://github.com/ilevkivskyi/cpython/pull/2/files) + +.. [5] Original proposal + (https://github.com/python/typing/issues/468) + + +Copyright +========= + +This document has been placed in the public domain. + + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: