Update PEP 575 (#640)

This commit is contained in:
jdemeyer 2018-05-03 20:02:50 +02:00 committed by Chris Angelico
parent 23dfb58c72
commit 520405ff24
1 changed files with 126 additions and 101 deletions

View File

@ -21,8 +21,8 @@ without sacrificing performance.
A new base class ``base_function`` is introduced and the various function A new base class ``base_function`` is introduced and the various function
classes, as well as ``method`` (renamed to ``bound_method``), inherit from it. classes, as well as ``method`` (renamed to ``bound_method``), inherit from it.
We also allow subclassing in some cases: We also allow subclassing the Python ``function`` class.
in particular the Python ``function`` class can be subclassed.
Motivation Motivation
========== ==========
@ -61,6 +61,7 @@ as true built-in functions.
All functions can access the function object All functions can access the function object
(the ``self`` in ``__call__``), paving the way for PEP 573. (the ``self`` in ``__call__``), paving the way for PEP 573.
New classes New classes
=========== ===========
@ -74,7 +75,7 @@ This is the new class hierarchy for functions and methods::
/ | \ / | \
/ | defined_function / | defined_function
/ | \ / | \
builtin_function (*) | \ cfunction (*) | \
| function | function
| |
bound_method (*) bound_method (*)
@ -135,7 +136,7 @@ but with the following differences and new features:
It is still needed in a few places though, for example `profiling`_. It is still needed in a few places though, for example `profiling`_.
#. A new flag ``METH_CUSTOM`` for ``ml_flags`` which prevents automatic #. A new flag ``METH_CUSTOM`` for ``ml_flags`` which prevents automatic
generation of a ``builtin_function``, see `automatic creation of built-in functions`_. generation of a ``cfunction``, see `automatic creation of built-in functions`_.
The goal of ``base_function`` is that it supports all different ways The goal of ``base_function`` is that it supports all different ways
of calling functions and methods in just one structure. of calling functions and methods in just one structure.
@ -175,10 +176,11 @@ then there is no ``__self__`` attribute at all.
For that reason, we write either ``m_self`` or ``__self__`` in this PEP For that reason, we write either ``m_self`` or ``__self__`` in this PEP
with slightly different meanings. with slightly different meanings.
builtin_function cfunction
---------------- ---------
This is a copy of ``base_function``, with the following differences: This is the new version of the old ``builtin_function_or_method`` class.
It is a copy of ``base_function``, with the following differences:
#. ``m_ml`` points to a ``PyMethodDef`` structure, #. ``m_ml`` points to a ``PyMethodDef`` structure,
extending ``PyCFunctionDef`` with an additional ``ml_doc`` extending ``PyCFunctionDef`` with an additional ``ml_doc``
@ -208,42 +210,37 @@ and we define ``PyCFunctionObject`` as alias of ``PyBaseFunctionObject``
defined_function defined_function
---------------- ----------------
The class ``defined_function`` (a subclass of ``base_function``) adds The class ``defined_function`` is an abstract base class meant
support for various standard attributes which are used by ``inspect``. to indicate that the function has introspection support.
This would be a good class to use for auto-generated C code, for example produced by Cython [#cython]_. Instances of ``defined_function`` are required to support all attributes
that Python functions have, namely
``__code__``, ``__globals__``, ``__doc__``,
``__defaults__``, ``__kwdefaults__``, ``__closure__`` and ``__annotations__``.
There is also a ``__dict__`` to support attributes added by the user.
The layout of the C structure is as follows:: None of these is required to be meaningful.
In particular, ``__code__`` may not be a working code object,
possibly only a few fields may be filled in.
This PEP does not dictate how the various attributes are implemented.
They may be simple struct members or more complicated descriptors.
Only read-only support is required, none of the attributes is required to be writable.
The class ``defined_function`` is mainly meant for auto-generated C code,
for example produced by Cython [#cython]_.
There is no API to create instances of it.
The C structure is the following::
PyTypeObject PyDefinedFunction_Type; PyTypeObject PyDefinedFunction_Type;
typedef struct { typedef struct {
PyBaseFunctionObject base; PyBaseFunctionObject base;
PyObject *func_code; /* __code__: code */
PyObject *func_globals; /* __globals__: anything; readonly */
PyObject *func_name; /* __name__: string */
PyObject *func_qualname; /* __qualname__: string */
PyObject *func_doc; /* __doc__: can be anything or NULL */
PyObject *func_defaults; /* __defaults__: tuple or NULL */
PyObject *func_kwdefaults; /* __kwdefaults__: dict or NULL */
PyObject *func_closure; /* __closure__: tuple of cell objects or NULL; readonly */
PyObject *func_annotations; /* __annotations__: dict or NULL */
PyObject *func_dict; /* __dict__: dict or NULL */ PyObject *func_dict; /* __dict__: dict or NULL */
} PyDefinedFunctionObject; } PyDefinedFunctionObject;
The descriptor ``__name__`` returns ``func_name``. **TODO**: maybe find a better name for ``defined_function``.
When setting ``__name__``, also ``base.m_ml->ml_name`` is updated Other proposals: ``inspect_function`` (anything that satisfies ``inspect.isfunction``),
with the UTF-8 encoded name. ``builtout_function`` (a function that is better built out; pun on builtin),
None of the attributes is required to be meaningful.
In particular, ``__code__`` may not be a working code object,
possibly only a few fields may be filled in.
And ``__defaults__`` is not required to be used for calling the function.
Apart from adding these extra attributes,
``defined_function`` behaves exactly the same as ``base_function``.
**TODO**: find a better name for ``defined_function``.
Other proposals: ``builtout_function`` (a function that is better built out; pun on builtin),
``generic_function`` (original proposal but conflicts with ``functools.singledispatch`` generic functions), ``generic_function`` (original proposal but conflicts with ``functools.singledispatch`` generic functions),
``user_function`` (defined by the user as opposed to CPython). ``user_function`` (defined by the user as opposed to CPython).
@ -255,12 +252,13 @@ Unlike the other function types,
instances of ``function`` can be created from Python code. instances of ``function`` can be created from Python code.
This is not changed, so we do not describe the details in this PEP. This is not changed, so we do not describe the details in this PEP.
The layout of the C structure is almost the same as ``defined_function``:: The layout of the C structure is the following::
PyTypeObject PyFunction_Type; PyTypeObject PyFunction_Type;
typedef struct { typedef struct {
PyBaseFunctionObject base; PyBaseFunctionObject base;
PyObject *func_dict; /* __dict__: dict or NULL */
PyObject *func_code; /* __code__: code */ PyObject *func_code; /* __code__: code */
PyObject *func_globals; /* __globals__: dict; readonly */ PyObject *func_globals; /* __globals__: dict; readonly */
PyObject *func_name; /* __name__: string */ PyObject *func_name; /* __name__: string */
@ -270,12 +268,14 @@ The layout of the C structure is almost the same as ``defined_function``::
PyObject *func_kwdefaults; /* __kwdefaults__: dict or NULL */ PyObject *func_kwdefaults; /* __kwdefaults__: dict or NULL */
PyObject *func_closure; /* __closure__: tuple of cell objects or NULL; readonly */ PyObject *func_closure; /* __closure__: tuple of cell objects or NULL; readonly */
PyObject *func_annotations; /* __annotations__: dict or NULL */ PyObject *func_annotations; /* __annotations__: dict or NULL */
PyObject *func_dict; /* __dict__: dict or NULL */
PyCFunctionDef _ml; /* Storage for base.m_ml */ PyCFunctionDef _ml; /* Storage for base.m_ml */
} PyFunctionObject; } PyFunctionObject;
The only difference is an ``_ml`` field The descriptor ``__name__`` returns ``func_name``.
which reserves space to be used by ``base.m_ml``. When setting ``__name__``, also ``base.m_ml->ml_name`` is updated
with the UTF-8 encoded name.
The ``_ml`` field reserves space to be used by ``base.m_ml``.
When constructing an instance of ``function`` from ``code`` and ``globals``, When constructing an instance of ``function`` from ``code`` and ``globals``,
an instance is created with ``base.m_ml = &_ml``, an instance is created with ``base.m_ml = &_ml``,
@ -284,8 +284,17 @@ Instances of ``function`` should always have the flag ``METH_PYTHON`` set.
This is also handled by the constructors. This is also handled by the constructors.
To make subclassing easier, we also add a copy constructor: To make subclassing easier, we also add a copy constructor:
if ``f`` is an instance of ``defined_function`` with the ``METH_PYTHON`` if ``f`` is an instance of ``function``, then ``types.FunctionType(f)`` copies ``f``.
flag set, then ``types.FunctionType(f)`` copies ``f``. This conveniently allows using a custom function type as decorator::
>>> from types import FunctionType
>>> class CustomFunction(FunctionType):
... pass
>>> @CustomFunction
... def f(x):
... return x
>>> type(f)
<class '__main__.CustomFunction'>
bound_method bound_method
------------ ------------
@ -328,7 +337,6 @@ The C structure is::
} PyMethodObject; } PyMethodObject;
Calling base_function instances Calling base_function instances
=============================== ===============================
@ -429,14 +437,14 @@ Then ``args`` is an array of length 3 + 2 = 5, ``nargs`` equals 3 and ``kwnames`
Automatic creation of built-in functions Automatic creation of built-in functions
======================================== ========================================
Python automatically generates instances of ``builtin_function`` Python automatically generates instances of ``cfunction``
for extension types (using the ``PyTypeObject.tp_methods`` field) and modules for extension types (using the ``PyTypeObject.tp_methods`` field) and modules
(using the ``PyModuleDef.m_methods`` field). (using the ``PyModuleDef.m_methods`` field).
The arrays ``PyTypeObject.tp_methods`` and ``PyModuleDef.m_methods`` The arrays ``PyTypeObject.tp_methods`` and ``PyModuleDef.m_methods``
must be arrays of ``PyMethodDef`` structures. must be arrays of ``PyMethodDef`` structures.
If the ``METH_CUSTOM`` flag is set for an element of such an array, If the ``METH_CUSTOM`` flag is set for an element of such an array,
then no ``builtin_function`` will be generated. then no ``cfunction`` will be generated.
This allows an application to customize the creation of functions This allows an application to customize the creation of functions
in an extension type or module. in an extension type or module.
If ``METH_CUSTOM`` is set, then ``METH_STATIC`` and ``METH_CLASS`` are ignored. If ``METH_CUSTOM`` is set, then ``METH_STATIC`` and ``METH_CLASS`` are ignored.
@ -445,7 +453,7 @@ Unbound methods of extension types
---------------------------------- ----------------------------------
The type of unbound methods changes from ``method_descriptor`` The type of unbound methods changes from ``method_descriptor``
to ``builtin_function``. to ``cfunction``.
The object which appears as unbound method is the same object which The object which appears as unbound method is the same object which
appears in the class ``__dict__``. appears in the class ``__dict__``.
Python automatically sets the ``__parent__`` attribute to the defining class. Python automatically sets the ``__parent__`` attribute to the defining class.
@ -477,16 +485,17 @@ New type flag
A new ``PyTypeObject`` flag (for ``tp_flags``) is added: A new ``PyTypeObject`` flag (for ``tp_flags``) is added:
``Py_TPFLAGS_BASEFUNCTION`` to indicate that instances of this type are ``Py_TPFLAGS_BASEFUNCTION`` to indicate that instances of this type are
functions which can be called as a ``base_function``. functions which can be called and bound as method like a ``base_function``.
In other words, subclasses of ``base_function``
which follow the implementation from the section `Calling base_function instances`_.
This is different from flags like ``Py_TPFLAGS_LIST_SUBCLASS`` This is different from flags like ``Py_TPFLAGS_LIST_SUBCLASS``
because it indicates more than just a subclass: because it indicates more than just a subclass:
it also indicates a default implementation of ``__call__``. it also indicates a default implementation of ``__call__`` and ``__get__``.
In particular, such subclasses of ``base_function``
must follow the implementation from the section `Calling base_function instances`_.
This flag is never inherited. This flag is never inherited.
However, extension types can explicitly specify it if they Extension types should explicitly specify it if they
do not override ``__call__`` or if they override ``__call__`` in a compatible way. do not override ``__call__`` nor ``__get__`` or if they override them in a compatible way.
The flag ``Py_TPFLAGS_BASEFUNCTION`` must never be set for a heap type The flag ``Py_TPFLAGS_BASEFUNCTION`` must never be set for a heap type
because that would not be safe (heap types can be changed dynamically). because that would not be safe (heap types can be changed dynamically).
@ -509,10 +518,10 @@ Some of these are existing (possibly changed) functions, some are new:
from the given data. from the given data.
- ``int PyCFunction_Check(PyObject *op)``: return true if ``op`` - ``int PyCFunction_Check(PyObject *op)``: return true if ``op``
is an instance of ``builtin_function``. is an instance of ``cfunction``.
- ``int PyCFunction_NewEx(PyMethodDef* ml, PyObject *self, PyObject* module)``: - ``int PyCFunction_NewEx(PyMethodDef* ml, PyObject *self, PyObject* module)``:
create a new instance of ``builtin_function``. create a new instance of ``cfunction``.
As special case, if ``self`` is ``NULL``, As special case, if ``self`` is ``NULL``,
then set ``self = Py_None`` instead (for backwards compatibility). then set ``self = Py_None`` instead (for backwards compatibility).
If ``self`` is a module, then ``__parent__`` is set to ``self``. If ``self`` is a module, then ``__parent__`` is set to ``self``.
@ -524,7 +533,10 @@ Some of these are existing (possibly changed) functions, some are new:
The old functions are kept as aliases of the new functions. The old functions are kept as aliases of the new functions.
- ``int PyFunction_Check(PyObject *op)``: return true if ``op`` - ``int PyFunction_Check(PyObject *op)``: return true if ``op``
is an instance of ``defined_function``. is an instance of ``function``.
- ``int PyFunction_CheckExact(PyObject *op)``: return true
if the type of ``op`` is ``function``.
- ``PyObject *PyFunction_NewPython(PyTypeObject *cls, PyObject *code, PyObject *globals, PyObject *name, PyObject *qualname)``: - ``PyObject *PyFunction_NewPython(PyTypeObject *cls, PyObject *code, PyObject *globals, PyObject *name, PyObject *qualname)``:
create a new instance of ``cls`` (which must be a sublass of ``function``) create a new instance of ``cls`` (which must be a sublass of ``function``)
@ -538,10 +550,7 @@ Some of these are existing (possibly changed) functions, some are new:
- ``PyObject *PyFunction_Copy(PyTypeObject *cls, PyObject *func)``: - ``PyObject *PyFunction_Copy(PyTypeObject *cls, PyObject *func)``:
create a new instance of ``cls`` (which must be a sublass of ``function``) create a new instance of ``cls`` (which must be a sublass of ``function``)
by copying a given ``defined_function``. by copying a given ``function``.
- All other existing ``PyFunction_...`` functions now act on ``defined_function``
instances (instead of ``function``).
Changes to the types module Changes to the types module
--------------------------- ---------------------------
@ -563,7 +572,7 @@ The new function ``inspect.isbasefunction`` checks for an instance of ``base_fun
``inspect.isfunction`` checks for an instance of ``defined_function``. ``inspect.isfunction`` checks for an instance of ``defined_function``.
``inspect.isbuiltin`` checks for an instance of ``builtin_function``. ``inspect.isbuiltin`` checks for an instance of ``cfunction``.
``inspect.isroutine`` checks ``isbasefunction`` or ``ismethoddescriptor``. ``inspect.isroutine`` checks ``isbasefunction`` or ``ismethoddescriptor``.
@ -658,12 +667,11 @@ This is done by thinking of the most likely use case:
2. ``inspect.isfunction()`` refers to ``defined_function`` 2. ``inspect.isfunction()`` refers to ``defined_function``
because this is the class where introspection is supported. because this is the class where introspection is supported.
3. The C API functions ``PyFunction_New...`` 3. The C API functions must refer to ``function`` because
refer to ``function`` simply because one cannot create instances we do not specify how the various attributes of ``defined_function``
of ``defined_function``. are implemented.
We expect that this is not a problem since there is typically no
4. The C API functions ``PyFunction_Check`` and ``PyFunction_Get/Set...`` reason for introspection to be implemented by C extensions.
refer to ``defined_function`` because all attributes exist for instances of ``defined_function``.
Scope of this PEP: which classes are involved? Scope of this PEP: which classes are involved?
---------------------------------------------- ----------------------------------------------
@ -729,21 +737,25 @@ Two implementations of __doc__
------------------------------ ------------------------------
``base_function`` does not support function docstrings. ``base_function`` does not support function docstrings.
Instead, the classes ``builtin_function`` and ``defined_function`` Instead, the classes ``cfunction`` and ``function``
each have their own way of dealing with docstrings each have their own way of dealing with docstrings
(and ``bound_method`` just takes the ``__doc__`` from the wrapped function). (and ``bound_method`` just takes the ``__doc__`` from the wrapped function).
For ``builtin_function``, the docstring is stored (together with the text signature) For ``cfunction``, the docstring is stored (together with the text signature)
as C string in the read-only ``ml_doc`` field of a ``PyMethodDef``. as C string in the read-only ``ml_doc`` field of a ``PyMethodDef``.
For ``defined_function``, the docstring is stored as a writable Python object For ``function``, the docstring is stored as a writable Python object
and it does not actually need to be a string. and it does not actually need to be a string.
It looks hard to unify these two very different ways of dealing with ``__doc__``. It looks hard to unify these two very different ways of dealing with ``__doc__``.
For backwards compatibility, we keep the existing implementations. For backwards compatibility, we keep the existing implementations.
For ``defined_function``, we require ``__doc__`` to be implemented
but we do not specify how. A subclass can implement ``__doc__`` the
same way as ``cfunction`` or using a struct member or some other way.
Subclassing Subclassing
----------- -----------
We disallow subclassing of ``builtin_function`` and ``bound_method`` We disallow subclassing of ``cfunction`` and ``bound_method``
to enable fast type checks for ``PyCFunction_Check`` and ``PyMethod_Check``. to enable fast type checks for ``PyCFunction_Check`` and ``PyMethod_Check``.
We allow subclassing of the other classes because there is no reason to disallow it. We allow subclassing of the other classes because there is no reason to disallow it.
@ -876,41 +888,54 @@ existing code.
In order to further minimize breakage, this PEP could be implemented In order to further minimize breakage, this PEP could be implemented
in two phases. in two phases.
Phase one: duplicate classes Phase one: keep existing classes but add base classes
---------------------------- -----------------------------------------------------
Implement this PEP but duplicate the classes ``bound_method`` Initially, implement the ``base_function`` class
and ``builtin_function``. and use it as common base class but otherwise keep the existing classes
Add a new class ``builtin_method`` which is an exact copy of ``builtin_function`` (but not their implementation).
add a new class ``bound_builtin_method`` which is an exact copy
of ``bound_method`` (in both cases, literally only the name of the class would differ).
The class ``builtin_method`` will be used for unbound methods In this proposal, the class hierarchy would become::
of extension types.
It should be seen as continuation of the existing class
``method_descriptor``.
This ensures 100% backwards compatibility for these objects
(except for added attributes and maybe the ``repr()``).
The same would be done for bound methods of extension types: object
these will be instances of ``bound_builtin_method``. |
This ensures full backwards compatibility, except for code |
assuming that ``types.BuiltinFunctionType`` is the same as ``types.BuiltinMethodType``. base_function
/ | \
/ | \
/ | \
cfunction | defined_function
| | | \
| | bound_method \
| | \
| method_descriptor function
|
builtin_function_or_method
For ``inspect``, we keep but deprecate the functions The leaf classes ``builtin_function_or_method``, ``method_descriptor``,
``isbuiltin``, ``ismethod`` and ``ismethoddescriptor``. ``bound_method`` and ``function`` correspond to the existing classes
To replace these, new functions ``isbuiltinfunction``, ``isboundmethod`` (with ``method`` renamed to ``bound_method``).
and ``isgetdescriptor`` (other possible names: ``isreaddescriptor`` or ``isdescriptor``)
are added. Automatically created functions created in modules become instances
The function ``isbuiltinfunction`` checks for instances of ``builtin_function`` of ``builtin_function_or_method``.
and ``builtin_method``. Unbound methods of extension types become instances of ``method_descriptor``.
``isboundmethod`` checks for both ``bound_method`` and ``bound_builtin_method``.
And ``isgetdescriptor`` checks for non-data descriptors The class ``method_descriptor`` is a copy of ``cfunction`` except
which are not instances of ``base_function``. that ``__get__`` returns a ``builtin_function_or_method`` instead of a
``bound_method``.
The class ``builtin_function_or_method`` has the same C structure as a
``bound_method``, but it inherits from ``cfunction``.
The ``__func__`` attribute is not mandatory:
it is only defined when binding a ``method_descriptor``.
We keep the implementation of the ``inspect`` functions as they are.
Because of this and because the existing classes are kept,
backwards compatibility is ensured for code doing type checks.
Since showing an actual ``DeprecationWarning`` would affect a lot Since showing an actual ``DeprecationWarning`` would affect a lot
of correctly-functioning code, of correctly-functioning code,
the deprecations would only appear in the documentation. any deprecations would only appear in the documentation.
Another reason is that it is hard to show warnings for calling ``isinstance(x, t)`` Another reason is that it is hard to show warnings for calling ``isinstance(x, t)``
(but it could be done using ``__instancecheck__`` hacking) (but it could be done using ``__instancecheck__`` hacking)
and impossible for ``type(x) is t``. and impossible for ``type(x) is t``.
@ -918,9 +943,9 @@ and impossible for ``type(x) is t``.
Phase two Phase two
--------- ---------
Phase two is what is actually described in the rest of this PEP: Phase two is what is actually described in the rest of this PEP.
the duplicate classes would be merged and the ``inspect`` functions In terms of implementation,
adjusted accordingly. it would be a relatively small change compared to phase one.
Reference Implementation Reference Implementation
@ -932,12 +957,12 @@ https://github.com/jdemeyer/cpython/tree/pep575
There are four steps, corresponding to the commits on that branch. There are four steps, corresponding to the commits on that branch.
After each step, CPython is in a mostly working state. After each step, CPython is in a mostly working state.
1. Add the ``base_function`` class and make it a subclass for ``builtin_function``. 1. Add the ``base_function`` class and make it a subclass for ``cfunction``.
This is by far the biggest step as the complete ``__call__`` protocol This is by far the biggest step as the complete ``__call__`` protocol
is implemented in this step. is implemented in this step.
2. Rename ``method`` to ``bound_method`` and make it a subclass of ``base_function``. 2. Rename ``method`` to ``bound_method`` and make it a subclass of ``base_function``.
Change unbound methods of extension types to be instances of ``builtin_function`` Change unbound methods of extension types to be instances of ``cfunction``
such that bound methods of extension types are also instances of ``bound_method``. such that bound methods of extension types are also instances of ``bound_method``.
3. Implement ``defined_function`` and ``function``. 3. Implement ``defined_function`` and ``function``.