Update PEP 575 (#640)

This commit is contained in:
jdemeyer 2018-05-03 20:02:50 +02:00 committed by Chris Angelico
parent 23dfb58c72
commit 520405ff24
1 changed files with 126 additions and 101 deletions

View File

@ -21,8 +21,8 @@ without sacrificing performance.
A new base class ``base_function`` is introduced and the various function
classes, as well as ``method`` (renamed to ``bound_method``), inherit from it.
We also allow subclassing in some cases:
in particular the Python ``function`` class can be subclassed.
We also allow subclassing the Python ``function`` class.
Motivation
==========
@ -61,6 +61,7 @@ as true built-in functions.
All functions can access the function object
(the ``self`` in ``__call__``), paving the way for PEP 573.
New classes
===========
@ -74,7 +75,7 @@ This is the new class hierarchy for functions and methods::
/ | \
/ | defined_function
/ | \
builtin_function (*) | \
cfunction (*) | \
| function
|
bound_method (*)
@ -135,7 +136,7 @@ but with the following differences and new features:
It is still needed in a few places though, for example `profiling`_.
#. A new flag ``METH_CUSTOM`` for ``ml_flags`` which prevents automatic
generation of a ``builtin_function``, see `automatic creation of built-in functions`_.
generation of a ``cfunction``, see `automatic creation of built-in functions`_.
The goal of ``base_function`` is that it supports all different ways
of calling functions and methods in just one structure.
@ -175,10 +176,11 @@ then there is no ``__self__`` attribute at all.
For that reason, we write either ``m_self`` or ``__self__`` in this PEP
with slightly different meanings.
builtin_function
----------------
cfunction
---------
This is a copy of ``base_function``, with the following differences:
This is the new version of the old ``builtin_function_or_method`` class.
It is a copy of ``base_function``, with the following differences:
#. ``m_ml`` points to a ``PyMethodDef`` structure,
extending ``PyCFunctionDef`` with an additional ``ml_doc``
@ -208,42 +210,37 @@ and we define ``PyCFunctionObject`` as alias of ``PyBaseFunctionObject``
defined_function
----------------
The class ``defined_function`` (a subclass of ``base_function``) adds
support for various standard attributes which are used by ``inspect``.
This would be a good class to use for auto-generated C code, for example produced by Cython [#cython]_.
The class ``defined_function`` is an abstract base class meant
to indicate that the function has introspection support.
Instances of ``defined_function`` are required to support all attributes
that Python functions have, namely
``__code__``, ``__globals__``, ``__doc__``,
``__defaults__``, ``__kwdefaults__``, ``__closure__`` and ``__annotations__``.
There is also a ``__dict__`` to support attributes added by the user.
The layout of the C structure is as follows::
None of these is required to be meaningful.
In particular, ``__code__`` may not be a working code object,
possibly only a few fields may be filled in.
This PEP does not dictate how the various attributes are implemented.
They may be simple struct members or more complicated descriptors.
Only read-only support is required, none of the attributes is required to be writable.
The class ``defined_function`` is mainly meant for auto-generated C code,
for example produced by Cython [#cython]_.
There is no API to create instances of it.
The C structure is the following::
PyTypeObject PyDefinedFunction_Type;
typedef struct {
PyBaseFunctionObject base;
PyObject *func_code; /* __code__: code */
PyObject *func_globals; /* __globals__: anything; readonly */
PyObject *func_name; /* __name__: string */
PyObject *func_qualname; /* __qualname__: string */
PyObject *func_doc; /* __doc__: can be anything or NULL */
PyObject *func_defaults; /* __defaults__: tuple or NULL */
PyObject *func_kwdefaults; /* __kwdefaults__: dict or NULL */
PyObject *func_closure; /* __closure__: tuple of cell objects or NULL; readonly */
PyObject *func_annotations; /* __annotations__: dict or NULL */
PyObject *func_dict; /* __dict__: dict or NULL */
} PyDefinedFunctionObject;
The descriptor ``__name__`` returns ``func_name``.
When setting ``__name__``, also ``base.m_ml->ml_name`` is updated
with the UTF-8 encoded name.
None of the attributes is required to be meaningful.
In particular, ``__code__`` may not be a working code object,
possibly only a few fields may be filled in.
And ``__defaults__`` is not required to be used for calling the function.
Apart from adding these extra attributes,
``defined_function`` behaves exactly the same as ``base_function``.
**TODO**: find a better name for ``defined_function``.
Other proposals: ``builtout_function`` (a function that is better built out; pun on builtin),
**TODO**: maybe find a better name for ``defined_function``.
Other proposals: ``inspect_function`` (anything that satisfies ``inspect.isfunction``),
``builtout_function`` (a function that is better built out; pun on builtin),
``generic_function`` (original proposal but conflicts with ``functools.singledispatch`` generic functions),
``user_function`` (defined by the user as opposed to CPython).
@ -255,12 +252,13 @@ Unlike the other function types,
instances of ``function`` can be created from Python code.
This is not changed, so we do not describe the details in this PEP.
The layout of the C structure is almost the same as ``defined_function``::
The layout of the C structure is the following::
PyTypeObject PyFunction_Type;
typedef struct {
PyBaseFunctionObject base;
PyObject *func_dict; /* __dict__: dict or NULL */
PyObject *func_code; /* __code__: code */
PyObject *func_globals; /* __globals__: dict; readonly */
PyObject *func_name; /* __name__: string */
@ -270,12 +268,14 @@ The layout of the C structure is almost the same as ``defined_function``::
PyObject *func_kwdefaults; /* __kwdefaults__: dict or NULL */
PyObject *func_closure; /* __closure__: tuple of cell objects or NULL; readonly */
PyObject *func_annotations; /* __annotations__: dict or NULL */
PyObject *func_dict; /* __dict__: dict or NULL */
PyCFunctionDef _ml; /* Storage for base.m_ml */
} PyFunctionObject;
The only difference is an ``_ml`` field
which reserves space to be used by ``base.m_ml``.
The descriptor ``__name__`` returns ``func_name``.
When setting ``__name__``, also ``base.m_ml->ml_name`` is updated
with the UTF-8 encoded name.
The ``_ml`` field reserves space to be used by ``base.m_ml``.
When constructing an instance of ``function`` from ``code`` and ``globals``,
an instance is created with ``base.m_ml = &_ml``,
@ -284,8 +284,17 @@ Instances of ``function`` should always have the flag ``METH_PYTHON`` set.
This is also handled by the constructors.
To make subclassing easier, we also add a copy constructor:
if ``f`` is an instance of ``defined_function`` with the ``METH_PYTHON``
flag set, then ``types.FunctionType(f)`` copies ``f``.
if ``f`` is an instance of ``function``, then ``types.FunctionType(f)`` copies ``f``.
This conveniently allows using a custom function type as decorator::
>>> from types import FunctionType
>>> class CustomFunction(FunctionType):
... pass
>>> @CustomFunction
... def f(x):
... return x
>>> type(f)
<class '__main__.CustomFunction'>
bound_method
------------
@ -328,7 +337,6 @@ The C structure is::
} PyMethodObject;
Calling base_function instances
===============================
@ -429,14 +437,14 @@ Then ``args`` is an array of length 3 + 2 = 5, ``nargs`` equals 3 and ``kwnames`
Automatic creation of built-in functions
========================================
Python automatically generates instances of ``builtin_function``
Python automatically generates instances of ``cfunction``
for extension types (using the ``PyTypeObject.tp_methods`` field) and modules
(using the ``PyModuleDef.m_methods`` field).
The arrays ``PyTypeObject.tp_methods`` and ``PyModuleDef.m_methods``
must be arrays of ``PyMethodDef`` structures.
If the ``METH_CUSTOM`` flag is set for an element of such an array,
then no ``builtin_function`` will be generated.
then no ``cfunction`` will be generated.
This allows an application to customize the creation of functions
in an extension type or module.
If ``METH_CUSTOM`` is set, then ``METH_STATIC`` and ``METH_CLASS`` are ignored.
@ -445,7 +453,7 @@ Unbound methods of extension types
----------------------------------
The type of unbound methods changes from ``method_descriptor``
to ``builtin_function``.
to ``cfunction``.
The object which appears as unbound method is the same object which
appears in the class ``__dict__``.
Python automatically sets the ``__parent__`` attribute to the defining class.
@ -477,16 +485,17 @@ New type flag
A new ``PyTypeObject`` flag (for ``tp_flags``) is added:
``Py_TPFLAGS_BASEFUNCTION`` to indicate that instances of this type are
functions which can be called as a ``base_function``.
In other words, subclasses of ``base_function``
which follow the implementation from the section `Calling base_function instances`_.
functions which can be called and bound as method like a ``base_function``.
This is different from flags like ``Py_TPFLAGS_LIST_SUBCLASS``
because it indicates more than just a subclass:
it also indicates a default implementation of ``__call__``.
it also indicates a default implementation of ``__call__`` and ``__get__``.
In particular, such subclasses of ``base_function``
must follow the implementation from the section `Calling base_function instances`_.
This flag is never inherited.
However, extension types can explicitly specify it if they
do not override ``__call__`` or if they override ``__call__`` in a compatible way.
Extension types should explicitly specify it if they
do not override ``__call__`` nor ``__get__`` or if they override them in a compatible way.
The flag ``Py_TPFLAGS_BASEFUNCTION`` must never be set for a heap type
because that would not be safe (heap types can be changed dynamically).
@ -509,10 +518,10 @@ Some of these are existing (possibly changed) functions, some are new:
from the given data.
- ``int PyCFunction_Check(PyObject *op)``: return true if ``op``
is an instance of ``builtin_function``.
is an instance of ``cfunction``.
- ``int PyCFunction_NewEx(PyMethodDef* ml, PyObject *self, PyObject* module)``:
create a new instance of ``builtin_function``.
create a new instance of ``cfunction``.
As special case, if ``self`` is ``NULL``,
then set ``self = Py_None`` instead (for backwards compatibility).
If ``self`` is a module, then ``__parent__`` is set to ``self``.
@ -524,7 +533,10 @@ Some of these are existing (possibly changed) functions, some are new:
The old functions are kept as aliases of the new functions.
- ``int PyFunction_Check(PyObject *op)``: return true if ``op``
is an instance of ``defined_function``.
is an instance of ``function``.
- ``int PyFunction_CheckExact(PyObject *op)``: return true
if the type of ``op`` is ``function``.
- ``PyObject *PyFunction_NewPython(PyTypeObject *cls, PyObject *code, PyObject *globals, PyObject *name, PyObject *qualname)``:
create a new instance of ``cls`` (which must be a sublass of ``function``)
@ -538,10 +550,7 @@ Some of these are existing (possibly changed) functions, some are new:
- ``PyObject *PyFunction_Copy(PyTypeObject *cls, PyObject *func)``:
create a new instance of ``cls`` (which must be a sublass of ``function``)
by copying a given ``defined_function``.
- All other existing ``PyFunction_...`` functions now act on ``defined_function``
instances (instead of ``function``).
by copying a given ``function``.
Changes to the types module
---------------------------
@ -563,7 +572,7 @@ The new function ``inspect.isbasefunction`` checks for an instance of ``base_fun
``inspect.isfunction`` checks for an instance of ``defined_function``.
``inspect.isbuiltin`` checks for an instance of ``builtin_function``.
``inspect.isbuiltin`` checks for an instance of ``cfunction``.
``inspect.isroutine`` checks ``isbasefunction`` or ``ismethoddescriptor``.
@ -658,12 +667,11 @@ This is done by thinking of the most likely use case:
2. ``inspect.isfunction()`` refers to ``defined_function``
because this is the class where introspection is supported.
3. The C API functions ``PyFunction_New...``
refer to ``function`` simply because one cannot create instances
of ``defined_function``.
4. The C API functions ``PyFunction_Check`` and ``PyFunction_Get/Set...``
refer to ``defined_function`` because all attributes exist for instances of ``defined_function``.
3. The C API functions must refer to ``function`` because
we do not specify how the various attributes of ``defined_function``
are implemented.
We expect that this is not a problem since there is typically no
reason for introspection to be implemented by C extensions.
Scope of this PEP: which classes are involved?
----------------------------------------------
@ -729,21 +737,25 @@ Two implementations of __doc__
------------------------------
``base_function`` does not support function docstrings.
Instead, the classes ``builtin_function`` and ``defined_function``
Instead, the classes ``cfunction`` and ``function``
each have their own way of dealing with docstrings
(and ``bound_method`` just takes the ``__doc__`` from the wrapped function).
For ``builtin_function``, the docstring is stored (together with the text signature)
For ``cfunction``, the docstring is stored (together with the text signature)
as C string in the read-only ``ml_doc`` field of a ``PyMethodDef``.
For ``defined_function``, the docstring is stored as a writable Python object
For ``function``, the docstring is stored as a writable Python object
and it does not actually need to be a string.
It looks hard to unify these two very different ways of dealing with ``__doc__``.
For backwards compatibility, we keep the existing implementations.
For ``defined_function``, we require ``__doc__`` to be implemented
but we do not specify how. A subclass can implement ``__doc__`` the
same way as ``cfunction`` or using a struct member or some other way.
Subclassing
-----------
We disallow subclassing of ``builtin_function`` and ``bound_method``
We disallow subclassing of ``cfunction`` and ``bound_method``
to enable fast type checks for ``PyCFunction_Check`` and ``PyMethod_Check``.
We allow subclassing of the other classes because there is no reason to disallow it.
@ -876,41 +888,54 @@ existing code.
In order to further minimize breakage, this PEP could be implemented
in two phases.
Phase one: duplicate classes
----------------------------
Phase one: keep existing classes but add base classes
-----------------------------------------------------
Implement this PEP but duplicate the classes ``bound_method``
and ``builtin_function``.
Add a new class ``builtin_method`` which is an exact copy of ``builtin_function``
add a new class ``bound_builtin_method`` which is an exact copy
of ``bound_method`` (in both cases, literally only the name of the class would differ).
Initially, implement the ``base_function`` class
and use it as common base class but otherwise keep the existing classes
(but not their implementation).
The class ``builtin_method`` will be used for unbound methods
of extension types.
It should be seen as continuation of the existing class
``method_descriptor``.
This ensures 100% backwards compatibility for these objects
(except for added attributes and maybe the ``repr()``).
In this proposal, the class hierarchy would become::
The same would be done for bound methods of extension types:
these will be instances of ``bound_builtin_method``.
This ensures full backwards compatibility, except for code
assuming that ``types.BuiltinFunctionType`` is the same as ``types.BuiltinMethodType``.
object
|
|
base_function
/ | \
/ | \
/ | \
cfunction | defined_function
| | | \
| | bound_method \
| | \
| method_descriptor function
|
builtin_function_or_method
For ``inspect``, we keep but deprecate the functions
``isbuiltin``, ``ismethod`` and ``ismethoddescriptor``.
To replace these, new functions ``isbuiltinfunction``, ``isboundmethod``
and ``isgetdescriptor`` (other possible names: ``isreaddescriptor`` or ``isdescriptor``)
are added.
The function ``isbuiltinfunction`` checks for instances of ``builtin_function``
and ``builtin_method``.
``isboundmethod`` checks for both ``bound_method`` and ``bound_builtin_method``.
And ``isgetdescriptor`` checks for non-data descriptors
which are not instances of ``base_function``.
The leaf classes ``builtin_function_or_method``, ``method_descriptor``,
``bound_method`` and ``function`` correspond to the existing classes
(with ``method`` renamed to ``bound_method``).
Automatically created functions created in modules become instances
of ``builtin_function_or_method``.
Unbound methods of extension types become instances of ``method_descriptor``.
The class ``method_descriptor`` is a copy of ``cfunction`` except
that ``__get__`` returns a ``builtin_function_or_method`` instead of a
``bound_method``.
The class ``builtin_function_or_method`` has the same C structure as a
``bound_method``, but it inherits from ``cfunction``.
The ``__func__`` attribute is not mandatory:
it is only defined when binding a ``method_descriptor``.
We keep the implementation of the ``inspect`` functions as they are.
Because of this and because the existing classes are kept,
backwards compatibility is ensured for code doing type checks.
Since showing an actual ``DeprecationWarning`` would affect a lot
of correctly-functioning code,
the deprecations would only appear in the documentation.
any deprecations would only appear in the documentation.
Another reason is that it is hard to show warnings for calling ``isinstance(x, t)``
(but it could be done using ``__instancecheck__`` hacking)
and impossible for ``type(x) is t``.
@ -918,9 +943,9 @@ and impossible for ``type(x) is t``.
Phase two
---------
Phase two is what is actually described in the rest of this PEP:
the duplicate classes would be merged and the ``inspect`` functions
adjusted accordingly.
Phase two is what is actually described in the rest of this PEP.
In terms of implementation,
it would be a relatively small change compared to phase one.
Reference Implementation
@ -932,12 +957,12 @@ https://github.com/jdemeyer/cpython/tree/pep575
There are four steps, corresponding to the commits on that branch.
After each step, CPython is in a mostly working state.
1. Add the ``base_function`` class and make it a subclass for ``builtin_function``.
1. Add the ``base_function`` class and make it a subclass for ``cfunction``.
This is by far the biggest step as the complete ``__call__`` protocol
is implemented in this step.
2. Rename ``method`` to ``bound_method`` and make it a subclass of ``base_function``.
Change unbound methods of extension types to be instances of ``builtin_function``
Change unbound methods of extension types to be instances of ``cfunction``
such that bound methods of extension types are also instances of ``bound_method``.
3. Implement ``defined_function`` and ``function``.