PEP 579 and 580: the C call protocol (#675)
* PEP 579: refactoring C functions and methods * PEP 580: the C call protocol * PEP 575: withdraw
This commit is contained in:
parent
fa86a3cd23
commit
9900d8d696
10
pep-0575.rst
10
pep-0575.rst
|
@ -1,7 +1,7 @@
|
||||||
PEP: 575
|
PEP: 575
|
||||||
Title: Unifying function/method classes
|
Title: Unifying function/method classes
|
||||||
Author: Jeroen Demeyer <J.Demeyer@UGent.be>
|
Author: Jeroen Demeyer <J.Demeyer@UGent.be>
|
||||||
Status: Draft
|
Status: Withdrawn
|
||||||
Type: Standards Track
|
Type: Standards Track
|
||||||
Content-Type: text/x-rst
|
Content-Type: text/x-rst
|
||||||
Created: 27-Mar-2018
|
Created: 27-Mar-2018
|
||||||
|
@ -9,6 +9,14 @@ Python-Version: 3.8
|
||||||
Post-History: 31-Mar-2018, 12-Apr-2018, 27-Apr-2018, 5-May-2018
|
Post-History: 31-Mar-2018, 12-Apr-2018, 27-Apr-2018, 5-May-2018
|
||||||
|
|
||||||
|
|
||||||
|
Withdrawal notice
|
||||||
|
=================
|
||||||
|
|
||||||
|
See PEP 580 for a better solution to allowing fast calling of custom classes.
|
||||||
|
|
||||||
|
See PEP 579 for a broader discussion of some of the other issues from this PEP.
|
||||||
|
|
||||||
|
|
||||||
Abstract
|
Abstract
|
||||||
========
|
========
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,379 @@
|
||||||
|
PEP: 579
|
||||||
|
Title: Refactoring C functions and methods
|
||||||
|
Author: Jeroen Demeyer <J.Demeyer@UGent.be>
|
||||||
|
Status: Draft
|
||||||
|
Type: Informational
|
||||||
|
Content-Type: text/x-rst
|
||||||
|
Created: 04-Jun-2018
|
||||||
|
Post-History: 19-Jun-2018
|
||||||
|
|
||||||
|
|
||||||
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
|
This meta-PEP collects various issues with CPython's existing implementation
|
||||||
|
of built-in functions (functions implemented in C) and methods.
|
||||||
|
|
||||||
|
Fixing all these issues is too much for one PEP,
|
||||||
|
so that will be delegated to other standards track PEPs.
|
||||||
|
However, this PEP does give some brief ideas of possible fixes.
|
||||||
|
This is mainly meant to coordinate an overall strategy.
|
||||||
|
For example, a proposed solution may sound too complicated
|
||||||
|
for fixing any one single issue, but it may be the best overall
|
||||||
|
solution for multiple issues.
|
||||||
|
|
||||||
|
This PEP is purely informational:
|
||||||
|
it does not imply that all issues will eventually
|
||||||
|
be fixed, nor that they will be fixed using the solution proposed here.
|
||||||
|
|
||||||
|
It also serves as a check-list of possible requested features
|
||||||
|
to verify that a given fix does not make those
|
||||||
|
other features harder to implement.
|
||||||
|
|
||||||
|
The major proposed change is replacing ``PyMethodDef``
|
||||||
|
by a new structure ``PyCCallDef``
|
||||||
|
which collects everything needed for calling the function/method.
|
||||||
|
In the ``PyTypeObject`` structure, a new field ``tp_ccalloffset``
|
||||||
|
is added giving an offset to a ``PyCCallDef *`` in the object structure.
|
||||||
|
|
||||||
|
**NOTE**: This PEP deals only with CPython implementation details,
|
||||||
|
it does not affect the Python language or standard library.
|
||||||
|
|
||||||
|
|
||||||
|
Issues
|
||||||
|
======
|
||||||
|
|
||||||
|
This lists various issues with built-in functions and methods,
|
||||||
|
together with a plan for a solution and (if applicable)
|
||||||
|
pointers to standards track PEPs discussing the details.
|
||||||
|
|
||||||
|
|
||||||
|
1. Naming
|
||||||
|
---------
|
||||||
|
|
||||||
|
The word "built-in" is overused in Python.
|
||||||
|
From a quick skim of the Python documentation, it mostly refers
|
||||||
|
to things from the ``builtins`` module.
|
||||||
|
In other words: things which are available in the global namespace
|
||||||
|
without a need for importing them.
|
||||||
|
This conflicts with the use of the word "built-in" to mean "implemented in C".
|
||||||
|
|
||||||
|
**Solution**: since the C structure for built-in functions and methods is already
|
||||||
|
called ``PyCFunctionObject``,
|
||||||
|
let's use the name "cfunction" and "cmethod" instead of "built-in function"
|
||||||
|
and "built-in method".
|
||||||
|
|
||||||
|
|
||||||
|
2. Not extendable
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
The various classes involved (such as ``builtin_function_or_method``)
|
||||||
|
cannot be subclassed::
|
||||||
|
|
||||||
|
>>> from types import BuiltinFunctionType
|
||||||
|
>>> class X(BuiltinFunctionType):
|
||||||
|
... pass
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "<stdin>", line 1, in <module>
|
||||||
|
TypeError: type 'builtin_function_or_method' is not an acceptable base type
|
||||||
|
|
||||||
|
This is a problem because it makes it impossible to add features
|
||||||
|
such as introspection support to these classes.
|
||||||
|
|
||||||
|
If one wants to implement a function in C with additional functionality,
|
||||||
|
an entirely new class must be implemented from scratch.
|
||||||
|
The problem with this is that the existing classes like
|
||||||
|
``builtin_function_or_method`` are special-cased in the Python interpreter
|
||||||
|
to allow faster calling (for example, by using ``METH_FASTCALL``).
|
||||||
|
It is currently impossible to have a custom class with the same optimizations.
|
||||||
|
|
||||||
|
**Solution**: make the existing optimizations available to arbitrary classes.
|
||||||
|
This is done by adding a new ``PyTypeObject`` field ``tp_ccalloffset``
|
||||||
|
(or can we re-use ``tp_print`` for that?)
|
||||||
|
specifying the offset of a ``PyCCallDef`` pointer.
|
||||||
|
This is a new structure holding all information needed to call
|
||||||
|
a cfunction and it would be used instead of ``PyMethodDef``.
|
||||||
|
This implements the new "C call" protocol.
|
||||||
|
|
||||||
|
For constructing cfunctions and cmethods, ``PyMethodDef`` arrays
|
||||||
|
will still be used (for example, in ``tp_methods``) but that will
|
||||||
|
be the *only* remaining purpose of the ``PyMethodDef`` structure.
|
||||||
|
|
||||||
|
Additionally, we can also make some function classes subclassable.
|
||||||
|
However, this seems less important once we have ``tp_ccalloffset``.
|
||||||
|
|
||||||
|
**Reference**: PEP 580
|
||||||
|
|
||||||
|
|
||||||
|
3. cfunctions do not become methods
|
||||||
|
-----------------------------------
|
||||||
|
|
||||||
|
A cfunction like ``repr`` does not implement ``__get__`` to bind
|
||||||
|
as a method::
|
||||||
|
|
||||||
|
>>> class X:
|
||||||
|
... meth = repr
|
||||||
|
>>> x = X()
|
||||||
|
>>> x.meth()
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "<stdin>", line 1, in <module>
|
||||||
|
TypeError: repr() takes exactly one argument (0 given)
|
||||||
|
|
||||||
|
In this example, one would have expected that ``x.meth()`` returns
|
||||||
|
``repr(x)`` by applying the normal rules of methods.
|
||||||
|
|
||||||
|
This is surprising and a needless difference
|
||||||
|
between cfunctions and Python functions.
|
||||||
|
For the standard built-in functions, this is not really a problem
|
||||||
|
since those are not meant to used as methods.
|
||||||
|
But it does become a problem when one wants to implement a
|
||||||
|
new cfunction with the goal of being usable as method.
|
||||||
|
|
||||||
|
Again, a solution could be to create a new class behaving just
|
||||||
|
like cfunctions but which bind as methods.
|
||||||
|
However, that would lose some existing optimizations for methods,
|
||||||
|
such as the ``LOAD_METHOD``/``CALL_METHOD`` opcodes.
|
||||||
|
|
||||||
|
**Solution**: the same as the previous issue.
|
||||||
|
It just shows that handling ``self`` and ``__get__``
|
||||||
|
should be part of the new C call protocol.
|
||||||
|
|
||||||
|
For backwards compatibility, we would keep the existing non-binding
|
||||||
|
behavior of cfunctions. We would just allow it in custom classes.
|
||||||
|
|
||||||
|
**Reference**: PEP 580
|
||||||
|
|
||||||
|
|
||||||
|
4. Semantics of inspect.isfunction
|
||||||
|
----------------------------------
|
||||||
|
|
||||||
|
Currently, ``inspect.isfunction`` returns ``True`` only for instances
|
||||||
|
of ``types.FunctionType``.
|
||||||
|
That is, true Python functions.
|
||||||
|
|
||||||
|
A common use case for ``inspect.isfunction`` is checking for introspection:
|
||||||
|
it guarantees for example that ``inspect.getfile()`` will work.
|
||||||
|
Ideally, it should be possible for other classes to be treated as
|
||||||
|
functions too.
|
||||||
|
|
||||||
|
**Solution**: introduce a new ``InspectFunction`` abstract base class
|
||||||
|
and use that to implement ``inspect.isfunction``.
|
||||||
|
Alternatively, use duck typing for ``inspect.isfunction``
|
||||||
|
(as proposed in [#bpo30071]_)::
|
||||||
|
|
||||||
|
def isfunction(obj):
|
||||||
|
return hasattr(type(obj), "__code__")
|
||||||
|
|
||||||
|
|
||||||
|
5. C functions should have access to the function object
|
||||||
|
--------------------------------------------------------
|
||||||
|
|
||||||
|
The underlying C function of a cfunction currently
|
||||||
|
takes a ``self`` argument (for bound methods)
|
||||||
|
and then possibly a number of arguments.
|
||||||
|
There is no way for the C function to actually access the Python
|
||||||
|
cfunction object (the ``self`` in ``__call__`` or ``tp_call``).
|
||||||
|
This would for example allow implementing the
|
||||||
|
C call protocol for Python functions (``types.FunctionType``):
|
||||||
|
the C function which implements calling Python functions
|
||||||
|
needs access to the ``__code__`` attribute of the function.
|
||||||
|
|
||||||
|
This is also needed for PEP 573
|
||||||
|
where all cfunctions require access to their "parent"
|
||||||
|
(the module for functions of a module or the defining class
|
||||||
|
for methods).
|
||||||
|
|
||||||
|
**Solution**: add a new ``PyMethodDef`` flag to specify
|
||||||
|
that the C function takes an additional argument (as first argument),
|
||||||
|
namely the function object.
|
||||||
|
|
||||||
|
**References**: PEP 580, PEP 573
|
||||||
|
|
||||||
|
|
||||||
|
6. METH_FASTCALL is private and undocumented
|
||||||
|
--------------------------------------------
|
||||||
|
|
||||||
|
The ``METH_FASTCALL`` mechanism allows calling cfunctions and cmethods
|
||||||
|
using a C array of Python objects instead of a ``tuple``.
|
||||||
|
This was introduced in Python 3.6 for positional arguments only
|
||||||
|
and extended in Python 3.7 with support for keyword arguments.
|
||||||
|
|
||||||
|
However, given that it is undocumented,
|
||||||
|
it is presumably only supposed to be used by CPython itself.
|
||||||
|
|
||||||
|
**Solution**: since this is an important optimization,
|
||||||
|
everybody should be encouraged to use it.
|
||||||
|
Now that the implementation of ``METH_FASTCALL`` is stable, document it!
|
||||||
|
|
||||||
|
As part of the C call protocol, we should also add a C API function ::
|
||||||
|
|
||||||
|
PyObject *PyCCall_FastCall(PyObject *func, PyObject *const *args, Py_ssize_t nargs, PyObject *keywords)
|
||||||
|
|
||||||
|
**Reference**: PEP 580
|
||||||
|
|
||||||
|
|
||||||
|
7. Allowing native C arguments
|
||||||
|
------------------------------
|
||||||
|
|
||||||
|
A cfunction always takes its arguments as Python objects
|
||||||
|
(say, an array of ``PyObject`` pointers).
|
||||||
|
In cases where the cfunction is really wrapping a native C function
|
||||||
|
(for example, coming from ``ctypes`` or some compiler like Cython),
|
||||||
|
this is inefficient: calls from C code to C code are forced to use
|
||||||
|
Python objects to pass arguments.
|
||||||
|
|
||||||
|
Analogous to the buffer protocol which allows access to C data,
|
||||||
|
we should also allow access to the underlying C callable.
|
||||||
|
|
||||||
|
**Solution**: when wrapping a C function with native arguments
|
||||||
|
(for example, a C ``long``) inside a cfunction,
|
||||||
|
we should also store a function pointer to the underlying C function,
|
||||||
|
together with its C signature.
|
||||||
|
|
||||||
|
Argument Clinic could automatically do this by storing
|
||||||
|
a pointer to the "impl" function.
|
||||||
|
|
||||||
|
|
||||||
|
8. Complexity
|
||||||
|
-------------
|
||||||
|
|
||||||
|
There are a huge number of classes involved to implement
|
||||||
|
all variations of methods.
|
||||||
|
This is not a problem by itself, but a compounding issue.
|
||||||
|
|
||||||
|
For ordinary Python classes, the table below gives the classes
|
||||||
|
for various kinds of methods, where columns
|
||||||
|
refer to the class in the class ``__dict__``,
|
||||||
|
the class for unbound methods (bound to the class)
|
||||||
|
and the class for bound methods (bound to the instance):
|
||||||
|
|
||||||
|
============= ================ ============ ============
|
||||||
|
kind __dict__ unbound bound
|
||||||
|
============= ================ ============ ============
|
||||||
|
Normal method ``function`` ``function`` ``method``
|
||||||
|
Static method ``staticmethod`` ``function`` ``function``
|
||||||
|
Class method ``classmethod`` ``method`` ``method``
|
||||||
|
Slot method ``function`` ``function`` ``method``
|
||||||
|
============= ================ ============ ============
|
||||||
|
|
||||||
|
This is the analogous table for extension types (C classes):
|
||||||
|
|
||||||
|
============= ========================== ============================== ==============================
|
||||||
|
kind __dict__ unbound bound
|
||||||
|
============= ========================== ============================== ==============================
|
||||||
|
Normal method ``method_descriptor`` ``method_descriptor`` ``builtin_function_or_method``
|
||||||
|
Static method ``staticmethod`` ``builtin_function_or_method`` ``builtin_function_or_method``
|
||||||
|
Class method ``classmethod_descriptor`` ``builtin_function_or_method`` ``builtin_function_or_method``
|
||||||
|
Slot method ``wrapper_descriptor`` ``wrapper_descriptor`` ``method-wrapper``
|
||||||
|
============= ========================== ============================== ==============================
|
||||||
|
|
||||||
|
There are a lot of classes involved
|
||||||
|
and these two tables look very different.
|
||||||
|
There is no good reason why Python methods should be
|
||||||
|
treated fundamentally different from C methods.
|
||||||
|
Also the features are slightly different:
|
||||||
|
for example, ``method`` supports ``__func__``
|
||||||
|
but ``builtin_function_or_method`` does not.
|
||||||
|
|
||||||
|
Since CPython has optimizations for calls to most of these objects,
|
||||||
|
the code for dealing with them can also become complex.
|
||||||
|
A good example of this is the ``call_function`` function in ``Python/ceval.c``.
|
||||||
|
|
||||||
|
**Solution**: all these class should implement the C call protocol.
|
||||||
|
Then the complexity in the code can mostly be fixed by
|
||||||
|
checking for the C call protocol (``tp_ccalloffset != 0``)
|
||||||
|
instead of doing type checks.
|
||||||
|
|
||||||
|
Furthermore, it should be investigated whether some of these classes can be merged
|
||||||
|
and whether ``method`` can be re-used also for bound methods of extension types
|
||||||
|
(see PEP 576 for the latter,
|
||||||
|
keeping in mind that this may have some minor backwards compatibility issues).
|
||||||
|
This is not a goal by itself but just something to keep in mind
|
||||||
|
when working on these classes.
|
||||||
|
|
||||||
|
|
||||||
|
9. PyMethodDef is too limited
|
||||||
|
-----------------------------
|
||||||
|
|
||||||
|
The typical way to create a cfunction or cmethod in an extension module
|
||||||
|
is by using a ``PyMethodDef`` to define it.
|
||||||
|
These are then stored in an array ``PyModuleDef.m_methods``
|
||||||
|
(for cfunctions) or ``PyTypeObject.tp_methods`` (for cmethods).
|
||||||
|
However, because of the stable ABI (PEP 384),
|
||||||
|
we cannot change the ``PyMethodDef`` structure.
|
||||||
|
|
||||||
|
So, this means that we cannot add new fields for creating cfunctions/cmethods
|
||||||
|
this way.
|
||||||
|
This is probably the reason for the hack that
|
||||||
|
``__doc__`` and ``__text_signature__`` are stored in the same C string
|
||||||
|
(with the ``__doc__`` and ``__text_signature__`` descriptors extracting
|
||||||
|
the relevant part).
|
||||||
|
|
||||||
|
**Solution**: stop assuming that a single ``PyMethodDef`` entry
|
||||||
|
is sufficient to describe a cfunction/cmethod.
|
||||||
|
Instead, we could add some flag which means that one of the ``PyMethodDef``
|
||||||
|
fields is instead a pointer to an additional structure.
|
||||||
|
Or, we could add a flag to use two or more consecutive ``PyMethodDef``
|
||||||
|
entries in the array to store more data.
|
||||||
|
Then the ``PyMethodDef`` array would be used only to construct
|
||||||
|
cfunctions/cmethods but it would no longer be used after that.
|
||||||
|
|
||||||
|
|
||||||
|
10. Slot wrappers have no custom documentation
|
||||||
|
----------------------------------------------
|
||||||
|
|
||||||
|
Right now, slot wrappers like ``__init__`` or ``__lt__`` only have very
|
||||||
|
generic documentation, not at all specific to the class::
|
||||||
|
|
||||||
|
>>> list.__init__.__doc__
|
||||||
|
'Initialize self. See help(type(self)) for accurate signature.'
|
||||||
|
>>> list.__lt__.__doc__
|
||||||
|
'Return self<value.'
|
||||||
|
|
||||||
|
The same happens for the signature::
|
||||||
|
|
||||||
|
>>> list.__init__.__text_signature__
|
||||||
|
'($self, /, *args, **kwargs)'
|
||||||
|
|
||||||
|
As you can see, slot wrappers do support ``__doc__``
|
||||||
|
and ``__text_signature__``.
|
||||||
|
The problem is that these are stored in ``struct wrapperbase``,
|
||||||
|
which is common for all wrappers of a specific slot
|
||||||
|
(for example, the same ``wrapperbase`` is used for ``str.__eq__`` and ``int.__eq__``).
|
||||||
|
|
||||||
|
**Solution**: rethink the slot wrapper class to allow docstrings
|
||||||
|
(and text signatures) for each instance separately.
|
||||||
|
|
||||||
|
This still leaves the question of how extension modules
|
||||||
|
should specify the documentation.
|
||||||
|
The ``PyTypeObject`` entries like ``tp_init`` are just function pointers,
|
||||||
|
we cannot do anything with those.
|
||||||
|
One solution would be to add entries to the ``tp_methods`` array
|
||||||
|
just for adding docstrings.
|
||||||
|
Such an entry could look like ::
|
||||||
|
|
||||||
|
{"__init__", NULL, METH_SLOTDOC, "pointer to __init__ doc goes here"}
|
||||||
|
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
.. [#bpo30071] Duck-typing inspect.isfunction()
|
||||||
|
(https://bugs.python.org/issue30071)
|
||||||
|
|
||||||
|
|
||||||
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
|
This document has been placed in the public domain.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
..
|
||||||
|
Local Variables:
|
||||||
|
mode: indented-text
|
||||||
|
indent-tabs-mode: nil
|
||||||
|
sentence-end-double-space: t
|
||||||
|
fill-column: 70
|
||||||
|
coding: utf-8
|
||||||
|
End:
|
|
@ -0,0 +1,402 @@
|
||||||
|
PEP: 580
|
||||||
|
Title: The C call protocol
|
||||||
|
Author: Jeroen Demeyer <J.Demeyer@UGent.be>
|
||||||
|
Status: Draft
|
||||||
|
Type: Standards Track
|
||||||
|
Content-Type: text/x-rst
|
||||||
|
Created: 14-Jun-2018
|
||||||
|
Python-Version: 3.8
|
||||||
|
Post-History: 19-Jun-2018
|
||||||
|
|
||||||
|
|
||||||
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
|
A new "C call" protocol is proposed.
|
||||||
|
It is meant for classes representing functions or methods
|
||||||
|
which need to implement fast calling.
|
||||||
|
The goal is to generalize existing optimizations for built-in functions
|
||||||
|
to arbitrary extension types.
|
||||||
|
|
||||||
|
In the reference implementation,
|
||||||
|
this new protocol is used for the existing classes
|
||||||
|
``builtin_function_or_method`` and ``method_descriptor``.
|
||||||
|
However, in the future, more classes may implement it.
|
||||||
|
|
||||||
|
**NOTE**: This PEP deals only with CPython implementation details,
|
||||||
|
it does not affect the Python language or standard library.
|
||||||
|
|
||||||
|
|
||||||
|
Motivation
|
||||||
|
==========
|
||||||
|
|
||||||
|
Currently, the Python bytecode interpreter has various optimizations
|
||||||
|
for calling instances of ``builtin_function_or_method``,
|
||||||
|
``method_descriptor``, ``method`` and ``function``.
|
||||||
|
However, none of these classes is subclassable.
|
||||||
|
Therefore, these optimizations are not available to
|
||||||
|
user-defined extension types.
|
||||||
|
|
||||||
|
If this PEP is implemented, then the checks
|
||||||
|
for ``builtin_function_or_method`` and ``method_descriptor``
|
||||||
|
could be replaced by simply checking for and using the C call protocol.
|
||||||
|
This simplifies existing code.
|
||||||
|
|
||||||
|
We also design the C call protocol such that it can easily
|
||||||
|
be extended with new features in the future.
|
||||||
|
|
||||||
|
This protocol replaces the use of ``PyMethodDef`` pointers
|
||||||
|
in instances of ``builtin_function_or_method`` for example.
|
||||||
|
However, ``PyMethodDef`` arrays are still used to construct
|
||||||
|
functions/methods but no longer for calling them.
|
||||||
|
|
||||||
|
For more background and motivation, see PEP 579.
|
||||||
|
|
||||||
|
|
||||||
|
New data structures
|
||||||
|
===================
|
||||||
|
|
||||||
|
The ``PyTypeObject`` structure gains a new field ``Py_ssize_t tp_ccalloffset``
|
||||||
|
and a new flag ``Py_TPFLAGS_HAVE_CCALL``.
|
||||||
|
If this flag is set, then ``tp_ccalloffset`` is assumed to be a valid
|
||||||
|
offset inside the object structure (similar to ``tp_weaklistoffset``).
|
||||||
|
It must be a strictly positive integer.
|
||||||
|
At that offset, a ``PyCMethodDef`` structure appears::
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
PyCCallDef *cm_ccall;
|
||||||
|
PyObject *cm_self; /* __self__ argument for methods */
|
||||||
|
} PyCMethodDef;
|
||||||
|
|
||||||
|
The ``PyCCallDef`` structure contains everything needed to describe how
|
||||||
|
the function can be called::
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
uint32_t cc_flags;
|
||||||
|
PyCFunction cc_func; /* C function to call */
|
||||||
|
PyObject *cc_name; /* str object */
|
||||||
|
PyObject *cc_parent; /* class or module */
|
||||||
|
} PyCCallDef;
|
||||||
|
|
||||||
|
The reason for putting ``__self__`` outside of ``PyCCallDef``
|
||||||
|
is that ``PyCCallDef`` is not meant to be changed after creating the function.
|
||||||
|
A single ``PyCCallDef`` can be shared
|
||||||
|
by an unbound method and multiple bound methods.
|
||||||
|
This wouldn't work if we would put ``__self__`` inside that structure.
|
||||||
|
|
||||||
|
**NOTE**: unlike ``tp_dictoffset`` we do not allow negative numbers
|
||||||
|
for ``tp_ccalloffset`` to mean counting from the end.
|
||||||
|
There does not seem to be a use case for it and it would only complicate
|
||||||
|
the implementation.
|
||||||
|
|
||||||
|
**NOTE**: in the reference implementation, ``tp_ccalloffset`` actually
|
||||||
|
replaces ``tp_print`` and ``Py_TPFLAGS_HAVE_CCALL`` is *not*
|
||||||
|
added to ``Py_TPFLAGS_DEFAULT``.
|
||||||
|
The latter ensures full backwards compatibility for existing
|
||||||
|
extension modules setting ``tp_print``.
|
||||||
|
It also means that we can require that ``tp_ccalloffset`` is a valid
|
||||||
|
offset when ``Py_TPFLAGS_HAVE_CCALL`` is specified:
|
||||||
|
we do not need to check ``tp_ccalloffset != 0``.
|
||||||
|
In future Python versions, we may decide that ``tp_print``
|
||||||
|
becomes ``tp_ccalloffset`` unconditionally,
|
||||||
|
drop the ``Py_TPFLAGS_HAVE_CCALL`` flag and instead check for
|
||||||
|
``tp_ccalloffset != 0``.
|
||||||
|
|
||||||
|
Parent
|
||||||
|
------
|
||||||
|
|
||||||
|
The ``cc_parent`` field (accessed for example by a ``__parent__``
|
||||||
|
or ``__objclass__`` descriptor from Python code) can be any Python object.
|
||||||
|
For methods of extension types, this is set to the class.
|
||||||
|
For functions of modules, this is set to the module.
|
||||||
|
|
||||||
|
The parent serves multiple purposes: for methods of extension types,
|
||||||
|
it is used for type checks like the following::
|
||||||
|
|
||||||
|
>>> list.append({}, "x")
|
||||||
|
Traceback (most recent call last):
|
||||||
|
File "<stdin>", line 1, in <module>
|
||||||
|
TypeError: descriptor 'append' requires a 'list' object but received a 'dict'
|
||||||
|
|
||||||
|
PEP 573 specifies that every function should have access to the
|
||||||
|
module in which it is defined.
|
||||||
|
For functions of a module, this is given by the parent.
|
||||||
|
For methods, this works indirectly through the class,
|
||||||
|
assuming that the class has a pointer to the module.
|
||||||
|
|
||||||
|
The parent would also typically be used to implement ``__qualname__``.
|
||||||
|
|
||||||
|
Custom classes are free to set ``cc_parent`` to whatever they want.
|
||||||
|
It is only used by the C call protocol if the ``CCALL_OBJCLASS`` flag is set.
|
||||||
|
|
||||||
|
|
||||||
|
The C call protocol
|
||||||
|
===================
|
||||||
|
|
||||||
|
We say that a class implements the C call protocol
|
||||||
|
if it has the ``Py_TPFLAGS_HAVE_CCALL`` flag set
|
||||||
|
(as explained above, it must then set ``tp_ccalloffset > 0``).
|
||||||
|
Such a class must implement ``__call__`` as described in this section
|
||||||
|
(in practice, this just means setting ``tp_call`` to ``PyCCall_Call``).
|
||||||
|
|
||||||
|
The ``cc_func`` field is a C function pointer.
|
||||||
|
Its precise signature depends on flags.
|
||||||
|
Below are the possible values for ``cc_flags & CCALL_SIGNATURE``
|
||||||
|
together with the arguments that the C function takes.
|
||||||
|
The return value is always ``PyObject *``.
|
||||||
|
The following are completely analogous to the existing ``PyMethodDef``
|
||||||
|
signature flags:
|
||||||
|
|
||||||
|
- ``CCALL_VARARGS``: ``cc_func(PyObject *self, PyObject *args)``
|
||||||
|
|
||||||
|
- ``CCALL_VARARGS | CCALL_KEYWORDS``: ``cc_func(PyObject *self, PyObject *args, PyObject *kwds)``
|
||||||
|
|
||||||
|
- ``CCALL_FASTCALL``: ``cc_func(PyObject *self, PyObject *const *args, Py_ssize_t nargs)``
|
||||||
|
|
||||||
|
- ``CCALL_FASTCALL | CCALL_KEYWORDS``: ``cc_func(PyObject *self, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)``
|
||||||
|
|
||||||
|
- ``CCALL_NOARGS``: ``cc_func(PyObject *self, PyObject *unused)``
|
||||||
|
|
||||||
|
- ``CCALL_O``: ``cc_func(PyObject *self, PyObject *arg)``
|
||||||
|
|
||||||
|
The flag ``CCALL_FUNCARG`` may be combined with any of these.
|
||||||
|
If so, the C function takes an additional argument as first argument
|
||||||
|
which is the function object (the ``self`` in ``__call__``).
|
||||||
|
For example, we have the following signature:
|
||||||
|
|
||||||
|
- ``CCALL_FUNCARG | CCALL_VARARGS``: ``cc_func(PyObject *func, PyObject *self, PyObject *args)``
|
||||||
|
|
||||||
|
**NOTE**: unlike the existing ``METH_...`` flags,
|
||||||
|
the ``CCALL_...`` constants do not necessarily represent single bits.
|
||||||
|
So checking ``cc_flags & CCALL_VARARGS != 0`` is not a valid way
|
||||||
|
for checking the signature.
|
||||||
|
|
||||||
|
Checking __objclass__
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
If the ``CCALL_OBJCLASS`` flag is set and if ``cm_self`` is NULL
|
||||||
|
(this is the case for unbound methods of extension types),
|
||||||
|
then a type check is done:
|
||||||
|
the function must be called with at least one positional argument
|
||||||
|
and the first (typically called ``self``) must be an instance of
|
||||||
|
``cc_parent`` (which must be a class).
|
||||||
|
If not, a ``TypeError`` is raised.
|
||||||
|
|
||||||
|
Self slicing
|
||||||
|
------------
|
||||||
|
|
||||||
|
If ``cm_self`` is not NULL or if the flag ``CCALL_SLICE_SELF``
|
||||||
|
is not set in ``cc_flags``, then the argument passed as ``self``
|
||||||
|
is simply ``cm_self``.
|
||||||
|
|
||||||
|
If ``cm_self`` is NULL and the flag ``CCALL_SLICE_SELF`` is set,
|
||||||
|
then the first positional argument (if any) is removed from
|
||||||
|
``args`` and instead passed as first argument to the C function.
|
||||||
|
Effectively, the first positional argument is treated as ``__self__``.
|
||||||
|
This is meant to support unbound methods such that the C function does
|
||||||
|
not see the difference between bound and unbound method calls.
|
||||||
|
This does not affect keyword arguments in any way.
|
||||||
|
|
||||||
|
This process is called self slicing and a function is said to have self
|
||||||
|
slicing if ``cm_self`` is NULL and ``CCALL_SLICE_SELF`` is set.
|
||||||
|
|
||||||
|
Note that a ``METH_NOARGS`` function with self slicing effectively has
|
||||||
|
one argument, namely ``self``.
|
||||||
|
Analogously, a ``METH_O`` function with self slicing has two arguments.
|
||||||
|
|
||||||
|
Supporting the LOAD_METHOD/CALL_METHOD opcodes
|
||||||
|
----------------------------------------------
|
||||||
|
|
||||||
|
Classes supporting the C call protocol
|
||||||
|
must implement ``__get__`` in a specific way.
|
||||||
|
This is required to correctly deal with the ``LOAD_METHOD``/``CALL_METHOD`` optimization.
|
||||||
|
If ``func`` supports the C call protocol, then ``func.__get__``
|
||||||
|
must behave as follows:
|
||||||
|
|
||||||
|
- If ``cm_self`` is not NULL, then ``__get__`` must be a no-op
|
||||||
|
in the sense that ``func.__get__(obj, cls)(*args, **kwds)``
|
||||||
|
behaves exactly the same as ``func(*args, **kwds)``.
|
||||||
|
It is also allowed for ``__get__`` to be not implemented at all.
|
||||||
|
|
||||||
|
- If ``cm_self`` is NULL, then ``func.__get__(obj, cls)(*args, **kwds)``
|
||||||
|
(with ``obj`` not None)
|
||||||
|
must be equivalent to ``func(obj, *args, **kwds)``.
|
||||||
|
Note that this is unrelated to self slicing: ``obj`` may be passed
|
||||||
|
as ``self`` argument to the C function or it may be the first positional argument.
|
||||||
|
|
||||||
|
- If ``cm_self`` is NULL, then ``func.__get__(None, cls)(*args, **kwds)``
|
||||||
|
must be equivalent to ``func(*args, **kwds)``.
|
||||||
|
|
||||||
|
There are no restrictions on the object ``func.__get__(obj, cls)``.
|
||||||
|
The latter is not required to implement the C call protocol for example.
|
||||||
|
It only specifies what ``func.__get__(obj, cls).__call__`` does.
|
||||||
|
|
||||||
|
For classes that do not care about ``__self__`` and ``__get__`` at all,
|
||||||
|
the easiest solution is to assign ``cm_self = Py_None``
|
||||||
|
(or any other non-NULL value).
|
||||||
|
|
||||||
|
Generic API functions
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
The following C API functions are added:
|
||||||
|
|
||||||
|
- ``int PyCCall_Check(PyObject *op)``:
|
||||||
|
return true if ``op`` implements the C call protocol.
|
||||||
|
|
||||||
|
- ``PyObject * PyCCall_Call(PyObject *func, PyObject *args, PyObject *kwds)``:
|
||||||
|
call ``func`` (which must implement the C call protocol)
|
||||||
|
with positional arguments ``args`` and keyword arguments ``kwds``
|
||||||
|
(``kwds`` may be NULL).
|
||||||
|
This function is meant to be put in the ``tp_call`` slot.
|
||||||
|
|
||||||
|
- ``PyObject * PyCCall_FastCall(PyObject *func, PyObject *const *args, Py_ssize_t nargs, PyObject *kwds)``:
|
||||||
|
call ``func`` (which must implement the C call protocol)
|
||||||
|
with ``nargs`` positional arguments given by ``args[0]``, …, ``args[nargs-1]``.
|
||||||
|
The parameter ``kwds`` can be NULL (no keyword arguments),
|
||||||
|
a dict with ``name:value`` items or a tuple with keyword names.
|
||||||
|
In the latter case, the keyword values are stored in the ``args``
|
||||||
|
array, starting at ``args[nargs]``.
|
||||||
|
|
||||||
|
Profiling
|
||||||
|
---------
|
||||||
|
|
||||||
|
A flag ``CCALL_PROFILE`` is added to control profiling [#setprofile]_.
|
||||||
|
If this flag is set, then the profiling events
|
||||||
|
``c_call``, ``c_return`` and ``c_exception`` are generated.
|
||||||
|
When an unbound method is called
|
||||||
|
(``cm_self`` is NULL and ``CCALL_SLICE_SELF`` is set),
|
||||||
|
the argument to the profiling function is the corresponding bound method
|
||||||
|
(obtained by calling ``__get__``).
|
||||||
|
This is meant for backwards compatibility and to simplify
|
||||||
|
the implementation of the profiling function.
|
||||||
|
|
||||||
|
|
||||||
|
Changes to built-in functions and methods
|
||||||
|
=========================================
|
||||||
|
|
||||||
|
The reference implementation of this PEP changes
|
||||||
|
the existing classes ``builtin_function_or_method`` and ``method_descriptor``
|
||||||
|
to use the C call protocol.
|
||||||
|
In both cases, the ``PyCCallDef`` structure is simply stored
|
||||||
|
as part of the object structure.
|
||||||
|
So, these are the new layouts of ``PyCFunctionObject`` and ``PyMethodDescrObject``::
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
PyObject_HEAD
|
||||||
|
PyCCallDef *m_ccall;
|
||||||
|
PyObject *m_self;
|
||||||
|
PyObject *m_module;
|
||||||
|
PyObject *m_weakreflist;
|
||||||
|
PyCCallDef _ccalldef;
|
||||||
|
} PyCFunctionObject;
|
||||||
|
|
||||||
|
typedef struct {
|
||||||
|
PyObject_HEAD
|
||||||
|
PyCCallDef *md_ccall;
|
||||||
|
PyObject *md_self; /* Always NULL */
|
||||||
|
PyObject *md_qualname;
|
||||||
|
PyCCallDef _ccalldef;
|
||||||
|
} PyMethodDescrObject;
|
||||||
|
|
||||||
|
For functions of a module, ``m_ccall`` would point to the ``_ccalldef``
|
||||||
|
field.
|
||||||
|
For bound methods, ``m_ccall`` would point to the ``PyCCallDef``
|
||||||
|
of the unbound method.
|
||||||
|
|
||||||
|
**NOTE**: the new layout of ``PyMethodDescrObject`` changes it
|
||||||
|
such that it no longer starts with ``PyDescr_COMMON``.
|
||||||
|
This is really an implementation detail and it should cause few (if any)
|
||||||
|
compatibility problems.
|
||||||
|
|
||||||
|
|
||||||
|
Inheritance
|
||||||
|
===========
|
||||||
|
|
||||||
|
Extension types inherit the type flag ``Py_TPFLAGS_HAVE_CCALL``
|
||||||
|
and the value ``tp_ccalloffset`` from the base class,
|
||||||
|
provided that they implement ``tp_call`` and ``tp_descr_get``
|
||||||
|
the same way as the base class.
|
||||||
|
Heap types never inherit the C call protocol because
|
||||||
|
that would not be safe (heap types can be changed dynamically).
|
||||||
|
|
||||||
|
|
||||||
|
Backwards compatibility
|
||||||
|
=======================
|
||||||
|
|
||||||
|
There should be no difference at all for the Python interface,
|
||||||
|
and neither for the documented C API
|
||||||
|
(in the sense that all functions remain supported with the same functionality).
|
||||||
|
|
||||||
|
So the only potential breakage is with C code accessing the
|
||||||
|
internals of ``PyCFunctionObject`` and ``PyMethodDescrObject``.
|
||||||
|
We expect very few problems because of this.
|
||||||
|
|
||||||
|
|
||||||
|
Rationale
|
||||||
|
=========
|
||||||
|
|
||||||
|
Why is this better than PEP 575?
|
||||||
|
--------------------------------
|
||||||
|
|
||||||
|
One of the major complaints of PEP 575 was that is was coupling
|
||||||
|
functionality (the calling and introspection protocol)
|
||||||
|
with the class hierarchy:
|
||||||
|
a class could only benefit from the new features
|
||||||
|
if it was a subclass of ``base_function``.
|
||||||
|
It may be difficult for existing classes to do that
|
||||||
|
because they may have other constraints on the layout of the C object structure,
|
||||||
|
coming from an existing base class or implementation details.
|
||||||
|
For example, ``functools.lru_cache`` cannot implement PEP 575 as-is.
|
||||||
|
|
||||||
|
It also complicated the implementation precisely because changes
|
||||||
|
were needed both in the implementation details and in the class hierarchy.
|
||||||
|
|
||||||
|
The current PEP does not have these problems.
|
||||||
|
|
||||||
|
Why store the function pointer in the instance?
|
||||||
|
-----------------------------------------------
|
||||||
|
|
||||||
|
The actual information needed for calling an object
|
||||||
|
is stored in the instance (in the ``PyCCallDef`` structure)
|
||||||
|
instead of the class.
|
||||||
|
This is different from the ``tp_call`` slot or earlier attempts
|
||||||
|
at implementing a ``tp_fastcall`` slot [#bpo29259]_.
|
||||||
|
|
||||||
|
The main use case is built-in functions and methods.
|
||||||
|
For those, the C function to be called does depend on the instance.
|
||||||
|
|
||||||
|
However, the current protocol makes it easy to support the case
|
||||||
|
where the same C function is called for all instances:
|
||||||
|
just use a single static ``PyCCallDef`` structure for every instance.
|
||||||
|
|
||||||
|
|
||||||
|
Reference implementation
|
||||||
|
========================
|
||||||
|
|
||||||
|
Work in progress.
|
||||||
|
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
.. [#setprofile] ``sys.setprofile`` documentation,
|
||||||
|
https://docs.python.org/3.8/library/sys.html#sys.setprofile
|
||||||
|
|
||||||
|
.. [#bpo29259] Add tp_fastcall to PyTypeObject: support FASTCALL calling convention for all callable objects,
|
||||||
|
https://bugs.python.org/issue29259
|
||||||
|
|
||||||
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
|
This document has been placed in the public domain.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
..
|
||||||
|
Local Variables:
|
||||||
|
mode: indented-text
|
||||||
|
indent-tabs-mode: nil
|
||||||
|
sentence-end-double-space: t
|
||||||
|
fill-column: 70
|
||||||
|
coding: utf-8
|
||||||
|
End:
|
Loading…
Reference in New Issue