PEP 580: minor update (#781)

This commit is contained in:
jdemeyer 2018-09-21 23:31:11 +02:00 committed by Brett Cannon
parent bde4ef1f98
commit f9e94be1a7
1 changed files with 90 additions and 41 deletions

View File

@ -15,7 +15,7 @@ Abstract
A new "C call" protocol is proposed. A new "C call" protocol is proposed.
It is meant for classes representing functions or methods It is meant for classes representing functions or methods
which need to implement fast calling. which need to implement fast calling.
The goal is to generalize existing optimizations for built-in functions The goal is to generalize all existing optimizations for built-in functions
to arbitrary extension types. to arbitrary extension types.
In the reference implementation, In the reference implementation,
@ -53,7 +53,7 @@ Basic idea
Currently, CPython has multiple optimizations for fast calling Currently, CPython has multiple optimizations for fast calling
for a few specific function classes. for a few specific function classes.
Calling instances of these classes using a plain ``tp_call`` is slower Calling instances of these classes using the ``tp_call`` slot is slower
than using the optimizations. than using the optimizations.
The basic idea of this PEP is to allow user-defined extension types The basic idea of this PEP is to allow user-defined extension types
(not Python classes) to use these optimizations also, (not Python classes) to use these optimizations also,
@ -74,7 +74,7 @@ giving an offset to a ``PyCCallDef *`` in the object structure
and a flag ``Py_TPFLAGS_HAVE_CCALL`` indicating that ``tp_ccalloffset`` is valid. and a flag ``Py_TPFLAGS_HAVE_CCALL`` indicating that ``tp_ccalloffset`` is valid.
Third, since we want to deal efficiently with unbound and bound methods too Third, since we want to deal efficiently with unbound and bound methods too
(as opposed to only plain functions), we need to handle ``__self__`` too: (as opposed to only plain functions), we need to handle ``__self__`` in the protocol:
after the ``PyCCallDef *`` in the object structure, after the ``PyCCallDef *`` in the object structure,
there is a ``PyObject *self`` field. there is a ``PyObject *self`` field.
These two fields together are referred to as a ``PyCCallRoot`` structure. These two fields together are referred to as a ``PyCCallRoot`` structure.
@ -89,7 +89,7 @@ New data structures
The ``PyTypeObject`` structure gains a new field ``Py_ssize_t tp_ccalloffset`` The ``PyTypeObject`` structure gains a new field ``Py_ssize_t tp_ccalloffset``
and a new flag ``Py_TPFLAGS_HAVE_CCALL``. and a new flag ``Py_TPFLAGS_HAVE_CCALL``.
If this flag is set, then ``tp_ccalloffset`` is assumed to be a valid If this flag is set, then ``tp_ccalloffset`` is assumed to be a valid
offset inside the object structure (similar to ``tp_weaklistoffset``). offset inside the object structure (similar to ``tp_dictoffset`` and ``tp_weaklistoffset``).
It must be a strictly positive integer. It must be a strictly positive integer.
At that offset, a ``PyCCallRoot`` structure appears:: At that offset, a ``PyCCallRoot`` structure appears::
@ -171,6 +171,15 @@ becomes ``tp_ccalloffset`` unconditionally,
drop the ``Py_TPFLAGS_HAVE_CCALL`` flag and instead check for drop the ``Py_TPFLAGS_HAVE_CCALL`` flag and instead check for
``tp_ccalloffset != 0``. ``tp_ccalloffset != 0``.
**NOTE**: the exact layout of ``PyTypeObject`` is not part of the stable ABI ([#pep384]_).
Therefore, changing the ``tp_print`` field from a ``printfunc`` (a function pointer)
to a ``Py_ssize_t`` should not be a problem,
even if this changes the memory layout of the ``PyTypeObject`` structure.
Moreover, on all systems for which binaries are commonly built
(Windows, Linux, macOS),
the size of ``printfunc`` and ``Py_ssize_t`` are the same,
so the issue of binary compatibility will not come up anyway.
The C call protocol The C call protocol
=================== ===================
@ -184,6 +193,8 @@ Such a class must implement ``__call__`` as described in this section
The ``cc_func`` field is a C function pointer, The ``cc_func`` field is a C function pointer,
which plays the same role as the existing ``ml_meth`` field of ``PyMethodDef``. which plays the same role as the existing ``ml_meth`` field of ``PyMethodDef``.
Its precise signature depends on flags. Its precise signature depends on flags.
The subset of flags influencing the signature of ``cc_func``
is given by the bitmask ``CCALL_SIGNATURE``.
Below are the possible values for ``cc_flags & CCALL_SIGNATURE`` Below are the possible values for ``cc_flags & CCALL_SIGNATURE``
together with the arguments that the C function takes. together with the arguments that the C function takes.
The return value is always ``PyObject *``. The return value is always ``PyObject *``.
@ -195,6 +206,7 @@ signature flags:
- ``CCALL_VARARGS | CCALL_KEYWORDS``: - ``CCALL_VARARGS | CCALL_KEYWORDS``:
``cc_func(PyObject *self, PyObject *args, PyObject *kwds)`` ``cc_func(PyObject *self, PyObject *args, PyObject *kwds)``
(``kwds`` is either ``NULL`` or a dict; this dict must not be modified by the callee)
- ``CCALL_FASTCALL``: - ``CCALL_FASTCALL``:
``cc_func(PyObject *self, PyObject *const *args, Py_ssize_t nargs)`` ``cc_func(PyObject *self, PyObject *const *args, Py_ssize_t nargs)``
@ -212,8 +224,7 @@ signature flags:
The flag ``CCALL_FUNCARG`` may be combined with any of these. The flag ``CCALL_FUNCARG`` may be combined with any of these.
If so, the C function takes an additional argument If so, the C function takes an additional argument
as first argument before ``self``. as first argument before ``self``.
This argument is used to pass the function object This argument is used to pass the function object (see NOTE 1 below).
(the ``self`` in ``__call__`` but see NOTE below).
For example, we have the following signature: For example, we have the following signature:
- ``CCALL_FUNCARG | CCALL_VARARGS``: - ``CCALL_FUNCARG | CCALL_VARARGS``:
@ -225,8 +236,9 @@ the ``unused`` argument is dropped, so the signature becomes
- ``CCALL_FUNCARG | CCALL_NOARGS``: - ``CCALL_FUNCARG | CCALL_NOARGS``:
``cc_func(PyObject *func, PyObject *self)`` ``cc_func(PyObject *func, PyObject *self)``
**NOTE**: in the case of bound methods, it is currently unspecified **NOTE 1**: with "function object", we mean the ``self`` in ``__call__``.
whether the "function object" in the paragraph above refers In the case of bound methods, it is currently unspecified
whether this refers
to the bound method or the original function (which is wrapped by the bound method). to the bound method or the original function (which is wrapped by the bound method).
In the reference implementation, the bound method is passed. In the reference implementation, the bound method is passed.
In the future, this may change to the wrapped function. In the future, this may change to the wrapped function.
@ -234,12 +246,18 @@ Despite this ambiguity, the implementation of bound methods
guarantees that ``PyCCall_CCALLDEF(func)`` guarantees that ``PyCCall_CCALLDEF(func)``
points to the ``PyCCallDef`` of the original function. points to the ``PyCCallDef`` of the original function.
**NOTE**: unlike the existing ``METH_...`` flags, **NOTE 2**: unlike the existing ``METH_...`` flags,
the ``CCALL_...`` constants do not necessarily represent single bits. the ``CCALL_...`` constants do not necessarily represent single bits.
So checking ``(cc_flags & CCALL_VARARGS) == 0`` is not a valid way So checking ``if (cc_flags & CCALL_VARARGS)`` is not a valid way
for checking the signature. for checking the signature.
There are also no guarantees of binary compatibility There are also no guarantees of binary compatibility for these flags
between Python versions for these flags. between Python versions.
This allows the implementation to choose the most efficient
numerical values of the flags.
In the reference implementation,
the legal values for ``cc_flags & CCALL_SIGNATURE`` form exactly the interval [0, …, 11].
This means that the compiler can easily
optimize a ``switch`` statement for those cases using a computed goto.
Checking __objclass__ Checking __objclass__
--------------------- ---------------------
@ -261,7 +279,7 @@ is simply ``cr_self``.
If ``cr_self`` is NULL and the flag ``CCALL_SELFARG`` is set, If ``cr_self`` is NULL and the flag ``CCALL_SELFARG`` is set,
then the first positional argument is removed from then the first positional argument is removed from
``args`` and instead passed as first argument to the C function. ``args`` and instead passed as ``self`` argument to the C function.
Effectively, the first positional argument is treated as ``__self__``. Effectively, the first positional argument is treated as ``__self__``.
If there are no positional arguments, ``TypeError`` is raised. If there are no positional arguments, ``TypeError`` is raised.
@ -277,12 +295,17 @@ Descriptor behavior
Classes supporting the C call protocol Classes supporting the C call protocol
must implement the descriptor protocol in a specific way. must implement the descriptor protocol in a specific way.
This is required for an efficient implementation of bound methods: This is required for an efficient implementation of bound methods:
it allows sharing the ``PyCCallDef`` structure between bound and unbound methods. if other code can make assumptions on what ``__get__`` does,
It is also needed for a correct implementation of ``_PyObject_GetMethod`` it enables optimizations which would not be possible otherwise.
In particular, we want to allow sharing
the ``PyCCallDef`` structure between bound and unbound methods.
We also need a correct implementation of ``_PyObject_GetMethod``
which is used by the ``LOAD_METHOD``/``CALL_METHOD`` optimization. which is used by the ``LOAD_METHOD``/``CALL_METHOD`` optimization.
First of all, if ``func`` supports the C call protocol, First of all, if ``func`` supports the C call protocol,
then ``func.__set__`` must not be implemented. then ``func.__set__`` and ``func.__delete__`` must not be implemented.
Second, ``func.__get__`` must behave as follows: Second, ``func.__get__`` must behave as follows:
@ -295,7 +318,7 @@ Second, ``func.__get__`` must behave as follows:
(with ``obj`` not None) (with ``obj`` not None)
must be equivalent to ``func(obj, *args, **kwds)``. must be equivalent to ``func(obj, *args, **kwds)``.
In particular, ``__get__`` must be implemented in this case. In particular, ``__get__`` must be implemented in this case.
Note that this is unrelated to self slicing: ``obj`` may be passed This is unrelated to `self slicing`_: ``obj`` may be passed
as ``self`` argument to the C function or it may be the first positional argument. as ``self`` argument to the C function or it may be the first positional argument.
- If ``cr_self`` is NULL, then ``func.__get__(None, cls)(*args, **kwds)`` - If ``cr_self`` is NULL, then ``func.__get__(None, cls)(*args, **kwds)``
@ -303,14 +326,14 @@ Second, ``func.__get__`` must behave as follows:
There are no restrictions on the object ``func.__get__(obj, cls)``. There are no restrictions on the object ``func.__get__(obj, cls)``.
The latter is not required to implement the C call protocol for example. The latter is not required to implement the C call protocol for example.
It only specifies what ``func.__get__(obj, cls).__call__`` does. We only specify what ``func.__get__(obj, cls).__call__`` does.
For classes that do not care about ``__self__`` and ``__get__`` at all, For classes that do not care about ``__self__`` and ``__get__`` at all,
the easiest solution is to assign ``cr_self = Py_None`` the easiest solution is to assign ``cr_self = Py_None``
(or any other non-NULL value). (or any other non-NULL value).
__name__ attribute The __name__ attribute
------------------ ----------------------
The C call protocol requires that the function has a ``__name__`` The C call protocol requires that the function has a ``__name__``
attribute which is of type ``str`` (not a subclass). attribute which is of type ``str`` (not a subclass).
@ -340,7 +363,7 @@ In other words, ``PyCCall_Check(func)`` must be true.
and keyword arguments ``kwds`` (``kwds`` may be NULL). and keyword arguments ``kwds`` (``kwds`` may be NULL).
This function is meant to be put in the ``tp_call`` slot. This function is meant to be put in the ``tp_call`` slot.
- ``PyObject *PyCCall_FASTCALL(PyObject *func, PyObject *const *args, Py_ssize_t nargs, PyObject *kwds)``: - ``PyObject *PyCCall_FastCall(PyObject *func, PyObject *const *args, Py_ssize_t nargs, PyObject *kwds)``:
call ``func`` with ``nargs`` positional arguments given by ``args[0]``, …, ``args[nargs-1]``. call ``func`` with ``nargs`` positional arguments given by ``args[0]``, …, ``args[nargs-1]``.
The parameter ``kwds`` can be NULL (no keyword arguments), The parameter ``kwds`` can be NULL (no keyword arguments),
a dict with ``name:value`` items or a tuple with keyword names. a dict with ``name:value`` items or a tuple with keyword names.
@ -436,9 +459,21 @@ The existing functions ``PyCFunction_New``, ``PyCFunction_NewEx`` and
``PyDescr_NewMethod`` are implemented in terms of ``PyCFunction_ClsNew``. ``PyDescr_NewMethod`` are implemented in terms of ``PyCFunction_ClsNew``.
The undocumented functions ``PyCFunction_GetFlags`` The undocumented functions ``PyCFunction_GetFlags``
and ``PyCFunction_GET_FLAGS`` and ``PyCFunction_GET_FLAGS`` are deprecated.
are removed because it would be non-trivial to support them They are still artificially supported by storing the original ``METH_...``
in a backwards-compatible way. flags in a bitfield inside ``cc_flags``.
Despite the fact that ``PyCFunction_GetFlags`` is technically
part of the stable ABI [#pep384]_,
it is highly unlikely to be used that way:
first of all, it is not even documented.
Second, the flag ``METH_FASTCALL``
is not part of the stable ABI but it is very common
(because of Argument Clinic).
So, if one cannot support ``METH_FASTCALL``,
it is hard to imagine a use case for ``PyCFunction_GetFlags``.
The fact that ``PyCFunction_GET_FLAGS`` and ``PyCFunction_GetFlags``
are not used at all by CPython outside of ``Objects/call.c``
further shows that these functions are not particularly useful.
Inheritance Inheritance
@ -473,8 +508,10 @@ performance improvements are discussed:
Stable ABI Stable ABI
========== ==========
The function ``PyCFunction_ClsNew`` is added to the stable ABI [#pep384]_.
None of the functions, structures or constants dealing with the C call protocol None of the functions, structures or constants dealing with the C call protocol
are added to the stable ABI [#pep384]_. are added to the stable ABI.
There are two reasons for this: There are two reasons for this:
first of all, the most useful feature of the C call protocol is probably the first of all, the most useful feature of the C call protocol is probably the
@ -495,20 +532,7 @@ There is no difference at all for the Python interface,
nor for the documented C API nor for the documented C API
(in the sense that all functions remain supported with the same functionality). (in the sense that all functions remain supported with the same functionality).
The removed function ``PyCFunction_GetFlags``, The only potential breakage is with C code
is officially part of the stable ABI [#pep384]_.
However, this is probably an oversight:
first of all, it is not even documented.
Second, the flag ``METH_FASTCALL``
is not part of the stable ABI but it is very common
(because of Argument Clinic).
So, if one cannot support ``METH_FASTCALL``,
it is hard to imagine a use case for ``PyCFunction_GetFlags``.
The fact that ``PyCFunction_GET_FLAGS`` and ``PyCFunction_GetFlags``
are not used at all by CPython outside of ``Objects/call.c``
further shows that these functions are not particularly useful.
Concluding: the only potential breakage is with C code
which accesses the internals of ``PyCFunctionObject`` and ``PyMethodDescrObject``. which accesses the internals of ``PyCFunctionObject`` and ``PyMethodDescrObject``.
We expect very few problems because of this. We expect very few problems because of this.
@ -593,10 +617,35 @@ Thanks to the proposed C call protocol, we can support this in such a way
that both the unbound and the bound method share a ``PyCCallDef`` that both the unbound and the bound method share a ``PyCCallDef``
structure (with the ``CCALL_SELFARG`` flag set). structure (with the ``CCALL_SELFARG`` flag set).
Concluding, ``CCALL_SELFARG`` has two advantages: So, ``CCALL_SELFARG`` has two advantages:
there is no extra layer of indirection for calling there is no extra layer of indirection for calling methods
and constructing bound methods does not require setting up a ``PyCCallDef`` structure. and constructing bound methods does not require setting up a ``PyCCallDef`` structure.
Another minor advantage is that we could
make the error messages for a wrong call signature
more uniform between Python methods and built-in methods.
In the following example, Python is undecided whether
a method takes 2 or 3 arguments::
>>> class List(list):
... def myappend(self, item):
... self.append(item)
>>> List().myappend(1, 2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: myappend() takes 2 positional arguments but 3 were given
>>> List().append(1, 2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: append() takes exactly one argument (2 given)
It is currently impossible for ``PyCFunction_Call``
to know the actual number of user-visible arguments
since it cannot distinguish at runtime between
a function (without ``self`` argument) and a bound method (with ``self`` argument).
The ``CCALL_SELFARG`` flag makes this difference explicit.
Replacing tp_print Replacing tp_print
------------------ ------------------