902 lines
36 KiB
ReStructuredText
902 lines
36 KiB
ReStructuredText
PEP: 575
|
||
Title: Unifying function/method classes
|
||
Author: Jeroen Demeyer <J.Demeyer@UGent.be>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 27-Mar-2018
|
||
Python-Version: 3.8
|
||
Post-History: 31-Mar-2018
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
Reorganize the class hierarchy for functions and methods
|
||
with the goal of reducing the difference between
|
||
built-in functions (implemented in C) and Python functions.
|
||
Mainly, make built-in functions behave more like Python functions
|
||
without sacrificing performance.
|
||
|
||
A new base class ``basefunction`` is introduced and the various function
|
||
classes, as well as ``method``, inherit from it.
|
||
|
||
We also allow subclassing of some of these function classes.
|
||
|
||
Motivation
|
||
==========
|
||
|
||
Currently, CPython has two different function classes:
|
||
the first is Python functions, which is what you get
|
||
when defining a function with ``def`` or ``lambda``.
|
||
The second is built-in functions such as ``len``, ``isinstance`` or ``numpy.dot``.
|
||
These are implemented in C.
|
||
|
||
These two classes are completely independent with different functionality.
|
||
In particular, it is currently not possible to implement a function efficiently in C
|
||
(only built-in functions can do that)
|
||
while still allowing introspection like ``inspect.signature`` or ``inspect.getsourcefile``
|
||
(only Python functions can do that).
|
||
This is a problem for projects like Cython [#cython]_ that want to do exactly that.
|
||
|
||
In Cython, this was worked around by inventing a new function class called ``cyfunction``.
|
||
Unfortunately, a new function class creates problems:
|
||
the ``inspect`` module does not recognize such functions as being functions [#bpo30071]_
|
||
and the performance is worse
|
||
(CPython has specific optimizations for calling built-in functions).
|
||
|
||
A second motivation is more generally making built-in functions and methods
|
||
behave more like Python functions and methods.
|
||
For example, Python unbound methods are just functions but
|
||
unbound methods of extension types (e.g. ``dict.get``) are a distinct class.
|
||
Bound methods of Python classes have a ``__func__`` attribute,
|
||
bound methods of extension types do not.
|
||
|
||
New classes
|
||
===========
|
||
|
||
This is the new class hierarchy for functions and methods::
|
||
|
||
object
|
||
|
|
||
|
|
||
basefunction
|
||
/ | \
|
||
/ | \
|
||
/ | generic_function
|
||
/ | \
|
||
builtin_function (*) | \
|
||
| python_function
|
||
|
|
||
method (*)
|
||
|
||
The two classes marked with (*) do *not* allow subclassing;
|
||
the others do.
|
||
|
||
There is no difference between functions and unbound methods,
|
||
while bound methods are instances of ``method``.
|
||
|
||
basefunction
|
||
------------
|
||
|
||
The class ``basefunction`` becomes a new base class for all function types.
|
||
It behaves like the existing ``builtin_function_or_method``
|
||
with some differences:
|
||
|
||
#. It acts as a descriptor implementing ``__get__`` to turn a function into a method
|
||
if there was no ``__self__`` attribute.
|
||
If the ``__self__`` attribute was already set, then this is a no-op:
|
||
the existing function is returned instead.
|
||
|
||
#. A new read-only slot ``__objclass__``, represented in the C structure as ``m_selftype``.
|
||
If this attribute exists, it must be a class (it cannot be ``None``).
|
||
If so, the function must be called with ``self`` being an instance of that class.
|
||
This is meant to support unbound methods of extension types, replacing ``method_descriptor``.
|
||
|
||
#. Argument Clinic [#clinic]_ is not supported.
|
||
|
||
#. The field ``ml_doc`` and the attributes ``__doc__`` and ``__text_signature__``
|
||
are gone.
|
||
|
||
#. A new flag ``METH_CUSTOM`` for ``ml_flags`` which prevents automatic
|
||
generation of a ``builtin_function``, see `Automatic creation of built-in functions`_.
|
||
|
||
#. A new flag ``METH_ARG0_FUNCTION`` for ``ml_flags``.
|
||
If this flag is set, the C function stored in ``ml_meth`` will be called with first argument
|
||
equal to the function object instead of ``__self__``.
|
||
|
||
#. A new flag ``METH_ARG0_NO_SLICE`` for ``ml_flags``.
|
||
If this flag is *not* set, ``__objclass__`` is set and ``__self__`` is not set,
|
||
then the first positional argument is treated as ``__self__``.
|
||
For more details, see `Calling basefunction instances`_.
|
||
|
||
#. A new flag ``METH_PYTHON`` for ``ml_flags``.
|
||
This flag indicates that this function should be treated as Python function.
|
||
Ideally, use of this flag should be avoided because it goes
|
||
against the duck typing philosophy.
|
||
It is still needed in a few places though, for example `Profiling`_.
|
||
|
||
The goal of ``basefunction`` is that it supports all different ways
|
||
of calling functions and methods in just one structure.
|
||
For example, the new flag ``METH_ARG0_FUNCTION``
|
||
will be used by the implementation of Python functions.
|
||
|
||
It is not possible to directly create instances of ``basefunction``
|
||
(``tp_new`` is ``NULL``).
|
||
However, it is legal for C code to manually create instances.
|
||
|
||
These are the relevant C structures::
|
||
|
||
PyTypeObject PyBaseFunction_Type;
|
||
|
||
typedef struct {
|
||
PyObject_HEAD
|
||
PyCFunctionDef *m_ml; /* Description of the C function to call */
|
||
PyObject *m_self; /* __self__: anything, can be NULL; readonly */
|
||
PyObject *m_module; /* __module__: anything */
|
||
PyObject *m_weakreflist; /* List of weak references */
|
||
PyObject *m_selftype; /* __objclass__: type object or NULL; readonly */
|
||
} PyBaseFunctionObject;
|
||
|
||
typedef struct {
|
||
const char *ml_name; /* The name of the built-in function/method */
|
||
PyCFunction ml_meth; /* The C function that implements it */
|
||
uint32_t ml_flags; /* Combination of METH_xxx flags, which mostly
|
||
describe the args expected by the C func */
|
||
} PyCFunctionDef;
|
||
|
||
Note that the type of ``ml_flags`` was changed from ``int`` to
|
||
``uint32_t`` (it makes a lot of sense to fix the number of bits).
|
||
Subclasses may extend ``PyCFunctionDef`` with extra fields.
|
||
|
||
builtin_function
|
||
----------------
|
||
|
||
This is a copy of ``basefunction``, with the following differences:
|
||
|
||
#. ``m_ml`` points to a ``PyMethodDef`` structure,
|
||
extending ``PyCFunctionDef`` with an additional ``ml_doc``
|
||
field to implement ``__doc__`` and ``__text_signature__``
|
||
as read-only attributes::
|
||
|
||
typedef struct {
|
||
const char *ml_name;
|
||
PyCFunction ml_meth;
|
||
uint32_t ml_flags;
|
||
const char *ml_doc;
|
||
} PyMethodDef;
|
||
|
||
#. Argument Clinic [#clinic]_ is supported.
|
||
|
||
The type object is ``PyTypeObject PyCFunction_Type``
|
||
and we define ``PyCFunctionObject`` as alias of ``PyBaseFunctionObject``.
|
||
|
||
generic_function
|
||
----------------
|
||
|
||
The class ``generic_function`` (a subclass of ``basefunction``) adds
|
||
support for various standard attributes which are used in ``inspect``.
|
||
This would be a good class to use for auto-generated C code, for example produced by Cython [#cython]_.
|
||
|
||
The layout of the C structure is as follows::
|
||
|
||
PyTypeObject PyGenericFunction_Type;
|
||
|
||
typedef struct {
|
||
PyBaseFunctionObject base;
|
||
PyObject *func_name; /* __name__: string */
|
||
PyObject *func_qualname; /* __qualname__: string */
|
||
PyObject *func_doc; /* __doc__: can be anything or NULL */
|
||
PyObject *func_code; /* __code__: code or NULL */
|
||
PyObject *func_defaults; /* __defaults__: tuple or NULL */
|
||
PyObject *func_kwdefaults; /* __kwdefaults__: dict or NULL */
|
||
PyObject *func_annotations; /* __annotations__: dict or NULL */
|
||
PyObject *func_globals; /* __globals__: anything or NULL; readonly */
|
||
PyObject *func_closure; /* __closure__: tuple of cell objects or NULL; readonly */
|
||
PyObject *func_dict; /* __dict__: dict or NULL */
|
||
} PyGenericFunctionObject;
|
||
|
||
This class adds various slots like ``__doc__`` and ``__code__`` to access the C attributes.
|
||
The slot ``__name__`` returns ``func_name``.
|
||
When setting ``__name__``, also ``base.m_ml.ml_name`` is updated
|
||
with the UTF-8 encoded name.
|
||
|
||
None of the attributes is required to be meaningful.
|
||
In particular, ``__code__`` may not be a working code object,
|
||
possibly only a few fields may be filled in.
|
||
And ``__defaults__`` is not required to be used for calling the function.
|
||
|
||
Apart from adding these extra attributes,
|
||
``generic_function`` behaves exactly the same as ``basefunction``.
|
||
|
||
python_function
|
||
---------------
|
||
|
||
This is the class meant for functions implemented in Python,
|
||
formerly known as ``function``.
|
||
Unlike the other function types,
|
||
instances of ``python_function`` can be created from Python code.
|
||
|
||
The layout of the C structure is almost the same as ``generic_function``::
|
||
|
||
PyTypeObject PyFunction_Type;
|
||
|
||
typedef struct {
|
||
PyBaseFunctionObject base;
|
||
PyObject *func_name; /* __name__: string */
|
||
PyObject *func_qualname; /* __qualname__: string */
|
||
PyObject *func_doc; /* __doc__: can be anything or NULL */
|
||
PyObject *func_code; /* __code__: code or NULL */
|
||
PyObject *func_defaults; /* __defaults__: tuple or NULL */
|
||
PyObject *func_kwdefaults; /* __kwdefaults__: dict or NULL */
|
||
PyObject *func_annotations; /* __annotations__: dict or NULL */
|
||
PyObject *func_globals; /* __globals__: anything or NULL; readonly */
|
||
PyObject *func_closure; /* __closure__: tuple of cell objects or NULL; readonly */
|
||
PyObject *func_dict; /* __dict__: dict or NULL */
|
||
PyCFunctionDef _ml; /* Storage for base.m_ml */
|
||
} PyFunctionObject;
|
||
|
||
The only difference is an ``_ml`` field
|
||
which reserves space to be used by ``base.m_ml``.
|
||
However, it is not required that ``base.m_ml`` points to ``_ml``.
|
||
|
||
The constructor takes care of setting up ``base.m_ml``.
|
||
In particular, it sets the ``METH_PYTHON`` flag.
|
||
|
||
method
|
||
------
|
||
|
||
The class ``method`` is used for all bound methods,
|
||
regardless of the class of the underlying function.
|
||
There is one extra attribute ``__func__`` pointing to that function.
|
||
|
||
For methods, there is a complication because we want to allow
|
||
constructing a method from a arbitrary callable which
|
||
may not be an instance of ``basefunction``.
|
||
Therefore, in practice there are two kinds of methods:
|
||
for arbitrary callables, we use a single fixed ``PyCFunctionDef``
|
||
structure with ``ml_name`` equal to ``"?"``
|
||
and with the ``METH_ARG0_FUNCTION`` flag set.
|
||
The C function then calls ``__func__`` with the correct arguments.
|
||
|
||
For methods which bind instances of ``basefunction``
|
||
(more precisely, which have the ``Py_TPFLAGS_BASEFUNCTION`` flag set),
|
||
we instead use the ``PyCFunctionDef`` from the original function.
|
||
In this case, the ``__func__`` attribute is only used to implement various attributes
|
||
but not for calling the method.
|
||
|
||
When constructing a new method from a ``basefunction``,
|
||
we check that the ``self`` object is an instance of ``__objclass__``
|
||
(if such a class was specified) and raise a ``TypeError`` otherwise.
|
||
|
||
The C structure is::
|
||
|
||
typedef struct {
|
||
PyBaseFunctionObject base;
|
||
PyObject *im_func; /* __func__: function implementing the method; readonly */
|
||
} PyMethodObject;
|
||
|
||
|
||
|
||
Calling basefunction instances
|
||
==============================
|
||
|
||
We specify the implementation of ``__call__`` for instances of ``basefunction``.
|
||
|
||
__objclass__
|
||
------------
|
||
|
||
First of all, if the function has an ``__objclass__`` attribute but no
|
||
``__self__`` attribute (this is the case for unbound methods of extension types),
|
||
then the function must be called with at least one positional argument
|
||
and the first (typically called ``self``) must be an instance of ``__objclass__``.
|
||
If not, a ``TypeError`` is raised.
|
||
|
||
Flags
|
||
-----
|
||
|
||
For convenience, we define two new constants:
|
||
``METH_CALLSIGNATURE`` combines the flags from ``PyCFunctionDef.ml_flags``
|
||
which specify the signature of the C function to be called.
|
||
It is equal to ::
|
||
|
||
METH_NOARGS | METH_O | METH_VARARGS | METH_FASTCALL | METH_KEYWORDS
|
||
|
||
Exactly one of the first four flags above must be set
|
||
and only ``METH_VARARGS`` and ``METH_FASTCALL`` may be combined with ``METH_KEYWORDS``.
|
||
Violating these rules is undefined behaviour.
|
||
|
||
The second new constant is ``METH_CALLFLAGS``.
|
||
It combines all flags which influence how a function is called.
|
||
It is equal to ::
|
||
|
||
METH_CALLSIGNATURE | METH_ARG0_FUNCTION | METH_ARG0_NO_SLICE
|
||
|
||
Some of these flags are already documented [#methoddoc]_.
|
||
We explain the others below.
|
||
|
||
METH_FASTCALL
|
||
-------------
|
||
|
||
This is an existing but undocumented flag.
|
||
We suggest to officially support and document it.
|
||
|
||
If the flag ``METH_FASTCALL`` is set without ``METH_KEYWORDS``,
|
||
then the ``ml_meth`` field is of type ``PyCFunctionFast``
|
||
which takes the arguments ``(PyObject *arg0, PyObject *const *args, Py_ssize_t nargs)``.
|
||
Such a function takes only positional arguments and they are passed as plain C array
|
||
``args`` of length ``nargs``.
|
||
|
||
If the flags ``METH_FASTCALL | METH_KEYWORDS`` are set,
|
||
then the ``ml_meth`` field is of type ``PyCFunctionFastWithKeywords``
|
||
which takes the arguments ``(PyObject *arg0, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)``.
|
||
The positional arguments are passed as C array ``args`` of length ``nargs``.
|
||
The *values* of the keyword arguments follow in that array,
|
||
starting at position ``nargs``.
|
||
The *keys* (names) of the keyword arguments are passed as a ``tuple`` in ``kwnames``.
|
||
As an example, assume that 3 positional and 2 keyword arguments are given.
|
||
Then ``args`` is an array of length 3 + 2 = 5, ``nargs`` equals 3 and ``kwnames`` is a 2-tuple.
|
||
|
||
METH_ARG0_FUNCTION
|
||
------------------
|
||
|
||
If this flag is set, then the first argument to the C function
|
||
is the function itself (the ``basefunction`` instance) instead of ``__self__``.
|
||
In this case, the C function should deal with ``__self__``
|
||
by getting it from the function, for example using ``PyBaseFunction_GET_SELF``.
|
||
|
||
METH_ARG0_NO_SLICE
|
||
------------------
|
||
|
||
If the function has a ``__objclass__`` attribute, no ``__self__``
|
||
attribute and neither ``METH_ARG0_FUNCTION`` nor ``METH_ARG0_NO_SLICE`` are set,
|
||
then the first positional argument (which must exist because of ``__objclass__``)
|
||
is removed from ``*args`` and instead passed as first argument to the C function.
|
||
Effectively, the first positional argument is treated as ``__self__``.
|
||
This process is called "self slicing".
|
||
This does not affect keyword arguments.
|
||
|
||
It is not allowed to combine the flags ``METH_ARG0_FUNCTION`` and ``METH_ARG0_NO_SLICE``.
|
||
That is not a problem because ``METH_ARG0_FUNCTION`` already disables self slicing.
|
||
|
||
|
||
Automatic creation of built-in functions
|
||
========================================
|
||
|
||
Python automatically generates instances of ``builtin_function``
|
||
for extension types (using the ``PyTypeObject.tp_methods`` field) and modules
|
||
(using the ``PyModuleDef.m_methods`` field).
|
||
The arrays ``PyTypeObject.tp_methods`` and ``PyModuleDef.m_methods``
|
||
must be arrays of ``PyMethodDef`` structures.
|
||
|
||
If the ``METH_CUSTOM`` flag is set for an element of such an array,
|
||
then no ``builtin_function`` will be generated.
|
||
This allows an application to customize the creation of functions
|
||
in an extension type or module.
|
||
If ``METH_CUSTOM`` is set, then ``METH_STATIC`` and ``METH_CLASS`` are ignored.
|
||
|
||
Built-in unbound methods
|
||
------------------------
|
||
|
||
The type of unbound methods changes from ``method_descriptor``
|
||
to ``builtin_function``.
|
||
The object which appears as unbound method is the same object which
|
||
appears in the class ``__dict__``.
|
||
Python automatically sets the ``__objclass__`` attribute.
|
||
|
||
Built-in functions of a module
|
||
------------------------------
|
||
|
||
For the case of functions of a module,
|
||
``__self__`` will be set to the module unless the flag ``METH_STATIC`` is set.
|
||
|
||
An important consequence is that such functions by default
|
||
do not become methods when used as attribute
|
||
(``basefunction.__get__`` only does that if ``__self__`` was unset).
|
||
One could consider this a bug, but this was done for backwards compatibility reasons:
|
||
in an initial post on python-ideas [#proposal]_ the concensus was to keep this
|
||
misfeature of built-in functions.
|
||
|
||
However, to allow this anyway for specific or newly implemented
|
||
built-in functions, the ``METH_STATIC`` flag prevents setting ``__self__``.
|
||
Previously, ``METH_STATIC`` was an error, so this is fullt backwards compatible.
|
||
Specifying ``METH_CLASS`` is still an error.
|
||
|
||
|
||
Further changes
|
||
===============
|
||
|
||
New type flag
|
||
-------------
|
||
|
||
A new ``PyTypeObject`` flag (for ``tp_flags``) is added:
|
||
``Py_TPFLAGS_BASEFUNCTION`` to indicate that instances of this type are
|
||
functions which can be called as a ``basefunction``.
|
||
In other words, subclasses of ``basefunction``
|
||
which follow the implementation from `Calling basefunction instances`_.
|
||
|
||
This is different from flags like ``Py_TPFLAGS_LIST_SUBCLASS``
|
||
because it indicates more than just a subclass:
|
||
it also indicates a default implementation of ``__call__``.
|
||
This flag is never inherited.
|
||
However, extension types can explicitly specify it if they
|
||
do not override ``__call__`` or if they override ``__call__`` in a compatible way.
|
||
The flag ``Py_TPFLAGS_BASEFUNCTION`` must never be set for a heap type
|
||
because that would not be safe (heap types can be changed dynamically).
|
||
|
||
C API functions
|
||
---------------
|
||
|
||
We add and change some Python/C API functions:
|
||
|
||
- ``int PyBaseFunction_Check(PyObject *op)``: return true if ``op``
|
||
is an instance of a type with the ``Py_TPFLAGS_BASEFUNCTION`` set.
|
||
|
||
- ``int PyCFunction_Check(PyObject *op)``: return true if ``PyBaseFunction_Check(op)``
|
||
is True and the function ``op`` does not have the flag ``METH_PYTHON`` set.
|
||
|
||
- ``int PyBuiltinFunction_Check(PyObject *op)``: return true if ``op``
|
||
is an instance of ``builtin_function``.
|
||
|
||
- ``int PyFunction_Check(PyObject *op)``: return true if ``op``
|
||
is an instance of ``generic_function``.
|
||
|
||
- ``PyObject* PyFunction_New(PyObject *code, PyObject *globals)``:
|
||
create a new instance of ``python_function``.
|
||
|
||
- ``PyObject* PyFunction_NewWithQualName(PyObject *code, PyObject *globals)``:
|
||
create a new instance of ``python_function``.
|
||
|
||
- For some existing ``PyCFunction_...`` and ``PyMethod_`` functions,
|
||
we define a new function ``PyBaseFunction_...``
|
||
acting on ``basefunction`` instances.
|
||
For backwards compatibility,
|
||
the old functions are kept as aliases of the new functions.
|
||
|
||
**TODO**: more functions may be added when implementing this PEP.
|
||
In particular, maybe there should be functions for creating instances of ``basefunction``
|
||
or ``generic_function``.
|
||
|
||
Changes to the types module
|
||
---------------------------
|
||
|
||
Two types are added: ``types.BaseFunctionType`` corresponding to
|
||
``basefunction`` and ``types.GenericFunctionType`` corresponding to
|
||
``generic_function``.
|
||
|
||
Apart from that, no changes to the ``types`` module are made.
|
||
In particular, ``types.FunctionType`` refers to ``python_function``.
|
||
However, the actual types will change:
|
||
for example, ``types.BuiltinFunctionType`` will no longer be the same
|
||
as ``types.BuiltinMethodType``.
|
||
|
||
Changes to the inspect module
|
||
-----------------------------
|
||
|
||
``inspect.isbasefunction`` checks for an instance of ``basefunction``.
|
||
|
||
``inspect.isfunction`` checks for an instance of ``generic_function``.
|
||
|
||
``inspect.isbuiltin`` checks for an instance of ``builtin_function``.
|
||
|
||
Profiling
|
||
---------
|
||
|
||
Currently, ``sys.setprofile`` supports ``c_call``, ``c_return`` and ``c_exception``
|
||
events for built-in functions.
|
||
These events are generated when calling or returning from a built-in function.
|
||
By contrast, the ``call`` and ``return`` events are generated by the function itself.
|
||
So nothing needs to change for the ``call`` and ``return`` events.
|
||
|
||
Since we no longer make a difference between C functions and Python functions,
|
||
we need to prevent the ``c_*`` events for Python functions.
|
||
This is done by not generating those events if the
|
||
``METH_PYTHON`` flag in ``ml_flags`` is set.
|
||
|
||
User flags in PyCFunctionDef.ml_flags
|
||
----------------------------------------
|
||
|
||
8 consecutive bits in ``ml_flags`` are reserved for the "user",
|
||
meaning the person or program who implemented the function.
|
||
These are ``METH_USR0``, ..., ``METH_USR7``.
|
||
Python will ignore these flags.
|
||
|
||
It should be clear that different users may use these flags
|
||
for different purposes, so users should only look at those flags in
|
||
functions that they implemented (for example, by looking for those flags
|
||
in the ``tp_methods`` array of an extension type).
|
||
|
||
|
||
Non-CPython implementations
|
||
===========================
|
||
|
||
For other implementations of Python apart from CPython,
|
||
only the classes ``basefunction``, ``method`` and ``python_function`` are required.
|
||
The latter two are the only classes which can be instantiated directly
|
||
from the Python interpreter.
|
||
We require ``basefunction`` for consistency but we put no requirements on it:
|
||
it is acceptable if this is just a copy of ``object``.
|
||
Support for the new ``__objclass__`` attribute is not required.
|
||
If there is no ``generic_function`` type,
|
||
then ``types.GenericFunctionType`` should be an alias of ``types.FunctionType``.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Why not simply change existing classes?
|
||
---------------------------------------
|
||
|
||
One could try to solve the problem not by introducing a new ``basefunction``
|
||
class and changing the class hierarchy, but by just changing existing classes.
|
||
|
||
That might look like a simpler solution but it is not:
|
||
it would require introspection support for 3 distinct classes:
|
||
``function``, ``builtin_function_or_method`` and ``method_descriptor``.
|
||
In the current PEP, there is only a single class where introspection needs
|
||
to be implemented.
|
||
It is also not clear how this would interact with ``__text_signature__``.
|
||
Having two independent kinds of ``inspect.signature`` support on the same
|
||
class sounds like asking for problems.
|
||
|
||
And this would not fix some of the other differences between built-in functions
|
||
and Python functions that were mentioned in the `Motivation`_.
|
||
|
||
Why __text_signature__ is not a solution
|
||
----------------------------------------
|
||
|
||
Built-in functions have an attribute ``__text_signature__``,
|
||
which gives the signature of the function as plain text.
|
||
The default values are evaluated by ``ast.literal_eval``.
|
||
Because of this, it supports only a small number of standard Python classes
|
||
and not arbitrary Python objects.
|
||
|
||
And even if ``__text_signature__`` would allow arbitrary signatures somehow,
|
||
that is only one piece of introspection:
|
||
it does not help with ``inspect.getsourcefile`` for example.
|
||
|
||
generic_function versus python_function
|
||
---------------------------------------
|
||
|
||
The names ``generic_function`` and ``python_function``
|
||
were chosen to be different from ``function``
|
||
because none of the two classes ``generic_function``/``python_function``
|
||
is an obvious candidate to receive the ``function`` name.
|
||
It also allows to use the word "function" informally without referring
|
||
to a specific class.
|
||
|
||
In many places, a decision needs to be made whether the old ``function`` class
|
||
should be replaced by ``generic_function`` or ``python_function``.
|
||
This is done by thinking of the most likely use case:
|
||
|
||
1. ``types.FunctionType`` refers to ``python_function`` because that
|
||
type might be used to construct instances using ``types.FunctionType(...)``.
|
||
|
||
2. ``inspect.isfunction()`` refers to ``generic_function``
|
||
because this is the class where introspection is supported.
|
||
|
||
3. The C API functions ``PyFunction_New...``
|
||
refer to ``python_function`` simply because one cannot create instances
|
||
of ``generic_function``.
|
||
|
||
4. The C API functions ``PyFunction_Check`` and ``PyFunction_Get/Set...``
|
||
refer to ``generic_function`` because all attributes exist for instances of ``generic_function``.
|
||
|
||
Scope of this PEP: which classes are involved?
|
||
----------------------------------------------
|
||
|
||
The main motivation of this PEP is fixing function classes,
|
||
so we certainly want to unify the existing classes
|
||
``builtin_function_or_method`` and ``function``.
|
||
|
||
Since built-in functions and methods have the same class,
|
||
it seems natural to include bound methods too.
|
||
And since there are no "unbound methods" for Python functions,
|
||
it makes sense to get rid of unbound methods for extension types.
|
||
|
||
For now, no changes are made to the classes ``staticmethod``,
|
||
``classmethod`` and ``classmethod_descriptor``.
|
||
It would certainly make sense to put these in the ``basefunction``
|
||
class hierarchy and unify ``classmethod`` and ``classmethod_descriptor``.
|
||
However, this PEP is already big enough
|
||
and this is left as a possible future improvement.
|
||
|
||
Slot wrappers for extension types like ``__init__`` or ``__eq__``
|
||
are quite different from normal methods.
|
||
They are also typically not called directly because you would normally
|
||
write ``foo[i]`` instead of ``foo.__getitem__(i)`` for example.
|
||
So these are left outside the scope of this PEP.
|
||
|
||
Python also has an ``instancemethod`` class, which was used in Python 2
|
||
for unbound methods.
|
||
It is not clear whether there is still a use case for it.
|
||
In any case, there is no reason to deal with it in this PEP.
|
||
|
||
**TODO**: should ``instancemethod`` be deprecated?
|
||
It doesn't seem used at all within CPython 3.7,
|
||
but maybe external packages use it?
|
||
|
||
__self__ in basefunction
|
||
------------------------
|
||
|
||
It may look strange at first sight to add the ``__self__`` slot
|
||
in ``basefunction`` as opposed to ``method``.
|
||
We took this idea from the existing ``builtin_function_or_method`` class.
|
||
It allows us to have a single general implementation of ``__call__``
|
||
for the various function classes discussed in this PEP.
|
||
It also makes it easy to support existing built-in functions
|
||
which set ``__self__`` to the module (for example, ``sys.exit.__self__`` is ``sys``).
|
||
|
||
Subclassing
|
||
-----------
|
||
|
||
We disallow subclassing of ``builtin_function`` and ``method``
|
||
to enable fast type checks for ``PyBuiltinFunction_Check`` and ``PyMethod_Check()``.
|
||
|
||
We allow subclassing of the other classes because there is no reason to disallow it.
|
||
For Python modules, the only relevant class to subclass is
|
||
``python_function`` because the others cannot be instantiated anyway.
|
||
|
||
Replacing tp_call: METH_ARG0_FUNCTION
|
||
-------------------------------------
|
||
|
||
The new flag ``METH_ARG0_FUNCTION`` is meant to support cases where
|
||
formerly a custom ``tp_call`` was used.
|
||
It would reduce the number of special fast paths in ``Python/ceval.c``
|
||
for calling objects:
|
||
instead of treating Python functions, built-in functions and methods,
|
||
there would only be a single check.
|
||
|
||
The signature of ``tp_call`` is essentially the signature
|
||
of ``PyBaseFunctionObject.m_ml.ml_meth`` with flags
|
||
``METH_VARARGS | METH_KEYWORDS | METH_ARG0_FUNCTION``.
|
||
Therefore, it should be easy to change existing ``tp_call`` slots
|
||
to use ``METH_ARG0_FUNCTION``.
|
||
There is just one extra complication: ``__self__`` must be handled manually.
|
||
That is not hard though: it just means adapting that logic from ``method``.
|
||
|
||
Self slicing: METH_ARG0_NO_SLICE
|
||
--------------------------------
|
||
|
||
We define "self slicing" to mean slicing off the ``self`` argument of a method
|
||
from the ``*args`` tuple when an unbound method is called.
|
||
This ``self`` argument is then passed as first argument to the C function.
|
||
|
||
The specification of ``METH_ARG0_NO_SLICE`` may seem strange at first.
|
||
The negation is confusing, but it is done for backwards compatibility:
|
||
existing methods require self slicing but do not specify a flag for it.
|
||
|
||
The requirement for ``__objclass__`` in order to use self slicing
|
||
makes sense because it guarantees that there is a ``self`` argument in the first place.
|
||
|
||
Since ``METH_ARG0_FUNCTION`` is clearly incompatible with self slicing
|
||
(both use the first argument of the C function),
|
||
this PEP dictates that ``METH_ARG0_FUNCTION`` disables self slicing.
|
||
So one may wonder if there is actually a use case for ``METH_ARG0_NO_SLICE``
|
||
without ``METH_ARG0_FUNCTION``.
|
||
If not, then one could simply unify those two flags in one flag
|
||
``METH_ARG0_FUNCTION``.
|
||
|
||
However, a priori, the flag ``METH_ARG0_NO_SLICE`` is meaningful,
|
||
so we keep the two flags ``METH_ARG0_FUNCTION`` and ``METH_ARG0_NO_SLICE`` separate.
|
||
|
||
**TODO**: this should be reconsidered after initial implementation
|
||
and testing of this PEP.
|
||
|
||
User flags: METH_CUSTOM and METH_USRx
|
||
-------------------------------------
|
||
|
||
These flags are meant for applications that want to use
|
||
``tp_methods`` for an extension type or ``m_methods`` for a module
|
||
but that do not want the default built-in functions to be created.
|
||
Those applications would set ``METH_CUSTOM``.
|
||
The application is also free to use ``METH_USR0``, ..., ``METH_USR7``
|
||
for its own purposes,
|
||
for example to customize the creation of special function instances.
|
||
|
||
There is no obvious concrete use case,
|
||
but given that it costs essentially nothing to have these flags,
|
||
it seems like a good idea to allow it.
|
||
|
||
|
||
Backwards Compatibility
|
||
=======================
|
||
|
||
While designing this PEP, great care was taken to not break
|
||
backwards compatibility too much.
|
||
|
||
Python functions
|
||
----------------
|
||
|
||
For Python functions, essentially nothing changes.
|
||
The attributes that existed before still exist and Python functions
|
||
can be initialized, called and turned into methods as before.
|
||
|
||
Built-in functions of a module
|
||
------------------------------
|
||
|
||
Also for built-in functions, nothing changes.
|
||
We keep the old behaviour that such functions do not bind as methods.
|
||
This is a consequence of the fact that ``__self__`` is set to the module.
|
||
|
||
Built-in bound and unbound methods
|
||
----------------------------------
|
||
|
||
The types of built-in bound and unbound methods will change.
|
||
However, this does not affect calling such methods
|
||
because the protocol in ``basefunction.__call__``
|
||
(in particular the handling of ``__objclass__`` and self slicing)
|
||
was specifically designed to be backwards compatible.
|
||
All attributes which existed before (like ``__objclass__`` and ``__self__``)
|
||
still exist.
|
||
|
||
New classes
|
||
-----------
|
||
|
||
Tools which take various kinds of functions as input will need to deal
|
||
with the new function hieararchy and the possibility of custom
|
||
function classes.
|
||
If those tools use ``inspect`` properly, there should be few
|
||
backwards compatibility problems.
|
||
|
||
New attributes
|
||
--------------
|
||
|
||
Some objects get new attributes.
|
||
For example, ``__objclass__`` now appears on bound methods too
|
||
and all methods get a ``__func__`` attribute.
|
||
We expect that this will not cause problems.
|
||
|
||
|
||
Reference Implementation
|
||
========================
|
||
|
||
After initial discussions of this PEP draft,
|
||
work will start on a reference implementation in CPython.
|
||
|
||
|
||
Appendix: current situation
|
||
===========================
|
||
|
||
**NOTE**:
|
||
This section is more useful during the draft period of the PEP,
|
||
so feel free to remove this once the PEP has been accepted.
|
||
|
||
For reference, we describe in detail the relevant existing classes in CPython 3.7.
|
||
|
||
There are a surprisingly large number of classes involved,
|
||
each of them is an "orphan" class (no non-trivial subclasses nor superclasses).
|
||
|
||
builtin_function_or_method: built-in functions and bound methods
|
||
----------------------------------------------------------------
|
||
|
||
These are of type `PyCFunction_Type <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Objects/methodobject.c#L271>`_
|
||
with structure `PyCFunctionObject <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Include/methodobject.h#L102>`_::
|
||
|
||
typedef struct {
|
||
PyObject_HEAD
|
||
PyMethodDef *m_ml; /* Description of the C function to call */
|
||
PyObject *m_self; /* Passed as 'self' arg to the C func, can be NULL */
|
||
PyObject *m_module; /* The __module__ attribute, can be anything */
|
||
PyObject *m_weakreflist; /* List of weak references */
|
||
} PyCFunctionObject;
|
||
|
||
struct PyMethodDef {
|
||
const char *ml_name; /* The name of the built-in function/method */
|
||
PyCFunction ml_meth; /* The C function that implements it */
|
||
int ml_flags; /* Combination of METH_xxx flags, which mostly
|
||
describe the args expected by the C func */
|
||
const char *ml_doc; /* The __doc__ attribute, or NULL */
|
||
};
|
||
|
||
where ``PyCFunction`` is a C function pointer (there are various forms of this, the most basic
|
||
takes two arguments for ``self`` and ``*args``).
|
||
|
||
This class is used both for functions and bound methods:
|
||
for a method, the ``m_self`` slot points to the object::
|
||
|
||
>>> dict(foo=42).get
|
||
<built-in method get of dict object at 0x...>
|
||
>>> dict(foo=42).get.__self__
|
||
{'foo': 42}
|
||
|
||
In some cases, a function is considered a "method" of the module defining it::
|
||
|
||
>>> import os
|
||
>>> os.kill
|
||
<built-in function kill>
|
||
>>> os.kill.__self__
|
||
<module 'posix' (built-in)>
|
||
|
||
method_descriptor: built-in unbound methods
|
||
-------------------------------------------
|
||
|
||
These are of type `PyMethodDescr_Type <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Objects/descrobject.c#L538>`_
|
||
with structure `PyMethodDescrObject <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Include/descrobject.h#L53>`_::
|
||
|
||
typedef struct {
|
||
PyDescrObject d_common;
|
||
PyMethodDef *d_method;
|
||
} PyMethodDescrObject;
|
||
|
||
typedef struct {
|
||
PyObject_HEAD
|
||
PyTypeObject *d_type;
|
||
PyObject *d_name;
|
||
PyObject *d_qualname;
|
||
} PyDescrObject;
|
||
|
||
function: Python functions
|
||
--------------------------
|
||
|
||
These are of type `PyFunction_Type <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Objects/funcobject.c#L592>`_
|
||
with structure `PyFunctionObject <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Include/funcobject.h#L21>`_::
|
||
|
||
typedef struct {
|
||
PyObject_HEAD
|
||
PyObject *func_code; /* A code object, the __code__ attribute */
|
||
PyObject *func_globals; /* A dictionary (other mappings won't do) */
|
||
PyObject *func_defaults; /* NULL or a tuple */
|
||
PyObject *func_kwdefaults; /* NULL or a dict */
|
||
PyObject *func_closure; /* NULL or a tuple of cell objects */
|
||
PyObject *func_doc; /* The __doc__ attribute, can be anything */
|
||
PyObject *func_name; /* The __name__ attribute, a string object */
|
||
PyObject *func_dict; /* The __dict__ attribute, a dict or NULL */
|
||
PyObject *func_weakreflist; /* List of weak references */
|
||
PyObject *func_module; /* The __module__ attribute, can be anything */
|
||
PyObject *func_annotations; /* Annotations, a dict or NULL */
|
||
PyObject *func_qualname; /* The qualified name */
|
||
|
||
/* Invariant:
|
||
* func_closure contains the bindings for func_code->co_freevars, so
|
||
* PyTuple_Size(func_closure) == PyCode_GetNumFree(func_code)
|
||
* (func_closure may be NULL if PyCode_GetNumFree(func_code) == 0).
|
||
*/
|
||
} PyFunctionObject;
|
||
|
||
In Python 3, there is no "unbound method" class:
|
||
an unbound method is just a plain function.
|
||
|
||
method: Python bound methods
|
||
----------------------------
|
||
|
||
These are of type `PyMethod_Type <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Objects/classobject.c#L329>`_
|
||
with structure `PyMethodObject <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Include/classobject.h#L12>`_::
|
||
|
||
typedef struct {
|
||
PyObject_HEAD
|
||
PyObject *im_func; /* The callable object implementing the method */
|
||
PyObject *im_self; /* The instance it is bound to */
|
||
PyObject *im_weakreflist; /* List of weak references */
|
||
} PyMethodObject;
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [#cython] Cython (http://cython.org/)
|
||
|
||
.. [#bpo30071] Python bug 30071 (https://bugs.python.org/issue30071)
|
||
|
||
.. [#clinic] PEP 436, The Argument Clinic DSL, Hastings (https://www.python.org/dev/peps/pep-0436)
|
||
|
||
.. [#methoddoc] PyMethodDef documentation (https://docs.python.org/3.7/c-api/structures.html#c.PyMethodDef)
|
||
|
||
.. [#proposal] PEP proposal: unifying function/method classes (https://mail.python.org/pipermail/python-ideas/2018-March/049398.html)
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|