PEP 575: Unifying function/method classes (#605)
This commit is contained in:
parent
d8f315b81a
commit
9012ec6e8c
|
@ -0,0 +1,901 @@
|
|||
PEP: 575
|
||||
Title: Unifying function/method classes
|
||||
Author: Jeroen Demeyer <J.Demeyer@UGent.be>
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 27-Mar-2018
|
||||
Python-Version: 3.8
|
||||
Post-History: 31-Mar-2018
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
Reorganize the class hierarchy for functions and methods
|
||||
with the goal of reducing the difference between
|
||||
built-in functions (implemented in C) and Python functions.
|
||||
Mainly, make built-in functions behave more like Python functions
|
||||
without sacrificing performance.
|
||||
|
||||
A new base class ``basefunction`` is introduced and the various function
|
||||
classes, as well as ``method``, inherit from it.
|
||||
|
||||
We also allow subclassing of some of these function classes.
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
Currently, CPython has two different function classes:
|
||||
the first is Python functions, which is what you get
|
||||
when defining a function with ``def`` or ``lambda``.
|
||||
The second is built-in functions such as ``len``, ``isinstance`` or ``numpy.dot``.
|
||||
These are implemented in C.
|
||||
|
||||
These two classes are completely independent with different functionality.
|
||||
In particular, it is currently not possible to implement a function efficiently in C
|
||||
(only built-in functions can do that)
|
||||
while still allowing introspection like ``inspect.signature`` or ``inspect.getsourcefile``
|
||||
(only Python functions can do that).
|
||||
This is a problem for projects like Cython [#cython]_ that want to do exactly that.
|
||||
|
||||
In Cython, this was worked around by inventing a new function class called ``cyfunction``.
|
||||
Unfortunately, a new function class creates problems:
|
||||
the ``inspect`` module does not recognize such functions as being functions [#bpo30071]_
|
||||
and the performance is worse
|
||||
(CPython has specific optimizations for calling built-in functions).
|
||||
|
||||
A second motivation is more generally making built-in functions and methods
|
||||
behave more like Python functions and methods.
|
||||
For example, Python unbound methods are just functions but
|
||||
unbound methods of extension types (e.g. ``dict.get``) are a distinct class.
|
||||
Bound methods of Python classes have a ``__func__`` attribute,
|
||||
bound methods of extension types do not.
|
||||
|
||||
New classes
|
||||
===========
|
||||
|
||||
This is the new class hierarchy for functions and methods::
|
||||
|
||||
object
|
||||
|
|
||||
|
|
||||
basefunction
|
||||
/ | \
|
||||
/ | \
|
||||
/ | generic_function
|
||||
/ | \
|
||||
builtin_function (*) | \
|
||||
| python_function
|
||||
|
|
||||
method (*)
|
||||
|
||||
The two classes marked with (*) do *not* allow subclassing;
|
||||
the others do.
|
||||
|
||||
There is no difference between functions and unbound methods,
|
||||
while bound methods are instances of ``method``.
|
||||
|
||||
basefunction
|
||||
------------
|
||||
|
||||
The class ``basefunction`` becomes a new base class for all function types.
|
||||
It behaves like the existing ``builtin_function_or_method``
|
||||
with some differences:
|
||||
|
||||
#. It acts as a descriptor implementing ``__get__`` to turn a function into a method
|
||||
if there was no ``__self__`` attribute.
|
||||
If the ``__self__`` attribute was already set, then this is a no-op:
|
||||
the existing function is returned instead.
|
||||
|
||||
#. A new read-only slot ``__objclass__``, represented in the C structure as ``m_selftype``.
|
||||
If this attribute exists, it must be a class (it cannot be ``None``).
|
||||
If so, the function must be called with ``self`` being an instance of that class.
|
||||
This is meant to support unbound methods of extension types, replacing ``method_descriptor``.
|
||||
|
||||
#. Argument Clinic [#clinic]_ is not supported.
|
||||
|
||||
#. The field ``ml_doc`` and the attributes ``__doc__`` and ``__text_signature__``
|
||||
are gone.
|
||||
|
||||
#. A new flag ``METH_CUSTOM`` for ``ml_flags`` which prevents automatic
|
||||
generation of a ``builtin_function``, see `Automatic creation of built-in functions`_.
|
||||
|
||||
#. A new flag ``METH_ARG0_FUNCTION`` for ``ml_flags``.
|
||||
If this flag is set, the C function stored in ``ml_meth`` will be called with first argument
|
||||
equal to the function object instead of ``__self__``.
|
||||
|
||||
#. A new flag ``METH_ARG0_NO_SLICE`` for ``ml_flags``.
|
||||
If this flag is *not* set, ``__objclass__`` is set and ``__self__`` is not set,
|
||||
then the first positional argument is treated as ``__self__``.
|
||||
For more details, see `Calling basefunction instances`_.
|
||||
|
||||
#. A new flag ``METH_PYTHON`` for ``ml_flags``.
|
||||
This flag indicates that this function should be treated as Python function.
|
||||
Ideally, use of this flag should be avoided because it goes
|
||||
against the duck typing philosophy.
|
||||
It is still needed in a few places though, for example `Profiling`_.
|
||||
|
||||
The goal of ``basefunction`` is that it supports all different ways
|
||||
of calling functions and methods in just one structure.
|
||||
For example, the new flag ``METH_ARG0_FUNCTION``
|
||||
will be used by the implementation of Python functions.
|
||||
|
||||
It is not possible to directly create instances of ``basefunction``
|
||||
(``tp_new`` is ``NULL``).
|
||||
However, it is legal for C code to manually create instances.
|
||||
|
||||
These are the relevant C structures::
|
||||
|
||||
PyTypeObject PyBaseFunction_Type;
|
||||
|
||||
typedef struct {
|
||||
PyObject_HEAD
|
||||
PyCFunctionDef *m_ml; /* Description of the C function to call */
|
||||
PyObject *m_self; /* __self__: anything, can be NULL; readonly */
|
||||
PyObject *m_module; /* __module__: anything */
|
||||
PyObject *m_weakreflist; /* List of weak references */
|
||||
PyObject *m_selftype; /* __objclass__: type object or NULL; readonly */
|
||||
} PyBaseFunctionObject;
|
||||
|
||||
typedef struct {
|
||||
const char *ml_name; /* The name of the built-in function/method */
|
||||
PyCFunction ml_meth; /* The C function that implements it */
|
||||
uint32_t ml_flags; /* Combination of METH_xxx flags, which mostly
|
||||
describe the args expected by the C func */
|
||||
} PyCFunctionDef;
|
||||
|
||||
Note that the type of ``ml_flags`` was changed from ``int`` to
|
||||
``uint32_t`` (it makes a lot of sense to fix the number of bits).
|
||||
Subclasses may extend ``PyCFunctionDef`` with extra fields.
|
||||
|
||||
builtin_function
|
||||
----------------
|
||||
|
||||
This is a copy of ``basefunction``, with the following differences:
|
||||
|
||||
#. ``m_ml`` points to a ``PyMethodDef`` structure,
|
||||
extending ``PyCFunctionDef`` with an additional ``ml_doc``
|
||||
field to implement ``__doc__`` and ``__text_signature__``
|
||||
as read-only attributes::
|
||||
|
||||
typedef struct {
|
||||
const char *ml_name;
|
||||
PyCFunction ml_meth;
|
||||
uint32_t ml_flags;
|
||||
const char *ml_doc;
|
||||
} PyMethodDef;
|
||||
|
||||
#. Argument Clinic [#clinic]_ is supported.
|
||||
|
||||
The type object is ``PyTypeObject PyCFunction_Type``
|
||||
and we define ``PyCFunctionObject`` as alias of ``PyBaseFunctionObject``.
|
||||
|
||||
generic_function
|
||||
----------------
|
||||
|
||||
The class ``generic_function`` (a subclass of ``basefunction``) adds
|
||||
support for various standard attributes which are used in ``inspect``.
|
||||
This would be a good class to use for auto-generated C code, for example produced by Cython [#cython]_.
|
||||
|
||||
The layout of the C structure is as follows::
|
||||
|
||||
PyTypeObject PyGenericFunction_Type;
|
||||
|
||||
typedef struct {
|
||||
PyBaseFunctionObject base;
|
||||
PyObject *func_name; /* __name__: string */
|
||||
PyObject *func_qualname; /* __qualname__: string */
|
||||
PyObject *func_doc; /* __doc__: can be anything or NULL */
|
||||
PyObject *func_code; /* __code__: code or NULL */
|
||||
PyObject *func_defaults; /* __defaults__: tuple or NULL */
|
||||
PyObject *func_kwdefaults; /* __kwdefaults__: dict or NULL */
|
||||
PyObject *func_annotations; /* __annotations__: dict or NULL */
|
||||
PyObject *func_globals; /* __globals__: anything or NULL; readonly */
|
||||
PyObject *func_closure; /* __closure__: tuple of cell objects or NULL; readonly */
|
||||
PyObject *func_dict; /* __dict__: dict or NULL */
|
||||
} PyGenericFunctionObject;
|
||||
|
||||
This class adds various slots like ``__doc__`` and ``__code__`` to access the C attributes.
|
||||
The slot ``__name__`` returns ``func_name``.
|
||||
When setting ``__name__``, also ``base.m_ml.ml_name`` is updated
|
||||
with the UTF-8 encoded name.
|
||||
|
||||
None of the attributes is required to be meaningful.
|
||||
In particular, ``__code__`` may not be a working code object,
|
||||
possibly only a few fields may be filled in.
|
||||
And ``__defaults__`` is not required to be used for calling the function.
|
||||
|
||||
Apart from adding these extra attributes,
|
||||
``generic_function`` behaves exactly the same as ``basefunction``.
|
||||
|
||||
python_function
|
||||
---------------
|
||||
|
||||
This is the class meant for functions implemented in Python,
|
||||
formerly known as ``function``.
|
||||
Unlike the other function types,
|
||||
instances of ``python_function`` can be created from Python code.
|
||||
|
||||
The layout of the C structure is almost the same as ``generic_function``::
|
||||
|
||||
PyTypeObject PyFunction_Type;
|
||||
|
||||
typedef struct {
|
||||
PyBaseFunctionObject base;
|
||||
PyObject *func_name; /* __name__: string */
|
||||
PyObject *func_qualname; /* __qualname__: string */
|
||||
PyObject *func_doc; /* __doc__: can be anything or NULL */
|
||||
PyObject *func_code; /* __code__: code or NULL */
|
||||
PyObject *func_defaults; /* __defaults__: tuple or NULL */
|
||||
PyObject *func_kwdefaults; /* __kwdefaults__: dict or NULL */
|
||||
PyObject *func_annotations; /* __annotations__: dict or NULL */
|
||||
PyObject *func_globals; /* __globals__: anything or NULL; readonly */
|
||||
PyObject *func_closure; /* __closure__: tuple of cell objects or NULL; readonly */
|
||||
PyObject *func_dict; /* __dict__: dict or NULL */
|
||||
PyCFunctionDef _ml; /* Storage for base.m_ml */
|
||||
} PyFunctionObject;
|
||||
|
||||
The only difference is an ``_ml`` field
|
||||
which reserves space to be used by ``base.m_ml``.
|
||||
However, it is not required that ``base.m_ml`` points to ``_ml``.
|
||||
|
||||
The constructor takes care of setting up ``base.m_ml``.
|
||||
In particular, it sets the ``METH_PYTHON`` flag.
|
||||
|
||||
method
|
||||
------
|
||||
|
||||
The class ``method`` is used for all bound methods,
|
||||
regardless of the class of the underlying function.
|
||||
There is one extra attribute ``__func__`` pointing to that function.
|
||||
|
||||
For methods, there is a complication because we want to allow
|
||||
constructing a method from a arbitrary callable which
|
||||
may not be an instance of ``basefunction``.
|
||||
Therefore, in practice there are two kinds of methods:
|
||||
for arbitrary callables, we use a single fixed ``PyCFunctionDef``
|
||||
structure with ``ml_name`` equal to ``"?"``
|
||||
and with the ``METH_ARG0_FUNCTION`` flag set.
|
||||
The C function then calls ``__func__`` with the correct arguments.
|
||||
|
||||
For methods which bind instances of ``basefunction``
|
||||
(more precisely, which have the ``Py_TPFLAGS_BASEFUNCTION`` flag set),
|
||||
we instead use the ``PyCFunctionDef`` from the original function.
|
||||
In this case, the ``__func__`` attribute is only used to implement various attributes
|
||||
but not for calling the method.
|
||||
|
||||
When constructing a new method from a ``basefunction``,
|
||||
we check that the ``self`` object is an instance of ``__objclass__``
|
||||
(if such a class was specified) and raise a ``TypeError`` otherwise.
|
||||
|
||||
The C structure is::
|
||||
|
||||
typedef struct {
|
||||
PyBaseFunctionObject base;
|
||||
PyObject *im_func; /* __func__: function implementing the method; readonly */
|
||||
} PyMethodObject;
|
||||
|
||||
|
||||
|
||||
Calling basefunction instances
|
||||
==============================
|
||||
|
||||
We specify the implementation of ``__call__`` for instances of ``basefunction``.
|
||||
|
||||
__objclass__
|
||||
------------
|
||||
|
||||
First of all, if the function has an ``__objclass__`` attribute but no
|
||||
``__self__`` attribute (this is the case for unbound methods of extension types),
|
||||
then the function must be called with at least one positional argument
|
||||
and the first (typically called ``self``) must be an instance of ``__objclass__``.
|
||||
If not, a ``TypeError`` is raised.
|
||||
|
||||
Flags
|
||||
-----
|
||||
|
||||
For convenience, we define two new constants:
|
||||
``METH_CALLSIGNATURE`` combines the flags from ``PyCFunctionDef.ml_flags``
|
||||
which specify the signature of the C function to be called.
|
||||
It is equal to ::
|
||||
|
||||
METH_NOARGS | METH_O | METH_VARARGS | METH_FASTCALL | METH_KEYWORDS
|
||||
|
||||
Exactly one of the first four flags above must be set
|
||||
and only ``METH_VARARGS`` and ``METH_FASTCALL`` may be combined with ``METH_KEYWORDS``.
|
||||
Violating these rules is undefined behaviour.
|
||||
|
||||
The second new constant is ``METH_CALLFLAGS``.
|
||||
It combines all flags which influence how a function is called.
|
||||
It is equal to ::
|
||||
|
||||
METH_CALLSIGNATURE | METH_ARG0_FUNCTION | METH_ARG0_NO_SLICE
|
||||
|
||||
Some of these flags are already documented [#methoddoc]_.
|
||||
We explain the others below.
|
||||
|
||||
METH_FASTCALL
|
||||
-------------
|
||||
|
||||
This is an existing but undocumented flag.
|
||||
We suggest to officially support and document it.
|
||||
|
||||
If the flag ``METH_FASTCALL`` is set without ``METH_KEYWORDS``,
|
||||
then the ``ml_meth`` field is of type ``PyCFunctionFast``
|
||||
which takes the arguments ``(PyObject *arg0, PyObject *const *args, Py_ssize_t nargs)``.
|
||||
Such a function takes only positional arguments and they are passed as plain C array
|
||||
``args`` of length ``nargs``.
|
||||
|
||||
If the flags ``METH_FASTCALL | METH_KEYWORDS`` are set,
|
||||
then the ``ml_meth`` field is of type ``PyCFunctionFastWithKeywords``
|
||||
which takes the arguments ``(PyObject *arg0, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)``.
|
||||
The positional arguments are passed as C array ``args`` of length ``nargs``.
|
||||
The *values* of the keyword arguments follow in that array,
|
||||
starting at position ``nargs``.
|
||||
The *keys* (names) of the keyword arguments are passed as a ``tuple`` in ``kwnames``.
|
||||
As an example, assume that 3 positional and 2 keyword arguments are given.
|
||||
Then ``args`` is an array of length 3 + 2 = 5, ``nargs`` equals 3 and ``kwnames`` is a 2-tuple.
|
||||
|
||||
METH_ARG0_FUNCTION
|
||||
------------------
|
||||
|
||||
If this flag is set, then the first argument to the C function
|
||||
is the function itself (the ``basefunction`` instance) instead of ``__self__``.
|
||||
In this case, the C function should deal with ``__self__``
|
||||
by getting it from the function, for example using ``PyBaseFunction_GET_SELF``.
|
||||
|
||||
METH_ARG0_NO_SLICE
|
||||
------------------
|
||||
|
||||
If the function has a ``__objclass__`` attribute, no ``__self__``
|
||||
attribute and neither ``METH_ARG0_FUNCTION`` nor ``METH_ARG0_NO_SLICE`` are set,
|
||||
then the first positional argument (which must exist because of ``__objclass__``)
|
||||
is removed from ``*args`` and instead passed as first argument to the C function.
|
||||
Effectively, the first positional argument is treated as ``__self__``.
|
||||
This process is called "self slicing".
|
||||
This does not affect keyword arguments.
|
||||
|
||||
It is not allowed to combine the flags ``METH_ARG0_FUNCTION`` and ``METH_ARG0_NO_SLICE``.
|
||||
That is not a problem because ``METH_ARG0_FUNCTION`` already disables self slicing.
|
||||
|
||||
|
||||
Automatic creation of built-in functions
|
||||
========================================
|
||||
|
||||
Python automatically generates instances of ``builtin_function``
|
||||
for extension types (using the ``PyTypeObject.tp_methods`` field) and modules
|
||||
(using the ``PyModuleDef.m_methods`` field).
|
||||
The arrays ``PyTypeObject.tp_methods`` and ``PyModuleDef.m_methods``
|
||||
must be arrays of ``PyMethodDef`` structures.
|
||||
|
||||
If the ``METH_CUSTOM`` flag is set for an element of such an array,
|
||||
then no ``builtin_function`` will be generated.
|
||||
This allows an application to customize the creation of functions
|
||||
in an extension type or module.
|
||||
If ``METH_CUSTOM`` is set, then ``METH_STATIC`` and ``METH_CLASS`` are ignored.
|
||||
|
||||
Built-in unbound methods
|
||||
------------------------
|
||||
|
||||
The type of unbound methods changes from ``method_descriptor``
|
||||
to ``builtin_function``.
|
||||
The object which appears as unbound method is the same object which
|
||||
appears in the class ``__dict__``.
|
||||
Python automatically sets the ``__objclass__`` attribute.
|
||||
|
||||
Built-in functions of a module
|
||||
------------------------------
|
||||
|
||||
For the case of functions of a module,
|
||||
``__self__`` will be set to the module unless the flag ``METH_STATIC`` is set.
|
||||
|
||||
An important consequence is that such functions by default
|
||||
do not become methods when used as attribute
|
||||
(``basefunction.__get__`` only does that if ``__self__`` was unset).
|
||||
One could consider this a bug, but this was done for backwards compatibility reasons:
|
||||
in an initial post on python-ideas [#proposal]_ the concensus was to keep this
|
||||
misfeature of built-in functions.
|
||||
|
||||
However, to allow this anyway for specific or newly implemented
|
||||
built-in functions, the ``METH_STATIC`` flag prevents setting ``__self__``.
|
||||
Previously, ``METH_STATIC`` was an error, so this is fullt backwards compatible.
|
||||
Specifying ``METH_CLASS`` is still an error.
|
||||
|
||||
|
||||
Further changes
|
||||
===============
|
||||
|
||||
New type flag
|
||||
-------------
|
||||
|
||||
A new ``PyTypeObject`` flag (for ``tp_flags``) is added:
|
||||
``Py_TPFLAGS_BASEFUNCTION`` to indicate that instances of this type are
|
||||
functions which can be called as a ``basefunction``.
|
||||
In other words, subclasses of ``basefunction``
|
||||
which follow the implementation from `Calling basefunction instances`_.
|
||||
|
||||
This is different from flags like ``Py_TPFLAGS_LIST_SUBCLASS``
|
||||
because it indicates more than just a subclass:
|
||||
it also indicates a default implementation of ``__call__``.
|
||||
This flag is never inherited.
|
||||
However, extension types can explicitly specify it if they
|
||||
do not override ``__call__`` or if they override ``__call__`` in a compatible way.
|
||||
The flag ``Py_TPFLAGS_BASEFUNCTION`` must never be set for a heap type
|
||||
because that would not be safe (heap types can be changed dynamically).
|
||||
|
||||
C API functions
|
||||
---------------
|
||||
|
||||
We add and change some Python/C API functions:
|
||||
|
||||
- ``int PyBaseFunction_Check(PyObject *op)``: return true if ``op``
|
||||
is an instance of a type with the ``Py_TPFLAGS_BASEFUNCTION`` set.
|
||||
|
||||
- ``int PyCFunction_Check(PyObject *op)``: return true if ``PyBaseFunction_Check(op)``
|
||||
is True and the function ``op`` does not have the flag ``METH_PYTHON`` set.
|
||||
|
||||
- ``int PyBuiltinFunction_Check(PyObject *op)``: return true if ``op``
|
||||
is an instance of ``builtin_function``.
|
||||
|
||||
- ``int PyFunction_Check(PyObject *op)``: return true if ``op``
|
||||
is an instance of ``generic_function``.
|
||||
|
||||
- ``PyObject* PyFunction_New(PyObject *code, PyObject *globals)``:
|
||||
create a new instance of ``python_function``.
|
||||
|
||||
- ``PyObject* PyFunction_NewWithQualName(PyObject *code, PyObject *globals)``:
|
||||
create a new instance of ``python_function``.
|
||||
|
||||
- For some existing ``PyCFunction_...`` and ``PyMethod_`` functions,
|
||||
we define a new function ``PyBaseFunction_...``
|
||||
acting on ``basefunction`` instances.
|
||||
For backwards compatibility,
|
||||
the old functions are kept as aliases of the new functions.
|
||||
|
||||
**TODO**: more functions may be added when implementing this PEP.
|
||||
In particular, maybe there should be functions for creating instances of ``basefunction``
|
||||
or ``generic_function``.
|
||||
|
||||
Changes to the types module
|
||||
---------------------------
|
||||
|
||||
Two types are added: ``types.BaseFunctionType`` corresponding to
|
||||
``basefunction`` and ``types.GenericFunctionType`` corresponding to
|
||||
``generic_function``.
|
||||
|
||||
Apart from that, no changes to the ``types`` module are made.
|
||||
In particular, ``types.FunctionType`` refers to ``python_function``.
|
||||
However, the actual types will change:
|
||||
for example, ``types.BuiltinFunctionType`` will no longer be the same
|
||||
as ``types.BuiltinMethodType``.
|
||||
|
||||
Changes to the inspect module
|
||||
-----------------------------
|
||||
|
||||
``inspect.isbasefunction`` checks for an instance of ``basefunction``.
|
||||
|
||||
``inspect.isfunction`` checks for an instance of ``generic_function``.
|
||||
|
||||
``inspect.isbuiltin`` checks for an instance of ``builtin_function``.
|
||||
|
||||
Profiling
|
||||
---------
|
||||
|
||||
Currently, ``sys.setprofile`` supports ``c_call``, ``c_return`` and ``c_exception``
|
||||
events for built-in functions.
|
||||
These events are generated when calling or returning from a built-in function.
|
||||
By contrast, the ``call`` and ``return`` events are generated by the function itself.
|
||||
So nothing needs to change for the ``call`` and ``return`` events.
|
||||
|
||||
Since we no longer make a difference between C functions and Python functions,
|
||||
we need to prevent the ``c_*`` events for Python functions.
|
||||
This is done by not generating those events if the
|
||||
``METH_PYTHON`` flag in ``ml_flags`` is set.
|
||||
|
||||
User flags in PyCFunctionDef.ml_flags
|
||||
----------------------------------------
|
||||
|
||||
8 consecutive bits in ``ml_flags`` are reserved for the "user",
|
||||
meaning the person or program who implemented the function.
|
||||
These are ``METH_USR0``, ..., ``METH_USR7``.
|
||||
Python will ignore these flags.
|
||||
|
||||
It should be clear that different users may use these flags
|
||||
for different purposes, so users should only look at those flags in
|
||||
functions that they implemented (for example, by looking for those flags
|
||||
in the ``tp_methods`` array of an extension type).
|
||||
|
||||
|
||||
Non-CPython implementations
|
||||
===========================
|
||||
|
||||
For other implementations of Python apart from CPython,
|
||||
only the classes ``basefunction``, ``method`` and ``python_function`` are required.
|
||||
The latter two are the only classes which can be instantiated directly
|
||||
from the Python interpreter.
|
||||
We require ``basefunction`` for consistency but we put no requirements on it:
|
||||
it is acceptable if this is just a copy of ``object``.
|
||||
Support for the new ``__objclass__`` attribute is not required.
|
||||
If there is no ``generic_function`` type,
|
||||
then ``types.GenericFunctionType`` should be an alias of ``types.FunctionType``.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
Why not simply change existing classes?
|
||||
---------------------------------------
|
||||
|
||||
One could try to solve the problem not by introducing a new ``basefunction``
|
||||
class and changing the class hierarchy, but by just changing existing classes.
|
||||
|
||||
That might look like a simpler solution but it is not:
|
||||
it would require introspection support for 3 distinct classes:
|
||||
``function``, ``builtin_function_or_method`` and ``method_descriptor``.
|
||||
In the current PEP, there is only a single class where introspection needs
|
||||
to be implemented.
|
||||
It is also not clear how this would interact with ``__text_signature__``.
|
||||
Having two independent kinds of ``inspect.signature`` support on the same
|
||||
class sounds like asking for problems.
|
||||
|
||||
And this would not fix some of the other differences between built-in functions
|
||||
and Python functions that were mentioned in the `Motivation`_.
|
||||
|
||||
Why __text_signature__ is not a solution
|
||||
----------------------------------------
|
||||
|
||||
Built-in functions have an attribute ``__text_signature__``,
|
||||
which gives the signature of the function as plain text.
|
||||
The default values are evaluated by ``ast.literal_eval``.
|
||||
Because of this, it supports only a small number of standard Python classes
|
||||
and not arbitrary Python objects.
|
||||
|
||||
And even if ``__text_signature__`` would allow arbitrary signatures somehow,
|
||||
that is only one piece of introspection:
|
||||
it does not help with ``inspect.getsourcefile`` for example.
|
||||
|
||||
generic_function versus python_function
|
||||
---------------------------------------
|
||||
|
||||
The names ``generic_function`` and ``python_function``
|
||||
were chosen to be different from ``function``
|
||||
because none of the two classes ``generic_function``/``python_function``
|
||||
is an obvious candidate to receive the ``function`` name.
|
||||
It also allows to use the word "function" informally without referring
|
||||
to a specific class.
|
||||
|
||||
In many places, a decision needs to be made whether the old ``function`` class
|
||||
should be replaced by ``generic_function`` or ``python_function``.
|
||||
This is done by thinking of the most likely use case:
|
||||
|
||||
1. ``types.FunctionType`` refers to ``python_function`` because that
|
||||
type might be used to construct instances using ``types.FunctionType(...)``.
|
||||
|
||||
2. ``inspect.isfunction()`` refers to ``generic_function``
|
||||
because this is the class where introspection is supported.
|
||||
|
||||
3. The C API functions ``PyFunction_New...``
|
||||
refer to ``python_function`` simply because one cannot create instances
|
||||
of ``generic_function``.
|
||||
|
||||
4. The C API functions ``PyFunction_Check`` and ``PyFunction_Get/Set...``
|
||||
refer to ``generic_function`` because all attributes exist for instances of ``generic_function``.
|
||||
|
||||
Scope of this PEP: which classes are involved?
|
||||
----------------------------------------------
|
||||
|
||||
The main motivation of this PEP is fixing function classes,
|
||||
so we certainly want to unify the existing classes
|
||||
``builtin_function_or_method`` and ``function``.
|
||||
|
||||
Since built-in functions and methods have the same class,
|
||||
it seems natural to include bound methods too.
|
||||
And since there are no "unbound methods" for Python functions,
|
||||
it makes sense to get rid of unbound methods for extension types.
|
||||
|
||||
For now, no changes are made to the classes ``staticmethod``,
|
||||
``classmethod`` and ``classmethod_descriptor``.
|
||||
It would certainly make sense to put these in the ``basefunction``
|
||||
class hierarchy and unify ``classmethod`` and ``classmethod_descriptor``.
|
||||
However, this PEP is already big enough
|
||||
and this is left as a possible future improvement.
|
||||
|
||||
Slot wrappers for extension types like ``__init__`` or ``__eq__``
|
||||
are quite different from normal methods.
|
||||
They are also typically not called directly because you would normally
|
||||
write ``foo[i]`` instead of ``foo.__getitem__(i)`` for example.
|
||||
So these are left outside the scope of this PEP.
|
||||
|
||||
Python also has an ``instancemethod`` class, which was used in Python 2
|
||||
for unbound methods.
|
||||
It is not clear whether there is still a use case for it.
|
||||
In any case, there is no reason to deal with it in this PEP.
|
||||
|
||||
**TODO**: should ``instancemethod`` be deprecated?
|
||||
It doesn't seem used at all within CPython 3.7,
|
||||
but maybe external packages use it?
|
||||
|
||||
__self__ in basefunction
|
||||
------------------------
|
||||
|
||||
It may look strange at first sight to add the ``__self__`` slot
|
||||
in ``basefunction`` as opposed to ``method``.
|
||||
We took this idea from the existing ``builtin_function_or_method`` class.
|
||||
It allows us to have a single general implementation of ``__call__``
|
||||
for the various function classes discussed in this PEP.
|
||||
It also makes it easy to support existing built-in functions
|
||||
which set ``__self__`` to the module (for example, ``sys.exit.__self__`` is ``sys``).
|
||||
|
||||
Subclassing
|
||||
-----------
|
||||
|
||||
We disallow subclassing of ``builtin_function`` and ``method``
|
||||
to enable fast type checks for ``PyBuiltinFunction_Check`` and ``PyMethod_Check()``.
|
||||
|
||||
We allow subclassing of the other classes because there is no reason to disallow it.
|
||||
For Python modules, the only relevant class to subclass is
|
||||
``python_function`` because the others cannot be instantiated anyway.
|
||||
|
||||
Replacing tp_call: METH_ARG0_FUNCTION
|
||||
-------------------------------------
|
||||
|
||||
The new flag ``METH_ARG0_FUNCTION`` is meant to support cases where
|
||||
formerly a custom ``tp_call`` was used.
|
||||
It would reduce the number of special fast paths in ``Python/ceval.c``
|
||||
for calling objects:
|
||||
instead of treating Python functions, built-in functions and methods,
|
||||
there would only be a single check.
|
||||
|
||||
The signature of ``tp_call`` is essentially the signature
|
||||
of ``PyBaseFunctionObject.m_ml.ml_meth`` with flags
|
||||
``METH_VARARGS | METH_KEYWORDS | METH_ARG0_FUNCTION``.
|
||||
Therefore, it should be easy to change existing ``tp_call`` slots
|
||||
to use ``METH_ARG0_FUNCTION``.
|
||||
There is just one extra complication: ``__self__`` must be handled manually.
|
||||
That is not hard though: it just means adapting that logic from ``method``.
|
||||
|
||||
Self slicing: METH_ARG0_NO_SLICE
|
||||
--------------------------------
|
||||
|
||||
We define "self slicing" to mean slicing off the ``self`` argument of a method
|
||||
from the ``*args`` tuple when an unbound method is called.
|
||||
This ``self`` argument is then passed as first argument to the C function.
|
||||
|
||||
The specification of ``METH_ARG0_NO_SLICE`` may seem strange at first.
|
||||
The negation is confusing, but it is done for backwards compatibility:
|
||||
existing methods require self slicing but do not specify a flag for it.
|
||||
|
||||
The requirement for ``__objclass__`` in order to use self slicing
|
||||
makes sense because it guarantees that there is a ``self`` argument in the first place.
|
||||
|
||||
Since ``METH_ARG0_FUNCTION`` is clearly incompatible with self slicing
|
||||
(both use the first argument of the C function),
|
||||
this PEP dictates that ``METH_ARG0_FUNCTION`` disables self slicing.
|
||||
So one may wonder if there is actually a use case for ``METH_ARG0_NO_SLICE``
|
||||
without ``METH_ARG0_FUNCTION``.
|
||||
If not, then one could simply unify those two flags in one flag
|
||||
``METH_ARG0_FUNCTION``.
|
||||
|
||||
However, a priori, the flag ``METH_ARG0_NO_SLICE`` is meaningful,
|
||||
so we keep the two flags ``METH_ARG0_FUNCTION`` and ``METH_ARG0_NO_SLICE`` separate.
|
||||
|
||||
**TODO**: this should be reconsidered after initial implementation
|
||||
and testing of this PEP.
|
||||
|
||||
User flags: METH_CUSTOM and METH_USRx
|
||||
-------------------------------------
|
||||
|
||||
These flags are meant for applications that want to use
|
||||
``tp_methods`` for an extension type or ``m_methods`` for a module
|
||||
but that do not want the default built-in functions to be created.
|
||||
Those applications would set ``METH_CUSTOM``.
|
||||
The application is also free to use ``METH_USR0``, ..., ``METH_USR7``
|
||||
for its own purposes,
|
||||
for example to customize the creation of special function instances.
|
||||
|
||||
There is no obvious concrete use case,
|
||||
but given that it costs essentially nothing to have these flags,
|
||||
it seems like a good idea to allow it.
|
||||
|
||||
|
||||
Backwards Compatibility
|
||||
=======================
|
||||
|
||||
While designing this PEP, great care was taken to not break
|
||||
backwards compatibility too much.
|
||||
|
||||
Python functions
|
||||
----------------
|
||||
|
||||
For Python functions, essentially nothing changes.
|
||||
The attributes that existed before still exist and Python functions
|
||||
can be initialized, called and turned into methods as before.
|
||||
|
||||
Built-in functions of a module
|
||||
------------------------------
|
||||
|
||||
Also for built-in functions, nothing changes.
|
||||
We keep the old behaviour that such functions do not bind as methods.
|
||||
This is a consequence of the fact that ``__self__`` is set to the module.
|
||||
|
||||
Built-in bound and unbound methods
|
||||
----------------------------------
|
||||
|
||||
The types of built-in bound and unbound methods will change.
|
||||
However, this does not affect calling such methods
|
||||
because the protocol in ``basefunction.__call__``
|
||||
(in particular the handling of ``__objclass__`` and self slicing)
|
||||
was specifically designed to be backwards compatible.
|
||||
All attributes which existed before (like ``__objclass__`` and ``__self__``)
|
||||
still exist.
|
||||
|
||||
New classes
|
||||
-----------
|
||||
|
||||
Tools which take various kinds of functions as input will need to deal
|
||||
with the new function hieararchy and the possibility of custom
|
||||
function classes.
|
||||
If those tools use ``inspect`` properly, there should be few
|
||||
backwards compatibility problems.
|
||||
|
||||
New attributes
|
||||
--------------
|
||||
|
||||
Some objects get new attributes.
|
||||
For example, ``__objclass__`` now appears on bound methods too
|
||||
and all methods get a ``__func__`` attribute.
|
||||
We expect that this will not cause problems.
|
||||
|
||||
|
||||
Reference Implementation
|
||||
========================
|
||||
|
||||
After initial discussions of this PEP draft,
|
||||
work will start on a reference implementation in CPython.
|
||||
|
||||
|
||||
Appendix: current situation
|
||||
===========================
|
||||
|
||||
**NOTE**:
|
||||
This section is more useful during the draft period of the PEP,
|
||||
so feel free to remove this once the PEP has been accepted.
|
||||
|
||||
For reference, we describe in detail the relevant existing classes in CPython 3.7.
|
||||
|
||||
There are a surprisingly large number of classes involved,
|
||||
each of them is an "orphan" class (no non-trivial subclasses nor superclasses).
|
||||
|
||||
builtin_function_or_method: built-in functions and bound methods
|
||||
----------------------------------------------------------------
|
||||
|
||||
These are of type `PyCFunction_Type <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Objects/methodobject.c#L271>`_
|
||||
with structure `PyCFunctionObject <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Include/methodobject.h#L102>`_::
|
||||
|
||||
typedef struct {
|
||||
PyObject_HEAD
|
||||
PyMethodDef *m_ml; /* Description of the C function to call */
|
||||
PyObject *m_self; /* Passed as 'self' arg to the C func, can be NULL */
|
||||
PyObject *m_module; /* The __module__ attribute, can be anything */
|
||||
PyObject *m_weakreflist; /* List of weak references */
|
||||
} PyCFunctionObject;
|
||||
|
||||
struct PyMethodDef {
|
||||
const char *ml_name; /* The name of the built-in function/method */
|
||||
PyCFunction ml_meth; /* The C function that implements it */
|
||||
int ml_flags; /* Combination of METH_xxx flags, which mostly
|
||||
describe the args expected by the C func */
|
||||
const char *ml_doc; /* The __doc__ attribute, or NULL */
|
||||
};
|
||||
|
||||
where ``PyCFunction`` is a C function pointer (there are various forms of this, the most basic
|
||||
takes two arguments for ``self`` and ``*args``).
|
||||
|
||||
This class is used both for functions and bound methods:
|
||||
for a method, the ``m_self`` slot points to the object::
|
||||
|
||||
>>> dict(foo=42).get
|
||||
<built-in method get of dict object at 0x...>
|
||||
>>> dict(foo=42).get.__self__
|
||||
{'foo': 42}
|
||||
|
||||
In some cases, a function is considered a "method" of the module defining it::
|
||||
|
||||
>>> import os
|
||||
>>> os.kill
|
||||
<built-in function kill>
|
||||
>>> os.kill.__self__
|
||||
<module 'posix' (built-in)>
|
||||
|
||||
method_descriptor: built-in unbound methods
|
||||
-------------------------------------------
|
||||
|
||||
These are of type `PyMethodDescr_Type <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Objects/descrobject.c#L538>`_
|
||||
with structure `PyMethodDescrObject <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Include/descrobject.h#L53>`_::
|
||||
|
||||
typedef struct {
|
||||
PyDescrObject d_common;
|
||||
PyMethodDef *d_method;
|
||||
} PyMethodDescrObject;
|
||||
|
||||
typedef struct {
|
||||
PyObject_HEAD
|
||||
PyTypeObject *d_type;
|
||||
PyObject *d_name;
|
||||
PyObject *d_qualname;
|
||||
} PyDescrObject;
|
||||
|
||||
function: Python functions
|
||||
--------------------------
|
||||
|
||||
These are of type `PyFunction_Type <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Objects/funcobject.c#L592>`_
|
||||
with structure `PyFunctionObject <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Include/funcobject.h#L21>`_::
|
||||
|
||||
typedef struct {
|
||||
PyObject_HEAD
|
||||
PyObject *func_code; /* A code object, the __code__ attribute */
|
||||
PyObject *func_globals; /* A dictionary (other mappings won't do) */
|
||||
PyObject *func_defaults; /* NULL or a tuple */
|
||||
PyObject *func_kwdefaults; /* NULL or a dict */
|
||||
PyObject *func_closure; /* NULL or a tuple of cell objects */
|
||||
PyObject *func_doc; /* The __doc__ attribute, can be anything */
|
||||
PyObject *func_name; /* The __name__ attribute, a string object */
|
||||
PyObject *func_dict; /* The __dict__ attribute, a dict or NULL */
|
||||
PyObject *func_weakreflist; /* List of weak references */
|
||||
PyObject *func_module; /* The __module__ attribute, can be anything */
|
||||
PyObject *func_annotations; /* Annotations, a dict or NULL */
|
||||
PyObject *func_qualname; /* The qualified name */
|
||||
|
||||
/* Invariant:
|
||||
* func_closure contains the bindings for func_code->co_freevars, so
|
||||
* PyTuple_Size(func_closure) == PyCode_GetNumFree(func_code)
|
||||
* (func_closure may be NULL if PyCode_GetNumFree(func_code) == 0).
|
||||
*/
|
||||
} PyFunctionObject;
|
||||
|
||||
In Python 3, there is no "unbound method" class:
|
||||
an unbound method is just a plain function.
|
||||
|
||||
method: Python bound methods
|
||||
----------------------------
|
||||
|
||||
These are of type `PyMethod_Type <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Objects/classobject.c#L329>`_
|
||||
with structure `PyMethodObject <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Include/classobject.h#L12>`_::
|
||||
|
||||
typedef struct {
|
||||
PyObject_HEAD
|
||||
PyObject *im_func; /* The callable object implementing the method */
|
||||
PyObject *im_self; /* The instance it is bound to */
|
||||
PyObject *im_weakreflist; /* List of weak references */
|
||||
} PyMethodObject;
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [#cython] Cython (http://cython.org/)
|
||||
|
||||
.. [#bpo30071] Python bug 30071 (https://bugs.python.org/issue30071)
|
||||
|
||||
.. [#clinic] PEP 436, The Argument Clinic DSL, Hastings (https://www.python.org/dev/peps/pep-0436)
|
||||
|
||||
.. [#methoddoc] PyMethodDef documentation (https://docs.python.org/3.7/c-api/structures.html#c.PyMethodDef)
|
||||
|
||||
.. [#proposal] PEP proposal: unifying function/method classes (https://mail.python.org/pipermail/python-ideas/2018-March/049398.html)
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document has been placed in the public domain.
|
||||
|
||||
|
||||
|
||||
..
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
coding: utf-8
|
||||
End:
|
Loading…
Reference in New Issue