PEP 575: Unifying function/method classes (#605)

This commit is contained in:
jdemeyer 2018-04-02 22:47:37 +02:00 committed by Chris Angelico
parent d8f315b81a
commit 9012ec6e8c
1 changed files with 901 additions and 0 deletions

901
pep-0575.rst Normal file
View File

@ -0,0 +1,901 @@
PEP: 575
Title: Unifying function/method classes
Author: Jeroen Demeyer <J.Demeyer@UGent.be>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 27-Mar-2018
Python-Version: 3.8
Post-History: 31-Mar-2018
Abstract
========
Reorganize the class hierarchy for functions and methods
with the goal of reducing the difference between
built-in functions (implemented in C) and Python functions.
Mainly, make built-in functions behave more like Python functions
without sacrificing performance.
A new base class ``basefunction`` is introduced and the various function
classes, as well as ``method``, inherit from it.
We also allow subclassing of some of these function classes.
Motivation
==========
Currently, CPython has two different function classes:
the first is Python functions, which is what you get
when defining a function with ``def`` or ``lambda``.
The second is built-in functions such as ``len``, ``isinstance`` or ``numpy.dot``.
These are implemented in C.
These two classes are completely independent with different functionality.
In particular, it is currently not possible to implement a function efficiently in C
(only built-in functions can do that)
while still allowing introspection like ``inspect.signature`` or ``inspect.getsourcefile``
(only Python functions can do that).
This is a problem for projects like Cython [#cython]_ that want to do exactly that.
In Cython, this was worked around by inventing a new function class called ``cyfunction``.
Unfortunately, a new function class creates problems:
the ``inspect`` module does not recognize such functions as being functions [#bpo30071]_
and the performance is worse
(CPython has specific optimizations for calling built-in functions).
A second motivation is more generally making built-in functions and methods
behave more like Python functions and methods.
For example, Python unbound methods are just functions but
unbound methods of extension types (e.g. ``dict.get``) are a distinct class.
Bound methods of Python classes have a ``__func__`` attribute,
bound methods of extension types do not.
New classes
===========
This is the new class hierarchy for functions and methods::
object
|
|
basefunction
/ | \
/ | \
/ | generic_function
/ | \
builtin_function (*) | \
| python_function
|
method (*)
The two classes marked with (*) do *not* allow subclassing;
the others do.
There is no difference between functions and unbound methods,
while bound methods are instances of ``method``.
basefunction
------------
The class ``basefunction`` becomes a new base class for all function types.
It behaves like the existing ``builtin_function_or_method``
with some differences:
#. It acts as a descriptor implementing ``__get__`` to turn a function into a method
if there was no ``__self__`` attribute.
If the ``__self__`` attribute was already set, then this is a no-op:
the existing function is returned instead.
#. A new read-only slot ``__objclass__``, represented in the C structure as ``m_selftype``.
If this attribute exists, it must be a class (it cannot be ``None``).
If so, the function must be called with ``self`` being an instance of that class.
This is meant to support unbound methods of extension types, replacing ``method_descriptor``.
#. Argument Clinic [#clinic]_ is not supported.
#. The field ``ml_doc`` and the attributes ``__doc__`` and ``__text_signature__``
are gone.
#. A new flag ``METH_CUSTOM`` for ``ml_flags`` which prevents automatic
generation of a ``builtin_function``, see `Automatic creation of built-in functions`_.
#. A new flag ``METH_ARG0_FUNCTION`` for ``ml_flags``.
If this flag is set, the C function stored in ``ml_meth`` will be called with first argument
equal to the function object instead of ``__self__``.
#. A new flag ``METH_ARG0_NO_SLICE`` for ``ml_flags``.
If this flag is *not* set, ``__objclass__`` is set and ``__self__`` is not set,
then the first positional argument is treated as ``__self__``.
For more details, see `Calling basefunction instances`_.
#. A new flag ``METH_PYTHON`` for ``ml_flags``.
This flag indicates that this function should be treated as Python function.
Ideally, use of this flag should be avoided because it goes
against the duck typing philosophy.
It is still needed in a few places though, for example `Profiling`_.
The goal of ``basefunction`` is that it supports all different ways
of calling functions and methods in just one structure.
For example, the new flag ``METH_ARG0_FUNCTION``
will be used by the implementation of Python functions.
It is not possible to directly create instances of ``basefunction``
(``tp_new`` is ``NULL``).
However, it is legal for C code to manually create instances.
These are the relevant C structures::
PyTypeObject PyBaseFunction_Type;
typedef struct {
PyObject_HEAD
PyCFunctionDef *m_ml; /* Description of the C function to call */
PyObject *m_self; /* __self__: anything, can be NULL; readonly */
PyObject *m_module; /* __module__: anything */
PyObject *m_weakreflist; /* List of weak references */
PyObject *m_selftype; /* __objclass__: type object or NULL; readonly */
} PyBaseFunctionObject;
typedef struct {
const char *ml_name; /* The name of the built-in function/method */
PyCFunction ml_meth; /* The C function that implements it */
uint32_t ml_flags; /* Combination of METH_xxx flags, which mostly
describe the args expected by the C func */
} PyCFunctionDef;
Note that the type of ``ml_flags`` was changed from ``int`` to
``uint32_t`` (it makes a lot of sense to fix the number of bits).
Subclasses may extend ``PyCFunctionDef`` with extra fields.
builtin_function
----------------
This is a copy of ``basefunction``, with the following differences:
#. ``m_ml`` points to a ``PyMethodDef`` structure,
extending ``PyCFunctionDef`` with an additional ``ml_doc``
field to implement ``__doc__`` and ``__text_signature__``
as read-only attributes::
typedef struct {
const char *ml_name;
PyCFunction ml_meth;
uint32_t ml_flags;
const char *ml_doc;
} PyMethodDef;
#. Argument Clinic [#clinic]_ is supported.
The type object is ``PyTypeObject PyCFunction_Type``
and we define ``PyCFunctionObject`` as alias of ``PyBaseFunctionObject``.
generic_function
----------------
The class ``generic_function`` (a subclass of ``basefunction``) adds
support for various standard attributes which are used in ``inspect``.
This would be a good class to use for auto-generated C code, for example produced by Cython [#cython]_.
The layout of the C structure is as follows::
PyTypeObject PyGenericFunction_Type;
typedef struct {
PyBaseFunctionObject base;
PyObject *func_name; /* __name__: string */
PyObject *func_qualname; /* __qualname__: string */
PyObject *func_doc; /* __doc__: can be anything or NULL */
PyObject *func_code; /* __code__: code or NULL */
PyObject *func_defaults; /* __defaults__: tuple or NULL */
PyObject *func_kwdefaults; /* __kwdefaults__: dict or NULL */
PyObject *func_annotations; /* __annotations__: dict or NULL */
PyObject *func_globals; /* __globals__: anything or NULL; readonly */
PyObject *func_closure; /* __closure__: tuple of cell objects or NULL; readonly */
PyObject *func_dict; /* __dict__: dict or NULL */
} PyGenericFunctionObject;
This class adds various slots like ``__doc__`` and ``__code__`` to access the C attributes.
The slot ``__name__`` returns ``func_name``.
When setting ``__name__``, also ``base.m_ml.ml_name`` is updated
with the UTF-8 encoded name.
None of the attributes is required to be meaningful.
In particular, ``__code__`` may not be a working code object,
possibly only a few fields may be filled in.
And ``__defaults__`` is not required to be used for calling the function.
Apart from adding these extra attributes,
``generic_function`` behaves exactly the same as ``basefunction``.
python_function
---------------
This is the class meant for functions implemented in Python,
formerly known as ``function``.
Unlike the other function types,
instances of ``python_function`` can be created from Python code.
The layout of the C structure is almost the same as ``generic_function``::
PyTypeObject PyFunction_Type;
typedef struct {
PyBaseFunctionObject base;
PyObject *func_name; /* __name__: string */
PyObject *func_qualname; /* __qualname__: string */
PyObject *func_doc; /* __doc__: can be anything or NULL */
PyObject *func_code; /* __code__: code or NULL */
PyObject *func_defaults; /* __defaults__: tuple or NULL */
PyObject *func_kwdefaults; /* __kwdefaults__: dict or NULL */
PyObject *func_annotations; /* __annotations__: dict or NULL */
PyObject *func_globals; /* __globals__: anything or NULL; readonly */
PyObject *func_closure; /* __closure__: tuple of cell objects or NULL; readonly */
PyObject *func_dict; /* __dict__: dict or NULL */
PyCFunctionDef _ml; /* Storage for base.m_ml */
} PyFunctionObject;
The only difference is an ``_ml`` field
which reserves space to be used by ``base.m_ml``.
However, it is not required that ``base.m_ml`` points to ``_ml``.
The constructor takes care of setting up ``base.m_ml``.
In particular, it sets the ``METH_PYTHON`` flag.
method
------
The class ``method`` is used for all bound methods,
regardless of the class of the underlying function.
There is one extra attribute ``__func__`` pointing to that function.
For methods, there is a complication because we want to allow
constructing a method from a arbitrary callable which
may not be an instance of ``basefunction``.
Therefore, in practice there are two kinds of methods:
for arbitrary callables, we use a single fixed ``PyCFunctionDef``
structure with ``ml_name`` equal to ``"?"``
and with the ``METH_ARG0_FUNCTION`` flag set.
The C function then calls ``__func__`` with the correct arguments.
For methods which bind instances of ``basefunction``
(more precisely, which have the ``Py_TPFLAGS_BASEFUNCTION`` flag set),
we instead use the ``PyCFunctionDef`` from the original function.
In this case, the ``__func__`` attribute is only used to implement various attributes
but not for calling the method.
When constructing a new method from a ``basefunction``,
we check that the ``self`` object is an instance of ``__objclass__``
(if such a class was specified) and raise a ``TypeError`` otherwise.
The C structure is::
typedef struct {
PyBaseFunctionObject base;
PyObject *im_func; /* __func__: function implementing the method; readonly */
} PyMethodObject;
Calling basefunction instances
==============================
We specify the implementation of ``__call__`` for instances of ``basefunction``.
__objclass__
------------
First of all, if the function has an ``__objclass__`` attribute but no
``__self__`` attribute (this is the case for unbound methods of extension types),
then the function must be called with at least one positional argument
and the first (typically called ``self``) must be an instance of ``__objclass__``.
If not, a ``TypeError`` is raised.
Flags
-----
For convenience, we define two new constants:
``METH_CALLSIGNATURE`` combines the flags from ``PyCFunctionDef.ml_flags``
which specify the signature of the C function to be called.
It is equal to ::
METH_NOARGS | METH_O | METH_VARARGS | METH_FASTCALL | METH_KEYWORDS
Exactly one of the first four flags above must be set
and only ``METH_VARARGS`` and ``METH_FASTCALL`` may be combined with ``METH_KEYWORDS``.
Violating these rules is undefined behaviour.
The second new constant is ``METH_CALLFLAGS``.
It combines all flags which influence how a function is called.
It is equal to ::
METH_CALLSIGNATURE | METH_ARG0_FUNCTION | METH_ARG0_NO_SLICE
Some of these flags are already documented [#methoddoc]_.
We explain the others below.
METH_FASTCALL
-------------
This is an existing but undocumented flag.
We suggest to officially support and document it.
If the flag ``METH_FASTCALL`` is set without ``METH_KEYWORDS``,
then the ``ml_meth`` field is of type ``PyCFunctionFast``
which takes the arguments ``(PyObject *arg0, PyObject *const *args, Py_ssize_t nargs)``.
Such a function takes only positional arguments and they are passed as plain C array
``args`` of length ``nargs``.
If the flags ``METH_FASTCALL | METH_KEYWORDS`` are set,
then the ``ml_meth`` field is of type ``PyCFunctionFastWithKeywords``
which takes the arguments ``(PyObject *arg0, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)``.
The positional arguments are passed as C array ``args`` of length ``nargs``.
The *values* of the keyword arguments follow in that array,
starting at position ``nargs``.
The *keys* (names) of the keyword arguments are passed as a ``tuple`` in ``kwnames``.
As an example, assume that 3 positional and 2 keyword arguments are given.
Then ``args`` is an array of length 3 + 2 = 5, ``nargs`` equals 3 and ``kwnames`` is a 2-tuple.
METH_ARG0_FUNCTION
------------------
If this flag is set, then the first argument to the C function
is the function itself (the ``basefunction`` instance) instead of ``__self__``.
In this case, the C function should deal with ``__self__``
by getting it from the function, for example using ``PyBaseFunction_GET_SELF``.
METH_ARG0_NO_SLICE
------------------
If the function has a ``__objclass__`` attribute, no ``__self__``
attribute and neither ``METH_ARG0_FUNCTION`` nor ``METH_ARG0_NO_SLICE`` are set,
then the first positional argument (which must exist because of ``__objclass__``)
is removed from ``*args`` and instead passed as first argument to the C function.
Effectively, the first positional argument is treated as ``__self__``.
This process is called "self slicing".
This does not affect keyword arguments.
It is not allowed to combine the flags ``METH_ARG0_FUNCTION`` and ``METH_ARG0_NO_SLICE``.
That is not a problem because ``METH_ARG0_FUNCTION`` already disables self slicing.
Automatic creation of built-in functions
========================================
Python automatically generates instances of ``builtin_function``
for extension types (using the ``PyTypeObject.tp_methods`` field) and modules
(using the ``PyModuleDef.m_methods`` field).
The arrays ``PyTypeObject.tp_methods`` and ``PyModuleDef.m_methods``
must be arrays of ``PyMethodDef`` structures.
If the ``METH_CUSTOM`` flag is set for an element of such an array,
then no ``builtin_function`` will be generated.
This allows an application to customize the creation of functions
in an extension type or module.
If ``METH_CUSTOM`` is set, then ``METH_STATIC`` and ``METH_CLASS`` are ignored.
Built-in unbound methods
------------------------
The type of unbound methods changes from ``method_descriptor``
to ``builtin_function``.
The object which appears as unbound method is the same object which
appears in the class ``__dict__``.
Python automatically sets the ``__objclass__`` attribute.
Built-in functions of a module
------------------------------
For the case of functions of a module,
``__self__`` will be set to the module unless the flag ``METH_STATIC`` is set.
An important consequence is that such functions by default
do not become methods when used as attribute
(``basefunction.__get__`` only does that if ``__self__`` was unset).
One could consider this a bug, but this was done for backwards compatibility reasons:
in an initial post on python-ideas [#proposal]_ the concensus was to keep this
misfeature of built-in functions.
However, to allow this anyway for specific or newly implemented
built-in functions, the ``METH_STATIC`` flag prevents setting ``__self__``.
Previously, ``METH_STATIC`` was an error, so this is fullt backwards compatible.
Specifying ``METH_CLASS`` is still an error.
Further changes
===============
New type flag
-------------
A new ``PyTypeObject`` flag (for ``tp_flags``) is added:
``Py_TPFLAGS_BASEFUNCTION`` to indicate that instances of this type are
functions which can be called as a ``basefunction``.
In other words, subclasses of ``basefunction``
which follow the implementation from `Calling basefunction instances`_.
This is different from flags like ``Py_TPFLAGS_LIST_SUBCLASS``
because it indicates more than just a subclass:
it also indicates a default implementation of ``__call__``.
This flag is never inherited.
However, extension types can explicitly specify it if they
do not override ``__call__`` or if they override ``__call__`` in a compatible way.
The flag ``Py_TPFLAGS_BASEFUNCTION`` must never be set for a heap type
because that would not be safe (heap types can be changed dynamically).
C API functions
---------------
We add and change some Python/C API functions:
- ``int PyBaseFunction_Check(PyObject *op)``: return true if ``op``
is an instance of a type with the ``Py_TPFLAGS_BASEFUNCTION`` set.
- ``int PyCFunction_Check(PyObject *op)``: return true if ``PyBaseFunction_Check(op)``
is True and the function ``op`` does not have the flag ``METH_PYTHON`` set.
- ``int PyBuiltinFunction_Check(PyObject *op)``: return true if ``op``
is an instance of ``builtin_function``.
- ``int PyFunction_Check(PyObject *op)``: return true if ``op``
is an instance of ``generic_function``.
- ``PyObject* PyFunction_New(PyObject *code, PyObject *globals)``:
create a new instance of ``python_function``.
- ``PyObject* PyFunction_NewWithQualName(PyObject *code, PyObject *globals)``:
create a new instance of ``python_function``.
- For some existing ``PyCFunction_...`` and ``PyMethod_`` functions,
we define a new function ``PyBaseFunction_...``
acting on ``basefunction`` instances.
For backwards compatibility,
the old functions are kept as aliases of the new functions.
**TODO**: more functions may be added when implementing this PEP.
In particular, maybe there should be functions for creating instances of ``basefunction``
or ``generic_function``.
Changes to the types module
---------------------------
Two types are added: ``types.BaseFunctionType`` corresponding to
``basefunction`` and ``types.GenericFunctionType`` corresponding to
``generic_function``.
Apart from that, no changes to the ``types`` module are made.
In particular, ``types.FunctionType`` refers to ``python_function``.
However, the actual types will change:
for example, ``types.BuiltinFunctionType`` will no longer be the same
as ``types.BuiltinMethodType``.
Changes to the inspect module
-----------------------------
``inspect.isbasefunction`` checks for an instance of ``basefunction``.
``inspect.isfunction`` checks for an instance of ``generic_function``.
``inspect.isbuiltin`` checks for an instance of ``builtin_function``.
Profiling
---------
Currently, ``sys.setprofile`` supports ``c_call``, ``c_return`` and ``c_exception``
events for built-in functions.
These events are generated when calling or returning from a built-in function.
By contrast, the ``call`` and ``return`` events are generated by the function itself.
So nothing needs to change for the ``call`` and ``return`` events.
Since we no longer make a difference between C functions and Python functions,
we need to prevent the ``c_*`` events for Python functions.
This is done by not generating those events if the
``METH_PYTHON`` flag in ``ml_flags`` is set.
User flags in PyCFunctionDef.ml_flags
----------------------------------------
8 consecutive bits in ``ml_flags`` are reserved for the "user",
meaning the person or program who implemented the function.
These are ``METH_USR0``, ..., ``METH_USR7``.
Python will ignore these flags.
It should be clear that different users may use these flags
for different purposes, so users should only look at those flags in
functions that they implemented (for example, by looking for those flags
in the ``tp_methods`` array of an extension type).
Non-CPython implementations
===========================
For other implementations of Python apart from CPython,
only the classes ``basefunction``, ``method`` and ``python_function`` are required.
The latter two are the only classes which can be instantiated directly
from the Python interpreter.
We require ``basefunction`` for consistency but we put no requirements on it:
it is acceptable if this is just a copy of ``object``.
Support for the new ``__objclass__`` attribute is not required.
If there is no ``generic_function`` type,
then ``types.GenericFunctionType`` should be an alias of ``types.FunctionType``.
Rationale
=========
Why not simply change existing classes?
---------------------------------------
One could try to solve the problem not by introducing a new ``basefunction``
class and changing the class hierarchy, but by just changing existing classes.
That might look like a simpler solution but it is not:
it would require introspection support for 3 distinct classes:
``function``, ``builtin_function_or_method`` and ``method_descriptor``.
In the current PEP, there is only a single class where introspection needs
to be implemented.
It is also not clear how this would interact with ``__text_signature__``.
Having two independent kinds of ``inspect.signature`` support on the same
class sounds like asking for problems.
And this would not fix some of the other differences between built-in functions
and Python functions that were mentioned in the `Motivation`_.
Why __text_signature__ is not a solution
----------------------------------------
Built-in functions have an attribute ``__text_signature__``,
which gives the signature of the function as plain text.
The default values are evaluated by ``ast.literal_eval``.
Because of this, it supports only a small number of standard Python classes
and not arbitrary Python objects.
And even if ``__text_signature__`` would allow arbitrary signatures somehow,
that is only one piece of introspection:
it does not help with ``inspect.getsourcefile`` for example.
generic_function versus python_function
---------------------------------------
The names ``generic_function`` and ``python_function``
were chosen to be different from ``function``
because none of the two classes ``generic_function``/``python_function``
is an obvious candidate to receive the ``function`` name.
It also allows to use the word "function" informally without referring
to a specific class.
In many places, a decision needs to be made whether the old ``function`` class
should be replaced by ``generic_function`` or ``python_function``.
This is done by thinking of the most likely use case:
1. ``types.FunctionType`` refers to ``python_function`` because that
type might be used to construct instances using ``types.FunctionType(...)``.
2. ``inspect.isfunction()`` refers to ``generic_function``
because this is the class where introspection is supported.
3. The C API functions ``PyFunction_New...``
refer to ``python_function`` simply because one cannot create instances
of ``generic_function``.
4. The C API functions ``PyFunction_Check`` and ``PyFunction_Get/Set...``
refer to ``generic_function`` because all attributes exist for instances of ``generic_function``.
Scope of this PEP: which classes are involved?
----------------------------------------------
The main motivation of this PEP is fixing function classes,
so we certainly want to unify the existing classes
``builtin_function_or_method`` and ``function``.
Since built-in functions and methods have the same class,
it seems natural to include bound methods too.
And since there are no "unbound methods" for Python functions,
it makes sense to get rid of unbound methods for extension types.
For now, no changes are made to the classes ``staticmethod``,
``classmethod`` and ``classmethod_descriptor``.
It would certainly make sense to put these in the ``basefunction``
class hierarchy and unify ``classmethod`` and ``classmethod_descriptor``.
However, this PEP is already big enough
and this is left as a possible future improvement.
Slot wrappers for extension types like ``__init__`` or ``__eq__``
are quite different from normal methods.
They are also typically not called directly because you would normally
write ``foo[i]`` instead of ``foo.__getitem__(i)`` for example.
So these are left outside the scope of this PEP.
Python also has an ``instancemethod`` class, which was used in Python 2
for unbound methods.
It is not clear whether there is still a use case for it.
In any case, there is no reason to deal with it in this PEP.
**TODO**: should ``instancemethod`` be deprecated?
It doesn't seem used at all within CPython 3.7,
but maybe external packages use it?
__self__ in basefunction
------------------------
It may look strange at first sight to add the ``__self__`` slot
in ``basefunction`` as opposed to ``method``.
We took this idea from the existing ``builtin_function_or_method`` class.
It allows us to have a single general implementation of ``__call__``
for the various function classes discussed in this PEP.
It also makes it easy to support existing built-in functions
which set ``__self__`` to the module (for example, ``sys.exit.__self__`` is ``sys``).
Subclassing
-----------
We disallow subclassing of ``builtin_function`` and ``method``
to enable fast type checks for ``PyBuiltinFunction_Check`` and ``PyMethod_Check()``.
We allow subclassing of the other classes because there is no reason to disallow it.
For Python modules, the only relevant class to subclass is
``python_function`` because the others cannot be instantiated anyway.
Replacing tp_call: METH_ARG0_FUNCTION
-------------------------------------
The new flag ``METH_ARG0_FUNCTION`` is meant to support cases where
formerly a custom ``tp_call`` was used.
It would reduce the number of special fast paths in ``Python/ceval.c``
for calling objects:
instead of treating Python functions, built-in functions and methods,
there would only be a single check.
The signature of ``tp_call`` is essentially the signature
of ``PyBaseFunctionObject.m_ml.ml_meth`` with flags
``METH_VARARGS | METH_KEYWORDS | METH_ARG0_FUNCTION``.
Therefore, it should be easy to change existing ``tp_call`` slots
to use ``METH_ARG0_FUNCTION``.
There is just one extra complication: ``__self__`` must be handled manually.
That is not hard though: it just means adapting that logic from ``method``.
Self slicing: METH_ARG0_NO_SLICE
--------------------------------
We define "self slicing" to mean slicing off the ``self`` argument of a method
from the ``*args`` tuple when an unbound method is called.
This ``self`` argument is then passed as first argument to the C function.
The specification of ``METH_ARG0_NO_SLICE`` may seem strange at first.
The negation is confusing, but it is done for backwards compatibility:
existing methods require self slicing but do not specify a flag for it.
The requirement for ``__objclass__`` in order to use self slicing
makes sense because it guarantees that there is a ``self`` argument in the first place.
Since ``METH_ARG0_FUNCTION`` is clearly incompatible with self slicing
(both use the first argument of the C function),
this PEP dictates that ``METH_ARG0_FUNCTION`` disables self slicing.
So one may wonder if there is actually a use case for ``METH_ARG0_NO_SLICE``
without ``METH_ARG0_FUNCTION``.
If not, then one could simply unify those two flags in one flag
``METH_ARG0_FUNCTION``.
However, a priori, the flag ``METH_ARG0_NO_SLICE`` is meaningful,
so we keep the two flags ``METH_ARG0_FUNCTION`` and ``METH_ARG0_NO_SLICE`` separate.
**TODO**: this should be reconsidered after initial implementation
and testing of this PEP.
User flags: METH_CUSTOM and METH_USRx
-------------------------------------
These flags are meant for applications that want to use
``tp_methods`` for an extension type or ``m_methods`` for a module
but that do not want the default built-in functions to be created.
Those applications would set ``METH_CUSTOM``.
The application is also free to use ``METH_USR0``, ..., ``METH_USR7``
for its own purposes,
for example to customize the creation of special function instances.
There is no obvious concrete use case,
but given that it costs essentially nothing to have these flags,
it seems like a good idea to allow it.
Backwards Compatibility
=======================
While designing this PEP, great care was taken to not break
backwards compatibility too much.
Python functions
----------------
For Python functions, essentially nothing changes.
The attributes that existed before still exist and Python functions
can be initialized, called and turned into methods as before.
Built-in functions of a module
------------------------------
Also for built-in functions, nothing changes.
We keep the old behaviour that such functions do not bind as methods.
This is a consequence of the fact that ``__self__`` is set to the module.
Built-in bound and unbound methods
----------------------------------
The types of built-in bound and unbound methods will change.
However, this does not affect calling such methods
because the protocol in ``basefunction.__call__``
(in particular the handling of ``__objclass__`` and self slicing)
was specifically designed to be backwards compatible.
All attributes which existed before (like ``__objclass__`` and ``__self__``)
still exist.
New classes
-----------
Tools which take various kinds of functions as input will need to deal
with the new function hieararchy and the possibility of custom
function classes.
If those tools use ``inspect`` properly, there should be few
backwards compatibility problems.
New attributes
--------------
Some objects get new attributes.
For example, ``__objclass__`` now appears on bound methods too
and all methods get a ``__func__`` attribute.
We expect that this will not cause problems.
Reference Implementation
========================
After initial discussions of this PEP draft,
work will start on a reference implementation in CPython.
Appendix: current situation
===========================
**NOTE**:
This section is more useful during the draft period of the PEP,
so feel free to remove this once the PEP has been accepted.
For reference, we describe in detail the relevant existing classes in CPython 3.7.
There are a surprisingly large number of classes involved,
each of them is an "orphan" class (no non-trivial subclasses nor superclasses).
builtin_function_or_method: built-in functions and bound methods
----------------------------------------------------------------
These are of type `PyCFunction_Type <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Objects/methodobject.c#L271>`_
with structure `PyCFunctionObject <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Include/methodobject.h#L102>`_::
typedef struct {
PyObject_HEAD
PyMethodDef *m_ml; /* Description of the C function to call */
PyObject *m_self; /* Passed as 'self' arg to the C func, can be NULL */
PyObject *m_module; /* The __module__ attribute, can be anything */
PyObject *m_weakreflist; /* List of weak references */
} PyCFunctionObject;
struct PyMethodDef {
const char *ml_name; /* The name of the built-in function/method */
PyCFunction ml_meth; /* The C function that implements it */
int ml_flags; /* Combination of METH_xxx flags, which mostly
describe the args expected by the C func */
const char *ml_doc; /* The __doc__ attribute, or NULL */
};
where ``PyCFunction`` is a C function pointer (there are various forms of this, the most basic
takes two arguments for ``self`` and ``*args``).
This class is used both for functions and bound methods:
for a method, the ``m_self`` slot points to the object::
>>> dict(foo=42).get
<built-in method get of dict object at 0x...>
>>> dict(foo=42).get.__self__
{'foo': 42}
In some cases, a function is considered a "method" of the module defining it::
>>> import os
>>> os.kill
<built-in function kill>
>>> os.kill.__self__
<module 'posix' (built-in)>
method_descriptor: built-in unbound methods
-------------------------------------------
These are of type `PyMethodDescr_Type <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Objects/descrobject.c#L538>`_
with structure `PyMethodDescrObject <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Include/descrobject.h#L53>`_::
typedef struct {
PyDescrObject d_common;
PyMethodDef *d_method;
} PyMethodDescrObject;
typedef struct {
PyObject_HEAD
PyTypeObject *d_type;
PyObject *d_name;
PyObject *d_qualname;
} PyDescrObject;
function: Python functions
--------------------------
These are of type `PyFunction_Type <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Objects/funcobject.c#L592>`_
with structure `PyFunctionObject <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Include/funcobject.h#L21>`_::
typedef struct {
PyObject_HEAD
PyObject *func_code; /* A code object, the __code__ attribute */
PyObject *func_globals; /* A dictionary (other mappings won't do) */
PyObject *func_defaults; /* NULL or a tuple */
PyObject *func_kwdefaults; /* NULL or a dict */
PyObject *func_closure; /* NULL or a tuple of cell objects */
PyObject *func_doc; /* The __doc__ attribute, can be anything */
PyObject *func_name; /* The __name__ attribute, a string object */
PyObject *func_dict; /* The __dict__ attribute, a dict or NULL */
PyObject *func_weakreflist; /* List of weak references */
PyObject *func_module; /* The __module__ attribute, can be anything */
PyObject *func_annotations; /* Annotations, a dict or NULL */
PyObject *func_qualname; /* The qualified name */
/* Invariant:
* func_closure contains the bindings for func_code->co_freevars, so
* PyTuple_Size(func_closure) == PyCode_GetNumFree(func_code)
* (func_closure may be NULL if PyCode_GetNumFree(func_code) == 0).
*/
} PyFunctionObject;
In Python 3, there is no "unbound method" class:
an unbound method is just a plain function.
method: Python bound methods
----------------------------
These are of type `PyMethod_Type <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Objects/classobject.c#L329>`_
with structure `PyMethodObject <https://github.com/python/cpython/blob/2cb4661707818cfd92556e7fdf9068a993577002/Include/classobject.h#L12>`_::
typedef struct {
PyObject_HEAD
PyObject *im_func; /* The callable object implementing the method */
PyObject *im_self; /* The instance it is bound to */
PyObject *im_weakreflist; /* List of weak references */
} PyMethodObject;
References
==========
.. [#cython] Cython (http://cython.org/)
.. [#bpo30071] Python bug 30071 (https://bugs.python.org/issue30071)
.. [#clinic] PEP 436, The Argument Clinic DSL, Hastings (https://www.python.org/dev/peps/pep-0436)
.. [#methoddoc] PyMethodDef documentation (https://docs.python.org/3.7/c-api/structures.html#c.PyMethodDef)
.. [#proposal] PEP proposal: unifying function/method classes (https://mail.python.org/pipermail/python-ideas/2018-March/049398.html)
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: