diff --git a/pep-0253.txt b/pep-0253.txt index 8747dde1f..a3fb28e5e 100644 --- a/pep-0253.txt +++ b/pep-0253.txt @@ -5,104 +5,108 @@ Last-Modified: $Date$ Author: guido@python.org (Guido van Rossum) Status: Final Type: Standards Track +Content-Type: text/x-rst Created: 14-May-2001 Python-Version: 2.2 Post-History: Abstract +======== - This PEP proposes additions to the type object API that will allow - the creation of subtypes of built-in types, in C and in Python. +This PEP proposes additions to the type object API that will allow +the creation of subtypes of built-in types, in C and in Python. - [Editor's note: the ideas described in this PEP have been incorporated - into Python. The PEP no longer accurately describes the implementation.] +[Editor's note: the ideas described in this PEP have been incorporated +into Python. The PEP no longer accurately describes the implementation.] Introduction +============ - Traditionally, types in Python have been created statically, by - declaring a global variable of type PyTypeObject and initializing - it with a static initializer. The slots in the type object - describe all aspects of a Python type that are relevant to the - Python interpreter. A few slots contain dimensional information - (like the basic allocation size of instances), others contain - various flags, but most slots are pointers to functions to - implement various kinds of behaviors. A NULL pointer means that - the type does not implement the specific behavior; in that case - the system may provide a default behavior or raise an exception - when the behavior is invoked for an instance of the type. Some - collections of function pointers that are usually defined together - are obtained indirectly via a pointer to an additional structure - containing more function pointers. +Traditionally, types in Python have been created statically, by +declaring a global variable of type PyTypeObject and initializing +it with a static initializer. The slots in the type object +describe all aspects of a Python type that are relevant to the +Python interpreter. A few slots contain dimensional information +(like the basic allocation size of instances), others contain +various flags, but most slots are pointers to functions to +implement various kinds of behaviors. A NULL pointer means that +the type does not implement the specific behavior; in that case +the system may provide a default behavior or raise an exception +when the behavior is invoked for an instance of the type. Some +collections of function pointers that are usually defined together +are obtained indirectly via a pointer to an additional structure +containing more function pointers. - While the details of initializing a PyTypeObject structure haven't - been documented as such, they are easily gleaned from the examples - in the source code, and I am assuming that the reader is - sufficiently familiar with the traditional way of creating new - Python types in C. +While the details of initializing a PyTypeObject structure haven't +been documented as such, they are easily gleaned from the examples +in the source code, and I am assuming that the reader is +sufficiently familiar with the traditional way of creating new +Python types in C. - This PEP will introduce the following features: +This PEP will introduce the following features: - - a type can be a factory function for its instances +- a type can be a factory function for its instances - - types can be subtyped in C +- types can be subtyped in C - - types can be subtyped in Python with the class statement +- types can be subtyped in Python with the class statement - - multiple inheritance from types is supported (insofar as - practical -- you still can't multiply inherit from list and - dictionary) +- multiple inheritance from types is supported (insofar as + practical -- you still can't multiply inherit from list and + dictionary) - - the standard coercion functions (int, tuple, str etc.) will - be redefined to be the corresponding type objects, which serve - as their own factory functions +- the standard coercion functions (int, tuple, str etc.) will + be redefined to be the corresponding type objects, which serve + as their own factory functions - - a class statement can contain a __metaclass__ declaration, - specifying the metaclass to be used to create the new class +- a class statement can contain a ``__metaclass__`` declaration, + specifying the metaclass to be used to create the new class - - a class statement can contain a __slots__ declaration, - specifying the specific names of the instance variables - supported +- a class statement can contain a ``__slots__`` declaration, + specifying the specific names of the instance variables + supported - This PEP builds on PEP 252, which adds standard introspection to - types; for example, when a particular type object initializes the - tp_hash slot, that type object has a __hash__ method when - introspected. PEP 252 also adds a dictionary to type objects - which contains all methods. At the Python level, this dictionary - is read-only for built-in types; at the C level, it is accessible - directly (but it should not be modified except as part of - initialization). +This PEP builds on PEP 252, which adds standard introspection to +types; for example, when a particular type object initializes the +``tp_hash`` slot, that type object has a ``__hash__`` method when +introspected. PEP 252 also adds a dictionary to type objects +which contains all methods. At the Python level, this dictionary +is read-only for built-in types; at the C level, it is accessible +directly (but it should not be modified except as part of +initialization). - For binary compatibility, a flag bit in the tp_flags slot - indicates the existence of the various new slots in the type - object introduced below. Types that don't have the - Py_TPFLAGS_HAVE_CLASS bit set in their tp_flags slot are assumed - to have NULL values for all the subtyping slots. (Warning: the - current implementation prototype is not yet consistent in its - checking of this flag bit. This should be fixed before the final - release.) +For binary compatibility, a flag bit in the tp_flags slot +indicates the existence of the various new slots in the type +object introduced below. Types that don't have the +``Py_TPFLAGS_HAVE_CLASS`` bit set in their ``tp_flags`` slot are assumed +to have NULL values for all the subtyping slots. (Warning: the +current implementation prototype is not yet consistent in its +checking of this flag bit. This should be fixed before the final +release.) - In current Python, a distinction is made between types and - classes. This PEP together with PEP 254 will remove that - distinction. However, for backwards compatibility the distinction - will probably remain for years to come, and without PEP 254, the - distinction is still large: types ultimately have a built-in type - as a base class, while classes ultimately derive from a - user-defined class. Therefore, in the rest of this PEP, I will - use the word type whenever I can -- including base type or - supertype, derived type or subtype, and metatype. However, - sometimes the terminology necessarily blends, for example an - object's type is given by its __class__ attribute, and subtyping - in Python is spelled with a class statement. If further - distinction is necessary, user-defined classes can be referred to - as "classic" classes. +In current Python, a distinction is made between types and +classes. This PEP together with PEP 254 will remove that +distinction. However, for backwards compatibility the distinction +will probably remain for years to come, and without PEP 254, the +distinction is still large: types ultimately have a built-in type +as a base class, while classes ultimately derive from a +user-defined class. Therefore, in the rest of this PEP, I will +use the word type whenever I can -- including base type or +supertype, derived type or subtype, and metatype. However, +sometimes the terminology necessarily blends, for example an +object's type is given by its ``__class__`` attribute, and subtyping +in Python is spelled with a class statement. If further +distinction is necessary, user-defined classes can be referred to +as "classic" classes. About metatypes +=============== - Inevitably the discussion comes to metatypes (or metaclasses). - Metatypes are nothing new in Python: Python has always been able - to talk about the type of a type: +Inevitably the discussion comes to metatypes (or metaclasses). +Metatypes are nothing new in Python: Python has always been able +to talk about the type of a type:: >>> a = 0 >>> type(a) @@ -113,841 +117,853 @@ About metatypes >>> - In this example, type(a) is a "regular" type, and type(type(a)) is - a metatype. While as distributed all types have the same metatype - (PyType_Type, which is also its own metatype), this is not a - requirement, and in fact a useful and relevant 3rd party extension - (ExtensionClasses by Jim Fulton) creates an additional metatype. - The type of classic classes, known as types.ClassType, can also be - considered a distinct metatype. +In this example, ``type(a)`` is a "regular" type, and ``type(type(a))`` is +a metatype. While as distributed all types have the same metatype +(``PyType_Type``, which is also its own metatype), this is not a +requirement, and in fact a useful and relevant 3rd party extension +(ExtensionClasses by Jim Fulton) creates an additional metatype. +The type of classic classes, known as ``types.ClassType``, can also be +considered a distinct metatype. - A feature closely connected to metatypes is the "Don Beaudry - hook", which says that if a metatype is callable, its instances - (which are regular types) can be subclassed (really subtyped) - using a Python class statement. I will use this rule to support - subtyping of built-in types, and in fact it greatly simplifies the - logic of class creation to always simply call the metatype. When - no base class is specified, a default metatype is called -- the - default metatype is the "ClassType" object, so the class statement - will behave as before in the normal case. (This default can be - changed per module by setting the global variable __metaclass__.) +A feature closely connected to metatypes is the "Don Beaudry +hook", which says that if a metatype is callable, its instances +(which are regular types) can be subclassed (really subtyped) +using a Python class statement. I will use this rule to support +subtyping of built-in types, and in fact it greatly simplifies the +logic of class creation to always simply call the metatype. When +no base class is specified, a default metatype is called -- the +default metatype is the "ClassType" object, so the class statement +will behave as before in the normal case. (This default can be +changed per module by setting the global variable ``__metaclass__``.) - Python uses the concept of metatypes or metaclasses in a different - way than Smalltalk. In Smalltalk-80, there is a hierarchy of - metaclasses that mirrors the hierarchy of regular classes, - metaclasses map 1-1 to classes (except for some funny business at - the root of the hierarchy), and each class statement creates both - a regular class and its metaclass, putting class methods in the - metaclass and instance methods in the regular class. +Python uses the concept of metatypes or metaclasses in a different +way than Smalltalk. In Smalltalk-80, there is a hierarchy of +metaclasses that mirrors the hierarchy of regular classes, +metaclasses map 1-1 to classes (except for some funny business at +the root of the hierarchy), and each class statement creates both +a regular class and its metaclass, putting class methods in the +metaclass and instance methods in the regular class. - Nice though this may be in the context of Smalltalk, it's not - compatible with the traditional use of metatypes in Python, and I - prefer to continue in the Python way. This means that Python - metatypes are typically written in C, and may be shared between - many regular types. (It will be possible to subtype metatypes in - Python, so it won't be absolutely necessary to write C to use - metatypes; but the power of Python metatypes will be limited. For - example, Python code will never be allowed to allocate raw memory - and initialize it at will.) +Nice though this may be in the context of Smalltalk, it's not +compatible with the traditional use of metatypes in Python, and I +prefer to continue in the Python way. This means that Python +metatypes are typically written in C, and may be shared between +many regular types. (It will be possible to subtype metatypes in +Python, so it won't be absolutely necessary to write C to use +metatypes; but the power of Python metatypes will be limited. For +example, Python code will never be allowed to allocate raw memory +and initialize it at will.) - Metatypes determine various *policies* for types, such as what - happens when a type is called, how dynamic types are (whether a - type's __dict__ can be modified after it is created), what the - method resolution order is, how instance attributes are looked - up, and so on. +Metatypes determine various **policies** for types, such as what +happens when a type is called, how dynamic types are (whether a +type's ``__dict__`` can be modified after it is created), what the +method resolution order is, how instance attributes are looked +up, and so on. - I'll argue that left-to-right depth-first is not the best - solution when you want to get the most use from multiple - inheritance. +I'll argue that left-to-right depth-first is not the best +solution when you want to get the most use from multiple +inheritance. - I'll argue that with multiple inheritance, the metatype of the - subtype must be a descendant of the metatypes of all base types. +I'll argue that with multiple inheritance, the metatype of the +subtype must be a descendant of the metatypes of all base types. - I'll come back to metatypes later. +I'll come back to metatypes later. Making a type a factory for its instances +========================================= - Traditionally, for each type there is at least one C factory - function that creates instances of the type (PyTuple_New(), - PyInt_FromLong() and so on). These factory functions take care of - both allocating memory for the object and initializing that - memory. As of Python 2.0, they also have to interface with the - garbage collection subsystem, if the type chooses to participate - in garbage collection (which is optional, but strongly recommended - for so-called "container" types: types that may contain references - to other objects, and hence may participate in reference cycles). +Traditionally, for each type there is at least one C factory +function that creates instances of the type (``PyTuple_New()``, +``PyInt_FromLong()`` and so on). These factory functions take care of +both allocating memory for the object and initializing that +memory. As of Python 2.0, they also have to interface with the +garbage collection subsystem, if the type chooses to participate +in garbage collection (which is optional, but strongly recommended +for so-called "container" types: types that may contain references +to other objects, and hence may participate in reference cycles). - In this proposal, type objects can be factory functions for their - instances, making the types directly callable from Python. This - mimics the way classes are instantiated. The C APIs for creating - instances of various built-in types will remain valid and in some - cases more efficient. Not all types will become their own factory - functions. +In this proposal, type objects can be factory functions for their +instances, making the types directly callable from Python. This +mimics the way classes are instantiated. The C APIs for creating +instances of various built-in types will remain valid and in some +cases more efficient. Not all types will become their own factory +functions. - The type object has a new slot, tp_new, which can act as a factory - for instances of the type. Types are now callable, because the - tp_call slot is set in PyType_Type (the metatype); the function - looks for the tp_new slot of the type that is being called. +The type object has a new slot, tp_new, which can act as a factory +for instances of the type. Types are now callable, because the +tp_call slot is set in ``PyType_Type`` (the metatype); the function +looks for the tp_new slot of the type that is being called. - Explanation: the tp_call slot of a regular type object (such as - PyInt_Type or PyList_Type) defines what happens when *instances* - of that type are called; in particular, the tp_call slot in the - function type, PyFunction_Type, is the key to making functions - callable. As another example, PyInt_Type.tp_call is NULL, because - integers are not callable. The new paradigm makes *type objects* - callable. Since type objects are instances of their metatype - (PyType_Type), the metatype's tp_call slot (PyType_Type.tp_call) - points to a function that is invoked when any type object is - called. Now, since each type has to do something different to - create an instance of itself, PyType_Type.tp_call immediately - defers to the tp_new slot of the type that is being called. - PyType_Type itself is also callable: its tp_new slot creates a new - type. This is used by the class statement (formalizing the Don - Beaudry hook, see above). And what makes PyType_Type callable? - The tp_call slot of *its* metatype -- but since it is its own - metatype, that is its own tp_call slot! +Explanation: the ``tp_call`` slot of a regular type object (such as +``PyInt_Type`` or ``PyList_Type``) defines what happens when **instances** +of that type are called; in particular, the ``tp_call`` slot in the +function type, ``PyFunction_Type``, is the key to making functions +callable. As another example, ``PyInt_Type.tp_call`` is ``NULL``, because +integers are not callable. The new paradigm makes **type objects** +callable. Since type objects are instances of their metatype +(``PyType_Type``), the metatype's ``tp_call`` slot (``PyType_Type.tp_call``) +points to a function that is invoked when any type object is +called. Now, since each type has to do something different to +create an instance of itself, ``PyType_Type.tp_call`` immediately +defers to the ``tp_new`` slot of the type that is being called. +``PyType_Type`` itself is also callable: its ``tp_new`` slot creates a new +type. This is used by the class statement (formalizing the Don +Beaudry hook, see above). And what makes ``PyType_Type`` callable? +The ``tp_call`` slot of **its** metatype -- but since it is its own +metatype, that is its own ``tp_call`` slot! - If the type's tp_new slot is NULL, an exception is raised. - Otherwise, the tp_new slot is called. The signature for the - tp_new slot is +If the type's ``tp_new`` slot is NULL, an exception is raised. +Otherwise, the tp_new slot is called. The signature for the +``tp_new`` slot is - PyObject *tp_new(PyTypeObject *type, - PyObject *args, - PyObject *kwds) +:: - where 'type' is the type whose tp_new slot is called, and 'args' - and 'kwds' are the sequential and keyword arguments to the call, - passed unchanged from tp_call. (The 'type' argument is used in - combination with inheritance, see below.) - - There are no constraints on the object type that is returned, - although by convention it should be an instance of the given - type. It is not necessary that a new object is returned; a - reference to an existing object is fine too. The return value - should always be a new reference, owned by the caller. - - Once the tp_new slot has returned an object, further initialization - is attempted by calling the tp_init() slot of the resulting - object's type, if not NULL. This has the following signature: - - int tp_init(PyObject *self, + PyObject *tp_new(PyTypeObject *type, PyObject *args, PyObject *kwds) - It corresponds more closely to the __init__() method of classic - classes, and in fact is mapped to that by the slot/special-method - correspondence rules. The difference in responsibilities between - the tp_new() slot and the tp_init() slot lies in the invariants - they ensure. The tp_new() slot should ensure only the most - essential invariants, without which the C code that implements the - objects would break. The tp_init() slot should be used for - overridable user-specific initializations. Take for example the - dictionary type. The implementation has an internal pointer to a - hash table which should never be NULL. This invariant is taken - care of by the tp_new() slot for dictionaries. The dictionary - tp_init() slot, on the other hand, could be used to give the - dictionary an initial set of keys and values based on the - arguments passed in. +where 'type' is the type whose ``tp_new`` slot is called, and 'args' +and 'kwds' are the sequential and keyword arguments to the call, +passed unchanged from tp_call. (The 'type' argument is used in +combination with inheritance, see below.) - Note that for immutable object types, the initialization cannot be - done by the tp_init() slot: this would provide the Python user - with a way to change the initialization. Therefore, immutable - objects typically have an empty tp_init() implementation and do - all their initialization in their tp_new() slot. +There are no constraints on the object type that is returned, +although by convention it should be an instance of the given +type. It is not necessary that a new object is returned; a +reference to an existing object is fine too. The return value +should always be a new reference, owned by the caller. - You may wonder why the tp_new() slot shouldn't call the tp_init() - slot itself. The reason is that in certain circumstances (like - support for persistent objects), it is important to be able to - create an object of a particular type without initializing it any - further than necessary. This may conveniently be done by calling - the tp_new() slot without calling tp_init(). It is also possible - that tp_init() is not called, or called more than once -- its - operation should be robust even in these anomalous cases. +Once the ``tp_new`` slot has returned an object, further initialization +is attempted by calling the ``tp_init()`` slot of the resulting +object's type, if not NULL. This has the following signature:: - For some objects, tp_new() may return an existing object. For - example, the factory function for integers caches the integers -1 - through 99. This is permissible only when the type argument to - tp_new() is the type that defined the tp_new() function (in the - example, if type == &PyInt_Type), and when the tp_init() slot for - this type does nothing. If the type argument differs, the - tp_new() call is initiated by a derived type's tp_new() to - create the object and initialize the base type portion of the - object; in this case tp_new() should always return a new object - (or raise an exception). + int tp_init(PyObject *self, + PyObject *args, + PyObject *kwds) - Both tp_new() and tp_init() should receive exactly the same 'args' - and 'kwds' arguments, and both should check that the arguments are - acceptable, because they may be called independently. +It corresponds more closely to the ``__init__()`` method of classic +classes, and in fact is mapped to that by the slot/special-method +correspondence rules. The difference in responsibilities between +the ``tp_new()`` slot and the ``tp_init()`` slot lies in the invariants +they ensure. The ``tp_new()`` slot should ensure only the most +essential invariants, without which the C code that implements the +objects would break. The ``tp_init()`` slot should be used for +overridable user-specific initializations. Take for example the +dictionary type. The implementation has an internal pointer to a +hash table which should never be NULL. This invariant is taken +care of by the ``tp_new()`` slot for dictionaries. The dictionary +``tp_init()`` slot, on the other hand, could be used to give the +dictionary an initial set of keys and values based on the +arguments passed in. - There's a third slot related to object creation: tp_alloc(). Its - responsibility is to allocate the memory for the object, - initialize the reference count (ob_refcnt) and the type pointer - (ob_type), and initialize the rest of the object to all zeros. It - should also register the object with the garbage collection - subsystem if the type supports garbage collection. This slot - exists so that derived types can override the memory allocation - policy (like which heap is being used) separately from the - initialization code. The signature is: +Note that for immutable object types, the initialization cannot be +done by the ``tp_init()`` slot: this would provide the Python user +with a way to change the initialization. Therefore, immutable +objects typically have an empty ``tp_init()`` implementation and do +all their initialization in their ``tp_new()`` slot. - PyObject *tp_alloc(PyTypeObject *type, int nitems) +You may wonder why the ``tp_new()`` slot shouldn't call the ``tp_init()`` +slot itself. The reason is that in certain circumstances (like +support for persistent objects), it is important to be able to +create an object of a particular type without initializing it any +further than necessary. This may conveniently be done by calling +the ``tp_new()`` slot without calling ``tp_init()``. It is also possible +that ``tp_init()`` is not called, or called more than once -- its +operation should be robust even in these anomalous cases. - The type argument is the type of the new object. The nitems - argument is normally zero, except for objects with a variable - allocation size (basically strings, tuples, and longs). The - allocation size is given by the following expression: +For some objects, ``tp_new()`` may return an existing object. For +example, the factory function for integers caches the integers -1 +through 99. This is permissible only when the type argument to +``tp_new()`` is the type that defined the ``tp_new()`` function (in the +example, if ``type == &PyInt_Type``), and when the ``tp_init()`` slot for +this type does nothing. If the type argument differs, the +``tp_new()`` call is initiated by a derived type's ``tp_new()`` to +create the object and initialize the base type portion of the +object; in this case ``tp_new()`` should always return a new object +(or raise an exception). - type->tp_basicsize + nitems * type->tp_itemsize +Both ``tp_new()`` and ``tp_init()`` should receive exactly the same 'args' +and 'kwds' arguments, and both should check that the arguments are +acceptable, because they may be called independently. - The tp_alloc slot is only used for subclassable types. The tp_new() - function of the base class must call the tp_alloc() slot of the - type passed in as its first argument. It is the tp_new() - function's responsibility to calculate the number of items. The - tp_alloc() slot will set the ob_size member of the new object if - the type->tp_itemsize member is nonzero. +There's a third slot related to object creation: ``tp_alloc()``. Its +responsibility is to allocate the memory for the object, +initialize the reference count (``ob_refcnt``) and the type pointer +(``ob_type``), and initialize the rest of the object to all zeros. It +should also register the object with the garbage collection +subsystem if the type supports garbage collection. This slot +exists so that derived types can override the memory allocation +policy (like which heap is being used) separately from the +initialization code. The signature is:: - (Note: in certain debugging compilation modes, the type structure - used to have members named tp_alloc and a tp_free slot already, - counters for the number of allocations and deallocations. These - are renamed to tp_allocs and tp_deallocs.) + PyObject *tp_alloc(PyTypeObject *type, int nitems) - Standard implementations for tp_alloc() and tp_new() are - available. PyType_GenericAlloc() allocates an object from the - standard heap and initializes it properly. It uses the above - formula to determine the amount of memory to allocate, and takes - care of GC registration. The only reason not to use this - implementation would be to allocate objects from a different heap - (as is done by some very small frequently used objects like ints - and tuples). PyType_GenericNew() adds very little: it just calls - the type's tp_alloc() slot with zero for nitems. But for mutable - types that do all their initialization in their tp_init() slot, - this may be just the ticket. +The type argument is the type of the new object. The nitems +argument is normally zero, except for objects with a variable +allocation size (basically strings, tuples, and longs). The +allocation size is given by the following expression:: + + type->tp_basicsize + nitems * type->tp_itemsize + +The ``tp_alloc`` slot is only used for subclassable types. The ``tp_new()`` +function of the base class must call the ``tp_alloc()`` slot of the +type passed in as its first argument. It is the ``tp_new()`` +function's responsibility to calculate the number of items. The +``tp_alloc()`` slot will set the ob_size member of the new object if +the ``type->tp_itemsize`` member is nonzero. + +(Note: in certain debugging compilation modes, the type structure +used to have members named ``tp_alloc`` and a ``tp_free`` slot already, +counters for the number of allocations and deallocations. These +are renamed to ``tp_allocs`` and ``tp_deallocs``.) + +Standard implementations for ``tp_alloc()`` and ``tp_new()`` are +available. ``PyType_GenericAlloc()`` allocates an object from the +standard heap and initializes it properly. It uses the above +formula to determine the amount of memory to allocate, and takes +care of GC registration. The only reason not to use this +implementation would be to allocate objects from a different heap +(as is done by some very small frequently used objects like ints +and tuples). ``PyType_GenericNew()`` adds very little: it just calls +the type's ``tp_alloc()`` slot with zero for nitems. But for mutable +types that do all their initialization in their ``tp_init()`` slot, +this may be just the ticket. Preparing a type for subtyping +============================== - The idea behind subtyping is very similar to that of single - inheritance in C++. A base type is described by a structure - declaration (similar to the C++ class declaration) plus a type - object (similar to the C++ vtable). A derived type can extend the - structure (but must leave the names, order and type of the members - of the base structure unchanged) and can override certain slots in - the type object, leaving others the same. (Unlike C++ vtables, - all Python type objects have the same memory layout.) +The idea behind subtyping is very similar to that of single +inheritance in C++. A base type is described by a structure +declaration (similar to the C++ class declaration) plus a type +object (similar to the C++ vtable). A derived type can extend the +structure (but must leave the names, order and type of the members +of the base structure unchanged) and can override certain slots in +the type object, leaving others the same. (Unlike C++ vtables, +all Python type objects have the same memory layout.) - The base type must do the following: +The base type must do the following: - - Add the flag value Py_TPFLAGS_BASETYPE to tp_flags. +- Add the flag value ``Py_TPFLAGS_BASETYPE`` to ``tp_flags``. - - Declare and use tp_new(), tp_alloc() and optional tp_init() - slots. +- Declare and use ``tp_new()``, ``tp_alloc()`` and optional ``tp_init()`` + slots. - - Declare and use tp_dealloc() and tp_free(). +- Declare and use ``tp_dealloc()`` and ``tp_free()``. - - Export its object structure declaration. +- Export its object structure declaration. - - Export a subtyping-aware type-checking macro. +- Export a subtyping-aware type-checking macro. - The requirements and signatures for tp_new(), tp_alloc() and - tp_init() have already been discussed above: tp_alloc() should - allocate the memory and initialize it to mostly zeros; tp_new() - should call the tp_alloc() slot and then proceed to do the - minimally required initialization; tp_init() should be used for - more extensive initialization of mutable objects. +The requirements and signatures for ``tp_new()``, ``tp_alloc()`` and +``tp_init()`` have already been discussed above: ``tp_alloc()`` should +allocate the memory and initialize it to mostly zeros; ``tp_new()`` +should call the ``tp_alloc()`` slot and then proceed to do the +minimally required initialization; ``tp_init()`` should be used for +more extensive initialization of mutable objects. - It should come as no surprise that there are similar conventions - at the end of an object's lifetime. The slots involved are - tp_dealloc() (familiar to all who have ever implemented a Python - extension type) and tp_free(), the new kid on the block. (The - names aren't quite symmetric; tp_free() corresponds to tp_alloc(), - which is fine, but tp_dealloc() corresponds to tp_new(). Maybe - the tp_dealloc slot should be renamed?) +It should come as no surprise that there are similar conventions +at the end of an object's lifetime. The slots involved are +``tp_dealloc()`` (familiar to all who have ever implemented a Python +extension type) and ``tp_free()``, the new kid on the block. (The +names aren't quite symmetric; ``tp_free()`` corresponds to ``tp_alloc()``, +which is fine, but ``tp_dealloc()`` corresponds to ``tp_new()``. Maybe +the tp_dealloc slot should be renamed?) - The tp_free() slot should be used to free the memory and - unregister the object with the garbage collection subsystem, and - can be overridden by a derived class; tp_dealloc() should - deinitialize the object (usually by calling Py_XDECREF() for - various sub-objects) and then call tp_free() to deallocate the - memory. The signature for tp_dealloc() is the same as it always - was: +The ``tp_free()`` slot should be used to free the memory and +unregister the object with the garbage collection subsystem, and +can be overridden by a derived class; ``tp_dealloc()`` should +deinitialize the object (usually by calling ``Py_XDECREF()`` for +various sub-objects) and then call ``tp_free()`` to deallocate the +memory. The signature for ``tp_dealloc()`` is the same as it always +was:: - void tp_dealloc(PyObject *object) + void tp_dealloc(PyObject *object) - The signature for tp_free() is the same: +The signature for tp_free() is the same:: - void tp_free(PyObject *object) + void tp_free(PyObject *object) - (In a previous version of this PEP, there was also a role reserved - for the tp_clear() slot. This turned out to be a bad idea.) +(In a previous version of this PEP, there was also a role reserved +for the ``tp_clear()`` slot. This turned out to be a bad idea.) - To be usefully subtyped in C, a type must export the structure - declaration for its instances through a header file, as it is - needed to derive a subtype. The type object for the base type - must also be exported. +To be usefully subtyped in C, a type must export the structure +declaration for its instances through a header file, as it is +needed to derive a subtype. The type object for the base type +must also be exported. - If the base type has a type-checking macro (like PyDict_Check()), - this macro should be made to recognize subtypes. This can be done - by using the new PyObject_TypeCheck(object, type) macro, which - calls a function that follows the base class links. +If the base type has a type-checking macro (like ``PyDict_Check()``), +this macro should be made to recognize subtypes. This can be done +by using the new ``PyObject_TypeCheck(object, type)`` macro, which +calls a function that follows the base class links. - The PyObject_TypeCheck() macro contains a slight optimization: it - first compares object->ob_type directly to the type argument, and - if this is a match, bypasses the function call. This should make - it fast enough for most situations. +The ``PyObject_TypeCheck()`` macro contains a slight optimization: it +first compares ``object->ob_type`` directly to the type argument, and +if this is a match, bypasses the function call. This should make +it fast enough for most situations. - Note that this change in the type-checking macro means that C - functions that require an instance of the base type may be invoked - with instances of the derived type. Before enabling subtyping of - a particular type, its code should be checked to make sure that - this won't break anything. It has proved useful in the prototype - to add another type-checking macro for the built-in Python object - types, to check for exact type match too (for example, - PyDict_Check(x) is true if x is an instance of dictionary or of a - dictionary subclass, while PyDict_CheckExact(x) is true only if x - is a dictionary). +Note that this change in the type-checking macro means that C +functions that require an instance of the base type may be invoked +with instances of the derived type. Before enabling subtyping of +a particular type, its code should be checked to make sure that +this won't break anything. It has proved useful in the prototype +to add another type-checking macro for the built-in Python object +types, to check for exact type match too (for example, +``PyDict_Check(x)`` is true if x is an instance of dictionary or of a +dictionary subclass, while ``PyDict_CheckExact(x)`` is true only if x +is a dictionary). Creating a subtype of a built-in type in C +========================================== - The simplest form of subtyping is subtyping in C. It is the - simplest form because we can require the C code to be aware of - some of the problems, and it's acceptable for C code that doesn't - follow the rules to dump core. For added simplicity, it is - limited to single inheritance. +The simplest form of subtyping is subtyping in C. It is the +simplest form because we can require the C code to be aware of +some of the problems, and it's acceptable for C code that doesn't +follow the rules to dump core. For added simplicity, it is +limited to single inheritance. - Let's assume we're deriving from a mutable base type whose - tp_itemsize is zero. The subtype code is not GC-aware, although - it may inherit GC-awareness from the base type (this is - automatic). The base type's allocation uses the standard heap. +Let's assume we're deriving from a mutable base type whose +tp_itemsize is zero. The subtype code is not GC-aware, although +it may inherit GC-awareness from the base type (this is +automatic). The base type's allocation uses the standard heap. - The derived type begins by declaring a type structure which - contains the base type's structure. For example, here's the type - structure for a subtype of the built-in list type: +The derived type begins by declaring a type structure which +contains the base type's structure. For example, here's the type +structure for a subtype of the built-in list type:: typedef struct { PyListObject list; int state; } spamlistobject; - Note that the base type structure member (here PyListObject) must - be the first member of the structure; any following members are - additions. Also note that the base type is not referenced via a - pointer; the actual contents of its structure must be included! - (The goal is for the memory layout of the beginning of the - subtype instance to be the same as that of the base type - instance.) +Note that the base type structure member (here ``PyListObject``) must +be the first member of the structure; any following members are +additions. Also note that the base type is not referenced via a +pointer; the actual contents of its structure must be included! +(The goal is for the memory layout of the beginning of the +subtype instance to be the same as that of the base type +instance.) - Next, the derived type must declare a type object and initialize - it. Most of the slots in the type object may be initialized to - zero, which is a signal that the base type slot must be copied - into it. Some slots that must be initialized properly: +Next, the derived type must declare a type object and initialize +it. Most of the slots in the type object may be initialized to +zero, which is a signal that the base type slot must be copied +into it. Some slots that must be initialized properly: - - The object header must be filled in as usual; the type should - be &PyType_Type. +- The object header must be filled in as usual; the type should + be ``&PyType_Type``. - - The tp_basicsize slot must be set to the size of the subtype - instance struct (in the above example: - sizeof(spamlistobject)). +- The tp_basicsize slot must be set to the size of the subtype + instance struct (in the above example: ``sizeof(spamlistobject)``). - - The tp_base slot must be set to the address of the base type's - type object. +- The tp_base slot must be set to the address of the base type's + type object. - - If the derived slot defines any pointer members, the - tp_dealloc slot function requires special attention, see - below; otherwise, it can be set to zero, to inherit the base - type's deallocation function. +- If the derived slot defines any pointer members, the + ``tp_dealloc`` slot function requires special attention, see + below; otherwise, it can be set to zero, to inherit the base + type's deallocation function. - - The tp_flags slot must be set to the usual Py_TPFLAGS_DEFAULT - value. +- The ``tp_flags`` slot must be set to the usual ``Py_TPFLAGS_DEFAULT`` + value. - - The tp_name slot must be set; it is recommended to set tp_doc - as well (these are not inherited). +- The ``tp_name`` slot must be set; it is recommended to set ``tp_doc`` + as well (these are not inherited). - If the subtype defines no additional structure members (it only - defines new behavior, no new data), the tp_basicsize and the - tp_dealloc slots may be left set to zero. +If the subtype defines no additional structure members (it only +defines new behavior, no new data), the ``tp_basicsize`` and the +``tp_dealloc`` slots may be left set to zero. - The subtype's tp_dealloc slot deserves special attention. If the - derived type defines no additional pointer members that need to be - DECREF'ed or freed when the object is deallocated, it can be set - to zero. Otherwise, the subtype's tp_dealloc() function must call - Py_XDECREF() for any PyObject * members and the correct memory - freeing function for any other pointers it owns, and then call the - base class's tp_dealloc() slot. This call has to be made via the - base type's type structure, for example, when deriving from the - standard list type: +The subtype's ``tp_dealloc`` slot deserves special attention. If the +derived type defines no additional pointer members that need to be +DECREF'ed or freed when the object is deallocated, it can be set +to zero. Otherwise, the subtype's ``tp_dealloc()`` function must call +``Py_XDECREF()`` for any ``PyObject *`` members and the correct memory +freeing function for any other pointers it owns, and then call the +base class's ``tp_dealloc()`` slot. This call has to be made via the +base type's type structure, for example, when deriving from the +standard list type:: - PyList_Type.tp_dealloc(self); + PyList_Type.tp_dealloc(self); - If the subtype wants to use a different allocation heap than the - base type, the subtype must override both the tp_alloc() and the - tp_free() slots. These will be called by the base class's - tp_new() and tp_dealloc() slots, respectively. +If the subtype wants to use a different allocation heap than the +base type, the subtype must override both the ``tp_alloc()`` and the +``tp_free()`` slots. These will be called by the base class's +``tp_new()`` and ``tp_dealloc()`` slots, respectively. - To complete the initialization of the type, PyType_InitDict() must - be called. This replaces slots initialized to zero in the subtype - with the value of the corresponding base type slots. (It also - fills in tp_dict, the type's dictionary, and does various other - initializations necessary for type objects.) +To complete the initialization of the type, ``PyType_InitDict()`` must +be called. This replaces slots initialized to zero in the subtype +with the value of the corresponding base type slots. (It also +fills in ``tp_dict``, the type's dictionary, and does various other +initializations necessary for type objects.) - A subtype is not usable until PyType_InitDict() is called for it; - this is best done during module initialization, assuming the - subtype belongs to a module. An alternative for subtypes added to - the Python core (which don't live in a particular module) would be - to initialize the subtype in their constructor function. It is - allowed to call PyType_InitDict() more than once; the second and - further calls have no effect. To avoid unnecessary calls, a test - for tp_dict==NULL can be made. +A subtype is not usable until ``PyType_InitDict()`` is called for it; +this is best done during module initialization, assuming the +subtype belongs to a module. An alternative for subtypes added to +the Python core (which don't live in a particular module) would be +to initialize the subtype in their constructor function. It is +allowed to call ``PyType_InitDict()`` more than once; the second and +further calls have no effect. To avoid unnecessary calls, a test +for ``tp_dict==NULL`` can be made. - (During initialization of the Python interpreter, some types are - actually used before they are initialized. As long as the slots - that are actually needed are initialized, especially tp_dealloc, - this works, but it is fragile and not recommended as a general - practice.) +(During initialization of the Python interpreter, some types are +actually used before they are initialized. As long as the slots +that are actually needed are initialized, especially ``tp_dealloc``, +this works, but it is fragile and not recommended as a general +practice.) - To create a subtype instance, the subtype's tp_new() slot is - called. This should first call the base type's tp_new() slot and - then initialize the subtype's additional data members. To further - initialize the instance, the tp_init() slot is typically called. - Note that the tp_new() slot should *not* call the tp_init() slot; - this is up to tp_new()'s caller (typically a factory function). - There are circumstances where it is appropriate not to call - tp_init(). +To create a subtype instance, the subtype's ``tp_new()`` slot is +called. This should first call the base type's ``tp_new()`` slot and +then initialize the subtype's additional data members. To further +initialize the instance, the ``tp_init()`` slot is typically called. +Note that the ``tp_new()`` slot should **not** call the ``tp_init()`` slot; +this is up to ``tp_new()``'s caller (typically a factory function). +There are circumstances where it is appropriate not to call +``tp_init()``. - If a subtype defines a tp_init() slot, the tp_init() slot should - normally first call the base type's tp_init() slot. +If a subtype defines a ``tp_init()`` slot, the ``tp_init()`` slot should +normally first call the base type's ``tp_init()`` slot. - (XXX There should be a paragraph or two about argument passing - here.) +(XXX There should be a paragraph or two about argument passing +here.) Subtyping in Python +=================== - The next step is to allow subtyping of selected built-in types - through a class statement in Python. Limiting ourselves to single - inheritance for now, here is what happens for a simple class - statement: +The next step is to allow subtyping of selected built-in types +through a class statement in Python. Limiting ourselves to single +inheritance for now, here is what happens for a simple class +statement:: class C(B): var1 = 1 def method1(self): pass # etc. - The body of the class statement is executed in a fresh environment - (basically, a new dictionary used as local namespace), and then C - is created. The following explains how C is created. +The body of the class statement is executed in a fresh environment +(basically, a new dictionary used as local namespace), and then C +is created. The following explains how C is created. - Assume B is a type object. Since type objects are objects, and - every object has a type, B has a type. Since B is itself a type, - we also call its type its metatype. B's metatype is accessible - via type(B) or B.__class__ (the latter notation is new for types; - it is introduced in PEP 252). Let's say this metatype is M (for - Metatype). The class statement will create a new type, C. Since - C will be a type object just like B, we view the creation of C as - an instantiation of the metatype, M. The information that needs - to be provided for the creation of a subclass is: +Assume B is a type object. Since type objects are objects, and +every object has a type, B has a type. Since B is itself a type, +we also call its type its metatype. B's metatype is accessible +via ``type(B)`` or ``B.__class__`` (the latter notation is new for types; +it is introduced in PEP 252). Let's say this metatype is M (for +Metatype). The class statement will create a new type, C. Since +C will be a type object just like B, we view the creation of C as +an instantiation of the metatype, M. The information that needs +to be provided for the creation of a subclass is: - - its name (in this example the string "C"); +- its name (in this example the string "C"); - - its bases (a singleton tuple containing B); +- its bases (a singleton tuple containing B); - - the results of executing the class body, in the form of a - dictionary (for example {"var1": 1, "method1": , ...}). +- the results of executing the class body, in the form of a + dictionary (for example + ``{"var1": 1, "method1": , ...}``). - The class statement will result in the following call: +The class statement will result in the following call:: - C = M("C", (B,), dict) + C = M("C", (B,), dict) - where dict is the dictionary resulting from execution of the - class body. In other words, the metatype (M) is called. +where dict is the dictionary resulting from execution of the +class body. In other words, the metatype (M) is called. - Note that even though the example has only one base, we still pass - in a (singleton) sequence of bases; this makes the interface - uniform with the multiple-inheritance case. +Note that even though the example has only one base, we still pass +in a (singleton) sequence of bases; this makes the interface +uniform with the multiple-inheritance case. - In current Python, this is called the "Don Beaudry hook" after its - inventor; it is an exceptional case that is only invoked when a - base class is not a regular class. For a regular base class (or - when no base class is specified), current Python calls - PyClass_New(), the C level factory function for classes, directly. +In current Python, this is called the "Don Beaudry hook" after its +inventor; it is an exceptional case that is only invoked when a +base class is not a regular class. For a regular base class (or +when no base class is specified), current Python calls +``PyClass_New()``, the C level factory function for classes, directly. - Under the new system this is changed so that Python *always* - determines a metatype and calls it as given above. When one or - more bases are given, the type of the first base is used as the - metatype; when no base is given, a default metatype is chosen. By - setting the default metatype to PyClass_Type, the metatype of - "classic" classes, the classic behavior of the class statement is - retained. This default can be changed per module by setting the - global variable __metaclass__. +Under the new system this is changed so that Python **always** +determines a metatype and calls it as given above. When one or +more bases are given, the type of the first base is used as the +metatype; when no base is given, a default metatype is chosen. By +setting the default metatype to ``PyClass_Type``, the metatype of +"classic" classes, the classic behavior of the class statement is +retained. This default can be changed per module by setting the +global variable ``__metaclass__``. - There are two further refinements here. First, a useful feature - is to be able to specify a metatype directly. If the class - suite defines a variable __metaclass__, that is the metatype - to call. (Note that setting __metaclass__ at the module level - only affects class statements without a base class and without an - explicit __metaclass__ declaration; but setting __metaclass__ in a - class suite overrides the default metatype unconditionally.) +There are two further refinements here. First, a useful feature +is to be able to specify a metatype directly. If the class +suite defines a variable ``__metaclass__``, that is the metatype +to call. (Note that setting ``__metaclass__`` at the module level +only affects class statements without a base class and without an +explicit ``__metaclass__`` declaration; but setting ``__metaclass__`` in a +class suite overrides the default metatype unconditionally.) - Second, with multiple bases, not all bases need to have the same - metatype. This is called a metaclass conflict [1]. Some - metaclass conflicts can be resolved by searching through the set - of bases for a metatype that derives from all other given - metatypes. If such a metatype cannot be found, an exception is - raised and the class statement fails. +Second, with multiple bases, not all bases need to have the same +metatype. This is called a metaclass conflict [1]_. Some +metaclass conflicts can be resolved by searching through the set +of bases for a metatype that derives from all other given +metatypes. If such a metatype cannot be found, an exception is +raised and the class statement fails. - This conflict resolution can be implemented by the metatype - constructors: the class statement just calls the metatype of the first - base (or that specified by the __metaclass__ variable), and this - metatype's constructor looks for the most derived metatype. If - that is itself, it proceeds; otherwise, it calls that metatype's - constructor. (Ultimate flexibility: another metatype might choose - to require that all bases have the same metatype, or that there's - only one base class, or whatever.) +This conflict resolution can be implemented by the metatype +constructors: the class statement just calls the metatype of the first +base (or that specified by the ``__metaclass__`` variable), and this +metatype's constructor looks for the most derived metatype. If +that is itself, it proceeds; otherwise, it calls that metatype's +constructor. (Ultimate flexibility: another metatype might choose +to require that all bases have the same metatype, or that there's +only one base class, or whatever.) - (In [1], a new metaclass is automatically derived that is a - subclass of all given metaclasses. But since it is questionable - in Python how conflicting method definitions of the various - metaclasses should be merged, I don't think this is feasible. - Should the need arise, the user can derive such a metaclass - manually and specify it using the __metaclass__ variable. It is - also possible to have a new metaclass that does this.) +(In [1]_, a new metaclass is automatically derived that is a +subclass of all given metaclasses. But since it is questionable +in Python how conflicting method definitions of the various +metaclasses should be merged, I don't think this is feasible. +Should the need arise, the user can derive such a metaclass +manually and specify it using the ``__metaclass__`` variable. It is +also possible to have a new metaclass that does this.) - Note that calling M requires that M itself has a type: the - meta-metatype. And the meta-metatype has a type, the - meta-meta-metatype. And so on. This is normally cut short at - some level by making a metatype be its own metatype. This is - indeed what happens in Python: the ob_type reference in - PyType_Type is set to &PyType_Type. In the absence of third party - metatypes, PyType_Type is the only metatype in the Python - interpreter. +Note that calling M requires that M itself has a type: the +meta-metatype. And the meta-metatype has a type, the +meta-meta-metatype. And so on. This is normally cut short at +some level by making a metatype be its own metatype. This is +indeed what happens in Python: the ``ob_type`` reference in +``PyType_Type`` is set to ``&PyType_Type``. In the absence of third party +metatypes, ``PyType_Type`` is the only metatype in the Python +interpreter. - (In a previous version of this PEP, there was one additional - meta-level, and there was a meta-metatype called "turtle". This - turned out to be unnecessary.) +(In a previous version of this PEP, there was one additional +meta-level, and there was a meta-metatype called "turtle". This +turned out to be unnecessary.) - In any case, the work for creating C is done by M's tp_new() slot. - It allocates space for an "extended" type structure, containing: - the type object; the auxiliary structures (as_sequence etc.); the - string object containing the type name (to ensure that this object - isn't deallocated while the type object is still referencing it); and - some auxiliary storage (to be described later). It initializes this - storage to zeros except for a few crucial slots (for example, tp_name - is set to point to the type name) and then sets the tp_base slot to - point to B. Then PyType_InitDict() is called to inherit B's slots. - Finally, C's tp_dict slot is updated with the contents of the - namespace dictionary (the third argument to the call to M). +In any case, the work for creating C is done by M's ``tp_new()`` slot. +It allocates space for an "extended" type structure, containing: +the type object; the auxiliary structures (as_sequence etc.); the +string object containing the type name (to ensure that this object +isn't deallocated while the type object is still referencing it); and +some auxiliary storage (to be described later). It initializes this +storage to zeros except for a few crucial slots (for example, tp_name +is set to point to the type name) and then sets the tp_base slot to +point to B. Then ``PyType_InitDict()`` is called to inherit B's slots. +Finally, C's ``tp_dict`` slot is updated with the contents of the +namespace dictionary (the third argument to the call to M). Multiple inheritance +==================== - The Python class statement supports multiple inheritance, and we - will also support multiple inheritance involving built-in types. +The Python class statement supports multiple inheritance, and we +will also support multiple inheritance involving built-in types. - However, there are some restrictions. The C runtime architecture - doesn't make it feasible to have a meaningful subtype of two - different built-in types except in a few degenerate cases. - Changing the C runtime to support fully general multiple - inheritance would be too much of an upheaval of the code base. +However, there are some restrictions. The C runtime architecture +doesn't make it feasible to have a meaningful subtype of two +different built-in types except in a few degenerate cases. +Changing the C runtime to support fully general multiple +inheritance would be too much of an upheaval of the code base. - The main problem with multiple inheritance from different built-in - types stems from the fact that the C implementation of built-in - types accesses structure members directly; the C compiler - generates an offset relative to the object pointer and that's - that. For example, the list and dictionary type structures each - declare a number of different but overlapping structure members. - A C function accessing an object expecting a list won't work when - passed a dictionary, and vice versa, and there's not much we could - do about this without rewriting all code that accesses lists and - dictionaries. This would be too much work, so we won't do this. +The main problem with multiple inheritance from different built-in +types stems from the fact that the C implementation of built-in +types accesses structure members directly; the C compiler +generates an offset relative to the object pointer and that's +that. For example, the list and dictionary type structures each +declare a number of different but overlapping structure members. +A C function accessing an object expecting a list won't work when +passed a dictionary, and vice versa, and there's not much we could +do about this without rewriting all code that accesses lists and +dictionaries. This would be too much work, so we won't do this. - The problem with multiple inheritance is caused by conflicting - structure member allocations. Classes defined in Python normally - don't store their instance variables in structure members: they - are stored in an instance dictionary. This is the key to a - partial solution. Suppose we have the following two classes: +The problem with multiple inheritance is caused by conflicting +structure member allocations. Classes defined in Python normally +don't store their instance variables in structure members: they +are stored in an instance dictionary. This is the key to a +partial solution. Suppose we have the following two classes:: - class A(dictionary): - def foo(self): pass + class A(dictionary): + def foo(self): pass - class B(dictionary): - def bar(self): pass + class B(dictionary): + def bar(self): pass - class C(A, B): pass + class C(A, B): pass - (Here, 'dictionary' is the type of built-in dictionary objects, - a.k.a. type({}) or {}.__class__ or types.DictType.) If we look at - the structure layout, we find that an A instance has the layout - of a dictionary followed by the __dict__ pointer, and a B instance - has the same layout; since there are no structure member layout - conflicts, this is okay. +(Here, 'dictionary' is the type of built-in dictionary objects, +a.k.a. ``type({})`` or ``{}.__class__`` or ``types.DictType``.) If we look at +the structure layout, we find that an A instance has the layout +of a dictionary followed by the ``__dict__`` pointer, and a B instance +has the same layout; since there are no structure member layout +conflicts, this is okay. - Here's another example: +Here's another example:: - class X(object): - def foo(self): pass + class X(object): + def foo(self): pass - class Y(dictionary): - def bar(self): pass + class Y(dictionary): + def bar(self): pass - class Z(X, Y): pass + class Z(X, Y): pass - (Here, 'object' is the base for all built-in types; its structure - layout only contains the ob_refcnt and ob_type members.) This - example is more complicated, because the __dict__ pointer for X - instances has a different offset than that for Y instances. Where - is the __dict__ pointer for Z instances? The answer is that the - offset for the __dict__ pointer is not hardcoded, it is stored in - the type object. +(Here, 'object' is the base for all built-in types; its structure +layout only contains the ``ob_refcnt`` and ``ob_type`` members.) This +example is more complicated, because the ``__dict__`` pointer for X +instances has a different offset than that for Y instances. Where +is the ``__dict__`` pointer for Z instances? The answer is that the +offset for the ``__dict__`` pointer is not hardcoded, it is stored in +the type object. - Suppose on a particular machine an 'object' structure is 8 bytes - long, and a 'dictionary' struct is 60 bytes, and an object pointer - is 4 bytes. Then an X structure is 12 bytes (an object structure - followed by a __dict__ pointer), and a Y structure is 64 bytes (a - dictionary structure followed by a __dict__ pointer). The Z - structure has the same layout as the Y structure in this example. - Each type object (X, Y and Z) has a "__dict__ offset" which is - used to find the __dict__ pointer. Thus, the recipe for looking - up an instance variable is: +Suppose on a particular machine an 'object' structure is 8 bytes +long, and a 'dictionary' struct is 60 bytes, and an object pointer +is 4 bytes. Then an X structure is 12 bytes (an object structure +followed by a ``__dict__`` pointer), and a Y structure is 64 bytes (a +dictionary structure followed by a ``__dict__`` pointer). The Z +structure has the same layout as the Y structure in this example. +Each type object (X, Y and Z) has a "__dict__ offset" which is +used to find the ``__dict__`` pointer. Thus, the recipe for looking +up an instance variable is: - 1. get the type of the instance - 2. get the __dict__ offset from the type object - 3. add the __dict__ offset to the instance pointer - 4. look in the resulting address to find a dictionary reference - 5. look up the instance variable name in that dictionary +1. get the type of the instance +2. get the ``__dict__`` offset from the type object +3. add the ``__dict__`` offset to the instance pointer +4. look in the resulting address to find a dictionary reference +5. look up the instance variable name in that dictionary - Of course, this recipe can only be implemented in C, and I have - left out some details. But this allows us to use multiple - inheritance patterns similar to the ones we can use with classic - classes. +Of course, this recipe can only be implemented in C, and I have +left out some details. But this allows us to use multiple +inheritance patterns similar to the ones we can use with classic +classes. - XXX I should write up the complete algorithm here to determine - base class compatibility, but I can't be bothered right now. Look - at best_base() in typeobject.c in the implementation mentioned - below. +XXX I should write up the complete algorithm here to determine +base class compatibility, but I can't be bothered right now. Look +at ``best_base()`` in typeobject.c in the implementation mentioned +below. MRO: Method resolution order (the lookup rule) +=============================================== - With multiple inheritance comes the question of method resolution - order: the order in which a class or type and its bases are - searched looking for a method of a given name. +With multiple inheritance comes the question of method resolution +order: the order in which a class or type and its bases are +searched looking for a method of a given name. - In classic Python, the rule is given by the following recursive - function, also known as the left-to-right depth-first rule: +In classic Python, the rule is given by the following recursive +function, also known as the left-to-right depth-first rule:: - def classic_lookup(cls, name): - if cls.__dict__.has_key(name): - return cls.__dict__[name] - for base in cls.__bases__: - try: - return classic_lookup(base, name) - except AttributeError: - pass - raise AttributeError, name + def classic_lookup(cls, name): + if cls.__dict__.has_key(name): + return cls.__dict__[name] + for base in cls.__bases__: + try: + return classic_lookup(base, name) + except AttributeError: + pass + raise AttributeError, name - The problem with this becomes apparent when we consider a "diamond - diagram": +The problem with this becomes apparent when we consider a "diamond +diagram":: - class A: - ^ ^ def save(self): ... - / \ - / \ - / \ - / \ - class B class C: - ^ ^ def save(self): ... - \ / - \ / - \ / - \ / - class D + class A: + ^ ^ def save(self): ... + / \ + / \ + / \ + / \ + class B class C: + ^ ^ def save(self): ... + \ / + \ / + \ / + \ / + class D - Arrows point from a subtype to its base type(s). This particular - diagram means B and C derive from A, and D derives from B and C - (and hence also, indirectly, from A). +Arrows point from a subtype to its base ``type(s)``. This particular +diagram means B and C derive from A, and D derives from B and C +(and hence also, indirectly, from A). - Assume that C overrides the method save(), which is defined in the - base A. (C.save() probably calls A.save() and then saves some of - its own state.) B and D don't override save(). When we invoke - save() on a D instance, which method is called? According to the - classic lookup rule, A.save() is called, ignoring C.save()! +Assume that C overrides the method ``save()``, which is defined in the +base A. (``C.save()`` probably calls ``A.save()`` and then saves some of +its own state.) B and D don't override ``save()``. When we invoke +``save()`` on a D instance, which method is called? According to the +classic lookup rule, ``A.save()`` is called, ignoring ``C.save()``! - This is not good. It probably breaks C (its state doesn't get - saved), defeating the whole purpose of inheriting from C in the - first place. +This is not good. It probably breaks C (its state doesn't get +saved), defeating the whole purpose of inheriting from C in the +first place. - Why was this not a problem in classic Python? Diamond diagrams - are rarely found in classic Python class hierarchies. Most class - hierarchies use single inheritance, and multiple inheritance is - usually confined to mix-in classes. In fact, the problem shown - here is probably the reason why multiple inheritance is unpopular - in classic Python. +Why was this not a problem in classic Python? Diamond diagrams +are rarely found in classic Python class hierarchies. Most class +hierarchies use single inheritance, and multiple inheritance is +usually confined to mix-in classes. In fact, the problem shown +here is probably the reason why multiple inheritance is unpopular +in classic Python. - Why will this be a problem in the new system? The 'object' type - at the top of the type hierarchy defines a number of methods that - can usefully be extended by subtypes, for example __getattr__(). +Why will this be a problem in the new system? The 'object' type +at the top of the type hierarchy defines a number of methods that +can usefully be extended by subtypes, for example ``__getattr__()``. - (Aside: in classic Python, the __getattr__() method is not really - the implementation for the get-attribute operation; it is a hook - that only gets invoked when an attribute cannot be found by normal - means. This has often been cited as a shortcoming -- some class - designs have a legitimate need for a __getattr__() method that - gets called for *all* attribute references. But then of course - this method has to be able to invoke the default implementation - directly. The most natural way is to make the default - implementation available as object.__getattr__(self, name).) +(Aside: in classic Python, the ``__getattr__()`` method is not really +the implementation for the get-attribute operation; it is a hook +that only gets invoked when an attribute cannot be found by normal +means. This has often been cited as a shortcoming -- some class +designs have a legitimate need for a ``__getattr__()`` method that +gets called for **all** attribute references. But then of course +this method has to be able to invoke the default implementation +directly. The most natural way is to make the default +implementation available as ``object.__getattr__(self, name)``.) - Thus, a classic class hierarchy like this: +Thus, a classic class hierarchy like this:: - class B class C: - ^ ^ def __getattr__(self, name): ... - \ / - \ / - \ / - \ / - class D + class B class C: + ^ ^ def __getattr__(self, name): ... + \ / + \ / + \ / + \ / + class D - will change into a diamond diagram under the new system: +will change into a diamond diagram under the new system:: - object: - ^ ^ __getattr__() - / \ - / \ - / \ - / \ - class B class C: - ^ ^ def __getattr__(self, name): ... - \ / - \ / - \ / - \ / - class D + object: + ^ ^ __getattr__() + / \ + / \ + / \ + / \ + class B class C: + ^ ^ def __getattr__(self, name): ... + \ / + \ / + \ / + \ / + class D - and while in the original diagram C.__getattr__() is invoked, - under the new system with the classic lookup rule, - object.__getattr__() would be invoked! +and while in the original diagram ``C.__getattr__()`` is invoked, +under the new system with the classic lookup rule, +``object.__getattr__()`` would be invoked! - Fortunately, there's a lookup rule that's better. It's a bit - difficult to explain, but it does the right thing in the diamond - diagram, and it is the same as the classic lookup rule when there - are no diamonds in the inheritance graph (when it is a tree). +Fortunately, there's a lookup rule that's better. It's a bit +difficult to explain, but it does the right thing in the diamond +diagram, and it is the same as the classic lookup rule when there +are no diamonds in the inheritance graph (when it is a tree). - The new lookup rule constructs a list of all classes in the - inheritance diagram in the order in which they will be searched. - This construction is done at class definition time to save time. - To explain the new lookup rule, let's first consider what such a - list would look like for the classic lookup rule. Note that in - the presence of diamonds the classic lookup visits some classes - multiple times. For example, in the ABCD diamond diagram above, - the classic lookup rule visits the classes in this order: +The new lookup rule constructs a list of all classes in the +inheritance diagram in the order in which they will be searched. +This construction is done at class definition time to save time. +To explain the new lookup rule, let's first consider what such a +list would look like for the classic lookup rule. Note that in +the presence of diamonds the classic lookup visits some classes +multiple times. For example, in the ABCD diamond diagram above, +the classic lookup rule visits the classes in this order:: - D, B, A, C, A + D, B, A, C, A - Note how A occurs twice in the list. The second occurrence is - redundant, since anything that could be found there would already - have been found when searching the first occurrence. +Note how A occurs twice in the list. The second occurrence is +redundant, since anything that could be found there would already +have been found when searching the first occurrence. - We use this observation to explain our new lookup rule. Using the - classic lookup rule, construct the list of classes that would be - searched, including duplicates. Now for each class that occurs in - the list multiple times, remove all occurrences except for the - last. The resulting list contains each ancestor class exactly - once (including the most derived class, D in the example). +We use this observation to explain our new lookup rule. Using the +classic lookup rule, construct the list of classes that would be +searched, including duplicates. Now for each class that occurs in +the list multiple times, remove all occurrences except for the +last. The resulting list contains each ancestor class exactly +once (including the most derived class, D in the example). - Searching for methods in this order will do the right thing for - the diamond diagram. Because of the way the list is constructed, - it does not change the search order in situations where no diamond - is involved. +Searching for methods in this order will do the right thing for +the diamond diagram. Because of the way the list is constructed, +it does not change the search order in situations where no diamond +is involved. - Isn't this backwards incompatible? Won't it break existing code? - It would, if we changed the method resolution order for all - classes. However, in Python 2.2, the new lookup rule will only be - applied to types derived from built-in types, which is a new - feature. Class statements without a base class create "classic - classes", and so do class statements whose base classes are - themselves classic classes. For classic classes the classic - lookup rule will be used. (To experiment with the new lookup rule - for classic classes, you will be able to specify a different - metaclass explicitly.) We'll also provide a tool that analyzes a - class hierarchy looking for methods that would be affected by a - change in method resolution order. +Isn't this backwards incompatible? Won't it break existing code? +It would, if we changed the method resolution order for all +classes. However, in Python 2.2, the new lookup rule will only be +applied to types derived from built-in types, which is a new +feature. Class statements without a base class create "classic +classes", and so do class statements whose base classes are +themselves classic classes. For classic classes the classic +lookup rule will be used. (To experiment with the new lookup rule +for classic classes, you will be able to specify a different +metaclass explicitly.) We'll also provide a tool that analyzes a +class hierarchy looking for methods that would be affected by a +change in method resolution order. - XXX Another way to explain the motivation for the new MRO, due to - Damian Conway: you never use the method defined in a base class if - it is defined in a derived class that you haven't explored yet - (using the old search order). +XXX Another way to explain the motivation for the new MRO, due to +Damian Conway: you never use the method defined in a base class if +it is defined in a derived class that you haven't explored yet +(using the old search order). XXX To be done +============== - Additional topics to be discussed in this PEP: +Additional topics to be discussed in this PEP: - - backwards compatibility issues!!! +- backwards compatibility issues!!! - - class methods and static methods +- class methods and static methods - - cooperative methods and super() +- cooperative methods and ``super()`` - - mapping between type object slots (tp_foo) and special methods - (__foo__) (actually, this may belong in PEP 252) +- mapping between type object slots (tp_foo) and special methods + (``__foo__``) (actually, this may belong in PEP 252) - - built-in names for built-in types (object, int, str, list etc.) +- built-in names for built-in types (object, int, str, list etc.) - - __dict__ and __dictoffset__ +- ``__dict__`` and ``__dictoffset__`` - - __slots__ +- ``__slots__`` - - the HEAPTYPE flag bit +- the ``HEAPTYPE`` flag bit - - GC support +- GC support - - API docs for all the new functions +- API docs for all the new functions - - how to use __new__ +- how to use ``__new__`` - - writing metaclasses (using mro() etc.) +- writing metaclasses (using ``mro()`` etc.) - - high level user overview +- high level user overview - - open issues: +open issues +----------- - - do we need __del__? +- do we need ``__del__``? - - assignment to __dict__, __bases__ +- assignment to ``__dict__``, ``__bases__`` - - inconsistent naming - (e.g. tp_dealloc/tp_new/tp_init/tp_alloc/tp_free) +- inconsistent naming + (e.g. tp_dealloc/tp_new/tp_init/tp_alloc/tp_free) - - add builtin alias 'dict' for 'dictionary'? +- add builtin alias 'dict' for 'dictionary'? - - when subclasses of dict/list etc. are passed to system - functions, the __getitem__ overrides (etc.) aren't always - used +- when subclasses of dict/list etc. are passed to system + functions, the ``__getitem__`` overrides (etc.) aren't always + used Implementation +============== - A prototype implementation of this PEP (and for PEP 252) is - available from CVS, and in the series of Python 2.2 alpha and beta - releases. For some examples of the features described here, see - the file Lib/test/test_descr.py and the extension module - Modules/xxsubtype.c. +A prototype implementation of this PEP (and for PEP 252) is +available from CVS, and in the series of Python 2.2 alpha and beta +releases. For some examples of the features described here, see +the file Lib/test/test_descr.py and the extension module +Modules/xxsubtype.c. References +========== - [1] "Putting Metaclasses to Work", by Ira R. Forman and Scott - H. Danforth, Addison-Wesley 1999. - (http://www.aw.com/product/0,2627,0201433052,00.html) +.. [1] "Putting Metaclasses to Work", by Ira R. Forman and Scott + H. Danforth, Addison-Wesley 1999. + (http://www.aw.com/product/0,2627,0201433052,00.html) Copyright +========= - This document has been placed in the public domain. +This document has been placed in the public domain. - -Local Variables: -mode: indented-text -indent-tabs-mode: nil -End: +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + End: