python-peps/pep-0253.txt

PEP: 253
Title: Subtyping Built-in Types
Version: $Revision$
Author: guido@python.org (Guido van Rossum)
Status: Draft
Type: Standards Track
Python-Version: 2.2
Created: 14-May-2001
Post-History:

Abstract

    This PEP proposes additions to the type object API that will allow
    the creation of subtypes of built-in types, in C and in Python.


Introduction

    Traditionally, types in Python have been created statically, by
    declaring a global variable of type PyTypeObject and initializing
    it with a static initializer.  The slots in the type object
    describe all aspects of a Python type that are relevant to the
    Python interpreter.  A few slots contain dimensional information
    (like the basic allocation size of instances), others contain
    various flags, but most slots are pointers to functions to
    implement various kinds of behaviors.  A NULL pointer means that
    the type does not implement the specific behavior; in that case
    the system may provide a default behavior in that case or raise an
    exception when the behavior is invoked.  Some collections of
    functions pointers that are usually defined together are obtained
    indirectly via a pointer to an additional structure containing
    more function pointers.

    While the details of initializing a PyTypeObject structure haven't
    been documented as such, they are easily gleaned from the examples
    in the source code, and I am assuming that the reader is
    sufficiently familiar with the traditional way of creating new
    Python types in C.

    This PEP will introduce the following features:

      - a type can be a factory function for its instances

      - types can be subtyped in C

      - types can be subtyped in Python with the class statement

      - multiple inheritance from types is supported (insofar as
        practical -- you still can't multiply inherit from list and
        dictionary)

      - the standard coercions functions (int, tuple, str etc.) will
        be redefined to be the corresponding type objects, which serve
        as their own factory functions

      - a class statement can contain a __metaclass__ declaration,
        specifying the metaclass to be used to create the new class

      - a class statement can contain a __slots__ declaration,
        specifying the specific names of the instance variables
        supported

    This PEP builds on PEP 252, which adds standard introspection to
    types; for example, when a particular type object initializes the
    tp_hash slot, that type object has a __hash__ method when
    introspected.  PEP 252 also adds a dictionary to type objects
    which contains all methods.  At the Python level, this dictionary
    is read-only for built-in types; at the C level, it is accessible
    directly (but it should not be modified except as part of
    initialization).

    For binary compatibility, a flag bit in the tp_flags slot
    indicates the existence of the various new slots in the type
    object introduced below.  Types that don't have the
    Py_TPFLAGS_HAVE_CLASS bit set in their tp_flags slot are assumed
    to have NULL values for all the subtyping slots.  (Warning: the
    current implementation prototype is not yet consistent in its
    checking of this flag bit.  This should be fixed before the final
    release.)

    In current Python, a distinction is made between types and
    classes.  This PEP together with PEP 254 will remove that
    distinction.  However, for backwards compatibility the distinction
    will probably remain for years to come, and without PEP 254, the
    distinction is still large: types ultimately have a built-in type
    as a base class, while classes ultimately derive from a
    user-defined class.  Therefore, in the rest of this PEP, I will
    use the word type whenever I can -- including base type or
    supertype, derived type or subtype, and metatype.  However,
    sometimes the terminology necessarily blends, for example an
    object's type is given by its __class__ attribute, and subtyping
    in Python is spelled with a class statement.  If further
    distinction is necessary, user-defined classes can be referred to
    as "classic" classes.


About metatypes

    Inevitably the discussion comes to metatypes (or metaclasses).
    Metatypes are nothing new in Python: Python has always been able
    to talk about the type of a type:

    >>> a = 0
    >>> type(a)
    <type 'int'>
    >>> type(type(a))
    <type 'type'>
    >>> type(type(type(a)))
    <type 'type'>
    >>> 

    In this example, type(a) is a "regular" type, and type(type(a)) is
    a metatype.  While as distributed all types have the same metatype
    (PyType_Type, which is also its own metatype), this is not a
    requirement, and in fact a useful and relevant 3rd party extension
    (ExtensionClasses by Jim Fulton) creates an additional metatype.
    The type of classic classes, known as types.ClassType, can also be
    considered a distinct metatype.

    A feature closely connected to metatypes is the "Don Beaudry
    hook", which says that if a metatype is callable, its instances
    (which are regular types) can be subclassed (really subtyped)
    using a Python class statement.  I will use this rule to support
    subtyping of built-in types, and in fact it greatly simplifies the
    logic of class creation to always simply call the metatype.  When
    no base class is specified, a default metatype is called -- the
    default metatype is the "ClassType" object, so the class statement
    will behave as before in the normal case.  (This default can be
    changed per module by setting the global variable __metaclass__.)

    Python uses the concept of metatypes or metaclasses in a different
    way than Smalltalk.  In Smalltalk-80, there is a hierarchy of
    metaclasses that mirrors the hierarchy of regular classes,
    metaclasses map 1-1 to classes (except for some funny business at
    the root of the hierarchy), and each class statement creates both
    a regular class and its metaclass, putting class methods in the
    metaclass and instance methods in the regular class.

    Nice though this may be in the context of Smalltalk, it's not
    compatible with the traditional use of metatypes in Python, and I
    prefer to continue in the Python way.  This means that Python
    metatypes are typically written in C, and may be shared between
    many regular types. (It will be possible to subtype metatypes in
    Python, so it won't be absolutely necessary to write C in order to
    use metatypes; but the power of Python metatypes will be limited.
    For example, Python code will never be allowed to allocate raw
    memory and initialize it at will.)

    Metatypes determine various *policies* for types,such as what
    happens when a type is called, how dynamic types are (whether a
    type's __dict__ can be modified after it is created), what the
    method resolution order is, how instance attributes are looked
    up, and so on.

    I'll argue that left-to-right depth-first is not the best
    solution when you want to get the most use from multiple
    inheritance.

    I'll argue that with multiple inheritance, the metatype of the
    subtype must be a descendant of the metatypes of all base types.

    I'll come back to metatypes later.


Making a type a factory for its instances

    Traditionally, for each type there is at least one C factory
    function that creates instances of the type (PyTuple_New(),
    PyInt_FromLong() and so on).  These factory functions take care of
    both allocating memory for the object and initializing that
    memory.  As of Python 2.0, they also have to interface with the
    garbage collection subsystem, if the type chooses to participate
    in garbage collection (which is optional, but strongly recommended
    for so-called "container" types: types that may contain arbitrary
    references to other objects, and hence may participate in
    reference cycles).

    In this proposal, type objects can be factory functions for their
    instances, making the types directly callable from Python.  This
    mimics the way classes are instantiated.  Of course, the C APIs
    for creating instances of various built-in types will remain valid
    and probably the most common; and not all types will become their
    own factory functions.

    The type object has a new slot, tp_new, which can act as a factory
    for instances of the type.  Types are made callable by providing a
    tp_call slot in PyType_Type (the metatype); the slot
    implementation function looks for the tp_new slot of the type that
    is being called.

    (Confusion alert: the tp_call slot of a regular type object (such
    as PyInt_Type or PyList_Type) defines what happens when
    *instances* of that type are called; in particular, the tp_call
    slot in the function type, PyFunction_Type, is the key to making
    functions callable.  As another example, PyInt_Type.tp_call is
    NULL, because integers are not callable.  The new paradigm makes
    *type objects* callable.  Since type objects are instances of
    their metatype (PyType_Type), the metatype's tp_call slot
    (PyType_Type.tp_call) points to a function that is invoked when
    any type object is called.  Now, since each type has do do
    something different to create an instance of itself,
    PyType_Type.tp_call immediately defers to the tp_new slot of the
    type that is being called.  To add to the confusion, PyType_Type
    itself is also callable: its tp_new slot creates a new type.  This
    is used by the class statement (via the Don Beaudry hook, see
    above).  And what makes PyType_Type callable?  The tp_call slot of
    *its* metatype -- but since it is its own metatype, that is its
    own tp_call slot!)

    If the type's tp_new slot is NULL, an exception is raised.
    Otherwise, the tp_new slot is called.  The signature for the
    tp_new slot is

        PyObject *tp_new(PyTypeObject *type,
                         PyObject *args,
                         PyObject *kwds)

    where 'type' is the type whose tp_new slot is called, and 'args'
    and 'kwds' are the sequential and keyword arguments to the call,
    passed unchanged from tp_call.  (The 'type' argument is used in
    combination with inheritance, see below.)

    There are no constraints on the object type that is returned,
    although by convention it should be an instance of the given
    type.  It is not necessary that a new object is returned; a
    reference to an existing object is fine too.  The return value
    should always be a new reference, owned by the caller.

    One the tp_new slot has returned an object, further initialization
    is attempted by calling the tp_init() slot of the resulting
    object's type, if not NULL.  This has the following signature:

        PyObject *tp_init(PyObject *self,
                          PyObject *args,
                          PyObject *kwds)

    It corresponds more closely to the __init__() method of classic
    classes, and in fact is mapped to that by the slot/special-method
    correspondence rules.  The difference in responsibilities between
    the tp_new() slot and the tp_init() slot lies in the invariants
    they ensure.  The tp_new() slot should ensure only the most
    essential invariants, without which the C code that implements the
    object's wold break.  The tp_init() slot should be used for
    overridable user-specific initializations.  Take for example the
    dictionary type.  The implementation has an internal pointer to a
    hash table which should never be NULL.  This invariant is taken
    care of by the tp_new() slot for dictionaries.  The dictionary
    tp_init() slot, on the other hand, could be used to give the
    dictionary an initial set of keys and values based on the
    arguments passed in.

    Note that for immutable object types, the initialization cannot be
    done by the tp_init() slot: this would provide the Python user
    with a way to change the initialiation.  Therefore, immutable
    objects typically have an empty tp_init() implementation and do
    all their initialization in their tp_new() slot.

    You may wonder why the tp_new() slot shouldn't call the tp_init()
    slot itself.  The reason is that in certain circumstances (like
    support for persistent objects), it is important to be able to
    create an object of a particular type without initializing it any
    further than necessary.  This may conveniently be done by calling
    the tp_new() slot without calling tp_init().  It is also possible
    that tp_init() is not called, or called more than once -- its
    operation should be robust even in these anomalous cases.

    For some objects, tp_new() may return an existing object.  For
    example, the factory function for integers caches the integers -1
    throug 99.  This is permissible only when the type argument to
    tp_new() is the type that defined the tp_new() function (in the
    example, if type == &PyInt_Type), and when the tp_init() slot for
    this type does nothing.  If the type argument differs, the
    tp_new() call is initiated by by a derived type's tp_new() to
    create the object and initialize the base type portion of the
    object; in this case tp_new() should always return a new object
    (or raise an exception).

    There's a third slot related to object creation: tp_alloc().  Its
    responsibility is to allocate the memory for the object,
    initialize the reference count (ob_refcnt) and the type pointer
    (ob_type), and initialize the rest of the object to all zeros.  It
    should also register the object with the garbage collection
    subsystem if the type supports garbage collection.  This slot
    exists so that derived types can override the memory allocation
    policy (like which heap is being used) separately from the
    initialization code.  The signature is:

        PyObject *tp_alloc(PyTypeObject *type, int nitems)

    The type argument is the type of the new object.  The nitems
    argument is normally zero, except for objects with a variable
    allocation size (basically strings, tuples, and longs).  The
    allocation size is given by the following expression:

        type->tp_basicsize  +  nitems * type->tp_itemsize

    This slot is only used for subclassable types.  The tp_new()
    function of the base class must call the tp_alloc() slot of the
    type passed in as its first argument.  It is the tp_new()
    function's responsibility to calculate the number of items.  The
    tp_alloc() slot will set the ob_size member of the new object if
    the type->tp_itemsize member is nonzero.

    (Note: in certain debugging compilation modes, the type structure
    used to have members named tp_alloc and a tp_free slot already,
    counters for the number of allocations and deallocations.  These
    are renamed to tp_allocs and tp_deallocs.)

    XXX The keyword arguments are currently not passed to tp_new();
    its kwds argument is always NULL.  This is a relic from a previous
    revision and should probably be fixed.  Both tp_new() and
    tp_init() should receive exactly the same arguments, and both
    should check that the arguments are acceptable, because they may
    be called independently.

    Standard implementations for tp_alloc() and tp_new() are
    available.  PyType_GenericAlloc() allocates an object from the
    standard heap and initializes it properly.  It uses the above
    formula to determine the amount of memory to allocate, and takes
    care of GC registration.  The only reason not to use this
    implementation would be to allocate objects from different heap
    (as is done by some very small frequently used objects like ints
    and tuples).  PyType_GenericNew() adds very little: it just calls
    the type's tp_alloc() slot with zero for nitems.  But for mutable
    types that do all their initialization in their tp_init() slot,
    this may be just the ticket.


Preparing a type for subtyping

    The idea behind subtyping is very similar to that of single
    inheritance in C++.  A base type is described by a structure
    declaration (similar to the C++ class declaration) plus a type
    object (similar to the C++ vtable).  A derived type can extend the
    structure (but must leave the names, order and type of the members
    of the base structure unchanged) and can override certain slots in
    the type object, leaving others the same.  (Unlike C++ vtables,
    all Python type objects have the same memory lay-out.)

    The base type must do the following:

      - Add the flag value Py_TPFLAGS_BASETYPE to tp_flags.

      - Declare and use tp_new(), tp_alloc() and optional tp_init()
        slots.

      - Declare and use tp_dealloc() and tp_free().

      - Export its object structure declaration.

      - Export a subtyping-aware type-checking macro.

    The requirements and signatures for tp_new(), tp_alloc() and
    tp_init() have already been discussed above: tp_alloc() should
    allocate the memory and initialize it to mostly zeros; tp_new()
    should call the tp_alloc() slot and then proceed to do the
    minimally required initialization; tp_init() should be used for
    more extensive initialization of mutable objects.

    It should come as no surprise that there are similar conventions
    at the end of an object's lifetime.  The slots involved are
    tp_dealloc() (familiar to all who have ever implemented a Python
    extension type) and tp_free(), the new kid on he block.  (The
    names aren't quite symmetric; tp_free() corresponds to tp_alloc(),
    which is fine, but tp_dealloc() corresponds to tp_new().  Maybe
    the tp_dealloc slot should be renamed?)

    The tp_free() slot should be used to free the memory and
    unregister the object with the garbage collection subsystem, and
    can be overridden by a derived class; tp_dealloc() should
    deinitialize the object (usually by calling Py_XDECREF() for
    various sub-objects) and then call tp_free() to deallocate the
    memory.  The signature for tp_dealloc() is the same as it always
    was:

        void tp_dealloc(PyObject *object)

    The signature for tp_free() is the same:

        void tp_free(PyObject *object)

    (In a previous version of this PEP, there was also role reserved
    for the tp_clear() slot.  This turned out to be a bad idea.)

    In order to be usefully subtyped in C, a type must export the
    structure declaration for its instances through a header file, as
    it is needed in order to derive a subtype.  The type object for
    the base type must also be exported.

    If the base type has a type-checking macro (like PyDict_Check()),
    this macro should be made to recognize subtypes.  This can be done
    by using the new PyObject_TypeCheck(object, type) macro, which
    calls a function that follows the base class links.

    The PyObject_TypeCheck() macro contains a slight optimization: it
    first compares object->ob_type directly to the type argument, and
    if this is a match, bypasses the function call.  This should make
    it fast enough for most situations.

    Note that this change in the type-checking macro means that C
    functions that require an instance of the base type may be invoked
    with instances of the derived type.  Before enabling subtyping of
    a particular type, its code should be checked to make sure that
    this won't break anything.


Creating a subtype of a built-in type in C

    The simplest form of subtyping is subtyping in C.  It is the
    simplest form because we can require the C code to be aware of
    some of the problems, and it's acceptable for C code that doesn't
    follow the rules to dump core.  For added simplicity, it is
    limited to single inheritance.

    Let's assume we're deriving from a mutable base type whose
    tp_itemsize is zero.  The subtype code is not GC-aware, although
    it may inherit GC-awareness from the base type (this is
    automatic).  The base type's allocation uses the standard heap.

    The derived type begins by declaring a type structure which
    contains the base type's structure.  For example, here's the type
    structure for a subtype of the built-in list type:

    typedef struct {
        PyListObject list;
        int state;
    } spamlistobject;

    Note that the base type structure member (here PyListObject) must
    be the first member of the structure; any following members are
    additions.  Also note that the base type is not referenced via a
    pointer; the actual contents of its structure must be included!
    (The goal is for the memory lay out of the beginning of the
    subtype instance to be the same as that of the base type
    instance.)

    Next, the derived type must declare a type object and initialize
    it.  Most of the slots in the type object may be initialized to
    zero, which is a signal that the base type slot must be copied
    into it.  Some slots that must be initialized properly:

      - The object header must be filled in as usual; the type should
        be &PyType_Type.

      - The tp_basicsize slot must be set to the size of the subtype
        instance struct (in the above example:
        sizeof(spamlistobject)).

      - The tp_base slot must be set to the address of the base type's
        type object.

      - If the derived slot defines any pointer members, the
        tp_dealloc slot function requires special attention, see
        below; otherwise, it can be set to zero, to inherit the base
        type's deallocation function.

      - The tp_flags slot must be set to the usual Py_TPFLAGS_DEFAULT
        value.

      - The tp_name slot must be set; it is recommended to set tp_doc
        as well (these are not inherited).

    If the subtype defines no additional structure members (it only
    defines new behavior, no new data), the tp_basicsize and the
    tp_dealloc slots may be left set to zero.

    The subtype's tp_dealloc slot deserves special attention.  If the
    derived type defines no additional pointer members that need to be
    DECREF'ed or freed when the object is deallocated, it can be set
    to zero.  Otherwise, the subtype's tp_dealloc() function must call
    Py_XDECREF() for any PyObject * members and the correct memory
    freeing function for any other pointers it owns, and then call the
    base class's tp_dealloc() slot.  This call has to be made via the
    base type's type structure, for example, when deriving from the
    standard list type:

        PyList_Type.tp_dealloc(self);

    If the subtype wants to use a different allocation heap than the
    base type, the subtype must override both the tp_alloc() and the
    tp_free() slots.  These will be called by the base class's
    tp_new() and tp_dealloc() slots, respectively.

    In order to complete the initialization of the type,
    PyType_InitDict() must be called.  This replaces slots initialized
    to zero in the subtype with the value of the corresponding base
    type slots.  (It also fills in tp_dict, the type's dictionary, and
    does various other initializations necessary for type objects.)

    A subtype is not usable until PyType_InitDict() is called for it;
    this is best done during module initialization, assuming the
    subtype belongs to a module.  An alternative for subtypes added to
    the Python core (which don't live in a particular module) would be
    to initialize the subtype in their constructor function.  It is
    allowed to call PyType_InitDict() more than once; the second and
    further calls have no effect.  In order to avoid unnecessary
    calls, a test for tp_dict==NULL can be made.

    (During initialization of the Python interpreter, some types are
    actually used before they are initialized.  As long as the slots
    that are actually needed are initialized, especially tp_dealloc,
    this works, but it is fragile and not recommended as a general
    practice.)

    To create a subtype instance, the subtype's tp_new() slot is
    called.  This should first call the base type's tp_new() slot and
    then initialize the subtype's additional data members.  To further
    initialize the instance, the tp_init() slot is typically called.
    Note that the tp_new() slot should *not* call the tp_init() slot;
    this is up to tp_new()'s caller (typically a factory function).
    There are circumstances where it is appropriate not to call
    tp_init().

    If a subtype defines a tp_init() slot, the tp_init() slot should
    normally first call the base type's tp_init() slot.

    (XXX There should be a paragraph or two about argument passing
    here.)


Subtyping in Python

    The next step is to allow subtyping of selected built-in types
    through a class statement in Python.  Limiting ourselves to single
    inheritance for now, here is what happens for a simple class
    statement:

    class C(B):
        var1 = 1
        def method1(self): pass
        # etc.

    The body of the class statement is executes in a fresh environment
    (basically, a new dictionary used as local namespace), and then C
    is created.  The following explains how C is created.

    Assume B is a type object.  Since type objects are objects, and
    every object has a type, B has a type.  Since B is itself a type,
    we also call its type its metatype.  B's metatype is accessible
    via type(B) or B.__class__ (the latter notation is new for types;
    it is introduced in PEP 252).  Let's say this metatype is M (for
    Metatype).  The class statement will create a new type, C.  Since
    C will be a type object just like B, we view the creation of C as
    an instantiation of the metatype, M.  The information that needs
    to be provided for the creation of a subclass is:

      - its name (in this example the string "C");

      - its bases (a singleton tuple containing B);

      - the results of executing the class body, in the form of a
        dictionary (for example {"var1": 1, "method1": <function
        method1 at ...>, ...}).

    The class statement will result in the following call:

        C = M("C", (B,), dict)

    (where dict is the dictionary resulting from execution of the
    class body).  In other words, the metatype (M) is called.

    Note that even though the example has only one base, we still pass
    in a (singleton) sequence of bases; this makes the interface
    uniform with the multiple-inheritance case.

    In current Python, this is called the "Don Beaudry hook" after its
    inventor; it is an exceptional case that is only invoked when a
    base class is not a regular class.  For a regular base class (or
    when no base class is specified), current Python calls
    PyClass_New(), the C level factory function for classes, directly.

    Under the new system this is changed so that Python *always*
    determines a metatype and calls it as given above.  When one or
    more bases are given, the type of the first base is used as the
    metatype; when no base is given, a default metatype is chosen.  By
    setting the default metatype to PyClass_Type, the metatype of
    "classic" classes, the classic behavior of the class statement is
    retained.  This default can be changed per module by setting the
    global variable __metaclass__.

    There are two further refinements here.  First, a useful feature
    is to be able to specify a metatype directly.  If the class
    statement defines a variable __metaclass__, that is the metatype
    to call.  (Note that setting __metaclass__ at the module level
    only affects class statements without a base class and without an
    explicit __metaclass__ declaration; but setting __metaclass__ in a
    class statement overrides the default metatype unconditionally.)

    Second, with multiple bases, not all bases need to have the same
    metatype.  This is called a metaclass conflict [1].  Some
    metaclass conflicts can be resolved by searching through the set
    of bases for a metatype that derives from all other given
    metatypes.  If such a metatype cannot be found, an exception is
    raised and the class statement fails.

    This conflict resultion can be implemented in the metatypes
    itself: the class statement just calls the metatype of the first
    base (or that specified by the __metaclass__ variable), and this
    metatype's constructor looks for the most derived metatype.  If
    that is itself, it proceeds; otherwise, it calls that metatype's
    constructor.  (Ultimate flexibility: another metatype might choose
    to require that all bases have the same metatype, or that there's
    only one base class, or whatever.)

    (In [1], a new metaclass is automatically derived that is a
    subclass of all given metaclasses.  But since it is questionable
    in Python how conflicting method definitions of the various
    metaclasses should be merged, I don't think this is feasible.
    Should the need arise, the user can derive such a metaclass
    manually and specify it using the __metaclass__ variable.  It is
    also possible to have a new metaclass that does this.)

    Note that calling M requires that M itself has a type: the
    meta-metatype.  And the meta-metatype has a type, the
    meta-meta-metatype.  And so on.  This is normally cut short at
    some level by making a metatype be its own metatype.  This is
    indeed what happens in Python: the ob_type reference in
    PyType_Type is set to &PyType_Type.  In the absence of third party
    metatypes, PyType_Type is the only metatype in the Python
    interpreter.

    (In a previous version of this PEP, there was one additional
    meta-level, and there was a meta-metatype called "turtle".  This
    turned out to be unnecessary.)

    In any case, the work for creating C is done by M's tp_new() slot.
    It allocates space for an "extended" type structure, which
    contains space for: the type object; the auxiliary structures
    (as_sequence etc.); the string object containing the type name (to
    ensure that this object isn't deallocated while the type object is
    still referencing it); and some more auxiliary storage (to be
    described later).  It initializes this storage to zeros except for
    a few crucial slots (for example, tp_name is set to point to the
    type name) and then sets the tp_base slot to point to B.  Then
    PyType_InitDict() is called to inherit B's slots.  Finally, C's
    tp_dict slot is updated with the contents of the namespace
    dictionary (the third argument to the call to M).


Multiple Inheritance

    The Python class statement supports multiple inheritance, and we
    will also support multiple inheritance involving built-in types.

    However, there are some restrictions.  The C runtime architecture
    doesn't make it feasible to have a meaningful subtype of two
    different built-in types except in a few degenerate cases.
    Changing the C runtime to support fully general multiple
    inheritance would be too much of an upheaval of the code base.

    The main problem with multiple inheritance from different built-in
    types stems from the fact that the C implementation of built-in
    types accesses structure members directly; the C compiler
    generates an offset relative to the object pointer and that's
    that.  For example, the list and dictionary type structures each
    declare a number of different but overlapping structure members.
    A C function accessing an object expecting a list won't work when
    passed a dictionary, and vice versa, and there's not much we could
    do about this without rewriting all code that accesses lists and
    dictionaries.  This would be too much work, so we won't do this.

    The problem with multiple inheritance is caused by conflicting
    structure member allocations.  Classes defined in Python normally
    don't store their instance variables in structure members: they
    are stored in an instance dictionary.  This is the key to a
    partial solution.  Suppose we have the following two classes:

      class A(dictionary):
          def foo(self): pass

      class B(dictionary):
          def bar(self): pass

      class C(A, B): pass

    (Here, 'dictionary' is the type of built-in dictionary objects,
    a.k.a. type({}) or {}.__class__ or types.DictType.)  If we look at
    the structure lay-out, we find that an A instance has the lay-out
    of a dictionary followed by the __dict__ pointer, and a B instance
    has the same lay-out; since there are no structure member lay-out
    conflicts, this is okay.

    Here's another example:

      class X(object):
          def foo(self): pass

      class Y(dictionary):
          def bar(self): pass

      class Z(X, Y): pass

    (Here, 'object' is the base for all built-in types; its structure
    lay-out only contains the ob_refcnt and ob_type members.)  This
    example is more complicated, because the __dict__ pointer for X
    instances has a different offset than that for Y instances.  Where
    is the __dict__ pointer for Z instances?  The answer is that the
    offset for the __dict__ pointer is not hardcoded, it is stored in
    the type object.

    Suppose on a particular machine an 'object' structure is 8 bytes
    long, and a 'dictionary' struct is 60 bytes, and an object pointer
    is 4 bytes.  Then an X structure is 12 bytes (an object structure
    followed by a __dict__ pointer), and a Y structure is 64 bytes (a
    dictionary structure followed by a __dict__ pointer).  The Z
    structure has the same lay-out as the Y structure in this example.
    Each type object (X, Y and Z) has a "__dict__ offset" which is
    used to find the __dict__ pointer.  Thus, the recipe for looking
    up an instance variable is:

      1. get the type of the instance
      2. get the __dict__ offset from the type object
      3. add the __dict__ offset to the instance pointer
      4. look in the resulting address to find a dictionary reference
      5. look up the instance variable name in that dictionary

    Of course, this recipe can only be implemented in C, and I have
    left out some details.  But this allows us to use multiple
    inheritance patterns similar to the ones we can use with classic
    classes.

    XXX I should write up the complete algorithm here to determine
    base class compatibility, but I can't be bothered right now.  Look
    at best_base() in typeobject.c in the implementation mentioned
    below.


XXX To be done

    Additional topics to be discussed in this PEP:

      - class methods and static methods

      - mapping between type object slots (tp_foo) and special methods
        (__foo__) (actually, this may belong in PEP 252)

      - built-in names for built-in types (object, int, str, list etc.)

      - method resolution order

      - __dict__

      - __slots__

      - the HEAPTYPE and DYNAMICTYPE flag bits

      - GC support

      - API docs for all the new functions

      - high level user overview


Implementation

    A prototype implementation of this PEP (and for PEP 252) is
    available from CVS as a branch named "descr-branch".  To
    experiment with this implementation, proceed to check out Python
    from CVS according to the instructions at
    http://sourceforge.net/cvs/?group_id=5470 but add the arguments
    "-r descr-branch" to the cvs checkout command.  (You can also
    start with an existing checkout and do "cvs update -r
    descr-branch".)  For some examples of the features described here,
    see the file Lib/test/test_descr.py and the extension module
    Modules/xxsubtype.c.


References

    [1] "Putting Metaclasses to Work", by Ira R. Forman and Scott
        H. Danforth, Addison-Wesley 1999.
        (http://www.aw.com/product/0,2627,0201433052,00.html)


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:
-												Checking in a stub of PEP 253.

											
										
										
											2001-05-14 09:43:23 -04:00
+								PEP: 253
 								Title: Subtyping Built-in Types
 								Version: $Revision$
 								Author: guido@python.org (Guido van Rossum)
 								Status: Draft
 								Type: Standards Track
 								Python-Version: 2.2
 								Created: 14-May-2001
 								Post-History:
 								Abstract
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    This PEP proposes additions to the type object API that will allow
 								    the creation of subtypes of built-in types, in C and in Python.
 								Introduction
-												Checking in a stub of PEP 253.

											
										
										
											2001-05-14 09:43:23 -04:00
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    Traditionally, types in Python have been created statically, by
 								    declaring a global variable of type PyTypeObject and initializing
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    it with a static initializer.  The slots in the type object
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    describe all aspects of a Python type that are relevant to the
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    Python interpreter.  A few slots contain dimensional information
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    (like the basic allocation size of instances), others contain
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    various flags, but most slots are pointers to functions to
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    implement various kinds of behaviors.  A NULL pointer means that
 								    the type does not implement the specific behavior; in that case
 								    the system may provide a default behavior in that case or raise an
 								    exception when the behavior is invoked.  Some collections of
 								    functions pointers that are usually defined together are obtained
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    indirectly via a pointer to an additional structure containing
 								    more function pointers.
-												Checking in a stub of PEP 253.

											
										
										
											2001-05-14 09:43:23 -04:00
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    While the details of initializing a PyTypeObject structure haven't
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    been documented as such, they are easily gleaned from the examples
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    in the source code, and I am assuming that the reader is
 								    sufficiently familiar with the traditional way of creating new
 								    Python types in C.
-												PEP 259: Omit printing newline after newline

											
										
										
											2001-06-11 16:07:37 -04:00
+								    This PEP will introduce the following features:
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								      - a type can be a factory function for its instances
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								      - types can be subtyped in C
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								      - types can be subtyped in Python with the class statement
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								      - multiple inheritance from types is supported (insofar as
 								        practical -- you still can't multiply inherit from list and
 								        dictionary)
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								      - the standard coercions functions (int, tuple, str etc.) will
 								        be redefined to be the corresponding type objects, which serve
 								        as their own factory functions
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								      - a class statement can contain a __metaclass__ declaration,
 								        specifying the metaclass to be used to create the new class
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								      - a class statement can contain a __slots__ declaration,
 								        specifying the specific names of the instance variables
 								        supported
-												PEP 259: Omit printing newline after newline

											
										
										
											2001-06-11 16:07:37 -04:00
-												Conform to Barry's new PEP referencing guidelines.

											
										
										
											2001-07-05 15:00:02 -04:00
+								    This PEP builds on PEP 252, which adds standard introspection to
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    types; for example, when a particular type object initializes the
 								    tp_hash slot, that type object has a __hash__ method when
 								    introspected.  PEP 252 also adds a dictionary to type objects
 								    which contains all methods.  At the Python level, this dictionary
 								    is read-only for built-in types; at the C level, it is accessible
 								    directly (but it should not be modified except as part of
 								    initialization).
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    For binary compatibility, a flag bit in the tp_flags slot
 								    indicates the existence of the various new slots in the type
 								    object introduced below.  Types that don't have the
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    Py_TPFLAGS_HAVE_CLASS bit set in their tp_flags slot are assumed
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    to have NULL values for all the subtyping slots.  (Warning: the
 								    current implementation prototype is not yet consistent in its
 								    checking of this flag bit.  This should be fixed before the final
 								    release.)
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												More good stuff.  Consider this just a checkpoint.

											
										
										
											2001-06-14 16:48:43 -04:00
+								    In current Python, a distinction is made between types and
-												Conform to Barry's new PEP referencing guidelines.

											
										
										
											2001-07-05 15:00:02 -04:00
+								    classes.  This PEP together with PEP 254 will remove that
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    distinction.  However, for backwards compatibility the distinction
 								    will probably remain for years to come, and without PEP 254, the
 								    distinction is still large: types ultimately have a built-in type
 								    as a base class, while classes ultimately derive from a
 								    user-defined class.  Therefore, in the rest of this PEP, I will
 								    use the word type whenever I can -- including base type or
 								    supertype, derived type or subtype, and metatype.  However,
 								    sometimes the terminology necessarily blends, for example an
-												More good stuff.  Consider this just a checkpoint.

											
										
										
											2001-06-14 16:48:43 -04:00
+								    object's type is given by its __class__ attribute, and subtyping
 								    in Python is spelled with a class statement.  If further
 								    distinction is necessary, user-defined classes can be referred to
 								    as "classic" classes.
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
 								About metatypes
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    Inevitably the discussion comes to metatypes (or metaclasses).
 								    Metatypes are nothing new in Python: Python has always been able
 								    to talk about the type of a type:
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
 								    >>> a = 0
 								    >>> type(a)
 								    <type 'int'>
 								    >>> type(type(a))
 								    <type 'type'>
 								    >>> type(type(type(a)))
 								    <type 'type'>
 								    >>>
 								    In this example, type(a) is a "regular" type, and type(type(a)) is
 								    a metatype.  While as distributed all types have the same metatype
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    (PyType_Type, which is also its own metatype), this is not a
 								    requirement, and in fact a useful and relevant 3rd party extension
 								    (ExtensionClasses by Jim Fulton) creates an additional metatype.
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    The type of classic classes, known as types.ClassType, can also be
 								    considered a distinct metatype.
 								    A feature closely connected to metatypes is the "Don Beaudry
 								    hook", which says that if a metatype is callable, its instances
 								    (which are regular types) can be subclassed (really subtyped)
 								    using a Python class statement.  I will use this rule to support
 								    subtyping of built-in types, and in fact it greatly simplifies the
 								    logic of class creation to always simply call the metatype.  When
 								    no base class is specified, a default metatype is called -- the
 								    default metatype is the "ClassType" object, so the class statement
 								    will behave as before in the normal case.  (This default can be
 								    changed per module by setting the global variable __metaclass__.)
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
 								    Python uses the concept of metatypes or metaclasses in a different
 								    way than Smalltalk.  In Smalltalk-80, there is a hierarchy of
 								    metaclasses that mirrors the hierarchy of regular classes,
 								    metaclasses map 1-1 to classes (except for some funny business at
 								    the root of the hierarchy), and each class statement creates both
 								    a regular class and its metaclass, putting class methods in the
 								    metaclass and instance methods in the regular class.
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
 								    Nice though this may be in the context of Smalltalk, it's not
 								    compatible with the traditional use of metatypes in Python, and I
 								    prefer to continue in the Python way.  This means that Python
 								    metatypes are typically written in C, and may be shared between
 								    many regular types. (It will be possible to subtype metatypes in
 								    Python, so it won't be absolutely necessary to write C in order to
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    use metatypes; but the power of Python metatypes will be limited.
 								    For example, Python code will never be allowed to allocate raw
 								    memory and initialize it at will.)
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    Metatypes determine various *policies* for types,such as what
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    happens when a type is called, how dynamic types are (whether a
 								    type's __dict__ can be modified after it is created), what the
 								    method resolution order is, how instance attributes are looked
 								    up, and so on.
 								    I'll argue that left-to-right depth-first is not the best
 								    solution when you want to get the most use from multiple
 								    inheritance.
 								    I'll argue that with multiple inheritance, the metatype of the
 								    subtype must be a descendant of the metatypes of all base types.
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    I'll come back to metatypes later.
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
 								Making a type a factory for its instances
 								    Traditionally, for each type there is at least one C factory
 								    function that creates instances of the type (PyTuple_New(),
 								    PyInt_FromLong() and so on).  These factory functions take care of
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    both allocating memory for the object and initializing that
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    memory.  As of Python 2.0, they also have to interface with the
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    garbage collection subsystem, if the type chooses to participate
 								    in garbage collection (which is optional, but strongly recommended
 								    for so-called "container" types: types that may contain arbitrary
 								    references to other objects, and hence may participate in
 								    reference cycles).
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    In this proposal, type objects can be factory functions for their
 								    instances, making the types directly callable from Python.  This
 								    mimics the way classes are instantiated.  Of course, the C APIs
 								    for creating instances of various built-in types will remain valid
 								    and probably the most common; and not all types will become their
 								    own factory functions.
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    The type object has a new slot, tp_new, which can act as a factory
 								    for instances of the type.  Types are made callable by providing a
 								    tp_call slot in PyType_Type (the metatype); the slot
 								    implementation function looks for the tp_new slot of the type that
 								    is being called.
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    (Confusion alert: the tp_call slot of a regular type object (such
 								    as PyInt_Type or PyList_Type) defines what happens when
 								    *instances* of that type are called; in particular, the tp_call
 								    slot in the function type, PyFunction_Type, is the key to making
 								    functions callable.  As another example, PyInt_Type.tp_call is
 								    NULL, because integers are not callable.  The new paradigm makes
 								    *type objects* callable.  Since type objects are instances of
 								    their metatype (PyType_Type), the metatype's tp_call slot
 								    (PyType_Type.tp_call) points to a function that is invoked when
 								    any type object is called.  Now, since each type has do do
 								    something different to create an instance of itself,
 								    PyType_Type.tp_call immediately defers to the tp_new slot of the
 								    type that is being called.  To add to the confusion, PyType_Type
 								    itself is also callable: its tp_new slot creates a new type.  This
 								    is used by the class statement (via the Don Beaudry hook, see
 								    above).  And what makes PyType_Type callable?  The tp_call slot of
 								    *its* metatype -- but since it is its own metatype, that is its
 								    own tp_call slot!)
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    If the type's tp_new slot is NULL, an exception is raised.
 								    Otherwise, the tp_new slot is called.  The signature for the
 								    tp_new slot is
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								        PyObject *tp_new(PyTypeObject *type,
 								                         PyObject *args,
 								                         PyObject *kwds)
 								    where 'type' is the type whose tp_new slot is called, and 'args'
 								    and 'kwds' are the sequential and keyword arguments to the call,
 								    passed unchanged from tp_call.  (The 'type' argument is used in
 								    combination with inheritance, see below.)
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    There are no constraints on the object type that is returned,
 								    although by convention it should be an instance of the given
 								    type.  It is not necessary that a new object is returned; a
 								    reference to an existing object is fine too.  The return value
 								    should always be a new reference, owned by the caller.
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    One the tp_new slot has returned an object, further initialization
 								    is attempted by calling the tp_init() slot of the resulting
 								    object's type, if not NULL.  This has the following signature:
 								        PyObject *tp_init(PyObject *self,
 								                          PyObject *args,
 								                          PyObject *kwds)
 								    It corresponds more closely to the __init__() method of classic
 								    classes, and in fact is mapped to that by the slot/special-method
 								    correspondence rules.  The difference in responsibilities between
 								    the tp_new() slot and the tp_init() slot lies in the invariants
 								    they ensure.  The tp_new() slot should ensure only the most
 								    essential invariants, without which the C code that implements the
 								    object's wold break.  The tp_init() slot should be used for
 								    overridable user-specific initializations.  Take for example the
 								    dictionary type.  The implementation has an internal pointer to a
 								    hash table which should never be NULL.  This invariant is taken
 								    care of by the tp_new() slot for dictionaries.  The dictionary
 								    tp_init() slot, on the other hand, could be used to give the
 								    dictionary an initial set of keys and values based on the
 								    arguments passed in.
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    Note that for immutable object types, the initialization cannot be
 								    done by the tp_init() slot: this would provide the Python user
 								    with a way to change the initialiation.  Therefore, immutable
 								    objects typically have an empty tp_init() implementation and do
 								    all their initialization in their tp_new() slot.
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    You may wonder why the tp_new() slot shouldn't call the tp_init()
 								    slot itself.  The reason is that in certain circumstances (like
 								    support for persistent objects), it is important to be able to
 								    create an object of a particular type without initializing it any
 								    further than necessary.  This may conveniently be done by calling
 								    the tp_new() slot without calling tp_init().  It is also possible
 								    that tp_init() is not called, or called more than once -- its
 								    operation should be robust even in these anomalous cases.
 								    For some objects, tp_new() may return an existing object.  For
 								    example, the factory function for integers caches the integers -1
 								    throug 99.  This is permissible only when the type argument to
 								    tp_new() is the type that defined the tp_new() function (in the
 								    example, if type == &PyInt_Type), and when the tp_init() slot for
 								    this type does nothing.  If the type argument differs, the
 								    tp_new() call is initiated by by a derived type's tp_new() to
 								    create the object and initialize the base type portion of the
 								    object; in this case tp_new() should always return a new object
 								    (or raise an exception).
 								    There's a third slot related to object creation: tp_alloc().  Its
 								    responsibility is to allocate the memory for the object,
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    initialize the reference count (ob_refcnt) and the type pointer
 								    (ob_type), and initialize the rest of the object to all zeros.  It
 								    should also register the object with the garbage collection
 								    subsystem if the type supports garbage collection.  This slot
 								    exists so that derived types can override the memory allocation
-												Added a section on multiple inheritance.

											
										
										
											2001-07-11 15:09:28 -04:00
+								    policy (like which heap is being used) separately from the
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    initialization code.  The signature is:
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
 								        PyObject *tp_alloc(PyTypeObject *type, int nitems)
 								    The type argument is the type of the new object.  The nitems
 								    argument is normally zero, except for objects with a variable
 								    allocation size (basically strings, tuples, and longs).  The
 								    allocation size is given by the following expression:
 								        type->tp_basicsize  +  nitems * type->tp_itemsize
 								    This slot is only used for subclassable types.  The tp_new()
 								    function of the base class must call the tp_alloc() slot of the
 								    type passed in as its first argument.  It is the tp_new()
 								    function's responsibility to calculate the number of items.  The
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    tp_alloc() slot will set the ob_size member of the new object if
 								    the type->tp_itemsize member is nonzero.
 								    (Note: in certain debugging compilation modes, the type structure
 								    used to have members named tp_alloc and a tp_free slot already,
 								    counters for the number of allocations and deallocations.  These
 								    are renamed to tp_allocs and tp_deallocs.)
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
 								    XXX The keyword arguments are currently not passed to tp_new();
 								    its kwds argument is always NULL.  This is a relic from a previous
 								    revision and should probably be fixed.  Both tp_new() and
 								    tp_init() should receive exactly the same arguments, and both
 								    should check that the arguments are acceptable, because they may
 								    be called independently.
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    Standard implementations for tp_alloc() and tp_new() are
 								    available.  PyType_GenericAlloc() allocates an object from the
 								    standard heap and initializes it properly.  It uses the above
 								    formula to determine the amount of memory to allocate, and takes
 								    care of GC registration.  The only reason not to use this
 								    implementation would be to allocate objects from different heap
 								    (as is done by some very small frequently used objects like ints
 								    and tuples).  PyType_GenericNew() adds very little: it just calls
 								    the type's tp_alloc() slot with zero for nitems.  But for mutable
 								    types that do all their initialization in their tp_init() slot,
 								    this may be just the ticket.
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								Preparing a type for subtyping
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
 								    The idea behind subtyping is very similar to that of single
 								    inheritance in C++.  A base type is described by a structure
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    declaration (similar to the C++ class declaration) plus a type
 								    object (similar to the C++ vtable).  A derived type can extend the
 								    structure (but must leave the names, order and type of the members
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    of the base structure unchanged) and can override certain slots in
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    the type object, leaving others the same.  (Unlike C++ vtables,
 								    all Python type objects have the same memory lay-out.)
 								    The base type must do the following:
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								      - Add the flag value Py_TPFLAGS_BASETYPE to tp_flags.
 								      - Declare and use tp_new(), tp_alloc() and optional tp_init()
 								        slots.
 								      - Declare and use tp_dealloc() and tp_free().
 								      - Export its object structure declaration.
 								      - Export a subtyping-aware type-checking macro.
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
 								    The requirements and signatures for tp_new(), tp_alloc() and
 								    tp_init() have already been discussed above: tp_alloc() should
 								    allocate the memory and initialize it to mostly zeros; tp_new()
 								    should call the tp_alloc() slot and then proceed to do the
 								    minimally required initialization; tp_init() should be used for
 								    more extensive initialization of mutable objects.
 								    It should come as no surprise that there are similar conventions
 								    at the end of an object's lifetime.  The slots involved are
 								    tp_dealloc() (familiar to all who have ever implemented a Python
 								    extension type) and tp_free(), the new kid on he block.  (The
 								    names aren't quite symmetric; tp_free() corresponds to tp_alloc(),
 								    which is fine, but tp_dealloc() corresponds to tp_new().  Maybe
 								    the tp_dealloc slot should be renamed?)
 								    The tp_free() slot should be used to free the memory and
 								    unregister the object with the garbage collection subsystem, and
 								    can be overridden by a derived class; tp_dealloc() should
-												Added a section on multiple inheritance.

											
										
										
											2001-07-11 15:09:28 -04:00
+								    deinitialize the object (usually by calling Py_XDECREF() for
 								    various sub-objects) and then call tp_free() to deallocate the
 								    memory.  The signature for tp_dealloc() is the same as it always
 								    was:
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
 								        void tp_dealloc(PyObject *object)
 								    The signature for tp_free() is the same:
 								        void tp_free(PyObject *object)
 								    (In a previous version of this PEP, there was also role reserved
 								    for the tp_clear() slot.  This turned out to be a bad idea.)
 								    In order to be usefully subtyped in C, a type must export the
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
+								    structure declaration for its instances through a header file, as
 								    it is needed in order to derive a subtype.  The type object for
 								    the base type must also be exported.
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    If the base type has a type-checking macro (like PyDict_Check()),
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    this macro should be made to recognize subtypes.  This can be done
 								    by using the new PyObject_TypeCheck(object, type) macro, which
 								    calls a function that follows the base class links.
 								    The PyObject_TypeCheck() macro contains a slight optimization: it
 								    first compares object->ob_type directly to the type argument, and
 								    if this is a match, bypasses the function call.  This should make
 								    it fast enough for most situations.
 								    Note that this change in the type-checking macro means that C
 								    functions that require an instance of the base type may be invoked
 								    with instances of the derived type.  Before enabling subtyping of
 								    a particular type, its code should be checked to make sure that
 								    this won't break anything.
-												Another intermediate update.  I've rewritten the requirements for a
base type to be subtypable.  Needs way more work!

											
										
										
											2001-06-13 17:48:31 -04:00
 								Creating a subtype of a built-in type in C
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    The simplest form of subtyping is subtyping in C.  It is the
 								    simplest form because we can require the C code to be aware of
 								    some of the problems, and it's acceptable for C code that doesn't
 								    follow the rules to dump core.  For added simplicity, it is
 								    limited to single inheritance.
-												Clarified the paragraph about creating a subtype in C.

											
										
										
											2001-06-14 09:37:45 -04:00
+								    Let's assume we're deriving from a mutable base type whose
 								    tp_itemsize is zero.  The subtype code is not GC-aware, although
 								    it may inherit GC-awareness from the base type (this is
 								    automatic).  The base type's allocation uses the standard heap.
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    The derived type begins by declaring a type structure which
 								    contains the base type's structure.  For example, here's the type
 								    structure for a subtype of the built-in list type:
 								    typedef struct {
 								        PyListObject list;
 								        int state;
 								    } spamlistobject;
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    Note that the base type structure member (here PyListObject) must
 								    be the first member of the structure; any following members are
 								    additions.  Also note that the base type is not referenced via a
 								    pointer; the actual contents of its structure must be included!
 								    (The goal is for the memory lay out of the beginning of the
 								    subtype instance to be the same as that of the base type
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    instance.)
 								    Next, the derived type must declare a type object and initialize
 								    it.  Most of the slots in the type object may be initialized to
 								    zero, which is a signal that the base type slot must be copied
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    into it.  Some slots that must be initialized properly:
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								      - The object header must be filled in as usual; the type should
 								        be &PyType_Type.
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								      - The tp_basicsize slot must be set to the size of the subtype
 								        instance struct (in the above example:
 								        sizeof(spamlistobject)).
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								      - The tp_base slot must be set to the address of the base type's
 								        type object.
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								      - If the derived slot defines any pointer members, the
 								        tp_dealloc slot function requires special attention, see
 								        below; otherwise, it can be set to zero, to inherit the base
 								        type's deallocation function.
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								      - The tp_flags slot must be set to the usual Py_TPFLAGS_DEFAULT
 								        value.
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								      - The tp_name slot must be set; it is recommended to set tp_doc
 								        as well (these are not inherited).
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    If the subtype defines no additional structure members (it only
 								    defines new behavior, no new data), the tp_basicsize and the
 								    tp_dealloc slots may be left set to zero.
-												Clarified the paragraph about creating a subtype in C.

											
										
										
											2001-06-14 09:37:45 -04:00
 								    The subtype's tp_dealloc slot deserves special attention.  If the
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    derived type defines no additional pointer members that need to be
-												Clarified the paragraph about creating a subtype in C.

											
										
										
											2001-06-14 09:37:45 -04:00
+								    DECREF'ed or freed when the object is deallocated, it can be set
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    to zero.  Otherwise, the subtype's tp_dealloc() function must call
 								    Py_XDECREF() for any PyObject * members and the correct memory
-												Clarified the paragraph about creating a subtype in C.

											
										
										
											2001-06-14 09:37:45 -04:00
+								    freeing function for any other pointers it owns, and then call the
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    base class's tp_dealloc() slot.  This call has to be made via the
 								    base type's type structure, for example, when deriving from the
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    standard list type:
-												Clarified the paragraph about creating a subtype in C.

											
										
										
											2001-06-14 09:37:45 -04:00
 								        PyList_Type.tp_dealloc(self);
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    If the subtype wants to use a different allocation heap than the
 								    base type, the subtype must override both the tp_alloc() and the
 								    tp_free() slots.  These will be called by the base class's
 								    tp_new() and tp_dealloc() slots, respectively.
 								    In order to complete the initialization of the type,
 								    PyType_InitDict() must be called.  This replaces slots initialized
 								    to zero in the subtype with the value of the corresponding base
 								    type slots.  (It also fills in tp_dict, the type's dictionary, and
 								    does various other initializations necessary for type objects.)
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
 								    A subtype is not usable until PyType_InitDict() is called for it;
 								    this is best done during module initialization, assuming the
 								    subtype belongs to a module.  An alternative for subtypes added to
 								    the Python core (which don't live in a particular module) would be
 								    to initialize the subtype in their constructor function.  It is
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    allowed to call PyType_InitDict() more than once; the second and
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    further calls have no effect.  In order to avoid unnecessary
 								    calls, a test for tp_dict==NULL can be made.
-												Another intermediate checkin.  Removed a lot of lies about an older
idea for what tp_alloc() should be.

											
										
										
											2001-07-10 16:01:52 -04:00
+								    (During initialization of the Python interpreter, some types are
 								    actually used before they are initialized.  As long as the slots
 								    that are actually needed are initialized, especially tp_dealloc,
 								    this works, but it is fragile and not recommended as a general
 								    practice.)
 								    To create a subtype instance, the subtype's tp_new() slot is
 								    called.  This should first call the base type's tp_new() slot and
 								    then initialize the subtype's additional data members.  To further
 								    initialize the instance, the tp_init() slot is typically called.
 								    Note that the tp_new() slot should *not* call the tp_init() slot;
 								    this is up to tp_new()'s caller (typically a factory function).
 								    There are circumstances where it is appropriate not to call
 								    tp_init().
 								    If a subtype defines a tp_init() slot, the tp_init() slot should
 								    normally first call the base type's tp_init() slot.
 								    (XXX There should be a paragraph or two about argument passing
 								    here.)
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
 								Subtyping in Python
 								    The next step is to allow subtyping of selected built-in types
 								    through a class statement in Python.  Limiting ourselves to single
 								    inheritance for now, here is what happens for a simple class
 								    statement:
 								    class C(B):
 								        var1 = 1
 								        def method1(self): pass
 								        # etc.
 								    The body of the class statement is executes in a fresh environment
 								    (basically, a new dictionary used as local namespace), and then C
 								    is created.  The following explains how C is created.
 								    Assume B is a type object.  Since type objects are objects, and
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								    every object has a type, B has a type.  Since B is itself a type,
 								    we also call its type its metatype.  B's metatype is accessible
 								    via type(B) or B.__class__ (the latter notation is new for types;
 								    it is introduced in PEP 252).  Let's say this metatype is M (for
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    Metatype).  The class statement will create a new type, C.  Since
 								    C will be a type object just like B, we view the creation of C as
 								    an instantiation of the metatype, M.  The information that needs
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								    to be provided for the creation of a subclass is:
 								      - its name (in this example the string "C");
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								      - its bases (a singleton tuple containing B);
 								      - the results of executing the class body, in the form of a
 								        dictionary (for example {"var1": 1, "method1": <function
 								        method1 at ...>, ...}).
 								    The class statement will result in the following call:
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
-												More good stuff.  Consider this just a checkpoint.

											
										
										
											2001-06-14 16:48:43 -04:00
+								        C = M("C", (B,), dict)
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
 								    (where dict is the dictionary resulting from execution of the
-												More good stuff.  Consider this just a checkpoint.

											
										
										
											2001-06-14 16:48:43 -04:00
+								    class body).  In other words, the metatype (M) is called.
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								    Note that even though the example has only one base, we still pass
 								    in a (singleton) sequence of bases; this makes the interface
 								    uniform with the multiple-inheritance case.
-												More good stuff.  Consider this just a checkpoint.

											
										
										
											2001-06-14 16:48:43 -04:00
 								    In current Python, this is called the "Don Beaudry hook" after its
 								    inventor; it is an exceptional case that is only invoked when a
 								    base class is not a regular class.  For a regular base class (or
 								    when no base class is specified), current Python calls
 								    PyClass_New(), the C level factory function for classes, directly.
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
 								    Under the new system this is changed so that Python *always*
 								    determines a metatype and calls it as given above.  When one or
 								    more bases are given, the type of the first base is used as the
 								    metatype; when no base is given, a default metatype is chosen.  By
 								    setting the default metatype to PyClass_Type, the metatype of
-												More good stuff.  Consider this just a checkpoint.

											
										
										
											2001-06-14 16:48:43 -04:00
+								    "classic" classes, the classic behavior of the class statement is
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								    retained.  This default can be changed per module by setting the
 								    global variable __metaclass__.
-												More good stuff.  Consider this just a checkpoint.

											
										
										
											2001-06-14 16:48:43 -04:00
 								    There are two further refinements here.  First, a useful feature
 								    is to be able to specify a metatype directly.  If the class
 								    statement defines a variable __metaclass__, that is the metatype
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								    to call.  (Note that setting __metaclass__ at the module level
 								    only affects class statements without a base class and without an
 								    explicit __metaclass__ declaration; but setting __metaclass__ in a
 								    class statement overrides the default metatype unconditionally.)
-												More good stuff.  Consider this just a checkpoint.

											
										
										
											2001-06-14 16:48:43 -04:00
 								    Second, with multiple bases, not all bases need to have the same
 								    metatype.  This is called a metaclass conflict [1].  Some
 								    metaclass conflicts can be resolved by searching through the set
 								    of bases for a metatype that derives from all other given
 								    metatypes.  If such a metatype cannot be found, an exception is
 								    raised and the class statement fails.
 								    This conflict resultion can be implemented in the metatypes
 								    itself: the class statement just calls the metatype of the first
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								    base (or that specified by the __metaclass__ variable), and this
 								    metatype's constructor looks for the most derived metatype.  If
 								    that is itself, it proceeds; otherwise, it calls that metatype's
 								    constructor.  (Ultimate flexibility: another metatype might choose
 								    to require that all bases have the same metatype, or that there's
 								    only one base class, or whatever.)
 								    (In [1], a new metaclass is automatically derived that is a
 								    subclass of all given metaclasses.  But since it is questionable
 								    in Python how conflicting method definitions of the various
 								    metaclasses should be merged, I don't think this is feasible.
 								    Should the need arise, the user can derive such a metaclass
 								    manually and specify it using the __metaclass__ variable.  It is
 								    also possible to have a new metaclass that does this.)
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
 								    Note that calling M requires that M itself has a type: the
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								    meta-metatype.  And the meta-metatype has a type, the
 								    meta-meta-metatype.  And so on.  This is normally cut short at
 								    some level by making a metatype be its own metatype.  This is
 								    indeed what happens in Python: the ob_type reference in
 								    PyType_Type is set to &PyType_Type.  In the absence of third party
 								    metatypes, PyType_Type is the only metatype in the Python
 								    interpreter.
 								    (In a previous version of this PEP, there was one additional
 								    meta-level, and there was a meta-metatype called "turtle".  This
 								    turned out to be unnecessary.)
 								    In any case, the work for creating C is done by M's tp_new() slot.
 								    It allocates space for an "extended" type structure, which
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    contains space for: the type object; the auxiliary structures
 								    (as_sequence etc.); the string object containing the type name (to
 								    ensure that this object isn't deallocated while the type object is
 								    still referencing it); and some more auxiliary storage (to be
 								    described later).  It initializes this storage to zeros except for
-												Intermediate checkin (documented tp_new, tp_init, tp_alloc properly).

											
										
										
											2001-07-10 13:11:19 -04:00
+								    a few crucial slots (for example, tp_name is set to point to the
 								    type name) and then sets the tp_base slot to point to B.  Then
-												Add a lot of text.  A looooooot of text.  Way too much rambling.  And
it isn't even finished.  I'll do that later.  But at least there's
some text here now...

											
										
										
											2001-05-14 21:36:46 -04:00
+								    PyType_InitDict() is called to inherit B's slots.  Finally, C's
 								    tp_dict slot is updated with the contents of the namespace
 								    dictionary (the third argument to the call to M).
-												Checking in a stub of PEP 253.

											
										
										
											2001-05-14 09:43:23 -04:00
-												Added a section on multiple inheritance.

											
										
										
											2001-07-11 15:09:28 -04:00
+								Multiple Inheritance
 								    The Python class statement supports multiple inheritance, and we
 								    will also support multiple inheritance involving built-in types.
 								    However, there are some restrictions.  The C runtime architecture
 								    doesn't make it feasible to have a meaningful subtype of two
 								    different built-in types except in a few degenerate cases.
 								    Changing the C runtime to support fully general multiple
 								    inheritance would be too much of an upheaval of the code base.
 								    The main problem with multiple inheritance from different built-in
 								    types stems from the fact that the C implementation of built-in
 								    types accesses structure members directly; the C compiler
 								    generates an offset relative to the object pointer and that's
 								    that.  For example, the list and dictionary type structures each
 								    declare a number of different but overlapping structure members.
 								    A C function accessing an object expecting a list won't work when
 								    passed a dictionary, and vice versa, and there's not much we could
 								    do about this without rewriting all code that accesses lists and
 								    dictionaries.  This would be too much work, so we won't do this.
 								    The problem with multiple inheritance is caused by conflicting
 								    structure member allocations.  Classes defined in Python normally
 								    don't store their instance variables in structure members: they
 								    are stored in an instance dictionary.  This is the key to a
 								    partial solution.  Suppose we have the following two classes:
 								      class A(dictionary):
 								          def foo(self): pass
 								      class B(dictionary):
 								          def bar(self): pass
 								      class C(A, B): pass
 								    (Here, 'dictionary' is the type of built-in dictionary objects,
 								    a.k.a. type({}) or {}.__class__ or types.DictType.)  If we look at
 								    the structure lay-out, we find that an A instance has the lay-out
 								    of a dictionary followed by the __dict__ pointer, and a B instance
 								    has the same lay-out; since there are no structure member lay-out
 								    conflicts, this is okay.
 								    Here's another example:
 								      class X(object):
 								          def foo(self): pass
 								      class Y(dictionary):
 								          def bar(self): pass
 								      class Z(X, Y): pass
 								    (Here, 'object' is the base for all built-in types; its structure
 								    lay-out only contains the ob_refcnt and ob_type members.)  This
 								    example is more complicated, because the __dict__ pointer for X
 								    instances has a different offset than that for Y instances.  Where
 								    is the __dict__ pointer for Z instances?  The answer is that the
 								    offset for the __dict__ pointer is not hardcoded, it is stored in
 								    the type object.
 								    Suppose on a particular machine an 'object' structure is 8 bytes
 								    long, and a 'dictionary' struct is 60 bytes, and an object pointer
 								    is 4 bytes.  Then an X structure is 12 bytes (an object structure
 								    followed by a __dict__ pointer), and a Y structure is 64 bytes (a
 								    dictionary structure followed by a __dict__ pointer).  The Z
 								    structure has the same lay-out as the Y structure in this example.
 								    Each type object (X, Y and Z) has a "__dict__ offset" which is
 								    used to find the __dict__ pointer.  Thus, the recipe for looking
 								    up an instance variable is:
 . get the type of the instance
 . get the __dict__ offset from the type object
 . add the __dict__ offset to the instance pointer
 . look in the resulting address to find a dictionary reference
 . look up the instance variable name in that dictionary
 								    Of course, this recipe can only be implemented in C, and I have
 								    left out some details.  But this allows us to use multiple
 								    inheritance patterns similar to the ones we can use with classic
 								    classes.
 								    XXX I should write up the complete algorithm here to determine
 								    base class compatibility, but I can't be bothered right now.  Look
 								    at best_base() in typeobject.c in the implementation mentioned
 								    below.
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								XXX To be done
 								    Additional topics to be discussed in this PEP:
-												PEP 259: Omit printing newline after newline

											
										
										
											2001-06-11 16:07:37 -04:00
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								      - class methods and static methods
-												PEP 259: Omit printing newline after newline

											
										
										
											2001-06-11 16:07:37 -04:00
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								      - mapping between type object slots (tp_foo) and special methods
-												Added a section on multiple inheritance.

											
										
										
											2001-07-11 15:09:28 -04:00
+								        (__foo__) (actually, this may belong in PEP 252)
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
 								      - built-in names for built-in types (object, int, str, list etc.)
-												Added a section on multiple inheritance.

											
										
										
											2001-07-11 15:09:28 -04:00
+								      - method resolution order
 								      - __dict__
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
 								      - __slots__
 								      - the HEAPTYPE and DYNAMICTYPE flag bits
-												Added a section on multiple inheritance.

											
										
										
											2001-07-11 15:09:28 -04:00
+								      - GC support
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
+								      - API docs for all the new functions
-												Added a section on multiple inheritance.

											
										
										
											2001-07-11 15:09:28 -04:00
+								      - high level user overview
-												Just a little bit more cleanup.  Added a TODO list.

											
										
										
											2001-07-10 16:46:24 -04:00
 								Implementation
 								    A prototype implementation of this PEP (and for PEP 252) is
 								    available from CVS as a branch named "descr-branch".  To
 								    experiment with this implementation, proceed to check out Python
 								    from CVS according to the instructions at
 								    http://sourceforge.net/cvs/?group_id=5470 but add the arguments
 								    "-r descr-branch" to the cvs checkout command.  (You can also
 								    start with an existing checkout and do "cvs update -r
 								    descr-branch".)  For some examples of the features described here,
 								    see the file Lib/test/test_descr.py and the extension module
 								    Modules/xxsubtype.c.
-												PEP 259: Omit printing newline after newline

											
										
										
											2001-06-11 16:07:37 -04:00
 								References
 								    [1] "Putting Metaclasses to Work", by Ira R. Forman and Scott
 								        H. Danforth, Addison-Wesley 1999.
 								        (http://www.aw.com/product/0,2627,0201433052,00.html)
-												Checking in a stub of PEP 253.

											
										
										
											2001-05-14 09:43:23 -04:00
+								Copyright
 								    This document has been placed in the public domain.
 								Local Variables:
 								mode: indented-text
 								indent-tabs-mode: nil
 								End: