Another intermediate checkin. Removed a lot of lies about an older

idea for what tp_alloc() should be.
2001-07-10 20:01:52 +00:00 · 2001-07-10 20:01:52 +00:00 · 14f1593cc7
parent 15299026e7
commit 14f1593cc7
1 changed files with 141 additions and 210 deletions
--- a/pep-0253.txt
+++ b/pep-0253.txt
@ -18,11 +18,11 @@ Introduction
    Traditionally, types in Python have been created statically, by
    declaring a global variable of type PyTypeObject and initializing
-    it with a static initializer.  The fields in the type object
+    it with a static initializer.  The slots in the type object
    describe all aspects of a Python type that are relevant to the
-    Python interpreter.  A few fields contain dimensional information
+    Python interpreter.  A few slots contain dimensional information
    (like the basic allocation size of instances), others contain
-    various flags, but most fields are pointers to functions to
+    various flags, but most slots are pointers to functions to
    implement various kinds of behaviors.  A NULL pointer means that
    the type does not implement the specific behavior; in that case
    the system may provide a default behavior in that case or raise an
@ -74,7 +74,7 @@ Introduction
    For binary compatibility, a flag bit in the tp_flags slot
    indicates the existence of the various new slots in the type
    object introduced below.  Types that don't have the
-    Py_TPFLAGS_HAVE_CLASS bit set in their tp_flags field are assumed
+    Py_TPFLAGS_HAVE_CLASS bit set in their tp_flags slot are assumed
    to have NULL values for all the subtyping slots.  (Warning: the
    current implementation prototype is not yet consistent in its
    checking of this flag bit.  This should be fixed before the final
@ -251,6 +251,12 @@ Making a type a factory for its instances
    dictionary an initial set of keys and values based on the
    arguments passed in.
    Note that for immutable object types, the initialization cannot be
    done by the tp_init() slot: this would provide the Python user
    with a way to change the initialiation.  Therefore, immutable
    objects typically have an empty tp_init() implementation and do
    all their initialization in their tp_new() slot.
    You may wonder why the tp_new() slot shouldn't call the tp_init()
    slot itself.  The reason is that in certain circumstances (like
    support for persistent objects), it is important to be able to
@ -273,13 +279,13 @@ Making a type a factory for its instances
    There's a third slot related to object creation: tp_alloc().  Its
    responsibility is to allocate the memory for the object,
-    initialize the reference count and type pointer field, and
+    initialize the reference count (ob_refcnt) and the type pointer
-    initialize the rest of the object to all zeros.  It should also
+    (ob_type), and initialize the rest of the object to all zeros.  It
-    register the object with the garbage collection subsystem if the
+    should also register the object with the garbage collection
-    type supports garbage collection.  This slot exists so that
+    subsystem if the type supports garbage collection.  This slot
-    derived types can override the memory allocation policy
+    exists so that derived types can override the memory allocation
-    (e.g. which heap is being used) separately from the initialization
+    policy (e.g. which heap is being used) separately from the
-    code.  The signature is:
+    initialization code.  The signature is:
        PyObject *tp_alloc(PyTypeObject *type, int nitems)
@ -294,8 +300,13 @@ Making a type a factory for its instances
    function of the base class must call the tp_alloc() slot of the
    type passed in as its first argument.  It is the tp_new()
    function's responsibility to calculate the number of items.  The
-    tp_alloc() slot will set the ob_size field of the new object if
+    tp_alloc() slot will set the ob_size member of the new object if
-    the type->tp_itemsize field is nonzero.
+    the type->tp_itemsize member is nonzero.
    (Note: in certain debugging compilation modes, the type structure
    used to have members named tp_alloc and a tp_free slot already,
    counters for the number of allocations and deallocations.  These
    are renamed to tp_allocs and tp_deallocs.)
    XXX The keyword arguments are currently not passed to tp_new();
    its kwds argument is always NULL.  This is a relic from a previous
@ -304,189 +315,99 @@ Making a type a factory for its instances
    should check that the arguments are acceptable, because they may
    be called independently.
    Standard implementations for tp_alloc() and tp_new() are
    available.  PyType_GenericAlloc() allocates an object from the
    standard heap and initializes it properly.  It uses the above
    formula to determine the amount of memory to allocate, and takes
    care of GC registration.  The only reason not to use this
    implementation would be to allocate objects from different heap
    (as is done by some very small frequently used objects like ints
    and tuples).  PyType_GenericNew() adds very little: it just calls
    the type's tp_alloc() slot with zero for nitems.  But for mutable
    types that do all their initialization in their tp_init() slot,
    this may be just the ticket.
 Requirements for a type to allow subtyping
-    The simplest form of subtyping is subtyping in C.  It is the
+Preparing a type for subtyping
    simplest form because we can require the C code to be aware of the
    various problems, and it's acceptable for C code that doesn't
    follow the rules to dump core.  For added simplicity, it is
    limited to single inheritance.
    The idea behind subtyping is very similar to that of single
    inheritance in C++.  A base type is described by a structure
-    declaration plus a type object.  A derived type can extend the
+    declaration (similar to the C++ class declaration) plus a type
-    structure (but must leave the names, order and type of the fields
+    object (similar to the C++ vtable).  A derived type can extend the
    structure (but must leave the names, order and type of the members
    of the base structure unchanged) and can override certain slots in
-    the type object, leaving others the same.
+    the type object, leaving others the same.  (Unlike C++ vtables,
    all Python type objects have the same memory lay-out.)
-    Most issues have to do with construction and destruction of
+    The base type must do the following:
    instances of derived types.
-    Creation of a new object is separated into allocation and
+    - Add the flag value Py_TPFLAGS_BASETYPE to tp_flags.
-    initialization: allocation allocates the memory, and
+    - Declare and use tp_new(), tp_alloc() and optional tp_init() slots.
-    initialization fill it with appropriate initial values.  The
+    - Declare and use tp_dealloc() and tp_free().
-    separation is needed for the convenience of subtypes.
+    - Export its object structure declaration.
-    Instantiation of a subtype goes as follows:
+    - Export a subtyping-aware type-checking macro.
-        1. allocate memory for the whole (subtype) instance
+    The requirements and signatures for tp_new(), tp_alloc() and
-        2. initialize the base type
+    tp_init() have already been discussed above: tp_alloc() should
-        3. initialize the subtype's instance variables
+    allocate the memory and initialize it to mostly zeros; tp_new()
    should call the tp_alloc() slot and then proceed to do the
    minimally required initialization; tp_init() should be used for
    more extensive initialization of mutable objects.
-    If allocation and initialization were done by the same function,
+    It should come as no surprise that there are similar conventions
-    you would need a way to tell the base type's constructor to
+    at the end of an object's lifetime.  The slots involved are
-    allocate additional memory for the subtype's instance variables,
+    tp_dealloc() (familiar to all who have ever implemented a Python
-    and there would be no way to change the allocation method for a
+    extension type) and tp_free(), the new kid on he block.  (The
-    subtype (without giving up on calling the base type to initialize
+    names aren't quite symmetric; tp_free() corresponds to tp_alloc(),
-    its part of the instance structure).
+    which is fine, but tp_dealloc() corresponds to tp_new().  Maybe
    the tp_dealloc slot should be renamed?)
-    A similar reasoning applies to destruction: if a subtype changes
+    The tp_free() slot should be used to free the memory and
-    the instance allocator (for example to use a different heap), it
+    unregister the object with the garbage collection subsystem, and
-    must also change the instance deallocator; but it must still call
+    can be overridden by a derived class; tp_dealloc() should
-    on the base type's destructor to DECREF the base type's instance
+    deinitialize the object (e.g. by calling Py_XDECREF() for various
-    variables.
+    sub-objects) and then call tp_free() to deallocate the memory.
    The signature for tp_dealloc() is the same as it always was:
-    In this proposal, I assign stricter meanings to two existing
+        void tp_dealloc(PyObject *object)
    slots for deallocation and deinitialization, and I add two new
    slots for allocation and initialization.
-    The tp_clear slot gets the new task of deinitializing an object so
+    The signature for tp_free() is the same:
    that all that remains to be done is free its memory.  Originally,
    all it had to do was clear object references.  The difference is
    subtle: the list and dictionary objects contain references to an
    additional heap-allocated piece of memory that isn't freed by
    tp_clear in Python 2.1, but which must be freed by tp_clear under
    this proposal. It should be safe to call tp_clear repeatedly on
    the same object.  If an object contains no references to other
    objects or heap-allocated memory, the tp_clear slot may be NULL.
-    The only additional requirement for the tp_dealloc slot is that it
+        void tp_free(PyObject *object)
    should do the right thing whether or not tp_clear has been called.
-    The new slots are tp_alloc for allocation and tp_init for
+    (In a previous version of this PEP, there was also role reserved
-    initialization.  Their signatures:
+    for the tp_clear() slot.  This turned out to be a bad idea.)
-        PyObject *tp_alloc(PyTypeObject *type,
+    In order to be usefully subtyped in C, a type must export the
                           PyObject *args,
                           PyObject *kwds)
        int tp_init(PyObject *self,
                    PyObject *args,
                    PyObject *kwds)
    [XXX We'll have to rename tp_alloc to something else, because in
    debug mode there's already a tp_alloc field.]
    The arguments for tp_alloc are the same as for tp_new, described
    above.  The arguments for tp_init are the same except that the
    first argument is replaced with the instance to be initialized.
    Its return value is 0 for success or -1 for failure.
    It is possible that tp_init is called more than once or not at
    all.  The implementation should allow this usage.  The object may
    be non-functional until tp_init is called, and a second call to
    tp_init may raise an exception, but it should not be possible to
    cause a core dump or memory leakage this way.
    Because tp_init is in a sense optional, tp_alloc is required to do
    *some* initialization of the object.  It must initialize ob_refcnt
    to 1 and ob_type to its type argument.  It should zero out the
    rest of the object.
    The constructor arguments are passed to tp_alloc so that for
    variable-size objects (like tuples and strings) it knows to
    allocate the right amount of memory.
    For immutable types, tp_alloc may have to do the full
    initialization; otherwise, different calls to tp_init might cause
    an immutable object to be modified, which is considered a grave
    offense in Python (unlike in Fortran :-).
    Not every type can serve as a base type.  The assumption is made
    that if a type has a non-NULL value in its tp_init slot, it is
    ready to be subclassed; otherwise, it is not, and using it as a
    base class will raise an exception.
    In order to be usefully subtyped in C, a type must also export the
    structure declaration for its instances through a header file, as
    it is needed in order to derive a subtype.  The type object for
    the base type must also be exported.
    If the base type has a type-checking macro (like PyDict_Check()),
-    this macro probably should be changed to recognize subtypes.  This
+    this macro should be made to recognize subtypes.  This can be done
-    can be done by using the new PyObject_TypeCheck(object, type)
+    by using the new PyObject_TypeCheck(object, type) macro, which
-    macro, which calls a function that follows the base class links.
+    calls a function that follows the base class links.
-    (An argument against changing the type-checking macro could be
+    The PyObject_TypeCheck() macro contains a slight optimization: it
-    that the type check is used frequently and a function call would
+    first compares object->ob_type directly to the type argument, and
-    slow things down too much, but I find this hard to believe.  One
+    if this is a match, bypasses the function call.  This should make
-    could also fear that a subtype might break an invariant assumed by
+    it fast enough for most situations.
    the support functions of the base type.  Usually it is best to
    change the base type to remove this reliance, at least to the
    point of raising an exception rather than dumping core when the
    invariant is broken.)
-    Here are the inteactions between, tp_alloc, tp_clear, tp_dealloc
+    Note that this change in the type-checking macro means that C
-    and subtypes; all assuming that the base type defines tp_init
+    functions that require an instance of the base type may be invoked
-    (otherwise it cannot be subtyped anyway):
+    with instances of the derived type.  Before enabling subtyping of
-
+    a particular type, its code should be checked to make sure that
-    - If the base type's allocation scheme doesn't use the standard
+    this won't break anything.
      heap, it should not define tp_alloc.  This is a signal for the
      subclass to provide its own tp_alloc *and* tp_dealloc
      implementation (probably using the standard heap).
    - If the base type's tp_dealloc does anything besides calling
      PyObject_DEL() (typically, calling Py_XDECREF() on contained
      objects or freeing dependent memory blocks), it should define a
      tp_clear that does the same without calling PyObject_DEL(), and
      which checks for zero pointers before and zeros the pointers
      afterwards, so that calling tp_clear more than once or calling
      tp_dealloc after tp_clear will not attempt to DECREF or free the
      same object/memory twice.  (It should also be allowed to
      continue using the object after tp_clear -- tp_clear should
      simply reset the object to its pristine state.)
    - If the derived type overrides tp_alloc, it should also override
      tp_dealloc, and tp_dealloc should call the derived type's
      tp_clear if non-NULL (or its own tp_clear).
    - If the derived type overrides tp_clear, it should call the base
      type's tp_clear if non-NULL.
    - If the base type defines tp_init as well as tp_new, its tp_new
      should be inheritable: it should call the tp_alloc and the
      tp_init of the type passed in as its first argument.
    - If the base type defines tp_init as well as tp_alloc, its
      tp_alloc should be inheritable: it should look in the
      tp_basicsize slot of the type passed in for the amount of memory
      to allocate, and it should initialize all allocated bytes to
      zero.
    - For types whose tp_itemsize is nonzero, the allocation size used
      in tp_alloc should be tp_basicsize + n*tp_itemsize, rounded up
      to the next integral multiple of sizeof(PyObject *), where n is
      the number of items determined by the arguments to tp_alloc.
    - Things are further complicated by the garbage collection API.
      This affects tp_basicsize, and the actions to be taken by
      tp_alloc.  tp_alloc should look at the Py_TPFLAGS_GC flag bit in
      the tp_flags field of the type passed in, and not assume that
      this is the same as the corresponding bit in the base type.  (In
      part, the GC API is at fault; Neil Schemenauer has a patch that
      fixes the API, but it is currently backwards incompatible.)
    Note: the rules here are very complicated -- probably too
    complicated.  It may be better to give up on subtyping immutable
    types, types with custom allocators, and types with variable size
    allocation (such as int, string and tuple) -- then the rules can
    be much simplified because you can assume allocation on the
    standard heap, no requirement beyond zeroing memory in tp_alloc,
    and no variable length allocation.
 Creating a subtype of a built-in type in C
    The simplest form of subtyping is subtyping in C.  It is the
    simplest form because we can require the C code to be aware of
    some of the problems, and it's acceptable for C code that doesn't
    follow the rules to dump core.  For added simplicity, it is
    limited to single inheritance.
    Let's assume we're deriving from a mutable base type whose
    tp_itemsize is zero.  The subtype code is not GC-aware, although
    it may inherit GC-awareness from the base type (this is
@ -501,85 +422,95 @@ Creating a subtype of a built-in type in C
        int state;
    } spamlistobject;
-    Note that the base type structure field (here PyListObject) must
+    Note that the base type structure member (here PyListObject) must
-    be the first field in the structure; any following fields are
+    be the first member of the structure; any following members are
-    extension fields.  Also note that the base type is not referenced
+    additions.  Also note that the base type is not referenced via a
-    via a pointer; the actual contents of its structure must be
+    pointer; the actual contents of its structure must be included!
-    included! (The goal is for the memory lay out of the beginning of
+    (The goal is for the memory lay out of the beginning of the
-    the subtype instance to be the same as that of the base type
+    subtype instance to be the same as that of the base type
    instance.)
    Next, the derived type must declare a type object and initialize
    it.  Most of the slots in the type object may be initialized to
    zero, which is a signal that the base type slot must be copied
-    into it.  Some fields that must be initialized properly:
+    into it.  Some slots that must be initialized properly:
    - The object header must be filled in as usual; the type should be
      &PyType_Type.
-    - The tp_basicsize field must be set to the size of the subtype
+    - The tp_basicsize slot must be set to the size of the subtype
      instance struct (in the above example: sizeof(spamlistobject)).
-    - The tp_base field must be set to the address of the base type's
+    - The tp_base slot must be set to the address of the base type's
      type object.
-    - If the derived slot defines any pointer fields, the tp_dealloc
+    - If the derived slot defines any pointer members, the tp_dealloc
      slot function requires special attention, see below; otherwise,
      it can be set to zero, to inherit the base type's deallocation
      function.
-    - The tp_flags field must be set to the usual Py_TPFLAGS_DEFAULT
+    - The tp_flags slot must be set to the usual Py_TPFLAGS_DEFAULT
      value.
-    - The tp_name field must be set; it is recommended to set tp_doc
+    - The tp_name slot must be set; it is recommended to set tp_doc
      as well (these are not inherited).
-    Exception: if the subtype defines no additional fields in its
+    If the subtype defines no additional structure members (it only
-    structure (it only defines new behavior, no new data), the
+    defines new behavior, no new data), the tp_basicsize and the
-    tp_basicsize and the tp_dealloc fields may be set to zero.
+    tp_dealloc slots may be left set to zero.
    In order to complete the initialization of the type,
    PyType_InitDict() must be called.  This replaces zero slots in the
    subtype with the value of the corresponding base type slots.  (It
    also fills in tp_dict, the type's dictionary, and does various
    other initializations necessary for type objects.)
    The subtype's tp_dealloc slot deserves special attention.  If the
-    derived type defines no additional pointers that need to be
+    derived type defines no additional pointer members that need to be
    DECREF'ed or freed when the object is deallocated, it can be set
-    to zero.  Otherwise, the subtype's deallocation function must call
+    to zero.  Otherwise, the subtype's tp_dealloc() function must call
-    Py_XDECREF() for any PyObject * fields and the correct memory
+    Py_XDECREF() for any PyObject * members and the correct memory
    freeing function for any other pointers it owns, and then call the
-    base class's tp_dealloc slot.  Because deallocation functions
+    base class's tp_dealloc() slot.  This call has to be made via the
-    typically are not exported, this call has to be made via the base
+    base type's type structure, for example, when deriving from the
    type's type structure, for example, when deriving from the
    standard list type:
        PyList_Type.tp_dealloc(self);
-    (If the subtype uses a different allocation heap than the base
+    If the subtype wants to use a different allocation heap than the
-    type, the subtype must call the base type's tp_clear() slot
+    base type, the subtype must override both the tp_alloc() and the
-    instead, followed by a call to free the object's memory from the
+    tp_free() slots.  These will be called by the base class's
-    appropriate heap, such as PyObject_DEL(self) if the subtype uses
+    tp_new() and tp_dealloc() slots, respectively.
-    the standard heap.  But in this case subtyping is not
+
-    recommended.)
+    In order to complete the initialization of the type,
    PyType_InitDict() must be called.  This replaces slots initialized
    to zero in the subtype with the value of the corresponding base
    type slots.  (It also fills in tp_dict, the type's dictionary, and
    does various other initializations necessary for type objects.)
    A subtype is not usable until PyType_InitDict() is called for it;
    this is best done during module initialization, assuming the
    subtype belongs to a module.  An alternative for subtypes added to
    the Python core (which don't live in a particular module) would be
    to initialize the subtype in their constructor function.  It is
-    allowed to call PyType_InitDict() more than once, the second and
+    allowed to call PyType_InitDict() more than once; the second and
    further calls have no effect.  In order to avoid unnecessary
    calls, a test for tp_dict==NULL can be made.
-    To create a subtype instance, the base type's tp_alloc slot must
+    (During initialization of the Python interpreter, some types are
-    be called with the subtype as its first argument.  Then, if the
+    actually used before they are initialized.  As long as the slots
-    base type has a tp_init slot, that must be called to initialize
+    that are actually needed are initialized, especially tp_dealloc,
-    the base portion of the instance; finally the subtype's own fields
+    this works, but it is fragile and not recommended as a general
-    must be initialized.  After allocation, the initialization can
+    practice.)
-    also be done by calling the subtype's tp_init slot, assuming this
+
-    correctly calls its base type's tp_init slot.
+    To create a subtype instance, the subtype's tp_new() slot is
    called.  This should first call the base type's tp_new() slot and
    then initialize the subtype's additional data members.  To further
    initialize the instance, the tp_init() slot is typically called.
    Note that the tp_new() slot should *not* call the tp_init() slot;
    this is up to tp_new()'s caller (typically a factory function).
    There are circumstances where it is appropriate not to call
    tp_init().
    If a subtype defines a tp_init() slot, the tp_init() slot should
    normally first call the base type's tp_init() slot.
    (XXX There should be a paragraph or two about argument passing
    here.)
 Subtyping in Python