Another intermediate update. I've rewritten the requirements for a
base type to be subtypable. Needs way more work!
This commit is contained in:
parent
137a35ac05
commit
b54a36962d
451
pep-0253.txt
451
pep-0253.txt
|
@ -10,14 +10,16 @@ Post-History:
|
|||
|
||||
Abstract
|
||||
|
||||
This PEP proposes ways for creating subtypes of existing built-in
|
||||
types, either in C or in Python. The text is currently long and
|
||||
rambling; I'll go over it again later to make it shorter.
|
||||
This PEP proposes additions to the type object API that will allow
|
||||
the creation of subtypes of built-in types, in C and in Python.
|
||||
|
||||
|
||||
Introduction
|
||||
|
||||
Traditionally, types in Python have been created statically, by
|
||||
declaring a global variable of type PyTypeObject and initializing
|
||||
it with a static initializer. The fields in the type object
|
||||
describe all aspects of a Python object that are relevant to the
|
||||
describe all aspects of a Python type that are relevant to the
|
||||
Python interpreter. A few fields contain dimensional information
|
||||
(e.g. the basic allocation size of instances), others contain
|
||||
various flags, but most fields are pointers to functions to
|
||||
|
@ -26,39 +28,57 @@ Abstract
|
|||
the system may provide a default behavior in that case or raise an
|
||||
exception when the behavior is invoked. Some collections of
|
||||
functions pointers that are usually defined together are obtained
|
||||
indirectly via a pointer to an additional structure containing.
|
||||
indirectly via a pointer to an additional structure containing
|
||||
more function pointers.
|
||||
|
||||
While the details of initializing a PyTypeObject structure haven't
|
||||
been documented as such, they are easily glanced from the examples
|
||||
been documented as such, they are easily gleaned from the examples
|
||||
in the source code, and I am assuming that the reader is
|
||||
sufficiently familiar with the traditional way of creating new
|
||||
Python types in C.
|
||||
|
||||
This PEP will introduce the following features:
|
||||
|
||||
- a type, like a class, can be a factory for its instances
|
||||
- a type can be a factory function for its instances
|
||||
|
||||
- types can be subtyped in C by specifying a base type pointer
|
||||
- types can be subtyped in C
|
||||
|
||||
- types can be subtyped in Python using the class statement
|
||||
- types can be subtyped in Python with the class statement
|
||||
|
||||
- multiple inheritance from types (insofar as practical)
|
||||
- multiple inheritance from types is supported (insofar as
|
||||
practical)
|
||||
|
||||
- the standard coercions (int, tuple, str etc.) will be the
|
||||
corresponding type objects
|
||||
- the standard coercions functions (int, tuple, str etc.) will be
|
||||
redefined to be the corresponding type objects, which serve as
|
||||
their own factory functions
|
||||
|
||||
- a standard type hierarchy
|
||||
- there will be a standard type hierarchy
|
||||
|
||||
- a class statement can contain a metaclass declaration,
|
||||
specifying the metaclass to be used to create the new class
|
||||
|
||||
- a class statement can contain a slots declaration, specifying
|
||||
the specific names of the instance variables supported
|
||||
|
||||
This PEP builds on pep-0252, which adds standard introspection to
|
||||
types; in particular, types are assumed to have e.g. a __hash__
|
||||
method when the type object defines the tp_hash slot. pep-0252 also
|
||||
adds a dictionary to type objects which contains all methods. At
|
||||
the Python level, this dictionary is read-only; at the C level, it
|
||||
is accessible directly (but modifying it is not recommended except
|
||||
as part of initialization).
|
||||
types; e.g., when the type object defines the tp_hash slot, the
|
||||
type object has a __hash__ method. pep-0252 also adds a
|
||||
dictionary to type objects which contains all methods. At the
|
||||
Python level, this dictionary is read-only for built-in types; at
|
||||
the C level, it is accessible directly (but it should not be
|
||||
modified except as part of initialization).
|
||||
|
||||
For binary compatibility, a flag bit in the tp_flags slot
|
||||
indicates the existence of the various new slots in the type
|
||||
object introduced below. Types that don't have the
|
||||
Py_TPFLAGS_HAVE_CLASS bit set in their tp_flags field are assumed
|
||||
to have NULL values for all the subtyping slots. (Warning: the
|
||||
current implementation prototype is not yet consistent in its
|
||||
checking of this flag bit. This should be fixed before the final
|
||||
release.)
|
||||
|
||||
|
||||
Metatypes
|
||||
About metatypes
|
||||
|
||||
Inevitably the following discussion will come to mention metatypes
|
||||
(or metaclasses). Metatypes are nothing new in Python: Python has
|
||||
|
@ -75,26 +95,27 @@ Metatypes
|
|||
|
||||
In this example, type(a) is a "regular" type, and type(type(a)) is
|
||||
a metatype. While as distributed all types have the same metatype
|
||||
(which is also its own metatype), this is not a requirement, and
|
||||
in fact a useful 3rd party extension (ExtensionClasses by Jim
|
||||
Fulton) creates an additional metatype. A related feature is the
|
||||
"Don Beaudry hook", which says that if a metatype is callable, its
|
||||
instances (which are regular types) can be subclassed (really
|
||||
subtyped) using a Python class statement. We will use this rule
|
||||
to support subtyping of built-in types, and in fact it greatly
|
||||
simplifies the logic of class creation to always simply call the
|
||||
metatype. When no base class is specified, a default metatype is
|
||||
called -- the default metatype is the "ClassType" object, so the
|
||||
class statement will behave as before in the normal case.
|
||||
(PyType_Type, which is also its own metatype), this is not a
|
||||
requirement, and in fact a useful and relevant 3rd party extension
|
||||
(ExtensionClasses by Jim Fulton) creates an additional metatype.
|
||||
|
||||
Python uses the concept of metatypes or metaclasses in a
|
||||
different way than Smalltalk. In Smalltalk-80, there is a
|
||||
hierarchy of metaclasses that mirrors the hierarchy of regular
|
||||
classes, metaclasses map 1-1 to classes (except for some funny
|
||||
business at the root of the hierarchy), and each class statement
|
||||
creates both a regular class and its metaclass, putting class
|
||||
methods in the metaclass and instance methods in the regular
|
||||
class.
|
||||
A related feature is the "Don Beaudry hook", which says that if a
|
||||
metatype is callable, its instances (which are regular types) can
|
||||
be subclassed (really subtyped) using a Python class statement.
|
||||
I will use this rule to support subtyping of built-in types, and
|
||||
in fact it greatly simplifies the logic of class creation to
|
||||
always simply call the metatype. When no base class is specified,
|
||||
a default metatype is called -- the default metatype is the
|
||||
"ClassType" object, so the class statement will behave as before
|
||||
in the normal case.
|
||||
|
||||
Python uses the concept of metatypes or metaclasses in a different
|
||||
way than Smalltalk. In Smalltalk-80, there is a hierarchy of
|
||||
metaclasses that mirrors the hierarchy of regular classes,
|
||||
metaclasses map 1-1 to classes (except for some funny business at
|
||||
the root of the hierarchy), and each class statement creates both
|
||||
a regular class and its metaclass, putting class methods in the
|
||||
metaclass and instance methods in the regular class.
|
||||
|
||||
Nice though this may be in the context of Smalltalk, it's not
|
||||
compatible with the traditional use of metatypes in Python, and I
|
||||
|
@ -106,98 +127,75 @@ Metatypes
|
|||
e.g. Python code will never be allowed to allocate raw memory and
|
||||
initialize it at will.)
|
||||
|
||||
Metatypes determine various *policies* for types, e.g. what
|
||||
happens when a type is called, how dynamic types are (whether a
|
||||
type's __dict__ can be modified after it is created), what the
|
||||
method resolution order is, how instance attributes are looked
|
||||
up, and so on.
|
||||
|
||||
Instantiation by calling the type object
|
||||
I'll argue that left-to-right depth-first is not the best
|
||||
solution when you want to get the most use from multiple
|
||||
inheritance.
|
||||
|
||||
Traditionally, for each type there is at least one C function that
|
||||
creates instances of the type (e.g. PyInt_FromLong(),
|
||||
PyTuple_New() and so on). This function has to take care of
|
||||
I'll argue that with multiple inheritance, the metatype of the
|
||||
subtype must be a descendant of the metatypes of all base types.
|
||||
|
||||
I'll come back to metatypes later.
|
||||
|
||||
|
||||
Making a type a factory for its instances
|
||||
|
||||
Traditionally, for each type there is at least one C factory
|
||||
function that creates instances of the type (PyTuple_New(),
|
||||
PyInt_FromLong() and so on). These factory functions take care of
|
||||
both allocating memory for the object and initializing that
|
||||
memory. As of Python 2.0, it also has to interface with the
|
||||
memory. As of Python 2.0, they also have to interface with the
|
||||
garbage collection subsystem, if the type chooses to participate
|
||||
in garbage collection (which is optional, but strongly recommended
|
||||
for so-called "container" types: types that may contain arbitrary
|
||||
references to other objects, and hence may participate in
|
||||
reference cycles).
|
||||
|
||||
If we're going to implement subtyping, we must separate allocation
|
||||
and initialization: typically, the most derived subtype is in
|
||||
charge of allocation (and hence deallocation!), but in most cases
|
||||
each base type's initializer (constructor) must still be called,
|
||||
from the "most base" type to the most derived type.
|
||||
In this proposal, type objects can be factory functions for their
|
||||
instances, making the types directly callable from Python. This
|
||||
mimics the way classes are instantiated. Of course, the C APIs
|
||||
for creating instances of various built-in types will remain valid
|
||||
and probably the most common; and not all types will become their
|
||||
own factory functions.
|
||||
|
||||
But let's first get the interface for instantiation right. If we
|
||||
call an object, the tp_call slot if its type gets invoked. Thus,
|
||||
if we call a type, this invokes the tp_call slot of the type's
|
||||
type: in other words, the tp_call slot of the metatype.
|
||||
Traditionally this has been a NULL pointer, meaning that types
|
||||
can't be called. Now we're adding a tp_call slot to the metatype,
|
||||
which makes all types "callable" in a trivial sense. But
|
||||
obviously the metatype's tp_call implementation doesn't know how
|
||||
to initialize the instances of individual types. So the type
|
||||
defines a new slot, tp_new, which is invoked by the metatype's
|
||||
tp_call slot. If the tp_new slot is NULL, the metatype's tp_call
|
||||
issues a nice error message: the type isn't callable.
|
||||
The type object has a new slot, tp_new, which can act as a factory
|
||||
for instances of the type. Types are made callable by providing a
|
||||
tp_call slot in PyType_Type (the metatype); the slot
|
||||
implementation function looks for the tp_new slot of the type that
|
||||
is being called.
|
||||
|
||||
This mechanism gives the maximum freedom to the type: a type's
|
||||
tp_new doesn't necessarily have to return a new object, or even an
|
||||
object that is an instance of the type (although the latter should
|
||||
be rare).
|
||||
If the type's tp_new slot is NULL, an exception is raised.
|
||||
Otherwise, the tp_new slot is called. The signature for the
|
||||
tp_new slot is
|
||||
|
||||
HIRO
|
||||
PyObject *tp_new(PyTypeObject *type,
|
||||
PyObject *args,
|
||||
PyObject *kwds)
|
||||
|
||||
The deallocation mechanism chosen should match the allocation
|
||||
mechanism: an allocation policy should prescribe both the
|
||||
allocation and deallocation mechanism. And again, planning ahead
|
||||
for subtyping would be nice. But the available mechanisms are
|
||||
different. The deallocation function has always been part of the
|
||||
type structure, as tp_dealloc, which combines the
|
||||
"uninitialization" with deallocation. This was good enough for
|
||||
the traditional situation, where it matched the combined
|
||||
allocation and initialization of the creation function. But now
|
||||
imagine a type whose creation function uses a special free list
|
||||
for allocation. It's deallocation function puts the object's
|
||||
memory back on the same free list. But when allocation and
|
||||
creation are separate, the object may have been allocated from the
|
||||
regular heap, and it would be wrong (in some cases disastrous) if
|
||||
it were placed on the free list by the deallocation function.
|
||||
where 'type' is the type whose tp_new slot is called, and 'args'
|
||||
and 'kwds' are the sequential and keyword arguments to the call,
|
||||
passed unchanged from tp_call. (The 'type' argument is used in
|
||||
combination with inheritance, see below.)
|
||||
|
||||
A solution would be for the tp_construct function to somehow mark
|
||||
whether the object was allocated from the special free list, so
|
||||
that the tp_dealloc function can choose the right deallocation
|
||||
method (assuming that the only two alternatives are a special free
|
||||
list or the regular heap). A variant that doesn't require space
|
||||
for an allocation flag bit would be to have two type objects,
|
||||
identical in the contents of all their slots except for their
|
||||
deallocation slot. But this requires that all type-checking code
|
||||
(e.g. the PyDict_Check()) recognizes both types. We'll come back
|
||||
to this solution in the context of subtyping. Another alternative
|
||||
is to require the metatype's tp_call to leave the allocation to
|
||||
the tp_construct method, by passing in a NULL pointer. But this
|
||||
doesn't work once we allow subtyping.
|
||||
|
||||
Eventually, when we add any form of subtyping, we'll have to
|
||||
separate deallocation from uninitialization. The way to do this
|
||||
is to add a separate slot to the type object that does the
|
||||
uninitialization without the deallocation. Fortunately, there is
|
||||
already such a slot: tp_clear, currently used by the garbage
|
||||
collection subsystem. A simple rule makes this slot reusable as
|
||||
an uninitialization: for types that support separate allocation
|
||||
and initialization, tp_clear must be defined (even if the object
|
||||
doesn't support garbage collection) and it must DECREF all
|
||||
contained objects and FREE all other memory areas the object owns.
|
||||
It must also be reentrant: it must be possible to clear an already
|
||||
cleared object. The easiest way to do this is to replace all
|
||||
pointers DECREFed or FREEd with NULL pointers.
|
||||
There are no constraints on the object type that is returned,
|
||||
although by convention it should be an instance of the given
|
||||
type. It is not necessary that a new object is returned; a
|
||||
reference to an existing object is fine too. The return value
|
||||
should always be a new reference, owned by the caller.
|
||||
|
||||
|
||||
Subtyping in C
|
||||
Requirements for a type to allow subtyping
|
||||
|
||||
The simplest form of subtyping is subtyping in C. It is the
|
||||
simplest form because we can require the C code to be aware of the
|
||||
various problems, and it's acceptable for C code that doesn't
|
||||
follow the rules to dump core; while for Python subtyping we would
|
||||
need to catch all errors before they become core dumps.
|
||||
follow the rules to dump core. For added simplicity, it is
|
||||
limited to single inheritance.
|
||||
|
||||
The idea behind subtyping is very similar to that of single
|
||||
inheritance in C++. A base type is described by a structure
|
||||
|
@ -206,31 +204,169 @@ Subtyping in C
|
|||
of the base structure unchanged) and can override certain slots in
|
||||
the type object, leaving others the same.
|
||||
|
||||
Not every type can serve as a base type. The base type must
|
||||
support separation of allocation and initialization by having a
|
||||
tp_construct slot that can be called with a preallocated object,
|
||||
and it must support uninitialization without deallocation by
|
||||
having a tp_clear slot as described above. The derived type must
|
||||
also export the structure declaration for its instances through a
|
||||
header file, as it is needed in order to derive a subtype. The
|
||||
type object for the base type must also be exported.
|
||||
Most issues have to do with construction and destruction of
|
||||
instances of derived types.
|
||||
|
||||
Creation of a new object is separated into allocation and
|
||||
initialization: allocation allocates the memory, and
|
||||
initialization fill it with appropriate initial values. The
|
||||
separation is needed for the convenience of subtypes.
|
||||
Instantiation of a subtype goes as follows:
|
||||
|
||||
1. allocate memory for the whole (subtype) instance
|
||||
2. initialize the base type
|
||||
3. initialize the subtype's instance variables
|
||||
|
||||
If allocation and initialization were done by the same function,
|
||||
you would need a way to tell the base type's constructor to
|
||||
allocate additional memory for the subtype's instance variables,
|
||||
and there would be no way to change the allocation method for a
|
||||
subtype (without giving up on calling the base type to initialize
|
||||
its part of the instance structure).
|
||||
|
||||
A similar reasoning applies to destruction: if a subtype changes
|
||||
the instance allocator (e.g. to use a different heap), it must
|
||||
also change the instance deallocator; but it must still call on
|
||||
the base type's destructor to DECREF the base type's instance
|
||||
variables.
|
||||
|
||||
In this proposal, I assign stricter meanings to two existing
|
||||
slots for deallocation and deinitialization, and I add two new
|
||||
slots for allocation and initialization.
|
||||
|
||||
The tp_clear slot gets the new task of deinitializing an object so
|
||||
that all that remains to be done is free its memory. Originally,
|
||||
all it had to do was clear object references. The difference is
|
||||
subtle: the list and dictionary objects contain references to an
|
||||
additional heap-allocated piece of memory that isn't freed by
|
||||
tp_clear in Python 2.1, but which must be freed by tp_clear under
|
||||
this proposal. It should be safe to call tp_clear repeatedly on
|
||||
the same object. If an object contains no references to other
|
||||
objects or heap-allocated memory, the tp_clear slot may be NULL.
|
||||
|
||||
The only additional requirement for the tp_dealloc slot is that it
|
||||
should do the right thing whether or not tp_clear has been called.
|
||||
|
||||
The new slots are tp_alloc for allocation and tp_init for
|
||||
initialization. Their signatures:
|
||||
|
||||
PyObject *tp_alloc(PyTypeObject *type,
|
||||
PyObject *args,
|
||||
PyObject *kwds)
|
||||
|
||||
int tp_init(PyObject *self,
|
||||
PyObject *args,
|
||||
PyObject *kwds)
|
||||
|
||||
The arguments for tp_alloc are the same as for tp_new, described
|
||||
above. The arguments for tp_init are the same except that the
|
||||
first argument is replaced with the instance to be initialized.
|
||||
Its return value is 0 for success or -1 for failure.
|
||||
|
||||
It is possible that tp_init is called more than once or not at
|
||||
all. The implementation should allow this usage. The object may
|
||||
be non-functional until tp_init is called, and a second call to
|
||||
tp_init may raise an exception, but it should not be possible to
|
||||
cause a core dump or memory leakage this way.
|
||||
|
||||
Because tp_init is in a sense optional, tp_alloc is required to do
|
||||
*some* initialization of the object. It is required to initialize
|
||||
ob_refcnt to 1 and ob_type to its type argument. To be safe, it
|
||||
should probably zero out the rest of the object.
|
||||
|
||||
The constructor arguments are passed to tp_alloc so that for
|
||||
variable-size objects (like tuples and strings) it knows to
|
||||
allocate the right amount of memory.
|
||||
|
||||
For immutable types, tp_alloc may have to do the full
|
||||
initialization; otherwise, different calls to tp_init might cause
|
||||
an immutable object to be modified, which is considered a grave
|
||||
offense in Python (unlike in Fortran :-).
|
||||
|
||||
Not every type can serve as a base type. The assumption is made
|
||||
that if a type has a non-NULL value in its tp_init slot, it is
|
||||
ready to be subclassed; otherwise, it is not, and using it as a
|
||||
base class will raise an exception.
|
||||
|
||||
In order to be usefully subtyped in C, a type must also export the
|
||||
structure declaration for its instances through a header file, as
|
||||
it is needed in order to derive a subtype. The type object for
|
||||
the base type must also be exported.
|
||||
|
||||
If the base type has a type-checking macro (e.g. PyDict_Check()),
|
||||
this macro may be changed to recognize subtypes. This can be done
|
||||
by using the new PyObject_TypeCheck(object, type) macro, which
|
||||
calls a function that follows the base class links. There are
|
||||
arguments for and against changing the type-checking macro in this
|
||||
way. The argument for the change should be clear: it allows
|
||||
subtypes to be used in places where the base type is required,
|
||||
which is often the prime attraction of subtyping (as opposed to
|
||||
sharing implementation). An argument against changing the
|
||||
type-checking macro could be that the type check is used
|
||||
frequently and a function call would slow things down too much
|
||||
(hard to believe); or one could fear that a subtype might break an
|
||||
invariant assumed by the support functions of the base type.
|
||||
Sometimes it would be wise to change the base type to remove this
|
||||
reliance; other times, it would be better to require that derived
|
||||
types (implemented in C) maintain the invariants.
|
||||
this macro probably should be changed to recognize subtypes. This
|
||||
can be done by using the new PyObject_TypeCheck(object, type)
|
||||
macro, which calls a function that follows the base class links.
|
||||
|
||||
(An argument against changing the type-checking macro could be
|
||||
that the type check is used frequently and a function call would
|
||||
slow things down too much, but I find this hard to believe. One
|
||||
could also fear that a subtype might break an invariant assumed by
|
||||
the support functions of the base type. Usually it is best to
|
||||
change the base type to remove this reliance, at least to the
|
||||
point of raising an exception rather than dumping core when the
|
||||
invariant is broken.)
|
||||
|
||||
Here are the inteactions between, tp_alloc, tp_clear, tp_dealloc
|
||||
and subtypes; all assuming that the base type defines tp_init
|
||||
(otherwise it cannot be subtyped anyway):
|
||||
|
||||
- If the base type's allocation scheme doesn't use the standard
|
||||
heap, it should not define tp_alloc. This is a signal for the
|
||||
subclass to provide its own tp_alloc *and* tp_dealloc
|
||||
implementation (probably using the standard heap).
|
||||
|
||||
- If the base type's tp_dealloc does anything besides calling
|
||||
PyObject_DEL() (typically, calling Py_XDECREF() on contained
|
||||
objects or freeing dependent memory blocks), it should define a
|
||||
tp_clear that does the same without calling PyObject_DEL(), and
|
||||
which checks for zero pointers before and zeros the pointers
|
||||
afterwards, so that calling tp_clear more than once or calling
|
||||
tp_dealloc after tp_clear will not attempt to DECREF or free the
|
||||
same object/memory twice. (It should also be allowed to
|
||||
continue using the object after tp_clear -- tp_clear should
|
||||
simply reset the object to its pristine state.)
|
||||
|
||||
- If the derived type overrides tp_alloc, it should also override
|
||||
tp_dealloc, and tp_dealloc should call the derived type's
|
||||
tp_clear if non-NULL (or its own tp_clear).
|
||||
|
||||
- If the derived type overrides tp_clear, it should call the base
|
||||
type's tp_clear if non-NULL.
|
||||
|
||||
- If the base type defines tp_init as well as tp_new, its tp_new
|
||||
should be inheritable: it should call the tp_alloc and the
|
||||
tp_init of the type passed in as its first argument.
|
||||
|
||||
- If the base type defines tp_init as well as tp_alloc, its
|
||||
tp_alloc should be inheritable: it should look in the
|
||||
tp_basicsize slot of the type passed in for the amount of memory
|
||||
to allocate, and it should initialize all allocated bytes to
|
||||
zero.
|
||||
|
||||
- For types whose tp_itemsize is nonzero, the allocation size used
|
||||
in tp_alloc should be tp_basicsize + n*tp_itemsize, rounded up
|
||||
to the next integral multiple of sizeof(PyObject *), where n is
|
||||
the number of items determined by the arguments to tp_alloc.
|
||||
|
||||
- Things are further complicated by the garbage collection API.
|
||||
This affects tp_basicsize, and the actions to be taken by
|
||||
tp_alloc. tp_alloc should look at the Py_TPFLAGS_GC flag bit in
|
||||
the tp_flags field of the type passed in, and not assume that
|
||||
this is the same as the corresponding bit in the base type. (In
|
||||
part, the GC API is at fault; Neil Schemenauer has a patch that
|
||||
fixes the API, but it is currently backwards incompatible.)
|
||||
|
||||
Note: the rules here are very complicated -- probably too
|
||||
complicated. It may be better to give up on subtyping immutable
|
||||
types, types with custom allocators, and types with variable size
|
||||
allocation (such as int, string and tuple) -- then the rules can
|
||||
be much simplified because you can assume allocation on the
|
||||
standard heap, no requirement beyond zeroing memory in tp_alloc,
|
||||
and no variable length allocation.
|
||||
|
||||
|
||||
Creating a subtype of a built-in type in C
|
||||
|
||||
The derived type begins by declaring a type structure which
|
||||
contains the base type's structure. For example, here's the type
|
||||
|
@ -400,6 +536,53 @@ Copyright
|
|||
|
||||
This document has been placed in the public domain.
|
||||
|
||||
|
||||
Junk text (to be reused somewhere above)
|
||||
|
||||
The deallocation mechanism chosen should match the allocation
|
||||
mechanism: an allocation policy should prescribe both the
|
||||
allocation and deallocation mechanism. And again, planning ahead
|
||||
for subtyping would be nice. But the available mechanisms are
|
||||
different. The deallocation function has always been part of the
|
||||
type structure, as tp_dealloc, which combines the
|
||||
"uninitialization" with deallocation. This was good enough for
|
||||
the traditional situation, where it matched the combined
|
||||
allocation and initialization of the creation function. But now
|
||||
imagine a type whose creation function uses a special free list
|
||||
for allocation. It's deallocation function puts the object's
|
||||
memory back on the same free list. But when allocation and
|
||||
creation are separate, the object may have been allocated from the
|
||||
regular heap, and it would be wrong (in some cases disastrous) if
|
||||
it were placed on the free list by the deallocation function.
|
||||
|
||||
A solution would be for the tp_construct function to somehow mark
|
||||
whether the object was allocated from the special free list, so
|
||||
that the tp_dealloc function can choose the right deallocation
|
||||
method (assuming that the only two alternatives are a special free
|
||||
list or the regular heap). A variant that doesn't require space
|
||||
for an allocation flag bit would be to have two type objects,
|
||||
identical in the contents of all their slots except for their
|
||||
deallocation slot. But this requires that all type-checking code
|
||||
(e.g. the PyDict_Check()) recognizes both types. We'll come back
|
||||
to this solution in the context of subtyping. Another alternative
|
||||
is to require the metatype's tp_call to leave the allocation to
|
||||
the tp_construct method, by passing in a NULL pointer. But this
|
||||
doesn't work once we allow subtyping.
|
||||
|
||||
Eventually, when we add any form of subtyping, we'll have to
|
||||
separate deallocation from uninitialization. The way to do this
|
||||
is to add a separate slot to the type object that does the
|
||||
uninitialization without the deallocation. Fortunately, there is
|
||||
already such a slot: tp_clear, currently used by the garbage
|
||||
collection subsystem. A simple rule makes this slot reusable as
|
||||
an uninitialization: for types that support separate allocation
|
||||
and initialization, tp_clear must be defined (even if the object
|
||||
doesn't support garbage collection) and it must DECREF all
|
||||
contained objects and FREE all other memory areas the object owns.
|
||||
It must also be reentrant: it must be possible to clear an already
|
||||
cleared object. The easiest way to do this is to replace all
|
||||
pointers DECREFed or FREEd with NULL pointers.
|
||||
|
||||
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
|
|
Loading…
Reference in New Issue