421 lines
18 KiB
Plaintext
421 lines
18 KiB
Plaintext
PEP: 252
|
||
Title: Making Types Look More Like Classes
|
||
Version: $Revision$
|
||
Author: guido@python.org (Guido van Rossum)
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Python-Version: 2.2
|
||
Created: 19-Apr-2001
|
||
Post-History:
|
||
|
||
Abstract
|
||
|
||
This PEP proposes changes to the introspection API for types that
|
||
makes them look more like classes. For example, type(x) will be
|
||
equivalent to x.__class__ for most built-in types. When C is
|
||
x.__class__, x.meth(a) will be equivalent to C.meth(x, a), and
|
||
C.__dict__ contains descriptors for x's methods and other
|
||
attributes.
|
||
|
||
The PEP also introduces a new approach to specifying attributes,
|
||
using attribute descriptors, or descriptors for short.
|
||
Descriptors unify and generalize several different common
|
||
mechanisms used for describing attributes: a descriptor can
|
||
describe a method, a typed field in the object structure, or a
|
||
generalized attribute represented by getter and setter functions.
|
||
|
||
|
||
Introduction
|
||
|
||
One of Python's oldest language warts is the difference between
|
||
classes and types. For example, you can't directly subclass the
|
||
dictionary type, and the introspection interface for finding out
|
||
what methods and instance variables an object has is different for
|
||
types and for classes.
|
||
|
||
Healing the class/type split is a big effort, because it affects
|
||
many aspects of how Python is implemented. This PEP concerns
|
||
itself with making the introspection API for types look the same
|
||
as that for classes. Other PEPs will propose making classes look
|
||
more like types, and subclassing from built-in types; these topics
|
||
are not on the table for this PEP.
|
||
|
||
|
||
Introspection APIs
|
||
|
||
Introspection concerns itself with finding out what attributes an
|
||
object has. Python's very general getattr/setattr API makes it
|
||
impossible to guarantee that there always is a way to get a list
|
||
of all attributes supported by a specific object, but in practice
|
||
two conventions have appeared that together work for almost all
|
||
objects. I'll call them the class-based introspection API and the
|
||
type-based introspection API; class API and type API for short.
|
||
|
||
The class-based introspection API is used primarily for class
|
||
instances; it is also used by Jim Fulton's ExtensionClasses. It
|
||
assumes that all data attributes of an object x are stored in the
|
||
dictionary x.__dict__, and that all methods and class variables
|
||
can be found by inspection of x's class, written as x.__class__.
|
||
Classes have a __dict__ attribute, which yields a dictionary
|
||
containing methods and class variables defined by the class
|
||
itself, and a __bases__ attribute, which is a tuple of base
|
||
classes that must be inspected recursively. Some assumption here
|
||
are:
|
||
|
||
- attributes defined in the instance dict override attributes
|
||
defined by the object's class;
|
||
|
||
- attributes defined in a derived class override attributes
|
||
defined in a base class;
|
||
|
||
- attributes in an earlier base class (meaning occurring earlier
|
||
in __bases__) override attributes in a later base class.
|
||
|
||
(The last two rules together are often summarized as the
|
||
left-to-right, depth-first rule for attribute search.)
|
||
|
||
The type-based introspection API is supported in one form or
|
||
another by most built-in objects. It uses two special attributes,
|
||
__members__ and __methods__. The __methods__ attribute, if
|
||
present, is a list of method names supported by the object. The
|
||
__members__ attribute, if present, is a list of data attribute
|
||
names supported by the object.
|
||
|
||
The type API is sometimes combined by a __dict__ that works the
|
||
same was as for instances (e.g., for function objects in Python
|
||
2.1, f.__dict__ contains f's dynamic attributes, while
|
||
f.__members__ lists the names of f's statically defined
|
||
attributes).
|
||
|
||
Some caution must be exercised: some objects don't list theire
|
||
"intrinsic" attributes (e.g. __dict__ and __doc__) in __members__,
|
||
while others do; sometimes attribute names that occur both in
|
||
__members__ or __methods__ and as keys in __dict__, in which case
|
||
it's anybody's guess whether the value found in __dict__ is used
|
||
or not.
|
||
|
||
The type API has never been carefully specified. It is part of
|
||
Python folklore, and most third party extensions support it
|
||
because they follow examples that support it. Also, any type that
|
||
uses Py_FindMethod() and/or PyMember_Get() in its tp_getattr
|
||
handler supports it, because these two functions special-case the
|
||
attribute names __methods__ and __members__, respectively.
|
||
|
||
Jim Fulton's ExtensionClasses ignore the type API, and instead
|
||
emulate the class API, which is more powerful. In this PEP, I
|
||
propose to phase out the type API in favor of supporting the class
|
||
API for all types.
|
||
|
||
One argument in favor of the class API is that it doesn't require
|
||
you to create an instance in order to find out which attributes a
|
||
type supports; this in turn is useful for documentation
|
||
processors. For example, the socket module exports the SocketType
|
||
object, but this currently doesn't tell us what methods are
|
||
defined on socket objects. Using the class API, SocketType shows
|
||
us exactly what the methods for socket objects are, and we can
|
||
even extract their docstrings, without creating a socket. (Since
|
||
this is a C extension module, the source-scanning approach to
|
||
docstring extraction isn't feasible in this case.)
|
||
|
||
|
||
Specification of the class-based introspection API
|
||
|
||
Objects may have two kinds of attributes: static and dynamic. The
|
||
names and sometimes other properties of static attributes are
|
||
knowable by inspection of the object's type or class, which is
|
||
accessible through obj.__class__ or type(obj). (I'm using type
|
||
and class interchangeably, because that's the goal of the
|
||
exercise.)
|
||
|
||
(XXX static and dynamic are lousy names, because the "static"
|
||
attributes may actually behave quite dynamically.)
|
||
|
||
The names and values of dynamic properties are typically stored in
|
||
a dictionary, and this dictionary is typically accessible as
|
||
obj.__dict__. The rest of this specification is more concerned
|
||
with discovering the names and properties of static attributes
|
||
than with dynamic attributes.
|
||
|
||
Examples of dynamic attributes are instance variables of class
|
||
instances, module attributes, etc. Examples of static attributes
|
||
are the methods of built-in objects like lists and dictionaries,
|
||
and the attributes of frame and code objects (c.co_code,
|
||
c.co_filename, etc.). When an object with dynamic attributes
|
||
exposes these through its __dict__ attribute, __dict__ is a static
|
||
attribute.
|
||
|
||
In the discussion below, I distinguish two kinds of objects:
|
||
regular objects (e.g. lists, ints, functions) and meta-objects.
|
||
Meta-objects are types and classes. Meta-objects are also regular
|
||
objects, but we're mostly interested in them because they are
|
||
referenced by the __class__ attribute of regular objects (or by
|
||
the __bases__ attribute of meta-objects).
|
||
|
||
The class introspection API consists of the following elements:
|
||
|
||
- the __class__ and __dict__ attributes on regular objects;
|
||
|
||
- the __bases__ and __dict__ attributes on meta-objects;
|
||
|
||
- precedence rules;
|
||
|
||
- attribute descriptors.
|
||
|
||
1. The __dict__ attribute on regular objects
|
||
|
||
A regular object may have a __dict__ attribute. If it does,
|
||
this should be a mapping (not necessarily a dictionary)
|
||
supporting at least __getitem__, keys(), and has_key(). This
|
||
gives the dynamic attributes of the object. The keys in the
|
||
mapping give attribute names, and the corresponding values give
|
||
their values.
|
||
|
||
Typically, the value of an attribute with a given name is the
|
||
same object as the value corresponding to that name as a key in
|
||
the __dict__. In othe words, obj.__dict__['spam'] is obj.spam.
|
||
(But see the precedence rules below; a static attribute with
|
||
the same name *may* override the dictionary item.)
|
||
|
||
2. The __class__ attribute on regular objects
|
||
|
||
A regular object may have a __class__ attributes. If it does,
|
||
this references a meta-object. A meta-object can define static
|
||
attributes for the regular object whose __class__ it is.
|
||
|
||
3. The __dict__ attribute on meta-objects
|
||
|
||
A meta-object may have a __dict__ attribute, of the same form
|
||
as the __dict__ attribute for regular objects (mapping, etc).
|
||
If it does, the keys of the meta-object's __dict__ are names of
|
||
static attributes for the corresponding regular object. The
|
||
values are attribute descriptors; we'll explain these later.
|
||
(An unbound method is a special case of an attribute
|
||
descriptor.)
|
||
|
||
Becase a meta-object is also a regular object, the items in a
|
||
meta-object's __dict__ correspond to attributes of the
|
||
meta-object; however, some transformation may be applied, and
|
||
bases (see below) may define additional dynamic attributes. In
|
||
other words, mobj.spam is not always mobj.__dict__['spam'].
|
||
(This rule contains a loophole because for classes, if
|
||
C.__dict__['spam'] is a function, C.spam is an unbound method
|
||
object.)
|
||
|
||
4. The __bases__ attribute on meta-objects
|
||
|
||
A meta-object may have a __bases__ attribute. If it does, this
|
||
should be a sequence (not necessarily a tuple) of other
|
||
meta-objects, the bases. An absent __bases__ is equivalent to
|
||
an empty sequece of bases. There must never be a cycle in the
|
||
relationship between meta objects defined by __bases__
|
||
attributes; in other words, the __bases__ attributes define an
|
||
inheritance tree, where the root of the tree is the __class__
|
||
attribute of a regular object, and the leaves of the trees are
|
||
meta-objects without bases. The __dict__ attributes of the
|
||
meta-objects in the inheritance tree supply attribute
|
||
descriptors for the regular object whose __class__ is at the
|
||
top of the inheritance tree.
|
||
|
||
5. Precedence rules
|
||
|
||
When two meta-objects in the inheritance tree both define an
|
||
attribute descriptor with the same name, the left-to-right
|
||
depth-first rule applies. (XXX define rigorously.)
|
||
|
||
When a dynamic attribute (one defined in a regular object's
|
||
__dict__) has the same name as a static attribute (one defined
|
||
by a meta-object in the inheritance tree rooted at the regular
|
||
object's __class__), the dynamic attribute *usually* wins, but
|
||
for some attributes the meta-object may specify that the static
|
||
attribute overrides the dynamic attribute.
|
||
|
||
(We can't have a simples rule like "static overrides dynamic"
|
||
or "dynamic overrides static", because some static attributes
|
||
indeed override dynamic attributes, e.g. a key '__class__' in
|
||
an instance's __dict__ is ignored in favor of the statically
|
||
defined __class__ pointer, but on the other hand most keys in
|
||
inst.__dict__ override attributes defined in inst.__class__.
|
||
The mechanism whereby a meta-object can specify that a
|
||
particular attribute has precedence is not yet specified.)
|
||
|
||
6. Attribute descriptors
|
||
|
||
This is where it gets interesting -- and messy. Attribute
|
||
descriptors (descriptors for short) are stored in the
|
||
meta-object's __dict__, and have two uses: a descriptor can be
|
||
used to get or set the corresponding attribute value on the
|
||
(non-meta) object, and it has an additional interface that
|
||
describes the attribute for documentation or introspection
|
||
purposes.
|
||
|
||
There is little prior art in Python for designing the
|
||
descriptor's interface, neither for getting/setting the value
|
||
nor for describing the attribute otherwise, except some trivial
|
||
properties (e.g. it's reasonable to assume that __name__ and
|
||
__doc__ should be the attribute's name and docstring). I will
|
||
propose such an API below.
|
||
|
||
If an object found in the meta-object's __dict__ is not an
|
||
attribute descriptor, backward compatibility dictates
|
||
semantics. This basically means that if it is a Python
|
||
function or an unbound method, the attribute is a method;
|
||
otherwise, it is the default value for a data attribute.
|
||
Backwards compatibility also dictates that (in the absence of a
|
||
__setattr__ method) it is legal to assign to an attribute of
|
||
type method, and that this creates a data attribute shadowing
|
||
the method for this particular instance. However, these
|
||
semantics are only required for backwards compatibility with
|
||
regular classes.
|
||
|
||
The introspection API is a read-only API. We don't define the
|
||
effect of assignment to any of the special attributes (__dict__,
|
||
__class__ and __bases__), nor the effect of assignment to the
|
||
items of a __dict__. Generally, such assignments should be
|
||
considered off-limits. An extension of this PEP may define some
|
||
semantics for some such assignments. (Especially because
|
||
currently instances support assignment to __class__ and __dict__,
|
||
and classes support assignment to __bases__ and __dict__.)
|
||
|
||
|
||
Specification of the attribute descriptor API
|
||
|
||
Attribute descriptors have the following attributes. In the
|
||
examples, x is an object, C is x.__class__, x.meth() is a method,
|
||
and x.ivar is a data attribute or instance variable.
|
||
|
||
- name: the original attribute name. Note that because of
|
||
aliasing and renaming, the attribute may be known under a
|
||
different name, but this is the name under which it was born.
|
||
Example: C.meth.name == 'meth'.
|
||
|
||
- doc: the attribute's documentation string.
|
||
|
||
- objclass: the class that declared this attribute. The
|
||
descriptor only applies to objects that are instances of this
|
||
class (this includes instances of its subclasses). Example:
|
||
C.meth.objclass is C.
|
||
|
||
- kind: either "method" or "data". This distinguishes between
|
||
methods and data attributes. The primary operation on a method
|
||
attribute is to call it. The primary operations on a data
|
||
attribute are to get and to set it. Example: C.meth.kind ==
|
||
'method'; C.ivar.kind == 'data'.
|
||
|
||
- default: for optional data attributes, this gives a default or
|
||
initial value. XXX Python has two kinds of semantics for
|
||
referencing "absent" attributes: this may raise an
|
||
AttributeError, or it may produce a default value stored
|
||
somewhere in the class. There could be a flag that
|
||
distinguishes between these two cases. Also, there could be a
|
||
flag that tells whether it's OK to delete an attribute (and what
|
||
happens then -- a default value takes its place, or it's truly
|
||
gone).
|
||
|
||
- attrclass: for data attributes, this can be the class of the
|
||
attribute value, or None. If this is not None, the attribute
|
||
value is restricted to being an instance of this class (or of a
|
||
subclass thereof). If this is None, the attribute value is not
|
||
constrained. For method attributes, this should normally be
|
||
None (a class is not sufficient information to describe a method
|
||
signature). If and when optional static typing is added to
|
||
Python, this the meaning of this attribute may change to
|
||
describe the type of the attribute.
|
||
|
||
- signature: for methods, an object that describes the signature
|
||
of the method. Signature objects will be described further
|
||
below.
|
||
|
||
- readonly: Boolean indicating whether assignment to this
|
||
attribute is disallowed. This is usually true for methods.
|
||
Example: C.meth.readonly == 1; C.ivar.readonly == 0.
|
||
|
||
- get(): a function of one argument that retrieves the attribute
|
||
value from an object. Examples: C.ivar.get(x) ~~ x.ivar;
|
||
C.meth.get(x) ~~ x.meth.
|
||
|
||
- set(): a function of two arguments that sets the attribute value
|
||
on the object. If readonly is set, this method raises a
|
||
TypeError exception. Example: C.ivar.set(x, y) ~~ x.ivar = y.
|
||
|
||
- call(): for method descriptors, this is a function of at least
|
||
one argument that calls the method. The first argument is the
|
||
object whose method is called; the remaining arguments
|
||
(including keyword arguments) are passed on to the method.
|
||
Example: C.meth.call(x, 1, 2) ~~ x.meth(1, 2).
|
||
|
||
- bind(): for method descriptiors, this is a function of one
|
||
argument that returns a "bound method object". This in turn can
|
||
be called exactly like the method should be called (in fact this
|
||
is what is returned for a bound method). This is the same as
|
||
get(). Example: C.meth.bind(x) ~~ x.meth.
|
||
|
||
For convenience, __name__ and __doc__ are defined as aliases for
|
||
name and doc. Also for convenience, calling the descriptor can do
|
||
one of three things:
|
||
|
||
- Calling a method descriptor is the same as calling its call()
|
||
method. Example: C.meth(x, 1, 2) ~~ x.meth(1, 2).
|
||
|
||
- Calling a data descriptor with one argument is the same as
|
||
calling its get() method. Example: C.ivar(x) ~~ x.ivar.
|
||
|
||
- Calling a data descriptor with two arguments is the same as
|
||
calling its set() method. Example: C.ivar(x, y) ~~ x.ivar = y.
|
||
|
||
Note that this specification does not define how to create
|
||
specific attribute descriptors. This is up to the individual
|
||
attribute descriptor implementations, of which there may be many.
|
||
|
||
|
||
Specification of the signature object API
|
||
|
||
XXX
|
||
|
||
Discussion
|
||
|
||
XXX
|
||
|
||
Examples
|
||
|
||
XXX
|
||
|
||
Backwards compatibility
|
||
|
||
XXX
|
||
|
||
Compatibility of C API
|
||
|
||
XXX
|
||
|
||
Warnings and Errors
|
||
|
||
XXX
|
||
|
||
Implementation
|
||
|
||
A partial implementation of this PEP is available from CVS as a
|
||
branch named "descr-branch". To experiment with this
|
||
implementation, proceed to check out Python from CVS according to
|
||
the instructions at http://sourceforge.net/cvs/?group_id=5470 but
|
||
add the arguments "-r descr-branch" to the cvs checkout command.
|
||
(You can also start with an existing checkout and do "cvs update
|
||
-r descr-branch".) For some examples of the features described
|
||
here, see the file Lib/test/test_descr.py.
|
||
|
||
Note: the code in this branch goes beyond this PEP; it is also
|
||
on the way to implementing pep-0253 (Subtyping Built-in Types).
|
||
|
||
References
|
||
|
||
XXX
|
||
|
||
Copyright
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
End:
|