725 lines
31 KiB
Plaintext
725 lines
31 KiB
Plaintext
PEP: 252
|
|
Title: Making Types Look More Like Classes
|
|
Author: Guido van Rossum <guido@python.org>
|
|
Status: Final
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 19-Apr-2001
|
|
Python-Version: 2.2
|
|
Post-History:
|
|
|
|
Abstract
|
|
========
|
|
|
|
This PEP proposes changes to the introspection API for types that
|
|
makes them look more like classes, and their instances more like
|
|
class instances. For example, ``type(x)`` will be equivalent to
|
|
``x.__class__`` for most built-in types. When C is ``x.__class__``,
|
|
``x.meth(a)`` will generally be equivalent to ``C.meth(x, a)``, and
|
|
``C.__dict__`` contains x's methods and other attributes.
|
|
|
|
This PEP also introduces a new approach to specifying attributes,
|
|
using attribute descriptors, or descriptors for short.
|
|
Descriptors unify and generalize several different common
|
|
mechanisms used for describing attributes: a descriptor can
|
|
describe a method, a typed field in the object structure, or a
|
|
generalized attribute represented by getter and setter functions.
|
|
|
|
Based on the generalized descriptor API, this PEP also introduces
|
|
a way to declare class methods and static methods.
|
|
|
|
[Editor's note: the ideas described in this PEP have been incorporated
|
|
into Python. The PEP no longer accurately describes the implementation.]
|
|
|
|
|
|
Introduction
|
|
============
|
|
|
|
One of Python's oldest language warts is the difference between
|
|
classes and types. For example, you can't directly subclass the
|
|
dictionary type, and the introspection interface for finding out
|
|
what methods and instance variables an object has is different for
|
|
types and for classes.
|
|
|
|
Healing the class/type split is a big effort, because it affects
|
|
many aspects of how Python is implemented. This PEP concerns
|
|
itself with making the introspection API for types look the same
|
|
as that for classes. Other PEPs will propose making classes look
|
|
more like types, and subclassing from built-in types; these topics
|
|
are not on the table for this PEP.
|
|
|
|
|
|
Introspection APIs
|
|
==================
|
|
|
|
Introspection concerns itself with finding out what attributes an
|
|
object has. Python's very general getattr/setattr API makes it
|
|
impossible to guarantee that there always is a way to get a list
|
|
of all attributes supported by a specific object, but in practice
|
|
two conventions have appeared that together work for almost all
|
|
objects. I'll call them the class-based introspection API and the
|
|
type-based introspection API; class API and type API for short.
|
|
|
|
The class-based introspection API is used primarily for class
|
|
instances; it is also used by Jim Fulton's ExtensionClasses. It
|
|
assumes that all data attributes of an object x are stored in the
|
|
dictionary ``x.__dict__``, and that all methods and class variables
|
|
can be found by inspection of x's class, written as ``x.__class__``.
|
|
Classes have a ``__dict__`` attribute, which yields a dictionary
|
|
containing methods and class variables defined by the class
|
|
itself, and a ``__bases__`` attribute, which is a tuple of base
|
|
classes that must be inspected recursively. Some assumptions here
|
|
are:
|
|
|
|
- attributes defined in the instance dict override attributes
|
|
defined by the object's class;
|
|
|
|
- attributes defined in a derived class override attributes
|
|
defined in a base class;
|
|
|
|
- attributes in an earlier base class (meaning occurring earlier
|
|
in ``__bases__``) override attributes in a later base class.
|
|
|
|
(The last two rules together are often summarized as the
|
|
left-to-right, depth-first rule for attribute search. This is the
|
|
classic Python attribute lookup rule. Note that :pep:`253` will
|
|
propose to change the attribute lookup order, and if accepted,
|
|
this PEP will follow suit.)
|
|
|
|
The type-based introspection API is supported in one form or
|
|
another by most built-in objects. It uses two special attributes,
|
|
``__members__`` and ``__methods__``. The ``__methods__`` attribute, if
|
|
present, is a list of method names supported by the object. The
|
|
``__members__`` attribute, if present, is a list of data attribute
|
|
names supported by the object.
|
|
|
|
The type API is sometimes combined with a ``__dict__`` that works the
|
|
same as for instances (for example for function objects in
|
|
Python 2.1, ``f.__dict__`` contains f's dynamic attributes, while
|
|
``f.__members__`` lists the names of f's statically defined
|
|
attributes).
|
|
|
|
Some caution must be exercised: some objects don't list their
|
|
"intrinsic" attributes (like ``__dict__`` and ``__doc__``) in ``__members__``,
|
|
while others do; sometimes attribute names occur both in
|
|
``__members__`` or ``__methods__`` and as keys in ``__dict__``, in which case
|
|
it's anybody's guess whether the value found in ``__dict__`` is used
|
|
or not.
|
|
|
|
The type API has never been carefully specified. It is part of
|
|
Python folklore, and most third party extensions support it
|
|
because they follow examples that support it. Also, any type that
|
|
uses ``Py_FindMethod()`` and/or ``PyMember_Get()`` in its tp_getattr
|
|
handler supports it, because these two functions special-case the
|
|
attribute names ``__methods__`` and ``__members__``, respectively.
|
|
|
|
Jim Fulton's ExtensionClasses ignore the type API, and instead
|
|
emulate the class API, which is more powerful. In this PEP, I
|
|
propose to phase out the type API in favor of supporting the class
|
|
API for all types.
|
|
|
|
One argument in favor of the class API is that it doesn't require
|
|
you to create an instance in order to find out which attributes a
|
|
type supports; this in turn is useful for documentation
|
|
processors. For example, the socket module exports the SocketType
|
|
object, but this currently doesn't tell us what methods are
|
|
defined on socket objects. Using the class API, SocketType would
|
|
show exactly what the methods for socket objects are, and we can
|
|
even extract their docstrings, without creating a socket. (Since
|
|
this is a C extension module, the source-scanning approach to
|
|
docstring extraction isn't feasible in this case.)
|
|
|
|
|
|
Specification of the class-based introspection API
|
|
==================================================
|
|
|
|
Objects may have two kinds of attributes: static and dynamic. The
|
|
names and sometimes other properties of static attributes are
|
|
knowable by inspection of the object's type or class, which is
|
|
accessible through ``obj.__class__`` or ``type(obj)``. (I'm using type
|
|
and class interchangeably; a clumsy but descriptive term that fits
|
|
both is "meta-object".)
|
|
|
|
(XXX static and dynamic are not great terms to use here, because
|
|
"static" attributes may actually behave quite dynamically, and
|
|
because they have nothing to do with static class members in C++
|
|
or Java. Barry suggests to use immutable and mutable instead, but
|
|
those words already have precise and different meanings in
|
|
slightly different contexts, so I think that would still be
|
|
confusing.)
|
|
|
|
Examples of dynamic attributes are instance variables of class
|
|
instances, module attributes, etc. Examples of static attributes
|
|
are the methods of built-in objects like lists and dictionaries,
|
|
and the attributes of frame and code objects (``f.f_code``,
|
|
``c.co_filename``, etc.). When an object with dynamic attributes
|
|
exposes these through its ``__dict__`` attribute, ``__dict__`` is a static
|
|
attribute.
|
|
|
|
The names and values of dynamic properties are typically stored in
|
|
a dictionary, and this dictionary is typically accessible as
|
|
``obj.__dict__``. The rest of this specification is more concerned
|
|
with discovering the names and properties of static attributes
|
|
than with dynamic attributes; the latter are easily discovered by
|
|
inspection of ``obj.__dict__``.
|
|
|
|
In the discussion below, I distinguish two kinds of objects:
|
|
regular objects (like lists, ints, functions) and meta-objects.
|
|
Types and classes are meta-objects. Meta-objects are also regular
|
|
objects, but we're mostly interested in them because they are
|
|
referenced by the ``__class__`` attribute of regular objects (or by
|
|
the ``__bases__`` attribute of other meta-objects).
|
|
|
|
The class introspection API consists of the following elements:
|
|
|
|
- the ``__class__`` and ``__dict__`` attributes on regular objects;
|
|
|
|
- the ``__bases__`` and ``__dict__`` attributes on meta-objects;
|
|
|
|
- precedence rules;
|
|
|
|
- attribute descriptors.
|
|
|
|
Together, these not only tell us about **all** attributes defined by
|
|
a meta-object, but they also help us calculate the value of a
|
|
specific attribute of a given object.
|
|
|
|
1. The ``__dict__`` attribute on regular objects
|
|
|
|
A regular object may have a ``__dict__`` attribute. If it does,
|
|
this should be a mapping (not necessarily a dictionary)
|
|
supporting at least ``__getitem__()``, ``keys()``, and ``has_key()``. This
|
|
gives the dynamic attributes of the object. The keys in the
|
|
mapping give attribute names, and the corresponding values give
|
|
their values.
|
|
|
|
Typically, the value of an attribute with a given name is the
|
|
same object as the value corresponding to that name as a key in
|
|
the ``__dict__``. In other words, ``obj.__dict__['spam']`` is ``obj.spam``.
|
|
(But see the precedence rules below; a static attribute with
|
|
the same name **may** override the dictionary item.)
|
|
|
|
2. The ``__class__`` attribute on regular objects
|
|
|
|
A regular object usually has a ``__class__`` attribute. If it
|
|
does, this references a meta-object. A meta-object can define
|
|
static attributes for the regular object whose ``__class__`` it
|
|
is. This is normally done through the following mechanism:
|
|
|
|
3. The ``__dict__`` attribute on meta-objects
|
|
|
|
A meta-object may have a ``__dict__`` attribute, of the same form
|
|
as the ``__dict__`` attribute for regular objects (a mapping but
|
|
not necessarily a dictionary). If it does, the keys of the
|
|
meta-object's ``__dict__`` are names of static attributes for the
|
|
corresponding regular object. The values are attribute
|
|
descriptors; we'll explain these later. An unbound method is a
|
|
special case of an attribute descriptor.
|
|
|
|
Because a meta-object is also a regular object, the items in a
|
|
meta-object's ``__dict__`` correspond to attributes of the
|
|
meta-object; however, some transformation may be applied, and
|
|
bases (see below) may define additional dynamic attributes. In
|
|
other words, ``mobj.spam`` is not always ``mobj.__dict__['spam']``.
|
|
(This rule contains a loophole because for classes, if
|
|
``C.__dict__['spam']`` is a function, ``C.spam`` is an unbound method
|
|
object.)
|
|
|
|
4. The ``__bases__`` attribute on meta-objects
|
|
|
|
A meta-object may have a ``__bases__`` attribute. If it does, this
|
|
should be a sequence (not necessarily a tuple) of other
|
|
meta-objects, the bases. An absent ``__bases__`` is equivalent to
|
|
an empty sequence of bases. There must never be a cycle in the
|
|
relationship between meta-objects defined by ``__bases__``
|
|
attributes; in other words, the ``__bases__`` attributes define a
|
|
directed acyclic graph, with arcs pointing from derived
|
|
meta-objects to their base meta-objects. (It is not
|
|
necessarily a tree, since multiple classes can have the same
|
|
base class.) The ``__dict__`` attributes of a meta-object in the
|
|
inheritance graph supply attribute descriptors for the regular
|
|
object whose ``__class__`` attribute points to the root of the
|
|
inheritance tree (which is not the same as the root of the
|
|
inheritance hierarchy -- rather more the opposite, at the
|
|
bottom given how inheritance trees are typically drawn).
|
|
Descriptors are first searched in the dictionary of the root
|
|
meta-object, then in its bases, according to a precedence rule
|
|
(see the next paragraph).
|
|
|
|
5. Precedence rules
|
|
|
|
When two meta-objects in the inheritance graph for a given
|
|
regular object both define an attribute descriptor with the
|
|
same name, the search order is up to the meta-object. This
|
|
allows different meta-objects to define different search
|
|
orders. In particular, classic classes use the old
|
|
left-to-right depth-first rule, while new-style classes use a
|
|
more advanced rule (see the section on method resolution order
|
|
in :pep:`253`).
|
|
|
|
When a dynamic attribute (one defined in a regular object's
|
|
``__dict__``) has the same name as a static attribute (one defined
|
|
by a meta-object in the inheritance graph rooted at the regular
|
|
object's ``__class__``), the static attribute has precedence if it
|
|
is a descriptor that defines a ``__set__`` method (see below);
|
|
otherwise (if there is no ``__set__`` method) the dynamic attribute
|
|
has precedence. In other words, for data attributes (those
|
|
with a ``__set__`` method), the static definition overrides the
|
|
dynamic definition, but for other attributes, dynamic overrides
|
|
static.
|
|
|
|
Rationale: we can't have a simple rule like "static overrides
|
|
dynamic" or "dynamic overrides static", because some static
|
|
attributes indeed override dynamic attributes; for example, a
|
|
key '__class__' in an instance's ``__dict__`` is ignored in favor
|
|
of the statically defined ``__class__`` pointer, but on the other
|
|
hand most keys in ``inst.__dict__`` override attributes defined in
|
|
``inst.__class__``. Presence of a ``__set__`` method on a descriptor
|
|
indicates that this is a data descriptor. (Even read-only data
|
|
descriptors have a ``__set__`` method: it always raises an
|
|
exception.) Absence of a ``__set__`` method on a descriptor
|
|
indicates that the descriptor isn't interested in intercepting
|
|
assignment, and then the classic rule applies: an instance
|
|
variable with the same name as a method hides the method until
|
|
it is deleted.
|
|
|
|
6. Attribute descriptors
|
|
|
|
This is where it gets interesting -- and messy. Attribute
|
|
descriptors (descriptors for short) are stored in the
|
|
meta-object's ``__dict__`` (or in the ``__dict__`` of one of its
|
|
ancestors), and have two uses: a descriptor can be used to get
|
|
or set the corresponding attribute value on the (regular,
|
|
non-meta) object, and it has an additional interface that
|
|
describes the attribute for documentation and introspection
|
|
purposes.
|
|
|
|
There is little prior art in Python for designing the
|
|
descriptor's interface, neither for getting/setting the value
|
|
nor for describing the attribute otherwise, except some trivial
|
|
properties (it's reasonable to assume that ``__name__`` and ``__doc__``
|
|
should be the attribute's name and docstring). I will propose
|
|
such an API below.
|
|
|
|
If an object found in the meta-object's ``__dict__`` is not an
|
|
attribute descriptor, backward compatibility dictates certain
|
|
minimal semantics. This basically means that if it is a Python
|
|
function or an unbound method, the attribute is a method;
|
|
otherwise, it is the default value for a dynamic data
|
|
attribute. Backwards compatibility also dictates that (in the
|
|
absence of a ``__setattr__`` method) it is legal to assign to an
|
|
attribute corresponding to a method, and that this creates a
|
|
data attribute shadowing the method for this particular
|
|
instance. However, these semantics are only required for
|
|
backwards compatibility with regular classes.
|
|
|
|
The introspection API is a read-only API. We don't define the
|
|
effect of assignment to any of the special attributes (``__dict__``,
|
|
``__class__`` and ``__bases__``), nor the effect of assignment to the
|
|
items of a ``__dict__``. Generally, such assignments should be
|
|
considered off-limits. A future PEP may define some semantics for
|
|
some such assignments. (Especially because currently instances
|
|
support assignment to ``__class__`` and ``__dict__``, and classes support
|
|
assignment to ``__bases__`` and ``__dict__``.)
|
|
|
|
|
|
Specification of the attribute descriptor API
|
|
=============================================
|
|
|
|
Attribute descriptors may have the following attributes. In the
|
|
examples, x is an object, C is ``x.__class__``, ``x.meth()`` is a method,
|
|
and ``x.ivar`` is a data attribute or instance variable. All
|
|
attributes are optional -- a specific attribute may or may not be
|
|
present on a given descriptor. An absent attribute means that the
|
|
corresponding information is not available or the corresponding
|
|
functionality is not implemented.
|
|
|
|
- ``__name__``: the attribute name. Because of aliasing and renaming,
|
|
the attribute may (additionally or exclusively) be known under a
|
|
different name, but this is the name under which it was born.
|
|
Example: ``C.meth.__name__ == 'meth'``.
|
|
|
|
- ``__doc__``: the attribute's documentation string. This may be
|
|
None.
|
|
|
|
- ``__objclass__``: the class that declared this attribute. The
|
|
descriptor only applies to objects that are instances of this
|
|
class (this includes instances of its subclasses). Example:
|
|
``C.meth.__objclass__ is C``.
|
|
|
|
- ``__get__()``: a function callable with one or two arguments that
|
|
retrieves the attribute value from an object. This is also
|
|
referred to as a "binding" operation, because it may return a
|
|
"bound method" object in the case of method descriptors. The
|
|
first argument, X, is the object from which the attribute must
|
|
be retrieved or to which it must be bound. When X is None, the
|
|
optional second argument, T, should be meta-object and the
|
|
binding operation may return an **unbound** method restricted to
|
|
instances of T. When both X and T are specified, X should be an
|
|
instance of T. Exactly what is returned by the binding
|
|
operation depends on the semantics of the descriptor; for
|
|
example, static methods and class methods (see below) ignore the
|
|
instance and bind to the type instead.
|
|
|
|
- ``__set__()``: a function of two arguments that sets the attribute
|
|
value on the object. If the attribute is read-only, this method
|
|
may raise a TypeError or ``AttributeError`` exception (both are
|
|
allowed, because both are historically found for undefined or
|
|
unsettable attributes). Example:
|
|
``C.ivar.set(x, y) ~~ x.ivar = y``.
|
|
|
|
|
|
Static methods and class methods
|
|
================================
|
|
|
|
The descriptor API makes it possible to add static methods and
|
|
class methods. Static methods are easy to describe: they behave
|
|
pretty much like static methods in C++ or Java. Here's an
|
|
example::
|
|
|
|
class C:
|
|
|
|
def foo(x, y):
|
|
print "staticmethod", x, y
|
|
foo = staticmethod(foo)
|
|
|
|
C.foo(1, 2)
|
|
c = C()
|
|
c.foo(1, 2)
|
|
|
|
Both the call ``C.foo(1, 2)`` and the call ``c.foo(1, 2)`` call ``foo()`` with
|
|
two arguments, and print "staticmethod 1 2". No "self" is declared in
|
|
the definition of ``foo()``, and no instance is required in the call.
|
|
|
|
The line "foo = staticmethod(foo)" in the class statement is the
|
|
crucial element: this makes ``foo()`` a static method. The built-in
|
|
``staticmethod()`` wraps its function argument in a special kind of
|
|
descriptor whose ``__get__()`` method returns the original function
|
|
unchanged. Without this, the ``__get__()`` method of standard
|
|
function objects would have created a bound method object for
|
|
'c.foo' and an unbound method object for 'C.foo'.
|
|
|
|
(XXX Barry suggests to use "sharedmethod" instead of
|
|
"staticmethod", because the word static is being overloaded in so
|
|
many ways already. But I'm not sure if shared conveys the right
|
|
meaning.)
|
|
|
|
Class methods use a similar pattern to declare methods that
|
|
receive an implicit first argument that is the *class* for which
|
|
they are invoked. This has no C++ or Java equivalent, and is not
|
|
quite the same as what class methods are in Smalltalk, but may
|
|
serve a similar purpose. According to Armin Rigo, they are
|
|
similar to "virtual class methods" in Borland Pascal dialect
|
|
Delphi. (Python also has real metaclasses, and perhaps methods
|
|
defined in a metaclass have more right to the name "class method";
|
|
but I expect that most programmers won't be using metaclasses.)
|
|
Here's an example::
|
|
|
|
class C:
|
|
|
|
def foo(cls, y):
|
|
print "classmethod", cls, y
|
|
foo = classmethod(foo)
|
|
|
|
C.foo(1)
|
|
c = C()
|
|
c.foo(1)
|
|
|
|
Both the call ``C.foo(1)`` and the call ``c.foo(1)`` end up calling ``foo()``
|
|
with **two** arguments, and print "classmethod __main__.C 1". The
|
|
first argument of ``foo()`` is implied, and it is the class, even if
|
|
the method was invoked via an instance. Now let's continue the
|
|
example::
|
|
|
|
class D(C):
|
|
pass
|
|
|
|
D.foo(1)
|
|
d = D()
|
|
d.foo(1)
|
|
|
|
This prints "classmethod __main__.D 1" both times; in other words,
|
|
the class passed as the first argument of ``foo()`` is the class
|
|
involved in the call, not the class involved in the definition of
|
|
``foo()``.
|
|
|
|
But notice this::
|
|
|
|
class E(C):
|
|
def foo(cls, y): # override C.foo
|
|
print "E.foo() called"
|
|
C.foo(y)
|
|
foo = classmethod(foo)
|
|
|
|
E.foo(1)
|
|
e = E()
|
|
e.foo(1)
|
|
|
|
In this example, the call to ``C.foo()`` from ``E.foo()`` will see class C
|
|
as its first argument, not class E. This is to be expected, since
|
|
the call specifies the class C. But it stresses the difference
|
|
between these class methods and methods defined in metaclasses,
|
|
where an upcall to a metamethod would pass the target class as an
|
|
explicit first argument. (If you don't understand this, don't
|
|
worry, you're not alone.) Note that calling ``cls.foo(y)`` would be a
|
|
mistake -- it would cause infinite recursion. Also note that you
|
|
can't specify an explicit 'cls' argument to a class method. If
|
|
you want this (e.g. the ``__new__`` method in :pep:`253` requires this),
|
|
use a static method with a class as its explicit first argument
|
|
instead.
|
|
|
|
|
|
C API
|
|
=====
|
|
|
|
XXX The following is VERY rough text that I wrote with a different
|
|
audience in mind; I'll have to go through this to edit it more.
|
|
XXX It also doesn't go into enough detail for the C API.
|
|
|
|
A built-in type can declare special data attributes in two ways:
|
|
using a struct memberlist (defined in structmember.h) or a struct
|
|
getsetlist (defined in descrobject.h). The struct memberlist is
|
|
an old mechanism put to new use: each attribute has a descriptor
|
|
record including its name, an enum giving its type (various C
|
|
types are supported as well as ``PyObject *``), an offset from the
|
|
start of the instance, and a read-only flag.
|
|
|
|
The struct getsetlist mechanism is new, and intended for cases
|
|
that don't fit in that mold, because they either require
|
|
additional checking, or are plain calculated attributes. Each
|
|
attribute here has a name, a getter C function pointer, a setter C
|
|
function pointer, and a context pointer. The function pointers
|
|
are optional, so that for example setting the setter function
|
|
pointer to ``NULL`` makes a read-only attribute. The context pointer
|
|
is intended to pass auxiliary information to generic getter/setter
|
|
functions, but I haven't found a need for this yet.
|
|
|
|
Note that there is also a similar mechanism to declare built-in
|
|
methods: these are ``PyMethodDef`` structures, which contain a name
|
|
and a C function pointer (and some flags for the calling
|
|
convention).
|
|
|
|
Traditionally, built-in types have had to define their own
|
|
``tp_getattro`` and ``tp_setattro`` slot functions to make these attribute
|
|
definitions work (``PyMethodDef`` and struct memberlist are quite
|
|
old). There are convenience functions that take an array of
|
|
``PyMethodDef`` or memberlist structures, an object, and an attribute
|
|
name, and return or set the attribute if found in the list, or
|
|
raise an exception if not found. But these convenience functions
|
|
had to be explicitly called by the ``tp_getattro`` or ``tp_setattro``
|
|
method of the specific type, and they did a linear search of the
|
|
array using ``strcmp()`` to find the array element describing the
|
|
requested attribute.
|
|
|
|
I now have a brand spanking new generic mechanism that improves
|
|
this situation substantially.
|
|
|
|
- Pointers to arrays of ``PyMethodDef``, memberlist, getsetlist
|
|
structures are part of the new type object (``tp_methods``,
|
|
``tp_members``, ``tp_getset``).
|
|
|
|
- At type initialization time (in ``PyType_InitDict()``), for each
|
|
entry in those three arrays, a descriptor object is created and
|
|
placed in a dictionary that belongs to the type (``tp_dict``).
|
|
|
|
- Descriptors are very lean objects that mostly point to the
|
|
corresponding structure. An implementation detail is that all
|
|
descriptors share the same object type, and a discriminator
|
|
field tells what kind of descriptor it is (method, member, or
|
|
getset).
|
|
|
|
- As explained in :pep:`252`, descriptors have a ``get()`` method that
|
|
takes an object argument and returns that object's attribute;
|
|
descriptors for writable attributes also have a ``set()`` method
|
|
that takes an object and a value and set that object's
|
|
attribute. Note that the ``get()`` object also serves as a ``bind()``
|
|
operation for methods, binding the unbound method implementation
|
|
to the object.
|
|
|
|
- Instead of providing their own tp_getattro and tp_setattro
|
|
implementation, almost all built-in objects now place
|
|
``PyObject_GenericGetAttr`` and (if they have any writable
|
|
attributes) ``PyObject_GenericSetAttr`` in their ``tp_getattro`` and
|
|
``tp_setattro`` slots. (Or, they can leave these ``NULL``, and inherit
|
|
them from the default base object, if they arrange for an
|
|
explicit call to ``PyType_InitDict()`` for the type before the first
|
|
instance is created.)
|
|
|
|
- In the simplest case, ``PyObject_GenericGetAttr()`` does exactly one
|
|
dictionary lookup: it looks up the attribute name in the type's
|
|
dictionary (obj->ob_type->tp_dict). Upon success, there are two
|
|
possibilities: the descriptor has a get method, or it doesn't.
|
|
For speed, the get and set methods are type slots: ``tp_descr_get``
|
|
and ``tp_descr_set``. If the ``tp_descr_get`` slot is non-NULL, it is
|
|
called, passing the object as its only argument, and the return
|
|
value from this call is the result of the getattr operation. If
|
|
the ``tp_descr_get`` slot is ``NULL``, as a fallback the descriptor
|
|
itself is returned (compare class attributes that are not
|
|
methods but simple values).
|
|
|
|
- ``PyObject_GenericSetAttr()`` works very similar but uses the
|
|
``tp_descr_set`` slot and calls it with the object and the new
|
|
attribute value; if the ``tp_descr_set`` slot is ``NULL``, an
|
|
``AttributeError`` is raised.
|
|
|
|
- But now for a more complicated case. The approach described
|
|
above is suitable for most built-in objects such as lists,
|
|
strings, numbers. However, some object types have a dictionary
|
|
in each instance that can store arbitrary attributes. In fact,
|
|
when you use a class statement to subtype an existing built-in
|
|
type, you automatically get such a dictionary (unless you
|
|
explicitly turn it off, using another advanced feature,
|
|
``__slots__``). Let's call this the instance dict, to distinguish
|
|
it from the type dict.
|
|
|
|
- In the more complicated case, there's a conflict between names
|
|
stored in the instance dict and names stored in the type dict.
|
|
If both dicts have an entry with the same key, which one should
|
|
we return? Looking at classic Python for guidance, I find
|
|
conflicting rules: for class instances, the instance dict
|
|
overrides the class dict, **except** for the special attributes
|
|
(like ``__dict__`` and ``__class__``), which have priority over the
|
|
instance dict.
|
|
|
|
- I resolved this with the following set of rules, implemented in
|
|
``PyObject_GenericGetAttr()``:
|
|
|
|
1. Look in the type dict. If you find a **data** descriptor, use
|
|
its ``get()`` method to produce the result. This takes care of
|
|
special attributes like ``__dict__`` and ``__class__``.
|
|
|
|
2. Look in the instance dict. If you find anything, that's it.
|
|
(This takes care of the requirement that normally the
|
|
instance dict overrides the class dict.)
|
|
|
|
3. Look in the type dict again (in reality this uses the saved
|
|
result from step 1, of course). If you find a descriptor,
|
|
use its ``get()`` method; if you find something else, that's it;
|
|
if it's not there, raise ``AttributeError``.
|
|
|
|
This requires a classification of descriptors as data and
|
|
nondata descriptors. The current implementation quite sensibly
|
|
classifies member and getset descriptors as data (even if they
|
|
are read-only!) and method descriptors as nondata.
|
|
Non-descriptors (like function pointers or plain values) are
|
|
also classified as non-data (!).
|
|
|
|
- This scheme has one drawback: in what I assume to be the most
|
|
common case, referencing an instance variable stored in the
|
|
instance dict, it does **two** dictionary lookups, whereas the
|
|
classic scheme did a quick test for attributes starting with two
|
|
underscores plus a single dictionary lookup. (Although the
|
|
implementation is sadly structured as ``instance_getattr()`` calling
|
|
``instance_getattr1()`` calling ``instance_getattr2()`` which finally
|
|
calls ``PyDict_GetItem()``, and the underscore test calls
|
|
``PyString_AsString()`` rather than inlining this. I wonder if
|
|
optimizing the snot out of this might not be a good idea to
|
|
speed up Python 2.2, if we weren't going to rip it all out. :-)
|
|
|
|
- A benchmark verifies that in fact this is as fast as classic
|
|
instance variable lookup, so I'm no longer worried.
|
|
|
|
- Modification for dynamic types: step 1 and 3 look in the
|
|
dictionary of the type and all its base classes (in MRO
|
|
sequence, or course).
|
|
|
|
|
|
Discussion
|
|
==========
|
|
|
|
XXX
|
|
|
|
|
|
Examples
|
|
========
|
|
|
|
Let's look at lists. In classic Python, the method names of
|
|
lists were available as the __methods__ attribute of list objects::
|
|
|
|
>>> [].__methods__
|
|
['append', 'count', 'extend', 'index', 'insert', 'pop',
|
|
'remove', 'reverse', 'sort']
|
|
>>>
|
|
|
|
Under the new proposal, the __methods__ attribute no longer exists::
|
|
|
|
>>> [].__methods__
|
|
Traceback (most recent call last):
|
|
File "<stdin>", line 1, in ?
|
|
AttributeError: 'list' object has no attribute '__methods__'
|
|
>>>
|
|
|
|
Instead, you can get the same information from the list type::
|
|
|
|
>>> T = [].__class__
|
|
>>> T
|
|
<type 'list'>
|
|
>>> dir(T) # like T.__dict__.keys(), but sorted
|
|
['__add__', '__class__', '__contains__', '__eq__', '__ge__',
|
|
'__getattr__', '__getitem__', '__getslice__', '__gt__',
|
|
'__iadd__', '__imul__', '__init__', '__le__', '__len__',
|
|
'__lt__', '__mul__', '__ne__', '__new__', '__radd__',
|
|
'__repr__', '__rmul__', '__setitem__', '__setslice__', 'append',
|
|
'count', 'extend', 'index', 'insert', 'pop', 'remove',
|
|
'reverse', 'sort']
|
|
>>>
|
|
|
|
The new introspection API gives more information than the old one:
|
|
in addition to the regular methods, it also shows the methods that
|
|
are normally invoked through special notations, e.g. ``__iadd__``
|
|
(``+=``), ``__len__`` (``len``), ``__ne__`` (``!=``).
|
|
You can invoke any method from this list directly::
|
|
|
|
>>> a = ['tic', 'tac']
|
|
>>> T.__len__(a) # same as len(a)
|
|
2
|
|
>>> T.append(a, 'toe') # same as a.append('toe')
|
|
>>> a
|
|
['tic', 'tac', 'toe']
|
|
>>>
|
|
|
|
This is just like it is for user-defined classes.
|
|
|
|
Notice a familiar yet surprising name in the list: ``__init__``. This
|
|
is the domain of :pep:`253`.
|
|
|
|
|
|
Backwards compatibility
|
|
=======================
|
|
|
|
XXX
|
|
|
|
|
|
Warnings and Errors
|
|
===================
|
|
|
|
XXX
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
A partial implementation of this PEP is available from CVS as a
|
|
branch named "descr-branch". To experiment with this
|
|
implementation, proceed to check out Python from CVS according to
|
|
the instructions at http://sourceforge.net/cvs/?group_id=5470 but
|
|
add the arguments "-r descr-branch" to the cvs checkout command.
|
|
(You can also start with an existing checkout and do "cvs update
|
|
-r descr-branch".) For some examples of the features described
|
|
here, see the file Lib/test/test_descr.py.
|
|
|
|
Note: the code in this branch goes way beyond this PEP; it is also
|
|
the experimentation area for :pep:`253` (Subtyping Built-in Types).
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
XXX
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|