Another checkpoint. Add some invariants. Resolve an XXX or two.

This commit is contained in:
Guido van Rossum 2007-04-19 18:53:49 +00:00
parent a9354df361
commit e3aab6db4b
1 changed files with 104 additions and 62 deletions

View File

@ -108,13 +108,14 @@ The specification follows the four categories listed in the abstract:
* An "ABC support framework" which defines a metaclass, a base class,
a decorator, and some helpers that make it easy to define ABCs.
This will be added as a new library module named "abc".
This will be added as a new library module named "abc", or (perhaps)
made built-in functionality.
* Specific ABCs for containers and iterators, to be added to the
collections module.
* Specific ABCs for numbers, to be added to a new module, yet to be
named.
* Specific ABCs for numbers, to be added to a new module that is yet
to be named.
* Guidelines for writing additional ABCs.
@ -126,13 +127,13 @@ The abc module will define some utilities that help defining ABCs.
These are:
``@abstractmethod``
A decorator to be used to declare abstract methods. This should
only be used with classes whose class is derived from ``Abstract``
below. A class containing at least one method declared with this
A decorator used to declare abstract methods. This should only be
used with classes whose class is derived from ``Abstract`` below.
A class containing at least one method declared with this
decorator that hasn't been overridden yet cannot be instantiated.
Such a methods may be called from the overriding method in the
subclass (using ``super`` or direct invocation).
``Abstract``
A class implementing the constraint that it or its subclasses
cannot be instantiated unless each abstract method has been
@ -185,46 +186,57 @@ Python 2 has to be implemented anew by each iterator class).
No ABCs override ``__init__``, ``__new__``, ``__str__`` or
``__repr__``.
XXX define how set, frozenset, list, tuple, dict, bytes and str derive
from these.
One Trick Ponies
''''''''''''''''
These abstract classes represent single methods like ``__iter__`` or
``__len__``.
``__len__``. The ``Iterator`` class is included as well, even though
it has two prescribed methods.
``Hashable``
The base class for classes defining ``__hash__``. Its abstract
``__hash__`` method always returns 0, which is a valid (albeit
inefficient) implementation. Note: being an instance of this
class does not imply that an object is immutable; e.g. a tuple
containing a list as a member is not immutable; its ``__hash__``
method raises an exception.
The base class for classes defining ``__hash__``. The
``__hash__`` method should return an ``Integer`` (see "Numbers"
below). The abstract ``__hash__`` method always returns 0, which
is a valid (albeit inefficient) implementation. **Invariant:** If
classes ``C1`` and ``C2`` both derive from ``Hashable``, the
invariant ``hash(o1) == hash(o2)`` must imply ``o1 == o2`` for all
instances ``o1`` of ``C1`` and all instances ``o2`` of ``C2``.
Note: being an instance of this class does not imply that an
object is immutable; e.g. a tuple containing a list as a member is
not immutable; its ``__hash__`` method raises an exception.
``Iterable``
The base class for classes defining ``__iter__``. Its abstract
``__iter__`` method returns an empty iterator.
The base class for classes defining ``__iter__``. The
``__iter__`` method should always return an instance of
``Iterator`` (see below). The abstract ``__iter__`` method
returns an empty iterator.
``Iterator``
The base class for classes defining ``__next__``. This derives
from ``Iterable``. Its abstract ``__next__`` method raises
``StopIteration``. Its ``__iter__`` method returns ``self``, and
is *not* abstract. (Note: this assumes PEP 3114 is implemented.)
from ``Iterable``. The abstract ``__next__`` method raises
``StopIteration``. The concrete ``__iter__`` method returns
``self``. (Note: this assumes PEP 3114 is implemented.)
``Finite``
The base class for classes defining ``__len__``. Its abstract
``__len__`` method returns 0. Any ``__len__`` method should
return an ``Integer`` (see "Numbers" below) >= 0. If class ``C``
derives from ``Finite`` as well as from ``Iterable``, the
invariant ``sum(1 for x in o) == len(o)`` should hold for any
The base class for classes defining ``__len__``. The ``__len__``
method should return an ``Integer`` (see "Numbers" below) >= 0.
The abstract ``__len__`` method returns 0. **Invariant:** If a
class ``C`` derives from ``Finite`` as well as from ``Iterable``,
the invariant ``sum(1 for x in o) == len(o)`` should hold for any
instance ``o`` of ``C``.
``Container``
The base class for classes defining ``__contains__``. Its
abstract ``__contains__`` method returns ``False``. Note:
strictly speaking, there are three variants of this method's
The base class for classes defining ``__contains__``. The
``__contains__` method should return a ``bool``.` The abstract
``__contains__`` method returns ``False``. **Invariant:** If a
class ``C`` derives from ``Container`` as well as from
``Iterable``, ``(x in o for x in o)`` should be a generator
yielding only True values for any instance ``o`` of ``C``.
Note: strictly speaking, there are three variants of this method's
semantics. The first one is for sets and mappings, which is fast:
O(1) or O(log N). The second one is for membership checking on
sequences, which is slow: O(N). The third one is for subsequence
@ -232,22 +244,44 @@ These abstract classes represent single methods like ``__iter__`` or
Would it make sense to distinguish these? The signature of the
third variant is different, since it takes a sequence (typically
of the same type as the method's target) intead of an element.
For now, I'm using the same type for all three.
For now, I'm using the same type for all three. This means that
is is possible for ``x in o`` to be True even though ``x`` is
never yielded by ``iter(o)``.
Sets
''''
These abstract classes represent various stages of "set-ness".
These abstract classes represent various stages of "set-ness". The
most fundamental set operation is the membership test, written as ``x
in s`` and implemented by ``s.__contains__(x)``. This is already
taken care of by the `Container`` class defined above. Therefore, we
define a set as finite, iterable container for which certain
invariants from mathematical set theory hold.
The built-in type ``set`` derives from ``MutableSet``. The built-in
type ``frozenset`` derives from ``HashableSet``.
You might wonder why we require a set to be finite -- surely certain
infinite sets can be represented just fine in Python. For example,
the set of even integers could be defined like this::
class EvenIntegers(Container):
def __contains__(self, x):
return x % 2 == 0
However, such sets have rather limited practical value, and deciding
whether one such set is a subset of another would be difficult in
general without using a symbolic algebra package. So I consider this
out of the scope of a pragmatic proposal like this.
``Set``
This is a finite, iterable container, i.e. a subclass of
``Finite``, ``Iterable`` and ``Container``. Not every subset of
those three classes is a set though! Sets have the additional
property (though it is not expressed in code) that each element
occurs only once (as can be determined by iteration), and in
addition sets implement rich comparisons defined as
subclass/superclass tests.
invariant that each element occurs only once (as can be determined
by iteration), and in addition sets implement rich comparisons
defined as subclass/superclass tests.
Sets with different implementations can be compared safely,
efficiently and correctly. Because ``Set`` derives from
@ -255,23 +289,24 @@ These abstract classes represent various stages of "set-ness".
immediately if two sets of unequal length are compared.
Similarly, ``__le__`` returns ``False`` immediately if the first
set has more members than the second set. Note that set inclusion
implements only a partial ordering; e.g. {1, 2} and {1, 3} are not
ordered (all three of ``<``, ``==`` and ``>`` return ``False`` for
these arguments). Sets cannot be ordered relative to mappings or
sequences, but they can be compared for equality (and then they
always compare unequal).
implements only a partial ordering; e.g. ``{1, 2}`` and ``{1, 3}``
are not ordered (all three of ``<``, ``==`` and ``>`` return
``False`` for these arguments). Sets cannot be ordered relative
to mappings or sequences, but they can be compared for equality
(and then they always compare unequal).
XXX Should we also implement the ``issubset`` and ``issuperset``
methods found on the set type in Python 2 (which are apparently
just aliases for ``__le__`` and ``__ge__``)?
**Open issue:** Should we also implement the ``issubset`` and
``issuperset`` methods found on the set type in Python 2? As
these are just aliases for ``__le__`` and ``__ge__``, I'm tempted
to leave these out.
XXX Should this class also implement union, intersection,
symmetric and asymmetric difference and/or the corresponding
operators? The problem with those (unlike the comparison
operators) is what should be the type of the return value. I'm
tentatively leaving these out -- if you need them, you can test
for a ``Set`` instance that implements e.g. ``__and__``. Some
alternatives: make these abstract methods (even though the
**Open issue:** Should this class also implement the operators
and/or methods that compute union, intersection, symmetric and
asymmetric difference? The problem with those (unlike the
comparison operators) is what should be the type of the return
value. I'm tentatively leaving these out -- if you need them, you
can test for a ``Set`` instance that implements e.g. ``__and__``.
Some alternatives: make these abstract methods (even though the
semantics apart from the type are well-defined); or make them
concrete methods that return a specific concrete set type; or make
them concrete methods that assume the class constructor always
@ -299,7 +334,7 @@ These abstract classes represent various stages of "set-ness".
``.add(x)``
Abstract method that adds the element ``x``, if it isn't
already in the set.
``.remove(x)``
Abstract method that removes the element ``x``; raises
@ -314,22 +349,25 @@ These abstract classes represent various stages of "set-ness".
Abstract method that empties the set. (Making this concrete
would just add a slow, cumbersome default implementation.)
XXX Should we support all the operations implemented by the Python
2 ``set`` type? I.e. union, update, __or__, __ror__, __ior__,
intersection, intersection_update, __and__, __rand__, __iand__,
difference, difference_update, __xor__, __rxor__, __ixor__,
symmetric_difference, symmetric_difference_update, __sub__,
__rsub__, __isub__. Note that in Python 2, ``a.update(b)`` is not
exactly the same as ``a |= b``, since ``update()`` takes any
iterable for an argument, while ``|=`` requires another set;
similar for the other operators.
**Open issue:** Should we support all the operations implemented
by the Python 2 ``set`` type? I.e. union, update, __or__,
__ror__, __ior__, intersection, intersection_update, __and__,
__rand__, __iand__, difference, difference_update, __xor__,
__rxor__, __ixor__, symmetric_difference,
symmetric_difference_update, __sub__, __rsub__, __isub__. Note
that in Python 2, ``a.update(b)`` is not exactly the same as ``a
|= b``, since ``update()`` takes any iterable for an argument,
while ``|=`` requires another set; similar for the other
operators.
Mappings
''''''''
These abstract classes represent various stages of mapping-ness.
The built-in type ``dict`` derives from ``MutableMapping``.
XXX Do we need BasicMapping and IterableMapping? Perhaps we should
just start with Mapping.
@ -387,6 +425,10 @@ Sequences
These abstract classes represent various stages of sequence-ness.
The built-in ``list`` and ``bytes`` types derive from
``MutableSequence``. The built-in ``tuple`` and ``str`` types derive
from ``HashableSequence``.
``Sequence``
A subclass of ``Iterable``, ``Finite``, ``Container``. It
defines a new abstract method ``__getitem__`` that has a