From 66d4f339bef20b7ef8ba452dd24fee5bd7002a07 Mon Sep 17 00:00:00 2001 From: Guido van Rossum Date: Fri, 20 Apr 2007 22:26:10 +0000 Subject: [PATCH] Work in many email responses. --- pep-3119.txt | 216 ++++++++++++++++++++++++++++++++------------------- 1 file changed, 136 insertions(+), 80 deletions(-) diff --git a/pep-3119.txt b/pep-3119.txt index 678ccd50f..065763a78 100644 --- a/pep-3119.txt +++ b/pep-3119.txt @@ -39,9 +39,9 @@ makes a set", "what makes a mapping" and "what makes a sequence". Acknowledgements ---------------- -Talin wrote the Rationale below [1]_. For that alone he deserves -co-authorship. But the rest of the PEP uses "I" referring to the -first author. +Talin wrote the Rationale below [1]_ as well as most of the section on +ABCs vs. Interfaces. For that alone he deserves co-authorship. But +the rest of the PEP uses "I" referring to the first author. Rationale @@ -55,54 +55,58 @@ Invocation means interacting with an object by invoking its methods. Usually this is combined with polymorphism, so that invoking a given method may run different code depending on the type of an object. -Inspection means the ability for external code (outside of the object's -methods) to examine the type or properties of that object, and make -decisions on how to treat that object based on that information. +Inspection means the ability for external code (outside of the +object's methods) to examine the type or properties of that object, +and make decisions on how to treat that object based on that +information. Both usage patterns serve the same general end, which is to be able to support the processing of diverse and potentially novel objects in a uniform way, but at the same time allowing processing decisions to be customized for each different type of object. -In classical OOP theory, invocation is the preferred usage pattern, and -inspection is actively discouraged, being considered a relic of an -earlier, procedural programming style. However, in practice this view is -simply too dogmatic and inflexible, and leads to a kind of design -rigidity that is very much at odds with the dynamic nature of a language -like Python. +In classical OOP theory, invocation is the preferred usage pattern, +and inspection is actively discouraged, being considered a relic of an +earlier, procedural programming style. However, in practice this view +is simply too dogmatic and inflexible, and leads to a kind of design +rigidity that is very much at odds with the dynamic nature of a +language like Python. In particular, there is often a need to process objects in a way that -wasn't anticipated by the creator of the object class. It is not always -the best solution to build in to every object methods that satisfy the -needs of every possible user of that object. Moreover, there are many -powerful dispatch philosophies that are in direct contrast to the -classic OOP requirement of behavior being strictly encapsulated within -an object, examples being rule or pattern-match driven logic. +wasn't anticipated by the creator of the object class. It is not +always the best solution to build in to every object methods that +satisfy the needs of every possible user of that object. Moreover, +there are many powerful dispatch philosophies that are in direct +contrast to the classic OOP requirement of behavior being strictly +encapsulated within an object, examples being rule or pattern-match +driven logic. On the the other hand, one of the criticisms of inspection by classic -OOP theorists is the lack of formalisms and the ad hoc nature of what is -being inspected. In a language such as Python, in which almost any +OOP theorists is the lack of formalisms and the ad hoc nature of what +is being inspected. In a language such as Python, in which almost any aspect of an object can be reflected and directly accessed by external code, there are many different ways to test whether an object conforms -to a particular protocol or not. For example, if asking 'is this object -a mutable sequence container?', one can look for a base class of 'list', -or one can look for a method named '__getitem__'. But note that although -these tests may seem obvious, neither of them are correct, as one -generates false negatives, and the other false positives. +to a particular protocol or not. For example, if asking 'is this +object a mutable sequence container?', one can look for a base class +of 'list', or one can look for a method named '__getitem__'. But note +that although these tests may seem obvious, neither of them are +correct, as one generates false negatives, and the other false +positives. -The generally agreed-upon remedy is to standardize the tests, and group -them into a formal arrangement. This is most easily done by associating -with each class a set of standard testable properties, either via the -inheritance mechanism or some other means. Each test carries with it a -set of promises: it contains a promise about the general behavior of the -class, and a promise as to what other class methods will be available. +The generally agreed-upon remedy is to standardize the tests, and +group them into a formal arrangement. This is most easily done by +associating with each class a set of standard testable properties, +either via the inheritance mechanism or some other means. Each test +carries with it a set of promises: it contains a promise about the +general behavior of the class, and a promise as to what other class +methods will be available. -This PEP proposes a particular strategy for organizing these tests known -as Abstract Base Classes, or ABC. ABCs are simply Python classes that -are added into an object's inheritance tree to signal certain features -of that object to an external inspector. Tests are done using -isinstance(), and the presence of a particular ABC means that the test -has passed. +This PEP proposes a particular strategy for organizing these tests +known as Abstract Base Classes, or ABC. ABCs are simply Python +classes that are added into an object's inheritance tree to signal +certain features of that object to an external inspector. Tests are +done using isinstance(), and the presence of a particular ABC means +that the test has passed. Like all other things in Python, these promises are in the nature of a gentlemen's agreement - which means that the language does not attempt @@ -203,8 +207,7 @@ One Trick Ponies '''''''''''''''' These abstract classes represent single methods like ``__iter__`` or -``__len__``. The ``Iterator`` class is included as well, even though -it has two prescribed methods. +``__len__``. ``Hashable`` The base class for classes defining ``__hash__``. The @@ -221,12 +224,12 @@ it has two prescribed methods. never change their value (as compared by ``==``) or their hash value. If a class cannot guarantee this, it should not derive from ``Hashable``; if it cannot guarantee this for certain - instances only, ``__hash__`` for those instances should raise an - exception. + instances only, ``__hash__`` for those instances should raise a + ``TypeError`` exception. Note: being an instance of this class does not imply that an object is immutable; e.g. a tuple containing a list as a member is - not immutable; its ``__hash__`` method raises an exception. + not immutable; its ``__hash__`` method raises ``TypeError``. ``Iterable`` The base class for classes defining ``__iter__``. The @@ -252,11 +255,11 @@ it has two prescribed methods. too cute), ``Countable`` (the set of natural numbers is a countable set in math), ``Enumerable`` (sounds like a sysnonym for ``Iterable``), ``Dimension``, ``Extent`` (sound like numbers to - me). + me), ``Bounded`` (probably just as confusing as ``Fininte``). ``Container`` The base class for classes defining ``__contains__``. The - ``__contains__`` method should return a ``bool``. The abstract + ``__contains__`` method should return a ``bool``. The abstract ``__contains__`` method returns ``False``. **Invariant:** If a class ``C`` derives from ``Container`` as well as from ``Iterable``, then ``(x in o for x in o)`` should be a generator @@ -272,7 +275,23 @@ it has two prescribed methods. of the same type as the method's target) intead of an element. For now, I'm using the same type for all three. This means that is is possible for ``x in o`` to be True even though ``x`` is - never yielded by ``iter(o)``. + never yielded by ``iter(o)``. A suggested name for the third form + is ``Searchable``. + +``PartiallyOrdered`` + This ABC defines the 4 inequality operations ``<``, ``<=``, ``>=``, + ``>``. (Note that ``==`` and ``!=`` are defined by ``object``.) + Classes deriving from this ABC should satisfy weak invariants such + as ``a < b < c`` implies ``a < c`` but don't require that for any + two instances ``x`` and ``y`` exactly one of ``x < y``, ``x == y`` + or ``x >= y`` apply. + +``TotallyOrdered`` + This ABC derives from ``PartiallyOrdered``. It adds no new + operations but implies a promise of stronger invariants. **Open + issues:** Should ``float`` derive from ``TotallyOrdered`` even + though for ``NaN`` this isn't strictly correct? + Sets @@ -302,12 +321,14 @@ general without using a symbolic algebra package. So I consider this out of the scope of a pragmatic proposal like this. ``Set`` - This is a finite, iterable container, i.e. a subclass of - ``Sized``, ``Iterable`` and ``Container``. Not every subset of - those three classes is a set though! Sets have the additional - invariant that each element occurs only once (as can be determined - by iteration), and in addition sets define concrete operators that - implement rich comparisons defined as subclass/superclass tests. + + This is a finite, iterable, partially ordered container, i.e. a + subclass of ``Sized``, ``Iterable``, ``Container`` and + ``PartiallyOrdered``. Not every subset of those three classes is + a set though! Sets have the additional invariant that each + element occurs only once (as can be determined by iteration), and + in addition sets define concrete operators that implement the + inequality operations as subclass/superclass tests. Sets with different implementations can be compared safely, efficiently and correctly. Because ``Set`` derives from @@ -325,8 +346,7 @@ out of the scope of a pragmatic proposal like this. type in Python 2 are not supported, as these are mostly just aliases for ``__le__`` and ``__ge__``. - **Open issues:** Should I spell out the invariants and method - definitions? + **Open issues:** Spell out the invariants and method definitions. ``ComposableSet`` This is a subclass of ``Set`` that defines abstract operators to @@ -360,42 +380,45 @@ out of the scope of a pragmatic proposal like this. from Composable)? ``MutableSet`` - This is a subclass of ``ComposableSet`` implementing additional operations to add and remove elements. The supported methods have - the semantics known from the ``set`` type in Python 2: + the semantics known from the ``set`` type in Python 2 (except + for ``discard``, which is modeled after Java): ``.add(x)`` - Abstract method that adds the element ``x``, if it isn't - already in the set. - - ``.remove(x)`` - Abstract method that removes the element ``x``; raises - ``KeyError`` if ``x`` is not in the set. + Abstract method returning a ``bool`` that adds the element + ``x`` if it isn't already in the set. It should return + ``True`` if ``x`` was added, ``False`` if it was already + there. The abstract implementation raises + ``NotImplementedError``. ``.discard(x)`` - Concrete method that removes the element ``x`` if it is - a member of the set; implemented using ``__contains__`` - and ``remove``. + Abstract method returning a ``bool`` that removes the element + ``x`` if present. It should return ``True`` if the element + was present and ``False`` if it wasn't. The abstract + implementation raises ``NotImplementedError``. ``.clear()`` - Abstract method that empties the set. (Making this concrete - would just add a slow, cumbersome default implementation.) + Abstract method that empties the set. The abstract + implementation raises ``NotImplementedError``. (Making this + concrete would just add a slow, cumbersome default + implementation.) ``.pop()`` Concrete method that removes an arbitrary item. If the set is empty, it raises ``KeyError``. The default implementation removes the first item returned by the set's iterator. - This also supports the in-place mutating operations ``|=``, - ``&=``, ``^=``, ``-=``. It does not support the named methods - that perform (almost) the same operations, like ``update``, even - though these don't have exactly the same rules (``update`` takes - any iterable, while ``|=`` requires a set). + ``.toggle(x)`` + Concrete method returning a ``bool`` that adds x to the set if + it wasn't there, but removes it if it was there. It should + return ``True`` if ``x`` was added, ``False`` if it was + removed. - **Open issues:** Should we unify ``remove`` and ``discard``, a la - Java (which has a single method returning a boolean indicating - whether it was removed or not)? + This also supports the in-place mutating operations ``|=``, + ``&=``, ``^=``, ``-=``. These are concrete methods whose right + operand can be an arbitrary ``Iterable``. It does not support the + named methods that perform (almost) the same operations. Mappings @@ -449,16 +472,19 @@ The built-in type ``dict`` derives from ``MutableMapping``. ``MutableMapping`` A subclass of ``Mapping`` that also implements some standard - mutating methods. Abstract methods include ``__setitem__``, ``__delitem__``, ``clear``, ``update``. Concrete methods include ``pop``, ``popitem``. Note: ``setdefault`` is *not* included. +**Open issues:** + * Do we need BasicMapping and IterableMapping? We should probably just start with Mapping. * We should say more about mapping view types. +* Should we add the ``copy`` method? + Sequences ''''''''' @@ -501,13 +527,27 @@ from ``HashableSequence``. instances. +Strings +------- + +Python 3000 has two built-in string types: byte strings (``bytes``), +deriving from ``MutableSequence``, and (Unicode) character strings +(``str``), deriving from ``HashableSequence``. + +**Open issues:** define the base interfaces for these so alternative +implementations and subclasses know what they are in for. This may be +the subject of a new PEP or PEPs (maybe PEP 358 can be co-opted for +the ``bytes`` type). + + ABCs for Numbers ---------------- -**Open issues:** Define: Number, Complex, Real, Rational, Integer. Do -we have a use case for Cardinal (Integer >= 0)? Do we need Index -(converts to Integer using __index__)? Or is that just subsumed into -Integer and should we use __index__ only at the C level? +**Open issues:** Define: ``Number``, ``Complex``, ``Real``, +``Rational``, ``Integer``. Maybe also ``Cardinal`` (``Integer`` >= +0)? We probably also need ``Index``, which converts to ``Integer`` +using ``__index__``. This should probably be moved out to a separate +PEP. Guidelines for Writing ABCs @@ -572,7 +612,7 @@ pretty-printing of sets like this:: return "{" + ... + "}" # Details left as an exercise and implementations for specific subclasses of Set could be added -easily. +easily. I believe ABCs also won't present any problems for RuleDispatch, Phillip Eby's GF implementation in PEAK [5]_. @@ -595,6 +635,22 @@ of the work that went into e.g. defining the various shades of "mapping-ness" and the nomenclature could easily be adapted for a proposal to use Interfaces instead of ABCs. +"Interfaces" in this context refers to a set of proposals for +additional metadata elements attached to a class which are not part of +the regular class hierarchy, but do allow for certain types of +inheritance testing. + +Such metadata would be designed, at least in some proposals, so as to +be easily mutable by an application, allowing application writers to +override the normal classification of an object. + +The drawback to this idea of attaching mutable metadata to a class is +that classes are shared state, and mutating them may lead to conflicts +of intent. Additionally, the need to override the classification of +an object can be done more cleanly using generic functions: In the +simplest case, one can define a "category membership" generic function +that simply returns False in the base implementation, and then provide +overrides that return True for any classes of interest. Open Issues ===========