Greg Wilson's latest version
This commit is contained in:
parent
cc08f77942
commit
4158b00536
100
pep-0218.txt
100
pep-0218.txt
|
@ -16,38 +16,45 @@ Introduction
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
|
|
||||||
Sets are a fundamental mathematical structure, and are commonly
|
One of Python's greatest strengths as a teaching language is its
|
||||||
used to specify algorithms. They are much less frequently used in
|
clarity. Its syntax and object model are so clean, and so simple,
|
||||||
implementations, even when they are the "right" structure.
|
that it can serve as "executable pseudocode". Anything that makes
|
||||||
Programmers frequently use lists instead, even when the ordering
|
it even better suited for this role will help increase its use in
|
||||||
information in lists is irrelevant, and by-value lookups are
|
school and college courses.
|
||||||
frequent. (Most medium-sized C programs contain a depressing
|
|
||||||
number of start-to-end searches through malloc'd vectors to
|
Sets are a fundamental mathematical structure, and are very
|
||||||
determine whether particular items are present or not...)
|
commonly used in algorithm specifications. They are much less
|
||||||
|
frequently used in implementations, even when they are the "right"
|
||||||
|
structure. Programmers frequently use lists instead, even when
|
||||||
|
the ordering information in lists is irrelevant, and by-value
|
||||||
|
lookups are frequent. (Most medium-sized C programs contain a
|
||||||
|
depressing number of start-to-end searches through malloc'd
|
||||||
|
vectors to determine whether particular items are present or
|
||||||
|
not...)
|
||||||
|
|
||||||
Programmers are often told that they can implement sets as
|
Programmers are often told that they can implement sets as
|
||||||
dictionaries with "don't care" values. Items can be added to
|
dictionaries with "don't care" values. Items can be added to
|
||||||
these "sets" by assigning the "don't care" value to them;
|
these "sets" by assigning the "don't care" value to them;
|
||||||
membership can be tested using "dict.has_key"; and items can be
|
membership can be tested using "dict.has_key"; and items can be
|
||||||
deleted using "del". However, the three main binary operations
|
deleted using "del". However, the other main operations on sets
|
||||||
on sets --- union, intersection, and difference --- are not
|
(union, intersection, and difference) are not directly supported
|
||||||
directly supported by this representation, since their meaning is
|
by this representation, since their meaning is ambiguous for
|
||||||
ambiguous for dictionaries containing key/value pairs.
|
dictionaries containing key/value pairs.
|
||||||
|
|
||||||
|
|
||||||
Proposal
|
Proposal
|
||||||
|
|
||||||
We propose adding a new built-in type to Python to represent sets.
|
We propose adding a set type to Python. This type will be an
|
||||||
This type will be an unordered collection of unique values, just
|
unordered collection of unique values, just as a dictionary is an
|
||||||
as a dictionary is an unordered collection of key/value pairs.
|
unordered collection of key/value pairs. Constant sets will be
|
||||||
Constant sets will be represented using the usual mathematical
|
represented using the usual mathematical notation, so that
|
||||||
notation, so that "{1, 2, 3}" will be a set of three integers.
|
"{1, 2, 3}" will be a set of three integers.
|
||||||
|
|
||||||
In order to avoid ambiguity, the empty set will be written "{,}",
|
In order to avoid ambiguity, the empty set will be written "{,}",
|
||||||
rather than "{}" (which is already used to represent empty
|
rather than "{}" (which is already used to represent empty
|
||||||
dictionaries). We feel that this notation is as reasonable as the
|
dictionaries). We feel that this notation is as reasonable as the
|
||||||
use of "(3,)" to represent single-element tuples; a more radical
|
use of "(3,)" to represent single-element tuples; a more radical
|
||||||
alternative is discussed in the "Alternatives" section.
|
strategy is discussed in the "Alternatives" section.
|
||||||
|
|
||||||
Iteration and comprehension will be implemented in the obvious
|
Iteration and comprehension will be implemented in the obvious
|
||||||
ways, so that:
|
ways, so that:
|
||||||
|
@ -64,7 +71,10 @@ Proposal
|
||||||
|
|
||||||
The binary operators '|', '&', '-', and "^" will implement set
|
The binary operators '|', '&', '-', and "^" will implement set
|
||||||
union, intersection, difference, and symmetric difference. Their
|
union, intersection, difference, and symmetric difference. Their
|
||||||
in-place equivalents will have the obvious semantics.
|
in-place equivalents will have the obvious semantics. (We feel
|
||||||
|
that it is more sensible to overload the bitwise operators '|' and
|
||||||
|
'&', rather than the arithmetic operators '+' and "*', because
|
||||||
|
there is no arithmetic equivalent of '^'.)
|
||||||
|
|
||||||
The method "add" will add an element to a set. This is different
|
The method "add" will add an element to a set. This is different
|
||||||
from set union, as the following example shows:
|
from set union, as the following example shows:
|
||||||
|
@ -83,14 +93,21 @@ Proposal
|
||||||
using "del":
|
using "del":
|
||||||
|
|
||||||
>>> S = {1, 2, 3}
|
>>> S = {1, 2, 3}
|
||||||
|
>>> S.remove(3)
|
||||||
|
>>> S
|
||||||
|
{1, 2}
|
||||||
>>> del S[1]
|
>>> del S[1]
|
||||||
>>> S
|
>>> S
|
||||||
{2, 3}
|
|
||||||
>>> S.remove(3)
|
|
||||||
{2}
|
{2}
|
||||||
|
|
||||||
The "KeyError" exception will be raised if an attempt is made to
|
The "KeyError" exception will be raised if an attempt is made to
|
||||||
remove an element which is not in a set.
|
remove an element which is not in a set. This definition of "del"
|
||||||
|
is consistent with that used for dictionaries:
|
||||||
|
|
||||||
|
>>> D = {1:2, 3:4}
|
||||||
|
>>> del D[1]
|
||||||
|
>>> D
|
||||||
|
{3:4}
|
||||||
|
|
||||||
A new method "dict.keyset" will return the keys of a dictionary as
|
A new method "dict.keyset" will return the keys of a dictionary as
|
||||||
a set. A corresponding method "dict.valueset" will return the
|
a set. A corresponding method "dict.valueset" will return the
|
||||||
|
@ -101,8 +118,47 @@ Proposal
|
||||||
handle sets as input.
|
handle sets as input.
|
||||||
|
|
||||||
|
|
||||||
|
Open Issues
|
||||||
|
|
||||||
|
One major issue remains to be resolved: will sets be allowed to
|
||||||
|
contain mutable values, or will their values be required to
|
||||||
|
immutable (as dictionary keys are)? The disadvantages of allowing
|
||||||
|
only immutable values are clear --- if nothing else, it would
|
||||||
|
prevent users from creating sets of sets.
|
||||||
|
|
||||||
|
However, no efficient implementation of sets of mutable values has
|
||||||
|
yet been suggested. Hashing approaches will obviously fail (which
|
||||||
|
is why mutable values are not allowed to be dictionary keys).
|
||||||
|
Even simple-minded implementations, such as storing the set's
|
||||||
|
values in a list, can give incorrect results, as the following
|
||||||
|
example shows:
|
||||||
|
|
||||||
|
>>> a = [1, 2]
|
||||||
|
>>> b = [3, 4]
|
||||||
|
>>> S = [a, b]
|
||||||
|
>>> a[0:2] = [3, 4]
|
||||||
|
>>> S
|
||||||
|
[[3, 4], [3, 4]]
|
||||||
|
|
||||||
|
One way to solve this problem would be to add observer/observable
|
||||||
|
functionality to every data structure in Python, so that
|
||||||
|
structures would know to update themselves when their contained
|
||||||
|
values mutated. This is clearly impractical given the current
|
||||||
|
code base, and the performance penalties (in both memory and
|
||||||
|
execution time) would probably be unacceptable anyway.
|
||||||
|
|
||||||
|
|
||||||
Alternatives
|
Alternatives
|
||||||
|
|
||||||
|
A more conservative alternative to this proposal would be to add a
|
||||||
|
new built-in class "Set", rather than adding new syntax for direct
|
||||||
|
expression of sets. On the positive side, this would not require
|
||||||
|
any changes to the Python language definition. On the negative
|
||||||
|
side, people would then not be able to write Python programs using
|
||||||
|
the same notation as they would use on a whiteboard. We feel that
|
||||||
|
the more Python supports standard pre-existing notation, the
|
||||||
|
greater the chances of it being adopted as a teaching language.
|
||||||
|
|
||||||
A radical alternative to the (admittedly clumsy) notation "{,}" is
|
A radical alternative to the (admittedly clumsy) notation "{,}" is
|
||||||
to re-define "{}" to be the empty collection, rather than the
|
to re-define "{}" to be the empty collection, rather than the
|
||||||
empty dictionary. Operations which made this object non-empty
|
empty dictionary. Operations which made this object non-empty
|
||||||
|
|
Loading…
Reference in New Issue