Update and finalize PEP 218 (builtin set types):
* List the additional methods and operators that are supported. * List differences between sets.py and the built-in types. * Mark the {-} syntax as rejected by Guido until Python 3000. * Note that genexps make set comprehensions moot. * Mark the pep as final and implemented.
This commit is contained in:
parent
a584e413a0
commit
d324b2fcff
|
@ -73,7 +73,6 @@ Index by Category
|
|||
I 206 2.0 Batteries Included Zadka
|
||||
S 209 Adding Multidimensional Arrays Barrett, Oliphant
|
||||
S 215 String Interpolation Yee
|
||||
S 218 Adding a Built-In Set Object Type Wilson
|
||||
S 228 Reworking Python's Numeric Model Zadka, GvR
|
||||
S 237 Unifying Long Integers and Integers Zadka, GvR
|
||||
S 239 Adding a Rational Type to Python Craig, Zadka
|
||||
|
@ -137,6 +136,7 @@ Index by Category
|
|||
SF 208 Reworking the Coercion Model Schemenauer, Lemburg
|
||||
SF 214 Extended Print Statement Warsaw
|
||||
SF 217 Display Hook for Interactive Use Zadka
|
||||
SF 218 Adding a Built-In Set Object Type Wilson, Hettinger
|
||||
SF 221 Import As Wouters
|
||||
SF 223 Change the Meaning of \x Escapes Peters
|
||||
I 226 Python 2.1 Release Schedule Hylton
|
||||
|
@ -247,7 +247,7 @@ Numerical Index
|
|||
SD 215 String Interpolation Yee
|
||||
IR 216 Docstring Format Zadka
|
||||
SF 217 Display Hook for Interactive Use Zadka
|
||||
S 218 Adding a Built-In Set Object Type Wilson
|
||||
SF 218 Adding a Built-In Set Object Type Wilson, Hettinger
|
||||
SD 219 Stackless Python McMillan
|
||||
ID 220 Coroutines, Generators, Continuations McMillan
|
||||
SF 221 Import As Wouters
|
||||
|
|
128
pep-0218.txt
128
pep-0218.txt
|
@ -1,8 +1,8 @@
|
|||
PEP: 218
|
||||
Title: Adding a Built-In Set Object Type
|
||||
Version: $Revision$
|
||||
Author: gvwilson@ddj.com (Greg Wilson)
|
||||
Status: Draft
|
||||
Author: gvwilson at ddj.com (Greg Wilson), python at rcn.com (Raymond Hettinger)
|
||||
Status: Final
|
||||
Type: Standards Track
|
||||
Python-Version: 2.2
|
||||
Created: 31-Jul-2000
|
||||
|
@ -16,11 +16,9 @@ Introduction
|
|||
module is widely used. After explaining why sets are desirable,
|
||||
and why the common idiom of using dictionaries in their place is
|
||||
inadequate, we describe how we intend built-in sets to work, and
|
||||
then how the preliminary Set module will behave. The penultimate
|
||||
then how the preliminary Set module will behave. The last
|
||||
section discusses the mutability (or otherwise) of sets and set
|
||||
elements, and the solution which the Set module will implement.
|
||||
The last section then looks at alternatives that were considered,
|
||||
but discarded.
|
||||
|
||||
|
||||
Rationale
|
||||
|
@ -45,21 +43,12 @@ Rationale
|
|||
dictionaries containing key/value pairs.
|
||||
|
||||
|
||||
Long-Term Proposal
|
||||
Proposal
|
||||
|
||||
The long-term goal of this PEP is to add a built-in set type to
|
||||
Python. This type will be an unordered collection of unique
|
||||
values, just as a dictionary is an unordered collection of
|
||||
key/value pairs. Constant sets will be represented using the
|
||||
usual mathematical notation, so that "{1, 2, 3}" will be a set of
|
||||
three integers.
|
||||
|
||||
In order to avoid ambiguity, the empty set will be written "{-}",
|
||||
rather than "{}" (which is already used to represent empty
|
||||
dictionaries). We feel that this notation is as reasonable as the
|
||||
use of "(3,)" to represent single-element tuples; a more radical
|
||||
strategy is discussed in the "Alternatives" section, and more
|
||||
readable than the earlier proposal "{,}".
|
||||
key/value pairs.
|
||||
|
||||
Iteration and comprehension will be implemented in the obvious
|
||||
ways, so that:
|
||||
|
@ -68,7 +57,7 @@ Long-Term Proposal
|
|||
|
||||
will step through the elements of S in arbitrary order, while:
|
||||
|
||||
{x**2 for x in S}
|
||||
set(x**2 for x in S)
|
||||
|
||||
will produce a set containing the squares of all elements in S,
|
||||
Membership will be tested using "in" and "not in", and basic set
|
||||
|
@ -79,6 +68,9 @@ Long-Term Proposal
|
|||
& intersection
|
||||
^ symmetric difference
|
||||
- asymmetric difference
|
||||
== != equality and inequality tests
|
||||
< <= >= > subset and superset tests
|
||||
|
||||
|
||||
and methods:
|
||||
|
||||
|
@ -99,11 +91,21 @@ Long-Term Proposal
|
|||
|
||||
S.clear() Remove all elements from this set.
|
||||
|
||||
and one new built-in conversion function:
|
||||
S.copy() Make a new set.
|
||||
|
||||
s.issuperset() Check for a superset relationship.
|
||||
|
||||
s.issubset() Check for a subset relationship.
|
||||
|
||||
|
||||
and two new built-in conversion functions:
|
||||
|
||||
set(x) Create a set containing the elements of the
|
||||
collection "x".
|
||||
|
||||
frozenset(x) Create an immutable set containing the elements
|
||||
of the collection "x".
|
||||
|
||||
Notes:
|
||||
|
||||
1. We propose using the bitwise operators "|&" for intersection
|
||||
|
@ -117,44 +119,39 @@ Long-Term Proposal
|
|||
of "add" will also avoid confusion between that operation and
|
||||
set union.
|
||||
|
||||
3. Sets raise "LookupError" exceptions, rather than "KeyError" or
|
||||
"ValueError", because set elements are neither keys nor values.
|
||||
|
||||
Set Notation
|
||||
|
||||
The PEP originally proposed {1,2,3} as the set notation and {-} for
|
||||
the empty set. Experience with Python 2.3's sets.py showed that
|
||||
the notation was not necessary. Also, there was some risk of making
|
||||
dictionaries less instantly recognizable.
|
||||
|
||||
It was also contemplated that the braced notation would support set
|
||||
comprehensions; however, Python 2.4 provided generator expressions
|
||||
which fully met that need and did so it a more general way.
|
||||
(See PEP 289 for details on generator expressions).
|
||||
|
||||
So, Guido ruled that there would not be a set syntax; however, the
|
||||
issue could be revisited for Python 3000 (see PEP 3000).
|
||||
|
||||
|
||||
Open Issues for the Long-Term Proposal
|
||||
History
|
||||
|
||||
Earlier drafts of PEP 218 had only a single set type, but the
|
||||
sets.py implementation in Python 2.3 has two, Set and
|
||||
ImmutableSet. The long-term proposal has a single built-in
|
||||
conversion function, set(iterable); how should instances of a
|
||||
built-in immutable set type be created? Possibilities include a
|
||||
second immutable_set() built-in, or perhaps the set() function
|
||||
could take an additional argument,
|
||||
e.g. set(iterable, immutable=True)?
|
||||
To gain experience with sets, a pure python module was introduced
|
||||
in Python 2.3. Based on that implementation, the set and frozenset
|
||||
types were introduced in Python 2.4. The improvements are:
|
||||
|
||||
The PEP proposes {1,2,3} as the set notation and {-} for the empty
|
||||
set. Would there be different syntax for an immutable and a
|
||||
mutable set? Perhaps the built-in syntax would only be for
|
||||
mutable sets, and an immutable set would be created from a mutable
|
||||
set using the appropriate built-in function,
|
||||
e.g. immutable_set({1,2,3}).
|
||||
|
||||
|
||||
Short-Term Proposal
|
||||
|
||||
In order to determine whether there is enough demand for sets to
|
||||
justify making them a built-in type, and to give users a chance to
|
||||
try out the semantics we propose for sets, our short-term proposal
|
||||
is to add a "Set" class to the standard Python library. This
|
||||
class will have the operators and methods described above; it will
|
||||
also have named methods corresponding to all of the operations: a
|
||||
"union" method for "|", and a "union_update" method for "|=", and
|
||||
so on.
|
||||
|
||||
This class will use a dictionary internally to contain set values.
|
||||
To avoid having to duplicate values (e.g. for iteration through
|
||||
the set), the class will rely on the iterators added in Python
|
||||
2.2.
|
||||
* Better hash algorithm for frozensets
|
||||
* More compact pickle format (storing only an element list
|
||||
instead of a dictionary of key:value pairs where the value
|
||||
is always True).
|
||||
* Use a __reduce__ function so that deep copying is automatic.
|
||||
* The BaseSet concept was eliminated.
|
||||
* The union_update() method became just update().
|
||||
* Auto-conversion between mutable and immutable sets was dropped.
|
||||
* The _repr method was dropped (the need is met by the new
|
||||
sorted() built-in function).
|
||||
|
||||
Tim Peters believes that the class's constructor should take a
|
||||
single sequence as an argument, and populate the set with that
|
||||
|
@ -173,10 +170,8 @@ Short-Term Proposal
|
|||
|
||||
>>> Set(1, 2, 3, 4) # case 2
|
||||
|
||||
On the other, other hand, if Python does adopt a dictionary-like
|
||||
notation for sets in the future, then case 2 will become
|
||||
redundant. We have therefore adopted the first strategy, in which
|
||||
the initializer takes a single iterable argument.
|
||||
Ultimately, we adopted the first strategy in which the initializer
|
||||
takes a single iterable argument.
|
||||
|
||||
|
||||
Mutability
|
||||
|
@ -188,26 +183,19 @@ Mutability
|
|||
to be immutable, this would preclude sets of sets (which are
|
||||
widely used in graph algorithms and other applications).
|
||||
|
||||
Earlier drafts of PEP 218 had only a single set type, but the
|
||||
sets.py implementation in Python 2.3 has two, Set and
|
||||
ImmutableSet. For Python 2.4, the new built-in types were named
|
||||
set and frozenset which are slightly less cumbersome.
|
||||
|
||||
There are two classes implemented in the "sets" module. Instances
|
||||
of the Set class can be modified by the addition or removal of
|
||||
elements, and the ImmutableSet class is "frozen", with an
|
||||
unchangeable collection of elements. Therefore, an ImmutableSet
|
||||
may be used as a dictionary key or as a set element, but cannot be
|
||||
updated. Both types of set require that their elements are
|
||||
immutable, hashable objects.
|
||||
|
||||
|
||||
Alternatives
|
||||
|
||||
An alternative to the notation "{-}" for the empty set would be to
|
||||
re-define "{}" to be the empty collection, rather than the empty
|
||||
dictionary. Operations which made this object non-empty would
|
||||
silently convert it to either a dictionary or a set; it would then
|
||||
retain that type for the rest of its existence. This idea was
|
||||
rejected because of its potential impact on existing Python
|
||||
programs. A similar proposal to modify "dict.keys" and
|
||||
"dict.values" to return sets, rather than lists, was rejected for
|
||||
the same reasons.
|
||||
immutable, hashable objects. Parallel comments apply to the "set"
|
||||
and "frozenset" built-in types.
|
||||
|
||||
|
||||
Copyright
|
||||
|
|
Loading…
Reference in New Issue