Update and finalize PEP 218 (builtin set types):
* List the additional methods and operators that are supported. * List differences between sets.py and the built-in types. * Mark the {-} syntax as rejected by Guido until Python 3000. * Note that genexps make set comprehensions moot. * Mark the pep as final and implemented.
This commit is contained in:
parent
a584e413a0
commit
d324b2fcff
|
@ -73,7 +73,6 @@ Index by Category
|
||||||
I 206 2.0 Batteries Included Zadka
|
I 206 2.0 Batteries Included Zadka
|
||||||
S 209 Adding Multidimensional Arrays Barrett, Oliphant
|
S 209 Adding Multidimensional Arrays Barrett, Oliphant
|
||||||
S 215 String Interpolation Yee
|
S 215 String Interpolation Yee
|
||||||
S 218 Adding a Built-In Set Object Type Wilson
|
|
||||||
S 228 Reworking Python's Numeric Model Zadka, GvR
|
S 228 Reworking Python's Numeric Model Zadka, GvR
|
||||||
S 237 Unifying Long Integers and Integers Zadka, GvR
|
S 237 Unifying Long Integers and Integers Zadka, GvR
|
||||||
S 239 Adding a Rational Type to Python Craig, Zadka
|
S 239 Adding a Rational Type to Python Craig, Zadka
|
||||||
|
@ -137,6 +136,7 @@ Index by Category
|
||||||
SF 208 Reworking the Coercion Model Schemenauer, Lemburg
|
SF 208 Reworking the Coercion Model Schemenauer, Lemburg
|
||||||
SF 214 Extended Print Statement Warsaw
|
SF 214 Extended Print Statement Warsaw
|
||||||
SF 217 Display Hook for Interactive Use Zadka
|
SF 217 Display Hook for Interactive Use Zadka
|
||||||
|
SF 218 Adding a Built-In Set Object Type Wilson, Hettinger
|
||||||
SF 221 Import As Wouters
|
SF 221 Import As Wouters
|
||||||
SF 223 Change the Meaning of \x Escapes Peters
|
SF 223 Change the Meaning of \x Escapes Peters
|
||||||
I 226 Python 2.1 Release Schedule Hylton
|
I 226 Python 2.1 Release Schedule Hylton
|
||||||
|
@ -247,7 +247,7 @@ Numerical Index
|
||||||
SD 215 String Interpolation Yee
|
SD 215 String Interpolation Yee
|
||||||
IR 216 Docstring Format Zadka
|
IR 216 Docstring Format Zadka
|
||||||
SF 217 Display Hook for Interactive Use Zadka
|
SF 217 Display Hook for Interactive Use Zadka
|
||||||
S 218 Adding a Built-In Set Object Type Wilson
|
SF 218 Adding a Built-In Set Object Type Wilson, Hettinger
|
||||||
SD 219 Stackless Python McMillan
|
SD 219 Stackless Python McMillan
|
||||||
ID 220 Coroutines, Generators, Continuations McMillan
|
ID 220 Coroutines, Generators, Continuations McMillan
|
||||||
SF 221 Import As Wouters
|
SF 221 Import As Wouters
|
||||||
|
|
128
pep-0218.txt
128
pep-0218.txt
|
@ -1,8 +1,8 @@
|
||||||
PEP: 218
|
PEP: 218
|
||||||
Title: Adding a Built-In Set Object Type
|
Title: Adding a Built-In Set Object Type
|
||||||
Version: $Revision$
|
Version: $Revision$
|
||||||
Author: gvwilson@ddj.com (Greg Wilson)
|
Author: gvwilson at ddj.com (Greg Wilson), python at rcn.com (Raymond Hettinger)
|
||||||
Status: Draft
|
Status: Final
|
||||||
Type: Standards Track
|
Type: Standards Track
|
||||||
Python-Version: 2.2
|
Python-Version: 2.2
|
||||||
Created: 31-Jul-2000
|
Created: 31-Jul-2000
|
||||||
|
@ -16,11 +16,9 @@ Introduction
|
||||||
module is widely used. After explaining why sets are desirable,
|
module is widely used. After explaining why sets are desirable,
|
||||||
and why the common idiom of using dictionaries in their place is
|
and why the common idiom of using dictionaries in their place is
|
||||||
inadequate, we describe how we intend built-in sets to work, and
|
inadequate, we describe how we intend built-in sets to work, and
|
||||||
then how the preliminary Set module will behave. The penultimate
|
then how the preliminary Set module will behave. The last
|
||||||
section discusses the mutability (or otherwise) of sets and set
|
section discusses the mutability (or otherwise) of sets and set
|
||||||
elements, and the solution which the Set module will implement.
|
elements, and the solution which the Set module will implement.
|
||||||
The last section then looks at alternatives that were considered,
|
|
||||||
but discarded.
|
|
||||||
|
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
|
@ -45,21 +43,12 @@ Rationale
|
||||||
dictionaries containing key/value pairs.
|
dictionaries containing key/value pairs.
|
||||||
|
|
||||||
|
|
||||||
Long-Term Proposal
|
Proposal
|
||||||
|
|
||||||
The long-term goal of this PEP is to add a built-in set type to
|
The long-term goal of this PEP is to add a built-in set type to
|
||||||
Python. This type will be an unordered collection of unique
|
Python. This type will be an unordered collection of unique
|
||||||
values, just as a dictionary is an unordered collection of
|
values, just as a dictionary is an unordered collection of
|
||||||
key/value pairs. Constant sets will be represented using the
|
key/value pairs.
|
||||||
usual mathematical notation, so that "{1, 2, 3}" will be a set of
|
|
||||||
three integers.
|
|
||||||
|
|
||||||
In order to avoid ambiguity, the empty set will be written "{-}",
|
|
||||||
rather than "{}" (which is already used to represent empty
|
|
||||||
dictionaries). We feel that this notation is as reasonable as the
|
|
||||||
use of "(3,)" to represent single-element tuples; a more radical
|
|
||||||
strategy is discussed in the "Alternatives" section, and more
|
|
||||||
readable than the earlier proposal "{,}".
|
|
||||||
|
|
||||||
Iteration and comprehension will be implemented in the obvious
|
Iteration and comprehension will be implemented in the obvious
|
||||||
ways, so that:
|
ways, so that:
|
||||||
|
@ -68,7 +57,7 @@ Long-Term Proposal
|
||||||
|
|
||||||
will step through the elements of S in arbitrary order, while:
|
will step through the elements of S in arbitrary order, while:
|
||||||
|
|
||||||
{x**2 for x in S}
|
set(x**2 for x in S)
|
||||||
|
|
||||||
will produce a set containing the squares of all elements in S,
|
will produce a set containing the squares of all elements in S,
|
||||||
Membership will be tested using "in" and "not in", and basic set
|
Membership will be tested using "in" and "not in", and basic set
|
||||||
|
@ -79,6 +68,9 @@ Long-Term Proposal
|
||||||
& intersection
|
& intersection
|
||||||
^ symmetric difference
|
^ symmetric difference
|
||||||
- asymmetric difference
|
- asymmetric difference
|
||||||
|
== != equality and inequality tests
|
||||||
|
< <= >= > subset and superset tests
|
||||||
|
|
||||||
|
|
||||||
and methods:
|
and methods:
|
||||||
|
|
||||||
|
@ -99,11 +91,21 @@ Long-Term Proposal
|
||||||
|
|
||||||
S.clear() Remove all elements from this set.
|
S.clear() Remove all elements from this set.
|
||||||
|
|
||||||
and one new built-in conversion function:
|
S.copy() Make a new set.
|
||||||
|
|
||||||
|
s.issuperset() Check for a superset relationship.
|
||||||
|
|
||||||
|
s.issubset() Check for a subset relationship.
|
||||||
|
|
||||||
|
|
||||||
|
and two new built-in conversion functions:
|
||||||
|
|
||||||
set(x) Create a set containing the elements of the
|
set(x) Create a set containing the elements of the
|
||||||
collection "x".
|
collection "x".
|
||||||
|
|
||||||
|
frozenset(x) Create an immutable set containing the elements
|
||||||
|
of the collection "x".
|
||||||
|
|
||||||
Notes:
|
Notes:
|
||||||
|
|
||||||
1. We propose using the bitwise operators "|&" for intersection
|
1. We propose using the bitwise operators "|&" for intersection
|
||||||
|
@ -117,44 +119,39 @@ Long-Term Proposal
|
||||||
of "add" will also avoid confusion between that operation and
|
of "add" will also avoid confusion between that operation and
|
||||||
set union.
|
set union.
|
||||||
|
|
||||||
3. Sets raise "LookupError" exceptions, rather than "KeyError" or
|
|
||||||
"ValueError", because set elements are neither keys nor values.
|
Set Notation
|
||||||
|
|
||||||
|
The PEP originally proposed {1,2,3} as the set notation and {-} for
|
||||||
|
the empty set. Experience with Python 2.3's sets.py showed that
|
||||||
|
the notation was not necessary. Also, there was some risk of making
|
||||||
|
dictionaries less instantly recognizable.
|
||||||
|
|
||||||
|
It was also contemplated that the braced notation would support set
|
||||||
|
comprehensions; however, Python 2.4 provided generator expressions
|
||||||
|
which fully met that need and did so it a more general way.
|
||||||
|
(See PEP 289 for details on generator expressions).
|
||||||
|
|
||||||
|
So, Guido ruled that there would not be a set syntax; however, the
|
||||||
|
issue could be revisited for Python 3000 (see PEP 3000).
|
||||||
|
|
||||||
|
|
||||||
Open Issues for the Long-Term Proposal
|
History
|
||||||
|
|
||||||
Earlier drafts of PEP 218 had only a single set type, but the
|
To gain experience with sets, a pure python module was introduced
|
||||||
sets.py implementation in Python 2.3 has two, Set and
|
in Python 2.3. Based on that implementation, the set and frozenset
|
||||||
ImmutableSet. The long-term proposal has a single built-in
|
types were introduced in Python 2.4. The improvements are:
|
||||||
conversion function, set(iterable); how should instances of a
|
|
||||||
built-in immutable set type be created? Possibilities include a
|
|
||||||
second immutable_set() built-in, or perhaps the set() function
|
|
||||||
could take an additional argument,
|
|
||||||
e.g. set(iterable, immutable=True)?
|
|
||||||
|
|
||||||
The PEP proposes {1,2,3} as the set notation and {-} for the empty
|
* Better hash algorithm for frozensets
|
||||||
set. Would there be different syntax for an immutable and a
|
* More compact pickle format (storing only an element list
|
||||||
mutable set? Perhaps the built-in syntax would only be for
|
instead of a dictionary of key:value pairs where the value
|
||||||
mutable sets, and an immutable set would be created from a mutable
|
is always True).
|
||||||
set using the appropriate built-in function,
|
* Use a __reduce__ function so that deep copying is automatic.
|
||||||
e.g. immutable_set({1,2,3}).
|
* The BaseSet concept was eliminated.
|
||||||
|
* The union_update() method became just update().
|
||||||
|
* Auto-conversion between mutable and immutable sets was dropped.
|
||||||
Short-Term Proposal
|
* The _repr method was dropped (the need is met by the new
|
||||||
|
sorted() built-in function).
|
||||||
In order to determine whether there is enough demand for sets to
|
|
||||||
justify making them a built-in type, and to give users a chance to
|
|
||||||
try out the semantics we propose for sets, our short-term proposal
|
|
||||||
is to add a "Set" class to the standard Python library. This
|
|
||||||
class will have the operators and methods described above; it will
|
|
||||||
also have named methods corresponding to all of the operations: a
|
|
||||||
"union" method for "|", and a "union_update" method for "|=", and
|
|
||||||
so on.
|
|
||||||
|
|
||||||
This class will use a dictionary internally to contain set values.
|
|
||||||
To avoid having to duplicate values (e.g. for iteration through
|
|
||||||
the set), the class will rely on the iterators added in Python
|
|
||||||
2.2.
|
|
||||||
|
|
||||||
Tim Peters believes that the class's constructor should take a
|
Tim Peters believes that the class's constructor should take a
|
||||||
single sequence as an argument, and populate the set with that
|
single sequence as an argument, and populate the set with that
|
||||||
|
@ -173,10 +170,8 @@ Short-Term Proposal
|
||||||
|
|
||||||
>>> Set(1, 2, 3, 4) # case 2
|
>>> Set(1, 2, 3, 4) # case 2
|
||||||
|
|
||||||
On the other, other hand, if Python does adopt a dictionary-like
|
Ultimately, we adopted the first strategy in which the initializer
|
||||||
notation for sets in the future, then case 2 will become
|
takes a single iterable argument.
|
||||||
redundant. We have therefore adopted the first strategy, in which
|
|
||||||
the initializer takes a single iterable argument.
|
|
||||||
|
|
||||||
|
|
||||||
Mutability
|
Mutability
|
||||||
|
@ -188,26 +183,19 @@ Mutability
|
||||||
to be immutable, this would preclude sets of sets (which are
|
to be immutable, this would preclude sets of sets (which are
|
||||||
widely used in graph algorithms and other applications).
|
widely used in graph algorithms and other applications).
|
||||||
|
|
||||||
|
Earlier drafts of PEP 218 had only a single set type, but the
|
||||||
|
sets.py implementation in Python 2.3 has two, Set and
|
||||||
|
ImmutableSet. For Python 2.4, the new built-in types were named
|
||||||
|
set and frozenset which are slightly less cumbersome.
|
||||||
|
|
||||||
There are two classes implemented in the "sets" module. Instances
|
There are two classes implemented in the "sets" module. Instances
|
||||||
of the Set class can be modified by the addition or removal of
|
of the Set class can be modified by the addition or removal of
|
||||||
elements, and the ImmutableSet class is "frozen", with an
|
elements, and the ImmutableSet class is "frozen", with an
|
||||||
unchangeable collection of elements. Therefore, an ImmutableSet
|
unchangeable collection of elements. Therefore, an ImmutableSet
|
||||||
may be used as a dictionary key or as a set element, but cannot be
|
may be used as a dictionary key or as a set element, but cannot be
|
||||||
updated. Both types of set require that their elements are
|
updated. Both types of set require that their elements are
|
||||||
immutable, hashable objects.
|
immutable, hashable objects. Parallel comments apply to the "set"
|
||||||
|
and "frozenset" built-in types.
|
||||||
|
|
||||||
Alternatives
|
|
||||||
|
|
||||||
An alternative to the notation "{-}" for the empty set would be to
|
|
||||||
re-define "{}" to be the empty collection, rather than the empty
|
|
||||||
dictionary. Operations which made this object non-empty would
|
|
||||||
silently convert it to either a dictionary or a set; it would then
|
|
||||||
retain that type for the rest of its existence. This idea was
|
|
||||||
rejected because of its potential impact on existing Python
|
|
||||||
programs. A similar proposal to modify "dict.keys" and
|
|
||||||
"dict.values" to return sets, rather than lists, was rejected for
|
|
||||||
the same reasons.
|
|
||||||
|
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
|
|
Loading…
Reference in New Issue