Update and finalize PEP 218 (builtin set types):

* List the additional methods and operators that are supported.
* List differences between sets.py and the built-in types.
* Mark the {-} syntax as rejected by Guido until Python 3000.
* Note that genexps make set comprehensions moot.
* Mark the pep as final and implemented.
This commit is contained in:
Raymond Hettinger 2004-08-27 20:28:58 +00:00
parent a584e413a0
commit d324b2fcff
2 changed files with 60 additions and 72 deletions

View File

@ -73,7 +73,6 @@ Index by Category
I 206 2.0 Batteries Included Zadka I 206 2.0 Batteries Included Zadka
S 209 Adding Multidimensional Arrays Barrett, Oliphant S 209 Adding Multidimensional Arrays Barrett, Oliphant
S 215 String Interpolation Yee S 215 String Interpolation Yee
S 218 Adding a Built-In Set Object Type Wilson
S 228 Reworking Python's Numeric Model Zadka, GvR S 228 Reworking Python's Numeric Model Zadka, GvR
S 237 Unifying Long Integers and Integers Zadka, GvR S 237 Unifying Long Integers and Integers Zadka, GvR
S 239 Adding a Rational Type to Python Craig, Zadka S 239 Adding a Rational Type to Python Craig, Zadka
@ -137,6 +136,7 @@ Index by Category
SF 208 Reworking the Coercion Model Schemenauer, Lemburg SF 208 Reworking the Coercion Model Schemenauer, Lemburg
SF 214 Extended Print Statement Warsaw SF 214 Extended Print Statement Warsaw
SF 217 Display Hook for Interactive Use Zadka SF 217 Display Hook for Interactive Use Zadka
SF 218 Adding a Built-In Set Object Type Wilson, Hettinger
SF 221 Import As Wouters SF 221 Import As Wouters
SF 223 Change the Meaning of \x Escapes Peters SF 223 Change the Meaning of \x Escapes Peters
I 226 Python 2.1 Release Schedule Hylton I 226 Python 2.1 Release Schedule Hylton
@ -247,7 +247,7 @@ Numerical Index
SD 215 String Interpolation Yee SD 215 String Interpolation Yee
IR 216 Docstring Format Zadka IR 216 Docstring Format Zadka
SF 217 Display Hook for Interactive Use Zadka SF 217 Display Hook for Interactive Use Zadka
S 218 Adding a Built-In Set Object Type Wilson SF 218 Adding a Built-In Set Object Type Wilson, Hettinger
SD 219 Stackless Python McMillan SD 219 Stackless Python McMillan
ID 220 Coroutines, Generators, Continuations McMillan ID 220 Coroutines, Generators, Continuations McMillan
SF 221 Import As Wouters SF 221 Import As Wouters

View File

@ -1,8 +1,8 @@
PEP: 218 PEP: 218
Title: Adding a Built-In Set Object Type Title: Adding a Built-In Set Object Type
Version: $Revision$ Version: $Revision$
Author: gvwilson@ddj.com (Greg Wilson) Author: gvwilson at ddj.com (Greg Wilson), python at rcn.com (Raymond Hettinger)
Status: Draft Status: Final
Type: Standards Track Type: Standards Track
Python-Version: 2.2 Python-Version: 2.2
Created: 31-Jul-2000 Created: 31-Jul-2000
@ -16,11 +16,9 @@ Introduction
module is widely used. After explaining why sets are desirable, module is widely used. After explaining why sets are desirable,
and why the common idiom of using dictionaries in their place is and why the common idiom of using dictionaries in their place is
inadequate, we describe how we intend built-in sets to work, and inadequate, we describe how we intend built-in sets to work, and
then how the preliminary Set module will behave. The penultimate then how the preliminary Set module will behave. The last
section discusses the mutability (or otherwise) of sets and set section discusses the mutability (or otherwise) of sets and set
elements, and the solution which the Set module will implement. elements, and the solution which the Set module will implement.
The last section then looks at alternatives that were considered,
but discarded.
Rationale Rationale
@ -45,21 +43,12 @@ Rationale
dictionaries containing key/value pairs. dictionaries containing key/value pairs.
Long-Term Proposal Proposal
The long-term goal of this PEP is to add a built-in set type to The long-term goal of this PEP is to add a built-in set type to
Python. This type will be an unordered collection of unique Python. This type will be an unordered collection of unique
values, just as a dictionary is an unordered collection of values, just as a dictionary is an unordered collection of
key/value pairs. Constant sets will be represented using the key/value pairs.
usual mathematical notation, so that "{1, 2, 3}" will be a set of
three integers.
In order to avoid ambiguity, the empty set will be written "{-}",
rather than "{}" (which is already used to represent empty
dictionaries). We feel that this notation is as reasonable as the
use of "(3,)" to represent single-element tuples; a more radical
strategy is discussed in the "Alternatives" section, and more
readable than the earlier proposal "{,}".
Iteration and comprehension will be implemented in the obvious Iteration and comprehension will be implemented in the obvious
ways, so that: ways, so that:
@ -68,7 +57,7 @@ Long-Term Proposal
will step through the elements of S in arbitrary order, while: will step through the elements of S in arbitrary order, while:
{x**2 for x in S} set(x**2 for x in S)
will produce a set containing the squares of all elements in S, will produce a set containing the squares of all elements in S,
Membership will be tested using "in" and "not in", and basic set Membership will be tested using "in" and "not in", and basic set
@ -79,6 +68,9 @@ Long-Term Proposal
& intersection & intersection
^ symmetric difference ^ symmetric difference
- asymmetric difference - asymmetric difference
== != equality and inequality tests
< <= >= > subset and superset tests
and methods: and methods:
@ -99,11 +91,21 @@ Long-Term Proposal
S.clear() Remove all elements from this set. S.clear() Remove all elements from this set.
and one new built-in conversion function: S.copy() Make a new set.
s.issuperset() Check for a superset relationship.
s.issubset() Check for a subset relationship.
and two new built-in conversion functions:
set(x) Create a set containing the elements of the set(x) Create a set containing the elements of the
collection "x". collection "x".
frozenset(x) Create an immutable set containing the elements
of the collection "x".
Notes: Notes:
1. We propose using the bitwise operators "|&" for intersection 1. We propose using the bitwise operators "|&" for intersection
@ -117,44 +119,39 @@ Long-Term Proposal
of "add" will also avoid confusion between that operation and of "add" will also avoid confusion between that operation and
set union. set union.
3. Sets raise "LookupError" exceptions, rather than "KeyError" or
"ValueError", because set elements are neither keys nor values. Set Notation
The PEP originally proposed {1,2,3} as the set notation and {-} for
the empty set. Experience with Python 2.3's sets.py showed that
the notation was not necessary. Also, there was some risk of making
dictionaries less instantly recognizable.
It was also contemplated that the braced notation would support set
comprehensions; however, Python 2.4 provided generator expressions
which fully met that need and did so it a more general way.
(See PEP 289 for details on generator expressions).
So, Guido ruled that there would not be a set syntax; however, the
issue could be revisited for Python 3000 (see PEP 3000).
Open Issues for the Long-Term Proposal History
Earlier drafts of PEP 218 had only a single set type, but the To gain experience with sets, a pure python module was introduced
sets.py implementation in Python 2.3 has two, Set and in Python 2.3. Based on that implementation, the set and frozenset
ImmutableSet. The long-term proposal has a single built-in types were introduced in Python 2.4. The improvements are:
conversion function, set(iterable); how should instances of a
built-in immutable set type be created? Possibilities include a
second immutable_set() built-in, or perhaps the set() function
could take an additional argument,
e.g. set(iterable, immutable=True)?
The PEP proposes {1,2,3} as the set notation and {-} for the empty * Better hash algorithm for frozensets
set. Would there be different syntax for an immutable and a * More compact pickle format (storing only an element list
mutable set? Perhaps the built-in syntax would only be for instead of a dictionary of key:value pairs where the value
mutable sets, and an immutable set would be created from a mutable is always True).
set using the appropriate built-in function, * Use a __reduce__ function so that deep copying is automatic.
e.g. immutable_set({1,2,3}). * The BaseSet concept was eliminated.
* The union_update() method became just update().
* Auto-conversion between mutable and immutable sets was dropped.
Short-Term Proposal * The _repr method was dropped (the need is met by the new
sorted() built-in function).
In order to determine whether there is enough demand for sets to
justify making them a built-in type, and to give users a chance to
try out the semantics we propose for sets, our short-term proposal
is to add a "Set" class to the standard Python library. This
class will have the operators and methods described above; it will
also have named methods corresponding to all of the operations: a
"union" method for "|", and a "union_update" method for "|=", and
so on.
This class will use a dictionary internally to contain set values.
To avoid having to duplicate values (e.g. for iteration through
the set), the class will rely on the iterators added in Python
2.2.
Tim Peters believes that the class's constructor should take a Tim Peters believes that the class's constructor should take a
single sequence as an argument, and populate the set with that single sequence as an argument, and populate the set with that
@ -173,10 +170,8 @@ Short-Term Proposal
>>> Set(1, 2, 3, 4) # case 2 >>> Set(1, 2, 3, 4) # case 2
On the other, other hand, if Python does adopt a dictionary-like Ultimately, we adopted the first strategy in which the initializer
notation for sets in the future, then case 2 will become takes a single iterable argument.
redundant. We have therefore adopted the first strategy, in which
the initializer takes a single iterable argument.
Mutability Mutability
@ -188,26 +183,19 @@ Mutability
to be immutable, this would preclude sets of sets (which are to be immutable, this would preclude sets of sets (which are
widely used in graph algorithms and other applications). widely used in graph algorithms and other applications).
Earlier drafts of PEP 218 had only a single set type, but the
sets.py implementation in Python 2.3 has two, Set and
ImmutableSet. For Python 2.4, the new built-in types were named
set and frozenset which are slightly less cumbersome.
There are two classes implemented in the "sets" module. Instances There are two classes implemented in the "sets" module. Instances
of the Set class can be modified by the addition or removal of of the Set class can be modified by the addition or removal of
elements, and the ImmutableSet class is "frozen", with an elements, and the ImmutableSet class is "frozen", with an
unchangeable collection of elements. Therefore, an ImmutableSet unchangeable collection of elements. Therefore, an ImmutableSet
may be used as a dictionary key or as a set element, but cannot be may be used as a dictionary key or as a set element, but cannot be
updated. Both types of set require that their elements are updated. Both types of set require that their elements are
immutable, hashable objects. immutable, hashable objects. Parallel comments apply to the "set"
and "frozenset" built-in types.
Alternatives
An alternative to the notation "{-}" for the empty set would be to
re-define "{}" to be the empty collection, rather than the empty
dictionary. Operations which made this object non-empty would
silently convert it to either a dictionary or a set; it would then
retain that type for the rest of its existence. This idea was
rejected because of its potential impact on existing Python
programs. A similar proposal to modify "dict.keys" and
"dict.values" to return sets, rather than lists, was rejected for
the same reasons.
Copyright Copyright