From 4003d90c2d42d987f3798afd679da1d07eba5dee Mon Sep 17 00:00:00 2001 From: Barry Warsaw Date: Mon, 27 Nov 2000 04:10:06 +0000 Subject: [PATCH] Greg Wilson's latest, with spell-checking by Barry. --- pep-0218.txt | 112 ++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 106 insertions(+), 6 deletions(-) diff --git a/pep-0218.txt b/pep-0218.txt index 823f9aa7d..8e7c83d07 100644 --- a/pep-0218.txt +++ b/pep-0218.txt @@ -9,14 +9,114 @@ Created: 31-Jul-2000 Post-History: -Abstract +Introduction + + This PEP proposes adding sets as a built-in type in Python. + + +Rationale Sets are a fundamental mathematical structure, and are commonly - used to both specify and implement programs. Sets are often - implemented as dictionaries with "don't care" values, but this - leaves the meaning of intersection, union, difference, and other - basic operations are ambiguous. This PEP therefore proposes - syntax and semantics for a concrete, built-in set type. + used to specify algorithms. They are much less frequently used in + implementations, even when they are the "right" structure. + Programmers frequently use lists instead, even when the ordering + information in lists is irrelevant, and by-value lookups are + frequent. (Most medium-sized C programs contain a depressing + number of start-to-end searches through malloc'd vectors to + determine whether particular items are present or not...) + + Programmers are often told that they can implement sets as + dictionaries with "don't care" values. Items can be added to + these "sets" by assigning the "don't care" value to them; + membership can be tested using "dict.has_key"; and items can be + deleted using "del". However, the three main binary operations + on sets --- union, intersection, and difference --- are not + directly supported by this representation, since their meaning is + ambiguous for dictionaries containing key/value pairs. + + +Proposal + + We propose adding a new built-in type to Python to represent sets. + This type will be an unordered collection of unique values, just + as a dictionary is an unordered collection of key/value pairs. + Constant sets will be represented using the usual mathematical + notation, so that "{1, 2, 3}" will be a set of three integers. + + In order to avoid ambiguity, the empty set will be written "{,}", + rather than "{}" (which is already used to represent empty + dictionaries). We feel that this notation is as reasonable as the + use of "(3,)" to represent single-element tuples; a more radical + alternative is discussed in the "Alternatives" section. + + Iteration and comprehension will be implemented in the obvious + ways, so that: + + for x in S: + + will step through the elements of S in arbitrary order, while: + + {x**2 for x in S} + + will produce a set containing the squares of all elements in S, + + Membership will be tested using "in" and "not in". + + The binary operators '|', '&', '-', and "^" will implement set + union, intersection, difference, and symmetric difference. Their + in-place equivalents will have the obvious semantics. + + The method "add" will add an element to a set. This is different + from set union, as the following example shows: + + >>> {1, 2, 3} | {4, 5, 6} + {1, 2, 3, 4, 5, 6} + + >>> {1, 2, 3}.add({4, 5, 6}) + {1, 2, 3, {4, 5, 6}} + + Note that we expect that items can also be added to sets using + in-place union of temporaries, i.e. "S |= {x}" instead of + "S.add(x)". + + Elements will be deleted from sets using a "remove" method, or + using "del": + + >>> S = {1, 2, 3} + >>> del S[1] + >>> S + {2, 3} + >>> S.remove(3) + {2} + + The "KeyError" exception will be raised if an attempt is made to + remove an element which is not in a set. + + A new method "dict.keyset" will return the keys of a dictionary as + a set. A corresponding method "dict.valueset" will return the + dictionary's values as a set. + + A built-in converter "set()" will convert any sequence type to a + set; converters such as "list()" and "tuple()" will be extended to + handle sets as input. + + +Alternatives + + A radical alternative to the (admittedly clumsy) notation "{,}" is + to re-define "{}" to be the empty collection, rather than the + empty dictionary. Operations which made this object non-empty + would silently convert it to either a dictionary or a set; it + would then retain that type for the rest of its existence. This + idea was rejected because of its potential impact on existing + Python programs. A similar proposal to modify "dict.keys" and + "dict.values" to return sets, rather than lists, was rejected for + the same reasons. + + +Copyright + + This document has been placed in the Public Domain.