2000-07-31 16:20:31 -04:00
|
|
|
|
PEP: 218
|
|
|
|
|
Title: Adding a Built-In Set Object Type
|
|
|
|
|
Version: $Revision$
|
|
|
|
|
Author: gvwilson@nevex.com (Greg Wilson)
|
|
|
|
|
Status: Draft
|
2000-08-23 01:51:13 -04:00
|
|
|
|
Type: Standards Track
|
|
|
|
|
Python-Version: 2.1
|
2000-07-31 16:20:31 -04:00
|
|
|
|
Created: 31-Jul-2000
|
|
|
|
|
Post-History:
|
|
|
|
|
|
|
|
|
|
|
2000-11-26 23:10:06 -05:00
|
|
|
|
Introduction
|
|
|
|
|
|
|
|
|
|
This PEP proposes adding sets as a built-in type in Python.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rationale
|
2000-07-31 16:20:31 -04:00
|
|
|
|
|
|
|
|
|
Sets are a fundamental mathematical structure, and are commonly
|
2000-11-26 23:10:06 -05:00
|
|
|
|
used to specify algorithms. They are much less frequently used in
|
|
|
|
|
implementations, even when they are the "right" structure.
|
|
|
|
|
Programmers frequently use lists instead, even when the ordering
|
|
|
|
|
information in lists is irrelevant, and by-value lookups are
|
|
|
|
|
frequent. (Most medium-sized C programs contain a depressing
|
|
|
|
|
number of start-to-end searches through malloc'd vectors to
|
|
|
|
|
determine whether particular items are present or not...)
|
|
|
|
|
|
|
|
|
|
Programmers are often told that they can implement sets as
|
|
|
|
|
dictionaries with "don't care" values. Items can be added to
|
|
|
|
|
these "sets" by assigning the "don't care" value to them;
|
|
|
|
|
membership can be tested using "dict.has_key"; and items can be
|
|
|
|
|
deleted using "del". However, the three main binary operations
|
|
|
|
|
on sets --- union, intersection, and difference --- are not
|
|
|
|
|
directly supported by this representation, since their meaning is
|
|
|
|
|
ambiguous for dictionaries containing key/value pairs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Proposal
|
|
|
|
|
|
|
|
|
|
We propose adding a new built-in type to Python to represent sets.
|
|
|
|
|
This type will be an unordered collection of unique values, just
|
|
|
|
|
as a dictionary is an unordered collection of key/value pairs.
|
|
|
|
|
Constant sets will be represented using the usual mathematical
|
|
|
|
|
notation, so that "{1, 2, 3}" will be a set of three integers.
|
|
|
|
|
|
|
|
|
|
In order to avoid ambiguity, the empty set will be written "{,}",
|
|
|
|
|
rather than "{}" (which is already used to represent empty
|
|
|
|
|
dictionaries). We feel that this notation is as reasonable as the
|
|
|
|
|
use of "(3,)" to represent single-element tuples; a more radical
|
|
|
|
|
alternative is discussed in the "Alternatives" section.
|
|
|
|
|
|
|
|
|
|
Iteration and comprehension will be implemented in the obvious
|
|
|
|
|
ways, so that:
|
|
|
|
|
|
|
|
|
|
for x in S:
|
|
|
|
|
|
|
|
|
|
will step through the elements of S in arbitrary order, while:
|
|
|
|
|
|
|
|
|
|
{x**2 for x in S}
|
|
|
|
|
|
|
|
|
|
will produce a set containing the squares of all elements in S,
|
|
|
|
|
|
|
|
|
|
Membership will be tested using "in" and "not in".
|
|
|
|
|
|
|
|
|
|
The binary operators '|', '&', '-', and "^" will implement set
|
|
|
|
|
union, intersection, difference, and symmetric difference. Their
|
|
|
|
|
in-place equivalents will have the obvious semantics.
|
|
|
|
|
|
|
|
|
|
The method "add" will add an element to a set. This is different
|
|
|
|
|
from set union, as the following example shows:
|
|
|
|
|
|
|
|
|
|
>>> {1, 2, 3} | {4, 5, 6}
|
|
|
|
|
{1, 2, 3, 4, 5, 6}
|
|
|
|
|
|
|
|
|
|
>>> {1, 2, 3}.add({4, 5, 6})
|
|
|
|
|
{1, 2, 3, {4, 5, 6}}
|
|
|
|
|
|
|
|
|
|
Note that we expect that items can also be added to sets using
|
|
|
|
|
in-place union of temporaries, i.e. "S |= {x}" instead of
|
|
|
|
|
"S.add(x)".
|
|
|
|
|
|
|
|
|
|
Elements will be deleted from sets using a "remove" method, or
|
|
|
|
|
using "del":
|
|
|
|
|
|
|
|
|
|
>>> S = {1, 2, 3}
|
|
|
|
|
>>> del S[1]
|
|
|
|
|
>>> S
|
|
|
|
|
{2, 3}
|
|
|
|
|
>>> S.remove(3)
|
|
|
|
|
{2}
|
|
|
|
|
|
|
|
|
|
The "KeyError" exception will be raised if an attempt is made to
|
|
|
|
|
remove an element which is not in a set.
|
|
|
|
|
|
|
|
|
|
A new method "dict.keyset" will return the keys of a dictionary as
|
|
|
|
|
a set. A corresponding method "dict.valueset" will return the
|
|
|
|
|
dictionary's values as a set.
|
|
|
|
|
|
|
|
|
|
A built-in converter "set()" will convert any sequence type to a
|
|
|
|
|
set; converters such as "list()" and "tuple()" will be extended to
|
|
|
|
|
handle sets as input.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Alternatives
|
|
|
|
|
|
|
|
|
|
A radical alternative to the (admittedly clumsy) notation "{,}" is
|
|
|
|
|
to re-define "{}" to be the empty collection, rather than the
|
|
|
|
|
empty dictionary. Operations which made this object non-empty
|
|
|
|
|
would silently convert it to either a dictionary or a set; it
|
|
|
|
|
would then retain that type for the rest of its existence. This
|
|
|
|
|
idea was rejected because of its potential impact on existing
|
|
|
|
|
Python programs. A similar proposal to modify "dict.keys" and
|
|
|
|
|
"dict.values" to return sets, rather than lists, was rejected for
|
|
|
|
|
the same reasons.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
|
|
|
|
|
This document has been placed in the Public Domain.
|
2000-07-31 16:20:31 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
End:
|