Wrote the real PEP (mostly).
This commit is contained in:
parent
c599c8e2ef
commit
9be72a8133
238
pep-3106.txt
238
pep-3106.txt
|
@ -11,5 +11,241 @@ Post-History:
|
|||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
Stub to reserve the PEP number.
|
||||
This PEP proposes to change the .keys(), .values() and .items()
|
||||
methods of the built-in dict type to return a set-like or
|
||||
multiset-like object whose contents are derived of the underlying
|
||||
dictionary rather than a list which is a copy of the keys, etc.; and
|
||||
to remove the .iterkeys(), .itervalues() and .iteritems() methods.
|
||||
|
||||
The approach is inspired by that taken in the Java Collections
|
||||
Framework [1]_.
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
It has long been the plan to change the .keys(), .values() and
|
||||
.items() methods of the built-in dict type to return a more
|
||||
lightweight object than a list, and to get rid of .iterkeys(),
|
||||
.itervalues() and .iteritems(). The idea is that code that currently
|
||||
(in 2.x) reads::
|
||||
|
||||
for x in d.iterkeys(): ...
|
||||
|
||||
should be rewritten as::
|
||||
|
||||
for x in d.keys(): ...
|
||||
|
||||
and code that currently reads::
|
||||
|
||||
a = d.keys() # assume we really want a list here
|
||||
|
||||
should be rewritten as
|
||||
|
||||
a = list(d.keys())
|
||||
|
||||
There are (at least) two ways to accomplish this. The original plan
|
||||
was to simply let .keys(), .values() and .items() return an iterator,
|
||||
i.e. exactly what iterkeys(), itervalues() and iteritems() return
|
||||
in Python 2.x. However, the Java Collections Framework [1]_ suggests
|
||||
that a better solution is possible: the methods return objects with
|
||||
set behavior (for .keys() and .items()) or multiset behavior (for
|
||||
.values()) that do not contain copies of the keys, values or items,
|
||||
but rather reference the underlying dict and pull their values out of
|
||||
the dict as needed.
|
||||
|
||||
The advantage of this approach is that one can still write code like
|
||||
this::
|
||||
|
||||
a = d.keys()
|
||||
for x in a: ...
|
||||
for x in a: ...
|
||||
|
||||
Effectively, iter(d.keys()) in Python 3.0 does what d.iterkeys() does
|
||||
in Python 2.x; but in most contexts we don't have to write the iter()
|
||||
call because it is implied by a for-loop.
|
||||
|
||||
The objects returned by the .keys() and .items() methods behave like
|
||||
sets with limited mutability; the allow removing elements, but not
|
||||
adding them. Removing an item from these sets removes it from the
|
||||
underlying dict. The object returned by the values() method behaves
|
||||
like a multiset (Java calls this a Collection). It does not allow
|
||||
removing elements, because a value might occur multiple times and the
|
||||
implementation wouldn't know which key to remove from the underlying
|
||||
dict. (The Java Collections Framework has a way around this by
|
||||
removing from an iterator, but I see no practical use case for that
|
||||
functionality.)
|
||||
|
||||
Because of the set behavior, it will be possible to check whether two
|
||||
dicts have the same keys by simply testing::
|
||||
|
||||
if a.keys() == b.keys(): ...
|
||||
|
||||
and similarly for values. (Two multisets are deemed equal if they
|
||||
have the same elements with the same cardinalities,
|
||||
e.g. the multiset {1, 2, 2} is equal to the multiset {2, 1, 2} but
|
||||
differs from the multiset {1, 2}.)
|
||||
|
||||
These operations are thread-safe only to the extent that using them in
|
||||
a thread-unsafe way may cause an exception but will not cause
|
||||
corruption of the internal representation.
|
||||
|
||||
As in Python 2.x, mutating a dict while iterating over it using an
|
||||
iterator has an undefined effect and will in most cases raise a
|
||||
RuntimeError exception. (This is similar to the guarantees made by
|
||||
the Java Collections Framework.)
|
||||
|
||||
The objects returned by .keys() and .items() are fully interoperable
|
||||
with instances of the built-in set and frozenset types; for example::
|
||||
|
||||
set(d.keys()) == d.keys()
|
||||
|
||||
is guaranteed to be True (except when d is being modified
|
||||
simultaneously by another thread).
|
||||
|
||||
|
||||
Specification
|
||||
=============
|
||||
|
||||
I'll try pseudo-code to specify the semantics::
|
||||
|
||||
class dict:
|
||||
|
||||
# Omitting all other dict methods for brevity
|
||||
|
||||
def keys(self):
|
||||
return d_keys(self)
|
||||
|
||||
def items(self):
|
||||
return d_items(self)
|
||||
|
||||
def values(self):
|
||||
return d_values(self)
|
||||
|
||||
class d_keys:
|
||||
|
||||
def __init__(self, d):
|
||||
self.__d = d
|
||||
|
||||
def __len__(self):
|
||||
return len(self.__d)
|
||||
|
||||
def __contains__(self, key):
|
||||
return key in self.__d
|
||||
|
||||
def __iter__(self):
|
||||
for key in self.__d:
|
||||
yield key
|
||||
|
||||
def remove(self, key):
|
||||
del self.__d[key]
|
||||
|
||||
def discard(self, key):
|
||||
if key in self:
|
||||
self.remove(key)
|
||||
|
||||
def pop(self):
|
||||
return self.__d.popitem()[0]
|
||||
|
||||
def clear(self):
|
||||
self.__d.clear()
|
||||
|
||||
def copy(self):
|
||||
return set(self)
|
||||
|
||||
# The following operations should be implemented to be
|
||||
# compatible with sets; this can be done by exploiting
|
||||
# the above primitive operations:
|
||||
#
|
||||
# <, <=, ==, !=, >=, > (returning a bool)
|
||||
# &, |, ^, - (returning a new, real set object)
|
||||
# &=, -= (updating in place and returning self; but not |=, ^=)
|
||||
#
|
||||
# as well as their method counterparts (.union(), etc.).
|
||||
|
||||
class d_items:
|
||||
|
||||
def __init__(self, d):
|
||||
self.__d = d
|
||||
|
||||
def __len__(self):
|
||||
return len(self.__d)
|
||||
|
||||
def __contains__(self, (key, value)):
|
||||
return key in self.__d and self.__d[key] == value
|
||||
|
||||
def __iter__(self):
|
||||
for key in self.__d:
|
||||
yield key, self.__d[key]
|
||||
|
||||
def remove(self, (key, value)):
|
||||
del self.__d[key]
|
||||
|
||||
def discard(self, item):
|
||||
if item in self:
|
||||
self.remove(item)
|
||||
|
||||
def pop(self):
|
||||
return self.__d.popitem()
|
||||
|
||||
def clear(self):
|
||||
self.__d.clear()
|
||||
|
||||
def copy(self):
|
||||
return set(self)
|
||||
|
||||
# As well as the same set operations as mentioned for d_keys above.
|
||||
|
||||
class d_values:
|
||||
|
||||
def __init__(self, d):
|
||||
self.__d = d
|
||||
|
||||
def __len__(self):
|
||||
return len(self.__d)
|
||||
|
||||
def __contains__(self, value):
|
||||
# Slow! Do we even want to implement this?
|
||||
for v in self:
|
||||
if v == value:
|
||||
return True
|
||||
return False
|
||||
|
||||
def __iter__(self):
|
||||
for key in self.__d:
|
||||
yield self.__d[key]
|
||||
|
||||
# Do we care about the following?
|
||||
|
||||
def pop(self):
|
||||
return self.__d.popitem()[1]
|
||||
|
||||
def clear(self):
|
||||
return self.__d.clear()
|
||||
|
||||
def copy(self):
|
||||
# XXX What should this return?
|
||||
|
||||
# Should we bother implementing set-like operations on
|
||||
# multisets? If so, how about mixed operations on sets and
|
||||
# multisets? I'm not sure that these are worth the effort.
|
||||
|
||||
I'm soliciting better names than d_keys, d_values and d_items; these
|
||||
classes will be public so that their implementations may be reused by
|
||||
the .keys(), .values() and .items() methods of other mappings. (Or
|
||||
should they?)
|
||||
|
||||
|
||||
Open Issues
|
||||
===========
|
||||
|
||||
Should the d_keys, d_values and d_items classes be reusable? Should
|
||||
they be subclassable?
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [1] Java Collections Framework
|
||||
http://java.sun.com/docs/books/tutorial/collections/index.html
|
||||
|
|
Loading…
Reference in New Issue