Join this PEP (with Armin's blessing) and move it forward

from where it's been sitting for a while.  Update the text
for Py2.7 and Py3.1.  Link to a new implementation that
sticks with the basic dict API and no new methods.
This commit is contained in:
Raymond Hettinger 2009-02-26 12:38:29 +00:00
parent afa5b4cb7f
commit 2489667c4c
1 changed files with 55 additions and 96 deletions

View File

@ -3,11 +3,12 @@ Title: Adding an ordered dictionary to collections
Version: $Revision$
Last-Modified: $Date$
Author: Armin Ronacher <armin.ronacher@active-4.com>
Raymond Hettinger <python@rcn.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 15-Jun-2008
Python-Version: 2.6, 3.0
Python-Version: 2.7, 3.1
Post-History:
@ -73,7 +74,7 @@ situations:
provide an ordered dictionary. [1]_
- Code ported from other programming languages such as PHP often
depends on a ordered dict. Having an implementation of an
depends on an ordered dict. Having an implementation of an
ordering-preserving dictionary in the standard library could ease
the transition and improve the compatibility of different libraries.
@ -82,12 +83,12 @@ Ordered Dict API
================
The ordered dict API would be mostly compatible with dict and existing
ordered dicts. (Note: this PEP refers to the Python 2.x dictionary
API; the transfer to the 3.x API is trivial.)
ordered dicts. Note: this PEP refers to the 2.7 and 3.0 dictionary
API as described in collections.Mapping abstract base class.
The constructor and ``update()`` both accept iterables of tuples as
well as mappings like a dict does. The ordering however is preserved
for the first case:
well as mappings like a dict does. Unlike a regular dictionary,
the insertion order is preserved.
>>> d = odict([('a', 'b'), ('c', 'd')])
>>> d.update({'foo': 'bar'})
@ -95,11 +96,11 @@ for the first case:
collections.odict([('a', 'b'), ('c', 'd'), ('foo', 'bar')])
If ordered dicts are updated from regular dicts, the ordering of new
keys is of course undefined again unless ``sort()`` is called.
keys is of course undefined.
All iteration methods as well as ``keys()``, ``values()`` and
``items()`` return the values ordered by the time the key-value pair
was inserted:
``items()`` return the values ordered by the time the key was
first inserted:
>>> d['spam'] = 'eggs'
>>> d.keys()
@ -111,80 +112,9 @@ was inserted:
New methods not available on dict:
``odict.byindex(index)``
Returns the key/value pair for an index, that is, the "position" of a key in
the ordered dict. 0 is the first key/value pair, -1 the last.
>>> d.byindex(2)
('foo', 'bar')
If there is no key for index an `IndexError` is raised. Slices are not
supported.
``odict.index(key)``
Returns the index of a key. If the key does not exist, a `ValueError` is
raised.
``odict.sort(cmp=None, key=None, reverse=False)``
Sorts the odict in place by cmp or key. This works exactly like
``list.sort()``, but the comparison functions are passed a key/value tuple,
not only the value.
>>> d = odict([(42, 1), (1, 4), (23, 7)]) d.sort() d
collections.odict([(1, 4), (23, 7), (42, 1)])
``odict.reverse()``
Reverses the odict in place.
``odict.__reversed__()``
Supports reverse iteration by key.
``odict.__eq__()`` / ``odict.__ne__()``
Compares the odict to another object. If it's compared to another
odict the ordering of items is taken into account, otherwise only
the keys and values.
``odict.__cmp__()``
Ordered dicts are sorted by their items. ``cmp(od1, od2)`` is
equivalent to ``cmp(od1.items(), od2.items())`` if both ``od1``
and ``od2`` are ordered dicts. Otherwise the regular dict comparison
kicks in.
Python 3 Version
================
The Python 3 version of the ``odict`` returns dictionary views rather
than lists for ``odict.keys()``, ``odict.values()`` and
``odict.items()``. The keys view is equivalent to a regular keys view
but supports the following extra or changed operations:
``odict_keys.__getitem__(index)``
Returns the key for an index. This is equivalent to
``odict.byindex(index)``.
``odict_keys.index(key)``
Returns the index for a key. This exists for compatibility with
the ``Sequence`` abstract base class and is equivalent to
``odict.index(key)``.
``odict_keys.__iter__()``
Has the same semantics as ``odict.__iter__()``.
``odict_keys.__reversed__()``
Has the same semantics as ``odict.__reversed__()``.
``odict_keys.__cmp__()`` / ``odict_keys.__eq__()`` /
``odict_keys.__ne__()``
Same semantics as the equivalent ``odict`` operation. E.g.: when
compared to another odict keys view the ordering is taken into
account.
Questions and Answers
=====================
@ -205,7 +135,7 @@ constructor?
The same as for regular dicts: The latter item overrides the
former. This has the side-effect that the position of the first
key is used because the key is actually overwritten:
key is used because only the value is actually overwritten:
>>> odict([('a', 1), ('b', 2), ('a', 3)])
collections.odict([('a', 3), ('b', 2)])
@ -216,7 +146,7 @@ constructor?
Why is there no ``odict.insert()``?
There are few situations where you really want to insert a key at
an specified index. To avoid API complication, the proposed
a specified index. To avoid API complication, the proposed
solution for this situation is creating a list of items,
manipulating that and converting it back into an odict:
@ -230,31 +160,60 @@ Is the ordered dict a dict subclass?
Yes. Like ``defaultdict``, ``odict`` subclasses ``dict``.
Does ``odict.pop()`` support list-like popping of items?
Does ``odict.popitem()`` return a particular key/value pair?
No. Neither ``odict.__getitem__()`` nor ``odict.pop()`` support
retrieving or deleting items by index. Slices are not supported
either. This would introduce ambiguities if integers or slice
objects are used as dict keys.
Yes. It pops-off the most recently inserted new key and its
corresponding value. This corresponds to the usual LIFO behavior
exhibited by traditional push/pop pairs. It is semantically
equivalent to ``k=list(od)[-1]; v=od[k]; del od[k]; return (k,v)``.
The actual implementation is more efficient. It is O(n log n)
on the first call, any successive calls are O(1).
Does odict support indexing, slicing, and whatnot?
As a matter of fact, ``odict`` does not implement the ``Sequence``
interface.
interface. Rather, it is a ``MutableMapping`` that remembers
the order of key insertion. The only sequence-like addition is
automatic support for ``reversed``.
Does odict support alternate sort orders such as alphabetical?
Example Implementation
======================
No. Those wanting different sort orders really need to be using another
technique. The odict is all about recording insertion order. If any
other order is of interest, then another structure (like an in-memory
dbm) is likely a better fit. It would be a mistake to try to be all
things to all users.
A poorly performing example implementation of the odict written in
Python is available:
Reference Implementation
========================
`odict.py <http://dev.pocoo.org/hg/sandbox/raw-file/tip/odict.py>`_
A proposed version is at:
The version for ``collections`` should be implemented in C and use a
linked list internally.
`OrderedDict recipe <http://code.activestate.com/recipes/576669/>`_
The proposed version has several merits:
* Strict compliance with the MutableMapping API and no new methods
so that the learning curve is near zero. It is simply a dictionary
that remembers insertion order.
* Generally good performance. The big-oh times are the same as regular
dictionaries except for the cost of a single sort prior to the
first ordered retrieval (via *__iter__* or somesuch).
* Key insertion and deletion is O(1). The work of organizing keys into
correct order is deferred to the end (instead of trying to maintain
sorted list, linked list, or btree as the dict is built-up). This
corresponds to typical use patterns (read-in ordered key/value pairs,
make modifications, and then write them back out in insertion order)
and it takes advantage of Python's highly efficient built-in sort.
* The code runs without modification on Py2.6, Py2.7, Py3.0, and Py3.1.
Other implementations of ordered dicts in various Python projects or
standalone libraries, that inspired the API proposed here, are:
- `odict in Python`_
- `odict in Babel`_
- `OrderedDict in Django`_
- `The odict module`_
@ -262,7 +221,7 @@ standalone libraries, that inspired the API proposed here, are:
- `StableDict`_
- `Armin Rigo's OrderedDict`_
.. _odict in Python: http://dev.pocoo.org/hg/sandbox/raw-file/tip/odict.py
.. _odict in Babel: http://babel.edgewall.org/browser/trunk/babel/util.py?rev=374#L178
.. _OrderedDict in Django:
http://code.djangoproject.com/browser/django/trunk/django/utils/datastructures.py?rev=7140#L53