Add PEP 372, Adding an ordered directory to collections.
This commit is contained in:
parent
86de0a6dbe
commit
60cae8daed
|
@ -98,6 +98,7 @@ Index by Category
|
||||||
S 364 Transitioning to the Py3K Standard Library Warsaw
|
S 364 Transitioning to the Py3K Standard Library Warsaw
|
||||||
S 368 Standard image protocol and class Mastrodomenico
|
S 368 Standard image protocol and class Mastrodomenico
|
||||||
S 369 Post import hooks Heimes
|
S 369 Post import hooks Heimes
|
||||||
|
S 372 Adding an ordered directory to collections Ronacher
|
||||||
S 3135 New Super Spealman, Delaney
|
S 3135 New Super Spealman, Delaney
|
||||||
|
|
||||||
Finished PEPs (done, implemented in Subversion)
|
Finished PEPs (done, implemented in Subversion)
|
||||||
|
@ -476,6 +477,7 @@ Numerical Index
|
||||||
S 369 Post import hooks Heimes
|
S 369 Post import hooks Heimes
|
||||||
SA 370 Per user site-packages directory Heimes
|
SA 370 Per user site-packages directory Heimes
|
||||||
SA 371 Addition of the multiprocessing package Noller, Oudkerk
|
SA 371 Addition of the multiprocessing package Noller, Oudkerk
|
||||||
|
S 372 Adding an ordered directory to collections Ronacher
|
||||||
SR 666 Reject Foolish Indentation Creighton
|
SR 666 Reject Foolish Indentation Creighton
|
||||||
SR 754 IEEE 754 Floating Point Special Values Warnes
|
SR 754 IEEE 754 Floating Point Special Values Warnes
|
||||||
P 3000 Python 3000 GvR
|
P 3000 Python 3000 GvR
|
||||||
|
@ -629,6 +631,7 @@ Owners
|
||||||
Reis, Christian R. kiko@async.com.br
|
Reis, Christian R. kiko@async.com.br
|
||||||
Riehl, Jonathan jriehl@spaceship.com
|
Riehl, Jonathan jriehl@spaceship.com
|
||||||
Roberge, André andre.roberge@gmail.com
|
Roberge, André andre.roberge@gmail.com
|
||||||
|
Ronacher, Armin armin.ronacher@active-4.com
|
||||||
van Rossum, Guido (GvR) guido@python.org
|
van Rossum, Guido (GvR) guido@python.org
|
||||||
van Rossum, Just (JvR) just@letterror.com
|
van Rossum, Just (JvR) just@letterror.com
|
||||||
Sajip, Vinay vinay_sajip@red-dove.com
|
Sajip, Vinay vinay_sajip@red-dove.com
|
||||||
|
|
|
@ -0,0 +1,234 @@
|
||||||
|
PEP: 372
|
||||||
|
Title: Adding an ordered directory to collections
|
||||||
|
Version: $Revision$
|
||||||
|
Last-Modified: $Date$
|
||||||
|
Author: Armin Ronacher <armin.ronacher@active-4.com>
|
||||||
|
Status: Draft
|
||||||
|
Type: Standards Track
|
||||||
|
Content-Type: text/x-rst
|
||||||
|
Created: 15-Jun-2008
|
||||||
|
Python-Version: 2.6, 3.0
|
||||||
|
Post-History:
|
||||||
|
|
||||||
|
|
||||||
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
|
This PEP proposes an ordered dictionary as a new data structure for
|
||||||
|
the ``collections`` module, called "odict" in this PEP for short. The
|
||||||
|
proposed API incorporates the experiences gained from working with
|
||||||
|
similar implementations that exist in various real-world applications
|
||||||
|
and other programming languages.
|
||||||
|
|
||||||
|
|
||||||
|
Rationale
|
||||||
|
=========
|
||||||
|
|
||||||
|
In current Python versions, the widely used built-in dict type does
|
||||||
|
not specify an order for the key/value pairs stored. This makes it
|
||||||
|
hard to use dictionaries as data storage for some specific use cases.
|
||||||
|
|
||||||
|
Some dynamic programming languages like PHP and Ruby 1.9 guarantee a
|
||||||
|
certain order on iteration. In those languages, and existing Python
|
||||||
|
ordered-dict implementations, the ordering of items is defined by the
|
||||||
|
time of insertion of the key. New keys are appended at the end, keys
|
||||||
|
that are overwritten and not moved.
|
||||||
|
|
||||||
|
The following example shows the behavior for simple assignments:
|
||||||
|
|
||||||
|
>>> d = odict()
|
||||||
|
>>> d['parrot'] = 'dead'
|
||||||
|
>>> d['penguin'] = 'exploded'
|
||||||
|
>>> d.items()
|
||||||
|
[('parrot', 'dead'), ('penguin', 'exploded')]
|
||||||
|
|
||||||
|
That the ordering is preserved makes an odict useful for a couple of
|
||||||
|
situations:
|
||||||
|
|
||||||
|
- XML/HTML processing libraries currently drop the ordering of
|
||||||
|
attributes, use a list instead of a dict which makes filtering
|
||||||
|
cumbersome, or implement their own ordered dictionary. This affects
|
||||||
|
ElementTree, html5lib, Genshi and many more libraries.
|
||||||
|
|
||||||
|
- There are many ordererd dict implementations in various libraries
|
||||||
|
and applications, most of them subtly incompatible with each other.
|
||||||
|
Furthermore, subclassing dict is a non-trivial task and many
|
||||||
|
implementations don't override all the methods properly which can
|
||||||
|
lead to unexpected results.
|
||||||
|
|
||||||
|
Additionally, many ordered dicts are implemented in an inefficient
|
||||||
|
way, making many operations more complex then they have to be.
|
||||||
|
|
||||||
|
- PEP 3115 allows metaclasses to change the mapping object used for
|
||||||
|
the class body. An ordered dict could be used to create ordered
|
||||||
|
member declarations similar to C structs. This could be useful, for
|
||||||
|
example, for future ``ctypes`` releases as well as ORMs that define
|
||||||
|
database tables as classes, like the one the Django framework ships.
|
||||||
|
Django currently uses an ugly hack to restore the ordering of
|
||||||
|
members in database models.
|
||||||
|
|
||||||
|
- Code ported from other programming languages such as PHP often
|
||||||
|
depends on a ordered dict. Having an implementation of an
|
||||||
|
ordering-preserving dictionary in the standard library could ease
|
||||||
|
the transition and improve the compatibility of different libraries.
|
||||||
|
|
||||||
|
|
||||||
|
Ordered Dict API
|
||||||
|
================
|
||||||
|
|
||||||
|
The ordered dict API would be mostly compatible with dict and existing
|
||||||
|
ordered dicts. (Note: this PEP refers to the Python 2.x dictionary
|
||||||
|
API; the transfer to the 3.x API is trivial.)
|
||||||
|
|
||||||
|
The constructor and ``update()`` both accept iterables of tuples as
|
||||||
|
well as mappings like a dict does. The ordering however is preserved
|
||||||
|
for the first case:
|
||||||
|
|
||||||
|
>>> d = odict([('a', 'b'), ('c', 'd')])
|
||||||
|
>>> d.update({'foo': 'bar'})
|
||||||
|
>>> d
|
||||||
|
collections.odict([('a', 'b'), ('c', 'd'), ('foo', 'bar')])
|
||||||
|
|
||||||
|
If ordered dicts are updated from regular dicts, the ordering of new
|
||||||
|
keys is of course undefined again unless ``sort()`` is called.
|
||||||
|
|
||||||
|
All iteration methods as well as ``keys()``, ``values()`` and
|
||||||
|
``items()`` return the values ordered by the the time the key-value
|
||||||
|
pair was inserted:
|
||||||
|
|
||||||
|
>>> d['spam'] = 'eggs'
|
||||||
|
>>> d.keys()
|
||||||
|
['a', 'c', 'foo', 'spam']
|
||||||
|
>>> d.values()
|
||||||
|
['b', 'd', 'bar', 'eggs']
|
||||||
|
>>> d.items()
|
||||||
|
[('a', 'b'), ('c', 'd'), ('foo', 'bar'), ('spam', 'eggs')]
|
||||||
|
|
||||||
|
New methods not available on dict:
|
||||||
|
|
||||||
|
``odict.byindex(index)``
|
||||||
|
|
||||||
|
Index-based lookup is supported by ``byindex()`` which returns
|
||||||
|
the key/value pair for an index, that is, the "position" of a
|
||||||
|
key in the ordered dict. 0 is the first key/value pair, -1
|
||||||
|
the last.
|
||||||
|
|
||||||
|
>>> d.byindex(2)
|
||||||
|
('foo', 'bar')
|
||||||
|
|
||||||
|
``odict.sort(cmp=None, key=None, reverse=False)``
|
||||||
|
|
||||||
|
Sorts the odict in place by cmp or key. This works exactly
|
||||||
|
like ``list.sort()``, but the comparison functions are passed
|
||||||
|
a key/value tuple, not only the value.
|
||||||
|
|
||||||
|
>>> d = odict([(42, 1), (1, 4), (23, 7)])
|
||||||
|
>>> d.sort()
|
||||||
|
>>> d
|
||||||
|
collections.odict([(1, 4), (23, 7), (42, 1)])
|
||||||
|
|
||||||
|
``odict.reverse()``
|
||||||
|
|
||||||
|
Reverses the odict in place.
|
||||||
|
|
||||||
|
``odict.__reverse__()``
|
||||||
|
|
||||||
|
Supports reverse iteration by key.
|
||||||
|
|
||||||
|
|
||||||
|
Questions and Answers
|
||||||
|
=====================
|
||||||
|
|
||||||
|
What happens if an existing key is reassigned?
|
||||||
|
|
||||||
|
The key is not moved but assigned a new value in place. This is
|
||||||
|
consistent with existing implementations and allows subclasses to
|
||||||
|
change the behavior easily::
|
||||||
|
|
||||||
|
class movingcollections.odict):
|
||||||
|
def __setitem__(self, key, value):
|
||||||
|
self.pop(key, None)
|
||||||
|
odict.__setitem__(self, key, value)
|
||||||
|
|
||||||
|
What happens if keys appear multiple times in the list passed to the
|
||||||
|
constructor?
|
||||||
|
|
||||||
|
The same as for regular dicts: The latter item overrides the
|
||||||
|
former. This has the side-effect that the position of the first
|
||||||
|
key is used because the key is actually overwritten:
|
||||||
|
|
||||||
|
>>> odict([('a', 1), ('b', 2), ('a', 3)])
|
||||||
|
collections.odict([('a', 3), ('b', 2)])
|
||||||
|
|
||||||
|
This behavior is consistent with existing implementations in
|
||||||
|
Python, the PHP array and the hashmap in Ruby 1.9.
|
||||||
|
|
||||||
|
Why is there no ``odict.insert()``?
|
||||||
|
|
||||||
|
There are few situations where you really want to insert a key at
|
||||||
|
an specified index. To avoid API complication, the proposed
|
||||||
|
solution for this situation is creating a list of items,
|
||||||
|
manipulating that and converting it back into an odict:
|
||||||
|
|
||||||
|
>>> d = odict([('a', 42), ('b', 23), ('c', 19)])
|
||||||
|
>>> l = d.items()
|
||||||
|
>>> l.insert(1, ('x', 0))
|
||||||
|
>>> odict(l)
|
||||||
|
collections.odict([('a', 42), ('x', 0), ('b', 23), ('c', 19)])
|
||||||
|
|
||||||
|
|
||||||
|
Example Implementation
|
||||||
|
======================
|
||||||
|
|
||||||
|
A poorly performing example implementation of the odict written in
|
||||||
|
Python is available:
|
||||||
|
|
||||||
|
`odict.py <http://dev.pocoo.org/hg/sandbox/raw-file/tip/odict.py>`_
|
||||||
|
|
||||||
|
The version for ``collections`` should be implemented in C and use a
|
||||||
|
linked list internally.
|
||||||
|
|
||||||
|
Other implementations of ordered dicts in various Python projects or
|
||||||
|
standalone libraries, that inspired the API proposed here, are:
|
||||||
|
|
||||||
|
- `odict in Babel`_
|
||||||
|
- `OrderedDict in Django`_
|
||||||
|
- `The odict module`_
|
||||||
|
- `ordereddict`_ (a C implementation of the odict module)
|
||||||
|
- `StableDict`_
|
||||||
|
- `Armin Rigo's OrderedDict`_
|
||||||
|
|
||||||
|
|
||||||
|
.. _odict in Babel: http://babel.edgewall.org/browser/trunk/babel/util.py?rev=374#L178
|
||||||
|
.. _OrderedDict in Django:
|
||||||
|
http://code.djangoproject.com/browser/django/trunk/django/utils/datastructures.py?rev=7140#L53
|
||||||
|
.. _The odict module: http://www.voidspace.org.uk/python/odict.html
|
||||||
|
.. _ordereddict: http://www.xs4all.nl/~anthon/Python/ordereddict/
|
||||||
|
.. _StableDict: http://pypi.python.org/pypi/StableDict/0.2
|
||||||
|
.. _Armin Rigo's OrderedDict: http://codespeak.net/svn/user/arigo/hack/pyfuse/OrderedDict.py
|
||||||
|
|
||||||
|
|
||||||
|
Future Directions
|
||||||
|
=================
|
||||||
|
|
||||||
|
With the availability of an ordered dict in the standard library,
|
||||||
|
other libraries may take advantage of that. For example, ElementTree
|
||||||
|
could return odicts in the future that retain the attribute ordering
|
||||||
|
of the source file.
|
||||||
|
|
||||||
|
|
||||||
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
|
This document has been placed in the public domain.
|
||||||
|
|
||||||
|
|
||||||
|
..
|
||||||
|
Local Variables:
|
||||||
|
mode: indented-text
|
||||||
|
indent-tabs-mode: nil
|
||||||
|
sentence-end-double-space: t
|
||||||
|
fill-column: 70
|
||||||
|
coding: utf-8
|
||||||
|
End:
|
||||||
|
|
Loading…
Reference in New Issue