Add PEP 372, Adding an ordered directory to collections.
This commit is contained in:
parent
86de0a6dbe
commit
60cae8daed
|
@ -98,6 +98,7 @@ Index by Category
|
|||
S 364 Transitioning to the Py3K Standard Library Warsaw
|
||||
S 368 Standard image protocol and class Mastrodomenico
|
||||
S 369 Post import hooks Heimes
|
||||
S 372 Adding an ordered directory to collections Ronacher
|
||||
S 3135 New Super Spealman, Delaney
|
||||
|
||||
Finished PEPs (done, implemented in Subversion)
|
||||
|
@ -476,6 +477,7 @@ Numerical Index
|
|||
S 369 Post import hooks Heimes
|
||||
SA 370 Per user site-packages directory Heimes
|
||||
SA 371 Addition of the multiprocessing package Noller, Oudkerk
|
||||
S 372 Adding an ordered directory to collections Ronacher
|
||||
SR 666 Reject Foolish Indentation Creighton
|
||||
SR 754 IEEE 754 Floating Point Special Values Warnes
|
||||
P 3000 Python 3000 GvR
|
||||
|
@ -629,6 +631,7 @@ Owners
|
|||
Reis, Christian R. kiko@async.com.br
|
||||
Riehl, Jonathan jriehl@spaceship.com
|
||||
Roberge, André andre.roberge@gmail.com
|
||||
Ronacher, Armin armin.ronacher@active-4.com
|
||||
van Rossum, Guido (GvR) guido@python.org
|
||||
van Rossum, Just (JvR) just@letterror.com
|
||||
Sajip, Vinay vinay_sajip@red-dove.com
|
||||
|
|
|
@ -0,0 +1,234 @@
|
|||
PEP: 372
|
||||
Title: Adding an ordered directory to collections
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Armin Ronacher <armin.ronacher@active-4.com>
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 15-Jun-2008
|
||||
Python-Version: 2.6, 3.0
|
||||
Post-History:
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This PEP proposes an ordered dictionary as a new data structure for
|
||||
the ``collections`` module, called "odict" in this PEP for short. The
|
||||
proposed API incorporates the experiences gained from working with
|
||||
similar implementations that exist in various real-world applications
|
||||
and other programming languages.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
In current Python versions, the widely used built-in dict type does
|
||||
not specify an order for the key/value pairs stored. This makes it
|
||||
hard to use dictionaries as data storage for some specific use cases.
|
||||
|
||||
Some dynamic programming languages like PHP and Ruby 1.9 guarantee a
|
||||
certain order on iteration. In those languages, and existing Python
|
||||
ordered-dict implementations, the ordering of items is defined by the
|
||||
time of insertion of the key. New keys are appended at the end, keys
|
||||
that are overwritten and not moved.
|
||||
|
||||
The following example shows the behavior for simple assignments:
|
||||
|
||||
>>> d = odict()
|
||||
>>> d['parrot'] = 'dead'
|
||||
>>> d['penguin'] = 'exploded'
|
||||
>>> d.items()
|
||||
[('parrot', 'dead'), ('penguin', 'exploded')]
|
||||
|
||||
That the ordering is preserved makes an odict useful for a couple of
|
||||
situations:
|
||||
|
||||
- XML/HTML processing libraries currently drop the ordering of
|
||||
attributes, use a list instead of a dict which makes filtering
|
||||
cumbersome, or implement their own ordered dictionary. This affects
|
||||
ElementTree, html5lib, Genshi and many more libraries.
|
||||
|
||||
- There are many ordererd dict implementations in various libraries
|
||||
and applications, most of them subtly incompatible with each other.
|
||||
Furthermore, subclassing dict is a non-trivial task and many
|
||||
implementations don't override all the methods properly which can
|
||||
lead to unexpected results.
|
||||
|
||||
Additionally, many ordered dicts are implemented in an inefficient
|
||||
way, making many operations more complex then they have to be.
|
||||
|
||||
- PEP 3115 allows metaclasses to change the mapping object used for
|
||||
the class body. An ordered dict could be used to create ordered
|
||||
member declarations similar to C structs. This could be useful, for
|
||||
example, for future ``ctypes`` releases as well as ORMs that define
|
||||
database tables as classes, like the one the Django framework ships.
|
||||
Django currently uses an ugly hack to restore the ordering of
|
||||
members in database models.
|
||||
|
||||
- Code ported from other programming languages such as PHP often
|
||||
depends on a ordered dict. Having an implementation of an
|
||||
ordering-preserving dictionary in the standard library could ease
|
||||
the transition and improve the compatibility of different libraries.
|
||||
|
||||
|
||||
Ordered Dict API
|
||||
================
|
||||
|
||||
The ordered dict API would be mostly compatible with dict and existing
|
||||
ordered dicts. (Note: this PEP refers to the Python 2.x dictionary
|
||||
API; the transfer to the 3.x API is trivial.)
|
||||
|
||||
The constructor and ``update()`` both accept iterables of tuples as
|
||||
well as mappings like a dict does. The ordering however is preserved
|
||||
for the first case:
|
||||
|
||||
>>> d = odict([('a', 'b'), ('c', 'd')])
|
||||
>>> d.update({'foo': 'bar'})
|
||||
>>> d
|
||||
collections.odict([('a', 'b'), ('c', 'd'), ('foo', 'bar')])
|
||||
|
||||
If ordered dicts are updated from regular dicts, the ordering of new
|
||||
keys is of course undefined again unless ``sort()`` is called.
|
||||
|
||||
All iteration methods as well as ``keys()``, ``values()`` and
|
||||
``items()`` return the values ordered by the the time the key-value
|
||||
pair was inserted:
|
||||
|
||||
>>> d['spam'] = 'eggs'
|
||||
>>> d.keys()
|
||||
['a', 'c', 'foo', 'spam']
|
||||
>>> d.values()
|
||||
['b', 'd', 'bar', 'eggs']
|
||||
>>> d.items()
|
||||
[('a', 'b'), ('c', 'd'), ('foo', 'bar'), ('spam', 'eggs')]
|
||||
|
||||
New methods not available on dict:
|
||||
|
||||
``odict.byindex(index)``
|
||||
|
||||
Index-based lookup is supported by ``byindex()`` which returns
|
||||
the key/value pair for an index, that is, the "position" of a
|
||||
key in the ordered dict. 0 is the first key/value pair, -1
|
||||
the last.
|
||||
|
||||
>>> d.byindex(2)
|
||||
('foo', 'bar')
|
||||
|
||||
``odict.sort(cmp=None, key=None, reverse=False)``
|
||||
|
||||
Sorts the odict in place by cmp or key. This works exactly
|
||||
like ``list.sort()``, but the comparison functions are passed
|
||||
a key/value tuple, not only the value.
|
||||
|
||||
>>> d = odict([(42, 1), (1, 4), (23, 7)])
|
||||
>>> d.sort()
|
||||
>>> d
|
||||
collections.odict([(1, 4), (23, 7), (42, 1)])
|
||||
|
||||
``odict.reverse()``
|
||||
|
||||
Reverses the odict in place.
|
||||
|
||||
``odict.__reverse__()``
|
||||
|
||||
Supports reverse iteration by key.
|
||||
|
||||
|
||||
Questions and Answers
|
||||
=====================
|
||||
|
||||
What happens if an existing key is reassigned?
|
||||
|
||||
The key is not moved but assigned a new value in place. This is
|
||||
consistent with existing implementations and allows subclasses to
|
||||
change the behavior easily::
|
||||
|
||||
class movingcollections.odict):
|
||||
def __setitem__(self, key, value):
|
||||
self.pop(key, None)
|
||||
odict.__setitem__(self, key, value)
|
||||
|
||||
What happens if keys appear multiple times in the list passed to the
|
||||
constructor?
|
||||
|
||||
The same as for regular dicts: The latter item overrides the
|
||||
former. This has the side-effect that the position of the first
|
||||
key is used because the key is actually overwritten:
|
||||
|
||||
>>> odict([('a', 1), ('b', 2), ('a', 3)])
|
||||
collections.odict([('a', 3), ('b', 2)])
|
||||
|
||||
This behavior is consistent with existing implementations in
|
||||
Python, the PHP array and the hashmap in Ruby 1.9.
|
||||
|
||||
Why is there no ``odict.insert()``?
|
||||
|
||||
There are few situations where you really want to insert a key at
|
||||
an specified index. To avoid API complication, the proposed
|
||||
solution for this situation is creating a list of items,
|
||||
manipulating that and converting it back into an odict:
|
||||
|
||||
>>> d = odict([('a', 42), ('b', 23), ('c', 19)])
|
||||
>>> l = d.items()
|
||||
>>> l.insert(1, ('x', 0))
|
||||
>>> odict(l)
|
||||
collections.odict([('a', 42), ('x', 0), ('b', 23), ('c', 19)])
|
||||
|
||||
|
||||
Example Implementation
|
||||
======================
|
||||
|
||||
A poorly performing example implementation of the odict written in
|
||||
Python is available:
|
||||
|
||||
`odict.py <http://dev.pocoo.org/hg/sandbox/raw-file/tip/odict.py>`_
|
||||
|
||||
The version for ``collections`` should be implemented in C and use a
|
||||
linked list internally.
|
||||
|
||||
Other implementations of ordered dicts in various Python projects or
|
||||
standalone libraries, that inspired the API proposed here, are:
|
||||
|
||||
- `odict in Babel`_
|
||||
- `OrderedDict in Django`_
|
||||
- `The odict module`_
|
||||
- `ordereddict`_ (a C implementation of the odict module)
|
||||
- `StableDict`_
|
||||
- `Armin Rigo's OrderedDict`_
|
||||
|
||||
|
||||
.. _odict in Babel: http://babel.edgewall.org/browser/trunk/babel/util.py?rev=374#L178
|
||||
.. _OrderedDict in Django:
|
||||
http://code.djangoproject.com/browser/django/trunk/django/utils/datastructures.py?rev=7140#L53
|
||||
.. _The odict module: http://www.voidspace.org.uk/python/odict.html
|
||||
.. _ordereddict: http://www.xs4all.nl/~anthon/Python/ordereddict/
|
||||
.. _StableDict: http://pypi.python.org/pypi/StableDict/0.2
|
||||
.. _Armin Rigo's OrderedDict: http://codespeak.net/svn/user/arigo/hack/pyfuse/OrderedDict.py
|
||||
|
||||
|
||||
Future Directions
|
||||
=================
|
||||
|
||||
With the availability of an ordered dict in the standard library,
|
||||
other libraries may take advantage of that. For example, ElementTree
|
||||
could return odicts in the future that retain the attribute ordering
|
||||
of the source file.
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
This document has been placed in the public domain.
|
||||
|
||||
|
||||
..
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
sentence-end-double-space: t
|
||||
fill-column: 70
|
||||
coding: utf-8
|
||||
End:
|
||||
|
Loading…
Reference in New Issue