357 lines
15 KiB
Plaintext
357 lines
15 KiB
Plaintext
|
PEP: 3128
|
|||
|
Title: BList: A Faster List-like Type
|
|||
|
Version: $Revision$
|
|||
|
Last-Modified: $Date$
|
|||
|
Author: Daniel Stutzbach <daniel@stutzbachenterprises.com>
|
|||
|
Discussions-To: Python 3000 List <python-3000@python.org>
|
|||
|
Status: Draft
|
|||
|
Type: Standards Track
|
|||
|
Content-Type: text/x-rst
|
|||
|
Created: 30-Apr-2007
|
|||
|
Python-Version: 2.6 and/or 3.0
|
|||
|
Post-History: 30-Apr-2007
|
|||
|
|
|||
|
|
|||
|
Abstract
|
|||
|
========
|
|||
|
|
|||
|
The common case for list operations is on small lists. The current
|
|||
|
array-based list implementation excels at small lists due to the
|
|||
|
strong locality of reference and infrequency of memory allocation
|
|||
|
operations. However, an array takes O(n) time to insert and delete
|
|||
|
elements, which can become problematic as the list gets large.
|
|||
|
|
|||
|
This PEP introduces a new data type, the BList, that has array-like
|
|||
|
and tree-like aspects. It enjoys the same good performance on small
|
|||
|
lists as the existing array-based implementation, but offers superior
|
|||
|
asymptotic performance for most operations. This PEP proposes
|
|||
|
replacing the makes two mutually exclusive proposals for including the
|
|||
|
BList type in Python:
|
|||
|
|
|||
|
1. Add it to the collections module, or
|
|||
|
2. Replace the existing list type
|
|||
|
|
|||
|
|
|||
|
Motivation
|
|||
|
==========
|
|||
|
|
|||
|
The BList grew out of the frustration of needing to rewrite intuitive
|
|||
|
algorithms that worked fine for small inputs but took O(n**2) time for
|
|||
|
large inputs due to the underlying O(n) behavior of array-based lists.
|
|||
|
The deque type, introduced in Python 2.4, solved the most common
|
|||
|
problem of needing a fast FIFO queue. However, the deque type doesn't
|
|||
|
help if we need to repeatedly insert or delete elements from the
|
|||
|
middle of a long list.
|
|||
|
|
|||
|
A wide variety of data structure provide good asymptotic performance
|
|||
|
for insertions and deletions, but they either have O(n) performance
|
|||
|
for other operations (e.g., linked lists) or have inferior performance
|
|||
|
for small lists (e.g., binary trees and skip lists).
|
|||
|
|
|||
|
The BList type proposed in this PEP is based on the principles of
|
|||
|
B+Trees, which have array-like and tree-like aspects. The BList
|
|||
|
offers array-like performance on small lists, while offering O(log n)
|
|||
|
asymptotic performance for all insert and delete operations.
|
|||
|
Additionally, the BList implements copy-on-write under-the-hood, so
|
|||
|
even operations like getslice take O(log n) time. The table below
|
|||
|
compares the asymptotic performance of the current array-based list
|
|||
|
implementation with the asymptotic performance of the BList.
|
|||
|
|
|||
|
========= ================ ====================
|
|||
|
Operation Array-based list BList
|
|||
|
========= ================ ====================
|
|||
|
Copy O(n) **O(1)**
|
|||
|
Append **O(1)** O(log n)
|
|||
|
Insert O(n) **O(log n)**
|
|||
|
Get Item **O(1)** O(log n)
|
|||
|
Set Item **O(1)** **O(log n)**
|
|||
|
Del Item O(n) **O(log n)**
|
|||
|
Iteration O(n) O(n)
|
|||
|
Get Slice O(k) **O(log n)**
|
|||
|
Del Slice O(n) **O(log n)**
|
|||
|
Set Slice O(n+k) **O(log k + log n)**
|
|||
|
Extend O(k) **O(log k + log n)**
|
|||
|
Sort O(n log n) O(n log n)
|
|||
|
Multiply O(nk) **O(log k)**
|
|||
|
========= ================ ====================
|
|||
|
|
|||
|
An extensive empirical comparison of Python's array-based list and the
|
|||
|
BList are available at [2]_.
|
|||
|
|
|||
|
Use Case Trade-offs
|
|||
|
===================
|
|||
|
|
|||
|
The BList offers superior performance for many, but not all,
|
|||
|
operations. Choosing the correct data type for a particular use case
|
|||
|
depends on which operations are used. Choosing the correct data type
|
|||
|
as a built-in depends on balancing the importance of different use
|
|||
|
cases and the magnitude of the performance differences.
|
|||
|
|
|||
|
For the common uses cases of small lists, the array-based list and the
|
|||
|
BList have similar performance characteristics.
|
|||
|
|
|||
|
For the slightly less common case of large lists, there are two common
|
|||
|
uses cases where the existing array-based list outperforms the
|
|||
|
existing BList reference implementation. These are:
|
|||
|
|
|||
|
1. A large LIFO stack, where there are many .append() and .pop(-1)
|
|||
|
operations. Each operation is O(1) for an array-based list, but
|
|||
|
O(log n) for the BList.
|
|||
|
|
|||
|
2. A large list that does not change size. The getitem and setitem
|
|||
|
calls are O(1) for an array-based list, but O(log n) for the BList.
|
|||
|
|
|||
|
In performance tests on a 10,000 element list, BLists exhibited a 50%
|
|||
|
and 5% increase in execution time for these two uses cases,
|
|||
|
respectively.
|
|||
|
|
|||
|
The performance for the LIFO use case could be improved to O(n) time,
|
|||
|
by caching a pointer to the right-most leaf within the root node. For
|
|||
|
lists that do not change size, the common case of sequential access
|
|||
|
could also be improved to O(n) time via caching in the root node.
|
|||
|
However, the performance of these approaches has not been empirically
|
|||
|
tested.
|
|||
|
|
|||
|
Many operations exhibit a tremendous speed-up (O(n) to O(log n)) when
|
|||
|
switching from the array-based list to BLists. In performance tests
|
|||
|
on a 10,000 element list, operations such as getslice, setslice, and
|
|||
|
FIFO-style insert and deletes on a BList take only 1% of the time
|
|||
|
needed on array-based lists.
|
|||
|
|
|||
|
In light of the large performance speed-ups for many operations, the
|
|||
|
small performance costs for some operations will be worthwhile for
|
|||
|
many (but not all) applications.
|
|||
|
|
|||
|
Implementation
|
|||
|
==============
|
|||
|
|
|||
|
The BList is based on the B+Tree data structure. The BList is a wide,
|
|||
|
bushy tree where each node contains an array of up to 128 pointers to
|
|||
|
its children. If the node is a leaf, its children are the
|
|||
|
user-visible objects that the user has placed in the list. If node is
|
|||
|
not a leaf, its children are other BList nodes that are not
|
|||
|
user-visible. If the list contains only a few elements, they will all
|
|||
|
be a children of single node that is both the root and a leaf. Since
|
|||
|
a node is little more than array of pointers, small lists operate in
|
|||
|
effectively the same way as an array-based data type and share the
|
|||
|
same good performance characteristics.
|
|||
|
|
|||
|
The BList maintains a few invariants to ensure good (O(log n))
|
|||
|
asymptotic performance regardless of the sequence of insert and delete
|
|||
|
operations. The principle invariants are as follows:
|
|||
|
|
|||
|
1. Each node has at most 128 children.
|
|||
|
2. Each non-root node has at least 64 children.
|
|||
|
3. The root node has at least 2 children, unless the list contains
|
|||
|
fewer than 2 elements.
|
|||
|
4. The tree is of uniform depth.
|
|||
|
|
|||
|
If an insert would cause a node to exceed 128 children, the node
|
|||
|
spawns a sibling and transfers half of its children to the sibling.
|
|||
|
The sibling is inserted into the node's parent. If the node is the
|
|||
|
root node (and thus has no parent), a new parent is created and the
|
|||
|
depth of the tree increases by one.
|
|||
|
|
|||
|
If a deletion would cause a node to have fewer than 64 children, the
|
|||
|
node moves elements from one of its siblings if possible. If both of
|
|||
|
its siblings also only have 64 children, then two of the nodes merge
|
|||
|
and the empty one is removed from its parent. If the root node is
|
|||
|
reduced to only one child, its single child becomes the new root
|
|||
|
(i.e., the depth of the tree is reduced by one).
|
|||
|
|
|||
|
In addition to tree-like asymptotic performance and array-like
|
|||
|
performance on small-lists, BLists support transparent
|
|||
|
**copy-on-write**. If a non-root node needs to be copied (as part of
|
|||
|
a getslice, copy, setslice, etc.), the node is shared between multiple
|
|||
|
parents instead of being copied. If it needs to be modified later, it
|
|||
|
will be copied at that time. This is completely behind-the-scenes;
|
|||
|
from the user's point of view, the BList works just like a regular
|
|||
|
Python list.
|
|||
|
|
|||
|
Memory Usage
|
|||
|
============
|
|||
|
|
|||
|
In the worst case, the leaf nodes of a BList have only 64 children
|
|||
|
each, rather than a full 128, meaning that memory usage is around
|
|||
|
twice that of a best-case array implementation. Non-leaf nodes use up
|
|||
|
a negligible amount of additional memory, since there are at least 63
|
|||
|
times as many leaf nodes as non-leaf nodes.
|
|||
|
|
|||
|
The existing array-based list implementation must grow and shrink as
|
|||
|
items are added and removed. To be efficient, it grows and shrinks
|
|||
|
only when the list has grow or shrunk exponentially. In the worst
|
|||
|
case, it, too, uses twice as much memory as the best case.
|
|||
|
|
|||
|
In summary, the BList's memory footprint is not significantly
|
|||
|
different from the existing array-based implementation.
|
|||
|
|
|||
|
Backwards Compatibility
|
|||
|
=======================
|
|||
|
|
|||
|
If the BList is added to the collections module, backwards
|
|||
|
compatibility is not an issue. This section focuses on the option of
|
|||
|
replacing the existing array-based list with the BList. For users of
|
|||
|
the Python interpreter, a BList has an identical interface to the
|
|||
|
current list-implementation. For virtually all operations, the
|
|||
|
behavior is identical, aside from execution speed.
|
|||
|
|
|||
|
For the C API, BList has a different interface than the existing
|
|||
|
list-implementation. Due to its more complex structure, the BList
|
|||
|
does not lend itself well to poking and prodding by external sources.
|
|||
|
Thankfully, the existing list-implementation defines an API of
|
|||
|
functions and macros for accessing data from list objects. Google
|
|||
|
Code Search suggests that the majority of third-party modules uses the
|
|||
|
well-defined API rather than relying on the list's structure
|
|||
|
directly. The table below summarizes the search queries and results:
|
|||
|
|
|||
|
======================== =================
|
|||
|
Search String Number of Results
|
|||
|
======================== =================
|
|||
|
PyList_GetItem 2,000
|
|||
|
PySequence_GetItem 800
|
|||
|
PySequence_Fast_GET_ITEM 100
|
|||
|
PyList_GET_ITEM 400
|
|||
|
\[^a\-zA\-Z\_\]ob_item 100
|
|||
|
======================== =================
|
|||
|
|
|||
|
|
|||
|
This can be achieved in one of two ways:
|
|||
|
|
|||
|
1. Redefine the various accessor functions and macros in listobject.h
|
|||
|
to access a BList instead. The interface would be unchanged. The
|
|||
|
functions can easily be redefined. The macros need a bit more care
|
|||
|
and would have to resort to function calls for large lists.
|
|||
|
|
|||
|
The macros would need to evaluate their arguments more than once,
|
|||
|
which could be a problem if the arguments have side effects. A
|
|||
|
Google Code Search for "PyList_GET_ITEM\(\[^)\]+\(" found only a
|
|||
|
handful of cases where this occurs, so the impact appears to be
|
|||
|
low.
|
|||
|
|
|||
|
The few extension modules that use list's undocumented structure
|
|||
|
directly, instead of using the API, would break. The core code
|
|||
|
itself uses the accessor macros fairly consistently and should be
|
|||
|
easy to port.
|
|||
|
|
|||
|
2. Deprecate the existing list type, but continue to include it.
|
|||
|
Extension modules wishing to use the new BList type must do so
|
|||
|
explicitly. The BList C interface can be changed to match the
|
|||
|
existing PyList interface so that a simple search-replace will be
|
|||
|
sufficient for 99% of module writers.
|
|||
|
|
|||
|
Existing modules would continue to compile and work without change,
|
|||
|
but they would need to make a deliberate (but small) effort to
|
|||
|
migrate to the BList.
|
|||
|
|
|||
|
The downside of this approach is that mixing modules that use
|
|||
|
BLists and array-based lists might lead to slow down if conversions
|
|||
|
are frequently necessary.
|
|||
|
|
|||
|
Reference Implementation
|
|||
|
========================
|
|||
|
|
|||
|
A reference implementations of the BList is available for CPython at [1]_.
|
|||
|
|
|||
|
The source package also includes a pure Python implementation,
|
|||
|
originally developed as a prototype for the CPython version.
|
|||
|
Naturally, the pure Python version is rather slow and the asymptotic
|
|||
|
improvements don't win out until the list is quite large.
|
|||
|
|
|||
|
When compiled with Py_DEBUG, the C implementation checks the
|
|||
|
BList invariants when entering and exiting most functions.
|
|||
|
|
|||
|
An extensive set of test cases is also included in the source package.
|
|||
|
The test cases include the existing Python sequence and list test
|
|||
|
cases as a subset. When the interpreter is built with Py_DEBUG, the
|
|||
|
test cases also check for reference leaks.
|
|||
|
|
|||
|
Porting to Other Python Variants
|
|||
|
--------------------------------
|
|||
|
|
|||
|
If the BList is added to the collections module, other Python variants
|
|||
|
can support it in one of three ways:
|
|||
|
|
|||
|
1. Make blist an alias for list. The asymptotic performance won't be
|
|||
|
as good, but it'll work.
|
|||
|
2. Use the pure Python reference implementation. The performance for
|
|||
|
small lists won't be as good, but it'll work.
|
|||
|
3. Port the reference implementation.
|
|||
|
|
|||
|
Discussion
|
|||
|
==========
|
|||
|
|
|||
|
This proposal has been discussed briefly on the Python-3000 mailing
|
|||
|
list [3]_. Although a number of people favored the proposal, there
|
|||
|
were also some objections. Below summarizes the pros and cons as
|
|||
|
observed by posters to the thread.
|
|||
|
|
|||
|
General comments:
|
|||
|
|
|||
|
- Pro: Will outperform the array-based list in most cases
|
|||
|
- Pro: "I've implemented variants of this ... a few different times"
|
|||
|
- Con: Desirability and performance in actual applications is unproven
|
|||
|
|
|||
|
Comments on adding BList to the collections module:
|
|||
|
|
|||
|
- Pro: Matching the list-API reduces the learning curve to near-zero
|
|||
|
- Pro: Useful for intermediate-level users; won't get in the way of beginners
|
|||
|
- Con: Proliferation of data types makes the choices for developers harder.
|
|||
|
|
|||
|
Comments on replacing the array-based list with the BList:
|
|||
|
|
|||
|
- Con: Impact on extension modules (addressed in `Backwards
|
|||
|
Compatibility`_)
|
|||
|
- Con: The use cases where BLists are slower are important
|
|||
|
(see `Use Case Trade-Offs`_ for how these might be addressed).
|
|||
|
- Con: The array-based list code is simple and easy to maintain
|
|||
|
|
|||
|
To assess the desirability and performance in actual applications,
|
|||
|
Raymond Hettinger suggested releasing the BList as an extension module
|
|||
|
(now available at [1]_). If it proves useful, he felt it would be a
|
|||
|
strong candidate for inclusion in 2.6 as part of the collections
|
|||
|
module. If widely popular, then it could be considered for replacing
|
|||
|
the array-based list, but not otherwise.
|
|||
|
|
|||
|
Guido van Rossum commented that he opposed the proliferation of data
|
|||
|
types, but favored replacing the array-based list if backwards
|
|||
|
compatibility could be addressed and the BList's performance was
|
|||
|
uniformly better.
|
|||
|
|
|||
|
On-going Tasks
|
|||
|
==============
|
|||
|
|
|||
|
- Reduce the memory footprint of small lists
|
|||
|
- Implement TimSort for BLists, so that best-case sorting is O(n)
|
|||
|
instead of O(log n).
|
|||
|
- Implement __reversed__
|
|||
|
- Cache a pointer in the root to the rightmost leaf, to make LIFO
|
|||
|
operation O(n) time.
|
|||
|
|
|||
|
References
|
|||
|
==========
|
|||
|
|
|||
|
.. [1] Reference Implementations for C and Python:
|
|||
|
http://www.python.org/pypi/blist/
|
|||
|
|
|||
|
.. [2] Empirical performance comparison between Python's array-based
|
|||
|
list and the blist: http://stutzbachenterprises.com/blist/
|
|||
|
|
|||
|
.. [3] Discussion on python-3000 starting at post:
|
|||
|
http://mail.python.org/pipermail/python-3000/2007-April/006757.html
|
|||
|
|
|||
|
Copyright
|
|||
|
=========
|
|||
|
|
|||
|
This document has been placed in the public domain.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
..
|
|||
|
Local Variables:
|
|||
|
mode: indented-text
|
|||
|
indent-tabs-mode: nil
|
|||
|
sentence-end-double-space: t
|
|||
|
fill-column: 70
|
|||
|
coding: utf-8
|
|||
|
End:
|