387 lines
16 KiB
Plaintext
387 lines
16 KiB
Plaintext
PEP: 3128
|
||
Title: BList: A Faster List-like Type
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Daniel Stutzbach <daniel@stutzbachenterprises.com>
|
||
Discussions-To: python-3000@python.org
|
||
Status: Rejected
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 30-Apr-2007
|
||
Python-Version: 2.6, 3.0
|
||
Post-History: 30-Apr-2007
|
||
|
||
|
||
Rejection Notice
|
||
================
|
||
|
||
Rejected based on Raymond Hettinger's sage advice [4]_:
|
||
|
||
After looking at the source, I think this has almost zero chance
|
||
for replacing list(). There is too much value in a simple C API,
|
||
low space overhead for small lists, good performance is common use
|
||
cases, and having performance that is easily understood. The
|
||
BList implementation lacks these virtues and it trades-off a little
|
||
performance in common cases in for much better performance in
|
||
uncommon cases. As a Py3.0 PEP, I think it can be rejected.
|
||
|
||
Depending on its success as a third-party module, it still has a
|
||
chance for inclusion in the collections module. The essential
|
||
criteria for that is whether it is a superior choice for some
|
||
real-world use cases. I've scanned my own code and found no instances
|
||
where BList would have been preferable to a regular list. However,
|
||
that scan has a selection bias because it doesn't reflect what I would
|
||
have written had BList been available. So, after a few months, I
|
||
intend to poll comp.lang.python for BList success stories. If they
|
||
exist, then I have no problem with inclusion in the collections
|
||
module. After all, its learning curve is near zero -- the only cost
|
||
is the clutter factor stemming from indecision about the most
|
||
appropriate data structure for a given task.
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
The common case for list operations is on small lists. The current
|
||
array-based list implementation excels at small lists due to the
|
||
strong locality of reference and infrequency of memory allocation
|
||
operations. However, an array takes O(n) time to insert and delete
|
||
elements, which can become problematic as the list gets large.
|
||
|
||
This PEP introduces a new data type, the BList, that has array-like
|
||
and tree-like aspects. It enjoys the same good performance on small
|
||
lists as the existing array-based implementation, but offers superior
|
||
asymptotic performance for most operations. This PEP proposes
|
||
replacing the makes two mutually exclusive proposals for including the
|
||
BList type in Python:
|
||
|
||
1. Add it to the collections module, or
|
||
2. Replace the existing list type
|
||
|
||
|
||
Motivation
|
||
==========
|
||
|
||
The BList grew out of the frustration of needing to rewrite intuitive
|
||
algorithms that worked fine for small inputs but took O(n**2) time for
|
||
large inputs due to the underlying O(n) behavior of array-based lists.
|
||
The deque type, introduced in Python 2.4, solved the most common
|
||
problem of needing a fast FIFO queue. However, the deque type doesn't
|
||
help if we need to repeatedly insert or delete elements from the
|
||
middle of a long list.
|
||
|
||
A wide variety of data structure provide good asymptotic performance
|
||
for insertions and deletions, but they either have O(n) performance
|
||
for other operations (e.g., linked lists) or have inferior performance
|
||
for small lists (e.g., binary trees and skip lists).
|
||
|
||
The BList type proposed in this PEP is based on the principles of
|
||
B+Trees, which have array-like and tree-like aspects. The BList
|
||
offers array-like performance on small lists, while offering O(log n)
|
||
asymptotic performance for all insert and delete operations.
|
||
Additionally, the BList implements copy-on-write under-the-hood, so
|
||
even operations like getslice take O(log n) time. The table below
|
||
compares the asymptotic performance of the current array-based list
|
||
implementation with the asymptotic performance of the BList.
|
||
|
||
========= ================ ====================
|
||
Operation Array-based list BList
|
||
========= ================ ====================
|
||
Copy O(n) **O(1)**
|
||
Append **O(1)** O(log n)
|
||
Insert O(n) **O(log n)**
|
||
Get Item **O(1)** O(log n)
|
||
Set Item **O(1)** **O(log n)**
|
||
Del Item O(n) **O(log n)**
|
||
Iteration O(n) O(n)
|
||
Get Slice O(k) **O(log n)**
|
||
Del Slice O(n) **O(log n)**
|
||
Set Slice O(n+k) **O(log k + log n)**
|
||
Extend O(k) **O(log k + log n)**
|
||
Sort O(n log n) O(n log n)
|
||
Multiply O(nk) **O(log k)**
|
||
========= ================ ====================
|
||
|
||
An extensive empirical comparison of Python's array-based list and the
|
||
BList are available at [2]_.
|
||
|
||
Use Case Trade-offs
|
||
===================
|
||
|
||
The BList offers superior performance for many, but not all,
|
||
operations. Choosing the correct data type for a particular use case
|
||
depends on which operations are used. Choosing the correct data type
|
||
as a built-in depends on balancing the importance of different use
|
||
cases and the magnitude of the performance differences.
|
||
|
||
For the common uses cases of small lists, the array-based list and the
|
||
BList have similar performance characteristics.
|
||
|
||
For the slightly less common case of large lists, there are two common
|
||
uses cases where the existing array-based list outperforms the
|
||
existing BList reference implementation. These are:
|
||
|
||
1. A large LIFO stack, where there are many .append() and .pop(-1)
|
||
operations. Each operation is O(1) for an array-based list, but
|
||
O(log n) for the BList.
|
||
|
||
2. A large list that does not change size. The getitem and setitem
|
||
calls are O(1) for an array-based list, but O(log n) for the BList.
|
||
|
||
In performance tests on a 10,000 element list, BLists exhibited a 50%
|
||
and 5% increase in execution time for these two uses cases,
|
||
respectively.
|
||
|
||
The performance for the LIFO use case could be improved to O(n) time,
|
||
by caching a pointer to the right-most leaf within the root node. For
|
||
lists that do not change size, the common case of sequential access
|
||
could also be improved to O(n) time via caching in the root node.
|
||
However, the performance of these approaches has not been empirically
|
||
tested.
|
||
|
||
Many operations exhibit a tremendous speed-up (O(n) to O(log n)) when
|
||
switching from the array-based list to BLists. In performance tests
|
||
on a 10,000 element list, operations such as getslice, setslice, and
|
||
FIFO-style insert and deletes on a BList take only 1% of the time
|
||
needed on array-based lists.
|
||
|
||
In light of the large performance speed-ups for many operations, the
|
||
small performance costs for some operations will be worthwhile for
|
||
many (but not all) applications.
|
||
|
||
Implementation
|
||
==============
|
||
|
||
The BList is based on the B+Tree data structure. The BList is a wide,
|
||
bushy tree where each node contains an array of up to 128 pointers to
|
||
its children. If the node is a leaf, its children are the
|
||
user-visible objects that the user has placed in the list. If node is
|
||
not a leaf, its children are other BList nodes that are not
|
||
user-visible. If the list contains only a few elements, they will all
|
||
be a children of single node that is both the root and a leaf. Since
|
||
a node is little more than array of pointers, small lists operate in
|
||
effectively the same way as an array-based data type and share the
|
||
same good performance characteristics.
|
||
|
||
The BList maintains a few invariants to ensure good (O(log n))
|
||
asymptotic performance regardless of the sequence of insert and delete
|
||
operations. The principle invariants are as follows:
|
||
|
||
1. Each node has at most 128 children.
|
||
2. Each non-root node has at least 64 children.
|
||
3. The root node has at least 2 children, unless the list contains
|
||
fewer than 2 elements.
|
||
4. The tree is of uniform depth.
|
||
|
||
If an insert would cause a node to exceed 128 children, the node
|
||
spawns a sibling and transfers half of its children to the sibling.
|
||
The sibling is inserted into the node's parent. If the node is the
|
||
root node (and thus has no parent), a new parent is created and the
|
||
depth of the tree increases by one.
|
||
|
||
If a deletion would cause a node to have fewer than 64 children, the
|
||
node moves elements from one of its siblings if possible. If both of
|
||
its siblings also only have 64 children, then two of the nodes merge
|
||
and the empty one is removed from its parent. If the root node is
|
||
reduced to only one child, its single child becomes the new root
|
||
(i.e., the depth of the tree is reduced by one).
|
||
|
||
In addition to tree-like asymptotic performance and array-like
|
||
performance on small-lists, BLists support transparent
|
||
**copy-on-write**. If a non-root node needs to be copied (as part of
|
||
a getslice, copy, setslice, etc.), the node is shared between multiple
|
||
parents instead of being copied. If it needs to be modified later, it
|
||
will be copied at that time. This is completely behind-the-scenes;
|
||
from the user's point of view, the BList works just like a regular
|
||
Python list.
|
||
|
||
Memory Usage
|
||
============
|
||
|
||
In the worst case, the leaf nodes of a BList have only 64 children
|
||
each, rather than a full 128, meaning that memory usage is around
|
||
twice that of a best-case array implementation. Non-leaf nodes use up
|
||
a negligible amount of additional memory, since there are at least 63
|
||
times as many leaf nodes as non-leaf nodes.
|
||
|
||
The existing array-based list implementation must grow and shrink as
|
||
items are added and removed. To be efficient, it grows and shrinks
|
||
only when the list has grow or shrunk exponentially. In the worst
|
||
case, it, too, uses twice as much memory as the best case.
|
||
|
||
In summary, the BList's memory footprint is not significantly
|
||
different from the existing array-based implementation.
|
||
|
||
Backwards Compatibility
|
||
=======================
|
||
|
||
If the BList is added to the collections module, backwards
|
||
compatibility is not an issue. This section focuses on the option of
|
||
replacing the existing array-based list with the BList. For users of
|
||
the Python interpreter, a BList has an identical interface to the
|
||
current list-implementation. For virtually all operations, the
|
||
behavior is identical, aside from execution speed.
|
||
|
||
For the C API, BList has a different interface than the existing
|
||
list-implementation. Due to its more complex structure, the BList
|
||
does not lend itself well to poking and prodding by external sources.
|
||
Thankfully, the existing list-implementation defines an API of
|
||
functions and macros for accessing data from list objects. Google
|
||
Code Search suggests that the majority of third-party modules uses the
|
||
well-defined API rather than relying on the list's structure
|
||
directly. The table below summarizes the search queries and results:
|
||
|
||
======================== =================
|
||
Search String Number of Results
|
||
======================== =================
|
||
PyList_GetItem 2,000
|
||
PySequence_GetItem 800
|
||
PySequence_Fast_GET_ITEM 100
|
||
PyList_GET_ITEM 400
|
||
\[^a\-zA\-Z\_\]ob_item 100
|
||
======================== =================
|
||
|
||
|
||
This can be achieved in one of two ways:
|
||
|
||
1. Redefine the various accessor functions and macros in listobject.h
|
||
to access a BList instead. The interface would be unchanged. The
|
||
functions can easily be redefined. The macros need a bit more care
|
||
and would have to resort to function calls for large lists.
|
||
|
||
The macros would need to evaluate their arguments more than once,
|
||
which could be a problem if the arguments have side effects. A
|
||
Google Code Search for "PyList_GET_ITEM\(\[^)\]+\(" found only a
|
||
handful of cases where this occurs, so the impact appears to be
|
||
low.
|
||
|
||
The few extension modules that use list's undocumented structure
|
||
directly, instead of using the API, would break. The core code
|
||
itself uses the accessor macros fairly consistently and should be
|
||
easy to port.
|
||
|
||
2. Deprecate the existing list type, but continue to include it.
|
||
Extension modules wishing to use the new BList type must do so
|
||
explicitly. The BList C interface can be changed to match the
|
||
existing PyList interface so that a simple search-replace will be
|
||
sufficient for 99% of module writers.
|
||
|
||
Existing modules would continue to compile and work without change,
|
||
but they would need to make a deliberate (but small) effort to
|
||
migrate to the BList.
|
||
|
||
The downside of this approach is that mixing modules that use
|
||
BLists and array-based lists might lead to slow down if conversions
|
||
are frequently necessary.
|
||
|
||
Reference Implementation
|
||
========================
|
||
|
||
A reference implementations of the BList is available for CPython at [1]_.
|
||
|
||
The source package also includes a pure Python implementation,
|
||
originally developed as a prototype for the CPython version.
|
||
Naturally, the pure Python version is rather slow and the asymptotic
|
||
improvements don't win out until the list is quite large.
|
||
|
||
When compiled with Py_DEBUG, the C implementation checks the
|
||
BList invariants when entering and exiting most functions.
|
||
|
||
An extensive set of test cases is also included in the source package.
|
||
The test cases include the existing Python sequence and list test
|
||
cases as a subset. When the interpreter is built with Py_DEBUG, the
|
||
test cases also check for reference leaks.
|
||
|
||
Porting to Other Python Variants
|
||
--------------------------------
|
||
|
||
If the BList is added to the collections module, other Python variants
|
||
can support it in one of three ways:
|
||
|
||
1. Make blist an alias for list. The asymptotic performance won't be
|
||
as good, but it'll work.
|
||
2. Use the pure Python reference implementation. The performance for
|
||
small lists won't be as good, but it'll work.
|
||
3. Port the reference implementation.
|
||
|
||
Discussion
|
||
==========
|
||
|
||
This proposal has been discussed briefly on the Python-3000 mailing
|
||
list [3]_. Although a number of people favored the proposal, there
|
||
were also some objections. Below summarizes the pros and cons as
|
||
observed by posters to the thread.
|
||
|
||
General comments:
|
||
|
||
- Pro: Will outperform the array-based list in most cases
|
||
- Pro: "I've implemented variants of this ... a few different times"
|
||
- Con: Desirability and performance in actual applications is unproven
|
||
|
||
Comments on adding BList to the collections module:
|
||
|
||
- Pro: Matching the list-API reduces the learning curve to near-zero
|
||
- Pro: Useful for intermediate-level users; won't get in the way of beginners
|
||
- Con: Proliferation of data types makes the choices for developers harder.
|
||
|
||
Comments on replacing the array-based list with the BList:
|
||
|
||
- Con: Impact on extension modules (addressed in `Backwards
|
||
Compatibility`_)
|
||
- Con: The use cases where BLists are slower are important
|
||
(see `Use Case Trade-Offs`_ for how these might be addressed).
|
||
- Con: The array-based list code is simple and easy to maintain
|
||
|
||
To assess the desirability and performance in actual applications,
|
||
Raymond Hettinger suggested releasing the BList as an extension module
|
||
(now available at [1]_). If it proves useful, he felt it would be a
|
||
strong candidate for inclusion in 2.6 as part of the collections
|
||
module. If widely popular, then it could be considered for replacing
|
||
the array-based list, but not otherwise.
|
||
|
||
Guido van Rossum commented that he opposed the proliferation of data
|
||
types, but favored replacing the array-based list if backwards
|
||
compatibility could be addressed and the BList's performance was
|
||
uniformly better.
|
||
|
||
On-going Tasks
|
||
==============
|
||
|
||
- Reduce the memory footprint of small lists
|
||
- Implement TimSort for BLists, so that best-case sorting is O(n)
|
||
instead of O(log n).
|
||
- Implement __reversed__
|
||
- Cache a pointer in the root to the rightmost leaf, to make LIFO
|
||
operation O(n) time.
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] Reference Implementations for C and Python:
|
||
http://www.python.org/pypi/blist/
|
||
|
||
.. [2] Empirical performance comparison between Python's array-based
|
||
list and the blist: http://stutzbachenterprises.com/blist/
|
||
|
||
.. [3] Discussion on python-3000 starting at post:
|
||
https://mail.python.org/pipermail/python-3000/2007-April/006757.html
|
||
|
||
.. [4] Raymond Hettinger's feedback on python-3000:
|
||
https://mail.python.org/pipermail/python-3000/2007-May/007491.html
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|