python-peps/peps/pep-3128.rst

PEP: 3128
Title: BList: A Faster List-like Type
Version: $Revision$
Last-Modified: $Date$
Author: Daniel Stutzbach <daniel@stutzbachenterprises.com>
Discussions-To: python-3000@python.org
Status: Rejected
Type: Standards Track
Content-Type: text/x-rst
Created: 30-Apr-2007
Python-Version: 2.6, 3.0
Post-History: 30-Apr-2007


Rejection Notice
================

Rejected based on Raymond Hettinger's sage advice [4]_:

    After looking at the source, I think this has almost zero chance
    for replacing list().  There is too much value in a simple C API,
    low space overhead for small lists, good performance is common use
    cases, and having performance that is easily understood.  The
    BList implementation lacks these virtues and it trades-off a little
    performance in common cases in for much better performance in
    uncommon cases.  As a Py3.0 PEP, I think it can be rejected.

    Depending on its success as a third-party module, it still has a
    chance for inclusion in the collections module.  The essential
    criteria for that is whether it is a superior choice for some
    real-world use cases.  I've scanned my own code and found no instances
    where BList would have been preferable to a regular list.  However,
    that scan has a selection bias because it doesn't reflect what I would
    have written had BList been available.  So, after a few months, I
    intend to poll comp.lang.python for BList success stories.  If they
    exist, then I have no problem with inclusion in the collections
    module.  After all, its learning curve is near zero -- the only cost
    is the clutter factor stemming from indecision about the most
    appropriate data structure for a given task.


Abstract
========

The common case for list operations is on small lists.  The current
array-based list implementation excels at small lists due to the
strong locality of reference and infrequency of memory allocation
operations.  However, an array takes O(n) time to insert and delete
elements, which can become problematic as the list gets large.

This PEP introduces a new data type, the BList, that has array-like
and tree-like aspects.  It enjoys the same good performance on small
lists as the existing array-based implementation, but offers superior
asymptotic performance for most operations.  This PEP proposes
replacing the makes two mutually exclusive proposals for including the
BList type in Python:

1. Add it to the collections module, or
2. Replace the existing list type


Motivation
==========

The BList grew out of the frustration of needing to rewrite intuitive
algorithms that worked fine for small inputs but took O(n**2) time for
large inputs due to the underlying O(n) behavior of array-based lists.
The deque type, introduced in Python 2.4, solved the most common
problem of needing a fast FIFO queue.  However, the deque type doesn't
help if we need to repeatedly insert or delete elements from the
middle of a long list.

A wide variety of data structure provide good asymptotic performance
for insertions and deletions, but they either have O(n) performance
for other operations (e.g., linked lists) or have inferior performance
for small lists (e.g., binary trees and skip lists).

The BList type proposed in this PEP is based on the principles of
B+Trees, which have array-like and tree-like aspects.  The BList
offers array-like performance on small lists, while offering O(log n)
asymptotic performance for all insert and delete operations.
Additionally, the BList implements copy-on-write under-the-hood, so
even operations like getslice take O(log n) time.  The table below
compares the asymptotic performance of the current array-based list
implementation with the asymptotic performance of the BList.

========= ================                     ====================
Operation Array-based list                     BList
========= ================                     ====================
Copy      O(n)                                 **O(1)**
Append    **O(1)**                             O(log n)
Insert    O(n)                                 **O(log n)**
Get Item  **O(1)**                             O(log n)
Set Item  **O(1)**                             **O(log n)**
Del Item  O(n)                                 **O(log n)**
Iteration O(n)                                 O(n)
Get Slice O(k)                                 **O(log n)**
Del Slice O(n)                                 **O(log n)**
Set Slice O(n+k)                               **O(log k + log n)**
Extend    O(k)                                 **O(log k + log n)**
Sort      O(n log n)                           O(n log n)
Multiply  O(nk)                                **O(log k)**
========= ================                     ====================

An extensive empirical comparison of Python's array-based list and the
BList are available at [2]_.

Use Case Trade-offs
===================

The BList offers superior performance for many, but not all,
operations.  Choosing the correct data type for a particular use case
depends on which operations are used.  Choosing the correct data type
as a built-in depends on balancing the importance of different use
cases and the magnitude of the performance differences.

For the common uses cases of small lists, the array-based list and the
BList have similar performance characteristics.

For the slightly less common case of large lists, there are two common
uses cases where the existing array-based list outperforms the
existing BList reference implementation.  These are:

1. A large LIFO stack, where there are many .append() and .pop(-1)
   operations.  Each operation is O(1) for an array-based list, but
   O(log n) for the BList.

2. A large list that does not change size.  The getitem and setitem
   calls are O(1) for an array-based list, but O(log n) for the BList.

In performance tests on a 10,000 element list, BLists exhibited a 50%
and 5% increase in execution time for these two uses cases,
respectively.

The performance for the LIFO use case could be improved to O(n) time,
by caching a pointer to the right-most leaf within the root node.  For
lists that do not change size, the common case of sequential access
could also be improved to O(n) time via caching in the root node.
However, the performance of these approaches has not been empirically
tested.

Many operations exhibit a tremendous speed-up (O(n) to O(log n)) when
switching from the array-based list to BLists.  In performance tests
on a 10,000 element list, operations such as getslice, setslice, and
FIFO-style insert and deletes on a BList take only 1% of the time
needed on array-based lists.

In light of the large performance speed-ups for many operations, the
small performance costs for some operations will be worthwhile for
many (but not all) applications.

Implementation
==============

The BList is based on the B+Tree data structure.  The BList is a wide,
bushy tree where each node contains an array of up to 128 pointers to
its children.  If the node is a leaf, its children are the
user-visible objects that the user has placed in the list.  If node is
not a leaf, its children are other BList nodes that are not
user-visible.  If the list contains only a few elements, they will all
be a children of single node that is both the root and a leaf.  Since
a node is little more than array of pointers, small lists operate in
effectively the same way as an array-based data type and share the
same good performance characteristics.

The BList maintains a few invariants to ensure good (O(log n))
asymptotic performance regardless of the sequence of insert and delete
operations.  The principle invariants are as follows:

1. Each node has at most 128 children.
2. Each non-root node has at least 64 children.
3. The root node has at least 2 children, unless the list contains
   fewer than 2 elements.
4. The tree is of uniform depth.

If an insert would cause a node to exceed 128 children, the node
spawns a sibling and transfers half of its children to the sibling.
The sibling is inserted into the node's parent.  If the node is the
root node (and thus has no parent), a new parent is created and the
depth of the tree increases by one.

If a deletion would cause a node to have fewer than 64 children, the
node moves elements from one of its siblings if possible.  If both of
its siblings also only have 64 children, then two of the nodes merge
and the empty one is removed from its parent.  If the root node is
reduced to only one child, its single child becomes the new root
(i.e., the depth of the tree is reduced by one).

In addition to tree-like asymptotic performance and array-like
performance on small-lists, BLists support transparent
**copy-on-write**.  If a non-root node needs to be copied (as part of
a getslice, copy, setslice, etc.), the node is shared between multiple
parents instead of being copied.  If it needs to be modified later, it
will be copied at that time.  This is completely behind-the-scenes;
from the user's point of view, the BList works just like a regular
Python list.

Memory Usage
============

In the worst case, the leaf nodes of a BList have only 64 children
each, rather than a full 128, meaning that memory usage is around
twice that of a best-case array implementation.  Non-leaf nodes use up
a negligible amount of additional memory, since there are at least 63
times as many leaf nodes as non-leaf nodes.

The existing array-based list implementation must grow and shrink as
items are added and removed.  To be efficient, it grows and shrinks
only when the list has grow or shrunk exponentially.  In the worst
case, it, too, uses twice as much memory as the best case.

In summary, the BList's memory footprint is not significantly
different from the existing array-based implementation.

Backwards Compatibility
=======================

If the BList is added to the collections module, backwards
compatibility is not an issue.  This section focuses on the option of
replacing the existing array-based list with the BList.  For users of
the Python interpreter, a BList has an identical interface to the
current list-implementation.  For virtually all operations, the
behavior is identical, aside from execution speed.

For the C API, BList has a different interface than the existing
list-implementation.  Due to its more complex structure, the BList
does not lend itself well to poking and prodding by external sources.
Thankfully, the existing list-implementation defines an API of
functions and macros for accessing data from list objects.  Google
Code Search suggests that the majority of third-party modules uses the
well-defined API rather than relying on the list's structure
directly.  The table below summarizes the search queries and results:

======================== =================
Search String            Number of Results
======================== =================
PyList_GetItem           2,000
PySequence_GetItem         800
PySequence_Fast_GET_ITEM   100
PyList_GET_ITEM            400
\[^a\-zA\-Z\_\]ob_item          100
======================== =================


This can be achieved in one of two ways:

1. Redefine the various accessor functions and macros in listobject.h
   to access a BList instead.  The interface would be unchanged.  The
   functions can easily be redefined.  The macros need a bit more care
   and would have to resort to function calls for large lists.

   The macros would need to evaluate their arguments more than once,
   which could be a problem if the arguments have side effects.  A
   Google Code Search for "PyList_GET_ITEM\(\[^)\]+\(" found only a
   handful of cases where this occurs, so the impact appears to be
   low.

   The few extension modules that use list's undocumented structure
   directly, instead of using the API, would break.  The core code
   itself uses the accessor macros fairly consistently and should be
   easy to port.

2. Deprecate the existing list type, but continue to include it.
   Extension modules wishing to use the new BList type must do so
   explicitly.  The BList C interface can be changed to match the
   existing PyList interface so that a simple search-replace will be
   sufficient for 99% of module writers.

   Existing modules would continue to compile and work without change,
   but they would need to make a deliberate (but small) effort to
   migrate to the BList.

   The downside of this approach is that mixing modules that use
   BLists and array-based lists might lead to slow down if conversions
   are frequently necessary.

Reference Implementation
========================

A reference implementations of the BList is available for CPython at [1]_.

The source package also includes a pure Python implementation,
originally developed as a prototype for the CPython version.
Naturally, the pure Python version is rather slow and the asymptotic
improvements don't win out until the list is quite large.

When compiled with Py_DEBUG, the C implementation checks the
BList invariants when entering and exiting most functions.

An extensive set of test cases is also included in the source package.
The test cases include the existing Python sequence and list test
cases as a subset.  When the interpreter is built with Py_DEBUG, the
test cases also check for reference leaks.

Porting to Other Python Variants
--------------------------------

If the BList is added to the collections module, other Python variants
can support it in one of three ways:

1. Make blist an alias for list.  The asymptotic performance won't be
   as good, but it'll work.
2. Use the pure Python reference implementation.  The performance for
   small lists won't be as good, but it'll work.
3. Port the reference implementation.

Discussion
==========

This proposal has been discussed briefly on the Python-3000 mailing
list [3]_.  Although a number of people favored the proposal, there
were also some objections.  Below summarizes the pros and cons as
observed by posters to the thread.

General comments:

- Pro: Will outperform the array-based list in most cases
- Pro: "I've implemented variants of this ... a few different times"
- Con: Desirability and performance in actual applications is unproven

Comments on adding BList to the collections module:

- Pro: Matching the list-API reduces the learning curve to near-zero
- Pro: Useful for intermediate-level users; won't get in the way of beginners
- Con: Proliferation of data types makes the choices for developers harder.

Comments on replacing the array-based list with the BList:

- Con: Impact on extension modules (addressed in `Backwards
  Compatibility`_)
- Con: The use cases where BLists are slower are important
  (see `Use Case Trade-Offs`_ for how these might be addressed).
- Con: The array-based list code is simple and easy to maintain

To assess the desirability and performance in actual applications,
Raymond Hettinger suggested releasing the BList as an extension module
(now available at [1]_).  If it proves useful, he felt it would be a
strong candidate for inclusion in 2.6 as part of the collections
module.  If widely popular, then it could be considered for replacing
the array-based list, but not otherwise.

Guido van Rossum commented that he opposed the proliferation of data
types, but favored replacing the array-based list if backwards
compatibility could be addressed and the BList's performance was
uniformly better.

On-going Tasks
==============

- Reduce the memory footprint of small lists
- Implement TimSort for BLists, so that best-case sorting is O(n)
  instead of O(log n).
- Implement __reversed__
- Cache a pointer in the root to the rightmost leaf, to make LIFO
  operation O(n) time.

References
==========

.. [1] Reference Implementations for C and Python:
   http://www.python.org/pypi/blist/

.. [2] Empirical performance comparison between Python's array-based
   list and the blist: http://stutzbachenterprises.com/blist/

.. [3] Discussion on python-3000 starting at post:
   https://mail.python.org/pipermail/python-3000/2007-April/006757.html

.. [4] Raymond Hettinger's feedback on python-3000:
   https://mail.python.org/pipermail/python-3000/2007-May/007491.html

Copyright
=========

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:
-												PEP 3127: Integer Literal Support and Syntax           Maupin
PEP 3128: BList: A Faster List-like Type               Stutzbach

											
										
										
											2007-05-01 12:57:09 -04:00
+								PEP: 3128
 								Title: BList: A Faster List-like Type
 								Version: $Revision$
 								Last-Modified: $Date$
 								Author: Daniel Stutzbach <daniel@stutzbachenterprises.com>
-												Display list names in Resolution and Discussions-To headers (#2361)


											
										
										
											2022-02-27 17:46:36 -05:00
+								Discussions-To: python-3000@python.org
-												Reject PEP 3128 (BList).

											
										
										
											2007-05-10 19:41:53 -04:00
+								Status: Rejected
-												PEP 3127: Integer Literal Support and Syntax           Maupin
PEP 3128: BList: A Faster List-like Type               Stutzbach

											
										
										
											2007-05-01 12:57:09 -04:00
+								Type: Standards Track
 								Content-Type: text/x-rst
 								Created: 30-Apr-2007
-												Lint: Validate PEP fields against PEP 12 (#1890)


											
										
										
											2021-03-23 10:55:26 -04:00
+								Python-Version: 2.6, 3.0
-												PEP 3127: Integer Literal Support and Syntax           Maupin
PEP 3128: BList: A Faster List-like Type               Stutzbach

											
										
										
											2007-05-01 12:57:09 -04:00
+								Post-History: 30-Apr-2007
-												Reject PEP 3128 (BList).

											
										
										
											2007-05-10 19:41:53 -04:00
+								Rejection Notice
 								================
-												Fix typos.

											
										
										
											2008-12-15 22:52:16 -05:00
+								Rejected based on Raymond Hettinger's sage advice [4]_:
-												Reject PEP 3128 (BList).

											
										
										
											2007-05-10 19:41:53 -04:00
 								    After looking at the source, I think this has almost zero chance
 								    for replacing list().  There is too much value in a simple C API,
 								    low space overhead for small lists, good performance is common use
 								    cases, and having performance that is easily understood.  The
-												Fix typos.

											
										
										
											2008-12-15 22:52:16 -05:00
+								    BList implementation lacks these virtues and it trades-off a little
 								    performance in common cases in for much better performance in
-												Reject PEP 3128 (BList).

											
										
										
											2007-05-10 19:41:53 -04:00
+								    uncommon cases.  As a Py3.0 PEP, I think it can be rejected.
 								    Depending on its success as a third-party module, it still has a
 								    chance for inclusion in the collections module.  The essential
 								    criteria for that is whether it is a superior choice for some
 								    real-world use cases.  I've scanned my own code and found no instances
 								    where BList would have been preferable to a regular list.  However,
 								    that scan has a selection bias because it doesn't reflect what I would
 								    have written had BList been available.  So, after a few months, I
 								    intend to poll comp.lang.python for BList success stories.  If they
 								    exist, then I have no problem with inclusion in the collections
 								    module.  After all, its learning curve is near zero -- the only cost
 								    is the clutter factor stemming from indecision about the most
 								    appropriate data structure for a given task.
-												PEP 3127: Integer Literal Support and Syntax           Maupin
PEP 3128: BList: A Faster List-like Type               Stutzbach

											
										
										
											2007-05-01 12:57:09 -04:00
+								Abstract
 								========
 								The common case for list operations is on small lists.  The current
 								array-based list implementation excels at small lists due to the
 								strong locality of reference and infrequency of memory allocation
 								operations.  However, an array takes O(n) time to insert and delete
 								elements, which can become problematic as the list gets large.
 								This PEP introduces a new data type, the BList, that has array-like
 								and tree-like aspects.  It enjoys the same good performance on small
 								lists as the existing array-based implementation, but offers superior
 								asymptotic performance for most operations.  This PEP proposes
 								replacing the makes two mutually exclusive proposals for including the
 								BList type in Python:
 . Add it to the collections module, or
 . Replace the existing list type
 								Motivation
 								==========
 								The BList grew out of the frustration of needing to rewrite intuitive
 								algorithms that worked fine for small inputs but took O(n**2) time for
 								large inputs due to the underlying O(n) behavior of array-based lists.
 								The deque type, introduced in Python 2.4, solved the most common
 								problem of needing a fast FIFO queue.  However, the deque type doesn't
 								help if we need to repeatedly insert or delete elements from the
 								middle of a long list.
 								A wide variety of data structure provide good asymptotic performance
 								for insertions and deletions, but they either have O(n) performance
 								for other operations (e.g., linked lists) or have inferior performance
 								for small lists (e.g., binary trees and skip lists).
 								The BList type proposed in this PEP is based on the principles of
 								B+Trees, which have array-like and tree-like aspects.  The BList
 								offers array-like performance on small lists, while offering O(log n)
 								asymptotic performance for all insert and delete operations.
 								Additionally, the BList implements copy-on-write under-the-hood, so
 								even operations like getslice take O(log n) time.  The table below
 								compares the asymptotic performance of the current array-based list
 								implementation with the asymptotic performance of the BList.
 								========= ================                     ====================
 								Operation Array-based list                     BList
 								========= ================                     ====================
 								Copy      O(n)                                 **O(1)**
 								Append    **O(1)**                             O(log n)
 								Insert    O(n)                                 **O(log n)**
 								Get Item  **O(1)**                             O(log n)
 								Set Item  **O(1)**                             **O(log n)**
 								Del Item  O(n)                                 **O(log n)**
 								Iteration O(n)                                 O(n)
 								Get Slice O(k)                                 **O(log n)**
 								Del Slice O(n)                                 **O(log n)**
 								Set Slice O(n+k)                               **O(log k + log n)**
 								Extend    O(k)                                 **O(log k + log n)**
 								Sort      O(n log n)                           O(n log n)
 								Multiply  O(nk)                                **O(log k)**
 								========= ================                     ====================
 								An extensive empirical comparison of Python's array-based list and the
 								BList are available at [2]_.
 								Use Case Trade-offs
 								===================
 								The BList offers superior performance for many, but not all,
 								operations.  Choosing the correct data type for a particular use case
 								depends on which operations are used.  Choosing the correct data type
 								as a built-in depends on balancing the importance of different use
 								cases and the magnitude of the performance differences.
 								For the common uses cases of small lists, the array-based list and the
 								BList have similar performance characteristics.
 								For the slightly less common case of large lists, there are two common
 								uses cases where the existing array-based list outperforms the
 								existing BList reference implementation.  These are:
 . A large LIFO stack, where there are many .append() and .pop(-1)
 								   operations.  Each operation is O(1) for an array-based list, but
 								   O(log n) for the BList.
 . A large list that does not change size.  The getitem and setitem
 								   calls are O(1) for an array-based list, but O(log n) for the BList.
 								In performance tests on a 10,000 element list, BLists exhibited a 50%
 								and 5% increase in execution time for these two uses cases,
 								respectively.
 								The performance for the LIFO use case could be improved to O(n) time,
 								by caching a pointer to the right-most leaf within the root node.  For
 								lists that do not change size, the common case of sequential access
 								could also be improved to O(n) time via caching in the root node.
 								However, the performance of these approaches has not been empirically
 								tested.
 								Many operations exhibit a tremendous speed-up (O(n) to O(log n)) when
 								switching from the array-based list to BLists.  In performance tests
 								on a 10,000 element list, operations such as getslice, setslice, and
 								FIFO-style insert and deletes on a BList take only 1% of the time
 								needed on array-based lists.
 								In light of the large performance speed-ups for many operations, the
 								small performance costs for some operations will be worthwhile for
 								many (but not all) applications.
 								Implementation
 								==============
 								The BList is based on the B+Tree data structure.  The BList is a wide,
 								bushy tree where each node contains an array of up to 128 pointers to
 								its children.  If the node is a leaf, its children are the
 								user-visible objects that the user has placed in the list.  If node is
 								not a leaf, its children are other BList nodes that are not
 								user-visible.  If the list contains only a few elements, they will all
 								be a children of single node that is both the root and a leaf.  Since
 								a node is little more than array of pointers, small lists operate in
 								effectively the same way as an array-based data type and share the
 								same good performance characteristics.
 								The BList maintains a few invariants to ensure good (O(log n))
 								asymptotic performance regardless of the sequence of insert and delete
 								operations.  The principle invariants are as follows:
 . Each node has at most 128 children.
 . Each non-root node has at least 64 children.
 . The root node has at least 2 children, unless the list contains
 								   fewer than 2 elements.
 . The tree is of uniform depth.
 								If an insert would cause a node to exceed 128 children, the node
 								spawns a sibling and transfers half of its children to the sibling.
 								The sibling is inserted into the node's parent.  If the node is the
 								root node (and thus has no parent), a new parent is created and the
 								depth of the tree increases by one.
 								If a deletion would cause a node to have fewer than 64 children, the
 								node moves elements from one of its siblings if possible.  If both of
 								its siblings also only have 64 children, then two of the nodes merge
 								and the empty one is removed from its parent.  If the root node is
 								reduced to only one child, its single child becomes the new root
 								(i.e., the depth of the tree is reduced by one).
 								In addition to tree-like asymptotic performance and array-like
 								performance on small-lists, BLists support transparent
 								**copy-on-write**.  If a non-root node needs to be copied (as part of
 								a getslice, copy, setslice, etc.), the node is shared between multiple
 								parents instead of being copied.  If it needs to be modified later, it
 								will be copied at that time.  This is completely behind-the-scenes;
 								from the user's point of view, the BList works just like a regular
 								Python list.
 								Memory Usage
 								============
 								In the worst case, the leaf nodes of a BList have only 64 children
 								each, rather than a full 128, meaning that memory usage is around
 								twice that of a best-case array implementation.  Non-leaf nodes use up
 								a negligible amount of additional memory, since there are at least 63
 								times as many leaf nodes as non-leaf nodes.
 								The existing array-based list implementation must grow and shrink as
 								items are added and removed.  To be efficient, it grows and shrinks
 								only when the list has grow or shrunk exponentially.  In the worst
 								case, it, too, uses twice as much memory as the best case.
 								In summary, the BList's memory footprint is not significantly
 								different from the existing array-based implementation.
 								Backwards Compatibility
 								=======================
 								If the BList is added to the collections module, backwards
 								compatibility is not an issue.  This section focuses on the option of
 								replacing the existing array-based list with the BList.  For users of
 								the Python interpreter, a BList has an identical interface to the
 								current list-implementation.  For virtually all operations, the
 								behavior is identical, aside from execution speed.
 								For the C API, BList has a different interface than the existing
 								list-implementation.  Due to its more complex structure, the BList
 								does not lend itself well to poking and prodding by external sources.
 								Thankfully, the existing list-implementation defines an API of
 								functions and macros for accessing data from list objects.  Google
 								Code Search suggests that the majority of third-party modules uses the
 								well-defined API rather than relying on the list's structure
 								directly.  The table below summarizes the search queries and results:
 								======================== =================
 								Search String            Number of Results
 								======================== =================
 								PyList_GetItem           2,000
 								PySequence_GetItem         800
 								PySequence_Fast_GET_ITEM   100
 								PyList_GET_ITEM            400
 								\[^a\-zA\-Z\_\]ob_item          100
 								======================== =================
 								This can be achieved in one of two ways:
 . Redefine the various accessor functions and macros in listobject.h
 								   to access a BList instead.  The interface would be unchanged.  The
 								   functions can easily be redefined.  The macros need a bit more care
 								   and would have to resort to function calls for large lists.
 								   The macros would need to evaluate their arguments more than once,
 								   which could be a problem if the arguments have side effects.  A
 								   Google Code Search for "PyList_GET_ITEM\(\[^)\]+\(" found only a
 								   handful of cases where this occurs, so the impact appears to be
 								   low.
 								   The few extension modules that use list's undocumented structure
 								   directly, instead of using the API, would break.  The core code
 								   itself uses the accessor macros fairly consistently and should be
 								   easy to port.
 . Deprecate the existing list type, but continue to include it.
 								   Extension modules wishing to use the new BList type must do so
 								   explicitly.  The BList C interface can be changed to match the
 								   existing PyList interface so that a simple search-replace will be
 								   sufficient for 99% of module writers.
 								   Existing modules would continue to compile and work without change,
 								   but they would need to make a deliberate (but small) effort to
 								   migrate to the BList.
 								   The downside of this approach is that mixing modules that use
 								   BLists and array-based lists might lead to slow down if conversions
 								   are frequently necessary.
 								Reference Implementation
 								========================
 								A reference implementations of the BList is available for CPython at [1]_.
 								The source package also includes a pure Python implementation,
 								originally developed as a prototype for the CPython version.
 								Naturally, the pure Python version is rather slow and the asymptotic
 								improvements don't win out until the list is quite large.
 								When compiled with Py_DEBUG, the C implementation checks the
 								BList invariants when entering and exiting most functions.
 								An extensive set of test cases is also included in the source package.
 								The test cases include the existing Python sequence and list test
 								cases as a subset.  When the interpreter is built with Py_DEBUG, the
 								test cases also check for reference leaks.
 								Porting to Other Python Variants
 								--------------------------------
 								If the BList is added to the collections module, other Python variants
 								can support it in one of three ways:
 . Make blist an alias for list.  The asymptotic performance won't be
 								   as good, but it'll work.
 . Use the pure Python reference implementation.  The performance for
 								   small lists won't be as good, but it'll work.
 . Port the reference implementation.
 								Discussion
 								==========
 								This proposal has been discussed briefly on the Python-3000 mailing
 								list [3]_.  Although a number of people favored the proposal, there
 								were also some objections.  Below summarizes the pros and cons as
 								observed by posters to the thread.
 								General comments:
 								- Pro: Will outperform the array-based list in most cases
 								- Pro: "I've implemented variants of this ... a few different times"
 								- Con: Desirability and performance in actual applications is unproven
 								Comments on adding BList to the collections module:
 								- Pro: Matching the list-API reduces the learning curve to near-zero
 								- Pro: Useful for intermediate-level users; won't get in the way of beginners
 								- Con: Proliferation of data types makes the choices for developers harder.
 								Comments on replacing the array-based list with the BList:
 								- Con: Impact on extension modules (addressed in `Backwards
 								  Compatibility`_)
 								- Con: The use cases where BLists are slower are important
 								  (see `Use Case Trade-Offs`_ for how these might be addressed).
 								- Con: The array-based list code is simple and easy to maintain
 								To assess the desirability and performance in actual applications,
 								Raymond Hettinger suggested releasing the BList as an extension module
 								(now available at [1]_).  If it proves useful, he felt it would be a
 								strong candidate for inclusion in 2.6 as part of the collections
 								module.  If widely popular, then it could be considered for replacing
 								the array-based list, but not otherwise.
 								Guido van Rossum commented that he opposed the proliferation of data
 								types, but favored replacing the array-based list if backwards
 								compatibility could be addressed and the BList's performance was
 								uniformly better.
 								On-going Tasks
 								==============
 								- Reduce the memory footprint of small lists
 								- Implement TimSort for BLists, so that best-case sorting is O(n)
 								  instead of O(log n).
 								- Implement __reversed__
 								- Cache a pointer in the root to the rightmost leaf, to make LIFO
 								  operation O(n) time.
 								References
 								==========
 								.. [1] Reference Implementations for C and Python:
 								   http://www.python.org/pypi/blist/
 								.. [2] Empirical performance comparison between Python's array-based
 								   list and the blist: http://stutzbachenterprises.com/blist/
 								.. [3] Discussion on python-3000 starting at post:
-												promote m.p.o links to https

											
										
										
											2017-06-11 15:02:39 -04:00
+								   https://mail.python.org/pipermail/python-3000/2007-April/006757.html
-												PEP 3127: Integer Literal Support and Syntax           Maupin
PEP 3128: BList: A Faster List-like Type               Stutzbach

											
										
										
											2007-05-01 12:57:09 -04:00
-												Reject PEP 3128 (BList).

											
										
										
											2007-05-10 19:41:53 -04:00
+								.. [4] Raymond Hettinger's feedback on python-3000:
-												promote m.p.o links to https

											
										
										
											2017-06-11 15:02:39 -04:00
+								   https://mail.python.org/pipermail/python-3000/2007-May/007491.html
-												Reject PEP 3128 (BList).

											
										
										
											2007-05-10 19:41:53 -04:00
-												PEP 3127: Integer Literal Support and Syntax           Maupin
PEP 3128: BList: A Faster List-like Type               Stutzbach

											
										
										
											2007-05-01 12:57:09 -04:00
+								Copyright
 								=========
 								This document has been placed in the public domain.
 								..
 								   Local Variables:
 								   mode: indented-text
 								   indent-tabs-mode: nil
 								   sentence-end-double-space: t
 								   fill-column: 70
 								   coding: utf-8
 								   End: