95 lines
3.2 KiB
Plaintext
95 lines
3.2 KiB
Plaintext
PEP: 424
|
|
Title: A method for exposing a length hint
|
|
Version: $Revision$
|
|
Last-Modified: $Date
|
|
Author: Alex Gaynor <alex.gaynor@gmail.com>
|
|
Status: Accepted
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 14-July-2012
|
|
Python-Version: 3.4
|
|
Post-History: http://mail.python.org/pipermail/python-dev/2012-July/120920.html
|
|
|
|
Abstract
|
|
========
|
|
|
|
CPython currently defines a ``__length_hint__`` method on several
|
|
types, such as various iterators. This method is then used by various
|
|
other functions (such as ``list``) to presize lists based on the
|
|
estimate returned by ``__length_hint__``. Types which are not sized,
|
|
and thus should not define ``__len__``, can then define
|
|
``__length_hint__``, to allow estimating or computing a size (such as
|
|
many iterators).
|
|
|
|
Specification
|
|
=============
|
|
|
|
This PEP formally documents ``__length_hint__`` for other
|
|
interpreters and non-standard-library Python modules to implement.
|
|
|
|
``__length_hint__`` must return an integer (else a TypeError is raised) or
|
|
``NotImplemented, and is not required to be accurate. It may return a value
|
|
that is either larger or smaller than the actual size of the container. A
|
|
return value of ``NotImplemented`` indicates that there is no finite length
|
|
estimate. It may not return a negative value (else a ValueError is raised).
|
|
|
|
In addition, a new function ``operator.length_hint`` hint is added,
|
|
with the following semantics (which define how ``__length_hint__`` should
|
|
be used)::
|
|
|
|
def length_hint(obj, default=0):
|
|
"""Return an estimate of the number of items in obj.
|
|
|
|
This is useful for presizing containers when building from an
|
|
iterable.
|
|
|
|
If the object supports len(), the result will be
|
|
exact. Otherwise, it may over- or under-estimate by an
|
|
arbitrary amount. The result will be an integer >= 0.
|
|
"""
|
|
if <obj has a __len__ method>:
|
|
return len(obj)
|
|
else:
|
|
try:
|
|
get_hint = obj.__length_hint__
|
|
except AttributeError:
|
|
return default
|
|
hint = get_hint()
|
|
if hint is NotImplemented:
|
|
return default
|
|
if not isinstance(hint, int):
|
|
raise TypeError("Length hint must be an integer, not %r" %
|
|
type(hint))
|
|
return max(hint, 0)
|
|
|
|
Note: there is no good way to spell "obj has a __len__ method" in pure
|
|
Python. In CPython, this comes down to checking for a ``sq_length``
|
|
slot. Other implementations presumably have their own way of
|
|
checking. Calling ``len(obj)`` and catching TypeError is not quite
|
|
correct (as it would assume no __len__ method exists when in fact one
|
|
exists but calling it raises TypeError); checking ``hasattr(obj,
|
|
'__len__')`` likewise is incorrect if obj is a class defining a
|
|
``__len__`` method for its instances.
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
Being able to pre-allocate lists based on the expected size, as estimated by
|
|
``__length_hint__``, can be a significant optimization. CPython has been
|
|
observed to run some code faster than PyPy, purely because of this optimization
|
|
being present.
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed into the public domain.
|
|
|
|
..
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
sentence-end-double-space: t
|
|
fill-column: 70
|
|
coding: utf-8
|