update from Josiah Carlson
This commit is contained in:
parent
cef13d2601
commit
778f8da1d3
309
pep-0326.txt
309
pep-0326.txt
|
@ -9,34 +9,34 @@ Type: Standards Track
|
|||
Content-Type: text/x-rst
|
||||
Created: 20-Dec-2003
|
||||
Python-Version: 2.4
|
||||
Post-History: 20-Dec-2003, 03-Jan-2004, 05-Jan-2004
|
||||
Post-History: 20-Dec-2003, 03-Jan-2004, 05-Jan-2004, 07-Jan-2004
|
||||
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This PEP proposes two new attributes to the ``cmp`` built-in that
|
||||
represent a top and bottom [4]_ value: ``high`` and ``low`` (or a pair
|
||||
of similarly named attributes [5]_).
|
||||
This PEP proposes two singleton constants that represent a top and
|
||||
bottom [3]_ value: ``Max`` and ``Min`` (or two similarly suggestive
|
||||
names [4]_; see `Open Issues`_).
|
||||
|
||||
As suggested by their names, ``cmp.high`` and ``cmp.low`` would
|
||||
compare higher or lower than any other object (respectively). Such
|
||||
behavior results in easier to understand code and fewer special cases
|
||||
in which a temporary minimum or maximum is required, and an actual
|
||||
minimum or maximum numeric value is not limited.
|
||||
As suggested by their names, ``Max`` and ``Min`` would compare higher
|
||||
or lower than any other object (respectively). Such behavior results
|
||||
in easier to understand code and fewer special cases in which a
|
||||
temporary minimum or maximum value is required, and an actual minimum
|
||||
or maximum numeric value is not limited.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
||||
While ``None`` can be used as an absolute minimum that any value can
|
||||
attain [1]_, this may be depreciated [5]_ in Python 3.0, and shouldn't
|
||||
attain [1]_, this may be depreciated [4]_ in Python 3.0, and shouldn't
|
||||
be relied upon.
|
||||
|
||||
As a replacement for ``None`` being used as an absolute minimum, as
|
||||
well as the introduction of an absolute maximum, attaching ``low`` and
|
||||
``high`` to ``cmp`` addresses concerns for namespace pollution and
|
||||
serves to make both self-documenting.
|
||||
well as the introduction of an absolute maximum, the introduction of
|
||||
two singleton constants ``Max`` and ``Min`` address concerns for the
|
||||
constants to be self-documenting.
|
||||
|
||||
What is commonly done to deal with absolute minimum or maximum values,
|
||||
is to set a value that is larger than the script author ever expects
|
||||
|
@ -69,24 +69,23 @@ infinity). However, each has their drawbacks.
|
|||
|
||||
- These same drawbacks exist when numbers are small.
|
||||
|
||||
Introducing ``high`` and ``low`` attributes to ``cmp`` that work as
|
||||
described does not take much effort. A sample Python `reference
|
||||
implementation`_ of both attributes is included.
|
||||
Introducing ``Max`` and ``Min`` that work as described above does not
|
||||
take much effort. A sample Python `reference implementation`_ of both
|
||||
is included.
|
||||
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
``cmp.high`` Examples
|
||||
---------------------
|
||||
|
||||
There are hundreds of algorithms that begin by initializing some set
|
||||
of values to a logical (or numeric) infinity or negative infinity.
|
||||
Python lacks either infinity that works consistently or really is the
|
||||
most extreme value that can be attained. By adding the ``cmp.high``
|
||||
and ``cmp.low`` attributes, Python would have a real maximum and
|
||||
minimum value, and such algorithms can become clearer due to the
|
||||
reduction of special cases.
|
||||
most extreme value that can be attained. By adding ``Max`` and
|
||||
``Min``, Python would have a real maximum and minimum value, and such
|
||||
algorithms can become clearer due to the reduction of special cases.
|
||||
|
||||
``Max`` Examples
|
||||
---------------------
|
||||
|
||||
Take for example, finding the minimum in a sequence::
|
||||
|
||||
|
@ -114,26 +113,26 @@ Take for example, finding the minimum in a sequence::
|
|||
|
||||
::
|
||||
|
||||
def findmin_high(seq):
|
||||
cur = cmp.high
|
||||
def findmin_Max(seq):
|
||||
cur = Max
|
||||
for obj in seq:
|
||||
cur = min(obj, cur)
|
||||
return cur
|
||||
|
||||
Please note that there are an arbitrarily large number of ways to find
|
||||
the minimum (or maximum) of a sequence, these seek to show a simple
|
||||
example where using ``cmp.high`` makes the algorithm easier to
|
||||
understand and results in simplification of code.
|
||||
example where using ``Max`` makes the algorithm easier to understand
|
||||
and results in the simplification of code.
|
||||
|
||||
Guido brought up the idea of just negating everything and comparing
|
||||
[2]_. Certainly this does work when using numbers, but it does not
|
||||
remove the special case (actually adds one) and results in the code
|
||||
being less readable. ::
|
||||
|
||||
#we have cmp.high available
|
||||
#we have Max available
|
||||
a = min(a, b)
|
||||
|
||||
#we don't have cmp.high available
|
||||
#we don't have Max available
|
||||
if a is not None:
|
||||
if b is None:
|
||||
a = b
|
||||
|
@ -166,10 +165,10 @@ table (a bit more understandable) and one using a heap (much faster)::
|
|||
table = {}
|
||||
for node in graph.iterkeys():
|
||||
#(visited, distance, node, parent)
|
||||
table[node] = (0, cmp.high, node, None)
|
||||
table[node] = (0, Max, node, None)
|
||||
table[S] = (0, 0, S, None)
|
||||
cur = min(table.values())
|
||||
while (not cur[0]) and cur[1] < cmp.high:
|
||||
while (not cur[0]) and cur[1] < Max:
|
||||
(visited, distance, node, parent) = cur
|
||||
table[node] = (1, distance, node, parent)
|
||||
for cdist, child in graph[node]:
|
||||
|
@ -194,10 +193,10 @@ table (a bit more understandable) and one using a heap (much faster)::
|
|||
#runs in O(NlgN) time using a minheap
|
||||
#find the shortest path
|
||||
import heapq
|
||||
Q = [(cmp.high, i, None) for i in graph.iterkeys()]
|
||||
Q = [(Max, i, None) for i in graph.iterkeys()]
|
||||
heapq.heappush(Q, (0, S, None))
|
||||
V = {}
|
||||
while Q[0][0] < cmp.high and T not in V:
|
||||
while Q[0][0] < Max and T not in V:
|
||||
dist, node, parent = heapq.heappop(Q)
|
||||
if node in V:
|
||||
continue
|
||||
|
@ -215,20 +214,20 @@ table (a bit more understandable) and one using a heap (much faster)::
|
|||
path.reverse()
|
||||
return path
|
||||
|
||||
Readers should note that replacing ``cmp.high`` in the above code with
|
||||
an arbitrarily large number does not guarantee that the actual
|
||||
Readers should note that replacing ``Max`` in the above code with an
|
||||
arbitrarily large number does not guarantee that the shortest path
|
||||
distance to a node will never exceed that number. Well, with one
|
||||
caveat: one could certainly sum up the weights of every edge in the
|
||||
graph, and set the 'arbitrarily large number' to that total. However,
|
||||
doing so does not make the algorithm any easier to understand and has
|
||||
potential problems with various numeric overflows.
|
||||
potential problems with numeric overflows.
|
||||
|
||||
|
||||
A ``cmp.low`` Example
|
||||
---------------------
|
||||
A ``Min`` Example
|
||||
-----------------
|
||||
|
||||
An example of usage for ``cmp.low`` is an algorithm that solves the
|
||||
following problem [7]_:
|
||||
An example of usage for ``Min`` is an algorithm that solves the
|
||||
following problem [6]_:
|
||||
|
||||
Suppose you are given a directed graph, representing a
|
||||
communication network. The vertices are the nodes in the network,
|
||||
|
@ -246,10 +245,10 @@ Such an algorithm is a 7 line modification to the DijkstraSP_table
|
|||
algorithm given above::
|
||||
|
||||
#only showing the changed to lines with the proper indentation
|
||||
table[node] = (0, cmp.low, node, None)
|
||||
table[node] = (0, Min, node, None)
|
||||
table[S] = (0, 1, S, None)
|
||||
cur = max(table.values())
|
||||
while (not cur[0]) and cur[1] > cmp.low:
|
||||
while (not cur[0]) and cur[1] > Min:
|
||||
ndist = distance*cdist
|
||||
if not table[child][0] and ndist > table[child][1]:
|
||||
cur = max(table.values())
|
||||
|
@ -260,9 +259,9 @@ to)::
|
|||
|
||||
#only showing the changed to lines with the proper indentation
|
||||
import maxheapq
|
||||
Q = [(cmp.low, i, None) for i in graph.iterkeys()]
|
||||
Q = [(Min, i, None) for i in graph.iterkeys()]
|
||||
maxheapq.heappush(Q, (1, S, None))
|
||||
while Q[0][0] > cmp.low and T not in V:
|
||||
while Q[0][0] > Min and T not in V:
|
||||
dist, node, parent = maxheapq.heappop(Q)
|
||||
maxheapq.heappush(Q, (next*dist, dest, node))
|
||||
|
||||
|
@ -270,45 +269,69 @@ Note that there is an equivalent way of translating the graph to
|
|||
produce something that can be passed unchanged into the original
|
||||
Dijkstra shortest path algorithm.
|
||||
|
||||
Further usage examples of both ``cmp.high`` and ``cmp.low`` are
|
||||
available in the realm of graph algorithms.
|
||||
|
||||
Other Examples
|
||||
--------------
|
||||
|
||||
Andrew P. Lentvorski, Jr. [7]_ has pointed out that various data
|
||||
structures involving range searching have immediate use for ``Max``
|
||||
and ``Min`` values. More specifically; Segment trees, Range trees,
|
||||
k-d trees and database keys:
|
||||
|
||||
...The issue is that a range can be open on one side and does not
|
||||
always have an initialized case.
|
||||
|
||||
The solutions I have seen are to either overload None as the
|
||||
extremum or use an arbitrary large magnitude number. Overloading
|
||||
None means that the built-ins can't really be used without special
|
||||
case checks to work around the undefined (or "wrongly defined")
|
||||
ordering of None. These checks tend to swamp the nice performance
|
||||
of built-ins like max() and min().
|
||||
|
||||
Choosing a large magnitude number throws away the ability of
|
||||
Python to cope with arbitrarily large integers and introduces a
|
||||
potential source of overrun/underrun bugs.
|
||||
|
||||
Further use examples of both ``Max`` and ``Min`` are available in the
|
||||
realm of graph algorithms, range searching algorithms, computational
|
||||
geometry algorithms, and others.
|
||||
|
||||
|
||||
Independent Implementations?
|
||||
----------------------------
|
||||
|
||||
Independent implementations of the top/bottom concept by users
|
||||
Independent implementations of the ``Min``/``Max`` concept by users
|
||||
desiring such functionality are not likely to be compatible, and
|
||||
certainly will be inconsistent. The following examples seek to show
|
||||
how inconsistent they can be.
|
||||
certainly will produce inconsistent orderings. The following examples
|
||||
seek to show how inconsistent they can be.
|
||||
|
||||
- Let us pretend we have created proper separate implementations of
|
||||
Myhigh, Mylow, Yourhigh and Yourlow with the same code as given in
|
||||
MyMax, MyMin, YourMax and YourMin with the same code as given in
|
||||
the sample implementation (with some minor renaming)::
|
||||
|
||||
>>> lst = [Yourlow, Mylow, Mylow, Yourlow, Myhigh, Yourlow,
|
||||
Myhigh, Yourhigh, Myhigh]
|
||||
>>> lst = [YourMin, MyMin, MyMin, YourMin, MyMax, YourMin, MyMax,
|
||||
YourMax, MyMax]
|
||||
>>> lst.sort()
|
||||
>>> lst
|
||||
[Yourlow, Yourlow, Mylow, Mylow, Yourlow, Myhigh, Myhigh,
|
||||
Yourhigh, Myhigh]
|
||||
[YourMin, YourMin, MyMin, MyMin, YourMin, MyMax, MyMax, YourMax,
|
||||
MyMax]
|
||||
|
||||
Notice that while all the "low"s are before the "high"s, there is no
|
||||
guarantee that all instances of Yourlow will come before Mylow, the
|
||||
reverse, or the equivalent Myhigh and Yourhigh.
|
||||
Notice that while all the "Min"s are before the "Max"s, there is no
|
||||
guarantee that all instances of YourMin will come before MyMin, the
|
||||
reverse, or the equivalent MyMax and YourMax.
|
||||
|
||||
- The problem is evident even when using the heapq module::
|
||||
- The problem is also evident when using the heapq module::
|
||||
|
||||
>>> lst = [Yourlow, Mylow, Mylow, Yourlow, Myhigh, Yourlow,
|
||||
Myhigh, Yourhigh, Myhigh]
|
||||
>>> lst = [YourMin, MyMin, MyMin, YourMin, MyMax, YourMin, MyMax,
|
||||
YourMax, MyMax]
|
||||
>>> heapq.heapify(lst) #not needed, but it can't hurt
|
||||
>>> while lst: print heapq.heappop(lst),
|
||||
...
|
||||
Yourlow Mylow Yourlow Yourlow Mylow Myhigh Myhigh Yourhigh Myhigh
|
||||
YourMin MyMin YourMin YourMin MyMin MyMax MyMax YourMax MyMax
|
||||
|
||||
- Furthermore, the findmin_high code and both versions of Dijkstra
|
||||
- Furthermore, the findmin_Max code and both versions of Dijkstra
|
||||
could result in incorrect output by passing in secondary versions of
|
||||
high.
|
||||
``Max``.
|
||||
|
||||
|
||||
Reference Implementation
|
||||
|
@ -316,42 +339,36 @@ Reference Implementation
|
|||
|
||||
::
|
||||
|
||||
class _HighType(object):
|
||||
class _ExtremeType(object):
|
||||
|
||||
def __init__(self, cmpr, rep):
|
||||
object.__init__(self)
|
||||
self._cmpr = cmpr
|
||||
self._rep = rep
|
||||
|
||||
def __cmp__(self, other):
|
||||
if isinstance(other, self.__class__):
|
||||
if isinstance(other, self.__class__) and\
|
||||
other._cmpr == self._cmpr:
|
||||
return 0
|
||||
return 1
|
||||
return self._cmpr
|
||||
|
||||
def __repr__(self):
|
||||
return 'cmp.high'
|
||||
return self._rep
|
||||
|
||||
class _LowType(object):
|
||||
Max = _ExtremeType(1, "Max")
|
||||
Min = _ExtremeType(-1, "Min")
|
||||
|
||||
def __cmp__(self, other):
|
||||
if isinstance(other, self.__class__):
|
||||
return 0
|
||||
return -1
|
||||
Results of Test Run::
|
||||
|
||||
def __repr__(self):
|
||||
return 'cmp.low'
|
||||
|
||||
# please note that the following code doesn't
|
||||
# work due to built-ins being read-only
|
||||
cmp.high = _HighType()
|
||||
cmp.low = _LowType()
|
||||
|
||||
Results of Test Run if we could set cmp.high and cmp.low::
|
||||
|
||||
>>> max(cmp.high, 2**65536)
|
||||
cmp.high
|
||||
>>> min(cmp.high, 2**65536)
|
||||
>>> max(Max, 2**65536)
|
||||
Max
|
||||
>>> min(Max, 2**65536)
|
||||
20035299304068464649790...
|
||||
(lines removed for brevity)
|
||||
...72339445587895905719156736L
|
||||
>>> min(cmp.low, -2**65536)
|
||||
cmp.low
|
||||
>>> max(cmp.low, -2**65536)
|
||||
>>> min(Min, -2**65536)
|
||||
Min
|
||||
>>> max(Min, -2**65536)
|
||||
-2003529930406846464979...
|
||||
(lines removed for brevity)
|
||||
...072339445587895905719156736L
|
||||
|
@ -360,66 +377,15 @@ Results of Test Run if we could set cmp.high and cmp.low::
|
|||
Open Issues
|
||||
===========
|
||||
|
||||
- Previously, ``Some`` and ``All`` were names for the idea that
|
||||
``cmp.high`` now represents. They have been subsumed by
|
||||
``cmp.high`` due to the relative ambiguity of ``Some`` and ``All``.
|
||||
Current options for the naming and namespace for ``Min``/``Max``, in
|
||||
no particular order:
|
||||
|
||||
- Terry Reedy [5]_ and others have offered alternate names for the
|
||||
``high/low`` objects: ``ObjHi/ObjLo``, ``NoneHi/NoneLo``,
|
||||
``Highest/Lowest``, ``Infinity/NegativeInfinity``, ``hi/lo`` and
|
||||
``High/Low``.
|
||||
|
||||
- Terry Reedy has also offered possible default behaviors of ``min``
|
||||
and ``max`` on empty lists using these values. Some have voiced
|
||||
that changing the behavior of min and max are not desirable due to
|
||||
the amount of code that actively uses min and max, which may rely on
|
||||
min and max raising exceptions on empty lists.
|
||||
|
||||
- Choosing ``high`` and ``low`` to be the attributes of ``cmp`` is
|
||||
arbitrary, but meaningful. Other meaningful parent locations
|
||||
include, but are not limited to: ``math``, ``int`` and ``number``
|
||||
(if such a numeric superclass existed). ``sys`` probably does not
|
||||
make sense, as such maximum and minimum values are not platform
|
||||
dependent.
|
||||
|
||||
- The base class of the high and low objects do not necessarily have
|
||||
to be ``object``. ``object``, ``NoneType`` or even a new class
|
||||
called ``cmp.extreme`` have been suggested.
|
||||
|
||||
- Various built-in names such as ``All`` and ``Some`` have been
|
||||
rejected by many users. Based on comments, it seems that regardless
|
||||
of name, any new built-in would be rejected. [6]_
|
||||
|
||||
- Should ``-cmp.high == cmp.low``? This seems to make logical sense.
|
||||
|
||||
- Certainly ``bool(cmp.high) == True``, but should ``bool(cmp.low)``
|
||||
be ``True`` or ``False``? Due to ``bool(1) == bool(-1) == True``,
|
||||
it seems to follow that ``bool(cmp.high) == bool(cmp.low) == True``.
|
||||
|
||||
- Whatever name the concepts of a top and bottom value come to have,
|
||||
the question of whether or not math can be done on them may or may
|
||||
not make sense. If math were not allowed, it brings up a potential
|
||||
ambiguity that while ``-cmp.high == cmp.low``, ``-1 * cmp.high``
|
||||
would produce an exception.
|
||||
|
||||
|
||||
Most-Preferred Options
|
||||
======================
|
||||
|
||||
Through a non-rigorous method, the following behavior of the objects
|
||||
seem to be preferred by those who are generally in favor of this PEP
|
||||
in python-dev.
|
||||
|
||||
- ``high`` and ``low`` objects should be attached to the ``cmp``
|
||||
built-in as ``cmp.high`` and ``cmp.low`` (or ``cmp.High/cmp.Low``).
|
||||
|
||||
- ``-cmp.high == cmp.low`` and equivalently ``-cmp.low == cmp.high``.
|
||||
|
||||
- ``bool(cmp.high) == bool(cmp.low) == True``
|
||||
|
||||
- The base type seems to be a cosmetic issue and has not resulted in
|
||||
any real preference other than ``cmp.extreme`` making the most
|
||||
sense.
|
||||
1. Give the built-in ``max`` and ``min`` appropriate ``__cmp__``
|
||||
methods to allow them to double as ``Min``/``Max``.
|
||||
2. Attach them to attributes of the ``cmp()`` built-in.
|
||||
3. Attach them to attributes of an appropriate type object.
|
||||
4. Make them an appropriate module object.
|
||||
5. Create two new built-ins with appropriate names.
|
||||
|
||||
|
||||
References
|
||||
|
@ -431,61 +397,54 @@ References
|
|||
.. [2] Re: [Python-Dev] Got None. Maybe Some?, von Rossum, Guido
|
||||
(http://mail.python.org/pipermail/python-dev/2003-December/041352.html)
|
||||
|
||||
.. [3] [Python-Dev] Re: Got None. Maybe Some?, Reedy, Terry
|
||||
(http://mail.python.org/pipermail/python-dev/2003-December/041337.html)
|
||||
|
||||
.. [4] RE: [Python-Dev] Got None. Maybe Some?, Peters, Tim
|
||||
.. [3] RE: [Python-Dev] Got None. Maybe Some?, Peters, Tim
|
||||
(http://mail.python.org/pipermail/python-dev/2003-December/041332.html)
|
||||
|
||||
.. [5] [Python-Dev] Re: PEP 326 now online, Reedy, Terry
|
||||
.. [4] [Python-Dev] Re: PEP 326 now online, Reedy, Terry
|
||||
(http://mail.python.org/pipermail/python-dev/2004-January/041685.html)
|
||||
|
||||
.. [6] [Python-Dev] PEP 326 now online, Chermside, Michael
|
||||
.. [5] [Python-Dev] PEP 326 now online, Chermside, Michael
|
||||
(http://mail.python.org/pipermail/python-dev/2004-January/041704.html)
|
||||
|
||||
.. [7] Homework 6, Problem 7, Dillencourt, Michael
|
||||
.. [6] Homework 6, Problem 7, Dillencourt, Michael
|
||||
(link may not be valid in the future)
|
||||
(http://www.ics.uci.edu/~dillenco/ics161/hw/hw6.pdf)
|
||||
|
||||
.. [7] RE: [Python-Dev] PEP 326 now online, Lentvorski, Andrew P., Jr.
|
||||
(http://mail.python.org/pipermail/python-dev/2004-January/041727.html)
|
||||
|
||||
.. [8] Re: It's not really Some is it?, Ippolito, Bob
|
||||
(http://www.livejournal.com/users/chouyu_31/138195.html?thread=274643#t274643)
|
||||
|
||||
Changes
|
||||
=======
|
||||
|
||||
- Added this section.
|
||||
|
||||
- Renamed ``Some`` to ``All``: "Some" was an arbitrary name that
|
||||
suffered from being unclear. [3]_
|
||||
|
||||
- Made ``All`` a subclass of ``object`` in order for it to become a
|
||||
new-style class.
|
||||
|
||||
- Removed mathematical negation and casting to float in Open Issues.
|
||||
``None`` is not a number and is not treated as one, ``All``
|
||||
shouldn't be either.
|
||||
|
||||
- Added Motivation_ section.
|
||||
|
||||
- Changed markup to reStructuredText.
|
||||
|
||||
- Renamed ``All`` to ``cmp.hi`` to remove builtin requirement and to
|
||||
provide a better better name, as well as adding an equivalent
|
||||
future-proof bottom value ``cmp.lo``. [5]_
|
||||
- Concept gets a possible name and location. [5]_
|
||||
|
||||
- Clarified Abstract_, Motivation_, `Reference Implementation`_ and
|
||||
`Open Issues`_ based on the simultaneous concepts of ``cmp.hi`` and
|
||||
``cmp.lo``.
|
||||
`Open Issues`_ based on the simultaneous concepts of ``Max`` and
|
||||
``Min``.
|
||||
|
||||
- Added two implementations of Dijkstra's Shortest Path algorithm that
|
||||
show where ``cmp.hi`` can be used to remove special cases.
|
||||
show where ``Max`` can be used to remove special cases.
|
||||
|
||||
- Renamed ``hi`` to ``high`` and ``lo`` to ``low`` to address concerns
|
||||
for non-native english speakers.
|
||||
- Added an example of use for ``Min`` to Motivation_.
|
||||
|
||||
- Added an example of use for ``cmp.low`` to Motivation_.
|
||||
- Added some `Open Issues`_ and clarified some others.
|
||||
|
||||
- Added a couple `Open Issues`_ and clarified some others.
|
||||
- Added an example and `Other Examples`_ subheading.
|
||||
|
||||
- Added `Most-Preferred Options`_ section.
|
||||
- Modified `Reference Implementation`_ to instantiate both items from
|
||||
a single class/type.
|
||||
|
||||
- Removed a large number of open issues that are not within the scope
|
||||
of this PEP.
|
||||
|
||||
|
||||
Copyright
|
||||
|
|
Loading…
Reference in New Issue