527 lines
20 KiB
Plaintext
527 lines
20 KiB
Plaintext
PEP: 326
|
||
Title: A Case for Top and Bottom Values
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Josiah Carlson <jcarlson@uci.edu>,
|
||
Terry Reedy <tjreedy@udel.edu>
|
||
Status: Rejected
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 20-Dec-2003
|
||
Python-Version: 2.4
|
||
Post-History: 20-Dec-2003, 03-Jan-2004, 05-Jan-2004, 07-Jan-2004,
|
||
21-Feb-2004
|
||
|
||
Results
|
||
=======
|
||
|
||
This PEP has been rejected by the BDFL [12]_. As per the
|
||
pseudo-sunset clause [13]_, :pep:`326` is being updated one last time
|
||
with the latest suggestions, code modifications, etc., and includes a
|
||
link to a module [14]_ that implements the behavior described in the
|
||
PEP. Users who desire the behavior listed in this PEP are encouraged
|
||
to use the module for the reasons listed in
|
||
`Independent Implementations?`_.
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP proposes two singleton constants that represent a top and
|
||
bottom [3]_ value: ``Max`` and ``Min`` (or two similarly suggestive
|
||
names [4]_; see `Open Issues`_).
|
||
|
||
As suggested by their names, ``Max`` and ``Min`` would compare higher
|
||
or lower than any other object (respectively). Such behavior results
|
||
in easier to understand code and fewer special cases in which a
|
||
temporary minimum or maximum value is required, and an actual minimum
|
||
or maximum numeric value is not limited.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
While ``None`` can be used as an absolute minimum that any value can
|
||
attain [1]_, this may be deprecated [4]_ in Python 3.0 and shouldn't
|
||
be relied upon.
|
||
|
||
As a replacement for ``None`` being used as an absolute minimum, as
|
||
well as the introduction of an absolute maximum, the introduction of
|
||
two singleton constants ``Max`` and ``Min`` address concerns for the
|
||
constants to be self-documenting.
|
||
|
||
What is commonly done to deal with absolute minimum or maximum values,
|
||
is to set a value that is larger than the script author ever expects
|
||
the input to reach, and hope that it isn't reached.
|
||
|
||
Guido has brought up [2]_ the fact that there exists two constants
|
||
that can be used in the interim for maximum values: sys.maxint and
|
||
floating point positive infinity (1e309 will evaluate to positive
|
||
infinity). However, each has their drawbacks.
|
||
|
||
- On most architectures sys.maxint is arbitrarily small (2**31-1 or
|
||
2**63-1) and can be easily eclipsed by large 'long' integers or
|
||
floating point numbers.
|
||
|
||
- Comparing long integers larger than the largest floating point
|
||
number representable against any float will result in an exception
|
||
being raised::
|
||
|
||
>>> cmp(1.0, 10**309)
|
||
Traceback (most recent call last):
|
||
File "<stdin>", line 1, in ?
|
||
OverflowError: long int too large to convert to float
|
||
|
||
Even when large integers are compared against positive infinity::
|
||
|
||
>>> cmp(1e309, 10**309)
|
||
Traceback (most recent call last):
|
||
File "<stdin>", line 1, in ?
|
||
OverflowError: long int too large to convert to float
|
||
|
||
- These same drawbacks exist when numbers are negative.
|
||
|
||
Introducing ``Max`` and ``Min`` that work as described above does not
|
||
take much effort. A sample Python `reference implementation`_ of both
|
||
is included.
|
||
|
||
|
||
Motivation
|
||
==========
|
||
|
||
There are hundreds of algorithms that begin by initializing some set
|
||
of values to a logical (or numeric) infinity or negative infinity.
|
||
Python lacks either infinity that works consistently or really is the
|
||
most extreme value that can be attained. By adding ``Max`` and
|
||
``Min``, Python would have a real maximum and minimum value, and such
|
||
algorithms can become clearer due to the reduction of special cases.
|
||
|
||
``Max`` Examples
|
||
---------------------
|
||
|
||
When testing various kinds of servers, it is sometimes necessary to
|
||
only serve a certain number of clients before exiting, which results
|
||
in code like the following::
|
||
|
||
count = 5
|
||
|
||
def counts(stop):
|
||
i = 0
|
||
while i < stop:
|
||
yield i
|
||
i += 1
|
||
|
||
for client_number in counts(count):
|
||
handle_one_client()
|
||
|
||
When using ``Max`` as the value assigned to count, our testing server
|
||
becomes a production server with minimal effort.
|
||
|
||
As another example, in Dijkstra's shortest path algorithm on a graph
|
||
with weighted edges (all positive).
|
||
|
||
1. Set distances to every node in the graph to infinity.
|
||
2. Set the distance to the start node to zero.
|
||
3. Set visited to be an empty mapping.
|
||
4. While shortest distance of a node that has not been visited is less
|
||
than infinity and the destination has not been visited.
|
||
|
||
a. Get the node with the shortest distance.
|
||
b. Visit the node.
|
||
c. Update neighbor distances and parent pointers if necessary for
|
||
neighbors that have not been visited.
|
||
|
||
5. If the destination has been visited, step back through parent
|
||
pointers to find the reverse of the path to be taken.
|
||
|
||
.. _DijkstraSP_table:
|
||
|
||
Below is an example of Dijkstra's shortest path algorithm on a graph
|
||
with weighted edges using a table (a faster version that uses a heap
|
||
is available, but this version is offered due to its similarity to the
|
||
description above, the heap version is available via older versions of
|
||
this document). ::
|
||
|
||
def DijkstraSP_table(graph, S, T):
|
||
table = {} #3
|
||
for node in graph.iterkeys():
|
||
#(visited, distance, node, parent)
|
||
table[node] = (0, Max, node, None) #1
|
||
table[S] = (0, 0, S, None) #2
|
||
cur = min(table.values()) #4a
|
||
while (not cur[0]) and cur[1] < Max: #4
|
||
(visited, distance, node, parent) = cur
|
||
table[node] = (1, distance, node, parent) #4b
|
||
for cdist, child in graph[node]: #4c
|
||
ndist = distance+cdist #|
|
||
if not table[child][0] and ndist < table[child][1]:#|
|
||
table[child] = (0, ndist, child, node) #|_
|
||
cur = min(table.values()) #4a
|
||
if not table[T][0]:
|
||
return None
|
||
cur = T #5
|
||
path = [T] #|
|
||
while table[cur][3] is not None: #|
|
||
path.append(table[cur][3]) #|
|
||
cur = path[-1] #|
|
||
path.reverse() #|
|
||
return path #|_
|
||
|
||
Readers should note that replacing ``Max`` in the above code with an
|
||
arbitrarily large number does not guarantee that the shortest path
|
||
distance to a node will never exceed that number. Well, with one
|
||
caveat: one could certainly sum up the weights of every edge in the
|
||
graph, and set the 'arbitrarily large number' to that total. However,
|
||
doing so does not make the algorithm any easier to understand and has
|
||
potential problems with numeric overflows.
|
||
|
||
.. _DijkstraSP_table_node:
|
||
|
||
Gustavo Niemeyer [9]_ points out that using a more Pythonic data
|
||
structure than tuples, to store information about node distances,
|
||
increases readability. Two equivalent node structures (one using
|
||
``None``, the other using ``Max``) and their use in a suitably
|
||
modified Dijkstra's shortest path algorithm is given below. ::
|
||
|
||
class SuperNode:
|
||
def __init__(self, node, parent, distance, visited):
|
||
self.node = node
|
||
self.parent = parent
|
||
self.distance = distance
|
||
self.visited = visited
|
||
|
||
class MaxNode(SuperNode):
|
||
def __init__(self, node, parent=None, distance=Max,
|
||
visited=False):
|
||
SuperNode.__init__(self, node, parent, distance, visited)
|
||
def __cmp__(self, other):
|
||
return cmp((self.visited, self.distance),
|
||
(other.visited, other.distance))
|
||
|
||
class NoneNode(SuperNode):
|
||
def __init__(self, node, parent=None, distance=None,
|
||
visited=False):
|
||
SuperNode.__init__(self, node, parent, distance, visited)
|
||
def __cmp__(self, other):
|
||
pair = ((self.visited, self.distance),
|
||
(other.visited, other.distance))
|
||
if None in (self.distance, other.distance):
|
||
return -cmp(*pair)
|
||
return cmp(*pair)
|
||
|
||
def DijkstraSP_table_node(graph, S, T, Node):
|
||
table = {} #3
|
||
for node in graph.iterkeys():
|
||
table[node] = Node(node) #1
|
||
table[S] = Node(S, distance=0) #2
|
||
cur = min(table.values()) #4a
|
||
sentinel = Node(None).distance
|
||
while not cur.visited and cur.distance != sentinel: #4
|
||
cur.visited = True #4b
|
||
for cdist, child in graph[node]: #4c
|
||
ndist = distance+cdist #|
|
||
if not table[child].visited and\ #|
|
||
ndist < table[child].distance: #|
|
||
table[child].distance = ndist #|_
|
||
cur = min(table.values()) #4a
|
||
if not table[T].visited:
|
||
return None
|
||
cur = T #5
|
||
path = [T] #|
|
||
while table[cur].parent is not None: #|
|
||
path.append(table[cur].parent) #|
|
||
cur = path[-1] #|
|
||
path.reverse() #|
|
||
return path #|_
|
||
|
||
In the above, passing in either NoneNode or MaxNode would be
|
||
sufficient to use either ``None`` or ``Max`` for the node distance
|
||
'infinity'. Note the additional special case required for ``None``
|
||
being used as a sentinel in NoneNode in the __cmp__ method.
|
||
|
||
This example highlights the special case handling where ``None`` is
|
||
used as a sentinel value for maximum values "in the wild", even though
|
||
None itself compares smaller than any other object in the standard
|
||
distribution.
|
||
|
||
As an aside, it is not clear to the author that using Nodes as a
|
||
replacement for tuples has increased readability significantly, if at
|
||
all.
|
||
|
||
|
||
A ``Min`` Example
|
||
-----------------
|
||
|
||
An example of usage for ``Min`` is an algorithm that solves the
|
||
following problem [6]_:
|
||
|
||
Suppose you are given a directed graph, representing a
|
||
communication network. The vertices are the nodes in the network,
|
||
and each edge is a communication channel. Each edge ``(u, v)`` has
|
||
an associated value ``r(u, v)``, with ``0 <= r(u, v) <= 1``, which
|
||
represents the reliability of the channel from ``u`` to ``v``
|
||
(i.e., the probability that the channel from ``u`` to ``v`` will
|
||
**not** fail). Assume that the reliability probabilities of the
|
||
channels are independent. (This implies that the reliability of
|
||
any path is the product of the reliability of the edges along the
|
||
path.) Now suppose you are given two nodes in the graph, ``A``
|
||
and ``B``.
|
||
|
||
Such an algorithm is a 7 line modification to the `DijkstraSP_table`_
|
||
algorithm given above (modified lines prefixed with `*`)::
|
||
|
||
def DijkstraSP_table(graph, S, T):
|
||
table = {} #3
|
||
for node in graph.iterkeys():
|
||
#(visited, distance, node, parent)
|
||
* table[node] = (0, Min, node, None) #1
|
||
* table[S] = (0, 1, S, None) #2
|
||
* cur = max(table.values()) #4a
|
||
* while (not cur[0]) and cur[1] > Min: #4
|
||
(visited, distance, node, parent) = cur
|
||
table[node] = (1, distance, node, parent) #4b
|
||
for cdist, child in graph[node]: #4c
|
||
* ndist = distance*cdist #|
|
||
* if not table[child][0] and ndist > table[child][1]:#|
|
||
table[child] = (0, ndist, child, node) #|_
|
||
* cur = max(table.values()) #4a
|
||
if not table[T][0]:
|
||
return None
|
||
cur = T #5
|
||
path = [T] #|
|
||
while table[cur][3] is not None: #|
|
||
path.append(table[cur][3]) #|
|
||
cur = path[-1] #|
|
||
path.reverse() #|
|
||
return path #|_
|
||
|
||
Note that there is a way of translating the graph to so that it can be
|
||
passed unchanged into the original `DijkstraSP_table`_ algorithm.
|
||
There also exists a handful of easy methods for constructing Node
|
||
objects that would work with `DijkstraSP_table_node`_. Such
|
||
translations are left as an exercise to the reader.
|
||
|
||
|
||
Other Examples
|
||
--------------
|
||
|
||
Andrew P. Lentvorski, Jr. [7]_ has pointed out that various data
|
||
structures involving range searching have immediate use for ``Max``
|
||
and ``Min`` values. More specifically; Segment trees, Range trees,
|
||
k-d trees and database keys:
|
||
|
||
...The issue is that a range can be open on one side and does not
|
||
always have an initialized case.
|
||
|
||
The solutions I have seen are to either overload None as the
|
||
extremum or use an arbitrary large magnitude number. Overloading
|
||
None means that the built-ins can't really be used without special
|
||
case checks to work around the undefined (or "wrongly defined")
|
||
ordering of None. These checks tend to swamp the nice performance
|
||
of built-ins like max() and min().
|
||
|
||
Choosing a large magnitude number throws away the ability of
|
||
Python to cope with arbitrarily large integers and introduces a
|
||
potential source of overrun/underrun bugs.
|
||
|
||
Further use examples of both ``Max`` and ``Min`` are available in the
|
||
realm of graph algorithms, range searching algorithms, computational
|
||
geometry algorithms, and others.
|
||
|
||
|
||
Independent Implementations?
|
||
----------------------------
|
||
|
||
Independent implementations of the ``Min``/``Max`` concept by users
|
||
desiring such functionality are not likely to be compatible, and
|
||
certainly will produce inconsistent orderings. The following examples
|
||
seek to show how inconsistent they can be.
|
||
|
||
- Let us pretend we have created proper separate implementations of
|
||
MyMax, MyMin, YourMax and YourMin with the same code as given in
|
||
the sample implementation (with some minor renaming)::
|
||
|
||
>>> lst = [YourMin, MyMin, MyMin, YourMin, MyMax, YourMin, MyMax,
|
||
YourMax, MyMax]
|
||
>>> lst.sort()
|
||
>>> lst
|
||
[YourMin, YourMin, MyMin, MyMin, YourMin, MyMax, MyMax, YourMax,
|
||
MyMax]
|
||
|
||
Notice that while all the "Min"s are before the "Max"s, there is no
|
||
guarantee that all instances of YourMin will come before MyMin, the
|
||
reverse, or the equivalent MyMax and YourMax.
|
||
|
||
- The problem is also evident when using the heapq module::
|
||
|
||
>>> lst = [YourMin, MyMin, MyMin, YourMin, MyMax, YourMin, MyMax,
|
||
YourMax, MyMax]
|
||
>>> heapq.heapify(lst) #not needed, but it can't hurt
|
||
>>> while lst: print heapq.heappop(lst),
|
||
...
|
||
YourMin MyMin YourMin YourMin MyMin MyMax MyMax YourMax MyMax
|
||
|
||
- Furthermore, the findmin_Max code and both versions of Dijkstra
|
||
could result in incorrect output by passing in secondary versions of
|
||
``Max``.
|
||
|
||
It has been pointed out [9]_ that the reference implementation given
|
||
below would be incompatible with independent implementations of
|
||
``Max``/``Min``. The point of this PEP is for the introduction of
|
||
"The One True Implementation" of "The One True Maximum" and "The One
|
||
True Minimum". User-based implementations of ``Max`` and ``Min``
|
||
objects would thusly be discouraged, and use of "The One True
|
||
Implementation" would obviously be encouraged. Ambiguous behavior
|
||
resulting from mixing users' implementations of ``Max`` and ``Min``
|
||
with "The One True Implementation" should be easy to discover through
|
||
variable and/or source code introspection.
|
||
|
||
|
||
Reference Implementation
|
||
========================
|
||
|
||
::
|
||
|
||
class _ExtremeType(object):
|
||
|
||
def __init__(self, cmpr, rep):
|
||
object.__init__(self)
|
||
self._cmpr = cmpr
|
||
self._rep = rep
|
||
|
||
def __cmp__(self, other):
|
||
if isinstance(other, self.__class__) and\
|
||
other._cmpr == self._cmpr:
|
||
return 0
|
||
return self._cmpr
|
||
|
||
def __repr__(self):
|
||
return self._rep
|
||
|
||
Max = _ExtremeType(1, "Max")
|
||
Min = _ExtremeType(-1, "Min")
|
||
|
||
Results of Test Run::
|
||
|
||
>>> max(Max, 2**65536)
|
||
Max
|
||
>>> min(Max, 2**65536)
|
||
20035299304068464649790...
|
||
(lines removed for brevity)
|
||
...72339445587895905719156736L
|
||
>>> min(Min, -2**65536)
|
||
Min
|
||
>>> max(Min, -2**65536)
|
||
-2003529930406846464979...
|
||
(lines removed for brevity)
|
||
...072339445587895905719156736L
|
||
|
||
|
||
Open Issues
|
||
===========
|
||
|
||
As the PEP was rejected, all open issues are now closed and
|
||
inconsequential. The module will use the names ``UniversalMaximum``
|
||
and ``UniversalMinimum`` due to the fact that it would be very
|
||
difficult to mistake what each does. For those who require a shorter
|
||
name, renaming the singletons during import is suggested::
|
||
|
||
from extremes import UniversalMaximum as uMax,
|
||
UniversalMinimum as uMin
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] RE: [Python-Dev] Re: Got None. Maybe Some?, Peters, Tim
|
||
(https://mail.python.org/pipermail/python-dev/2003-December/041374.html)
|
||
|
||
.. [2] Re: [Python-Dev] Got None. Maybe Some?, van Rossum, Guido
|
||
(https://mail.python.org/pipermail/python-dev/2003-December/041352.html)
|
||
|
||
.. [3] RE: [Python-Dev] Got None. Maybe Some?, Peters, Tim
|
||
(https://mail.python.org/pipermail/python-dev/2003-December/041332.html)
|
||
|
||
.. [4] [Python-Dev] Re: PEP 326 now online, Reedy, Terry
|
||
(https://mail.python.org/pipermail/python-dev/2004-January/041685.html)
|
||
|
||
.. [5] [Python-Dev] PEP 326 now online, Chermside, Michael
|
||
(https://mail.python.org/pipermail/python-dev/2004-January/041704.html)
|
||
|
||
.. [6] Homework 6, Problem 7, Dillencourt, Michael
|
||
(link may not be valid in the future)
|
||
(http://www.ics.uci.edu/~dillenco/ics161/hw/hw6.pdf)
|
||
|
||
.. [7] RE: [Python-Dev] PEP 326 now online, Lentvorski, Andrew P., Jr.
|
||
(https://mail.python.org/pipermail/python-dev/2004-January/041727.html)
|
||
|
||
.. [8] Re: It's not really Some is it?, Ippolito, Bob
|
||
(http://www.livejournal.com/users/chouyu_31/138195.html?thread=274643#t274643)
|
||
|
||
.. [9] [Python-Dev] Re: PEP 326 now online, Niemeyer, Gustavo
|
||
(https://mail.python.org/pipermail/python-dev/2004-January/042261.html);
|
||
[Python-Dev] Re: PEP 326 now online, Carlson, Josiah
|
||
(https://mail.python.org/pipermail/python-dev/2004-January/042272.html)
|
||
|
||
.. [11] [Python-Dev] PEP 326 (quick location possibility), Carlson, Josiah
|
||
(https://mail.python.org/pipermail/python-dev/2004-January/042275.html)
|
||
|
||
.. [12] [Python-Dev] PEP 326 (quick location possibility), van Rossum, Guido
|
||
(https://mail.python.org/pipermail/python-dev/2004-January/042306.html)
|
||
|
||
.. [13] [Python-Dev] PEP 326 (quick location possibility), Carlson, Josiah
|
||
(https://mail.python.org/pipermail/python-dev/2004-January/042300.html)
|
||
|
||
.. [14] Recommended standard implementation of PEP 326, extremes.py,
|
||
Carlson, Josiah
|
||
(http://www.ics.uci.edu/~jcarlson/pep326/extremes.py)
|
||
|
||
|
||
Changes
|
||
=======
|
||
|
||
- Added this section.
|
||
|
||
- Added Motivation_ section.
|
||
|
||
- Changed markup to reStructuredText.
|
||
|
||
- Clarified Abstract_, Motivation_, `Reference Implementation`_ and
|
||
`Open Issues`_ based on the simultaneous concepts of ``Max`` and
|
||
``Min``.
|
||
|
||
- Added two implementations of Dijkstra's Shortest Path algorithm that
|
||
show where ``Max`` can be used to remove special cases.
|
||
|
||
- Added an example of use for ``Min`` to Motivation_.
|
||
|
||
- Added an example and `Other Examples`_ subheading.
|
||
|
||
- Modified `Reference Implementation`_ to instantiate both items from
|
||
a single class/type.
|
||
|
||
- Removed a large number of open issues that are not within the scope
|
||
of this PEP.
|
||
|
||
- Replaced an example from `Max Examples`_, changed an example in
|
||
`A Min Example`_.
|
||
|
||
- Added some `References`_.
|
||
|
||
- BDFL rejects [12]_ :pep:`326`
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
End:
|