python-peps/pep-0285.txt

PEP: 285
Title: Adding a bool type
Version: $Revision$
Last-Modified: $Date$
Author: guido@python.org (Guido van Rossum)
Status: Draft
Type: Standards Track
Created: 8-Mar-2002
Python-Version: 2.3
Post-History: 8-Mar-2002, 30-Mar-2002


Abstract

    This PEP proposes the introduction of a new built-in type, bool,
    with two constants, False and True.  The bool type would be a
    straightforward subtype (in C) of the int type, and the values
    False and True would behave like 0 and 1 in most respects (for
    example, False==0 and True==1 would be true) except repr() and
    str().  All built-in operations that conceptually return a Boolean
    result will be changed to return False or True instead of 0 or 1;
    for example, comparisons, the "not" operator, and predicates like
    isinstance().


Review

    Dear reviewers:

    I'm particularly interested in hearing your opinion about the
    following three issues:

    1) Should this PEP be accepted at all.

    => The majority of reviewers so far are in favor, ranging from +0
       (don't hate it) to 1 (yes please).  Votes against are mixed:
       some are against all change, some think it's not needed, some
       think it will just add more confusion or complexity, some have
       irrational fears about code breakage based on misunderstanding
       the PEP (believing it adds reserved words, or believing it will
       require you to write "if bool(x):" where previously "if x:"
       worked; neither belief is true).

    2) Should str(True) return "True" or "1": "1" might reduce
       backwards compatibility problems, but looks strange to me.
       (repr(True) would always return "True".)

    => Most reviewers prefer str(True) == "True" (which may mean that
       they don't appreciate the specific 

    3) Should the constants be called 'True' and 'False'
       (corresponding to None) or 'true' and 'false' (as in C++, Java
       and C99).

    => There's no clear preference either way here, so I'll break the
       tie by pronouncing False and True.

    Most other details of the proposal are pretty much forced by the
    backwards compatibility requirement; for example, True == 1 and
    True+1 == 2 must hold, else reams of existing code would break.

    Minor additional issues:

    4) Should we strive to eliminate non-Boolean operations on bools
       in the future, through suitable warnings, so that for example
       True+1 would eventually (in Python 3000) be illegal.
       Personally, I think we shouldn't; 28+isleap(y) seems totally
       reasonable to me.

    => Most reviewers agree with me.

    5) Should operator.truth(x) return an int or a bool.  Tim Peters
       believes it should return an int because it's been documented
       as such.  I think it should return a bool; most other standard
       predicates (like issubtype()) have also been documented as
       returning 0 or 1, and it's obvious that we want to change those
       to return a bool.

    => Most reviewers agree with me.  My take: operator.truth() exists
       to force a Boolean context on its argument (it calls the C API
       PyObject_IsTrue()).  Whether the outcome is reported as int or
       bool is secondary; if bool exists there's no reason not to use
       it.

    New issues brought up during the review:

    6) Should bool inherit from int?

    => My take: in an ideal world, bool might be better implemented as
       a separate integer type that knows how to perform mixed-mode
       arithmetic.  However, inheriting bool from int eases the
       implementation enormously (in part since all C code that calls
       PyInt_Check() will continue to work -- this returns true for
       subclasses of int).  Also, I believe in terms of
       substitutability, this is right: code that requires an int can
       be fed a bool and it will behave the same as 0 or 1.  Code that
       requires a bool may not work when it is given an int; for
       example, 3 & 4 is 0, but both 3 and 4 are true when considered
       as truth values.

    7) Should the name 'bool' be changed?

    => Some reviewers argue for boolean instead of bool, because this
       would be easier to understand (novices may have heard of
       Boolean algebra but may not make the connection with bool) or
       because they hate abbreviations.  My take: Python uses
       abbreviations judiciously (like 'def', 'int', 'dict') and I
       don't think these are a burden to understanding.

       One reviewer argues to make the name 'truth'.  I find this an
       unattractive name, and would actually prefer to reserve this
       term (in documentation) for the more abstract concept of truth
       values that already exists in Python.  For example: "when a
       container is interpreted as a truth value, an empty container
       is considered false and a non-empty one is considered true."


Rationale

    Most languages eventually grow a Boolean type; even C99 (the new
    and improved C standard, not yet widely adopted) has one.

    Many programmers apparently feel the need for a Boolean type; most
    Python documentation contains a bit of an apology for the absence
    of a Boolean type.  I've seen lots of modules that defined
    constants "False=0" and "True=1" (or similar) at the top and used
    those.  The problem with this is that everybody does it
    differently.  For example, should you use "FALSE", "false",
    "False", "F" or even "f"?  And should false be the value zero or
    None, or perhaps a truth value of a different type that will print
    as "true" or "false"?  Adding a standard bool type to the language
    resolves those issues.

    Some external libraries (like databases and RPC packages) need to
    be able to distinguish between Boolean and integral values, and
    while it's usually possible to craft a solution, it would be
    easier if the language offered a standard Boolean type.  This also
    applies to Jython: some Java classes have separately overloaded
    methods or constructors for int and boolean arguments.  The bool
    type can be used to select the boolean variant.

    The standard bool type can also serve as a way to force a value to
    be interpreted as a Boolean, which can be used to normalize
    Boolean values.  When a Boolean value needs to be normalized to
    one of two values, bool(x) is much clearer than "not not x" and
    much more concise than

        if x:
            return 1
        else:
            return 0

    Here are some arguments derived from teaching Python.  When
    showing people comparison operators etc. in the interactive shell,
    I think this is a bit ugly:

        >>> a = 13
        >>> b = 12
        >>> a > b
        1
        >>>

    If this was:

        >>> a > b
        True
        >>>

    it would require one millisecond less thinking each time a 0 or 1
    was printed.

    There's also the issue (which I've seen puzzling even experienced
    Pythonistas who had been away from the language for a while) that if
    you see:

        >>> cmp(a, b)
        1
        >>> cmp(a, a)
        0
        >>> 

    you might be tempted to believe that cmp() also returned a truth
    value.  If ints are not (normally) used for Booleans results, this
    would stand out much more clearly as something completely
    different.


Specification

    The following Python code specifies most of the properties of the
    new type:

        class bool(int):

            def __new__(cls, val=0):
                # This constructor always returns an existing instance
                if val:
                    return True
                else:
                    return False

            def __repr__(self):
                if self:
                    return "True"
                else:
                    return "False"

            __str__ = __repr__

            def __and__(self, other):
                if isinstance(other, bool):
                    return bool(int(self) & int(other))
                else:
                    return int.__and__(self, other)

            __rand__ = __and__

            def __or__(self, other):
                if isinstance(other, bool):
                    return bool(int(self) | int(other))
                else:
                    return int.__or__(self, other)

            __ror__ = __or__

            def __xor__(self, other):
                if isinstance(other, bool):
                    return bool(int(self) ^ int(other))
                else:
                    return int.__xor__(self, other)

            __rxor__ = __xor__

        # Bootstrap truth values through sheer willpower
        False = int.__new__(bool, 0)
        True = int.__new__(bool, 1)

    The values False and True will be singletons, like None; the C
    implementation will not allow other instances of bool to be
    created.  At the C level, the existing globals Py_False and
    Py_True will be appropriated to refer to False and True.

    All built-in operations that are defined to return a Boolean
    result will be changed to return False or True instead of 0 or 1.
    In particular, this affects comparisons (<, <=, ==, !=, >, >=, is,
    is not, in, not in), the unary operator 'not', the built-in
    functions callable(), hasattr(), isinstance() and issubclass(),
    the dict method has_key(), the string and unicode methods
    endswith(), isalnum(), isalpha(), isdigit(), islower(), isspace(),
    istitle(), isupper(), and startswith(), the unicode methods
    isdecimal() and isnumeric(), and the 'closed' attribute of file
    objects.

    Because bool inherits from int, True+1 is valid and equals 2, and
    so on.  This is important for backwards compatibility: because
    comparisons and so on currently return integer values, there's no
    way of telling what uses existing applications make of these
    values.

    It is expected that over time, the standard library will be
    updated to use False and True when appropriate (but not to require
    a bool argument type where previous an int was allowed).  This
    change should not pose additional problems and is not specified in
    detail by this PEP.


Clarification

    This PEP does *not* change the fact that almost all object types
    can be used as truth values.  For example, when used in an if
    statement, an empty list is false and a non-empty one is true;
    this does not change and there is no plan to ever change this.

    The only thing that changes is the preferred values to represent
    truth values when returned or assigned explicitly.  Previously,
    these preferred truth values were 0 and 1; the PEP changes the
    preferred values to False and True, and changes built-in
    operations to return these preferred values.


Compatibility

    Because of backwards compatibility, the bool type lacks many
    properties that some would like to see.  For example, arithmetic
    operations with one or two bool arguments is allowed, treating
    False as 0 and True as 1.  Also, a bool may be used as a sequence
    index.

    I don't see this as a problem, and I don't want evolve the
    language in this direction either; I don't believe that a stricter
    interpretation of "Booleanness" makes the language any clearer.

    Another consequence of the compatibility requirement is that the
    expression "True and 6" has the value 6, and similarly the
    expression "False or None" has the value None.  The "and" and "or"
    operators are usefully defined to return the first argument that
    determines the outcome, and this won't change; in particular, they
    don't force the outcome to be a bool.  Of course, if both
    arguments are bools, the outcome is always a bool.  It can also
    easily be coerced into being a bool by writing for example
    "bool(x and y)".


Issues

    Because the repr() or str() of a bool value is different from an
    int value, some code (for example doctest-based unit tests, and
    possibly database code that relies on things like "%s" % truth)
    may fail.  How much of a backwards compatibility problem this will
    be, I don't know.  If this turns out to be a real problem, we
    could changes the rules so that str() of a bool returns "0" or
    "1", while repr() of a bool still returns "False" or "True".

    Other languages (C99, C++, Java) name the constants "false" and
    "true", in all lowercase.  In Python, I prefer to stick with the
    example set by the existing built-in constants, which all use
    CapitalizedWords: None, Ellipsis, NotImplemented (as well as all
    built-in exceptions).  Python's built-in module uses all lowercase
    for functions and types only.  But I'm willing to consider the
    lowercase alternatives if enough people think it looks better.

    It has been suggested that, in order to satisfy user expectations,
    for every x that is considered true in a Boolean context, the
    expression x == True should be true, and likewise if x is
    considered false, x == False should be true.  This is of course
    impossible; it would mean that for example 6 == True and 7 ==
    True, from which one could infer 6 == 7.  Similarly, [] == False
    == None would be true, and one could infer [] == None, which is
    not the case.  I'm not sure where this suggestion came from; it
    was made several times during the first review period.  For truth
    testing, one should use "if", as in "if x: print 'Yes'", not
    comparison to a truth value; "if x == True: print 'Yes'" is not
    only wrong, it is also strangely redundant.


Implementation

    An experimental, but fairly complete implementation in C has been
    uploaded to the SourceForge patch manager:

    http://python.org/sf/528022


Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
fill-column: 70
End: