357 lines
14 KiB
Plaintext
357 lines
14 KiB
Plaintext
PEP: 285
|
||
Title: Adding a bool type
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: guido@python.org (Guido van Rossum)
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Created: 8-Mar-2002
|
||
Python-Version: 2.3
|
||
Post-History: 8-Mar-2002, 30-Mar-2002
|
||
|
||
|
||
Abstract
|
||
|
||
This PEP proposes the introduction of a new built-in type, bool,
|
||
with two constants, False and True. The bool type would be a
|
||
straightforward subtype (in C) of the int type, and the values
|
||
False and True would behave like 0 and 1 in most respects (for
|
||
example, False==0 and True==1 would be true) except repr() and
|
||
str(). All built-in operations that conceptually return a Boolean
|
||
result will be changed to return False or True instead of 0 or 1;
|
||
for example, comparisons, the "not" operator, and predicates like
|
||
isinstance().
|
||
|
||
|
||
Review
|
||
|
||
Dear reviewers:
|
||
|
||
I'm particularly interested in hearing your opinion about the
|
||
following three issues:
|
||
|
||
1) Should this PEP be accepted at all.
|
||
|
||
=> The majority of reviewers so far are in favor, ranging from +0
|
||
(don't hate it) to 1 (yes please). Votes against are mixed:
|
||
some are against all change, some think it's not needed, some
|
||
think it will just add more confusion or complexity, some have
|
||
irrational fears about code breakage based on misunderstanding
|
||
the PEP (believing it adds reserved words, or believing it will
|
||
require you to write "if bool(x):" where previously "if x:"
|
||
worked; neither belief is true).
|
||
|
||
2) Should str(True) return "True" or "1": "1" might reduce
|
||
backwards compatibility problems, but looks strange to me.
|
||
(repr(True) would always return "True".)
|
||
|
||
=> Most reviewers prefer str(True) == "True" (which may mean that
|
||
they don't appreciate the specific backwards compatibility
|
||
issue brought up by Marc-Andre Lemburg :-).
|
||
|
||
3) Should the constants be called 'True' and 'False'
|
||
(corresponding to None) or 'true' and 'false' (as in C++, Java
|
||
and C99).
|
||
|
||
=> There's no clear preference either way here, so I'll break the
|
||
tie by pronouncing False and True.
|
||
|
||
Most other details of the proposal are pretty much forced by the
|
||
backwards compatibility requirement; for example, True == 1 and
|
||
True+1 == 2 must hold, else reams of existing code would break.
|
||
|
||
Minor additional issues:
|
||
|
||
4) Should we strive to eliminate non-Boolean operations on bools
|
||
in the future, through suitable warnings, so that for example
|
||
True+1 would eventually (in Python 3000) be illegal.
|
||
Personally, I think we shouldn't; 28+isleap(y) seems totally
|
||
reasonable to me.
|
||
|
||
=> Most reviewers agree with me.
|
||
|
||
5) Should operator.truth(x) return an int or a bool. Tim Peters
|
||
believes it should return an int because it's been documented
|
||
as such. I think it should return a bool; most other standard
|
||
predicates (like issubtype()) have also been documented as
|
||
returning 0 or 1, and it's obvious that we want to change those
|
||
to return a bool.
|
||
|
||
=> Most reviewers agree with me. My take: operator.truth() exists
|
||
to force a Boolean context on its argument (it calls the C API
|
||
PyObject_IsTrue()). Whether the outcome is reported as int or
|
||
bool is secondary; if bool exists there's no reason not to use
|
||
it.
|
||
|
||
New issues brought up during the review:
|
||
|
||
6) Should bool inherit from int?
|
||
|
||
=> My take: in an ideal world, bool might be better implemented as
|
||
a separate integer type that knows how to perform mixed-mode
|
||
arithmetic. However, inheriting bool from int eases the
|
||
implementation enormously (in part since all C code that calls
|
||
PyInt_Check() will continue to work -- this returns true for
|
||
subclasses of int). Also, I believe in terms of
|
||
substitutability, this is right: code that requires an int can
|
||
be fed a bool and it will behave the same as 0 or 1. Code that
|
||
requires a bool may not work when it is given an int; for
|
||
example, 3 & 4 is 0, but both 3 and 4 are true when considered
|
||
as truth values.
|
||
|
||
7) Should the name 'bool' be changed?
|
||
|
||
=> Some reviewers argue for boolean instead of bool, because this
|
||
would be easier to understand (novices may have heard of
|
||
Boolean algebra but may not make the connection with bool) or
|
||
because they hate abbreviations. My take: Python uses
|
||
abbreviations judiciously (like 'def', 'int', 'dict') and I
|
||
don't think these are a burden to understanding.
|
||
|
||
One reviewer argues to make the name 'truth'. I find this an
|
||
unattractive name, and would actually prefer to reserve this
|
||
term (in documentation) for the more abstract concept of truth
|
||
values that already exists in Python. For example: "when a
|
||
container is interpreted as a truth value, an empty container
|
||
is considered false and a non-empty one is considered true."
|
||
|
||
|
||
Rationale
|
||
|
||
Most languages eventually grow a Boolean type; even C99 (the new
|
||
and improved C standard, not yet widely adopted) has one.
|
||
|
||
Many programmers apparently feel the need for a Boolean type; most
|
||
Python documentation contains a bit of an apology for the absence
|
||
of a Boolean type. I've seen lots of modules that defined
|
||
constants "False=0" and "True=1" (or similar) at the top and used
|
||
those. The problem with this is that everybody does it
|
||
differently. For example, should you use "FALSE", "false",
|
||
"False", "F" or even "f"? And should false be the value zero or
|
||
None, or perhaps a truth value of a different type that will print
|
||
as "true" or "false"? Adding a standard bool type to the language
|
||
resolves those issues.
|
||
|
||
Some external libraries (like databases and RPC packages) need to
|
||
be able to distinguish between Boolean and integral values, and
|
||
while it's usually possible to craft a solution, it would be
|
||
easier if the language offered a standard Boolean type. This also
|
||
applies to Jython: some Java classes have separately overloaded
|
||
methods or constructors for int and boolean arguments. The bool
|
||
type can be used to select the boolean variant. (The same is
|
||
apparently the case for some COM interfaces.)
|
||
|
||
The standard bool type can also serve as a way to force a value to
|
||
be interpreted as a Boolean, which can be used to normalize
|
||
Boolean values. When a Boolean value needs to be normalized to
|
||
one of two values, bool(x) is much clearer than "not not x" and
|
||
much more concise than
|
||
|
||
if x:
|
||
return 1
|
||
else:
|
||
return 0
|
||
|
||
Here are some arguments derived from teaching Python. When
|
||
showing people comparison operators etc. in the interactive shell,
|
||
I think this is a bit ugly:
|
||
|
||
>>> a = 13
|
||
>>> b = 12
|
||
>>> a > b
|
||
1
|
||
>>>
|
||
|
||
If this was:
|
||
|
||
>>> a > b
|
||
True
|
||
>>>
|
||
|
||
it would require one millisecond less thinking each time a 0 or 1
|
||
was printed.
|
||
|
||
There's also the issue (which I've seen puzzling even experienced
|
||
Pythonistas who had been away from the language for a while) that if
|
||
you see:
|
||
|
||
>>> cmp(a, b)
|
||
1
|
||
>>> cmp(a, a)
|
||
0
|
||
>>>
|
||
|
||
you might be tempted to believe that cmp() also returned a truth
|
||
value. If ints are not (normally) used for Booleans results, this
|
||
would stand out much more clearly as something completely
|
||
different.
|
||
|
||
|
||
Specification
|
||
|
||
The following Python code specifies most of the properties of the
|
||
new type:
|
||
|
||
class bool(int):
|
||
|
||
def __new__(cls, val=0):
|
||
# This constructor always returns an existing instance
|
||
if val:
|
||
return True
|
||
else:
|
||
return False
|
||
|
||
def __repr__(self):
|
||
if self:
|
||
return "True"
|
||
else:
|
||
return "False"
|
||
|
||
__str__ = __repr__
|
||
|
||
def __and__(self, other):
|
||
if isinstance(other, bool):
|
||
return bool(int(self) & int(other))
|
||
else:
|
||
return int.__and__(self, other)
|
||
|
||
__rand__ = __and__
|
||
|
||
def __or__(self, other):
|
||
if isinstance(other, bool):
|
||
return bool(int(self) | int(other))
|
||
else:
|
||
return int.__or__(self, other)
|
||
|
||
__ror__ = __or__
|
||
|
||
def __xor__(self, other):
|
||
if isinstance(other, bool):
|
||
return bool(int(self) ^ int(other))
|
||
else:
|
||
return int.__xor__(self, other)
|
||
|
||
__rxor__ = __xor__
|
||
|
||
# Bootstrap truth values through sheer willpower
|
||
False = int.__new__(bool, 0)
|
||
True = int.__new__(bool, 1)
|
||
|
||
The values False and True will be singletons, like None; the C
|
||
implementation will not allow other instances of bool to be
|
||
created. At the C level, the existing globals Py_False and
|
||
Py_True will be appropriated to refer to False and True.
|
||
|
||
All built-in operations that are defined to return a Boolean
|
||
result will be changed to return False or True instead of 0 or 1.
|
||
In particular, this affects comparisons (<, <=, ==, !=, >, >=, is,
|
||
is not, in, not in), the unary operator 'not', the built-in
|
||
functions callable(), hasattr(), isinstance() and issubclass(),
|
||
the dict method has_key(), the string and unicode methods
|
||
endswith(), isalnum(), isalpha(), isdigit(), islower(), isspace(),
|
||
istitle(), isupper(), and startswith(), the unicode methods
|
||
isdecimal() and isnumeric(), and the 'closed' attribute of file
|
||
objects.
|
||
|
||
Because bool inherits from int, True+1 is valid and equals 2, and
|
||
so on. This is important for backwards compatibility: because
|
||
comparisons and so on currently return integer values, there's no
|
||
way of telling what uses existing applications make of these
|
||
values.
|
||
|
||
It is expected that over time, the standard library will be
|
||
updated to use False and True when appropriate (but not to require
|
||
a bool argument type where previous an int was allowed). This
|
||
change should not pose additional problems and is not specified in
|
||
detail by this PEP.
|
||
|
||
|
||
Clarification
|
||
|
||
This PEP does *not* change the fact that almost all object types
|
||
can be used as truth values. For example, when used in an if
|
||
statement, an empty list is false and a non-empty one is true;
|
||
this does not change and there is no plan to ever change this.
|
||
|
||
The only thing that changes is the preferred values to represent
|
||
truth values when returned or assigned explicitly. Previously,
|
||
these preferred truth values were 0 and 1; the PEP changes the
|
||
preferred values to False and True, and changes built-in
|
||
operations to return these preferred values.
|
||
|
||
|
||
Compatibility
|
||
|
||
Because of backwards compatibility, the bool type lacks many
|
||
properties that some would like to see. For example, arithmetic
|
||
operations with one or two bool arguments is allowed, treating
|
||
False as 0 and True as 1. Also, a bool may be used as a sequence
|
||
index.
|
||
|
||
I don't see this as a problem, and I don't want evolve the
|
||
language in this direction either; I don't believe that a stricter
|
||
interpretation of "Booleanness" makes the language any clearer.
|
||
|
||
Another consequence of the compatibility requirement is that the
|
||
expression "True and 6" has the value 6, and similarly the
|
||
expression "False or None" has the value None. The "and" and "or"
|
||
operators are usefully defined to return the first argument that
|
||
determines the outcome, and this won't change; in particular, they
|
||
don't force the outcome to be a bool. Of course, if both
|
||
arguments are bools, the outcome is always a bool. It can also
|
||
easily be coerced into being a bool by writing for example
|
||
"bool(x and y)".
|
||
|
||
|
||
Issues
|
||
|
||
Because the repr() or str() of a bool value is different from an
|
||
int value, some code (for example doctest-based unit tests, and
|
||
possibly database code that relies on things like "%s" % truth)
|
||
may fail. How much of a backwards compatibility problem this will
|
||
be, I don't know. If this turns out to be a real problem, we
|
||
could changes the rules so that str() of a bool returns "0" or
|
||
"1", while repr() of a bool still returns "False" or "True".
|
||
|
||
Other languages (C99, C++, Java) name the constants "false" and
|
||
"true", in all lowercase. In Python, I prefer to stick with the
|
||
example set by the existing built-in constants, which all use
|
||
CapitalizedWords: None, Ellipsis, NotImplemented (as well as all
|
||
built-in exceptions). Python's built-in module uses all lowercase
|
||
for functions and types only. But I'm willing to consider the
|
||
lowercase alternatives if enough people think it looks better.
|
||
|
||
It has been suggested that, in order to satisfy user expectations,
|
||
for every x that is considered true in a Boolean context, the
|
||
expression x == True should be true, and likewise if x is
|
||
considered false, x == False should be true. This is of course
|
||
impossible; it would mean that for example 6 == True and 7 ==
|
||
True, from which one could infer 6 == 7. Similarly, [] == False
|
||
== None would be true, and one could infer [] == None, which is
|
||
not the case. I'm not sure where this suggestion came from; it
|
||
was made several times during the first review period. For truth
|
||
testing, one should use "if", as in "if x: print 'Yes'", not
|
||
comparison to a truth value; "if x == True: print 'Yes'" is not
|
||
only wrong, it is also strangely redundant.
|
||
|
||
|
||
Implementation
|
||
|
||
An experimental, but fairly complete implementation in C has been
|
||
uploaded to the SourceForge patch manager:
|
||
|
||
http://python.org/sf/528022
|
||
|
||
|
||
Copyright
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
fill-column: 70
|
||
End:
|