442 lines
17 KiB
Plaintext
442 lines
17 KiB
Plaintext
PEP: 285
|
||
Title: Adding a bool type
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: guido@python.org (Guido van Rossum)
|
||
Status: Final
|
||
Type: Standards Track
|
||
Created: 8-Mar-2002
|
||
Python-Version: 2.3
|
||
Post-History: 8-Mar-2002, 30-Mar-2002, 3-Apr-2002
|
||
|
||
|
||
Abstract
|
||
|
||
This PEP proposes the introduction of a new built-in type, bool,
|
||
with two constants, False and True. The bool type would be a
|
||
straightforward subtype (in C) of the int type, and the values
|
||
False and True would behave like 0 and 1 in most respects (for
|
||
example, False==0 and True==1 would be true) except repr() and
|
||
str(). All built-in operations that conceptually return a Boolean
|
||
result will be changed to return False or True instead of 0 or 1;
|
||
for example, comparisons, the "not" operator, and predicates like
|
||
isinstance().
|
||
|
||
|
||
Review
|
||
|
||
I've collected enough feedback to last me a lifetime, so I declare
|
||
the review period officially OVER. I had Chinese food today; my
|
||
fortune cookie said "Strong and bitter words indicate a weak
|
||
cause." It reminded me of some of the posts against this
|
||
PEP... :-)
|
||
|
||
Anyway, here are my BDFL pronouncements. (Executive summary: I'm
|
||
not changing a thing; all variants are rejected.)
|
||
|
||
1) Should this PEP be accepted?
|
||
|
||
=> Yes.
|
||
|
||
There have been many arguments against the PEP. Many of them
|
||
were based on misunderstandings. I've tried to clarify some of
|
||
the most common misunderstandings below in the main text of the
|
||
PEP. The only issue that weighs at all for me is the tendency
|
||
of newbies to write "if x == True" where "if x" would suffice.
|
||
More about that below too. I think this is not a sufficient
|
||
reason to reject the PEP.
|
||
|
||
2) Should str(True) return "True" or "1"? "1" might reduce
|
||
backwards compatibility problems, but looks strange.
|
||
(repr(True) would always return "True".)
|
||
|
||
=> "True".
|
||
|
||
Almost all reviewers agree with this.
|
||
|
||
3) Should the constants be called 'True' and 'False' (similar to
|
||
None) or 'true' and 'false' (as in C++, Java and C99)?
|
||
|
||
=> True and False.
|
||
|
||
Most reviewers agree that consistency within Python is more
|
||
important than consistency with other languages.
|
||
|
||
4) Should we strive to eliminate non-Boolean operations on bools
|
||
in the future, through suitable warnings, so that for example
|
||
True+1 would eventually (in Python 3000) be illegal?
|
||
|
||
=> No.
|
||
|
||
There's a small but vocal minority that would prefer to see
|
||
"textbook" bools that don't support arithmetic operations at
|
||
all, but most reviewers agree with me that bools should always
|
||
allow arithmetic operations.
|
||
|
||
5) Should operator.truth(x) return an int or a bool?
|
||
|
||
=> bool.
|
||
|
||
Tim Peters believes it should return an int, but almost all
|
||
other reviewers agree that it should return a bool. My
|
||
rationale: operator.truth() exists to force a Boolean context
|
||
on its argument (it calls the C API PyObject_IsTrue()).
|
||
Whether the outcome is reported as int or bool is secondary; if
|
||
bool exists there's no reason not to use it. (Under the PEP,
|
||
operator.truth() now becomes an alias for bool(); that's fine.)
|
||
|
||
6) Should bool inherit from int?
|
||
|
||
=> Yes.
|
||
|
||
In an ideal world, bool might be better implemented as a
|
||
separate integer type that knows how to perform mixed-mode
|
||
arithmetic. However, inheriting bool from int eases the
|
||
implementation enormously (in part since all C code that calls
|
||
PyInt_Check() will continue to work -- this returns true for
|
||
subclasses of int). Also, I believe this is right in terms of
|
||
substitutability: code that requires an int can be fed a bool
|
||
and it will behave the same as 0 or 1. Code that requires a
|
||
bool may not work when it is given an int; for example, 3 & 4
|
||
is 0, but both 3 and 4 are true when considered as truth
|
||
values.
|
||
|
||
7) Should the name 'bool' be changed?
|
||
|
||
=> No.
|
||
|
||
Some reviewers have argued for boolean instead of bool, because
|
||
this would be easier to understand (novices may have heard of
|
||
Boolean algebra but may not make the connection with bool) or
|
||
because they hate abbreviations. My take: Python uses
|
||
abbreviations judiciously (like 'def', 'int', 'dict') and I
|
||
don't think these are a burden to understanding. To a newbie,
|
||
it doesn't matter whether it's called a waffle or a bool; it's
|
||
a new word, and they learn quickly what it means.
|
||
|
||
One reviewer has argued to make the name 'truth'. I find this
|
||
an unattractive name, and would actually prefer to reserve this
|
||
term (in documentation) for the more abstract concept of truth
|
||
values that already exists in Python. For example: "when a
|
||
container is interpreted as a truth value, an empty container
|
||
is considered false and a non-empty one is considered true."
|
||
|
||
8) Should we strive to require that Boolean operations (like "if",
|
||
"and", "not") have a bool as an argument in the future, so that
|
||
for example "if []:" would become illegal and would have to be
|
||
written as "if bool([]):" ???
|
||
|
||
=> No!!!
|
||
|
||
Some people believe that this is how a language with a textbook
|
||
Boolean type should behave. Because it was brought up, others
|
||
have worried that I might agree with this position. Let me
|
||
make my position on this quite clear. This is not part of the
|
||
PEP's motivation and I don't intend to make this change. (See
|
||
also the section "Clarification" below.)
|
||
|
||
|
||
Rationale
|
||
|
||
Most languages eventually grow a Boolean type; even C99 (the new
|
||
and improved C standard, not yet widely adopted) has one.
|
||
|
||
Many programmers apparently feel the need for a Boolean type; most
|
||
Python documentation contains a bit of an apology for the absence
|
||
of a Boolean type. I've seen lots of modules that defined
|
||
constants "False=0" and "True=1" (or similar) at the top and used
|
||
those. The problem with this is that everybody does it
|
||
differently. For example, should you use "FALSE", "false",
|
||
"False", "F" or even "f"? And should false be the value zero or
|
||
None, or perhaps a truth value of a different type that will print
|
||
as "true" or "false"? Adding a standard bool type to the language
|
||
resolves those issues.
|
||
|
||
Some external libraries (like databases and RPC packages) need to
|
||
be able to distinguish between Boolean and integral values, and
|
||
while it's usually possible to craft a solution, it would be
|
||
easier if the language offered a standard Boolean type. This also
|
||
applies to Jython: some Java classes have separately overloaded
|
||
methods or constructors for int and boolean arguments. The bool
|
||
type can be used to select the boolean variant. (The same is
|
||
apparently the case for some COM interfaces.)
|
||
|
||
The standard bool type can also serve as a way to force a value to
|
||
be interpreted as a Boolean, which can be used to normalize
|
||
Boolean values. When a Boolean value needs to be normalized to
|
||
one of two values, bool(x) is much clearer than "not not x" and
|
||
much more concise than
|
||
|
||
if x:
|
||
return 1
|
||
else:
|
||
return 0
|
||
|
||
Here are some arguments derived from teaching Python. When
|
||
showing people comparison operators etc. in the interactive shell,
|
||
I think this is a bit ugly:
|
||
|
||
>>> a = 13
|
||
>>> b = 12
|
||
>>> a > b
|
||
1
|
||
>>>
|
||
|
||
If this was:
|
||
|
||
>>> a > b
|
||
True
|
||
>>>
|
||
|
||
it would require a millisecond less thinking each time a 0 or 1
|
||
was printed.
|
||
|
||
There's also the issue (which I've seen baffling even experienced
|
||
Pythonistas who had been away from the language for a while) that
|
||
if you see:
|
||
|
||
>>> cmp(a, b)
|
||
1
|
||
>>> cmp(a, a)
|
||
0
|
||
>>>
|
||
|
||
you might be tempted to believe that cmp() also returned a truth
|
||
value, whereas in reality it can return three different values
|
||
(-1, 0, 1). If ints were not (normally) used to represent
|
||
Booleans results, this would stand out much more clearly as
|
||
something completely different.
|
||
|
||
|
||
Specification
|
||
|
||
The following Python code specifies most of the properties of the
|
||
new type:
|
||
|
||
class bool(int):
|
||
|
||
def __new__(cls, val=0):
|
||
# This constructor always returns an existing instance
|
||
if val:
|
||
return True
|
||
else:
|
||
return False
|
||
|
||
def __repr__(self):
|
||
if self:
|
||
return "True"
|
||
else:
|
||
return "False"
|
||
|
||
__str__ = __repr__
|
||
|
||
def __and__(self, other):
|
||
if isinstance(other, bool):
|
||
return bool(int(self) & int(other))
|
||
else:
|
||
return int.__and__(self, other)
|
||
|
||
__rand__ = __and__
|
||
|
||
def __or__(self, other):
|
||
if isinstance(other, bool):
|
||
return bool(int(self) | int(other))
|
||
else:
|
||
return int.__or__(self, other)
|
||
|
||
__ror__ = __or__
|
||
|
||
def __xor__(self, other):
|
||
if isinstance(other, bool):
|
||
return bool(int(self) ^ int(other))
|
||
else:
|
||
return int.__xor__(self, other)
|
||
|
||
__rxor__ = __xor__
|
||
|
||
# Bootstrap truth values through sheer willpower
|
||
False = int.__new__(bool, 0)
|
||
True = int.__new__(bool, 1)
|
||
|
||
The values False and True will be singletons, like None. Because
|
||
the type has two values, perhaps these should be called
|
||
"doubletons"? The real implementation will not allow other
|
||
instances of bool to be created.
|
||
|
||
True and False will properly round-trip through pickling and
|
||
marshalling; for example pickle.loads(pickle.dumps(True)) will
|
||
return True, and so will marshal.loads(marshal.dumps(True)).
|
||
|
||
All built-in operations that are defined to return a Boolean
|
||
result will be changed to return False or True instead of 0 or 1.
|
||
In particular, this affects comparisons (<, <=, ==, !=, >, >=, is,
|
||
is not, in, not in), the unary operator 'not', the built-in
|
||
functions callable(), hasattr(), isinstance() and issubclass(),
|
||
the dict method has_key(), the string and unicode methods
|
||
endswith(), isalnum(), isalpha(), isdigit(), islower(), isspace(),
|
||
istitle(), isupper(), and startswith(), the unicode methods
|
||
isdecimal() and isnumeric(), and the 'closed' attribute of file
|
||
objects. The predicates in the operator module are also changed
|
||
to return a bool, including operator.truth().
|
||
|
||
Because bool inherits from int, True+1 is valid and equals 2, and
|
||
so on. This is important for backwards compatibility: because
|
||
comparisons and so on currently return integer values, there's no
|
||
way of telling what uses existing applications make of these
|
||
values.
|
||
|
||
It is expected that over time, the standard library will be
|
||
updated to use False and True when appropriate (but not to require
|
||
a bool argument type where previous an int was allowed). This
|
||
change should not pose additional problems and is not specified in
|
||
detail by this PEP.
|
||
|
||
|
||
C API
|
||
|
||
The header file "boolobject.h" defines the C API for the bool
|
||
type. It is included by "Python.h" so there is no need to include
|
||
it directly.
|
||
|
||
The existing names Py_False and Py_True reference the unique bool
|
||
objects False and True (previously these referenced static int
|
||
objects with values 0 and 1, which were not unique amongst int
|
||
values).
|
||
|
||
A new API, PyObject *PyBool_FromLong(long), takes a C long int
|
||
argument and returns a new reference to either Py_False (when the
|
||
argument is zero) or Py_True (when it is nonzero).
|
||
|
||
To check whether an object is a bool, the macro PyBool_Check() can
|
||
be used.
|
||
|
||
The type of bool instances is PyBoolObject *.
|
||
|
||
The bool type object is available as PyBool_Type.
|
||
|
||
|
||
Clarification
|
||
|
||
This PEP does *not* change the fact that almost all object types
|
||
can be used as truth values. For example, when used in an if
|
||
statement, an empty list is false and a non-empty one is true;
|
||
this does not change and there is no plan to ever change this.
|
||
|
||
The only thing that changes is the preferred values to represent
|
||
truth values when returned or assigned explicitly. Previously,
|
||
these preferred truth values were 0 and 1; the PEP changes the
|
||
preferred values to False and True, and changes built-in
|
||
operations to return these preferred values.
|
||
|
||
|
||
Compatibility
|
||
|
||
Because of backwards compatibility, the bool type lacks many
|
||
properties that some would like to see. For example, arithmetic
|
||
operations with one or two bool arguments is allowed, treating
|
||
False as 0 and True as 1. Also, a bool may be used as a sequence
|
||
index.
|
||
|
||
I don't see this as a problem, and I don't want evolve the
|
||
language in this direction either. I don't believe that a
|
||
stricter interpretation of "Booleanness" makes the language any
|
||
clearer.
|
||
|
||
Another consequence of the compatibility requirement is that the
|
||
expression "True and 6" has the value 6, and similarly the
|
||
expression "False or None" has the value None. The "and" and "or"
|
||
operators are usefully defined to return the first argument that
|
||
determines the outcome, and this won't change; in particular, they
|
||
don't force the outcome to be a bool. Of course, if both
|
||
arguments are bools, the outcome is always a bool. It can also
|
||
easily be coerced into being a bool by writing for example "bool(x
|
||
and y)".
|
||
|
||
|
||
Resolved Issues
|
||
|
||
(See also the Review section above.)
|
||
|
||
- Because the repr() or str() of a bool value is different from an
|
||
int value, some code (for example doctest-based unit tests, and
|
||
possibly database code that relies on things like "%s" % truth)
|
||
may fail. It is easy to work around this (without explicitly
|
||
referencing the bool type), and it is expected that this only
|
||
affects a very small amount of code that can easily be fixed.
|
||
|
||
- Other languages (C99, C++, Java) name the constants "false" and
|
||
"true", in all lowercase. For Python, I prefer to stick with
|
||
the example set by the existing built-in constants, which all
|
||
use CapitalizedWords: None, Ellipsis, NotImplemented (as well as
|
||
all built-in exceptions). Python's built-in namespace uses all
|
||
lowercase for functions and types only.
|
||
|
||
- It has been suggested that, in order to satisfy user
|
||
expectations, for every x that is considered true in a Boolean
|
||
context, the expression x == True should be true, and likewise
|
||
if x is considered false, x == False should be true. In
|
||
particular newbies who have only just learned about Boolean
|
||
variables are likely to write
|
||
|
||
if x == True: ...
|
||
|
||
instead of the correct form,
|
||
|
||
if x: ...
|
||
|
||
There seem to be strong psychological and linguistic reasons why
|
||
many people are at first uncomfortable with the latter form, but
|
||
I believe that the solution should be in education rather than
|
||
in crippling the language. After all, == is general seen as a
|
||
transitive operator, meaning that from a==b and b==c we can
|
||
deduce a==c. But if any comparison to True were to report
|
||
equality when the other operand was a true value of any type,
|
||
atrocities like 6==True==7 would hold true, from which one could
|
||
infer the falsehood 6==7. That's unacceptable. (In addition,
|
||
it would break backwards compatibility. But even if it didn't,
|
||
I'd still be against this, for the stated reasons.)
|
||
|
||
Newbies should also be reminded that there's never a reason to
|
||
write
|
||
|
||
if bool(x): ...
|
||
|
||
since the bool is implicit in the "if". Explicit is *not*
|
||
better than implicit here, since the added verbiage impairs
|
||
redability and there's no other interpretation possible. There
|
||
is, however, sometimes a reason to write
|
||
|
||
b = bool(x)
|
||
|
||
This is useful when it is unattractive to keep a reference to an
|
||
arbitrary object x, or when normalization is required for some
|
||
other reason. It is also sometimes appropriate to write
|
||
|
||
i = int(bool(x))
|
||
|
||
which converts the bool to an int with the value 0 or 1. This
|
||
conveys the intention to henceforth use the value as an int.
|
||
|
||
|
||
Implementation
|
||
|
||
A complete implementation in C has been uploaded to the
|
||
SourceForge patch manager:
|
||
|
||
http://python.org/sf/528022
|
||
|
||
This will soon be checked into CVS for python 2.3a0.
|
||
|
||
|
||
Copyright
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
fill-column: 70
|
||
End:
|