Accepted, closed the review period, pronounced all open issues, and

added clarification and extra opinions to various spots.  Will post
once more.
This commit is contained in:
Guido van Rossum 2002-04-03 22:11:05 +00:00
parent 624f9377bf
commit 10dbefaed1
1 changed files with 174 additions and 110 deletions

View File

@ -3,11 +3,11 @@ Title: Adding a bool type
Version: $Revision$
Last-Modified: $Date$
Author: guido@python.org (Guido van Rossum)
Status: Draft
Status: Accepted
Type: Standards Track
Created: 8-Mar-2002
Python-Version: 2.3
Post-History: 8-Mar-2002, 30-Mar-2002
Post-History: 8-Mar-2002, 30-Mar-2002, 3-Apr-2002
Abstract
@ -25,91 +25,97 @@ Abstract
Review
Dear reviewers:
I've collected enough feedback to last me a lifetime, so I declare
the review period officially OVER. I had Chinese food today; my
fortune cookie said "Strong and bitter words indicate a weak
cause." It reminded me of some of the posts against this
PEP... :-)
I'm particularly interested in hearing your opinion about the
following three issues:
Anyway, here are my BDFL pronouncements. (Executive summary: I'm
not changing a thing; all variants are rejected.)
1) Should this PEP be accepted at all.
1) Should this PEP be accepted?
=> The majority of reviewers so far are in favor, ranging from +0
(don't hate it) to 1 (yes please). Votes against are mixed:
some are against all change, some think it's not needed, some
think it will just add more confusion or complexity, some have
irrational fears about code breakage based on misunderstanding
the PEP (believing it adds reserved words, or believing it will
require you to write "if bool(x):" where previously "if x:"
worked; neither belief is true).
=> Yes.
2) Should str(True) return "True" or "1": "1" might reduce
backwards compatibility problems, but looks strange to me.
There have been many arguments against the PEP. Many of them
were based on misunderstandings. I've tried to clarify some of
the most common misunderstandings below in the main text of the
PEP. The only issue that weighs at all for me is the tendency
of newbies to write "if x == True" where "if x" would suffice.
More about that below too. I think this is not a sufficient
reason to reject the PEP.
2) Should str(True) return "True" or "1"? "1" might reduce
backwards compatibility problems, but looks strange.
(repr(True) would always return "True".)
=> Most reviewers prefer str(True) == "True" (which may mean that
they don't appreciate the specific backwards compatibility
issue brought up by Marc-Andre Lemburg :-).
=> "True".
3) Should the constants be called 'True' and 'False'
(corresponding to None) or 'true' and 'false' (as in C++, Java
and C99).
Almost all reviewers agree with this.
=> There's no clear preference either way here, so I'll break the
tie by pronouncing False and True.
3) Should the constants be called 'True' and 'False' (similar to
None) or 'true' and 'false' (as in C++, Java and C99)?
Most other details of the proposal are pretty much forced by the
backwards compatibility requirement; for example, True == 1 and
True+1 == 2 must hold, else reams of existing code would break.
=> True and False.
Minor additional issues:
Most reviewers agree that consistency within Python is more
important than consistency with other languages.
4) Should we strive to eliminate non-Boolean operations on bools
in the future, through suitable warnings, so that for example
True+1 would eventually (in Python 3000) be illegal.
Personally, I think we shouldn't; 28+isleap(y) seems totally
reasonable to me.
True+1 would eventually (in Python 3000) be illegal?
=> Most reviewers agree with me.
=> No.
5) Should operator.truth(x) return an int or a bool. Tim Peters
believes it should return an int because it's been documented
as such. I think it should return a bool; most other standard
predicates (like issubtype()) have also been documented as
returning 0 or 1, and it's obvious that we want to change those
to return a bool.
There's a small but vocal minority that would prefer to see
"textbook" bools that don't support arithmetic operations at
all, but most reviewers agree with me that bools should always
allow arithmetic operations.
=> Most reviewers agree with me. My take: operator.truth() exists
to force a Boolean context on its argument (it calls the C API
PyObject_IsTrue()). Whether the outcome is reported as int or
bool is secondary; if bool exists there's no reason not to use
it.
5) Should operator.truth(x) return an int or a bool?
New issues brought up during the review:
=> bool.
Tim Peters believes it should return an int, but almost all
other reviewers agree that it should return a bool. My
rationale: operator.truth() exists to force a Boolean context
on its argument (it calls the C API PyObject_IsTrue()).
Whether the outcome is reported as int or bool is secondary; if
bool exists there's no reason not to use it. (Under the PEP,
operator.truth() now becomes an alias for bool(); that's fine.)
6) Should bool inherit from int?
=> My take: in an ideal world, bool might be better implemented as
a separate integer type that knows how to perform mixed-mode
=> Yes.
In an ideal world, bool might be better implemented as a
separate integer type that knows how to perform mixed-mode
arithmetic. However, inheriting bool from int eases the
implementation enormously (in part since all C code that calls
PyInt_Check() will continue to work -- this returns true for
subclasses of int). Also, I believe in terms of
substitutability, this is right: code that requires an int can
be fed a bool and it will behave the same as 0 or 1. Code that
requires a bool may not work when it is given an int; for
example, 3 & 4 is 0, but both 3 and 4 are true when considered
as truth values.
subclasses of int). Also, I believe this is right in terms of
substitutability: code that requires an int can be fed a bool
and it will behave the same as 0 or 1. Code that requires a
bool may not work when it is given an int; for example, 3 & 4
is 0, but both 3 and 4 are true when considered as truth
values.
7) Should the name 'bool' be changed?
=> Some reviewers argue for boolean instead of bool, because this
would be easier to understand (novices may have heard of
=> No.
Some reviewers have argued for boolean instead of bool, because
this would be easier to understand (novices may have heard of
Boolean algebra but may not make the connection with bool) or
because they hate abbreviations. My take: Python uses
abbreviations judiciously (like 'def', 'int', 'dict') and I
don't think these are a burden to understanding.
don't think these are a burden to understanding. To a newbie,
it doesn't matter whether it's called a waffle or a bool; it's
a new word, and they learn quickly what it means.
One reviewer argues to make the name 'truth'. I find this an
unattractive name, and would actually prefer to reserve this
One reviewer has argued to make the name 'truth'. I find this
an unattractive name, and would actually prefer to reserve this
term (in documentation) for the more abstract concept of truth
values that already exists in Python. For example: "when a
container is interpreted as a truth value, an empty container
@ -120,18 +126,15 @@ Review
for example "if []:" would become illegal and would have to be
writen as "if bool([]):" ???
=> No!!! Some people believe that this is how a language with a
=> No!!!
Some people believe that this is how a language with a textbook
Boolean type should behave. Because it was brought up, others
have worried that I might agree with this position. Let me
make my position on this quite clear. This is not part of the
PEP's motivation and I don't intend to make this change. (See
also the section "Clarification" below.)
9) What about pickle and marshal?
=> I've added a paragraph to the specification requiring that
bool values roundtrip when pickled or marshalled.
Rationale
@ -185,12 +188,12 @@ Rationale
True
>>>
it would require one millisecond less thinking each time a 0 or 1
it would require a millisecond less thinking each time a 0 or 1
was printed.
There's also the issue (which I've seen puzzling even experienced
Pythonistas who had been away from the language for a while) that if
you see:
There's also the issue (which I've seen baffling even experienced
Pythonistas who had been away from the language for a while) that
if you see:
>>> cmp(a, b)
1
@ -199,9 +202,10 @@ Rationale
>>>
you might be tempted to believe that cmp() also returned a truth
value. If ints are not (normally) used for Booleans results, this
would stand out much more clearly as something completely
different.
value, whereas in reality it can return three different values
(-1, 0, 1). If ints were not (normally) used to represent
Booleans results, this would stand out much more clearly as
something completely different.
Specification
@ -254,14 +258,14 @@ Specification
False = int.__new__(bool, 0)
True = int.__new__(bool, 1)
The values False and True will be singletons, like None; the C
implementation will not allow other instances of bool to be
created. At the C level, the existing globals Py_False and
Py_True will be appropriated to refer to False and True.
The values False and True will be singletons, like None. Because
the type has two values, perhaps these should be called
"doubletons"? The real implementation will not allow other
instances of bool to be created.
True and False will properly round-trip through pickling and
marshalling; for example pickle.loads(pickle.dumps(True)) will
return True.
return True, and so will marshal.loads(marshal.dumps(True)).
All built-in operations that are defined to return a Boolean
result will be changed to return False or True instead of 0 or 1.
@ -272,7 +276,8 @@ Specification
endswith(), isalnum(), isalpha(), isdigit(), islower(), isspace(),
istitle(), isupper(), and startswith(), the unicode methods
isdecimal() and isnumeric(), and the 'closed' attribute of file
objects.
objects. The predicates in the operator module are also changed
to return a bool, including operator.truth().
Because bool inherits from int, True+1 is valid and equals 2, and
so on. This is important for backwards compatibility: because
@ -287,6 +292,29 @@ Specification
detail by this PEP.
C API
The header file "boolobject.h" defines the C API for the bool
type. It is included by "Python.h" so there is no need to include
it directly.
The existing names Py_False and Py_True reference the unique bool
objects False and True (previously these referenced static int
objects with values 0 and 1, which were not unique amongst int
values).
A new API, PyObject *PyBool_FromLong(long), takes a C long int
argument and returns a new reference to either Py_False (when the
argument is zero) or Py_True (when it is nonzero).
To check whether an object is a bool, the macro PyBool_Check() can
be used.
The type of bool instances is PyBoolObject *.
The bool type object is available as PyBool_Type.
Clarification
This PEP does *not* change the fact that almost all object types
@ -310,8 +338,9 @@ Compatibility
index.
I don't see this as a problem, and I don't want evolve the
language in this direction either; I don't believe that a stricter
interpretation of "Booleanness" makes the language any clearer.
language in this direction either. I don't believe that a
stricter interpretation of "Booleanness" makes the language any
clearer.
Another consequence of the compatibility requirement is that the
expression "True and 6" has the value 6, and similarly the
@ -320,48 +349,83 @@ Compatibility
determines the outcome, and this won't change; in particular, they
don't force the outcome to be a bool. Of course, if both
arguments are bools, the outcome is always a bool. It can also
easily be coerced into being a bool by writing for example
"bool(x and y)".
easily be coerced into being a bool by writing for example "bool(x
and y)".
Issues
Resolved Issues
Because the repr() or str() of a bool value is different from an
int value, some code (for example doctest-based unit tests, and
possibly database code that relies on things like "%s" % truth)
may fail. How much of a backwards compatibility problem this will
be, I don't know. If this turns out to be a real problem, we
could changes the rules so that str() of a bool returns "0" or
"1", while repr() of a bool still returns "False" or "True".
(See also the Review section above.)
Other languages (C99, C++, Java) name the constants "false" and
"true", in all lowercase. In Python, I prefer to stick with the
example set by the existing built-in constants, which all use
CapitalizedWords: None, Ellipsis, NotImplemented (as well as all
built-in exceptions). Python's built-in module uses all lowercase
for functions and types only. But I'm willing to consider the
lowercase alternatives if enough people think it looks better.
- Because the repr() or str() of a bool value is different from an
int value, some code (for example doctest-based unit tests, and
possibly database code that relies on things like "%s" % truth)
may fail. It is easy to work around this (without explicitly
referencing the bool type), and it is expected that this only
affects a very small amount of code that can easily be fixed.
It has been suggested that, in order to satisfy user expectations,
for every x that is considered true in a Boolean context, the
expression x == True should be true, and likewise if x is
considered false, x == False should be true. This is of course
impossible; it would mean that for example 6 == True and 7 ==
True, from which one could infer 6 == 7. Similarly, [] == False
== None would be true, and one could infer [] == None, which is
not the case. I'm not sure where this suggestion came from; it
was made several times during the first review period. For truth
testing, one should use "if", as in "if x: print 'Yes'", not
comparison to a truth value; "if x == True: print 'Yes'" is not
only wrong, it is also strangely redundant.
- Other languages (C99, C++, Java) name the constants "false" and
"true", in all lowercase. For Python, I prefer to stick with
the example set by the existing built-in constants, which all
use CapitalizedWords: None, Ellipsis, NotImplemented (as well as
all built-in exceptions). Python's built-in namespace uses all
lowercase for functions and types only.
- It has been suggested that, in order to satisfy user
expectations, for every x that is considered true in a Boolean
context, the expression x == True should be true, and likewise
if x is considered false, x == False should be true. In
particular newbies who have only just learned about Boolean
variables are likely to write
if x == True: ...
instead of the correct form,
if x: ...
There seem to be strong psychological and linguistic reasons why
many people are at first uncomfortable with the latter form, but
I believe that the solution should be in education rather than
in crippling the language. After all, == is general seen as a
transitive operator, meaning that from a==b and b==c we can
deduce a==c. But if any comparison to True were to report
equality when the other operand was a true value of any type,
atrocities like 6==True==7 would hold true, from which one could
infer the falsehood 6==7. That's unacceptable. (In addition,
it would break backwards compatibility. But even if it didn't,
I'd still be against this, for the stated reasons.)
Newbies should also be reminded that there's never a reason to
write
if bool(x): ...
since the bool is implicit in the "if". Explicit is *not*
better than implicit here, since the added verbiage impairs
redability and there's no other interpretation possible. There
is, however, sometimes a reason to write
b = bool(x)
This is useful when it is unattractive to keep a reference to an
arbitrary object x, or when normalization is required for some
other reason. It is also sometimes appropriate to write
i = int(bool(x))
which converts the bool to an int with the value 0 or 1. This
conveys the intention to henceforth use the value as an int.
Implementation
An experimental, but fairly complete implementation in C has been
uploaded to the SourceForge patch manager:
A complete implementation in C has been uploaded to the
SourceForge patch manager:
http://python.org/sf/528022
http://python.org/sf/528022
This will soon be checked into CVS for python 2.3a0.
Copyright