After lengthy discussion on python-dev, here is an update to the Python style

guide.  Hopefully it has been simplified and made more prescriptive, as well
as improved by all the great community suggestions.
This commit is contained in:
Barry Warsaw 2005-12-14 21:12:58 +00:00
parent 91a41916ba
commit 478d6d498e
1 changed files with 389 additions and 305 deletions

View File

@ -11,83 +11,86 @@ Post-History: 05-Jul-2001
Introduction
This document gives coding conventions for the Python code
comprising the standard library for the main Python distribution.
Please see the companion informational PEP describing style
guidelines for the C code in the C implementation of Python[1].
This document gives coding conventions for the Python code comprising the
standard library in the main Python distribution. Please see the
companion informational PEP describing style guidelines for the C code in
the C implementation of Python[1].
This document was adapted from Guido's original Python Style
Guide essay[2], with some additions from Barry's style guide[5].
Where there's conflict, Guido's style rules for the purposes of
this PEP. This PEP may still be incomplete (in fact, it may never
be finished <wink>).
This document was adapted from Guido's original Python Style Guide
essay[2], with some additions from Barry's style guide[5]. Where there's
conflict, Guido's style rules for the purposes of this PEP. This PEP may
still be incomplete (in fact, it may never be finished <wink>).
A Foolish Consistency is the Hobgoblin of Little Minds
A style guide is about consistency. Consistency with this style
guide is important. Consistency within a project is more
important. Consistency within one module or function is most
important.
One of Guido's key insights is that code is read much more often than it
is written. The guidelines provided here are intended to improve the
readability of code and make it consistent across the wide spectrum of
Python code. As PEP 20 [6] says, "Readability counts".
But most importantly: know when to be inconsistent -- sometimes
the style guide just doesn't apply. When in doubt, use your best
judgement. Look at other examples and decide what looks best.
And don't hesitate to ask!
A style guide is about consistency. Consistency with this style guide is
important. Consistency within a project is more important. Consistency
within one module or function is most important.
But most importantly: know when to be inconsistent -- sometimes the style
guide just doesn't apply. When in doubt, use your best judgment. Look
at other examples and decide what looks best. And don't hesitate to ask!
Two good reasons to break a particular rule:
(1) When applying the rule would make the code less readable, even
for someone who is used to reading code that follows the rules.
(1) When applying the rule would make the code less readable, even for
someone who is used to reading code that follows the rules.
(2) To be consistent with surrounding code that also breaks it
(maybe for historic reasons) -- although this is also an
opportunity to clean up someone else's mess (in true XP style).
(2) To be consistent with surrounding code that also breaks it (maybe for
historic reasons) -- although this is also an opportunity to clean up
someone else's mess (in true XP style).
Code lay-out
Indentation
Use the default of Emacs' Python-mode: 4 spaces for one
indentation level. For really old code that you don't want to
mess up, you can continue to use 8-space tabs. Emacs Python-mode
auto-detects the prevailing indentation level used in a file and
sets its indentation parameters accordingly.
Use 4 spaces per indentation level.
This is the default for Emacs's python-mode. For really old code that you
don't want to mess up, you can continue to use 8-space tabs. Emacs
python-mode auto-detects the prevailing indentation level used in a file
and sets its indentation parameters accordingly.
Tabs or Spaces?
Never mix tabs and spaces. The most popular way of indenting
Python is with spaces only. The second-most popular way is with
tabs only. Code indented with a mixture of tabs and spaces should
be converted to using spaces exclusively. (In Emacs, select the
whole buffer and hit ESC-x untabify.) When invoking the python
command line interpreter with the -t option, it issues warnings
about code that illegally mixes tabs and spaces. When using -tt
these warnings become errors. These options are highly
recommended!
Never mix tabs and spaces.
For new projects, spaces-only are strongly recommended over tabs.
Most editors have features that make this easy to do. (In Emacs,
make sure indent-tabs-mode is nil).
The most popular way of indenting Python is with spaces only. The
second-most popular way is with tabs only. Code indented with a mixture
of tabs and spaces should be converted to using spaces exclusively. (In
Emacs, select the whole buffer and hit ESC-x untabify.) When invoking the
Python command line interpreter with the -t option, it issues warnings
about code that illegally mixes tabs and spaces. When using -tt these
warnings become errors. These options are highly recommended!
For new projects, spaces-only are strongly recommended over tabs. Most
editors have features that make this easy to do. (In Emacs, make sure
indent-tabs-mode is nil).
Maximum Line Length
There are still many devices around that are limited to 80
character lines; plus, limiting windows to 80 characters makes it
possible to have several windows side-by-side. The default
wrapping on such devices looks ugly. Therefore, please limit all
lines to a maximum of 79 characters (Emacs wraps lines that are
exactly 80 characters long). For flowing long blocks of text
(docstrings or comments), limiting the length to 72 characters is
recommended.
Limit all lines to a maximum of 79 characters.
The preferred way of wrapping long lines is by using Python's
implied line continuation inside parentheses, brackets and braces.
If necessary, you can add an extra pair of parentheses around an
expression, but sometimes using a backslash looks better. Make
sure to indent the continued line appropriately. Emacs
Python-mode does this right. Some examples:
There are still many devices around that are limited to 80 character
lines; plus, limiting windows to 80 characters makes it possible to have
several windows side-by-side. The default wrapping on such devices looks
ugly. Therefore, please limit all lines to a maximum of 79 characters
(Emacs wraps lines that are exactly 80 characters long). For flowing long
blocks of text (docstrings or comments), limiting the length to 72
characters is recommended.
The preferred way of wrapping long lines is by using Python's implied line
continuation inside parentheses, brackets and braces. If necessary, you
can add an extra pair of parentheses around an expression, but sometimes
using a backslash looks better. Make sure to indent the continued line
appropriately. Emacs's python-mode does this right. Some examples:
class Rectangle(Blob):
@ -105,65 +108,63 @@ Code lay-out
Blank Lines
Separate top-level function and class definitions with two blank
lines. Method definitions inside a class are separated by a
single blank line. Extra blank lines may be used (sparingly) to
separate groups of related functions. Blank lines may be omitted
between a bunch of related one-liners (e.g. a set of dummy
implementations).
Separate top-level function and class definitions with two blank lines.
When blank lines are used to separate method definitions, there is
also a blank line between the `class' line and the first method
definition.
Method definitions inside a class are separated by a single blank line.
Use blank lines in functions, sparingly, to indicate logical
sections.
Extra blank lines may be used (sparingly) to separate groups of related
functions. Blank lines may be omitted between a bunch of related
one-liners (e.g. a set of dummy implementations).
Python accepts the control-L (i.e. ^L) form feed character as
whitespace; Emacs (and some printing tools) treat these
characters as page separators, so you may use them to separate
pages of related sections of your file.
Use blank lines in functions, sparingly, to indicate logical sections.
Python accepts the control-L (i.e. ^L) form feed character as whitespace;
Emacs (and some printing tools) treat these characters as page separators,
so you may use them to separate pages of related sections of your file.
Encodings (PEP 263)
Code in the core Python distribution should aways use the ASCII or
Latin-1 encoding (a.k.a. ISO-8859-1). Files using ASCII should
not have a coding cookie. Latin-1 should only be used when a
comment or docstring needs to mention an author name that requires
Latin-1; otherwise, using \x escapes is the preferred way to
include non-ASCII data in string literals. An exception is made
for those files that are part of the test suite for the code
implementing PEP 263.
Code in the core Python distribution should aways use the ASCII or Latin-1
encoding (a.k.a. ISO-8859-1).
Files using ASCII should not have a coding cookie. Latin-1 should only be
used when a comment or docstring needs to mention an author name that
requires Latin-1; otherwise, using \x escapes is the preferred way to
include non-ASCII data in string literals. An exception is made for those
files that are part of the test suite for the code implementing PEP 263.
Imports
- Imports should usually be on separate lines, e.g.:
Yes: import os
import sys
No: import sys, os
Yes: import sys
import os
it's okay to say this though:
from types import StringType, ListType
from subprocess import Popen, PIPE
- Imports are always put at the top of the file, just after any
module comments and docstrings, and before module globals and
constants. Imports should be grouped, with the order being
- Imports are always put at the top of the file, just after any module
comments and docstrings, and before module globals and constants.
Imports should be grouped in the following order:
1. standard library imports
2. related major package imports (i.e. all email package imports next)
3. application specific imports
2. related third party imports
3. local application/library specific imports
You should put a blank line between each group of imports.
- Relative imports for intra-package imports are highly
discouraged. Always use the absolute package path for all
imports.
Put any relevant __all__ specification after the imports.
- When importing a class from a class-containing module, it's usually
okay to spell this
- Relative imports for intra-package imports are highly discouraged.
Always use the absolute package path for all imports.
- When importing a class from a class-containing module, it's usually okay
to spell this
from MyClass import MyClass
from foo.bar.YourClass import YourClass
@ -180,40 +181,45 @@ Whitespace in Expressions and Statements
Pet Peeves
Guido hates whitespace in the following places:
Avoid extraneous whitespace in the following situations:
- Immediately inside parentheses, brackets or braces, as in:
"spam( ham[ 1 ], { eggs: 2 } )". Always write this as
"spam(ham[1], {eggs: 2})".
- Immediately inside parentheses, brackets or braces.
- Immediately before a comma, semicolon, or colon, as in:
"if x == 4 : print x , y ; x , y = y , x". Always write this as
"if x == 4: print x, y; x, y = y, x".
Yes: spam(ham[1], {eggs: 2})
No: spam( ham[ 1 ], { eggs: 2 } )
- Immediately before a comma, semicolon, or colon:
Yes: if x == 4: print x, y; x, y = y, x
No: if x == 4 : print x , y ; x , y = y , x
- Immediately before the open parenthesis that starts the argument
list of a function call, as in "spam (1)". Always write
this as "spam(1)".
list of a function call:
Yes: spam(1)
No: spam (1)
- Immediately before the open parenthesis that starts an indexing or
slicing, as in: "dict ['key'] = list [index]". Always
write this as "dict['key'] = list[index]".
slicing:
Yes: dict['key'] = list[index]
No: dict ['key'] = list [index]
- More than one space around an assignment (or other) operator to
align it with another, as in:
align it with another.
Yes:
x = 1
y = 2
long_variable = 3
Always write this as
No:
x = 1
y = 2
long_variable = 3
(Don't bother to argue with him on any of the above -- Guido's
grown accustomed to this style over 20 years.)
Other Recommendations
@ -221,53 +227,70 @@ Whitespace in Expressions and Statements
either side: assignment (=), comparisons (==, <, >, !=, <>, <=,
>=, in, not in, is, is not), Booleans (and, or, not).
- Use your better judgment for the insertion of spaces around
arithmetic operators. Always be consistent about whitespace on
either side of a binary operator. Some examples:
- Use spaces around arithmetic operators:
i = i+1
submitted = submitted + 1
Yes:
i = i + 1
submitted += 1
x = x * 2 - 1
hypot2 = x * x + y * y
c = (a + b) * (a - b)
No:
i=i+1
submitted +=1
x = x*2 - 1
hypot2 = x*x + y*y
c = (a+b) * (a-b)
c = (a + b) * (a - b)
- Don't use spaces around the '=' sign when used to indicate a
keyword argument or a default parameter value. For instance:
keyword argument or a default parameter value.
Yes:
def complex(real, imag=0.0):
return magic(r=real, i=imag)
No:
def complex(real, imag = 0.0):
return magic(r = real, i = imag)
- Compound statements (multiple statements on the same line) are
generally discouraged.
No: if foo == 'blah': do_blah_thing()
Yes: if foo == 'blah':
do_blah_thing()
Yes:
No: do_one(); do_two(); do_three()
Yes: do_one()
if foo == 'blah':
do_blah_thing()
do_one()
do_two()
do_three()
No:
if foo == 'blah': do_blah_thing()
do_one(); do_two(); do_three()
Comments
Comments that contradict the code are worse than no comments.
Always make a priority of keeping the comments up-to-date when the
code changes!
Comments that contradict the code are worse than no comments. Always make
a priority of keeping the comments up-to-date when the code changes!
Comments should be complete sentences. If a comment is a phrase
or sentence, its first word should be capitalized, unless it is an
identifier that begins with a lower case letter (never alter the
case of identifiers!).
Comments should be complete sentences. If a comment is a phrase or
sentence, its first word should be capitalized, unless it is an identifier
that begins with a lower case letter (never alter the case of
identifiers!).
If a comment is short, the period at the end is best omitted.
Block comments generally consist of one or more paragraphs built
out of complete sentences, and each sentence should end in a
period.
If a comment is short, the period at the end can be omitted. Block
comments generally consist of one or more paragraphs built out of complete
sentences, and each sentence should end in a period.
You should use two spaces after a sentence-ending period, since it
makes Emacs wrapping and filling work consistenty.
makes Emacs wrapping and filling work consistently.
When writing English, Strunk and White apply.
@ -278,66 +301,65 @@ Comments
Block Comments
Block comments generally apply to some (or all) code that follows
them, and are indented to the same level as that code. Each line
of a block comment starts with a # and a single space (unless it
is indented text inside the comment). Paragraphs inside a block
comment are separated by a line containing a single #. Block
comments are best surrounded by a blank line above and below them
(or two lines above and a single line below for a block comment at
the start of a a new section of function definitions).
Block comments generally apply to some (or all) code that follows them,
and are indented to the same level as that code. Each line of a block
comment starts with a # and a single space (unless it is indented text
inside the comment).
Paragraphs inside a block comment are separated by a line containing a
single #.
Inline Comments
An inline comment is a comment on the same line as a statement.
Inline comments should be used sparingly. Inline comments should
be separated by at least two spaces from the statement. They
should start with a # and a single space.
Use inline comments sparingly.
An inline comment is a comment on the same line as a statement. Inline
comments should be separated by at least two spaces from the statement.
They should start with a # and a single space.
Inline comments are unnecessary and in fact distracting if they state
the obvious. Don't do this:
x = x+1 # Increment x
x = x + 1 # Increment x
But sometimes, this is useful:
x = x+1 # Compensate for border
x = x + 1 # Compensate for border
Documentation Strings
Conventions for writing good documentation strings
(a.k.a. "docstrings") are immortalized in PEP 257 [3].
Conventions for writing good documentation strings (a.k.a. "docstrings")
are immortalized in PEP 257 [3].
- Write docstrings for all public modules, functions, classes, and
methods. Docstrings are not necessary for non-public methods,
but you should have a comment that describes what the method
does. This comment should appear after the "def" line.
methods. Docstrings are not necessary for non-public methods, but you
should have a comment that describes what the method does. This comment
should appear after the "def" line.
- PEP 257 describes good docstring conventions. Note that most
importantly, the """ that ends a multiline docstring should be
on a line by itself, e.g.:
importantly, the """ that ends a multiline docstring should be on a line
by itself, e.g.:
"""Return a foobang
Optional plotz says to frobnicate the bizbaz first.
"""
- For one liner docstrings, it's okay to keep the closing """ on
the same line.
- For one liner docstrings, it's okay to keep the closing """ on the same
line.
Version Bookkeeping
If you have to have RCS or CVS crud in your source file, do it as
follows.
If you have to have Subversion, CVS, or RCS crud in your source file, do
it as follows.
__version__ = "$Revision$"
# $Source$
These lines should be included after the module's docstring,
before any other code, separated by a blank line above and
below.
These lines should be included after the module's docstring, before any
other code, separated by a blank line above and below.
Naming Conventions
@ -345,15 +367,15 @@ Naming Conventions
The naming conventions of Python's library are a bit of a mess, so we'll
never get this completely consistent -- nevertheless, here are the
currently recommended naming standards. New modules and packages
(including 3rd party frameworks) should be written to these standards, but
where an existing library has a different style, internal consistency is
preferred.
(including third party frameworks) should be written to these standards,
but where an existing library has a different style, internal consistency
is preferred.
Descriptive: Naming Styles
There are a lot of different naming styles. It helps to be able
to recognize what naming style is being used, independently from
what they are used for.
There are a lot of different naming styles. It helps to be able to
recognize what naming style is being used, independently from what they
are used for.
The following naming styles are commonly distinguished:
@ -373,73 +395,66 @@ Naming Conventions
of the bumpy look of its letters[4]). This is also sometimes known as
StudlyCaps.
Note: When using abbreviations in CapWords, capitalize all the letters
of the abbreviation. Thus HTTPServerError is better than
HttpServerError.
- mixedCase (differs from CapitalizedWords by initial lowercase
character!)
- Capitalized_Words_With_Underscores (ugly!)
There's also the style of using a short unique prefix to group
related names together. This is not used much in Python, but it
is mentioned for completeness. For example, the os.stat()
function returns a tuple whose items traditionally have names like
st_mode, st_size, st_mtime and so on. The X11 library uses a
leading X for all its public functions. (In Python, this style is
generally deemed unnecessary because attribute and method names
are prefixed with an object, and function names are prefixed with
a module name.)
There's also the style of using a short unique prefix to group related
names together. This is not used much in Python, but it is mentioned for
completeness. For example, the os.stat() function returns a tuple whose
items traditionally have names like st_mode, st_size, st_mtime and so on.
The X11 library uses a leading X for all its public functions. In Python,
this style is generally deemed unnecessary because attribute and method
names are prefixed with an object, and function names are prefixed with a
module name.
In addition, the following special forms using leading or trailing
underscores are recognized (these can generally be combined with any
case convention):
underscores are recognized (these can generally be combined with any case
convention):
- _single_leading_underscore: weak "internal use" indicator
(e.g. "from M import *" does not import objects whose name
starts with an underscore).
- _single_leading_underscore: weak "internal use" indicator. E.g. "from M
import *" does not import objects whose name starts with an underscore.
- single_trailing_underscore_: used by convention to avoid
conflicts with Python keyword, e.g.
"Tkinter.Toplevel(master, class_='ClassName')".
- single_trailing_underscore_: used by convention to avoid conflicts with
Python keyword, e.g.
- __double_leading_underscore: class-private names as of Python 1.4.
Tkinter.Toplevel(master, class_='ClassName')
- __double_leading_underscore: when naming a class attribute, invokes name
mangling as of Python 1.4.
- __double_leading_and_trailing_underscore__: "magic" objects or
attributes that live in user-controlled namespaces,
e.g. __init__, __import__ or __file__. Sometimes these are
defined by the user to trigger certain magic behavior
(e.g. operator overloading); sometimes these are inserted by the
infrastructure for its own use or for debugging purposes. Since
the infrastructure (loosely defined as the Python interpreter
and the standard library) may decide to grow its list of magic
attributes in future versions, user code should generally
refrain from using this convention for its own use. User code
that aspires to become part of the infrastructure could combine
this with a short prefix inside the underscores,
e.g. __bobo_magic_attr__.
attributes that live in user-controlled namespaces. E.g. __init__,
__import__ or __file__.
Prescriptive: Naming Conventions
Names to Avoid
Never use the characters `l' (lowercase letter el), `O'
(uppercase letter oh), or `I' (uppercase letter eye) as single
character variable names. In some fonts, these characters are
indistinguisable from the numerals one and zero. When tempted
to use `l' use `L' instead.
Never use the characters `l' (lowercase letter el), `O' (uppercase
letter oh), or `I' (uppercase letter eye) as single character variable
names.
In some fonts, these characters are indistinguishable from the numerals
one and zero. When tempted to use `l' use `L' instead.
Module Names
Modules should have short, lowercase names, without underscores.
Since module names are mapped to file names, and some file systems
are case insensitive and truncate long names, it is important that
module names be chosen to be fairly short -- this won't be a
problem on Unix, but it may be a problem when the code is
transported to Mac or Windows.
Since module names are mapped to file names, and some file systems are
case insensitive and truncate long names, it is important that module
names be chosen to be fairly short -- this won't be a problem on Unix,
but it may be a problem when the code is transported to Mac or Windows.
When an extension module written in C or C++ has an accompanying
Python module that provides a higher level (e.g. more object
oriented) interface, the C/C++ module has a leading underscore
(e.g. _socket).
When an extension module written in C or C++ has an accompanying Python
module that provides a higher level (e.g. more object oriented)
interface, the C/C++ module has a leading underscore (e.g. _socket).
Python packages should have short, all-lowercase names, without
underscores.
@ -451,137 +466,201 @@ Naming Conventions
Exception Names
If a module defines a single exception raised for all sorts of
conditions, it is generally called "error" or "Error". It seems
that built-in (extension) modules use "error" (e.g. os.error),
while Python modules generally use "Error" (e.g. xdrlib.Error).
The trend seems to be toward CapWords exception names.
Because exceptions should be classes, the class naming convention
applies here. However, you should use the suffix "Error" on your
exception names (if the exception actually is an error).
Global Variable Names
(Let's hope that these variables are meant for use inside one
module only.) The conventions are about the same as those for
functions. Modules that are designed for use via "from M import *"
should prefix their globals (and internal functions and classes)
with an underscore to prevent exporting them.
(Let's hope that these variables are meant for use inside one module
only.) The conventions are about the same as those for functions.
Modules that are designed for use via "from M import *" should use the
__all__ mechanism to prevent exporting globals, or use the the older
convention of prefixing such globals with an underscore (which you might
want to do to indicate these globals are "module non-public").
Function Names
Function names should be lowercase, possibly with words separated by
underscores to improve readability. mixedCase is allowed only in
contexts where that's already the prevailing style (e.g. threading.py),
to retain backwards compatibility.
Function names should be lowercase, with words separated by underscores
as necessary to improve readability.
mixedCase is allowed only in contexts where that's already the
prevailing style (e.g. threading.py), to retain backwards compatibility.
Function and method arguments
Always use 'self' for the first argument to instance methods.
Always use 'cls' for the first argument to class methods.
If a function argument's name clashes with a reserved keyword, it is
generally better to append a single trailing underscore rather than use
an abbreviation or spelling corruption. Thus "print_" is better than
"prnt".
Method Names and Instance Variables
The story is largely the same as with functions: in general, use
lowercase with words separated by underscores as necessary to improve
readability.
Use the function naming rules: lowercase with words separated by
underscores as necessary to improve readability.
Use one leading underscore only for internal methods and instance
variables which are not intended to be part of the class's public
interface. Python does not enforce this; it is up to programmers to
respect the convention.
Use one leading underscore only for non-public methods and instance
variables.
Use two leading underscores to denote class-private names. Python
"mangles" these names with the class name: if class Foo has an
To avoid name clashes with subclasses, use two leading underscores to
invoke Python's name mangling rules.
Python mangles these names with the class name: if class Foo has an
attribute named __a, it cannot be accessed by Foo.__a. (An insistent
user could still gain access by calling Foo._Foo__a.) Generally,
double leading underscores should be used only to avoid name conflicts
with attributes in classes designed to be subclassed.
user could still gain access by calling Foo._Foo__a.) Generally, double
leading underscores should be used only to avoid name conflicts with
attributes in classes designed to be subclassed.
Note: there is some controversy about the use of __names (see below).
Designing for inheritance
Always decide whether a class's methods and instance variables
should be public or non-public. In general, never make data
variables public unless you're implementing essentially a
record. It's almost always preferrable to give a functional
interface to your class instead (and some Python 2.2
developments will make this much nicer).
(collectively: "attributes") should be public or non-public. If in
doubt, choose non-public; it's easier to make it public later than to
make a public attribute non-public.
Also decide whether your attributes should be private or not.
The difference between private and non-public is that the former
will never be useful for a derived class, while the latter might
be. Yes, you should design your classes with inheritence in
mind!
Public attributes are those that you expect unrelated clients of your
class to use, with your commitment to avoid backward incompatible
changes. Non-public attributes are those that are not intended to be
used by third parties; you make no guarantees that non-pubic attributes
won't change or even be removed.
Private attributes should have two leading underscores, no
trailing underscores.
We don't use the term "private" here, since no attribute is really
private in Python (without a generally unnecessary amount of work).
Non-public attributes should have a single leading underscore,
no trailing underscores.
Another category of attributes are those that are part of the "subclass
API" (often called "protected" in other languages). Some classes are
designed to be inherited from, either to extend or modify aspects of the
class's behavior. When designing such a class, take care to make
explicit decisions about which attributes are public, which are part of
the subclass API, and which are truly only to be used by your base
class.
Public attributes should have no leading or trailing
underscores, unless they conflict with reserved words, in which
case, a single trailing underscore is preferrable to a leading
one, or a corrupted spelling, e.g. class_ rather than klass.
(This last point is a bit controversial; if you prefer klass
over class_ then just be consistent. :).
With this in mind, here are the Pythonic guidelines:
- Public attributes should have no leading underscores.
- If your public attribute name collides with a reserved keyword, append
a single trailing underscore to your attribute name. This is
preferable to an abbreviation or corrupted spelling. E.g. "class_"
is preferable to "cls" or "klass".
Note 1: See the argument name recommendation above for class methods.
- For simple public data attributes, it is best to expose just the
attribute name, without complicated accessor/mutator methods. Keep in
mind that Python provides an easy path to future enhancement, should
you find that a simple data attribute needs to grow functional
behavior. In that case, use properties to hide functional
implementation behind simple data attribute access syntax.
Note 1: Properties only work on new-style classes.
Note 2: Try to keep the functional behavior side-effect free, although
side-effects such as caching are generally fine.
- If your class is intended to be subclassed, and you have attributes
that you do not want subclasses to use, consider naming them with
double leading underscores and no trailing underscores. This invokes
Python's name mangling algorithm, where the name of the class is
mangled into the attribute name. This helps avoid attribute name
collisions should subclasses inadvertently contain attributes with the
same name.
Note 1: Note that only the simple class name is used in the mangled
name, so if a subclass chooses both the same class name and attribute
name, you can still get name collisions.
Note 2: Name mangling can make certain uses, such as debugging and
__getattr__(), less convenient. However the name mangling algorithm
is well documented and easy to perform manually.
Programming Recommendations
- Code should be written in a way that does not disadvantage other
implementations of Python (PyPy, Jython, IronPython, Pyrex, Psyco,
and such). For example, do not rely on CPython's efficient
implementation of in-place string concatenation for statements in
the form a+=b or a=a+b. Those statements run more slowly in
Jython. In performance sensitive parts of the library, the
''.join() form should be used instead. This will assure that
concatenation occurs in linear time across various implementations.
and such).
For example, do not rely on CPython's efficient implementation of
in-place string concatenation for statements in the form a+=b or a=a+b.
Those statements run more slowly in Jython. In performance sensitive
parts of the library, the ''.join() form should be used instead. This
will assure that concatenation occurs in linear time across various
implementations.
- Comparisons to singletons like None should always be done with
'is' or 'is not'. Also, beware of writing "if x" when you
really mean "if x is not None" -- e.g. when testing whether a
variable or argument that defaults to None was set to some other
value. The other value might be a value that's false in a
Boolean context!
'is' or 'is not', never the equality operators.
- Class-based exceptions are always preferred over string-based
exceptions. Modules or packages should define their own
domain-specific base exception class, which should be subclassed
from the built-in Exception class. Always include a class
docstring. E.g.:
Also, beware of writing "if x" when you really mean "if x is not None"
-- e.g. when testing whether a variable or argument that defaults to
None was set to some other value. The other value might be a value
that's false in a boolean context!
- Use class-based exceptions.
String exceptions in new code are strongly discouraged, as they will
eventually be deprecated and then removed.
Modules or packages should define their own domain-specific base
exception class, which should be subclassed from the built-in Exception
class. Always include a class docstring. E.g.:
class MessageError(Exception):
"""Base class for errors in the email package."""
When raising an exception, use "raise ValueError('message')"
instead of the older form "raise ValueError, 'message'". The
paren-using form is preferred because when the exception
arguments are long or include string formatting, you don't need
to use line continuation characters thanks to the containing
parentheses. The older form will be removed in Python 3000.
Class naming conventions apply here, although you should add the suffix
"Error" to your exception classes, if the exception is an error.
Non-error exceptions need no special suffix.
- Use string methods instead of the string module unless
backward-compatibility with versions earlier than Python 2.0 is
important. String methods are always much faster and share the
same API with unicode strings.
- When raising an exception, use "raise ValueError('message')" instead of
the older form "raise ValueError, 'message'".
- Avoid slicing strings when checking for prefixes or suffixes.
Use startswith() and endswith() instead, since they are
cleaner and less error prone. For example:
The paren-using form is preferred because when the exception arguments
are long or include string formatting, you don't need to use line
continuation characters thanks to the containing parentheses. The older
form will be removed in Python 3000.
- Use string methods instead of the string module.
String methods are always much faster and share the same API with
unicode strings. Override this rule if backward compatibility with
Pythons older than 2.0 is required.
- Use ''.startswith() and ''.endswith() instead of string slicing to check
for prefixes or suffixes.
startswith() and endswith() are cleaner and less error prone. For
example:
No: if foo[:3] == 'bar':
Yes: if foo.startswith('bar'):
The exception is if your code must work with Python 1.5.2 (but
let's hope not!).
No: if foo[:3] == 'bar':
The exception is if your code must work with Python 1.5.2 (but let's
hope not!).
- Object type comparisons should always use isinstance() instead
of comparing types directly. E.g.
of comparing types directly.
No: if type(obj) is type(1):
Yes: if isinstance(obj, int):
When checking if an object is a string, keep in mind that it
might be a unicode string too! In Python 2.3, str and unicode
have a common base class, basestring, so you can do:
No: if type(obj) is type(1):
When checking if an object is a string, keep in mind that it might be a
unicode string too! In Python 2.3, str and unicode have a common base
class, basestring, so you can do:
if isinstance(obj, basestring):
In Python 2.2, the types module has the StringTypes type defined
for that purpose, e.g.:
In Python 2.2, the types module has the StringTypes type defined for
that purpose, e.g.:
from types import StringTypes
if isinstance(obj, StringTypes):
@ -593,19 +672,22 @@ Programming Recommendations
isinstance(obj, UnicodeType) :
- For sequences, (strings, lists, tuples), use the fact that empty
sequences are false, so "if not seq" or "if seq" is preferable
to "if len(seq)" or "if not len(seq)".
sequences are false.
Yes: if not seq:
if seq:
No: if len(seq)
if not len(seq)
- Don't write string literals that rely on significant trailing
whitespace. Such trailing whitespace is visually
indistinguishable and some editors (or more recently,
reindent.py) will trim them.
whitespace. Such trailing whitespace is visually indistinguishable and
some editors (or more recently, reindent.py) will trim them.
- Don't compare boolean values to True or False using == (bool
types are new in Python 2.3):
- Don't compare boolean values to True or False using ==
No: if greeting == True:
Yes: if greeting:
No: if greeting == True:
References
@ -621,6 +703,8 @@ References
[5] Barry's GNU Mailman style guide
http://barry.warsaw.us/software/STYLEGUIDE.txt
[6] PEP 20, The Zen of Python
Copyright