414 lines
15 KiB
Plaintext
414 lines
15 KiB
Plaintext
PEP: 8
|
||
Title: Style Guide for Python Code
|
||
Version: $Revision$
|
||
Author: guido@python.org (Guido van Rossum),
|
||
barry@zope.com (Barry Warsaw)
|
||
Status: Active
|
||
Type: Informational
|
||
Created: 05-Jul-2001
|
||
Post-History: 05-Jul-2001
|
||
|
||
|
||
Introduction
|
||
|
||
This document gives coding conventions for the Python code
|
||
comprising the standard library for the main Python distribution.
|
||
Please see the companion informational PEP describing style
|
||
guidelines for the C code in the C implementation of Python[1].
|
||
|
||
This document was adapted from Guido's original Python Style
|
||
Guide essay[2]. This PEP inherits that essay's incompleteness.
|
||
|
||
|
||
A Foolish Consistency is the Hobgoblin of Little Minds
|
||
|
||
A style guide is about consistency. Consistency with this style
|
||
guide is important. Consistency within a project is more
|
||
important. Consistency within one module or function is most
|
||
important.
|
||
|
||
But most importantly: know when to be inconsistent -- sometimes
|
||
the style guide just doesn't apply. When in doubt, use your best
|
||
judgement. Look at other examples and decide what looks best. And
|
||
don't hesitate to ask!
|
||
|
||
Two good reasons to break a particular rule:
|
||
|
||
(1) When applying the rule would make the code less readable, even
|
||
for someone who is used to reading code that follows the rules.
|
||
|
||
(2) To be consistent with surrounding code that also breaks it
|
||
(maybe for historic reasons) -- although this is also an
|
||
opportunity to clean up someone else's mess (in true XP style).
|
||
|
||
|
||
Code lay-out
|
||
|
||
Indentation
|
||
|
||
Use the default of Emacs' Python-mode: 4 spaces for one
|
||
indentation level. For really old code that you don't want to
|
||
mess up, you can continue to use 8-space tabs. Emacs Python-mode
|
||
auto-detects the prevailing indentation level used in a file and
|
||
sets its indentation parameters accordingly.
|
||
|
||
Tabs or Spaces?
|
||
|
||
Never mix tabs and spaces. The most popular way of indenting
|
||
Python is with spaces only. The second-most popular way is with
|
||
tabs only. Code indented with a mixture of tabs and spaces should
|
||
be converted to using spaces exclusively. (In Emacs, select the
|
||
whole buffer and hit ESC-x untabify.) When invoking the python
|
||
command line interpreter with the -t option, it issues warnings
|
||
about code that illegally mixes tabs and spaces. When using -tt
|
||
these warnings become errors. These options are highly
|
||
recommended!
|
||
|
||
Maximum Line Length
|
||
|
||
There are still many devices around that are limited to 80
|
||
character lines. The default wrapping on such devices looks ugly.
|
||
Therefore, please limit all lines to a maximum of 79 characters
|
||
(Emacs wraps lines that are exactly 80 characters long.)
|
||
|
||
The preferred way of wrapping long lines is by using Python's
|
||
implied line continuation inside parentheses, brackets and braces.
|
||
If necessary, you can add an extra pair of parentheses around an
|
||
expression, but sometimes using a backslash looks better. Make
|
||
sure to indent the continued line appropriately. Emacs
|
||
Python-mode does this right. Some examples:
|
||
|
||
class Rectangle(Blob):
|
||
|
||
def __init__(self, width, height,
|
||
color='black', emphasis=None, highlight=0):
|
||
if width == 0 and height == 0 and \
|
||
color == 'red' and emphasis == 'strong' or \
|
||
highlight > 100:
|
||
raise ValueError, "sorry, you lose"
|
||
if width == 0 and height == 0 and (color == 'red' or
|
||
emphasis is None):
|
||
raise ValueError, "I don't think so"
|
||
Blob.__init__(self, width, height,
|
||
color, emphasis, highlight)
|
||
|
||
Blank Lines
|
||
|
||
Separate top-level function and class definitions with two blank
|
||
lines. Method definitions inside a class are separated by a
|
||
single blank line. Extra blank lines may be used (sparingly) to
|
||
separate groups of related functions. Blank lines may be omitted
|
||
between a bunch of related one-liners (e.g. a set of dummy
|
||
implementations).
|
||
|
||
When blank lines are used to separate method definitions, there is
|
||
also a blank line between the `class' line and the first method
|
||
definition.
|
||
|
||
Use blank lines in functions, sparingly, to indicate logical
|
||
sections.
|
||
|
||
Python accepts the control-L (i.e. ^L) form feed character as
|
||
whitespace; Emacs (and some printing facilities) treat these
|
||
characters as page separators, so you may use them to separate
|
||
pages of related sections of your file.
|
||
|
||
|
||
Whitespace in Expressions and Statements
|
||
|
||
Pet Peeves
|
||
|
||
Guido hates whitespace in the following places:
|
||
|
||
- Immediately inside parentheses, brackets or braces, as in:
|
||
"spam( ham[ 1 ], { eggs: 2 } )". Always write this as
|
||
"spam(ham[1], {eggs: 2})".
|
||
|
||
- Immediately before a comma, semicolon, or colon, as in:
|
||
"if x == 4 : print x , y ; x , y = y , x". Always write this as
|
||
"if x == 4: print x, y; x, y = y, x".
|
||
|
||
- Immediately before the open parenthesis that starts the argument
|
||
list of a function call, as in "spam (1)". Always write
|
||
this as "spam(1)".
|
||
|
||
- Immediately before the open parenthesis that starts an indexing or
|
||
slicing, as in: "dict ['key'] = list [index]". Always
|
||
write this as "dict['key'] = list[index]".
|
||
|
||
- More than one space around an assignment (or other) operator to
|
||
align it with another, as in:
|
||
|
||
x = 1
|
||
y = 2
|
||
long_variable = 3
|
||
|
||
Always write this as
|
||
|
||
x = 1
|
||
y = 2
|
||
long_variable = 3
|
||
|
||
(Don't bother to argue with him on any of the above -- Guido's
|
||
grown accustomed to this style over 15 years.)
|
||
|
||
|
||
Other Recommendations
|
||
|
||
- Always surround these binary operators with a single space on
|
||
either side: assignment (=), comparisons (==, <, >, !=, <>, <=,
|
||
>=, in, not in, is, is not), Booleans (and, or, not).
|
||
|
||
- Use your better judgment for the insertion of spaces around
|
||
arithmetic operators. Always be consistent about whitespace on
|
||
either side of a binary operator. Some examples:
|
||
|
||
i = i+1
|
||
submitted = submitted + 1
|
||
x = x*2 - 1
|
||
hypot2 = x*x + y*y
|
||
c = (a+b) * (a-b)
|
||
c = (a + b) * (a - b)
|
||
|
||
- Don't use spaces around the '=' sign when used to indicate a
|
||
keyword argument or a default parameter value. For instance:
|
||
|
||
def complex(real, imag=0.0):
|
||
return magic(r=real, i=imag)
|
||
|
||
|
||
Comments
|
||
|
||
Comments that contradict the code are worse than no comments.
|
||
Always make a priority of keeping the comments up-to-date when the
|
||
code changes!
|
||
|
||
If a comment is a phrase or sentence, its first word should be
|
||
capitalized, unless it is an identifier that begins with a lower
|
||
case letter (never alter the case of identifiers!).
|
||
|
||
If a comment is short, the period at the end is best omitted.
|
||
Block comments generally consist of one or more paragraphs built
|
||
out of complete sentences, and each sentence should end in a
|
||
period.
|
||
|
||
You can use two spaces after a sentence-ending period.
|
||
|
||
As always when writing English, Strunk and White apply.
|
||
|
||
Python coders from non-English speaking countries: please write
|
||
your comments in English, unless you are 120% sure that the code
|
||
will never be read by people who don't speak your language.
|
||
|
||
|
||
Block Comments
|
||
|
||
Block comments generally apply to some (or all) code that follows
|
||
them, and are indented to the same level as that code. Each line
|
||
of a block comment starts with a # and a single space (unless it
|
||
is indented text inside the comment). Paragraphs inside a block
|
||
comment are separated by a line containing a single #. Block
|
||
comments are best surrounded by a blank line above and below them
|
||
(or two lines above and a single line below for a block comment at
|
||
the start of a a new section of function definitions).
|
||
|
||
Inline Comments
|
||
|
||
An inline comment is a comment on the same line as a statement.
|
||
Inline comments should be used sparingly. Inline comments should
|
||
be separated by at least two spaces from the statement. They
|
||
should start with a # and a single space.
|
||
|
||
Inline comments are unnecessary and in fact distracting if they state
|
||
the obvious. Don't do this:
|
||
|
||
x = x+1 # Increment x
|
||
|
||
But sometimes, this is useful:
|
||
|
||
x = x+1 # Compensate for border
|
||
|
||
|
||
Documentation Strings
|
||
|
||
Conventions for writing good documentation strings
|
||
(a.k.a. "docstrings") are immortalized in their own PEP[3].
|
||
|
||
|
||
Version Bookkeeping
|
||
|
||
If you have to have RCS or CVS crud in your source file, do it as
|
||
follows.
|
||
|
||
__version__ = "$Revision$"
|
||
# $Source$
|
||
|
||
These lines should be included after the module's docstring,
|
||
before any other code, separated by a blank line above and
|
||
below.
|
||
|
||
|
||
Naming Conventions
|
||
|
||
The naming conventions of Python's library are a bit of a mess, so
|
||
we'll never get this completely consistent -- nevertheless, here
|
||
are some guidelines.
|
||
|
||
Descriptive: Naming Styles
|
||
|
||
There are a lot of different naming styles. It helps to be able
|
||
to recognize what naming style is being used, independently from
|
||
what they are used for.
|
||
|
||
The following naming styles are commonly distinguished:
|
||
|
||
- x (single lowercase letter)
|
||
|
||
- X (single uppercase letter)
|
||
|
||
- lowercase
|
||
|
||
- lower_case_with_underscores
|
||
|
||
- UPPERCASE
|
||
|
||
- UPPER_CASE_WITH_UNDERSCORES
|
||
|
||
- CapitalizedWords (or CapWords)
|
||
|
||
- mixedCase (differs from CapitalizedWords by initial lowercase
|
||
character!)
|
||
|
||
- Capitalized_Words_With_Underscores (ugly!)
|
||
|
||
There's also the style of using a short unique prefix to group
|
||
related names together. This is not used much in Python, but it
|
||
is mentioned for completeness. For example, the os.stat()
|
||
function returns a tuple whose items traditionally have names like
|
||
st_mode, st_size, st_mtime and so on. The X11 library uses a
|
||
leading X for all its public functions. (In Python, this style is
|
||
generally deemed unnecessary because attribute and method names
|
||
are prefixed with an object, and function names are prefixed with
|
||
a module name.)<
|
||
|
||
In addition, the following special forms using leading or trailing
|
||
underscores are recognized (these can generally be combined with any
|
||
case convention):
|
||
|
||
- _single_leading_underscore: weak "internal use" indicator
|
||
(e.g. "from M import *" does not import objects whose name
|
||
starts with an underscore).
|
||
|
||
- single_trailing_underscore_: used by convention to avoid
|
||
conflicts with Python keyword, e.g.
|
||
"Tkinter.Toplevel(master, class_='ClassName')".
|
||
|
||
- __double_leading_underscore: class-private names as of Python 1.4.
|
||
|
||
- __double_leading_and_trailing_underscore__: "magic" objects or
|
||
attributes that live in user-controlled namespaces,
|
||
e.g. __init__, __import__ or __file__. Sometimes these are
|
||
defined by the user to trigger certain magic behavior
|
||
(e.g. operator overloading); sometimes these are inserted by the
|
||
infrastructure for its own use or for debugging purposes. Since
|
||
the infrastructure (loosely defined as the Python interpreter
|
||
and the standard library) may decide to grow its list of magic
|
||
attributes in future versions, user code should generally
|
||
refrain from using this convention for its own use. User code
|
||
that aspires to become part of the infrastructure could combine
|
||
this with a short prefix inside the underscores,
|
||
e.g. __bobo_magic_attr__.
|
||
|
||
Prescriptive: Naming Conventions
|
||
|
||
Module Names
|
||
|
||
Module names can be either MixedCase or lowercase. There is no
|
||
unambiguous convention to decide which to use. Modules that
|
||
export a single class (or a number of closely related classes,
|
||
plus some additional support) are often named in MixedCase, with
|
||
the module name being the same as the class name (e.g. the
|
||
standard StringIO module). Modules that export a bunch of
|
||
functions are usually named in all lowercase.
|
||
|
||
Since module names are mapped to file names, and some file
|
||
systems are case insensitive and truncate long names, it is
|
||
important that module names be chosen to be fairly short and not
|
||
in conflict with other module names that only differ in the case
|
||
-- this won't be a problem on Unix, but it may be a problem when
|
||
the code is transported to Mac or Windows.
|
||
|
||
There is an emerging convention that when an extension module
|
||
written in C or C++ has an accompanying Python module that
|
||
provides a higher level (e.g. more object oriented) interface,
|
||
the Python module's name CapWords, while the C/C++ module is
|
||
named in all lowercase and has a leading underscore
|
||
(e.g. _socket).
|
||
|
||
Python packages generally have a short all lowercase name.
|
||
|
||
Class Names
|
||
|
||
Almost without exception, class names use the CapWords
|
||
convention. Classes for internal use have a leading underscore
|
||
in addition.
|
||
|
||
Exception Names
|
||
|
||
If a module defines a single exception raised for all sorts of
|
||
conditions, it is generally called "error" or "Error". It seems
|
||
that built-in (extension) modules use "error" (e.g. os.error),
|
||
while Python modules generally use "Error" (e.g. xdrlib.Error).
|
||
|
||
Function Names
|
||
|
||
Plain functions exported by a module can either use the CapWords
|
||
style or lowercase (or lower_case_with_underscores). There is
|
||
no strong preference, but it seems that the CapWords style is
|
||
used for functions that provide major functionality
|
||
(e.g. nstools.WorldOpen()), while lowercase is used more for
|
||
"utility" functions (e.g. pathhack.kos_root()).
|
||
|
||
Global Variable Names
|
||
|
||
(Let's hope that these variables are meant for use inside one
|
||
module only.) The conventions are about the same as those for
|
||
exported functions. Modules that are designed for use via "from
|
||
M import *" should prefix their globals (and internal functions
|
||
and classes) with an underscore to prevent exporting them.
|
||
|
||
Method Names
|
||
|
||
The story is largely the same as for functions. Use lowercase
|
||
for methods accessed by other classes or functions that are part
|
||
of the implementation of an object type. Use one leading
|
||
underscore for "internal" methods and instance variables when
|
||
there is no chance of a conflict with subclass or superclass
|
||
attributes or when a subclass might actually need access to
|
||
them. Use two leading underscores (class-private names,
|
||
enforced by Python 1.4) in those cases where it is important
|
||
that only the current class accesses an attribute. (But realize
|
||
that Python contains enough loopholes so that an insistent user
|
||
could gain access nevertheless, e.g. via the __dict__ attribute.)
|
||
|
||
|
||
References
|
||
|
||
[1] PEP 7, Style Guide for C Code, van Rossum
|
||
|
||
[2] http://www.python.org/doc/essays/styleguide.html
|
||
|
||
[3] PEP 257, Docstring Conventions, Goodger, van Rossum
|
||
|
||
|
||
Copyright
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
End:
|