635 lines
24 KiB
Plaintext
635 lines
24 KiB
Plaintext
PEP: 8
|
||
Title: Style Guide for Python Code
|
||
Version: $Revision$
|
||
Author: guido@python.org (Guido van Rossum),
|
||
barry@python.org (Barry Warsaw)
|
||
Status: Active
|
||
Type: Informational
|
||
Created: 05-Jul-2001
|
||
Post-History: 05-Jul-2001
|
||
|
||
|
||
Introduction
|
||
|
||
This document gives coding conventions for the Python code
|
||
comprising the standard library for the main Python distribution.
|
||
Please see the companion informational PEP describing style
|
||
guidelines for the C code in the C implementation of Python[1].
|
||
|
||
This document was adapted from Guido's original Python Style
|
||
Guide essay[2], with some additions from Barry's style guide[5].
|
||
Where there's conflict, Guido's style rules for the purposes of
|
||
this PEP. This PEP may still be incomplete (in fact, it may never
|
||
be finished <wink>).
|
||
|
||
|
||
A Foolish Consistency is the Hobgoblin of Little Minds
|
||
|
||
A style guide is about consistency. Consistency with this style
|
||
guide is important. Consistency within a project is more
|
||
important. Consistency within one module or function is most
|
||
important.
|
||
|
||
But most importantly: know when to be inconsistent -- sometimes
|
||
the style guide just doesn't apply. When in doubt, use your best
|
||
judgement. Look at other examples and decide what looks best.
|
||
And don't hesitate to ask!
|
||
|
||
Two good reasons to break a particular rule:
|
||
|
||
(1) When applying the rule would make the code less readable, even
|
||
for someone who is used to reading code that follows the rules.
|
||
|
||
(2) To be consistent with surrounding code that also breaks it
|
||
(maybe for historic reasons) -- although this is also an
|
||
opportunity to clean up someone else's mess (in true XP style).
|
||
|
||
|
||
Code lay-out
|
||
|
||
Indentation
|
||
|
||
Use the default of Emacs' Python-mode: 4 spaces for one
|
||
indentation level. For really old code that you don't want to
|
||
mess up, you can continue to use 8-space tabs. Emacs Python-mode
|
||
auto-detects the prevailing indentation level used in a file and
|
||
sets its indentation parameters accordingly.
|
||
|
||
Tabs or Spaces?
|
||
|
||
Never mix tabs and spaces. The most popular way of indenting
|
||
Python is with spaces only. The second-most popular way is with
|
||
tabs only. Code indented with a mixture of tabs and spaces should
|
||
be converted to using spaces exclusively. (In Emacs, select the
|
||
whole buffer and hit ESC-x untabify.) When invoking the python
|
||
command line interpreter with the -t option, it issues warnings
|
||
about code that illegally mixes tabs and spaces. When using -tt
|
||
these warnings become errors. These options are highly
|
||
recommended!
|
||
|
||
For new projects, spaces-only are strongly recommended over tabs.
|
||
Most editors have features that make this easy to do. (In Emacs,
|
||
make sure indent-tabs-mode is nil).
|
||
|
||
Maximum Line Length
|
||
|
||
There are still many devices around that are limited to 80
|
||
character lines; plus, limiting windows to 80 characters makes it
|
||
possible to have several windows side-by-side. The default
|
||
wrapping on such devices looks ugly. Therefore, please limit all
|
||
lines to a maximum of 79 characters (Emacs wraps lines that are
|
||
exactly 80 characters long). For flowing long blocks of text
|
||
(docstrings or comments), limiting the length to 72 characters is
|
||
recommended.
|
||
|
||
The preferred way of wrapping long lines is by using Python's
|
||
implied line continuation inside parentheses, brackets and braces.
|
||
If necessary, you can add an extra pair of parentheses around an
|
||
expression, but sometimes using a backslash looks better. Make
|
||
sure to indent the continued line appropriately. Emacs
|
||
Python-mode does this right. Some examples:
|
||
|
||
class Rectangle(Blob):
|
||
|
||
def __init__(self, width, height,
|
||
color='black', emphasis=None, highlight=0):
|
||
if width == 0 and height == 0 and \
|
||
color == 'red' and emphasis == 'strong' or \
|
||
highlight > 100:
|
||
raise ValueError, "sorry, you lose"
|
||
if width == 0 and height == 0 and (color == 'red' or
|
||
emphasis is None):
|
||
raise ValueError, "I don't think so"
|
||
Blob.__init__(self, width, height,
|
||
color, emphasis, highlight)
|
||
|
||
Blank Lines
|
||
|
||
Separate top-level function and class definitions with two blank
|
||
lines. Method definitions inside a class are separated by a
|
||
single blank line. Extra blank lines may be used (sparingly) to
|
||
separate groups of related functions. Blank lines may be omitted
|
||
between a bunch of related one-liners (e.g. a set of dummy
|
||
implementations).
|
||
|
||
When blank lines are used to separate method definitions, there is
|
||
also a blank line between the `class' line and the first method
|
||
definition.
|
||
|
||
Use blank lines in functions, sparingly, to indicate logical
|
||
sections.
|
||
|
||
Python accepts the control-L (i.e. ^L) form feed character as
|
||
whitespace; Emacs (and some printing tools) treat these
|
||
characters as page separators, so you may use them to separate
|
||
pages of related sections of your file.
|
||
|
||
Encodings (PEP 263)
|
||
|
||
Code in the core Python distribution should aways use the ASCII or
|
||
Latin-1 encoding (a.k.a. ISO-8859-1). Files using ASCII should
|
||
not have a coding cookie. Latin-1 should only be used when a
|
||
comment or docstring needs to mention an author name that requires
|
||
Latin-1; otherwise, using \x escapes is the preferred way to
|
||
include non-ASCII data in string literals. An exception is made
|
||
for those files that are part of the test suite for the code
|
||
implementing PEP 263.
|
||
|
||
|
||
Imports
|
||
|
||
- Imports should usually be on separate lines, e.g.:
|
||
|
||
No: import sys, os
|
||
Yes: import sys
|
||
import os
|
||
|
||
it's okay to say this though:
|
||
|
||
from types import StringType, ListType
|
||
|
||
- Imports are always put at the top of the file, just after any
|
||
module comments and docstrings, and before module globals and
|
||
constants. Imports should be grouped, with the order being
|
||
|
||
1. standard library imports
|
||
2. related major package imports (i.e. all email package imports next)
|
||
3. application specific imports
|
||
|
||
You should put a blank line between each group of imports.
|
||
|
||
- Relative imports for intra-package imports are highly
|
||
discouraged. Always use the absolute package path for all
|
||
imports.
|
||
|
||
- When importing a class from a class-containing module, it's usually
|
||
okay to spell this
|
||
|
||
from MyClass import MyClass
|
||
from foo.bar.YourClass import YourClass
|
||
|
||
If this spelling causes local name clashes, then spell them
|
||
|
||
import MyClass
|
||
import foo.bar.YourClass
|
||
|
||
and use "MyClass.MyClass" and "foo.bar.YourClass.YourClass"
|
||
|
||
|
||
Whitespace in Expressions and Statements
|
||
|
||
Pet Peeves
|
||
|
||
Guido hates whitespace in the following places:
|
||
|
||
- Immediately inside parentheses, brackets or braces, as in:
|
||
"spam( ham[ 1 ], { eggs: 2 } )". Always write this as
|
||
"spam(ham[1], {eggs: 2})".
|
||
|
||
- Immediately before a comma, semicolon, or colon, as in:
|
||
"if x == 4 : print x , y ; x , y = y , x". Always write this as
|
||
"if x == 4: print x, y; x, y = y, x".
|
||
|
||
- Immediately before the open parenthesis that starts the argument
|
||
list of a function call, as in "spam (1)". Always write
|
||
this as "spam(1)".
|
||
|
||
- Immediately before the open parenthesis that starts an indexing or
|
||
slicing, as in: "dict ['key'] = list [index]". Always
|
||
write this as "dict['key'] = list[index]".
|
||
|
||
- More than one space around an assignment (or other) operator to
|
||
align it with another, as in:
|
||
|
||
x = 1
|
||
y = 2
|
||
long_variable = 3
|
||
|
||
Always write this as
|
||
|
||
x = 1
|
||
y = 2
|
||
long_variable = 3
|
||
|
||
(Don't bother to argue with him on any of the above -- Guido's
|
||
grown accustomed to this style over 20 years.)
|
||
|
||
|
||
Other Recommendations
|
||
|
||
- Always surround these binary operators with a single space on
|
||
either side: assignment (=), comparisons (==, <, >, !=, <>, <=,
|
||
>=, in, not in, is, is not), Booleans (and, or, not).
|
||
|
||
- Use your better judgment for the insertion of spaces around
|
||
arithmetic operators. Always be consistent about whitespace on
|
||
either side of a binary operator. Some examples:
|
||
|
||
i = i+1
|
||
submitted = submitted + 1
|
||
x = x*2 - 1
|
||
hypot2 = x*x + y*y
|
||
c = (a+b) * (a-b)
|
||
c = (a + b) * (a - b)
|
||
|
||
- Don't use spaces around the '=' sign when used to indicate a
|
||
keyword argument or a default parameter value. For instance:
|
||
|
||
def complex(real, imag=0.0):
|
||
return magic(r=real, i=imag)
|
||
|
||
- Compound statements (multiple statements on the same line) are
|
||
generally discouraged.
|
||
|
||
No: if foo == 'blah': do_blah_thing()
|
||
Yes: if foo == 'blah':
|
||
do_blah_thing()
|
||
|
||
No: do_one(); do_two(); do_three()
|
||
Yes: do_one()
|
||
do_two()
|
||
do_three()
|
||
|
||
Comments
|
||
|
||
Comments that contradict the code are worse than no comments.
|
||
Always make a priority of keeping the comments up-to-date when the
|
||
code changes!
|
||
|
||
Comments should be complete sentences. If a comment is a phrase
|
||
or sentence, its first word should be capitalized, unless it is an
|
||
identifier that begins with a lower case letter (never alter the
|
||
case of identifiers!).
|
||
|
||
If a comment is short, the period at the end is best omitted.
|
||
Block comments generally consist of one or more paragraphs built
|
||
out of complete sentences, and each sentence should end in a
|
||
period.
|
||
|
||
You should use two spaces after a sentence-ending period, since it
|
||
makes Emacs wrapping and filling work consistenty.
|
||
|
||
When writing English, Strunk and White apply.
|
||
|
||
Python coders from non-English speaking countries: please write
|
||
your comments in English, unless you are 120% sure that the code
|
||
will never be read by people who don't speak your language.
|
||
|
||
|
||
Block Comments
|
||
|
||
Block comments generally apply to some (or all) code that follows
|
||
them, and are indented to the same level as that code. Each line
|
||
of a block comment starts with a # and a single space (unless it
|
||
is indented text inside the comment). Paragraphs inside a block
|
||
comment are separated by a line containing a single #. Block
|
||
comments are best surrounded by a blank line above and below them
|
||
(or two lines above and a single line below for a block comment at
|
||
the start of a a new section of function definitions).
|
||
|
||
Inline Comments
|
||
|
||
An inline comment is a comment on the same line as a statement.
|
||
Inline comments should be used sparingly. Inline comments should
|
||
be separated by at least two spaces from the statement. They
|
||
should start with a # and a single space.
|
||
|
||
Inline comments are unnecessary and in fact distracting if they state
|
||
the obvious. Don't do this:
|
||
|
||
x = x+1 # Increment x
|
||
|
||
But sometimes, this is useful:
|
||
|
||
x = x+1 # Compensate for border
|
||
|
||
|
||
Documentation Strings
|
||
|
||
Conventions for writing good documentation strings
|
||
(a.k.a. "docstrings") are immortalized in PEP 257 [3].
|
||
|
||
- Write docstrings for all public modules, functions, classes, and
|
||
methods. Docstrings are not necessary for non-public methods,
|
||
but you should have a comment that describes what the method
|
||
does. This comment should appear after the "def" line.
|
||
|
||
- PEP 257 describes good docstring conventions. Note that most
|
||
importantly, the """ that ends a multiline docstring should be
|
||
on a line by itself, e.g.:
|
||
|
||
"""Return a foobang
|
||
|
||
Optional plotz says to frobnicate the bizbaz first.
|
||
"""
|
||
|
||
- For one liner docstrings, it's okay to keep the closing """ on
|
||
the same line.
|
||
|
||
|
||
Version Bookkeeping
|
||
|
||
If you have to have RCS or CVS crud in your source file, do it as
|
||
follows.
|
||
|
||
__version__ = "$Revision$"
|
||
# $Source$
|
||
|
||
These lines should be included after the module's docstring,
|
||
before any other code, separated by a blank line above and
|
||
below.
|
||
|
||
|
||
Naming Conventions
|
||
|
||
The naming conventions of Python's library are a bit of a mess, so we'll
|
||
never get this completely consistent -- nevertheless, here are the
|
||
currently recommended naming standards. New modules and packages
|
||
(including 3rd party frameworks) should be written to these standards, but
|
||
where an existing library has a different style, internal consistency is
|
||
preferred.
|
||
|
||
Descriptive: Naming Styles
|
||
|
||
There are a lot of different naming styles. It helps to be able
|
||
to recognize what naming style is being used, independently from
|
||
what they are used for.
|
||
|
||
The following naming styles are commonly distinguished:
|
||
|
||
- b (single lowercase letter)
|
||
|
||
- B (single uppercase letter)
|
||
|
||
- lowercase
|
||
|
||
- lower_case_with_underscores
|
||
|
||
- UPPERCASE
|
||
|
||
- UPPER_CASE_WITH_UNDERSCORES
|
||
|
||
- CapitalizedWords (or CapWords, or CamelCase -- so named because
|
||
of the bumpy look of its letters[4]). This is also sometimes known as
|
||
StudlyCaps.
|
||
|
||
- mixedCase (differs from CapitalizedWords by initial lowercase
|
||
character!)
|
||
|
||
- Capitalized_Words_With_Underscores (ugly!)
|
||
|
||
There's also the style of using a short unique prefix to group
|
||
related names together. This is not used much in Python, but it
|
||
is mentioned for completeness. For example, the os.stat()
|
||
function returns a tuple whose items traditionally have names like
|
||
st_mode, st_size, st_mtime and so on. The X11 library uses a
|
||
leading X for all its public functions. (In Python, this style is
|
||
generally deemed unnecessary because attribute and method names
|
||
are prefixed with an object, and function names are prefixed with
|
||
a module name.)
|
||
|
||
In addition, the following special forms using leading or trailing
|
||
underscores are recognized (these can generally be combined with any
|
||
case convention):
|
||
|
||
- _single_leading_underscore: weak "internal use" indicator
|
||
(e.g. "from M import *" does not import objects whose name
|
||
starts with an underscore).
|
||
|
||
- single_trailing_underscore_: used by convention to avoid
|
||
conflicts with Python keyword, e.g.
|
||
"Tkinter.Toplevel(master, class_='ClassName')".
|
||
|
||
- __double_leading_underscore: class-private names as of Python 1.4.
|
||
|
||
- __double_leading_and_trailing_underscore__: "magic" objects or
|
||
attributes that live in user-controlled namespaces,
|
||
e.g. __init__, __import__ or __file__. Sometimes these are
|
||
defined by the user to trigger certain magic behavior
|
||
(e.g. operator overloading); sometimes these are inserted by the
|
||
infrastructure for its own use or for debugging purposes. Since
|
||
the infrastructure (loosely defined as the Python interpreter
|
||
and the standard library) may decide to grow its list of magic
|
||
attributes in future versions, user code should generally
|
||
refrain from using this convention for its own use. User code
|
||
that aspires to become part of the infrastructure could combine
|
||
this with a short prefix inside the underscores,
|
||
e.g. __bobo_magic_attr__.
|
||
|
||
Prescriptive: Naming Conventions
|
||
|
||
Names to Avoid
|
||
|
||
Never use the characters `l' (lowercase letter el), `O'
|
||
(uppercase letter oh), or `I' (uppercase letter eye) as single
|
||
character variable names. In some fonts, these characters are
|
||
indistinguisable from the numerals one and zero. When tempted
|
||
to use `l' use `L' instead.
|
||
|
||
Module Names
|
||
|
||
Modules should have short, lowercase names, without underscores.
|
||
|
||
Since module names are mapped to file names, and some file systems
|
||
are case insensitive and truncate long names, it is important that
|
||
module names be chosen to be fairly short -- this won't be a
|
||
problem on Unix, but it may be a problem when the code is
|
||
transported to Mac or Windows.
|
||
|
||
When an extension module written in C or C++ has an accompanying
|
||
Python module that provides a higher level (e.g. more object
|
||
oriented) interface, the C/C++ module has a leading underscore
|
||
(e.g. _socket).
|
||
|
||
Python packages should have short, all-lowercase names, without
|
||
underscores.
|
||
|
||
Class Names
|
||
|
||
Almost without exception, class names use the CapWords convention.
|
||
Classes for internal use have a leading underscore in addition.
|
||
|
||
Exception Names
|
||
|
||
If a module defines a single exception raised for all sorts of
|
||
conditions, it is generally called "error" or "Error". It seems
|
||
that built-in (extension) modules use "error" (e.g. os.error),
|
||
while Python modules generally use "Error" (e.g. xdrlib.Error).
|
||
The trend seems to be toward CapWords exception names.
|
||
|
||
Global Variable Names
|
||
|
||
(Let's hope that these variables are meant for use inside one
|
||
module only.) The conventions are about the same as those for
|
||
functions. Modules that are designed for use via "from M import *"
|
||
should prefix their globals (and internal functions and classes)
|
||
with an underscore to prevent exporting them.
|
||
|
||
Function Names
|
||
|
||
Function names should be lowercase, possibly with words separated by
|
||
underscores to improve readability. mixedCase is allowed only in
|
||
contexts where that's already the prevailing style (e.g. threading.py),
|
||
to retain backwards compatibility.
|
||
|
||
Method Names and Instance Variables
|
||
|
||
The story is largely the same as with functions: in general, use
|
||
lowercase with words separated by underscores as necessary to improve
|
||
readability.
|
||
|
||
Use one leading underscore only for internal methods and instance
|
||
variables which are not intended to be part of the class's public
|
||
interface. Python does not enforce this; it is up to programmers to
|
||
respect the convention.
|
||
|
||
Use two leading underscores to denote class-private names. Python
|
||
"mangles" these names with the class name: if class Foo has an
|
||
attribute named __a, it cannot be accessed by Foo.__a. (An insistent
|
||
user could still gain access by calling Foo._Foo__a.) Generally,
|
||
double leading underscores should be used only to avoid name conflicts
|
||
with attributes in classes designed to be subclassed.
|
||
|
||
Designing for inheritance
|
||
|
||
Always decide whether a class's methods and instance variables
|
||
should be public or non-public. In general, never make data
|
||
variables public unless you're implementing essentially a
|
||
record. It's almost always preferrable to give a functional
|
||
interface to your class instead (and some Python 2.2
|
||
developments will make this much nicer).
|
||
|
||
Also decide whether your attributes should be private or not.
|
||
The difference between private and non-public is that the former
|
||
will never be useful for a derived class, while the latter might
|
||
be. Yes, you should design your classes with inheritence in
|
||
mind!
|
||
|
||
Private attributes should have two leading underscores, no
|
||
trailing underscores.
|
||
|
||
Non-public attributes should have a single leading underscore,
|
||
no trailing underscores.
|
||
|
||
Public attributes should have no leading or trailing
|
||
underscores, unless they conflict with reserved words, in which
|
||
case, a single trailing underscore is preferrable to a leading
|
||
one, or a corrupted spelling, e.g. class_ rather than klass.
|
||
(This last point is a bit controversial; if you prefer klass
|
||
over class_ then just be consistent. :).
|
||
|
||
|
||
Programming Recommendations
|
||
|
||
- Code should be written in a way that does not disadvantage other
|
||
implementations of Python (PyPy, Jython, IronPython, Pyrex, Psyco,
|
||
and such). For example, do not rely on CPython's efficient
|
||
implementation of in-place string concatenation for statements in
|
||
the form a+=b or a=a+b. Those statements run more slowly in
|
||
Jython. In performance sensitive parts of the library, the
|
||
''.join() form should be used instead. This will assure that
|
||
concatenation occurs in linear time across various implementations.
|
||
|
||
- Comparisons to singletons like None should always be done with
|
||
'is' or 'is not'. Also, beware of writing "if x" when you
|
||
really mean "if x is not None" -- e.g. when testing whether a
|
||
variable or argument that defaults to None was set to some other
|
||
value. The other value might be a value that's false in a
|
||
Boolean context!
|
||
|
||
- Class-based exceptions are always preferred over string-based
|
||
exceptions. Modules or packages should define their own
|
||
domain-specific base exception class, which should be subclassed
|
||
from the built-in Exception class. Always include a class
|
||
docstring. E.g.:
|
||
|
||
class MessageError(Exception):
|
||
"""Base class for errors in the email package."""
|
||
|
||
When raising an exception, use "raise ValueError('message')"
|
||
instead of the older form "raise ValueError, 'message'". The
|
||
paren-using form is preferred because when the exception
|
||
arguments are long or include string formatting, you don't need
|
||
to use line continuation characters thanks to the containing
|
||
parentheses. The older form will be removed in Python 3000.
|
||
|
||
- Use string methods instead of the string module unless
|
||
backward-compatibility with versions earlier than Python 2.0 is
|
||
important. String methods are always much faster and share the
|
||
same API with unicode strings.
|
||
|
||
- Avoid slicing strings when checking for prefixes or suffixes.
|
||
Use startswith() and endswith() instead, since they are
|
||
cleaner and less error prone. For example:
|
||
|
||
No: if foo[:3] == 'bar':
|
||
Yes: if foo.startswith('bar'):
|
||
|
||
The exception is if your code must work with Python 1.5.2 (but
|
||
let's hope not!).
|
||
|
||
- Object type comparisons should always use isinstance() instead
|
||
of comparing types directly. E.g.
|
||
|
||
No: if type(obj) is type(1):
|
||
Yes: if isinstance(obj, int):
|
||
|
||
When checking if an object is a string, keep in mind that it
|
||
might be a unicode string too! In Python 2.3, str and unicode
|
||
have a common base class, basestring, so you can do:
|
||
|
||
if isinstance(obj, basestring):
|
||
|
||
In Python 2.2, the types module has the StringTypes type defined
|
||
for that purpose, e.g.:
|
||
|
||
from types import StringTypes
|
||
if isinstance(obj, StringTypes):
|
||
|
||
In Python 2.0 and 2.1, you should do:
|
||
|
||
from types import StringType, UnicodeType
|
||
if isinstance(obj, StringType) or \
|
||
isinstance(obj, UnicodeType) :
|
||
|
||
- For sequences, (strings, lists, tuples), use the fact that empty
|
||
sequences are false, so "if not seq" or "if seq" is preferable
|
||
to "if len(seq)" or "if not len(seq)".
|
||
|
||
- Don't write string literals that rely on significant trailing
|
||
whitespace. Such trailing whitespace is visually
|
||
indistinguishable and some editors (or more recently,
|
||
reindent.py) will trim them.
|
||
|
||
- Don't compare boolean values to True or False using == (bool
|
||
types are new in Python 2.3):
|
||
|
||
No: if greeting == True:
|
||
Yes: if greeting:
|
||
|
||
|
||
References
|
||
|
||
[1] PEP 7, Style Guide for C Code, van Rossum
|
||
|
||
[2] http://www.python.org/doc/essays/styleguide.html
|
||
|
||
[3] PEP 257, Docstring Conventions, Goodger, van Rossum
|
||
|
||
[4] http://www.wikipedia.com/wiki/CamelCase
|
||
|
||
[5] Barry's GNU Mailman style guide
|
||
http://barry.warsaw.us/software/STYLEGUIDE.txt
|
||
|
||
|
||
Copyright
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
End:
|