diff --git a/pep-0008.txt b/pep-0008.txt new file mode 100644 index 000000000..c2fb303f1 --- /dev/null +++ b/pep-0008.txt @@ -0,0 +1,401 @@ +PEP: 8 +Title: Style Guide for Python Code +Version: $Revision$ +Author: guido@python.org (Guido van Rossum), + barry@digicool.com (Barry Warsaw) +Status: Active +Type: Informational +Created: 05-Jul-2001 +Post-History: + + +Introduction + + This document gives coding conventions for the Python code + comprising the standard library for the main Python distribution. + Please see the companion informational PEP describing style + guidelines for the C code in the C implementation of Python[1]. + + Note, rules are there to be broken. Two good reasons to break a + particular rule: + + (1) When applying the rule would make the code less readable, even + for someone who is used to reading code that follows the rules. + + (2) To be consistent with surrounding code that also breaks it + (maybe for historic reasons) -- although this is also an + opportunity to clean up someone else's mess (in true XP style). + + This document was adapted from Guido's original Python Style + Guide essay[2]. This PEP inherits that essay's incompleteness. + + +Code lay-out + + Indentation + + Use the default of Emacs' Python-mode: 4 spaces for one + indentation level. For really old code that you don't want to + mess up, you can continue to use 8-space tabs. Emacs Python-mode + auto-detects the prevailing indentation level used in a file and + sets its indentation parameters accordingly. + + Tabs or Spaces? + + Never mix tabs and spaces. The most popular way of indenting + Python is with spaces only. The second-most popular way is with + tabs only. Code indented with a mixture of tabs and spaces should + be converted to using spaces exclusively. (In Emacs, select the + whole buffer and hit ESC-x untabify.) When invoking the python + command line interpreter with the -t option, it issues warnings + about code that illegally mixes tabs and spaces. When using -tt + these warnings become errors. These options are highly + recommended! + + Maximum Line Length + + There are still many devices around that are limited to 80 + character lines. The default wrapping on such devices looks ugly. + Therefore, please limit all lines to a maximum of 79 characters + (Emacs wraps lines that are exactly 80 characters long.) + + The preferred way of wrapping long lines is by using Python's + implied line continuation inside parentheses, brackets and braces. + If necessary, you can add an extra pair of parentheses around an + expression, but sometimes using a backslash looks better. Make + sure to indent the continued line appropriately. Emacs + Python-mode does this right. Some examples: + + class Rectangle(Blob): + + def __init__(self, width, height, + color='black', emphasis=None, highlight=0): + if width == 0 and height == 0 and \ + color == 'red' and emphasis == 'strong' or \ + highlight > 100: + raise ValueError, "sorry, you lose" + if width == 0 and height == 0 and (color == 'red' or + emphasis is None): + raise ValueError, "I don't think so" + Blob.__init__(self, width, height, + color, emphasis, highlight) + + Blank Lines + + Separate top-level function and class definitions with two blank + lines. Method definitions inside a class are separated by a + single blank line. Extra blank lines may be used (sparingly) to + separate groups of related functions. Blank lines may be omitted + between a bunch of related one-liners (e.g. a set of dummy + implementations). + + When blank lines are used to separate method definitions, there is + also a blank line between the `class' line and the first method + definition. + + Use blank lines in functions, sparingly, to indicate logical + sections. + + Python accepts the control-L (i.e. ^L) form feed character as + whitespace; Emacs (and some printing facilities) treat these + characters as page separators, so you may use them to separate + pages of related sections of your file. + + +Whitespace in Expressions and Statements + + Pet Peeves + + Guido hates whitespace in the following places: + + - Immediately inside parentheses, brackets or braces, as in: + "spam( ham[ 1 ], { eggs: 2 } )". Always write this as + "spam(ham[1], {eggs: 2})". + + - Immediately before a comma, semicolon, or colon, as in: + "if x == 4 : print x , y ; x , y = y , x". Always write this as + "if x == 4: print x, y; x, y = y, x". + + - Immediately before the open parenthesis that starts the argument + list of a function call, as in "spam (1)". Always write + this as "spam(1)". + + - Immediately before the open parenthesis that starts an indexing or + slicing, as in: "dict ['key'] = list [index]". Always + write this as "dict['key'] = list[index]". + + - More than one space around an assignment (or other) operator to + align it with another, as in: + + x = 1 + y = 2 + long_variable = 3 + + Always write this as + + x = 1 + y = 2 + long_variable = 3 + + (Don't bother to argue with him on any of the above -- Guido's + grown accustomed to this style over 15 years.) + + + Other Recommendations + + - Always surround these binary operators with a single space on + either side: assignment (=), comparisons (==, <, >, !=, <>, <=, + >=, in, not in, is, is not), Booleans (and, or, not). + + - Use your better judgment for the insertion of spaces around + arithmetic operators. Always be consistent about whitespace on + either side of a binary operator. Some examples: + + i = i+1 + submitted = submitted + 1 + x = x*2 - 1 + hypot2 = x*x + y*y + c = (a+b) * (a-b) + c = (a + b) * (a - b) + + - Don't use spaces around the '=' sign when used to indicate a + keyword argument or a default parameter value. For instance: + + def complex(real, imag=0.0): + return magic(r=real, i=imag) + + +Comments + + Comments that contradict the code are worse than no comments. + Always make a priority of keeping the comments up-to-date when the + code changes! + + If a comment is a phrase or sentence, its first word should be + capitalized, unless it is an identifier that begins with a lower + case letter (never alter the case of identifiers!). + + If a comment is short, the period at the end is best omitted. + Block comments generally consist of one or more paragraphs built + out of complete sentences, and each sentence should end in a + period. + + You can use two spaces after a sentence-ending period. + + As always when writing English, Strunk and White apply. + + Python coders from non-English speaking countries: please write + your comments in English, unless you are 120% sure that the code + will never be read by people who don't speak your language. + + + Block Comments + + Block comments generally apply to some (or all) code that follows + them, and are indented to the same level as that code. Each line + of a block comment starts with a # and a single space (unless it + is indented text inside the comment). Paragraphs inside a block + comment are separated by a line containing a single #. Block + comments are best surrounded by a blank line above and below them + (or two lines above and a single line below for a block comment at + the start of a a new section of function definitions). + + Inline Comments + + An inline comment is a comment on the same line as a statement. + Inline comments should be used sparingly. Inline comments should + be separated by at least two spaces from the statement. They + should start with a # and a single space. + + Inline comments are unnecessary and in fact distracting if they state + the obvious. Don't do this: + + x = x+1 # Increment x + + But sometimes, this is useful: + + x = x+1 # Compensate for border + + +Documentation Strings + + Conventions for writing good documentation strings + (a.k.a. "docstrings") are immortalized in their own PEP[3]. + + +Version Bookkeeping + + If you have to have RCS or CVS crud in your source file, do it as + follows. + + __version__ = "$Revision$" + # $Source$ + + These lines should be included after the module's docstring, + before any other code, separated by a blank line above and + below. + + +Naming Conventions + + The naming conventions of Python's library are a bit of a mess, so + we'll never get this completely consistent -- nevertheless, here + are some guidelines. + + Descriptive: Naming Styles + + There are a lot of different naming styles. It helps to be able + to recognize what naming style is being used, independently from + what they are used for. + + The following naming styles are commonly distinguished: + + - x (single lowercase letter) + + - X (single uppercase letter) + + - lowercase + + - lower_case_with_underscores + + - UPPERCASE + + - UPPER_CASE_WITH_UNDERSCORES + + - CapitalizedWords (or CapWords) + + - mixedCase (differs from CapitalizedWords by initial lowercase + character!) + + - Capitalized_Words_With_Underscores (ugly!) + + There's also the style of using a short unique prefix to group + related names together. This is not used much in Python, but it + is mentioned for completeness. For example, the os.stat() + function returns a tuple whose items traditionally have names like + st_mode, st_size, st_mtime and so on. The X11 library uses a + leading X for all its public functions. (In Python, this style is + generally deemed unnecessary because attribute and method names + are prefixed with an object, and function names are prefixed with + a module name.)< + + In addition, the following special forms using leading or trailing + underscores are recognized (these can generally be combined with any + case convention): + + - _single_leading_underscore: weak "internal use" indicator + (e.g. "from M import *" does not import objects whose name + starts with an underscore). + + - single_trailing_underscore_: used by convention to avoid + conflicts with Python keyword, e.g. + "Tkinter.Toplevel(master, class_='ClassName')". + + - __double_leading_underscore: class-private names as of Python 1.4. + + - __double_leading_and_trailing_underscore__: "magic" objects or + attributes that live in user-controlled namespaces, + e.g. __init__, __import__ or __file__. Sometimes these are + defined by the user to trigger certain magic behavior + (e.g. operator overloading); sometimes these are inserted by the + infrastructure for its own use or for debugging purposes. Since + the infrastructure (loosely defined as the Python interpreter + and the standard library) may decide to grow its list of magic + attributes in future versions, user code should generally + refrain from using this convention for its own use. User code + that aspires to become part of the infrastructure could combine + this with a short prefix inside the underscores, + e.g. __bobo_magic_attr__. + + Prescriptive: Naming Conventions + + Module Names + + Module names can be either MixedCase or lowercase. There is no + unambiguous convention to decide which to use. Modules that + export a single class (or a number of closely related classes, + plus some additional support) are often named in MixedCase, with + the module name being the same as the class name (e.g. the + standard StringIO module). Modules that export a bunch of + functions are usually named in all lowercase. + + Since module names are mapped to file names, and some file + systems are case insensitive and truncate long names, it is + important that module names be chosen to be fairly short and not + in conflict with other module names that only differ in the case + -- this won't be a problem on Unix, but it may be a problem when + the code is transported to Mac or Windows. + + There is an emerging convention that when an extension module + written in C or C++ has an accompanying Python module that + provides a higher level (e.g. more object oriented) interface, + the Python module's name CapWords, while the C/C++ module is + named in all lowercase and has a leading underscore + (e.g. _socket). + + Python packages generally have a short all lowercase name. + + Class Names + + Almost without exception, class names use the CapWords + convention. Classes for internal use have a leading underscore + in addition. + + Exception Names + + If a module defines a single exception raised for all sorts of + conditions, it is generally called "error" or "Error". It seems + that built-in (extension) modules use "error" (e.g. os.error), + while Python modules generally use "Error" (e.g. xdrlib.Error). + + Function Names + + Plain functions exported by a module can either use the CapWords + style or lowercase (or lower_case_with_underscores). There is + no strong preference, but it seems that the CapWords style is + used for functions that provide major functionality + (e.g. nstools.WorldOpen()), while lowercase is used more for + "utility" functions (e.g. pathhack.kos_root()). + + Global Variable Names + + (Let's hope that these variables are meant for use inside one + module only.) The conventions are about the same as those for + exported functions. Modules that are designed for use via "from + M import *" should prefix their globals (and internal functions + and classes) with an underscore to prevent exporting them. + + Method Names + + The story is largely the same as for functions. Use lowercase + for methods accessed by other classes or functions that are part + of the implementation of an object type. Use one leading + underscore for "internal" methods and instance variables when + there is no chance of a conflict with subclass or superclass + attributes or when a subclass might actually need access to + them. Use two leading underscores (class-private names, + enforced by Python 1.4) in those cases where it is important + that only the current class accesses an attribute. (But realize + that Python contains enough loopholes so that an insistent user + could gain access nevertheless, e.g. via the __dict__ attribute. + + +References + + [1] PEP 7, Style Guide for C Code, van Rossum + + [2] http://www.python.org/doc/essays/styleguide.html + + [3] PEP 257, Docstring Conventions, Goodger, van Rossum + + +Copyright + + This document has been placed in the public domain. + + + +Local Variables: +mode: indented-text +indent-tabs-mode: nil +End: