diff --git a/pep-0256.txt b/pep-0256.txt index 7f438a8d8..e6715eb14 100644 --- a/pep-0256.txt +++ b/pep-0256.txt @@ -2,290 +2,298 @@ PEP: 256 Title: Docstring Processing System Framework Version: $Revision$ Last-Modified: $Date$ -Author: goodger@users.sourceforge.net (David Goodger) -Discussions-To: doc-sig@python.org +Author: David Goodger +Discussions-To: Status: Draft Type: Standards Track +Content-Type: text/x-rst Created: 01-Jun-2001 Post-History: 13-Jun-2001 Abstract +======== - Python lends itself to inline documentation. With its built-in - docstring syntax, a limited form of Literate Programming [1]_ is - easy to do in Python. However, there are no satisfactory standard - tools for extracting and processing Python docstrings. The lack - of a standard toolset is a significant gap in Python's - infrastructure; this PEP aims to fill the gap. +Python lends itself to inline documentation. With its built-in +docstring syntax, a limited form of `Literate Programming`_ is easy to +do in Python. However, there are no satisfactory standard tools for +extracting and processing Python docstrings. The lack of a standard +toolset is a significant gap in Python's infrastructure; this PEP aims +to fill the gap. - The issues surrounding docstring processing have been contentious - and difficult to resolve. This PEP proposes a generic Docstring - Processing System (DPS) framework, which separates out the - components (program and conceptual), enabling the resolution of - individual issues either through consensus (one solution) or - through divergence (many). It promotes standard interfaces which - will allow a variety of plug-in components (input context readers, - markup parsers, and output format writers) to be used. +The issues surrounding docstring processing have been contentious and +difficult to resolve. This PEP proposes a generic Docstring +Processing System (DPS) framework, which separates out the components +(program and conceptual), enabling the resolution of individual issues +either through consensus (one solution) or through divergence (many). +It promotes standard interfaces which will allow a variety of plug-in +components (input context readers, markup parsers, and output format +writers) to be used. - The concepts of a DPS framework are presented independently of - implementation details. +The concepts of a DPS framework are presented independently of +implementation details. Roadmap to the Doctring PEPs +============================ - There are many aspects to docstring processing. The "Docstring - PEPs" have broken up the issues in order to deal with each of them - in isolation, or as close as possible. The individual aspects and - associated PEPs are as follows: +There are many aspects to docstring processing. The "Docstring PEPs" +have broken up the issues in order to deal with each of them in +isolation, or as close as possible. The individual aspects and +associated PEPs are as follows: - * Docstring syntax. PEP 287, reStructuredText Docstring Format, - proposes a syntax for Python docstrings, PEPs, and other uses. +* Docstring syntax. PEP 287, "reStructuredText Docstring Format" + [#PEP-287]_, proposes a syntax for Python docstrings, PEPs, and + other uses. - * Docstring semantics consist of at least two aspects: +* Docstring semantics consist of at least two aspects: - - Conventions: the high-level structure of docstrings. Dealt - with in PEP 257, Docstring Conventions. + - Conventions: the high-level structure of docstrings. Dealt with + in PEP 257, "Docstring Conventions" [#PEP-257]_. - - Methodology: rules for the informational content of - docstrings. Not addressed. + - Methodology: rules for the informational content of docstrings. + Not addressed. - * Processing mechanisms. This PEP outlines the high-level issues - and specification of an abstract docstring processing system - (DPS). PEP 258, Docutils Design Specification, is an overview - of the design and implementation of one DPS under development. +* Processing mechanisms. This PEP (PEP 256) outlines the high-level + issues and specification of an abstract docstring processing system + (DPS). PEP 258, "Docutils Design Specification" [#PEP-258]_, is an + overview of the design and implementation of one DPS under + development. - * Output styles: developers want the documentation generated from - their source code to look good, and there are many different - ideas about what that means. PEP 258 touches on "Stylist - Transforms". This aspect of docstring processing has yet to be - fully explored. +* Output styles: developers want the documentation generated from + their source code to look good, and there are many different ideas + about what that means. PEP 258 touches on "Stylist Transforms". + This aspect of docstring processing has yet to be fully explored. - By separating out the issues, we can form consensus more easily - (smaller fights ;-), and accept divergence more readily. +By separating out the issues, we can form consensus more easily +(smaller fights ;-), and accept divergence more readily. Rationale +========= - There are standard inline documentation systems for some other - languages. For example, Perl has POD [2]_ and Java has Javadoc - [3]_, but neither of these mesh with the Pythonic way. POD syntax - is very explicit, but takes after Perl in terms of readability. - Javadoc is HTML-centric; except for '@field' tags, raw HTML is - used for markup. There are also general tools such as Autoduck - [4]_ and Web (Tangle & Weave) [5]_, useful for multiple languages. +There are standard inline documentation systems for some other +languages. For example, Perl has POD_ ("Plain Old Documentation") and +Java has Javadoc_, but neither of these mesh with the Pythonic way. +POD syntax is very explicit, but takes after Perl in terms of +readability. Javadoc is HTML-centric; except for "``@field``" tags, +raw HTML is used for markup. There are also general tools such as +Autoduck_ and Web_ (Tangle & Weave), useful for multiple languages. - There have been many attempts to write auto-documentation systems - for Python (not an exhaustive list): +There have been many attempts to write auto-documentation systems +for Python (not an exhaustive list): - - Marc-Andre Lemburg's doc.py [6]_ +- Marc-Andre Lemburg's doc.py_ - - Daniel Larsson's pythondoc & gendoc [7]_ +- Daniel Larsson's pythondoc_ & gendoc_ - - Doug Hellmann's HappyDoc [8]_ +- Doug Hellmann's HappyDoc_ - - Laurence Tratt's Crystal [9]_ +- Laurence Tratt's Crystal_ - - Ka-Ping Yee's htmldoc & pydoc [10]_ (pydoc.py is now part of the - Python standard library; see below) +- Ka-Ping Yee's pydoc_ (pydoc.py is now part of the Python standard + library; see below) - - Tony Ibbs' docutils [11]_ +- Tony Ibbs' docutils_ (Tony has donated this name to the `Docutils + project`_) - - Edward Loper's STminus formalization and related efforts [12]_ +- Edward Loper's STminus_ formalization and related efforts - These systems, each with different goals, have had varying degrees - of success. A problem with many of the above systems was - over-ambition combined with inflexibility. They provided a - self-contained set of components: a docstring extraction system, a - markup parser, an internal processing system and one or more - output format writers with a fixed style. Inevitably, one or more - aspects of each system had serious shortcomings, and they were not - easily extended or modified, preventing them from being adopted as - standard tools. +These systems, each with different goals, have had varying degrees of +success. A problem with many of the above systems was over-ambition +combined with inflexibility. They provided a self-contained set of +components: a docstring extraction system, a markup parser, an +internal processing system and one or more output format writers with +a fixed style. Inevitably, one or more aspects of each system had +serious shortcomings, and they were not easily extended or modified, +preventing them from being adopted as standard tools. - It has become clear (to this author, at least) that the "all or - nothing" approach cannot succeed, since no monolithic - self-contained system could possibly be agreed upon by all - interested parties. A modular component approach designed for - extension, where components may be multiply implemented, may be - the only chance for success. Standard inter-component APIs will - make the DPS components comprehensible without requiring detailed - knowledge of the whole, lowering the barrier for contributions, - and ultimately resulting in a rich and varied system. +It has become clear (to this author, at least) that the "all or +nothing" approach cannot succeed, since no monolithic self-contained +system could possibly be agreed upon by all interested parties. A +modular component approach designed for extension, where components +may be multiply implemented, may be the only chance for success. +Standard inter-component APIs will make the DPS components +comprehensible without requiring detailed knowledge of the whole, +lowering the barrier for contributions, and ultimately resulting in a +rich and varied system. - Each of the components of a docstring processing system should be - developed independently. A 'best of breed' system should be - chosen, either merged from existing systems, and/or developed - anew. This system should be included in Python's standard - library. +Each of the components of a docstring processing system should be +developed independently. A "best of breed" system should be chosen, +either merged from existing systems, and/or developed anew. This +system should be included in Python's standard library. PyDoc & Other Existing Systems +------------------------------ - PyDoc became part of the Python standard library as of release - 2.1. It extracts and displays docstrings from within the Python - interactive interpreter, from the shell command line, and from a - GUI window into a web browser (HTML). Although a very useful - tool, PyDoc has several deficiencies, including: +PyDoc became part of the Python standard library as of release 2.1. +It extracts and displays docstrings from within the Python interactive +interpreter, from the shell command line, and from a GUI window into a +web browser (HTML). Although a very useful tool, PyDoc has several +deficiencies, including: - - In the case of the GUI/HTML, except for some heuristic - hyperlinking of identifier names, no formatting of the - docstrings is done. They are presented within

- tags to avoid unwanted line wrapping. Unfortunately, the result - is not attractive. +- In the case of the GUI/HTML, except for some heuristic hyperlinking + of identifier names, no formatting of the docstrings is done. They + are presented within ``

`` tags to avoid unwanted line + wrapping. Unfortunately, the result is not attractive. - - PyDoc extracts docstrings and structural information (class - identifiers, method signatures, etc.) from imported module - objects. There are security issues involved with importing - untrusted code. Also, information from the source is lost when - importing, such as comments, "additional docstrings" (string - literals in non-docstring contexts; see PEP 258 [13]_), and the - order of definitions. +- PyDoc extracts docstrings and structural information (class + identifiers, method signatures, etc.) from imported module objects. + There are security issues involved with importing untrusted code. + Also, information from the source is lost when importing, such as + comments, "additional docstrings" (string literals in non-docstring + contexts; see PEP 258 [#PEP-258]_), and the order of definitions. - The functionality proposed in this PEP could be added to or used - by PyDoc when serving HTML pages. The proposed docstring - processing system's functionality is much more than PyDoc needs in - its current form. Either an independent tool will be developed - (which PyDoc may or may not use), or PyDoc could be expanded to - encompass this functionality and *become* the docstring processing - system (or one such system). That decision is beyond the scope of - this PEP. +The functionality proposed in this PEP could be added to or used by +PyDoc when serving HTML pages. The proposed docstring processing +system's functionality is much more than PyDoc needs in its current +form. Either an independent tool will be developed (which PyDoc may +or may not use), or PyDoc could be expanded to encompass this +functionality and *become* the docstring processing system (or one +such system). That decision is beyond the scope of this PEP. - Similarly for other existing docstring processing systems, their - authors may or may not choose compatibility with this framework. - However, if this framework is accepted and adopted as the Python - standard, compatibility will become an important consideration in - these systems' future. +Similarly for other existing docstring processing systems, their +authors may or may not choose compatibility with this framework. +However, if this framework is accepted and adopted as the Python +standard, compatibility will become an important consideration in +these systems' future. Specification +============= - The docstring processing system framework consists of components, - as follows:: +The docstring processing system framework is broken up as follows: - 1. Docstring conventions. Documents issues such as: +1. Docstring conventions. Documents issues such as: - - What should be documented where. + - What should be documented where. - - First line is a one-line synopsis. + - First line is a one-line synopsis. - PEP 257, Docstring Conventions [14]_, documents some of these - issues. + PEP 257 [#PEP-257]_ documents some of these issues. - 2. Docstring processing system design specification. Documents - issues such as: +2. Docstring processing system design specification. Documents + issues such as: - - High-level spec: what a DPS does. + - High-level spec: what a DPS does. - - Command-line interface for executable script. + - Command-line interface for executable script. - - System Python API. + - System Python API. - - Docstring extraction rules. + - Docstring extraction rules. - - Readers, which encapsulate the input context . + - Readers, which encapsulate the input context . - - Parsers. + - Parsers. - - Document tree: the intermediate internal data structure. The - output of the Parser and Reader, and the input to the Writer - all share the same data structure. + - Document tree: the intermediate internal data structure. The + output of the Parser and Reader, and the input to the Writer all + share the same data structure. - - Transforms, which modify the document tree. + - Transforms, which modify the document tree. - - Writers for output formats. + - Writers for output formats. - - Distributors, which handle output management (one file, many - files, or objects in memory). + - Distributors, which handle output management (one file, many + files, or objects in memory). - These issues are applicable to any docstring processing system - implementation. PEP 258, Docutils Design Specification [13 ]_, - documents these issues. + These issues are applicable to any docstring processing system + implementation. PEP 258 [#PEP-258]_ documents these issues. - 3. Docstring processing system implementation. +3. Docstring processing system implementation. - 4. Input markup specifications: docstring syntax. PEP 287, - reStructuredText Docstring Format [15]_, proposes a standard - syntax. +4. Input markup specifications: docstring syntax. PEP 287 [#PEP-287]_ + proposes a standard syntax. - 5. Input parser implementations. +5. Input parser implementations. - 6. Input context readers ("modes": Python source code, PEP, - standalone text file, email, etc.) and implementations. +6. Input context readers ("modes": Python source code, PEP, standalone + text file, email, etc.) and implementations. - 7. Stylists: certain input context readers may have associated - stylists which allow for a variety of output document styles. +7. Stylists: certain input context readers may have associated + stylists which allow for a variety of output document styles. - 8. Output formats (HTML, XML, TeX, DocBook, info, etc.) and writer - implementations. +8. Output formats (HTML, XML, TeX, DocBook, info, etc.) and writer + implementations. - Components 1, 2/3, and 4/5 are the subject of individual companion - PEPs. If there is another implementation of the framework or - syntax/parser, additional PEPs may be required. Multiple - implementations of each of components 6 and 7 will be required; - the PEP mechanism may be overkill for these components. +Components 1, 2/3/5, and 4 are the subject of individual companion +PEPs. If there is another implementation of the framework or +syntax/parser, additional PEPs may be required. Multiple +implementations of each of components 6 and 7 will be required; the +PEP mechanism may be overkill for these components. Project Web Site +================ - A SourceForge project has been set up for this work at - http://docutils.sourceforge.net/. +A SourceForge project has been set up for this work at +http://docutils.sourceforge.net/. References and Footnotes +======================== - [1] http://www.literateprogramming.com/ +.. [#PEP-287] PEP 287, reStructuredText Docstring Format, Goodger + (http://www.python.org/peps/pep-0287.html) - [2] Perl "Plain Old Documentation" - http://www.perldoc.com/perl5.6/pod/perlpod.html +.. [#PEP-257] PEP 257, Docstring Conventions, Goodger, Van Rossum + (http://www.python.org/peps/pep-0257.html) - [3] http://java.sun.com/j2se/javadoc/ +.. [#PEP-258] PEP 258, Docutils Design Specification, Goodger + (http://www.python.org/peps/pep-0258.html) - [4] http://www.helpmaster.com/hlp-developmentaids-autoduck.htm +.. _Literate Programming: http://www.literateprogramming.com/ - [5] http://www-cs-faculty.stanford.edu/~knuth/cweb.html +.. _POD: http://www.perldoc.com/perl5.6/pod/perlpod.html - [6] http://www.lemburg.com/files/python/SoftwareDescriptions.html#doc.py +.. _Javadoc: http://java.sun.com/j2se/javadoc/ - [7] http://starship.python.net/crew/danilo/pythondoc/ +.. _Autoduck: + http://www.helpmaster.com/hlp-developmentaids-autoduck.htm - [8] http://happydoc.sourceforge.net/ +.. _Web: http://www-cs-faculty.stanford.edu/~knuth/cweb.html - [9] http://www.btinternet.com/~tratt/comp/python/crystal/ +.. _doc.py: + http://www.lemburg.com/files/python/SoftwareDescriptions.html#doc.py - [10] http://www.python.org/doc/current/lib/module-pydoc.html +.. _pythondoc: +.. _gendoc: http://starship.python.net/crew/danilo/pythondoc/ - [11] http://homepage.ntlworld.com/tibsnjoan/docutils/ +.. _HappyDoc: http://happydoc.sourceforge.net/ - [12] http://www.cis.upenn.edu/~edloper/pydoc/ +.. _Crystal: http://www.btinternet.com/~tratt/comp/python/crystal/ - [13] PEP 258, Docutils Design Specification, Goodger - http://www.python.org/peps/pep-0258.html +.. _pydoc: http://www.python.org/doc/current/lib/module-pydoc.html - [14] PEP 257, Docstring Conventions, Goodger, Van Rossum - http://www.python.org/peps/pep-0257.html +.. _docutils: http://homepage.ntlworld.com/tibsnjoan/docutils/ - [15] PEP 287, reStructuredText Docstring Format, Goodger - http://www.python.org/peps/pep-0287.html +.. _Docutils project: http://docutils.sourceforge.net/ - [16] http://www.python.org/sigs/doc-sig/ +.. _STMinus: http://www.cis.upenn.edu/~edloper/pydoc/ + +.. _Python Doc-SIG: http://www.python.org/sigs/doc-sig/ Copyright +========= - This document has been placed in the public domain. +This document has been placed in the public domain. Acknowledgements +================ - This document borrows ideas from the archives of the Python - Doc-SIG [16]_. Thanks to all members past & present. +This document borrows ideas from the archives of the `Python +Doc-SIG`_. Thanks to all members past & present. -Local Variables: -mode: indented-text -indent-tabs-mode: nil -fill-column: 70 -sentence-end-double-space: t -End: +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + End: diff --git a/pep-0257.txt b/pep-0257.txt index fc5edbcb0..9ce589f88 100644 --- a/pep-0257.txt +++ b/pep-0257.txt @@ -2,255 +2,262 @@ PEP: 257 Title: Docstring Conventions Version: $Revision$ Last-Modified: $Date$ -Author: goodger@users.sourceforge.net (David Goodger), - guido@python.org (Guido van Rossum) +Author: David Goodger , + Guido van Rossum Discussions-To: doc-sig@python.org Status: Active Type: Informational +Content-Type: text/x-rst Created: 29-May-2001 Post-History: 13-Jun-2001 Abstract +======== - This PEP documents the semantics and conventions associated with - Python docstrings. +This PEP documents the semantics and conventions associated with +Python docstrings. Rationale +========= - The aim of this PEP is to standardize the high-level structure of - docstrings: what they should contain, and how to say it (without - touching on any markup syntax within docstrings). The PEP - contains conventions, not laws or syntax. +The aim of this PEP is to standardize the high-level structure of +docstrings: what they should contain, and how to say it (without +touching on any markup syntax within docstrings). The PEP contains +conventions, not laws or syntax. - "A universal convention supplies all of maintainability, - clarity, consistency, and a foundation for good programming - habits too. What it doesn't do is insist that you follow it - against your will. That's Python!" + "A universal convention supplies all of maintainability, clarity, + consistency, and a foundation for good programming habits too. + What it doesn't do is insist that you follow it against your will. + That's Python!" - --Tim Peters on comp.lang.python, 2001-06-16 + -- Tim Peters on comp.lang.python, 2001-06-16 - If you violate the conventions, the worst you'll get is some dirty - looks. But some software (such as the Docutils docstring - processing system [1] [2]) will be aware of the conventions, so - following them will get you the best results. +If you violate these conventions, the worst you'll get is some dirty +looks. But some software (such as the Docutils_ docstring processing +system [1]_ [2]_) will be aware of the conventions, so following them +will get you the best results. Specification +============= - What is a Docstring? - -------------------- +What is a Docstring? +-------------------- - A docstring is a string literal that occurs as the first statement - in a module, function, class, or method definition. Such a - docstring becomes the __doc__ special attribute of that object. +A docstring is a string literal that occurs as the first statement in +a module, function, class, or method definition. Such a docstring +becomes the ``__doc__`` special attribute of that object. - All modules should normally have docstrings, and all functions and - classes exported by a module should also have docstrings. Public - methods (including the __init__ constructor) should also have - docstrings. A package may be documented in the module docstring - of the __init__.py file in the package directory. +All modules should normally have docstrings, and all functions and +classes exported by a module should also have docstrings. Public +methods (including the ``__init__`` constructor) should also have +docstrings. A package may be documented in the module docstring of +the ``__init__.py`` file in the package directory. - String literals occurring elsewhere in Python code may also act as - documentation. They are not recognized by the Python bytecode - compiler and are not accessible as runtime object attributes - (i.e. not assigned to __doc__), but two types of extra docstrings - may be extracted by software tools: +String literals occurring elsewhere in Python code may also act as +documentation. They are not recognized by the Python bytecode +compiler and are not accessible as runtime object attributes (i.e. not +assigned to ``__doc__``), but two types of extra docstrings may be +extracted by software tools: - 1. String literals occurring immediately after a simple assignment - at the top level of a module, class, or __init__ method - are called "attribute docstrings". +1. String literals occurring immediately after a simple assignment at + the top level of a module, class, or ``__init__`` method are called + "attribute docstrings". - 2. String literals occurring immediately after another docstring - are called "additional docstrings". +2. String literals occurring immediately after another docstring are + called "additional docstrings". - Please see PEP 258 "Docutils Design Specification" [2] for a - detailed description of attribute and additional docstrings. +Please see PEP 258, "Docutils Design Specification" [2]_, for a +detailed description of attribute and additional docstrings. - XXX Mention docstrings of 2.2 properties. +XXX Mention docstrings of 2.2 properties. - For consistency, always use """triple double quotes""" around - docstrings. Use r"""raw triple double quotes""" if you use any - backslashes in your docstrings. For Unicode docstrings, use - u"""Unicode triple-quoted strings""". +For consistency, always use ``"""triple double quotes"""`` around +docstrings. Use ``r"""raw triple double quotes"""`` if you use any +backslashes in your docstrings. For Unicode docstrings, use +``u"""Unicode triple-quoted strings"""``. - There are two forms of docstrings: one-liners and multi-line - docstrings. +There are two forms of docstrings: one-liners and multi-line +docstrings. - One-line Docstrings - -------------------- - One-liners are for really obvious cases. They should really fit - on one line. For example:: +One-line Docstrings +-------------------- - def kos_root(): - """Return the pathname of the KOS root directory.""" - global _kos_root - if _kos_root: return _kos_root - ... +One-liners are for really obvious cases. They should really fit on +one line. For example:: - Notes: + def kos_root(): + """Return the pathname of the KOS root directory.""" + global _kos_root + if _kos_root: return _kos_root + ... - - Triple quotes are used even though the string fits on one line. - This makes it easy to later expand it. +Notes: - - The closing quotes are on the same line as the opening quotes. - This looks better for one-liners. +- Triple quotes are used even though the string fits on one line. + This makes it easy to later expand it. - - There's no blank line either before or after the docstring. +- The closing quotes are on the same line as the opening quotes. This + looks better for one-liners. - - The docstring is a phrase ending in a period. It prescribes the - function or method's effect as a command ("Do this", "Return - that"), not as a description: e.g. don't write "Returns the - pathname ..." +- There's no blank line either before or after the docstring. - - The one-line docstring should NOT be a "signature" reiterating - the function/method parameters (which can be obtained by - introspection). Don't do:: +- The docstring is a phrase ending in a period. It prescribes the + function or method's effect as a command ("Do this", "Return that"), + not as a description; e.g. don't write "Returns the pathname ...". - def function(a, b): - """function(a, b) -> list""" +- The one-line docstring should NOT be a "signature" reiterating the + function/method parameters (which can be obtained by introspection). + Don't do:: - This type of docstring is only appropriate for C functions (such - as built-ins), where introspection is not possible. However, - the nature of the *return value* cannot be determined by - introspection, so it should be mentioned. The preferred form - for such a docstring would be something like:: + def function(a, b): + """function(a, b) -> list""" - def function(a, b): - """Do X and return a list.""" + This type of docstring is only appropriate for C functions (such as + built-ins), where introspection is not possible. However, the + nature of the *return value* cannot be determined by introspection, + so it should be mentioned. The preferred form for such a docstring + would be something like:: - (Of course "Do X" should be replaced by a useful description!) + def function(a, b): + """Do X and return a list.""" - Multi-line Docstrings - ---------------------- + (Of course "Do X" should be replaced by a useful description!) - Multi-line docstrings consist of a summary line just like a - one-line docstring, followed by a blank line, followed by a more - elaborate description. The summary line may be used by automatic - indexing tools; it is important that it fits on one line and is - separated from the rest of the docstring by a blank line. The - summary line may be on the same line as the opening quotes or on - the next line. - The entire docstring is indented the same as the quotes at its - first line (see example below). Docstring processing tools will - strip an amount of indentation from the second and further lines - of the docstring equal to the indentation of the first non-blank - line after the first line of the docstring. Relative indentation - of later lines in the docstring is retained. +Multi-line Docstrings +---------------------- - Insert a blank line before and after all docstrings (one-line or - multi-line) that document a class -- generally speaking, the - class's methods are separated from each other by a single blank - line, and the docstring needs to be offset from the first method - by a blank line; for symmetry, put a blank line between the class - header and the docstring. Docstrings documenting functions or - methods generally don't have this requirement, unless the function - or method's body is written as a number of blank-line separated - sections -- in this case, treat the docstring as another section, - and precede it with a blank line. +Multi-line docstrings consist of a summary line just like a one-line +docstring, followed by a blank line, followed by a more elaborate +description. The summary line may be used by automatic indexing +tools; it is important that it fits on one line and is separated from +the rest of the docstring by a blank line. The summary line may be on +the same line as the opening quotes or on the next line. - The docstring of a script (a stand-alone program) should be usable - as its "usage" message, printed when the script is invoked with - incorrect or missing arguments (or perhaps with a "-h" option, for - "help"). Such a docstring should document the script's function - and command line syntax, environment variables, and files. Usage - messages can be fairly elaborate (several screens full) and should - be sufficient for a new user to use the command properly, as well - as a complete quick reference to all options and arguments for the - sophisticated user. +The entire docstring is indented the same as the quotes at its first +line (see example below). Docstring processing tools will strip an +amount of indentation from the second and further lines of the +docstring equal to the indentation of the first non-blank line after +the first line of the docstring. Relative indentation of later lines +in the docstring is retained. - The docstring for a module should generally list the classes, - exceptions and functions (and any other objects) that are exported - by the module, with a one-line summary of each. (These summaries - generally give less detail than the summary line in the object's - docstring.) The docstring for a package (i.e., the docstring of - the package's __init__.py module) should also list the modules and - subpackages exported by the package. +Insert a blank line before and after all docstrings (one-line or +multi-line) that document a class -- generally speaking, the class's +methods are separated from each other by a single blank line, and the +docstring needs to be offset from the first method by a blank line; +for symmetry, put a blank line between the class header and the +docstring. Docstrings documenting functions or methods generally +don't have this requirement, unless the function or method's body is +written as a number of blank-line separated sections -- in this case, +treat the docstring as another section, and precede it with a blank +line. - The docstring for a function or method should summarize its - behavior and document its arguments, return value(s), side - effects, exceptions raised, and restrictions on when it can be - called (all if applicable). Optional arguments should be - indicated. It should be documented whether keyword arguments are - part of the interface. +The docstring of a script (a stand-alone program) should be usable as +its "usage" message, printed when the script is invoked with incorrect +or missing arguments (or perhaps with a "-h" option, for "help"). +Such a docstring should document the script's function and command +line syntax, environment variables, and files. Usage messages can be +fairly elaborate (several screens full) and should be sufficient for a +new user to use the command properly, as well as a complete quick +reference to all options and arguments for the sophisticated user. - The docstring for a class should summarize its behavior and list - the public methods and instance variables. If the class is - intended to be subclassed, and has an additional interface for - subclasses, this interface should be listed separately (in the - docstring). The class constructor should be documented in the - docstring for its __init__ method. Individual methods should be - documented by their own docstring. +The docstring for a module should generally list the classes, +exceptions and functions (and any other objects) that are exported by +the module, with a one-line summary of each. (These summaries +generally give less detail than the summary line in the object's +docstring.) The docstring for a package (i.e., the docstring of the +package's ``__init__.py`` module) should also list the modules and +subpackages exported by the package. - If a class subclasses another class and its behavior is mostly - inherited from that class, its docstring should mention this and - summarize the differences. Use the verb "override" to indicate - that a subclass method replaces a superclass method and does not - call the superclass method; use the verb "extend" to indicate that - a subclass method calls the superclass method (in addition to its - own behavior). +The docstring for a function or method should summarize its behavior +and document its arguments, return value(s), side effects, exceptions +raised, and restrictions on when it can be called (all if applicable). +Optional arguments should be indicated. It should be documented +whether keyword arguments are part of the interface. - *Do not* use the Emacs convention of mentioning the arguments of - functions or methods in upper case in running text. Python is - case sensitive and the argument names can be used for keyword - arguments, so the docstring should document the correct argument - names. It is best to list each argument on a separate line. For - example:: +The docstring for a class should summarize its behavior and list the +public methods and instance variables. If the class is intended to be +subclassed, and has an additional interface for subclasses, this +interface should be listed separately (in the docstring). The class +constructor should be documented in the docstring for its ``__init__`` +method. Individual methods should be documented by their own +docstring. - def complex(real=0.0, imag=0.0): - """Form a complex number. +If a class subclasses another class and its behavior is mostly +inherited from that class, its docstring should mention this and +summarize the differences. Use the verb "override" to indicate that a +subclass method replaces a superclass method and does not call the +superclass method; use the verb "extend" to indicate that a subclass +method calls the superclass method (in addition to its own behavior). - Keyword arguments: - real -- the real part (default 0.0) - imag -- the imaginary part (default 0.0) +*Do not* use the Emacs convention of mentioning the arguments of +functions or methods in upper case in running text. Python is case +sensitive and the argument names can be used for keyword arguments, so +the docstring should document the correct argument names. It is best +to list each argument on a separate line. For example:: - """ - if imag == 0.0 and real == 0.0: return complex_zero - ... + def complex(real=0.0, imag=0.0): + """Form a complex number. - The BDFL [3] recommends inserting a blank line between the last - paragraph in a multi-line docstring and its closing quotes, - placing the closing quotes on a line by themselves. This way, - Emacs' fill-paragraph command can be used on it. + Keyword arguments: + real -- the real part (default 0.0) + imag -- the imaginary part (default 0.0) + + """ + if imag == 0.0 and real == 0.0: return complex_zero + ... + +The BDFL [3]_ recommends inserting a blank line between the last +paragraph in a multi-line docstring and its closing quotes, placing +the closing quotes on a line by themselves. This way, Emacs' +``fill-paragraph`` command can be used on it. References and Footnotes +======================== - [1] PEP 256, Docstring Processing System Framework, Goodger - http://www.python.org/peps/pep-0256.html +.. [1] PEP 256, Docstring Processing System Framework, Goodger + (http://www.python.org/peps/pep-0256.html) - [2] PEP 258, Docutils Design Specification, Goodger - http://www.python.org/peps/pep-0258.html +.. [2] PEP 258, Docutils Design Specification, Goodger + (http://www.python.org/peps/pep-0258.html) - [3] Guido van Rossum, Python's creator and Benevolent Dictator For - Life. +.. [3] Guido van Rossum, Python's creator and Benevolent Dictator For + Life. - [4] http://www.python.org/doc/essays/styleguide.html +.. _Docutils: http://docutils.sourceforge.net/ - [5] http://www.python.org/sigs/doc-sig/ +.. _Python Style Guide: + http://www.python.org/doc/essays/styleguide.html + +.. _Doc-SIG: http://www.python.org/sigs/doc-sig/ Copyright +========= - This document has been placed in the public domain. +This document has been placed in the public domain. Acknowledgements +================ - The "Specification" text comes mostly verbatim from the Python - Style Guide essay by Guido van Rossum [4]. +The "Specification" text comes mostly verbatim from the `Python Style +Guide`_ essay by Guido van Rossum. - This document borrows ideas from the archives of the Python - Doc-SIG [5]. Thanks to all members past and present. +This document borrows ideas from the archives of the Python Doc-SIG_. +Thanks to all members past and present. -Local Variables: -mode: indented-text -indent-tabs-mode: nil -fill-column: 70 -sentence-end-double-space: t -End: +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + fill-column: 70 + sentence-end-double-space: t + End: diff --git a/pep-0258.txt b/pep-0258.txt index a66feeb95..593460421 100644 --- a/pep-0258.txt +++ b/pep-0258.txt @@ -2,923 +2,928 @@ PEP: 258 Title: Docutils Design Specification Version: $Revision$ Last-Modified: $Date$ -Author: goodger@users.sourceforge.net (David Goodger) -Discussions-To: doc-sig@python.org +Author: David Goodger +Discussions-To: Status: Draft Type: Standards Track +Content-Type: text/x-rst Requires: 256, 257 Created: 31-May-2001 Post-History: 13-Jun-2001 -Abstract - - This PEP documents design issues and implementation details for - Docutils, a Python Docstring Processing System (DPS). The - rationale and high-level concepts of a DPS are documented in PEP - 256, "Docstring Processing System Framework" [1]. Also see PEP - 256 for a "Roadmap to the Doctring PEPs". - - Docutils is being designed modularly so that any of its components - can be replaced easily. In addition, Docutils is not limited to - the processing of Python docstrings; it processes standalone - documents as well, in several contexts. - - No changes to the core Python language are required by this PEP. - Its deliverables consist of a package for the standard library and - its documentation. - - -Specification - - Docutils Project Model - ====================== - - :: - - +--------------------------+ - | Docutils: | - | docutils.core.Publisher, | - | docutils.core.publish() | - +--------------------------+ - / \ - / \ - 1,3,5,7 / \ 8,10 - +--------+ +--------+ - | READER | =========================> | WRITER | - +--------+ +--------+ - / || \ / \ - / || \ / \ - 2 / 4 || \ 6 9 / \ 11 - +-----+ +--------+ +-------------+ +------------+ +-----+ - | I/O | | PARSER |...| reader | | writer | | I/O | - +-----+ +--------+ | transforms | | transforms | +-----+ - | | | | - | - docinfo | | - system | - | - titles | | messages | - | - linking | | - final | - | - lookups | | checks | - | - reader- | | - writer- | - | specific | | specific | - | - parser- | | - etc. | - | specific | +------------+ - | - layout | - | (stylist) | - | - etc. | - +-------------+ - - The numbers indicate the path a document's data takes through the - code. Double-width lines between reader & parser and between - reader & writer indicate that data sent along these paths should - be standard (pure & unextended) Docutils doc trees. Single-width - lines signify that internal tree extensions or completely - unrelated representations are possible, but they must be supported - at both ends. +========== + Abstract +========== + +This PEP documents design issues and implementation details for +Docutils, a Python Docstring Processing System (DPS). The rationale +and high-level concepts of a DPS are documented in PEP 256, "Docstring +Processing System Framework" [#PEP-256]_. Also see PEP 256 for a +"Roadmap to the Doctring PEPs". + +Docutils is being designed modularly so that any of its components can +be replaced easily. In addition, Docutils is not limited to the +processing of Python docstrings; it processes standalone documents as +well, in several contexts. + +No changes to the core Python language are required by this PEP. Its +deliverables consist of a package for the standard library and its +documentation. + + +=============== + Specification +=============== + +Docutils Project Model +====================== + +:: + + +--------------------------+ + | Docutils: | + | docutils.core.Publisher, | + | docutils.core.publish() | + +--------------------------+ + / \ + / \ + 1,3,5,7 / \ 8,10 + +--------+ +--------+ + | READER | =========================> | WRITER | + +--------+ +--------+ + / || \ / \ + / || \ / \ + 2 / 4 || \ 6 9 / \ 11 + +-----+ +--------+ +-------------+ +------------+ +-----+ + | I/O | | PARSER |...| reader | | writer | | I/O | + +-----+ +--------+ | transforms | | transforms | +-----+ + | | | | + | - docinfo | | - system | + | - titles | | messages | + | - linking | | - final | + | - lookups | | checks | + | - reader- | | - writer- | + | specific | | specific | + | - parser- | | - etc. | + | specific | +------------+ + | - layout | + | (stylist) | + | - etc. | + +-------------+ + +The numbers indicate the path a document's data takes through the +code. Double-width lines between reader & parser and between reader & +writer indicate that data sent along these paths should be standard +(pure & unextended) Docutils doc trees. Single-width lines signify +that internal tree extensions or completely unrelated representations +are possible, but they must be supported at both ends. - Publisher - --------- +Publisher +--------- - The "docutils.core" module contains a "Publisher" facade class and - "publish" convenience function. Publisher encapsulates the - high-level logic of a Docutils system. The Publisher.publish() - method first calls its Reader, which reads data from its source - I/O, parses and transforms the data, and returns it. - Publisher.publish() then passes the resulting document tree to its - Writer, which further transforms the document before translating - it to the final output format and writing the formatted data to - its destination I/O. +The ``docutils.core`` module contains a "Publisher" facade class and +"publish" convenience function. Publisher encapsulates the high-level +logic of a Docutils system. The ``Publisher.publish()`` method first +calls its Reader, which reads data from its source I/O, parses and +transforms the data, and returns it. ``Publisher.publish()`` then +passes the resulting document tree to its Writer, which further +transforms the document before translating it to the final output +format and writing the formatted data to its destination I/O. - Calling the "publish" function (or instantiating a "Publisher" - object) with component names will result in default behavior. For - custom behavior (setting component options), create custom - component objects first, and pass *them* to publish/Publisher. +Calling the "publish" function (or instantiating a "Publisher" object) +with component names will result in default behavior. For custom +behavior (setting component options), create custom component objects +first, and pass *them* to publish/Publisher. - Readers - ------- +Readers +------- - Readers understand the input context (where the data is coming - from), send the whole input or discrete "chunks" to the parser, - and provide the context to bind the chunks together back into a - cohesive whole. Using transforms_, Readers also resolve - references, footnote numbers, interpreted text processing, and - anything else that requires context-sensitive computation. +Readers understand the input context (where the data is coming from), +send the whole input or discrete "chunks" to the parser, and provide +the context to bind the chunks together back into a cohesive whole. +Using transforms_, Readers also resolve references, footnote numbers, +interpreted text processing, and anything else that requires +context-sensitive computation. - Each reader is a module or package exporting a "Reader" class with - a "read" method. The base "Reader" class can be found in the - docutils/readers/__init__.py module. +Each reader is a module or package exporting a "Reader" class with a +"read" method. The base "Reader" class can be found in the +``docutils/readers/__init__.py`` module. - Most Readers will have to be told what parser to use. So far (see - the list of examples below), only the Python Source Reader - (PySource; still incomplete) will be able to determine the parser - on its own. +Most Readers will have to be told what parser to use. So far (see the +list of examples below), only the Python Source Reader ("PySource"; +still incomplete) will be able to determine the parser on its own. - Responsibilities: +Responsibilities: - - Get input text from the source I/O. +- Get input text from the source I/O. - - Pass the input text to the parser, along with a fresh doctree - root. +- Pass the input text to the parser, along with a fresh doctree root. - - Run transforms over the doctree(s). +- Run transforms over the doctree(s). - Examples: +Examples: - - Standalone (Raw/Plain): Just read a text file and process it. - The reader needs to be told which parser to use. +- Standalone (Raw/Plain): Just read a text file and process it. + The reader needs to be told which parser to use. - The "Standalone Reader" has been implemented in - docutils/readers/standalone.py. + The "Standalone Reader" has been implemented in module + ``docutils.readers.standalone``. - - Python Source: See `Python Source Reader`_ below. This Reader - is currently in development in the Docutils sandbox. +- Python Source: See `Python Source Reader`_ below. This Reader is + currently in development in the Docutils sandbox. - - Email: RFC-822 headers, quoted excerpts, signatures, MIME parts. +- Email: RFC-822 headers, quoted excerpts, signatures, MIME parts. - - PEP: RFC-822 headers, "PEP xxxx" and "RFC xxxx" conversion to - URIs. Either interpret PEPs' indented sections or convert - existing PEPs to reStructuredText (or both?). +- PEP: RFC-822 headers, "PEP xxxx" and "RFC xxxx" conversion to URIs. + Either interpret PEPs' indented sections or convert existing PEPs to + reStructuredText (or both?). - The "PEP Reader" is being implemented in - docutils/readers/pep.py. + The "PEP Reader" is being implemented in module + ``docutils.readers.pep``. - - Wiki: Global reference lookups of "wiki links" incorporated into - transforms. (CamelCase only or unrestricted?) Lazy - indentation? +- Wiki: Global reference lookups of "wiki links" incorporated into + transforms. (CamelCase only or unrestricted?) Lazy + indentation? - - Web Page: As standalone, but recognize meta fields as meta tags. - Support for templates of some sort? (After , before - ?) +- Web Page: As standalone, but recognize meta fields as meta tags. + Support for templates of some sort? (After ````, before + ````?) - - FAQ: Structured "question & answer(s)" constructs. +- FAQ: Structured "question & answer(s)" constructs. - - Compound document: Merge chapters into a book. Master TOC file? +- Compound document: Merge chapters into a book. Master TOC file? - Parsers - ------- +Parsers +------- - Parsers analyze their input and produce a Docutils `document - tree`_. They don't know or care anything about the source or - destination of the data. +Parsers analyze their input and produce a Docutils `document tree`_. +They don't know or care anything about the source or destination of +the data. - Each input parser is a module or package exporting a "Parser" - class with a "parse" method. The base "Parser" class can be found - in the docutils/parsers/__init__.py module. +Each input parser is a module or package exporting a "Parser" class +with a "parse" method. The base "Parser" class can be found in the +``docutils/parsers/__init__.py`` module. - Responsibilities: Given raw input text and a doctree root node, - populate the doctree by parsing the input text. +Responsibilities: Given raw input text and a doctree root node, +populate the doctree by parsing the input text. - Example: The only parser implemented so far is for the - reStructuredText markup. It is implemented in the - docutils/parsers/rst/ package. +Example: The only parser implemented so far is for the +reStructuredText markup. It is implemented in the +``docutils/parsers/rst/`` package. - Transforms - ---------- +Transforms +---------- - Transforms change the document tree from one form to another, add - to the tree, or prune it. Transforms are run by Reader and Writer - objects. Some transforms are Reader-specific, some are - Parser-specific, and others are Writer-specific. The choice and - order of transforms is specified in the Reader and Writer objects. +Transforms change the document tree from one form to another, add to +the tree, or prune it. Transforms are run by Reader and Writer +objects. Some transforms are Reader-specific, some are +Parser-specific, and others are Writer-specific. The choice and order +of transforms is specified in the Reader and Writer objects. - Each transform is a class in a module in the docutils/transforms - package, a subclass of docutils.tranforms.Transform. +Each transform is a class in a module in the ``docutils/transforms/`` +package, a subclass of docutils.tranforms.Transform. - Responsibilities: +Responsibilities: - - Modify a doctree in-place, either purely transforming one - structure into another, or adding new structures based on the - doctree and/or external data. +- Modify a doctree in-place, either purely transforming one structure + into another, or adding new structures based on the doctree and/or + external data. - Examples (in the docutils/transforms/ package): +Examples (in the ``docutils/transforms/`` package): - - frontmatter.DocInfo: Conversion of document metadata - (bibliographic information). +- frontmatter.DocInfo: Conversion of document metadata (bibliographic + information). - - references.Hyperlinks: Resolution of hyperlinks. +- references.Hyperlinks: Resolution of hyperlinks. - - parts.Contents: Generates a table of contents for a document. +- parts.Contents: Generates a table of contents for a document. - - document.Merger: Combining multiple populated doctrees into one - (not yet implemented or fully understood). +- document.Merger: Combining multiple populated doctrees into one (not + yet implemented or fully understood). - - document.Splitter: Splits a document into a tree-structure of - subdocuments, perhaps by section. It will have to transform - references appropriately. (Neither implemented not remotely - understood.) +- document.Splitter: Splits a document into a tree-structure of + subdocuments, perhaps by section. It will have to transform + references appropriately. (Neither implemented not remotely + understood.) - - universal.Pending: Handles transforms that must be executed at - specific stages of processing. +- universal.Pending: Handles transforms that must be executed at + specific stages of processing. - - components.Filter: Includes or excludes elements which depend on - a specific Docutils component (triggered by the - universal.Pending transform). +- components.Filter: Includes or excludes elements which depend on a + specific Docutils component (triggered by the universal.Pending + transform). - Writers - ------- +Writers +------- - Writers produce the final output (HTML, XML, TeX, etc.). Writers - translate the internal document tree structure into the final data - format, possibly running Writer-specific transforms_ first. +Writers produce the final output (HTML, XML, TeX, etc.). Writers +translate the internal document tree structure into the final data +format, possibly running Writer-specific transforms_ first. - Each writer is a module or package exporting a "Writer" class with - a "write" method. The base "Writer" class can be found in the - docutils/writers/__init__.py module. +Each writer is a module or package exporting a "Writer" class with a +"write" method. The base "Writer" class can be found in the +``docutils/writers/__init__.py`` module. - Responsibilities: +Responsibilities: - - Run transforms over the doctree(s). +- Run transforms over the doctree(s). - - Translate doctree(s) into specific output formats. +- Translate doctree(s) into specific output formats. - - Transform references into format-native forms. + - Transform references into format-native forms. - - Write the translated output to the destination I/O. +- Write the translated output to the destination I/O. - Examples: +Examples: - - XML: Various forms, such as: +- XML: Various forms, such as: - - DocBook (being implemented in the Docutils sandbox). + - DocBook (being implemented in the Docutils sandbox). - - Raw doctree XML (accessible via "doctree.asdom().toxml()"; no - Writer component implemented yet). + - Raw doctree XML (accessible via "``doctree.asdom().toxml()``"; no + Writer component implemented yet). - - HTML (XHTML implemented as docutils/writers/html4css1.py). +- HTML (XHTML implemented as ``docutils.writers.html4css1``). - - PDF (a ReportLabs interface is being developed in the Docutils - sandbox). +- PDF (a ReportLabs interface is being developed in the Docutils + sandbox). - - TeX +- TeX - - Docutils-native pseudo-XML (implemented as - docutils/writers/pseudoxml.py, used for testing). +- Docutils-native pseudo-XML (implemented as + ``docutils.writers.pseudoxml``, used for testing). - - Plain text +- Plain text - - reStructuredText? +- reStructuredText? - I/O - --- +Input/Output +------------ - I/O classes provide a uniform API for low-level input and output. - Subclasses will exist for a variety of input/output mechanisms. +I/O classes provide a uniform API for low-level input and output. +Subclasses will exist for a variety of input/output mechanisms. - I/O classes are currently in the preliminary stages; there's a lot - of work yet to be done. Issues: +I/O classes are currently in the preliminary stages; there's a lot of +work yet to be done. Issues: - - Looking at the list of writers, it seems that only HTML would - require anything other than monolithic output. Perhaps "Writer" - variants, one for each output distribution type? +- Looking at the list of writers, it seems that only HTML would + require anything other than monolithic output. Perhaps "Writer" + variants, one for each output distribution type? - - How to represent a multi-file document (files & directories) in - the API? +- How to represent a multi-file document (files & directories) in the + API? - Responsibilities: +Responsibilities: - - Read data from the input source and/or write data to the output - destination. +- Read data from the input source and/or write data to the output + destination. - Examples of input sources: +Examples of input sources: - - A single file on disk or a stream (implemented as - docutils.io.FileIO). +- A single file on disk or a stream (implemented as + ``docutils.io.FileInput``). - - Multiple files on disk (MultiFileIO?). +- Multiple files on disk (``MultiFileInput``?). - - Python source files: modules and packages. +- Python source files: modules and packages. - - Python strings, as received from a client application - (implemented as docutils.io.StringIO). +- Python strings, as received from a client application + (implemented as ``docutils.io.StringInput``). - Examples of output destinations: +Examples of output destinations: - - A single file on disk or a stream (implemented as - docutils.io.FileIO). +- A single file on disk or a stream (implemented as + ``docutils.io.FileOutput``). - - A tree of directories and files on disk. +- A tree of directories and files on disk. - - A Python string, returned to a client application (implemented - as docutils.io.StringIO). +- A Python string, returned to a client application (implemented as + ``docutils.io.StringOutput``). - - A single tree-shaped data structure in memory. +- A single tree-shaped data structure in memory. - - Some other set of data structures in memory. +- Some other set of data structures in memory. - Docutils Package Structure - ========================== +Docutils Package Structure +========================== - - Package "docutils". +- Package "docutils". - - Class "Component" is a base class for Docutils components. + - Class "Component" is a base class for Docutils components. - - Module "docutils.core" contains facade class "Publisher" and - convenience function "publish()". See `Publisher`_ above. + - Module "docutils.core" contains facade class "Publisher" and + convenience function "publish()". See `Publisher`_ above. - - Module "docutils.frontend" provides command-line and option - processing for Docutils front-ends. + - Module "docutils.frontend" provides command-line and option + processing for Docutils front-end tools. - - Module "docutils.io" provides a uniform API for low-level - input and output. + - Module "docutils.io" provides a uniform API for low-level input + and output. See `Input/Output`_ above. - - Module "docutils.nodes" contains the Docutils document tree - element class library plus Visitor pattern base classes. See - `Document Tree`_ below. + - Module "docutils.nodes" contains the Docutils document tree + element class library plus Visitor pattern base classes. See + `Document Tree`_ below. - - Module "docutils.optik" provides option parsing and - command-line help; from Greg Ward's http://optik.sf.net/ - project, included for convenience. + - Module "docutils.optik" provides option parsing and command-line + help; from Greg Ward's http://optik.sf.net/ project, included for + convenience. - - Module "docutils.roman" contains Roman numeral conversion - routines. + - Module "docutils.roman" contains Roman numeral conversion + routines. - - Module "docutils.statemachine" contains a finite state machine - specialized for regular-expression-based text filters. The - reStructuredText parser implementation is based on this - module. + - Module "docutils.statemachine" contains a finite state machine + specialized for regular-expression-based text filters. The + reStructuredText parser implementation is based on this module. - - Module "docutils.urischemes" contains a mapping of known URI - schemes ("http", "ftp", "mail", etc.). + - Module "docutils.urischemes" contains a mapping of known URI + schemes ("http", "ftp", "mail", etc.). - - Module "docutils.utils" contains utility functions and - classes, including a logger class ("Reporter"; see `Error - Handling`_ below). + - Module "docutils.utils" contains utility functions and classes, + including a logger class ("Reporter"; see `Error Handling`_ + below). - - Package "docutils.parsers": markup parsers_. + - Package "docutils.parsers": markup parsers_. - - Function "get_parser_class(parser_name)" returns a parser - module by name. Class "Parser" is the base class of - specific parsers. (docutils/parsers/__init__.py) + - Function "get_parser_class(parser_name)" returns a parser module + by name. Class "Parser" is the base class of specific parsers. + (``docutils/parsers/__init__.py``) - - Package "docutils.parsers.rst": the reStructuredText parser. + - Package "docutils.parsers.rst": the reStructuredText parser. - - Alternate markup parsers may be added. + - Alternate markup parsers may be added. - - Package "docutils.readers": context-aware input readers. + See `Parsers`_ above. - - Function "get_reader_class(reader_name)" returns a reader - module by name or alias. Class "Reader" is the base class - of specific readers. (docutils/readers/__init__.py) + - Package "docutils.readers": context-aware input readers. - - Module "docutils.readers.standalone" reads independent - document files. + - Function "get_reader_class(reader_name)" returns a reader module + by name or alias. Class "Reader" is the base class of specific + readers. (``docutils/readers/__init__.py``) - - Module "docutils.readers.pep" reads PEPs (Python Enhancement - Proposals). + - Module "docutils.readers.standalone" reads independent document + files. - - Readers to be added for: Python source code (structure & - docstrings), PEPs, email, FAQ, and perhaps Wiki and others. + - Module "docutils.readers.pep" reads PEPs (Python Enhancement + Proposals). - - Package "docutils.writers": output format writers. + - Readers to be added for: Python source code (structure & + docstrings), PEPs, email, FAQ, and perhaps Wiki and others. - - Function "get_writer_class(writer_name)" returns a writer - module by name. Class "Writer" is the base class of - specific writers. (docutils/writers/__init__.py) + See `Readers`_ above. - - Module "docutils.writers.pseudoxml" is a simple internal - document tree writer; it writes indented pseudo-XML. + - Package "docutils.writers": output format writers. - - Module "docutils.writers.html4css1" is a simple HyperText - Markup Language document tree writer for HTML 4.01 and CSS1. + - Function "get_writer_class(writer_name)" returns a writer module + by name. Class "Writer" is the base class of specific writers. + (``docutils/writers/__init__.py``) - - Writers to be added: HTML 3.2 or 4.01-loose, XML (various - forms, such as DocBook and the raw internal doctree), PDF, - TeX, plaintext, reStructuredText, and perhaps others. + - Module "docutils.writers.pseudoxml" is a simple internal + document tree writer; it writes indented pseudo-XML. - - Package "docutils.transforms": tree transform classes. + - Module "docutils.writers.html4css1" is a simple HyperText Markup + Language document tree writer for HTML 4.01 and CSS1. - - Class "Transform" is the base class of specific transforms; - see `Transform API`_ below. - (docutils/transforms/__init__.py) + - Writers to be added: HTML 3.2 or 4.01-loose, XML (various forms, + such as DocBook and the raw internal doctree), PDF, TeX, + plaintext, reStructuredText, and perhaps others. - - Each module contains related transform classes. + See `Writers`_ above. - - Package "docutils.languages": Language modules contain - language-dependent strings and mappings. They are named for - their language identifier (as defined in `Choice of Docstring - Format`_ above), converting dashes to underscores. + - Package "docutils.transforms": tree transform classes. - - Function "get_language(language_code)", returns matching - language module. (docutils/languages/__init__.py) + - Class "Transform" is the base class of specific transforms. + (``docutils/transforms/__init__.py``) - - Module "docutils.languages.en" (English). + - Each module contains related transform classes. - - Other languages to be added. + See `Transforms`_ above. + - Package "docutils.languages": Language modules contain + language-dependent strings and mappings. They are named for their + language identifier (as defined in `Choice of Docstring Format`_ + above), converting dashes to underscores. - Front-End Tools - =============== + - Function "get_language(language_code)", returns matching + language module. (``docutils/languages/__init__.py``) - @@@ To be determined. + - Module "docutils.languages.en" (English). - @@@ Document tools & summarize their command-line interfaces. + - Other languages to be added. - Document Tree - ============= +Front-End Tools +=============== - A single intermediate data structure is used internally by - Docutils, in the interfaces between components; it is defined in - the docutils.nodes module. It is not required that this data - structure be used *internally* by any of the components, just - *between* components. +See `Docutils Front-End Tools`_. - Custom node types are allowed, providing that either (A) a - transform converts them to standard Docutils nodes before they - reach the Writer proper, or (B) the custom node is explicitly - supported by certain Writers, and is wrapped in a filtered - "pending" node. An example of condition A is the `Python Source - Reader`_ (see below), where a "stylist" transform converts custom - nodes. The HTML tag is an example of condition B; it is - supported by the HTML Writer but not by others. The - reStructuredText ".. meta::" directive creates a "pending" node, - which contains knowledge that the embedded "meta" node can only be - handled by HTML-compatible writers. The "pending" node is - resolved by the "transforms.components.Filter" transform, which - checks that the calling writer supports HTML; if it doesn't, the - "meta" node is removed from the document. +.. _Docutils Front-End Tools: http://docutils.sf.net/docs/tools.html - The document tree data structure is similar to a DOM tree, but - with specific node names (classes) instead of DOM's generic nodes. - The schema is documented in an XML DTD (eXtensible Markup Language - Document Type Definition), which comes in two parts: - - the Docutils Generic DTD, docutils.dtd [2], and +Document Tree +============= - - the OASIS Exchange Table Model, soextbl.dtd [3]. +A single intermediate data structure is used internally by Docutils, +in the interfaces between components; it is defined in the +docutils.nodes module. It is not required that this data structure be +used *internally* by any of the components, just *between* components. - The DTD defines a rich set of elements, suitable for many input - and output formats. The DTD retains all information necessary to - reconstruct the original input text, or a reasonable facsimile - thereof. +Custom node types are allowed, provided that either (a) a transform +converts them to standard Docutils nodes before they reach the Writer +proper, or (b) the custom node is explicitly supported by certain +Writers, and is wrapped in a filtered "pending" node. An example of +condition A is the `Python Source Reader`_ (see below), where a +"stylist" transform converts custom nodes. The HTML ```` tag is +an example of condition B; it is supported by the HTML Writer but not +by others. The reStructuredText "meta" directive creates a "pending" +node, which contains knowledge that the embedded "meta" node can only +be handled by HTML-compatible writers. The "pending" node is resolved +by the "transforms.components.Filter" transform, which checks that the +calling writer supports HTML; if it doesn't, the "meta" node is +removed from the document. - See "The Docutils Document Tree" [4] for details (incomplete). +The document tree data structure is similar to a DOM tree, but with +specific node names (classes) instead of DOM's generic nodes. The +schema is documented in an XML DTD (eXtensible Markup Language +Document Type Definition), which comes in two parts: +- the Docutils Generic DTD, docutils.dtd_, and - Error Handling - ============== +- the OASIS Exchange Table Model, soextbl.dtd_. - When the parser encounters an error in markup, it inserts a system - message (DTD element "system_message"). There are five levels of - system messages: +The DTD defines a rich set of elements, suitable for many input and +output formats. The DTD retains all information necessary to +reconstruct the original input text, or a reasonable facsimile +thereof. - - Level-0, "DEBUG": an internal reporting issue. There is no - effect on the processing. Level-0 system messages are - handled separately from the others. +See `The Docutils Document Tree`_ for details (incomplete). - - Level-1, "INFO": a minor issue that can be ignored. There is - little or no effect on the processing. Typically level-1 system - messages are not reported. - - Level-2, "WARNING": an issue that should be addressed. If - ignored, there may be minor problems with the output. Typically - level-2 system messages are reported but do not halt processing +Error Handling +============== - - Level-3, "ERROR": a major issue that should be addressed. If - ignored, the output will contain unpredictable errors. - Typically level-3 system messages are reported but do not halt - processing +When the parser encounters an error in markup, it inserts a system +message (DTD element "system_message"). There are five levels of +system messages: - - Level-4, "SEVERE": a critical error that must be addressed. - Typically level-4 system messages are turned into exceptions - which halt processing. If ignored, the output will contain - severe errors. +- Level-0, "DEBUG": an internal reporting issue. There is no effect + on the processing. Level-0 system messages are handled separately + from the others. - Although the initial message levels were devised independently, - they have a strong correspondence to VMS error condition severity - levels [5]; the names in quotes for levels 1 through 4 were - borrowed from VMS. Error handling has since been influenced by - the log4j project [6]. +- Level-1, "INFO": a minor issue that can be ignored. There is little + or no effect on the processing. Typically level-1 system messages + are not reported. +- Level-2, "WARNING": an issue that should be addressed. If ignored, + there may be minor problems with the output. Typically level-2 + system messages are reported but do not halt processing - Python Source Reader - ==================== +- Level-3, "ERROR": a major issue that should be addressed. If + ignored, the output will contain unpredictable errors. Typically + level-3 system messages are reported but do not halt processing - The Python Source Reader ("PySource") is the Docutils component - that reads Python source files, extracts docstrings in context, - then parses, links, and assembles the docstrings into a cohesive - whole. It is a major and non-trivial component, currently under - experimental development in the Docutils sandbox. High-level - design issues are presented here. +- Level-4, "SEVERE": a critical error that must be addressed. + Typically level-4 system messages are turned into exceptions which + halt processing. If ignored, the output will contain severe errors. +Although the initial message levels were devised independently, they +have a strong correspondence to `VMS error condition severity +levels`_; the names in quotes for levels 1 through 4 were borrowed +from VMS. Error handling has since been influenced by the `log4j +project`_. - Processing Model - ---------------- - This model will evolve over time, incorporating experience and - discoveries. +Python Source Reader +==================== - 1. The PySource Reader uses an I/O class to read in some Python - packages and modules, into a tree of strings. +The Python Source Reader ("PySource") is the Docutils component that +reads Python source files, extracts docstrings in context, then +parses, links, and assembles the docstrings into a cohesive whole. It +is a major and non-trivial component, currently under experimental +development in the Docutils sandbox. High-level design issues are +presented here. - 2. The Python modules are parsed, converting the tree of strings - into a tree of abstract syntax trees. - 3. The abstract syntax trees are converted into an internal - representation of the packages/modules. Docstrings are - extracted, as well as code structure details. See `AST - Mining`_ below. Namespaces are constructed for lookup in step - 6. +Processing Model +---------------- - 4. One at a time, the docstrings are parsed, producing standard - Docutils doctrees. +This model will evolve over time, incorporating experience and +discoveries. - 5. PySource assembles all the individual docstrings' doctrees into - a Python-specific custom Docutils tree parallelling the - package/module/class structure; this is a custom - Reader-specific internal representation (see the Docutils - Python Source DTD [7]). Namespaces must be merged: Python - identifiers, hyperlink targets. +1. The PySource Reader uses an I/O class to read in some Python + packages and modules, into a tree of strings. - 6. Cross-references from docstrings (interpreted text) to Python - identifiers are resolved according to the Python namespace - lookup rules. See `Identifier Cross-References`_ below. +2. The Python modules are parsed, converting the tree of strings into + a tree of abstract syntax trees. - 7. A "Stylist" transform is applied to the custom doctree, custom - nodes are rendered using standard nodes as primitives, and a - standard document tree is emitted. See `Stylist Transforms`_ - below. +3. The abstract syntax trees are converted into an internal + representation of the packages/modules. Docstrings are extracted, + as well as code structure details. See `AST Mining`_ below. + Namespaces are constructed for lookup in step 6. - 8. Other transforms are applied to the standard doctree. +4. One at a time, the docstrings are parsed, producing standard + Docutils doctrees. - 9. The standard doctree is sent to a Writer, which translates the - document into a concrete format (HTML, PDF, etc.). +5. PySource assembles all the individual docstrings' doctrees into a + Python-specific custom Docutils tree parallelling the + package/module/class structure; this is a custom Reader-specific + internal representation (see the `Docutils Python Source DTD`_). + Namespaces must be merged: Python identifiers, hyperlink targets. - 10. The Writer uses an I/O class to write the resulting data to - its destination (disk file, directories and files, etc.). +6. Cross-references from docstrings (interpreted text) to Python + identifiers are resolved according to the Python namespace lookup + rules. See `Identifier Cross-References`_ below. +7. A "Stylist" transform is applied to the custom doctree, custom + nodes are rendered using standard nodes as primitives, and a + standard document tree is emitted. See `Stylist Transforms`_ + below. - AST Mining - ---------- +8. Other transforms are applied to the standard doctree. - Abstract Syntax Tree mining code will be written that scans a - parsed Python module, and returns an ordered tree containing the - names, docstrings (including attribute and additional docstrings; - see below), and additional info (in parentheses below) of all of - the following objects: +9. The standard doctree is sent to a Writer, which translates the + document into a concrete format (HTML, PDF, etc.). - - packages - - modules - - module attributes (+ initial values) - - classes (+ inheritance) - - class attributes (+ initial values) - - instance attributes (+ initial values) - - methods (+ parameters & defaults) - - functions (+ parameters & defaults) +10. The Writer uses an I/O class to write the resulting data to its + destination (disk file, directories and files, etc.). - (Extract comments too? For example, comments at the start of a - module would be a good place for bibliographic field lists.) - In order to evaluate interpreted text cross-references, namespaces - for each of the above will also be required. +AST Mining +---------- - See python-dev/docstring-develop thread "AST mining", started on - 2001-08-14. +Abstract Syntax Tree mining code will be written that scans a parsed +Python module, and returns an ordered tree containing the names, +docstrings (including attribute and additional docstrings; see below), +and additional info (in parentheses below) of all of the following +objects: +- packages +- modules +- module attributes (+ initial values) +- classes (+ inheritance) +- class attributes (+ initial values) +- instance attributes (+ initial values) +- methods (+ parameters & defaults) +- functions (+ parameters & defaults) - Docstring Extraction Rules - -------------------------- +(Extract comments too? For example, comments at the start of a module +would be a good place for bibliographic field lists.) - 1. What to examine: +In order to evaluate interpreted text cross-references, namespaces for +each of the above will also be required. - a) If the "__all__" variable is present in the module being - documented, only identifiers listed in "__all__" are - examined for docstrings. +See python-dev/docstring-develop thread "AST mining", started on +2001-08-14. - b) In the absense of "__all__", all identifiers are examined, - except those whose names are private (names begin with "_" - but don't begin and end with "__"). - c) 1a and 1b can be overridden by a parameter or command-line - option. +Docstring Extraction Rules +-------------------------- - 2. Where: +1. What to examine: - Docstrings are string literal expressions, and are recognized - in the following places within Python modules: + a) If the "``__all__``" variable is present in the module being + documented, only identifiers listed in "``__all__``" are + examined for docstrings. - a) At the beginning of a module, function definition, class - definition, or method definition, after any comments. This - is the standard for Python __doc__ attributes. + b) In the absense of "``__all__``", all identifiers are examined, + except those whose names are private (names begin with "_" but + don't begin and end with "__"). - b) Immediately following a simple assignment at the top level - of a module, class definition, or __init__ method - definition, after any comments. See "Attribute Docstrings" - below. + c) 1a and 1b can be overridden by a parameter or command-line + option. - c) Additional string literals found immediately after the - docstrings in (a) and (b) will be recognized, extracted, and - concatenated. See "Additional Docstrings" below. +2. Where: - d) @@@ 2.2-style "properties" with attribute docstrings? + Docstrings are string literal expressions, and are recognized in + the following places within Python modules: - 3. How: + a) At the beginning of a module, function definition, class + definition, or method definition, after any comments. This is + the standard for Python ``__doc__`` attributes. - Whenever possible, Python modules should be parsed by Docutils, - not imported. There are several reasons: + b) Immediately following a simple assignment at the top level of a + module, class definition, or ``__init__`` method definition, + after any comments. See `Attribute Docstrings`_ below. - - Importing untrusted code is inherently insecure. + c) Additional string literals found immediately after the + docstrings in (a) and (b) will be recognized, extracted, and + concatenated. See `Additional Docstrings`_ below. - - Information from the source is lost when using introspection - to examine an imported module, such as comments and the order - of definitions. + d) @@@ 2.2-style "properties" with attribute docstrings? - - Docstrings are to be recognized in places where the bytecode - compiler ignores string literal expressions (2b and 2c - above), meaning importing the module will lose these - docstrings. +3. How: - Of course, standard Python parsing tools such as the "parser" - library module should be used. + Whenever possible, Python modules should be parsed by Docutils, not + imported. There are several reasons: - When the Python source code for a module is not available - (i.e. only the .pyc file exists) or for C extension modules, to - access docstrings the module can only be imported, and any - limitations must be lived with. + - Importing untrusted code is inherently insecure. - Since attribute docstrings and additional docstrings are ignored - by the Python bytecode compiler, no namespace pollution or runtime - bloat will result from their use. They are not assigned to - __doc__ or to any other attribute. The initial parsing of a - module may take a slight performance hit. + - Information from the source is lost when using introspection to + examine an imported module, such as comments and the order of + definitions. + - Docstrings are to be recognized in places where the bytecode + compiler ignores string literal expressions (2b and 2c above), + meaning importing the module will lose these docstrings. - Attribute Docstrings - ```````````````````` + Of course, standard Python parsing tools such as the "parser" + library module should be used. - (This is a simplified version of PEP 224 [8] by Marc-Andre - Lemberg.) + When the Python source code for a module is not available + (i.e. only the ``.pyc`` file exists) or for C extension modules, to + access docstrings the module can only be imported, and any + limitations must be lived with. - A string literal immediately following an assignment statement is - interpreted by the docstring extration machinery as the docstring - of the target of the assignment statement, under the following - conditions: +Since attribute docstrings and additional docstrings are ignored by +the Python bytecode compiler, no namespace pollution or runtime bloat +will result from their use. They are not assigned to ``__doc__`` or +to any other attribute. The initial parsing of a module may take a +slight performance hit. - 1. The assignment must be in one of the following contexts: - a) At the top level of a module (i.e., not nested inside a - compound statement such as a loop or conditional): a module - attribute. +Attribute Docstrings +'''''''''''''''''''' - b) At the top level of a class definition: a class attribute. +(This is a simplified version of PEP 224 [#PEP-224]_.) - c) At the top level of the "__init__" method definition of a - class: an instance attribute. +A string literal immediately following an assignment statement is +interpreted by the docstring extration machinery as the docstring of +the target of the assignment statement, under the following +conditions: - Since each of the above contexts are at the top level (i.e., in - the outermost suite of a definition), it may be necessary to - place dummy assignments for attributes assigned conditionally - or in a loop. +1. The assignment must be in one of the following contexts: - 2. The assignment must be to a single target, not to a list or a - tuple of targets. + a) At the top level of a module (i.e., not nested inside a compound + statement such as a loop or conditional): a module attribute. - 3. The form of the target: + b) At the top level of a class definition: a class attribute. - a) For contexts 1a and 1b above, the target must be a simple - identifier (not a dotted identifier, a subscripted - expression, or a sliced expression). + c) At the top level of the "``__init__``" method definition of a + class: an instance attribute. - b) For context 1c above, the target must be of the form - "self.attrib", where "self" matches the "__init__" method's - first parameter (the instance parameter) and "attrib" is a - simple indentifier as in 3a. + Since each of the above contexts are at the top level (i.e., in the + outermost suite of a definition), it may be necessary to place + dummy assignments for attributes assigned conditionally or in a + loop. - Blank lines may be used after attribute docstrings to emphasize - the connection between the assignment and the docstring. +2. The assignment must be to a single target, not to a list or a tuple + of targets. - Examples:: +3. The form of the target: - g = 'module attribute (module-global variable)' - """This is g's docstring.""" + a) For contexts 1a and 1b above, the target must be a simple + identifier (not a dotted identifier, a subscripted expression, + or a sliced expression). - class AClass: + b) For context 1c above, the target must be of the form + "``self.attrib``", where "``self``" matches the "``__init__``" + method's first parameter (the instance parameter) and "attrib" + is a simple indentifier as in 3a. - c = 'class attribute' - """This is AClass.c's docstring.""" +Blank lines may be used after attribute docstrings to emphasize the +connection between the assignment and the docstring. - def __init__(self): - self.i = 'instance attribute' - """This is self.i's docstring.""" +Examples:: + g = 'module attribute (module-global variable)' + """This is g's docstring.""" - Additional Docstrings - ````````````````````` + class AClass: - (This idea was adapted from PEP 216, Docstring Format [9], by - Moshe Zadka.) + c = 'class attribute' + """This is AClass.c's docstring.""" - Many programmers would like to make extensive use of docstrings - for API documentation. However, docstrings do take up space in - the running program, so some of these programmers are reluctant to - "bloat up" their code. Also, not all API documentation is - applicable to interactive environments, where __doc__ would be - displayed. + def __init__(self): + self.i = 'instance attribute' + """This is self.i's docstring.""" - The docstring processing system's extraction tools will - concatenate all string literal expressions which appear at the - beginning of a definition or after a simple assignment. Only the - first strings in definitions will be available as __doc__, and can - be used for brief usage text suitable for interactive sessions; - subsequent string literals and all attribute docstrings are - ignored by the Python bytecode compiler and may contain more - extensive API information. - Example:: +Additional Docstrings +''''''''''''''''''''' - def function(arg): - """This is __doc__, function's docstring.""" +(This idea was adapted from PEP 216 [#PEP-216]_.) + +Many programmers would like to make extensive use of docstrings for +API documentation. However, docstrings do take up space in the +running program, so some of these programmers are reluctant to "bloat +up" their code. Also, not all API documentation is applicable to +interactive environments, where ``__doc__`` would be displayed. + +The docstring processing system's extraction tools will concatenate +all string literal expressions which appear at the beginning of a +definition or after a simple assignment. Only the first strings in +definitions will be available as ``__doc__``, and can be used for +brief usage text suitable for interactive sessions; subsequent string +literals and all attribute docstrings are ignored by the Python +bytecode compiler and may contain more extensive API information. + +Example:: + + def function(arg): + """This is __doc__, function's docstring.""" + """ + This is an additional docstring, ignored by the bytecode + compiler, but extracted by the Docutils. + """ + pass + +.. topic:: Issue: ``from __future__ import`` + + This would break "``from __future__ import``" statements introduced + in Python 2.1 for multiple module docstrings (main docstring plus + additional docstring(s)). The Python Reference Manual specifies: + + A future statement must appear near the top of the module. The + only lines that can appear before a future statement are: + + * the module docstring (if any), + * comments, + * blank lines, and + * other future statements. + + Resolution? + + 1. Should we search for docstrings after a ``__future__`` + statement? Very ugly. + + 2. Redefine ``__future__`` statements to allow multiple preceeding + string literals? + + 3. Or should we not even worry about this? There probably + shouldn't be ``__future__`` statements in production code, after + all. Will modules with ``__future__`` statements simply have to + put up with the single-docstring limitation? + + +Choice of Docstring Format +-------------------------- + +Rather than force everyone to use a single docstring format, multiple +input formats are allowed by the processing system. A special +variable, ``__docformat__``, may appear at the top level of a module +before any function or class definitions. Over time or through +decree, a standard format or set of formats should emerge. + +The ``__docformat__`` variable is a string containing the name of the +format being used, a case-insensitive string matching the input +parser's module or package name (i.e., the same name as required to +"import" the module or package), or a registered alias. If no +``__docformat__`` is specified, the default format is "plaintext" for +now; this may be changed to the standard format once determined. + +The ``__docformat__`` string may contain an optional second field, +separated from the format name (first field) by a single space: a +case-insensitive language identifier as defined in RFC 1766. A +typical language identifier consists of a 2-letter language code from +`ISO 639`_ (3-letter codes used only if no 2-letter code exists; RFC +1766 is currently being revised to allow 3-letter codes). If no +language identifier is specified, the default is "en" for English. +The language identifier is passed to the parser and can be used for +language-dependent markup features. + + +Identifier Cross-References +--------------------------- + +In Python docstrings, interpreted text is used to classify and mark up +program identifiers, such as the names of variables, functions, +classes, and modules. If the identifier alone is given, its role is +inferred implicitly according to the Python namespace lookup rules. +For functions and methods (even when dynamically assigned), +parentheses ('()') may be included:: + + This function uses `another()` to do its work. + +For class, instance and module attributes, dotted identifiers are used +when necessary. For example (using reStructuredText markup):: + + class Keeper(Storer): + + """ + Extend `Storer`. Class attribute `instances` keeps track + of the number of `Keeper` objects instantiated. + """ + + instances = 0 + """How many `Keeper` objects are there?""" + + def __init__(self): """ - This is an additional docstring, ignored by the bytecode - compiler, but extracted by the Docutils. + Extend `Storer.__init__()` to keep track of instances. + + Keep count in `self.instances`, data in `self.data`. """ - pass + Storer.__init__(self) + self.instances += 1 - Issue: This breaks "from __future__ import" statements in Python - 2.1 for multiple module docstrings. The Python Reference Manual - specifies: - - A future statement must appear near the top of the module. - The only lines that can appear before a future statement are: - - * the module docstring (if any), - * comments, - * blank lines, and - * other future statements. - - Resolution? - - 1. Should we search for docstrings after a __future__ statement? - Very ugly. - - 2. Redefine __future__ statements to allow multiple preceeding - string literals? - - 3. Or should we not even worry about this? There shouldn't be - __future__ statements in production code, after all. Will - modules with __future__ statements simply have to put up with - the single-docstring limitation? - - - Choice of Docstring Format - -------------------------- - - Rather than force everyone to use a single docstring format, - multiple input formats are allowed by the processing system. A - special variable, __docformat__, may appear at the top level of a - module before any function or class definitions. Over time or - through decree, a standard format or set of formats should emerge. - - The __docformat__ variable is a string containing the name of the - format being used, a case-insensitive string matching the input - parser's module or package name (i.e., the same name as required - to "import" the module or package), or a registered alias. If no - __docformat__ is specified, the default format is "plaintext" for - now; this may be changed to the standard format once determined. - - The __docformat__ string may contain an optional second field, - separated from the format name (first field) by a single space: a - case-insensitive language identifier as defined in RFC 1766 [10]. - A typical language identifier consists of a 2-letter language code - from ISO 639 [11] (3-letter codes used only if no 2-letter code - exists; RFC 1766 is currently being revised to allow 3-letter - codes). If no language identifier is specified, the default is - "en" for English. The language identifier is passed to the parser - and can be used for language-dependent markup features. - - - Identifier Cross-References - --------------------------- - - In Python docstrings, interpreted text is used to classify and - mark up program identifiers, such as the names of variables, - functions, classes, and modules. If the identifier alone is - given, its role is inferred implicitly according to the Python - namespace lookup rules. For functions and methods (even when - dynamically assigned), parentheses ('()') may be included:: - - This function uses `another()` to do its work. - - For class, instance and module attributes, dotted identifiers are - used when necessary. For example (using reStructuredText - markup):: - - class Keeper(Storer): + self.data = [] + """Store data in a list, most recent last.""" + def storedata(self, data): """ - Extend `Storer`. Class attribute `instances` keeps track - of the number of `Keeper` objects instantiated. + Extend `Storer.storedata()`; append new `data` to a + list (in `self.data`). """ + self.data = data - instances = 0 - """How many `Keeper` objects are there?""" - - def __init__(self): - """ - Extend `Storer.__init__()` to keep track of instances. - - Keep count in `self.instances`, data in `self.data`. - """ - Storer.__init__(self) - self.instances += 1 - - self.data = [] - """Store data in a list, most recent last.""" - - def storedata(self, data): - """ - Extend `Storer.storedata()`; append new `data` to a - list (in `self.data`). - """ - self.data = data - - Each of the identifiers quoted with backquotes ("`") will become - references to the definitions of the identifiers themselves. +Each of the identifiers quoted with backquotes ("`") will become +references to the definitions of the identifiers themselves. - Stylist Transforms - ------------------ +Stylist Transforms +------------------ - Stylist transforms are specialized transforms specific to a - Reader. The PySource Reader doesn't have to make any decisions as - to style; it just produces a logically constructed document tree, - parsed and linked, including custom node types. Stylist - transforms understand the custom nodes created by the Reader and - convert them into standard Docutils nodes. +Stylist transforms are specialized transforms specific to a Reader. +The PySource Reader doesn't have to make any decisions as to style; it +just produces a logically constructed document tree, parsed and +linked, including custom node types. Stylist transforms understand +the custom nodes created by the Reader and convert them into standard +Docutils nodes. - Multiple Stylist transforms may be implemented and one can be - chosen at runtime (through a "--style" or "--stylist" command-line - option). Each Stylist transform implements a different layout or - style; thus the name. They decouple the context-understanding - part of the Reader from the layout-generating part of processing, - resulting in a more flexible and robust system. This also serves - to "separate style from content", the SGML/XML ideal. +Multiple Stylist transforms may be implemented and one can be chosen +at runtime (through a "--style" or "--stylist" command-line option). +Each Stylist transform implements a different layout or style; thus +the name. They decouple the context-understanding part of the Reader +from the layout-generating part of processing, resulting in a more +flexible and robust system. This also serves to "separate style from +content", the SGML/XML ideal. - By keeping the piece of code that does the styling small and - modular, it becomes much easier for people to roll their own - styles. The "barrier to entry" is too high with existing tools; - extracting the stylist code will lower the barrier considerably. +By keeping the piece of code that does the styling small and modular, +it becomes much easier for people to roll their own styles. The +"barrier to entry" is too high with existing tools; extracting the +stylist code will lower the barrier considerably. -References and Footnotes +========================== + References and Footnotes +========================== - [1] PEP 256, Docstring Processing System Framework, Goodger - http://www.python.org/peps/pep-0256.html +.. [#PEP-256] PEP 256, Docstring Processing System Framework, Goodger + (http://www.python.org/peps/pep-0256.html) - [2] http://docutils.sourceforge.net/spec/docutils.dtd +.. [#PEP-224] PEP 224, Attribute Docstrings, Lemburg + (http://www.python.org/peps/pep-0224.html) - [3] http://docutils.sourceforge.net/spec/soextblx.dtd +.. [#PEP-216] PEP 216, Docstring Format, Zadka + (http://www.python.org/peps/pep-0216.html) - [4] http://docutils.sourceforge.net/spec/doctree.txt +.. _docutils.dtd: http://docutils.sourceforge.net/spec/docutils.dtd - [5] http://www.openvms.compaq.com:8000/73final/5841/ - 5841pro_027.html#error_cond_severity +.. _soextbl.dtd: http://docutils.sourceforge.net/spec/soextblx.dtd - [6] http://jakarta.apache.org/log4j/ +.. _The Docutils Document Tree: + http://docutils.sourceforge.net/spec/doctree.html - [7] http://docutils.sourceforge.net/spec/pysource.dtd +.. _VMS error condition severity levels: + http://www.openvms.compaq.com:8000/73final/5841/841pro_027.html + #error_cond_severity - [8] PEP 224, Attribute Docstrings, Lemburg - http://www.python.org/peps/pep-0224.html +.. _log4j project: http://jakarta.apache.org/log4j/ - [9] PEP 216, Docstring Format, Zadka - http://www.python.org/peps/pep-0216.html +.. _Docutils Python Source DTD: + http://docutils.sourceforge.net/spec/pysource.dtd - [10] http://www.rfc-editor.org/rfc/rfc1766.txt +.. _ISO 639: http://lcweb.loc.gov/standards/iso639-2/englangn.html - [11] http://lcweb.loc.gov/standards/iso639-2/englangn.html - - [12] http://www.python.org/sigs/doc-sig/ +.. _Python Doc-SIG: http://www.python.org/sigs/doc-sig/ -Project Web Site +================== + Project Web Site +================== - A SourceForge project has been set up for this work at - http://docutils.sourceforge.net/. +A SourceForge project has been set up for this work at +http://docutils.sourceforge.net/. -Copyright +=========== + Copyright +=========== - This document has been placed in the public domain. +This document has been placed in the public domain. -Acknowledgements +================== + Acknowledgements +================== - This document borrows ideas from the archives of the Python - Doc-SIG [12]. Thanks to all members past & present. +This document borrows ideas from the archives of the `Python +Doc-SIG`_. Thanks to all members past & present. -Local Variables: -mode: indented-text -indent-tabs-mode: nil -fill-column: 70 -sentence-end-double-space: t -End: +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + End: diff --git a/pep-0287.txt b/pep-0287.txt index 96a8ff3c8..4b4ed4e44 100644 --- a/pep-0287.txt +++ b/pep-0287.txt @@ -2,812 +2,814 @@ PEP: 287 Title: reStructuredText Docstring Format Version: $Revision$ Last-Modified: $Date$ -Author: goodger@users.sourceforge.net (David Goodger) -Discussions-To: doc-sig@python.org +Author: David Goodger +Discussions-To: Status: Draft Type: Informational +Content-Type: text/x-rst Created: 25-Mar-2002 Post-History: 02-Apr-2002 Replaces: 216 Abstract +======== - When plaintext hasn't been expressive enough for inline - documentation, Python programmers have sought out a format for - docstrings. This PEP proposes that the reStructuredText markup - [1]_ be adopted as a standard markup format for structured - plaintext documentation in Python docstrings, and for PEPs and - ancillary documents as well. reStructuredText is a rich and - extensible yet easy-to-read, what-you-see-is-what-you-get - plaintext markup syntax. +When plaintext hasn't been expressive enough for inline documentation, +Python programmers have sought out a format for docstrings. This PEP +proposes that the `reStructuredText markup`_ be adopted as a standard +markup format for structured plaintext documentation in Python +docstrings, and for PEPs and ancillary documents as well. +reStructuredText is a rich and extensible yet easy-to-read, +what-you-see-is-what-you-get plaintext markup syntax. - Only the low-level syntax of docstrings is addressed here. This - PEP is not concerned with docstring semantics or processing at all - (see PEP 256 for a "Roadmap to the Doctring PEPs"). Nor is it an - attempt to deprecate pure plaintext docstrings, which are always - going to be legitimate. The reStructuredText markup is an - alternative for those who want more expressive docstrings. +Only the low-level syntax of docstrings is addressed here. This PEP +is not concerned with docstring semantics or processing at all (see +PEP 256 for a "Roadmap to the Doctring PEPs"). Nor is it an attempt +to deprecate pure plaintext docstrings, which are always going to be +legitimate. The reStructuredText markup is an alternative for those +who want more expressive docstrings. Benefits +======== - Programmers are by nature a lazy breed. We reuse code with - functions, classes, modules, and subsystems. Through its - docstring syntax, Python allows us to document our code from - within. The "holy grail" of the Python Documentation Special - Interest Group (Doc-SIG) [2]_ has been a markup syntax and toolset - to allow auto-documentation, where the docstrings of Python - systems can be extracted in context and processed into useful, - high-quality documentation for multiple purposes. +Programmers are by nature a lazy breed. We reuse code with functions, +classes, modules, and subsystems. Through its docstring syntax, +Python allows us to document our code from within. The "holy grail" +of the Python Documentation Special Interest Group (Doc-SIG_) has been +a markup syntax and toolset to allow auto-documentation, where the +docstrings of Python systems can be extracted in context and processed +into useful, high-quality documentation for multiple purposes. - Document markup languages have three groups of customers: the - authors who write the documents, the software systems that process - the data, and the readers, who are the final consumers and the - most important group. Most markups are designed for the authors - and software systems; readers are only meant to see the processed - form, either on paper or via browser software. ReStructuredText - is different: it is intended to be easily readable in source form, - without prior knowledge of the markup. ReStructuredText is - entirely readable in plaintext format, and many of the markup - forms match common usage (e.g., ``*emphasis*``), so it reads quite - naturally. Yet it is rich enough to produce complex documents, - and extensible so that there are few limits. Of course, to write - reStructuredText documents some prior knowledge is required. +Document markup languages have three groups of customers: the authors +who write the documents, the software systems that process the data, +and the readers, who are the final consumers and the most important +group. Most markups are designed for the authors and software +systems; readers are only meant to see the processed form, either on +paper or via browser software. ReStructuredText is different: it is +intended to be easily readable in source form, without prior knowledge +of the markup. ReStructuredText is entirely readable in plaintext +format, and many of the markup forms match common usage (e.g., +``*emphasis*``), so it reads quite naturally. Yet it is rich enough +to produce complex documents, and extensible so that there are few +limits. Of course, to write reStructuredText documents some prior +knowledge is required. - The reStructuredText parser is available now. The Docutils - project is at the point where standalone reStructuredText - documents can be converted to HTML; other output format writers - will become available over time. Work is progressing on a Python - source "Reader" which will implement auto-documentation from - docstrings. Authors of existing auto-documentation tools are - encouraged to integrate the reStructuredText parser into their - projects, or better yet, to join forces to produce a world-class - toolset for the Python standard library. +The markup offers functionality and expressivity, while maintaining +easy readability in the source text. The processed form (HTML etc.) +makes it all accessible to readers: inline live hyperlinks; live links +to and from footnotes; automatic tables of contents (with live +links!); tables; images for diagrams etc.; pleasant, readable styled +text. - Tools will become available in the near future, which will allow - programmers to generate HTML for online help, XML for multiple - purposes, and eventually PDF, DocBook, and LaTeX for printed - documentation, essentially "for free" from the existing - docstrings. The adoption of a standard will, at the very least, - benefit docstring processing tools by preventing further - "reinventing the wheel". +The reStructuredText parser is available now, part of the Docutils_ +project. Standalone reStructuredText documents and PEPs can be +converted to HTML; other output format writers are being worked on and +will become available over time. Work is progressing on a Python +source "Reader" which will implement auto-documentation from +docstrings. Authors of existing auto-documentation tools are +encouraged to integrate the reStructuredText parser into their +projects, or better yet, to join forces to produce a world-class +toolset for the Python standard library. - Eventually PyDoc, the one existing standard auto-documentation - tool, could have reStructuredText support added. In the interim - it will have no problem with reStructuredText markup, since it - treats all docstrings as preformatted plaintext. +Tools will become available in the near future, which will allow +programmers to generate HTML for online help, XML for multiple +purposes, and eventually PDF, DocBook, and LaTeX for printed +documentation, essentially "for free" from the existing docstrings. +The adoption of a standard will, at the very least, benefit docstring +processing tools by preventing further "reinventing the wheel". + +Eventually PyDoc, the one existing standard auto-documentation tool, +could have reStructuredText support added. In the interim it will +have no problem with reStructuredText markup, since it treats all +docstrings as preformatted plaintext. Goals +===== - These are the generally accepted goals for a docstring format, as - discussed in the Doc-SIG: +These are the generally accepted goals for a docstring format, as +discussed in the Doc-SIG: - 1. It must be readable in source form by the casual observer. +1. It must be readable in source form by the casual observer. - 2. It must be easy to type with any standard text editor. +2. It must be easy to type with any standard text editor. - 3. It must not need to contain information which can be deduced - from parsing the module. +3. It must not need to contain information which can be deduced from + parsing the module. - 4. It must contain sufficient information (structure) so it can be - converted to any reasonable markup format. +4. It must contain sufficient information (structure) so it can be + converted to any reasonable markup format. - 5. It must be possible to write a module's entire documentation in - docstrings, without feeling hampered by the markup language. +5. It must be possible to write a module's entire documentation in + docstrings, without feeling hampered by the markup language. - reStructuredText meets and exceeds all of these goals, and sets - its own goals as well, even more stringent. See - "Docstring-Significant Features" below. +reStructuredText meets and exceeds all of these goals, and sets its +own goals as well, even more stringent. See `Docstring-Significant +Features`_ below. - The goals of this PEP are as follows: +The goals of this PEP are as follows: - 1. To establish reStructuredText as a standard structured - plaintext format for docstrings (inline documentation of Python - modules and packages), PEPs, README-type files and other - standalone documents. "Accepted" status will be sought through - Python community consensus and eventual BDFL pronouncement. +1. To establish reStructuredText as a standard structured plaintext + format for docstrings (inline documentation of Python modules and + packages), PEPs, README-type files and other standalone documents. + "Accepted" status will be sought through Python community consensus + and eventual BDFL pronouncement. - Please note that reStructuredText is being proposed as *a* - standard, not *the only* standard. Its use will be entirely - optional. Those who don't want to use it need not. + Please note that reStructuredText is being proposed as *a* + standard, not *the only* standard. Its use will be entirely + optional. Those who don't want to use it need not. - 2. To solicit and address any related concerns raised by the - Python community. +2. To solicit and address any related concerns raised by the Python + community. - 3. To encourage community support. As long as multiple competing - markups are out there, the development community remains - fractured. Once a standard exists, people will start to use - it, and momentum will inevitably gather. +3. To encourage community support. As long as multiple competing + markups are out there, the development community remains fractured. + Once a standard exists, people will start to use it, and momentum + will inevitably gather. - 4. To consolidate efforts from related auto-documentation - projects. It is hoped that interested developers will join - forces and work on a joint/merged/common implementation. +4. To consolidate efforts from related auto-documentation projects. + It is hoped that interested developers will join forces and work on + a joint/merged/common implementation. - Once reStructuredText is a Python standard, effort can be focused - on tools instead of arguing for a standard. Python needs a - standard set of documentation tools. +Once reStructuredText is a Python standard, effort can be focused on +tools instead of arguing for a standard. Python needs a standard set +of documentation tools. - With regard to PEPs, one or both of the following strategies may - be applied: +With regard to PEPs, one or both of the following strategies may be +applied: - a) Keep the existing PEP section structure constructs (one-line - section headers, indented body text). Subsections can either - be forbidden, or supported with reStructuredText-style - underlined headers in the indented body text. +a) Keep the existing PEP section structure constructs (one-line + section headers, indented body text). Subsections can either be + forbidden, or supported with reStructuredText-style underlined + headers in the indented body text. - b) Replace the PEP section structure constructs with the - reStructuredText syntax. Section headers will require - underlines, subsections will be supported out of the box, - and body text need not be indented (except for block - quotes). +b) Replace the PEP section structure constructs with the + reStructuredText syntax. Section headers will require underlines, + subsections will be supported out of the box, and body text need + not be indented (except for block quotes). - Support for RFC 2822 headers has been added to the - reStructuredText parser for PEPs (unambiguous given a specific - context: the first contiguous block of the document). It may be - desired to concretely specify what over/underline styles are - allowed for PEP section headers, for uniformity. +Strategy (b) is recommended, and its implementation is complete. + +Support for RFC 2822 headers has been added to the reStructuredText +parser for PEPs (unambiguous given a specific context: the first +contiguous block of the document). It may be desired to concretely +specify what over/underline styles are allowed for PEP section +headers, for uniformity. Rationale +========= - The lack of a standard syntax for docstrings has hampered the - development of standard tools for extracting and converting - docstrings into documentation in standard formats (e.g., HTML, - DocBook, TeX). There have been a number of proposed markup - formats and variations, and many tools tied to these proposals, - but without a standard docstring format they have failed to gain a - strong following and/or floundered half-finished. +The lack of a standard syntax for docstrings has hampered the +development of standard tools for extracting and converting docstrings +into documentation in standard formats (e.g., HTML, DocBook, TeX). +There have been a number of proposed markup formats and variations, +and many tools tied to these proposals, but without a standard +docstring format they have failed to gain a strong following and/or +floundered half-finished. - Throughout the existence of the Doc-SIG, consensus on a single - standard docstring format has never been reached. A lightweight, - implicit markup has been sought, for the following reasons (among - others): +Throughout the existence of the Doc-SIG, consensus on a single +standard docstring format has never been reached. A lightweight, +implicit markup has been sought, for the following reasons (among +others): - 1. Docstrings written within Python code are available from within - the interactive interpreter, and can be "print"ed. Thus the - use of plaintext for easy readability. +1. Docstrings written within Python code are available from within the + interactive interpreter, and can be "print"ed. Thus the use of + plaintext for easy readability. - 2. Programmers want to add structure to their docstrings, without - sacrificing raw docstring readability. Unadorned plaintext - cannot be transformed ("up-translated") into useful structured - formats. +2. Programmers want to add structure to their docstrings, without + sacrificing raw docstring readability. Unadorned plaintext cannot + be transformed ("up-translated") into useful structured formats. - 3. Explicit markup (like XML or TeX) is widely considered - unreadable by the uninitiated. +3. Explicit markup (like XML or TeX) is widely considered unreadable + by the uninitiated. - 4. Implicit markup is aesthetically compatible with the clean and - minimalist Python syntax. +4. Implicit markup is aesthetically compatible with the clean and + minimalist Python syntax. - Many alternative markups for docstrings have been proposed on the - Doc-SIG over the years; a representative sample is listed below. - Each is briefly analyzed in terms of the goals stated above. - Please note that this is *not* intended to be an exclusive list of - all existing markup systems; there are many other markups - (Texinfo, Doxygen, TIM, YODL, AFT, ...) which are not mentioned. +Many alternative markups for docstrings have been proposed on the +Doc-SIG over the years; a representative sample is listed below. Each +is briefly analyzed in terms of the goals stated above. Please note +that this is *not* intended to be an exclusive list of all existing +markup systems; there are many other markups (Texinfo, Doxygen, TIM, +YODL, AFT, ...) which are not mentioned. - - XML [3]_, SGML [4]_, DocBook [5]_, HTML [6]_, XHTML [7]_ +- XML_, SGML_, DocBook_, HTML_, XHTML_ - XML and SGML are explicit, well-formed meta-languages suitable - for all kinds of documentation. XML is a variant of SGML. They - are best used behind the scenes, because to untrained eyes they - are verbose, difficult to type, and too cluttered to read - comfortably as source. DocBook, HTML, and XHTML are all - applications of SGML and/or XML, and all share the same basic - syntax and the same shortcomings. + XML and SGML are explicit, well-formed meta-languages suitable for + all kinds of documentation. XML is a variant of SGML. They are + best used behind the scenes, because to untrained eyes they are + verbose, difficult to type, and too cluttered to read comfortably as + source. DocBook, HTML, and XHTML are all applications of SGML + and/or XML, and all share the same basic syntax and the same + shortcomings. - - TeX [8]_ +- TeX_ - TeX is similar to XML/SGML in that it's explicit, but not very - easy to write, and not easy for the uninitiated to read. + TeX is similar to XML/SGML in that it's explicit, but not very easy + to write, and not easy for the uninitiated to read. - - Perl POD [9]_ +- `Perl POD`_ - Most Perl modules are documented in a format called POD (Plain - Old Documentation). This is an easy-to-type, very low level - format with strong integration with the Perl parser. Many tools - exist to turn POD documentation into other formats: info, HTML - and man pages, among others. However, the POD syntax takes - after Perl itself in terms of readability. + Most Perl modules are documented in a format called POD (Plain Old + Documentation). This is an easy-to-type, very low level format with + strong integration with the Perl parser. Many tools exist to turn + POD documentation into other formats: info, HTML and man pages, + among others. However, the POD syntax takes after Perl itself in + terms of readability. - - JavaDoc [10]_ +- JavaDoc_ - Special comments before Java classes and functions serve to - document the code. A program to extract these, and turn them - into HTML documentation is called javadoc, and is part of the - standard Java distribution. However, JavaDoc has a very - intimate relationship with HTML, using HTML tags for most - markup. Thus it shares the readability problems of HTML. + Special comments before Java classes and functions serve to document + the code. A program to extract these, and turn them into HTML + documentation is called javadoc, and is part of the standard Java + distribution. However, JavaDoc has a very intimate relationship + with HTML, using HTML tags for most markup. Thus it shares the + readability problems of HTML. - - Setext [11]_, StructuredText [12]_ +- Setext_, StructuredText_ - Early on, variants of Setext (Structure Enhanced Text), - including Zope Corp's StructuredText, were proposed for Python - docstring formatting. Hereafter these variants will - collectively be called "STexts". STexts have the advantage of - being easy to read without special knowledge, and relatively - easy to write. + Early on, variants of Setext (Structure Enhanced Text), including + Zope Corp's StructuredText, were proposed for Python docstring + formatting. Hereafter these variants will collectively be called + "STexts". STexts have the advantage of being easy to read without + special knowledge, and relatively easy to write. - Although used by some (including in most existing Python - auto-documentation tools), until now STexts have failed to - become standard because: + Although used by some (including in most existing Python + auto-documentation tools), until now STexts have failed to become + standard because: - - STexts have been incomplete. Lacking "essential" constructs - that people want to use in their docstrings, STexts are - rendered less than ideal. Note that these "essential" - constructs are not universal; everyone has their own - requirements. + - STexts have been incomplete. Lacking "essential" constructs that + people want to use in their docstrings, STexts are rendered less + than ideal. Note that these "essential" constructs are not + universal; everyone has their own requirements. - - STexts have been sometimes surprising. Bits of text are - unexpectedly interpreted as being marked up, leading to user - frustration. + - STexts have been sometimes surprising. Bits of text are + unexpectedly interpreted as being marked up, leading to user + frustration. - - SText implementations have been buggy. + - SText implementations have been buggy. - - Most STexts have have had no formal specification except for - the implementation itself. A buggy implementation meant a - buggy spec, and vice-versa. + - Most STexts have have had no formal specification except for the + implementation itself. A buggy implementation meant a buggy spec, + and vice-versa. - - There has been no mechanism to get around the SText markup - rules when a markup character is used in a non-markup context. - In other words, no way to escape markup. + - There has been no mechanism to get around the SText markup rules + when a markup character is used in a non-markup context. In other + words, no way to escape markup. - Proponents of implicit STexts have vigorously opposed proposals - for explicit markup (XML, HTML, TeX, POD, etc.), and the debates - have continued off and on since 1996 or earlier. +Proponents of implicit STexts have vigorously opposed proposals for +explicit markup (XML, HTML, TeX, POD, etc.), and the debates have +continued off and on since 1996 or earlier. - reStructuredText is a complete revision and reinterpretation of - the SText idea, addressing all of the problems listed above. +reStructuredText is a complete revision and reinterpretation of the +SText idea, addressing all of the problems listed above. Specification +============= - The specification and user documentaton for reStructuredText is - quite extensive. Rather than repeating or summarizing it all - here, links to the originals are provided. +The specification and user documentaton for reStructuredText is +quite extensive. Rather than repeating or summarizing it all +here, links to the originals are provided. - Please first take a look at "A ReStructuredText Primer" [13]_, a - short and gentle introduction. The "Quick reStructuredText" user - reference [14]_ quickly summarizes all of the markup constructs. - For complete and extensive details, please refer to the following - documents: +Please first take a look at `A ReStructuredText Primer`_, a short and +gentle introduction. The `Quick reStructuredText`_ user reference +quickly summarizes all of the markup constructs. For complete and +extensive details, please refer to the following documents: - - An Introduction to reStructuredText [15]_ +- `An Introduction to reStructuredText`_ - - reStructuredText Markup Specification [16]_ +- `reStructuredText Markup Specification`_ - - reStructuredText Directives [17]_ +- `reStructuredText Directives`_ - In addition, "Problems With StructuredText" [18]_ explains many - markup decisions made with regards to StructuredText, and "A - Record of reStructuredText Syntax Alternatives" [19]_ records - markup decisions made independently. +In addition, `Problems With StructuredText`_ explains many markup +decisions made with regards to StructuredText, and `A Record of +reStructuredText Syntax Alternatives`_ records markup decisions made +independently. Docstring-Significant Features +============================== - - A markup escaping mechanism. +- A markup escaping mechanism. - Backslashes (``\``) are used to escape markup characters when - needed for non-markup purposes. However, the inline markup - recognition rules have been constructed in order to minimize the - need for backslash-escapes. For example, although asterisks are - used for *emphasis*, in non-markup contexts such as "*" or "(*)" - or "x * y", the asterisks are not interpreted as markup and are - left unchanged. For many non-markup uses of backslashes (e.g., - describing regular expressions), inline literals or literal - blocks are applicable; see the next item. + Backslashes (``\``) are used to escape markup characters when needed + for non-markup purposes. However, the inline markup recognition + rules have been constructed in order to minimize the need for + backslash-escapes. For example, although asterisks are used for + *emphasis*, in non-markup contexts such as "*" or "(*)" or "x * y", + the asterisks are not interpreted as markup and are left unchanged. + For many non-markup uses of backslashes (e.g., describing regular + expressions), inline literals or literal blocks are applicable; see + the next item. - - Markup to include Python source code and Python interactive - sessions: inline literals, literal blocks, and doctest blocks. +- Markup to include Python source code and Python interactive + sessions: inline literals, literal blocks, and doctest blocks. - Inline literals use ``double-backquotes`` to indicate program - I/O or code snippets. No markup interpretation (including - backslash-escape [``\``] interpretation) is done within inline - literals. + Inline literals use ``double-backquotes`` to indicate program I/O or + code snippets. No markup interpretation (including backslash-escape + [``\``] interpretation) is done within inline literals. - Literal blocks (block-level literal text, such as code excerpts - or ASCII graphics) are indented, and indicated with a - double-colon ("::") at the end of the preceding paragraph (right - here -->):: + Literal blocks (block-level literal text, such as code excerpts or + ASCII graphics) are indented, and indicated with a double-colon + ("::") at the end of the preceding paragraph (right here -->):: - if literal_block: - text = 'is left as-is' - spaces_and_linebreaks = 'are preserved' - markup_processing = None + if literal_block: + text = 'is left as-is' + spaces_and_linebreaks = 'are preserved' + markup_processing = None - Doctest blocks begin with ">>> " and end with a blank line. - Neither indentation nor literal block double-colons are - required. For example:: + Doctest blocks begin with ">>> " and end with a blank line. Neither + indentation nor literal block double-colons are required. For + example:: - Here's a doctest block: + Here's a doctest block: - >>> print 'Python-specific usage examples; begun with ">>>"' - Python-specific usage examples; begun with ">>>" - >>> print '(cut and pasted from interactive sessions)' - (cut and pasted from interactive sessions) + >>> print 'Python-specific usage examples; begun with ">>>"' + Python-specific usage examples; begun with ">>>" + >>> print '(cut and pasted from interactive sessions)' + (cut and pasted from interactive sessions) - - Markup that isolates a Python identifier: interpreted text. +- Markup that isolates a Python identifier: interpreted text. - Text enclosed in single backquotes is recognized as "interpreted - text", whose interpretation is application-dependent. In the - context of a Python docstring, the default interpretation of - interpreted text is as Python identifiers. The text will be - marked up with a hyperlink connected to the documentation for - the identifier given. Lookup rules are the same as in Python - itself: LGB namespace lookups (local, global, builtin). The - "role" of the interpreted text (identifying a class, module, - function, etc.) is determined implicitly from the namespace - lookup. For example:: + Text enclosed in single backquotes is recognized as "interpreted + text", whose interpretation is application-dependent. In the + context of a Python docstring, the default interpretation of + interpreted text is as Python identifiers. The text will be marked + up with a hyperlink connected to the documentation for the + identifier given. Lookup rules are the same as in Python itself: + LGB namespace lookups (local, global, builtin). The "role" of the + interpreted text (identifying a class, module, function, etc.) is + determined implicitly from the namespace lookup. For example:: - class Keeper(Storer): + class Keeper(Storer): + """ + Keep data fresher longer. + + Extend `Storer`. Class attribute `instances` keeps track + of the number of `Keeper` objects instantiated. + """ + + instances = 0 + """How many `Keeper` objects are there?""" + + def __init__(self): """ - Keep data fresher longer. - - Extend `Storer`. Class attribute `instances` keeps track - of the number of `Keeper` objects instantiated. + Extend `Storer.__init__()` to keep track of + instances. Keep count in `self.instances` and data + in `self.data`. """ + Storer.__init__(self) + self.instances += 1 - instances = 0 - """How many `Keeper` objects are there?""" + self.data = [] + """Store data in a list, most recent last.""" - def __init__(self): - """ - Extend `Storer.__init__()` to keep track of - instances. Keep count in `self.instances` and data - in `self.data`. - """ - Storer.__init__(self) - self.instances += 1 + def storedata(self, data): + """ + Extend `Storer.storedata()`; append new `data` to a + list (in `self.data`). + """ + self.data = data - self.data = [] - """Store data in a list, most recent last.""" + Each piece of interpreted text is looked up according to the local + namespace of the block containing its docstring. - def storedata(self, data): - """ - Extend `Storer.storedata()`; append new `data` to a - list (in `self.data`). - """ - self.data = data +- Markup that isolates a Python identifier and specifies its type: + interpreted text with roles. - Each piece of interpreted text is looked up according to the - local namespace of the block containing its docstring. + Although the Python source context reader is designed not to require + explicit roles, they may be used. To classify identifiers + explicitly, the role is given along with the identifier in either + prefix or suffix form:: - - Markup that isolates a Python identifier and specifies its type: - interpreted text with roles. + Use :method:`Keeper.storedata` to store the object's data in + `Keeper.data`:instance_attribute:. - Although the Python source context reader is designed not to - require explicit roles, they may be used. To classify - identifiers explicitly, the role is given along with the - identifier in either prefix or suffix form:: + The syntax chosen for roles is verbose, but necessarily so (if + anyone has a better alternative, please post it to the Doc-SIG_). + The intention of the markup is that there should be little need to + use explicit roles; their use is to be kept to an absolute minimum. - Use :method:`Keeper.storedata` to store the object's data in - `Keeper.data`:instance_attribute:. +- Markup for "tagged lists" or "label lists": field lists. - The syntax chosen for roles is verbose, but necessarily so (if - anyone has a better alternative, please post it to the Doc-SIG). - The intention of the markup is that there should be little need - to use explicit roles; their use is to be kept to an absolute - minimum. + Field lists represent a mapping from field name to field body. + These are mostly used for extension syntax, such as "bibliographic + field lists" (representing document metadata such as author, date, + and version) and extension attributes for directives (see below). + They may be used to implement methodologies (docstring semantics), + such as identifying parameters, exceptions raised, etc.; such usage + is beyond the scope of this PEP. - - Markup for "tagged lists" or "label lists": field lists. + A modified RFC 2822 syntax is used, with a colon *before* as well as + *after* the field name. Field bodies are more versatile as well; + they may contain multiple field bodies (even nested field lists). + For example:: - Field lists represent a mapping from field name to field body. - These are mostly used for extension syntax, such as - "bibliographic field lists" (representing document metadata such - as author, date, and version) and extension attributes for - directives (see below). They may be used to implement - methodologies (docstring semantics), such as identifying - parameters, exceptions raised, etc.; such usage is beyond the - scope of this PEP. + :Date: 2002-03-22 + :Version: 1 + :Authors: + - Me + - Myself + - I - A modified RFC 2822 syntax is used, with a colon *before* as - well as *after* the field name. Field bodies are more versatile - as well; they may contain multiple field bodies (even nested - field lists). For example:: + Standard RFC 2822 header syntax cannot be used for this construct + because it is ambiguous. A word followed by a colon at the + beginning of a line is common in written text. - :Date: 2002-03-22 - :Version: 1 - :Authors: - - Me - - Myself - - I +- Markup extensibility: directives and substitutions. - Standard RFC 2822 header syntax cannot be used for this - construct because it is ambiguous. A word followed by a colon - at the beginning of a line is common in written text. + Directives are used as an extension mechanism for reStructuredText, + a way of adding support for new block-level constructs without + adding new syntax. Directives for images, admonitions (note, + caution, etc.), and tables of contents generation (among others) + have been implemented. For example, here's how to place an image:: - - Markup extensibility: directives and substitutions. + .. image:: mylogo.png - Directives are used as an extension mechanism for - reStructuredText, a way of adding support for new block-level - constructs without adding new syntax. Directives for images, - admonitions (note, caution, etc.), and tables of contents - generation (among others) have been implemented. For example, - here's how to place an image:: + Substitution definitions allow the power and flexibility of + block-level directives to be shared by inline text. For example:: - .. image:: mylogo.png + The |biohazard| symbol must be used on containers used to + dispose of medical waste. - Substitution definitions allow the power and flexibility of - block-level directives to be shared by inline text. For - example:: + .. |biohazard| image:: biohazard.png - The |biohazard| symbol must be used on containers used to - dispose of medical waste. +- Section structure markup. - .. |biohazard| image:: biohazard.png + Section headers in reStructuredText use adornment via underlines + (and possibly overlines) rather than indentation. For example:: - - Section structure markup. + This is a Section Title + ======================= - Section headers in reStructuredText use adornment via underlines - (and possibly overlines) rather than indentation. For example:: + This is a Subsection Title + -------------------------- - This is a Section Title - ======================= + This paragraph is in the subsection. - This is a Subsection Title - -------------------------- + This is Another Section Title + ============================= - This paragraph is in the subsection. - - This is Another Section Title - ============================= - - This paragraph is in the second section. + This paragraph is in the second section. Questions & Answers +=================== - 1. Is reStructuredText rich enough? +1. Is reStructuredText rich enough? - Yes, it is for most people. If it lacks some construct that is - required for a specific application, it can be added via the - directive mechanism. If a useful and common construct has been - overlooked and a suitably readable syntax can be found, it can - be added to the specification and parser. + Yes, it is for most people. If it lacks some construct that is + required for a specific application, it can be added via the + directive mechanism. If a useful and common construct has been + overlooked and a suitably readable syntax can be found, it can be + added to the specification and parser. - 2. Is reStructuredText *too* rich? +2. Is reStructuredText *too* rich? - For specific applications or individuals, perhaps. In general, - no. + For specific applications or individuals, perhaps. In general, no. - Since the very beginning, whenever a docstring markup syntax - has been proposed on the Doc-SIG, someone has complained about - the lack of support for some construct or other. The reply was - often something like, "These are docstrings we're talking - about, and docstrings shouldn't have complex markup." The - problem is that a construct that seems superfluous to one - person may be absolutely essential to another. + Since the very beginning, whenever a docstring markup syntax has + been proposed on the Doc-SIG_, someone has complained about the + lack of support for some construct or other. The reply was often + something like, "These are docstrings we're talking about, and + docstrings shouldn't have complex markup." The problem is that a + construct that seems superfluous to one person may be absolutely + essential to another. - reStructuredText takes the opposite approach: it provides a - rich set of implicit markup constructs (plus a generic - extension mechanism for explicit markup), allowing for all - kinds of documents. If the set of constructs is too rich for a - particular application, the unused constructs can either be - removed from the parser (via application-specific overrides) or - simply omitted by convention. + reStructuredText takes the opposite approach: it provides a rich + set of implicit markup constructs (plus a generic extension + mechanism for explicit markup), allowing for all kinds of + documents. If the set of constructs is too rich for a particular + application, the unused constructs can either be removed from the + parser (via application-specific overrides) or simply omitted by + convention. - 3. Why not use indentation for section structure, like - StructuredText does? Isn't it more "Pythonic"? +3. Why not use indentation for section structure, like StructuredText + does? Isn't it more "Pythonic"? - Guido van Rossum wrote the following in a 2001-06-13 Doc-SIG - post: + Guido van Rossum wrote the following in a 2001-06-13 Doc-SIG post: - I still think that using indentation to indicate sectioning - is wrong. If you look at how real books and other print - publications are laid out, you'll notice that indentation - is used frequently, but mostly at the intra-section level. - Indentation can be used to offset lists, tables, - quotations, examples, and the like. (The argument that - docstrings are different because they are input for a text - formatter is wrong: the whole point is that they are also - readable without processing.) + I still think that using indentation to indicate sectioning is + wrong. If you look at how real books and other print + publications are laid out, you'll notice that indentation is + used frequently, but mostly at the intra-section level. + Indentation can be used to offset lists, tables, quotations, + examples, and the like. (The argument that docstrings are + different because they are input for a text formatter is wrong: + the whole point is that they are also readable without + processing.) - I reject the argument that using indentation is Pythonic: - text is not code, and different traditions and conventions - hold. People have been presenting text for readability for - over 30 centuries. Let's not innovate needlessly. + I reject the argument that using indentation is Pythonic: text + is not code, and different traditions and conventions hold. + People have been presenting text for readability for over 30 + centuries. Let's not innovate needlessly. - See "Section Structure via Indentation" in "Problems With - StructuredText" [18 ]_ for further elaboration. + See `Section Structure via Indentation`__ in `Problems With + StructuredText`_ for further elaboration. - 4. Why use reStructuredText for PEPs? What's wrong with the - existing standard? + __ http://docutils.sourceforge.net/spec/rst/problems.html + #section-structure-via-indentation - The existing standard for PEPs is very limited in terms of - general expressibility, and referencing is especially lacking - for such a reference-rich document type. PEPs are currently - converted into HTML, but the results (mostly monospaced text) - are less than attractive, and most of the value-added potential - of HTML (especially inline hyperlinks) is untapped. +4. Why use reStructuredText for PEPs? What's wrong with the existing + standard? - Making reStructuredText a standard markup for PEPs will enable - much richer expression, including support for section - structure, inline markup, graphics, and tables. In several - PEPs there are ASCII graphics diagrams, which are all that - plaintext documents can support. Since PEPs are made available - in HTML form, the ability to include proper diagrams would be - immediately useful. + The existing standard for PEPs is very limited in terms of general + expressibility, and referencing is especially lacking for such a + reference-rich document type. PEPs are currently converted into + HTML, but the results (mostly monospaced text) are less than + attractive, and most of the value-added potential of HTML + (especially inline hyperlinks) is untapped. - Current PEP practices allow for reference markers in the form - "[1]" in the text, and the footnotes/references themselves are - listed in a section toward the end of the document. There is - currently no hyperlinking between the reference marker and the - footnote/reference itself (it would be possible to add this to - pep2html.py, but the "markup" as it stands is ambiguous and - mistakes would be inevitable). A PEP with many references - (such as this one ;-) requires a lot of flipping back and - forth. When revising a PEP, often new references are added or - unused references deleted. It is painful to renumber the - references, since it has to be done in two places and can have - a cascading effect (insert a single new reference 1, and every - other reference has to be renumbered; always adding new - references to the end is suboptimal). It is easy for - references to go out of sync. + Making reStructuredText a standard markup for PEPs will enable much + richer expression, including support for section structure, inline + markup, graphics, and tables. In several PEPs there are ASCII + graphics diagrams, which are all that plaintext documents can + support. Since PEPs are made available in HTML form, the ability + to include proper diagrams would be immediately useful. - PEPs use references for two purposes: simple URL references and - footnotes. reStructuredText differentiates between the two. A - PEP might contain references like this:: + Current PEP practices allow for reference markers in the form "[1]" + in the text, and the footnotes/references themselves are listed in + a section toward the end of the document. There is currently no + hyperlinking between the reference marker and the + footnote/reference itself (it would be possible to add this to + pep2html.py, but the "markup" as it stands is ambiguous and + mistakes would be inevitable). A PEP with many references (such as + this one ;-) requires a lot of flipping back and forth. When + revising a PEP, often new references are added or unused references + deleted. It is painful to renumber the references, since it has to + be done in two places and can have a cascading effect (insert a + single new reference 1, and every other reference has to be + renumbered; always adding new references to the end is suboptimal). + It is easy for references to go out of sync. - Abstract + PEPs use references for two purposes: simple URL references and + footnotes. reStructuredText differentiates between the two. A PEP + might contain references like this:: - This PEP proposes adding frungible doodads [1] to the - core. It extends PEP 9876 [2] via the BCA [3] - mechanism. + Abstract - References and Footnotes + This PEP proposes adding frungible doodads [1] to the core. + It extends PEP 9876 [2] via the BCA [3] mechanism. - [1] http://www.example.org/ + ... - [2] PEP 9876, Let's Hope We Never Get Here - http://www.python.org/peps/pep-9876.html + References and Footnotes - [3] "Bogus Complexity Addition" + [1] http://www.example.org/ - Reference 1 is a simple URL reference. Reference 2 is a - footnote containing text and a URL. Reference 3 is a footnote - containing text only. Rewritten using reStructuredText, this - PEP could look like this:: + [2] PEP 9876, Let's Hope We Never Get Here + http://www.python.org/peps/pep-9876.html - Abstract - ======== + [3] "Bogus Complexity Addition" - This PEP proposes adding `frungible doodads`_ to the core. - It extends PEP 9876 [#pep9876]_ via the BCA [#]_ mechanism. + Reference 1 is a simple URL reference. Reference 2 is a footnote + containing text and a URL. Reference 3 is a footnote containing + text only. Rewritten using reStructuredText, this PEP could look + like this:: - ... - - References & Footnotes - ====================== + Abstract + ======== - .. _frungible doodads: http://www.example.org/ + This PEP proposes adding `frungible doodads`_ to the core. It + extends PEP 9876 [#pep9876]_ via the BCA [#]_ mechanism. - .. [#pep9876] PEP 9876, Let's Hope We Never Get Here + ... - .. [#] "Bogus Complexity Addition" + References & Footnotes + ====================== - URLs and footnotes can be defined close to their references if - desired, making them easier to read in the source text, and - making the PEPs easier to revise. The "References and - Footnotes" section can be auto-generated with a document tree - transform. Footnotes from throughout the PEP would be gathered - and displayed under a standard header. If URL references - should likewise be written out explicitly (in citation form), - another tree transform could be used. + .. _frungible doodads: http://www.example.org/ - URL references can be named ("frungible doodads"), and can be - referenced from multiple places in the document without - additional definitions. When converted to HTML, references - will be replaced with inline hyperlinks (HTML tags). The - two footnotes are automatically numbered, so they will always - stay in sync. The first footnote also contains an internal - reference name, "pep9876", so it's easier to see the connection - between reference and footnote in the source text. Named - footnotes can be referenced multiple times, maintaining - consistent numbering. + .. [#pep9876] PEP 9876, Let's Hope We Never Get Here - The "#pep9876" footnote could also be written in the form of a - citation:: + .. [#] "Bogus Complexity Addition" - It extends PEP 9876 [PEP9876]_ ... + URLs and footnotes can be defined close to their references if + desired, making them easier to read in the source text, and making + the PEPs easier to revise. The "References and Footnotes" section + can be auto-generated with a document tree transform. Footnotes + from throughout the PEP would be gathered and displayed under a + standard header. If URL references should likewise be written out + explicitly (in citation form), another tree transform could be + used. - .. [PEP9876] PEP 9876, Let's Hope We Never Get Here + URL references can be named ("frungible doodads"), and can be + referenced from multiple places in the document without additional + definitions. When converted to HTML, references will be replaced + with inline hyperlinks (HTML tags). The two footnotes are + automatically numbered, so they will always stay in sync. The + first footnote also contains an internal reference name, "pep9876", + so it's easier to see the connection between reference and footnote + in the source text. Named footnotes can be referenced multiple + times, maintaining consistent numbering. - Footnotes are numbered, whereas citations use text for their - references. + The "#pep9876" footnote could also be written in the form of a + citation:: - 5. Wouldn't it be better to keep the docstring and PEP proposals - separate? + It extends PEP 9876 [PEP9876]_ ... - The PEP markup proposal may be removed if it is deemed that - there is no need for PEP markup, or it could be made into a - separate PEP. If accepted, PEP 1, PEP Purpose and Guidelines - [20]_, and PEP 9, Sample PEP Template [21]_ will be updated. + .. [PEP9876] PEP 9876, Let's Hope We Never Get Here - It seems natural to adopt a single consistent markup standard - for all uses of structured plaintext in Python, and to propose - it all in one place. + Footnotes are numbered, whereas citations use text for their + references. - 6. The existing pep2html.py script converts the existing PEP - format to HTML. How will the new-format PEPs be converted to - HTML? +5. Wouldn't it be better to keep the docstring and PEP proposals + separate? - One of the deliverables of the Docutils project [22]_ will be a - new version of pep2html.py with integrated reStructuredText - parsing. The Docutils project will support PEPs with a "PEP - Reader" component, including all functionality currently in - pep2html.py (auto-recognition of PEP & RFC references). + The PEP markup proposal may be removed if it is deemed that there + is no need for PEP markup, or it could be made into a separate PEP. + If accepted, PEP 1, PEP Purpose and Guidelines [#PEP-1]_, and PEP + 9, Sample PEP Template [#PEP-9]_ will be updated. - 7. Who's going to convert the existing PEPs to reStructuredText? + It seems natural to adopt a single consistent markup standard for + all uses of structured plaintext in Python, and to propose it all + in one place. - PEP authors or volunteers may convert existing PEPs if they - like, but there is no requirement to do so. The - reStructuredText-based PEPs will coexist with the old PEP - standard. The pep2html.py mentioned in answer 6 will process - both old and new standards. +6. The existing pep2html.py script converts the existing PEP format to + HTML. How will the new-format PEPs be converted to HTML? - 8. Why use reStructuredText for README and other ancillary files? + A new version of pep2html.py with integrated reStructuredText + parsing has been completed. The Docutils project supports PEPs + with a "PEP Reader" component, including all functionality + currently in pep2html.py (auto-recognition of PEP & RFC references, + email masking, etc.). - The reasoning given for PEPs in answer 4 above also applies to - README and other ancillary files. By adopting a standard - markup, these files can be converted to attractive - cross-referenced HTML and put up on python.org. Developers of - Python projects can also take advantage of this facility for - their own documentation. +7. Who's going to convert the existing PEPs to reStructuredText? - 9. Won't the superficial similarity to existing markup conventions - cause problems, and result in people writing invalid markup - (and not noticing, because the plaintext looks natural)? How - forgiving is reStructuredText of "not quite right" markup? + PEP authors or volunteers may convert existing PEPs if they like, + but there is no requirement to do so. The reStructuredText-based + PEPs will coexist with the old PEP standard. The pep2html.py + mentioned in answer 6 processes both old and new standards. - There will be some mis-steps, as there would be when moving - from one programming language to another. As with any - language, proficiency grows with experience. Luckily, - reStructuredText is a very little language indeed. +8. Why use reStructuredText for README and other ancillary files? - As with any syntax, there is the possibility of syntax errors. - It is expected that a user will run the processing system over - their input and check the output for correctness. + The reasoning given for PEPs in answer 4 above also applies to + README and other ancillary files. By adopting a standard markup, + these files can be converted to attractive cross-referenced HTML + and put up on python.org. Developers of other projects can also + take advantage of this facility for their own documentation. - In a strict sense, the reStructuredText parser is very - unforgiving (as it should be; "In the face of ambiguity, refuse - the temptation to guess" [23]_ applies to parsing markup as - well as computer languages). Here's design goal 3 from "An - Introduction to reStructuredText" [15 ]_: +9. Won't the superficial similarity to existing markup conventions + cause problems, and result in people writing invalid markup (and + not noticing, because the plaintext looks natural)? How forgiving + is reStructuredText of "not quite right" markup? - Unambiguous. The rules for markup must not be open for - interpretation. For any given input, there should be one - and only one possible output (including error output). + There will be some mis-steps, as there would be when moving from + one programming language to another. As with any language, + proficiency grows with experience. Luckily, reStructuredText is a + very little language indeed. - While unforgiving, at the same time the parser does try to be - helpful by producing useful diagnostic output ("system - messages"). The parser reports problems, indicating their - level of severity (from least to most: debug, info, warning, - error, severe). The user or the client software can decide on - reporting thresholds; they can ignore low-level problems or - cause high-level problems to bring processing to an immediate - halt. Problems are reported during the parse as well as - included in the output, often with two-way links between the - source of the problem and the system message explaining it. + As with any syntax, there is the possibility of syntax errors. It + is expected that a user will run the processing system over their + input and check the output for correctness. - 10. Will the docstrings in the Python standard library modules be - converted to reStructuredText? + In a strict sense, the reStructuredText parser is very unforgiving + (as it should be; "In the face of ambiguity, refuse the temptation + to guess" [#Zen]_ applies to parsing markup as well as computer + languages). Here's design goal 3 from `An Introduction to + reStructuredText`_: - No. Python's library reference documentation is maintained - separately from the source. Docstrings in the Python standard - library should not try to duplicate the library reference - documentation. The current policy for docstrings in the - Python standard library is that they should be no more than - concise hints, simple and markup-free (although many *do* - contain ad-hoc implicit markup). + Unambiguous. The rules for markup must not be open for + interpretation. For any given input, there should be one and + only one possible output (including error output). - 11. I want to write all my strings in Unicode. Will anything - break? + While unforgiving, at the same time the parser does try to be + helpful by producing useful diagnostic output ("system messages"). + The parser reports problems, indicating their level of severity + (from least to most: debug, info, warning, error, severe). The + user or the client software can decide on reporting thresholds; + they can ignore low-level problems or cause high-level problems to + bring processing to an immediate halt. Problems are reported + during the parse as well as included in the output, often with + two-way links between the source of the problem and the system + message explaining it. - The parser fully supports Unicode. Docutils supports - arbitrary input encodings. +10. Will the docstrings in the Python standard library modules be + converted to reStructuredText? - 12. Why does the community need a new structured text design? + No. Python's library reference documentation is maintained + separately from the source. Docstrings in the Python standard + library should not try to duplicate the library reference + documentation. The current policy for docstrings in the Python + standard library is that they should be no more than concise + hints, simple and markup-free (although many *do* contain ad-hoc + implicit markup). - The existing structured text designs are deficient, for the - reasons given in "Rationale" above. reStructuredText aims to - be a complete markup syntax, within the limitations of the - "readable plaintext" medium. +11. I want to write all my strings in Unicode. Will anything + break? - 13. What is wrong with existing documentation methodologies? + The parser fully supports Unicode. Docutils supports arbitrary + input and output encodings. - What existing methodologies? For Python docstrings, there is - **no** official standard markup format, let alone a - documentation methodology, akin to JavaDoc. The question of - methodology is at a much higher level than syntax (which this - PEP addresses). It is potentially much more controversial and - difficult to resolve, and is intentionally left out of this - discussion. +12. Why does the community need a new structured text design? + + The existing structured text designs are deficient, for the + reasons given in "Rationale" above. reStructuredText aims to be a + complete markup syntax, within the limitations of the "readable + plaintext" medium. + +13. What is wrong with existing documentation methodologies? + + What existing methodologies? For Python docstrings, there is + **no** official standard markup format, let alone a documentation + methodology akin to JavaDoc. The question of methodology is at a + much higher level than syntax (which this PEP addresses). It is + potentially much more controversial and difficult to resolve, and + is intentionally left out of this discussion. References & Footnotes +====================== - [1] http://docutils.sourceforge.net/spec/rst.html +.. [#PEP-1] PEP 1, PEP Guidelines, Warsaw, Hylton + (http://www.python.org/peps/pep-0001.html) - [2] http://www.python.org/sigs/doc-sig/ +.. [#PEP-9] PEP 9, Sample PEP Template, Warsaw + (http://www.python.org/peps/pep-0009.html) - [3] http://www.w3.org/XML/ +.. [#Zen] From `The Zen of Python (by Tim Peters)`__ (or just + "``import this``" in Python) - [4] http://www.oasis-open.org/cover/general.html +__ http://www.python.org/doc/Humor.html#zen - [5] http://docbook.org/tdg/en/html/docbook.html +.. [#PEP-216] PEP 216, Docstring Format, Zadka + (http://www.python.org/peps/pep-0216.html) - [6] http://www.w3.org/MarkUp/ +.. _reStructuredText markup: http://docutils.sourceforge.net/spec/rst.html - [7] http://www.w3.org/MarkUp/#xhtml1 +.. _Doc-SIG: http://www.python.org/sigs/doc-sig/ - [8] http://www.tug.org/interest.html +.. _XML: http://www.w3.org/XML/ - [9] http://www.perldoc.com/perl5.6/pod/perlpod.html +.. _SGML: http://www.oasis-open.org/cover/general.html - [10] http://java.sun.com/j2se/javadoc/ +.. _DocBook: http://docbook.org/tdg/en/html/docbook.html - [11] http://docutils.sourceforge.net/mirror/setext.html +.. _HTML: http://www.w3.org/MarkUp/ - [12] http://dev.zope.org/Members/jim/StructuredTextWiki/FrontPage +.. _XHTML: http://www.w3.org/MarkUp/#xhtml1 - [13] A ReStructuredText Primer - http://docutils.sourceforge.net/docs/rst/quickstart.html +.. _TeX: http://www.tug.org/interest.html - [14] Quick reStructuredText - http://docutils.sourceforge.net/docs/rst/quickref.html +.. _Perl POD: http://www.perldoc.com/perl5.6/pod/perlpod.html - [15] An Introduction to reStructuredText - http://docutils.sourceforge.net/spec/rst/introduction.html +.. _JavaDoc: http://java.sun.com/j2se/javadoc/ - [16] reStructuredText Markup Specification - http://docutils.sourceforge.net/spec/rst/reStructuredText.html +.. _Setext: http://docutils.sourceforge.net/mirror/setext.html - [17] reStructuredText Directives - http://docutils.sourceforge.net/spec/rst/directives.html +.. _StructuredText: + http://dev.zope.org/Members/jim/StructuredTextWiki/FrontPage - [18] Problems with StructuredText - http://docutils.sourceforge.net/spec/rst/problems.html +.. _A ReStructuredText Primer: + http://docutils.sourceforge.net/docs/rst/quickstart.html - [19] A Record of reStructuredText Syntax Alternatives - http://docutils.sourceforge.net/spec/rst/alternatives.html +.. _Quick reStructuredText: + http://docutils.sourceforge.net/docs/rst/quickref.html - [20] PEP 1, PEP Guidelines, Warsaw, Hylton - http://www.python.org/peps/pep-0001.html +.. _An Introduction to reStructuredText: + http://docutils.sourceforge.net/spec/rst/introduction.html - [21] PEP 9, Sample PEP Template, Warsaw - http://www.python.org/peps/pep-0009.html +.. _reStructuredText Markup Specification: + http://docutils.sourceforge.net/spec/rst/reStructuredText.html - [22] http://docutils.sourceforge.net/ +.. _reStructuredText Directives: + http://docutils.sourceforge.net/spec/rst/directives.html - [23] From "The Zen of Python (by Tim Peters)" - (http://www.python.org/doc/Humor.html#zen or just - "``import this``" in Python) +.. _Problems with StructuredText: + http://docutils.sourceforge.net/spec/rst/problems.html - [24] PEP 216, Docstring Format, Zadka - http://www.python.org/peps/pep-0216.html +.. _A Record of reStructuredText Syntax Alternatives: + http://docutils.sourceforge.net/spec/rst/alternatives.html + +.. _Docutils: http://docutils.sourceforge.net/ Copyright +========= - This document has been placed in the public domain. +This document has been placed in the public domain. Acknowledgements +================ - Some text is borrowed from PEP 216, Docstring Format [24]_, by - Moshe Zadka. +Some text is borrowed from PEP 216, Docstring Format [#PEP-216]_, by +Moshe Zadka. - Special thanks to all members past & present of the Python - Doc-SIG. +Special thanks to all members past & present of the Python Doc-SIG_. -Local Variables: -mode: indented-text -indent-tabs-mode: nil -sentence-end-double-space: t -fill-column: 70 -End: +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + End: