From 18550ff27a6e8f4593bcdd44b7e7ddcc216d2c01 Mon Sep 17 00:00:00 2001 From: Barry Warsaw Date: Mon, 1 Apr 2002 15:57:27 +0000 Subject: [PATCH] PEP 287, reStructuredText Standard Docstring Format, Goodger Replaces PEP 216, Zadka --- pep-0000.txt | 2 + pep-0216.txt | 5 +- pep-0287.txt | 679 +++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 684 insertions(+), 2 deletions(-) create mode 100644 pep-0287.txt diff --git a/pep-0000.txt b/pep-0000.txt index 5829d4857..26dcb30f0 100644 --- a/pep-0000.txt +++ b/pep-0000.txt @@ -93,6 +93,7 @@ Index by Category S 284 Integer for-loops Eppstein, Ewing S 285 Adding a bool type van Rossum S 286 Enhanced Argument Tuples von Loewis + S 287 reStructuredText Standard Docstring Format Goodger Finished PEPs (done, implemented in CVS) @@ -261,6 +262,7 @@ Numerical Index S 284 Integer for-loops Eppstein, Ewing S 285 Adding a bool type van Rossum S 286 Enhanced Argument Tuples von Loewis + S 287 reStructuredText Standard Docstring Format Goodger SR 666 Reject Foolish Indentation Creighton diff --git a/pep-0216.txt b/pep-0216.txt index 4d6602c04..44f78d3b8 100644 --- a/pep-0216.txt +++ b/pep-0216.txt @@ -5,11 +5,12 @@ Author: moshez@zadka.site.co.il (Moshe Zadka) Status: Rejected Type: Informational Created: 31-Jul-2000 +Replaced-By: PEP 287 Notice - This PEP is rejected by the author. It will be superseded by a - new PEP, at which time this notice will be updated. + This PEP is rejected by the author. It has been superseded by PEP + 287. Abstract diff --git a/pep-0287.txt b/pep-0287.txt new file mode 100644 index 000000000..f8c19dac9 --- /dev/null +++ b/pep-0287.txt @@ -0,0 +1,679 @@ +PEP: 287 +Title: reStructuredText Standard Docstring Format +Version: $Revision$ +Last-Modified: $Date$ +Author: goodger@users.sourceforge.net (David Goodger) +Discussions-To: doc-sig@python.org +Status: Draft +Type: Informational +Created: 25-Mar-2002 +Post-History: +Replaces: PEP 216 + + +Abstract + + This PEP proposes that the reStructuredText [1]_ markup be adopted + as the standard markup format for plaintext documentation in + Python docstrings, and (optionally) for PEPs and ancillary + documents as well. reStructuredText is a rich and extensible yet + easy-to-read, what-you-see-is-what-you-get plaintext markup + syntax. + + Only the low-level syntax of docstrings is addressed here. This + PEP is not concerned with docstring semantics or processing at + all. + + +Goals + + These are the generally accepted goals for a docstring format, as + discussed in the Python Documentation Special Interest Group + (Doc-SIG) [2]_: + + 1. It must be easy to type with any standard text editor. + + 2. It must be readable to the casual observer. + + 3. It must not need to contain information which can be deduced + from parsing the module. + + 4. It must contain sufficient information (structure) so it can be + converted to any reasonable markup format. + + 5. It must be possible to write a module's entire documentation in + docstrings, without feeling hampered by the markup language. + + [[Are these in fact the goals of the Doc-SIG members? Anything to + add?]] + + reStructuredText meets and exceeds all of these goals, and sets + its own goals as well, even more stringent. See "Features" below. + + The goals of this PEP are as follows: + + 1. To establish a standard docstring format by attaining + "accepted" status (Python community consensus; BDFL + pronouncement). Once reStructuredText is a Python standard, + all effort can be focused on tools instead of arguing for a + standard. Python needs a standard set of documentation tools. + + 2. To address any related concerns raised by the Python community. + + 3. To encourage community support. As long as multiple competing + markups are out there, the development community remains + fractured. Once a standard exists, people will start to use + it, and momentum will inevitably gather. + + 4. To consolidate efforts from related auto-documentation + projects. It is hoped that interested developers will join + forces and work on a joint/merged/common implementation. + + 5. (Optional.) To adopt reStructuredText as the standard markup + for PEPs. One or both of the following strategies may be + applied: + + a) Keep the existing PEP section structure constructs (one-line + section headers, indented body text). Subsections can + either be forbidden or supported with underlined headers in + the indented body text. + + b) Replace the PEP section structure constructs with the + reStructuredText syntax. Section headers will require + underlines, subsections will be supported out of the box, + and body text need not be indented (except for block + quotes). + + Support for RFC 2822 headers will be added to the + reStructuredText parser (unambiguous given a specific context: + the first contiguous block of a PEP document). It may be + desired to concretely specify what over/underline styles are + allowed for PEP section headers, for uniformity. + + 6. (Optional.) To adopt reStructuredText as the standard markup + for README-type files and other standalone documents in the + Python distribution. + + +Rationale + + The __doc__ attribute is called a documentation string, or + docstring. It is often used to summarize the interface of the + module, class or function. The lack of a standard syntax for + docstrings has hampered the development of standard tools for + extracting docstrings and transforming them into documentation in + standard formats (e.g., HTML, DocBook, TeX). There have been a + number of proposed markup formats and variations, and many tools + tied to these proposals, but without a standard docstring format + they have failed to gain a strong following and/or floundered + half-finished. + + The adoption of a standard will, at the very least, benefit + docstring processing tools by preventing further "reinventing the + wheel". + + Throughout the existence of the Doc-SIG, consensus on a single + standard docstring format has never been reached. A lightweight, + implicit markup has been sought, for the following reasons (among + others): + + 1. Docstrings written within Python code are available from within + the interactive interpreter, and can be 'print'ed. Thus the + use of plaintext for easy readability. + + 2. Programmers want to add structure to their docstrings, without + sacrificing raw docstring readability. Unadorned plaintext + cannot be transformed ('up-translated') into useful structured + formats. + + 3. Explicit markup (like XML or TeX) is widely considered + unreadable by the uninitiated. + + 4. Implicit markup is aesthetically compatible with the clean and + minimalist Python syntax. + + Proposed alternatives have included: + + - XML [3]_, SGML [4]_, DocBook [5]_, HTML [6]_, XHTML [7]_ + + XML and SGML are explicit, well-formed meta-languages suitable + for all kinds of documentation. XML is a variant of SGML. They + are best used behind the scenes, because they are verbose, + difficult to type, and too cluttered to read comfortably as + source. DocBook, HTML, and XHTML are all applications of SGML + and/or XML, and all share the same basic syntax and the same + shortcomings. + + - TeX [8]_ + + TeX is similar to XML/SGML in that it's explicit, not very easy + to write, and not easy for the uninitiated to read. + + - Perl POD [9]_ + + Most Perl modules are documented in a format called POD -- Plain + Old Documentation. This is an easy-to-type, very low level + format with strong integration with the Perl parser. Many tools + exist to turn POD documentation into other formats: info, HTML + and man pages, among others. However, the POD syntax takes + after Perl itself in terms of readability. + + - JavaDoc [10]_ + + Special comments before Java classes and functions serve to + document the code. A program to extract these, and turn them + into HTML documentation is called javadoc, and is part of the + standard Java distribution. However, the only output format + that is supported is HTML, and JavaDoc has a very intimate + relationship with HTML, using HTML tags for most markup. Thus + it shares the readability problems of HTML. + + - Setext [11]_, StructuredText [12]_ + + Early on, variants of Setext (Structure Enhanced Text), + including Zope Corp's StructuredText, were proposed for Python + docstring formatting. Hereafter these variants will + collectively be call 'STexts'. STexts have the advantage of + being easy to read without special knowledge, and relatively + easy to write. + + Although used by some (including in most existing Python + auto-documentation tools), until now STexts have failed to + become standard because: + + - STexts have been incomplete. Lacking "essential" constructs + that people want to use in their docstrings, STexts are + rendered less than ideal. Note that these "essential" + constructs are not universal; everyone has their own + requirements. + + - STexts have been sometimes surprising. Bits of text are + marked up unexpectedly, leading to user frustration. + + - SText implementations have been buggy. + + - Most STexts have have had no formal specification except for + the implementation itself. A buggy implementation meant a + buggy spec, and vice-versa. + + - There has been no mechanism to get around the SText markup + rules when a markup character is used in a non-markup context. + + Proponents of implicit STexts have vigorously opposed proposals + for explicit markup (XML, HTML, TeX, POD, etc.), and the debates + have continued off and on since 1996 or earlier. + + reStructuredText is a complete revision and reinterpretation of + the SText idea, addressing all of the problems listed above. + + +Features + + Rather than repeating or summarizing the extensive + reStructuredText spec, please read the originals available from + http://structuredtext.sourceforge.net/spec/ (.txt & .html files). + Reading the documents in following order is recommended: + + - An Introduction to reStructuredText [13]_ + + - Problems With StructuredText [14]_ (optional, if you've used + StructuredText; it explains many markup decisions made) + + - reStructuredText Markup Specification [15]_ + + - A Record of reStructuredText Syntax Alternatives [16]_ (explains + markup decisions made independently of StructuredText) + + - reStructuredText Directives [17]_ + + There is also a "Quick reStructuredText" user reference [18]_. + + A summary of features addressing often-raised docstring markup + concerns follows: + + - A markup escaping mechanism. + + Backslashes (``\``) are used to escape markup characters when + needed for non-markup purposes. However, the inline markup + recognition rules have been constructed in order to minimize the + need for backslash-escapes. For example, although asterisks are + used for *emphasis*, in non-markup contexts such as "*" or "(*)" + or "x * y", the asterisks are not interpreted as markup and are + left unchanged. For many non-markup uses of backslashes (e.g., + describing regular expressions), inline literals or literal + blocks are applicable; see the next item. + + - Markup to include Python source code and Python interactive + sessions: inline literals, literal blocks, and doctest blocks. + + Inline literals use ``double-backquotes`` to indicate program + I/O or code snippets. No markup interpretation (including + backslash-escape [``\``] interpretation) is done within inline + literals. + + Literal blocks (block-level literal text, such as code excerpts + or ASCII graphics) are indented, and indicated with a + double-colon ("::") at the end of the preceding paragraph (right + here -->):: + + if literal_block: + text = 'is left as-is' + spaces_and_linebreaks = 'are preserved' + markup_processing = None + + Doctest blocks begin with ">>> " and end with a blank line. + Neither indentation nor literal block double-colons are + required. For example:: + + Here's a doctest block: + + >>> print 'Python-specific usage examples; begun with ">>>"' + Python-specific usage examples; begun with ">>>" + >>> print '(cut and pasted from interactive sessions)' + (cut and pasted from interactive sessions) + + - Markup that isolates a Python identifier: interpreted text. + + Text enclosed in single backquotes is recognized as "interpreted + text", whose interpretation is application-dependent. In the + context of a Python docstring, the default interpretation of + interpreted text is as Python identifiers. The text will be + marked up with a hyperlink connected to the documentation for + the identifier given. Lookup rules are the same as in Python + itself: LGB namespace lookups (local, global, builtin). The + "role" of the interpreted text (identifying a class, module, + function, etc.) is determined implicitly from the namespace + lookup. For example:: + + class Keeper(Storer): + + """ + Extend `Storer`. Class attribute `instances` keeps track + of the number of `Keeper` objects instantiated. + """ + + instances = 0 + """How many `Keeper` objects are there?""" + + def __init__(self): + """ + Extend `Storer.__init__()` to keep track of + instances. Keep count in `self.instances` and data + in `self.data`. + """ + Storer.__init__(self) + self.instances += 1 + + self.data = [] + """Store data in a list, most recent last.""" + + def storedata(self, data): + """ + Extend `Storer.storedata()`; append new `data` to a + list (in `self.data`). + """ + self.data = data + + Each piece of interpreted text is looked up according to the + local namespace of the block containing its docstring. + + - Markup that isolates a Python identifier and specifies its type: + interpreted text with roles. + + Although the Python source context reader is designed not to + require explicit roles, they may be used. To classify + identifiers explicitly, the role is given along with the + identifier in either prefix or suffix form:: + + Use :method:`Keeper.storedata` to store the object's data in + `Keeper.data`:instance_attribute:. + + The syntax chosen for roles is verbose, but necessarily so (if + anyone has a better alternative, please post it to the Doc-SIG). + The intention of the markup is that there should be little need + to use explicit roles; their use is to be kept to an absolute + minimum. + + - Markup for "tagged lists" or "label lists": field lists. + + Field lists represent a mapping from field name to field body. + These are mostly used for extension syntax, such as + "bibliographic field lists" (representing document metadata such + as author, date, and version) and extension attributes for + directives (see below). They may be used to implement docstring + semantics, such as identifying parameters, exceptions raised, + etc.; such usage is beyond the scope of this PEP. + + A modified RFC 2822 syntax is used, with a colon *before* as + well as *after* the field name. Field bodies are more versatile + as well; they may contain multiple field bodies (even nested + field lists). For example:: + + :Date: 2002-03-22 + :Version: 1 + :Authors: + - Me + - Myself + - I + + Standard RFC 2822 header syntax cannot be used for this + construct because it is ambiguous. A word followed by a colon + at the beginning of a line is common in written text. However, + with the addition of a well-defined context, such as when a + field list invariably occurs at the beginning of a document + (e.g., PEPs and email messages), standard RFC 2822 header syntax + can be used. + + - Markup extensibility: directives and substitutions. + + Directives are used as an extension mechanism for + reStructuredText, a way of adding support for new block-level + constructs without adding new syntax. Directives for images, + admonitions (note, caution, etc.), and tables of contents + generation (among others) have been implemented. For example, + here's how to place an image:: + + .. image:: mylogo.png + + Substitution definitions allow the power and flexibility of + block-level directives to be shared by inline text. For + example:: + + The |biohazard| symbol must be used on containers used to + dispose of medical waste. + + .. |biohazard| image:: biohazard.png + + - Section structure markup. + + Section headers in reStructuredText use adornment via underlines + (and possibly overlines) rather than indentation. For example:: + + This is a Section Title + ======================= + + This is a Subsection Title + -------------------------- + + This paragraph is in the subsection. + + This is Another Section Title + ============================= + + This paragraph is in the second section. + + +Questions & Answers + + Q: Is reStructuredText rich enough? + + A: Yes, it is for most people. If it lacks some construct that is + require for a specific application, it can be added via the + directive mechanism. If a common construct has been + overlooked and a suitably readable syntax can be found, it can + be added to the specification and parser. + + Q: Is reStructuredText *too* rich? + + A: No. + + Since the very beginning, whenever a markup syntax has been + proposed on the Doc-SIG, someone has complained about the lack + of support for some construct or other. The reply was often + something like, "These are docstrings we're talking about, and + docstrings shouldn't have complex markup." The problem is that + a construct that seems superfluous to one person may be + absolutely essential to another. + + reStructuredText takes the opposite approach: it provides a + rich set of implicit markup constructs (plus a generic + extension mechanism for explicit markup), allowing for all + kinds of documents. If the set of constructs is too rich for a + particular application, the unused constructs can either be + removed from the parser (via application-specific overrides) or + simply omitted by convention. + + Q: Why not use indentation for section structure, like + StructuredText does? Isn't it more "Pythonic"? + + A: Guido van Rossum wrote the following in a 2001-06-13 Doc-SIG + post: + + I still think that using indentation to indicate sectioning + is wrong. If you look at how real books and other print + publications are laid out, you'll notice that indentation + is used frequently, but mostly at the intra-section level. + Indentation can be used to offset lists, tables, + quotations, examples, and the like. (The argument that + docstrings are different because they are input for a text + formatter is wrong: the whole point is that they are also + readable without processing.) + + I reject the argument that using indentation is Pythonic: + text is not code, and different traditions and conventions + hold. People have been presenting text for readability for + over 30 centuries. Let's not innovate needlessly. + + See "Section Structure via Indentation" in "Problems With + StructuredText" [14 ]_ for further elaboration. + + Q: Why use reStructuredText for PEPs? What's wrong with the + existing standard? + + A: The existing standard for PEPs is very limited in terms of + general expressibility, and referencing is especially lacking + for such a reference-rich document type. PEPs are currently + converted into HTML, but the results (mostly monospaced text) + are less than attractive, and most of the value-added potential + of HTML is untapped. + + Making reStructuredText the standard markup for PEPs will + enable much richer expression, including support for section + structure, inline markup, graphics, and tables. In several + PEPs there are ASCII graphics diagrams, which are all that + plaintext documents can support. Since PEPs are made available + in HTML form, the ability to include proper diagrams would be + immediately useful. + + Current PEP practices allow for reference markers in the form + "[1]" in the text, and the footnotes/references themselves are + listed in a section toward the end of the document. There is + currently no hyperlinking between the reference marker and the + footnote/reference itself (it would be possible to add this to + pep2html.py, but the "markup" as it stands is ambiguous and + mistakes would be inevitable). A PEP with many references + (such as this one ;-) requires a lot of flipping back and + forth. When revising a PEP, often new references are added or + unused references deleted. It is painful to renumber the + references, since it has to be done in two places and can have + a cascading effect (insert a single new reference 1, and every + other reference has to be renumbered; always adding new + references to the end is suboptimal). It is easy for + references to go out of sync. + + PEPs use references for two purposes: simple URL references and + footnotes. reStructuredText differentiates between the two. A + PEP might contain references like this:: + + Abstract + + This PEP proposes a adding frungible doodads [1] to the + core. It extends PEP 9876 [2] via the BCA [3] + mechanism. + + References and Footnotes + + [1] http://www.doodads.org/frungible.html + + [2] PEP 9876, Let's Hope We Never Get Here + http://www.python.org/peps/pep-9876.html + + [3] "Bogus Complexity Addition" + + Reference 1 is a simple URL reference. Reference 2 is a + footnote containing text and a URL. Reference 3 is a footnote + containing text only. Rewritten using reStructuredText, this + PEP could look like this:: + + Abstract + ======== + + This PEP proposes a adding `frungible doodads`_ to the + core. It extends PEP 9876 [#pep9876] via the BCA [#] + mechanism. + + .. _frungible doodads: + http://www.doodads.org/frungible.html + + .. [#pep9876] `PEP 9876`__, Let's Hope We Never Get Here + + __ http://www.python.org/peps/pep-9876.html + + .. [#] "Bogus Complexity Addition" + + URLs and footnotes can be defined close to their references if + desired, making them easier to read in the source text, and + making the PEPs easier to revise. The "References and + Footnotes" section can be auto-generated with a document tree + transform. Footnotes from throughout the PEP would be gathered + and displayed under a standard header. If URL references + should likewise be written out explicitly (in citation form), + another tree transform could be used. + + URL references can be named ("frungible doodads"), and can be + referenced from multiple places in the document without + additional definitions. When converted to HTML, references + will be replaced with inline hyperlinks (HTML tags). The + two footnotes are automatically numbered, so they will always + stay in sync. The first footnote also contains an internal + reference name, "pep9876", so it's easier to see the connection + between reference and footnote in the source text. Named + footnotes can be referenced multiple times, maintaining + consistent numbering. + + The "#pep9876" footnote could also be written in the form of a + citation:: + + It extends PEP 9876 [PEP9876]_ ... + + .. [PEP9876] `PEP 9876`_, Let's Hope We Never Get Here + + Footnotes are numbered, whereas citations use text for their + references. + + Q: Wouldn't it be better to keep the docstring and PEP proposals + separate? + + A: The PEP markup proposal is an option to this PEP. It may be + removed if it is deemed that there is no need for PEP markup. + The PEP markup proposal could be made into a separate PEP if + necessary. If accepted, PEP 1, PEP Purpose and Guidelines [19]_, + and PEP 9, Sample PEP Template [20]_ will be updated. + + It seems natural to adopt a single consistent markup standard + for all uses of plaintext in Python. + + Q: The existing pep2html.py script converts the existing PEP + format to HTML. How will the new-format PEPs be converted to + HTML? + + A: One of the deliverables of the Docutils project [21]_ will be a + new version of pep2html.py with integrated reStructuredText + parsing. The Docutils project will support PEPs with a "PEP + Reader" component, including all functionality currently in + pep2html.py (auto-recognition of PEP & RFC references). + + Q: Who's going to convert the existing PEPs to reStructuredText? + + A: A call for volunteers will be put out to the Doc-SIG and + greater Python communities. If insufficient volunteers are + forthcoming, I (David Goodger) will convert the documents + myself, perhaps with some level of automation. A transitional + system whereby both old and new standards can coexist will be + easy to implement (and I pledge to implement it if necessary). + + Q: Why use reStructuredText for README and other ancillary files? + + A: The same reasoning used for PEPs above applies to README and + other ancillary files. By adopting a standard markup, these + files can be converted to attractive cross-referenced HTML and + put up on python.org. Developers of Python projects can also + take advantage of this facility for their own documentation. + + +References and Footnotes + + [1] http://structuredtext.sourceforge.net/ + + [2] http://www.python.org/sigs/doc-sig/ + + [3] http://www.w3.org/XML/ + + [4] http://www.oasis-open.org/cover/general.html + + [5] http://docbook.org/tdg/en/html/docbook.html + + [6] http://www.w3.org/MarkUp/ + + [7] http://www.w3.org/MarkUp/#xhtml1 + + [8] http://www.tug.org/interest.html + + [9] http://www.perldoc.com/perl5.6/pod/perlpod.html + + [10] http://java.sun.com/j2se/javadoc/ + + [11] http://docutils.sourceforge.net/mirror/setext.html + + [12] http://dev.zope.org/Members/jim/StructuredTextWiki/FrontPage + + [13] An Introduction to reStructuredText + http://structuredtext.sourceforge.net/spec/introduction.txt + + [14] Problems with StructuredText + http://structuredtext.sourceforge.net/spec/problems.txt + + [15] reStructuredText Markup Specification + http://structuredtext.sourceforge.net/spec/reStructuredText.txt + + [16] A Record of reStructuredText Syntax Alternatives + http://structuredtext.sourceforge.net/spec/alternatives.txt + + [17] reStructuredText Directives + http://structuredtext.sourceforge.net/spec/directives.txt + + [18] Quick reStructuredText + http://structuredtext.sourceforge.net/docs/quickref.html + + [19] PEP 1, PEP Guidelines, Warsaw, Hylton + http://www.python.org/peps/pep-0001.html + + [20] PEP 9, Sample PEP Template, Warsaw + http://www.python.org/peps/pep-0009.html + + [21] http://docutils.sourceforge.net/ + + [22] PEP 216, Docstring Format, Zadka + http://www.python.org/peps/pep-0216.html + + +Copyright + + This document has been placed in the public domain. + + +Acknowledgements + + Some text is borrowed from PEP 216, Docstring Format, by Moshe + Zadka [22]_. + + Special thanks to all members past & present of the Python Doc-SIG. + + + +Local Variables: +mode: indented-text +indent-tabs-mode: nil +sentence-end-double-space: t +fill-column: 70 +End: