PEP 256, Docstring Processing System Framework, David Goodger
Editing pass by Barry.
This commit is contained in:
parent
3cade7a68b
commit
ff682b330d
|
@ -0,0 +1,312 @@
|
|||
PEP: 256
|
||||
Title: Docstring Processing System Framework
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: dgoodger@bigfoot.com (David Goodger)
|
||||
Discussions-To: doc-sig@python.org
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
Requires: PEP 257 Docstring Conventions
|
||||
PEP 258 DPS Generic Implementation Details
|
||||
Created: 01-Jun-2001
|
||||
Post-History:
|
||||
|
||||
|
||||
Abstract
|
||||
|
||||
Python modules, classes and functions have a string attribute
|
||||
called __doc__. If the first expression inside the definition is
|
||||
a literal string, that string is assigned to the __doc__
|
||||
attribute, called a documentation string or docstring. It is
|
||||
often used to summarize the interface of the module, class or
|
||||
function.
|
||||
|
||||
There is no standard format (markup) for docstrings, nor are there
|
||||
standard tools for extracting docstrings and transforming them
|
||||
into useful structured formats (e.g., HTML, DocBook, TeX). Those
|
||||
tools that do exist are for the most part unmaintained and unused.
|
||||
The issues surrounding docstring processing have been contentious
|
||||
and difficult to resolve.
|
||||
|
||||
This PEP proposes a Docstring Processing System (DPS) framework.
|
||||
It separates out the components (program and conceptual), enabling
|
||||
the resolution of individual issues either through consensus (one
|
||||
solution) or through divergence (many). It promotes standard
|
||||
interfaces which will allow a variety of plug-in components (e.g.,
|
||||
input parsers and output formatters) to be used.
|
||||
|
||||
This PEP presents the concepts of a DPS framework independently of
|
||||
implementation details.
|
||||
|
||||
|
||||
Rationale
|
||||
|
||||
Python lends itself to inline documentation. With its built-in
|
||||
docstring syntax, a limited form of Literate Programming [2] is
|
||||
easy to do in Python. However, there are no satisfactory standard
|
||||
tools for extracting and processing Python docstrings. The lack
|
||||
of a standard toolset is a significant gap in Python's
|
||||
infrastructure; this PEP aims to fill the gap.
|
||||
|
||||
There are standard inline documentation systems for some other
|
||||
languages. For example, Perl has POD (plain old documentation)
|
||||
and Java has Javadoc, but neither of these mesh with the Pythonic
|
||||
way. POD is very explicit, but takes after Perl in terms of
|
||||
readability. Javadoc is HTML-centric; except for '@field' tags,
|
||||
raw HTML is used for markup. There are also general tools such as
|
||||
Autoduck and Web (Tangle & Weave), useful for multiple languages.
|
||||
|
||||
There have been many attempts to write autodocumentation systems
|
||||
for Python (not an exhaustive list):
|
||||
|
||||
- Marc-Andre Lemburg's doc.py [3]
|
||||
|
||||
- Daniel Larsson's pythondoc & gendoc [4]
|
||||
|
||||
- Doug Hellmann's HappyDoc [5]
|
||||
|
||||
- Laurence Tratt's Crystal [6]
|
||||
|
||||
- Ka-Ping Yee's htmldoc & pydoc [7] (pydoc.py is now part of the Python
|
||||
standard library; see below)
|
||||
|
||||
- Tony Ibbs' docutils [8]
|
||||
|
||||
These systems, each with different goals, have had varying degrees
|
||||
of success. A problem with many of the above systems was
|
||||
over-ambition. They provided a self-contained set of components: a
|
||||
docstring extraction system, an input parser, an internal
|
||||
processing system and one or more output formatters. Inevitably,
|
||||
one or more components had serious shortcomings, preventing the
|
||||
system from being adopted as a standard tool.
|
||||
|
||||
Throughout the existence of the Python Documentation Special
|
||||
Interest Group (Doc-SIG) [9], consensus on a single standard
|
||||
docstring format has never been reached. A lightweight, implicit
|
||||
markup has been sought, for the following reasons (among others):
|
||||
|
||||
1. Docstrings written within Python code are available from within
|
||||
the interactive interpreter, and can be 'print'ed. Thus the
|
||||
use of plaintext for easy readability.
|
||||
|
||||
2. Programmers want to add structure to their docstrings, without
|
||||
sacrificing raw docstring readability. Unadorned plaintext
|
||||
cannot be transformed ('up-translated') into useful structured
|
||||
formats.
|
||||
|
||||
3. Explicit markup (like XML or TeX) has been widely considered
|
||||
unreadable by the uninitiated.
|
||||
|
||||
4. Implicit markup is aesthetically compatible with the clean and
|
||||
minimalist Python syntax.
|
||||
|
||||
Early on, variants of Setext (Structure Enhanced Text) [10],
|
||||
including Digital Creation's StructuredText [11], were proposed
|
||||
for Python docstring formatting. Hereafter we will collectively
|
||||
call these variants 'STexts'. Although used by some (including in
|
||||
most of the above-listed autodocumentation tools), these markup
|
||||
schemes have failed to become standard because:
|
||||
|
||||
- STexts have been incomplete: lacking 'essential' constructs that
|
||||
people want to use in their docstrings, STexts are rendered less
|
||||
than ideal. Note that these 'essential' constructs are not
|
||||
universal; everyone has their own requirements.
|
||||
|
||||
- STexts have been sometimes surprising: bits of text are marked
|
||||
up unexpectedly, leading to user frustration.
|
||||
|
||||
- SText implementations have been buggy.
|
||||
|
||||
- Some STexts have have had no formal specification except for the
|
||||
implementation itself. A buggy implementation meant a buggy
|
||||
spec, and vice-versa.
|
||||
|
||||
- There has been no mechanism to get around the SText markup rules
|
||||
when a markup character is used in a non-markup context.
|
||||
|
||||
Recognizing the deficiencies of STexts, some people have proposed
|
||||
using explicit markup of some kind. There have been proposals for
|
||||
using XML, HTML, TeX, POD, and Javadoc at one time or another.
|
||||
Proponents of STexts have vigorously opposed these proposals, and
|
||||
the debates have continued off and on for at least five years.
|
||||
|
||||
It has become clear (to this author, at least) that the "all or
|
||||
nothing" approach cannot succeed, since no all-encompassing
|
||||
proposal could possibly be agreed upon by all interested parties.
|
||||
A modular component approach, where components may be multiply
|
||||
implemented, is the only chance at success. By separating out the
|
||||
issues, we can form consensus more easily (smaller fights ;-), and
|
||||
accept divergence more readily.
|
||||
|
||||
Each of the components of a docstring processing system should be
|
||||
developed independently. A 'best of breed' system should be
|
||||
chosen and/or developed and eventually included in Python's
|
||||
standard library.
|
||||
|
||||
|
||||
Pydoc & Other Existing Systems
|
||||
|
||||
Pydoc is part of the Python 2.1 standard library. It extracts and
|
||||
displays docstrings from within the Python interactive
|
||||
interpreter, from the shell command line, and from a GUI window
|
||||
into a web browser (HTML). In the case of GUI/HTML, except for
|
||||
some heuristic hyperlinking of identifier names, no formatting of
|
||||
the docstrings is done. They are presented within <p><small><tt>
|
||||
tags to avoid unwanted line wrapping. Unfortunately, the result
|
||||
is not pretty.
|
||||
|
||||
The functionality proposed in this PEP could be added to or used
|
||||
by pydoc when serving HTML pages. However, the proposed docstring
|
||||
processing system's functionality is much more than pydoc needs
|
||||
(in its current form). Either an independent tool will be
|
||||
developed (which pydoc may or may not use), or pydoc could be
|
||||
expanded to encompass this functionality and *become* the
|
||||
docstring processing system (or one such system). That decision
|
||||
is beyond the scope of this PEP.
|
||||
|
||||
Similarly for other existing docstring processing systems, their
|
||||
authors may or may not choose compatibility with this framework.
|
||||
However, if this framework is accepted and adopted as the Python
|
||||
standard, compatibility will become an important consideration in
|
||||
these systems' future.
|
||||
|
||||
|
||||
Specification
|
||||
|
||||
The docstring processing system framework consists of components,
|
||||
as follows::
|
||||
|
||||
1. Docstring conventions. Documents issues such as:
|
||||
|
||||
- What should be documented where.
|
||||
|
||||
- First line is a one-line synopsis.
|
||||
|
||||
PEP 257, "Docstring Conventions" [12], documents these issues.
|
||||
|
||||
2. Docstring processing system generic implementation details.
|
||||
Documents issues such as:
|
||||
|
||||
- High-level spec: what a DPS does.
|
||||
|
||||
- Command-line interface for executable script.
|
||||
|
||||
- System Python API
|
||||
|
||||
- Docstring extraction rules.
|
||||
|
||||
- Input parser API.
|
||||
|
||||
- Intermediate internal data structure: output from input parser,
|
||||
input to output formatter.
|
||||
|
||||
- Output formatter API.
|
||||
|
||||
- Output management.
|
||||
|
||||
These issues are applicable to any docstring processing system
|
||||
implementation. PEP 258, "DPS Generic Implementation Details"
|
||||
[13], documents these issues.
|
||||
|
||||
3. Docstring processing system implementation.
|
||||
|
||||
4. Input markup specifications: docstring syntax.
|
||||
|
||||
5. Input parser implementations.
|
||||
|
||||
6. Output formats (HTML, XML, TeX, DocBook, info, etc.).
|
||||
|
||||
7. Output formatter implementations.
|
||||
|
||||
Components 1, 2, and 3 will be the subject of individual companion
|
||||
PEPs, although they may be merged into this PEP once consensus is
|
||||
reached. If there is only one implementation, PEPs for components
|
||||
2 & 3 can be combined. Multiple PEPs will be necessary for each
|
||||
of components 4, 5, 6, and 7. An alternative to the PEP mechanism
|
||||
may be used instead, since these are not directly related to the
|
||||
Python language.
|
||||
|
||||
The following diagram shows an overview of the framework.
|
||||
Interfaces are indicated by double-borders. The ASCII diagram is
|
||||
very wide; please turn off line wrapping to view it:
|
||||
|
||||
|
||||
+========================+
|
||||
| Command-Line Interface |
|
||||
+========================+
|
||||
| Executable Script |
|
||||
+------------------------+
|
||||
|
|
||||
| calls
|
||||
v
|
||||
+===========================================+ returns +---------+
|
||||
| System Python API |==========>| output |
|
||||
+--------+ +===========================================+ | objects |
|
||||
_ writes | Python | reads | Docstring Processing System | +---------+
|
||||
/ \ ==============>| module |<===========| |
|
||||
\_/ +--------+ | input | transformation, | output | +--------+
|
||||
| +-------------+ follows | docstring | integration, | object | writes | output |
|
||||
--+-- consults | docstring |<-----------| extraction | linking | management |===========>| files |
|
||||
| --------->| conventions | +============+=====+=====+=====+============+ +--------+
|
||||
/ \ +-------------+ | parser API | | formatter API |
|
||||
/ \ +-------------+ +===========+======+ +======+===========+ +--------+
|
||||
author consults | markup | implements | input | intermediate | output | implements | output |
|
||||
--------->| syntax spec |<-----------| parser | data structure | formatter |----------->| format |
|
||||
+-------------+ +-----------+-------------------+-----------+ +--------+
|
||||
|
||||
|
||||
Project Web Site
|
||||
|
||||
A SourceForge project has been set up for this work at
|
||||
http://docstring.sf.net.
|
||||
|
||||
|
||||
References and Footnotes
|
||||
|
||||
[1] http://python.sf.net/peps/pep-0216.html
|
||||
|
||||
[2] http://www.literateprogramming.com/
|
||||
|
||||
[3] http://www.lemburg.com/files/python/SoftwareDescriptions.html#doc.py
|
||||
|
||||
[4] http://starship.python.net/crew/danilo/pythondoc/
|
||||
|
||||
[5] http://happydoc.sf.net/
|
||||
|
||||
[6] http://www.btinternet.com/~tratt/comp/python/crystal/index.html
|
||||
|
||||
[7] http://www.lfw.org/python/
|
||||
|
||||
[8] http://homepage.ntlworld.com/tibsnjoan/docutils/
|
||||
|
||||
[9] http://www.python.org/sigs/doc-sig/
|
||||
|
||||
[10] http://www.bsdi.com/setext/
|
||||
|
||||
[11] http://dev.zope.org/Members/jim/StructuredTextWiki/FrontPage/
|
||||
|
||||
[12] http://python.sf.net/peps/pep-0257.html
|
||||
|
||||
[13] http://python.sf.net/peps/pep-0258.html
|
||||
|
||||
|
||||
Copyright
|
||||
|
||||
This document has been placed in the public domain.
|
||||
|
||||
|
||||
Acknowledgements
|
||||
|
||||
This document borrows text from PEP 216 "Docstring Format" by
|
||||
Moshe Zadka [1]. It is intended as a reorganization of PEP 216
|
||||
and its approach.
|
||||
|
||||
This document also borrows ideas from the archives of the Python
|
||||
Doc-SIG. Thanks to all members past & present.
|
||||
|
||||
|
||||
|
||||
Local Variables:
|
||||
mode: indented-text
|
||||
indent-tabs-mode: nil
|
||||
End:
|
Loading…
Reference in New Issue