313 lines
13 KiB
Plaintext
313 lines
13 KiB
Plaintext
|
PEP: 256
|
|||
|
Title: Docstring Processing System Framework
|
|||
|
Version: $Revision$
|
|||
|
Last-Modified: $Date$
|
|||
|
Author: dgoodger@bigfoot.com (David Goodger)
|
|||
|
Discussions-To: doc-sig@python.org
|
|||
|
Status: Draft
|
|||
|
Type: Standards Track
|
|||
|
Requires: PEP 257 Docstring Conventions
|
|||
|
PEP 258 DPS Generic Implementation Details
|
|||
|
Created: 01-Jun-2001
|
|||
|
Post-History:
|
|||
|
|
|||
|
|
|||
|
Abstract
|
|||
|
|
|||
|
Python modules, classes and functions have a string attribute
|
|||
|
called __doc__. If the first expression inside the definition is
|
|||
|
a literal string, that string is assigned to the __doc__
|
|||
|
attribute, called a documentation string or docstring. It is
|
|||
|
often used to summarize the interface of the module, class or
|
|||
|
function.
|
|||
|
|
|||
|
There is no standard format (markup) for docstrings, nor are there
|
|||
|
standard tools for extracting docstrings and transforming them
|
|||
|
into useful structured formats (e.g., HTML, DocBook, TeX). Those
|
|||
|
tools that do exist are for the most part unmaintained and unused.
|
|||
|
The issues surrounding docstring processing have been contentious
|
|||
|
and difficult to resolve.
|
|||
|
|
|||
|
This PEP proposes a Docstring Processing System (DPS) framework.
|
|||
|
It separates out the components (program and conceptual), enabling
|
|||
|
the resolution of individual issues either through consensus (one
|
|||
|
solution) or through divergence (many). It promotes standard
|
|||
|
interfaces which will allow a variety of plug-in components (e.g.,
|
|||
|
input parsers and output formatters) to be used.
|
|||
|
|
|||
|
This PEP presents the concepts of a DPS framework independently of
|
|||
|
implementation details.
|
|||
|
|
|||
|
|
|||
|
Rationale
|
|||
|
|
|||
|
Python lends itself to inline documentation. With its built-in
|
|||
|
docstring syntax, a limited form of Literate Programming [2] is
|
|||
|
easy to do in Python. However, there are no satisfactory standard
|
|||
|
tools for extracting and processing Python docstrings. The lack
|
|||
|
of a standard toolset is a significant gap in Python's
|
|||
|
infrastructure; this PEP aims to fill the gap.
|
|||
|
|
|||
|
There are standard inline documentation systems for some other
|
|||
|
languages. For example, Perl has POD (plain old documentation)
|
|||
|
and Java has Javadoc, but neither of these mesh with the Pythonic
|
|||
|
way. POD is very explicit, but takes after Perl in terms of
|
|||
|
readability. Javadoc is HTML-centric; except for '@field' tags,
|
|||
|
raw HTML is used for markup. There are also general tools such as
|
|||
|
Autoduck and Web (Tangle & Weave), useful for multiple languages.
|
|||
|
|
|||
|
There have been many attempts to write autodocumentation systems
|
|||
|
for Python (not an exhaustive list):
|
|||
|
|
|||
|
- Marc-Andre Lemburg's doc.py [3]
|
|||
|
|
|||
|
- Daniel Larsson's pythondoc & gendoc [4]
|
|||
|
|
|||
|
- Doug Hellmann's HappyDoc [5]
|
|||
|
|
|||
|
- Laurence Tratt's Crystal [6]
|
|||
|
|
|||
|
- Ka-Ping Yee's htmldoc & pydoc [7] (pydoc.py is now part of the Python
|
|||
|
standard library; see below)
|
|||
|
|
|||
|
- Tony Ibbs' docutils [8]
|
|||
|
|
|||
|
These systems, each with different goals, have had varying degrees
|
|||
|
of success. A problem with many of the above systems was
|
|||
|
over-ambition. They provided a self-contained set of components: a
|
|||
|
docstring extraction system, an input parser, an internal
|
|||
|
processing system and one or more output formatters. Inevitably,
|
|||
|
one or more components had serious shortcomings, preventing the
|
|||
|
system from being adopted as a standard tool.
|
|||
|
|
|||
|
Throughout the existence of the Python Documentation Special
|
|||
|
Interest Group (Doc-SIG) [9], consensus on a single standard
|
|||
|
docstring format has never been reached. A lightweight, implicit
|
|||
|
markup has been sought, for the following reasons (among others):
|
|||
|
|
|||
|
1. Docstrings written within Python code are available from within
|
|||
|
the interactive interpreter, and can be 'print'ed. Thus the
|
|||
|
use of plaintext for easy readability.
|
|||
|
|
|||
|
2. Programmers want to add structure to their docstrings, without
|
|||
|
sacrificing raw docstring readability. Unadorned plaintext
|
|||
|
cannot be transformed ('up-translated') into useful structured
|
|||
|
formats.
|
|||
|
|
|||
|
3. Explicit markup (like XML or TeX) has been widely considered
|
|||
|
unreadable by the uninitiated.
|
|||
|
|
|||
|
4. Implicit markup is aesthetically compatible with the clean and
|
|||
|
minimalist Python syntax.
|
|||
|
|
|||
|
Early on, variants of Setext (Structure Enhanced Text) [10],
|
|||
|
including Digital Creation's StructuredText [11], were proposed
|
|||
|
for Python docstring formatting. Hereafter we will collectively
|
|||
|
call these variants 'STexts'. Although used by some (including in
|
|||
|
most of the above-listed autodocumentation tools), these markup
|
|||
|
schemes have failed to become standard because:
|
|||
|
|
|||
|
- STexts have been incomplete: lacking 'essential' constructs that
|
|||
|
people want to use in their docstrings, STexts are rendered less
|
|||
|
than ideal. Note that these 'essential' constructs are not
|
|||
|
universal; everyone has their own requirements.
|
|||
|
|
|||
|
- STexts have been sometimes surprising: bits of text are marked
|
|||
|
up unexpectedly, leading to user frustration.
|
|||
|
|
|||
|
- SText implementations have been buggy.
|
|||
|
|
|||
|
- Some STexts have have had no formal specification except for the
|
|||
|
implementation itself. A buggy implementation meant a buggy
|
|||
|
spec, and vice-versa.
|
|||
|
|
|||
|
- There has been no mechanism to get around the SText markup rules
|
|||
|
when a markup character is used in a non-markup context.
|
|||
|
|
|||
|
Recognizing the deficiencies of STexts, some people have proposed
|
|||
|
using explicit markup of some kind. There have been proposals for
|
|||
|
using XML, HTML, TeX, POD, and Javadoc at one time or another.
|
|||
|
Proponents of STexts have vigorously opposed these proposals, and
|
|||
|
the debates have continued off and on for at least five years.
|
|||
|
|
|||
|
It has become clear (to this author, at least) that the "all or
|
|||
|
nothing" approach cannot succeed, since no all-encompassing
|
|||
|
proposal could possibly be agreed upon by all interested parties.
|
|||
|
A modular component approach, where components may be multiply
|
|||
|
implemented, is the only chance at success. By separating out the
|
|||
|
issues, we can form consensus more easily (smaller fights ;-), and
|
|||
|
accept divergence more readily.
|
|||
|
|
|||
|
Each of the components of a docstring processing system should be
|
|||
|
developed independently. A 'best of breed' system should be
|
|||
|
chosen and/or developed and eventually included in Python's
|
|||
|
standard library.
|
|||
|
|
|||
|
|
|||
|
Pydoc & Other Existing Systems
|
|||
|
|
|||
|
Pydoc is part of the Python 2.1 standard library. It extracts and
|
|||
|
displays docstrings from within the Python interactive
|
|||
|
interpreter, from the shell command line, and from a GUI window
|
|||
|
into a web browser (HTML). In the case of GUI/HTML, except for
|
|||
|
some heuristic hyperlinking of identifier names, no formatting of
|
|||
|
the docstrings is done. They are presented within <p><small><tt>
|
|||
|
tags to avoid unwanted line wrapping. Unfortunately, the result
|
|||
|
is not pretty.
|
|||
|
|
|||
|
The functionality proposed in this PEP could be added to or used
|
|||
|
by pydoc when serving HTML pages. However, the proposed docstring
|
|||
|
processing system's functionality is much more than pydoc needs
|
|||
|
(in its current form). Either an independent tool will be
|
|||
|
developed (which pydoc may or may not use), or pydoc could be
|
|||
|
expanded to encompass this functionality and *become* the
|
|||
|
docstring processing system (or one such system). That decision
|
|||
|
is beyond the scope of this PEP.
|
|||
|
|
|||
|
Similarly for other existing docstring processing systems, their
|
|||
|
authors may or may not choose compatibility with this framework.
|
|||
|
However, if this framework is accepted and adopted as the Python
|
|||
|
standard, compatibility will become an important consideration in
|
|||
|
these systems' future.
|
|||
|
|
|||
|
|
|||
|
Specification
|
|||
|
|
|||
|
The docstring processing system framework consists of components,
|
|||
|
as follows::
|
|||
|
|
|||
|
1. Docstring conventions. Documents issues such as:
|
|||
|
|
|||
|
- What should be documented where.
|
|||
|
|
|||
|
- First line is a one-line synopsis.
|
|||
|
|
|||
|
PEP 257, "Docstring Conventions" [12], documents these issues.
|
|||
|
|
|||
|
2. Docstring processing system generic implementation details.
|
|||
|
Documents issues such as:
|
|||
|
|
|||
|
- High-level spec: what a DPS does.
|
|||
|
|
|||
|
- Command-line interface for executable script.
|
|||
|
|
|||
|
- System Python API
|
|||
|
|
|||
|
- Docstring extraction rules.
|
|||
|
|
|||
|
- Input parser API.
|
|||
|
|
|||
|
- Intermediate internal data structure: output from input parser,
|
|||
|
input to output formatter.
|
|||
|
|
|||
|
- Output formatter API.
|
|||
|
|
|||
|
- Output management.
|
|||
|
|
|||
|
These issues are applicable to any docstring processing system
|
|||
|
implementation. PEP 258, "DPS Generic Implementation Details"
|
|||
|
[13], documents these issues.
|
|||
|
|
|||
|
3. Docstring processing system implementation.
|
|||
|
|
|||
|
4. Input markup specifications: docstring syntax.
|
|||
|
|
|||
|
5. Input parser implementations.
|
|||
|
|
|||
|
6. Output formats (HTML, XML, TeX, DocBook, info, etc.).
|
|||
|
|
|||
|
7. Output formatter implementations.
|
|||
|
|
|||
|
Components 1, 2, and 3 will be the subject of individual companion
|
|||
|
PEPs, although they may be merged into this PEP once consensus is
|
|||
|
reached. If there is only one implementation, PEPs for components
|
|||
|
2 & 3 can be combined. Multiple PEPs will be necessary for each
|
|||
|
of components 4, 5, 6, and 7. An alternative to the PEP mechanism
|
|||
|
may be used instead, since these are not directly related to the
|
|||
|
Python language.
|
|||
|
|
|||
|
The following diagram shows an overview of the framework.
|
|||
|
Interfaces are indicated by double-borders. The ASCII diagram is
|
|||
|
very wide; please turn off line wrapping to view it:
|
|||
|
|
|||
|
|
|||
|
+========================+
|
|||
|
| Command-Line Interface |
|
|||
|
+========================+
|
|||
|
| Executable Script |
|
|||
|
+------------------------+
|
|||
|
|
|
|||
|
| calls
|
|||
|
v
|
|||
|
+===========================================+ returns +---------+
|
|||
|
| System Python API |==========>| output |
|
|||
|
+--------+ +===========================================+ | objects |
|
|||
|
_ writes | Python | reads | Docstring Processing System | +---------+
|
|||
|
/ \ ==============>| module |<===========| |
|
|||
|
\_/ +--------+ | input | transformation, | output | +--------+
|
|||
|
| +-------------+ follows | docstring | integration, | object | writes | output |
|
|||
|
--+-- consults | docstring |<-----------| extraction | linking | management |===========>| files |
|
|||
|
| --------->| conventions | +============+=====+=====+=====+============+ +--------+
|
|||
|
/ \ +-------------+ | parser API | | formatter API |
|
|||
|
/ \ +-------------+ +===========+======+ +======+===========+ +--------+
|
|||
|
author consults | markup | implements | input | intermediate | output | implements | output |
|
|||
|
--------->| syntax spec |<-----------| parser | data structure | formatter |----------->| format |
|
|||
|
+-------------+ +-----------+-------------------+-----------+ +--------+
|
|||
|
|
|||
|
|
|||
|
Project Web Site
|
|||
|
|
|||
|
A SourceForge project has been set up for this work at
|
|||
|
http://docstring.sf.net.
|
|||
|
|
|||
|
|
|||
|
References and Footnotes
|
|||
|
|
|||
|
[1] http://python.sf.net/peps/pep-0216.html
|
|||
|
|
|||
|
[2] http://www.literateprogramming.com/
|
|||
|
|
|||
|
[3] http://www.lemburg.com/files/python/SoftwareDescriptions.html#doc.py
|
|||
|
|
|||
|
[4] http://starship.python.net/crew/danilo/pythondoc/
|
|||
|
|
|||
|
[5] http://happydoc.sf.net/
|
|||
|
|
|||
|
[6] http://www.btinternet.com/~tratt/comp/python/crystal/index.html
|
|||
|
|
|||
|
[7] http://www.lfw.org/python/
|
|||
|
|
|||
|
[8] http://homepage.ntlworld.com/tibsnjoan/docutils/
|
|||
|
|
|||
|
[9] http://www.python.org/sigs/doc-sig/
|
|||
|
|
|||
|
[10] http://www.bsdi.com/setext/
|
|||
|
|
|||
|
[11] http://dev.zope.org/Members/jim/StructuredTextWiki/FrontPage/
|
|||
|
|
|||
|
[12] http://python.sf.net/peps/pep-0257.html
|
|||
|
|
|||
|
[13] http://python.sf.net/peps/pep-0258.html
|
|||
|
|
|||
|
|
|||
|
Copyright
|
|||
|
|
|||
|
This document has been placed in the public domain.
|
|||
|
|
|||
|
|
|||
|
Acknowledgements
|
|||
|
|
|||
|
This document borrows text from PEP 216 "Docstring Format" by
|
|||
|
Moshe Zadka [1]. It is intended as a reorganization of PEP 216
|
|||
|
and its approach.
|
|||
|
|
|||
|
This document also borrows ideas from the archives of the Python
|
|||
|
Doc-SIG. Thanks to all members past & present.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Local Variables:
|
|||
|
mode: indented-text
|
|||
|
indent-tabs-mode: nil
|
|||
|
End:
|