292 lines
10 KiB
Plaintext
292 lines
10 KiB
Plaintext
PEP: 256
|
||
Title: Docstring Processing System Framework
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: goodger@users.sourceforge.net (David Goodger)
|
||
Discussions-To: doc-sig@python.org
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Created: 01-Jun-2001
|
||
Post-History: 13-Jun-2001
|
||
|
||
|
||
Abstract
|
||
|
||
Python lends itself to inline documentation. With its built-in
|
||
docstring syntax, a limited form of Literate Programming [1]_ is
|
||
easy to do in Python. However, there are no satisfactory standard
|
||
tools for extracting and processing Python docstrings. The lack
|
||
of a standard toolset is a significant gap in Python's
|
||
infrastructure; this PEP aims to fill the gap.
|
||
|
||
The issues surrounding docstring processing have been contentious
|
||
and difficult to resolve. This PEP proposes a generic Docstring
|
||
Processing System (DPS) framework, which separates out the
|
||
components (program and conceptual), enabling the resolution of
|
||
individual issues either through consensus (one solution) or
|
||
through divergence (many). It promotes standard interfaces which
|
||
will allow a variety of plug-in components (input context readers,
|
||
markup parsers, and output format writers) to be used.
|
||
|
||
The concepts of a DPS framework are presented independently of
|
||
implementation details.
|
||
|
||
|
||
Roadmap to the Doctring PEPs
|
||
|
||
There are many aspects to docstring processing. The "Docstring
|
||
PEPs" have broken up the issues in order to deal with each of them
|
||
in isolation, or as close as possible. The individual aspects and
|
||
associated PEPs are as follows:
|
||
|
||
* Docstring syntax. PEP 287, reStructuredText Docstring Format,
|
||
proposes a syntax for Python docstrings, PEPs, and other uses.
|
||
|
||
* Docstring semantics consist of at least two aspects:
|
||
|
||
- Conventions: the high-level structure of docstrings. Dealt
|
||
with in PEP 257, Docstring Conventions.
|
||
|
||
- Methodology: rules for the informational content of
|
||
docstrings. Not addressed.
|
||
|
||
* Processing mechanisms. This PEP outlines the high-level issues
|
||
and specification of an abstract docstring processing system
|
||
(DPS). PEP 258, Docutils Design Specification, is an overview
|
||
of the design and implementation of one DPS under development.
|
||
|
||
* Output styles: developers want the documentation generated from
|
||
their source code to look good, and there are many different
|
||
ideas about what that means. PEP 258 touches on "Stylist
|
||
Transforms". This aspect of docstring processing has yet to be
|
||
fully explored.
|
||
|
||
By separating out the issues, we can form consensus more easily
|
||
(smaller fights ;-), and accept divergence more readily.
|
||
|
||
|
||
Rationale
|
||
|
||
There are standard inline documentation systems for some other
|
||
languages. For example, Perl has POD [2]_ and Java has Javadoc
|
||
[3]_, but neither of these mesh with the Pythonic way. POD syntax
|
||
is very explicit, but takes after Perl in terms of readability.
|
||
Javadoc is HTML-centric; except for '@field' tags, raw HTML is
|
||
used for markup. There are also general tools such as Autoduck
|
||
[4]_ and Web (Tangle & Weave) [5]_, useful for multiple languages.
|
||
|
||
There have been many attempts to write auto-documentation systems
|
||
for Python (not an exhaustive list):
|
||
|
||
- Marc-Andre Lemburg's doc.py [6]_
|
||
|
||
- Daniel Larsson's pythondoc & gendoc [7]_
|
||
|
||
- Doug Hellmann's HappyDoc [8]_
|
||
|
||
- Laurence Tratt's Crystal [9]_
|
||
|
||
- Ka-Ping Yee's htmldoc & pydoc [10]_ (pydoc.py is now part of the
|
||
Python standard library; see below)
|
||
|
||
- Tony Ibbs' docutils [11]_
|
||
|
||
- Edward Loper's STminus formalization and related efforts [12]_
|
||
|
||
These systems, each with different goals, have had varying degrees
|
||
of success. A problem with many of the above systems was
|
||
over-ambition combined with inflexibility. They provided a
|
||
self-contained set of components: a docstring extraction system, a
|
||
markup parser, an internal processing system and one or more
|
||
output format writers with a fixed style. Inevitably, one or more
|
||
aspects of each system had serious shortcomings, and they were not
|
||
easily extended or modified, preventing them from being adopted as
|
||
standard tools.
|
||
|
||
It has become clear (to this author, at least) that the "all or
|
||
nothing" approach cannot succeed, since no monolithic
|
||
self-contained system could possibly be agreed upon by all
|
||
interested parties. A modular component approach designed for
|
||
extension, where components may be multiply implemented, may be
|
||
the only chance for success. Standard inter-component APIs will
|
||
make the DPS components comprehensible without requiring detailed
|
||
knowledge of the whole, lowering the barrier for contributions,
|
||
and ultimately resulting in a rich and varied system.
|
||
|
||
Each of the components of a docstring processing system should be
|
||
developed independently. A 'best of breed' system should be
|
||
chosen, either merged from existing systems, and/or developed
|
||
anew. This system should be included in Python's standard
|
||
library.
|
||
|
||
|
||
PyDoc & Other Existing Systems
|
||
|
||
PyDoc became part of the Python standard library as of release
|
||
2.1. It extracts and displays docstrings from within the Python
|
||
interactive interpreter, from the shell command line, and from a
|
||
GUI window into a web browser (HTML). Although a very useful
|
||
tool, PyDoc has several deficiencies, including:
|
||
|
||
- In the case of the GUI/HTML, except for some heuristic
|
||
hyperlinking of identifier names, no formatting of the
|
||
docstrings is done. They are presented within <p><small><tt>
|
||
tags to avoid unwanted line wrapping. Unfortunately, the result
|
||
is not attractive.
|
||
|
||
- PyDoc extracts docstrings and structural information (class
|
||
identifiers, method signatures, etc.) from imported module
|
||
objects. There are security issues involved with importing
|
||
untrusted code. Also, information from the source is lost when
|
||
importing, such as comments, "additional docstrings" (string
|
||
literals in non-docstring contexts; see PEP 258 [13]_), and the
|
||
order of definitions.
|
||
|
||
The functionality proposed in this PEP could be added to or used
|
||
by PyDoc when serving HTML pages. The proposed docstring
|
||
processing system's functionality is much more than PyDoc needs in
|
||
its current form. Either an independent tool will be developed
|
||
(which PyDoc may or may not use), or PyDoc could be expanded to
|
||
encompass this functionality and *become* the docstring processing
|
||
system (or one such system). That decision is beyond the scope of
|
||
this PEP.
|
||
|
||
Similarly for other existing docstring processing systems, their
|
||
authors may or may not choose compatibility with this framework.
|
||
However, if this framework is accepted and adopted as the Python
|
||
standard, compatibility will become an important consideration in
|
||
these systems' future.
|
||
|
||
|
||
Specification
|
||
|
||
The docstring processing system framework consists of components,
|
||
as follows::
|
||
|
||
1. Docstring conventions. Documents issues such as:
|
||
|
||
- What should be documented where.
|
||
|
||
- First line is a one-line synopsis.
|
||
|
||
PEP 257, Docstring Conventions [14]_, documents some of these
|
||
issues.
|
||
|
||
2. Docstring processing system design specification. Documents
|
||
issues such as:
|
||
|
||
- High-level spec: what a DPS does.
|
||
|
||
- Command-line interface for executable script.
|
||
|
||
- System Python API.
|
||
|
||
- Docstring extraction rules.
|
||
|
||
- Readers, which encapsulate the input context .
|
||
|
||
- Parsers.
|
||
|
||
- Document tree: the intermediate internal data structure. The
|
||
output of the Parser and Reader, and the input to the Writer
|
||
all share the same data structure.
|
||
|
||
- Transforms, which modify the document tree.
|
||
|
||
- Writers for output formats.
|
||
|
||
- Distributors, which handle output management (one file, many
|
||
files, or objects in memory).
|
||
|
||
These issues are applicable to any docstring processing system
|
||
implementation. PEP 258, Docutils Design Specification [13 ]_,
|
||
documents these issues.
|
||
|
||
3. Docstring processing system implementation.
|
||
|
||
4. Input markup specifications: docstring syntax. PEP 287,
|
||
reStructuredText Docstring Format [15]_, proposes a standard
|
||
syntax.
|
||
|
||
5. Input parser implementations.
|
||
|
||
6. Input context readers ("modes": Python source code, PEP,
|
||
standalone text file, email, etc.) and implementations.
|
||
|
||
7. Stylists: certain input context readers may have associated
|
||
stylists which allow for a variety of output document styles.
|
||
|
||
8. Output formats (HTML, XML, TeX, DocBook, info, etc.) and writer
|
||
implementations.
|
||
|
||
Components 1, 2/3, and 4/5 are the subject of individual companion
|
||
PEPs. If there is another implementation of the framework or
|
||
syntax/parser, additional PEPs may be required. Multiple
|
||
implementations of each of components 6 and 7 will be required;
|
||
the PEP mechanism may be overkill for these components.
|
||
|
||
|
||
Project Web Site
|
||
|
||
A SourceForge project has been set up for this work at
|
||
http://docutils.sourceforge.net/.
|
||
|
||
|
||
References and Footnotes
|
||
|
||
[1] http://www.literateprogramming.com/
|
||
|
||
[2] Perl "Plain Old Documentation"
|
||
http://www.perldoc.com/perl5.6/pod/perlpod.html
|
||
|
||
[3] http://java.sun.com/j2se/javadoc/
|
||
|
||
[4] http://www.helpmaster.com/hlp-developmentaids-autoduck.htm
|
||
|
||
[5] http://www-cs-faculty.stanford.edu/~knuth/cweb.html
|
||
|
||
[6] http://www.lemburg.com/files/python/SoftwareDescriptions.html#doc.py
|
||
|
||
[7] http://starship.python.net/crew/danilo/pythondoc/
|
||
|
||
[8] http://happydoc.sourceforge.net/
|
||
|
||
[9] http://www.btinternet.com/~tratt/comp/python/crystal/
|
||
|
||
[10] http://www.python.org/doc/current/lib/module-pydoc.html
|
||
|
||
[11] http://homepage.ntlworld.com/tibsnjoan/docutils/
|
||
|
||
[12] http://www.cis.upenn.edu/~edloper/pydoc/
|
||
|
||
[13] PEP 258, Docutils Design Specification, Goodger
|
||
http://www.python.org/peps/pep-0258.html
|
||
|
||
[14] PEP 257, Docstring Conventions, Goodger, Van Rossum
|
||
http://www.python.org/peps/pep-0257.html
|
||
|
||
[15] PEP 287, reStructuredText Docstring Format, Goodger
|
||
http://www.python.org/peps/pep-0287.html
|
||
|
||
[16] http://www.python.org/sigs/doc-sig/
|
||
|
||
|
||
Copyright
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
Acknowledgements
|
||
|
||
This document borrows ideas from the archives of the Python
|
||
Doc-SIG [16]_. Thanks to all members past & present.
|
||
|
||
|
||
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
fill-column: 70
|
||
sentence-end-double-space: t
|
||
End:
|