Updated

2002-11-08 04:26:56 +00:00 · 2002-11-08 04:26:56 +00:00 · a80082ba80
parent 4a04782272
commit a80082ba80
3 changed files with 266 additions and 218 deletions
--- a/pep-0256.txt
+++ b/pep-0256.txt
@ -34,8 +34,8 @@ The concepts of a DPS framework are presented independently of
 implementation details.


-Roadmap to the Doctring PEPs
-============================
+Road Map to the Doctring PEPs
+=============================

 There are many aspects to docstring processing.  The "Docstring PEPs"
 have broken up the issues in order to deal with each of them in
@ -184,7 +184,7 @@ The docstring processing system framework is broken up as follows:

   - Docstring extraction rules.

-   - Readers, which encapsulate the input context .
+   - Readers, which encapsulate the input context.

   - Parsers.

--- a/pep-0258.txt
+++ b/pep-0258.txt
@ -20,7 +20,7 @@ This PEP documents design issues and implementation details for
 Docutils, a Python Docstring Processing System (DPS).  The rationale
 and high-level concepts of a DPS are documented in PEP 256, "Docstring
 Processing System Framework" [#PEP-256]_.  Also see PEP 256 for a
-"Roadmap to the Doctring PEPs".
+"Road Map to the Docstring PEPs".

 Docutils is being designed modularly so that any of its components can
 be replaced easily.  In addition, Docutils is not limited to the
@ -39,63 +39,65 @@ documentation.
 Docutils Project Model
 ======================

-::
+Project components and data flow::

-                        +--------------------------+
-                        |        Docutils:         |
-                        | docutils.core.Publisher, |
-                        | docutils.core.publish()  |
-                        +--------------------------+
-                         /                        \
-                        /                          \
-               1,3,5,7 /                            \ 8,10
-              +--------+                            +--------+
-              | READER | =========================> | WRITER |
-              +--------+                            +--------+
-             /    ||    \                             /     \
-            /     ||     \                           /       \
-     2     /   4  ||      \ 6                  9    /         \ 11
-    +-----+   +--------+   +-------------+    +------------+   +-----+
-    | I/O |   | PARSER |...| reader      |    | writer     |   | I/O |
-    +-----+   +--------+   | transforms  |    | transforms |   +-----+
-                           |             |    |            |
-                           | - docinfo   |    | - system   |
-                           | - titles    |    |   messages |
-                           | - linking   |    | - final    |
-                           | - lookups   |    |   checks   |
-                           | - reader-   |    | - writer-  |
-                           |   specific  |    |   specific |
-                           | - parser-   |    | - etc.     |
-                           |   specific  |    +------------+
-                           | - layout    |
-                           |   (stylist) |
-                           | - etc.      |
-                           +-------------+
+                     +---------------------------+
+                     |        Docutils:          |
+                     | docutils.core.Publisher,  |
+                     | docutils.core.publish_*() |
+                     +---------------------------+
+                      /            |            \
+                     /             |             \
+            1,3,5   /        6     |              \ 7
+           +--------+       +-------------+       +--------+
+           | READER | ----> | TRANSFORMER | ====> | WRITER |
+           +--------+       +-------------+       +--------+
+            /     \\                                  |
+           /       \\                                 |
+     2    /      4  \\                             8  |
+    +-------+   +--------+                        +--------+
+    | INPUT |   | PARSER |                        | OUTPUT |
+    +-------+   +--------+                        +--------+

-The numbers indicate the path a document's data takes through the
-code.  Double-width lines between reader & parser and between reader &
-writer indicate that data sent along these paths should be standard
-(pure & unextended) Docutils doc trees.  Single-width lines signify
-that internal tree extensions or completely unrelated representations
-are possible, but they must be supported at both ends.
+The numbers above each component indicate the path a document's data
+takes.  Double-width lines between Reader & Parser and between
+Transformer & Writer indicate that data sent along these paths should
+be standard (pure & unextended) Docutils doc trees.  Single-width
+lines signify that internal tree extensions or completely unrelated
+representations are possible, but they must be supported at both ends.


 Publisher
 ---------

 The ``docutils.core`` module contains a "Publisher" facade class and
-"publish" convenience function.  Publisher encapsulates the high-level
-logic of a Docutils system.  The ``Publisher.publish()`` method first
-calls its Reader, which reads data from its source I/O, parses and
-transforms the data, and returns it.  ``Publisher.publish()`` then
-passes the resulting document tree to its Writer, which further
-transforms the document before translating it to the final output
-format and writing the formatted data to its destination I/O.
+several convenience functions: "publish_cmdline()" (for command-line
+front ends), "publish_file()" (for programmatic use with file-like
+I/O), and "publish_string()" (for programmatic use with string I/O).
+The Publisher class encapsulates the high-level logic of a Docutils
+system.  The Publisher class has overall responsibility for
+processing, controlled by the ``Publisher.publish()`` method:
+
+1. Set up internal settings (may include config files & command-line
+   options) and I/O objects.
+
+2. Call the Reader object to read data from the source Input object
+   and parse the data with the Parser object.  A document object is
+   returned.
+
+3. Set up and apply transforms via the Transformer object attached to
+   the document.
+
+4. Call the Writer object which translates the document to the final
+   output format and writes the formatted data to the destination
+   Output object.  Depending on the Output object, the output may be
+   returned from the Writer, and then from the ``publish()`` method.

 Calling the "publish" function (or instantiating a "Publisher" object)
 with component names will result in default behavior.  For custom
-behavior (setting component options), create custom component objects
-first, and pass *them* to publish/Publisher.
+behavior (customizing component settings), create custom component
+objects first, and pass *them* to the Publisher or ``publish_*``
+convenience functions.


 Readers
@ -104,9 +106,6 @@ Readers
 Readers understand the input context (where the data is coming from),
 send the whole input or discrete "chunks" to the parser, and provide
 the context to bind the chunks together back into a cohesive whole.
-Using transforms_, Readers also resolve references, footnote numbers,
-interpreted text processing, and anything else that requires
-context-sensitive computation.

 Each reader is a module or package exporting a "Reader" class with a
 "read" method.  The base "Reader" class can be found in the
@ -118,43 +117,40 @@ still incomplete) will be able to determine the parser on its own.

 Responsibilities:

- Get input text from the source I/O.
+* Get input text from the source I/O.

- Pass the input text to the parser, along with a fresh doctree root.
-
- Run transforms over the doctree(s).
+* Pass the input text to the parser, along with a fresh `document
+  tree`_ root.

 Examples:

- Standalone (Raw/Plain): Just read a text file and process it.
+* Standalone (Raw/Plain): Just read a text file and process it.
  The reader needs to be told which parser to use.

  The "Standalone Reader" has been implemented in module
  ``docutils.readers.standalone``.

- Python Source: See `Python Source Reader`_ below.  This Reader is
+* Python Source: See `Python Source Reader`_ below.  This Reader is
  currently in development in the Docutils sandbox.

- Email: RFC-822 headers, quoted excerpts, signatures, MIME parts.
+* Email: RFC-822 headers, quoted excerpts, signatures, MIME parts.

- PEP: RFC-822 headers, "PEP xxxx" and "RFC xxxx" conversion to URIs.
-  Either interpret PEPs' indented sections or convert existing PEPs to
-  reStructuredText (or both?).
+* PEP: RFC-822 headers, "PEP xxxx" and "RFC xxxx" conversion to URIs.
+  The "PEP Reader" has been implemented in module
+  ``docutils.readers.pep``; see PEP 287 and PEP 12.

-  The "PEP Reader" is being implemented in module
-  ``docutils.readers.pep``.
-
- Wiki: Global reference lookups of "wiki links" incorporated into
+* Wiki: Global reference lookups of "wiki links" incorporated into
  transforms.  (CamelCase only or unrestricted?)  Lazy
  indentation?

- Web Page: As standalone, but recognize meta fields as meta tags.
+* Web Page: As standalone, but recognize meta fields as meta tags.
  Support for templates of some sort?  (After ``<body>``, before
  ``</body>``?)

- FAQ: Structured "question & answer(s)" constructs.
+* FAQ: Structured "question & answer(s)" constructs.

- Compound document: Merge chapters into a book.  Master TOC file?
+* Compound document: Merge chapters into a book.  Master manifest
+  file?


 Parsers
@ -175,93 +171,121 @@ Example: The only parser implemented so far is for the
 reStructuredText markup.  It is implemented in the
 ``docutils/parsers/rst/`` package.

+The development and integration of other parsers is possible and
+encouraged.

-Transforms
----------

-Transforms change the document tree from one form to another, add to
-the tree, or prune it.  Transforms are run by Reader and Writer
-objects.  Some transforms are Reader-specific, some are
-Parser-specific, and others are Writer-specific.  The choice and order
-of transforms is specified in the Reader and Writer objects.
+.. _transforms:
+
+Transformer
+-----------
+
+The Transformer class, in ``docutils/transforms/__init__.py``, stores
+transforms and applies them to documents.  A transformer object is
+attached to every new document tree.  The Publisher_ calls
+``Transformer.apply_transforms()`` to apply all stored transforms to
+the document tree.  Transforms change the document tree from one form
+to another, add to the tree, or prune it.  Transforms resolve
+references and footnote numbers, processing interpreted text, and do
+other context-sensitive processing.
+
+Some transforms are specific to components (Readers, Parser, Writers,
+Input, Output).  Standard component-specific transforms are specified
+in the ``default_transforms`` attribute of component classes.  After
+the Reader has finished processing, the Publisher_ calls
+``Transformer.populate_from_components()`` with a list of components
+and all default transforms are stored.

 Each transform is a class in a module in the ``docutils/transforms/``
-package, a subclass of docutils.tranforms.Transform.
+package, a subclass of ``docutils.tranforms.Transform``.  Transform
+classes each have a ``default_priority`` attribute which is used by
+the Transformer to apply transforms in order (low to high).  The
+default priority can be overridden when adding transforms to the
+Transformer object.

-Responsibilities:
+Transformer responsibilities:

- Modify a doctree in-place, either purely transforming one structure
+* Apply transforms to the document tree, in priority order.
+
+* Store a mapping of component type name ('reader', 'writer', etc.) to
+  component objects.  These are used by certain transforms (such as
+  "components.Filter") to determine suitability.
+
+Transform responsibilities:
+
+* Modify a doctree in-place, either purely transforming one structure
  into another, or adding new structures based on the doctree and/or
  external data.

-Examples (in the ``docutils/transforms/`` package):
+Examples of transforms (in the ``docutils/transforms/`` package):

- frontmatter.DocInfo: Conversion of document metadata (bibliographic
+* frontmatter.DocInfo: Conversion of document metadata (bibliographic
  information).

- references.Hyperlinks: Resolution of hyperlinks.
+* references.AnonymousHyperlinks: Resolution of anonymous references
+  to corresponding targets.

- parts.Contents: Generates a table of contents for a document.
+* parts.Contents: Generates a table of contents for a document.

- document.Merger: Combining multiple populated doctrees into one (not
-  yet implemented or fully understood).
+* document.Merger: Combining multiple populated doctrees into one.
+  (Not yet implemented or fully understood.)

- document.Splitter: Splits a document into a tree-structure of
+* document.Splitter: Splits a document into a tree-structure of
  subdocuments, perhaps by section.  It will have to transform
  references appropriately.  (Neither implemented not remotely
  understood.)

- universal.Pending: Handles transforms that must be executed at
-  specific stages of processing.
-
- components.Filter: Includes or excludes elements which depend on a
-  specific Docutils component (triggered by the universal.Pending
-  transform).
+* components.Filter: Includes or excludes elements which depend on a
+  specific Docutils component.


 Writers
 -------

 Writers produce the final output (HTML, XML, TeX, etc.).  Writers
-translate the internal document tree structure into the final data
+translate the internal `document tree`_ structure into the final data
 format, possibly running Writer-specific transforms_ first.

+By the time the document gets to the Writer, it should be in final
+form.  The Writer's job is simply (and only) to translate from the
+Docutils doctree structure to the target format.  Some small
+transforms may be required, but they should be local and
+format-specific.
+
 Each writer is a module or package exporting a "Writer" class with a
 "write" method.  The base "Writer" class can be found in the
 ``docutils/writers/__init__.py`` module.

 Responsibilities:

- Run transforms over the doctree(s).
-
- Translate doctree(s) into specific output formats.
+* Translate doctree(s) into specific output formats.

  - Transform references into format-native forms.

- Write the translated output to the destination I/O.
+* Write the translated output to the destination I/O.

 Examples:

- XML: Various forms, such as:
+* XML: Various forms, such as:
+
+  - Docutils XML (an expression of the internal document tree,
+    implemented as ``docutils.writers.docutils_xml``).

  - DocBook (being implemented in the Docutils sandbox).

-  - Raw doctree XML (accessible via "``doctree.asdom().toxml()``"; no
-    Writer component implemented yet).
+* HTML (XHTML implemented as ``docutils.writers.html4css1``).

- HTML (XHTML implemented as ``docutils.writers.html4css1``).
-
- PDF (a ReportLabs interface is being developed in the Docutils
+* PDF (a ReportLabs interface is being developed in the Docutils
  sandbox).

- TeX
+* TeX (a LaTeX Writer is being implemented in the sandbox).

- Docutils-native pseudo-XML (implemented as
+* Docutils-native pseudo-XML (implemented as
  ``docutils.writers.pseudoxml``, used for testing).

- Plain text
+* Plain text

- reStructuredText?
+* reStructuredText?


 Input/Output
@ -269,68 +293,78 @@ Input/Output

 I/O classes provide a uniform API for low-level input and output.
 Subclasses will exist for a variety of input/output mechanisms.
+However, they can be considered an implementation detail.  Most
+applications should be satisfied using one of the convenience
+functions associated with the Publisher_.

 I/O classes are currently in the preliminary stages; there's a lot of
 work yet to be done.  Issues:

- Looking at the list of writers, it seems that only HTML would
-  require anything other than monolithic output.  Perhaps "Writer"
-  variants, one for each output distribution type?
+* How to represent multi-file input (files & directories) in the API?

- How to represent a multi-file document (files & directories) in the
-  API?
+* How to represent multi-file output?  Perhaps "Writer" variants, one
+  for each output distribution type?  Or Output objects with
+  associated transforms?

 Responsibilities:

- Read data from the input source and/or write data to the output
-  destination.
+* Read data from the input source (Input objects) or write data to the
+  output destination (Output objects).

 Examples of input sources:

- A single file on disk or a stream (implemented as
+* A single file on disk or a stream (implemented as
  ``docutils.io.FileInput``).

- Multiple files on disk (``MultiFileInput``?).
+* Multiple files on disk (``MultiFileInput``?).

- Python source files: modules and packages.
+* Python source files: modules and packages.

- Python strings, as received from a client application
+* Python strings, as received from a client application
  (implemented as ``docutils.io.StringInput``).

 Examples of output destinations:

- A single file on disk or a stream (implemented as
+* A single file on disk or a stream (implemented as
  ``docutils.io.FileOutput``).

- A tree of directories and files on disk.
+* A tree of directories and files on disk.

- A Python string, returned to a client application (implemented as
+* A Python string, returned to a client application (implemented as
  ``docutils.io.StringOutput``).

- A single tree-shaped data structure in memory.
+* No output; useful for programmatic applications where only a portion
+  of the normal output is to be used (implemented as
+  ``docutils.io.NullOutput``).

- Some other set of data structures in memory.
+* A single tree-shaped data structure in memory.
+
+* Some other set of data structures in memory.


 Docutils Package Structure
 ==========================

- Package "docutils".
+* Package "docutils".

-  - Class "Component" is a base class for Docutils components.
+  - Module "__init__.py" contains: class "Component", a base class for
+    Docutils components; class "SettingsSpec", a base class for
+    specifying runtime settings (used by docutils.frontend); and class
+    "TransformSpec", a base class for specifying transforms.

  - Module "docutils.core" contains facade class "Publisher" and
-    convenience function "publish()".  See `Publisher`_ above.
+    convenience functions.  See `Publisher`_ above.

-  - Module "docutils.frontend" provides command-line and option
-    processing for Docutils front-end tools.
+  - Module "docutils.frontend" provides runtime settings support, for
+    programmatic use and front-end tools (including configuration file
+    support, and command-line argument and option processing).

  - Module "docutils.io" provides a uniform API for low-level input
    and output.  See `Input/Output`_ above.

  - Module "docutils.nodes" contains the Docutils document tree
-    element class library plus Visitor pattern base classes.  See
-    `Document Tree`_ below.
+    element class library plus tree-traversal Visitor pattern base
+    classes.  See `Document Tree`_ below.

  - Module "docutils.optik" provides option parsing and command-line
    help; from Greg Ward's http://optik.sf.net/ project, included for
@ -340,8 +374,9 @@ Docutils Package Structure
    routines.

  - Module "docutils.statemachine" contains a finite state machine
-    specialized for regular-expression-based text filters.  The
-    reStructuredText parser implementation is based on this module.
+    specialized for regular-expression-based text filters and parsers.
+    The reStructuredText parser implementation is based on this
+    module.

  - Module "docutils.urischemes" contains a mapping of known URI
    schemes ("http", "ftp", "mail", etc.).
@ -375,7 +410,7 @@ Docutils Package Structure
      Proposals).

    - Readers to be added for: Python source code (structure &
-      docstrings), PEPs, email, FAQ, and perhaps Wiki and others.
+      docstrings), email, FAQ, and perhaps Wiki and others.

    See `Readers`_ above.

@ -385,20 +420,26 @@ Docutils Package Structure
      by name.  Class "Writer" is the base class of specific writers.
      (``docutils/writers/__init__.py``)

-    - Module "docutils.writers.pseudoxml" is a simple internal
-      document tree writer; it writes indented pseudo-XML.
-
    - Module "docutils.writers.html4css1" is a simple HyperText Markup
      Language document tree writer for HTML 4.01 and CSS1.

+    - Module "docutils.writers.docutils_xml" writes the internal
+      document tree in XML form.
+
+    - Module "docutils.writers.pseudoxml" is a simple internal
+      document tree writer; it writes indented pseudo-XML.
+
    - Writers to be added: HTML 3.2 or 4.01-loose, XML (various forms,
-      such as DocBook and the raw internal doctree), PDF, TeX,
-      plaintext, reStructuredText, and perhaps others.
+      such as DocBook), PDF, TeX, plaintext, reStructuredText, and
+      perhaps others.

    See `Writers`_ above.

  - Package "docutils.transforms": tree transform classes.

+    - Class "Transformer" stores transforms and applies them to
+      document trees.  (``docutils/transforms/__init__.py``)
+
    - Class "Transform" is the base class of specific transforms.
      (``docutils/transforms/__init__.py``)

@ -414,7 +455,8 @@ Docutils Package Structure
    - Function "get_language(language_code)", returns matching
      language module.  (``docutils/languages/__init__.py``)

-    - Module "docutils.languages.en" (English).
+    - Modules: en.py (English), de.py (German), fr.py (French), sk.py
+      (Slovak), sv.py (Swedish).

    - Other languages to be added.

@ -422,7 +464,8 @@ Docutils Package Structure
 Front-End Tools
 ===============

-See `Docutils Front-End Tools`_.
+The ``tools/`` directory contains several front ends for common
+Docutils processing.  See `Docutils Front-End Tools`_ for details.

 .. _Docutils Front-End Tools: http://docutils.sf.net/docs/tools.html

@ -432,31 +475,34 @@ Document Tree

 A single intermediate data structure is used internally by Docutils,
 in the interfaces between components; it is defined in the
-docutils.nodes module.  It is not required that this data structure be
-used *internally* by any of the components, just *between* components.
+``docutils.nodes`` module.  It is not required that this data
+structure be used *internally* by any of the components, just
+*between* components as outlined in the diagram in the `Docutils
+Project Model`_ above.

 Custom node types are allowed, provided that either (a) a transform
 converts them to standard Docutils nodes before they reach the Writer
 proper, or (b) the custom node is explicitly supported by certain
 Writers, and is wrapped in a filtered "pending" node.  An example of
-condition A is the `Python Source Reader`_ (see below), where a
+condition (a) is the `Python Source Reader`_ (see below), where a
 "stylist" transform converts custom nodes.  The HTML ``<meta>`` tag is
-an example of condition B; it is supported by the HTML Writer but not
-by others.  The reStructuredText "meta" directive creates a "pending"
-node, which contains knowledge that the embedded "meta" node can only
-be handled by HTML-compatible writers.  The "pending" node is resolved
-by the "transforms.components.Filter" transform, which checks that the
-calling writer supports HTML; if it doesn't, the "meta" node is
-removed from the document.
+an example of condition (b); it is supported by the HTML Writer but
+not by others.  The reStructuredText "meta" directive creates a
+"pending" node, which contains knowledge that the embedded "meta" node
+can only be handled by HTML-compatible writers.  The "pending" node is
+resolved by the ``docutils.transforms.components.Filter`` transform,
+which checks that the calling writer supports HTML; if it doesn't, the
+"pending" node (and enclosed "meta" node) is removed from the
+document.

 The document tree data structure is similar to a DOM tree, but with
 specific node names (classes) instead of DOM's generic nodes. The
 schema is documented in an XML DTD (eXtensible Markup Language
 Document Type Definition), which comes in two parts:

- the Docutils Generic DTD, docutils.dtd_, and
+* the Docutils Generic DTD, docutils.dtd_, and

- the OASIS Exchange Table Model, soextbl.dtd_.
+* the OASIS Exchange Table Model, soextbl.dtd_.

 The DTD defines a rich set of elements, suitable for many input and
 output formats.  The DTD retains all information necessary to
@ -473,23 +519,23 @@ When the parser encounters an error in markup, it inserts a system
 message (DTD element "system_message").  There are five levels of
 system messages:

- Level-0, "DEBUG": an internal reporting issue.  There is no effect
+* Level-0, "DEBUG": an internal reporting issue.  There is no effect
  on the processing.  Level-0 system messages are handled separately
  from the others.

- Level-1, "INFO": a minor issue that can be ignored.  There is little
+* Level-1, "INFO": a minor issue that can be ignored.  There is little
  or no effect on the processing.  Typically level-1 system messages
  are not reported.

- Level-2, "WARNING": an issue that should be addressed.  If ignored,
+* Level-2, "WARNING": an issue that should be addressed.  If ignored,
  there may be minor problems with the output.  Typically level-2
  system messages are reported but do not halt processing

- Level-3, "ERROR": a major issue that should be addressed.  If
+* Level-3, "ERROR": a major issue that should be addressed.  If
  ignored, the output will contain unpredictable errors.  Typically
  level-3 system messages are reported but do not halt processing

- Level-4, "SEVERE": a critical error that must be addressed.
+* Level-4, "SEVERE": a critical error that must be addressed.
  Typically level-4 system messages are turned into exceptions which
  halt processing.  If ignored, the output will contain severe errors.

@ -517,11 +563,11 @@ Processing Model
 This model will evolve over time, incorporating experience and
 discoveries.

-1. The PySource Reader uses an I/O class to read in some Python
-   packages and modules, into a tree of strings.
+1. The PySource Reader uses an Input class to read in Python packages
+   and modules, into a tree of strings.

 2. The Python modules are parsed, converting the tree of strings into
-   a tree of abstract syntax trees.
+   a tree of abstract syntax trees with docstring nodes.

 3. The abstract syntax trees are converted into an internal
   representation of the packages/modules.  Docstrings are extracted,
@ -532,7 +578,7 @@ discoveries.
   Docutils doctrees.

 5. PySource assembles all the individual docstrings' doctrees into a
-   Python-specific custom Docutils tree parallelling the
+   Python-specific custom Docutils tree paralleling the
   package/module/class structure; this is a custom Reader-specific
   internal representation (see the `Docutils Python Source DTD`_).
   Namespaces must be merged: Python identifiers, hyperlink targets.
@ -541,37 +587,38 @@ discoveries.
   identifiers are resolved according to the Python namespace lookup
   rules.  See `Identifier Cross-References`_ below.

-7. A "Stylist" transform is applied to the custom doctree, custom
-   nodes are rendered using standard nodes as primitives, and a
-   standard document tree is emitted.  See `Stylist Transforms`_
-   below.
+7. A "Stylist" transform is applied to the custom doctree (by the
+   Transformer_), custom nodes are rendered using standard nodes as
+   primitives, and a standard document tree is emitted.  See `Stylist
+   Transforms`_ below.

-8. Other transforms are applied to the standard doctree.
+8. Other transforms are applied to the standard doctree by the
+   Transformer_.

 9. The standard doctree is sent to a Writer, which translates the
   document into a concrete format (HTML, PDF, etc.).

-10. The Writer uses an I/O class to write the resulting data to its
+10. The Writer uses an Output class to write the resulting data to its
    destination (disk file, directories and files, etc.).


 AST Mining
 ----------

-Abstract Syntax Tree mining code will be written that scans a parsed
-Python module, and returns an ordered tree containing the names,
-docstrings (including attribute and additional docstrings; see below),
-and additional info (in parentheses below) of all of the following
-objects:
+Abstract Syntax Tree mining code will be written (or adapted) that
+scans a parsed Python module, and returns an ordered tree containing
+the names, docstrings (including attribute and additional docstrings;
+see below), and additional info (in parentheses below) of all of the
+following objects:

- packages
- modules
- module attributes (+ initial values)
- classes (+ inheritance)
- class attributes (+ initial values)
- instance attributes (+ initial values)
- methods (+ parameters & defaults)
- functions (+ parameters & defaults)
+* packages
+* modules
+* module attributes (+ initial values)
+* classes (+ inheritance)
+* class attributes (+ initial values)
+* instance attributes (+ initial values)
+* methods (+ parameters & defaults)
+* functions (+ parameters & defaults)

 (Extract comments too?  For example, comments at the start of a module
 would be a good place for bibliographic field lists.)
@ -579,7 +626,7 @@ would be a good place for bibliographic field lists.)
 In order to evaluate interpreted text cross-references, namespaces for
 each of the above will also be required.

-See python-dev/docstring-develop thread "AST mining", started on
+See the python-dev/docstring-develop thread "AST mining", started on
 2001-08-14.


@ -592,12 +639,11 @@ Docstring Extraction Rules
      documented, only identifiers listed in "``__all__``" are
      examined for docstrings.

-   b) In the absense of "``__all__``", all identifiers are examined,
+   b) In the absence of "``__all__``", all identifiers are examined,
      except those whose names are private (names begin with "_" but
      don't begin and end with "__").

-   c) 1a and 1b can be overridden by a parameter or command-line
-      option.
+   c) 1a and 1b can be overridden by runtime settings.

 2. Where:

@ -616,7 +662,8 @@ Docstring Extraction Rules
      docstrings in (a) and (b) will be recognized, extracted, and
      concatenated.  See `Additional Docstrings`_ below.

-   d) @@@ 2.2-style "properties" with attribute docstrings?
+   d) @@@ 2.2-style "properties" with attribute docstrings?  Wait for
+      syntax?

 3. How:

@ -629,7 +676,7 @@ Docstring Extraction Rules
     examine an imported module, such as comments and the order of
     definitions.

-   - Docstrings are to be recognized in places where the bytecode
+   - Docstrings are to be recognized in places where the byte-code
     compiler ignores string literal expressions (2b and 2c above),
     meaning importing the module will lose these docstrings.

@ -642,7 +689,7 @@ Docstring Extraction Rules
   limitations must be lived with.

 Since attribute docstrings and additional docstrings are ignored by
-the Python bytecode compiler, no namespace pollution or runtime bloat
+the Python byte-code compiler, no namespace pollution or runtime bloat
 will result from their use.  They are not assigned to ``__doc__`` or
 to any other attribute.  The initial parsing of a module may take a
 slight performance hit.
@ -654,7 +701,7 @@ Attribute Docstrings
 (This is a simplified version of PEP 224 [#PEP-224]_.)

 A string literal immediately following an assignment statement is
-interpreted by the docstring extration machinery as the docstring of
+interpreted by the docstring extraction machinery as the docstring of
 the target of the assignment statement, under the following
 conditions:

@ -666,7 +713,7 @@ conditions:
   b) At the top level of a class definition: a class attribute.

   c) At the top level of the "``__init__``" method definition of a
-      class: an instance attribute.
+      class: an instance attribute.  (@@@ ``__new__`` methods?)

   Since each of the above contexts are at the top level (i.e., in the
   outermost suite of a definition), it may be necessary to place
@ -685,7 +732,7 @@ conditions:
   b) For context 1c above, the target must be of the form
      "``self.attrib``", where "``self``" matches the "``__init__``"
      method's first parameter (the instance parameter) and "attrib"
-      is a simple indentifier as in 3a.
+      is a simple identifier as in 3a.

 Blank lines may be used after attribute docstrings to emphasize the
 connection between the assignment and the docstring.
@ -712,25 +759,25 @@ Additional Docstrings

 Many programmers would like to make extensive use of docstrings for
 API documentation.  However, docstrings do take up space in the
-running program, so some of these programmers are reluctant to "bloat
-up" their code.  Also, not all API documentation is applicable to
-interactive environments, where ``__doc__`` would be displayed.
+running program, so some programmers are reluctant to "bloat up" their
+code.  Also, not all API documentation is applicable to interactive
+environments, where ``__doc__`` would be displayed.

-The docstring processing system's extraction tools will concatenate
-all string literal expressions which appear at the beginning of a
-definition or after a simple assignment.  Only the first strings in
-definitions will be available as ``__doc__``, and can be used for
-brief usage text suitable for interactive sessions; subsequent string
-literals and all attribute docstrings are ignored by the Python
-bytecode compiler and may contain more extensive API information.
+Docutils' docstring extraction tools will concatenate all string
+literal expressions which appear at the beginning of a definition or
+after a simple assignment.  Only the first strings in definitions will
+be available as ``__doc__``, and can be used for brief usage text
+suitable for interactive sessions; subsequent string literals and all
+attribute docstrings are ignored by the Python byte-code compiler and
+may contain more extensive API information.

 Example::

    def function(arg):
        """This is __doc__, function's docstring."""
        """
-        This is an additional docstring, ignored by the bytecode
-        compiler, but extracted by the Docutils.
+        This is an additional docstring, ignored by the byte-code
+        compiler, but extracted by Docutils.
        """
        pass

@ -753,13 +800,13 @@ Example::
   1. Should we search for docstrings after a ``__future__``
      statement?  Very ugly.

-   2. Redefine ``__future__`` statements to allow multiple preceeding
+   2. Redefine ``__future__`` statements to allow multiple preceding
      string literals?

   3. Or should we not even worry about this?  There probably
      shouldn't be ``__future__`` statements in production code, after
-      all.  Will modules with ``__future__`` statements simply have to
-      put up with the single-docstring limitation?
+      all.  Perhaps modules with ``__future__`` statements will simply
+      have to put up with the single-docstring limitation.


 Choice of Docstring Format
@ -776,7 +823,8 @@ format being used, a case-insensitive string matching the input
 parser's module or package name (i.e., the same name as required to
 "import" the module or package), or a registered alias.  If no
 ``__docformat__`` is specified, the default format is "plaintext" for
-now; this may be changed to the standard format once determined.
+now; this may be changed to the standard format if one is ever
+established.

 The ``__docformat__`` string may contain an optional second field,
 separated from the format name (first field) by a single space: a
@ -818,17 +866,17 @@ when necessary.  For example (using reStructuredText markup)::
            """
            Extend `Storer.__init__()` to keep track of instances.

-            Keep count in `self.instances`, data in `self.data`.
+            Keep count in `Keeper.instances`, data in `self.data`.
            """
            Storer.__init__(self)
-            self.instances += 1
+            Keeper.instances += 1

            self.data = []
            """Store data in a list, most recent last."""

-        def storedata(self, data):
+        def store_data(self, data):
            """
-            Extend `Storer.storedata()`; append new `data` to a
+            Extend `Storer.store_data()`; append new `data` to a
            list (in `self.data`).
            """
            self.data = data
@ -840,12 +888,12 @@ references to the definitions of the identifiers themselves.
 Stylist Transforms
 ------------------

-Stylist transforms are specialized transforms specific to a Reader.
-The PySource Reader doesn't have to make any decisions as to style; it
-just produces a logically constructed document tree, parsed and
-linked, including custom node types.  Stylist transforms understand
-the custom nodes created by the Reader and convert them into standard
-Docutils nodes.
+Stylist transforms are specialized transforms specific to the PySource
+Reader.  The PySource Reader doesn't have to make any decisions as to
+style; it just produces a logically constructed document tree, parsed
+and linked, including custom node types.  Stylist transforms
+understand the custom nodes created by the Reader and convert them
+into standard Docutils nodes.

 Multiple Stylist transforms may be implemented and one can be chosen
 at runtime (through a "--style" or "--stylist" command-line option).
--- a/pep-0287.txt
+++ b/pep-0287.txt
@ -25,7 +25,7 @@ what-you-see-is-what-you-get plaintext markup syntax.

 Only the low-level syntax of docstrings is addressed here.  This PEP
 is not concerned with docstring semantics or processing at all (see
-PEP 256 for a "Roadmap to the Doctring PEPs").  Nor is it an attempt
+PEP 256 for a "Road Map to the Doctring PEPs").  Nor is it an attempt
 to deprecate pure plaintext docstrings, which are always going to be
 legitimate.  The reStructuredText markup is an alternative for those
 who want more expressive docstrings.