reSTify PEP 100 (#422)

2017-10-24 14:54:10 -05:00 · 2017-10-24 14:54:10 -05:00 · 0fcce00dad
parent 5a15c92dcf
commit 0fcce00dad
1 changed files with 739 additions and 618 deletions
--- a/pep-0100.txt
+++ b/pep-0100.txt
@ -5,12 +5,14 @@ Last-Modified: $Date$
 Author: mal@lemburg.com (Marc-André Lemburg)
 Status: Final
 Type: Standards Track
+Content-Type: text/x-rst
 Created: 10-Mar-2000
 Python-Version: 2.0
 Post-History:


 Historical Note
+===============

 This document was first written by Marc-Andre in the pre-PEP days,
 and was originally distributed as Misc/unicode.txt in Python
@ -26,6 +28,7 @@ Historical Note


 Introduction
+============

 The idea of this proposal is to add native Unicode 3.0 support to
 Python in a way that makes use of Unicode strings as simple as
@ -40,11 +43,9 @@ Introduction
 integration.

 The latest version of this document is always available at:
-
 http://starship.python.net/~lemburg/unicode-proposal.txt

 Older versions are available as:
-
 http://starship.python.net/~lemburg/unicode-proposal-X.X.txt

 [ed. note: new revisions should be made to this PEP document,
@ -53,6 +54,7 @@ Introduction


 Conventions
+===========

 - In examples we use u = Unicode object and s = Python string

@ -60,6 +62,7 @@ Conventions


 General Remarks
+===============

 - Unicode encoding names should be lower case on output and
  case-insensitive on input (they will be converted to lower case
@ -70,10 +73,11 @@ General Remarks
  16' is written as 'utf-16'.

 - Codec modules should use the same names, but with hyphens
-      converted to underscores, e.g. utf_8, utf_16, iso_8859_1.
+  converted to underscores, e.g. ``utf_8``, ``utf_16``, ``iso_8859_1``.


 Unicode Default Encoding
+========================

 The Unicode implementation has to make some assumption about the
 encoding of 8-bit strings passed to it for coercion and about the
@ -86,16 +90,16 @@ Unicode Default Encoding
 possible.  The <default encoding> can be set and queried using the
 two sys module APIs:

-      sys.setdefaultencoding(encoding)
-        --> Sets the <default encoding> used by the Unicode implementation.
+``sys.setdefaultencoding(encoding)``
+   Sets the <default encoding> used by the Unicode implementation.
   encoding has to be an encoding which is supported by the
   Python installation, otherwise, a LookupError is raised.

   Note: This API is only available in site.py!  It is
   removed from the sys module by site.py after usage.

-      sys.getdefaultencoding()
-        --> Returns the current <default encoding>.
+``sys.getdefaultencoding()``
+   Returns the current <default encoding>.

 If not otherwise defined or set, the <default encoding> defaults
 to 'ascii'.  This encoding is also the startup default of Python
@ -113,9 +117,10 @@ Unicode Default Encoding


 Unicode Constructors
+====================

 Python should provide a built-in constructor for Unicode strings
-    which is available through __builtins__:
+which is available through ``__builtins__``::

    u = unicode(encoded_string[,encoding=<default encoding>][,errors="strict"])

@ -129,17 +134,17 @@ Unicode Constructors
  ordinal (e.g. 'a' -> U+0061).

 - all existing defined Python escape sequences are interpreted as
-      Unicode ordinals; note that \xXXXX can represent all Unicode
-      ordinals, and \OOO (octal) can represent Unicode ordinals up to
+  Unicode ordinals; note that ``\xXXXX`` can represent all Unicode
+  ordinals, and ``\OOO`` (octal) can represent Unicode ordinals up to
  U+01FF.

-    - a new escape sequence, \uXXXX, represents U+XXXX; it is a syntax
-      error to have fewer than 4 digits after \u.
+- a new escape sequence, ``\uXXXX``, represents U+XXXX; it is a syntax
+  error to have fewer than 4 digits after ``\u``.

 For an explanation of possible values for errors see the Codec
 section below.

-    Examples:
+Examples::

    u'abc'          -> U+0061 U+0062 U+0063
    u'\u1234'       -> U+1234
@ -147,7 +152,7 @@ Unicode Constructors

 The 'raw-unicode-escape' encoding is defined as follows:

-    - \uXXXX sequence represent the U+XXXX Unicode character if and
+- ``\uXXXX`` sequence represent the U+XXXX Unicode character if and
  only if the number of leading backslashes is odd

 - all other characters represent themselves as Unicode ordinal
@ -164,17 +169,21 @@ Unicode Constructors


 Unicode Type Object
+===================

 Unicode objects should have the type UnicodeType with type name
 'unicode', made available through the standard types module.


 Unicode Output
+==============

 Unicode objects have a method .encode([encoding=<default encoding>])
 which returns a Python string encoding the Unicode string using the
 given scheme (see Codecs).

+::
+
    print u := print u.encode()   # using the <default encoding>

    str(u)  := u.encode()         # using the <default encoding>
@ -186,10 +195,11 @@ Unicode Output


 Unicode Ordinals
+================

 Since Unicode 3.0 has a 32-bit ordinal character set, the
 implementation should provide 32-bit aware ordinal conversion
-    APIs:
+APIs::

    ord(u[:1]) (this is the standard ord() extended to work with Unicode
                objects)
@ -199,8 +209,8 @@ Unicode Ordinals
        --> Unicode object for character i (provided it is 32-bit);
            ValueError otherwise

-    Both APIs should go into __builtins__ just like their string
-    counterparts ord() and chr().
+Both APIs should go into ``__builtins__`` just like their string
+counterparts ``ord()`` and ``chr()``.

 Note that Unicode provides space for private encodings.  Usage of
 these can cause different output representations on different
@ -209,6 +219,7 @@ Unicode Ordinals


 Comparison & Hash Value
+=======================

 Unicode objects should compare equal to other objects after these
 other objects have been coerced to Unicode.  For strings this
@ -220,10 +231,10 @@ Comparison & Hash Value
 not guaranteed to return the same hash values as the default
 encoded equivalent string representation.

-    When compared using cmp() (or PyObject_Compare()) the
-    implementation should mask TypeErrors raised during the conversion
+When compared using ``cmp()`` (or ``PyObject_Compare()``) the
+implementation should mask ``TypeErrors`` raised during the conversion
 to remain in synch with the string behavior.  All other errors
-    such as ValueErrors raised during coercion of strings to Unicode
+such as ``ValueErrors`` raised during coercion of strings to Unicode
 should not be masked and passed through to the user.

 In containment tests ('a' in u'abc' and u'a' in 'abc') both sides
@ -233,11 +244,14 @@ Comparison & Hash Value


 Coercion
+========

 Using Python strings and Unicode objects to form new objects
 should always coerce to the more precise format, i.e. Unicode
 objects.

+::
+
    u + s := u + unicode(s)

    s + u := unicode(s) + u
@ -247,6 +261,8 @@ Coercion
 Unicode and then applying the arguments to the Unicode method of
 the same name, e.g.

+::
+
    string.join((s,u),sep) := (s + sep) + u

    sep.join((s,u)) := (s + sep) + u
@ -256,17 +272,19 @@ Coercion


 Exceptions
+==========

-    UnicodeError is defined in the exceptions module as a subclass of
-    ValueError.  It is available at the C level via
-    PyExc_UnicodeError.  All exceptions related to Unicode
-    encoding/decoding should be subclasses of UnicodeError.
+``UnicodeError`` is defined in the exceptions module as a subclass of
+``ValueError``.  It is available at the C level via
+``PyExc_UnicodeError``.  All exceptions related to Unicode
+encoding/decoding should be subclasses of ``UnicodeError``.


 Codecs (Coder/Decoders) Lookup
+==============================

 A Codec (see Codec Interface Definition) search registry should be
-    implemented by a module "codecs":
+implemented by a module "codecs"::

    codecs.register(search_function)

@ -276,22 +294,20 @@ Codecs (Coder/Decoders) Lookup
 (encoder, decoder, stream_reader, stream_writer) taking the
 following arguments:

-      encoder and decoder:
-
+encoder and decoder
    These must be functions or methods which have the same
-        interface as the .encode/.decode methods of Codec instances
+    interface as the ``.encode``/``.decode`` methods of Codec instances
    (see Codec Interface). The functions/methods are expected to
    work in a stateless mode.

-      stream_reader and stream_writer:
-
+stream_reader and stream_writer
    These need to be factory functions with the following
-        interface:
+    interface::

        factory(stream,errors='strict')

    The factory functions must return objects providing the
-        interfaces defined by StreamWriter/StreamReader resp.  (see
+    interfaces defined by ``StreamWriter``/``StreamReader`` resp.  (see
    Codec Interface).  Stream codecs can maintain state.

    Possible values for errors are defined in the Codec section
@ -309,24 +325,27 @@ Codecs (Coder/Decoders) Lookup
 codecs tuple is found, a LookupError is raised.  Otherwise, the
 codecs tuple is stored in the cache and returned to the caller.

-    To query the Codec instance the following API should be used:
+To query the Codec instance the following API should be used::

    codecs.lookup(encoding)

 This will either return the found codecs tuple or raise a
-    LookupError.
+``LookupError``.


 Standard Codecs
+===============

 Standard codecs should live inside an encodings/ package directory
-    in the Standard Python Code Library.  The __init__.py file of that
+in the Standard Python Code Library.  The ``__init__.py`` file of that
 directory should include a Codec Lookup compatible search function
 implementing a lazy module based codec lookup.

 Python should provide a few standard codecs for the most relevant
 encodings, e.g.

+::
+
    'utf-8':              8-bit variable length encoding
    'utf-16':             16-bit variable length encoding (little/big endian)
    'utf-16-le':          utf-16 but explicitly little endian
@ -350,6 +369,7 @@ Standard Codecs


 Codecs Interface Definition
+===========================

 The following base class should be defined in the module "codecs".
 They provide not only templates for use by encoding module
@ -358,15 +378,17 @@ Codecs Interface Definition

 Note that the Codec Interface defined here is well suitable for a
 larger range of applications.  The Unicode implementation expects
-    Unicode objects on input for .encode() and .write() and character
-    buffer compatible objects on input for .decode().  Output of
-    .encode() and .read() should be a Python string and .decode() must
+Unicode objects on input for ``.encode()`` and ``.write()`` and character
+buffer compatible objects on input for ``.decode()``.  Output of
+``.encode()`` and ``.read()`` should be a Python string and ``.decode()`` must
 return an Unicode object.

 First, we have the stateless encoders/decoders.  These do not work
 in chunks as the stream codecs (see below) do, because all
 components are expected to be available in memory.

+::
+
    class Codec:

        """Defines the interface for stateless encoders/decoders.
@ -415,14 +437,16 @@ Codecs Interface Definition

            """

-    StreamWriter and StreamReader define the interface for stateful
+``StreamWriter`` and ``StreamReader`` define the interface for stateful
 encoders/decoders which work on streams.  These allow processing
 of the data in chunks to efficiently use memory.  If you have
-    large strings in memory, you may want to wrap them with cStringIO
+large strings in memory, you may want to wrap them with ``cStringIO``
 objects and then use these codecs on them to be able to do chunk
 processing as well, e.g. to provide progress information to the
 user.

+::
+
    class StreamWriter(Codec):

        def __init__(self,stream,errors='strict'):
@ -593,8 +617,8 @@ Codecs Interface Definition
            return getattr(self.stream,name)


-    Stream codec implementors are free to combine the StreamWriter and
-    StreamReader interfaces into one class.  Even combining all these
+Stream codec implementors are free to combine the ``StreamWriter`` and
+``StreamReader`` interfaces into one class.  Even combining all these
 with the Codec class should be possible.

 Implementors are free to add additional methods to enhance the
@ -616,12 +640,14 @@ Codecs Interface Definition


 Whitespace
+==========

-    The .split() method will have to know about what is considered
+The ``.split()`` method will have to know about what is considered
 whitespace in Unicode.


 Case Conversion
+===============

 Case conversion is rather complicated with Unicode data, since
 there are many different conditions to respect.  See
@ -635,24 +661,26 @@ Case Conversion
 (see the Unicode standard file SpecialCasing.txt) should be left
 to user land routines and not go into the core interpreter.

-    The methods .capitalize() and .iscapitalized() should follow the
+The methods ``.capitalize()`` and ``.iscapitalized()`` should follow the
 case mapping algorithm defined in the above technical report as
 closely as possible.


 Line Breaks
+===========

 Line breaking should be done for all Unicode characters having the
 B property as well as the combinations CRLF, CR, LF (interpreted
 in that order) and other special line separators defined by the
 standard.

-    The Unicode type should provide a .splitlines() method which
+The Unicode type should provide a ``.splitlines()`` method which
 returns a list of lines according to the above specification. See
 Unicode Methods.


 Unicode Character Properties
+============================

 A separate module "unicodedata" should provide a compact interface
 to all Unicode character properties defined in the standard's
@ -675,14 +703,16 @@ Unicode Character Properties


 Private Code Point Areas
+========================

 Support for these is left to user land Codecs and not explicitly
 integrated into the core.  Note that due to the Internal Format
-    being implemented, only the area between \uE000 and \uF8FF is
+being implemented, only the area between ``\uE000`` and ``\uF8FF`` is
 usable for private encodings.


 Internal Format
+===============

 The internal format for Unicode objects should use a Python
 specific fixed format <PythonUnicode> implemented as 'unsigned
@ -720,10 +750,10 @@ Internal Format
 Interning is not needed (for now), since Python identifiers are
 defined as being ASCII only.

-    codecs.BOM should return the byte order mark (BOM) for the format
+``codecs.BOM`` should return the byte order mark (BOM) for the format
 used internally.  The codecs module should provide the following
-    additional constants for convenience and reference (codecs.BOM
-    will either be BOM_BE or BOM_LE depending on the platform):
+additional constants for convenience and reference (``codecs.BOM``
+will either be ``BOM_BE`` or ``BOM_LE`` depending on the platform)::

    BOM_BE: '\376\377'
      (corresponds to Unicode U+0000FEFF in UTF-16 on big endian
@ -744,19 +774,20 @@ Internal Format
 format, hence the illegal character definition.

 The configure script should provide aid in deciding whether Python
-    can use the native wchar_t type or not (it has to be a 16-bit
+can use the native ``wchar_t`` type or not (it has to be a 16-bit
 unsigned type).


 Buffer Interface
+================

 Implement the buffer interface using the <defenc> Python string
-    object as basis for bf_getcharbuf and the internal buffer for
-    bf_getreadbuf.  If bf_getcharbuf is requested and the <defenc>
+object as basis for ``bf_getcharbuf`` and the internal buffer for
+``bf_getreadbuf``.  If ``bf_getcharbuf`` is requested and the <defenc>
 object does not yet exist, it is created first.

 Note that as special case, the parser marker "s#" will not return
-    raw Unicode UTF-16 data (which the bf_getreadbuf returns), but
+raw Unicode UTF-16 data (which the ``bf_getreadbuf`` returns), but
 instead tries to encode the Unicode object using the default
 encoding and then returns a pointer to the resulting string object
 (or raises an exception in case the conversion fails).  This was
@ -768,13 +799,14 @@ Buffer Interface
 specification of the encoding to use.

 If you need to access the read buffer interface of Unicode
-    objects, use the PyObject_AsReadBuffer() interface.
+objects, use the ``PyObject_AsReadBuffer()`` interface.

 The internal format can also be accessed using the
-    'unicode-internal' codec, e.g. via u.encode('unicode-internal').
+'unicode-internal' codec, e.g. via ``u.encode('unicode-internal')``.


 Pickle/Marshalling
+==================

 Should have native Unicode object support.  The objects should be
 encoded using platform independent encodings.
@ -786,6 +818,7 @@ Pickle/Marshalling


 Regular Expressions
+===================

 Secret Labs AB is working on a Unicode-aware regular expression
 machinery.  It works on plain 8-bit, UCS-2, and (optionally) UCS-4
@ -799,10 +832,11 @@ Regular Expressions


 Formatting Markers
+==================

 Format markers are used in Python format strings.  If Python
 strings are used as format strings, the following interpretations
-    should be in effect:
+should be in effect::

    '%s': For Unicode objects this will cause coercion of the
          whole format string to Unicode.  Note that you should use
@ -814,71 +848,78 @@ Formatting Markers
 according to the format string.  Numbers are first converted to
 strings and then to Unicode.

+::
+
    '%s': Python strings are interpreted as Unicode
          string using the <default encoding>.  Unicode objects are
          taken as is.

 All other string formatters should work accordingly.

-    Example:
+Example::

    u"%s %s" % (u"abc", "abc")  ==  u"abc abc"


 Internal Argument Parsing
+=========================

-    These markers are used by the PyArg_ParseTuple() APIs:
+These markers are used by the ``PyArg_ParseTuple()`` APIs:

-      "U":  Check for Unicode object and return a pointer to it
+"U"
+    Check for Unicode object and return a pointer to it

-      "s":  For Unicode objects: return a pointer to the object's
+"s"
+   For Unicode objects: return a pointer to the object's
   <defenc> buffer (which uses the <default encoding>).

-      "s#": Access to the default encoded version of the Unicode object
+"s#"
+   Access to the default encoded version of the Unicode object
   (see Buffer Interface); note that the length relates to
   the length of the default encoded string rather than the
   Unicode object length.

-      "t#": Same as "s#".
+"t#"
+    Same as "s#".

-      "es":
-            Takes two parameters: encoding (const char *) and buffer
-            (char **).
+"es"
+   Takes two parameters: encoding (``const char *``) and buffer
+   (``char **``).

   The input object is first coerced to Unicode in the usual
   way and then encoded into a string using the given
   encoding.

   On output, a buffer of the needed size is allocated and
-            returned through *buffer as NULL-terminated string.  The
+   returned through ``*buffer`` as NULL-terminated string.  The
   encoded may not contain embedded NULL characters.  The
-            caller is responsible for calling PyMem_Free() to free the
-            allocated *buffer after usage.
+   caller is responsible for calling ``PyMem_Free()`` to free the
+   allocated ``*buffer`` after usage.

-      "es#":
-            Takes three parameters: encoding (const char *), buffer
-            (char **) and buffer_len (int *).
+"es#"
+    Takes three parameters: encoding (``const char *``), buffer
+    (``char **``) and buffer_len (``int *``).

    The input object is first coerced to Unicode in the usual
    way and then encoded into a string using the given
    encoding.

-            If *buffer is non-NULL, *buffer_len must be set to
-            sizeof(buffer) on input. Output is then copied to *buffer.
+    If ``*buffer`` is non-NULL, ``*buffer_len`` must be set to
+    ``sizeof(buffer)`` on input. Output is then copied to ``*buffer``.

-            If *buffer is NULL, a buffer of the needed size is
-            allocated and output copied into it.  *buffer is then
+    If ``*buffer`` is NULL, a buffer of the needed size is
+    allocated and output copied into it.  ``*buffer`` is then
    updated to point to the allocated memory area.  The caller
-            is responsible for calling PyMem_Free() to free the
-            allocated *buffer after usage.
+    is responsible for calling ``PyMem_Free()`` to free the
+    allocated ``*buffer`` after usage.

-            In both cases *buffer_len is updated to the number of
+    In both cases ``*buffer_len`` is updated to the number of
    characters written (excluding the trailing NULL-byte).
    The output buffer is assured to be NULL-terminated.

 Examples:

-    Using "es#" with auto-allocation:
+Using "es#" with auto-allocation::

    static PyObject *
    test_parser(PyObject *self,
@ -902,7 +943,7 @@ Internal Argument Parsing
        return str;
    }

-    Using "es" with auto-allocation returning a NULL-terminated string:
+Using "es" with auto-allocation returning a NULL-terminated string::

    static PyObject *
    test_parser(PyObject *self,
@ -925,7 +966,7 @@ Internal Argument Parsing
        return str;
    }

-    Using "es#" with a pre-allocated buffer:
+Using "es#" with a pre-allocated buffer::

    static PyObject *
    test_parser(PyObject *self,
@ -951,6 +992,7 @@ Internal Argument Parsing


 File/Stream Output
+==================

 Since file.write(object) and most other stream writers use the
 "s#" or "t#" argument parsing marker for querying the data to
@ -966,6 +1008,7 @@ File/Stream Output


 File/Stream Input
+=================

 Only the user knows what encoding the input data uses, so no
 special magic is applied.  The user will have to explicitly
@ -975,8 +1018,9 @@ File/Stream Input


 Unicode Methods & Attributes
+============================

-    All Python string methods, plus:
+All Python string methods, plus::

    .encode([encoding=<default encoding>][,errors="strict"])
       --> see Unicode Output
@ -989,6 +1033,7 @@ Unicode Methods & Attributes


 Code Base
+=========

 We should use Fredrik Lundh's Unicode object implementation as
 basis. It already implements most of the string methods needed
@ -999,6 +1044,7 @@ Code Base


 Test Cases
+==========

 Test cases should follow those in Lib/test/test_string.py and
 include additional checks for the Codec Registry and the Standard
@ -1006,130 +1052,205 @@ Test Cases


 References
+==========

-    Unicode Consortium:
-            http://www.unicode.org/
+* Unicode Consortium: http://www.unicode.org/

-    Unicode FAQ:
-            http://www.unicode.org/unicode/faq/
+* Unicode FAQ: http://www.unicode.org/unicode/faq/

-    Unicode 3.0:
-            http://www.unicode.org/unicode/standard/versions/Unicode3.0.html
+* Unicode 3.0: http://www.unicode.org/unicode/standard/versions/Unicode3.0.html

-    Unicode-TechReports:
-            http://www.unicode.org/unicode/reports/techreports.html
+* Unicode-TechReports: http://www.unicode.org/unicode/reports/techreports.html

-    Unicode-Mappings:
-            ftp://ftp.unicode.org/Public/MAPPINGS/
+* Unicode-Mappings: ftp://ftp.unicode.org/Public/MAPPINGS/

-    Introduction to Unicode (a little outdated by still nice to read):
+* Introduction to Unicode (a little outdated by still nice to read):
  http://www.nada.kth.se/i18n/ucs/unicode-iso10646-oview.html

-    For comparison:
+* For comparison:
  Introducing Unicode to ECMAScript (aka JavaScript) --
  http://www-4.ibm.com/software/developer/library/internationalization-support.html

-    IANA Character Set Names:
+* IANA Character Set Names:
  ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets

-    Discussion of UTF-8 and Unicode support for POSIX and Linux:
+* Discussion of UTF-8 and Unicode support for POSIX and Linux:
  http://www.cl.cam.ac.uk/~mgk25/unicode.html

-    Encodings:
+* Encodings:

-        Overview:
-                http://czyborra.com/utf/
+  * Overview: http://czyborra.com/utf/

-        UCS-2:
-                http://www.uazone.org/multiling/unicode/ucs2.html
+  * UCS-2: http://www.uazone.org/multiling/unicode/ucs2.html

-        UTF-7:
-                Defined in RFC2152, e.g.
+  * UTF-7: Defined in RFC2152, e.g.
    http://www.uazone.org/multiling/ml-docs/rfc2152.txt

-        UTF-8:
-                Defined in RFC2279, e.g.
+  * UTF-8: Defined in RFC2279, e.g.
    https://tools.ietf.org/html/rfc2279

-        UTF-16:
-                http://www.uazone.org/multiling/unicode/wg2n1035.html
+  * UTF-16: http://www.uazone.org/multiling/unicode/wg2n1035.html


 History of this Proposal
+========================

 [ed. note: revisions prior to 1.7 are available in the CVS history
 of Misc/unicode.txt from the standard Python distribution.  All
 subsequent history is available via the CVS revisions on this
 file.]

-    1.7: Added note about the changed behaviour of "s#".
-    1.6: Changed <defencstr> to <defenc> since this is the name used in the
-         implementation.  Added notes about the usage of <defenc> in
+1.7
+---
+
+* Added note about the changed behaviour of "s#".
+
+1.6
+---
+
+* Changed <defencstr> to <defenc> since this is the name used in the
+  implementation.
+* Added notes about the usage of <defenc> in
  the buffer protocol implementation.
-    1.5: Added notes about setting the <default encoding>.  Fixed some
-         typos (thanks to Andrew Kuchling).  Changed <defencstr> to
-         <utf8str>.
-    1.4: Added note about mixed type comparisons and contains tests.
-         Changed treating of Unicode objects in format strings (if
-         used with '%s' % u they will now cause the format string to
+
+1.5
+---
+
+* Added notes about setting the <default encoding>.
+* Fixed some typos (thanks to Andrew Kuchling).
+* Changed <defencstr> to <utf8str>.
+
+1.4
+---
+
+* Added note about mixed type comparisons and contains tests.
+* Changed treating of Unicode objects in format strings (if
+  used with ``'%s' % u`` they will now cause the format string to
  be coerced to Unicode, thus producing a Unicode object on
-         return).  Added link to IANA charset names (thanks to Lars
-         Marius Garshol).  Added new codec methods .readline(),
-         .readlines() and .writelines().
-    1.3: Added new "es" and "es#" parser markers
-    1.2: Removed POD about codecs.open()
-    1.1: Added note about comparisons and hash values.  Added note about
-         case mapping algorithms.  Changed stream codecs .read() and
-         .write() method to match the standard file-like object
+  return).
+* Added link to IANA charset names (thanks to Lars
+  Marius Garshol).
+* Added new codec methods ``.readline()``,
+  ``.readlines()`` and ``.writelines()``.
+
+1.3
+---
+
+* Added new "es" and "es#" parser markers
+
+1.2
+---
+
+* Removed POD about ``codecs.open()``
+
+1.1
+---
+
+* Added note about comparisons and hash values.
+* Added note about case mapping algorithms.
+* Changed stream codecs ``.read()`` and ``.write()`` method
+  to match the standard file-like object
  methods (bytes consumed information is no longer returned by
  the methods)
-    1.0: changed encode Codec method to be symmetric to the decode method
+
+1.0
+---
+
+* changed encode Codec method to be symmetric to the decode method
  (they both return (object, data consumed) now and thus become
-         interchangeable); removed __init__ method of Codec class (the
+  interchangeable);
+* removed ``__init__`` method of Codec class (the
  methods are stateless) and moved the errors argument down to
-         the methods; made the Codec design more generic w/r to type
-         of input and output objects; changed StreamWriter.flush to
-         StreamWriter.reset in order to avoid overriding the stream's
-         .flush() method; renamed .breaklines() to .splitlines();
-         renamed the module unicodec to codecs; modified the File I/O
-         section to refer to the stream codecs.
-    0.9: changed errors keyword argument definition; added 'replace' error
-         handling; changed the codec APIs to accept buffer like
-         objects on input; some minor typo fixes; added Whitespace
-         section and included references for Unicode characters that
-         have the whitespace and the line break characteristic; added
-         note that search functions can expect lower-case encoding
-         names; dropped slicing and offsets in the codec APIs
-    0.8: added encodings package and raw unicode escape encoding; untabified
-         the proposal; added notes on Unicode format strings; added
-         .breaklines() method
-    0.7: added a whole new set of codec APIs; added a different
-         encoder lookup scheme; fixed some names
-    0.6: changed "s#" to "t#"; changed <defencbuf> to <defencstr> holding
-         a real Python string object; changed Buffer Interface to
-         delegate requests to <defencstr>'s buffer interface; removed
-         the explicit reference to the unicodec.codecs dictionary (the
+  the methods;
+* made the Codec design more generic w/r to type
+  of input and output objects;
+* changed ``StreamWriter.flush`` to ``StreamWriter.reset`` in order to
+  avoid overriding the stream's ``.flush()`` method;
+* renamed ``.breaklines()`` to ``.splitlines()``;
+* renamed the module unicodec to codecs;
+* modified the File I/O section to refer to the stream codecs.
+
+0.9
+---
+
+* changed errors keyword argument definition;
+* added 'replace' error handling;
+* changed the codec APIs to accept buffer like
+  objects on input;
+* some minor typo fixes;
+* added Whitespace section and included references for Unicode characters that
+  have the whitespace and the line break characteristic;
+* added note that search functions can expect lower-case encoding names;
+* dropped slicing and offsets in the codec APIs
+
+0.8
+---
+
+* added encodings package and raw unicode escape encoding;
+* untabified the proposal;
+* added notes on Unicode format strings;
+* added ``.breaklines()`` method
+
+0.7
+---
+
+* added a whole new set of codec APIs;
+* added a different encoder lookup scheme;
+* fixed some names
+
+0.6
+---
+
+* changed "s#" to "t#";
+* changed <defencbuf> to <defencstr> holding
+  a real Python string object;
+* changed Buffer Interface to
+  delegate requests to <defencstr>'s buffer interface;
+* removed the explicit reference to the unicodec.codecs dictionary (the
  module can implement this in way fit for the purpose);
-         removed the settable default encoding; move UnicodeError from
-         unicodec to exceptions; "s#" not returns the internal data;
-         passed the UCS-2/UTF-16 checking from the Unicode constructor
+* removed the settable default encoding;
+* move ``UnicodeError`` from unicodec to exceptions;
+* "s#" not returns the internal data;
+* passed the UCS-2/UTF-16 checking from the Unicode constructor
  to the Codecs
-    0.5: moved sys.bom to unicodec.BOM; added sections on case mapping,
-         private use encodings and Unicode character properties
-    0.4: added Codec interface, notes on %-formatting, changed some encoding
-         details, added comments on stream wrappers, fixed some
-         discussion points (most important: Internal Format),
-         clarified the 'unicode-escape' encoding, added encoding
+
+0.5
+---
+
+* moved ``sys.bom`` to ``unicodec.BOM``;
+* added sections on case mapping,
+* private use encodings and Unicode character properties
+
+0.4
+---
+
+* added Codec interface, notes on %-formatting,
+* changed some encoding details,
+* added comments on stream wrappers,
+* fixed some discussion points (most important: Internal Format),
+* clarified the 'unicode-escape' encoding, added encoding
  references
-    0.3: added references, comments on codec modules, the internal format,
-         bf_getcharbuffer and the RE engine; added 'unicode-escape'
+
+0.3
+---
+
+* added references, comments on codec modules, the internal format,
+  bf_getcharbuffer and the RE engine;
+* added 'unicode-escape'
  encoding proposed by Tim Peters and fixed repr(u) accordingly
-    0.2: integrated Guido's suggestions, added stream codecs and file
-         wrapping
-    0.1: first version
+
+0.2
+---
+
+* integrated Guido's suggestions, added stream codecs and file wrapping
+
+0.1
+---
+
+* first version


-
+..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil