Fix lay-out glitches and remove gmail turd.

2008-06-02 22:26:21 +00:00 · 2008-06-02 22:26:21 +00:00 · 9934e842de
parent 564e85c33f
commit 9934e842de
1 changed files with 64 additions and 65 deletions
--- a/pep-3138.txt
+++ b/pep-3138.txt
@ -29,20 +29,20 @@ algorithm.
 - Convert CR, LF, TAB and '\\' to '\\r', '\\n', '\\t', '\\\\'.

 - Convert other non-printable characters(0x00-0x1f, 0x7f) and non-ASCII
- characters(>=0x80) to '\\xXX'.
+  characters(>=0x80) to '\\xXX'.

 - Backslash-escape quote characters (apostrophe, ') and add the quote
- character at the beginning and the end.
+  character at the beginning and the end.

 For Unicode strings, the following additional conversions are done.

 - Convert leading surrogate pair characters without trailing character
- (0xd800-0xdbff, but not followed by 0xdc00-0xdfff) to '\\uXXXX'.
+  (0xd800-0xdbff, but not followed by 0xdc00-0xdfff) to '\\uXXXX'.

 - Convert 16-bit characters(>=0x100) to '\\uXXXX'.

 - Convert 21-bit characters(>=0x10000) and surrogate pair characters to
- '\\U00xxxxxx'.
+  '\\U00xxxxxx'.

 This algorithm converts any string to printable ASCII, and repr() is
 used as a handy and safe way to print strings for debugging or for
@ -75,19 +75,19 @@ Specification
 =============

 - Add a new function to the Python C API ``int Py_UNICODE_ISPRINTABLE
- (Py_UNICODE ch)``. This function returns 0 if repr() should escape the
- Unicode character ``ch``; otherwise it returns 1. Characters that should
- be escaped are defined in the Unicode character database as:
+  (Py_UNICODE ch)``. This function returns 0 if repr() should escape the
+  Unicode character ``ch``; otherwise it returns 1. Characters that should
+  be escaped are defined in the Unicode character database as:

- * Cc (Other, Control)
- * Cf (Other, Format)
- * Cs (Other, Surrogate)
- * Co (Other, Private Use)
- * Cn (Other, Not Assigned)
- * Zl (Separator, Line), refers to LINE SEPARATOR ('\\u2028').
- * Zp (Separator, Paragraph), refers to PARAGRAPH SEPARATOR ('\\u2029').
- * Zs (Separator, Space) other than ASCII space('\\x20'). Characters in
-   this category should be escaped to avoid ambiguity.
+  * Cc (Other, Control)
+  * Cf (Other, Format)
+  * Cs (Other, Surrogate)
+  * Co (Other, Private Use)
+  * Cn (Other, Not Assigned)
+  * Zl (Separator, Line), refers to LINE SEPARATOR ('\\u2028').
+  * Zp (Separator, Paragraph), refers to PARAGRAPH SEPARATOR ('\\u2029').
+  * Zs (Separator, Space) other than ASCII space('\\x20'). Characters in
+    this category should be escaped to avoid ambiguity.

 - The algorithm to build repr() strings should be changed to:

@ -105,22 +105,22 @@ Specification
   character at the beginning and the end.

 - Set the Unicode error-handler for sys.stderr to 'backslashreplace' by
- default.
+  default.

 - Add ``'%a'`` string format operator. ``'%a'`` converts any python
- object to a string using repr() and then hex-escapes all non-ASCII
- characters. The ``'%a'`` format operator generates the same string as
- ``'%r'`` in Python 2.
+  object to a string using repr() and then hex-escapes all non-ASCII
+  characters. The ``'%a'`` format operator generates the same string as
+  ``'%r'`` in Python 2.

 - Add a new built-in function, ``ascii()``. This function converts any
- python object to a string using repr() and then hex-escapes all non-
- ASCII characters. ``ascii()`` generates the same string as ``repr()``
- in Python 2.
+  python object to a string using repr() and then hex-escapes all non-
+  ASCII characters. ``ascii()`` generates the same string as ``repr()``
+  in Python 2.

 - Add an ``isprintable()`` method to the string type. ``str.isprintable()``
- returns False if repr() should escape any character in the string;
- otherwise returns True. The ``isprintable()`` method calls the
- `` Py_UNICODE_ISPRINTABLE()`` function internally.
+  returns False if repr() should escape any character in the string;
+  otherwise returns True. The ``isprintable()`` method calls the
+  `` Py_UNICODE_ISPRINTABLE()`` function internally.


 Rationale
@ -157,38 +157,38 @@ suggestions were made.

 - Supply a tool to print lists or dicts.

- Strings to be printed for debugging are not only contained by lists or
- dicts, but also in many other types of object. File objects contain a
- file name in Unicode, exception objects contain a message in Unicode,
- etc. These strings should be printed in readable form when repr()ed.
- It is unlikely to be possible to implement a tool to print all
- possible object types.
+  Strings to be printed for debugging are not only contained by lists or
+  dicts, but also in many other types of object. File objects contain a
+  file name in Unicode, exception objects contain a message in Unicode,
+  etc. These strings should be printed in readable form when repr()ed.
+  It is unlikely to be possible to implement a tool to print all
+  possible object types.

 - Use sys.displayhook and sys.excepthook.

- For interactive sessions, we can write hooks to restore hex escaped
- characters to the original characters. But these hooks are called only
- when printing the result of evaluating an expression entered in an
- interactive Python session, and doesn't work for the print() function,
- for non-interactive sessions or for logging.debug("%r", ...), etc.
+  For interactive sessions, we can write hooks to restore hex escaped
+  characters to the original characters. But these hooks are called only
+  when printing the result of evaluating an expression entered in an
+  interactive Python session, and doesn't work for the print() function,
+  for non-interactive sessions or for logging.debug("%r", ...), etc.

 - Subclass sys.stdout and sys.stderr.

- It is difficult to implement a subclass to restore hex-escaped
- characters since there isn't enough information left by the time it's
- a string to undo the escaping correctly in all cases. For example, ``
- print("\\"+"u0041")`` should be printed as '\\u0041', not 'A'. But
- there is no chance to tell file objects apart.
+  It is difficult to implement a subclass to restore hex-escaped
+  characters since there isn't enough information left by the time it's
+  a string to undo the escaping correctly in all cases. For example, ``
+  print("\\"+"u0041")`` should be printed as '\\u0041', not 'A'. But
+  there is no chance to tell file objects apart.

 - Make the encoding used by unicode_repr() adjustable, and make the
- existing repr() the default.
+  existing repr() the default.

- With adjustable repr(), the result of using repr() is unpredictable
- and would make it impossible to write correct code involving repr().
- And if current repr() is the default, then the old convention remains
- intact and users may expect ASCII strings as the result of repr().
- Third party applications or libraries could be confused when a custom
- repr() function is used.
+  With adjustable repr(), the result of using repr() is unpredictable
+  and would make it impossible to write correct code involving repr().
+  And if current repr() is the default, then the old convention remains
+  intact and users may expect ASCII strings as the result of repr().
+  Third party applications or libraries could be confused when a custom
+  repr() function is used.


 Backwards Compatibility
@ -234,37 +234,36 @@ Open Issues
 ===========

 - Is the ``ascii()`` function necessary, or is it sufficient to document
- how to do it? If necessary, should ``ascii()`` belong to the builtin
- namespace?
+  how to do it? If necessary, should ``ascii()`` belong to the builtin
+  namespace?


 Rejected Proposals
 ==================

 - Add encoding and errors arguments to the builtin print() function,
- with defaults of sys.getfilesystemencoding() and 'backslashreplace'.
+  with defaults of sys.getfilesystemencoding() and 'backslashreplace'.

- Complicated to implement, and in general, this is not seen as a good
- idea. [2]_
+  Complicated to implement, and in general, this is not seen as a good
+  idea. [2]_

 - Use character names to escape characters, instead of hex character
- codes. For example, ``repr('\u03b1')`` can be converted to
- ``"\N{GREEK SMALL LETTER ALPHA}"``.
+  codes. For example, ``repr('\u03b1')`` can be converted to
+  ``"\N{GREEK SMALL LETTER ALPHA}"``.

- Using character names can be very verbose compared to hex-escape.
- e.g., ``repr("\ufbf9")`` is converted to ``"\N{ARABIC LIGATURE UIGHUR
- KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF MAKSURA ISOLATED FORM}"``.
+  Using character names can be very verbose compared to hex-escape.
+  e.g., ``repr("\ufbf9")`` is converted to ``"\N{ARABIC LIGATURE UIGHUR
+  KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF MAKSURA ISOLATED FORM}"``.

 - Default error-handler of sys.stdout should be 'backslashreplace'.

- Stuff written to stdout might be consumed by another program that
- might misinterpret the \ escapes. For interactive session, it is
- possible to make 'backslashreplace' error-handler to default, but may
- add confusion of the kind "it works in interactive mode but not when
- redirecting to a file".
+  Stuff written to stdout might be consumed by another program that
+  might misinterpret the \ escapes. For interactive session, it is
+  possible to make 'backslashreplace' error-handler to default, but may
+  add confusion of the kind "it works in interactive mode but not when
+  redirecting to a file".


- Hide quoted text -
 Reference Implementation
 ========================