New spec for newline= parameter to open() and TextIOBase().
This commit is contained in:
parent
42966cb9b0
commit
911566c261
65
pep-3116.txt
65
pep-3116.txt
|
@ -342,20 +342,55 @@ signature:
|
|||
``.__init__(self, buffer, encoding=None, newline=None)``
|
||||
|
||||
``buffer`` is a reference to the ``BufferedIOBase`` object to
|
||||
be wrapped with the ``TextIOWrapper``. ``encoding`` refers to
|
||||
an encoding to be used for translating between the
|
||||
byte-representation and character-representation. If it is
|
||||
``None``, then the system's locale setting will be used as the
|
||||
default. ``newline`` can be ``None``, ``'\n'``, ``'\r'``, or
|
||||
``'\r\n'`` (all other values are illegal); it indicates the
|
||||
translation for ``'\n'`` characters written. If ``None``, a
|
||||
system-specific default is chosen, i.e., ``'\r\n'`` on Windows
|
||||
and ``'\n'`` on Unix/Linux. Setting ``newline='\n'`` on input
|
||||
means that no CRLF translation is done; lines ending in
|
||||
``'\r\n'`` will be returned as ``'\r\n'``. (``'\r'`` support
|
||||
is still needed for some OSX applications that produce files
|
||||
using ``'\r'`` line endings; Excel (when exporting to text)
|
||||
and Adobe Illustrator EPS files are the most common examples.
|
||||
be wrapped with the ``TextIOWrapper``.
|
||||
|
||||
``encoding`` refers to an encoding to be used for translating
|
||||
between the byte-representation and character-representation.
|
||||
If it is ``None``, then the system's locale setting will be
|
||||
used as the default.
|
||||
|
||||
``newline`` can be ``None``, ``''``, ``'\n'``, ``'\r'``, or
|
||||
``'\r\n'``; all other values are illegal. It controls the
|
||||
handling of line endings. It works as follows:
|
||||
|
||||
* On input, if ``newline`` is ``None``, universal newlines
|
||||
mode is enabled. Lines in the input can end in ``'\n'``,
|
||||
``'\r'``, or ``'\r\n'``, and these are translated into
|
||||
``'\n'`` before being returned to the caller. If it is
|
||||
``''``, universal newline mode is enabled, but line endings
|
||||
are returned to the caller untranslated. If it has any of
|
||||
the other legal values, input lines are only terminated by
|
||||
the given string, and the line ending is returned to the
|
||||
caller translated to ``'\n'``.
|
||||
|
||||
* On output, if ``newline`` is ``None``, any ``'\n'``
|
||||
characters written are translated to the system default
|
||||
line separator, ``os.linesep``. If ``newline`` is ``''``,
|
||||
no translation takes place. If ``newline`` is any of the
|
||||
other legal values, any ``'\n'`` characters written are
|
||||
translated to the given string.
|
||||
|
||||
Further notes on the ``newline`` parameter:
|
||||
|
||||
* ``'\r'`` support is still needed for some OSX applications
|
||||
that produce files using ``'\r'`` line endings; Excel (when
|
||||
exporting to text) and Adobe Illustrator EPS files are the
|
||||
most common examples.
|
||||
|
||||
* If translation is enabled, it happens regardless of which
|
||||
method is called for reading or writing. For example,
|
||||
{{{f.read()}}} will always produce the same result as
|
||||
{{{''.join(f.readlines())}}}.
|
||||
|
||||
* If universal newlines without translation are requested on
|
||||
input (i.e. ``newline=''``), if a system read operation
|
||||
returns a buffer ending in ``'\r'``, another system read
|
||||
operation is done to determine whether it is followed by
|
||||
``'\n'`` or not. In universal newlines mode with
|
||||
translation, the second system read operation may be
|
||||
postponed until the next read request, and if the following
|
||||
system read operation returns a buffer starting with
|
||||
``'\n'``, that character is simply discarded.
|
||||
|
||||
Another implementation, ``StringIO``, creates a file-like ``TextIO``
|
||||
implementation without an underlying Buffered I/O object. While
|
||||
|
@ -422,7 +457,7 @@ pseudo-code::
|
|||
assert isinstance(mode, str)
|
||||
assert buffering is None or isinstance(buffering, int)
|
||||
assert encoding is None or isinstance(encoding, str)
|
||||
assert newline in (None, "\n", "\r", "\r\n")
|
||||
assert newline in (None, "", "\n", "\r", "\r\n")
|
||||
modes = set(mode)
|
||||
if modes - set("arwb+t") or len(mode) > len(modes):
|
||||
raise ValueError("invalid mode: %r" % mode)
|
||||
|
|
Loading…
Reference in New Issue