python-peps/pep-0597/index.html

464 lines
37 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="color-scheme" content="light dark">
<title>PEP 597 Add optional EncodingWarning | peps.python.org</title>
<link rel="shortcut icon" href="../_static/py.png">
<link rel="canonical" href="https://peps.python.org/pep-0597/">
<link rel="stylesheet" href="../_static/style.css" type="text/css">
<link rel="stylesheet" href="../_static/mq.css" type="text/css">
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" media="(prefers-color-scheme: light)" id="pyg-light">
<link rel="stylesheet" href="../_static/pygments_dark.css" type="text/css" media="(prefers-color-scheme: dark)" id="pyg-dark">
<link rel="alternate" type="application/rss+xml" title="Latest PEPs" href="https://peps.python.org/peps.rss">
<meta property="og:title" content='PEP 597 Add optional EncodingWarning | peps.python.org'>
<meta property="og:description" content="Add a new warning category EncodingWarning. It is emitted when the encoding argument to open() is omitted and the default locale-specific encoding is used.">
<meta property="og:type" content="website">
<meta property="og:url" content="https://peps.python.org/pep-0597/">
<meta property="og:site_name" content="Python Enhancement Proposals (PEPs)">
<meta property="og:image" content="https://peps.python.org/_static/og-image.png">
<meta property="og:image:alt" content="Python PEPs">
<meta property="og:image:width" content="200">
<meta property="og:image:height" content="200">
<meta name="description" content="Add a new warning category EncodingWarning. It is emitted when the encoding argument to open() is omitted and the default locale-specific encoding is used.">
<meta name="theme-color" content="#3776ab">
</head>
<body>
<svg xmlns="http://www.w3.org/2000/svg" style="display: none;">
<symbol id="svg-sun-half" viewBox="0 0 24 24" pointer-events="all">
<title>Following system colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<circle cx="12" cy="12" r="9"></circle>
<path d="M12 3v18m0-12l4.65-4.65M12 14.3l7.37-7.37M12 19.6l8.85-8.85"></path>
</svg>
</symbol>
<symbol id="svg-moon" viewBox="0 0 24 24" pointer-events="all">
<title>Selected dark colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<path stroke="none" d="M0 0h24v24H0z" fill="none"></path>
<path d="M12 3c.132 0 .263 0 .393 0a7.5 7.5 0 0 0 7.92 12.446a9 9 0 1 1 -8.313 -12.454z"></path>
</svg>
</symbol>
<symbol id="svg-sun" viewBox="0 0 24 24" pointer-events="all">
<title>Selected light colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<circle cx="12" cy="12" r="5"></circle>
<line x1="12" y1="1" x2="12" y2="3"></line>
<line x1="12" y1="21" x2="12" y2="23"></line>
<line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line>
<line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line>
<line x1="1" y1="12" x2="3" y2="12"></line>
<line x1="21" y1="12" x2="23" y2="12"></line>
<line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line>
<line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line>
</svg>
</symbol>
</svg>
<script>
document.documentElement.dataset.colour_scheme = localStorage.getItem("colour_scheme") || "auto"
</script>
<section id="pep-page-section">
<header>
<h1>Python Enhancement Proposals</h1>
<ul class="breadcrumbs">
<li><a href="https://www.python.org/" title="The Python Programming Language">Python</a> &raquo; </li>
<li><a href="../pep-0000/">PEP Index</a> &raquo; </li>
<li>PEP 597</li>
</ul>
<button id="colour-scheme-cycler" onClick="setColourScheme(nextColourScheme())">
<svg aria-hidden="true" class="colour-scheme-icon-when-auto"><use href="#svg-sun-half"></use></svg>
<svg aria-hidden="true" class="colour-scheme-icon-when-dark"><use href="#svg-moon"></use></svg>
<svg aria-hidden="true" class="colour-scheme-icon-when-light"><use href="#svg-sun"></use></svg>
<span class="visually-hidden">Toggle light / dark / auto colour theme</span>
</button>
</header>
<article>
<section id="pep-content">
<h1 class="page-title">PEP 597 Add optional EncodingWarning</h1>
<dl class="rfc2822 field-list simple">
<dt class="field-odd">Author<span class="colon">:</span></dt>
<dd class="field-odd">Inada Naoki &lt;songofacandy&#32;&#97;t&#32;gmail.com&gt;</dd>
<dt class="field-even">Status<span class="colon">:</span></dt>
<dd class="field-even"><abbr title="Accepted and implementation complete, or no longer active">Final</abbr></dd>
<dt class="field-odd">Type<span class="colon">:</span></dt>
<dd class="field-odd"><abbr title="Normative PEP with a new feature for Python, implementation change for CPython or interoperability standard for the ecosystem">Standards Track</abbr></dd>
<dt class="field-even">Created<span class="colon">:</span></dt>
<dd class="field-even">05-Jun-2019</dd>
<dt class="field-odd">Python-Version<span class="colon">:</span></dt>
<dd class="field-odd">3.10</dd>
</dl>
<hr class="docutils" />
<section id="contents">
<details><summary>Table of Contents</summary><ul class="simple">
<li><a class="reference internal" href="#abstract">Abstract</a></li>
<li><a class="reference internal" href="#motivation">Motivation</a><ul>
<li><a class="reference internal" href="#using-the-default-encoding-is-a-common-mistake">Using the default encoding is a common mistake</a></li>
<li><a class="reference internal" href="#explicit-way-to-use-locale-specific-encoding">Explicit way to use locale-specific encoding</a></li>
<li><a class="reference internal" href="#prepare-to-change-the-default-encoding-to-utf-8">Prepare to change the default encoding to UTF-8</a></li>
</ul>
</li>
<li><a class="reference internal" href="#specification">Specification</a><ul>
<li><a class="reference internal" href="#encodingwarning"><code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code></a></li>
<li><a class="reference internal" href="#options-to-enable-the-warning">Options to enable the warning</a></li>
<li><a class="reference internal" href="#encoding-locale"><code class="docutils literal notranslate"><span class="pre">encoding=&quot;locale&quot;</span></code></a></li>
<li><a class="reference internal" href="#io-text-encoding"><code class="docutils literal notranslate"><span class="pre">io.text_encoding()</span></code></a></li>
<li><a class="reference internal" href="#affected-standard-library-modules">Affected standard library modules</a></li>
</ul>
</li>
<li><a class="reference internal" href="#rationale">Rationale</a><ul>
<li><a class="reference internal" href="#opt-in-warning">Opt-in warning</a></li>
<li><a class="reference internal" href="#locale-is-not-a-codec-alias">“locale” is not a codec alias</a></li>
</ul>
</li>
<li><a class="reference internal" href="#backward-compatibility">Backward Compatibility</a></li>
<li><a class="reference internal" href="#forward-compatibility">Forward Compatibility</a></li>
<li><a class="reference internal" href="#how-to-teach-this">How to Teach This</a><ul>
<li><a class="reference internal" href="#for-new-users">For new users</a></li>
<li><a class="reference internal" href="#for-experienced-users">For experienced users</a></li>
</ul>
</li>
<li><a class="reference internal" href="#reference-implementation">Reference Implementation</a></li>
<li><a class="reference internal" href="#discussions">Discussions</a></li>
<li><a class="reference internal" href="#references">References</a></li>
<li><a class="reference internal" href="#copyright">Copyright</a></li>
</ul>
</details></section>
<section id="abstract">
<h2><a class="toc-backref" href="#abstract" role="doc-backlink">Abstract</a></h2>
<p>Add a new warning category <code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code>. It is emitted when the
<code class="docutils literal notranslate"><span class="pre">encoding</span></code> argument to <code class="docutils literal notranslate"><span class="pre">open()</span></code> is omitted and the default
locale-specific encoding is used.</p>
<p>The warning is disabled by default. A new <code class="docutils literal notranslate"><span class="pre">-X</span> <span class="pre">warn_default_encoding</span></code>
command-line option and a new <code class="docutils literal notranslate"><span class="pre">PYTHONWARNDEFAULTENCODING</span></code> environment
variable can be used to enable it.</p>
<p>A <code class="docutils literal notranslate"><span class="pre">&quot;locale&quot;</span></code> argument value for <code class="docutils literal notranslate"><span class="pre">encoding</span></code> is added too. It
explicitly specifies that the locale encoding should be used, silencing
the warning.</p>
</section>
<section id="motivation">
<h2><a class="toc-backref" href="#motivation" role="doc-backlink">Motivation</a></h2>
<section id="using-the-default-encoding-is-a-common-mistake">
<h3><a class="toc-backref" href="#using-the-default-encoding-is-a-common-mistake" role="doc-backlink">Using the default encoding is a common mistake</a></h3>
<p>Developers using macOS or Linux may forget that the default encoding
is not always UTF-8.</p>
<p>For example, using <code class="docutils literal notranslate"><span class="pre">long_description</span> <span class="pre">=</span> <span class="pre">open(&quot;README.md&quot;).read()</span></code> in
<code class="docutils literal notranslate"><span class="pre">setup.py</span></code> is a common mistake. Many Windows users cannot install
such packages if there is at least one non-ASCII character
(e.g. emoji, author names, copyright symbols, and the like)
in their UTF-8-encoded <code class="docutils literal notranslate"><span class="pre">README.md</span></code> file.</p>
<p>Of the 4000 most downloaded packages from PyPI, 489 use non-ASCII
characters in their README, and 82 fail to install from source on
non-UTF-8 locales due to not specifying an encoding for a non-ASCII
file. <a class="footnote-reference brackets" href="#id10" id="id1">[1]</a></p>
<p>Another example is <code class="docutils literal notranslate"><span class="pre">logging.basicConfig(filename=&quot;log.txt&quot;)</span></code>.
Some users might expect it to use UTF-8 by default, but the locale
encoding is actually what is used. <a class="footnote-reference brackets" href="#id11" id="id2">[2]</a></p>
<p>Even Python experts may assume that the default encoding is UTF-8.
This creates bugs that only happen on Windows; see <a class="footnote-reference brackets" href="#id12" id="id3">[3]</a>, <a class="footnote-reference brackets" href="#id13" id="id4">[4]</a>, <a class="footnote-reference brackets" href="#id14" id="id5">[5]</a>,
and <a class="footnote-reference brackets" href="#id15" id="id6">[6]</a> for example.</p>
<p>Emitting a warning when the <code class="docutils literal notranslate"><span class="pre">encoding</span></code> argument is omitted will help
find such mistakes.</p>
</section>
<section id="explicit-way-to-use-locale-specific-encoding">
<h3><a class="toc-backref" href="#explicit-way-to-use-locale-specific-encoding" role="doc-backlink">Explicit way to use locale-specific encoding</a></h3>
<p><code class="docutils literal notranslate"><span class="pre">open(filename)</span></code> isnt explicit about which encoding is expected:</p>
<ul class="simple">
<li>If ASCII is assumed, this isnt a bug, but may result in decreased
performance on Windows, particularly with non-Latin-1 locale encodings</li>
<li>If UTF-8 is assumed, this may be a bug or a platform-specific script</li>
<li>If the locale encoding is assumed, the behavior is as expected
(but could change if future versions of Python modify the default)</li>
</ul>
<p>From this point of view, <code class="docutils literal notranslate"><span class="pre">open(filename)</span></code> is not readable code.</p>
<p><code class="docutils literal notranslate"><span class="pre">encoding=locale.getpreferredencoding(False)</span></code> can be used to
specify the locale encoding explicitly, but it is too long and easy
to misuse (e.g. one can forget to pass <code class="docutils literal notranslate"><span class="pre">False</span></code> as its argument).</p>
<p>This PEP provides an explicit way to specify the locale encoding.</p>
</section>
<section id="prepare-to-change-the-default-encoding-to-utf-8">
<h3><a class="toc-backref" href="#prepare-to-change-the-default-encoding-to-utf-8" role="doc-backlink">Prepare to change the default encoding to UTF-8</a></h3>
<p>Since UTF-8 has become the de-facto standard text encoding,
we might default to it for opening files in the future.</p>
<p>However, such a change will affect many applications and libraries.
If we start emitting <code class="docutils literal notranslate"><span class="pre">DeprecationWarning</span></code> everywhere the <code class="docutils literal notranslate"><span class="pre">encoding</span></code>
argument is omitted, it will be too noisy and painful.</p>
<p>Although this PEP doesnt propose changing the default encoding,
it will help enable that change by:</p>
<ul class="simple">
<li>Reducing the number of omitted <code class="docutils literal notranslate"><span class="pre">encoding</span></code> arguments in libraries
before we start emitting a <code class="docutils literal notranslate"><span class="pre">DeprecationWarning</span></code> by default.</li>
<li>Allowing users to pass <code class="docutils literal notranslate"><span class="pre">encoding=&quot;locale&quot;</span></code> to suppress
the current warning and any <code class="docutils literal notranslate"><span class="pre">DeprecationWarning</span></code> added in the future,
as well as retaining consistent behavior if later Python versions
change the default, ensuring support for any Python version &gt;=3.10.</li>
</ul>
</section>
</section>
<section id="specification">
<h2><a class="toc-backref" href="#specification" role="doc-backlink">Specification</a></h2>
<section id="encodingwarning">
<h3><a class="toc-backref" href="#encodingwarning" role="doc-backlink"><code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code></a></h3>
<p>Add a new <code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code> warning class as a subclass of
<code class="docutils literal notranslate"><span class="pre">Warning</span></code>. It is emitted when the <code class="docutils literal notranslate"><span class="pre">encoding</span></code> argument is omitted and
the default locale-specific encoding is used.</p>
</section>
<section id="options-to-enable-the-warning">
<h3><a class="toc-backref" href="#options-to-enable-the-warning" role="doc-backlink">Options to enable the warning</a></h3>
<p>The <code class="docutils literal notranslate"><span class="pre">-X</span> <span class="pre">warn_default_encoding</span></code> option and the
<code class="docutils literal notranslate"><span class="pre">PYTHONWARNDEFAULTENCODING</span></code> environment variable are added. They
are used to enable <code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code>.</p>
<p><code class="docutils literal notranslate"><span class="pre">sys.flags.warn_default_encoding</span></code> is also added. The flag is true when
<code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code> is enabled.</p>
<p>When the flag is set, <code class="docutils literal notranslate"><span class="pre">io.TextIOWrapper()</span></code>, <code class="docutils literal notranslate"><span class="pre">open()</span></code> and other
modules using them will emit <code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code> when the <code class="docutils literal notranslate"><span class="pre">encoding</span></code>
argument is omitted.</p>
<p>Since <code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code> is a subclass of <code class="docutils literal notranslate"><span class="pre">Warning</span></code>, they are
shown by default (if the <code class="docutils literal notranslate"><span class="pre">warn_default_encoding</span></code> flag is set), unlike
<code class="docutils literal notranslate"><span class="pre">DeprecationWarning</span></code>.</p>
</section>
<section id="encoding-locale">
<h3><a class="toc-backref" href="#encoding-locale" role="doc-backlink"><code class="docutils literal notranslate"><span class="pre">encoding=&quot;locale&quot;</span></code></a></h3>
<p><code class="docutils literal notranslate"><span class="pre">io.TextIOWrapper</span></code> will accept <code class="docutils literal notranslate"><span class="pre">&quot;locale&quot;</span></code> as a valid argument to
<code class="docutils literal notranslate"><span class="pre">encoding</span></code>. It has the same meaning as the current <code class="docutils literal notranslate"><span class="pre">encoding=None</span></code>,
except that <code class="docutils literal notranslate"><span class="pre">io.TextIOWrapper</span></code> doesnt emit <code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code> when
<code class="docutils literal notranslate"><span class="pre">encoding=&quot;locale&quot;</span></code> is specified.</p>
</section>
<section id="io-text-encoding">
<h3><a class="toc-backref" href="#io-text-encoding" role="doc-backlink"><code class="docutils literal notranslate"><span class="pre">io.text_encoding()</span></code></a></h3>
<p><code class="docutils literal notranslate"><span class="pre">io.text_encoding()</span></code> is a helper for functions with an
<code class="docutils literal notranslate"><span class="pre">encoding=None</span></code> parameter that pass it to <code class="docutils literal notranslate"><span class="pre">io.TextIOWrapper()</span></code> or
<code class="docutils literal notranslate"><span class="pre">open()</span></code>.</p>
<p>A pure Python implementation will look like this:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">text_encoding</span><span class="p">(</span><span class="n">encoding</span><span class="p">,</span> <span class="n">stacklevel</span><span class="o">=</span><span class="mi">1</span><span class="p">):</span>
<span class="w"> </span><span class="sd">&quot;&quot;&quot;A helper function to choose the text encoding.</span>
<span class="sd"> When *encoding* is not None, just return it.</span>
<span class="sd"> Otherwise, return the default text encoding (i.e. &quot;locale&quot;).</span>
<span class="sd"> This function emits an EncodingWarning if *encoding* is None and</span>
<span class="sd"> sys.flags.warn_default_encoding is true.</span>
<span class="sd"> This function can be used in APIs with an encoding=None parameter</span>
<span class="sd"> that pass it to TextIOWrapper or open.</span>
<span class="sd"> However, please consider using encoding=&quot;utf-8&quot; for new APIs.</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="k">if</span> <span class="n">encoding</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="k">if</span> <span class="n">sys</span><span class="o">.</span><span class="n">flags</span><span class="o">.</span><span class="n">warn_default_encoding</span><span class="p">:</span>
<span class="kn">import</span> <span class="nn">warnings</span>
<span class="n">warnings</span><span class="o">.</span><span class="n">warn</span><span class="p">(</span>
<span class="s2">&quot;&#39;encoding&#39; argument not specified.&quot;</span><span class="p">,</span>
<span class="ne">EncodingWarning</span><span class="p">,</span> <span class="n">stacklevel</span> <span class="o">+</span> <span class="mi">2</span><span class="p">)</span>
<span class="n">encoding</span> <span class="o">=</span> <span class="s2">&quot;locale&quot;</span>
<span class="k">return</span> <span class="n">encoding</span>
</pre></div>
</div>
<p>For example, <code class="docutils literal notranslate"><span class="pre">pathlib.Path.read_text()</span></code> can use it like this:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">read_text</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">encoding</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">errors</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="n">encoding</span> <span class="o">=</span> <span class="n">io</span><span class="o">.</span><span class="n">text_encoding</span><span class="p">(</span><span class="n">encoding</span><span class="p">)</span>
<span class="k">with</span> <span class="bp">self</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="n">mode</span><span class="o">=</span><span class="s1">&#39;r&#39;</span><span class="p">,</span> <span class="n">encoding</span><span class="o">=</span><span class="n">encoding</span><span class="p">,</span> <span class="n">errors</span><span class="o">=</span><span class="n">errors</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="k">return</span> <span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
</pre></div>
</div>
<p>By using <code class="docutils literal notranslate"><span class="pre">io.text_encoding()</span></code>, <code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code> is emitted for
the caller of <code class="docutils literal notranslate"><span class="pre">read_text()</span></code> instead of <code class="docutils literal notranslate"><span class="pre">read_text()</span></code> itself.</p>
</section>
<section id="affected-standard-library-modules">
<h3><a class="toc-backref" href="#affected-standard-library-modules" role="doc-backlink">Affected standard library modules</a></h3>
<p>Many standard library modules will be affected by this change.</p>
<p>Most APIs accepting <code class="docutils literal notranslate"><span class="pre">encoding=None</span></code> will use <code class="docutils literal notranslate"><span class="pre">io.text_encoding()</span></code>
as written in the previous section.</p>
<p>Where using the locale encoding as the default encoding is reasonable,
<code class="docutils literal notranslate"><span class="pre">encoding=&quot;locale&quot;</span></code> will be used instead. For example,
the <code class="docutils literal notranslate"><span class="pre">subprocess</span></code> module will use the locale encoding as the default
for pipes.</p>
<p>Many tests use <code class="docutils literal notranslate"><span class="pre">open()</span></code> without <code class="docutils literal notranslate"><span class="pre">encoding</span></code> specified to read
ASCII text files. They should be rewritten with <code class="docutils literal notranslate"><span class="pre">encoding=&quot;ascii&quot;</span></code>.</p>
</section>
</section>
<section id="rationale">
<h2><a class="toc-backref" href="#rationale" role="doc-backlink">Rationale</a></h2>
<section id="opt-in-warning">
<h3><a class="toc-backref" href="#opt-in-warning" role="doc-backlink">Opt-in warning</a></h3>
<p>Although <code class="docutils literal notranslate"><span class="pre">DeprecationWarning</span></code> is suppressed by default, always
emitting <code class="docutils literal notranslate"><span class="pre">DeprecationWarning</span></code> when the <code class="docutils literal notranslate"><span class="pre">encoding</span></code> argument is
omitted would be too noisy.</p>
<p>Noisy warnings may lead developers to dismiss the
<code class="docutils literal notranslate"><span class="pre">DeprecationWarning</span></code>.</p>
</section>
<section id="locale-is-not-a-codec-alias">
<h3><a class="toc-backref" href="#locale-is-not-a-codec-alias" role="doc-backlink">“locale” is not a codec alias</a></h3>
<p>We dont add “locale” as a codec alias because the locale can be
changed at runtime.</p>
<p>Additionally, <code class="docutils literal notranslate"><span class="pre">TextIOWrapper</span></code> checks <code class="docutils literal notranslate"><span class="pre">os.device_encoding()</span></code>
when <code class="docutils literal notranslate"><span class="pre">encoding=None</span></code>. This behavior cannot be implemented in
a codec.</p>
</section>
</section>
<section id="backward-compatibility">
<h2><a class="toc-backref" href="#backward-compatibility" role="doc-backlink">Backward Compatibility</a></h2>
<p>The new warning is not emitted by default, so this PEP is 100%
backwards-compatible.</p>
</section>
<section id="forward-compatibility">
<h2><a class="toc-backref" href="#forward-compatibility" role="doc-backlink">Forward Compatibility</a></h2>
<p>Passing <code class="docutils literal notranslate"><span class="pre">&quot;locale&quot;</span></code> as the argument to <code class="docutils literal notranslate"><span class="pre">encoding</span></code> is not
forward-compatible. Code using it will not work on Python older than
3.10, and will instead raise <code class="docutils literal notranslate"><span class="pre">LookupError:</span> <span class="pre">unknown</span> <span class="pre">encoding:</span> <span class="pre">locale</span></code>.</p>
<p>Until developers can drop Python 3.9 support, <code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code>
can only be used for finding missing <code class="docutils literal notranslate"><span class="pre">encoding=&quot;utf-8&quot;</span></code> arguments.</p>
</section>
<section id="how-to-teach-this">
<h2><a class="toc-backref" href="#how-to-teach-this" role="doc-backlink">How to Teach This</a></h2>
<section id="for-new-users">
<h3><a class="toc-backref" href="#for-new-users" role="doc-backlink">For new users</a></h3>
<p>Since <code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code> is used to write cross-platform code,
there is no need to teach it to new users.</p>
<p>We can just recommend using UTF-8 for text files and using
<code class="docutils literal notranslate"><span class="pre">encoding=&quot;utf-8&quot;</span></code> when opening them.</p>
</section>
<section id="for-experienced-users">
<h3><a class="toc-backref" href="#for-experienced-users" role="doc-backlink">For experienced users</a></h3>
<p>Using <code class="docutils literal notranslate"><span class="pre">open(filename)</span></code> to read text files encoded in UTF-8 is a
common mistake. It may not work on Windows because UTF-8 is not the
default encoding.</p>
<p>You can use <code class="docutils literal notranslate"><span class="pre">-X</span> <span class="pre">warn_default_encoding</span></code> or
<code class="docutils literal notranslate"><span class="pre">PYTHONWARNDEFAULTENCODING=1</span></code> to find this type of mistake.</p>
<p>Omitting the <code class="docutils literal notranslate"><span class="pre">encoding</span></code> argument is not a bug when opening text files
encoded in the locale encoding, but <code class="docutils literal notranslate"><span class="pre">encoding=&quot;locale&quot;</span></code> is recommended
in Python 3.10 and later because it is more explicit.</p>
</section>
</section>
<section id="reference-implementation">
<h2><a class="toc-backref" href="#reference-implementation" role="doc-backlink">Reference Implementation</a></h2>
<p><a class="reference external" href="https://github.com/python/cpython/pull/19481">https://github.com/python/cpython/pull/19481</a></p>
</section>
<section id="discussions">
<h2><a class="toc-backref" href="#discussions" role="doc-backlink">Discussions</a></h2>
<p>The latest discussion thread is:
<a class="reference external" href="https://mail.python.org/archives/list/python-dev&#64;python.org/thread/SFYUP2TWD5JZ5KDLVSTZ44GWKVY4YNCV/">https://mail.python.org/archives/list/python-dev&#64;python.org/thread/SFYUP2TWD5JZ5KDLVSTZ44GWKVY4YNCV/</a></p>
<ul class="simple">
<li>Why not implement this in linters?<ul>
<li><code class="docutils literal notranslate"><span class="pre">encoding=&quot;locale&quot;</span></code> and <code class="docutils literal notranslate"><span class="pre">io.text_encoding()</span></code> must be implemented
in Python.</li>
<li>It is difficult to find all callers of functions wrapping
<code class="docutils literal notranslate"><span class="pre">open()</span></code> or <code class="docutils literal notranslate"><span class="pre">TextIOWrapper()</span></code> (see the <code class="docutils literal notranslate"><span class="pre">io.text_encoding()</span></code>
section).</li>
</ul>
</li>
<li>Many developers will not use the option.<ul>
<li>Some will, and report the warnings to libraries they use,
so the option is worth it even if many developers dont enable it.</li>
<li>For example, I found <a class="footnote-reference brackets" href="#id16" id="id7">[7]</a> and <a class="footnote-reference brackets" href="#id17" id="id8">[8]</a> by running
<code class="docutils literal notranslate"><span class="pre">pip</span> <span class="pre">install</span> <span class="pre">-U</span> <span class="pre">pip</span></code>, and <a class="footnote-reference brackets" href="#id18" id="id9">[9]</a> by running <code class="docutils literal notranslate"><span class="pre">tox</span></code>
with the reference implementation. This demonstrates how this
option can be used to find potential issues.</li>
</ul>
</li>
</ul>
</section>
<section id="references">
<h2><a class="toc-backref" href="#references" role="doc-backlink">References</a></h2>
<aside class="footnote-list brackets">
<aside class="footnote brackets" id="id10" role="doc-footnote">
<dt class="label" id="id10">[<a href="#id1">1</a>]</dt>
<dd>“Packages cant be installed when encoding is not UTF-8”
(<a class="reference external" href="https://github.com/methane/pep597-pypi-ascii">https://github.com/methane/pep597-pypi-ascii</a>)</aside>
<aside class="footnote brackets" id="id11" role="doc-footnote">
<dt class="label" id="id11">[<a href="#id2">2</a>]</dt>
<dd>“Logging - Inconsistent behaviour when handling unicode”
(<a class="reference external" href="https://bugs.python.org/issue37111">https://bugs.python.org/issue37111</a>)</aside>
<aside class="footnote brackets" id="id12" role="doc-footnote">
<dt class="label" id="id12">[<a href="#id3">3</a>]</dt>
<dd>Packaging tutorial in packaging.python.org didnt specify
encoding to read a <code class="docutils literal notranslate"><span class="pre">README.md</span></code>
(<a class="reference external" href="https://github.com/pypa/packaging.python.org/pull/682">https://github.com/pypa/packaging.python.org/pull/682</a>)</aside>
<aside class="footnote brackets" id="id13" role="doc-footnote">
<dt class="label" id="id13">[<a href="#id4">4</a>]</dt>
<dd><code class="docutils literal notranslate"><span class="pre">json.tool</span></code> had used locale encoding to read JSON files.
(<a class="reference external" href="https://bugs.python.org/issue33684">https://bugs.python.org/issue33684</a>)</aside>
<aside class="footnote brackets" id="id14" role="doc-footnote">
<dt class="label" id="id14">[<a href="#id5">5</a>]</dt>
<dd>site: Potential UnicodeDecodeError when handling pth file
(<a class="reference external" href="https://bugs.python.org/issue33684">https://bugs.python.org/issue33684</a>)</aside>
<aside class="footnote brackets" id="id15" role="doc-footnote">
<dt class="label" id="id15">[<a href="#id6">6</a>]</dt>
<dd>pypa/pip: “Installing packages fails if Python 3 installed
into path with non-ASCII characters”
(<a class="reference external" href="https://github.com/pypa/pip/issues/9054">https://github.com/pypa/pip/issues/9054</a>)</aside>
<aside class="footnote brackets" id="id16" role="doc-footnote">
<dt class="label" id="id16">[<a href="#id7">7</a>]</dt>
<dd>“site: Potential UnicodeDecodeError when handling pth file”
(<a class="reference external" href="https://bugs.python.org/issue43214">https://bugs.python.org/issue43214</a>)</aside>
<aside class="footnote brackets" id="id17" role="doc-footnote">
<dt class="label" id="id17">[<a href="#id8">8</a>]</dt>
<dd>“[pypa/pip] Use <code class="docutils literal notranslate"><span class="pre">encoding</span></code> option or binary mode for open()”
(<a class="reference external" href="https://github.com/pypa/pip/pull/9608">https://github.com/pypa/pip/pull/9608</a>)</aside>
<aside class="footnote brackets" id="id18" role="doc-footnote">
<dt class="label" id="id18">[<a href="#id9">9</a>]</dt>
<dd>“Possible UnicodeError caused by missing encoding=”utf-8””
(<a class="reference external" href="https://github.com/tox-dev/tox/issues/1908">https://github.com/tox-dev/tox/issues/1908</a>)</aside>
</aside>
</section>
<section id="copyright">
<h2><a class="toc-backref" href="#copyright" role="doc-backlink">Copyright</a></h2>
<p>This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.</p>
</section>
</section>
<hr class="docutils" />
<p>Source: <a class="reference external" href="https://github.com/python/peps/blob/main/peps/pep-0597.rst">https://github.com/python/peps/blob/main/peps/pep-0597.rst</a></p>
<p>Last modified: <a class="reference external" href="https://github.com/python/peps/commits/main/peps/pep-0597.rst">2023-09-09 17:39:29 GMT</a></p>
</article>
<nav id="pep-sidebar">
<h2>Contents</h2>
<ul>
<li><a class="reference internal" href="#abstract">Abstract</a></li>
<li><a class="reference internal" href="#motivation">Motivation</a><ul>
<li><a class="reference internal" href="#using-the-default-encoding-is-a-common-mistake">Using the default encoding is a common mistake</a></li>
<li><a class="reference internal" href="#explicit-way-to-use-locale-specific-encoding">Explicit way to use locale-specific encoding</a></li>
<li><a class="reference internal" href="#prepare-to-change-the-default-encoding-to-utf-8">Prepare to change the default encoding to UTF-8</a></li>
</ul>
</li>
<li><a class="reference internal" href="#specification">Specification</a><ul>
<li><a class="reference internal" href="#encodingwarning"><code class="docutils literal notranslate"><span class="pre">EncodingWarning</span></code></a></li>
<li><a class="reference internal" href="#options-to-enable-the-warning">Options to enable the warning</a></li>
<li><a class="reference internal" href="#encoding-locale"><code class="docutils literal notranslate"><span class="pre">encoding=&quot;locale&quot;</span></code></a></li>
<li><a class="reference internal" href="#io-text-encoding"><code class="docutils literal notranslate"><span class="pre">io.text_encoding()</span></code></a></li>
<li><a class="reference internal" href="#affected-standard-library-modules">Affected standard library modules</a></li>
</ul>
</li>
<li><a class="reference internal" href="#rationale">Rationale</a><ul>
<li><a class="reference internal" href="#opt-in-warning">Opt-in warning</a></li>
<li><a class="reference internal" href="#locale-is-not-a-codec-alias">“locale” is not a codec alias</a></li>
</ul>
</li>
<li><a class="reference internal" href="#backward-compatibility">Backward Compatibility</a></li>
<li><a class="reference internal" href="#forward-compatibility">Forward Compatibility</a></li>
<li><a class="reference internal" href="#how-to-teach-this">How to Teach This</a><ul>
<li><a class="reference internal" href="#for-new-users">For new users</a></li>
<li><a class="reference internal" href="#for-experienced-users">For experienced users</a></li>
</ul>
</li>
<li><a class="reference internal" href="#reference-implementation">Reference Implementation</a></li>
<li><a class="reference internal" href="#discussions">Discussions</a></li>
<li><a class="reference internal" href="#references">References</a></li>
<li><a class="reference internal" href="#copyright">Copyright</a></li>
</ul>
<br>
<a id="source" href="https://github.com/python/peps/blob/main/peps/pep-0597.rst">Page Source (GitHub)</a>
</nav>
</section>
<script src="../_static/colour_scheme.js"></script>
<script src="../_static/wrap_tables.js"></script>
<script src="../_static/sticky_banner.js"></script>
</body>
</html>