python-peps/pep-0266/index.html

554 lines
47 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="color-scheme" content="light dark">
<title>PEP 266 Optimizing Global Variable/Attribute Access | peps.python.org</title>
<link rel="shortcut icon" href="../_static/py.png">
<link rel="canonical" href="https://peps.python.org/pep-0266/">
<link rel="stylesheet" href="../_static/style.css" type="text/css">
<link rel="stylesheet" href="../_static/mq.css" type="text/css">
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" media="(prefers-color-scheme: light)" id="pyg-light">
<link rel="stylesheet" href="../_static/pygments_dark.css" type="text/css" media="(prefers-color-scheme: dark)" id="pyg-dark">
<link rel="alternate" type="application/rss+xml" title="Latest PEPs" href="https://peps.python.org/peps.rss">
<meta property="og:title" content='PEP 266 Optimizing Global Variable/Attribute Access | peps.python.org'>
<meta property="og:description" content="The bindings for most global variables and attributes of other modules typically never change during the execution of a Python program, but because of Pythons dynamic nature, code which accesses such global objects must run through a full lookup each t...">
<meta property="og:type" content="website">
<meta property="og:url" content="https://peps.python.org/pep-0266/">
<meta property="og:site_name" content="Python Enhancement Proposals (PEPs)">
<meta property="og:image" content="https://peps.python.org/_static/og-image.png">
<meta property="og:image:alt" content="Python PEPs">
<meta property="og:image:width" content="200">
<meta property="og:image:height" content="200">
<meta name="description" content="The bindings for most global variables and attributes of other modules typically never change during the execution of a Python program, but because of Pythons dynamic nature, code which accesses such global objects must run through a full lookup each t...">
<meta name="theme-color" content="#3776ab">
</head>
<body>
<svg xmlns="http://www.w3.org/2000/svg" style="display: none;">
<symbol id="svg-sun-half" viewBox="0 0 24 24" pointer-events="all">
<title>Following system colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<circle cx="12" cy="12" r="9"></circle>
<path d="M12 3v18m0-12l4.65-4.65M12 14.3l7.37-7.37M12 19.6l8.85-8.85"></path>
</svg>
</symbol>
<symbol id="svg-moon" viewBox="0 0 24 24" pointer-events="all">
<title>Selected dark colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<path stroke="none" d="M0 0h24v24H0z" fill="none"></path>
<path d="M12 3c.132 0 .263 0 .393 0a7.5 7.5 0 0 0 7.92 12.446a9 9 0 1 1 -8.313 -12.454z"></path>
</svg>
</symbol>
<symbol id="svg-sun" viewBox="0 0 24 24" pointer-events="all">
<title>Selected light colour scheme</title>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="none"
stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<circle cx="12" cy="12" r="5"></circle>
<line x1="12" y1="1" x2="12" y2="3"></line>
<line x1="12" y1="21" x2="12" y2="23"></line>
<line x1="4.22" y1="4.22" x2="5.64" y2="5.64"></line>
<line x1="18.36" y1="18.36" x2="19.78" y2="19.78"></line>
<line x1="1" y1="12" x2="3" y2="12"></line>
<line x1="21" y1="12" x2="23" y2="12"></line>
<line x1="4.22" y1="19.78" x2="5.64" y2="18.36"></line>
<line x1="18.36" y1="5.64" x2="19.78" y2="4.22"></line>
</svg>
</symbol>
</svg>
<script>
document.documentElement.dataset.colour_scheme = localStorage.getItem("colour_scheme") || "auto"
</script>
<section id="pep-page-section">
<header>
<h1>Python Enhancement Proposals</h1>
<ul class="breadcrumbs">
<li><a href="https://www.python.org/" title="The Python Programming Language">Python</a> &raquo; </li>
<li><a href="../pep-0000/">PEP Index</a> &raquo; </li>
<li>PEP 266</li>
</ul>
<button id="colour-scheme-cycler" onClick="setColourScheme(nextColourScheme())">
<svg aria-hidden="true" class="colour-scheme-icon-when-auto"><use href="#svg-sun-half"></use></svg>
<svg aria-hidden="true" class="colour-scheme-icon-when-dark"><use href="#svg-moon"></use></svg>
<svg aria-hidden="true" class="colour-scheme-icon-when-light"><use href="#svg-sun"></use></svg>
<span class="visually-hidden">Toggle light / dark / auto colour theme</span>
</button>
</header>
<article>
<section id="pep-content">
<h1 class="page-title">PEP 266 Optimizing Global Variable/Attribute Access</h1>
<dl class="rfc2822 field-list simple">
<dt class="field-odd">Author<span class="colon">:</span></dt>
<dd class="field-odd">Skip Montanaro &lt;skip&#32;&#97;t&#32;pobox.com&gt;</dd>
<dt class="field-even">Status<span class="colon">:</span></dt>
<dd class="field-even"><abbr title="Removed from consideration by sponsor or authors">Withdrawn</abbr></dd>
<dt class="field-odd">Type<span class="colon">:</span></dt>
<dd class="field-odd"><abbr title="Normative PEP with a new feature for Python, implementation change for CPython or interoperability standard for the ecosystem">Standards Track</abbr></dd>
<dt class="field-even">Created<span class="colon">:</span></dt>
<dd class="field-even">13-Aug-2001</dd>
<dt class="field-odd">Python-Version<span class="colon">:</span></dt>
<dd class="field-odd">2.3</dd>
<dt class="field-even">Post-History<span class="colon">:</span></dt>
<dd class="field-even"><p></p></dd>
</dl>
<hr class="docutils" />
<section id="contents">
<details><summary>Table of Contents</summary><ul class="simple">
<li><a class="reference internal" href="#abstract">Abstract</a></li>
<li><a class="reference internal" href="#introduction">Introduction</a></li>
<li><a class="reference internal" href="#proposed-change">Proposed Change</a></li>
<li><a class="reference internal" href="#threads">Threads</a></li>
<li><a class="reference internal" href="#rationale">Rationale</a></li>
<li><a class="reference internal" href="#questions">Questions</a><ul>
<li><a class="reference internal" href="#what-about-threads-what-if-math-sin-changes-while-in-cache">What about threads? What if <code class="docutils literal notranslate"><span class="pre">math.sin</span></code> changes while in cache?</a></li>
</ul>
</li>
<li><a class="reference internal" href="#unresolved-issues">Unresolved Issues</a><ul>
<li><a class="reference internal" href="#threading">Threading</a></li>
<li><a class="reference internal" href="#nested-scopes">Nested Scopes</a></li>
<li><a class="reference internal" href="#missing-attributes">Missing Attributes</a></li>
<li><a class="reference internal" href="#who-does-the-dirty-work">Who does the dirty work?</a></li>
</ul>
</li>
<li><a class="reference internal" href="#discussion">Discussion</a></li>
<li><a class="reference internal" href="#backwards-compatibility">Backwards Compatibility</a></li>
<li><a class="reference internal" href="#implementation">Implementation</a></li>
<li><a class="reference internal" href="#performance">Performance</a></li>
<li><a class="reference internal" href="#references">References</a></li>
<li><a class="reference internal" href="#copyright">Copyright</a></li>
</ul>
</details></section>
<section id="abstract">
<h2><a class="toc-backref" href="#abstract" role="doc-backlink">Abstract</a></h2>
<p>The bindings for most global variables and attributes of other modules
typically never change during the execution of a Python program, but because
of Pythons dynamic nature, code which accesses such global objects must run
through a full lookup each time the object is needed. This PEP proposes a
mechanism that allows code that accesses most global objects to treat them as
local objects and places the burden of updating references on the code that
changes the name bindings of such objects.</p>
</section>
<section id="introduction">
<h2><a class="toc-backref" href="#introduction" role="doc-backlink">Introduction</a></h2>
<p>Consider the workhorse function <code class="docutils literal notranslate"><span class="pre">sre_compile._compile</span></code>. It is the internal
compilation function for the <code class="docutils literal notranslate"><span class="pre">sre</span></code> module. It consists almost entirely of a
loop over the elements of the pattern being compiled, comparing opcodes with
known constant values and appending tokens to an output list. Most of the
comparisons are with constants imported from the <code class="docutils literal notranslate"><span class="pre">sre_constants</span></code> module.
This means there are lots of <code class="docutils literal notranslate"><span class="pre">LOAD_GLOBAL</span></code> bytecodes in the compiled output
of this module. Just by reading the code its apparent that the author
intended <code class="docutils literal notranslate"><span class="pre">LITERAL</span></code>, <code class="docutils literal notranslate"><span class="pre">NOT_LITERAL</span></code>, <code class="docutils literal notranslate"><span class="pre">OPCODES</span></code> and many other symbols to
be constants. Still, each time they are involved in an expression, they must
be looked up anew.</p>
<p>Most global accesses are actually to objects that are “almost constants”.
This includes global variables in the current module as well as the attributes
of other imported modules. Since they rarely change, it seems reasonable to
place the burden of updating references to such objects on the code that
changes the name bindings. If <code class="docutils literal notranslate"><span class="pre">sre_constants.LITERAL</span></code> is changed to refer
to another object, perhaps it would be worthwhile for the code that modifies
the <code class="docutils literal notranslate"><span class="pre">sre_constants</span></code> module dict to correct any active references to that
object. By doing so, in many cases global variables and the attributes of
many objects could be cached as local variables. If the bindings between the
names given to the objects and the objects themselves changes rarely, the cost
of keeping track of such objects should be low and the potential payoff fairly
large.</p>
<p>In an attempt to gauge the effect of this proposal, I modified the Pystone
benchmark program included in the Python distribution to cache global
functions. Its main function, <code class="docutils literal notranslate"><span class="pre">Proc0</span></code>, makes calls to ten different
functions inside its <code class="docutils literal notranslate"><span class="pre">for</span></code> loop. In addition, <code class="docutils literal notranslate"><span class="pre">Func2</span></code> calls <code class="docutils literal notranslate"><span class="pre">Func1</span></code>
repeatedly inside a loop. If local copies of these 11 global identifiers are
made before the functions loops are entered, performance on this particular
benchmark improves by about two percent (from 5561 pystones to 5685 on my
laptop). It gives some indication that performance would be improved by
caching most global variable access. Note also that the pystone benchmark
makes essentially no accesses of global module attributes, an anticipated area
of improvement for this PEP.</p>
</section>
<section id="proposed-change">
<h2><a class="toc-backref" href="#proposed-change" role="doc-backlink">Proposed Change</a></h2>
<p>I propose that the Python virtual machine be modified to include
<code class="docutils literal notranslate"><span class="pre">TRACK_OBJECT</span></code> and <code class="docutils literal notranslate"><span class="pre">UNTRACK_OBJECT</span></code> opcodes. <code class="docutils literal notranslate"><span class="pre">TRACK_OBJECT</span></code> would
associate a global name or attribute of a global name with a slot in the local
variable array and perform an initial lookup of the associated object to fill
in the slot with a valid value. The association it creates would be noted by
the code responsible for changing the name-to-object binding to cause the
associated local variable to be updated. The <code class="docutils literal notranslate"><span class="pre">UNTRACK_OBJECT</span></code> opcode would
delete any association between the name and the local variable slot.</p>
</section>
<section id="threads">
<h2><a class="toc-backref" href="#threads" role="doc-backlink">Threads</a></h2>
<p>Operation of this code in threaded programs will be no different than in
unthreaded programs. If you need to lock an object to access it, you would
have had to do that before <code class="docutils literal notranslate"><span class="pre">TRACK_OBJECT</span></code> would have been executed and
retain that lock until after you stop using it.</p>
<p>FIXME: I suspect I need more here.</p>
</section>
<section id="rationale">
<h2><a class="toc-backref" href="#rationale" role="doc-backlink">Rationale</a></h2>
<p>Global variables and attributes rarely change. For example, once a function
imports the math module, the binding between the name <em>math</em> and the
module it refers to arent likely to change. Similarly, if the function that
uses the <code class="docutils literal notranslate"><span class="pre">math</span></code> module refers to its <em>sin</em> attribute, its unlikely to
change. Still, every time the module wants to call the <code class="docutils literal notranslate"><span class="pre">math.sin</span></code> function,
it must first execute a pair of instructions:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">LOAD_GLOBAL</span> <span class="n">math</span>
<span class="n">LOAD_ATTR</span> <span class="n">sin</span>
</pre></div>
</div>
<p>If the client module always assumed that <code class="docutils literal notranslate"><span class="pre">math.sin</span></code> was a local constant and
it was the responsibility of “external forces” outside the function to keep
the reference correct, we might have code like this:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">TRACK_OBJECT</span> <span class="n">math</span><span class="o">.</span><span class="n">sin</span>
<span class="o">...</span>
<span class="n">LOAD_FAST</span> <span class="n">math</span><span class="o">.</span><span class="n">sin</span>
<span class="o">...</span>
<span class="n">UNTRACK_OBJECT</span> <span class="n">math</span><span class="o">.</span><span class="n">sin</span>
</pre></div>
</div>
<p>If the <code class="docutils literal notranslate"><span class="pre">LOAD_FAST</span></code> was in a loop the payoff in reduced global loads and
attribute lookups could be significant.</p>
<p>This technique could, in theory, be applied to any global variable access or
attribute lookup. Consider this code:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">l</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">):</span>
<span class="n">l</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">math</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">i</span><span class="p">))</span>
<span class="k">return</span> <span class="n">l</span>
</pre></div>
</div>
<p>Even though <em>l</em> is a local variable, you still pay the cost of loading
<code class="docutils literal notranslate"><span class="pre">l.append</span></code> ten times in the loop. The compiler (or an optimizer) could
recognize that both <code class="docutils literal notranslate"><span class="pre">math.sin</span></code> and <code class="docutils literal notranslate"><span class="pre">l.append</span></code> are being called in the loop
and decide to generate the tracked local code, avoiding it for the builtin
<code class="docutils literal notranslate"><span class="pre">range()</span></code> function because its only called once during loop setup.
Performance issues related to accessing local variables make tracking
<code class="docutils literal notranslate"><span class="pre">l.append</span></code> less attractive than tracking globals such as <code class="docutils literal notranslate"><span class="pre">math.sin</span></code>.</p>
<p>According to a post to python-dev by Marc-Andre Lemburg <a class="footnote-reference brackets" href="#id4" id="id1">[1]</a>, <code class="docutils literal notranslate"><span class="pre">LOAD_GLOBAL</span></code>
opcodes account for over 7% of all instructions executed by the Python virtual
machine. This can be a very expensive instruction, at least relative to a
<code class="docutils literal notranslate"><span class="pre">LOAD_FAST</span></code> instruction, which is a simple array index and requires no extra
function calls by the virtual machine. I believe many <code class="docutils literal notranslate"><span class="pre">LOAD_GLOBAL</span></code>
instructions and <code class="docutils literal notranslate"><span class="pre">LOAD_GLOBAL/LOAD_ATTR</span></code> pairs could be converted to
<code class="docutils literal notranslate"><span class="pre">LOAD_FAST</span></code> instructions.</p>
<p>Code that uses global variables heavily often resorts to various tricks to
avoid global variable and attribute lookup. The aforementioned
<code class="docutils literal notranslate"><span class="pre">sre_compile._compile</span></code> function caches the <code class="docutils literal notranslate"><span class="pre">append</span></code> method of the growing
output list. Many people commonly abuse functions default argument feature
to cache global variable lookups. Both of these schemes are hackish and
rarely address all the available opportunities for optimization. (For
example, <code class="docutils literal notranslate"><span class="pre">sre_compile._compile</span></code> does not cache the two globals that it uses
most frequently: the builtin <code class="docutils literal notranslate"><span class="pre">len</span></code> function and the global <code class="docutils literal notranslate"><span class="pre">OPCODES</span></code> array
that it imports from <code class="docutils literal notranslate"><span class="pre">sre_constants.py</span></code>.</p>
</section>
<section id="questions">
<h2><a class="toc-backref" href="#questions" role="doc-backlink">Questions</a></h2>
<section id="what-about-threads-what-if-math-sin-changes-while-in-cache">
<h3><a class="toc-backref" href="#what-about-threads-what-if-math-sin-changes-while-in-cache" role="doc-backlink">What about threads? What if <code class="docutils literal notranslate"><span class="pre">math.sin</span></code> changes while in cache?</a></h3>
<p>I believe the global interpreter lock will protect values from being
corrupted. In any case, the situation would be no worse than it is today.
If one thread modified <code class="docutils literal notranslate"><span class="pre">math.sin</span></code> after another thread had already executed
<code class="docutils literal notranslate"><span class="pre">LOAD_GLOBAL</span> <span class="pre">math</span></code>, but before it executed <code class="docutils literal notranslate"><span class="pre">LOAD_ATTR</span> <span class="pre">sin</span></code>, the client
thread would see the old value of <code class="docutils literal notranslate"><span class="pre">math.sin</span></code>.</p>
<p>The idea is this. I use a multi-attribute load below as an example, not
because it would happen very often, but because by demonstrating the recursive
nature with an extra call hopefully it will become clearer what I have in
mind. Suppose a function defined in module <code class="docutils literal notranslate"><span class="pre">foo</span></code> wants to access
<code class="docutils literal notranslate"><span class="pre">spam.eggs.ham</span></code> and that <code class="docutils literal notranslate"><span class="pre">spam</span></code> is a module imported at the module level
in <code class="docutils literal notranslate"><span class="pre">foo</span></code>:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">spam</span>
<span class="o">...</span>
<span class="k">def</span> <span class="nf">somefunc</span><span class="p">():</span>
<span class="o">...</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">spam</span><span class="o">.</span><span class="n">eggs</span><span class="o">.</span><span class="n">ham</span>
</pre></div>
</div>
<p>Upon entry to <code class="docutils literal notranslate"><span class="pre">somefunc</span></code>, a <code class="docutils literal notranslate"><span class="pre">TRACK_GLOBAL</span></code> instruction will be executed:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">TRACK_GLOBAL</span> <span class="n">spam</span><span class="o">.</span><span class="n">eggs</span><span class="o">.</span><span class="n">ham</span> <span class="n">n</span>
</pre></div>
</div>
<p><em>spam.eggs.ham</em> is a string literal stored in the functions constants
array. <em>n</em> is a fastlocals index. <code class="docutils literal notranslate"><span class="pre">&amp;fastlocals[n]</span></code> is a reference to
slot <em>n</em> in the executing frames <code class="docutils literal notranslate"><span class="pre">fastlocals</span></code> array, the location in
which the <em>spam.eggs.ham</em> reference will be stored. Heres what I envision
happening:</p>
<ol class="arabic">
<li>The <code class="docutils literal notranslate"><span class="pre">TRACK_GLOBAL</span></code> instruction locates the object referred to by the name
<em>spam</em> and finds it in its module scope. It then executes a C function
like:<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">_PyObject_TrackName</span><span class="p">(</span><span class="n">m</span><span class="p">,</span> <span class="s2">&quot;spam.eggs.ham&quot;</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">fastlocals</span><span class="p">[</span><span class="n">n</span><span class="p">])</span>
</pre></div>
</div>
<p>where <code class="docutils literal notranslate"><span class="pre">m</span></code> is the module object with an attribute <code class="docutils literal notranslate"><span class="pre">spam</span></code>.</p>
</li>
<li>The module object strips the leading <em>spam.</em> and stores the necessary
information (<em>eggs.ham</em> and <code class="docutils literal notranslate"><span class="pre">&amp;fastlocals[n]</span></code>) in case its binding for the
name <em>eggs</em> changes. It then locates the object referred to by the key
<em>eggs</em> in its dict and recursively calls:<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">_PyObject_TrackName</span><span class="p">(</span><span class="n">eggs</span><span class="p">,</span> <span class="s2">&quot;eggs.ham&quot;</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">fastlocals</span><span class="p">[</span><span class="n">n</span><span class="p">])</span>
</pre></div>
</div>
</li>
<li>The <code class="docutils literal notranslate"><span class="pre">eggs</span></code> object strips the leading <em>eggs.</em>, stores the
(<em>ham</em>, &amp;fastlocals[n]) info, locates the object in its namespace called
<code class="docutils literal notranslate"><span class="pre">ham</span></code> and calls <code class="docutils literal notranslate"><span class="pre">_PyObject_TrackName</span></code> once again:<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">_PyObject_TrackName</span><span class="p">(</span><span class="n">ham</span><span class="p">,</span> <span class="s2">&quot;ham&quot;</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">fastlocals</span><span class="p">[</span><span class="n">n</span><span class="p">])</span>
</pre></div>
</div>
</li>
<li>The <code class="docutils literal notranslate"><span class="pre">ham</span></code> object strips the leading string (no “.” this time, but thats
a minor point), sees that the result is empty, then uses its own value
(<code class="docutils literal notranslate"><span class="pre">self</span></code>, probably) to update the location it was handed:<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">Py_XDECREF</span><span class="p">(</span><span class="o">&amp;</span><span class="n">fastlocals</span><span class="p">[</span><span class="n">n</span><span class="p">]);</span>
<span class="o">&amp;</span><span class="n">fastlocals</span><span class="p">[</span><span class="n">n</span><span class="p">]</span> <span class="o">=</span> <span class="bp">self</span><span class="p">;</span>
<span class="n">Py_INCREF</span><span class="p">(</span><span class="o">&amp;</span><span class="n">fastlocals</span><span class="p">[</span><span class="n">n</span><span class="p">]);</span>
</pre></div>
</div>
<p>At this point, each object involved in resolving <code class="docutils literal notranslate"><span class="pre">spam.eggs.ham</span></code>
knows which entry in its namespace needs to be tracked and what location
to update if that name changes. Furthermore, if the one name it is
tracking in its local storage changes, it can call <code class="docutils literal notranslate"><span class="pre">_PyObject_TrackName</span></code>
using the new object once the change has been made. At the bottom end of
the food chain, the last object will always strip a name, see the empty
string and know that its value should be stuffed into the location its
been passed.</p>
<p>When the object referred to by the dotted expression <code class="docutils literal notranslate"><span class="pre">spam.eggs.ham</span></code>
is going to go out of scope, an <code class="docutils literal notranslate"><span class="pre">UNTRACK_GLOBAL</span> <span class="pre">spam.eggs.ham</span> <span class="pre">n</span></code>
instruction is executed. It has the effect of deleting all the tracking
information that <code class="docutils literal notranslate"><span class="pre">TRACK_GLOBAL</span></code> established.</p>
<p>The tracking operation may seem expensive, but recall that the objects
being tracked are assumed to be “almost constant”, so the setup cost will
be traded off against hopefully multiple local instead of global loads.
For globals with attributes the tracking setup cost grows but is offset by
avoiding the extra <code class="docutils literal notranslate"><span class="pre">LOAD_ATTR</span></code> cost. The <code class="docutils literal notranslate"><span class="pre">TRACK_GLOBAL</span></code> instruction
needs to perform a <code class="docutils literal notranslate"><span class="pre">PyDict_GetItemString</span></code> for the first name in the chain
to determine where the top-level object resides. Each object in the chain
has to store a string and an address somewhere, probably in a dict that
uses storage locations as keys (e.g. the <code class="docutils literal notranslate"><span class="pre">&amp;fastlocals[n]</span></code>) and strings as
values. (This dict could possibly be a central dict of dicts whose keys
are object addresses instead of a per-object dict.) It shouldnt be the
other way around because multiple active frames may want to track
<code class="docutils literal notranslate"><span class="pre">spam.eggs.ham</span></code>, but only one frame will want to associate that name with
one of its fast locals slots.</p>
</li>
</ol>
</section>
</section>
<section id="unresolved-issues">
<h2><a class="toc-backref" href="#unresolved-issues" role="doc-backlink">Unresolved Issues</a></h2>
<section id="threading">
<h3><a class="toc-backref" href="#threading" role="doc-backlink">Threading</a></h3>
<p>What about this (dumb) code?:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">l</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">lock</span> <span class="o">=</span> <span class="n">threading</span><span class="o">.</span><span class="n">Lock</span><span class="p">()</span>
<span class="o">...</span>
<span class="k">def</span> <span class="nf">fill_l</span><span class="p">()::</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1000</span><span class="p">)::</span>
<span class="n">lock</span><span class="o">.</span><span class="n">acquire</span><span class="p">()</span>
<span class="n">l</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">math</span><span class="o">.</span><span class="n">sin</span><span class="p">(</span><span class="n">i</span><span class="p">))</span>
<span class="n">lock</span><span class="o">.</span><span class="n">release</span><span class="p">()</span>
<span class="o">...</span>
<span class="k">def</span> <span class="nf">consume_l</span><span class="p">()::</span>
<span class="k">while</span> <span class="mi">1</span><span class="p">::</span>
<span class="n">lock</span><span class="o">.</span><span class="n">acquire</span><span class="p">()</span>
<span class="k">if</span> <span class="n">l</span><span class="p">::</span>
<span class="n">elt</span> <span class="o">=</span> <span class="n">l</span><span class="o">.</span><span class="n">pop</span><span class="p">()</span>
<span class="n">lock</span><span class="o">.</span><span class="n">release</span><span class="p">()</span>
<span class="n">fiddle</span><span class="p">(</span><span class="n">elt</span><span class="p">)</span>
</pre></div>
</div>
<p>Its not clear from a static analysis of the code what the lock is protecting.
(You cant tell at compile-time that threads are even involved can you?)
Would or should it affect attempts to track <code class="docutils literal notranslate"><span class="pre">l.append</span></code> or <code class="docutils literal notranslate"><span class="pre">math.sin</span></code> in
the <code class="docutils literal notranslate"><span class="pre">fill_l</span></code> function?</p>
<p>If we annotate the code with mythical <code class="docutils literal notranslate"><span class="pre">track_object</span></code> and <code class="docutils literal notranslate"><span class="pre">untrack_object</span></code>
builtins (Im not proposing this, just illustrating where stuff would go!), we
get:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">l</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">lock</span> <span class="o">=</span> <span class="n">threading</span><span class="o">.</span><span class="n">Lock</span><span class="p">()</span>
<span class="o">...</span>
<span class="k">def</span> <span class="nf">fill_l</span><span class="p">()::</span>
<span class="n">track_object</span><span class="p">(</span><span class="s2">&quot;l.append&quot;</span><span class="p">,</span> <span class="n">append</span><span class="p">)</span>
<span class="n">track_object</span><span class="p">(</span><span class="s2">&quot;math.sin&quot;</span><span class="p">,</span> <span class="n">sin</span><span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1000</span><span class="p">)::</span>
<span class="n">lock</span><span class="o">.</span><span class="n">acquire</span><span class="p">()</span>
<span class="n">append</span><span class="p">(</span><span class="n">sin</span><span class="p">(</span><span class="n">i</span><span class="p">))</span>
<span class="n">lock</span><span class="o">.</span><span class="n">release</span><span class="p">()</span>
<span class="n">untrack_object</span><span class="p">(</span><span class="s2">&quot;math.sin&quot;</span><span class="p">,</span> <span class="n">sin</span><span class="p">)</span>
<span class="n">untrack_object</span><span class="p">(</span><span class="s2">&quot;l.append&quot;</span><span class="p">,</span> <span class="n">append</span><span class="p">)</span>
<span class="o">...</span>
<span class="k">def</span> <span class="nf">consume_l</span><span class="p">()::</span>
<span class="k">while</span> <span class="mi">1</span><span class="p">::</span>
<span class="n">lock</span><span class="o">.</span><span class="n">acquire</span><span class="p">()</span>
<span class="k">if</span> <span class="n">l</span><span class="p">::</span>
<span class="n">elt</span> <span class="o">=</span> <span class="n">l</span><span class="o">.</span><span class="n">pop</span><span class="p">()</span>
<span class="n">lock</span><span class="o">.</span><span class="n">release</span><span class="p">()</span>
<span class="n">fiddle</span><span class="p">(</span><span class="n">elt</span><span class="p">)</span>
</pre></div>
</div>
<p>Is that correct both with and without threads (or at least equally incorrect
with and without threads)?</p>
</section>
<section id="nested-scopes">
<h3><a class="toc-backref" href="#nested-scopes" role="doc-backlink">Nested Scopes</a></h3>
<p>The presence of nested scopes will affect where <code class="docutils literal notranslate"><span class="pre">TRACK_GLOBAL</span></code> finds a
global variable, but shouldnt affect anything after that. (I think.)</p>
</section>
<section id="missing-attributes">
<h3><a class="toc-backref" href="#missing-attributes" role="doc-backlink">Missing Attributes</a></h3>
<p>Suppose I am tracking the object referred to by <code class="docutils literal notranslate"><span class="pre">spam.eggs.ham</span></code> and
<code class="docutils literal notranslate"><span class="pre">spam.eggs</span></code> is rebound to an object that does not have a <code class="docutils literal notranslate"><span class="pre">ham</span></code> attribute.
Its clear this will be an <code class="docutils literal notranslate"><span class="pre">AttributeError</span></code> if the programmer attempts to
resolve <code class="docutils literal notranslate"><span class="pre">spam.eggs.ham</span></code> in the current Python virtual machine, but suppose
the programmer has anticipated this case:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">if</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">spam</span><span class="o">.</span><span class="n">eggs</span><span class="p">,</span> <span class="s2">&quot;ham&quot;</span><span class="p">):</span>
<span class="nb">print</span> <span class="n">spam</span><span class="o">.</span><span class="n">eggs</span><span class="o">.</span><span class="n">ham</span>
<span class="k">elif</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">spam</span><span class="o">.</span><span class="n">eggs</span><span class="p">,</span> <span class="s2">&quot;bacon&quot;</span><span class="p">):</span>
<span class="nb">print</span> <span class="n">spam</span><span class="o">.</span><span class="n">eggs</span><span class="o">.</span><span class="n">bacon</span>
<span class="k">else</span><span class="p">:</span>
<span class="nb">print</span> <span class="s2">&quot;what? no meat?&quot;</span>
</pre></div>
</div>
<p>You cant raise an <code class="docutils literal notranslate"><span class="pre">AttributeError</span></code> when the tracking information is
recalculated. If it does not raise <code class="docutils literal notranslate"><span class="pre">AttributeError</span></code> and instead lets the
tracking stand, it may be setting the programmer up for a very subtle error.</p>
<p>One solution to this problem would be to track the shortest possible root of
each dotted expression the function refers to directly. In the above example,
<code class="docutils literal notranslate"><span class="pre">spam.eggs</span></code> would be tracked, but <code class="docutils literal notranslate"><span class="pre">spam.eggs.ham</span></code> and <code class="docutils literal notranslate"><span class="pre">spam.eggs.bacon</span></code>
would not.</p>
</section>
<section id="who-does-the-dirty-work">
<h3><a class="toc-backref" href="#who-does-the-dirty-work" role="doc-backlink">Who does the dirty work?</a></h3>
<p>In the Questions section I postulated the existence of a
<code class="docutils literal notranslate"><span class="pre">_PyObject_TrackName</span></code> function. While the API is fairly easy to specify,
the implementation behind-the-scenes is not so obvious. A central dictionary
could be used to track the name/location mappings, but it appears that all
<code class="docutils literal notranslate"><span class="pre">setattr</span></code> functions might need to be modified to accommodate this new
functionality.</p>
<p>If all types used the <code class="docutils literal notranslate"><span class="pre">PyObject_GenericSetAttr</span></code> function to set attributes
that would localize the update code somewhat. They dont however (which is
not too surprising), so it seems that all <code class="docutils literal notranslate"><span class="pre">getattrfunc</span></code> and <code class="docutils literal notranslate"><span class="pre">getattrofunc</span></code>
functions will have to be updated. In addition, this would place an absolute
requirement on C extension module authors to call some function when an
attribute changes value (<code class="docutils literal notranslate"><span class="pre">PyObject_TrackUpdate</span></code>?).</p>
<p>Finally, its quite possible that some attributes will be set by side effect
and not by any direct call to a <code class="docutils literal notranslate"><span class="pre">setattr</span></code> method of some sort. Consider a
device interface module that has an interrupt routine that copies the contents
of a device register into a slot in the objects <code class="docutils literal notranslate"><span class="pre">struct</span></code> whenever it
changes. In these situations, more extensive modifications would have to be
made by the module author. To identify such situations at compile time would
be impossible. I think an extra slot could be added to <code class="docutils literal notranslate"><span class="pre">PyTypeObjects</span></code> to
indicate if an objects code is safe for global tracking. It would have a
default value of 0 (<code class="docutils literal notranslate"><span class="pre">Py_TRACKING_NOT_SAFE</span></code>). If an extension module author
has implemented the necessary tracking support, that field could be
initialized to 1 (<code class="docutils literal notranslate"><span class="pre">Py_TRACKING_SAFE</span></code>). <code class="docutils literal notranslate"><span class="pre">_PyObject_TrackName</span></code> could check
that field and issue a warning if it is asked to track an object that the
author has not explicitly said was safe for tracking.</p>
</section>
</section>
<section id="discussion">
<h2><a class="toc-backref" href="#discussion" role="doc-backlink">Discussion</a></h2>
<p>Jeremy Hylton has an alternate proposal on the table <a class="footnote-reference brackets" href="#id5" id="id2">[2]</a>. His proposal seeks
to create a hybrid dictionary/list object for use in global name lookups that
would make global variable access look more like local variable access. While
there is no C code available to examine, the Python implementation given in
his proposal still appears to require dictionary key lookup. It doesnt
appear that his proposal could speed local variable attribute lookup, which
might be worthwhile in some situations if potential performance burdens could
be addressed.</p>
</section>
<section id="backwards-compatibility">
<h2><a class="toc-backref" href="#backwards-compatibility" role="doc-backlink">Backwards Compatibility</a></h2>
<p>I dont believe there will be any serious issues of backward compatibility.
Obviously, Python bytecode that contains <code class="docutils literal notranslate"><span class="pre">TRACK_OBJECT</span></code> opcodes could not be
executed by earlier versions of the interpreter, but breakage at the bytecode
level is often assumed between versions.</p>
</section>
<section id="implementation">
<h2><a class="toc-backref" href="#implementation" role="doc-backlink">Implementation</a></h2>
<p>TBD. This is where I need help. I believe there should be either a central
name/location registry or the code that modifies object attributes should be
modified, but Im not sure the best way to go about this. If you look at the
code that implements the <code class="docutils literal notranslate"><span class="pre">STORE_GLOBAL</span></code> and <code class="docutils literal notranslate"><span class="pre">STORE_ATTR</span></code> opcodes, it seems
likely that some changes will be required to <code class="docutils literal notranslate"><span class="pre">PyDict_SetItem</span></code> and
<code class="docutils literal notranslate"><span class="pre">PyObject_SetAttr</span></code> or their String variants. Ideally, thered be a fairly
central place to localize these changes. If you begin considering tracking
attributes of local variables you get into issues of modifying <code class="docutils literal notranslate"><span class="pre">STORE_FAST</span></code>
as well, which could be a problem, since the name bindings for local variables
are changed much more frequently. (I think an optimizer could avoid inserting
the tracking code for the attributes for any local variables where the
variables name binding changes.)</p>
</section>
<section id="performance">
<h2><a class="toc-backref" href="#performance" role="doc-backlink">Performance</a></h2>
<p>I believe (though I have no code to prove it at this point), that implementing
<code class="docutils literal notranslate"><span class="pre">TRACK_OBJECT</span></code> will generally not be much more expensive than a single
<code class="docutils literal notranslate"><span class="pre">LOAD_GLOBAL</span></code> instruction or a <code class="docutils literal notranslate"><span class="pre">LOAD_GLOBAL</span></code>/<code class="docutils literal notranslate"><span class="pre">LOAD_ATTR</span></code> pair. An
optimizer should be able to avoid converting <code class="docutils literal notranslate"><span class="pre">LOAD_GLOBAL</span></code> and
<code class="docutils literal notranslate"><span class="pre">LOAD_GLOBAL</span></code>/<code class="docutils literal notranslate"><span class="pre">LOAD_ATTR</span></code> to the new scheme unless the object access
occurred within a loop. Further down the line, a register-oriented
replacement for the current Python virtual machine <a class="footnote-reference brackets" href="#id6" id="id3">[3]</a> could conceivably
eliminate most of the <code class="docutils literal notranslate"><span class="pre">LOAD_FAST</span></code> instructions as well.</p>
<p>The number of tracked objects should be relatively small. All active frames
of all active threads could conceivably be tracking objects, but this seems
small compared to the number of functions defined in a given application.</p>
</section>
<section id="references">
<h2><a class="toc-backref" href="#references" role="doc-backlink">References</a></h2>
<aside class="footnote-list brackets">
<aside class="footnote brackets" id="id4" role="doc-footnote">
<dt class="label" id="id4">[<a href="#id1">1</a>]</dt>
<dd><a class="reference external" href="https://mail.python.org/pipermail/python-dev/2000-July/007609.html">https://mail.python.org/pipermail/python-dev/2000-July/007609.html</a></aside>
<aside class="footnote brackets" id="id5" role="doc-footnote">
<dt class="label" id="id5">[<a href="#id2">2</a>]</dt>
<dd><a class="reference external" href="http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobalsPEP">http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobalsPEP</a></aside>
<aside class="footnote brackets" id="id6" role="doc-footnote">
<dt class="label" id="id6">[<a href="#id3">3</a>]</dt>
<dd><a class="reference external" href="http://www.musi-cal.com/~skip/python/rattlesnake20010813.tar.gz">http://www.musi-cal.com/~skip/python/rattlesnake20010813.tar.gz</a></aside>
</aside>
</section>
<section id="copyright">
<h2><a class="toc-backref" href="#copyright" role="doc-backlink">Copyright</a></h2>
<p>This document has been placed in the public domain.</p>
</section>
</section>
<hr class="docutils" />
<p>Source: <a class="reference external" href="https://github.com/python/peps/blob/main/peps/pep-0266.rst">https://github.com/python/peps/blob/main/peps/pep-0266.rst</a></p>
<p>Last modified: <a class="reference external" href="https://github.com/python/peps/commits/main/peps/pep-0266.rst">2023-09-09 17:39:29 GMT</a></p>
</article>
<nav id="pep-sidebar">
<h2>Contents</h2>
<ul>
<li><a class="reference internal" href="#abstract">Abstract</a></li>
<li><a class="reference internal" href="#introduction">Introduction</a></li>
<li><a class="reference internal" href="#proposed-change">Proposed Change</a></li>
<li><a class="reference internal" href="#threads">Threads</a></li>
<li><a class="reference internal" href="#rationale">Rationale</a></li>
<li><a class="reference internal" href="#questions">Questions</a><ul>
<li><a class="reference internal" href="#what-about-threads-what-if-math-sin-changes-while-in-cache">What about threads? What if <code class="docutils literal notranslate"><span class="pre">math.sin</span></code> changes while in cache?</a></li>
</ul>
</li>
<li><a class="reference internal" href="#unresolved-issues">Unresolved Issues</a><ul>
<li><a class="reference internal" href="#threading">Threading</a></li>
<li><a class="reference internal" href="#nested-scopes">Nested Scopes</a></li>
<li><a class="reference internal" href="#missing-attributes">Missing Attributes</a></li>
<li><a class="reference internal" href="#who-does-the-dirty-work">Who does the dirty work?</a></li>
</ul>
</li>
<li><a class="reference internal" href="#discussion">Discussion</a></li>
<li><a class="reference internal" href="#backwards-compatibility">Backwards Compatibility</a></li>
<li><a class="reference internal" href="#implementation">Implementation</a></li>
<li><a class="reference internal" href="#performance">Performance</a></li>
<li><a class="reference internal" href="#references">References</a></li>
<li><a class="reference internal" href="#copyright">Copyright</a></li>
</ul>
<br>
<a id="source" href="https://github.com/python/peps/blob/main/peps/pep-0266.rst">Page Source (GitHub)</a>
</nav>
</section>
<script src="../_static/colour_scheme.js"></script>
<script src="../_static/wrap_tables.js"></script>
<script src="../_static/sticky_banner.js"></script>
</body>
</html>