reSTify PEP 266 (#265)
This commit is contained in:
parent
1ef52e9d6a
commit
d66124f598
674
pep-0266.txt
674
pep-0266.txt
|
@ -5,450 +5,440 @@ Last-Modified: $Date$
|
||||||
Author: skip@pobox.com (Skip Montanaro)
|
Author: skip@pobox.com (Skip Montanaro)
|
||||||
Status: Withdrawn
|
Status: Withdrawn
|
||||||
Type: Standards Track
|
Type: Standards Track
|
||||||
|
Content-Type: text/x-rst
|
||||||
Created: 13-Aug-2001
|
Created: 13-Aug-2001
|
||||||
Python-Version: 2.3
|
Python-Version: 2.3
|
||||||
Post-History:
|
Post-History:
|
||||||
|
|
||||||
|
|
||||||
Abstract
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
The bindings for most global variables and attributes of other
|
The bindings for most global variables and attributes of other modules
|
||||||
modules typically never change during the execution of a Python
|
typically never change during the execution of a Python program, but because
|
||||||
program, but because of Python's dynamic nature, code which
|
of Python's dynamic nature, code which accesses such global objects must run
|
||||||
accesses such global objects must run through a full lookup each
|
through a full lookup each time the object is needed. This PEP proposes a
|
||||||
time the object is needed. This PEP proposes a mechanism that
|
mechanism that allows code that accesses most global objects to treat them as
|
||||||
allows code that accesses most global objects to treat them as
|
local objects and places the burden of updating references on the code that
|
||||||
local objects and places the burden of updating references on the
|
changes the name bindings of such objects.
|
||||||
code that changes the name bindings of such objects.
|
|
||||||
|
|
||||||
|
|
||||||
Introduction
|
Introduction
|
||||||
|
============
|
||||||
|
|
||||||
Consider the workhorse function sre_compile._compile. It is the
|
Consider the workhorse function ``sre_compile._compile``. It is the internal
|
||||||
internal compilation function for the sre module. It consists
|
compilation function for the ``sre`` module. It consists almost entirely of a
|
||||||
almost entirely of a loop over the elements of the pattern being
|
loop over the elements of the pattern being compiled, comparing opcodes with
|
||||||
compiled, comparing opcodes with known constant values and
|
known constant values and appending tokens to an output list. Most of the
|
||||||
appending tokens to an output list. Most of the comparisons are
|
comparisons are with constants imported from the ``sre_constants`` module.
|
||||||
with constants imported from the sre_constants module. This means
|
This means there are lots of ``LOAD_GLOBAL`` bytecodes in the compiled output
|
||||||
there are lots of LOAD_GLOBAL bytecodes in the compiled output of
|
of this module. Just by reading the code it's apparent that the author
|
||||||
this module. Just by reading the code it's apparent that the
|
intended ``LITERAL``, ``NOT_LITERAL``, ``OPCODES`` and many other symbols to
|
||||||
author intended LITERAL, NOT_LITERAL, OPCODES and many other
|
be constants. Still, each time they are involved in an expression, they must
|
||||||
symbols to be constants. Still, each time they are involved in an
|
be looked up anew.
|
||||||
expression, they must be looked up anew.
|
|
||||||
|
|
||||||
Most global accesses are actually to objects that are "almost
|
Most global accesses are actually to objects that are "almost constants".
|
||||||
constants". This includes global variables in the current module
|
This includes global variables in the current module as well as the attributes
|
||||||
as well as the attributes of other imported modules. Since they
|
of other imported modules. Since they rarely change, it seems reasonable to
|
||||||
rarely change, it seems reasonable to place the burden of updating
|
place the burden of updating references to such objects on the code that
|
||||||
references to such objects on the code that changes the name
|
changes the name bindings. If ``sre_constants.LITERAL`` is changed to refer
|
||||||
bindings. If sre_constants.LITERAL is changed to refer to another
|
to another object, perhaps it would be worthwhile for the code that modifies
|
||||||
object, perhaps it would be worthwhile for the code that modifies
|
the ``sre_constants`` module dict to correct any active references to that
|
||||||
the sre_constants module dict to correct any active references to
|
object. By doing so, in many cases global variables and the attributes of
|
||||||
that object. By doing so, in many cases global variables and the
|
many objects could be cached as local variables. If the bindings between the
|
||||||
attributes of many objects could be cached as local variables. If
|
names given to the objects and the objects themselves changes rarely, the cost
|
||||||
the bindings between the names given to the objects and the
|
of keeping track of such objects should be low and the potential payoff fairly
|
||||||
objects themselves changes rarely, the cost of keeping track of
|
large.
|
||||||
such objects should be low and the potential payoff fairly large.
|
|
||||||
|
In an attempt to gauge the effect of this proposal, I modified the Pystone
|
||||||
|
benchmark program included in the Python distribution to cache global
|
||||||
|
functions. Its main function, ``Proc0``, makes calls to ten different
|
||||||
|
functions inside its ``for`` loop. In addition, ``Func2`` calls ``Func1``
|
||||||
|
repeatedly inside a loop. If local copies of these 11 global idenfiers are
|
||||||
|
made before the functions' loops are entered, performance on this particular
|
||||||
|
benchmark improves by about two percent (from 5561 pystones to 5685 on my
|
||||||
|
laptop). It gives some indication that performance would be improved by
|
||||||
|
caching most global variable access. Note also that the pystone benchmark
|
||||||
|
makes essentially no accesses of global module attributes, an anticipated area
|
||||||
|
of improvement for this PEP.
|
||||||
|
|
||||||
In an attempt to gauge the effect of this proposal, I modified the
|
|
||||||
Pystone benchmark program included in the Python distribution to
|
|
||||||
cache global functions. Its main function, Proc0, makes calls to
|
|
||||||
ten different functions inside its for loop. In addition, Func2
|
|
||||||
calls Func1 repeatedly inside a loop. If local copies of these 11
|
|
||||||
global idenfiers are made before the functions' loops are entered,
|
|
||||||
performance on this particular benchmark improves by about two per
|
|
||||||
cent (from 5561 pystones to 5685 on my laptop). It gives some
|
|
||||||
indication that performance would be improved by caching most
|
|
||||||
global variable access. Note also that the pystone benchmark
|
|
||||||
makes essentially no accesses of global module attributes, an
|
|
||||||
anticipated area of improvement for this PEP.
|
|
||||||
|
|
||||||
Proposed Change
|
Proposed Change
|
||||||
|
===============
|
||||||
|
|
||||||
I propose that the Python virtual machine be modified to include
|
I propose that the Python virtual machine be modified to include
|
||||||
TRACK_OBJECT and UNTRACK_OBJECT opcodes. TRACK_OBJECT would
|
``TRACK_OBJECT`` and ``UNTRACK_OBJECT`` opcodes. ``TRACK_OBJECT`` would
|
||||||
associate a global name or attribute of a global name with a slot
|
associate a global name or attribute of a global name with a slot in the local
|
||||||
in the local variable array and perform an initial lookup of the
|
variable array and perform an initial lookup of the associated object to fill
|
||||||
associated object to fill in the slot with a valid value. The
|
in the slot with a valid value. The association it creates would be noted by
|
||||||
association it creates would be noted by the code responsible for
|
the code responsible for changing the name-to-object binding to cause the
|
||||||
changing the name-to-object binding to cause the associated local
|
associated local variable to be updated. The ``UNTRACK_OBJECT`` opcode would
|
||||||
variable to be updated. The UNTRACK_OBJECT opcode would delete
|
delete any association between the name and the local variable slot.
|
||||||
any association between the name and the local variable slot.
|
|
||||||
|
|
||||||
|
|
||||||
Threads
|
Threads
|
||||||
|
=======
|
||||||
|
|
||||||
Operation of this code in threaded programs will be no different
|
Operation of this code in threaded programs will be no different than in
|
||||||
than in unthreaded programs. If you need to lock an object to
|
unthreaded programs. If you need to lock an object to access it, you would
|
||||||
access it, you would have had to do that before TRACK_OBJECT would
|
have had to do that before ``TRACK_OBJECT`` would have been executed and
|
||||||
have been executed and retain that lock until after you stop using
|
retain that lock until after you stop using it.
|
||||||
it.
|
|
||||||
|
|
||||||
FIXME: I suspect I need more here.
|
FIXME: I suspect I need more here.
|
||||||
|
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
|
=========
|
||||||
|
|
||||||
Global variables and attributes rarely change. For example, once
|
Global variables and attributes rarely change. For example, once a function
|
||||||
a function imports the math module, the binding between the name
|
imports the math module, the binding between the name *math* and the
|
||||||
"math" and the module it refers to aren't likely to change.
|
module it refers to aren't likely to change. Similarly, if the function that
|
||||||
Similarly, if the function that uses the math module refers to its
|
uses the ``math`` module refers to its *sin* attribute, it's unlikely to
|
||||||
"sin" attribute, it's unlikely to change. Still, every time the
|
change. Still, every time the module wants to call the ``math.sin`` function,
|
||||||
module wants to call the math.sin function, it must first execute
|
it must first execute a pair of instructions::
|
||||||
a pair of instructions:
|
|
||||||
|
|
||||||
LOAD_GLOBAL math
|
LOAD_GLOBAL math
|
||||||
LOAD_ATTR sin
|
LOAD_ATTR sin
|
||||||
|
|
||||||
If the client module always assumed that math.sin was a local
|
If the client module always assumed that ``math.sin`` was a local constant and
|
||||||
constant and it was the responsibility of "external forces"
|
it was the responsibility of "external forces" outside the function to keep
|
||||||
outside the function to keep the reference correct, we might have
|
the reference correct, we might have code like this::
|
||||||
code like this:
|
|
||||||
|
|
||||||
TRACK_OBJECT math.sin
|
TRACK_OBJECT math.sin
|
||||||
...
|
...
|
||||||
LOAD_FAST math.sin
|
LOAD_FAST math.sin
|
||||||
...
|
...
|
||||||
UNTRACK_OBJECT math.sin
|
UNTRACK_OBJECT math.sin
|
||||||
|
|
||||||
If the LOAD_FAST was in a loop the payoff in reduced global loads
|
If the ``LOAD_FAST`` was in a loop the payoff in reduced global loads and
|
||||||
and attribute lookups could be significant.
|
attribute lookups could be significant.
|
||||||
|
|
||||||
This technique could, in theory, be applied to any global variable
|
This technique could, in theory, be applied to any global variable access or
|
||||||
access or attribute lookup. Consider this code:
|
attribute lookup. Consider this code::
|
||||||
|
|
||||||
l = []
|
l = []
|
||||||
for i in range(10):
|
for i in range(10):
|
||||||
l.append(math.sin(i))
|
l.append(math.sin(i))
|
||||||
return l
|
return l
|
||||||
|
|
||||||
Even though l is a local variable, you still pay the cost of
|
Even though *l* is a local variable, you still pay the cost of loading
|
||||||
loading l.append ten times in the loop. The compiler (or an
|
``l.append`` ten times in the loop. The compiler (or an optimizer) could
|
||||||
optimizer) could recognize that both math.sin and l.append are
|
recognize that both ``math.sin`` and ``l.append`` are being called in the loop
|
||||||
being called in the loop and decide to generate the tracked local
|
and decide to generate the tracked local code, avoiding it for the builtin
|
||||||
code, avoiding it for the builtin range() function because it's
|
``range()`` function because it's only called once during loop setup.
|
||||||
only called once during loop setup. Performance issues related to
|
Performance issues related to accessing local variables make tracking
|
||||||
accessing local variables make tracking l.append less attractive
|
``l.append`` less attractive than tracking globals such as ``math.sin``.
|
||||||
than tracking globals such as math.sin.
|
|
||||||
|
|
||||||
According to a post to python-dev by Marc-Andre Lemburg [1],
|
According to a post to python-dev by Marc-Andre Lemburg [1]_, ``LOAD_GLOBAL``
|
||||||
LOAD_GLOBAL opcodes account for over 7% of all instructions
|
opcodes account for over 7% of all instructions executed by the Python virtual
|
||||||
executed by the Python virtual machine. This can be a very
|
machine. This can be a very expensive instruction, at least relative to a
|
||||||
expensive instruction, at least relative to a LOAD_FAST
|
``LOAD_FAST`` instruction, which is a simple array index and requires no extra
|
||||||
instruction, which is a simple array index and requires no extra
|
function calls by the virtual machine. I believe many ``LOAD_GLOBAL``
|
||||||
function calls by the virtual machine. I believe many LOAD_GLOBAL
|
instructions and ``LOAD_GLOBAL/LOAD_ATTR`` pairs could be converted to
|
||||||
instructions and LOAD_GLOBAL/LOAD_ATTR pairs could be converted to
|
``LOAD_FAST`` instructions.
|
||||||
LOAD_FAST instructions.
|
|
||||||
|
|
||||||
Code that uses global variables heavily often resorts to various
|
Code that uses global variables heavily often resorts to various tricks to
|
||||||
tricks to avoid global variable and attribute lookup. The
|
avoid global variable and attribute lookup. The aforementioned
|
||||||
aforementioned sre_compile._compile function caches the append
|
``sre_compile._compile`` function caches the ``append`` method of the growing
|
||||||
method of the growing output list. Many people commonly abuse
|
output list. Many people commonly abuse functions' default argument feature
|
||||||
functions' default argument feature to cache global variable
|
to cache global variable lookups. Both of these schemes are hackish and
|
||||||
lookups. Both of these schemes are hackish and rarely address all
|
rarely address all the available opportunities for optimization. (For
|
||||||
the available opportunities for optimization. (For example,
|
example, ``sre_compile._compile`` does not cache the two globals that it uses
|
||||||
sre_compile._compile does not cache the two globals that it uses
|
most frequently: the builtin ``len`` function and the global ``OPCODES`` array
|
||||||
most frequently: the builtin len function and the global OPCODES
|
that it imports from ``sre_constants.py``.
|
||||||
array that it imports from sre_constants.py.
|
|
||||||
|
|
||||||
|
|
||||||
Questions
|
Questions
|
||||||
|
=========
|
||||||
|
|
||||||
Q. What about threads? What if math.sin changes while in cache?
|
What about threads? What if ``math.sin`` changes while in cache?
|
||||||
|
-----------------------------------------------------------------
|
||||||
|
|
||||||
A. I believe the global interpreter lock will protect values from
|
I believe the global interpreter lock will protect values from being
|
||||||
being corrupted. In any case, the situation would be no worse
|
corrupted. In any case, the situation would be no worse than it is today.
|
||||||
than it is today. If one thread modified math.sin after another
|
If one thread modified ``math.sin`` after another thread had already executed
|
||||||
thread had already executed "LOAD_GLOBAL math", but before it
|
``LOAD_GLOBAL math``, but before it executed ``LOAD_ATTR sin``, the client
|
||||||
executed "LOAD_ATTR sin", the client thread would see the old
|
thread would see the old value of ``math.sin``.
|
||||||
value of math.sin.
|
|
||||||
|
|
||||||
The idea is this. I use a multi-attribute load below as an
|
The idea is this. I use a multi-attribute load below as an example, not
|
||||||
example, not because it would happen very often, but because by
|
because it would happen very often, but because by demonstrating the recursive
|
||||||
demonstrating the recursive nature with an extra call hopefully
|
nature with an extra call hopefully it will become clearer what I have in
|
||||||
it will become clearer what I have in mind. Suppose a function
|
mind. Suppose a function defined in module ``foo`` wants to access
|
||||||
defined in module foo wants to access spam.eggs.ham and that
|
``spam.eggs.ham`` and that ``spam`` is a module imported at the module level
|
||||||
spam is a module imported at the module level in foo:
|
in ``foo``::
|
||||||
|
|
||||||
import spam
|
import spam
|
||||||
...
|
...
|
||||||
def somefunc():
|
def somefunc():
|
||||||
...
|
...
|
||||||
x = spam.eggs.ham
|
x = spam.eggs.ham
|
||||||
|
|
||||||
Upon entry to somefunc, a TRACK_GLOBAL instruction will be
|
Upon entry to ``somefunc``, a ``TRACK_GLOBAL`` instruction will be executed::
|
||||||
executed:
|
|
||||||
|
|
||||||
TRACK_GLOBAL spam.eggs.ham n
|
TRACK_GLOBAL spam.eggs.ham n
|
||||||
|
|
||||||
"spam.eggs.ham" is a string literal stored in the function's
|
*spam.eggs.ham* is a string literal stored in the function's constants
|
||||||
constants array. "n" is a fastlocals index. "&fastlocals[n]"
|
array. *n* is a fastlocals index. ``&fastlocals[n]`` is a reference to
|
||||||
is a reference to slot "n" in the executing frame's fastlocals
|
slot *n* in the executing frame's ``fastlocals`` array, the location in
|
||||||
array, the location in which the spam.eggs.ham reference will
|
which the *spam.eggs.ham* reference will be stored. Here's what I envision
|
||||||
be stored. Here's what I envision happening:
|
happening:
|
||||||
|
|
||||||
1. The TRACK_GLOBAL instruction locates the object referred to
|
1. The ``TRACK_GLOBAL`` instruction locates the object referred to by the name
|
||||||
by the name "spam" and finds it in its module scope. It
|
*spam* and finds it in its module scope. It then executes a C function
|
||||||
then executes a C function like
|
like::
|
||||||
|
|
||||||
_PyObject_TrackName(m, "spam.eggs.ham", &fastlocals[n])
|
_PyObject_TrackName(m, "spam.eggs.ham", &fastlocals[n])
|
||||||
|
|
||||||
where "m" is the module object with an attribute "spam".
|
where ``m`` is the module object with an attribute ``spam``.
|
||||||
|
|
||||||
2. The module object strips the leading "spam." stores the
|
2. The module object strips the leading *spam.* and stores the necessary
|
||||||
necessary information ("eggs.ham" and &fastlocals[n]) in
|
information (*eggs.ham* and ``&fastlocals[n]``) in case its binding for the
|
||||||
case its binding for the name "eggs" changes. It then
|
name *eggs* changes. It then locates the object referred to by the key
|
||||||
locates the object referred to by the key "eggs" in its
|
*eggs* in its dict and recursively calls::
|
||||||
dict and recursively calls
|
|
||||||
|
|
||||||
_PyObject_TrackName(eggs, "eggs.ham", &fastlocals[n])
|
_PyObject_TrackName(eggs, "eggs.ham", &fastlocals[n])
|
||||||
|
|
||||||
3. The eggs object strips the leading "eggs.", stores the
|
3. The ``eggs`` object strips the leading *eggs.*, stores the
|
||||||
("ham", &fastlocals[n]) info, locates the object in its
|
(*ham*, &fastlocals[n]) info, locates the object in its namespace called
|
||||||
namespace called "ham" and calls _PyObject_TrackName once
|
``ham`` and calls ``_PyObject_TrackName`` once again::
|
||||||
again:
|
|
||||||
|
|
||||||
_PyObject_TrackName(ham, "ham", &fastlocals[n])
|
_PyObject_TrackName(ham, "ham", &fastlocals[n])
|
||||||
|
|
||||||
4. The "ham" object strips the leading string (no "." this
|
4. The ``ham`` object strips the leading string (no "." this time, but that's
|
||||||
time, but that's a minor point), sees that the result is
|
a minor point), sees that the result is empty, then uses its own value
|
||||||
empty, then uses its own value (self, probably) to update
|
(``self``, probably) to update the location it was handed::
|
||||||
the location it was handed:
|
|
||||||
|
|
||||||
Py_XDECREF(&fastlocals[n]);
|
Py_XDECREF(&fastlocals[n]);
|
||||||
&fastlocals[n] = self;
|
&fastlocals[n] = self;
|
||||||
Py_INCREF(&fastlocals[n]);
|
Py_INCREF(&fastlocals[n]);
|
||||||
|
|
||||||
At this point, each object involved in resolving
|
At this point, each object involved in resolving ``spam.eggs.ham``
|
||||||
"spam.eggs.ham" knows which entry in its namespace needs to be
|
knows which entry in its namespace needs to be tracked and what location
|
||||||
tracked and what location to update if that name changes.
|
to update if that name changes. Furthermore, if the one name it is
|
||||||
Furthermore, if the one name it is tracking in its local
|
tracking in its local storage changes, it can call ``_PyObject_TrackName``
|
||||||
storage changes, it can call _PyObject_TrackName using the new
|
using the new object once the change has been made. At the bottom end of
|
||||||
object once the change has been made. At the bottom end of
|
the food chain, the last object will always strip a name, see the empty
|
||||||
the food chain, the last object will always strip a name, see
|
string and know that its value should be stuffed into the location it's
|
||||||
the empty string and know that its value should be stuffed
|
been passed.
|
||||||
into the location it's been passed.
|
|
||||||
|
|
||||||
When the object referred to by the dotted expression
|
When the object referred to by the dotted expression ``spam.eggs.ham``
|
||||||
"spam.eggs.ham" is going to go out of scope, an
|
is going to go out of scope, an ``UNTRACK_GLOBAL spam.eggs.ham n``
|
||||||
"UNTRACK_GLOBAL spam.eggs.ham n" instruction is executed. It
|
instruction is executed. It has the effect of deleting all the tracking
|
||||||
has the effect of deleting all the tracking information that
|
information that ``TRACK_GLOBAL`` established.
|
||||||
TRACK_GLOBAL established.
|
|
||||||
|
|
||||||
The tracking operation may seem expensive, but recall that the
|
The tracking operation may seem expensive, but recall that the objects
|
||||||
objects being tracked are assumed to be "almost constant", so
|
being tracked are assumed to be "almost constant", so the setup cost will
|
||||||
the setup cost will be traded off against hopefully multiple
|
be traded off against hopefully multiple local instead of global loads.
|
||||||
local instead of global loads. For globals with attributes
|
For globals with attributes the tracking setup cost grows but is offset by
|
||||||
the tracking setup cost grows but is offset by avoiding the
|
avoiding the extra ``LOAD_ATTR`` cost. The ``TRACK_GLOBAL`` instruction
|
||||||
extra LOAD_ATTR cost. The TRACK_GLOBAL instruction needs to
|
needs to perform a ``PyDict_GetItemString`` for the first name in the chain
|
||||||
perform a PyDict_GetItemString for the first name in the chain
|
to determine where the top-level object resides. Each object in the chain
|
||||||
to determine where the top-level object resides. Each object
|
has to store a string and an address somewhere, probably in a dict that
|
||||||
in the chain has to store a string and an address somewhere,
|
uses storage locations as keys (e.g. the ``&fastlocals[n]``) and strings as
|
||||||
probably in a dict that uses storage locations as keys
|
values. (This dict could possibly be a central dict of dicts whose keys
|
||||||
(e.g. the &fastlocals[n]) and strings as values. (This dict
|
are object addresses instead of a per-object dict.) It shouldn't be the
|
||||||
could possibly be a central dict of dicts whose keys are
|
other way around because multiple active frames may want to track
|
||||||
object addresses instead of a per-object dict.) It shouldn't
|
``spam.eggs.ham``, but only one frame will want to associate that name with
|
||||||
be the other way around because multiple active frames may
|
one of its fast locals slots.
|
||||||
want to track "spam.eggs.ham", but only one frame will want to
|
|
||||||
associate that name with one of its fast locals slots.
|
|
||||||
|
|
||||||
|
|
||||||
Unresolved Issues
|
Unresolved Issues
|
||||||
|
=================
|
||||||
|
|
||||||
Threading -
|
Threading
|
||||||
|
---------
|
||||||
|
|
||||||
What about this (dumb) code?
|
What about this (dumb) code?::
|
||||||
|
|
||||||
l = []
|
l = []
|
||||||
lock = threading.Lock()
|
lock = threading.Lock()
|
||||||
...
|
...
|
||||||
def fill_l():
|
def fill_l()::
|
||||||
for i in range(1000):
|
for i in range(1000)::
|
||||||
lock.acquire()
|
lock.acquire()
|
||||||
l.append(math.sin(i))
|
l.append(math.sin(i))
|
||||||
lock.release()
|
lock.release()
|
||||||
...
|
...
|
||||||
def consume_l():
|
def consume_l()::
|
||||||
while 1:
|
while 1::
|
||||||
lock.acquire()
|
lock.acquire()
|
||||||
if l:
|
if l::
|
||||||
elt = l.pop()
|
elt = l.pop()
|
||||||
lock.release()
|
lock.release()
|
||||||
fiddle(elt)
|
fiddle(elt)
|
||||||
|
|
||||||
It's not clear from a static analysis of the code what the lock is
|
It's not clear from a static analysis of the code what the lock is protecting.
|
||||||
protecting. (You can't tell at compile-time that threads are even
|
(You can't tell at compile-time that threads are even involved can you?)
|
||||||
involved can you?) Would or should it affect attempts to track
|
Would or should it affect attempts to track ``l.append`` or ``math.sin`` in
|
||||||
"l.append" or "math.sin" in the fill_l function?
|
the ``fill_l`` function?
|
||||||
|
|
||||||
If we annotate the code with mythical track_object and untrack_object
|
If we annotate the code with mythical ``track_object`` and ``untrack_object``
|
||||||
builtins (I'm not proposing this, just illustrating where stuff would
|
builtins (I'm not proposing this, just illustrating where stuff would go!), we
|
||||||
go!), we get
|
get::
|
||||||
|
|
||||||
l = []
|
l = []
|
||||||
lock = threading.Lock()
|
lock = threading.Lock()
|
||||||
...
|
...
|
||||||
def fill_l():
|
def fill_l()::
|
||||||
track_object("l.append", append)
|
track_object("l.append", append)
|
||||||
track_object("math.sin", sin)
|
track_object("math.sin", sin)
|
||||||
for i in range(1000):
|
for i in range(1000)::
|
||||||
lock.acquire()
|
lock.acquire()
|
||||||
append(sin(i))
|
append(sin(i))
|
||||||
lock.release()
|
lock.release()
|
||||||
untrack_object("math.sin", sin)
|
untrack_object("math.sin", sin)
|
||||||
untrack_object("l.append", append)
|
untrack_object("l.append", append)
|
||||||
...
|
...
|
||||||
def consume_l():
|
def consume_l()::
|
||||||
while 1:
|
while 1::
|
||||||
lock.acquire()
|
lock.acquire()
|
||||||
if l:
|
if l::
|
||||||
elt = l.pop()
|
elt = l.pop()
|
||||||
lock.release()
|
lock.release()
|
||||||
fiddle(elt)
|
fiddle(elt)
|
||||||
|
|
||||||
Is that correct both with and without threads (or at least equally
|
Is that correct both with and without threads (or at least equally incorrect
|
||||||
incorrect with and without threads)?
|
with and without threads)?
|
||||||
|
|
||||||
Nested Scopes -
|
Nested Scopes
|
||||||
|
-------------
|
||||||
|
|
||||||
The presence of nested scopes will affect where TRACK_GLOBAL finds
|
The presence of nested scopes will affect where ``TRACK_GLOBAL`` finds a
|
||||||
a global variable, but shouldn't affect anything after that. (I
|
global variable, but shouldn't affect anything after that. (I think.)
|
||||||
think.)
|
|
||||||
|
|
||||||
Missing Attributes -
|
Missing Attributes
|
||||||
|
------------------
|
||||||
|
|
||||||
Suppose I am tracking the object referred to by "spam.eggs.ham"
|
Suppose I am tracking the object referred to by ``spam.eggs.ham`` and
|
||||||
and "spam.eggs" is rebound to an object that does not have a "ham"
|
``spam.eggs`` is rebound to an object that does not have a ``ham`` attribute.
|
||||||
attribute. It's clear this will be an AttributeError if the
|
It's clear this will be an ``AttributeError`` if the programmer attempts to
|
||||||
programmer attempts to resolve "spam.eggs.ham" in the current
|
resolve ``spam.eggs.ham`` in the current Python virtual machine, but suppose
|
||||||
Python virtual machine, but suppose the programmer has anticipated
|
the programmer has anticipated this case::
|
||||||
this case:
|
|
||||||
|
|
||||||
if hasattr(spam.eggs, "ham"):
|
if hasattr(spam.eggs, "ham"):
|
||||||
print spam.eggs.ham
|
print spam.eggs.ham
|
||||||
elif hasattr(spam.eggs, "bacon"):
|
elif hasattr(spam.eggs, "bacon"):
|
||||||
print spam.eggs.bacon
|
print spam.eggs.bacon
|
||||||
else:
|
else:
|
||||||
print "what? no meat?"
|
print "what? no meat?"
|
||||||
|
|
||||||
You can't raise an AttributeError when the tracking information is
|
You can't raise an ``AttributeError`` when the tracking information is
|
||||||
recalculated. If it does not raise AttributeError and instead
|
recalculated. If it does not raise ``AttributeError`` and instead lets the
|
||||||
lets the tracking stand, it may be setting the programmer up for a
|
tracking stand, it may be setting the programmer up for a very subtle error.
|
||||||
very subtle error.
|
|
||||||
|
|
||||||
One solution to this problem would be to track the shortest
|
One solution to this problem would be to track the shortest possible root of
|
||||||
possible root of each dotted expression the function refers to
|
each dotted expression the function refers to directly. In the above example,
|
||||||
directly. In the above example, "spam.eggs" would be tracked, but
|
``spam.eggs`` would be tracked, but ``spam.eggs.ham`` and ``spam.eggs.bacon``
|
||||||
"spam.eggs.ham" and "spam.eggs.bacon" would not.
|
would not.
|
||||||
|
|
||||||
Who does the dirty work? -
|
Who does the dirty work?
|
||||||
|
------------------------
|
||||||
|
|
||||||
In the Questions section I postulated the existence of a
|
In the Questions section I postulated the existence of a
|
||||||
_PyObject_TrackName function. While the API is fairly easy to
|
``_PyObject_TrackName`` function. While the API is fairly easy to specify,
|
||||||
specify, the implementation behind-the-scenes is not so obvious.
|
the implementation behind-the-scenes is not so obvious. A central dictionary
|
||||||
A central dictionary could be used to track the name/location
|
could be used to track the name/location mappings, but it appears that all
|
||||||
mappings, but it appears that all setattr functions might need to
|
``setattr`` functions might need to be modified to accommodate this new
|
||||||
be modified to accommodate this new functionality.
|
functionality.
|
||||||
|
|
||||||
If all types used the PyObject_GenericSetAttr function to set
|
If all types used the ``PyObject_GenericSetAttr`` function to set attributes
|
||||||
attributes that would localize the update code somewhat. They
|
that would localize the update code somewhat. They don't however (which is
|
||||||
don't however (which is not too surprising), so it seems that all
|
not too surprising), so it seems that all ``getattrfunc`` and ``getattrofunc``
|
||||||
getattrfunc and getattrofunc functions will have to be updated.
|
functions will have to be updated. In addition, this would place an absolute
|
||||||
In addition, this would place an absolute requirement on C
|
requirement on C extension module authors to call some function when an
|
||||||
extension module authors to call some function when an attribute
|
attribute changes value (``PyObject_TrackUpdate``?).
|
||||||
changes value (PyObject_TrackUpdate?).
|
|
||||||
|
Finally, it's quite possible that some attributes will be set by side effect
|
||||||
|
and not by any direct call to a ``setattr`` method of some sort. Consider a
|
||||||
|
device interface module that has an interrupt routine that copies the contents
|
||||||
|
of a device register into a slot in the object's ``struct`` whenever it
|
||||||
|
changes. In these situations, more extensive modifications would have to be
|
||||||
|
made by the module author. To identify such situations at compile time would
|
||||||
|
be impossible. I think an extra slot could be added to ``PyTypeObjects`` to
|
||||||
|
indicate if an object's code is safe for global tracking. It would have a
|
||||||
|
default value of 0 (``Py_TRACKING_NOT_SAFE``). If an extension module author
|
||||||
|
has implemented the necessary tracking support, that field could be
|
||||||
|
initialized to 1 (``Py_TRACKING_SAFE``). ``_PyObject_TrackName`` could check
|
||||||
|
that field and issue a warning if it is asked to track an object that the
|
||||||
|
author has not explicitly said was safe for tracking.
|
||||||
|
|
||||||
Finally, it's quite possible that some attributes will be set by
|
|
||||||
side effect and not by any direct call to a setattr method of some
|
|
||||||
sort. Consider a device interface module that has an interrupt
|
|
||||||
routine that copies the contents of a device register into a slot
|
|
||||||
in the object's struct whenever it changes. In these situations,
|
|
||||||
more extensive modifications would have to be made by the module
|
|
||||||
author. To identify such situations at compile time would be
|
|
||||||
impossible. I think an extra slot could be added to PyTypeObjects
|
|
||||||
to indicate if an object's code is safe for global tracking. It
|
|
||||||
would have a default value of 0 (Py_TRACKING_NOT_SAFE). If an
|
|
||||||
extension module author has implemented the necessary tracking
|
|
||||||
support, that field could be initialized to 1 (Py_TRACKING_SAFE).
|
|
||||||
_PyObject_TrackName could check that field and issue a warning if
|
|
||||||
it is asked to track an object that the author has not explicitly
|
|
||||||
said was safe for tracking.
|
|
||||||
|
|
||||||
Discussion
|
Discussion
|
||||||
|
==========
|
||||||
|
|
||||||
Jeremy Hylton has an alternate proposal on the table [2]. His
|
Jeremy Hylton has an alternate proposal on the table [2]_. His proposal seeks
|
||||||
proposal seeks to create a hybrid dictionary/list object for use
|
to create a hybrid dictionary/list object for use in global name lookups that
|
||||||
in global name lookups that would make global variable access look
|
would make global variable access look more like local variable access. While
|
||||||
more like local variable access. While there is no C code
|
there is no C code available to examine, the Python implementation given in
|
||||||
available to examine, the Python implementation given in his
|
his proposal still appears to require dictionary key lookup. It doesn't
|
||||||
proposal still appears to require dictionary key lookup. It
|
appear that his proposal could speed local variable attribute lookup, which
|
||||||
doesn't appear that his proposal could speed local variable
|
might be worthwhile in some situations if potential performance burdens could
|
||||||
attribute lookup, which might be worthwhile in some situations if
|
be addressed.
|
||||||
potential performance burdens could be addressed.
|
|
||||||
|
|
||||||
|
|
||||||
Backwards Compatibility
|
Backwards Compatibility
|
||||||
|
=======================
|
||||||
|
|
||||||
I don't believe there will be any serious issues of backward
|
I don't believe there will be any serious issues of backward compatibility.
|
||||||
compatibility. Obviously, Python bytecode that contains
|
Obviously, Python bytecode that contains ``TRACK_OBJECT`` opcodes could not be
|
||||||
TRACK_OBJECT opcodes could not be executed by earlier versions of
|
executed by earlier versions of the interpreter, but breakage at the bytecode
|
||||||
the interpreter, but breakage at the bytecode level is often
|
level is often assumed between versions.
|
||||||
assumed between versions.
|
|
||||||
|
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
|
==============
|
||||||
|
|
||||||
TBD. This is where I need help. I believe there should be either
|
TBD. This is where I need help. I believe there should be either a central
|
||||||
a central name/location registry or the code that modifies object
|
name/location registry or the code that modifies object attributes should be
|
||||||
attributes should be modified, but I'm not sure the best way to go
|
modified, but I'm not sure the best way to go about this. If you look at the
|
||||||
about this. If you look at the code that implements the
|
code that implements the ``STORE_GLOBAL`` and ``STORE_ATTR`` opcodes, it seems
|
||||||
STORE_GLOBAL and STORE_ATTR opcodes, it seems likely that some
|
likely that some changes will be required to ``PyDict_SetItem`` and
|
||||||
changes will be required to PyDict_SetItem and PyObject_SetAttr or
|
``PyObject_SetAttr`` or their String variants. Ideally, there'd be a fairly
|
||||||
their String variants. Ideally, there'd be a fairly central place
|
central place to localize these changes. If you begin considering tracking
|
||||||
to localize these changes. If you begin considering tracking
|
attributes of local variables you get into issues of modifying ``STORE_FAST``
|
||||||
attributes of local variables you get into issues of modifying
|
as well, which could be a problem, since the name bindings for local variables
|
||||||
STORE_FAST as well, which could be a problem, since the name
|
are changed much more frequently. (I think an optimizer could avoid inserting
|
||||||
bindings for local variables are changed much more frequently. (I
|
the tracking code for the attributes for any local variables where the
|
||||||
think an optimizer could avoid inserting the tracking code for the
|
variable's name binding changes.)
|
||||||
attributes for any local variables where the variable's name
|
|
||||||
binding changes.)
|
|
||||||
|
|
||||||
|
|
||||||
Performance
|
Performance
|
||||||
|
===========
|
||||||
|
|
||||||
I believe (though I have no code to prove it at this point), that
|
I believe (though I have no code to prove it at this point), that implementing
|
||||||
implementing TRACK_OBJECT will generally not be much more
|
``TRACK_OBJECT`` will generally not be much more expensive than a single
|
||||||
expensive than a single LOAD_GLOBAL instruction or a
|
``LOAD_GLOBAL`` instruction or a ``LOAD_GLOBAL``/``LOAD_ATTR`` pair. An
|
||||||
LOAD_GLOBAL/LOAD_ATTR pair. An optimizer should be able to avoid
|
optimizer should be able to avoid converting ``LOAD_GLOBAL`` and
|
||||||
converting LOAD_GLOBAL and LOAD_GLOBAL/LOAD_ATTR to the new scheme
|
``LOAD_GLOBAL``/``LOAD_ATTR`` to the new scheme unless the object access
|
||||||
unless the object access occurred within a loop. Further down the
|
occurred within a loop. Further down the line, a register-oriented
|
||||||
line, a register-oriented replacement for the current Python
|
replacement for the current Python virtual machine [3]_ could conceivably
|
||||||
virtual machine [3] could conceivably eliminate most of the
|
eliminate most of the ``LOAD_FAST`` instructions as well.
|
||||||
LOAD_FAST instructions as well.
|
|
||||||
|
|
||||||
The number of tracked objects should be relatively small. All
|
The number of tracked objects should be relatively small. All active frames
|
||||||
active frames of all active threads could conceivably be tracking
|
of all active threads could conceivably be tracking objects, but this seems
|
||||||
objects, but this seems small compared to the number of functions
|
small compared to the number of functions defined in a given application.
|
||||||
defined in a given application.
|
|
||||||
|
|
||||||
|
|
||||||
References
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
[1] http://mail.python.org/pipermail/python-dev/2000-July/007609.html
|
.. [1] http://mail.python.org/pipermail/python-dev/2000-July/007609.html
|
||||||
|
|
||||||
[2] http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobalsPEP
|
.. [2] http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobalsPEP
|
||||||
|
|
||||||
[3] http://www.musi-cal.com/~skip/python/rattlesnake20010813.tar.gz
|
.. [3] http://www.musi-cal.com/~skip/python/rattlesnake20010813.tar.gz
|
||||||
|
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
This document has been placed in the public domain.
|
This document has been placed in the public domain.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Local Variables:
|
..
|
||||||
mode: indented-text
|
Local Variables:
|
||||||
indent-tabs-mode: nil
|
mode: indented-text
|
||||||
fill-column: 70
|
indent-tabs-mode: nil
|
||||||
End:
|
fill-column: 70
|
||||||
|
End:
|
||||||
|
|
Loading…
Reference in New Issue