598 lines
19 KiB
Plaintext
598 lines
19 KiB
Plaintext
PEP: 511
|
||
Title: API for code transformers
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Victor Stinner <victor.stinner@gmail.com>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 4-January-2016
|
||
Python-Version: 3.6
|
||
|
||
Abstract
|
||
========
|
||
|
||
Propose an API to register bytecode and AST transformers. Add also ``-o
|
||
OPTIM_TAG`` command line option to change ``.pyc`` filenames, ``-o
|
||
noopt`` disables the peephole optimizer. Raise an ``ImportError``
|
||
exception on import if the ``.pyc`` file is missing and the code
|
||
transformers required to transform the code are missing. code
|
||
transformers are not needed code transformed ahead of time (loaded from
|
||
``.pyc`` files).
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Python does not provide a standard way to transform the code. Projects
|
||
transforming the code use various hooks. The MacroPy project uses an
|
||
import hook: it adds its own module finder in ``sys.meta_path`` to
|
||
hook its AST transformer. Another option is to monkey-patch the
|
||
builtin ``compile()`` function. There are even more options to
|
||
hook a code transformer.
|
||
|
||
Python 3.4 added a ``compile_source()`` method to
|
||
``importlib.abc.SourceLoader``. But code transformation is wider than
|
||
just importing modules, see described use cases below.
|
||
|
||
Writing an optimizer or a preprocessor is out of the scope of this PEP.
|
||
|
||
Usage 1: AST optimizer
|
||
----------------------
|
||
|
||
Transforming an Abstract Syntax Tree (AST) is a convenient
|
||
way to implement an optimizer. It's easier to work on the AST than
|
||
working on the bytecode, AST contains more information and is more high
|
||
level.
|
||
|
||
Since the optimization can done ahead of time, complex but slow
|
||
optimizations can be implemented.
|
||
|
||
Example of optimizations which can be implemented with an AST optimizer:
|
||
|
||
* `Copy propagation
|
||
<https://en.wikipedia.org/wiki/Copy_propagation>`_:
|
||
replace ``x=1; y=x`` with ``x=1; y=1``
|
||
* `Constant folding
|
||
<https://en.wikipedia.org/wiki/Constant_folding>`_:
|
||
replace ``1+1`` with ``2``
|
||
* `Dead code elimination
|
||
<https://en.wikipedia.org/wiki/Dead_code_elimination>`_
|
||
|
||
Using guards (see the `PEP 510
|
||
<https://www.python.org/dev/peps/pep-0510/>`_), it is possible to
|
||
implement a much wider choice of optimizations. Examples:
|
||
|
||
* Simplify iterable: replace ``range(3)`` with ``(0, 1, 2)`` when used
|
||
as iterable
|
||
* `Loop unrolling <https://en.wikipedia.org/wiki/Loop_unrolling>`_
|
||
* Call pure builtins: replace ``len("abc")`` with ``3``
|
||
* Copy used builtin symbols to constants
|
||
* See also `optimizations implemented in fatoptimizer
|
||
<https://fatoptimizer.readthedocs.org/en/latest/optimizations.html>`_,
|
||
a static optimizer for Python 3.6.
|
||
|
||
The following issues can be implemented with an AST optimizer:
|
||
|
||
* `Issue #1346238
|
||
<https://bugs.python.org/issue1346238>`_: A constant folding
|
||
optimization pass for the AST
|
||
* `Issue #2181 <http://bugs.python.org/issue2181>`_:
|
||
optimize out local variables at end of function
|
||
* `Issue #2499 <http://bugs.python.org/issue2499>`_:
|
||
Fold unary + and not on constants
|
||
* `Issue #4264 <http://bugs.python.org/issue4264>`_:
|
||
Patch: optimize code to use LIST_APPEND instead of calling list.append
|
||
* `Issue #7682 <http://bugs.python.org/issue7682>`_:
|
||
Optimisation of if with constant expression
|
||
* `Issue #10399 <https://bugs.python.org/issue10399>`_: AST
|
||
Optimization: inlining of function calls
|
||
* `Issue #11549 <http://bugs.python.org/issue11549>`_:
|
||
Build-out an AST optimizer, moving some functionality out of the
|
||
peephole optimizer
|
||
* `Issue #17068 <http://bugs.python.org/issue17068>`_:
|
||
peephole optimization for constant strings
|
||
* `Issue #17430 <http://bugs.python.org/issue17430>`_:
|
||
missed peephole optimization
|
||
|
||
|
||
Usage 2: Preprocessor
|
||
---------------------
|
||
|
||
A preprocessor can be easily implemented with an AST transformer. A
|
||
preprocessor has various and different usages.
|
||
|
||
Some examples:
|
||
|
||
* Remove debug code like assertions and logs to make the code faster to
|
||
run it for production.
|
||
* `Tail-call Optimization <https://en.wikipedia.org/wiki/Tail_call>`_
|
||
* Add profiling code
|
||
* `Lazy evaluation <https://en.wikipedia.org/wiki/Lazy_evaluation>`_:
|
||
see `lazy_python <https://github.com/llllllllll/lazy_python>`_
|
||
(bytecode transformer) and `lazy macro of MacroPy
|
||
<https://github.com/lihaoyi/macropy#lazy>`_ (AST transformer)
|
||
* Change dictionary literals into collection.OrderedDict instances
|
||
* Declare constants: see `@asconstants of codetransformer
|
||
<https://pypi.python.org/pypi/codetransformer>`_
|
||
* Domain Specific Language (DSL) like SQL queries. The
|
||
Python language itself doesn't need to be modified. Previous attempts
|
||
to implement DSL for SQL like `PEP 335 - Overloadable Boolean
|
||
Operators <https://www.python.org/dev/peps/pep-0335/>`_ was rejected.
|
||
* Pattern Matching of functional languages
|
||
* String Interpolation, but `PEP 498 -- Literal String Interpolation
|
||
<https://www.python.org/dev/peps/pep-0498/>`_ was merged into Python
|
||
3.6.
|
||
|
||
`MacroPy <https://github.com/lihaoyi/macropy>`_ has a long list of
|
||
examples and use cases.
|
||
|
||
This PEP does not add any new code transformer. Using a code transformer
|
||
will require an external module and to register it manually.
|
||
|
||
See also `PyXfuscator <https://bitbucket.org/namn/pyxfuscator>`_: Python
|
||
obfuscator, deobfuscator, and user-assisted decompiler.
|
||
|
||
|
||
Usage 3: Disable all optimization
|
||
---------------------------------
|
||
|
||
Ned Batchelder asked to add an option to disable the peephole optimizer
|
||
because it makes code coverage more difficult to implement. See the
|
||
discussion on the python-ideas mailing list: `Disable all peephole
|
||
optimizations
|
||
<https://mail.python.org/pipermail/python-ideas/2014-May/027893.html>`_.
|
||
|
||
This PEP adds a new ``-o noopt`` command line option to disable the
|
||
peephole optimizer. In Python, it's as easy as::
|
||
|
||
sys.set_code_transformers([])
|
||
|
||
It will fix the `Issue #2506 <https://bugs.python.org/issue2506>`_: Add
|
||
mechanism to disable optimizations.
|
||
|
||
|
||
Usage 4: Write new bytecode optimizers in Python
|
||
------------------------------------------------
|
||
|
||
Python 3.6 optimizes the code using a peephole optimizer. By
|
||
definition, a peephole optimizer has a narrow view of the code and so
|
||
can only implement basic optimizations. The optimizer rewrites the
|
||
bytecode. It is difficult to enhance it, because it written in C.
|
||
|
||
With this PEP, it becomes possible to implement a new bytecode optimizer
|
||
in pure Python and experiment new optimizations.
|
||
|
||
Some optimizations are easier to implement on the AST like constant
|
||
folding, but optimizations on the bytecode are still useful. For
|
||
example, when the AST is compiled to bytecode, useless jumps can be
|
||
emited because the compiler is naive and does not try to optimize
|
||
anything.
|
||
|
||
|
||
Use Cases
|
||
=========
|
||
|
||
This section give examples of use cases explaining when and how code
|
||
transformers will be used.
|
||
|
||
Interactive interpreter
|
||
-----------------------
|
||
|
||
It will be possible to use code transformers with the interactive
|
||
interpreter which is popular in Python and commonly used to demonstrate
|
||
Python.
|
||
|
||
The code is transformed at runtime and so the interpreter can be slower
|
||
when expensive code transformers are used.
|
||
|
||
Build a transformed package
|
||
---------------------------
|
||
|
||
It will be possible to build a package of the transformed code.
|
||
|
||
A transformer can have a configuration. The configuration is not stored
|
||
in the package.
|
||
|
||
All ``.pyc`` files of the package must be transformed with the same code
|
||
transformers and the same transformers configuration.
|
||
|
||
It is possible to build different ``.pyc`` files using different
|
||
optimizer tags. Example: ``fat`` for the default configuration and
|
||
``fat_inline`` for a different configuration with function inlining
|
||
enabled.
|
||
|
||
A package can contain ``.pyc`` files with different optimizer tags.
|
||
|
||
|
||
Install a package containing transformed .pyc files
|
||
---------------------------------------------------
|
||
|
||
It will be possible to install a package which contains transformed
|
||
``.pyc`` files.
|
||
|
||
All ``.pyc`` files with any optimizer tag contained in the package are
|
||
installed, not only for the current optimizer tag.
|
||
|
||
|
||
Build .pyc files when installing a package
|
||
------------------------------------------
|
||
|
||
If a package does not contain any ``.pyc`` files of the current
|
||
optimizer tag (or some ``.pyc`` files are missing), the ``.pyc`` are
|
||
created during the installation.
|
||
|
||
Code transformers of the optimizer tag are required. Otherwise, the
|
||
installation fails with an error.
|
||
|
||
|
||
Execute transformed code
|
||
------------------------
|
||
|
||
It will be possible to execute transformed code.
|
||
|
||
Raise an ``ImportError`` exception on import if the ``.pyc`` file of the
|
||
current optimizer tag is missing and the code transformers required to
|
||
transform the code are missing.
|
||
|
||
The interesting point here is that code transformers are not needed to
|
||
execute the transformed code if all required ``.pyc`` files are already
|
||
available.
|
||
|
||
|
||
Code transformer API
|
||
====================
|
||
|
||
A code transformer is a class with ``ast_transformer()`` and/or
|
||
``code_transformer()`` methods (API described below) and a ``name``
|
||
attribute.
|
||
|
||
For efficiency, do not define a ``code_transformer()`` or
|
||
``ast_transformer()`` method if it does nothing.
|
||
|
||
The ``name`` attribute (``str``) must be a short string used to identify
|
||
an optimizer. It is used to build a ``.pyc`` filename. The name must not
|
||
contain dots (``'.'``), dashes (``'-'``) or directory separators: dots
|
||
are used to separated fields in a ``.pyc`` filename and dashes areused
|
||
to join code transformer names to build the optimizer tag.
|
||
|
||
.. note::
|
||
It would be nice to pass the fully qualified name of a module in the
|
||
*context* when an AST transformer is used to transform a module on
|
||
import, but it looks like the information is not available in
|
||
``PyParser_ASTFromStringObject()``.
|
||
|
||
|
||
code_transformer()
|
||
------------------
|
||
|
||
Prototype::
|
||
|
||
def code_transformer(code, consts, names, lnotab, context):
|
||
...
|
||
return (code, consts, names, lnotab)
|
||
|
||
Parameters:
|
||
|
||
* *code*: the bytecode (``bytes``)
|
||
* *consts*: a sequence of constants
|
||
* *names*: tuple of variable names
|
||
* *lnotab*: table mapping instruction offsets to line numbers
|
||
(``bytes``)
|
||
|
||
The code transformer is run after the compilation to bytecode
|
||
|
||
|
||
ast_transformer()
|
||
------------------
|
||
|
||
Prototype::
|
||
|
||
def ast_transformer(tree, context):
|
||
...
|
||
return tree
|
||
|
||
Parameters:
|
||
|
||
* *tree*: an AST tree
|
||
* *context*: an object with a ``filename`` attribute (``str``)
|
||
|
||
It must return an AST tree. It can modify the AST tree in place, or
|
||
create a new AST tree.
|
||
|
||
The AST transformer is called after the creation of the AST by the
|
||
parser and before the compilation to bytecode. New attributes may be
|
||
added to *context* in the future.
|
||
|
||
|
||
Changes
|
||
=======
|
||
|
||
In short, add:
|
||
|
||
* -o OPTIM_TAG command line option
|
||
* sys.implementation.optim_tag
|
||
* sys.get_code_transformers()
|
||
* sys.set_code_transformers(transformers)
|
||
* ast.Constant
|
||
* ast.PyCF_TRANSFORMED_AST
|
||
|
||
|
||
API to get/set code transformers
|
||
--------------------------------
|
||
|
||
Add new functions to register code transformers:
|
||
|
||
* ``sys.set_code_transformers(transformers)``: set the list of code
|
||
transformers and update ``sys.implementation.optim_tag``
|
||
* ``sys.get_code_transformers()``: get the list of code
|
||
transformers.
|
||
|
||
The order of code transformers matter. Running transformer A and then
|
||
transformer B can give a different output than running transformer B an
|
||
then transformer A.
|
||
|
||
Example to prepend a new code transformer::
|
||
|
||
transformers = sys.get_code_transformers()
|
||
transformers.insert(0, new_cool_transformer)
|
||
sys.set_code_transformers(transformers)
|
||
|
||
All AST tranformers are run sequentially (ex: the second transformer
|
||
gets the input of the first transformer), and then all bytecode
|
||
transformers are run sequentially.
|
||
|
||
|
||
Optimizer tag
|
||
-------------
|
||
|
||
Changes:
|
||
|
||
* Add ``sys.implementation.optim_tag`` (``str``): optimization tag.
|
||
The default optimization tag is ``'opt'``.
|
||
* Add a new ``-o OPTIM_TAG`` command line option to set
|
||
``sys.implementation.optim_tag``.
|
||
|
||
Changes on ``importlib``:
|
||
|
||
* ``importlib`` uses ``sys.implementation.optim_tag`` to build the
|
||
``.pyc`` filename to importing modules, instead of always using
|
||
``opt``. Remove also the special case for the optimizer level ``0``
|
||
with the default optimizer tag ``'opt'`` to simplify the code.
|
||
* When loading a module, if the ``.pyc`` file is missing but the ``.py``
|
||
is available, the ``.py`` is only used if code optimizers have the
|
||
same optimizer tag than the current tag, otherwise an ``ImportError``
|
||
exception is raised.
|
||
|
||
Pseudo-code of a ``use_py()`` function to decide if a ``.py`` file can
|
||
be compiled to import a module::
|
||
|
||
def transformers_tag():
|
||
transformers = sys.get_code_transformers()
|
||
if not transformers:
|
||
return 'noopt'
|
||
return '-'.join(transformer.name
|
||
for transformer in transformers)
|
||
|
||
def use_py():
|
||
return (transformers_tag() == sys.implementation.optim_tag)
|
||
|
||
The order of ``sys.get_code_transformers()`` matter. For example, the
|
||
``fat`` transformer followed by the ``pythran`` transformer gives the
|
||
optimizer tag ``fat-pythran``.
|
||
|
||
The behaviour of the ``importlib`` module is unchanged with the default
|
||
optimizer tag (``'opt'``).
|
||
|
||
|
||
Peephole optimizer
|
||
------------------
|
||
|
||
By default, ``sys.implementation.optim_tag`` is ``opt`` and
|
||
``sys.get_code_transformers()`` returns a list of one code transformer:
|
||
the peephole optimizer (optimize the bytecode).
|
||
|
||
Use ``-o noopt`` to disable the peephole optimizer. In this case, the
|
||
optimizer tag is ``noopt`` and no code transformer is registered.
|
||
|
||
Using the ``-o opt`` option has not effect.
|
||
|
||
|
||
AST enhancements
|
||
----------------
|
||
|
||
Enhancements to simplify the implementation of AST transformers:
|
||
|
||
* Add a new compiler flag ``PyCF_TRANSFORMED_AST`` to get the
|
||
transformed AST. ``PyCF_ONLY_AST`` returns the AST before the
|
||
transformers.
|
||
* Add ``ast.Constant``: this type is not emited by the compiler, but
|
||
can be used in an AST transformer to simplify the code. It does not
|
||
contain line number and column offset informations on tuple or
|
||
frozenset items.
|
||
* ``PyCodeObject.co_lnotab``: line number delta becomes signed to
|
||
support moving instructions (note: need to modify MAGIC_NUMBER in
|
||
importlib). Implemented in the `issue #26107
|
||
<https://bugs.python.org/issue26107>`_
|
||
* Enhance the bytecode compiler to support ``tuple`` and ``frozenset``
|
||
constants. Currently, ``tuple`` and ``frozenset`` constants are
|
||
created by the peephole transformer, after the bytecode compilation.
|
||
* ``marshal`` module: fix serialization of the empty frozenset singleton
|
||
* update ``Tools/parser/unparse.py`` to support the new ``ast.Constant``
|
||
node type
|
||
|
||
|
||
Examples
|
||
========
|
||
|
||
.pyc filenames
|
||
--------------
|
||
|
||
Example of ``.pyc`` filenames of the ``os`` module.
|
||
|
||
With the default optimizer tag ``'opt'``:
|
||
|
||
=========================== ==================
|
||
.pyc filename Optimization level
|
||
=========================== ==================
|
||
``os.cpython-36.opt-0.pyc`` 0
|
||
``os.cpython-36.opt-1.pyc`` 1
|
||
``os.cpython-36.opt-2.pyc`` 2
|
||
=========================== ==================
|
||
|
||
With the ``'fat'`` optimizer tag:
|
||
|
||
=========================== ==================
|
||
.pyc filename Optimization level
|
||
=========================== ==================
|
||
``os.cpython-36.fat-0.pyc`` 0
|
||
``os.cpython-36.fat-1.pyc`` 1
|
||
``os.cpython-36.fat-2.pyc`` 2
|
||
=========================== ==================
|
||
|
||
|
||
Bytecode transformer
|
||
--------------------
|
||
|
||
Scary bytecode transformer replacing all strings with
|
||
``"Ni! Ni! Ni!"``::
|
||
|
||
import sys
|
||
|
||
class BytecodeTransformer:
|
||
name = "knights_who_say_ni"
|
||
|
||
def code_transformer(self, code, consts, names, lnotab, context):
|
||
consts = ['Ni! Ni! Ni!' if isinstance(const, str) else const
|
||
for const in consts]
|
||
return (code, consts, names, lnotab)
|
||
|
||
# replace existing code transformers with the new bytecode transformer
|
||
sys.set_code_transformers([BytecodeTransformer()])
|
||
|
||
# execute code which will be transformed by code_transformer()
|
||
exec("print('Hello World!')")
|
||
|
||
Output::
|
||
|
||
Ni! Ni! Ni!
|
||
|
||
|
||
AST transformer
|
||
---------------
|
||
|
||
Similary to the bytecode transformer example, the AST transformer also
|
||
replaces all strings with ``"Ni! Ni! Ni!"``::
|
||
|
||
import ast
|
||
import sys
|
||
|
||
class KnightsWhoSayNi(ast.NodeTransformer):
|
||
def visit_Str(self, node):
|
||
node.s = 'Ni! Ni! Ni!'
|
||
return node
|
||
|
||
class ASTTransformer:
|
||
name = "knights_who_say_ni"
|
||
|
||
def __init__(self):
|
||
self.transformer = KnightsWhoSayNi()
|
||
|
||
def ast_transformer(self, tree, context):
|
||
self.transformer.visit(tree)
|
||
return tree
|
||
|
||
# replace existing code transformers with the new AST transformer
|
||
sys.set_code_transformers([ASTTransformer()])
|
||
|
||
# execute code which will be transformed by ast_transformer()
|
||
exec("print('Hello World!')")
|
||
|
||
Output::
|
||
|
||
Ni! Ni! Ni!
|
||
|
||
|
||
Other Python implementations
|
||
============================
|
||
|
||
The PEP 511 should be implemented by all Python implementation, but the
|
||
bytecode and the AST are not standardized.
|
||
|
||
By the way, even between minor version of CPython, there are changes on
|
||
the AST API. There are differences, but only minor differences. It is
|
||
quite easy to write an AST transformer which works on Python 2.7 and
|
||
Python 3.5 for example.
|
||
|
||
|
||
Discussion
|
||
==========
|
||
|
||
* `[Python-Dev] AST optimizer implemented in Python
|
||
<https://mail.python.org/pipermail/python-dev/2012-August/121286.html>`_
|
||
(August 2012)
|
||
|
||
|
||
Prior Art
|
||
=========
|
||
|
||
AST optimizers
|
||
--------------
|
||
|
||
In 2011, Eugene Toder proposed to rewrite some peephole optimizations in
|
||
a new AST optimizer: issue #11549, `Build-out an AST optimizer, moving
|
||
some functionality out of the peephole optimizer
|
||
<https://bugs.python.org/issue11549>`_. The patch adds ``ast.Lit`` (it
|
||
was proposed to rename it to ``ast.Literal``).
|
||
|
||
In 2012, Victor Stinner wrote the `astoptimizer
|
||
<https://bitbucket.org/haypo/astoptimizer/>`_ project, an AST optimizer
|
||
implementing various optimizations. Most interesting optimizations break
|
||
the Python semantics since no guard is used to disable optimization if
|
||
something changes.
|
||
|
||
In 2015, Victor Stinner wrote the `fatoptimizer
|
||
<http://fatoptimizer.readthedocs.org/>`_ project, an AST optimizer
|
||
specializing functions using guards.
|
||
|
||
The Issue #17515 `"Add sys.setasthook() to allow to use a custom AST"
|
||
optimizer <https://bugs.python.org/issue17515>`_ was a first attempt of
|
||
API for code transformers, but specific to AST.
|
||
|
||
|
||
Python Preprocessors
|
||
--------------------
|
||
|
||
* `MacroPy <https://github.com/lihaoyi/macropy>`_: MacroPy is an
|
||
implementation of Syntactic Macros in the Python Programming Language.
|
||
MacroPy provides a mechanism for user-defined functions (macros) to
|
||
perform transformations on the abstract syntax tree (AST) of a Python
|
||
program at import time.
|
||
* `pypreprocessor <https://code.google.com/p/pypreprocessor/>`_: C-style
|
||
preprocessor directives in Python, like ``#define`` and ``#ifdef``
|
||
|
||
|
||
Bytecode transformers
|
||
---------------------
|
||
|
||
* `codetransformer <https://pypi.python.org/pypi/codetransformer>`_:
|
||
Bytecode transformers for CPython inspired by the ``ast`` module’s
|
||
``NodeTransformer``.
|
||
* `byteplay <http://code.google.com/p/byteplay/>`_: Byteplay lets you
|
||
convert Python code objects into equivalent objects which are easy to
|
||
play with, and lets you convert those objects back into living Python
|
||
code objects. It's useful for applying crazy transformations on Python
|
||
functions, and is also useful in learning Python byte code
|
||
intricacies. See `byteplay documentation
|
||
<http://wiki.python.org/moin/ByteplayDoc>`_.
|
||
|
||
See also:
|
||
|
||
* `BytecodeAssembler <http://pypi.python.org/pypi/BytecodeAssembler>`_
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|