465 lines
16 KiB
Plaintext
465 lines
16 KiB
Plaintext
PEP: 511
|
||
Title: API for AST transformers
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Victor Stinner <victor.stinner@gmail.com>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 4-January-2016
|
||
Python-Version: 3.6
|
||
|
||
Abstract
|
||
========
|
||
|
||
Propose an API to support AST transformers. Add also ``-o OPTIM_TAG``
|
||
command line option to change ``.pyc`` filenames. Raise an
|
||
``ImportError`` exception on import if the ``.pyc`` file is missing and
|
||
the AST transformers required to transform the code are missing.
|
||
AST transformers are not needed code transformed ahead of time (loaded
|
||
from ``.pyc`` files).
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Python does not provide a standard way to transform the code. Projects
|
||
transforming the code use various hooks. The MacroPy project uses an
|
||
import hook: it adds its own module finder in ``sys.meta_path`` to
|
||
hook its AST transformer. Another option is to monkey-patch the
|
||
builtin ``compile()`` function. There are even more options to
|
||
hook a code transformer.
|
||
|
||
Python 3.4 added a ``compile_source()`` method to
|
||
``importlib.abc.SourceLoader``. But code transformation is wider than just
|
||
importing modules, see described use cases below.
|
||
|
||
Writing an optimizer or a preprocessor is out of the scope of this PEP.
|
||
|
||
Usage 1: AST optimizer
|
||
----------------------
|
||
|
||
Python 3.6 optimizes the code using a peephole optimizer. By
|
||
definition, a peephole optimizer has a narrow view of the code and so
|
||
can only implement basic optimizations. The optimizer rewrites the
|
||
bytecode. It is difficult to enhance it, because it written in C.
|
||
|
||
Transforming an Abstract Syntax Tree (AST) is a convenient
|
||
way to implement an optimizer. It's easier to work on the AST than
|
||
working on the bytecode, AST contains more information and is more high
|
||
level.
|
||
|
||
Example of optimizations which can be implemented with an AST optimizer:
|
||
|
||
* `Copy propagation
|
||
<https://en.wikipedia.org/wiki/Copy_propagation>`_:
|
||
replace ``x=1; y=x`` with ``x=1; y=1``
|
||
* `Constant folding
|
||
<https://en.wikipedia.org/wiki/Constant_folding>`_:
|
||
replace ``1+1`` with ``2``
|
||
* `Dead code elimination
|
||
<https://en.wikipedia.org/wiki/Dead_code_elimination>`_
|
||
|
||
Using guards (see the `PEP 510
|
||
<https://www.python.org/dev/peps/pep-0510/>`_), it is possible to
|
||
implement a much wider choice of optimizations. Examples:
|
||
|
||
* Simplify iterable: replace ``range(3)`` with ``(0, 1, 2)`` when used
|
||
as iterable
|
||
* `Loop unrolling <https://en.wikipedia.org/wiki/Loop_unrolling>`_
|
||
* Call pure builtins: replace ``len("abc")`` with ``3``
|
||
* Copy used builtin symbols to constants
|
||
* See also `optimizations implemented in fatoptimizer
|
||
<https://fatoptimizer.readthedocs.org/en/latest/optimizations.html>`_,
|
||
a static optimizer for Python 3.6.
|
||
|
||
The following issues can be implemented with an AST optimizer:
|
||
|
||
* `Issue #1346238
|
||
<https://bugs.python.org/issue1346238>`_: A constant folding
|
||
optimization pass for the AST
|
||
* `Issue #2181 <http://bugs.python.org/issue2181>`_:
|
||
optimize out local variables at end of function
|
||
* `Issue #2499 <http://bugs.python.org/issue2499>`_:
|
||
Fold unary + and not on constants
|
||
* `Issue #4264 <http://bugs.python.org/issue4264>`_:
|
||
Patch: optimize code to use LIST_APPEND instead of calling list.append
|
||
* `Issue #7682 <http://bugs.python.org/issue7682>`_:
|
||
Optimisation of if with constant expression
|
||
* `Issue #10399 <https://bugs.python.org/issue10399>`_: AST
|
||
Optimization: inlining of function calls
|
||
* `Issue #11549 <http://bugs.python.org/issue11549>`_:
|
||
Build-out an AST optimizer, moving some functionality out of the
|
||
peephole optimizer
|
||
* `Issue #17068 <http://bugs.python.org/issue17068>`_:
|
||
peephole optimization for constant strings
|
||
* `Issue #17430 <http://bugs.python.org/issue17430>`_:
|
||
missed peephole optimization
|
||
|
||
|
||
Usage 2: Preprocessor
|
||
---------------------
|
||
|
||
A preprocessor can be easily implemented with an AST transformer. A
|
||
preprocessor has various and different usages. Examples:
|
||
|
||
* Remove debug code like assertions and logs to make the code faster to run
|
||
it for production.
|
||
* `Tail-call Optimization <https://en.wikipedia.org/wiki/Tail_call>`_
|
||
* Add profiling code
|
||
* `Lazy evaluation <https://en.wikipedia.org/wiki/Lazy_evaluation>`_:
|
||
see `lazy_python <https://github.com/llllllllll/lazy_python>`_
|
||
(bytecode transformer) and `lazy macro of MacroPy
|
||
<https://github.com/lihaoyi/macropy#lazy>`_ (AST transformer)
|
||
* Change dictionary literals into collection.OrderedDict instances
|
||
* Declare constants: see `@asconstants of codetransformer
|
||
<https://pypi.python.org/pypi/codetransformer>`_
|
||
* Domain Specific Language (DSL) like SQL queries. The
|
||
Python language itself doesn't need to be modified. Previous attempts to
|
||
implement DSL for SQL like `PEP 335 - Overloadable Boolean Operators
|
||
<https://www.python.org/dev/peps/pep-0335/>`_ was rejected.
|
||
* Pattern Matching of functional languages
|
||
* String Interpolation, but `PEP 498 -- Literal String Interpolation
|
||
<https://www.python.org/dev/peps/pep-0498/>`_ was merged into Python 3.6.
|
||
|
||
`MacroPy <https://github.com/lihaoyi/macropy>`_ has a long list of
|
||
examples and use cases.
|
||
|
||
See also `PyXfuscator <https://bitbucket.org/namn/pyxfuscator>`_: Python
|
||
obfuscator, deobfuscator, and user-assisted decompiler.
|
||
|
||
|
||
Use Cases
|
||
=========
|
||
|
||
This section give examples of use cases explaining when and how AST
|
||
transformers will be used.
|
||
|
||
Interactive interpreter
|
||
-----------------------
|
||
|
||
It will be possible to use AST transformers with the interactive
|
||
interpreter which is popular in Python and commonly used to demonstrate
|
||
Python.
|
||
|
||
The code is transformed at runtime and so the interpreter can be slower
|
||
when expensive AST transformers are used.
|
||
|
||
Build a transformed package
|
||
---------------------------
|
||
|
||
It will be possible to build a package of the transformed code.
|
||
|
||
A transformer can have a configuration. The configuration is not stored
|
||
in the package.
|
||
|
||
All ``.pyc`` files of the package must be transformed with the same AST
|
||
transformers and the same transformers configuration.
|
||
|
||
It is possible to build different ``.pyc`` files using different
|
||
optimizer tags. Example: ``fat`` for the default configuration and
|
||
``fat_inline`` for a different configuration with function inlining
|
||
enabled.
|
||
|
||
A package can contain ``.pyc`` files with different optimizer tags.
|
||
|
||
|
||
Install a package containing transformed .pyc files
|
||
---------------------------------------------------
|
||
|
||
It will be possible to install a package which contains transformed
|
||
``.pyc`` files.
|
||
|
||
All ``.pyc`` files with any optimizer tag contained in the package are
|
||
installed, not only for the current optimizer tag.
|
||
|
||
|
||
Build .pyc files when installing a package
|
||
------------------------------------------
|
||
|
||
If a package does not contain any ``.pyc`` files of the current
|
||
optimizer tag (or some ``.pyc`` files are missing), the ``.pyc`` are
|
||
created during the installation.
|
||
|
||
AST transformers of the optimizer tag are required. Otherwise, the
|
||
installation fails with an error.
|
||
|
||
|
||
Execute transformed code
|
||
------------------------
|
||
|
||
It will be possible to execute transformed code.
|
||
|
||
Raise an ``ImportError`` exception on import if the ``.pyc`` file of the
|
||
current optimizer tag is missing and the AST transformers required to
|
||
transform the code are missing.
|
||
|
||
The interesting point here is that AST transformers are not needed to
|
||
execute the transformed code if all required ``.pyc`` files are already
|
||
available.
|
||
|
||
|
||
Changes
|
||
=======
|
||
|
||
This PEP proposes to add an API to register AST transformers.
|
||
|
||
The transformation can done ahead of time. It allows to implement
|
||
powerful but expensive transformations.
|
||
|
||
|
||
API for AST transformers
|
||
------------------------
|
||
|
||
Add new functions to register AST transformers:
|
||
|
||
* ``sys.set_ast_transformers(transformers)``: set the list of AST
|
||
transformers
|
||
* ``sys.get_ast_transformers()``: get the list of AST
|
||
transformers.
|
||
|
||
The order of AST transformers matter. Running transformer A and then
|
||
transformer B can give a different output than running transformer B an
|
||
then transformer A.
|
||
|
||
API of an AST transformer:
|
||
|
||
* An AST transformer is a callable object with the prototype::
|
||
|
||
def ast_transformer(tree, context):
|
||
...
|
||
return tree
|
||
|
||
where *tree* is an AST tree and *context* is an object with a
|
||
``filename`` attribute (``str``). New attributes may be added to
|
||
*context* in the future.
|
||
|
||
* It must return an AST tree.
|
||
* It must have a ``name`` attribute (``str``): short string used to identify an
|
||
optimizer. The name must not contain ``.`` (dot) nor ``-`` (dash) characters:
|
||
``.`` is used to separated fields in a ``.pyc`` filename and ``-`` is used
|
||
to join AST transformer names to build the optimizer tag.
|
||
* The transformer is called after the creation of the AST by the parser
|
||
and before the compilation to bytecode
|
||
* It can modify the AST tree in place, or create a new AST tree.
|
||
|
||
.. note::
|
||
It would be nice to pass the fully qualified name of a module in the
|
||
*context* when an AST transformer is used to transform a module, but
|
||
it looks like the information is not available in
|
||
``PyParser_ASTFromStringObject()``.
|
||
|
||
|
||
Optimizer tag
|
||
-------------
|
||
|
||
Changes:
|
||
|
||
* Add ``sys.implementation.optim_tag`` (``str``): optimization tag.
|
||
The default optimization tag is ``'opt'``.
|
||
* Add a new ``-o OPTIM_TAG`` command line option to set
|
||
``sys.implementation.optim_tag``
|
||
|
||
Changes on ``importlib``:
|
||
|
||
* ``importlib`` uses ``sys.implementation.optim_tag`` to build the
|
||
``.pyc`` filename to importing modules, instead of always using
|
||
``opt``. Remove also the special case for the optimizer level ``0``
|
||
with the default optimizer tag ``'opt'`` to simplify the code.
|
||
* When loading a module, if the ``.pyc`` file is missing but the ``.py``
|
||
is available, the ``.py`` is only used if AST optimizers have the same
|
||
optimizer tag than the current tag, otherwise an ``ImportError``
|
||
exception is raised.
|
||
|
||
Pseudo-code of a ``use_py()`` function to decide if a ``.py`` file can
|
||
be compiled to import a module::
|
||
|
||
def get_ast_optim_tag():
|
||
transformers = sys.get_ast_transformers()
|
||
if not transformers:
|
||
return 'opt'
|
||
return '-'.join(transformer.name for transformer in transformers)
|
||
|
||
def use_py():
|
||
return (get_ast_transformers() == sys.implementation.optim_tag)
|
||
|
||
The order of ``sys.get_ast_transformers()`` matter. For example, the
|
||
``fat`` transformer followed by the ``pythran`` transformer gives the
|
||
optimizer tag ``fat-pythran``.
|
||
|
||
The behaviour of the ``importlib`` module is unchanged with the default
|
||
optimizer tag (``'opt'``).
|
||
|
||
|
||
AST enhancements
|
||
----------------
|
||
|
||
Enhancements to simplify the implementation of AST transformers:
|
||
|
||
* Add a new compiler flag ``PyCF_TRANSFORMED_AST`` to get the
|
||
transformed AST. ``PyCF_ONLY_AST`` returns the AST before the
|
||
transformers.
|
||
* Add ``ast.Constant``: this type is not emited by the compiler, but
|
||
can be used in an AST transformer to simplify the code. It does not
|
||
contain line number and column offset informations on tuple or
|
||
frozenset items.
|
||
* ``PyCodeObject.co_lnotab``: line number delta becomes signed to
|
||
support moving instructions (note: need to modify MAGIC_NUMBER in
|
||
importlib). Implemented in the `issue #26107
|
||
<https://bugs.python.org/issue26107>`_
|
||
* Enhance the bytecode compiler to support ``tuple`` and ``frozenset``
|
||
constants. Currently, ``tuple`` and ``frozenset`` constants are
|
||
created by the peephole transformer, after the bytecode compilation.
|
||
* ``marshal`` module: fix serialization of the empty frozenset singleton
|
||
* update ``Tools/parser/unparse.py`` to support the new ``ast.Constant``
|
||
node type
|
||
|
||
|
||
Example
|
||
=======
|
||
|
||
.pyc filenames
|
||
--------------
|
||
|
||
Example of ``.pyc`` filenames of the ``os`` module.
|
||
|
||
With the default optimizer tag ``'opt'``:
|
||
|
||
=========================== ==================
|
||
.pyc filename Optimization level
|
||
=========================== ==================
|
||
``os.cpython-36.opt-0.pyc`` 0
|
||
``os.cpython-36.opt-1.pyc`` 1
|
||
``os.cpython-36.opt-2.pyc`` 2
|
||
=========================== ==================
|
||
|
||
With the ``'fat'`` optimizer tag:
|
||
|
||
=========================== ==================
|
||
.pyc filename Optimization level
|
||
=========================== ==================
|
||
``os.cpython-36.fat-0.pyc`` 0
|
||
``os.cpython-36.fat-1.pyc`` 1
|
||
``os.cpython-36.fat-2.pyc`` 2
|
||
=========================== ==================
|
||
|
||
|
||
AST transformer
|
||
----------------
|
||
|
||
Scary AST transformer replacing all strings with ``"Ni! Ni! Ni!"``::
|
||
|
||
import ast
|
||
import sys
|
||
|
||
|
||
class KnightsWhoSayNi(ast.NodeTransformer):
|
||
def visit_Str(self, node):
|
||
node.s = 'Ni! Ni! Ni!'
|
||
return node
|
||
|
||
|
||
class ASTTransformer:
|
||
name = "knights_who_say_ni"
|
||
|
||
def __init__(self):
|
||
self.transformer = KnightsWhoSayNi()
|
||
|
||
def __call__(self, tree, context):
|
||
self.transformer.visit(tree)
|
||
return tree
|
||
|
||
|
||
# register the AST transformer
|
||
sys.set_ast_transformers([ASTTransformer()])
|
||
|
||
# execute code which will be transformed by ast_transformer()
|
||
exec("print('Hello World!')")
|
||
|
||
Output::
|
||
|
||
Ni! Ni! Ni!
|
||
|
||
|
||
Other Python implementations
|
||
============================
|
||
|
||
The PEP 511 should be implemented be all Python implementation. The AST
|
||
emited by the parser is not specified.
|
||
|
||
By the way, even between minor version of CPython, there are changes on
|
||
the AST API. There are differences, but only minor differences. It is
|
||
quite easy to write an AST transformer which works on Python 2.7 and
|
||
Python 3.5 for example.
|
||
|
||
|
||
Discussion
|
||
==========
|
||
|
||
* `[Python-Dev] AST optimizer implemented in Python
|
||
<https://mail.python.org/pipermail/python-dev/2012-August/121286.html>`_
|
||
(August 2012)
|
||
|
||
|
||
Prior Art
|
||
=========
|
||
|
||
AST optimizers
|
||
--------------
|
||
|
||
In 2011, Eugene Toder proposed to rewrite some peephole optimizations in
|
||
a new AST optimizer: issue #11549, `Build-out an AST optimizer, moving
|
||
some functionality out of the peephole optimizer
|
||
<https://bugs.python.org/issue11549>`_. The patch adds ``ast.Lit`` (it
|
||
was proposed to rename it to ``ast.Literal``).
|
||
|
||
In 2012, Victor Stinner wrote the `astoptimizer
|
||
<https://bitbucket.org/haypo/astoptimizer/>`_ project, an AST optimizer
|
||
implementing various optimizations. Most interesting optimizations break
|
||
the Python semantics since no guard is used to disable optimization if
|
||
something changes.
|
||
|
||
Issue #17515: `Add sys.setasthook() to allow to use a custom AST
|
||
optimizer <https://bugs.python.org/issue17515>`_.
|
||
|
||
|
||
Python Preprocessors
|
||
--------------------
|
||
|
||
* `MacroPy <https://github.com/lihaoyi/macropy>`_: MacroPy is an
|
||
implementation of Syntactic Macros in the Python Programming Language.
|
||
MacroPy provides a mechanism for user-defined functions (macros) to
|
||
perform transformations on the abstract syntax tree (AST) of a Python
|
||
program at import time.
|
||
* `pypreprocessor <https://code.google.com/p/pypreprocessor/>`_: C-style
|
||
preprocessor directives in Python, like ``#define`` and ``#ifdef``
|
||
|
||
|
||
Modify the bytecode
|
||
-------------------
|
||
|
||
* `codetransformer <https://pypi.python.org/pypi/codetransformer>`_:
|
||
Bytecode transformers for CPython inspired by the ``ast`` module’s
|
||
``NodeTransformer``.
|
||
* `byteplay <http://code.google.com/p/byteplay/>`_: Byteplay lets you
|
||
convert Python code objects into equivalent objects which are easy to
|
||
play with, and lets you convert those objects back into living Python
|
||
code objects. It's useful for applying crazy transformations on Python
|
||
functions, and is also useful in learning Python byte code
|
||
intricacies. See `byteplay documentation
|
||
<http://wiki.python.org/moin/ByteplayDoc>`_.
|
||
|
||
See also:
|
||
|
||
* `BytecodeAssembler <http://pypi.python.org/pypi/BytecodeAssembler>`_
|
||
* `Issue #2506 <https://bugs.python.org/issue2506>`_: Add mechanism to
|
||
disable optimizations
|
||
* `[Python-ideas] Disable all peephole optimizations
|
||
<https://mail.python.org/pipermail/python-ideas/2014-May/027893.html>`_
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|