PEP 617: Expand the section about actions to cover Python-based actions (#1357)

* PEP 617: Expand the section about actions to cover Python-based actions

* Update pep-0617.rst

Co-Authored-By: Guido van Rossum <gvanrossum@gmail.com>

* Convert rules to new format for alternatives

Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
This commit is contained in:
Pablo Galindo 2020-04-05 14:59:33 +01:00 committed by GitHub
parent 050635872f
commit 75f841a0fb
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 65 additions and 28 deletions

View File

@ -545,15 +545,25 @@ A subexpression can be named by preceding it with an identifier and an
---------------
Grammar actions
---------------
To avoid the intermediate steps that obscure the relationship between the
grammar and the AST generation the proposed PEG parser allows directly generating
AST nodes for a rule via grammar actions. Grammar actions are C expressions that
are evaluated when a grammar rule is successfully parsed. This allows to directly
describe how the AST is composed in the grammar itself, making it more clear and
maintainable. This AST generation process is supported by the use of some helper
functions that factor out common AST object manipulations and some other required
operations that are not directly related to the grammar.
grammar and the AST generation the proposed PEG parser allows directly
generating AST nodes for a rule via grammar actions. Grammar actions are
language-specific expressions that are evaluated when a grammar rule is
successfully parsed. These expressions can be written in Python or C
depending on the desired output of the parser generator. This means that if
one would want to generate a parser in Python and another in C, two grammar
files should be written, each one with a different set of actions, keeping
everything else apart from said actions identical in both files. As an
example of a grammar with Python actions, the piece of the parser generator
that parses grammar files is bootstrapped from a meta-grammar file with
Python actions that generate the grammar tree as a result of the parsing.
In the specific case of the new proposed PEG grammar for the Python, having
actions allows to directly describe how the AST is composed in the grammar
itself, making it more clear and maintainable. This AST generation process is
supported by the use of some helper functions that factor out common AST
object manipulations and some other required operations that are not directly
related to the grammar.
To indicate these actions each alternative can be followed by the action code
inside curly-braces, which specifies the return value of the alternative:::
@ -571,35 +581,62 @@ different possibilities:
If the action is omitted and Python code is being generated, then a list
with all the parsed expressions get returned (this is meant for debugging).
As an illustrative example this simple grammar file allows to directly generate a full
parser that can parse simple aritmetic expressions and that returns a valid Python AST:
As an illustrative example this simple grammar file allows to directly
generate a full parser that can parse simple arithmetic expressions and that
returns a valid C-based Python AST:
::
start[mod_ty]: a=stmt* $ { Module(a, NULL, p->arena) }
stmt[stmt_ty]: a=expr_stmt { a }
expr_stmt[stmt_ty]: a=expression NEWLINE { _Py_Expr(a, EXTRA) }
expression[expr_ty]: ( l=expression '+' r=term { _Py_BinOp(l, Add, r, EXTRA) }
| l=expression '-' r=term { _Py_BinOp(l, Sub, r, EXTRA) }
| t=term { t }
)
term[expr_ty]: ( l=term '*' r=factor { _Py_BinOp(l, Mult, r, EXTRA }
| l=term '/' r=factor { _Py_BinOp(l, Div, r, EXTRA) }
| f=factor { f }
)
factor[expr_ty]: ('(' e=expression ')' { e }
| a=atom { a }
)
atom[expr_ty]: ( n=NAME { n }
| n=NUMBER { n }
| s=STRING { s }
)
start[mod_ty]: a=expr_stmt* $ { Module(a, NULL, p->arena) }
expr_stmt[stmt_ty]: a=expr NEWLINE { _Py_Expr(a, EXTRA) }
expr[expr_ty]:
| l=expr '+' r=term { _Py_BinOp(l, Add, r, EXTRA) }
| l=expr '-' r=term { _Py_BinOp(l, Sub, r, EXTRA) }
| t=term { t }
term[expr_ty]:
| l=term '*' r=factor { _Py_BinOp(l, Mult, r, EXTRA) }
| l=term '/' r=factor { _Py_BinOp(l, Div, r, EXTRA) }
| f=factor { f }
factor[expr_ty]:
| '(' e=expr ')' { e }
| a=atom { a }
atom[expr_ty]:
| n=NAME { n }
| n=NUMBER { n }
| s=STRING { s }
Here ``EXTRA`` is a macro that expands to ``start_lineno, start_col_offset,
end_lineno, end_col_offset, p->arena``, those being variables automatically
injected by the parser; ``p`` points to an object that holds on to all state
for the parser.
A similar grammar written to target Python AST objects:
::
start: expr NEWLINE? ENDMARKER { ast.Expression(expr) }
expr:
| expr '+' term { ast.BinOp(expr, ast.Add(), term) }
| expr '-' term { ast.BinOp(expr, ast.Sub(), term) }
| term { term }
term:
| l=term '*' r=factor { ast.BinOp(l, ast.Mult(), r) }
| term '/' factor { ast.BinOp(term, ast.Div(), factor) }
| factor { factor }
factor:
| '(' expr ')' { expr }
| atom { atom }
atom:
| NAME { ast.Name(id=name.string, ctx=ast.Load()) }
| NUMBER { ast.Constant(value=ast.literal_eval(number.string)) }
==============
Migration plan
==============