PEP 617: Expand the section about actions to cover Python-based actions (#1357)
* PEP 617: Expand the section about actions to cover Python-based actions * Update pep-0617.rst Co-Authored-By: Guido van Rossum <gvanrossum@gmail.com> * Convert rules to new format for alternatives Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
This commit is contained in:
parent
050635872f
commit
75f841a0fb
93
pep-0617.rst
93
pep-0617.rst
|
@ -545,15 +545,25 @@ A subexpression can be named by preceding it with an identifier and an
|
|||
---------------
|
||||
Grammar actions
|
||||
---------------
|
||||
|
||||
To avoid the intermediate steps that obscure the relationship between the
|
||||
grammar and the AST generation the proposed PEG parser allows directly generating
|
||||
AST nodes for a rule via grammar actions. Grammar actions are C expressions that
|
||||
are evaluated when a grammar rule is successfully parsed. This allows to directly
|
||||
describe how the AST is composed in the grammar itself, making it more clear and
|
||||
maintainable. This AST generation process is supported by the use of some helper
|
||||
functions that factor out common AST object manipulations and some other required
|
||||
operations that are not directly related to the grammar.
|
||||
grammar and the AST generation the proposed PEG parser allows directly
|
||||
generating AST nodes for a rule via grammar actions. Grammar actions are
|
||||
language-specific expressions that are evaluated when a grammar rule is
|
||||
successfully parsed. These expressions can be written in Python or C
|
||||
depending on the desired output of the parser generator. This means that if
|
||||
one would want to generate a parser in Python and another in C, two grammar
|
||||
files should be written, each one with a different set of actions, keeping
|
||||
everything else apart from said actions identical in both files. As an
|
||||
example of a grammar with Python actions, the piece of the parser generator
|
||||
that parses grammar files is bootstrapped from a meta-grammar file with
|
||||
Python actions that generate the grammar tree as a result of the parsing.
|
||||
|
||||
In the specific case of the new proposed PEG grammar for the Python, having
|
||||
actions allows to directly describe how the AST is composed in the grammar
|
||||
itself, making it more clear and maintainable. This AST generation process is
|
||||
supported by the use of some helper functions that factor out common AST
|
||||
object manipulations and some other required operations that are not directly
|
||||
related to the grammar.
|
||||
|
||||
To indicate these actions each alternative can be followed by the action code
|
||||
inside curly-braces, which specifies the return value of the alternative:::
|
||||
|
@ -571,35 +581,62 @@ different possibilities:
|
|||
If the action is omitted and Python code is being generated, then a list
|
||||
with all the parsed expressions get returned (this is meant for debugging).
|
||||
|
||||
As an illustrative example this simple grammar file allows to directly generate a full
|
||||
parser that can parse simple aritmetic expressions and that returns a valid Python AST:
|
||||
As an illustrative example this simple grammar file allows to directly
|
||||
generate a full parser that can parse simple arithmetic expressions and that
|
||||
returns a valid C-based Python AST:
|
||||
|
||||
::
|
||||
|
||||
start[mod_ty]: a=stmt* $ { Module(a, NULL, p->arena) }
|
||||
stmt[stmt_ty]: a=expr_stmt { a }
|
||||
expr_stmt[stmt_ty]: a=expression NEWLINE { _Py_Expr(a, EXTRA) }
|
||||
expression[expr_ty]: ( l=expression '+' r=term { _Py_BinOp(l, Add, r, EXTRA) }
|
||||
| l=expression '-' r=term { _Py_BinOp(l, Sub, r, EXTRA) }
|
||||
| t=term { t }
|
||||
)
|
||||
term[expr_ty]: ( l=term '*' r=factor { _Py_BinOp(l, Mult, r, EXTRA }
|
||||
| l=term '/' r=factor { _Py_BinOp(l, Div, r, EXTRA) }
|
||||
| f=factor { f }
|
||||
)
|
||||
factor[expr_ty]: ('(' e=expression ')' { e }
|
||||
| a=atom { a }
|
||||
)
|
||||
atom[expr_ty]: ( n=NAME { n }
|
||||
| n=NUMBER { n }
|
||||
| s=STRING { s }
|
||||
)
|
||||
start[mod_ty]: a=expr_stmt* $ { Module(a, NULL, p->arena) }
|
||||
expr_stmt[stmt_ty]: a=expr NEWLINE { _Py_Expr(a, EXTRA) }
|
||||
expr[expr_ty]:
|
||||
| l=expr '+' r=term { _Py_BinOp(l, Add, r, EXTRA) }
|
||||
| l=expr '-' r=term { _Py_BinOp(l, Sub, r, EXTRA) }
|
||||
| t=term { t }
|
||||
|
||||
term[expr_ty]:
|
||||
| l=term '*' r=factor { _Py_BinOp(l, Mult, r, EXTRA) }
|
||||
| l=term '/' r=factor { _Py_BinOp(l, Div, r, EXTRA) }
|
||||
| f=factor { f }
|
||||
|
||||
factor[expr_ty]:
|
||||
| '(' e=expr ')' { e }
|
||||
| a=atom { a }
|
||||
|
||||
atom[expr_ty]:
|
||||
| n=NAME { n }
|
||||
| n=NUMBER { n }
|
||||
| s=STRING { s }
|
||||
|
||||
Here ``EXTRA`` is a macro that expands to ``start_lineno, start_col_offset,
|
||||
end_lineno, end_col_offset, p->arena``, those being variables automatically
|
||||
injected by the parser; ``p`` points to an object that holds on to all state
|
||||
for the parser.
|
||||
|
||||
A similar grammar written to target Python AST objects:
|
||||
|
||||
::
|
||||
|
||||
start: expr NEWLINE? ENDMARKER { ast.Expression(expr) }
|
||||
expr:
|
||||
| expr '+' term { ast.BinOp(expr, ast.Add(), term) }
|
||||
| expr '-' term { ast.BinOp(expr, ast.Sub(), term) }
|
||||
| term { term }
|
||||
|
||||
term:
|
||||
| l=term '*' r=factor { ast.BinOp(l, ast.Mult(), r) }
|
||||
| term '/' factor { ast.BinOp(term, ast.Div(), factor) }
|
||||
| factor { factor }
|
||||
|
||||
factor:
|
||||
| '(' expr ')' { expr }
|
||||
| atom { atom }
|
||||
|
||||
atom:
|
||||
| NAME { ast.Name(id=name.string, ctx=ast.Load()) }
|
||||
| NUMBER { ast.Constant(value=ast.literal_eval(number.string)) }
|
||||
|
||||
|
||||
==============
|
||||
Migration plan
|
||||
==============
|
||||
|
|
Loading…
Reference in New Issue