New DSL syntax and slightly changed semantics for the Argument Clinic DSL.
This commit is contained in:
parent
c084fd16c3
commit
d247c96780
654
pep-0436.txt
654
pep-0436.txt
|
@ -13,7 +13,7 @@ Created: 22-Feb-2013
|
||||||
Abstract
|
Abstract
|
||||||
========
|
========
|
||||||
|
|
||||||
This document proposes "Argument Clinic", a DSL designed to facilitate
|
This document proposes "Argument Clinic", a DSL to facilitate
|
||||||
argument processing for built-in functions in the implementation of
|
argument processing for built-in functions in the implementation of
|
||||||
CPython.
|
CPython.
|
||||||
|
|
||||||
|
@ -22,36 +22,37 @@ Rationale and Goals
|
||||||
===================
|
===================
|
||||||
|
|
||||||
The primary implementation of Python, "CPython", is written in a
|
The primary implementation of Python, "CPython", is written in a
|
||||||
mixture of Python and C. One of the implementation details of CPython
|
mixture of Python and C. One implementation detail of CPython
|
||||||
is what are called "built-in" functions -- functions available to
|
is what are called "built-in" functions -- functions available to
|
||||||
Python programs but written in C. When a Python program calls a
|
Python programs but written in C. When a Python program calls a
|
||||||
built-in function and passes in arguments, those arguments must be
|
built-in function and passes in arguments, those arguments must be
|
||||||
translated from Python values into C values. This process is called
|
translated from Python values into C values. This process is called
|
||||||
"parsing arguments".
|
"parsing arguments".
|
||||||
|
|
||||||
As of CPython 3.3, arguments to functions are primarily parsed with
|
As of CPython 3.3, builtin functions nearly always parse their arguments
|
||||||
one of two functions: the original ``PyArg_ParseTuple()``, [1]_ and
|
with one of two functions: the original ``PyArg_ParseTuple()``, [1]_ and
|
||||||
the more modern ``PyArg_ParseTupleAndKeywords()``. [2]_ The former
|
the more modern ``PyArg_ParseTupleAndKeywords()``. [2]_ The former
|
||||||
function only handles positional parameters; the latter also
|
only handles positional parameters; the latter also accommodates keyword
|
||||||
accommodates keyword and keyword-only parameters, and is preferred for
|
and keyword-only parameters, and is preferred for new code.
|
||||||
new code.
|
|
||||||
|
|
||||||
``PyArg_ParseTuple()`` was a reasonable approach when it was first
|
With either function, the caller specifies the translation for
|
||||||
conceived. The programmer specified the translation for the arguments
|
parsing arguments in a "format string": [3]_ each parameter corresponds
|
||||||
in a "format string": [3]_ each parameter matched to a "format unit",
|
to a "format unit", a short character sequence telling the parsing
|
||||||
a one-or-two character sequence telling ``PyArg_ParseTuple()`` what
|
function what Python types to accept and how to translate them into
|
||||||
Python types to accept and how to translate them into the appropriate
|
the appropriate C value for that parameter.
|
||||||
C value for that parameter. There were only a dozen or so of these
|
|
||||||
"format units", and each one was distinct and easy to understand.
|
|
||||||
|
|
||||||
Over the years the ``PyArg_Parse`` interface has been extended in
|
|
||||||
numerous ways. The modern API is quite complex, to the point that it
|
``PyArg_ParseTuple()`` was reasonable when it was first conceived.
|
||||||
|
There were only a dozen or so of these "format units"; each one
|
||||||
|
was distinct, and easy to understand and remember.
|
||||||
|
But over the years the ``PyArg_Parse`` interface has been extended
|
||||||
|
in numerous ways. The modern API is complex, to the point that it
|
||||||
is somewhat painful to use. Consider:
|
is somewhat painful to use. Consider:
|
||||||
|
|
||||||
* There are now forty different "format units"; a few are even three
|
* There are now forty different "format units"; a few are even three
|
||||||
characters long. This makes it difficult to understand what the
|
characters long. This makes it difficult for the programmer to
|
||||||
format string says without constantly cross-indexing it with the
|
understand what the format string says--or even perhaps to parse
|
||||||
documentation.
|
it--without constantly cross-indexing it with the documentation.
|
||||||
* There are also six meta-format units that may be buried in the
|
* There are also six meta-format units that may be buried in the
|
||||||
format string. (They are: ``"()|$:;"``.)
|
format string. (They are: ``"()|$:;"``.)
|
||||||
* The more format units are added, the less likely it is the
|
* The more format units are added, the less likely it is the
|
||||||
|
@ -61,8 +62,9 @@ is somewhat painful to use. Consider:
|
||||||
format units become.
|
format units become.
|
||||||
* Several format units are nearly identical to others, having only
|
* Several format units are nearly identical to others, having only
|
||||||
subtle differences. This makes understanding the exact semantics
|
subtle differences. This makes understanding the exact semantics
|
||||||
of the format string even harder.
|
of the format string even harder, and can make choosing the right
|
||||||
* The docstring is specified as a static C string, which is mildly
|
format unit a conundrum.
|
||||||
|
* The docstring is specified as a static C string, making it mildly
|
||||||
bothersome to read and edit.
|
bothersome to read and edit.
|
||||||
* When adding a new parameter to a function using
|
* When adding a new parameter to a function using
|
||||||
``PyArg_ParseTupleAndKeywords()``, it's necessary to touch six
|
``PyArg_ParseTupleAndKeywords()``, it's necessary to touch six
|
||||||
|
@ -91,18 +93,32 @@ inheriting none of these downsides:
|
||||||
|
|
||||||
* You need specify each parameter only once.
|
* You need specify each parameter only once.
|
||||||
* All information about a parameter is kept together in one place.
|
* All information about a parameter is kept together in one place.
|
||||||
* For each parameter, you specify its type in C; Argument Clinic
|
* For each parameter, you specify a conversion function; Argument
|
||||||
handles the translation from Python value into C value for you.
|
Clinic handles the translation from Python value into C value for
|
||||||
|
you.
|
||||||
* Argument Clinic also allows for fine-tuning of argument processing
|
* Argument Clinic also allows for fine-tuning of argument processing
|
||||||
behavior with highly-readable "flags", both per-parameter and
|
behavior with parameterized conversion functions..
|
||||||
applying across the whole function.
|
* Docstrings are written in plain text. Function docstrings are
|
||||||
* Docstrings are written in plain text.
|
required; per-parameter docstrings are encouraged.
|
||||||
* From this, Argument Clinic generates for you all the mundane,
|
* From this, Argument Clinic generates for you all the mundane,
|
||||||
repetitious code and data structures CPython needs internally.
|
repetitious code and data structures CPython needs internally.
|
||||||
Once you've specified the interface, the next step is simply to
|
Once you've specified the interface, the next step is simply to
|
||||||
write your implementation using native C types. Every detail of
|
write your implementation using native C types. Every detail of
|
||||||
argument parsing is handled for you.
|
argument parsing is handled for you.
|
||||||
|
|
||||||
|
Argument Clinic is implemented as a preprocessor. It draws inspiration
|
||||||
|
for its workflow directly from [Cog]_ by Ned Batchelder. To use Clinic,
|
||||||
|
add a block comment to your C source code beginning and ending with
|
||||||
|
special text strings, then run Clinic on the file. Clinic will find the
|
||||||
|
block comment, process the contents, and write the output back into your
|
||||||
|
C source file directly after the comment. The intent is that Clinic's
|
||||||
|
output becomes part of your source code; it's checked in to revision
|
||||||
|
control, and distributed with source packages. This means that Python
|
||||||
|
will still ship ready-to-build. It does complicate development slightly;
|
||||||
|
in order to add a new function, or modify the arguments or documentation
|
||||||
|
of an existing function using Clinic, you'll need a working Python 3
|
||||||
|
interpreter.
|
||||||
|
|
||||||
Future goals of Argument Clinic include:
|
Future goals of Argument Clinic include:
|
||||||
|
|
||||||
* providing signature information for builtins, and
|
* providing signature information for builtins, and
|
||||||
|
@ -117,24 +133,118 @@ file, as follows. The "Example" column on the right shows you sample
|
||||||
input to the Argument Clinic DSL, and the "Section" column on the left
|
input to the Argument Clinic DSL, and the "Section" column on the left
|
||||||
specifies what each line represents in turn.
|
specifies what each line represents in turn.
|
||||||
|
|
||||||
|
Argument Clinic's DSL syntax mirrors the Python ``def``
|
||||||
|
statement, lending it some familiarity to Python core developers.
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
+-----------------------+-----------------------------------------------------+
|
+-----------------------+-----------------------------------------------------------------+
|
||||||
| Section | Example |
|
| Section | Example |
|
||||||
+-----------------------+-----------------------------------------------------+
|
+-----------------------+-----------------------------------------------------------------+
|
||||||
| Clinic DSL start | /*[clinic] |
|
| Clinic DSL start | /*[clinic] |
|
||||||
| Function declaration | module.function_name -> return_annotation |
|
| Module declaration | module module_name |
|
||||||
| Function flags | flag flag2 flag3=value |
|
| Class declaration | class module_name.class_name |
|
||||||
| Parameter declaration | type name = default |
|
| Function declaration | module_name.function_name -> return_annotation |
|
||||||
| Parameter flags | flag flag2 flag3=value |
|
| Parameter declaration | name : converter(param=value) |
|
||||||
| Parameter docstring | Lorem ipsum dolor sit amet, consectetur |
|
| Parameter docstring | Lorem ipsum dolor sit amet, consectetur |
|
||||||
| | adipisicing elit, sed do eiusmod tempor |
|
| | adipisicing elit, sed do eiusmod tempor |
|
||||||
| Function docstring | Lorem ipsum dolor sit amet, consectetur adipisicing |
|
| Function docstring | Lorem ipsum dolor sit amet, consectetur adipisicing |
|
||||||
| | elit, sed do eiusmod tempor incididunt ut labore et |
|
| | elit, sed do eiusmod tempor incididunt ut labore et |
|
||||||
| Clinic DSL end | [clinic]*/ |
|
| Clinic DSL end | [clinic]*/ |
|
||||||
| Clinic output | ... |
|
| Clinic output | ... |
|
||||||
| Clinic output end | /*[clinic end output:<checksum>]*/ |
|
| Clinic output end | /*[clinic end output:<checksum>]*/ |
|
||||||
+-----------------------+-----------------------------------------------------+
|
+-----------------------+-----------------------------------------------------------------+
|
||||||
|
|
||||||
|
To give some flavor of the proposed DSL syntax, here are some sample Clinic
|
||||||
|
code blocks. This first block reflects the normally preferred style, including
|
||||||
|
blank lines between parameters and per-argument docstrings.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
/*[clinic]
|
||||||
|
os.stat as os_stat_fn -> stat result
|
||||||
|
|
||||||
|
path: path_t(allow_fd=1)
|
||||||
|
Path to be examined; can be string, bytes, or open-file-descriptor int.
|
||||||
|
|
||||||
|
*
|
||||||
|
|
||||||
|
dir_fd: OS_STAT_DIR_FD_CONVERTER = DEFAULT_DIR_FD
|
||||||
|
If not None, it should be a file descriptor open to a directory,
|
||||||
|
and path should be a relative string; path will then be relative to
|
||||||
|
that directory.
|
||||||
|
|
||||||
|
follow_symlinks: bool = True
|
||||||
|
If False, and the last element of the path is a symbolic link,
|
||||||
|
stat will examine the symbolic link itself instead of the file
|
||||||
|
the link points to.
|
||||||
|
|
||||||
|
Perform a stat system call on the given path.
|
||||||
|
|
||||||
|
{parameters}
|
||||||
|
|
||||||
|
dir_fd and follow_symlinks may not be implemented
|
||||||
|
on your platform. If they are unavailable, using them will raise a
|
||||||
|
NotImplementedError.
|
||||||
|
|
||||||
|
It's an error to use dir_fd or follow_symlinks when specifying path as
|
||||||
|
an open file descriptor.
|
||||||
|
|
||||||
|
[clinic]*/
|
||||||
|
|
||||||
|
This second example shows a minimal Clinic code block, omitting all
|
||||||
|
parameter docstrings and non-significant blank lines::
|
||||||
|
|
||||||
|
/*[clinic]
|
||||||
|
os.access
|
||||||
|
path: path
|
||||||
|
mode: int
|
||||||
|
*
|
||||||
|
dir_fd: OS_ACCESS_DIR_FD_CONVERTER = 1
|
||||||
|
effective_ids: bool = False
|
||||||
|
follow_symlinks: bool = True
|
||||||
|
Use the real uid/gid to test for access to a path.
|
||||||
|
Returns True if granted, False otherwise.
|
||||||
|
|
||||||
|
{parameters}
|
||||||
|
|
||||||
|
dir_fd, effective_ids, and follow_symlinks may not be implemented
|
||||||
|
on your platform. If they are unavailable, using them will raise a
|
||||||
|
NotImplementedError.
|
||||||
|
|
||||||
|
Note that most operations will use the effective uid/gid, therefore this
|
||||||
|
routine can be used in a suid/sgid environment to test if the invoking user
|
||||||
|
has the specified access to the path.
|
||||||
|
|
||||||
|
[clinic]*/
|
||||||
|
|
||||||
|
This final example shows a Clinic code block handling groups of
|
||||||
|
optional parameters, including parameters on the left::
|
||||||
|
|
||||||
|
/*[clinic]
|
||||||
|
curses.window.addch
|
||||||
|
|
||||||
|
[
|
||||||
|
x: int
|
||||||
|
X-coordinate.
|
||||||
|
|
||||||
|
y: int
|
||||||
|
Y-coordinate.
|
||||||
|
]
|
||||||
|
|
||||||
|
ch: char
|
||||||
|
Character to add.
|
||||||
|
|
||||||
|
[
|
||||||
|
attr: long
|
||||||
|
Attributes for the character.
|
||||||
|
]
|
||||||
|
|
||||||
|
Paint character ch at (y, x) with attributes attr,
|
||||||
|
overwriting any character previously painter at that location.
|
||||||
|
By default, the character position and attributes are the
|
||||||
|
current settings for the window object.
|
||||||
|
[clinic]*/
|
||||||
|
|
||||||
|
|
||||||
General Behavior Of the Argument Clinic DSL
|
General Behavior Of the Argument Clinic DSL
|
||||||
|
@ -145,113 +255,219 @@ docstrings. Blank lines are always ignored.
|
||||||
|
|
||||||
Like Python itself, leading whitespace is significant in the Argument
|
Like Python itself, leading whitespace is significant in the Argument
|
||||||
Clinic DSL. The first line of the "function" section is the
|
Clinic DSL. The first line of the "function" section is the
|
||||||
declaration; all subsequent lines at the same indent are function
|
function declaration. Indented lines below the function declaration
|
||||||
flags. Once you indent, the first line is a parameter declaration;
|
declare parameters, one per line; lines below those that are indented even
|
||||||
subsequent lines at that indent are parameter flags. Indent one more
|
further are per-parameter docstrings. Finally, the first line dedented
|
||||||
time for the lines of the parameter docstring. Finally, dedent back
|
back to column 0 end parameter declarations and start the function docstring.
|
||||||
to the same level as the function declaration for the function
|
|
||||||
docstring.
|
Parameter docstrings are optional; function docstrings are not.
|
||||||
|
Functions that specify no arguments may simply specify the function
|
||||||
|
declaration followed by the docstring.
|
||||||
|
|
||||||
|
Module and Class Declarations
|
||||||
|
-----------------------------
|
||||||
|
|
||||||
|
When a C file implements a module or class, this should be declared to
|
||||||
|
Clinic. The syntax is simple:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
module module_name
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
class module_name.class_name
|
||||||
|
|
||||||
|
(Note that these are not actually special syntax; they are implemented
|
||||||
|
as `Directives`_.)
|
||||||
|
|
||||||
|
The module name or class name should always be the full dotted path
|
||||||
|
from the top-level module. Nested modules and classes are supported.
|
||||||
|
|
||||||
|
|
||||||
Function Declaration
|
Function Declaration
|
||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
The return annotation is optional. If skipped, the arrow ("``->``")
|
The full form of the function declaration is as follows:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
dotted.name [ as legal_c_id ] [ -> return_annotation ]
|
||||||
|
|
||||||
|
The dotted name should be the full name of the function, starting
|
||||||
|
with the highest-level package (e.g. "os.stat" or "curses.window.addch").
|
||||||
|
|
||||||
|
The "as legal_c_id" syntax is optional.
|
||||||
|
Argument Clinic uses the name of the function to create the names of
|
||||||
|
the generated C functions. In some circumstances, the generated name
|
||||||
|
may collide with other global names in the C program's namespace.
|
||||||
|
The "as legal_c_id" syntax allows you to override the generated name
|
||||||
|
with your own; substitute "legal_c_id" with any legal C identifier.
|
||||||
|
If skipped, the "as" keyword must also be omitted.
|
||||||
|
|
||||||
|
The return annotation is also optional. If skipped, the arrow ("``->``")
|
||||||
must also be omitted.
|
must also be omitted.
|
||||||
|
|
||||||
|
|
||||||
Parameter Declaration
|
Parameter Declaration
|
||||||
---------------------
|
---------------------
|
||||||
|
|
||||||
The "type" is a C type. If it's a pointer type, you must specify a
|
The full form of the parameter declaration line as as follows:
|
||||||
single space between the type and the "``*``", and zero spaces between
|
|
||||||
the "``*``" and the name. (e.g. "``PyObject *foo``", not "``PyObject*
|
|
||||||
foo``")
|
|
||||||
|
|
||||||
The "name" must be a legal C identifier.
|
::
|
||||||
|
|
||||||
The "default" is a Python value. Default values are optional; if not
|
name: converter [ (parameter=value [, parameter2=value2]) ] [ = default]
|
||||||
specified you must omit the equals sign too. Parameters which don't
|
|
||||||
have a default are implicitly required. The default value is
|
The "name" must be a legal C identifier. Whitespace is permitted between
|
||||||
|
the name and the colon (though this is not the preferred style). Whitespace
|
||||||
|
is permitted (and encouraged) between the colon and the converter.
|
||||||
|
|
||||||
|
The "converter" is the name of one of the "converter functions" registered
|
||||||
|
with Argument Clinic. Clinic will ship with a number of built-in converters;
|
||||||
|
new converters can also be added dynamically. In choosing a converter, you
|
||||||
|
are automatically constraining what Python types are permitted on the input,
|
||||||
|
and specifying what type the output variable (or variables) will be. Although
|
||||||
|
many of the converters will resemble the names of C types or perhaps Python
|
||||||
|
types, the name of a converter may be any legal Python identifier.
|
||||||
|
|
||||||
|
If the converter is followed by parentheses, these parentheses enclose
|
||||||
|
parameter to the conversion function. The syntax mirrors providing arguments
|
||||||
|
a Python function call: the parameter must always be named, as if they were
|
||||||
|
"keyword-only parameters", and the values provided for the parameters will
|
||||||
|
syntactically resemble Python literal values. These parameters are always
|
||||||
|
optional, permitting all conversion functions to be called without
|
||||||
|
any parameters. In this case, you may also omit the parentheses entirely;
|
||||||
|
this is always equivalent to specifying empty parentheses.
|
||||||
|
|
||||||
|
The "default" is a Python literal value. Default values are optional;
|
||||||
|
if not specified you must omit the equals sign too. Parameters which
|
||||||
|
don't have a default are implicitly required. The default value is
|
||||||
dynamically assigned, "live" in the generated C code, and although
|
dynamically assigned, "live" in the generated C code, and although
|
||||||
it's specified as a Python value, it's translated into a native C
|
it's specified as a Python value, it's translated into a native C
|
||||||
value in the generated C code.
|
value in the generated C code. Few default values are permitted,
|
||||||
|
owing to this manual translation step.
|
||||||
|
|
||||||
It's explicitly permitted to end the parameter declaration line with a
|
If this were a Python function declaration, a parameter declaration
|
||||||
semicolon, though the semicolon is optional. This is intended to
|
would be delimited by either a trailing comma or an ending parentheses.
|
||||||
allow directly cutting and pasting in declarations from C code.
|
However, Argument Clinic uses neither; parameter declarations are
|
||||||
However, the preferred style is without the semicolon.
|
delimited by a newline. A trailing comma or right parenthesis is not
|
||||||
|
permitted.
|
||||||
|
|
||||||
|
The first parameter declaration establishes the indent for all parameter
|
||||||
|
declarations in a particular Clinic code block. All subsequent parameters
|
||||||
|
must be indented to the same level.
|
||||||
|
|
||||||
|
|
||||||
Flags
|
Legacy Converters
|
||||||
-----
|
-----------------
|
||||||
|
|
||||||
"Flags" are like "``make -D``" arguments. They're unordered. Flags
|
For convenience's sake in converting existing code to Argument Clinic,
|
||||||
lines are parsed much like the shell (specifically, using
|
Clinic provides a set of legacy converters that match ``PyArg_ParseTuple``
|
||||||
``shlex.split()`` [5]_ ). You can have as many flag lines as you
|
format units. They are specified as a C string containing the format
|
||||||
like. Specifying a flag twice is currently an error.
|
unit. For example, to specify a parameter "foo" as taking a Python
|
||||||
|
"int" and emitting a C int, you could specify:
|
||||||
|
|
||||||
Supported flags for functions:
|
::
|
||||||
|
|
||||||
``basename``
|
foo : "i"
|
||||||
The basename to use for the generated C functions. By default this
|
|
||||||
is the name of the function from the DSL, only with periods replaced
|
|
||||||
by underscores.
|
|
||||||
|
|
||||||
``positional-only``
|
(To more closely resemble a C string, these must always use double quotes.)
|
||||||
This function only supports positional parameters, not keyword
|
|
||||||
parameters. See `Functions With Positional-Only Parameters`_ below.
|
|
||||||
|
|
||||||
Supported flags for parameters:
|
Although these resemble ``PyArg_ParseTuple`` format units, no guarantee is
|
||||||
|
made that the implementation will call a ``PyArg_Parse`` function for parsing.
|
||||||
|
|
||||||
``bitwise``
|
This syntax does not support parameters. Therefore it doesn't support any
|
||||||
If the Python integer passed in is signed, copy the bits directly
|
of the format units that require input parameters (``"O!", "O&", "es", "es#",
|
||||||
even if it is negative. Only valid for unsigned integer types.
|
"et", "et#"``). Parameters requiring one of these conversions cannot use the
|
||||||
|
legacy syntax. (You may still, however, supply a default value.)
|
||||||
|
|
||||||
``converter``
|
|
||||||
Backwards-compatibility support for parameter "converter"
|
Parameter Docstrings
|
||||||
functions. [6]_ The value should be the name of the converter
|
--------------------
|
||||||
function in C. Only valid when the type of the parameter is
|
|
||||||
``void *``.
|
All lines that appear below and are indented further than a parameter declaration
|
||||||
|
are the docstring for that parameter. All such lines are "dedented" until the
|
||||||
|
first line is flush left.
|
||||||
|
|
||||||
|
Special Syntax For Parameter Lines
|
||||||
|
----------------------------------
|
||||||
|
|
||||||
|
There are four special symbols that may be used in the parameter section. Each
|
||||||
|
of these must appear on a line by itself, indented to the same level as parameter
|
||||||
|
declarations. The four symbols are:
|
||||||
|
|
||||||
|
``*``
|
||||||
|
Establishes that all subsequent parameters are keyword-only.
|
||||||
|
|
||||||
|
``[``
|
||||||
|
Establishes the start of an optional "group" of parameters.
|
||||||
|
Note that "groups" may nest inside other "groups".
|
||||||
|
See `Functions With Positional-Only Parameters`_ below.
|
||||||
|
|
||||||
|
``]``
|
||||||
|
Ends an optional "group" of parameters.
|
||||||
|
|
||||||
|
``/``
|
||||||
|
This hints to Argument Clinic that this function is performance-sensitive,
|
||||||
|
and that it's acceptable to forego supporting keyword parameters when parsing.
|
||||||
|
(In early implementations of Clinic, this will switch Clinic from generating
|
||||||
|
code using ``PyArg_ParseTupleAndKeywords`` to using ``PyArg_ParseTuple``.
|
||||||
|
The hope is that in the future there will be no appreciable speed difference,
|
||||||
|
rendering this syntax irrelevant and deprecated but harmless.)
|
||||||
|
|
||||||
|
|
||||||
|
Function Docstring
|
||||||
|
------------------
|
||||||
|
|
||||||
|
The first line with no leading whitespace after the function declaration is the
|
||||||
|
first line of the function docstring. All subsequent lines of the Clinic block
|
||||||
|
are considered part of the docstring, and their leading whitespace is preserved.
|
||||||
|
|
||||||
|
If the string ``{parameters}`` appears on a line by itself inside the function
|
||||||
|
docstring, Argument Clinic will insert a list of all parameters that have
|
||||||
|
docstrings, each such parameter followed by its docstring. The name of the
|
||||||
|
parameter is on a line by itself; the docstring starts on a subsequent line,
|
||||||
|
and all lines of the docstring are indented by two spaces. (Parameters with
|
||||||
|
no per-parameter docstring are suppressed.) The entire list is indented by the
|
||||||
|
leading whitespace that appeared before the ``{parameters}`` token.
|
||||||
|
|
||||||
|
If the string ``{parameters}`` doesn't appear in the docstring, Argument Clinic
|
||||||
|
will append one to the end of the docstring, inserting a blank line above it if
|
||||||
|
the docstring does not end with a blank line, and with the parameter list at
|
||||||
|
column 0.
|
||||||
|
|
||||||
|
Converters
|
||||||
|
----------
|
||||||
|
|
||||||
|
Argument Clinic contains a pre-initialized registry of converter functions.
|
||||||
|
Example converter functions:
|
||||||
|
|
||||||
|
``int``
|
||||||
|
Accepts a Python object implementing ``__int__``; emits a C ``int``.
|
||||||
|
|
||||||
|
``byte``
|
||||||
|
Accepts a Python int; emits an ``unsigned char``. The integer
|
||||||
|
must be in the range [0, 256).
|
||||||
|
|
||||||
|
``str``
|
||||||
|
Accepts a Python str object; emits a C ``char *``. Automatically
|
||||||
|
encodes the string using the ``ascii`` codec.
|
||||||
|
|
||||||
|
``PyObject``
|
||||||
|
Accepts any object; emits a C ``PyObject *`` without any conversion.
|
||||||
|
|
||||||
|
All converters accept the following parameters:
|
||||||
|
|
||||||
``default``
|
``default``
|
||||||
The Python value to use in place of the parameter's actual default
|
The Python value to use in place of the parameter's actual default
|
||||||
in Python contexts. Specifically, when specified, this value will
|
in Python contexts. In other words: when specified, this value will
|
||||||
be used for the parameter's default in the docstring, and in the
|
be used for the parameter's default in the docstring, and in the
|
||||||
``Signature``. (TBD: If the string is a valid Python expression
|
``Signature``. (TBD alternative semantics: If the string is a valid
|
||||||
which can be rendered into a Python value using ``eval()``, then the
|
Python expression which can be rendered into a Python value using
|
||||||
result of ``eval()`` on it will be used as the default in the
|
``eval()``, then the result of ``eval()`` on it will be used as the
|
||||||
``Signature``.) Ignored if there is no default.
|
default in the ``Signature``.) Ignored if there is no default.
|
||||||
|
|
||||||
``encoding``
|
|
||||||
Encoding to use when encoding a Unicode string to a ``char *``.
|
|
||||||
Only valid when the type of the parameter is ``char *``.
|
|
||||||
|
|
||||||
``group=``
|
|
||||||
This parameter is part of a group of options that must either all be
|
|
||||||
specified or none specified. Parameters in the same "group" must be
|
|
||||||
contiguous. The value of the group flag is the name used for the
|
|
||||||
group variable, and therefore must be legal as a C identifier. Only
|
|
||||||
valid for functions marked "``positional-only``"; see `Functions
|
|
||||||
With Positional-Only Parameters`_ below.
|
|
||||||
|
|
||||||
``immutable``
|
|
||||||
Only accept immutable values.
|
|
||||||
|
|
||||||
``keyword-only``
|
|
||||||
This parameter (and all subsequent parameters) is keyword-only.
|
|
||||||
Keyword-only parameters must also be optional parameters. Not valid
|
|
||||||
for positional-only functions.
|
|
||||||
|
|
||||||
``length``
|
|
||||||
This is an iterable type, and we also want its length. The DSL will
|
|
||||||
generate a second ``Py_ssize_t`` variable; its name will be this
|
|
||||||
parameter's name appended with "``_length``".
|
|
||||||
|
|
||||||
``nullable``
|
|
||||||
``None`` is a legal argument for this parameter. If ``None`` is
|
|
||||||
supplied on the Python side, the equivalent C argument will be
|
|
||||||
``NULL``. Only valid for pointer types.
|
|
||||||
|
|
||||||
``required``
|
``required``
|
||||||
Normally any parameter that has a default value is automatically
|
Normally any parameter that has a default value is automatically
|
||||||
|
@ -259,24 +475,78 @@ Supported flags for parameters:
|
||||||
required (non-optional) even if it has a default value. The
|
required (non-optional) even if it has a default value. The
|
||||||
generated documentation will also not show any default value.
|
generated documentation will also not show any default value.
|
||||||
|
|
||||||
``types``
|
|
||||||
Space-separated list of acceptable Python types for this object.
|
|
||||||
There are also four special-case types which represent Python
|
|
||||||
protocols:
|
|
||||||
|
|
||||||
* buffer
|
Additionally, converters may accept one or more of these optional
|
||||||
* mapping
|
parameters, on an individual basis:
|
||||||
* number
|
|
||||||
* sequence
|
``bitwise``
|
||||||
|
For converters that accept unsigned integers. If the Python integer
|
||||||
|
passed in is signed, copy the bits directly even if it is negative.
|
||||||
|
|
||||||
|
``encoding``
|
||||||
|
For converters that accept str. Encoding to use when encoding a
|
||||||
|
Unicode string to a ``char *``.
|
||||||
|
|
||||||
|
``immutable``
|
||||||
|
Only accept immutable values.
|
||||||
|
|
||||||
|
``length``
|
||||||
|
For converters that accept iterable types. Requests that the converter
|
||||||
|
also emit the length of the iterable, passed in to the ``_impl`` function
|
||||||
|
in a ``Py_ssize_t`` variable; its name will be this
|
||||||
|
parameter's name appended with "``_length``".
|
||||||
|
|
||||||
|
``nullable``
|
||||||
|
This converter normally does not accept ``None``, but in this case
|
||||||
|
it should. If ``None`` is supplied on the Python side, the equivalent
|
||||||
|
C argument will be ``NULL``. (The ``_impl`` argument emitted by this
|
||||||
|
converter will presumably be a pointer type.)
|
||||||
|
|
||||||
|
``types``
|
||||||
|
A list of strings representing acceptable Python types for this object.
|
||||||
|
There are also four strings which represent Python protocols:
|
||||||
|
|
||||||
|
* "buffer"
|
||||||
|
* "mapping"
|
||||||
|
* "number"
|
||||||
|
* "sequence"
|
||||||
|
|
||||||
``zeroes``
|
``zeroes``
|
||||||
This parameter is a string type, and its value should be allowed to
|
For converters that accept string types. The converted value should
|
||||||
have embedded zeroes. Not valid for all varieties of string
|
be allowed to have embedded zeroes.
|
||||||
parameters.
|
|
||||||
|
|
||||||
|
Directives
|
||||||
|
----------
|
||||||
|
|
||||||
|
Argument Clinic also permits "directives" in Clinic code blocks.
|
||||||
|
Directives are similar to *pragmas* in C; they are statements
|
||||||
|
that modify Argument Clinic's behavior.
|
||||||
|
|
||||||
|
The format of a directive is as follows:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
directive_name [argument [second_argument [ ... ]]]
|
||||||
|
|
||||||
|
Directives only take positional arguments.
|
||||||
|
|
||||||
|
A Clinic code block must contain either one or more directives,
|
||||||
|
or a function declaration. It may contain both, in which
|
||||||
|
case all directives must come before the function declaration.
|
||||||
|
|
||||||
|
Internally directives map directly to Python callables.
|
||||||
|
The directive's arguments are passed directly to the callable
|
||||||
|
as positional arguments of type ``str()``.
|
||||||
|
|
||||||
|
Example possible directives include the production,
|
||||||
|
suppression, or redirection of Clinic output. Also, the
|
||||||
|
"module" and "class" keywords are actually implemented
|
||||||
|
as directives.
|
||||||
|
|
||||||
|
|
||||||
Python Code
|
Python Code
|
||||||
-----------
|
===========
|
||||||
|
|
||||||
Argument Clinic also permits embedding Python code inside C files,
|
Argument Clinic also permits embedding Python code inside C files,
|
||||||
which is executed in-place when Argument Clinic processes the file.
|
which is executed in-place when Argument Clinic processes the file.
|
||||||
|
@ -290,16 +560,21 @@ Embedded code looks like this:
|
||||||
print("/" + "* Hello world! *" + "/")
|
print("/" + "* Hello world! *" + "/")
|
||||||
|
|
||||||
[python]*/
|
[python]*/
|
||||||
|
/* Hello world! */
|
||||||
|
/*[python end:da39a3ee5e6b4b0d3255bfef95601890afd80709]*/
|
||||||
|
|
||||||
|
The ``"/* Hello world! */"`` line above was generated by running the Python
|
||||||
|
code in the preceding comment.
|
||||||
|
|
||||||
Any Python code is valid. Python code sections in Argument Clinic can
|
Any Python code is valid. Python code sections in Argument Clinic can
|
||||||
also be used to modify Clinic's behavior at runtime; for example, see
|
also be used to directly interact with Clinic; see
|
||||||
`Extending Argument Clinic`_.
|
`Argument Clinic Programmatic Interfaces`_.
|
||||||
|
|
||||||
|
|
||||||
Output
|
Output
|
||||||
======
|
======
|
||||||
|
|
||||||
Argument Clinic writes its output in-line in the C file, immediately
|
Argument Clinic writes its output inline in the C file, immediately
|
||||||
after the section of Clinic code. For "python" sections, the output
|
after the section of Clinic code. For "python" sections, the output
|
||||||
is everything printed using ``builtins.print``. For "clinic"
|
is everything printed using ``builtins.print``. For "clinic"
|
||||||
sections, the output is valid C code, including:
|
sections, the output is valid C code, including:
|
||||||
|
@ -313,11 +588,10 @@ sections, the output is valid C code, including:
|
||||||
* the definition line of the "impl" function
|
* the definition line of the "impl" function
|
||||||
* and a comment indicating the end of output.
|
* and a comment indicating the end of output.
|
||||||
|
|
||||||
The intention is that you will write the body of your impl function
|
The intention is that you write the body of your impl function immediately
|
||||||
immediately after the output -- as in, you write a left-curly-brace
|
after the output -- as in, you write a left-curly-brace immediately after
|
||||||
immediately after the end-of-output comment and write the
|
the end-of-output comment and implement builtin in the body there.
|
||||||
implementation of the builtin in the body there. (It's a bit strange
|
(It's a bit strange at first, but oddly convenient.)
|
||||||
at first, but oddly convenient.)
|
|
||||||
|
|
||||||
Argument Clinic will define the parameters of the impl function for
|
Argument Clinic will define the parameters of the impl function for
|
||||||
you. The function will take the "self" parameter passed in
|
you. The function will take the "self" parameter passed in
|
||||||
|
@ -332,6 +606,9 @@ overwrite the file. (You can force Clinic to overwrite with the
|
||||||
"``-f``" command-line argument; Clinic will also ignore the checksums
|
"``-f``" command-line argument; Clinic will also ignore the checksums
|
||||||
when using the "``-o``" command-line argument.)
|
when using the "``-o``" command-line argument.)
|
||||||
|
|
||||||
|
Finally, Argument Clinic can also emit the boilerplate definition
|
||||||
|
of the PyMethodDef array for the defined classes and modules.
|
||||||
|
|
||||||
|
|
||||||
Functions With Positional-Only Parameters
|
Functions With Positional-Only Parameters
|
||||||
=========================================
|
=========================================
|
||||||
|
@ -342,61 +619,90 @@ older positional-only API for processing arguments
|
||||||
their arguments differently based on how many arguments were passed
|
their arguments differently based on how many arguments were passed
|
||||||
in. This can provide some bewildering flexibility: there may be
|
in. This can provide some bewildering flexibility: there may be
|
||||||
groups of optional parameters, which must either all be specified or
|
groups of optional parameters, which must either all be specified or
|
||||||
none specified. And occasionally these groups are on the *left!* (For
|
none specified. And occasionally these groups are on the *left!* (A
|
||||||
example: ``curses.window.addch()``.)
|
representative example: ``curses.window.addch()``.)
|
||||||
|
|
||||||
Argument Clinic supports these legacy use-cases with a special set of
|
Argument Clinic supports these legacy use-cases by allowing you to
|
||||||
flags. First, set the flag "``positional-only``" on the entire
|
specify parameters in groups. Each optional group of parameters
|
||||||
function. Then, for every group of parameters that is collectively
|
is marked with square brackets. Note that these groups are permitted
|
||||||
optional, add a "``group=``" flag with a unique string to all the
|
on the right *or left* of any required parameters!
|
||||||
parameters in that group. Note that these groups are permitted on the
|
|
||||||
right *or left* of any required parameters! However, all groups
|
|
||||||
(including the group of required parameters) must be contiguous.
|
|
||||||
|
|
||||||
The impl function generated by Clinic will add an extra parameter for
|
The impl function generated by Clinic will add an extra parameter for
|
||||||
every group, "``int <group>_group``". This argument will be nonzero
|
every group, "``int group_{left|right}_<x>``", where x is a monotonically
|
||||||
if the group was specified on this call, and zero if it was not.
|
increasing number assigned to each group as it builds away from the
|
||||||
|
required arguments. This argument will be nonzero if the group was
|
||||||
|
specified on this call, and zero if it was not.
|
||||||
|
|
||||||
Note that when operating in this mode, you cannot specify default
|
Note that when operating in this mode, you cannot specify default
|
||||||
arguments. You can simulate defaults by putting parameters in
|
arguments.
|
||||||
individual groups and detecting whether or not they were specified;
|
|
||||||
generally speaking it's better to simply not use "positional-only"
|
|
||||||
where it isn't absolutely necessary. (TBD: It might be possible to
|
|
||||||
relax this restriction. But adding default arguments into the mix of
|
|
||||||
groups would seemingly make calculating which groups are active a good
|
|
||||||
deal harder.)
|
|
||||||
|
|
||||||
Also, note that it's possible to specify a set of groups to a function
|
Also, note that it's possible to specify a set of groups to a function
|
||||||
such that there are several valid mappings from the number of
|
such that there are several valid mappings from the number of
|
||||||
arguments to a valid set of groups. If this happens, Clinic will exit
|
arguments to a valid set of groups. If this happens, Clinic will abort
|
||||||
with an error message. This should not be a problem, as
|
with an error message. This should not be a problem, as
|
||||||
positional-only operation is only intended for legacy use cases, and
|
positional-only operation is only intended for legacy use cases, and
|
||||||
all the legacy functions using this quirky behavior should have
|
all the legacy functions using this quirky behavior have unambiguous
|
||||||
unambiguous mappings.
|
mappings.
|
||||||
|
|
||||||
|
|
||||||
Current Status
|
Current Status
|
||||||
==============
|
==============
|
||||||
|
|
||||||
As of this writing, there is a working prototype implementation of
|
As of this writing, there is a working prototype implementation of
|
||||||
Argument Clinic available online. [7]_ The prototype implements the
|
Argument Clinic available online (though the syntax may be out of date
|
||||||
syntax above, and generates code using the existing ``PyArg_Parse``
|
as you read this). [7]_ The prototype generates code using the
|
||||||
APIs. It supports translating to all current format units except
|
existing ``PyArg_Parse`` APIs. It supports translating to all current
|
||||||
``"w*"``. Sample functions using Argument Clinic exercise all major
|
format units except the mysterious ``"w*"``. Sample functions using
|
||||||
features, including positional-only argument parsing.
|
Argument Clinic exercise all major features, including positional-only
|
||||||
|
argument parsing.
|
||||||
|
|
||||||
|
|
||||||
Extending Argument Clinic
|
Argument Clinic Programmatic Interfaces
|
||||||
-------------------------
|
---------------------------------------
|
||||||
|
|
||||||
The prototype also currently provides an experimental extension
|
The prototype also currently provides an experimental extension
|
||||||
mechanism, allowing adding support for new types on-the-fly. See
|
mechanism, allowing adding support for new types on-the-fly. See
|
||||||
``Modules/posixmodule.c`` in the prototype for an example of its use.
|
``Modules/posixmodule.c`` in the prototype for an example of its use.
|
||||||
|
|
||||||
|
In the future, Argument Clinic is expected to be automatable enough
|
||||||
|
to allow querying, modification, or outright new construction of
|
||||||
|
function declarations through Python code. It may even permit
|
||||||
|
dynamically adding your own custom DSL!
|
||||||
|
|
||||||
|
|
||||||
Notes / TBD
|
Notes / TBD
|
||||||
===========
|
===========
|
||||||
|
|
||||||
|
* Optimally we'd want Argument Clinic run automatically as part of the
|
||||||
|
normal Python build process. But this presents a bootstrapping problem;
|
||||||
|
if you don't have a system Python 3, you need a Python 3 executable to
|
||||||
|
build Python 3. I'm sure this is a solvable problem, but I don't know
|
||||||
|
what the best solution might be. (Supporting this will also require
|
||||||
|
a parallel solution for Windows.)
|
||||||
|
|
||||||
|
* The original Clinic DSL syntax allowed naming the "groups" for
|
||||||
|
positional-only argument parsing. The current one does not;
|
||||||
|
they will therefore get computer-generated names (probably
|
||||||
|
left_1, right_2, etc.). Do we care about allowing the user
|
||||||
|
to explicitly specify names for the groups? The thing is, there's
|
||||||
|
no good place to put it. Only one syntax suggests itself to me,
|
||||||
|
and it's a syntax only a mother could love:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
[ group_name
|
||||||
|
name: converter
|
||||||
|
name2: converter2
|
||||||
|
]
|
||||||
|
|
||||||
|
* During the PyCon US 2013 Language Summit, there was discussion of having
|
||||||
|
Argument Clinic also generate the actual documentation (in ReST, processed
|
||||||
|
by Sphinx) for the function. The logistics of this are TBD, but it would
|
||||||
|
require that the docstrings be written in ReST, and require that Python
|
||||||
|
ship a ReST -> ascii converter. It would be best to come to a decision
|
||||||
|
about this before we begin any large-scale conversion of the CPython
|
||||||
|
source tree to using Clinic.
|
||||||
|
|
||||||
* Guido proposed having the "function docstring" be hand-written inline,
|
* Guido proposed having the "function docstring" be hand-written inline,
|
||||||
in the middle of the output, something like this:
|
in the middle of the output, something like this:
|
||||||
|
|
||||||
|
@ -420,10 +726,6 @@ Notes / TBD
|
||||||
* Do we need to support tuple unpacking? (The "``(OOO)``" style
|
* Do we need to support tuple unpacking? (The "``(OOO)``" style
|
||||||
format string.) Boy I sure hope not.
|
format string.) Boy I sure hope not.
|
||||||
|
|
||||||
* What about Python functions that take no arguments? This syntax
|
|
||||||
doesn't provide for that. Perhaps a lone indented "None" should
|
|
||||||
mean "no arguments"?
|
|
||||||
|
|
||||||
* This approach removes some dynamism / flexibility. With the
|
* This approach removes some dynamism / flexibility. With the
|
||||||
existing syntax one could theoretically pass in different encodings
|
existing syntax one could theoretically pass in different encodings
|
||||||
at runtime for the "``es``"/"``et``" format units. AFAICT CPython
|
at runtime for the "``es``"/"``et``" format units. AFAICT CPython
|
||||||
|
@ -433,14 +735,14 @@ Notes / TBD
|
||||||
socketmodule.c, except for one in _ssl.c. They're all static,
|
socketmodule.c, except for one in _ssl.c. They're all static,
|
||||||
specifying the encoding ``"idna"``.)
|
specifying the encoding ``"idna"``.)
|
||||||
|
|
||||||
* Right now the "basename" flag on a function changes the ``#define
|
|
||||||
methoddef`` name too. Should it, or should the #define'd methoddef
|
|
||||||
name always be ``{module_name}_{function_name}`` ?
|
|
||||||
|
|
||||||
|
|
||||||
References
|
References
|
||||||
==========
|
==========
|
||||||
|
|
||||||
|
.. [Cog] ``Cog``:
|
||||||
|
http://nedbatchelder.com/code/cog/
|
||||||
|
|
||||||
|
|
||||||
.. [1] ``PyArg_ParseTuple()``:
|
.. [1] ``PyArg_ParseTuple()``:
|
||||||
http://docs.python.org/3/c-api/arg.html#PyArg_ParseTuple
|
http://docs.python.org/3/c-api/arg.html#PyArg_ParseTuple
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue