New DSL syntax and slightly changed semantics for the Argument Clinic DSL.

This commit is contained in:
Larry Hastings 2013-03-17 00:09:36 -07:00
parent c084fd16c3
commit d247c96780
1 changed files with 478 additions and 176 deletions

View File

@ -13,7 +13,7 @@ Created: 22-Feb-2013
Abstract Abstract
======== ========
This document proposes "Argument Clinic", a DSL designed to facilitate This document proposes "Argument Clinic", a DSL to facilitate
argument processing for built-in functions in the implementation of argument processing for built-in functions in the implementation of
CPython. CPython.
@ -22,36 +22,37 @@ Rationale and Goals
=================== ===================
The primary implementation of Python, "CPython", is written in a The primary implementation of Python, "CPython", is written in a
mixture of Python and C. One of the implementation details of CPython mixture of Python and C. One implementation detail of CPython
is what are called "built-in" functions -- functions available to is what are called "built-in" functions -- functions available to
Python programs but written in C. When a Python program calls a Python programs but written in C. When a Python program calls a
built-in function and passes in arguments, those arguments must be built-in function and passes in arguments, those arguments must be
translated from Python values into C values. This process is called translated from Python values into C values. This process is called
"parsing arguments". "parsing arguments".
As of CPython 3.3, arguments to functions are primarily parsed with As of CPython 3.3, builtin functions nearly always parse their arguments
one of two functions: the original ``PyArg_ParseTuple()``, [1]_ and with one of two functions: the original ``PyArg_ParseTuple()``, [1]_ and
the more modern ``PyArg_ParseTupleAndKeywords()``. [2]_ The former the more modern ``PyArg_ParseTupleAndKeywords()``. [2]_ The former
function only handles positional parameters; the latter also only handles positional parameters; the latter also accommodates keyword
accommodates keyword and keyword-only parameters, and is preferred for and keyword-only parameters, and is preferred for new code.
new code.
``PyArg_ParseTuple()`` was a reasonable approach when it was first With either function, the caller specifies the translation for
conceived. The programmer specified the translation for the arguments parsing arguments in a "format string": [3]_ each parameter corresponds
in a "format string": [3]_ each parameter matched to a "format unit", to a "format unit", a short character sequence telling the parsing
a one-or-two character sequence telling ``PyArg_ParseTuple()`` what function what Python types to accept and how to translate them into
Python types to accept and how to translate them into the appropriate the appropriate C value for that parameter.
C value for that parameter. There were only a dozen or so of these
"format units", and each one was distinct and easy to understand.
Over the years the ``PyArg_Parse`` interface has been extended in
numerous ways. The modern API is quite complex, to the point that it ``PyArg_ParseTuple()`` was reasonable when it was first conceived.
There were only a dozen or so of these "format units"; each one
was distinct, and easy to understand and remember.
But over the years the ``PyArg_Parse`` interface has been extended
in numerous ways. The modern API is complex, to the point that it
is somewhat painful to use. Consider: is somewhat painful to use. Consider:
* There are now forty different "format units"; a few are even three * There are now forty different "format units"; a few are even three
characters long. This makes it difficult to understand what the characters long. This makes it difficult for the programmer to
format string says without constantly cross-indexing it with the understand what the format string says--or even perhaps to parse
documentation. it--without constantly cross-indexing it with the documentation.
* There are also six meta-format units that may be buried in the * There are also six meta-format units that may be buried in the
format string. (They are: ``"()|$:;"``.) format string. (They are: ``"()|$:;"``.)
* The more format units are added, the less likely it is the * The more format units are added, the less likely it is the
@ -61,8 +62,9 @@ is somewhat painful to use. Consider:
format units become. format units become.
* Several format units are nearly identical to others, having only * Several format units are nearly identical to others, having only
subtle differences. This makes understanding the exact semantics subtle differences. This makes understanding the exact semantics
of the format string even harder. of the format string even harder, and can make choosing the right
* The docstring is specified as a static C string, which is mildly format unit a conundrum.
* The docstring is specified as a static C string, making it mildly
bothersome to read and edit. bothersome to read and edit.
* When adding a new parameter to a function using * When adding a new parameter to a function using
``PyArg_ParseTupleAndKeywords()``, it's necessary to touch six ``PyArg_ParseTupleAndKeywords()``, it's necessary to touch six
@ -91,18 +93,32 @@ inheriting none of these downsides:
* You need specify each parameter only once. * You need specify each parameter only once.
* All information about a parameter is kept together in one place. * All information about a parameter is kept together in one place.
* For each parameter, you specify its type in C; Argument Clinic * For each parameter, you specify a conversion function; Argument
handles the translation from Python value into C value for you. Clinic handles the translation from Python value into C value for
you.
* Argument Clinic also allows for fine-tuning of argument processing * Argument Clinic also allows for fine-tuning of argument processing
behavior with highly-readable "flags", both per-parameter and behavior with parameterized conversion functions..
applying across the whole function. * Docstrings are written in plain text. Function docstrings are
* Docstrings are written in plain text. required; per-parameter docstrings are encouraged.
* From this, Argument Clinic generates for you all the mundane, * From this, Argument Clinic generates for you all the mundane,
repetitious code and data structures CPython needs internally. repetitious code and data structures CPython needs internally.
Once you've specified the interface, the next step is simply to Once you've specified the interface, the next step is simply to
write your implementation using native C types. Every detail of write your implementation using native C types. Every detail of
argument parsing is handled for you. argument parsing is handled for you.
Argument Clinic is implemented as a preprocessor. It draws inspiration
for its workflow directly from [Cog]_ by Ned Batchelder. To use Clinic,
add a block comment to your C source code beginning and ending with
special text strings, then run Clinic on the file. Clinic will find the
block comment, process the contents, and write the output back into your
C source file directly after the comment. The intent is that Clinic's
output becomes part of your source code; it's checked in to revision
control, and distributed with source packages. This means that Python
will still ship ready-to-build. It does complicate development slightly;
in order to add a new function, or modify the arguments or documentation
of an existing function using Clinic, you'll need a working Python 3
interpreter.
Future goals of Argument Clinic include: Future goals of Argument Clinic include:
* providing signature information for builtins, and * providing signature information for builtins, and
@ -117,24 +133,118 @@ file, as follows. The "Example" column on the right shows you sample
input to the Argument Clinic DSL, and the "Section" column on the left input to the Argument Clinic DSL, and the "Section" column on the left
specifies what each line represents in turn. specifies what each line represents in turn.
Argument Clinic's DSL syntax mirrors the Python ``def``
statement, lending it some familiarity to Python core developers.
:: ::
+-----------------------+-----------------------------------------------------+ +-----------------------+-----------------------------------------------------------------+
| Section | Example | | Section | Example |
+-----------------------+-----------------------------------------------------+ +-----------------------+-----------------------------------------------------------------+
| Clinic DSL start | /*[clinic] | | Clinic DSL start | /*[clinic] |
| Function declaration | module.function_name -> return_annotation | | Module declaration | module module_name |
| Function flags | flag flag2 flag3=value | | Class declaration | class module_name.class_name |
| Parameter declaration | type name = default | | Function declaration | module_name.function_name -> return_annotation |
| Parameter flags | flag flag2 flag3=value | | Parameter declaration | name : converter(param=value) |
| Parameter docstring | Lorem ipsum dolor sit amet, consectetur | | Parameter docstring | Lorem ipsum dolor sit amet, consectetur |
| | adipisicing elit, sed do eiusmod tempor | | | adipisicing elit, sed do eiusmod tempor |
| Function docstring | Lorem ipsum dolor sit amet, consectetur adipisicing | | Function docstring | Lorem ipsum dolor sit amet, consectetur adipisicing |
| | elit, sed do eiusmod tempor incididunt ut labore et | | | elit, sed do eiusmod tempor incididunt ut labore et |
| Clinic DSL end | [clinic]*/ | | Clinic DSL end | [clinic]*/ |
| Clinic output | ... | | Clinic output | ... |
| Clinic output end | /*[clinic end output:<checksum>]*/ | | Clinic output end | /*[clinic end output:<checksum>]*/ |
+-----------------------+-----------------------------------------------------+ +-----------------------+-----------------------------------------------------------------+
To give some flavor of the proposed DSL syntax, here are some sample Clinic
code blocks. This first block reflects the normally preferred style, including
blank lines between parameters and per-argument docstrings.
::
/*[clinic]
os.stat as os_stat_fn -> stat result
path: path_t(allow_fd=1)
Path to be examined; can be string, bytes, or open-file-descriptor int.
*
dir_fd: OS_STAT_DIR_FD_CONVERTER = DEFAULT_DIR_FD
If not None, it should be a file descriptor open to a directory,
and path should be a relative string; path will then be relative to
that directory.
follow_symlinks: bool = True
If False, and the last element of the path is a symbolic link,
stat will examine the symbolic link itself instead of the file
the link points to.
Perform a stat system call on the given path.
{parameters}
dir_fd and follow_symlinks may not be implemented
on your platform. If they are unavailable, using them will raise a
NotImplementedError.
It's an error to use dir_fd or follow_symlinks when specifying path as
an open file descriptor.
[clinic]*/
This second example shows a minimal Clinic code block, omitting all
parameter docstrings and non-significant blank lines::
/*[clinic]
os.access
path: path
mode: int
*
dir_fd: OS_ACCESS_DIR_FD_CONVERTER = 1
effective_ids: bool = False
follow_symlinks: bool = True
Use the real uid/gid to test for access to a path.
Returns True if granted, False otherwise.
{parameters}
dir_fd, effective_ids, and follow_symlinks may not be implemented
on your platform. If they are unavailable, using them will raise a
NotImplementedError.
Note that most operations will use the effective uid/gid, therefore this
routine can be used in a suid/sgid environment to test if the invoking user
has the specified access to the path.
[clinic]*/
This final example shows a Clinic code block handling groups of
optional parameters, including parameters on the left::
/*[clinic]
curses.window.addch
[
x: int
X-coordinate.
y: int
Y-coordinate.
]
ch: char
Character to add.
[
attr: long
Attributes for the character.
]
Paint character ch at (y, x) with attributes attr,
overwriting any character previously painter at that location.
By default, the character position and attributes are the
current settings for the window object.
[clinic]*/
General Behavior Of the Argument Clinic DSL General Behavior Of the Argument Clinic DSL
@ -145,113 +255,219 @@ docstrings. Blank lines are always ignored.
Like Python itself, leading whitespace is significant in the Argument Like Python itself, leading whitespace is significant in the Argument
Clinic DSL. The first line of the "function" section is the Clinic DSL. The first line of the "function" section is the
declaration; all subsequent lines at the same indent are function function declaration. Indented lines below the function declaration
flags. Once you indent, the first line is a parameter declaration; declare parameters, one per line; lines below those that are indented even
subsequent lines at that indent are parameter flags. Indent one more further are per-parameter docstrings. Finally, the first line dedented
time for the lines of the parameter docstring. Finally, dedent back back to column 0 end parameter declarations and start the function docstring.
to the same level as the function declaration for the function
docstring. Parameter docstrings are optional; function docstrings are not.
Functions that specify no arguments may simply specify the function
declaration followed by the docstring.
Module and Class Declarations
-----------------------------
When a C file implements a module or class, this should be declared to
Clinic. The syntax is simple:
::
module module_name
or
::
class module_name.class_name
(Note that these are not actually special syntax; they are implemented
as `Directives`_.)
The module name or class name should always be the full dotted path
from the top-level module. Nested modules and classes are supported.
Function Declaration Function Declaration
-------------------- --------------------
The return annotation is optional. If skipped, the arrow ("``->``") The full form of the function declaration is as follows:
::
dotted.name [ as legal_c_id ] [ -> return_annotation ]
The dotted name should be the full name of the function, starting
with the highest-level package (e.g. "os.stat" or "curses.window.addch").
The "as legal_c_id" syntax is optional.
Argument Clinic uses the name of the function to create the names of
the generated C functions. In some circumstances, the generated name
may collide with other global names in the C program's namespace.
The "as legal_c_id" syntax allows you to override the generated name
with your own; substitute "legal_c_id" with any legal C identifier.
If skipped, the "as" keyword must also be omitted.
The return annotation is also optional. If skipped, the arrow ("``->``")
must also be omitted. must also be omitted.
Parameter Declaration Parameter Declaration
--------------------- ---------------------
The "type" is a C type. If it's a pointer type, you must specify a The full form of the parameter declaration line as as follows:
single space between the type and the "``*``", and zero spaces between
the "``*``" and the name. (e.g. "``PyObject *foo``", not "``PyObject*
foo``")
The "name" must be a legal C identifier. ::
The "default" is a Python value. Default values are optional; if not name: converter [ (parameter=value [, parameter2=value2]) ] [ = default]
specified you must omit the equals sign too. Parameters which don't
have a default are implicitly required. The default value is The "name" must be a legal C identifier. Whitespace is permitted between
the name and the colon (though this is not the preferred style). Whitespace
is permitted (and encouraged) between the colon and the converter.
The "converter" is the name of one of the "converter functions" registered
with Argument Clinic. Clinic will ship with a number of built-in converters;
new converters can also be added dynamically. In choosing a converter, you
are automatically constraining what Python types are permitted on the input,
and specifying what type the output variable (or variables) will be. Although
many of the converters will resemble the names of C types or perhaps Python
types, the name of a converter may be any legal Python identifier.
If the converter is followed by parentheses, these parentheses enclose
parameter to the conversion function. The syntax mirrors providing arguments
a Python function call: the parameter must always be named, as if they were
"keyword-only parameters", and the values provided for the parameters will
syntactically resemble Python literal values. These parameters are always
optional, permitting all conversion functions to be called without
any parameters. In this case, you may also omit the parentheses entirely;
this is always equivalent to specifying empty parentheses.
The "default" is a Python literal value. Default values are optional;
if not specified you must omit the equals sign too. Parameters which
don't have a default are implicitly required. The default value is
dynamically assigned, "live" in the generated C code, and although dynamically assigned, "live" in the generated C code, and although
it's specified as a Python value, it's translated into a native C it's specified as a Python value, it's translated into a native C
value in the generated C code. value in the generated C code. Few default values are permitted,
owing to this manual translation step.
It's explicitly permitted to end the parameter declaration line with a If this were a Python function declaration, a parameter declaration
semicolon, though the semicolon is optional. This is intended to would be delimited by either a trailing comma or an ending parentheses.
allow directly cutting and pasting in declarations from C code. However, Argument Clinic uses neither; parameter declarations are
However, the preferred style is without the semicolon. delimited by a newline. A trailing comma or right parenthesis is not
permitted.
The first parameter declaration establishes the indent for all parameter
declarations in a particular Clinic code block. All subsequent parameters
must be indented to the same level.
Flags Legacy Converters
----- -----------------
"Flags" are like "``make -D``" arguments. They're unordered. Flags For convenience's sake in converting existing code to Argument Clinic,
lines are parsed much like the shell (specifically, using Clinic provides a set of legacy converters that match ``PyArg_ParseTuple``
``shlex.split()`` [5]_ ). You can have as many flag lines as you format units. They are specified as a C string containing the format
like. Specifying a flag twice is currently an error. unit. For example, to specify a parameter "foo" as taking a Python
"int" and emitting a C int, you could specify:
Supported flags for functions: ::
``basename`` foo : "i"
The basename to use for the generated C functions. By default this
is the name of the function from the DSL, only with periods replaced
by underscores.
``positional-only`` (To more closely resemble a C string, these must always use double quotes.)
This function only supports positional parameters, not keyword
parameters. See `Functions With Positional-Only Parameters`_ below.
Supported flags for parameters: Although these resemble ``PyArg_ParseTuple`` format units, no guarantee is
made that the implementation will call a ``PyArg_Parse`` function for parsing.
``bitwise`` This syntax does not support parameters. Therefore it doesn't support any
If the Python integer passed in is signed, copy the bits directly of the format units that require input parameters (``"O!", "O&", "es", "es#",
even if it is negative. Only valid for unsigned integer types. "et", "et#"``). Parameters requiring one of these conversions cannot use the
legacy syntax. (You may still, however, supply a default value.)
``converter``
Backwards-compatibility support for parameter "converter" Parameter Docstrings
functions. [6]_ The value should be the name of the converter --------------------
function in C. Only valid when the type of the parameter is
``void *``. All lines that appear below and are indented further than a parameter declaration
are the docstring for that parameter. All such lines are "dedented" until the
first line is flush left.
Special Syntax For Parameter Lines
----------------------------------
There are four special symbols that may be used in the parameter section. Each
of these must appear on a line by itself, indented to the same level as parameter
declarations. The four symbols are:
``*``
Establishes that all subsequent parameters are keyword-only.
``[``
Establishes the start of an optional "group" of parameters.
Note that "groups" may nest inside other "groups".
See `Functions With Positional-Only Parameters`_ below.
``]``
Ends an optional "group" of parameters.
``/``
This hints to Argument Clinic that this function is performance-sensitive,
and that it's acceptable to forego supporting keyword parameters when parsing.
(In early implementations of Clinic, this will switch Clinic from generating
code using ``PyArg_ParseTupleAndKeywords`` to using ``PyArg_ParseTuple``.
The hope is that in the future there will be no appreciable speed difference,
rendering this syntax irrelevant and deprecated but harmless.)
Function Docstring
------------------
The first line with no leading whitespace after the function declaration is the
first line of the function docstring. All subsequent lines of the Clinic block
are considered part of the docstring, and their leading whitespace is preserved.
If the string ``{parameters}`` appears on a line by itself inside the function
docstring, Argument Clinic will insert a list of all parameters that have
docstrings, each such parameter followed by its docstring. The name of the
parameter is on a line by itself; the docstring starts on a subsequent line,
and all lines of the docstring are indented by two spaces. (Parameters with
no per-parameter docstring are suppressed.) The entire list is indented by the
leading whitespace that appeared before the ``{parameters}`` token.
If the string ``{parameters}`` doesn't appear in the docstring, Argument Clinic
will append one to the end of the docstring, inserting a blank line above it if
the docstring does not end with a blank line, and with the parameter list at
column 0.
Converters
----------
Argument Clinic contains a pre-initialized registry of converter functions.
Example converter functions:
``int``
Accepts a Python object implementing ``__int__``; emits a C ``int``.
``byte``
Accepts a Python int; emits an ``unsigned char``. The integer
must be in the range [0, 256).
``str``
Accepts a Python str object; emits a C ``char *``. Automatically
encodes the string using the ``ascii`` codec.
``PyObject``
Accepts any object; emits a C ``PyObject *`` without any conversion.
All converters accept the following parameters:
``default`` ``default``
The Python value to use in place of the parameter's actual default The Python value to use in place of the parameter's actual default
in Python contexts. Specifically, when specified, this value will in Python contexts. In other words: when specified, this value will
be used for the parameter's default in the docstring, and in the be used for the parameter's default in the docstring, and in the
``Signature``. (TBD: If the string is a valid Python expression ``Signature``. (TBD alternative semantics: If the string is a valid
which can be rendered into a Python value using ``eval()``, then the Python expression which can be rendered into a Python value using
result of ``eval()`` on it will be used as the default in the ``eval()``, then the result of ``eval()`` on it will be used as the
``Signature``.) Ignored if there is no default. default in the ``Signature``.) Ignored if there is no default.
``encoding``
Encoding to use when encoding a Unicode string to a ``char *``.
Only valid when the type of the parameter is ``char *``.
``group=``
This parameter is part of a group of options that must either all be
specified or none specified. Parameters in the same "group" must be
contiguous. The value of the group flag is the name used for the
group variable, and therefore must be legal as a C identifier. Only
valid for functions marked "``positional-only``"; see `Functions
With Positional-Only Parameters`_ below.
``immutable``
Only accept immutable values.
``keyword-only``
This parameter (and all subsequent parameters) is keyword-only.
Keyword-only parameters must also be optional parameters. Not valid
for positional-only functions.
``length``
This is an iterable type, and we also want its length. The DSL will
generate a second ``Py_ssize_t`` variable; its name will be this
parameter's name appended with "``_length``".
``nullable``
``None`` is a legal argument for this parameter. If ``None`` is
supplied on the Python side, the equivalent C argument will be
``NULL``. Only valid for pointer types.
``required`` ``required``
Normally any parameter that has a default value is automatically Normally any parameter that has a default value is automatically
@ -259,24 +475,78 @@ Supported flags for parameters:
required (non-optional) even if it has a default value. The required (non-optional) even if it has a default value. The
generated documentation will also not show any default value. generated documentation will also not show any default value.
``types``
Space-separated list of acceptable Python types for this object.
There are also four special-case types which represent Python
protocols:
* buffer Additionally, converters may accept one or more of these optional
* mapping parameters, on an individual basis:
* number
* sequence ``bitwise``
For converters that accept unsigned integers. If the Python integer
passed in is signed, copy the bits directly even if it is negative.
``encoding``
For converters that accept str. Encoding to use when encoding a
Unicode string to a ``char *``.
``immutable``
Only accept immutable values.
``length``
For converters that accept iterable types. Requests that the converter
also emit the length of the iterable, passed in to the ``_impl`` function
in a ``Py_ssize_t`` variable; its name will be this
parameter's name appended with "``_length``".
``nullable``
This converter normally does not accept ``None``, but in this case
it should. If ``None`` is supplied on the Python side, the equivalent
C argument will be ``NULL``. (The ``_impl`` argument emitted by this
converter will presumably be a pointer type.)
``types``
A list of strings representing acceptable Python types for this object.
There are also four strings which represent Python protocols:
* "buffer"
* "mapping"
* "number"
* "sequence"
``zeroes`` ``zeroes``
This parameter is a string type, and its value should be allowed to For converters that accept string types. The converted value should
have embedded zeroes. Not valid for all varieties of string be allowed to have embedded zeroes.
parameters.
Directives
----------
Argument Clinic also permits "directives" in Clinic code blocks.
Directives are similar to *pragmas* in C; they are statements
that modify Argument Clinic's behavior.
The format of a directive is as follows:
::
directive_name [argument [second_argument [ ... ]]]
Directives only take positional arguments.
A Clinic code block must contain either one or more directives,
or a function declaration. It may contain both, in which
case all directives must come before the function declaration.
Internally directives map directly to Python callables.
The directive's arguments are passed directly to the callable
as positional arguments of type ``str()``.
Example possible directives include the production,
suppression, or redirection of Clinic output. Also, the
"module" and "class" keywords are actually implemented
as directives.
Python Code Python Code
----------- ===========
Argument Clinic also permits embedding Python code inside C files, Argument Clinic also permits embedding Python code inside C files,
which is executed in-place when Argument Clinic processes the file. which is executed in-place when Argument Clinic processes the file.
@ -290,16 +560,21 @@ Embedded code looks like this:
print("/" + "* Hello world! *" + "/") print("/" + "* Hello world! *" + "/")
[python]*/ [python]*/
/* Hello world! */
/*[python end:da39a3ee5e6b4b0d3255bfef95601890afd80709]*/
The ``"/* Hello world! */"`` line above was generated by running the Python
code in the preceding comment.
Any Python code is valid. Python code sections in Argument Clinic can Any Python code is valid. Python code sections in Argument Clinic can
also be used to modify Clinic's behavior at runtime; for example, see also be used to directly interact with Clinic; see
`Extending Argument Clinic`_. `Argument Clinic Programmatic Interfaces`_.
Output Output
====== ======
Argument Clinic writes its output in-line in the C file, immediately Argument Clinic writes its output inline in the C file, immediately
after the section of Clinic code. For "python" sections, the output after the section of Clinic code. For "python" sections, the output
is everything printed using ``builtins.print``. For "clinic" is everything printed using ``builtins.print``. For "clinic"
sections, the output is valid C code, including: sections, the output is valid C code, including:
@ -313,11 +588,10 @@ sections, the output is valid C code, including:
* the definition line of the "impl" function * the definition line of the "impl" function
* and a comment indicating the end of output. * and a comment indicating the end of output.
The intention is that you will write the body of your impl function The intention is that you write the body of your impl function immediately
immediately after the output -- as in, you write a left-curly-brace after the output -- as in, you write a left-curly-brace immediately after
immediately after the end-of-output comment and write the the end-of-output comment and implement builtin in the body there.
implementation of the builtin in the body there. (It's a bit strange (It's a bit strange at first, but oddly convenient.)
at first, but oddly convenient.)
Argument Clinic will define the parameters of the impl function for Argument Clinic will define the parameters of the impl function for
you. The function will take the "self" parameter passed in you. The function will take the "self" parameter passed in
@ -332,6 +606,9 @@ overwrite the file. (You can force Clinic to overwrite with the
"``-f``" command-line argument; Clinic will also ignore the checksums "``-f``" command-line argument; Clinic will also ignore the checksums
when using the "``-o``" command-line argument.) when using the "``-o``" command-line argument.)
Finally, Argument Clinic can also emit the boilerplate definition
of the PyMethodDef array for the defined classes and modules.
Functions With Positional-Only Parameters Functions With Positional-Only Parameters
========================================= =========================================
@ -342,61 +619,90 @@ older positional-only API for processing arguments
their arguments differently based on how many arguments were passed their arguments differently based on how many arguments were passed
in. This can provide some bewildering flexibility: there may be in. This can provide some bewildering flexibility: there may be
groups of optional parameters, which must either all be specified or groups of optional parameters, which must either all be specified or
none specified. And occasionally these groups are on the *left!* (For none specified. And occasionally these groups are on the *left!* (A
example: ``curses.window.addch()``.) representative example: ``curses.window.addch()``.)
Argument Clinic supports these legacy use-cases with a special set of Argument Clinic supports these legacy use-cases by allowing you to
flags. First, set the flag "``positional-only``" on the entire specify parameters in groups. Each optional group of parameters
function. Then, for every group of parameters that is collectively is marked with square brackets. Note that these groups are permitted
optional, add a "``group=``" flag with a unique string to all the on the right *or left* of any required parameters!
parameters in that group. Note that these groups are permitted on the
right *or left* of any required parameters! However, all groups
(including the group of required parameters) must be contiguous.
The impl function generated by Clinic will add an extra parameter for The impl function generated by Clinic will add an extra parameter for
every group, "``int <group>_group``". This argument will be nonzero every group, "``int group_{left|right}_<x>``", where x is a monotonically
if the group was specified on this call, and zero if it was not. increasing number assigned to each group as it builds away from the
required arguments. This argument will be nonzero if the group was
specified on this call, and zero if it was not.
Note that when operating in this mode, you cannot specify default Note that when operating in this mode, you cannot specify default
arguments. You can simulate defaults by putting parameters in arguments.
individual groups and detecting whether or not they were specified;
generally speaking it's better to simply not use "positional-only"
where it isn't absolutely necessary. (TBD: It might be possible to
relax this restriction. But adding default arguments into the mix of
groups would seemingly make calculating which groups are active a good
deal harder.)
Also, note that it's possible to specify a set of groups to a function Also, note that it's possible to specify a set of groups to a function
such that there are several valid mappings from the number of such that there are several valid mappings from the number of
arguments to a valid set of groups. If this happens, Clinic will exit arguments to a valid set of groups. If this happens, Clinic will abort
with an error message. This should not be a problem, as with an error message. This should not be a problem, as
positional-only operation is only intended for legacy use cases, and positional-only operation is only intended for legacy use cases, and
all the legacy functions using this quirky behavior should have all the legacy functions using this quirky behavior have unambiguous
unambiguous mappings. mappings.
Current Status Current Status
============== ==============
As of this writing, there is a working prototype implementation of As of this writing, there is a working prototype implementation of
Argument Clinic available online. [7]_ The prototype implements the Argument Clinic available online (though the syntax may be out of date
syntax above, and generates code using the existing ``PyArg_Parse`` as you read this). [7]_ The prototype generates code using the
APIs. It supports translating to all current format units except existing ``PyArg_Parse`` APIs. It supports translating to all current
``"w*"``. Sample functions using Argument Clinic exercise all major format units except the mysterious ``"w*"``. Sample functions using
features, including positional-only argument parsing. Argument Clinic exercise all major features, including positional-only
argument parsing.
Extending Argument Clinic Argument Clinic Programmatic Interfaces
------------------------- ---------------------------------------
The prototype also currently provides an experimental extension The prototype also currently provides an experimental extension
mechanism, allowing adding support for new types on-the-fly. See mechanism, allowing adding support for new types on-the-fly. See
``Modules/posixmodule.c`` in the prototype for an example of its use. ``Modules/posixmodule.c`` in the prototype for an example of its use.
In the future, Argument Clinic is expected to be automatable enough
to allow querying, modification, or outright new construction of
function declarations through Python code. It may even permit
dynamically adding your own custom DSL!
Notes / TBD Notes / TBD
=========== ===========
* Optimally we'd want Argument Clinic run automatically as part of the
normal Python build process. But this presents a bootstrapping problem;
if you don't have a system Python 3, you need a Python 3 executable to
build Python 3. I'm sure this is a solvable problem, but I don't know
what the best solution might be. (Supporting this will also require
a parallel solution for Windows.)
* The original Clinic DSL syntax allowed naming the "groups" for
positional-only argument parsing. The current one does not;
they will therefore get computer-generated names (probably
left_1, right_2, etc.). Do we care about allowing the user
to explicitly specify names for the groups? The thing is, there's
no good place to put it. Only one syntax suggests itself to me,
and it's a syntax only a mother could love:
::
[ group_name
name: converter
name2: converter2
]
* During the PyCon US 2013 Language Summit, there was discussion of having
Argument Clinic also generate the actual documentation (in ReST, processed
by Sphinx) for the function. The logistics of this are TBD, but it would
require that the docstrings be written in ReST, and require that Python
ship a ReST -> ascii converter. It would be best to come to a decision
about this before we begin any large-scale conversion of the CPython
source tree to using Clinic.
* Guido proposed having the "function docstring" be hand-written inline, * Guido proposed having the "function docstring" be hand-written inline,
in the middle of the output, something like this: in the middle of the output, something like this:
@ -420,10 +726,6 @@ Notes / TBD
* Do we need to support tuple unpacking? (The "``(OOO)``" style * Do we need to support tuple unpacking? (The "``(OOO)``" style
format string.) Boy I sure hope not. format string.) Boy I sure hope not.
* What about Python functions that take no arguments? This syntax
doesn't provide for that. Perhaps a lone indented "None" should
mean "no arguments"?
* This approach removes some dynamism / flexibility. With the * This approach removes some dynamism / flexibility. With the
existing syntax one could theoretically pass in different encodings existing syntax one could theoretically pass in different encodings
at runtime for the "``es``"/"``et``" format units. AFAICT CPython at runtime for the "``es``"/"``et``" format units. AFAICT CPython
@ -433,14 +735,14 @@ Notes / TBD
socketmodule.c, except for one in _ssl.c. They're all static, socketmodule.c, except for one in _ssl.c. They're all static,
specifying the encoding ``"idna"``.) specifying the encoding ``"idna"``.)
* Right now the "basename" flag on a function changes the ``#define
methoddef`` name too. Should it, or should the #define'd methoddef
name always be ``{module_name}_{function_name}`` ?
References References
========== ==========
.. [Cog] ``Cog``:
http://nedbatchelder.com/code/cog/
.. [1] ``PyArg_ParseTuple()``: .. [1] ``PyArg_ParseTuple()``:
http://docs.python.org/3/c-api/arg.html#PyArg_ParseTuple http://docs.python.org/3/c-api/arg.html#PyArg_ParseTuple