811 lines
31 KiB
Plaintext
811 lines
31 KiB
Plaintext
PEP: 436
|
||
Title: The Argument Clinic DSL
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Larry Hastings <larry@hastings.org>
|
||
Discussions-To: python-dev@python.org
|
||
Status: Final
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 22-Feb-2013
|
||
Python-Version: 3.4
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This document proposes "Argument Clinic", a DSL to facilitate
|
||
argument processing for built-in functions in the implementation of
|
||
CPython.
|
||
|
||
|
||
Rationale and Goals
|
||
===================
|
||
|
||
The primary implementation of Python, "CPython", is written in a
|
||
mixture of Python and C. One implementation detail of CPython
|
||
is what are called "built-in" functions -- functions available to
|
||
Python programs but written in C. When a Python program calls a
|
||
built-in function and passes in arguments, those arguments must be
|
||
translated from Python values into C values. This process is called
|
||
"parsing arguments".
|
||
|
||
As of CPython 3.3, builtin functions nearly always parse their arguments
|
||
with one of two functions: the original ``PyArg_ParseTuple()``, [1]_ and
|
||
the more modern ``PyArg_ParseTupleAndKeywords()``. [2]_ The former
|
||
only handles positional parameters; the latter also accommodates keyword
|
||
and keyword-only parameters, and is preferred for new code.
|
||
|
||
With either function, the caller specifies the translation for
|
||
parsing arguments in a "format string": [3]_ each parameter corresponds
|
||
to a "format unit", a short character sequence telling the parsing
|
||
function what Python types to accept and how to translate them into
|
||
the appropriate C value for that parameter.
|
||
|
||
|
||
``PyArg_ParseTuple()`` was reasonable when it was first conceived.
|
||
There were only a dozen or so of these "format units"; each one
|
||
was distinct, and easy to understand and remember.
|
||
But over the years the ``PyArg_Parse`` interface has been extended
|
||
in numerous ways. The modern API is complex, to the point that it
|
||
is somewhat painful to use. Consider:
|
||
|
||
* There are now forty different "format units"; a few are even three
|
||
characters long. This makes it difficult for the programmer to
|
||
understand what the format string says--or even perhaps to parse
|
||
it--without constantly cross-indexing it with the documentation.
|
||
* There are also six meta-format units that may be buried in the
|
||
format string. (They are: ``"()|$:;"``.)
|
||
* The more format units are added, the less likely it is the
|
||
implementer can pick an easy-to-use mnemonic for the format unit,
|
||
because the character of choice is probably already in use. In
|
||
other words, the more format units we have, the more obtuse the
|
||
format units become.
|
||
* Several format units are nearly identical to others, having only
|
||
subtle differences. This makes understanding the exact semantics
|
||
of the format string even harder, and can make it difficult to
|
||
figure out exactly which format unit you want.
|
||
* The docstring is specified as a static C string, making it mildly
|
||
bothersome to read and edit since it must obey C string quoting rules.
|
||
* When adding a new parameter to a function using
|
||
``PyArg_ParseTupleAndKeywords()``, it's necessary to touch six
|
||
different places in the code: [4]_
|
||
|
||
* Declaring the variable to store the argument.
|
||
* Passing in a pointer to that variable in the correct spot in
|
||
``PyArg_ParseTupleAndKeywords()``, also passing in any
|
||
"length" or "converter" arguments in the correct order.
|
||
* Adding the name of the argument in the correct spot of the
|
||
"keywords" array passed in to
|
||
``PyArg_ParseTupleAndKeywords()``.
|
||
* Adding the format unit to the correct spot in the format
|
||
string.
|
||
* Adding the parameter to the prototype in the docstring.
|
||
* Documenting the parameter in the docstring.
|
||
|
||
* There is currently no mechanism for builtin functions to provide
|
||
their "signature" information (see ``inspect.getfullargspec`` and
|
||
``inspect.Signature``). Adding this information using a mechanism
|
||
similar to the existing ``PyArg_Parse`` functions would require
|
||
repeating ourselves yet again.
|
||
|
||
The goal of Argument Clinic is to replace this API with a mechanism
|
||
inheriting none of these downsides:
|
||
|
||
* You need specify each parameter only once.
|
||
* All information about a parameter is kept together in one place.
|
||
* For each parameter, you specify a conversion function; Argument
|
||
Clinic handles the translation from Python value into C value for
|
||
you.
|
||
* Argument Clinic also allows for fine-tuning of argument processing
|
||
behavior with parameterized conversion functions.
|
||
* Docstrings are written in plain text. Function docstrings are
|
||
required; per-parameter docstrings are encouraged.
|
||
* From this, Argument Clinic generates for you all the mundane,
|
||
repetitious code and data structures CPython needs internally.
|
||
Once you've specified the interface, the next step is simply to
|
||
write your implementation using native C types. Every detail of
|
||
argument parsing is handled for you.
|
||
|
||
Argument Clinic is implemented as a preprocessor. It draws inspiration
|
||
for its workflow directly from [Cog]_ by Ned Batchelder. To use Clinic,
|
||
add a block comment to your C source code beginning and ending with
|
||
special text strings, then run Clinic on the file. Clinic will find the
|
||
block comment, process the contents, and write the output back into your
|
||
C source file directly after the comment. The intent is that Clinic's
|
||
output becomes part of your source code; it's checked in to revision
|
||
control, and distributed with source packages. This means that Python
|
||
will still ship ready-to-build. It does complicate development slightly;
|
||
in order to add a new function, or modify the arguments or documentation
|
||
of an existing function using Clinic, you'll need a working Python 3
|
||
interpreter.
|
||
|
||
Future goals of Argument Clinic include:
|
||
|
||
* providing signature information for builtins,
|
||
* enabling alternative implementations of Python to create
|
||
automated library compatibility tests, and
|
||
* speeding up argument parsing with improvements to the
|
||
generated code.
|
||
|
||
|
||
DSL Syntax Summary
|
||
==================
|
||
|
||
The Argument Clinic DSL is specified as a comment embedded in a C
|
||
file, as follows. The "Example" column on the right shows you sample
|
||
input to the Argument Clinic DSL, and the "Section" column on the left
|
||
specifies what each line represents in turn.
|
||
|
||
Argument Clinic's DSL syntax mirrors the Python ``def``
|
||
statement, lending it some familiarity to Python core developers.
|
||
|
||
::
|
||
|
||
+-----------------------+-----------------------------------------------------------------+
|
||
| Section | Example |
|
||
+-----------------------+-----------------------------------------------------------------+
|
||
| Clinic DSL start | /*[clinic] |
|
||
| Module declaration | module module_name |
|
||
| Class declaration | class module_name.class_name |
|
||
| Function declaration | module_name.function_name -> return_annotation |
|
||
| Parameter declaration | name : converter(param=value) |
|
||
| Parameter docstring | Lorem ipsum dolor sit amet, consectetur |
|
||
| | adipisicing elit, sed do eiusmod tempor |
|
||
| Function docstring | Lorem ipsum dolor sit amet, consectetur adipisicing |
|
||
| | elit, sed do eiusmod tempor incididunt ut labore et |
|
||
| Clinic DSL end | [clinic]*/ |
|
||
| Clinic output | ... |
|
||
| Clinic output end | /*[clinic end output:<checksum>]*/ |
|
||
+-----------------------+-----------------------------------------------------------------+
|
||
|
||
To give some flavor of the proposed DSL syntax, here are some sample Clinic
|
||
code blocks. This first block reflects the normally preferred style, including
|
||
blank lines between parameters and per-argument docstrings.
|
||
It also includes a user-defined converter (``path_t``) created
|
||
locally::
|
||
|
||
/*[clinic]
|
||
os.stat as os_stat_fn -> stat result
|
||
|
||
path: path_t(allow_fd=1)
|
||
Path to be examined; can be string, bytes, or open-file-descriptor int.
|
||
|
||
*
|
||
|
||
dir_fd: OS_STAT_DIR_FD_CONVERTER = DEFAULT_DIR_FD
|
||
If not None, it should be a file descriptor open to a directory,
|
||
and path should be a relative string; path will then be relative to
|
||
that directory.
|
||
|
||
follow_symlinks: bool = True
|
||
If False, and the last element of the path is a symbolic link,
|
||
stat will examine the symbolic link itself instead of the file
|
||
the link points to.
|
||
|
||
Perform a stat system call on the given path.
|
||
|
||
{parameters}
|
||
|
||
dir_fd and follow_symlinks may not be implemented
|
||
on your platform. If they are unavailable, using them will raise a
|
||
NotImplementedError.
|
||
|
||
It's an error to use dir_fd or follow_symlinks when specifying path as
|
||
an open file descriptor.
|
||
|
||
[clinic]*/
|
||
|
||
This second example shows a minimal Clinic code block, omitting all
|
||
parameter docstrings and non-significant blank lines::
|
||
|
||
/*[clinic]
|
||
os.access
|
||
path: path
|
||
mode: int
|
||
*
|
||
dir_fd: OS_ACCESS_DIR_FD_CONVERTER = 1
|
||
effective_ids: bool = False
|
||
follow_symlinks: bool = True
|
||
Use the real uid/gid to test for access to a path.
|
||
Returns True if granted, False otherwise.
|
||
|
||
{parameters}
|
||
|
||
dir_fd, effective_ids, and follow_symlinks may not be implemented
|
||
on your platform. If they are unavailable, using them will raise a
|
||
NotImplementedError.
|
||
|
||
Note that most operations will use the effective uid/gid, therefore this
|
||
routine can be used in a suid/sgid environment to test if the invoking user
|
||
has the specified access to the path.
|
||
|
||
[clinic]*/
|
||
|
||
This final example shows a Clinic code block handling groups of
|
||
optional parameters, including parameters on the left::
|
||
|
||
/*[clinic]
|
||
curses.window.addch
|
||
|
||
[
|
||
y: int
|
||
Y-coordinate.
|
||
|
||
x: int
|
||
X-coordinate.
|
||
]
|
||
|
||
ch: char
|
||
Character to add.
|
||
|
||
[
|
||
attr: long
|
||
Attributes for the character.
|
||
]
|
||
|
||
/
|
||
|
||
Paint character ch at (y, x) with attributes attr,
|
||
overwriting any character previously painter at that location.
|
||
By default, the character position and attributes are the
|
||
current settings for the window object.
|
||
[clinic]*/
|
||
|
||
|
||
General Behavior Of the Argument Clinic DSL
|
||
-------------------------------------------
|
||
|
||
All lines support ``#`` as a line comment delimiter *except*
|
||
docstrings. Blank lines are always ignored.
|
||
|
||
Like Python itself, leading whitespace is significant in the Argument
|
||
Clinic DSL. The first line of the "function" section is the
|
||
function declaration. Indented lines below the function declaration
|
||
declare parameters, one per line; lines below those that are indented even
|
||
further are per-parameter docstrings. Finally, the first line dedented
|
||
back to column 0 end parameter declarations and start the function docstring.
|
||
|
||
Parameter docstrings are optional; function docstrings are not.
|
||
Functions that specify no arguments may simply specify the function
|
||
declaration followed by the docstring.
|
||
|
||
Module and Class Declarations
|
||
-----------------------------
|
||
|
||
When a C file implements a module or class, this should be declared to
|
||
Clinic. The syntax is simple::
|
||
|
||
module module_name
|
||
|
||
or ::
|
||
|
||
class module_name.class_name
|
||
|
||
(Note that these are not actually special syntax; they are implemented
|
||
as `Directives`_.)
|
||
|
||
The module name or class name should always be the full dotted path
|
||
from the top-level module. Nested modules and classes are supported.
|
||
|
||
|
||
Function Declaration
|
||
--------------------
|
||
|
||
The full form of the function declaration is as follows::
|
||
|
||
dotted.name [ as legal_c_id ] [ -> return_annotation ]
|
||
|
||
The dotted name should be the full name of the function, starting
|
||
with the highest-level package (e.g. "os.stat" or "curses.window.addch").
|
||
|
||
The "as legal_c_id" syntax is optional.
|
||
Argument Clinic uses the name of the function to create the names of
|
||
the generated C functions. In some circumstances, the generated name
|
||
may collide with other global names in the C program's namespace.
|
||
The "as legal_c_id" syntax allows you to override the generated name
|
||
with your own; substitute "legal_c_id" with any legal C identifier.
|
||
If skipped, the "as" keyword must also be omitted.
|
||
|
||
The return annotation is also optional. If skipped, the arrow ("``->``")
|
||
must also be omitted. If specified, the value for the return annotation
|
||
must be compatible with ``ast.literal_eval``, and it is interpreted as
|
||
a *return converter*.
|
||
|
||
|
||
Parameter Declaration
|
||
---------------------
|
||
|
||
The full form of the parameter declaration line as follows::
|
||
|
||
name: converter [ (parameter=value [, parameter2=value2]) ] [ = default]
|
||
|
||
The "name" must be a legal C identifier. Whitespace is permitted between
|
||
the name and the colon (though this is not the preferred style). Whitespace
|
||
is permitted (and encouraged) between the colon and the converter.
|
||
|
||
The "converter" is the name of one of the "converter functions" registered
|
||
with Argument Clinic. Clinic will ship with a number of built-in converters;
|
||
new converters can also be added dynamically. In choosing a converter, you
|
||
are automatically constraining what Python types are permitted on the input,
|
||
and specifying what type the output variable (or variables) will be. Although
|
||
many of the converters will resemble the names of C types or perhaps Python
|
||
types, the name of a converter may be any legal Python identifier.
|
||
|
||
If the converter is followed by parentheses, these parentheses enclose
|
||
parameter to the conversion function. The syntax mirrors providing arguments
|
||
a Python function call: the parameter must always be named, as if they were
|
||
"keyword-only parameters", and the values provided for the parameters will
|
||
syntactically resemble Python literal values. These parameters are always
|
||
optional, permitting all conversion functions to be called without
|
||
any parameters. In this case, you may also omit the parentheses entirely;
|
||
this is always equivalent to specifying empty parentheses. The values
|
||
supplied for these parameters must be compatible with ``ast.literal_eval``.
|
||
|
||
The "default" is a Python literal value. Default values are optional;
|
||
if not specified you must omit the equals sign too. Parameters which
|
||
don't have a default are implicitly required. The default value is
|
||
dynamically assigned, "live" in the generated C code, and although
|
||
it's specified as a Python value, it's translated into a native C
|
||
value in the generated C code. Few default values are permitted,
|
||
owing to this manual translation step.
|
||
|
||
If this were a Python function declaration, a parameter declaration
|
||
would be delimited by either a trailing comma or an ending parenthesis.
|
||
However, Argument Clinic uses neither; parameter declarations are
|
||
delimited by a newline. A trailing comma or right parenthesis is not
|
||
permitted.
|
||
|
||
The first parameter declaration establishes the indent for all parameter
|
||
declarations in a particular Clinic code block. All subsequent parameters
|
||
must be indented to the same level.
|
||
|
||
|
||
Legacy Converters
|
||
-----------------
|
||
|
||
For convenience's sake in converting existing code to Argument Clinic,
|
||
Clinic provides a set of legacy converters that match ``PyArg_ParseTuple``
|
||
format units. They are specified as a C string containing the format
|
||
unit. For example, to specify a parameter "foo" as taking a Python
|
||
"int" and emitting a C int, you could specify::
|
||
|
||
foo : "i"
|
||
|
||
(To more closely resemble a C string, these must always use double quotes.)
|
||
|
||
Although these resemble ``PyArg_ParseTuple`` format units, no guarantee is
|
||
made that the implementation will call a ``PyArg_Parse`` function for parsing.
|
||
|
||
This syntax does not support parameters. Therefore, it doesn't support any
|
||
of the format units that require input parameters (``"O!", "O&", "es", "es#",
|
||
"et", "et#"``). Parameters requiring one of these conversions cannot use the
|
||
legacy syntax. (You may still, however, supply a default value.)
|
||
|
||
|
||
Parameter Docstrings
|
||
--------------------
|
||
|
||
All lines that appear below and are indented further than a parameter declaration
|
||
are the docstring for that parameter. All such lines are "dedented" until the
|
||
first line is flush left.
|
||
|
||
Special Syntax For Parameter Lines
|
||
----------------------------------
|
||
|
||
There are four special symbols that may be used in the parameter section. Each
|
||
of these must appear on a line by itself, indented to the same level as parameter
|
||
declarations. The four symbols are:
|
||
|
||
``*``
|
||
Establishes that all subsequent parameters are keyword-only.
|
||
|
||
``[``
|
||
Establishes the start of an optional "group" of parameters.
|
||
Note that "groups" may nest inside other "groups".
|
||
See `Functions With Positional-Only Parameters`_ below.
|
||
Note that currently ``[`` is only legal for use in functions
|
||
where *all* parameters are marked positional-only, see
|
||
``/`` below.
|
||
|
||
``]``
|
||
Ends an optional "group" of parameters.
|
||
|
||
``/``
|
||
Establishes that all the *proceeding* arguments are
|
||
positional-only. For now, Argument Clinic does not
|
||
support functions with both positional-only and
|
||
non-positional-only arguments. Therefore: if ``/``
|
||
is specified for a function, it must currently always
|
||
be after the *last* parameter. Also, Argument Clinic
|
||
does not currently support default values for
|
||
positional-only parameters.
|
||
|
||
(The semantics of ``/`` follow a syntax for positional-only
|
||
parameters in Python once proposed by Guido. [5]_ )
|
||
|
||
|
||
Function Docstring
|
||
------------------
|
||
|
||
The first line with no leading whitespace after the function declaration is the
|
||
first line of the function docstring. All subsequent lines of the Clinic block
|
||
are considered part of the docstring, and their leading whitespace is preserved.
|
||
|
||
If the string ``{parameters}`` appears on a line by itself inside the function
|
||
docstring, Argument Clinic will insert a list of all parameters that have
|
||
docstrings, each such parameter followed by its docstring. The name of the
|
||
parameter is on a line by itself; the docstring starts on a subsequent line,
|
||
and all lines of the docstring are indented by two spaces. (Parameters with
|
||
no per-parameter docstring are suppressed.) The entire list is indented by the
|
||
leading whitespace that appeared before the ``{parameters}`` token.
|
||
|
||
If the string ``{parameters}`` doesn't appear in the docstring, Argument Clinic
|
||
will append one to the end of the docstring, inserting a blank line above it if
|
||
the docstring does not end with a blank line, and with the parameter list at
|
||
column 0.
|
||
|
||
Converters
|
||
----------
|
||
|
||
Argument Clinic contains a pre-initialized registry of converter functions.
|
||
Example converter functions:
|
||
|
||
``int``
|
||
Accepts a Python object implementing ``__int__``; emits a C ``int``.
|
||
|
||
``byte``
|
||
Accepts a Python int; emits an ``unsigned char``. The integer
|
||
must be in the range [0, 256).
|
||
|
||
``str``
|
||
Accepts a Python str object; emits a C ``char *``. Automatically
|
||
encodes the string using the ``ascii`` codec.
|
||
|
||
``PyObject``
|
||
Accepts any object; emits a C ``PyObject *`` without any conversion.
|
||
|
||
All converters accept the following parameters:
|
||
|
||
``doc_default``
|
||
The Python value to use in place of the parameter's actual default
|
||
in Python contexts. In other words: when specified, this value will
|
||
be used for the parameter's default in the docstring, and in the
|
||
``Signature``. (TBD alternative semantics: If the string is a valid
|
||
Python expression which can be rendered into a Python value using
|
||
``eval()``, then the result of ``eval()`` on it will be used as the
|
||
default in the ``Signature``.) Ignored if there is no default.
|
||
|
||
``required``
|
||
Normally any parameter that has a default value is automatically
|
||
optional. A parameter that has "required" set will be considered
|
||
required (non-optional) even if it has a default value. The
|
||
generated documentation will also not show any default value.
|
||
|
||
|
||
Additionally, converters may accept one or more of these optional
|
||
parameters, on an individual basis:
|
||
|
||
``annotation``
|
||
Explicitly specifies the per-parameter annotation for this
|
||
parameter. Normally it's the responsibility of the conversion
|
||
function to generate the annotation (if any).
|
||
|
||
``bitwise``
|
||
For converters that accept unsigned integers. If the Python integer
|
||
passed in is signed, copy the bits directly even if it is negative.
|
||
|
||
``encoding``
|
||
For converters that accept str. Encoding to use when encoding a
|
||
Unicode string to a ``char *``.
|
||
|
||
``immutable``
|
||
Only accept immutable values.
|
||
|
||
``length``
|
||
For converters that accept iterable types. Requests that the converter
|
||
also emit the length of the iterable, passed in to the ``_impl`` function
|
||
in a ``Py_ssize_t`` variable; its name will be this
|
||
parameter's name appended with "``_length``".
|
||
|
||
``nullable``
|
||
This converter normally does not accept ``None``, but in this case
|
||
it should. If ``None`` is supplied on the Python side, the equivalent
|
||
C argument will be ``NULL``. (The ``_impl`` argument emitted by this
|
||
converter will presumably be a pointer type.)
|
||
|
||
``types``
|
||
A list of strings representing acceptable Python types for this object.
|
||
There are also four strings which represent Python protocols:
|
||
|
||
* "buffer"
|
||
* "mapping"
|
||
* "number"
|
||
* "sequence"
|
||
|
||
``zeroes``
|
||
For converters that accept string types. The converted value should
|
||
be allowed to have embedded zeroes.
|
||
|
||
|
||
Return Converters
|
||
-----------------
|
||
|
||
A *return converter* conceptually performs the inverse operation of
|
||
a converter: it converts a native C value into its equivalent Python
|
||
value.
|
||
|
||
|
||
Directives
|
||
----------
|
||
|
||
Argument Clinic also permits "directives" in Clinic code blocks.
|
||
Directives are similar to *pragmas* in C; they are statements
|
||
that modify Argument Clinic's behavior.
|
||
|
||
The format of a directive is as follows::
|
||
|
||
directive_name [argument [second_argument [ ... ]]]
|
||
|
||
Directives only take positional arguments.
|
||
|
||
A Clinic code block must contain either one or more directives,
|
||
or a function declaration. It may contain both, in which
|
||
case all directives must come before the function declaration.
|
||
|
||
Internally directives map directly to Python callables.
|
||
The directive's arguments are passed directly to the callable
|
||
as positional arguments of type ``str()``.
|
||
|
||
Example possible directives include the production,
|
||
suppression, or redirection of Clinic output. Also, the
|
||
"module" and "class" keywords are implemented
|
||
as directives in the prototype.
|
||
|
||
|
||
Python Code
|
||
===========
|
||
|
||
Argument Clinic also permits embedding Python code inside C files,
|
||
which is executed in-place when Argument Clinic processes the file.
|
||
Embedded code looks like this::
|
||
|
||
/*[python]
|
||
|
||
# this is python code!
|
||
print("/" + "* Hello world! *" + "/")
|
||
|
||
[python]*/
|
||
/* Hello world! */
|
||
/*[python end:da39a3ee5e6b4b0d3255bfef95601890afd80709]*/
|
||
|
||
The ``"/* Hello world! */"`` line above was generated by running the Python
|
||
code in the preceding comment.
|
||
|
||
Any Python code is valid. Python code sections in Argument Clinic can
|
||
also be used to directly interact with Clinic; see
|
||
`Argument Clinic Programmatic Interfaces`_.
|
||
|
||
|
||
Output
|
||
======
|
||
|
||
Argument Clinic writes its output inline in the C file, immediately
|
||
after the section of Clinic code. For "python" sections, the output
|
||
is everything printed using ``builtins.print``. For "clinic"
|
||
sections, the output is valid C code, including:
|
||
|
||
* a ``#define`` providing the correct ``methoddef`` structure for the
|
||
function
|
||
* a prototype for the "impl" function -- this is what you'll write
|
||
to implement this function
|
||
* a function that handles all argument processing, which calls your
|
||
"impl" function
|
||
* the definition line of the "impl" function
|
||
* and a comment indicating the end of output.
|
||
|
||
The intention is that you write the body of your impl function immediately
|
||
after the output -- as in, you write a left-curly-brace immediately after
|
||
the end-of-output comment and implement builtin in the body there.
|
||
(It's a bit strange at first, but oddly convenient.)
|
||
|
||
Argument Clinic will define the parameters of the impl function for
|
||
you. The function will take the "self" parameter passed in
|
||
originally, all the parameters you define, and possibly some extra
|
||
generated parameters ("length" parameters; also "group" parameters,
|
||
see next section).
|
||
|
||
Argument Clinic also writes a checksum for the output section. This
|
||
is a valuable safety feature: if you modify the output by hand, Clinic
|
||
will notice that the checksum doesn't match, and will refuse to
|
||
overwrite the file. (You can force Clinic to overwrite with the
|
||
"``-f``" command-line argument; Clinic will also ignore the checksums
|
||
when using the "``-o``" command-line argument.)
|
||
|
||
Finally, Argument Clinic can also emit the boilerplate definition
|
||
of the PyMethodDef array for the defined classes and modules.
|
||
|
||
|
||
Functions With Positional-Only Parameters
|
||
=========================================
|
||
|
||
A significant fraction of Python builtins implemented in C use the
|
||
older positional-only API for processing arguments
|
||
(``PyArg_ParseTuple()``). In some instances, these builtins parse
|
||
their arguments differently based on how many arguments were passed
|
||
in. This can provide some bewildering flexibility: there may be
|
||
groups of optional parameters, which must either all be specified or
|
||
none specified. And occasionally these groups are on the *left!* (A
|
||
representative example: ``curses.window.addch()``.)
|
||
|
||
Argument Clinic supports these legacy use-cases by allowing you to
|
||
specify parameters in groups. Each optional group of parameters
|
||
is marked with square brackets. Note that these groups are permitted
|
||
on the right *or left* of any required parameters!
|
||
|
||
The impl function generated by Clinic will add an extra parameter for
|
||
every group, "``int group_{left|right}_<x>``", where x is a monotonically
|
||
increasing number assigned to each group as it builds away from the
|
||
required arguments. This argument will be nonzero if the group was
|
||
specified on this call, and zero if it was not.
|
||
|
||
Note that when operating in this mode, you cannot specify default
|
||
arguments.
|
||
|
||
Also, note that it's possible to specify a set of groups to a function
|
||
such that there are several valid mappings from the number of
|
||
arguments to a valid set of groups. If this happens, Clinic will abort
|
||
with an error message. This should not be a problem, as
|
||
positional-only operation is only intended for legacy use cases, and
|
||
all the legacy functions using this quirky behavior have unambiguous
|
||
mappings.
|
||
|
||
|
||
Current Status
|
||
==============
|
||
|
||
As of this writing, there is a working prototype implementation of
|
||
Argument Clinic available online (though the syntax may be out of date
|
||
as you read this). [6]_ The prototype generates code using the
|
||
existing ``PyArg_Parse`` APIs. It supports translating to all current
|
||
format units except the mysterious ``"w*"``. Sample functions using
|
||
Argument Clinic exercise all major features, including positional-only
|
||
argument parsing.
|
||
|
||
|
||
Argument Clinic Programmatic Interfaces
|
||
---------------------------------------
|
||
|
||
The prototype also currently provides an experimental extension
|
||
mechanism, allowing adding support for new types on-the-fly. See
|
||
``Modules/posixmodule.c`` in the prototype for an example of its use.
|
||
|
||
In the future, Argument Clinic is expected to be automatable enough
|
||
to allow querying, modification, or outright new construction of
|
||
function declarations through Python code. It may even permit
|
||
dynamically adding your own custom DSL!
|
||
|
||
|
||
Notes / TBD
|
||
===========
|
||
|
||
* The API for supplying inspect.Signature metadata for builtins is
|
||
currently under discussion. Argument Clinic will add support for
|
||
the prototype when it becomes viable.
|
||
|
||
* Nick Coghlan suggests that we a) only support at most one left-optional
|
||
group per function, and b) in the face of ambiguity, prefer the left
|
||
group over the right group. This would solve all our existing use cases
|
||
including range().
|
||
|
||
* Optimally we'd want Argument Clinic run automatically as part of the
|
||
normal Python build process. But this presents a bootstrapping problem;
|
||
if you don't have a system Python 3, you need a Python 3 executable to
|
||
build Python 3. I'm sure this is a solvable problem, but I don't know
|
||
what the best solution might be. (Supporting this will also require
|
||
a parallel solution for Windows.)
|
||
|
||
* On a related note: inspect.Signature has no way of representing
|
||
blocks of arguments, like the left-optional block of ``y`` and ``x``
|
||
for ``curses.window.addch``. How far are we going to go in supporting
|
||
this admittedly aberrant parameter paradigm?
|
||
|
||
* During the PyCon US 2013 Language Summit, there was discussion of having
|
||
Argument Clinic also generate the actual documentation (in ReST, processed
|
||
by Sphinx) for the function. The logistics of this are TBD, but it would
|
||
require that the docstrings be written in ReST, and require that Python
|
||
ship a ReST -> ascii converter. It would be best to come to a decision
|
||
about this before we begin any large-scale conversion of the CPython
|
||
source tree to using Clinic.
|
||
|
||
* Guido proposed having the "function docstring" be hand-written inline,
|
||
in the middle of the output, something like this:
|
||
|
||
::
|
||
|
||
/*[clinic]
|
||
... prototype and parameters (including parameter docstrings) go here
|
||
[clinic]*/
|
||
... some output ...
|
||
/*[clinic docstring start]*/
|
||
... hand-edited function docstring goes here <-- you edit this by hand!
|
||
/*[clinic docstring end]*/
|
||
... more output
|
||
/*[clinic output end]*/
|
||
|
||
I tried it this way and don't like it -- I think it's clumsy. I
|
||
prefer that everything you write goes in one place, rather than
|
||
having an island of hand-edited stuff in the middle of the DSL
|
||
output.
|
||
|
||
* Argument Clinic does not support automatic tuple unpacking
|
||
(the "``(OOO)``" style format string for ``PyArg_ParseTuple()``.)
|
||
|
||
* Argument Clinic removes some dynamism / flexibility. With
|
||
``PyArg_ParseTuple()`` one could theoretically pass in different
|
||
encodings at runtime for the "``es``"/"``et``" format units.
|
||
AFAICT CPython doesn't do this itself, however it's possible
|
||
external users might do this. (Trivia: there are no uses of
|
||
"``es``" exercised by regrtest, and all the uses of "``et``"
|
||
exercised are in socketmodule.c, except for one in _ssl.c.
|
||
They're all static, specifying the encoding ``"idna"``.)
|
||
|
||
Acknowledgements
|
||
================
|
||
|
||
The PEP author wishes to thank Ned Batchelder for permission to
|
||
shamelessly rip off his clever design for Cog--"my favorite tool
|
||
that I've never gotten to use". Thanks also to everyone who provided
|
||
feedback on the [bugtracker issue] and on python-dev. Special thanks
|
||
to Nick Coglan and Guido van Rossum for a rousing two-hour in-person
|
||
deep dive on the topic at PyCon US 2013.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [Cog] ``Cog``:
|
||
http://nedbatchelder.com/code/cog/
|
||
|
||
.. [bugtracker issue] Issue 16612 on the python.org bug tracker:
|
||
http://bugs.python.org/issue16612
|
||
|
||
.. [1] ``PyArg_ParseTuple()``:
|
||
http://docs.python.org/3/c-api/arg.html#PyArg_ParseTuple
|
||
|
||
.. [2] ``PyArg_ParseTupleAndKeywords()``:
|
||
http://docs.python.org/3/c-api/arg.html#PyArg_ParseTupleAndKeywords
|
||
|
||
.. [3] ``PyArg_`` format units:
|
||
http://docs.python.org/3/c-api/arg.html#strings-and-buffers
|
||
|
||
.. [4] Keyword parameters for extension functions:
|
||
http://docs.python.org/3/extending/extending.html#keyword-parameters-for-extension-functions
|
||
|
||
.. [5] Guido van Rossum, posting to python-ideas, March 2012:
|
||
https://mail.python.org/pipermail/python-ideas/2012-March/014364.html
|
||
and
|
||
https://mail.python.org/pipermail/python-ideas/2012-March/014378.html
|
||
and
|
||
https://mail.python.org/pipermail/python-ideas/2012-March/014417.html
|
||
|
||
.. [6] Argument Clinic prototype:
|
||
https://bitbucket.org/larry/python-clinic/
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|