223 lines
8.2 KiB
ReStructuredText
223 lines
8.2 KiB
ReStructuredText
PEP: 671
|
||
Title: Syntax for late-bound function argument defaults
|
||
Author: Chris Angelico <rosuav@gmail.com>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 24-Oct-2021
|
||
Python-Version: 3.11
|
||
Post-History: 24-Oct-2021
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
Function parameters can have default values which are calculated during
|
||
function definition and saved. This proposal introduces a new form of
|
||
argument default, defined by an expression to be evaluated at function
|
||
call time.
|
||
|
||
|
||
Motivation
|
||
==========
|
||
|
||
Optional function arguments, if omitted, often have some sort of logical
|
||
default value. When this value depends on other arguments, or needs to be
|
||
reevaluated each function call, there is currently no clean way to state
|
||
this in the function header.
|
||
|
||
Currently-legal idioms for this include::
|
||
|
||
# Very common: Use None and replace it in the function
|
||
def bisect_right(a, x, lo=0, hi=None, *, key=None):
|
||
if hi is None:
|
||
hi = len(a)
|
||
|
||
# Also well known: Use a unique custom sentinel object
|
||
_USE_GLOBAL_DEFAULT = object()
|
||
def connect(timeout=_USE_GLOBAL_DEFAULT):
|
||
if timeout is _USE_GLOBAL_DEFAULT:
|
||
timeout = default_timeout
|
||
|
||
# Unusual: Accept star-args and then validate
|
||
def add_item(item, *optional_target):
|
||
if not optional_target:
|
||
target = []
|
||
else:
|
||
target = optional_target[0]
|
||
|
||
In each form, ``help(function)`` fails to show the true default value. Each
|
||
one has additional problems, too; using ``None`` is only valid if None is not
|
||
itself a plausible function parameter, the custom sentinel requires a global
|
||
constant; and use of star-args implies that more than one argument could be
|
||
given.
|
||
|
||
Specification
|
||
=============
|
||
|
||
Function default arguments can be defined using the new ``=>`` notation::
|
||
|
||
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None):
|
||
def connect(timeout=>default_timeout):
|
||
def add_item(item, target=>[]):
|
||
def format_time(fmt, time_t=>time.time()):
|
||
|
||
The expression is saved in its source code form for the purpose of inspection,
|
||
and bytecode to evaluate it is prepended to the function's body.
|
||
|
||
Notably, the expression is evaluated in the function's run-time scope, NOT the
|
||
scope in which the function was defined (as are early-bound defaults). This
|
||
allows the expression to refer to other arguments.
|
||
|
||
Multiple late-bound arguments are evaluated from left to right, and can refer
|
||
to previously-defined values. Order is defined by the function, regardless of
|
||
the order in which keyword arguments may be passed. Using names of later
|
||
arguments should not be relied upon, and while this MAY work in some Python
|
||
implementations, it should be considered dubious::
|
||
|
||
def prevref(word="foo", a=>len(word), b=>a//2): # Valid
|
||
def selfref(spam=>spam): # Highly likely to give an error
|
||
def spaminate(sausage=>eggs + 1, eggs=>sausage - 1): # Confusing, may fail
|
||
def frob(n=>len(items), items=[]): # May fail, may succeed
|
||
|
||
Moreover, even if syntactically and semantically legal, this kind of construct
|
||
is highly confusing to other programmers, and should be avoided.
|
||
|
||
|
||
Choice of spelling
|
||
------------------
|
||
|
||
While this document specifies a single syntax ``name=>expression``, alternate
|
||
spellings are similarly plausible. Open for consideration are the following:
|
||
|
||
def bisect(a, hi=>len(a)):
|
||
def bisect(a, hi:=len(a)):
|
||
def bisect(a, hi?=len(a)):
|
||
|
||
An alternative reference implementation is under consideration, which would
|
||
use this syntax::
|
||
|
||
def bisect(a, @hi=len(a)):
|
||
|
||
Since default arguments behave largely the same whether they're early or late
|
||
bound, the syntax is deliberately similar to the existing early-bind syntax.
|
||
|
||
How to Teach This
|
||
=================
|
||
|
||
Early-bound default arguments should always be taught first, as they are the
|
||
simpler and more efficient way to evaluate arguments. Building on them, late
|
||
bound arguments are broadly equivalent to code at the top of the function::
|
||
|
||
def add_item(item, target=>[]):
|
||
|
||
# Equivalent pseudocode:
|
||
def add_item(item, target=<OPTIONAL>):
|
||
if target was omitted: target = []
|
||
|
||
A simple rule of thumb is: "target=expression" is evaluated when the function
|
||
is defined, and "target=>expression" is evaluated when the function is called.
|
||
Either way, if the argument is provided at call time, the default is ignored.
|
||
While this does not completely explain all the subtleties, it is sufficient to
|
||
cover the important distinction here (and the fact that they are similar).
|
||
|
||
|
||
Interaction with other open PEPs
|
||
================================
|
||
|
||
:pep:`661` attempts to solve one of the same problems as this does. It seeks to
|
||
improve the documentation of sentinel values in default arguments, where this
|
||
proposal seeks to remove the need for sentinels in many common cases. :pep:`661`
|
||
is able to improve documentation in arbitrarily complicated functions (it
|
||
cites ``traceback.print_exception`` as its primary motivation, which has two
|
||
arguments which must both-or-neither be specified); on the other hand, many
|
||
of the common cases would no longer need sentinels if the true default could
|
||
be defined by the function. Additionally, dedicated sentinel objects can be
|
||
used as dictionary lookup keys, where :pep:`671` does not apply.
|
||
|
||
|
||
Interaction with annotations
|
||
============================
|
||
|
||
Annotations go before the default, so in all syntax options, it must be
|
||
unambiguous (both to the human and the parser) whether this is an annotation,
|
||
a default, or both. The alternate syntax ``target:=expr`` runs the risk of
|
||
being misinterpreted as ``target:int=expr`` with the annotation omitted in
|
||
error, and may thus mask bugs. The preferred syntax ``target=>expr`` does not
|
||
have this problem.
|
||
|
||
|
||
Implementation details
|
||
======================
|
||
|
||
The following relates to the reference implementation, and is not necessarily
|
||
part of the specification.
|
||
|
||
Argument defaults (positional or keyword) have both their values, as already
|
||
retained, and an extra piece of information. For positional arguments, the
|
||
extras are stored in a tuple in ``__defaults_extra__``, and for keyword-only,
|
||
a dict in ``__kwdefaults_extra__``. If this attribute is ``None``, it is
|
||
equivalent to having ``None`` for every argument default.
|
||
|
||
For each parameter with a late-bound default, the special value ``Ellipsis``
|
||
is stored as the value placeholder, and the corresponding extra information
|
||
needs to be queried. If it is ``None``, then the default is indeed the value
|
||
``Ellipsis``; otherwise, it is a descriptive string and the true value is
|
||
calculated as the function begins.
|
||
|
||
When a parameter with a late-bound default is omitted, the function will begin
|
||
with the parameter unbound. The function begins by testing for each parameter
|
||
with a late-bound default using a new opcode QUERY_FAST/QUERY_DEREF, and if
|
||
unbound, evaluates the original expression. This opcode (available only for
|
||
fast locals and closure variables) pushes True onto the stack if the given
|
||
local has a value, and False if not - meaning that it pushes False if LOAD_FAST
|
||
or LOAD_DEREF would raise UnboundLocalError, and True if it would succeed.
|
||
|
||
Out-of-order variable references are permitted as long as the referent has a
|
||
value from an argument or early-bound default.
|
||
|
||
|
||
Costs
|
||
-----
|
||
|
||
When no late-bound argument defaults are used, the following costs should be
|
||
all that are incurred:
|
||
|
||
* Function objects require two additional pointers, which will be NULL
|
||
* Compiling code and constructing functions have additional flag checks
|
||
* Using ``Ellipsis`` as a default value will require run-time verification
|
||
to see if late-bound defaults exist.
|
||
|
||
These costs are expected to be minimal (on 64-bit Linux, this increases all
|
||
function objects from 152 bytes to 168), with virtually no run-time cost when
|
||
late-bound defaults are not used.
|
||
|
||
Backward incompatibility
|
||
------------------------
|
||
|
||
Where late-bound defaults are not used, behaviour should be identical. Care
|
||
should be taken if Ellipsis is found, as it may not represent itself, but
|
||
beyond that, tools should see existing code unchanged.
|
||
|
||
References
|
||
==========
|
||
|
||
https://github.com/rosuav/cpython/tree/pep-671
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document is placed in the public domain or under the
|
||
CC0-1.0-Universal license, whichever is more permissive.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|