207 lines
8.3 KiB
ReStructuredText
207 lines
8.3 KiB
ReStructuredText
PEP: 671
|
|
Title: Syntax for late-bound function argument defaults
|
|
Author: Chris Angelico <rosuav@gmail.com>
|
|
Discussions-To: https://mail.python.org/archives/list/python-ideas@python.org/thread/UVOQEK7IRFSCBOH734T5GFJOEJXFCR6A/
|
|
Status: Draft
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 24-Oct-2021
|
|
Python-Version: 3.11
|
|
Post-History: `24-Oct-2021 <https://mail.python.org/archives/list/python-ideas@python.org/thread/KR2TMLPFR7NHDZCDOS6VTNWDKZQQJN3V/>`__,
|
|
`01-Dec-2021 <https://mail.python.org/archives/list/python-ideas@python.org/thread/UVOQEK7IRFSCBOH734T5GFJOEJXFCR6A/>`__
|
|
|
|
Abstract
|
|
========
|
|
|
|
Function parameters can have default values which are calculated during
|
|
function definition and saved. This proposal introduces a new form of
|
|
argument default, defined by an expression to be evaluated at function
|
|
call time.
|
|
|
|
|
|
Motivation
|
|
==========
|
|
|
|
Optional function arguments, if omitted, often have some sort of logical
|
|
default value. When this value depends on other arguments, or needs to be
|
|
reevaluated each function call, there is currently no clean way to state
|
|
this in the function header.
|
|
|
|
Currently-legal idioms for this include::
|
|
|
|
# Very common: Use None and replace it in the function
|
|
def bisect_right(a, x, lo=0, hi=None, *, key=None):
|
|
if hi is None:
|
|
hi = len(a)
|
|
|
|
# Also well known: Use a unique custom sentinel object
|
|
_USE_GLOBAL_DEFAULT = object()
|
|
def connect(timeout=_USE_GLOBAL_DEFAULT):
|
|
if timeout is _USE_GLOBAL_DEFAULT:
|
|
timeout = default_timeout
|
|
|
|
# Unusual: Accept star-args and then validate
|
|
def add_item(item, *optional_target):
|
|
if not optional_target:
|
|
target = []
|
|
else:
|
|
target = optional_target[0]
|
|
|
|
In each form, ``help(function)`` fails to show the true default value. Each
|
|
one has additional problems, too; using ``None`` is only valid if None is not
|
|
itself a plausible function parameter, the custom sentinel requires a global
|
|
constant; and use of star-args implies that more than one argument could be
|
|
given.
|
|
|
|
Specification
|
|
=============
|
|
|
|
Function default arguments can be defined using the new ``=>`` notation::
|
|
|
|
def bisect_right(a, x, lo=0, hi=>len(a), *, key=None):
|
|
def connect(timeout=>default_timeout):
|
|
def add_item(item, target=>[]):
|
|
def format_time(fmt, time_t=>time.time()):
|
|
|
|
The expression is saved in its source code form for the purpose of inspection,
|
|
and bytecode to evaluate it is prepended to the function's body.
|
|
|
|
Notably, the expression is evaluated in the function's run-time scope, NOT the
|
|
scope in which the function was defined (as are early-bound defaults). This
|
|
allows the expression to refer to other arguments.
|
|
|
|
Multiple late-bound arguments are evaluated from left to right, and can refer
|
|
to previously-defined values. Order is defined by the function, regardless of
|
|
the order in which keyword arguments may be passed.
|
|
|
|
def prevref(word="foo", a=>len(word), b=>a//2): # Valid
|
|
def selfref(spam=>spam): # UnboundLocalError
|
|
def spaminate(sausage=>eggs + 1, eggs=>sausage - 1): # Confusing, don't do this
|
|
def frob(n=>len(items), items=[]): # See below
|
|
|
|
Evaluation order is left-to-right; however, implementations MAY choose to do so
|
|
in two separate passes, first for all passed arguments and early-bound defaults,
|
|
and then a second pass for late-bound defaults. Otherwise, all arguments will be
|
|
assigned strictly left-to-right.
|
|
|
|
Rejected choices of spelling
|
|
----------------------------
|
|
|
|
While this document specifies a single syntax ``name=>expression``, alternate
|
|
spellings are similarly plausible. The following spellings were considered::
|
|
|
|
def bisect(a, hi=>len(a)):
|
|
def bisect(a, hi:=len(a)):
|
|
def bisect(a, hi?=len(a)):
|
|
def bisect(a, @hi=len(a)):
|
|
|
|
Since default arguments behave largely the same whether they're early or late
|
|
bound, the chosen syntax ``hi=>len(a)`` is deliberately similar to the existing
|
|
early-bind syntax.
|
|
|
|
One reason for rejection of the ``:=`` syntax is its behaviour with annotations.
|
|
Annotations go before the default, so in all syntax options, it must be
|
|
unambiguous (both to the human and the parser) whether this is an annotation,
|
|
a default, or both. The alternate syntax ``target:=expr`` runs the risk of
|
|
being misinterpreted as ``target:int=expr`` with the annotation omitted in
|
|
error, and may thus mask bugs. The chosen syntax ``target=>expr`` does not
|
|
have this problem.
|
|
|
|
|
|
How to Teach This
|
|
=================
|
|
|
|
Early-bound default arguments should always be taught first, as they are the
|
|
simpler and more efficient way to evaluate arguments. Building on them, late
|
|
bound arguments are broadly equivalent to code at the top of the function::
|
|
|
|
def add_item(item, target=>[]):
|
|
|
|
# Equivalent pseudocode:
|
|
def add_item(item, target=<OPTIONAL>):
|
|
if target was omitted: target = []
|
|
|
|
A simple rule of thumb is: "target=expression" is evaluated when the function
|
|
is defined, and "target=>expression" is evaluated when the function is called.
|
|
Either way, if the argument is provided at call time, the default is ignored.
|
|
While this does not completely explain all the subtleties, it is sufficient to
|
|
cover the important distinction here (and the fact that they are similar).
|
|
|
|
|
|
Interaction with other open PEPs
|
|
================================
|
|
|
|
:pep:`661` attempts to solve one of the same problems as this does. It seeks to
|
|
improve the documentation of sentinel values in default arguments, where this
|
|
proposal seeks to remove the need for sentinels in many common cases. :pep:`661`
|
|
is able to improve documentation in arbitrarily complicated functions (it
|
|
cites ``traceback.print_exception`` as its primary motivation, which has two
|
|
arguments which must both-or-neither be specified); on the other hand, many
|
|
of the common cases would no longer need sentinels if the true default could
|
|
be defined by the function. Additionally, dedicated sentinel objects can be
|
|
used as dictionary lookup keys, where :pep:`671` does not apply.
|
|
|
|
|
|
Implementation details
|
|
======================
|
|
|
|
The following relates to the reference implementation, and is not necessarily
|
|
part of the specification.
|
|
|
|
Argument defaults (positional or keyword) have both their values, as already
|
|
retained, and an extra piece of information. For positional arguments, the
|
|
extras are stored in a tuple in ``__defaults_extra__``, and for keyword-only,
|
|
a dict in ``__kwdefaults_extra__``. If this attribute is ``None``, it is
|
|
equivalent to having ``None`` for every argument default.
|
|
|
|
For each parameter with a late-bound default, the special value ``Ellipsis``
|
|
is stored as the value placeholder, and the corresponding extra information
|
|
needs to be queried. If it is ``None``, then the default is indeed the value
|
|
``Ellipsis``; otherwise, it is a descriptive string and the true value is
|
|
calculated as the function begins.
|
|
|
|
When a parameter with a late-bound default is omitted, the function will begin
|
|
with the parameter unbound. The function begins by testing for each parameter
|
|
with a late-bound default using a new opcode QUERY_FAST/QUERY_DEREF, and if
|
|
unbound, evaluates the original expression. This opcode (available only for
|
|
fast locals and closure variables) pushes True onto the stack if the given
|
|
local has a value, and False if not - meaning that it pushes False if LOAD_FAST
|
|
or LOAD_DEREF would raise UnboundLocalError, and True if it would succeed.
|
|
|
|
Out-of-order variable references are permitted as long as the referent has a
|
|
value from an argument or early-bound default.
|
|
|
|
|
|
Costs
|
|
-----
|
|
|
|
When no late-bound argument defaults are used, the following costs should be
|
|
all that are incurred:
|
|
|
|
* Function objects require two additional pointers, which will be NULL
|
|
* Compiling code and constructing functions have additional flag checks
|
|
* Using ``Ellipsis`` as a default value will require run-time verification
|
|
to see if late-bound defaults exist.
|
|
|
|
These costs are expected to be minimal (on 64-bit Linux, this increases all
|
|
function objects from 152 bytes to 168), with virtually no run-time cost when
|
|
late-bound defaults are not used.
|
|
|
|
Backward incompatibility
|
|
------------------------
|
|
|
|
Where late-bound defaults are not used, behaviour should be identical. Care
|
|
should be taken if Ellipsis is found, as it may not represent itself, but
|
|
beyond that, tools should see existing code unchanged.
|
|
|
|
References
|
|
==========
|
|
|
|
https://github.com/rosuav/cpython/tree/pep-671
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document is placed in the public domain or under the
|
|
CC0-1.0-Universal license, whichever is more permissive.
|