added two PEPs by Talin: 3101, Advanced String Formatting; and 3102, Keyword-Only Arguments

2006-04-26 20:33:25 +00:00 · 2006-04-26 20:33:25 +00:00 · 0ac7bc2d0c
parent a30697598f
commit 0ac7bc2d0c
3 changed files with 535 additions and 1 deletions
--- a/pep-0000.txt
+++ b/pep-0000.txt
@ -104,6 +104,8 @@ Index by Category
 S   358  The "bytes" Object                           Schemenauer
 S   359  The "make" Statement                         Bethard
 S   754  IEEE 754 Floating Point Special Values       Warnes
+ S  3101  Advanced String Formatting                   Talin
+ S  3102  Keyword-Only Arguments                       Talin

 Finished PEPs (done, implemented in Subversion)

@ -425,7 +427,8 @@ Numerical Index
 P  3002  Procedure for Backwards-Incompatible Changes Bethard
 I  3099  Things that will Not Change in Python 3000   Brandl
 I  3100  Python 3.0 Plans                             Kuchling, Cannon
-
+ S  3101  Advanced String Formatting                   Talin
+ S  3102  Keyword-Only Arguments                       Talin

 Key

@ -522,6 +525,7 @@ Owners
    Smith, Kevin D.          Kevin.Smith@theMorgue.org
    Stein, Greg              gstein@lyra.org
    Suzi, Roman              rnd@onego.ru
+    Talin                    talin at acm.org
    Taschuk, Steven          staschuk@telusplanet.net
    Tirosh, Oren             oren at hishome.net
    Warnes, Gregory R.       warnes@users.sourceforge.net
--- a/pep-3101.txt
+++ b/pep-3101.txt
@ -0,0 +1,346 @@
+PEP: 3101
+Title: Advanced String Formatting
+Version: $Revision$
+Last-Modified: $Date$
+Author: Talin <talin at acm.org>
+Status: Draft
+Type: Standards
+Content-Type: text/plain
+Created: 16-Apr-2006
+Python-Version: 3.0
+Post-History:
+
+
+Abstract
+
+    This PEP proposes a new system for built-in string formatting
+    operations, intended as a replacement for the existing '%' string
+    formatting operator.
+
+
+Rationale
+
+    Python currently provides two methods of string interpolation:
+
+    - The '%' operator for strings.
+
+    - The string.Template module.
+
+    The scope of this PEP will be restricted to proposals for built-in
+    string formatting operations (in other words, methods of the
+    built-in string type).  This does not obviate the need for more
+    sophisticated string-manipulation modules in the standard library
+    such as string.Template.  In any case, string.Template will not be
+    discussed here, except to say that the this proposal will most
+    likely have some overlapping functionality with that module.
+
+    The '%' operator is primarily limited by the fact that it is a
+    binary operator, and therefore can take at most two arguments.
+    One of those arguments is already dedicated to the format string,
+    leaving all other variables to be squeezed into the remaining
+    argument.  The current practice is to use either a dictionary or a
+    tuple as the second argument, but as many people have commented
+    [1], this lacks flexibility.  The "all or nothing" approach
+    (meaning that one must choose between only positional arguments,
+    or only named arguments) is felt to be overly constraining.
+
+
+Specification
+
+    The specification will consist of 4 parts:
+
+    - Specification of a set of methods to be added to the built-in
+      string class.
+
+    - Specification of a new syntax for format strings.
+
+    - Specification of a new set of class methods to control the
+      formatting and conversion of objects.
+
+    - Specification of an API for user-defined formatting classes.
+
+
+String Methods
+
+    The build-in string class will gain two new methods.  The first
+    method is 'format', and takes an arbitrary number of positional
+    and keyword arguments:
+
+        "The story of {0}, {1}, and {c}".format(a, b, c=d)
+
+    Within a format string, each positional argument is identified
+    with a number, starting from zero, so in the above example, 'a' is
+    argument 0 and 'b' is argument 1.  Each keyword argument is
+    identified by its keyword name, so in the above example, 'c' is
+    used to refer to the third argument.
+
+    The result of the format call is an object of the same type
+    (string or unicode) as the format string.
+
+
+Format Strings
+
+    Brace characters ('curly braces') are used to indicate a
+    replacement field within the string:
+
+        "My name is {0}".format('Fred')
+
+    The result of this is the string:
+
+        "My name is Fred"
+
+    Braces can be escaped using a backslash:
+
+        "My name is {0} :-\{\}".format('Fred')
+
+    Which would produce:
+
+        "My name is Fred :-{}"
+
+    The element within the braces is called a 'field'.  Fields consist
+    of a name, which can either be simple or compound, and an optional
+    'conversion specifier'.
+
+    Simple names are either names or numbers.  If numbers, they must
+    be valid decimal numbers; if names, they must be valid Python
+    identifiers.  A number is used to identify a positional argument,
+    while a name is used to identify a keyword argument.
+
+    Compound names are a sequence of simple names seperated by
+    periods:
+
+        "My name is {0.name} :-\{\}".format(dict(name='Fred'))
+
+    Compound names can be used to access specific dictionary entries,
+    array elements, or object attributes.  In the above example, the
+    '{0.name}' field refers to the dictionary entry 'name' within
+    positional argument 0.
+
+    Each field can also specify an optional set of 'conversion
+    specifiers'.  Conversion specifiers follow the field name, with a
+    colon (':') character separating the two:
+
+        "My name is {0:8}".format('Fred')
+
+    The meaning and syntax of the conversion specifiers depends on the
+    type of object that is being formatted, however many of the
+    built-in types will recognize a standard set of conversion
+    specifiers.
+
+    The conversion specifier consists of a sequence of zero or more
+    characters, each of which can consist of any printable character
+    except for a non-escaped '}'.  The format() method does not
+    attempt to intepret the conversion specifiers in any way; it
+    merely passes all of the characters between the first colon ':'
+    and the matching right brace ('}') to the various underlying
+    formatters (described later.)
+
+    When using the 'fformat' variant, it is possible to omit the field
+    name entirely, and simply include the conversion specifiers:
+
+        "My name is {:pad(23)}"
+
+    This syntax is used to send special instructions to the custom
+    formatter object (such as instructing it to insert padding
+    characters up to a given column.)  The interpretation of this
+    'empty' field is entirely up to the custom formatter; no
+    standard interpretation will be defined in this PEP.
+
+    If a custom formatter is not being used, then it is an error to
+    omit the field name.
+
+
+Standard Conversion Specifiers
+
+    For most built-in types, the conversion specifiers will be the
+    same or similar to the existing conversion specifiers used with
+    the '%' operator.  Thus, instead of '%02.2x", you will say
+    '{0:2.2x}'.
+
+    There are a few differences however:
+
+    - The trailing letter is optional - you don't need to say '2.2d',
+      you can instead just say '2.2'.  If the letter is omitted, the
+      value will be converted into its 'natural' form (that is, the
+      form that it take if str() or unicode() were called on it)
+      subject to the field length and precision specifiers (if
+      supplied).
+
+    - Variable field width specifiers use a nested version of the {}
+      syntax, allowing the width specifier to be either a positional
+      or keyword argument:
+
+        "{0:{1}.{2}d}".format(a, b, c)
+
+      (Note: It might be easier to parse if these used a different
+      type of delimiter, such as parens - avoiding the need to create
+      a regex that handles the recursive case.)
+
+    - The support for length modifiers (which are ignored by Python
+      anyway) is dropped.
+
+    For non-built-in types, the conversion specifiers will be specific
+    to that type.  An example is the 'datetime' class, whose
+    conversion specifiers are identical to the arguments to the
+    strftime() function:
+
+        "Today is: {0:%x}".format(datetime.now())
+
+
+Controlling Formatting
+
+    A class that wishes to implement a custom interpretation of its
+    conversion specifiers can implement a __format__ method:
+
+    class AST:
+        def __format__(self, specifiers):
+            ...
+
+    The 'specifiers' argument will be either a string object or a
+    unicode object, depending on the type of the original format
+    string.  The __format__ method should test the type of the
+    specifiers parameter to determine whether to return a string or
+    unicode object.  It is the responsibility of the __format__ method
+    to return an object of the proper type.
+
+    string.format() will format each field using the following steps:
+
+     1) See if the value to be formatted has a __format__ method.  If
+        it does, then call it.
+
+     2) Otherwise, check the internal formatter within string.format
+        that contains knowledge of certain builtin types.
+
+     3) Otherwise, call str() or unicode() as appropriate.
+
+
+User-Defined Formatting Classes
+
+    The code that interprets format strings can be called explicitly
+    from user code.  This allows the creation of custom formatter
+    classes that can override the normal formatting rules.
+
+    The string and unicode classes will have a class method called
+    'cformat' that does all the actual work of formatting; The
+    format() method is just a wrapper that calls cformat.
+
+    The parameters to the cformat function are:
+
+        -- The format string (or unicode; the same function handles
+           both.)
+        -- A field format hook (see below)
+        -- A tuple containing the positional arguments
+        -- A dict containing the keyword arguments
+
+    The cformat function will parse all of the fields in the format
+    string, and return a new string (or unicode) with all of the
+    fields replaced with their formatted values.
+
+    For each field, the cformat function will attempt to call the
+    field format hook with the following arguments:
+
+       field_hook(value, conversion, buffer)
+
+    The 'value' field corresponds to the value being formatted, which
+    was retrieved from the arguments using the field name.  (The
+    field_hook has no control over the selection of values, only
+    how they are formatted.)
+
+    The 'conversion' argument is the conversion spec part of the
+    field, which will be either a string or unicode object, depending
+    on the type of the original format string.
+
+    The 'buffer' argument is a Python array object, either a byte
+    array or unicode character array.  The buffer object will contain
+    the partially constructed string; the field hook is free to modify
+    the contents of this buffer if needed.
+
+    The field_hook will be called once per field. The field_hook may
+    take one of two actions:
+
+        1) Return False, indicating that the field_hook will not
+           process this field and the default formatting should be
+           used.  This decision should be based on the type of the
+           value object, and the contents of the conversion string.
+
+        2) Append the formatted field to the buffer, and return True.
+
+
+Alternate Syntax
+
+    Naturally, one of the most contentious issues is the syntax of the
+    format strings, and in particular the markup conventions used to
+    indicate fields.
+
+    Rather than attempting to exhaustively list all of the various
+    proposals, I will cover the ones that are most widely used
+    already.
+
+    - Shell variable syntax: $name and $(name) (or in some variants,
+      ${name}).  This is probably the oldest convention out there, and
+      is used by Perl and many others.  When used without the braces,
+      the length of the variable is determined by lexically scanning
+      until an invalid character is found.
+
+      This scheme is generally used in cases where interpolation is
+      implicit - that is, in environments where any string can contain
+      interpolation variables, and no special subsitution function
+      need be invoked.  In such cases, it is important to prevent the
+      interpolation behavior from occuring accidentally, so the '$'
+      (which is otherwise a relatively uncommonly-used character) is
+      used to signal when the behavior should occur.
+
+      It is the author's opinion, however, that in cases where the
+      formatting is explicitly invoked, that less care needs to be
+      taken to prevent accidental interpolation, in which case a
+      lighter and less unwieldy syntax can be used.
+
+    - Printf and its cousins ('%'), including variations that add a
+      field index, so that fields can be interpolated out of order.
+
+    - Other bracket-only variations.  Various MUDs (Multi-User
+      Dungeons) such as MUSH have used brackets (e.g. [name]) to do
+      string interpolation.  The Microsoft .Net libraries uses braces
+      ({}), and a syntax which is very similar to the one in this
+      proposal, although the syntax for conversion specifiers is quite
+      different. [2]
+
+    - Backquoting.  This method has the benefit of minimal syntactical
+      clutter, however it lacks many of the benefits of a function
+      call syntax (such as complex expression arguments, custom
+      formatters, etc.).
+
+    - Other variations include Ruby's #{}, PHP's {$name}, and so
+      on.
+
+
+Backwards Compatibility
+
+    Backwards compatibility can be maintained by leaving the existing
+    mechanisms in place.  The new system does not collide with any of
+    the method names of the existing string formatting techniques, so
+    both systems can co-exist until it comes time to deprecate the
+    older system.
+
+
+References
+
+    [1] [Python-3000] String formating operations in python 3k
+        http://mail.python.org/pipermail/python-3000/2006-April/000285.html
+
+    [2] Composite Formatting - [.Net Framework Developer's Guide]
+        http://msdn.microsoft.com/library/en-us/cpguide/html/cpconcompositeformatting.asp?frame=true
+
+
+Copyright
+
+    This document has been placed in the public domain.
+
+
+Local Variables:
+mode: indented-text
+indent-tabs-mode: nil
+sentence-end-double-space: t
+fill-column: 70
+coding: utf-8
+End:
--- a/pep-3102.txt
+++ b/pep-3102.txt
@ -0,0 +1,184 @@
+PEP: 3102
+Title: Keyword-Only Arguments
+Version: $Revision$
+Last-Modified: $Date$
+Author: Talin <talin at acm.org>
+Status: Draft
+Type: Standards
+Content-Type: text/plain
+Created: 22-Apr-2006
+Python-Version: 3.0
+Post-History:
+
+
+Abstract
+
+    This PEP proposes a change to the way that function arguments are
+    assigned to named parameter slots.  In particular, it enables the
+    declaration of "keyword-only" arguments: arguments that can only
+    be supplied by keyword and which will never be automatically
+    filled in by a positional argument.
+
+
+Rationale
+
+    The current Python function-calling paradigm allows arguments to
+    be specified either by position or by keyword.  An argument can be
+    filled in either explicitly by name, or implicitly by position.
+
+    There are often cases where it is desirable for a function to take
+    a variable number of arguments.  The Python language supports this
+    using the 'varargs' syntax ('*name'), which specifies that any
+    'left over' arguments be passed into the varargs parameter as a
+    tuple.
+
+    One limitation on this is that currently, all of the regular
+    argument slots must be filled before the vararg slot can be.
+
+    This is not always desirable.  One can easily envision a function
+    which takes a variable number of arguments, but also takes one
+    or more 'options' in the form of keyword arguments.  Currently,
+    the only way to do this is to define both a varargs argument,
+    and a 'keywords' argument (**kwargs), and then manually extract
+    the desired keywords from the dictionary.
+
+
+Specification
+
+    Syntactically, the proposed changes are fairly simple.  The first
+    change is to allow regular arguments to appear after a varargs
+    argument:
+
+        def sortwords(*wordlist, case_sensitive=False):
+           ...
+
+    This function accepts any number of positional arguments, and it
+    also accepts a keyword option called 'case_sensitive'.  This
+    option will never be filled in by a positional argument, but
+    must be explicitly specified by name.
+
+    Keyword-only arguments are not required to have a default value.
+    Since Python requires that all arguments be bound to a value,
+    and since the only way to bind a value to a keyword-only argument
+    is via keyword, such arguments are therefore 'required keyword'
+    arguments.  Such arguments must be supplied by the caller, and
+    they must be supplied via keyword.
+
+    The second syntactical change is to allow the argument name to
+    be omitted for a varargs argument:
+
+        def compare(a, b, *, key=None):
+            ...
+
+    The reasoning behind this change is as follows.  Imagine for a
+    moment a function which takes several positional arguments, as
+    well as a keyword argument:
+
+        def compare(a, b, key=None):
+            ...
+
+    Now, suppose you wanted to have 'key' be a keyword-only argument.
+    Under the above syntax, you could accomplish this by adding a
+    varargs argument immediately before the keyword argument:
+
+        def compare(a, b, *ignore, key=None):
+            ...
+
+    Unfortunately, the 'ignore' argument will also suck up any
+    erroneous positional arguments that may have been supplied by the
+    caller.  Given that we'd prefer any unwanted arguments to raise an
+    error, we could do this:
+
+        def compare(a, b, *ignore, key=None):
+            if ignore:  # If ignore is not empty
+                raise TypeError
+
+    As a convenient shortcut, we can simply omit the 'ignore' name,
+    meaning 'don't allow any positional arguments beyond this point'.
+
+
+Function Calling Behavior
+
+    The previous section describes the difference between the old
+    behavior and the new.  However, it is also useful to have a
+    description of the new behavior that stands by itself, without
+    reference to the previous model.  So this next section will
+    attempt to provide such a description.
+
+    When a function is called, the input arguments are assigned to
+    formal parameters as follows:
+
+      - For each formal parameter, there is a slot which will be used
+        to contain the value of the argument assigned to that
+        parameter.
+
+      - Slots which have had values assigned to them are marked as
+        'filled'.  Slots which have no value assigned to them yet are
+        considered 'empty'.
+
+      - Initially, all slots are marked as empty.
+
+      - Positional arguments are assigned first, followed by keyword
+        arguments.
+
+      - For each positional argument:
+
+         o Attempt to bind the argument to the first unfilled
+           parameter slot.  If the slot is not a vararg slot, then
+           mark the slot as 'filled'.
+
+         o If the next unfilled slot is a vararg slot, and it does
+           not have a name, then it is an error.
+
+         o Otherwise, if the next unfilled slot is a vararg slot then
+           all remaining non-keyword arguments are placed into the
+           vararg slot.
+
+      - For each keyword argument:
+
+         o If there is a parameter with the same name as the keyword,
+           then the argument value is assigned to that parameter slot.
+           However, if the parameter slot is already filled, then that
+           is an error.
+
+         o Otherwise, if there is a 'keyword dictionary' argument,
+           the argument is added to the dictionary using the keyword
+           name as the dictionary key, unless there is already an
+           entry with that key, in which case it is an error.
+
+         o Otherwise, if there is no keyword dictionary, and no
+           matching named parameter, then it is an error.
+
+      - Finally:
+
+         o If the vararg slot is not yet filled, assign an empty tuple
+           as its value.
+
+         o For each remaining empty slot: if there is a default value
+           for that slot, then fill the slot with the default value.
+           If there is no default value, then it is an error.
+
+    In accordance with the current Python implementation, any errors
+    encountered will be signaled by raising TypeError.  (If you want
+    something different, that's a subject for a different PEP.)
+
+
+Backwards Compatibility
+
+    The function calling behavior specified in this PEP is a superset
+    of the existing behavior - that is, it is expected that any
+    existing programs will continue to work.
+
+
+Copyright
+
+    This document has been placed in the public domain.
+
+
+Local Variables:
+mode: indented-text
+indent-tabs-mode: nil
+sentence-end-double-space: t
+fill-column: 70
+coding: utf-8
+End: