diff --git a/pep-0422.txt b/pep-0422.txt index 37087e6d4..f33f1cb3e 100644 --- a/pep-0422.txt +++ b/pep-0422.txt @@ -1,5 +1,5 @@ PEP: 422 -Title: Simple class initialisation hook +Title: Simpler customisation of class creation Version: $Revision$ Last-Modified: $Date$ Author: Nick Coghlan , @@ -9,29 +9,23 @@ Type: Standards Track Content-Type: text/x-rst Created: 5-Jun-2012 Python-Version: 3.4 -Post-History: 5-Jun-2012, 10-Feb-2012 +Post-History: 5-Jun-2012, 10-Feb-2013 Abstract ======== -In Python 2, the body of a class definition could modify the way a class -was created (or simply arrange to run other code after the class was created) -by setting the ``__metaclass__`` attribute in the class body. While doing -this implicitly from called code required the use of an implementation detail -(specifically, ``sys._getframes()``), it could also be done explicitly in a -fully supported fashion (for example, by passing ``locals()`` to a -function that calculated a suitable ``__metaclass__`` value) +Currently, customising class creation requires the use of a custom metaclass. +This custom metaclass then persists for the entire lifecycle of the class, +creating the potential for spurious metaclass conflicts. -There is currently no corresponding mechanism in Python 3 that allows the -code executed in the class body to directly influence how the class object -is created. Instead, the class creation process is fully defined by the -class header, before the class body even begins executing. +This PEP proposes to instead support a wide range of customisation +scenarios through a new ``namespace`` parameter in the class header, and +a new ``__init_class__`` hook in the class body. -This PEP proposes a mechanism that will once again allow the body of a -class definition to more directly influence the way a class is created -(albeit in a more constrained fashion), as well as replacing some current -uses of metaclasses with a simpler, easier to understand alternative. +The new mechanism is also much easier to understand and use than +implementing a custom metaclass, and thus should provide a gentler +introduction to the full power Python's metaclass machinery. Background @@ -81,25 +75,32 @@ metaclass could not call methods that referenced the class by name (as the name had not yet been bound in the containing scope), similarly, Python 3 metaclasses cannot call methods that rely on the implicit ``__class__`` reference (as it is not populated until after the metaclass has returned -control to the class creation machiner). +control to the class creation machinery). + +Finally, when a class uses a custom metaclass, it can pose additional +challenges to the use of multiple inheritance, as a new class cannot +inherit from parent classes with unrelated metaclasses. This means that +it is impossible to add a metaclass to an already published class: such +an addition is a backwards incompatible change due to the risk of metaclass +conflicts. Proposal ======== -This PEP proposes that a mechanism be added to Python 3 that meets the -following criteria: +This PEP proposes that a new mechanism to customise class creation be +added to Python 3.4 that meets the following criteria: -1. Restores the ability for class namespaces to have some influence on the +1. Integrates nicely with class inheritance structures (including mixins and + multiple inheritance) +2. Integrates nicely with the implicit ``__class__`` reference and + zero-argument ``super()`` syntax introduced by PEP 3135 +3. Can be added to an existing base class without a significant risk of + introducing backwards compatibility problems +4. Restores the ability for class namespaces to have some influence on the class creation process (above and beyond populating the namespace itself), but potentially without the full flexibility of the Python 2 style ``__metaclass__`` hook -2. Integrates nicely with class inheritance structures (including mixins and - multiple inheritance) -3. Integrates nicely with the implicit ``__class__`` reference and - zero-argument ``super()`` syntax introduced by PEP 3135 -4. Can be added to an existing base class without a significant risk of - introducing backwards compatibility problems One mechanism that can achieve this goal is to add a new class initialisation hook, modelled directly on the existing instance @@ -110,7 +111,6 @@ Specifically, it is proposed that class definitions be able to provide a class initialisation hook as follows:: class Example: - @classmethod def __init_class__(cls): # This is invoked after the class is created, but before any # explicit decorators are called @@ -121,13 +121,15 @@ class initialisation hook as follows:: If present on the created object, this new hook will be called by the class creation machinery *after* the ``__class__`` reference has been initialised. For ``types.new_class()``, it will be called as the last step before -returning the created class object. +returning the created class object. ``__init_class__`` is implicitly +converted to a class method when the class is created (prior to the hook +being invoked). If a metaclass wishes to block class initialisation for some reason, it must arrange for ``cls.__init_class__`` to trigger ``AttributeError``. Note, that when ``__init_class__`` is called, the name of the class is not -bound to the new class object yet. As a consequence, the two argument form +yet bound to the new class object. As a consequence, the two argument form of ``super()`` cannot be used to call methods (e.g., ``super(Example, cls)`` wouldn't work in the example above). However, the zero argument form of ``super()`` works as expected, since the ``__class__`` reference is already @@ -139,19 +141,47 @@ similar mechanism has long been supported by `Zope's ExtensionClass`_), but the situation has changed sufficiently in recent years that the idea is worth reconsidering. +In addition, the introduction of the metaclass ``__prepare__`` method in PEP +3115 allows a further enhancement that was not possible in Python 2: this +PEP also proposes that ``type.__prepare__`` be updated to accept a factory +function as a ``namespace`` keyword-only argument. If present, the value +provided as the ``namespace`` argument will be called without arguments +to create the result of ``type.__prepare__`` instead of using a freshly +created dictionary instance. For example, the following will use +an ordered dictionary as the class namespace:: + + class OrderedExample(namespace=collections.OrderedDict): + def __init_class__(cls): + # cls.__dict__ is still a read-only proxy to the class namespace, + # but the underlying storage is an OrderedDict instance + +.. note:: + + This PEP, along with the existing ability to use __prepare__ to share a + single namespace amongst multiple class objects, highlights a possible + issue with the attribute lookup caching: when the underlying mapping is + updated by other means, the attribute lookup cache is not invalidated + correctly (this is a key part of the reason class ``__dict__`` attributes + produce a read-only view of the underlying storage). + + Since the optimisation provided by that cache is highly desirable, + the use of a preexisting namespace as the class namespace may need to + be declared as officially unsupported (since the observed behaviour is + rather strange when the caches get out of sync). + Key Benefits ============ -Replaces many use cases for dynamic setting of ``__metaclass__`` ------------------------------------------------------------------ +Easier use of custom namespaces for a class +------------------------------------------- -For use cases that don't involve completely replacing the defined class, -Python 2 code that dynamically set ``__metaclass__`` can now dynamically -set ``__init_class__`` instead. For more advanced use cases, introduction of -an explicit metaclass (possibly made available as a required base class) will -still be necessary in order to support Python 3. +Currently, to use a different type (such as ``collections.OrderedDict``) for +a class namespace, or to use a pre-populated namespace, it is necessary to +write and use a custom metaclass. With this PEP, using a custom namespace +becomes as simple as specifying an appropriate factory function in the +class header. Easier inheritance of definition time behaviour @@ -201,137 +231,124 @@ implicit ``__class__`` reference introduced by PEP 3135, including methods that use the zero argument form of ``super()``. -Alternatives -============ +Replaces many use cases for dynamic setting of ``__metaclass__`` +----------------------------------------------------------------- + +For use cases that don't involve completely replacing the defined class, +Python 2 code that dynamically set ``__metaclass__`` can now dynamically +set ``__init_class__`` instead. For more advanced use cases, introduction of +an explicit metaclass (possibly made available as a required base class) will +still be necessary in order to support Python 3. -The Python 3 Status Quo ------------------------ +New Ways of Using Classes +========================= -The Python 3 status quo already offers a great deal of flexibility. For -changes which only affect a single class definition and which can be -specified at the time the code is written, then class decorators can be -used to modify a class explicitly. Class decorators largely ignore class -inheritance and can make full use of methods that rely on the ``__class__`` -reference being populated. +The new ``namespace`` keyword in the class header enables a number of +interesting options for controlling the way a class is initialised, +including some aspects of the object models of both Javascript and Ruby. -Using a custom metaclass provides the same level of power as it did in -Python 2. However, it's notable that, unlike class decorators, a metaclass -cannot call any methods that rely on the ``__class__`` reference, as that -reference is not populated until after the metaclass constructor returns -control to the class creation code. +All of the examples below are actually possible today through the use of a +custom metaclass:: -One major use case for metaclasses actually closely resembles the use of -class decorators. It occurs whenever a metaclass has an implementation that -uses the following pattern:: + class CustomNamespace(type): + @classmethod + def __prepare__(meta, name, bases, *, namespace=None, **kwds): + parent_namespace = super().__prepare__(name, bases, **kwds) + return namespace() if namespace is not None else parent_namespace + def __new__(meta, name, bases, ns, *, namespace=None, **kwds): + return super().__new__(meta, name, bases, ns, **kwds) + def __init__(cls, name, bases, ns, *, namespace=None, **kwds): + return super().__init__(name, bases, ns, **kwds) - class Metaclass(type): - def __new__(meta, *args, **kwds): - cls = super(Metaclass, meta).__new__(meta, *args, **kwds) - # Do something with cls - return cls - -The key difference between this pattern and a class decorator is that it -is automatically inherited by subclasses. However, it also comes with a -major disadvantage: Python does not allow you to inherit from classes with -unrelated metaclasses. - -Thus, the status quo requires that developers choose between the following -two alternatives: - -* Use a class decorator, meaning that behaviour is not inherited and must be - requested explicitly on every subclass -* Use a metaclass, meaning that behaviour is inherited, but metaclass - conflicts may make integration with other libraries and frameworks more - difficult than it otherwise would be - -If this PEP is ultimately rejected, then this is the existing design that -will remain in place by default. +The advantage of implementing the new keyword directly in +``type.__prepare__`` is that the *only* persistent effect is then +the change in the underlying storage of the class attributes. The metaclass +of the class remains unchanged, eliminating many of the drawbacks +typically associated with these kinds of customisations. -Restoring the Python 2 metaclass hook -------------------------------------- - -One simple alternative would be to restore support for a Python 2 style -``metaclass`` hook in the class body. This would be checked after the class -body was executed, potentially overwriting the metaclass hint provided in the -class header. - -The main attraction of such an approach is that it would simplify porting -Python 2 applications that make use of this hook (especially those that do -so dynamically). - -However, this approach does nothing to simplify the process of adding -*inherited* class definition time behaviour, nor does it interoperate -cleanly with the PEP 3135 ``__class__`` and ``super()`` semantics (as with -any metaclass based solution, the ``__metaclass__`` hook would have to run -before the ``__class__`` reference has been populated. - - -Dynamic class decorators +Order preserving classes ------------------------ -The original version of this PEP was called "Dynamic class decorators" and -focused solely on a significantly more complicated proposal than that -presented in the current version. +:: -As with the current version, it proposed that a new step be added to the -class creation process, after the metaclass invocation to construct the -class instance and before the application of lexical decorators. However, -instead of a simple process of calling a single class method that relies -on normal inheritance mechanisms, it proposed a far more complicated -procedure that walked the class MRO looking for decorators stored in -iterable ``__decorators__`` attributes. - -Using the current version of the PEP, the scheme originally proposed could -be implemented as:: - - class DynamicDecorators(Base): - @classmethod - def __init_class__(cls): - # Process any classes later in the MRO - try: - mro_chain = super().__init_class__ - except AttributeError: - pass - else: - mro_chain() - # Process any __decorators__ attributes in the MRO - for entry in reversed(cls.mro()): - decorators = entry.__dict__.get("__decorators__", ()) - for deco in reversed(decorators): - cls = deco(cls) - -Any subclasses of ``DynamicDecorators`` would then automatically have the -contents of any ``__decorators__`` attributes processed and invoked. - -The mechanism in the current PEP is considered superior, as many issues -to do with ordering and the same decorator being invoked multiple times -just go away, as that kind of thing is taken care of through the use of an -ordinary class method invocation. + class OrderedClass(namespace=collections.OrderedDict): + a = 1 + b = 2 + c = 3 -Automatic metaclass derivation ------------------------------- +Prepopulated namespaces +----------------------- -When no appropriate metaclass is found, it's theoretically possible to -automatically derive a metaclass for a new type based on the metaclass hint -and the metaclasses of the bases. +:: -While adding such a mechanism would reduce the risk of spurious metaclass -conflicts, it would do nothing to improve integration with PEP 3135, would -not help with porting Python 2 code that set ``__metaclass__`` dynamically -and would not provide a more straightforward inherited mechanism for invoking -additional operations after the class invocation is complete. - -In addition, there would still be a risk of metaclass conflicts in cases -where the base metaclasses were not written with multiple inheritance in -mind. In such situations, there's a chance of introducing latent defects -if one or more metaclasses are not invoked correctly. + seed_data = dict(a=1, b=2, c=3) + class PrepopulatedClass(namespace=seed_data.copy): + pass -Calling the new hook from ``type.__init__`` -------------------------------------------- +Cloning a prototype class +------------------------- + +:: + + class NewClass(namespace=Prototype.__dict__.copy): + pass + + +Extending a class +----------------- + +.. note:: Just because the PEP makes it *possible* to do this relatively, + cleanly doesn't mean anyone *should* do this! + +:: + + from collections import MutableMapping + + # The MutableMapping + dict combination should give something that + # generally behaves correctly as a mapping, while still being accepted + # as a class namespace + class ClassNamespace(MutableMapping, dict): + def __init__(self, cls): + self._cls = cls + def __len__(self): + return len(dir(self._cls)) + def __iter__(self): + for attr in dir(self._cls): + yield attr + def __contains__(self, attr): + return hasattr(self._cls, attr) + def __getitem__(self, attr): + return getattr(self._cls, attr) + def __setitem__(self, attr, value): + setattr(self._cls, attr, value) + def __delitem__(self, attr): + delattr(self._cls, attr) + + def extend(cls): + return lambda: ClassNamespace(cls) + + class Example: + pass + + class ExtendedExample(namespace=extend(Example)): + a = 1 + b = 2 + c = 3 + + >>> Example.a, Example.b, Example.c + (1, 2, 3) + + +Rejected Design Options +======================= + + +Calling ``__init_class__`` from ``type.__init__`` +------------------------------------------------- Calling the new hook automatically from ``type.__init__``, would achieve most of the goals of this PEP. However, using that approach would mean that @@ -340,11 +357,43 @@ relied on the ``__class__`` reference (or used the zero-argument form of ``super()``), and could not make use of those features themselves. +Requiring an explict decorator on ``__init_class__`` +---------------------------------------------------- + +Originally, this PEP required the explicit use of ``@classmethod`` on the +``__init_class__`` decorator. It was made implicit since there's no +sensible interpretation for leaving it out, and that case would need to be +detected anyway in order to give a useful error message. + +This decision was reinforced after noticing that the user experience of +defining ``__prepare__`` and forgetting the ``@classmethod`` method +decorator is singularly incomprehensible (particularly since PEP 3115 +documents it as an ordinary method, and the current documentation doesn't +explicitly say anything one way or the other). + + +Passing in the namespace directly rather than a factory function +---------------------------------------------------------------- + +At one point, this PEP proposed that the class namespace be passed +directly as a keyword argument, rather than passing a factory function. +However, this encourages an unsupported behaviour (that is, passing the +same namespace to multiple classes, or retaining direct write access +to a mapping used as a class namespace), so the API was switched to +the factory function version. + + Reference Implementation ======================== -The reference implementation has been posted to the `issue tracker`_. +A reference implementation for __init_class__ has been posted to the +`issue tracker`_. It does not yet include the new ``namespace`` parameter +for ``type.__prepare__``. +TODO +==== + +* address the 5 points in http://mail.python.org/pipermail/python-dev/2013-February/123970.html References ========== diff --git a/pep-0426.txt b/pep-0426.txt index 2095b6801..b6ea1ccb1 100644 --- a/pep-0426.txt +++ b/pep-0426.txt @@ -343,10 +343,10 @@ Examples:: Provides-Extra (multiple use) ----------------------------- -A string containing the name of an optional feature. Must be printable -ASCII, not containing whitespace, comma (,), or square brackets []. -May be used to make a dependency conditional on whether the optional -feature has been requested. +A string containing the name of an optional feature or "extra" that may +only be available when additional dependencies have been installed. Must +be printable ASCII, not containing whitespace, comma (,), or square +brackets []. See `Optional Features`_ for details on the use of this field. @@ -861,7 +861,7 @@ ordered as shown:: Within a post-release (``1.0.post1``), the following suffixes are permitted and are ordered as shown:: - devN, + .devN, Note that ``devN`` and ``postN`` must always be preceded by a dot, even when used immediately following a numeric version (e.g. ``1.0.dev456``, @@ -976,8 +976,9 @@ Date based versions As with other incompatible version schemes, date based versions can be stored in the ``Private-Version`` field. Translating them to a compliant -version is straightforward: the simplest approach is to subtract the year -of the first release from the major component in the release number. +public version is straightforward: the simplest approach is to subtract +the year before the first release from the major component in the release +number. Version specifiers @@ -994,48 +995,93 @@ Each version identifier must be in the standard format described in The comma (",") is equivalent to a logical **and** operator. -Comparison operators must be one of ``<``, ``>``, ``<=``, ``>=``, ``==`` -or ``!=``. - -The ``==`` and ``!=`` operators are strict - in order to match, the -version supplied must exactly match the specified version, with no -additional trailing suffix. - -However, when no comparison operator is provided along with a version -identifier ``V``, it is equivalent to using the following pair of version -clauses:: - - >= V, < V+1 - -where ``V+1`` is the next version after ``V``, as determined by -incrementing the last numeric component in ``V`` (for example, if -``V == 1.0a3``, then ``V+1 == 1.0a4``, while if ``V == 1.0``, then -``V+1 == 1.1``). - -This approach makes it easy to depend on a particular release series -simply by naming it in a version specifier, without requiring any -additional annotation. For example, the following pairs of version -specifiers are equivalent:: - - 2 - >= 2, < 3 - - 3.3 - >= 3.3, < 3.4 - Whitespace between a conditional operator and the following version identifier is optional, as is the whitespace around the commas. + +Compatible release +------------------ + +A compatible release clause omits the comparison operator and matches any +version that is expected to be compatible with the specified version. + +For a given release identifier ``V.N``, the compatible release clause is +approximately equivalent to the pair of comparison clauses:: + + >= V.N, < V+1.dev0 + +where ``V+1`` is the next version after ``V``, as determined by +incrementing the last numeric component in ``V``. For example, +the following version clauses are approximately equivalent:: + + 2.2 + >= 2.2, < 3.dev0 + + 1.4.5 + >= 1.4.5, < 1.5.dev0 + +The difference between the two is that using a compatible release clause +does *not* count as `explicitly mentioning a pre-release`__. + +__ `Handling of pre-releases`_ + +If a pre-release, post-release or developmental release is named in a +compatible release clause as ``V.N.suffix``, then the suffix is ignored +when determining the upper limit of compatibility:: + + 2.2.post3 + >= 2.2.post3, < 3.dev0 + + 1.4.5a4 + >= 1.4.5a4, < 1.5.dev0 + + +Version comparisons +------------------- + +A version comparison clause includes a comparison operator and a version +identifier, and will match any version where the comparison is true. + +Comparison clauses are only needed to cover cases which cannot be handled +with an appropriate compatible release clause, including coping with +dependencies which do not have a robust backwards compatibility policy +and thus break the assumptions of a compatible release clause. + +The defined comparison operators are ``<``, ``>``, ``<=``, ``>=``, ``==``, +and ``!=``. + +The ordered comparison operators ``<``, ``>``, ``<=``, ``>=`` are based +on the consistent ordering defined by the standard `Version scheme`_. + +The ``==`` and ``!=`` operators are based on string comparisons - in order +to match, the version being checked must start with exactly that sequence of +characters. + +.. note:: + + The use of ``==`` when defining dependencies for published distributions + is strongly discouraged, as it greatly complicates the deployment of + security fixes (the strict version comparison operator is intended + primarily for use when defining dependencies for particular + applications while using a shared distribution index). + + +Handling of pre-releases +------------------------ + Pre-releases of any kind, including developmental releases, are implicitly excluded from all version specifiers, *unless* a pre-release or developmental -developmental release is explicitly mentioned in one of the clauses. For -example, this specifier implicitly excludes all pre-releases and development +release is explicitly mentioned in one of the clauses. For example, these +specifiers implicitly exclude all pre-releases and development releases of later versions:: + 2.2 >= 1.0 -While these specifiers would include them:: +While these specifiers would include at least some of them:: + 2.2.dev0 + 2.2, != 2.3b2 >= 1.0a1 >= 1.0c1 >= 1.0, != 1.0b2 @@ -1053,37 +1099,26 @@ controlled on a per-distribution basis. Post-releases and purely numeric releases receive no special treatment - they are always included unless explicitly excluded. -Given the above rules, projects which include the ``.0`` suffix for the -first release in a series, such as ``2.5.0``, can easily refer specifically -to that version with the clause ``2.5.0``, while the clause ``2.5`` refers -to that entire series. Projects which omit the ".0" suffix for the first -release of a series, by using a version string like ``2.5`` rather than -``2.5.0``, will need to use an explicit clause like ``>= 2.5, < 2.5.1`` to -refer specifically to that initial release. -Some examples: +Examples +-------- -* ``Requires-Dist: zope.interface (3.1)``: any version that starts with 3.1, +* ``Requires-Dist: zope.interface (3.1)``: version 3.1 or later, but not + version 4.0 or later. Excludes pre-releases and developmental releases. +* ``Requires-Dist: zope.interface (3.1.0)``: version 3.1.0 or later, but not + version 3.2.0 or later. Excludes pre-releases and developmental releases. +* ``Requires-Dist: zope.interface (==3.1)``: any version that starts + with 3.1, excluding pre-releases and developmental releases. +* ``Requires-Dist: zope.interface (3.1.0,!=3.1.3)``: version 3.1.0 or later, + but not version 3.1.3 and not version 3.2.0 or later. Excludes pre-releases + and developmental releases. For this particular project, this means: "any + version of the 3.1 series but not 3.1.3". This is equivalent to: + ``>=3.1, !=3.1.3, <3.2``. +* ``Requires-Python: 2.6``: Any version of Python 2.6 or 2.7. It + automatically excludes Python 3 or later. +* ``Requires-Python: 3.2, < 3.3``: Specifically requires Python 3.2, excluding pre-releases. -* ``Requires-Dist: zope.interface (==3.1)``: equivalent to ``Requires-Dist: - zope.interface (3.1)``. -* ``Requires-Dist: zope.interface (3.1.0)``: any version that starts with - 3.1.0, excluding pre-releases. Since that particular project doesn't - use more than 3 digits, it also means "only the 3.1.0 release". -* ``Requires-Python: 3``: Any Python 3 version, excluding pre-releases. -* ``Requires-Python: >=2.6,<3``: Any version of Python 2.6 or 2.7, including - post-releases (if they were used for Python). It excludes pre releases of - Python 3. -* ``Requires-Python: 2.6.2``: Equivalent to ">=2.6.2,<2.6.3". So this includes - only Python 2.6.2. Of course, if Python was numbered with 4 digits, it would - include all versions of the 2.6.2 series, excluding pre-releases. -* ``Requires-Python: 2.5``: Equivalent to ">=2.5,<2.6". -* ``Requires-Dist: zope.interface (3.1,!=3.1.3)``: any version that starts - with 3.1, excluding pre-releases of 3.1 *and* excluding any version that - starts with "3.1.3". For this particular project, this means: "any version - of the 3.1 series but not 3.1.3". This is equivalent to: - ">=3.1,!=3.1.3,<3.2". -* ``Requires-Python: >=3.3a1``: Any version of Python 3.3+, including +* ``Requires-Python: 3.3a1``: Any version of Python 3.3+, including pre-releases like 3.4a1. @@ -1439,10 +1474,10 @@ Changing the interpretation of version specifiers The previous interpretation of version specifiers made it very easy to accidentally download a pre-release version of a dependency. This in turn made it difficult for developers to publish pre-release versions -of software to the Python Package Index, as leaving the package set as -public would lead to users inadvertently downloading pre-release software, -while hiding it would defeat the purpose of publishing it for user -testing. +of software to the Python Package Index, as even marking the package as +hidden wasn't enough to keep automated tools from downloading it, and also +made it harder for users to obtain the test release manually through the +main PyPI web interface. The previous interpretation also excluded post-releases from some version specifiers for no adequately justified reason. @@ -1451,6 +1486,16 @@ The updated interpretation is intended to make it difficult to accidentally accept a pre-release version as satisfying a dependency, while allowing pre-release versions to be explicitly requested when needed. +The "some forward compatibility assumed" default version constraint is +taken directly from the Ruby community's "pessimistic version constraint" +operator [4]_ to allow projects to take a cautious approach to forward +compatibility promises, while still easily setting a minimum required +version for their dependencies. It is made the default behaviour rather +than needing a separate operator in order to explicitly discourage +overspecification of dependencies by library developers. The explicit +comparison operators remain available to cope with dependencies with +unreliable or non-existent backwards compatibility policies. + Packaging, build and installation dependencies ---------------------------------------------- @@ -1548,6 +1593,9 @@ justifications for needing such a standard can be found in PEP 386. .. [3] Version compatibility analysis script: http://hg.python.org/peps/file/default/pep-0426/pepsort.py +.. [4] Pessimistic version constraint + http://docs.rubygems.org/read/chapter/16 + Appendix A ========== diff --git a/pep-0427.txt b/pep-0427.txt index 3715a107c..10d0a7b0f 100644 --- a/pep-0427.txt +++ b/pep-0427.txt @@ -101,6 +101,15 @@ Generate script wrappers. accompanying .exe wrappers. Windows installers may want to add them during install. +Recommended archiver features +''''''''''''''''''''''''''''' + +Place ``.dist-info`` at the end of the archive. + Archivers are encouraged to place the ``.dist-info`` files physically + at the end of the archive. This enables some potentially interesting + ZIP tricks including the ability to amend the metadata without + rewriting the entire archive. + File Format ----------- @@ -149,9 +158,14 @@ non-alphanumeric characters with an underscore ``_``:: re.sub("[^\w\d.]+", "_", distribution, re.UNICODE) -The filename is Unicode. It will be some time before the tools are -updated to support non-ASCII filenames, but they are supported in this -specification. +The archive filename is Unicode. It will be some time before the tools +are updated to support non-ASCII filenames, but they are supported in +this specification. + +The filenames *inside* the archive are encoded as UTF-8. Although some +ZIP clients in common use do not properly display UTF-8 filenames, +the encoding is supported by both the ZIP specification and Python's +``zipfile``. File contents ''''''''''''' diff --git a/pep-0428.txt b/pep-0428.txt index 61d82d81c..efd322a08 100644 --- a/pep-0428.txt +++ b/pep-0428.txt @@ -55,6 +55,15 @@ rejection of :pep:`355`. .. _`Unipath`: https://bitbucket.org/sluggo/unipath/overview +Implementation +============== + +The implementation of this proposal is tracked in the ``pep428`` branch +of pathlib's `Mercurial repository`_. + +.. _`Mercurial repository`: https://bitbucket.org/pitrou/pathlib/ + + Why an object-oriented API ========================== @@ -341,6 +350,15 @@ call ``bytes()`` on it, or use the ``as_bytes()`` method:: >>> bytes(p) b'/home/antoine/pathlib/setup.py' +To represent the path as a ``file`` URI, call the ``as_uri()`` method:: + + >>> p = PurePosixPath('/etc/passwd') + >>> p.as_uri() + 'file:///etc/passwd' + >>> p = PureNTPath('c:/Windows') + >>> p.as_uri() + 'file:///c:/Windows' + Properties ---------- diff --git a/pep-0435.txt b/pep-0435.txt new file mode 100644 index 000000000..6ca2c0b8a --- /dev/null +++ b/pep-0435.txt @@ -0,0 +1,542 @@ +PEP: 435 +Title: Adding an Enum type to the Python standard library +Version: $Revision$ +Last-Modified: $Date$ +Author: Barry Warsaw , + Eli Bendersky +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 2013-02-23 +Python-Version: 3.4 +Post-History: 2013-02-23 + + +Abstract +======== + +This PEP proposes adding an enumeration type to the Python standard library. +Specifically, it proposes moving the existing ``flufl.enum`` package by Barry +Warsaw into the standard library. Much of this PEP is based on the "using" +[1]_ document from the documentation of ``flufl.enum``. + +An enumeration is a set of symbolic names bound to unique, constant integer +values. Within an enumeration, the values can be compared by identity, and the +enumeration itself can be iterated over. Enumeration items can be converted to +and from their integer equivalents, supporting use cases such as storing +enumeration values in a database. + + +Status of discussions +===================== + +The idea of adding an enum type to Python is not new - PEP 354 [2]_ is a +previous attempt that was rejected in 2005. Recently a new set of discussions +was initiated [3]_ on the ``python-ideas`` mailing list. Many new ideas were +proposed in several threads; after a lengthy discussion Guido proposed adding +``flufl.enum`` to the standard library [4]_. This PEP is an attempt to +formalize this decision as well as discuss a number of variations that can be +considered for inclusion. + + +Motivation +========== + +*[Based partly on the Motivation stated in PEP 354]* + +The properties of an enumeration are useful for defining an immutable, related +set of constant values that have a defined sequence but no inherent semantic +meaning. Classic examples are days of the week (Sunday through Saturday) and +school assessment grades ('A' through 'D', and 'F'). Other examples include +error status values and states within a defined process. + +It is possible to simply define a sequence of values of some other basic type, +such as ``int`` or ``str``, to represent discrete arbitrary values. However, +an enumeration ensures that such values are distinct from any others including, +importantly, values within other enumerations, and that operations without +meaning ("Wednesday times two") are not defined for these values. It also +provides a convenient printable representation of enum values without requiring +tedious repetition while defining them (i.e. no ``GREEN = 'green'``). + + +Module and type name +==================== + +We propose to add a module named ``enum`` to the standard library. The main +type exposed by this module is ``Enum``. Hence, to import the ``Enum`` type +user code will run:: + + >>> from enum import Enum + + +Proposed semantics for the new enumeration type +=============================================== + +Creating an Enum +---------------- + +Enumerations are created using the class syntax, which makes them easy to read +and write. Every enumeration value must have a unique integer value and the +only restriction on their names is that they must be valid Python identifiers. +To define an enumeration, derive from the ``Enum`` class and add attributes +with assignment to their integer values:: + + >>> from enum import Enum + >>> class Colors(Enum): + ... red = 1 + ... green = 2 + ... blue = 3 + +Enumeration values are compared by identity:: + + >>> Colors.red is Colors.red + True + >>> Colors.blue is Colors.blue + True + >>> Colors.red is not Colors.blue + True + >>> Colors.blue is Colors.red + False + +Enumeration values have nice, human readable string representations:: + + >>> print(Colors.red) + Colors.red + +...while their repr has more information:: + + >>> print(repr(Colors.red)) + + +The enumeration value names are available through the class members:: + + >>> for member in Colors.__members__: + ... print(member) + red + green + blue + +Let's say you wanted to encode an enumeration value in a database. You might +want to get the enumeration class object from an enumeration value:: + + >>> cls = Colors.red.enum + >>> print(cls.__name__) + Colors + +Enums also have a property that contains just their item name:: + + >>> print(Colors.red.name) + red + >>> print(Colors.green.name) + green + >>> print(Colors.blue.name) + blue + +The str and repr of the enumeration class also provides useful information:: + + >>> print(Colors) + + >>> print(repr(Colors)) + + +You can extend previously defined Enums by subclassing:: + + >>> class MoreColors(Colors): + ... pink = 4 + ... cyan = 5 + +When extended in this way, the base enumeration's values are identical to the +same named values in the derived class:: + + >>> Colors.red is MoreColors.red + True + >>> Colors.blue is MoreColors.blue + True + +However, these are not doing comparisons against the integer equivalent +values, because if you define an enumeration with similar item names and +integer values, they will not be identical:: + + >>> class OtherColors(Enum): + ... red = 1 + ... blue = 2 + ... yellow = 3 + >>> Colors.red is OtherColors.red + False + >>> Colors.blue is not OtherColors.blue + True + +These enumeration values are not equal, nor do they hash equally:: + + >>> Colors.red == OtherColors.red + False + >>> len(set((Colors.red, OtherColors.red))) + 2 + +Ordered comparisons between enumeration values are *not* supported. Enums are +not integers:: + + >>> Colors.red < Colors.blue + Traceback (most recent call last): + ... + NotImplementedError + >>> Colors.red <= Colors.blue + Traceback (most recent call last): + ... + NotImplementedError + >>> Colors.blue > Colors.green + Traceback (most recent call last): + ... + NotImplementedError + >>> Colors.blue >= Colors.green + Traceback (most recent call last): + ... + NotImplementedError + +Equality comparisons are defined though:: + + >>> Colors.blue == Colors.blue + True + >>> Colors.green != Colors.blue + True + +Enumeration values do not support ordered comparisons:: + + >>> Colors.red < Colors.blue + Traceback (most recent call last): + ... + NotImplementedError + >>> Colors.red < 3 + Traceback (most recent call last): + ... + NotImplementedError + >>> Colors.red <= 3 + Traceback (most recent call last): + ... + NotImplementedError + >>> Colors.blue > 2 + Traceback (most recent call last): + ... + NotImplementedError + >>> Colors.blue >= 2 + Traceback (most recent call last): + ... + NotImplementedError + +While equality comparisons are allowed, comparisons against non-enumeration +values will always compare not equal:: + + >>> Colors.green == 2 + False + >>> Colors.blue == 3 + False + >>> Colors.green != 3 + True + >>> Colors.green == 'green' + False + +If you really want the integer equivalent values, you can convert enumeration +values explicitly using the ``int()`` built-in. This is quite convenient for +storing enums in a database, as well as for interoperability with C extensions +that expect integers:: + + >>> int(Colors.red) + 1 + >>> int(Colors.green) + 2 + >>> int(Colors.blue) + 3 + +You can also convert back to the enumeration value by calling the Enum +subclass, passing in the integer value for the item you want:: + + >>> Colors(1) + + >>> Colors(2) + + >>> Colors(3) + + >>> Colors(1) is Colors.red + True + +The Enum subclass also accepts the string name of the enumeration value:: + + >>> Colors('red') + + >>> Colors('blue') is Colors.blue + True + +You get exceptions though, if you try to use invalid arguments:: + + >>> Colors('magenta') + Traceback (most recent call last): + ... + ValueError: magenta + >>> Colors(99) + Traceback (most recent call last): + ... + ValueError: 99 + +The Enum base class also supports getitem syntax, exactly equivalent to the +class's call semantics:: + + >>> Colors[1] + + >>> Colors[2] + + >>> Colors[3] + + >>> Colors[1] is Colors.red + True + >>> Colors['red'] + + >>> Colors['blue'] is Colors.blue + True + >>> Colors['magenta'] + Traceback (most recent call last): + ... + ValueError: magenta + >>> Colors[99] + Traceback (most recent call last): + ... + ValueError: 99 + +The integer equivalent values serve another purpose. You may not define two +enumeration values with the same integer value:: + + >>> class Bad(Enum): + ... cartman = 1 + ... stan = 2 + ... kyle = 3 + ... kenny = 3 # Oops! + ... butters = 4 + Traceback (most recent call last): + ... + TypeError: Multiple enum values: 3 + +You also may not duplicate values in derived enumerations:: + + >>> class BadColors(Colors): + ... yellow = 4 + ... chartreuse = 2 # Oops! + Traceback (most recent call last): + ... + TypeError: Multiple enum values: 2 + +The Enum class support iteration. Enumeration values are returned in the +sorted order of their integer equivalent values:: + + >>> [v.name for v in MoreColors] + ['red', 'green', 'blue', 'pink', 'cyan'] + >>> [int(v) for v in MoreColors] + [1, 2, 3, 4, 5] + +Enumeration values are hashable, so they can be used in dictionaries and sets:: + + >>> apples = {} + >>> apples[Colors.red] = 'red delicious' + >>> apples[Colors.green] = 'granny smith' + >>> for color in sorted(apples, key=int): + ... print(color.name, '->', apples[color]) + red -> red delicious + green -> granny smith + + +Pickling +-------- + +Enumerations created with the class syntax can also be pickled and unpickled:: + + >>> from enum.tests.fruit import Fruit + >>> from pickle import dumps, loads + >>> Fruit.tomato is loads(dumps(Fruit.tomato)) + True + + +Convenience API +--------------- + +You can also create enumerations using the convenience function ``make()``, +which takes an iterable object or dictionary to provide the item names and +values. ``make()`` is a module-level function. + +The first argument to ``make()`` is the name of the enumeration, and it returns +the so-named `Enum` subclass. The second argument is a *source* which can be +either an iterable or a dictionary. In the most basic usage, *source* returns +a sequence of strings which name the enumeration items. In this case, the +values are automatically assigned starting from 1:: + + >>> import enum + >>> enum.make('Animals', ('ant', 'bee', 'cat', 'dog')) + + +The items in source can also be 2-tuples, where the first item is the +enumeration value name and the second is the integer value to assign to the +value. If 2-tuples are used, all items must be 2-tuples:: + + >>> def enumiter(): + ... start = 1 + ... while True: + ... yield start + ... start <<= 1 + >>> enum.make('Flags', zip(list('abcdefg'), enumiter())) + + + +Proposed variations +=================== + +Some variations were proposed during the discussions in the mailing list. +Here's some of the more popular ones. + + +Not having to specify values for enums +-------------------------------------- + +Michael Foord proposed (and Tim Delaney provided a proof-of-concept +implementation) to use metaclass magic that makes this possible:: + + class Color(Enum): + red, green, blue + +The values get actually assigned only when first looked up. + +Pros: cleaner syntax that requires less typing for a very common task (just +listing enumeration names without caring about the values). + +Cons: involves much magic in the implementation, which makes even the +definition of such enums baffling when first seen. Besides, explicit is +better than implicit. + + +Using special names or forms to auto-assign enum values +------------------------------------------------------- + +A different approach to avoid specifying enum values is to use a special name +or form to auto assign them. For example:: + + class Color(Enum): + red = None # auto-assigned to 0 + green = None # auto-assigned to 1 + blue = None # auto-assigned to 2 + +More flexibly:: + + class Color(Enum): + red = 7 + green = None # auto-assigned to 8 + blue = 19 + purple = None # auto-assigned to 20 + +Some variations on this theme: + +#. A special name ``auto`` imported from the enum package. +#. Georg Brandl proposed ellipsis (``...``) instead of ``None`` to achieve the + same effect. + +Pros: no need to manually enter values. Makes it easier to change the enum and +extend it, especially for large enumerations. + +Cons: actually longer to type in many simple cases. The argument of explicit +vs. implicit applies here as well. + + +Use-cases in the standard library +================================= + +The Python standard library has many places where the usage of enums would be +beneficial to replace other idioms currently used to represent them. Such +usages can be divided to two categories: user-code facing constants, and +internal constants. + +User-code facing constants like ``os.SEEK_*``, ``socket`` module constants, +decimal rounding modes, HTML error codes could benefit from being enums had +they been implemented this way from the beginning. At this point, however, at +the risk of breaking user code (that relies on the constants' actual values +rather than their meaning) such a change cannot be made. This does not mean +that future uses in the stdlib can't use an enum for defining new user-code +facing constants. + +Internal constants are not seen by user code but are employed internally by +stdlib modules. It appears that nothing should stand in the way of +implementing such constants with enums. Some examples uncovered by a very +partial skim through the stdlib: ``binhex``, ``imaplib``, ``http/client``, +``urllib/robotparser``, ``idlelib``, ``concurrent.futures``, ``turtledemo``. + +In addition, looking at the code of the Twisted library, there are many use +cases for replacing internal state constants with enums. The same can be said +about a lot of networking code (especially implementation of protocols) and +can be seen in test protocols written with the Tulip library as well. + + +Differences from PEP 354 +======================== + +Unlike PEP 354, enumeration values are not defined as a sequence of strings, +but as attributes of a class. This design was chosen because it was felt that +class syntax is more readable. + +Unlike PEP 354, enumeration values require an explicit integer value. This +difference recognizes that enumerations often represent real-world values, or +must interoperate with external real-world systems. For example, to store an +enumeration in a database, it is better to convert it to an integer on the way +in and back to an enumeration on the way out. Providing an integer value also +provides an explicit ordering. However, there is no automatic conversion to +and from the integer values, because explicit is better than implicit. + +Unlike PEP 354, this implementation does use a metaclass to define the +enumeration's syntax, and allows for extended base-enumerations so that the +common values in derived classes are identical (a singleton model). While PEP +354 dismisses this approach for its complexity, in practice any perceived +complexity, though minimal, is hidden from users of the enumeration. + +Unlike PEP 354, enumeration values can only be tested by identity comparison. +This is to emphasize the fact that enumeration values are singletons, much +like ``None``. + + +Acknowledgments +=============== + +This PEP describes the ``flufl.enum`` package by Barry Warsaw. ``flufl.enum`` +is based on an example by Jeremy Hylton. It has been modified and extended +by Barry Warsaw for use in the GNU Mailman [5]_ project. Ben Finney is the +author of the earlier enumeration PEP 354. + + +References +========== + +.. [1] http://pythonhosted.org/flufl.enum/docs/using.html +.. [2] http://www.python.org/dev/peps/pep-0354/ +.. [3] http://mail.python.org/pipermail/python-ideas/2013-January/019003.html +.. [4] http://mail.python.org/pipermail/python-ideas/2013-February/019373.html +.. [5] http://www.list.org + + +Copyright +========= + +This document has been placed in the public domain. + + +Todo +==== + + * Mark PEP 354 "superseded by" this one, if accepted + * New package name within stdlib - enum? (top-level) + * For make, can we add an API like namedtuple's? + make('Animals, 'ant bee cat dog') + I.e. when make sees a string argument it splits it, making it similar to a + tuple but with far less manual quote typing. OTOH, it just saves a ".split" + so may not be worth the effort ? + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: + diff --git a/pep-0436.txt b/pep-0436.txt new file mode 100644 index 000000000..a9534b424 --- /dev/null +++ b/pep-0436.txt @@ -0,0 +1,480 @@ +PEP: 436 +Title: The Argument Clinic DSL +Version: $Revision$ +Last-Modified: $Date$ +Author: Larry Hastings +Discussions-To: Python-Dev +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 22-Feb-2013 + + +Abstract +======== + +This document proposes "Argument Clinic", a DSL designed to facilitate +argument processing for built-in functions in the implementation of +CPython. + + +Rationale and Goals +=================== + +The primary implementation of Python, "CPython", is written in a +mixture of Python and C. One of the implementation details of CPython +is what are called "built-in" functions -- functions available to +Python programs but written in C. When a Python program calls a +built-in function and passes in arguments, those arguments must be +translated from Python values into C values. This process is called +"parsing arguments". + +As of CPython 3.3, arguments to functions are primarily parsed with +one of two functions: the original ``PyArg_ParseTuple()``, [1]_ and +the more modern ``PyArg_ParseTupleAndKeywords()``. [2]_ The former +function only handles positional parameters; the latter also +accommodates keyword and keyword-only parameters, and is preferred for +new code. + +``PyArg_ParseTuple()`` was a reasonable approach when it was first +conceived. The programmer specified the translation for the arguments +in a "format string": [3]_ each parameter matched to a "format unit", +a one-or-two character sequence telling ``PyArg_ParseTuple()`` what +Python types to accept and how to translate them into the appropriate +C value for that parameter. There were only a dozen or so of these +"format units", and each one was distinct and easy to understand. + +Over the years the ``PyArg_Parse`` interface has been extended in +numerous ways. The modern API is quite complex, to the point that it +is somewhat painful to use. Consider: + + * There are now forty different "format units"; a few are even three + characters long. This makes it difficult to understand what the + format string says without constantly cross-indexing it with the + documentation. + * There are also six meta-format units that may be buried in the + format string. (They are: ``"()|$:;"``.) + * The more format units are added, the less likely it is the + implementer can pick an easy-to-use mnemonic for the format unit, + because the character of choice is probably already in use. In + other words, the more format units we have, the more obtuse the + format units become. + * Several format units are nearly identical to others, having only + subtle differences. This makes understanding the exact semantics + of the format string even harder. + * The docstring is specified as a static C string, which is mildly + bothersome to read and edit. + * When adding a new parameter to a function using + ``PyArg_ParseTupleAndKeywords()``, it's necessary to touch six + different places in the code: [4]_ + + * Declaring the variable to store the argument. + * Passing in a pointer to that variable in the correct spot in + ``PyArg_ParseTupleAndKeywords()``, also passing in any + "length" or "converter" arguments in the correct order. + * Adding the name of the argument in the correct spot of the + "keywords" array passed in to + ``PyArg_ParseTupleAndKeywords()``. + * Adding the format unit to the correct spot in the format + string. + * Adding the parameter to the prototype in the docstring. + * Documenting the parameter in the docstring. + + * There is currently no mechanism for builtin functions to provide + their "signature" information (see ``inspect.getfullargspec`` and + ``inspect.Signature``). Adding this information using a mechanism + similar to the existing ``PyArg_Parse`` functions would require + repeating ourselves yet again. + +The goal of Argument Clinic is to replace this API with a mechanism +inheriting none of these downsides: + + * You need specify each parameter only once. + * All information about a parameter is kept together in one place. + * For each parameter, you specify its type in C; Argument Clinic + handles the translation from Python value into C value for you. + * Argument Clinic also allows for fine-tuning of argument processing + behavior with highly-readable "flags", both per-parameter and + applying across the whole function. + * Docstrings are written in plain text. + * From this, Argument Clinic generates for you all the mundane, + repetitious code and data structures CPython needs internally. + Once you've specified the interface, the next step is simply to + write your implementation using native C types. Every detail of + argument parsing is handled for you. + +Future goals of Argument Clinic include: + + * providing signature information for builtins, and + * speed improvements to the generated code. + + +DSL Syntax Summary +================== + +The Argument Clinic DSL is specified as a comment embedded in a C +file, as follows. The "Example" column on the right shows you sample +input to the Argument Clinic DSL, and the "Section" column on the left +specifies what each line represents in turn. + +:: + + +-----------------------+-----------------------------------------------------+ + | Section | Example | + +-----------------------+-----------------------------------------------------+ + | Clinic DSL start | /*[clinic] | + | Function declaration | module.function_name -> return_annotation | + | Function flags | flag flag2 flag3=value | + | Parameter declaration | type name = default | + | Parameter flags | flag flag2 flag3=value | + | Parameter docstring | Lorem ipsum dolor sit amet, consectetur | + | | adipisicing elit, sed do eiusmod tempor | + | Function docstring | Lorem ipsum dolor sit amet, consectetur adipisicing | + | | elit, sed do eiusmod tempor incididunt ut labore et | + | Clinic DSL end | [clinic]*/ | + | Clinic output | ... | + | Clinic output end | /*[clinic end output:]*/ | + +-----------------------+-----------------------------------------------------+ + + +General Behavior Of the Argument Clinic DSL +------------------------------------------- + +All lines support ``#`` as a line comment delimiter *except* +docstrings. Blank lines are always ignored. + +Like Python itself, leading whitespace is significant in the Argument +Clinic DSL. The first line of the "function" section is the +declaration; all subsequent lines at the same indent are function +flags. Once you indent, the first line is a parameter declaration; +subsequent lines at that indent are parameter flags. Indent one more +time for the lines of the parameter docstring. Finally, dedent back +to the same level as the function declaration for the function +docstring. + + +Function Declaration +-------------------- + +The return annotation is optional. If skipped, the arrow ("``->``") +must also be omitted. + + +Parameter Declaration +--------------------- + +The "type" is a C type. If it's a pointer type, you must specify a +single space between the type and the "``*``", and zero spaces between +the "``*``" and the name. (e.g. "``PyObject *foo``", not "``PyObject* +foo``") + +The "name" must be a legal C identifier. + +The "default" is a Python value. Default values are optional; if not +specified you must omit the equals sign too. Parameters which don't +have a default are implicitly required. The default value is +dynamically assigned, "live" in the generated C code, and although +it's specified as a Python value, it's translated into a native C +value in the generated C code. + +It's explicitly permitted to end the parameter declaration line with a +semicolon, though the semicolon is optional. This is intended to +allow directly cutting and pasting in declarations from C code. +However, the preferred style is without the semicolon. + + +Flags +----- + +"Flags" are like "``make -D``" arguments. They're unordered. Flags +lines are parsed much like the shell (specifically, using +``shlex.split()`` [5]_ ). You can have as many flag lines as you +like. Specifying a flag twice is currently an error. + +Supported flags for functions: + +``basename`` + The basename to use for the generated C functions. By default this + is the name of the function from the DSL, only with periods replaced + by underscores. + +``positional-only`` + This function only supports positional parameters, not keyword + parameters. See `Functions With Positional-Only Parameters`_ below. + +Supported flags for parameters: + +``bitwise`` + If the Python integer passed in is signed, copy the bits directly + even if it is negative. Only valid for unsigned integer types. + +``converter`` + Backwards-compatibility support for parameter "converter" + functions. [6]_ The value should be the name of the converter + function in C. Only valid when the type of the parameter is + ``void *``. + +``default`` + The Python value to use in place of the parameter's actual default + in Python contexts. Specifically, when specified, this value will + be used for the parameter's default in the docstring, and in the + ``Signature``. (TBD: If the string is a valid Python expression + which can be rendered into a Python value using ``eval()``, then the + result of ``eval()`` on it will be used as the default in the + ``Signature``.) Ignored if there is no default. + +``encoding`` + Encoding to use when encoding a Unicode string to a ``char *``. + Only valid when the type of the parameter is ``char *``. + +``group=`` + This parameter is part of a group of options that must either all be + specified or none specified. Parameters in the same "group" must be + contiguous. The value of the group flag is the name used for the + group variable, and therefore must be legal as a C identifier. Only + valid for functions marked "``positional-only``"; see `Functions + With Positional-Only Parameters`_ below. + +``immutable`` + Only accept immutable values. + +``keyword-only`` + This parameter (and all subsequent parameters) is keyword-only. + Keyword-only parameters must also be optional parameters. Not valid + for positional-only functions. + +``length`` + This is an iterable type, and we also want its length. The DSL will + generate a second ``Py_ssize_t`` variable; its name will be this + parameter's name appended with "``_length``". + +``nullable`` + ``None`` is a legal argument for this parameter. If ``None`` is + supplied on the Python side, the equivalent C argument will be + ``NULL``. Only valid for pointer types. + +``required`` + Normally any parameter that has a default value is automatically + optional. A parameter that has "required" set will be considered + required (non-optional) even if it has a default value. The + generated documentation will also not show any default value. + +``types`` + Space-separated list of acceptable Python types for this object. + There are also four special-case types which represent Python + protocols: + + * buffer + * mapping + * number + * sequence + +``zeroes`` + This parameter is a string type, and its value should be allowed to + have embedded zeroes. Not valid for all varieties of string + parameters. + + +Python Code +----------- + +Argument Clinic also permits embedding Python code inside C files, +which is executed in-place when Argument Clinic processes the file. +Embedded code looks like this: + +:: + + /*[python] + + # this is python code! + print("/" + "* Hello world! *" + "/") + + [python]*/ + +Any Python code is valid. Python code sections in Argument Clinic can +also be used to modify Clinic's behavior at runtime; for example, see +`Extending Argument Clinic`_. + + +Output +====== + +Argument Clinic writes its output in-line in the C file, immediately +after the section of Clinic code. For "python" sections, the output +is everything printed using ``builtins.print``. For "clinic" +sections, the output is valid C code, including: + + * a ``#define`` providing the correct ``methoddef`` structure for the + function + * a prototype for the "impl" function -- this is what you'll write + to implement this function + * a function that handles all argument processing, which calls your + "impl" function + * the definition line of the "impl" function + * and a comment indicating the end of output. + +The intention is that you will write the body of your impl function +immediately after the output -- as in, you write a left-curly-brace +immediately after the end-of-output comment and write the +implementation of the builtin in the body there. (It's a bit strange +at first, but oddly convenient.) + +Argument Clinic will define the parameters of the impl function for +you. The function will take the "self" parameter passed in +originally, all the parameters you define, and possibly some extra +generated parameters ("length" parameters; also "group" parameters, +see next section). + +Argument Clinic also writes a checksum for the output section. This +is a valuable safety feature: if you modify the output by hand, Clinic +will notice that the checksum doesn't match, and will refuse to +overwrite the file. (You can force Clinic to overwrite with the +"``-f``" command-line argument; Clinic will also ignore the checksums +when using the "``-o``" command-line argument.) + + +Functions With Positional-Only Parameters +========================================= + +A significant fraction of Python builtins implemented in C use the +older positional-only API for processing arguments +(``PyArg_ParseTuple()``). In some instances, these builtins parse +their arguments differently based on how many arguments were passed +in. This can provide some bewildering flexibility: there may be +groups of optional parameters, which must either all be specified or +none specified. And occasionally these groups are on the *left!* (For +example: ``curses.window.addch()``.) + +Argument Clinic supports these legacy use-cases with a special set of +flags. First, set the flag "``positional-only``" on the entire +function. Then, for every group of parameters that is collectively +optional, add a "``group=``" flag with a unique string to all the +parameters in that group. Note that these groups are permitted on the +right *or left* of any required parameters! However, all groups +(including the group of required parameters) must be contiguous. + +The impl function generated by Clinic will add an extra parameter for +every group, "``int _group``". This argument will be nonzero +if the group was specified on this call, and zero if it was not. + +Note that when operating in this mode, you cannot specify default +arguments. You can simulate defaults by putting parameters in +individual groups and detecting whether or not they were specified; +generally speaking it's better to simply not use "positional-only" +where it isn't absolutely necessary. (TBD: It might be possible to +relax this restriction. But adding default arguments into the mix of +groups would seemingly make calculating which groups are active a good +deal harder.) + +Also, note that it's possible to specify a set of groups to a function +such that there are several valid mappings from the number of +arguments to a valid set of groups. If this happens, Clinic will exit +with an error message. This should not be a problem, as +positional-only operation is only intended for legacy use cases, and +all the legacy functions using this quirky behavior should have +unambiguous mappings. + + +Current Status +============== + +As of this writing, there is a working prototype implementation of +Argument Clinic available online. [7]_ The prototype implements the +syntax above, and generates code using the existing ``PyArg_Parse`` +APIs. It supports translating to all current format units except +``"w*"``. Sample functions using Argument Clinic exercise all major +features, including positional-only argument parsing. + + +Extending Argument Clinic +------------------------- + +The prototype also currently provides an experimental extension +mechanism, allowing adding support for new types on-the-fly. See +``Modules/posixmodule.c`` in the prototype for an example of its use. + + +Notes / TBD +=========== + +* Guido proposed having the "function docstring" be hand-written inline, + in the middle of the output, something like this: + + :: + + /*[clinic] + ... prototype and parameters (including parameter docstrings) go here + [clinic]*/ + ... some output ... + /*[clinic docstring start]*/ + ... hand-edited function docstring goes here <-- you edit this by hand! + /*[clinic docstring end]*/ + ... more output + /*[clinic output end]*/ + + I tried it this way and don't like it -- I think it's clumsy. I + prefer that everything you write goes in one place, rather than + having an island of hand-edited stuff in the middle of the DSL + output. + +* Do we need to support tuple unpacking? (The "``(OOO)``" style + format string.) Boy I sure hope not. + +* What about Python functions that take no arguments? This syntax + doesn't provide for that. Perhaps a lone indented "None" should + mean "no arguments"? + +* This approach removes some dynamism / flexibility. With the + existing syntax one could theoretically pass in different encodings + at runtime for the "``es``"/"``et``" format units. AFAICT CPython + doesn't do this itself, however it's possible external users might + do this. (Trivia: there are no uses of "``es``" exercised by + regrtest, and all the uses of "``et``" exercised are in + socketmodule.c, except for one in _ssl.c. They're all static, + specifying the encoding ``"idna"``.) + +* Right now the "basename" flag on a function changes the ``#define + methoddef`` name too. Should it, or should the #define'd methoddef + name always be ``{module_name}_{function_name}`` ? + + +References +========== + +.. [1] ``PyArg_ParseTuple()``: + http://docs.python.org/3/c-api/arg.html#PyArg_ParseTuple + +.. [2] ``PyArg_ParseTupleAndKeywords()``: + http://docs.python.org/3/c-api/arg.html#PyArg_ParseTupleAndKeywords + +.. [3] ``PyArg_`` format units: + http://docs.python.org/3/c-api/arg.html#strings-and-buffers + +.. [4] Keyword parameters for extension functions: + http://docs.python.org/3/extending/extending.html#keyword-parameters-for-extension-functions + +.. [5] ``shlex.split()``: + http://docs.python.org/3/library/shlex.html#shlex.split + +.. [6] ``PyArg_`` "converter" functions, see ``"O&"`` in this section: + http://docs.python.org/3/c-api/arg.html#other-objects + +.. [7] Argument Clinic prototype: + https://bitbucket.org/larry/python-clinic/ + + +Copyright +========= + +This document has been placed in the public domain. + + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: