This commit is contained in:
Terry Jan Reedy 2013-03-05 23:16:18 -05:00
commit 2c3b8db87c
6 changed files with 1377 additions and 226 deletions

View File

@ -1,5 +1,5 @@
PEP: 422
Title: Simple class initialisation hook
Title: Simpler customisation of class creation
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com>,
@ -9,29 +9,23 @@ Type: Standards Track
Content-Type: text/x-rst
Created: 5-Jun-2012
Python-Version: 3.4
Post-History: 5-Jun-2012, 10-Feb-2012
Post-History: 5-Jun-2012, 10-Feb-2013
Abstract
========
In Python 2, the body of a class definition could modify the way a class
was created (or simply arrange to run other code after the class was created)
by setting the ``__metaclass__`` attribute in the class body. While doing
this implicitly from called code required the use of an implementation detail
(specifically, ``sys._getframes()``), it could also be done explicitly in a
fully supported fashion (for example, by passing ``locals()`` to a
function that calculated a suitable ``__metaclass__`` value)
Currently, customising class creation requires the use of a custom metaclass.
This custom metaclass then persists for the entire lifecycle of the class,
creating the potential for spurious metaclass conflicts.
There is currently no corresponding mechanism in Python 3 that allows the
code executed in the class body to directly influence how the class object
is created. Instead, the class creation process is fully defined by the
class header, before the class body even begins executing.
This PEP proposes to instead support a wide range of customisation
scenarios through a new ``namespace`` parameter in the class header, and
a new ``__init_class__`` hook in the class body.
This PEP proposes a mechanism that will once again allow the body of a
class definition to more directly influence the way a class is created
(albeit in a more constrained fashion), as well as replacing some current
uses of metaclasses with a simpler, easier to understand alternative.
The new mechanism is also much easier to understand and use than
implementing a custom metaclass, and thus should provide a gentler
introduction to the full power Python's metaclass machinery.
Background
@ -81,25 +75,32 @@ metaclass could not call methods that referenced the class by name (as the
name had not yet been bound in the containing scope), similarly, Python 3
metaclasses cannot call methods that rely on the implicit ``__class__``
reference (as it is not populated until after the metaclass has returned
control to the class creation machiner).
control to the class creation machinery).
Finally, when a class uses a custom metaclass, it can pose additional
challenges to the use of multiple inheritance, as a new class cannot
inherit from parent classes with unrelated metaclasses. This means that
it is impossible to add a metaclass to an already published class: such
an addition is a backwards incompatible change due to the risk of metaclass
conflicts.
Proposal
========
This PEP proposes that a mechanism be added to Python 3 that meets the
following criteria:
This PEP proposes that a new mechanism to customise class creation be
added to Python 3.4 that meets the following criteria:
1. Restores the ability for class namespaces to have some influence on the
1. Integrates nicely with class inheritance structures (including mixins and
multiple inheritance)
2. Integrates nicely with the implicit ``__class__`` reference and
zero-argument ``super()`` syntax introduced by PEP 3135
3. Can be added to an existing base class without a significant risk of
introducing backwards compatibility problems
4. Restores the ability for class namespaces to have some influence on the
class creation process (above and beyond populating the namespace itself),
but potentially without the full flexibility of the Python 2 style
``__metaclass__`` hook
2. Integrates nicely with class inheritance structures (including mixins and
multiple inheritance)
3. Integrates nicely with the implicit ``__class__`` reference and
zero-argument ``super()`` syntax introduced by PEP 3135
4. Can be added to an existing base class without a significant risk of
introducing backwards compatibility problems
One mechanism that can achieve this goal is to add a new class
initialisation hook, modelled directly on the existing instance
@ -110,7 +111,6 @@ Specifically, it is proposed that class definitions be able to provide a
class initialisation hook as follows::
class Example:
@classmethod
def __init_class__(cls):
# This is invoked after the class is created, but before any
# explicit decorators are called
@ -121,13 +121,15 @@ class initialisation hook as follows::
If present on the created object, this new hook will be called by the class
creation machinery *after* the ``__class__`` reference has been initialised.
For ``types.new_class()``, it will be called as the last step before
returning the created class object.
returning the created class object. ``__init_class__`` is implicitly
converted to a class method when the class is created (prior to the hook
being invoked).
If a metaclass wishes to block class initialisation for some reason, it
must arrange for ``cls.__init_class__`` to trigger ``AttributeError``.
Note, that when ``__init_class__`` is called, the name of the class is not
bound to the new class object yet. As a consequence, the two argument form
yet bound to the new class object. As a consequence, the two argument form
of ``super()`` cannot be used to call methods (e.g., ``super(Example, cls)``
wouldn't work in the example above). However, the zero argument form of
``super()`` works as expected, since the ``__class__`` reference is already
@ -139,19 +141,47 @@ similar mechanism has long been supported by `Zope's ExtensionClass`_),
but the situation has changed sufficiently in recent years that
the idea is worth reconsidering.
In addition, the introduction of the metaclass ``__prepare__`` method in PEP
3115 allows a further enhancement that was not possible in Python 2: this
PEP also proposes that ``type.__prepare__`` be updated to accept a factory
function as a ``namespace`` keyword-only argument. If present, the value
provided as the ``namespace`` argument will be called without arguments
to create the result of ``type.__prepare__`` instead of using a freshly
created dictionary instance. For example, the following will use
an ordered dictionary as the class namespace::
class OrderedExample(namespace=collections.OrderedDict):
def __init_class__(cls):
# cls.__dict__ is still a read-only proxy to the class namespace,
# but the underlying storage is an OrderedDict instance
.. note::
This PEP, along with the existing ability to use __prepare__ to share a
single namespace amongst multiple class objects, highlights a possible
issue with the attribute lookup caching: when the underlying mapping is
updated by other means, the attribute lookup cache is not invalidated
correctly (this is a key part of the reason class ``__dict__`` attributes
produce a read-only view of the underlying storage).
Since the optimisation provided by that cache is highly desirable,
the use of a preexisting namespace as the class namespace may need to
be declared as officially unsupported (since the observed behaviour is
rather strange when the caches get out of sync).
Key Benefits
============
Replaces many use cases for dynamic setting of ``__metaclass__``
-----------------------------------------------------------------
Easier use of custom namespaces for a class
-------------------------------------------
For use cases that don't involve completely replacing the defined class,
Python 2 code that dynamically set ``__metaclass__`` can now dynamically
set ``__init_class__`` instead. For more advanced use cases, introduction of
an explicit metaclass (possibly made available as a required base class) will
still be necessary in order to support Python 3.
Currently, to use a different type (such as ``collections.OrderedDict``) for
a class namespace, or to use a pre-populated namespace, it is necessary to
write and use a custom metaclass. With this PEP, using a custom namespace
becomes as simple as specifying an appropriate factory function in the
class header.
Easier inheritance of definition time behaviour
@ -201,137 +231,124 @@ implicit ``__class__`` reference introduced by PEP 3135, including methods
that use the zero argument form of ``super()``.
Alternatives
============
Replaces many use cases for dynamic setting of ``__metaclass__``
-----------------------------------------------------------------
For use cases that don't involve completely replacing the defined class,
Python 2 code that dynamically set ``__metaclass__`` can now dynamically
set ``__init_class__`` instead. For more advanced use cases, introduction of
an explicit metaclass (possibly made available as a required base class) will
still be necessary in order to support Python 3.
The Python 3 Status Quo
-----------------------
New Ways of Using Classes
=========================
The Python 3 status quo already offers a great deal of flexibility. For
changes which only affect a single class definition and which can be
specified at the time the code is written, then class decorators can be
used to modify a class explicitly. Class decorators largely ignore class
inheritance and can make full use of methods that rely on the ``__class__``
reference being populated.
The new ``namespace`` keyword in the class header enables a number of
interesting options for controlling the way a class is initialised,
including some aspects of the object models of both Javascript and Ruby.
Using a custom metaclass provides the same level of power as it did in
Python 2. However, it's notable that, unlike class decorators, a metaclass
cannot call any methods that rely on the ``__class__`` reference, as that
reference is not populated until after the metaclass constructor returns
control to the class creation code.
All of the examples below are actually possible today through the use of a
custom metaclass::
One major use case for metaclasses actually closely resembles the use of
class decorators. It occurs whenever a metaclass has an implementation that
uses the following pattern::
class CustomNamespace(type):
@classmethod
def __prepare__(meta, name, bases, *, namespace=None, **kwds):
parent_namespace = super().__prepare__(name, bases, **kwds)
return namespace() if namespace is not None else parent_namespace
def __new__(meta, name, bases, ns, *, namespace=None, **kwds):
return super().__new__(meta, name, bases, ns, **kwds)
def __init__(cls, name, bases, ns, *, namespace=None, **kwds):
return super().__init__(name, bases, ns, **kwds)
class Metaclass(type):
def __new__(meta, *args, **kwds):
cls = super(Metaclass, meta).__new__(meta, *args, **kwds)
# Do something with cls
return cls
The key difference between this pattern and a class decorator is that it
is automatically inherited by subclasses. However, it also comes with a
major disadvantage: Python does not allow you to inherit from classes with
unrelated metaclasses.
Thus, the status quo requires that developers choose between the following
two alternatives:
* Use a class decorator, meaning that behaviour is not inherited and must be
requested explicitly on every subclass
* Use a metaclass, meaning that behaviour is inherited, but metaclass
conflicts may make integration with other libraries and frameworks more
difficult than it otherwise would be
If this PEP is ultimately rejected, then this is the existing design that
will remain in place by default.
The advantage of implementing the new keyword directly in
``type.__prepare__`` is that the *only* persistent effect is then
the change in the underlying storage of the class attributes. The metaclass
of the class remains unchanged, eliminating many of the drawbacks
typically associated with these kinds of customisations.
Restoring the Python 2 metaclass hook
-------------------------------------
One simple alternative would be to restore support for a Python 2 style
``metaclass`` hook in the class body. This would be checked after the class
body was executed, potentially overwriting the metaclass hint provided in the
class header.
The main attraction of such an approach is that it would simplify porting
Python 2 applications that make use of this hook (especially those that do
so dynamically).
However, this approach does nothing to simplify the process of adding
*inherited* class definition time behaviour, nor does it interoperate
cleanly with the PEP 3135 ``__class__`` and ``super()`` semantics (as with
any metaclass based solution, the ``__metaclass__`` hook would have to run
before the ``__class__`` reference has been populated.
Dynamic class decorators
Order preserving classes
------------------------
The original version of this PEP was called "Dynamic class decorators" and
focused solely on a significantly more complicated proposal than that
presented in the current version.
::
As with the current version, it proposed that a new step be added to the
class creation process, after the metaclass invocation to construct the
class instance and before the application of lexical decorators. However,
instead of a simple process of calling a single class method that relies
on normal inheritance mechanisms, it proposed a far more complicated
procedure that walked the class MRO looking for decorators stored in
iterable ``__decorators__`` attributes.
Using the current version of the PEP, the scheme originally proposed could
be implemented as::
class DynamicDecorators(Base):
@classmethod
def __init_class__(cls):
# Process any classes later in the MRO
try:
mro_chain = super().__init_class__
except AttributeError:
pass
else:
mro_chain()
# Process any __decorators__ attributes in the MRO
for entry in reversed(cls.mro()):
decorators = entry.__dict__.get("__decorators__", ())
for deco in reversed(decorators):
cls = deco(cls)
Any subclasses of ``DynamicDecorators`` would then automatically have the
contents of any ``__decorators__`` attributes processed and invoked.
The mechanism in the current PEP is considered superior, as many issues
to do with ordering and the same decorator being invoked multiple times
just go away, as that kind of thing is taken care of through the use of an
ordinary class method invocation.
class OrderedClass(namespace=collections.OrderedDict):
a = 1
b = 2
c = 3
Automatic metaclass derivation
------------------------------
Prepopulated namespaces
-----------------------
When no appropriate metaclass is found, it's theoretically possible to
automatically derive a metaclass for a new type based on the metaclass hint
and the metaclasses of the bases.
::
While adding such a mechanism would reduce the risk of spurious metaclass
conflicts, it would do nothing to improve integration with PEP 3135, would
not help with porting Python 2 code that set ``__metaclass__`` dynamically
and would not provide a more straightforward inherited mechanism for invoking
additional operations after the class invocation is complete.
In addition, there would still be a risk of metaclass conflicts in cases
where the base metaclasses were not written with multiple inheritance in
mind. In such situations, there's a chance of introducing latent defects
if one or more metaclasses are not invoked correctly.
seed_data = dict(a=1, b=2, c=3)
class PrepopulatedClass(namespace=seed_data.copy):
pass
Calling the new hook from ``type.__init__``
-------------------------------------------
Cloning a prototype class
-------------------------
::
class NewClass(namespace=Prototype.__dict__.copy):
pass
Extending a class
-----------------
.. note:: Just because the PEP makes it *possible* to do this relatively,
cleanly doesn't mean anyone *should* do this!
::
from collections import MutableMapping
# The MutableMapping + dict combination should give something that
# generally behaves correctly as a mapping, while still being accepted
# as a class namespace
class ClassNamespace(MutableMapping, dict):
def __init__(self, cls):
self._cls = cls
def __len__(self):
return len(dir(self._cls))
def __iter__(self):
for attr in dir(self._cls):
yield attr
def __contains__(self, attr):
return hasattr(self._cls, attr)
def __getitem__(self, attr):
return getattr(self._cls, attr)
def __setitem__(self, attr, value):
setattr(self._cls, attr, value)
def __delitem__(self, attr):
delattr(self._cls, attr)
def extend(cls):
return lambda: ClassNamespace(cls)
class Example:
pass
class ExtendedExample(namespace=extend(Example)):
a = 1
b = 2
c = 3
>>> Example.a, Example.b, Example.c
(1, 2, 3)
Rejected Design Options
=======================
Calling ``__init_class__`` from ``type.__init__``
-------------------------------------------------
Calling the new hook automatically from ``type.__init__``, would achieve most
of the goals of this PEP. However, using that approach would mean that
@ -340,11 +357,43 @@ relied on the ``__class__`` reference (or used the zero-argument form of
``super()``), and could not make use of those features themselves.
Requiring an explict decorator on ``__init_class__``
----------------------------------------------------
Originally, this PEP required the explicit use of ``@classmethod`` on the
``__init_class__`` decorator. It was made implicit since there's no
sensible interpretation for leaving it out, and that case would need to be
detected anyway in order to give a useful error message.
This decision was reinforced after noticing that the user experience of
defining ``__prepare__`` and forgetting the ``@classmethod`` method
decorator is singularly incomprehensible (particularly since PEP 3115
documents it as an ordinary method, and the current documentation doesn't
explicitly say anything one way or the other).
Passing in the namespace directly rather than a factory function
----------------------------------------------------------------
At one point, this PEP proposed that the class namespace be passed
directly as a keyword argument, rather than passing a factory function.
However, this encourages an unsupported behaviour (that is, passing the
same namespace to multiple classes, or retaining direct write access
to a mapping used as a class namespace), so the API was switched to
the factory function version.
Reference Implementation
========================
The reference implementation has been posted to the `issue tracker`_.
A reference implementation for __init_class__ has been posted to the
`issue tracker`_. It does not yet include the new ``namespace`` parameter
for ``type.__prepare__``.
TODO
====
* address the 5 points in http://mail.python.org/pipermail/python-dev/2013-February/123970.html
References
==========

View File

@ -343,10 +343,10 @@ Examples::
Provides-Extra (multiple use)
-----------------------------
A string containing the name of an optional feature. Must be printable
ASCII, not containing whitespace, comma (,), or square brackets [].
May be used to make a dependency conditional on whether the optional
feature has been requested.
A string containing the name of an optional feature or "extra" that may
only be available when additional dependencies have been installed. Must
be printable ASCII, not containing whitespace, comma (,), or square
brackets [].
See `Optional Features`_ for details on the use of this field.
@ -861,7 +861,7 @@ ordered as shown::
Within a post-release (``1.0.post1``), the following suffixes are permitted
and are ordered as shown::
devN, <no suffix>
.devN, <no suffix>
Note that ``devN`` and ``postN`` must always be preceded by a dot, even
when used immediately following a numeric version (e.g. ``1.0.dev456``,
@ -976,8 +976,9 @@ Date based versions
As with other incompatible version schemes, date based versions can be
stored in the ``Private-Version`` field. Translating them to a compliant
version is straightforward: the simplest approach is to subtract the year
of the first release from the major component in the release number.
public version is straightforward: the simplest approach is to subtract
the year before the first release from the major component in the release
number.
Version specifiers
@ -994,48 +995,93 @@ Each version identifier must be in the standard format described in
The comma (",") is equivalent to a logical **and** operator.
Comparison operators must be one of ``<``, ``>``, ``<=``, ``>=``, ``==``
or ``!=``.
The ``==`` and ``!=`` operators are strict - in order to match, the
version supplied must exactly match the specified version, with no
additional trailing suffix.
However, when no comparison operator is provided along with a version
identifier ``V``, it is equivalent to using the following pair of version
clauses::
>= V, < V+1
where ``V+1`` is the next version after ``V``, as determined by
incrementing the last numeric component in ``V`` (for example, if
``V == 1.0a3``, then ``V+1 == 1.0a4``, while if ``V == 1.0``, then
``V+1 == 1.1``).
This approach makes it easy to depend on a particular release series
simply by naming it in a version specifier, without requiring any
additional annotation. For example, the following pairs of version
specifiers are equivalent::
2
>= 2, < 3
3.3
>= 3.3, < 3.4
Whitespace between a conditional operator and the following version
identifier is optional, as is the whitespace around the commas.
Compatible release
------------------
A compatible release clause omits the comparison operator and matches any
version that is expected to be compatible with the specified version.
For a given release identifier ``V.N``, the compatible release clause is
approximately equivalent to the pair of comparison clauses::
>= V.N, < V+1.dev0
where ``V+1`` is the next version after ``V``, as determined by
incrementing the last numeric component in ``V``. For example,
the following version clauses are approximately equivalent::
2.2
>= 2.2, < 3.dev0
1.4.5
>= 1.4.5, < 1.5.dev0
The difference between the two is that using a compatible release clause
does *not* count as `explicitly mentioning a pre-release`__.
__ `Handling of pre-releases`_
If a pre-release, post-release or developmental release is named in a
compatible release clause as ``V.N.suffix``, then the suffix is ignored
when determining the upper limit of compatibility::
2.2.post3
>= 2.2.post3, < 3.dev0
1.4.5a4
>= 1.4.5a4, < 1.5.dev0
Version comparisons
-------------------
A version comparison clause includes a comparison operator and a version
identifier, and will match any version where the comparison is true.
Comparison clauses are only needed to cover cases which cannot be handled
with an appropriate compatible release clause, including coping with
dependencies which do not have a robust backwards compatibility policy
and thus break the assumptions of a compatible release clause.
The defined comparison operators are ``<``, ``>``, ``<=``, ``>=``, ``==``,
and ``!=``.
The ordered comparison operators ``<``, ``>``, ``<=``, ``>=`` are based
on the consistent ordering defined by the standard `Version scheme`_.
The ``==`` and ``!=`` operators are based on string comparisons - in order
to match, the version being checked must start with exactly that sequence of
characters.
.. note::
The use of ``==`` when defining dependencies for published distributions
is strongly discouraged, as it greatly complicates the deployment of
security fixes (the strict version comparison operator is intended
primarily for use when defining dependencies for particular
applications while using a shared distribution index).
Handling of pre-releases
------------------------
Pre-releases of any kind, including developmental releases, are implicitly
excluded from all version specifiers, *unless* a pre-release or developmental
developmental release is explicitly mentioned in one of the clauses. For
example, this specifier implicitly excludes all pre-releases and development
release is explicitly mentioned in one of the clauses. For example, these
specifiers implicitly exclude all pre-releases and development
releases of later versions::
2.2
>= 1.0
While these specifiers would include them::
While these specifiers would include at least some of them::
2.2.dev0
2.2, != 2.3b2
>= 1.0a1
>= 1.0c1
>= 1.0, != 1.0b2
@ -1053,37 +1099,26 @@ controlled on a per-distribution basis.
Post-releases and purely numeric releases receive no special treatment -
they are always included unless explicitly excluded.
Given the above rules, projects which include the ``.0`` suffix for the
first release in a series, such as ``2.5.0``, can easily refer specifically
to that version with the clause ``2.5.0``, while the clause ``2.5`` refers
to that entire series. Projects which omit the ".0" suffix for the first
release of a series, by using a version string like ``2.5`` rather than
``2.5.0``, will need to use an explicit clause like ``>= 2.5, < 2.5.1`` to
refer specifically to that initial release.
Some examples:
Examples
--------
* ``Requires-Dist: zope.interface (3.1)``: any version that starts with 3.1,
* ``Requires-Dist: zope.interface (3.1)``: version 3.1 or later, but not
version 4.0 or later. Excludes pre-releases and developmental releases.
* ``Requires-Dist: zope.interface (3.1.0)``: version 3.1.0 or later, but not
version 3.2.0 or later. Excludes pre-releases and developmental releases.
* ``Requires-Dist: zope.interface (==3.1)``: any version that starts
with 3.1, excluding pre-releases and developmental releases.
* ``Requires-Dist: zope.interface (3.1.0,!=3.1.3)``: version 3.1.0 or later,
but not version 3.1.3 and not version 3.2.0 or later. Excludes pre-releases
and developmental releases. For this particular project, this means: "any
version of the 3.1 series but not 3.1.3". This is equivalent to:
``>=3.1, !=3.1.3, <3.2``.
* ``Requires-Python: 2.6``: Any version of Python 2.6 or 2.7. It
automatically excludes Python 3 or later.
* ``Requires-Python: 3.2, < 3.3``: Specifically requires Python 3.2,
excluding pre-releases.
* ``Requires-Dist: zope.interface (==3.1)``: equivalent to ``Requires-Dist:
zope.interface (3.1)``.
* ``Requires-Dist: zope.interface (3.1.0)``: any version that starts with
3.1.0, excluding pre-releases. Since that particular project doesn't
use more than 3 digits, it also means "only the 3.1.0 release".
* ``Requires-Python: 3``: Any Python 3 version, excluding pre-releases.
* ``Requires-Python: >=2.6,<3``: Any version of Python 2.6 or 2.7, including
post-releases (if they were used for Python). It excludes pre releases of
Python 3.
* ``Requires-Python: 2.6.2``: Equivalent to ">=2.6.2,<2.6.3". So this includes
only Python 2.6.2. Of course, if Python was numbered with 4 digits, it would
include all versions of the 2.6.2 series, excluding pre-releases.
* ``Requires-Python: 2.5``: Equivalent to ">=2.5,<2.6".
* ``Requires-Dist: zope.interface (3.1,!=3.1.3)``: any version that starts
with 3.1, excluding pre-releases of 3.1 *and* excluding any version that
starts with "3.1.3". For this particular project, this means: "any version
of the 3.1 series but not 3.1.3". This is equivalent to:
">=3.1,!=3.1.3,<3.2".
* ``Requires-Python: >=3.3a1``: Any version of Python 3.3+, including
* ``Requires-Python: 3.3a1``: Any version of Python 3.3+, including
pre-releases like 3.4a1.
@ -1439,10 +1474,10 @@ Changing the interpretation of version specifiers
The previous interpretation of version specifiers made it very easy to
accidentally download a pre-release version of a dependency. This in
turn made it difficult for developers to publish pre-release versions
of software to the Python Package Index, as leaving the package set as
public would lead to users inadvertently downloading pre-release software,
while hiding it would defeat the purpose of publishing it for user
testing.
of software to the Python Package Index, as even marking the package as
hidden wasn't enough to keep automated tools from downloading it, and also
made it harder for users to obtain the test release manually through the
main PyPI web interface.
The previous interpretation also excluded post-releases from some version
specifiers for no adequately justified reason.
@ -1451,6 +1486,16 @@ The updated interpretation is intended to make it difficult to accidentally
accept a pre-release version as satisfying a dependency, while allowing
pre-release versions to be explicitly requested when needed.
The "some forward compatibility assumed" default version constraint is
taken directly from the Ruby community's "pessimistic version constraint"
operator [4]_ to allow projects to take a cautious approach to forward
compatibility promises, while still easily setting a minimum required
version for their dependencies. It is made the default behaviour rather
than needing a separate operator in order to explicitly discourage
overspecification of dependencies by library developers. The explicit
comparison operators remain available to cope with dependencies with
unreliable or non-existent backwards compatibility policies.
Packaging, build and installation dependencies
----------------------------------------------
@ -1548,6 +1593,9 @@ justifications for needing such a standard can be found in PEP 386.
.. [3] Version compatibility analysis script:
http://hg.python.org/peps/file/default/pep-0426/pepsort.py
.. [4] Pessimistic version constraint
http://docs.rubygems.org/read/chapter/16
Appendix A
==========

View File

@ -101,6 +101,15 @@ Generate script wrappers.
accompanying .exe wrappers. Windows installers may want to add them
during install.
Recommended archiver features
'''''''''''''''''''''''''''''
Place ``.dist-info`` at the end of the archive.
Archivers are encouraged to place the ``.dist-info`` files physically
at the end of the archive. This enables some potentially interesting
ZIP tricks including the ability to amend the metadata without
rewriting the entire archive.
File Format
-----------
@ -149,9 +158,14 @@ non-alphanumeric characters with an underscore ``_``::
re.sub("[^\w\d.]+", "_", distribution, re.UNICODE)
The filename is Unicode. It will be some time before the tools are
updated to support non-ASCII filenames, but they are supported in this
specification.
The archive filename is Unicode. It will be some time before the tools
are updated to support non-ASCII filenames, but they are supported in
this specification.
The filenames *inside* the archive are encoded as UTF-8. Although some
ZIP clients in common use do not properly display UTF-8 filenames,
the encoding is supported by both the ZIP specification and Python's
``zipfile``.
File contents
'''''''''''''

View File

@ -55,6 +55,15 @@ rejection of :pep:`355`.
.. _`Unipath`: https://bitbucket.org/sluggo/unipath/overview
Implementation
==============
The implementation of this proposal is tracked in the ``pep428`` branch
of pathlib's `Mercurial repository`_.
.. _`Mercurial repository`: https://bitbucket.org/pitrou/pathlib/
Why an object-oriented API
==========================
@ -341,6 +350,15 @@ call ``bytes()`` on it, or use the ``as_bytes()`` method::
>>> bytes(p)
b'/home/antoine/pathlib/setup.py'
To represent the path as a ``file`` URI, call the ``as_uri()`` method::
>>> p = PurePosixPath('/etc/passwd')
>>> p.as_uri()
'file:///etc/passwd'
>>> p = PureNTPath('c:/Windows')
>>> p.as_uri()
'file:///c:/Windows'
Properties
----------

542
pep-0435.txt Normal file
View File

@ -0,0 +1,542 @@
PEP: 435
Title: Adding an Enum type to the Python standard library
Version: $Revision$
Last-Modified: $Date$
Author: Barry Warsaw <barry@python.org>,
Eli Bendersky <eliben@gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 2013-02-23
Python-Version: 3.4
Post-History: 2013-02-23
Abstract
========
This PEP proposes adding an enumeration type to the Python standard library.
Specifically, it proposes moving the existing ``flufl.enum`` package by Barry
Warsaw into the standard library. Much of this PEP is based on the "using"
[1]_ document from the documentation of ``flufl.enum``.
An enumeration is a set of symbolic names bound to unique, constant integer
values. Within an enumeration, the values can be compared by identity, and the
enumeration itself can be iterated over. Enumeration items can be converted to
and from their integer equivalents, supporting use cases such as storing
enumeration values in a database.
Status of discussions
=====================
The idea of adding an enum type to Python is not new - PEP 354 [2]_ is a
previous attempt that was rejected in 2005. Recently a new set of discussions
was initiated [3]_ on the ``python-ideas`` mailing list. Many new ideas were
proposed in several threads; after a lengthy discussion Guido proposed adding
``flufl.enum`` to the standard library [4]_. This PEP is an attempt to
formalize this decision as well as discuss a number of variations that can be
considered for inclusion.
Motivation
==========
*[Based partly on the Motivation stated in PEP 354]*
The properties of an enumeration are useful for defining an immutable, related
set of constant values that have a defined sequence but no inherent semantic
meaning. Classic examples are days of the week (Sunday through Saturday) and
school assessment grades ('A' through 'D', and 'F'). Other examples include
error status values and states within a defined process.
It is possible to simply define a sequence of values of some other basic type,
such as ``int`` or ``str``, to represent discrete arbitrary values. However,
an enumeration ensures that such values are distinct from any others including,
importantly, values within other enumerations, and that operations without
meaning ("Wednesday times two") are not defined for these values. It also
provides a convenient printable representation of enum values without requiring
tedious repetition while defining them (i.e. no ``GREEN = 'green'``).
Module and type name
====================
We propose to add a module named ``enum`` to the standard library. The main
type exposed by this module is ``Enum``. Hence, to import the ``Enum`` type
user code will run::
>>> from enum import Enum
Proposed semantics for the new enumeration type
===============================================
Creating an Enum
----------------
Enumerations are created using the class syntax, which makes them easy to read
and write. Every enumeration value must have a unique integer value and the
only restriction on their names is that they must be valid Python identifiers.
To define an enumeration, derive from the ``Enum`` class and add attributes
with assignment to their integer values::
>>> from enum import Enum
>>> class Colors(Enum):
... red = 1
... green = 2
... blue = 3
Enumeration values are compared by identity::
>>> Colors.red is Colors.red
True
>>> Colors.blue is Colors.blue
True
>>> Colors.red is not Colors.blue
True
>>> Colors.blue is Colors.red
False
Enumeration values have nice, human readable string representations::
>>> print(Colors.red)
Colors.red
...while their repr has more information::
>>> print(repr(Colors.red))
<EnumValue: Colors.red [int=1]>
The enumeration value names are available through the class members::
>>> for member in Colors.__members__:
... print(member)
red
green
blue
Let's say you wanted to encode an enumeration value in a database. You might
want to get the enumeration class object from an enumeration value::
>>> cls = Colors.red.enum
>>> print(cls.__name__)
Colors
Enums also have a property that contains just their item name::
>>> print(Colors.red.name)
red
>>> print(Colors.green.name)
green
>>> print(Colors.blue.name)
blue
The str and repr of the enumeration class also provides useful information::
>>> print(Colors)
<Colors {red: 1, green: 2, blue: 3}>
>>> print(repr(Colors))
<Colors {red: 1, green: 2, blue: 3}>
You can extend previously defined Enums by subclassing::
>>> class MoreColors(Colors):
... pink = 4
... cyan = 5
When extended in this way, the base enumeration's values are identical to the
same named values in the derived class::
>>> Colors.red is MoreColors.red
True
>>> Colors.blue is MoreColors.blue
True
However, these are not doing comparisons against the integer equivalent
values, because if you define an enumeration with similar item names and
integer values, they will not be identical::
>>> class OtherColors(Enum):
... red = 1
... blue = 2
... yellow = 3
>>> Colors.red is OtherColors.red
False
>>> Colors.blue is not OtherColors.blue
True
These enumeration values are not equal, nor do they hash equally::
>>> Colors.red == OtherColors.red
False
>>> len(set((Colors.red, OtherColors.red)))
2
Ordered comparisons between enumeration values are *not* supported. Enums are
not integers::
>>> Colors.red < Colors.blue
Traceback (most recent call last):
...
NotImplementedError
>>> Colors.red <= Colors.blue
Traceback (most recent call last):
...
NotImplementedError
>>> Colors.blue > Colors.green
Traceback (most recent call last):
...
NotImplementedError
>>> Colors.blue >= Colors.green
Traceback (most recent call last):
...
NotImplementedError
Equality comparisons are defined though::
>>> Colors.blue == Colors.blue
True
>>> Colors.green != Colors.blue
True
Enumeration values do not support ordered comparisons::
>>> Colors.red < Colors.blue
Traceback (most recent call last):
...
NotImplementedError
>>> Colors.red < 3
Traceback (most recent call last):
...
NotImplementedError
>>> Colors.red <= 3
Traceback (most recent call last):
...
NotImplementedError
>>> Colors.blue > 2
Traceback (most recent call last):
...
NotImplementedError
>>> Colors.blue >= 2
Traceback (most recent call last):
...
NotImplementedError
While equality comparisons are allowed, comparisons against non-enumeration
values will always compare not equal::
>>> Colors.green == 2
False
>>> Colors.blue == 3
False
>>> Colors.green != 3
True
>>> Colors.green == 'green'
False
If you really want the integer equivalent values, you can convert enumeration
values explicitly using the ``int()`` built-in. This is quite convenient for
storing enums in a database, as well as for interoperability with C extensions
that expect integers::
>>> int(Colors.red)
1
>>> int(Colors.green)
2
>>> int(Colors.blue)
3
You can also convert back to the enumeration value by calling the Enum
subclass, passing in the integer value for the item you want::
>>> Colors(1)
<EnumValue: Colors.red [int=1]>
>>> Colors(2)
<EnumValue: Colors.green [int=2]>
>>> Colors(3)
<EnumValue: Colors.blue [int=3]>
>>> Colors(1) is Colors.red
True
The Enum subclass also accepts the string name of the enumeration value::
>>> Colors('red')
<EnumValue: Colors.red [int=1]>
>>> Colors('blue') is Colors.blue
True
You get exceptions though, if you try to use invalid arguments::
>>> Colors('magenta')
Traceback (most recent call last):
...
ValueError: magenta
>>> Colors(99)
Traceback (most recent call last):
...
ValueError: 99
The Enum base class also supports getitem syntax, exactly equivalent to the
class's call semantics::
>>> Colors[1]
<EnumValue: Colors.red [int=1]>
>>> Colors[2]
<EnumValue: Colors.green [int=2]>
>>> Colors[3]
<EnumValue: Colors.blue [int=3]>
>>> Colors[1] is Colors.red
True
>>> Colors['red']
<EnumValue: Colors.red [int=1]>
>>> Colors['blue'] is Colors.blue
True
>>> Colors['magenta']
Traceback (most recent call last):
...
ValueError: magenta
>>> Colors[99]
Traceback (most recent call last):
...
ValueError: 99
The integer equivalent values serve another purpose. You may not define two
enumeration values with the same integer value::
>>> class Bad(Enum):
... cartman = 1
... stan = 2
... kyle = 3
... kenny = 3 # Oops!
... butters = 4
Traceback (most recent call last):
...
TypeError: Multiple enum values: 3
You also may not duplicate values in derived enumerations::
>>> class BadColors(Colors):
... yellow = 4
... chartreuse = 2 # Oops!
Traceback (most recent call last):
...
TypeError: Multiple enum values: 2
The Enum class support iteration. Enumeration values are returned in the
sorted order of their integer equivalent values::
>>> [v.name for v in MoreColors]
['red', 'green', 'blue', 'pink', 'cyan']
>>> [int(v) for v in MoreColors]
[1, 2, 3, 4, 5]
Enumeration values are hashable, so they can be used in dictionaries and sets::
>>> apples = {}
>>> apples[Colors.red] = 'red delicious'
>>> apples[Colors.green] = 'granny smith'
>>> for color in sorted(apples, key=int):
... print(color.name, '->', apples[color])
red -> red delicious
green -> granny smith
Pickling
--------
Enumerations created with the class syntax can also be pickled and unpickled::
>>> from enum.tests.fruit import Fruit
>>> from pickle import dumps, loads
>>> Fruit.tomato is loads(dumps(Fruit.tomato))
True
Convenience API
---------------
You can also create enumerations using the convenience function ``make()``,
which takes an iterable object or dictionary to provide the item names and
values. ``make()`` is a module-level function.
The first argument to ``make()`` is the name of the enumeration, and it returns
the so-named `Enum` subclass. The second argument is a *source* which can be
either an iterable or a dictionary. In the most basic usage, *source* returns
a sequence of strings which name the enumeration items. In this case, the
values are automatically assigned starting from 1::
>>> import enum
>>> enum.make('Animals', ('ant', 'bee', 'cat', 'dog'))
<Animals {ant: 1, bee: 2, cat: 3, dog: 4}>
The items in source can also be 2-tuples, where the first item is the
enumeration value name and the second is the integer value to assign to the
value. If 2-tuples are used, all items must be 2-tuples::
>>> def enumiter():
... start = 1
... while True:
... yield start
... start <<= 1
>>> enum.make('Flags', zip(list('abcdefg'), enumiter()))
<Flags {a: 1, b: 2, c: 4, d: 8, e: 16, f: 32, g: 64}>
Proposed variations
===================
Some variations were proposed during the discussions in the mailing list.
Here's some of the more popular ones.
Not having to specify values for enums
--------------------------------------
Michael Foord proposed (and Tim Delaney provided a proof-of-concept
implementation) to use metaclass magic that makes this possible::
class Color(Enum):
red, green, blue
The values get actually assigned only when first looked up.
Pros: cleaner syntax that requires less typing for a very common task (just
listing enumeration names without caring about the values).
Cons: involves much magic in the implementation, which makes even the
definition of such enums baffling when first seen. Besides, explicit is
better than implicit.
Using special names or forms to auto-assign enum values
-------------------------------------------------------
A different approach to avoid specifying enum values is to use a special name
or form to auto assign them. For example::
class Color(Enum):
red = None # auto-assigned to 0
green = None # auto-assigned to 1
blue = None # auto-assigned to 2
More flexibly::
class Color(Enum):
red = 7
green = None # auto-assigned to 8
blue = 19
purple = None # auto-assigned to 20
Some variations on this theme:
#. A special name ``auto`` imported from the enum package.
#. Georg Brandl proposed ellipsis (``...``) instead of ``None`` to achieve the
same effect.
Pros: no need to manually enter values. Makes it easier to change the enum and
extend it, especially for large enumerations.
Cons: actually longer to type in many simple cases. The argument of explicit
vs. implicit applies here as well.
Use-cases in the standard library
=================================
The Python standard library has many places where the usage of enums would be
beneficial to replace other idioms currently used to represent them. Such
usages can be divided to two categories: user-code facing constants, and
internal constants.
User-code facing constants like ``os.SEEK_*``, ``socket`` module constants,
decimal rounding modes, HTML error codes could benefit from being enums had
they been implemented this way from the beginning. At this point, however, at
the risk of breaking user code (that relies on the constants' actual values
rather than their meaning) such a change cannot be made. This does not mean
that future uses in the stdlib can't use an enum for defining new user-code
facing constants.
Internal constants are not seen by user code but are employed internally by
stdlib modules. It appears that nothing should stand in the way of
implementing such constants with enums. Some examples uncovered by a very
partial skim through the stdlib: ``binhex``, ``imaplib``, ``http/client``,
``urllib/robotparser``, ``idlelib``, ``concurrent.futures``, ``turtledemo``.
In addition, looking at the code of the Twisted library, there are many use
cases for replacing internal state constants with enums. The same can be said
about a lot of networking code (especially implementation of protocols) and
can be seen in test protocols written with the Tulip library as well.
Differences from PEP 354
========================
Unlike PEP 354, enumeration values are not defined as a sequence of strings,
but as attributes of a class. This design was chosen because it was felt that
class syntax is more readable.
Unlike PEP 354, enumeration values require an explicit integer value. This
difference recognizes that enumerations often represent real-world values, or
must interoperate with external real-world systems. For example, to store an
enumeration in a database, it is better to convert it to an integer on the way
in and back to an enumeration on the way out. Providing an integer value also
provides an explicit ordering. However, there is no automatic conversion to
and from the integer values, because explicit is better than implicit.
Unlike PEP 354, this implementation does use a metaclass to define the
enumeration's syntax, and allows for extended base-enumerations so that the
common values in derived classes are identical (a singleton model). While PEP
354 dismisses this approach for its complexity, in practice any perceived
complexity, though minimal, is hidden from users of the enumeration.
Unlike PEP 354, enumeration values can only be tested by identity comparison.
This is to emphasize the fact that enumeration values are singletons, much
like ``None``.
Acknowledgments
===============
This PEP describes the ``flufl.enum`` package by Barry Warsaw. ``flufl.enum``
is based on an example by Jeremy Hylton. It has been modified and extended
by Barry Warsaw for use in the GNU Mailman [5]_ project. Ben Finney is the
author of the earlier enumeration PEP 354.
References
==========
.. [1] http://pythonhosted.org/flufl.enum/docs/using.html
.. [2] http://www.python.org/dev/peps/pep-0354/
.. [3] http://mail.python.org/pipermail/python-ideas/2013-January/019003.html
.. [4] http://mail.python.org/pipermail/python-ideas/2013-February/019373.html
.. [5] http://www.list.org
Copyright
=========
This document has been placed in the public domain.
Todo
====
* Mark PEP 354 "superseded by" this one, if accepted
* New package name within stdlib - enum? (top-level)
* For make, can we add an API like namedtuple's?
make('Animals, 'ant bee cat dog')
I.e. when make sees a string argument it splits it, making it similar to a
tuple but with far less manual quote typing. OTOH, it just saves a ".split"
so may not be worth the effort ?
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

480
pep-0436.txt Normal file
View File

@ -0,0 +1,480 @@
PEP: 436
Title: The Argument Clinic DSL
Version: $Revision$
Last-Modified: $Date$
Author: Larry Hastings <larry@hastings.org>
Discussions-To: Python-Dev <python-dev@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 22-Feb-2013
Abstract
========
This document proposes "Argument Clinic", a DSL designed to facilitate
argument processing for built-in functions in the implementation of
CPython.
Rationale and Goals
===================
The primary implementation of Python, "CPython", is written in a
mixture of Python and C. One of the implementation details of CPython
is what are called "built-in" functions -- functions available to
Python programs but written in C. When a Python program calls a
built-in function and passes in arguments, those arguments must be
translated from Python values into C values. This process is called
"parsing arguments".
As of CPython 3.3, arguments to functions are primarily parsed with
one of two functions: the original ``PyArg_ParseTuple()``, [1]_ and
the more modern ``PyArg_ParseTupleAndKeywords()``. [2]_ The former
function only handles positional parameters; the latter also
accommodates keyword and keyword-only parameters, and is preferred for
new code.
``PyArg_ParseTuple()`` was a reasonable approach when it was first
conceived. The programmer specified the translation for the arguments
in a "format string": [3]_ each parameter matched to a "format unit",
a one-or-two character sequence telling ``PyArg_ParseTuple()`` what
Python types to accept and how to translate them into the appropriate
C value for that parameter. There were only a dozen or so of these
"format units", and each one was distinct and easy to understand.
Over the years the ``PyArg_Parse`` interface has been extended in
numerous ways. The modern API is quite complex, to the point that it
is somewhat painful to use. Consider:
* There are now forty different "format units"; a few are even three
characters long. This makes it difficult to understand what the
format string says without constantly cross-indexing it with the
documentation.
* There are also six meta-format units that may be buried in the
format string. (They are: ``"()|$:;"``.)
* The more format units are added, the less likely it is the
implementer can pick an easy-to-use mnemonic for the format unit,
because the character of choice is probably already in use. In
other words, the more format units we have, the more obtuse the
format units become.
* Several format units are nearly identical to others, having only
subtle differences. This makes understanding the exact semantics
of the format string even harder.
* The docstring is specified as a static C string, which is mildly
bothersome to read and edit.
* When adding a new parameter to a function using
``PyArg_ParseTupleAndKeywords()``, it's necessary to touch six
different places in the code: [4]_
* Declaring the variable to store the argument.
* Passing in a pointer to that variable in the correct spot in
``PyArg_ParseTupleAndKeywords()``, also passing in any
"length" or "converter" arguments in the correct order.
* Adding the name of the argument in the correct spot of the
"keywords" array passed in to
``PyArg_ParseTupleAndKeywords()``.
* Adding the format unit to the correct spot in the format
string.
* Adding the parameter to the prototype in the docstring.
* Documenting the parameter in the docstring.
* There is currently no mechanism for builtin functions to provide
their "signature" information (see ``inspect.getfullargspec`` and
``inspect.Signature``). Adding this information using a mechanism
similar to the existing ``PyArg_Parse`` functions would require
repeating ourselves yet again.
The goal of Argument Clinic is to replace this API with a mechanism
inheriting none of these downsides:
* You need specify each parameter only once.
* All information about a parameter is kept together in one place.
* For each parameter, you specify its type in C; Argument Clinic
handles the translation from Python value into C value for you.
* Argument Clinic also allows for fine-tuning of argument processing
behavior with highly-readable "flags", both per-parameter and
applying across the whole function.
* Docstrings are written in plain text.
* From this, Argument Clinic generates for you all the mundane,
repetitious code and data structures CPython needs internally.
Once you've specified the interface, the next step is simply to
write your implementation using native C types. Every detail of
argument parsing is handled for you.
Future goals of Argument Clinic include:
* providing signature information for builtins, and
* speed improvements to the generated code.
DSL Syntax Summary
==================
The Argument Clinic DSL is specified as a comment embedded in a C
file, as follows. The "Example" column on the right shows you sample
input to the Argument Clinic DSL, and the "Section" column on the left
specifies what each line represents in turn.
::
+-----------------------+-----------------------------------------------------+
| Section | Example |
+-----------------------+-----------------------------------------------------+
| Clinic DSL start | /*[clinic] |
| Function declaration | module.function_name -> return_annotation |
| Function flags | flag flag2 flag3=value |
| Parameter declaration | type name = default |
| Parameter flags | flag flag2 flag3=value |
| Parameter docstring | Lorem ipsum dolor sit amet, consectetur |
| | adipisicing elit, sed do eiusmod tempor |
| Function docstring | Lorem ipsum dolor sit amet, consectetur adipisicing |
| | elit, sed do eiusmod tempor incididunt ut labore et |
| Clinic DSL end | [clinic]*/ |
| Clinic output | ... |
| Clinic output end | /*[clinic end output:<checksum>]*/ |
+-----------------------+-----------------------------------------------------+
General Behavior Of the Argument Clinic DSL
-------------------------------------------
All lines support ``#`` as a line comment delimiter *except*
docstrings. Blank lines are always ignored.
Like Python itself, leading whitespace is significant in the Argument
Clinic DSL. The first line of the "function" section is the
declaration; all subsequent lines at the same indent are function
flags. Once you indent, the first line is a parameter declaration;
subsequent lines at that indent are parameter flags. Indent one more
time for the lines of the parameter docstring. Finally, dedent back
to the same level as the function declaration for the function
docstring.
Function Declaration
--------------------
The return annotation is optional. If skipped, the arrow ("``->``")
must also be omitted.
Parameter Declaration
---------------------
The "type" is a C type. If it's a pointer type, you must specify a
single space between the type and the "``*``", and zero spaces between
the "``*``" and the name. (e.g. "``PyObject *foo``", not "``PyObject*
foo``")
The "name" must be a legal C identifier.
The "default" is a Python value. Default values are optional; if not
specified you must omit the equals sign too. Parameters which don't
have a default are implicitly required. The default value is
dynamically assigned, "live" in the generated C code, and although
it's specified as a Python value, it's translated into a native C
value in the generated C code.
It's explicitly permitted to end the parameter declaration line with a
semicolon, though the semicolon is optional. This is intended to
allow directly cutting and pasting in declarations from C code.
However, the preferred style is without the semicolon.
Flags
-----
"Flags" are like "``make -D``" arguments. They're unordered. Flags
lines are parsed much like the shell (specifically, using
``shlex.split()`` [5]_ ). You can have as many flag lines as you
like. Specifying a flag twice is currently an error.
Supported flags for functions:
``basename``
The basename to use for the generated C functions. By default this
is the name of the function from the DSL, only with periods replaced
by underscores.
``positional-only``
This function only supports positional parameters, not keyword
parameters. See `Functions With Positional-Only Parameters`_ below.
Supported flags for parameters:
``bitwise``
If the Python integer passed in is signed, copy the bits directly
even if it is negative. Only valid for unsigned integer types.
``converter``
Backwards-compatibility support for parameter "converter"
functions. [6]_ The value should be the name of the converter
function in C. Only valid when the type of the parameter is
``void *``.
``default``
The Python value to use in place of the parameter's actual default
in Python contexts. Specifically, when specified, this value will
be used for the parameter's default in the docstring, and in the
``Signature``. (TBD: If the string is a valid Python expression
which can be rendered into a Python value using ``eval()``, then the
result of ``eval()`` on it will be used as the default in the
``Signature``.) Ignored if there is no default.
``encoding``
Encoding to use when encoding a Unicode string to a ``char *``.
Only valid when the type of the parameter is ``char *``.
``group=``
This parameter is part of a group of options that must either all be
specified or none specified. Parameters in the same "group" must be
contiguous. The value of the group flag is the name used for the
group variable, and therefore must be legal as a C identifier. Only
valid for functions marked "``positional-only``"; see `Functions
With Positional-Only Parameters`_ below.
``immutable``
Only accept immutable values.
``keyword-only``
This parameter (and all subsequent parameters) is keyword-only.
Keyword-only parameters must also be optional parameters. Not valid
for positional-only functions.
``length``
This is an iterable type, and we also want its length. The DSL will
generate a second ``Py_ssize_t`` variable; its name will be this
parameter's name appended with "``_length``".
``nullable``
``None`` is a legal argument for this parameter. If ``None`` is
supplied on the Python side, the equivalent C argument will be
``NULL``. Only valid for pointer types.
``required``
Normally any parameter that has a default value is automatically
optional. A parameter that has "required" set will be considered
required (non-optional) even if it has a default value. The
generated documentation will also not show any default value.
``types``
Space-separated list of acceptable Python types for this object.
There are also four special-case types which represent Python
protocols:
* buffer
* mapping
* number
* sequence
``zeroes``
This parameter is a string type, and its value should be allowed to
have embedded zeroes. Not valid for all varieties of string
parameters.
Python Code
-----------
Argument Clinic also permits embedding Python code inside C files,
which is executed in-place when Argument Clinic processes the file.
Embedded code looks like this:
::
/*[python]
# this is python code!
print("/" + "* Hello world! *" + "/")
[python]*/
Any Python code is valid. Python code sections in Argument Clinic can
also be used to modify Clinic's behavior at runtime; for example, see
`Extending Argument Clinic`_.
Output
======
Argument Clinic writes its output in-line in the C file, immediately
after the section of Clinic code. For "python" sections, the output
is everything printed using ``builtins.print``. For "clinic"
sections, the output is valid C code, including:
* a ``#define`` providing the correct ``methoddef`` structure for the
function
* a prototype for the "impl" function -- this is what you'll write
to implement this function
* a function that handles all argument processing, which calls your
"impl" function
* the definition line of the "impl" function
* and a comment indicating the end of output.
The intention is that you will write the body of your impl function
immediately after the output -- as in, you write a left-curly-brace
immediately after the end-of-output comment and write the
implementation of the builtin in the body there. (It's a bit strange
at first, but oddly convenient.)
Argument Clinic will define the parameters of the impl function for
you. The function will take the "self" parameter passed in
originally, all the parameters you define, and possibly some extra
generated parameters ("length" parameters; also "group" parameters,
see next section).
Argument Clinic also writes a checksum for the output section. This
is a valuable safety feature: if you modify the output by hand, Clinic
will notice that the checksum doesn't match, and will refuse to
overwrite the file. (You can force Clinic to overwrite with the
"``-f``" command-line argument; Clinic will also ignore the checksums
when using the "``-o``" command-line argument.)
Functions With Positional-Only Parameters
=========================================
A significant fraction of Python builtins implemented in C use the
older positional-only API for processing arguments
(``PyArg_ParseTuple()``). In some instances, these builtins parse
their arguments differently based on how many arguments were passed
in. This can provide some bewildering flexibility: there may be
groups of optional parameters, which must either all be specified or
none specified. And occasionally these groups are on the *left!* (For
example: ``curses.window.addch()``.)
Argument Clinic supports these legacy use-cases with a special set of
flags. First, set the flag "``positional-only``" on the entire
function. Then, for every group of parameters that is collectively
optional, add a "``group=``" flag with a unique string to all the
parameters in that group. Note that these groups are permitted on the
right *or left* of any required parameters! However, all groups
(including the group of required parameters) must be contiguous.
The impl function generated by Clinic will add an extra parameter for
every group, "``int <group>_group``". This argument will be nonzero
if the group was specified on this call, and zero if it was not.
Note that when operating in this mode, you cannot specify default
arguments. You can simulate defaults by putting parameters in
individual groups and detecting whether or not they were specified;
generally speaking it's better to simply not use "positional-only"
where it isn't absolutely necessary. (TBD: It might be possible to
relax this restriction. But adding default arguments into the mix of
groups would seemingly make calculating which groups are active a good
deal harder.)
Also, note that it's possible to specify a set of groups to a function
such that there are several valid mappings from the number of
arguments to a valid set of groups. If this happens, Clinic will exit
with an error message. This should not be a problem, as
positional-only operation is only intended for legacy use cases, and
all the legacy functions using this quirky behavior should have
unambiguous mappings.
Current Status
==============
As of this writing, there is a working prototype implementation of
Argument Clinic available online. [7]_ The prototype implements the
syntax above, and generates code using the existing ``PyArg_Parse``
APIs. It supports translating to all current format units except
``"w*"``. Sample functions using Argument Clinic exercise all major
features, including positional-only argument parsing.
Extending Argument Clinic
-------------------------
The prototype also currently provides an experimental extension
mechanism, allowing adding support for new types on-the-fly. See
``Modules/posixmodule.c`` in the prototype for an example of its use.
Notes / TBD
===========
* Guido proposed having the "function docstring" be hand-written inline,
in the middle of the output, something like this:
::
/*[clinic]
... prototype and parameters (including parameter docstrings) go here
[clinic]*/
... some output ...
/*[clinic docstring start]*/
... hand-edited function docstring goes here <-- you edit this by hand!
/*[clinic docstring end]*/
... more output
/*[clinic output end]*/
I tried it this way and don't like it -- I think it's clumsy. I
prefer that everything you write goes in one place, rather than
having an island of hand-edited stuff in the middle of the DSL
output.
* Do we need to support tuple unpacking? (The "``(OOO)``" style
format string.) Boy I sure hope not.
* What about Python functions that take no arguments? This syntax
doesn't provide for that. Perhaps a lone indented "None" should
mean "no arguments"?
* This approach removes some dynamism / flexibility. With the
existing syntax one could theoretically pass in different encodings
at runtime for the "``es``"/"``et``" format units. AFAICT CPython
doesn't do this itself, however it's possible external users might
do this. (Trivia: there are no uses of "``es``" exercised by
regrtest, and all the uses of "``et``" exercised are in
socketmodule.c, except for one in _ssl.c. They're all static,
specifying the encoding ``"idna"``.)
* Right now the "basename" flag on a function changes the ``#define
methoddef`` name too. Should it, or should the #define'd methoddef
name always be ``{module_name}_{function_name}`` ?
References
==========
.. [1] ``PyArg_ParseTuple()``:
http://docs.python.org/3/c-api/arg.html#PyArg_ParseTuple
.. [2] ``PyArg_ParseTupleAndKeywords()``:
http://docs.python.org/3/c-api/arg.html#PyArg_ParseTupleAndKeywords
.. [3] ``PyArg_`` format units:
http://docs.python.org/3/c-api/arg.html#strings-and-buffers
.. [4] Keyword parameters for extension functions:
http://docs.python.org/3/extending/extending.html#keyword-parameters-for-extension-functions
.. [5] ``shlex.split()``:
http://docs.python.org/3/library/shlex.html#shlex.split
.. [6] ``PyArg_`` "converter" functions, see ``"O&"`` in this section:
http://docs.python.org/3/c-api/arg.html#other-objects
.. [7] Argument Clinic prototype:
https://bitbucket.org/larry/python-clinic/
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: