Update to remove the 100% branch coverage requirement to simply say

comprehensive coverage is needed. Also mention how more abstract C
APIs should be used over type-specific ones to make sure APIs in C
code do not restrict to a specific type when not needed.
This commit is contained in:
Brett Cannon 2011-07-17 17:18:06 -07:00
parent 467a8e1147
commit 74a80ac945
1 changed files with 23 additions and 64 deletions

View File

@ -8,7 +8,7 @@ Type: Informational
Content-Type: text/x-rst
Created: 04-Apr-2011
Python-Version: 3.3
Post-History: 04-Apr-2011, 12-Apr-2011
Post-History: 04-Apr-2011, 12-Apr-2011, 17-Jul-2011
Abstract
========
@ -17,28 +17,28 @@ The Python standard library under CPython contains various instances
of modules implemented in both pure Python and C (either entirely or
partially). This PEP requires that in these instances that the
C code *must* pass the test suite used for the pure Python code
so as to act as much as a drop-in replacement as possible
so as to act as much as a drop-in replacement as reasonably possible
(C- and VM-specific tests are exempt). It is also required that new
C-based modules lacking a pure Python equivalent implementation get
special permissions to be added to the standard library.
special permission to be added to the standard library.
Rationale
=========
Python has grown beyond the CPython virtual machine (VM). IronPython_,
Jython_, and PyPy_ all currently being viable alternatives to the
CPython VM. This VM ecosystem that has sprung up around the Python
Jython_, and PyPy_ are all currently viable alternatives to the
CPython VM. The VM ecosystem that has sprung up around the Python
programming language has led to Python being used in many different
areas where CPython cannot be used, e.g., Jython allowing Python to be
used in Java applications.
A problem all of the VMs other than CPython face is handling modules
from the standard library that are implemented (to some extent) in C.
Since they do not typically support the entire `C API of CPython`_
Since other VMs do not typically support the entire `C API of CPython`_
they are unable to use the code used to create the module. Often times
this leads these other VMs to either re-implement the modules in pure
Python or in the programming language used to implement the VM
Python or in the programming language used to implement the VM itself
(e.g., in C# for IronPython). This duplication of effort between
CPython, PyPy, Jython, and IronPython is extremely unfortunate as
implementing a module *at least* in pure Python would help mitigate
@ -56,45 +56,8 @@ Re-implementing parts (or all) of a module in C (in the case
of CPython) is still allowed for performance reasons, but any such
accelerated code must pass the same test suite (sans VM- or C-specific
tests) to verify semantics and prevent divergence. To accomplish this,
the test suite for the module must have 100% branch coverage of the
the test suite for the module must have comprehensive coverage of the
pure Python implementation before the acceleration code may be added.
This is to prevent users from accidentally relying
on semantics that are specific to the C code and are not reflected in
the pure Python implementation that other VMs rely upon. For example,
in CPython 3.2.0, ``heapq.heappop()`` does an explicit type
check in its accelerated C code while the Python code uses duck
typing::
from test.support import import_fresh_module
c_heapq = import_fresh_module('heapq', fresh=['_heapq'])
py_heapq = import_fresh_module('heapq', blocked=['_heapq'])
class Spam:
"""Tester class which defines no other magic methods but
__len__()."""
def __len__(self):
return 0
try:
c_heapq.heappop(Spam())
except TypeError:
# Explicit type check failure: "heap argument must be a list"
pass
try:
py_heapq.heappop(Spam())
except AttributeError:
# Duck typing failure: "'Foo' object has no attribute 'pop'"
pass
This kind of divergence is a problem for users as they unwittingly
write code that is CPython-specific. This is also an issue for other
VM teams as they have to deal with bug reports from users thinking
that they incorrectly implemented the module when in fact it was
caused by an untested case.
Details
@ -128,13 +91,8 @@ the stdlib that lack a pure Python equivalent gain such a module. But
if people do volunteer to provide and maintain a pure Python
equivalent (e.g., the PyPy team volunteering their pure Python
implementation of the ``csv`` module and maintaining it) then such
code will be accepted.
This requirement does not apply to modules already existing as only C
code in the standard library. It is acceptable to retroactively add a
pure Python implementation of a module implemented entirely in C, but
in those instances the C version is considered the reference
implementation in terms of expected semantics.
code will be accepted. In those instances the C version is considered
the reference implementation in terms of expected semantics.
Any new accelerated code must act as a drop-in replacement as close
to the pure Python implementation as reasonable. Technical details of
@ -143,16 +101,8 @@ necessary, e.g., a class being a ``type`` when implemented in C. To
verify that the Python and equivalent C code operate as similarly as
possible, both code bases must be tested using the same tests which
apply to the pure Python code (tests specific to the C code or any VM
do not follow under this requirement). To make sure that the test
suite is thorough enough to cover all relevant semantics, the tests
must have 100% branch coverage for the Python code being replaced by
C code. This will make sure that the new acceleration code will
operate as much like a drop-in replacement for the Python code is as
possible. Testing should still be done for issues that come up when
working with C code even if it is not explicitly required to meet the
coverage requirement, e.g., Tests should be aware that C code typically
has special paths for things such as built-in types, subclasses of
built-in types, etc.
do not follow under this requirement). The test suite is expected to
be extensive in order to verify expected semantics.
Acting as a drop-in replacement also dictates that no public API be
provided in accelerated code that does not exist in the pure Python
@ -214,13 +164,22 @@ C accelerated versions of a module, a basic idiom can be followed::
test_main()
If this test were to provide 100% branch coverage for
If this test were to provide extensive coverage for
``heapq.heappop()`` in the pure Python implementation then the
accelerated C code would be allowed to be added to CPython's standard
library. If it did not, then the test suite would need to be updated
until 100% branch coverage was provided before the accelerated C code
until proper coverage was provided before the accelerated C code
could be added.
To also help with compatibility, C code should use abstract APIs on
objects to prevent accidental dependence on specific types. For
instance, if a function accepts a sequence then the C code should
default to using `PyObject_GetItem()` instead of something like
`PyList_GetItem()`. C code is allowed to have a fast path if the
proper `PyList_Check()` is used, but otherwise APIs should work with
any object that duck types to the proper interface instead of a
specific type.
Copyright
=========