PEP 554: Updates post-implementation. (#645)

This commit is contained in:
Eric Snow 2018-05-14 13:39:07 -04:00 committed by GitHub
parent da41a811d7
commit d77a8be7c9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 143 additions and 70 deletions

View File

@ -6,15 +6,16 @@ Type: Standards Track
Content-Type: text/x-rst
Created: 2017-09-05
Python-Version: 3.8
Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017
Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017,
09-May-2018
Abstract
========
CPython has supported multiple interpreters in the same process (AKA
"subinterpreters") since version 1.5. The feature has been available
via the C-API. [c-api]_ Subinterpreters operate in
"subinterpreters") since version 1.5 (1997). The feature has been
available via the C-API. [c-api]_ Subinterpreters operate in
`relative isolation from one another <Interpreter Isolation_>`_, which
provides the basis for an
`alternative concurrency model <Concurrency_>`_.
@ -47,6 +48,8 @@ At first only the following types will be supported for sharing:
* None
* bytes
* str
* int
* PEP 3118 buffer objects (via ``send_buffer()``)
Support for other basic types (e.g. int, Ellipsis) will be added later.
@ -87,6 +90,14 @@ For creating and using interpreters:
| channels=None) | | (This blocks the current thread until done.) |
+-----------------------+-----------------------------------------------------+
|
+----------------+--------------+------------------------------------------------------+
| exception | base | description |
+================+==============+======================================================+
| RunFailedError | RuntimeError | Interpreter.run() resulted in an uncaught exception. |
+----------------+--------------+------------------------------------------------------+
For sharing data between interpreters:
+--------------------------------+--------------------------------------------+
@ -120,9 +131,11 @@ For sharing data between interpreters:
| .recv_nowait(default=None) -> | | Like recv(), but return the default |
| object | | instead of waiting. |
+-------------------------------+-----------------------------------------------+
| .close() | | No longer associate the current interpreter |
| .release() | | No longer associate the current interpreter |
| | | with the channel (on the receiving end). |
+-------------------------------+-----------------------------------------------+
| .close(force=False) | | Close the channel in all interpreters. |
+-------------------------------+-----------------------------------------------+
|
@ -147,9 +160,31 @@ For sharing data between interpreters:
+---------------------------+-------------------------------------------------+
| .send_buffer_nowait(obj) | | Like send_buffer(), but fail if not received. |
+---------------------------+-------------------------------------------------+
| .close() | | No longer associate the current interpreter |
| .release() | | No longer associate the current interpreter |
| | | with the channel (on the sending end). |
+---------------------------+-------------------------------------------------+
| .close(force=False) | | Close the channel in all interpreters. |
+---------------------------+-------------------------------------------------+
|
+----------------------+--------------------+------------------------------------------------+
| exception | base | description |
+======================+====================+================================================+
| ChannelError | Exception | The base class for channel-related exceptions. |
+----------------------+--------------------+------------------------------------------------+
| ChannelNotFoundError | ChannelError | The identified channel was not found. |
+----------------------+--------------------+------------------------------------------------+
| ChannelEmptyError | ChannelError | The channel was unexpectedly empty. |
+----------------------+--------------------+------------------------------------------------+
| ChannelNotEmptyError | ChannelError | The channel was unexpectedly not empty. |
+----------------------+--------------------+------------------------------------------------+
| NotReceivedError | ChannelError | Nothing was waiting to receive a sent object. |
+----------------------+--------------------+------------------------------------------------+
| ChannelClosedError | ChannelError | The channel is closed. |
+----------------------+--------------------+------------------------------------------------+
| ChannelReleasedError | ChannelClosedError | The channel is released (but not yet closed). |
+----------------------+--------------------+------------------------------------------------+
Examples
@ -218,7 +253,7 @@ Synchronize using a channel
interp.run(tw.dedent("""
reader.recv()
print("during")
reader.close()
reader.release()
"""),
shared=dict(
reader=r,
@ -229,7 +264,7 @@ Synchronize using a channel
t.start()
print('after')
s.send(b'')
s.close()
s.release()
Sharing a file descriptor
-------------------------
@ -280,7 +315,7 @@ Passing objects via marshal
obj = marshal.loads(data)
do_something(obj)
data = reader.recv()
reader.close()
reader.release()
"""))
t = threading.Thread(target=run)
t.start()
@ -310,7 +345,7 @@ Passing objects via pickle
obj = pickle.loads(data)
do_something(obj)
data = reader.recv()
reader.close()
reader.release()
"""))
t = threading.Thread(target=run)
t.start()
@ -514,6 +549,8 @@ channels to the following:
* None
* bytes
* str
* int
* PEP 3118 buffer objects (via ``send_buffer()``)
Limiting the initial shareable types is a practical matter, reducing
@ -686,16 +723,24 @@ The module also provides the following class:
"run()" call into one long script. This is the same as how the
REPL operates.
Regarding uncaught exceptions, we noted that they are
"effectively" propagated into the code where ``run()`` was called.
To prevent leaking exceptions (and tracebacks) between
interpreters, we create a surrogate of the exception and its
traceback (see ``traceback.TracebackException``), wrap it in a
RuntimeError, and raise that.
Supported code: source text.
Uncaught Exceptions
-------------------
Regarding uncaught exceptions in ``Interpreter.run()``, we noted that
they are "effectively" propagated into the code where ``run()`` was
called. To prevent leaking exceptions (and tracebacks) between
interpreters, we create a surrogate of the exception and its traceback
(see ``traceback.TracebackException``), set it to ``__cause__`` on a
new ``RunFailedError``, and raise that.
Raising (a proxy of) the exception is problematic since it's harder to
distinguish between an error in the ``run()`` call and an uncaught
exception from the subinterpreter.
API for sharing data
--------------------
@ -703,8 +748,8 @@ Subinterpreters are less useful without a mechanism for sharing data
between them. Sharing actual Python objects between interpreters,
however, has enough potential problems that we are avoiding support
for that here. Instead, only mimimum set of types will be supported.
Initially this will include ``bytes`` and channels. Further types may
be supported later.
Initially this will include ``None``, ``bytes``, ``str``, ``int``,
and channels. Further types may be supported later.
The ``interpreters`` module provides a way for users to determine
whether an object is shareable or not:
@ -737,11 +782,12 @@ many-to-many, channels have no buffer.
Create a new channel and return (recv, send), the RecvChannel and
SendChannel corresponding to the ends of the channel. The channel
is not closed and destroyed (i.e. garbage-collected) until the number
of associated interpreters returns to 0.
of associated interpreters returns to 0 (including when the channel
is explicitly closed).
An interpreter gets associated with a channel by calling its "send()"
or "recv()" method. That association gets dropped by calling
"close()" on the channel.
"release()" on the channel.
Both ends of the channel are supported "shared" objects (i.e. may be
safely shared by different interpreters. Thus they may be passed as
@ -765,7 +811,8 @@ many-to-many, channels have no buffer.
interpreters:
The list of associated interpreters: those that have called
the "recv()" or "__next__()" methods and haven't called "close()".
the "recv()" or "__next__()" methods and haven't called
"release()" (and the channel hasn't been explicitly closed).
recv():
@ -773,10 +820,11 @@ many-to-many, channels have no buffer.
the channel. If none have been sent then wait until the next
send. This associates the current interpreter with the channel.
If the channel is already closed (see the close() method)
then raise EOFError. If the channel isn't closed, but the current
interpreter already called the "close()" method (which drops its
association with the channel) then raise ValueError.
If the channel is already closed then raise ChannelClosedError.
If the channel isn't closed but the current interpreter already
called the "release()" method (which drops its association with
the channel) then raise ChannelReleasedError (which is a subclass
of ChannelClosedError).
recv_nowait(default=None):
@ -784,26 +832,35 @@ many-to-many, channels have no buffer.
then return the default. Otherwise, this is the same as the
"recv()" method.
close():
release():
No longer associate the current interpreter with the channel (on
the receiving end) and block future association (via the "recv()"
method. If the interpreter was never associated with the channel
method). If the interpreter was never associated with the channel
then still block future association. Once an interpreter is no
longer associated with the channel, subsequent (or current) send()
and recv() calls from that interpreter will raise ValueError
(or EOFError if the channel is actually marked as closed).
and recv() calls from that interpreter will raise
ChannelReleasedError (or ChannelClosedError if the channel
is actually marked as closed).
Once the number of associated interpreters on both ends drops
to 0, the channel is actually marked as closed. The Python
runtime will garbage collect all closed channels, though it may
not be immediately. Note that "close()" is automatically called
not be immediately. Note that "release()" is automatically called
in behalf of the current interpreter when the channel is no longer
used (i.e. has no references) in that interpreter.
This operation is idempotent. Return True if "close()" has not
This operation is idempotent. Return True if "release()" has not
been called before by the current interpreter.
close(force=False):
Close both ends of the channel (in all interpreters). This means
that any further use of the channel raises ChannelClosedError. If
the channel is not empty then raise ChannelNotEmptyError (if
"force" is False) or discard the remaining objects (if "force"
is True) and close it.
``SendChannel(id)``::
@ -827,16 +884,16 @@ many-to-many, channels have no buffer.
object is not shareable then ValueError is raised. Currently
only bytes are supported.
If the channel is already closed (see the close() method)
then raise EOFError. If the channel isn't closed, but the current
interpreter already called the "close()" method (which drops its
association with the channel) then raise ValueError.
If the channel is already closed then raise ChannelClosedError.
If the channel isn't closed but the current interpreter already
called the "release()" method (which drops its association with
the channel) then raise ChannelReleasedError.
send_nowait(obj):
Send the object to the receiving end of the channel. If the other
end is not currently receiving then raise RuntimeError. Otherwise
this is the same as "send()".
end is not currently receiving then raise NotReceivedError.
Otherwise this is the same as "send()".
send_buffer(obj):
@ -847,14 +904,23 @@ many-to-many, channels have no buffer.
send_buffer_nowait(obj):
Send a MemoryView of the object rather than the object. If the
other end is not currently receiving then raise RuntimeError.
other end is not currently receiving then raise NotReceivedError.
Otherwise this is the same as "send_buffer()".
close():
release():
This is the same as "RecvChannel.close(), but applied to the
This is the same as "RecvChannel.release(), but applied to the
sending end of the channel.
close(force=False):
Close both ends of the channel (in all interpreters). No matter
what the "send" end of the channel is immediately closed. If the
channel is empty then close the "recv" end immediately too.
Otherwise wait until the channel is empty before closing it (if
"force" is False) or discard the remaining items and close
immediately (if "force" is True).
Note that ``send_buffer()`` is similar to how
``multiprocessing.Connection`` works. [mp-conn]_
@ -862,7 +928,9 @@ Note that ``send_buffer()`` is similar to how
Open Questions
==============
None
* "force" argument to ``ch.release()``?
* add a "tp_share" type slot instead of using a global registry
for shareable types?
Open Implementation Questions
@ -1020,9 +1088,8 @@ exception, effectively ending execution in the interpreter that tried
to use the poisoned channel.
This could be accomplished by adding a ``poison()`` method to both ends
of the channel. The ``close()`` method could work if it had a ``force``
option to force the channel closed. Regardless, these semantics are
relatively specialized and can wait.
of the channel. The ``close()`` method can be used in this way
(mostly), but these semantics are relatively specialized and can wait.
Sending channels over channels
------------------------------
@ -1070,14 +1137,6 @@ generic module reset mechanism may prove unnecessary.
This isn't a critical feature initially. It can wait until later
if desirable.
Support passing ints in channels
--------------------------------
Passing ints around should be fine and ultimately is probably
desirable. However, we can get by with serializing them as bytes
for now. The goal is a minimal API for the sake of basic
functionality at first.
File descriptors and sockets in channels
----------------------------------------
@ -1119,7 +1178,8 @@ Channel context managers
Context manager support on ``RecvChannel`` and ``SendChannel`` may be
helpful. The implementation would be simple, wrapping a call to
``close()`` like files do. As with iteration, this can wait.
``close()`` (or maybe ``release()``) like files do. As with iteration,
this can wait.
Pipes and Queues
----------------
@ -1136,19 +1196,11 @@ reasonable. The could be trivially implemented as wrappers around
channels. Alternatively they could be implemented for efficiency at the
same low level as channels.
interpreters.RunFailedError
---------------------------
Buffering
---------
As currently proposed, ``Interpreter.run()`` offers you no way to
distinguish an error coming from the subinterpreter from any other
error in the current interpreter. Your only option would be to
explicitly wrap your ``run()`` call in a
``try: ... except RuntimeError:`` (since we wrap a proxy of the original
exception in a RuntimeError and raise that).
If this is a problem in practice then would could add something like
``interpreters.RunFailedError`` (subclassing RuntimeError) and raise that
in ``run()``.
The proposed channels are unbuffered. This simplifies the API and
implementation. If buffering is desireable we can add it later.
Return a lock from send()
-------------------------
@ -1162,6 +1214,26 @@ This matters for buffered channels (i.e. queues). For unbuffered
channels it is a non-issue. So this can be dealt with once channels
support buffering.
Add a "reraise" method to RunFailedError
----------------------------------------
While having ``__cause__`` set on ``RunFailedError`` helps produce a
more useful traceback, it's less helpful when handling the original
error. To help facilitate this, we could add
``RunFailedError.reraise()``. This method would enable the following
pattern::
try:
interp.run(script)
except RunFailedError as exc:
try:
exc.reraise()
except MyException:
...
This would be made even simpler if there existed a ``__reraise__``
protocol.
Rejected Ideas
==============
@ -1170,7 +1242,7 @@ Explicit channel association
----------------------------
Interpreters are implicitly associated with channels upon ``recv()`` and
``send()`` calls. They are de-associated with ``close()`` calls. The
``send()`` calls. They are de-associated with ``release()`` calls. The
alternative would be explicit methods. It would be either
``add_channel()`` and ``remove_channel()`` methods on ``Interpreter``
objects or something similar on channel objects.
@ -1216,15 +1288,16 @@ While that might not be a problem currently, it would be a problem once
interpreters get better isolation relative to memory management (which
is necessary to stop sharing the GIL between interpreters). We've
resolved the semantics of how the exceptions propagate by raising a
RuntimeError instead, which wraps a safe proxy for the original
exception and traceback.
``RunFailedError`` instead, for which ``__cause__`` wraps a safe proxy
for the original exception and traceback.
Rejected possible solutions:
* set the RuntimeError's __cause__ to the proxy of the original
exception
* reproduce the exception and traceback in the original interpreter
and raise that.
* raise a subclass of RunFailedError that proxies the original
exception and traceback.
* raise RuntimeError instead of RunFailedError
* convert at the boundary (a la ``subprocess.CalledProcessError``)
(requires a cross-interpreter representation)
* support customization via ``Interpreter.excepthook``