PEP 554: updates after feedback (#1388)
This commit is contained in:
parent
e589d83236
commit
08a58eccaa
236
pep-0554.rst
236
pep-0554.rst
|
@ -8,7 +8,7 @@ Content-Type: text/x-rst
|
|||
Created: 2017-09-05
|
||||
Python-Version: 3.9
|
||||
Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017,
|
||||
09-May-2018
|
||||
09-May-2018, 20-Apr-2020
|
||||
|
||||
|
||||
Abstract
|
||||
|
@ -106,7 +106,7 @@ For creating and using interpreters:
|
|||
+----------------------------------------+-----------------------------------------------------+
|
||||
| ``.is_running() -> bool`` | Is the interpreter currently executing code? |
|
||||
+----------------------------------------+-----------------------------------------------------+
|
||||
| ``.destroy()`` | Finalize and destroy the interpreter. |
|
||||
| ``.close()`` | Finalize and destroy the interpreter. |
|
||||
+----------------------------------------+-----------------------------------------------------+
|
||||
| ``.run(src_str, /, *, channels=None)`` | | Run the given source code in the interpreter. |
|
||||
| | | (This blocks the current thread until done.) |
|
||||
|
@ -738,7 +738,8 @@ The module provides the following functions::
|
|||
|
||||
get_main() => Interpreter
|
||||
|
||||
Return the main interpreter.
|
||||
Return the main interpreter. If the Python implementation
|
||||
has no concept of a main interpreter then return None.
|
||||
|
||||
create() -> Interpreter
|
||||
|
||||
|
@ -763,7 +764,7 @@ The module also provides the following class::
|
|||
code. Calling this on the current interpreter will always
|
||||
return True.
|
||||
|
||||
destroy():
|
||||
close():
|
||||
|
||||
Finalize and destroy the interpreter.
|
||||
|
||||
|
@ -925,9 +926,13 @@ The module also provides the following channel-related classes::
|
|||
|
||||
recv():
|
||||
|
||||
Return the next object (i.e. the data from the sent object)
|
||||
from the channel. If none have been sent then wait until
|
||||
the next send.
|
||||
Return the next object from the channel. If none have been
|
||||
sent then wait until the next send.
|
||||
|
||||
At the least, the object will be equivalent to the sent object.
|
||||
That will almost always mean the same type with the same data,
|
||||
though it could also be a compatible proxy. Regardless, it may
|
||||
use a copy of that data or actually share the data.
|
||||
|
||||
If the channel is already closed then raise ChannelClosedError.
|
||||
If the channel isn't closed but the current interpreter already
|
||||
|
@ -1085,17 +1090,27 @@ Open Questions
|
|||
|
||||
* add "isolated" mode to subinterpreters API?
|
||||
|
||||
An "isolated" mode for subinterpreters would mean an interpreter in
|
||||
that mode is especially restricted. It might include any of the
|
||||
following::
|
||||
There are various ways that an interpreter could potentially operate
|
||||
in a more isolated/restricted way::
|
||||
|
||||
* ImportError when importing ext. module without PEP 489 support
|
||||
* no daemon threads
|
||||
* no threads at all
|
||||
* no multiprocessing
|
||||
* ...
|
||||
|
||||
For now the default would be ``False``, but it would become ``True``
|
||||
later.
|
||||
This could be facilitated via settinga (separate or an int flag) on
|
||||
the ``PyConfig`` struct on each ``PyInterpreterState``. (This would
|
||||
require moving ``_PyInterpreterState_SetConfig()`` to the public C-API.)
|
||||
By default the settings would all be False, for backward compatibility.
|
||||
|
||||
The ``interpreters`` module, however, would likely use a more
|
||||
restrictive default (e.g. always require PEP 489 support). This would
|
||||
effectively be the "isolated" mode. It would make sense to add an arg
|
||||
to ``interpreters.create()`` to disable "isolated" mode (at least the
|
||||
PEP 489 part), since then extension authors could test their modules
|
||||
under subinterpreters (without having to release a potentially broken
|
||||
build with PEP 489 support).
|
||||
|
||||
* add a shareable synchronization primitive?
|
||||
|
||||
|
@ -1104,15 +1119,49 @@ interpreters would actually share the underlying mutex. This would
|
|||
provide much better efficiency than blocking channel ops. The main
|
||||
concern is that locks and channels don't mix well (as learned in Go).
|
||||
|
||||
* add readiness callback support to channels?
|
||||
|
||||
This is an alternative to channel buffering. It is probably
|
||||
unnecessary, but may have enough advantages to consider it for the
|
||||
high-level API. It may also be better only for the low-level
|
||||
implementation.
|
||||
|
||||
* also track which interpreters are using a channel end?
|
||||
|
||||
* auto-run in a thread?
|
||||
|
||||
The PEP proposes a hard separation between subinterpreters and threads:
|
||||
if you want to run in a thread you must create the thread yourself and
|
||||
call ``run()`` in it. However, it might be convenient if ``run()``
|
||||
could do that for you, meaning there would be less boilerplate.
|
||||
|
||||
Furthermore, we anticipate that users will want to run in a thread much
|
||||
more often than not. So it would make sense to make this the default
|
||||
behavior. We would add a kw-only param "threaded" (default ``True``)
|
||||
to ``run()`` to allow the run-in-the-current-thread operation.
|
||||
|
||||
* what to do about BaseException propagation?
|
||||
|
||||
The exception types that inherit from ``BaseException`` (aside from
|
||||
``Exception``) are usually treated specially. These types are:
|
||||
``KeyboardInterrupt``, ``SystemExit``, and ``GeneratorExit``. It may
|
||||
make sense to treat them specially when it comes to propagation from
|
||||
``run()``. Here are some options::
|
||||
|
||||
* propagate like normal via RunFailedError
|
||||
* do not propagate (handle them somehow in the subinterpreter)
|
||||
* propagate them directly (avoid RunFailedError)
|
||||
* propagate them directly (set RunFailedError as __cause__)
|
||||
|
||||
|
||||
TODO
|
||||
======
|
||||
|
||||
* add a more detailed description of channel lifespan
|
||||
|
||||
A state machine diagram may be most effective. Relevant questions:
|
||||
|
||||
* How does an interpreter detach from the receiving end of a channel
|
||||
that is never empty?
|
||||
* What happens if an interpreter deletes the last reference to a
|
||||
non-empty channel?
|
||||
* On the receiving end, or on the sending end?
|
||||
|
||||
* run the CPython test suite in a subinterpreter and see what shakes out
|
||||
|
||||
|
||||
Deferred Functionality
|
||||
======================
|
||||
|
@ -1143,18 +1192,6 @@ Typically functions that have a ``block`` argument also have a
|
|||
functions that otherwise block, like the channel ``recv()`` and
|
||||
``send()`` methods. We can add it later if needed.
|
||||
|
||||
get_main()
|
||||
----------
|
||||
|
||||
CPython has a concept of a "main" interpreter. This is the initial
|
||||
interpreter created during CPython's runtime initialization. It may
|
||||
be useful to identify the main interpreter. For instance, the main
|
||||
interpreter should not be destroyed. However, for the basic
|
||||
functionality of a high-level API a ``get_main()`` function is not
|
||||
necessary. Furthermore, there is no requirement that a Python
|
||||
implementation have a concept of a main interpreter. So until there's
|
||||
a clear need we'll leave ``get_main()`` out.
|
||||
|
||||
Interpreter.run_in_thread()
|
||||
---------------------------
|
||||
|
||||
|
@ -1318,6 +1355,15 @@ channel methods (``recv()``, and ``send()``). However,
|
|||
the basic functionality of subinterpreters does not depend on async and
|
||||
can be added later.
|
||||
|
||||
Alternately, "readiness callbacks" could be used to simplify use in
|
||||
async scenarios. This would mean adding an optional ``callback``
|
||||
(kw-only) parameter to the ``recv_nowait()`` and ``send_nowait()``
|
||||
channel methods. The callback would be called once the object was sent
|
||||
or received (respectively).
|
||||
|
||||
(Note that making channels buffered makes readiness callbacks less
|
||||
important.)
|
||||
|
||||
Support for iteration
|
||||
---------------------
|
||||
|
||||
|
@ -1340,9 +1386,9 @@ Pipes and Queues
|
|||
|
||||
With the proposed object passing machanism of "channels", other similar
|
||||
basic types aren't required to achieve the minimal useful functionality
|
||||
of subinterpreters. Such types include pipes (like channels, but
|
||||
one-to-one) and queues (like channels, but more generic). See below in
|
||||
`Rejected Ideas` for more information.
|
||||
of subinterpreters. Such types include pipes (like unbuffered channels,
|
||||
but one-to-one) and queues (like channels, but more generic). See below
|
||||
in `Rejected Ideas` for more information.
|
||||
|
||||
Even though these types aren't part of this proposal, they may still
|
||||
be useful in the context of concurrency. Adding them later is entirely
|
||||
|
@ -1350,12 +1396,6 @@ reasonable. The could be trivially implemented as wrappers around
|
|||
channels. Alternatively they could be implemented for efficiency at the
|
||||
same low level as channels.
|
||||
|
||||
Buffering
|
||||
---------
|
||||
|
||||
The proposed channels are unbuffered. This simplifies the API and
|
||||
implementation. If buffering is desirable we can add it later.
|
||||
|
||||
Return a lock from send()
|
||||
-------------------------
|
||||
|
||||
|
@ -1371,26 +1411,6 @@ less likely to confuse users.
|
|||
Note that returning a lock would matter for buffered channels
|
||||
(i.e. queues). For unbuffered channels it is a non-issue.
|
||||
|
||||
Add a "reraise" method to RunFailedError
|
||||
----------------------------------------
|
||||
|
||||
While having ``__cause__`` set on ``RunFailedError`` helps produce a
|
||||
more useful traceback, it's less helpful when handling the original
|
||||
error. To help facilitate this, we could add
|
||||
``RunFailedError.reraise()``. This method would enable the following
|
||||
pattern::
|
||||
|
||||
try:
|
||||
interp.run(script)
|
||||
except RunFailedError as exc:
|
||||
try:
|
||||
exc.reraise()
|
||||
except MyException:
|
||||
...
|
||||
|
||||
This would be made even simpler if there existed a ``__reraise__``
|
||||
protocol.
|
||||
|
||||
Support prioritization in channels
|
||||
----------------------------------
|
||||
|
||||
|
@ -1411,6 +1431,51 @@ will require significant work, especially when it comes to complex
|
|||
objects and most especially for mutable containers of mutable
|
||||
complex objects.
|
||||
|
||||
Make exceptions shareable
|
||||
-------------------------
|
||||
|
||||
Exceptions are propagated out of ``run()`` calls, so it isn't a big
|
||||
leap to make them shareable in channels. However, as noted elsewhere,
|
||||
it isn't essential or (particularly common) so we can wait on doing
|
||||
that.
|
||||
|
||||
Make RunFailedError.__cause__ lazy
|
||||
----------------------------------
|
||||
|
||||
An uncaught exception in a subinterpreter (from ``run()``) is copied
|
||||
to the calling interpreter and set as ``__cause__`` on a
|
||||
``RunFailedError`` which is then raised. That copying part involves
|
||||
some sort of deserialization in the calling intepreter, which can be
|
||||
expensive (e.g. due to imports) yet is not always necessary.
|
||||
|
||||
So it may be useful to use an ``ExceptionProxy`` type to wrap the
|
||||
serialized exception and only deserialize it when needed. That could
|
||||
be via ``ExceptionProxy__getattribute__()`` or perhaps through
|
||||
``RunFailedError.resolve()`` (which would raise the deserialized
|
||||
exception and set ``RunFailedError.__cause__`` to the exception.
|
||||
|
||||
It may also make sense to have ``RunFailedError.__cause__`` be a
|
||||
descriptor that does the lazy deserialization (and set ``__cause__``)
|
||||
on the ``RunFailedError`` instance.
|
||||
|
||||
Serialize everything through channels
|
||||
-------------------------------------
|
||||
|
||||
We could use pickle (or marshal) to serialize everything sent through
|
||||
channels. Doing this is potentially inefficient, but it may be a
|
||||
matter of convenience in the end. We can add it later, but trying to
|
||||
remove it later would be significantly more painful.
|
||||
|
||||
Return a value from ``run()``
|
||||
-----------------------------
|
||||
|
||||
Currently ``run()`` always returns None. One idea is to return the
|
||||
return value from whatever the subinterpreter ran. However, for now
|
||||
it doesn't make sense. The only thing folks can run is a string of
|
||||
code (i.e. a script). This is equivalent to ``PyRun_StringFlags()``,
|
||||
``exec()``, or a module body. None of those "return" anything. We can
|
||||
revisit this once ``run()`` supports functions, etc.
|
||||
|
||||
|
||||
Rejected Ideas
|
||||
==============
|
||||
|
@ -1440,15 +1505,11 @@ Use queues instead of channels
|
|||
------------------------------
|
||||
|
||||
Queues and buffered channels are almost the same thing. The main
|
||||
difference is that channels has a stronger relationship with context
|
||||
difference is that channels have a stronger relationship with context
|
||||
(i.e. the associated interpreter).
|
||||
|
||||
The name "Channel" was used instead of "Queue" to avoid confusion with
|
||||
the stdlib ``queue`` module.
|
||||
|
||||
Note that buffering in channels does complicate the blocking semantics
|
||||
of ``recv()`` and ``send()``. Also, queues can be built on top of
|
||||
unbuffered channels.
|
||||
the stdlib ``queue.Queue``.
|
||||
|
||||
"enumerate"
|
||||
-----------
|
||||
|
@ -1542,6 +1603,49 @@ Doing this is potentially confusing and also can lead to unexpected
|
|||
races where a channel is auto-closed before it can be used in the
|
||||
original (creating) interpreter.
|
||||
|
||||
Add a "reraise" method to RunFailedError
|
||||
----------------------------------------
|
||||
|
||||
While having ``__cause__`` set on ``RunFailedError`` helps produce a
|
||||
more useful traceback, it's less helpful when handling the original
|
||||
error. To help facilitate this, we could add
|
||||
``RunFailedError.reraise()``. This method would enable the following
|
||||
pattern::
|
||||
|
||||
try:
|
||||
try:
|
||||
interp.run(script)
|
||||
except RunFailedError as exc:
|
||||
exc.reraise()
|
||||
except MyException:
|
||||
...
|
||||
|
||||
This would be made even simpler if there existed a ``__reraise__``
|
||||
protocol.
|
||||
|
||||
All that said, this is completely unnecessary. Using ``__cause__``
|
||||
is good enough::
|
||||
|
||||
try:
|
||||
try:
|
||||
interp.run(script)
|
||||
except RunFailedError as exc:
|
||||
raise exc.__cause__
|
||||
except MyException:
|
||||
...
|
||||
|
||||
Note that in extreme cases it may require a little extra boilerplate::
|
||||
|
||||
try:
|
||||
try:
|
||||
interp.run(script)
|
||||
except RunFailedError as exc:
|
||||
if exc.__cause__ is not None:
|
||||
raise exc.__cause__
|
||||
raise # re-raise
|
||||
except MyException:
|
||||
...
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
|
Loading…
Reference in New Issue