From d77a8be7c92d9df7e10bc3a76edc3a1a8d1c7c30 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Mon, 14 May 2018 13:39:07 -0400 Subject: [PATCH] PEP 554: Updates post-implementation. (#645) --- pep-0554.rst | 213 ++++++++++++++++++++++++++++++++++----------------- 1 file changed, 143 insertions(+), 70 deletions(-) diff --git a/pep-0554.rst b/pep-0554.rst index 8c09fc55d..4d9d5beea 100644 --- a/pep-0554.rst +++ b/pep-0554.rst @@ -6,15 +6,16 @@ Type: Standards Track Content-Type: text/x-rst Created: 2017-09-05 Python-Version: 3.8 -Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017 +Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017, + 09-May-2018 Abstract ======== CPython has supported multiple interpreters in the same process (AKA -"subinterpreters") since version 1.5. The feature has been available -via the C-API. [c-api]_ Subinterpreters operate in +"subinterpreters") since version 1.5 (1997). The feature has been +available via the C-API. [c-api]_ Subinterpreters operate in `relative isolation from one another `_, which provides the basis for an `alternative concurrency model `_. @@ -47,6 +48,8 @@ At first only the following types will be supported for sharing: * None * bytes +* str +* int * PEP 3118 buffer objects (via ``send_buffer()``) Support for other basic types (e.g. int, Ellipsis) will be added later. @@ -87,6 +90,14 @@ For creating and using interpreters: | channels=None) | | (This blocks the current thread until done.) | +-----------------------+-----------------------------------------------------+ +| + ++----------------+--------------+------------------------------------------------------+ +| exception | base | description | ++================+==============+======================================================+ +| RunFailedError | RuntimeError | Interpreter.run() resulted in an uncaught exception. | ++----------------+--------------+------------------------------------------------------+ + For sharing data between interpreters: +--------------------------------+--------------------------------------------+ @@ -120,9 +131,11 @@ For sharing data between interpreters: | .recv_nowait(default=None) -> | | Like recv(), but return the default | | object | | instead of waiting. | +-------------------------------+-----------------------------------------------+ -| .close() | | No longer associate the current interpreter | +| .release() | | No longer associate the current interpreter | | | | with the channel (on the receiving end). | +-------------------------------+-----------------------------------------------+ +| .close(force=False) | | Close the channel in all interpreters. | ++-------------------------------+-----------------------------------------------+ | @@ -147,9 +160,31 @@ For sharing data between interpreters: +---------------------------+-------------------------------------------------+ | .send_buffer_nowait(obj) | | Like send_buffer(), but fail if not received. | +---------------------------+-------------------------------------------------+ -| .close() | | No longer associate the current interpreter | +| .release() | | No longer associate the current interpreter | | | | with the channel (on the sending end). | +---------------------------+-------------------------------------------------+ +| .close(force=False) | | Close the channel in all interpreters. | ++---------------------------+-------------------------------------------------+ + +| + ++----------------------+--------------------+------------------------------------------------+ +| exception | base | description | ++======================+====================+================================================+ +| ChannelError | Exception | The base class for channel-related exceptions. | ++----------------------+--------------------+------------------------------------------------+ +| ChannelNotFoundError | ChannelError | The identified channel was not found. | ++----------------------+--------------------+------------------------------------------------+ +| ChannelEmptyError | ChannelError | The channel was unexpectedly empty. | ++----------------------+--------------------+------------------------------------------------+ +| ChannelNotEmptyError | ChannelError | The channel was unexpectedly not empty. | ++----------------------+--------------------+------------------------------------------------+ +| NotReceivedError | ChannelError | Nothing was waiting to receive a sent object. | ++----------------------+--------------------+------------------------------------------------+ +| ChannelClosedError | ChannelError | The channel is closed. | ++----------------------+--------------------+------------------------------------------------+ +| ChannelReleasedError | ChannelClosedError | The channel is released (but not yet closed). | ++----------------------+--------------------+------------------------------------------------+ Examples @@ -218,7 +253,7 @@ Synchronize using a channel interp.run(tw.dedent(""" reader.recv() print("during") - reader.close() + reader.release() """), shared=dict( reader=r, @@ -229,7 +264,7 @@ Synchronize using a channel t.start() print('after') s.send(b'') - s.close() + s.release() Sharing a file descriptor ------------------------- @@ -280,7 +315,7 @@ Passing objects via marshal obj = marshal.loads(data) do_something(obj) data = reader.recv() - reader.close() + reader.release() """)) t = threading.Thread(target=run) t.start() @@ -310,7 +345,7 @@ Passing objects via pickle obj = pickle.loads(data) do_something(obj) data = reader.recv() - reader.close() + reader.release() """)) t = threading.Thread(target=run) t.start() @@ -514,6 +549,8 @@ channels to the following: * None * bytes +* str +* int * PEP 3118 buffer objects (via ``send_buffer()``) Limiting the initial shareable types is a practical matter, reducing @@ -686,16 +723,24 @@ The module also provides the following class: "run()" call into one long script. This is the same as how the REPL operates. - Regarding uncaught exceptions, we noted that they are - "effectively" propagated into the code where ``run()`` was called. - To prevent leaking exceptions (and tracebacks) between - interpreters, we create a surrogate of the exception and its - traceback (see ``traceback.TracebackException``), wrap it in a - RuntimeError, and raise that. - Supported code: source text. +Uncaught Exceptions +------------------- + +Regarding uncaught exceptions in ``Interpreter.run()``, we noted that +they are "effectively" propagated into the code where ``run()`` was +called. To prevent leaking exceptions (and tracebacks) between +interpreters, we create a surrogate of the exception and its traceback +(see ``traceback.TracebackException``), set it to ``__cause__`` on a +new ``RunFailedError``, and raise that. + +Raising (a proxy of) the exception is problematic since it's harder to +distinguish between an error in the ``run()`` call and an uncaught +exception from the subinterpreter. + + API for sharing data -------------------- @@ -703,8 +748,8 @@ Subinterpreters are less useful without a mechanism for sharing data between them. Sharing actual Python objects between interpreters, however, has enough potential problems that we are avoiding support for that here. Instead, only mimimum set of types will be supported. -Initially this will include ``bytes`` and channels. Further types may -be supported later. +Initially this will include ``None``, ``bytes``, ``str``, ``int``, +and channels. Further types may be supported later. The ``interpreters`` module provides a way for users to determine whether an object is shareable or not: @@ -737,11 +782,12 @@ many-to-many, channels have no buffer. Create a new channel and return (recv, send), the RecvChannel and SendChannel corresponding to the ends of the channel. The channel is not closed and destroyed (i.e. garbage-collected) until the number - of associated interpreters returns to 0. + of associated interpreters returns to 0 (including when the channel + is explicitly closed). An interpreter gets associated with a channel by calling its "send()" or "recv()" method. That association gets dropped by calling - "close()" on the channel. + "release()" on the channel. Both ends of the channel are supported "shared" objects (i.e. may be safely shared by different interpreters. Thus they may be passed as @@ -765,7 +811,8 @@ many-to-many, channels have no buffer. interpreters: The list of associated interpreters: those that have called - the "recv()" or "__next__()" methods and haven't called "close()". + the "recv()" or "__next__()" methods and haven't called + "release()" (and the channel hasn't been explicitly closed). recv(): @@ -773,10 +820,11 @@ many-to-many, channels have no buffer. the channel. If none have been sent then wait until the next send. This associates the current interpreter with the channel. - If the channel is already closed (see the close() method) - then raise EOFError. If the channel isn't closed, but the current - interpreter already called the "close()" method (which drops its - association with the channel) then raise ValueError. + If the channel is already closed then raise ChannelClosedError. + If the channel isn't closed but the current interpreter already + called the "release()" method (which drops its association with + the channel) then raise ChannelReleasedError (which is a subclass + of ChannelClosedError). recv_nowait(default=None): @@ -784,26 +832,35 @@ many-to-many, channels have no buffer. then return the default. Otherwise, this is the same as the "recv()" method. - close(): + release(): No longer associate the current interpreter with the channel (on the receiving end) and block future association (via the "recv()" - method. If the interpreter was never associated with the channel + method). If the interpreter was never associated with the channel then still block future association. Once an interpreter is no longer associated with the channel, subsequent (or current) send() - and recv() calls from that interpreter will raise ValueError - (or EOFError if the channel is actually marked as closed). + and recv() calls from that interpreter will raise + ChannelReleasedError (or ChannelClosedError if the channel + is actually marked as closed). Once the number of associated interpreters on both ends drops to 0, the channel is actually marked as closed. The Python runtime will garbage collect all closed channels, though it may - not be immediately. Note that "close()" is automatically called + not be immediately. Note that "release()" is automatically called in behalf of the current interpreter when the channel is no longer used (i.e. has no references) in that interpreter. - This operation is idempotent. Return True if "close()" has not + This operation is idempotent. Return True if "release()" has not been called before by the current interpreter. + close(force=False): + + Close both ends of the channel (in all interpreters). This means + that any further use of the channel raises ChannelClosedError. If + the channel is not empty then raise ChannelNotEmptyError (if + "force" is False) or discard the remaining objects (if "force" + is True) and close it. + ``SendChannel(id)``:: @@ -827,16 +884,16 @@ many-to-many, channels have no buffer. object is not shareable then ValueError is raised. Currently only bytes are supported. - If the channel is already closed (see the close() method) - then raise EOFError. If the channel isn't closed, but the current - interpreter already called the "close()" method (which drops its - association with the channel) then raise ValueError. + If the channel is already closed then raise ChannelClosedError. + If the channel isn't closed but the current interpreter already + called the "release()" method (which drops its association with + the channel) then raise ChannelReleasedError. send_nowait(obj): Send the object to the receiving end of the channel. If the other - end is not currently receiving then raise RuntimeError. Otherwise - this is the same as "send()". + end is not currently receiving then raise NotReceivedError. + Otherwise this is the same as "send()". send_buffer(obj): @@ -847,14 +904,23 @@ many-to-many, channels have no buffer. send_buffer_nowait(obj): Send a MemoryView of the object rather than the object. If the - other end is not currently receiving then raise RuntimeError. + other end is not currently receiving then raise NotReceivedError. Otherwise this is the same as "send_buffer()". - close(): + release(): - This is the same as "RecvChannel.close(), but applied to the + This is the same as "RecvChannel.release(), but applied to the sending end of the channel. + close(force=False): + + Close both ends of the channel (in all interpreters). No matter + what the "send" end of the channel is immediately closed. If the + channel is empty then close the "recv" end immediately too. + Otherwise wait until the channel is empty before closing it (if + "force" is False) or discard the remaining items and close + immediately (if "force" is True). + Note that ``send_buffer()`` is similar to how ``multiprocessing.Connection`` works. [mp-conn]_ @@ -862,7 +928,9 @@ Note that ``send_buffer()`` is similar to how Open Questions ============== -None +* "force" argument to ``ch.release()``? +* add a "tp_share" type slot instead of using a global registry + for shareable types? Open Implementation Questions @@ -1020,9 +1088,8 @@ exception, effectively ending execution in the interpreter that tried to use the poisoned channel. This could be accomplished by adding a ``poison()`` method to both ends -of the channel. The ``close()`` method could work if it had a ``force`` -option to force the channel closed. Regardless, these semantics are -relatively specialized and can wait. +of the channel. The ``close()`` method can be used in this way +(mostly), but these semantics are relatively specialized and can wait. Sending channels over channels ------------------------------ @@ -1070,14 +1137,6 @@ generic module reset mechanism may prove unnecessary. This isn't a critical feature initially. It can wait until later if desirable. -Support passing ints in channels --------------------------------- - -Passing ints around should be fine and ultimately is probably -desirable. However, we can get by with serializing them as bytes -for now. The goal is a minimal API for the sake of basic -functionality at first. - File descriptors and sockets in channels ---------------------------------------- @@ -1119,7 +1178,8 @@ Channel context managers Context manager support on ``RecvChannel`` and ``SendChannel`` may be helpful. The implementation would be simple, wrapping a call to -``close()`` like files do. As with iteration, this can wait. +``close()`` (or maybe ``release()``) like files do. As with iteration, +this can wait. Pipes and Queues ---------------- @@ -1136,19 +1196,11 @@ reasonable. The could be trivially implemented as wrappers around channels. Alternatively they could be implemented for efficiency at the same low level as channels. -interpreters.RunFailedError ---------------------------- +Buffering +--------- -As currently proposed, ``Interpreter.run()`` offers you no way to -distinguish an error coming from the subinterpreter from any other -error in the current interpreter. Your only option would be to -explicitly wrap your ``run()`` call in a -``try: ... except RuntimeError:`` (since we wrap a proxy of the original -exception in a RuntimeError and raise that). - -If this is a problem in practice then would could add something like -``interpreters.RunFailedError`` (subclassing RuntimeError) and raise that -in ``run()``. +The proposed channels are unbuffered. This simplifies the API and +implementation. If buffering is desireable we can add it later. Return a lock from send() ------------------------- @@ -1162,6 +1214,26 @@ This matters for buffered channels (i.e. queues). For unbuffered channels it is a non-issue. So this can be dealt with once channels support buffering. +Add a "reraise" method to RunFailedError +---------------------------------------- + +While having ``__cause__`` set on ``RunFailedError`` helps produce a +more useful traceback, it's less helpful when handling the original +error. To help facilitate this, we could add +``RunFailedError.reraise()``. This method would enable the following +pattern:: + + try: + interp.run(script) + except RunFailedError as exc: + try: + exc.reraise() + except MyException: + ... + +This would be made even simpler if there existed a ``__reraise__`` +protocol. + Rejected Ideas ============== @@ -1170,7 +1242,7 @@ Explicit channel association ---------------------------- Interpreters are implicitly associated with channels upon ``recv()`` and -``send()`` calls. They are de-associated with ``close()`` calls. The +``send()`` calls. They are de-associated with ``release()`` calls. The alternative would be explicit methods. It would be either ``add_channel()`` and ``remove_channel()`` methods on ``Interpreter`` objects or something similar on channel objects. @@ -1216,15 +1288,16 @@ While that might not be a problem currently, it would be a problem once interpreters get better isolation relative to memory management (which is necessary to stop sharing the GIL between interpreters). We've resolved the semantics of how the exceptions propagate by raising a -RuntimeError instead, which wraps a safe proxy for the original -exception and traceback. +``RunFailedError`` instead, for which ``__cause__`` wraps a safe proxy +for the original exception and traceback. Rejected possible solutions: -* set the RuntimeError's __cause__ to the proxy of the original - exception * reproduce the exception and traceback in the original interpreter and raise that. +* raise a subclass of RunFailedError that proxies the original + exception and traceback. +* raise RuntimeError instead of RunFailedError * convert at the boundary (a la ``subprocess.CalledProcessError``) (requires a cross-interpreter representation) * support customization via ``Interpreter.excepthook``