PEP 554: Drop "Shareable" Objects (#3057)

This change also includes some minor fixes and a note about concurrent.futures.
This commit is contained in:
Eric Snow 2023-03-14 17:13:30 -06:00 committed by GitHub
parent 2bc335d7a4
commit 860b950c77
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 64 additions and 141 deletions

View File

@ -1,14 +1,16 @@
PEP: 554
Title: Multiple Interpreters in the Stdlib
Author: Eric Snow <ericsnowcurrently@gmail.com>
BDFL-Delegate: Antoine Pitrou <antoine@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 05-Sep-2017
Python-Version: 3.12
Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017,
09-May-2018, 20-Apr-2020, 04-May-2020
Post-History: `07-Sep-2017 <https://mail.python.org/archives/list/python-ideas@python.org/thread/HQQWEE527HG3ILJVKQTXVSJIQO6NUSIA/>`__,
`08-Sep-2017 <https://mail.python.org/archives/list/python-dev@python.org/thread/NBWMA6LVD22XOUYC5ZMPBFWDQOECRP77/>`__,
`13-Sep-2017 <https://mail.python.org/archives/list/python-dev@python.org/thread/EG4FSFG5E3O22FTIUQOXMQ6X6B5X3DP7/>`__,
`05-Dec-2017 <https://mail.python.org/archives/list/python-dev@python.org/thread/BCSRGAMCYB3NGXNU42U66J56XNZVMQP2/>`__,
`04-May-2020 <https://mail.python.org/archives/list/python-dev@python.org/thread/X2KPCSRVBD2QD5GP5IMXXZTGZ46OXD3D/>`__,
Abstract
@ -23,7 +25,7 @@ facilitates novel alternative approaches to
This proposal introduces the stdlib ``interpreters`` module. It exposes
the basic functionality of multiple interpreters already provided by the
C-API, along with a *very* basic way to communicate
C-API, along with describing a *very* basic way to communicate
(i.e. pass data between interpreters).
@ -44,6 +46,12 @@ necessarily agree.
Proposal
========
Summary:
* add a new stdlib module: "interpreters"
* help for extension module maintainers
The "interpreters" Module
-------------------------
@ -88,20 +96,20 @@ For creating and using interpreters:
|
+---------------------------------------------------+---------------------------------------------------+
| signature | description |
+===================================================+===================================================+
| ``class Interpreter`` | A single interpreter. |
+---------------------------------------------------+---------------------------------------------------+
| ``.id`` | The interpreter's ID (read-only). |
+---------------------------------------------------+---------------------------------------------------+
| ``.is_running() -> bool`` | Is the interpreter currently executing code? |
+---------------------------------------------------+---------------------------------------------------+
| ``.close()`` | Finalize and destroy the interpreter. |
+---------------------------------------------------+---------------------------------------------------+
| ``.run(src_str, /, *, shared=None) -> Status`` | | Run the given source code in the interpreter |
| | | (in its own thread). |
+---------------------------------------------------+---------------------------------------------------+
+----------------------------------+---------------------------------------------------+
| signature | description |
+==================================+===================================================+
| ``class Interpreter`` | A single interpreter. |
+----------------------------------+---------------------------------------------------+
| ``.id`` | The interpreter's ID (read-only). |
+----------------------------------+---------------------------------------------------+
| ``.is_running() -> bool`` | Is the interpreter currently executing code? |
+----------------------------------+---------------------------------------------------+
| ``.close()`` | Finalize and destroy the interpreter. |
+----------------------------------+---------------------------------------------------+
| ``.run(src_str, /) -> Status`` | | Run the given source code in the interpreter |
| | | (in its own thread). |
+----------------------------------+---------------------------------------------------+
.. XXX Support blocking interp.run() until the interpreter
finishes its current work.
@ -136,15 +144,6 @@ Asynchronous results:
| ``NotFinishedError`` | ``Exception`` | The request has not completed yet. |
+--------------------------+------------------------+------------------------------------------------+
For sharing data between interpreters:
+---------------------------------------------------------+--------------------------------------------+
| signature | description |
+=========================================================+============================================+
| ``is_shareable(obj) -> Bool`` | | Can the object's data be shared |
| | | between interpreters? |
+---------------------------------------------------------+--------------------------------------------+
Help for Extension Module Maintainers
-------------------------------------
@ -239,15 +238,11 @@ Synchronize using an OS pipe
interp = interpreters.create()
r, s = os.pipe()
print('before')
interp.run(tw.dedent("""
interp.run(tw.dedent(f"""
import os
os.read(reader, 1)
os.read({r}, 1)
print("during")
"""),
shared=dict(
reader=r,
),
)
"""))
print('after')
os.write(s, '')
@ -259,19 +254,14 @@ Sharing a file descriptor
interp = interpreters.create()
r1, s1 = os.pipe()
r2, s2 = os.pipe()
interp.run(tw.dedent("""
interp.run(tw.dedent(f"""
import os
fd = int.from_bytes(
os.read(reader, 10), 'big')
os.read({r1}, 10), 'big')
for line in os.fdopen(fd):
print(line)
os.write(writer, b'')
"""),
shared=dict(
reader=r1,
writer=s2,
),
)
os.write({s2}, b'')
"""))
with open('spamspamspam') as infile:
fd = infile.fileno().to_bytes(1, 'big')
os.write(s1, fd)
@ -284,14 +274,11 @@ Passing objects via pickle
interp = interpreters.create()
r, s = os.pipe()
interp.run(tw.dedent("""
interp.run(tw.dedent(f"""
import os
import pickle
"""),
shared=dict(
reader=r,
),
).wait()
reader = {r}
""")).wait()
interp.run(tw.dedent("""
data = b''
c = os.read(reader, 1)
@ -507,7 +494,7 @@ created. Furthermore, the complexity of *object* sharing increases as
interpreters become more isolated, e.g. after GIL removal (though this
is mitigated somewhat for some "immortal" objects (see :pep:`683`).
Consequently,the mechanism for sharing needs to be carefully considered.
Consequently, the mechanism for sharing needs to be carefully considered.
There are a number of valid solutions, several of which may be
appropriate to support in Python. Earlier versions of this proposal
included a basic capability ("channels"), though most of the options
@ -643,17 +630,11 @@ The module also provides the following classes::
This may not be called on an already running interpreter.
Doing so results in a RuntimeError.
run(source_str, /, *, shared=None) -> Status:
run(source_str, /) -> Status:
Run the provided Python source code in the interpreter and
return a Status object that tracks when it finishes.
If the "shared" keyword argument is provided (and is a mapping
of attribute name keys) then each key-value pair is added to
the interpreter's execution namespace (the interpreter's
"__main__" module). If any of the values are not a shareable
object (see below) then ValueError gets raised.
This may not be called on an already running interpreter.
Doing so results in a RuntimeError.
@ -737,92 +718,17 @@ The various aspects of the approach, including keeping the API minimal,
helps us avoid further exposing any underlying complexity
to Python users.
.. _interpreters-is-shareable:
Shareable Objects
'''''''''''''''''
A "shareable" object is one that the runtime knows how to safely "share"
between interpreters. For now this actually means that a copy of the
object is provided to the second interpreter. Legitimate sharing is
feasible but beyond the scope of this proposal.
In fact, this proposal only covers very minimal "sharing" of a handful
of simple, immutable object types. We will initially limit the types
that are shareable to the following:
* ``None``
* ``bytes``
* ``str``
* ``int``
Support for other basic types (e.g. ``bool``, ``float``, ``Ellipsis``)
will be added later, separately.
Limiting the initial shareable types is a practical matter, reducing
the potential complexity of the initial implementation. There are a
number of solutions we may pursue in the future to expand supported
objects and object sharing strategies.
However, this PEP does provide one concrete addition related to
shareable objects. The ``interpreters`` module provides a function
that users may call to determine whether an object is shareable or not::
is_shareable(obj) -> bool:
Return True if the object may be shared between interpreters.
This does not necessarily mean that the actual objects will be
shared. Insead, it means that the objects' underlying data will
be shared in a cross-interpreter way, whether via a proxy, a
copy, or some other means.
How Sharing Works
'''''''''''''''''
In this propsal, shareable objects are used with ``Interpreter.run()``.
The steps look something like this:
1. a "shareable" object is mapped to an identifier in some container
2. that mapping is passed as the "shared" argument in the
``Interpreter.run()`` call
3. the mapped object is converted to an object that the target
interpreter may safely use
4. that object is bound to the mapped name in the target interpreter's
``__main__`` module, where the running code has access to it
The critical part is what happens in step 3. The object must be
converted to some cross-interpreter-safe data (its raw data or even
a pointer). Then that data must be converted back into an object
for the target interpreter to use, likely a new object. For example,
an ``int`` object could be converted to the underlying C ``long`` value
and then back into a Python ``int`` object.
To make this work, the intermediate data (and any associated mutable
shared state) will be managed by the Python runtime, not by any of the
interpreters.
The underlying runtime capability that ``Interpreter.run()`` uses is
what enables data/object "sharing", and is available for use elsewhere
in the runtime. In fact, it was used in the implementation of the
"channels" that were part of an earlier version of this PEP.
Likewise, this runtime functionality facilitates most of the possible
solutions to which `Shared Data`_ alluded. Thus any separate effort
to introduce effective means for communicating and sharing data will
be well served by the underlying functionality proposed here.
.. XXX Add Interpreter.set_on___main__() and drop the "shared" arg?
Communicating Through OS Pipes
''''''''''''''''''''''''''''''
As noted, this proposal enables a very basic mechanism for
communicating between interpreters, which makes use of
``Interpreter.run()`` and shareable objects:
``Interpreter.run()``:
1. interpreter A calls ``os.pipe()`` to get a read/write pair
of file descriptors (both shareable ``int`` objects)
2. interpreter A calls ``run()`` on interpreter B, passing
the read FD via the "shared" argument
of file descriptors (both ``int`` objects)
2. interpreter A calls ``run()`` on interpreter B, including
the read FD via string formatting
3. interpreter A writes some bytes to the write FD
4. interpreter B reads those bytes
@ -890,6 +796,20 @@ a message that clearly says it is due to missing multiple interpreters
compatibility and that extensions are not required to provide it. This
will help set user expectations properly.
Alternative Solutions
=====================
One possible alternative to a new module is to add support for interpreters
to ``concurrent.futures``. There are several reasons why that wouldn't work:
* the obvious place to look for multiple interpreters support
is an "interpreters" module, much as with "threading", etc.
* ``concurrent.futures`` is all about executing functions
but currently we don't have a good way to run a function
from one interpreter in another
Similar reasoning applies for support in the ``multiprocessing`` module.
Deferred Functionality
======================
@ -899,6 +819,14 @@ functionality has been left out for future consideration. Note that
this is not a judgement against any of said capability, but rather a
deferment. That said, each is arguably valid.
Shareable Objects
-----------------
Earlier versions of this proposal included a mechanism by which the
data underlying a given object could be passed to another interpreter
or even shared, even if the object can't be. Without channels there
isn't enough benefit to keep the concept of shareable objects around.
Interpreter.call()
------------------
@ -1144,11 +1072,6 @@ code (i.e. a script). This is equivalent to ``PyRun_StringFlags()``,
``exec()``, or a module body. None of those "return" anything. We can
revisit this once ``run()`` supports functions, etc.
Add a "tp_share" type slot
--------------------------
This would replace the current global registry for shareable types.
Add a shareable synchronization primitive
-----------------------------------------