PEP 432: Flesh out a design for main execution
This commit is contained in:
parent
38af865f01
commit
cc027218c7
231
pep-0432.txt
231
pep-0432.txt
|
@ -43,9 +43,11 @@ well-defined phases during the startup sequence:
|
|||
* Initializing - interpreter partially available
|
||||
* Initialized - interpreter available, __main__ related metadata
|
||||
incomplete
|
||||
* Main Execution - __main__ related metadata populated, bytecode
|
||||
executing in the __main__ module namespace (embedding applications
|
||||
may choose not to use this phase)
|
||||
* Main Preparation - __main__ related metadata populated
|
||||
* Main Execution - bytecode executing in the __main__ module namespace
|
||||
|
||||
(Embedding applications may choose not to use the Main Preparation and
|
||||
Execution phases)
|
||||
|
||||
As a concrete use case to help guide any design changes, and to solve a known
|
||||
problem where the appropriate defaults for system utilities differ from those
|
||||
|
@ -78,21 +80,21 @@ complicated, offering more options, as well as performing more complex tasks
|
|||
(such as configuring the Unicode settings for OS interfaces in Python 3 as
|
||||
well as bootstrapping a pure Python implementation of the import system).
|
||||
|
||||
Much of this complexity is accessible only through the ``Py_Main`` and
|
||||
``Py_Initialize`` APIs, offering embedding applications little opportunity
|
||||
for customisation. This creeping complexity also makes life difficult for
|
||||
maintainers, as much of the configuration needs to take place prior to the
|
||||
``Py_Initialize`` call, meaning much of the Python C API cannot be used
|
||||
safely.
|
||||
Much of this complexity is formally accessible only through the ``Py_Main``
|
||||
and ``Py_Initialize`` APIs, offering embedding applications little
|
||||
opportunity for customisation. This creeping complexity also makes life
|
||||
difficult for maintainers, as much of the configuration needs to take
|
||||
place prior to the ``Py_Initialize`` call, meaning much of the Python C
|
||||
API cannot be used safely.
|
||||
|
||||
A number of proposals are on the table for even *more* sophisticated
|
||||
startup behaviour, such as an isolated mode equivalent to that described in
|
||||
this PEP as a "system Python" [6_], better control over ``sys.path``
|
||||
initialization (easily adding additional directories on the command line
|
||||
in a cross-platform fashion [7_], as well as controlling the configuration of
|
||||
``sys.path[0]`` [8_]), easier configuration of utilities like coverage tracing
|
||||
when launching Python subprocesses [9_], and easier control of the encoding
|
||||
used for the standard IO streams when embedding CPython in a larger
|
||||
``sys.path[0]`` [8_]), easier configuration of utilities like coverage
|
||||
tracing when launching Python subprocesses [9_], and easier control of the
|
||||
encoding used for the standard IO streams when embedding CPython in a larger
|
||||
application [10_].
|
||||
|
||||
Rather than attempting to bolt such behaviour onto an already complicated
|
||||
|
@ -118,8 +120,8 @@ to ``Py_Initialize`` when the ``-X`` or ``-W`` options are used [1_].
|
|||
|
||||
By moving to an explicitly multi-phase startup sequence, developers should
|
||||
only need to understand which features are not available in the core
|
||||
bootstrapping state, as the vast majority of the configuration process
|
||||
will now take place in that state.
|
||||
bootstrapping phase, as the vast majority of the configuration process
|
||||
will now take place during that phase.
|
||||
|
||||
By basing the new design on a combination of C structures and Python
|
||||
data types, it should also be easier to modify the system in the
|
||||
|
@ -504,14 +506,13 @@ CPython command line application.
|
|||
Interpreter Initialization Phases
|
||||
---------------------------------
|
||||
|
||||
Four distinct phases are proposed:
|
||||
Five distinct phases are proposed:
|
||||
|
||||
* Pre-Initialization:
|
||||
|
||||
* no interpreter is available.
|
||||
* ``Py_IsInitializing()`` returns ``0``
|
||||
* ``Py_IsInitialized()`` returns ``0``
|
||||
* ``Py_IsRunningMain()`` returns ``0``
|
||||
* The embedding application determines the settings required to create the
|
||||
main interpreter and moves to the next phase by calling
|
||||
``Py_BeginInitialization``.
|
||||
|
@ -521,7 +522,6 @@ Four distinct phases are proposed:
|
|||
* the main interpreter is available, but only partially configured.
|
||||
* ``Py_IsInitializing()`` returns ``1``
|
||||
* ``Py_IsInitialized()`` returns ``0``
|
||||
* ``Py_RunningMain()`` returns ``0``
|
||||
* The embedding application determines and applies the settings
|
||||
required to complete the initialization process by calling
|
||||
``Py_ReadConfiguration`` and ``Py_EndInitialization``.
|
||||
|
@ -529,26 +529,28 @@ Four distinct phases are proposed:
|
|||
* Initialized:
|
||||
|
||||
* the main interpreter is available and fully operational, but
|
||||
``__main__`` related metadata is incomplete and the site module may
|
||||
not have been imported.
|
||||
``__main__`` related metadata is incomplete
|
||||
* ``Py_IsInitializing()`` returns ``0``
|
||||
* ``Py_IsInitialized()`` returns ``1``
|
||||
* ``Py_IsRunningMain()`` returns ``0``
|
||||
* Optionally, the embedding application may identify and begin
|
||||
executing code in the ``__main__`` module namespace by calling
|
||||
``Py_RunPathAsMain``, ``Py_RunModuleAsMain`` or ``Py_RunStreamAsMain``.
|
||||
``PyRun_PrepareMain`` and ``PyRun_ExecMain``.
|
||||
|
||||
* Main Preparation:
|
||||
|
||||
* subphase of Initialized (not separately identified at runtime)
|
||||
* fully populates ``__main__`` related metadata
|
||||
* may execute code in ``__main__`` namespace (e.g. ``PYTHONSTARTUP``)
|
||||
|
||||
* Main Execution:
|
||||
|
||||
* bytecode is being executed in the ``__main__`` namespace
|
||||
* ``Py_IsInitializing()`` returns ``0``
|
||||
* ``Py_IsInitialized()`` returns ``1``
|
||||
* ``Py_IsRunningMain()`` returns ``1``
|
||||
* subphase of Initialized (not separately identified at runtime)
|
||||
* user supplied bytecode is being executed in the ``__main__`` namespace
|
||||
|
||||
As indicated by the phase reporting functions, main module execution is
|
||||
an optional subphase of Initialized rather than a completely distinct phase.
|
||||
As noted above, main module preparation and execution are optional subphases
|
||||
of Initialized rather than completely distinct phases.
|
||||
|
||||
All 4 phases will be used by the standard CPython interpreter and the
|
||||
All listed phases will be used by the standard CPython interpreter and the
|
||||
proposed System Python interpreter. Other embedding applications may
|
||||
choose to skip the step of executing code in the ``__main__`` namespace.
|
||||
|
||||
|
@ -817,15 +819,9 @@ data types (not set == ``NULL``) or numeric flags (not set == ``-1``)::
|
|||
/* Filesystem access */
|
||||
PyUnicodeObject *fs_encoding;
|
||||
|
||||
/* Interactive interpreter */
|
||||
int stdin_is_interactive; /* Force interactive behaviour */
|
||||
int inspect_main; /* -i switch, PYTHONINSPECT */
|
||||
PyUnicodeObject *startup_file; /* PYTHONSTARTUP */
|
||||
|
||||
/* Debugging output */
|
||||
int debug_parser; /* -d switch, PYTHONDEBUG */
|
||||
int verbosity; /* -v switch */
|
||||
int suppress_banner; /* -q switch */
|
||||
|
||||
/* Code generation */
|
||||
int bytes_warnings; /* -b switch */
|
||||
|
@ -833,6 +829,32 @@ data types (not set == ``NULL``) or numeric flags (not set == ``-1``)::
|
|||
|
||||
/* Signal handling */
|
||||
int install_sig_handlers;
|
||||
|
||||
/* Implicit execution */
|
||||
PyUnicodeObject *startup_file; /* PYTHONSTARTUP */
|
||||
|
||||
/* Main module
|
||||
*
|
||||
* If prepare_main is set, at most one of the main_* settings should
|
||||
* be set before calling PyRun_PrepareMain (Py_ReadConfiguration will
|
||||
* set one of them based on the command line arguments if prepare_main
|
||||
* is non-zero when that API is called).
|
||||
int prepare_main;
|
||||
PyUnicodeObject *main_source; /* -c switch */
|
||||
PyUnicodeObject *main_path; /* filesystem path */
|
||||
PyUnicodeObject *main_module; /* -m switch */
|
||||
PyCodeObject *main_code; /* Run directly from a code object */
|
||||
PyObject *main_stream; /* Run from stream */
|
||||
int run_implicit_code; /* Run implicit code during prep */
|
||||
|
||||
/* Interactive main
|
||||
*
|
||||
* Note: Settings related to interactive mode are very much in flux.
|
||||
*/
|
||||
PyObject *prompt_stream; /* Output interactive prompt */
|
||||
int show_banner; /* -q switch (inverted) */
|
||||
int inspect_main; /* -i switch, PYTHONINSPECT */
|
||||
|
||||
} Py_Config;
|
||||
|
||||
|
||||
|
@ -844,17 +866,19 @@ data types (not set == ``NULL``) or numeric flags (not set == ``-1``)::
|
|||
#define _Py_ImportConfig_INIT -1, -1, NULL
|
||||
#define _Py_StreamConfig_INIT -1, NULL, NULL, NULL, NULL, NULL, NULL
|
||||
#define _Py_FilesystemConfig_INIT NULL
|
||||
#define _Py_InteractiveConfig_INIT -1, -1, NULL
|
||||
#define _Py_DebuggingConfig_INIT -1, -1, -1
|
||||
#define _Py_CodeGenConfig_INIT -1, -1
|
||||
#define _Py_SignalConfig_INIT -1
|
||||
#define _Py_ImplicitConfig_INIT NULL
|
||||
#define _Py_MainConfig_INIT -1, NULL, NULL, NULL, NULL, NULL, -1
|
||||
#define _Py_InteractiveConfig_INIT NULL, -1, -1
|
||||
|
||||
#define Py_Config_INIT {_Py_ArgConfig_INIT, _Py_LocationConfig_INIT,
|
||||
_Py_SiteConfig_INIT, _Py_ImportConfig_INIT,
|
||||
_Py_StreamConfig_INIT, _Py_FilesystemConfig_INIT,
|
||||
_Py_InteractiveConfig_INIT,
|
||||
_Py_DebuggingConfig_INIT, _Py_CodeGenConfig_INIT,
|
||||
_Py_SignalConfig_INIT}
|
||||
_Py_SignalConfig_INIT, _Py_ImplicitConfig_INIT,
|
||||
_Py_MainConfig_INIT, _Py_InteractiveConfig_INIT}
|
||||
|
||||
<TBD: did I miss anything?>
|
||||
|
||||
|
@ -893,6 +917,11 @@ incomplete:
|
|||
* it will be the same as ``sys.path[0]`` rather than the location of
|
||||
the ``__main__`` module when executing a valid ``sys.path`` entry
|
||||
(typically a zipfile or directory)
|
||||
* otherwise, it will be accurate:
|
||||
|
||||
* the script name if running an ordinary script
|
||||
* ``-c`` if executing a supplied string
|
||||
* ``-`` or the empty string if running from stdin
|
||||
|
||||
* the metadata in the ``__main__`` module will still indicate it is a
|
||||
builtin module
|
||||
|
@ -904,21 +933,103 @@ behaviour, as well as eliminating any side effects on global state if
|
|||
``import site`` is later explicitly executed in the process.
|
||||
|
||||
|
||||
Preparing the main module
|
||||
-------------------------
|
||||
|
||||
This subphase completes the population of the ``__main__`` module
|
||||
related metadata, without actually starting execution of the ``__main__``
|
||||
module code.
|
||||
|
||||
It is handled by calling the following API::
|
||||
|
||||
int PyRun_PrepareMain();
|
||||
|
||||
The actual processing is driven by the main related settings stored in
|
||||
the interpreter state as part of the configuration struct.
|
||||
|
||||
If ``prepare_main`` is zero, this call does nothing.
|
||||
|
||||
If all of ``main_source``, ``main_path``, ``main_module``,
|
||||
``main_stream`` and ``main_code`` are NULL, this call does nothing.
|
||||
|
||||
If more than one of ``main_source``, ``main_path``, ``main_module``,
|
||||
``main_stream`` or ``main_code`` are set, ``RuntimeError`` will be reported.
|
||||
|
||||
If ``main_code`` is already set, then this call does nothing.
|
||||
|
||||
If ``main_stream`` is set, and ``run_implicit_code`` is also set, then
|
||||
the file identified in ``startup_file`` will be read, compiled and
|
||||
executed in the ``__main__`` namespace.
|
||||
|
||||
If ``main_source``, ``main_path`` or ``main_module`` are set, then this
|
||||
call will take whatever steps are needed to populate ``main_code``:
|
||||
|
||||
* For ``main_source``, the supplied string will be compiled and saved to
|
||||
``main_code``.
|
||||
|
||||
* For ``main_path``:
|
||||
* if the supplied path is recognised as a valid ``sys.path`` entry, it
|
||||
is inserted as ``sys.path[0]``, ``main_module`` is set
|
||||
to ``__main__`` and processing continues as for ``main_module`` below.
|
||||
* otherwise, path is read as a CPython bytecode file
|
||||
* if that fails, it is read as a Python source file and compiled
|
||||
* in the latter two cases, the code object is saved to ``main_code``
|
||||
and ``__main__.__file__`` is set appropriately
|
||||
|
||||
* For ``main_module``:
|
||||
* any parent package is imported
|
||||
* the loader for the module is determined
|
||||
* if the loader indicates the module is a package, add ``.__main__`` to
|
||||
the end of ``main_module`` and try again (if the final name segment
|
||||
is already ``.__main__`` then fail immediately)
|
||||
* once the module source code is located, save the compiled module code
|
||||
as ``main_code`` and populate the following attributes in ``__main__``
|
||||
appropriately: ``__name__``, ``__loader__``, ``__file__``,
|
||||
``__cached__``, ``__package__``.
|
||||
|
||||
|
||||
(Note: the behaviour described in this section isn't new, it's a write-up
|
||||
of the current behaviour of the CPython interpreter adjusted for the new
|
||||
configuration system)
|
||||
|
||||
|
||||
Executing the main module
|
||||
-------------------------
|
||||
|
||||
<TBD>
|
||||
This subphase covers the execution of the actual ``__main__`` module code.
|
||||
|
||||
Initial thought is that hiding the various options behind a single API
|
||||
would make that API too complicated, so 3 separate APIs is more likely::
|
||||
It is handled by calling the following API::
|
||||
|
||||
Py_RunPathAsMain
|
||||
Py_RunModuleAsMain
|
||||
Py_RunStreamAsMain
|
||||
int PyRun_ExecMain();
|
||||
|
||||
Query API to indicate that ``sys.argv[0]`` is fully populated::
|
||||
The actual processing is driven by the main related settings stored in
|
||||
the interpreter state as part of the configuration struct.
|
||||
|
||||
If both ``main_stream`` and ``main_code`` are NULL, this call does nothing.
|
||||
|
||||
If both ``main_stream`` and ``main_code`` are set, ``RuntimeError`` will
|
||||
be reported.
|
||||
|
||||
If ``main_stream`` and ``prompt_stream`` are both set, main execution will
|
||||
be delegated to a new API::
|
||||
|
||||
int PyRun_InteractiveMain(PyObject *input, PyObject* output);
|
||||
|
||||
If ``main_stream`` is set and ``prompt_stream`` is NULL, main execution will
|
||||
be delegated to a new API::
|
||||
|
||||
int PyRun_StreamInMain(PyObject *input);
|
||||
|
||||
If ``main_code`` is set, main execution will be delegated to a new
|
||||
API::
|
||||
|
||||
int PyRun_CodeInMain(PyCodeObject *code);
|
||||
|
||||
After execution of main completes, if ``inspect_main`` is set, or
|
||||
the ``PYTHONINSPECT`` environment variable has been set, then
|
||||
``PyRun_ExecMain`` will invoke
|
||||
``PyRun_InteractiveMain(sys.__stdin__, sys.__stdout__)``.
|
||||
|
||||
Py_IsRunningMain()
|
||||
|
||||
Internal Storage of Configuration Data
|
||||
--------------------------------------
|
||||
|
@ -931,7 +1042,7 @@ structs.
|
|||
For debugging purposes, the configuration settings will be exposed as
|
||||
a ``sys._configuration`` simple namespace (similar to ``sys.flags`` and
|
||||
``sys.implementation``. Field names will match those in the configuration
|
||||
structs, exception for ``hash_seed``, which will be deliberately excluded.
|
||||
structs, except for ``hash_seed``, which will be deliberately excluded.
|
||||
|
||||
An underscored attribute is chosen deliberately, as these configuration
|
||||
settings are part of the CPython implementation, rather than part of the
|
||||
|
@ -941,16 +1052,19 @@ should be agreed with the other implementations and exposed as new required
|
|||
attributes on ``sys.implementation``, as described in PEP 421.
|
||||
|
||||
These are *snapshots* of the initial configuration settings. They are not
|
||||
consulted by the interpreter during runtime.
|
||||
modified by the interpreter during runtime (except as noted above).
|
||||
|
||||
|
||||
Stable ABI
|
||||
----------
|
||||
|
||||
All of the APIs proposed in this PEP are excluded from the stable ABI, as
|
||||
Most of the APIs proposed in this PEP are excluded from the stable ABI, as
|
||||
embedding a Python interpreter involves a much higher degree of coupling
|
||||
than merely writing an extension.
|
||||
|
||||
The only newly exposed API that will be part of the stable ABI is the
|
||||
``Py_IsInitializing()`` query.
|
||||
|
||||
|
||||
Build time configuration
|
||||
------------------------
|
||||
|
@ -1038,27 +1152,32 @@ Open Questions
|
|||
|
||||
* Error details for Py_ReadConfiguration and Py_EndInitialization (these
|
||||
should become clear as the implementation progresses)
|
||||
* Is ``Py_IsRunningMain()`` worth keeping?
|
||||
* Should the answers to ``Py_IsInitialized()`` and ``Py_IsRunningMain()`` be
|
||||
exposed via the ``sys`` module?
|
||||
* Is the ``Py_Config`` struct too unwieldy to be practical? Would a Python
|
||||
dictionary be a better choice?
|
||||
* Should there be ``Py_PreparingMain()`` and ``Py_RunningMain()`` query APIs?
|
||||
* Should the answer to ``Py_IsInitialized()`` be exposed via the ``sys``
|
||||
module?
|
||||
* Is initialisation of the ``Py_Config`` struct too unwieldy to be
|
||||
maintainable? Would a Python dictionary be a better choice, despite
|
||||
being harder to work with from C code?
|
||||
* Would it be better to manage the flag variables in ``Py_Config`` as
|
||||
Python integers or as "negative means false, positive means true, zero
|
||||
means not set" so the struct can be initialized with a simple
|
||||
``memset(&config, 0, sizeof(*config))``, eliminating the need to update
|
||||
both Py_Config and Py_Config_INIT when adding new fields?
|
||||
* The name of the system Python executable is a bikeshed waiting to be
|
||||
* The name of the new system Python executable is a bikeshed waiting to be
|
||||
painted. The 3 options considered so far are ``spython``, ``pysystem``
|
||||
and ``python-minimal``. The PEP text reflects my current preferred choice
|
||||
i.e. ``pysystem``.
|
||||
(``pysystem``).
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
The reference implementation is being developed as a feature branch in my
|
||||
BitBucket sandbox [2_].
|
||||
BitBucket sandbox [2_]. Pull requests to fix the inevitably broken
|
||||
Windows builds are welcome, but the basic design is still in too much flux
|
||||
for other pull requests to be feasible just yet. Once the overall design
|
||||
settles down and it's a matter of migrating individual settings over to
|
||||
the new design, that level of collaboration should become more practical.
|
||||
|
||||
As the number of application binaries created by the build process is now
|
||||
four, the reference implementation also creates a new top level "Apps"
|
||||
|
|
Loading…
Reference in New Issue