PEP 432: Update based on the extracted PEP 587 API (#965)

The overall PEP 432 design is still a work in progress,
but the parts that Victor extracted out to PEP 587 should
be pretty solid at this point.
This commit is contained in:
Nick Coghlan 2019-04-16 23:58:12 +10:00 committed by GitHub
parent 7a0b6e7ef0
commit 1b0e8221be
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 274 additions and 211 deletions

View File

@ -2,14 +2,17 @@ PEP: 432
Title: Restructuring the CPython startup sequence Title: Restructuring the CPython startup sequence
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan@gmail.com> Author: Nick Coghlan <ncoghlan@gmail.com>,
Victor Stinner <vstinner@redhat.com>,
Eric Snow <ericsnowcurrently@gmail.com>
Discussions-To: capi-sig@python.org Discussions-To: capi-sig@python.org
Status: Draft Status: Draft
Type: Standards Track Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Requires: 587
Created: 28-Dec-2012 Created: 28-Dec-2012
Python-Version: 3.9 Python-Version: 3.9
Post-History: 28-Dec-2012, 2-Jan-2013 Post-History: 28-Dec-2012, 2-Jan-2013, 30-Mar-2019
Abstract Abstract
@ -24,7 +27,7 @@ embedding it as a Python execution engine inside a larger application.
When implementation of this proposal is completed, interpreter startup will When implementation of this proposal is completed, interpreter startup will
consist of three clearly distinct and independently configurable phases: consist of three clearly distinct and independently configurable phases:
* Python core runtime preconfiguration * Python core runtime preinitialization
* setting up memory management * setting up memory management
* determining the encodings used for system interfaces (including settings * determining the encodings used for system interfaces (including settings
@ -57,12 +60,35 @@ explored by way of private updates to the initialisation API used by the main
CPython CLI application. CPython CLI application.
As long as that is the case, the Discussions-To header in the PEP will remain As long as that is the case, the Discussions-To header in the PEP will remain
set to ``capi-sig@python.org``. Once a coherent and feasible proposal for a new set to ``capi-sig@python.org``.
public API has been developed, then that PEP header will be removed, and the
PEP will be submitted back to ``python-dev`` for further review and discussion. However, as coherent and feasible proposals to address subsets of the problem
(Python 3.9 currently seems like a more plausible time frame for that than are identified, they may be spun out to as separate PEPs, which will then be
Python 3.8 - if that changes, then the target Python version in the PEP will be added to this PEP as dependencies.
adjusted accordingly)
So far, the following subproposals have been extracted:
* PEP 587 (Python Initialization Configuration): this splits out the
preinitialization stage, but otherwise keeps the combined ``Py_Initialize*``
model that configures a Python interpreter with a ready-to-run ``__main__``
module in a single step. While this is still a nice improvement over the
pre-PEP status quo and gets embedding applications back on an equal footing
with the private APIs exposed to the default CPython CLI in the Python 3.7
release, it doesn't yet allow for more of the configuration code to be
migrated out of C and into frozen Python modules.
For PEP 432 itself, it is now expected that once enough subproposals are
eventually split out, the only things left to be propose in PEP 432 itself will
be actually shipping a ``system-python`` executable (which would run in isolated
mode by default and ignore all user-level settings) and enhancing the ``zipapp``
module to support the creation of single-file executables from pure Python
scripts. Once that is the case, the ``Discussions-To`` PEP header will be
removed, and the PEP will be submitted back to ``python-dev`` for further
discussion.
In the meantime, the PEP will continue to be used as a draft specification for
the aspects of the proposed startup sequence redesign that don't yet have their
own dedicated PEP.
Proposal Proposal
@ -71,7 +97,7 @@ Proposal
This PEP proposes that initialization of the CPython runtime be split into This PEP proposes that initialization of the CPython runtime be split into
three clearly distinct phases: three clearly distinct phases:
* core runtime preconfiguration * core runtime preinitialization
* core runtime initialization * core runtime initialization
* main interpreter configuration * main interpreter configuration
@ -87,15 +113,23 @@ The proposed design also has significant implications for:
In the new design, the interpreter will move through the following In the new design, the interpreter will move through the following
well-defined phases during the initialization sequence: well-defined phases during the initialization sequence:
* Uninitialized - haven't even started the pre-initialization phase yet
* Pre-Initialization - no interpreter available * Pre-Initialization - no interpreter available
* Runtime Initialized - main interpreter partially available, * Runtime Initialized - main interpreter partially available,
subinterpreter creation not yet available subinterpreter creation not yet available
* Initialized - main interpreter fully available, subinterpreter creation * Initialized - main interpreter fully available, subinterpreter creation
available available
PEP 587 is a more detailed proposal that covers separating out the
Pre-Initialization phase from the last two phases, but doesn't allow embedding
applications to run arbitrary code while in the "Runtime Initialized" state
(instead, initializing the core runtime will also always fully initialize the
main interpreter, as that's the way the native CPython CLI still works in
Python 3.8).
As a concrete use case to help guide any design changes, and to solve a known As a concrete use case to help guide any design changes, and to solve a known
problem where the appropriate defaults for system utilities differ from those problem where the appropriate defaults for system utilities differ from those
for running user scripts, this PEP also proposes the creation and for running user scripts, this PEP proposes the creation and
distribution of a separate system Python (``system-python``) executable distribution of a separate system Python (``system-python``) executable
which, by default, operates in "isolated mode" (as selected by the CPython which, by default, operates in "isolated mode" (as selected by the CPython
``-I`` switch), as well as the creation of an example stub binary that just ``-I`` switch), as well as the creation of an example stub binary that just
@ -109,9 +143,9 @@ occur in order to make the startup sequence easier to maintain is already a
substantial change, and attempting to make those other changes at the same time substantial change, and attempting to make those other changes at the same time
will make the change significantly more invasive and much harder to review. will make the change significantly more invasive and much harder to review.
However, such proposals may be suitable topics for follow-on PEPs or patches However, such proposals may be suitable topics for follow-on PEPs or patches
- one key benefit of this PEP is decreasing the coupling between the internal - one key benefit of this PEP and its related subproposals is decreasing the
storage model and the configuration interface, so such changes should be easier coupling between the internal storage model and the configuration interface,
once this PEP has been implemented. so such changes should be easier once this PEP has been implemented.
Background Background
@ -143,6 +177,11 @@ system indefinitely, this PEP proposes to start simplifying the status quo by
introducing a more structured startup sequence, with the aim of making these introducing a more structured startup sequence, with the aim of making these
further feature requests easier to implement. further feature requests easier to implement.
Originally the entire proposal was maintained in this one PEP, but that proved
impractical, so as parts of the proposed design stabilised, they are now split
out into their own PEPs, allowing progress to be made, even while the details
of the overall design are still evolving.
Key Concerns Key Concerns
============ ============
@ -154,16 +193,26 @@ needs to take into account.
Maintainability Maintainability
--------------- ---------------
The current CPython startup sequence is difficult to understand, and even The CPython startup sequence as of Python 3.6 was difficult to understand, and
more difficult to modify. It is not clear what state the interpreter is in even more difficult to modify. It was not clear what state the interpreter was
while much of the initialization code executes, leading to behaviour such in while much of the initialization code executed, leading to behaviour such
as lists, dictionaries and Unicode values being created prior to the call as lists, dictionaries and Unicode values being created prior to the call
to ``Py_Initialize`` when the ``-X`` or ``-W`` options are used [1_]. to ``Py_Initialize`` when the ``-X`` or ``-W`` options are used [1_].
By moving to an explicitly multi-phase startup sequence, developers should By moving to an explicitly multi-phase startup sequence, developers should
only need to understand which features are not available in the main only need to understand:
interpreter configuration phase, as the vast majority of the configuration
process will now take place during that phase. * which APIs and features are available prior to pre-configuration (essentially
none, except for the pre-configuration API itself)
* which APIs and features are available prior to core runtime configuration, and
will implicitly run the pre-configuration with default settings that match the
behaviour of Python 3.6 if the pre-configuration hasn't been run explicitly
* which APIs and features are only available after the main interpreter has been
fully configured (which will hopefully be a relatively small subset of the
full C API)
The first two aspects of that are covered by PEP 587, while the details of the
latter distinction are still being considered.
By basing the new design on a combination of C structures and Python By basing the new design on a combination of C structures and Python
data types, it should also be easier to modify the system in the data types, it should also be easier to modify the system in the
@ -214,92 +263,8 @@ sophisticated microbenchmark will be developed to assist in investigation.
Required Configuration Settings Required Configuration Settings
=============================== ===============================
A comprehensive configuration scheme requires that an embedding application See PEP 587 for a detailed listing of CPython interpreter configuration settings
be able to control the following aspects of the final interpreter state: and the various means available for setting them.
* Whether or not to use randomised hashes (and if used, potentially specify
a specific random seed)
* Whether or not to enable the import system (required by CPython's
build process when freezing the importlib._bootstrap bytecode)
* The "Where is Python located?" elements in the ``sys`` module:
* ``sys.executable``
* ``sys.base_exec_prefix``
* ``sys.base_prefix``
* ``sys.exec_prefix``
* ``sys.prefix``
* The path searched for imports from the filesystem (and other path hooks):
* ``sys.path``
* The command line arguments seen by the interpreter:
* ``sys.argv``
* The filesystem encoding used by:
* ``sys.getfsencoding``
* ``os.fsencode``
* ``os.fsdecode``
* The IO encoding (if any), error handling, and buffering used by:
* ``sys.stdin``
* ``sys.stdout``
* ``sys.stderr``
* The initial warning system state:
* ``sys.warnoptions``
* Arbitrary extended options (e.g. to automatically enable ``faulthandler``):
* ``sys._xoptions``
* Whether or not to implicitly cache bytecode files:
* ``sys.dont_write_bytecode``
* Whether or not to enforce correct case in filenames on case-insensitive
platforms
* ``os.environ["PYTHONCASEOK"]``
* The other settings exposed to Python code in ``sys.flags``:
* ``debug`` (Enable debugging output in the pgen parser)
* ``inspect`` (Enter interactive interpreter after __main__ terminates)
* ``interactive`` (Treat stdin as a tty)
* ``optimize`` (__debug__ status, .pyc optimization marker, strip doc strings)
* ``no_user_site`` (don't add the user site directory to sys.path)
* ``no_site`` (don't implicitly import site during startup)
* ``ignore_environment`` (whether environment vars are used during config)
* ``verbose`` (enable a variety of additional debugging messages)
* ``bytes_warning`` (warnings/errors for implicit str/bytes interaction)
* ``quiet`` (disable banner output even if verbose is also enabled or
stdin is a tty and the interpreter is launched in interactive mode)
* Whether or not CPython's signal handlers should be installed
* What code (if any) should be executed as ``__main__``:
* Nothing (just create an empty module)
* A filesystem path referring to a Python script (source or bytecode)
* A filesystem path referring to a valid ``sys.path`` entry (typically
a directory or zipfile)
* A given string (equivalent to the "-c" option)
* A module or package (equivalent to the "-m" option)
* Standard input as a script (i.e. a non-interactive stream)
* Standard input as an interactive interpreter session
<TBD: What, if anything, is still missing from this list?>
Note that this just covers settings that are currently configurable in some
manner when using the main CPython executable. While this PEP aims to make
adding additional configuration settings easier in the future, it
deliberately avoids adding any new settings of its own (except where such
additional settings arise naturally in the course of migrating existing
settings to the new structure).
Implementation Strategy Implementation Strategy
@ -326,10 +291,10 @@ previously been decoded with the locale encoding, and decode them again using
UTF-8 instead). Eric Snow also migrated a number of internal subsystems over as UTF-8 instead). Eric Snow also migrated a number of internal subsystems over as
part of making the subinterpreter feature more robust. part of making the subinterpreter feature more robust.
That work showed that the detailed design currently proposed in this PEP has a That work showed that the detailed design originally proposed in this PEP had a
range of practical issues, so it's currently expected to remain a private API range of practical issues, so Victor designed and implemented an improved
for CPython 3.8, with the possibility of making it public and stable in CPython private API (inspired by an earlier iteration of this PEP), which PEP 587
3.9. proposes to promote to a public API in Python 3.8.
Design Details Design Details
@ -337,15 +302,17 @@ Design Details
.. note:: .. note::
The API details here are still very much in flux, as the private refactoring The API details here are still very much in flux. The header files that show
work has shown that these specific API designs aren't really viable in the current state of the private API are mainly:
practice. The header files that show the current state of the private API
are mainly:
* https://github.com/python/cpython/blob/master/Include/cpython/coreconfig.h * https://github.com/python/cpython/blob/master/Include/cpython/coreconfig.h
* https://github.com/python/cpython/blob/master/Include/cpython/pystate.h * https://github.com/python/cpython/blob/master/Include/cpython/pystate.h
* https://github.com/python/cpython/blob/master/Include/cpython/pylifecycle.h * https://github.com/python/cpython/blob/master/Include/cpython/pylifecycle.h
PEP 587 covers the aspects of the API that are considered potentially stable
enough to make public. Where a proposed API is covered by that PEP,
"(see PEP 587)" is added to the text below.
The main theme of this proposal is to initialize the core language runtime The main theme of this proposal is to initialize the core language runtime
and create a partially initialized interpreter state for the main interpreter and create a partially initialized interpreter state for the main interpreter
*much* earlier in the startup process. This will allow most of the CPython API *much* earlier in the startup process. This will allow most of the CPython API
@ -354,6 +321,11 @@ simplifying a number of operations that currently need to rely on basic C
functionality rather than being able to use the richer data structures provided functionality rather than being able to use the richer data structures provided
by the CPython C API. by the CPython C API.
PEP 587 covers a subset of that task, which is splitting out the components that
even the existing "May be called before ``Py_Initialize``" interfaces need (like
memory allocators and operating system interface encoding details) into a
separate pre-configuration step.
In the following, the term "embedding application" also covers the standard In the following, the term "embedding application" also covers the standard
CPython command line application. CPython command line application.
@ -361,31 +333,52 @@ CPython command line application.
Interpreter Initialization Phases Interpreter Initialization Phases
--------------------------------- ---------------------------------
Three distinct interpreter initialisation phases are proposed: The following distinct interpreter initialisation phases are proposed:
* Pre-Initialization: * Uninitialized:
* Not really a phase, but the absence of a phase
* ``Py_IsInitializing()`` returns ``0``
* ``Py_IsRuntimeInitialized()`` returns ``0``
* ``Py_IsInitialized()`` returns ``0``
* The embedding application determines which memory allocator to use, and
which encoding to use to access operating system interfaces (or chooses
to delegate those decisions to the Python runtime)
* Application starts the initialization process by calling one of the
``Py_PreInitialize`` APIs (see PEP 587)
* Runtime Pre-Initialization:
* no interpreter is available * no interpreter is available
* ``Py_IsInitializing()`` returns ``1``
* ``Py_IsRuntimeInitialized()`` returns ``0`` * ``Py_IsRuntimeInitialized()`` returns ``0``
* ``Py_IsInitialized()`` returns ``0`` * ``Py_IsInitialized()`` returns ``0``
* The embedding application determines the settings required to initialize * The embedding application determines the settings required to initialize
the core CPython runtime and create the main interpreter and moves to the the core CPython runtime and create the main interpreter and moves to the
next phase by calling ``Py_InitializeRuntime`` next phase by calling ``Py_InitializeRuntime``
* Note: as of PEP 587, the embedding application instead calls ``Py_Main()``,
``Py_UnixMain``, or one of the ``Py_Initialize`` APIs, and hence jumps
directly to the Initialized state.
* Initializing: * Main Interpreter Initialization:
* the builtin data types and other core runtime services are available * the builtin data types and other core runtime services are available
* the main interpreter is available, but only partially configured * the main interpreter is available, but only partially configured
* ``Py_IsInitializing()`` returns ``1``
* ``Py_IsRuntimeInitialized()`` returns ``1`` * ``Py_IsRuntimeInitialized()`` returns ``1``
* ``Py_IsInitialized()`` returns ``0`` * ``Py_IsInitialized()`` returns ``0``
* The embedding application determines and applies the settings * The embedding application determines and applies the settings
required to complete the initialization process by calling required to complete the initialization process by calling
``Py_ReadMainInterpreterConfig`` and ``Py_ConfigureMainInterpreter``. ``Py_InitializeMainInterpreter``
* Note: as of PEP 587, this state is not reachable via any public API, it
only exists as an implicit internal state while one of the ``Py_Initialize``
functions is running
* Initialized: * Initialized:
* the main interpreter is available and fully operational, but * the main interpreter is available and fully operational, but
``__main__`` related metadata is incomplete ``__main__`` related metadata is incomplete
* ``Py_IsInitializing()`` returns ``0``
* ``Py_IsRuntimeInitialized()`` returns ``1`` * ``Py_IsRuntimeInitialized()`` returns ``1``
* ``Py_IsInitialized()`` returns ``1`` * ``Py_IsInitialized()`` returns ``1``
@ -398,33 +391,45 @@ proposed System Python interpreter.
An embedding application may still continue to leave initialization almost An embedding application may still continue to leave initialization almost
entirely under CPython's control by using the existing ``Py_Initialize`` entirely under CPython's control by using the existing ``Py_Initialize``
API. Alternatively, if an embedding application wants greater control or ``Py_Main()`` APIs - backwards compatibility will be preserved.
Alternatively, if an embedding application wants greater control
over CPython's initial state, it will be able to use the new, finer over CPython's initial state, it will be able to use the new, finer
grained API, which allows the embedding application greater control grained API, which allows the embedding application greater control
over the initialization process:: over the initialization process.
/* Phase 1: Pre-Initialization */ PEP 587 covers an initial iteration of that API, separating out the
PyRuntimeConfig runtime_config = PyRuntimeConfig_INIT; pre-initialization phase without attempting to separate core runtime
PyMainInterpreterConfig interpreter_config; initialization from main interpreter initialization.
/* Easily control the core configuration */
runtime_config.ignore_environment = 1; /* Ignore environment variables */
runtime_config.use_hash_seed = 0; /* Full hash randomisation */
Py_InitializeRuntime(&runtime_config);
/* Phase 2: Initializing */
/* Optionally preconfigure some settings here - they will then be
* used to derive other settings */
Py_ReadMainInterpreterConfig(&interpreter_config);
/* Can completely override derived settings here */
Py_ConfigureMainInterpreter(&interpreter_config);
/* Phase 3: Initialized */
/* If an embedding application has no real concept of a main module
* it can just stop the initialization process here.
* Alternatively, it can launch __main__ via the relevant API functions.
*/
Pre-Initialization Phase Uninitialized State
------------------------ -------------------
The unitialized state is where an embedding application determines the settings
which are required in order to be able to correctly pass configurations settings
to the embedded Python runtime.
This covers telling Python which memory allocator to use, as well as which text
encoding to use when processing provided settings.
PEP 587 defines the settings needed to exit this state in its ``PyPreConfig``
struct.
A new query API will allow code to determine if the interpreter hasn't even
started the initialization process::
int Py_IsInitializing();
The query for a completely unitialized environment would then be
``!(Py_Initialized() || Py_Initializing())``.
Runtime Pre-Initialization Phase
--------------------------------
.. note:: In PEP 587, the settings for this phase are not yet separated out,
and are instead only available through the combined ``PyConfig`` struct
The pre-initialization phase is where an embedding application determines The pre-initialization phase is where an embedding application determines
the settings which are absolutely required before the CPython runtime can be the settings which are absolutely required before the CPython runtime can be
@ -433,7 +438,7 @@ category are those related to the randomised hash algorithm - the hash
algorithms must be consistent for the lifetime of the process, and so they algorithms must be consistent for the lifetime of the process, and so they
must be in place before the core interpreter is created. must be in place before the core interpreter is created.
The specific settings needed are a flag indicating whether or not to use a The essential settings needed are a flag indicating whether or not to use a
specific seed value for the randomised hashes, and if so, the specific value specific seed value for the randomised hashes, and if so, the specific value
for the seed (a seed value of zero disables randomised hashing). In addition, for the seed (a seed value of zero disables randomised hashing). In addition,
due to the possible use of ``PYTHONHASHSEED`` in configuring the hash due to the possible use of ``PYTHONHASHSEED`` in configuring the hash
@ -442,15 +447,28 @@ variables must also be addressed early. Finally, to support the CPython
build process, an option is offered to completely disable the import build process, an option is offered to completely disable the import
system. system.
The proposed API for this step in the startup sequence is:: The proposed APIs for this step in the startup sequence are::
void Py_InitializeRuntime(const PyRuntimeConfig *config); PyInitError Py_InitializeRuntime(
const PyRuntimeConfig *config
);
Like ``Py_Initialize``, this part of the new API treats initialization failures PyInitError Py_InitializeRuntimeFromArgs(
as fatal errors. While that's still not particularly embedding friendly, const PyRuntimeConfig *config, int argc, char **argv
the operations in this step *really* shouldn't be failing, and changing them );
to return error codes instead of aborting would be an even larger task than
the one already being proposed. PyInitError Py_InitializeRuntimeFromWideArgs(
const PyRuntimeConfig *config, int argc, wchar_t **argv
);
If ``Py_IsInitializing()`` is false, the ``Py_InitializeRuntime`` functions will
implicitly call the corresponding ``Py_PreInitialize`` function. The
``use_environment`` setting will be passed down, while other settings will be
processed according to their defaults, as described in PEP 587.
The ``PyInitError`` return type is defined in PEP 587, and allows an embedding
application to gracefully handle Python runtime initialization failures,
rather than having the entire process abruptly terminated by ``Py_FatalError``.
The new ``PyRuntimeConfig`` struct holds the settings required for preliminary The new ``PyRuntimeConfig`` struct holds the settings required for preliminary
configuration of the core runtime and creation of the main interpreter:: configuration of the core runtime and creation of the main interpreter::
@ -458,10 +476,10 @@ configuration of the core runtime and creation of the main interpreter::
/* Note: if changing anything in PyRuntimeConfig, also update /* Note: if changing anything in PyRuntimeConfig, also update
* PyRuntimeConfig_INIT */ * PyRuntimeConfig_INIT */
typedef struct { typedef struct {
bool ignore_environment; /* -E switch, -I switch */ bool use_environment; /* as in PyPreConfig, PyConfig from PEP 587 */
int use_hash_seed; /* PYTHONHASHSEED */ int use_hash_seed; /* PYTHONHASHSEED, as in PyConfig from PEP 587 */
unsigned long hash_seed; /* PYTHONHASHSEED */ unsigned long hash_seed; /* PYTHONHASHSEED, as in PyConfig from PEP 587 */
bool _disable_importlib; /* Needed by freeze_importlib */ bool _install_importlib; /* Needed by freeze_importlib */
} PyRuntimeConfig; } PyRuntimeConfig;
/* Rely on the "designated initializer" feature of C99 */ /* Rely on the "designated initializer" feature of C99 */
@ -475,8 +493,8 @@ of a struct instance with sensible defaults::
PyRuntimeConfig runtime_config = PyRuntimeConfig_INIT; PyRuntimeConfig runtime_config = PyRuntimeConfig_INIT;
``ignore_environment`` controls the processing of all Python related ``use_environment`` controls the processing of all Python related
environment variables. If the flag is false, then environment variables are environment variables. If the flag is true, then ``PYTHONHASHSEED`` is
processed normally. Otherwise, all Python-specific environment variables processed normally. Otherwise, all Python-specific environment variables
are considered undefined (exceptions may be made for some OS specific are considered undefined (exceptions may be made for some OS specific
environment variables, such as those used on Mac OS X to communicate environment variables, such as those used on Mac OS X to communicate
@ -488,7 +506,7 @@ be used. It is positive, then the value in ``hash_seed`` will be used
to seed the random number generator. If the ``hash_seed`` is zero in this to seed the random number generator. If the ``hash_seed`` is zero in this
case, then the randomised hashing is disabled completely. case, then the randomised hashing is disabled completely.
If ``use_hash_seed`` is negative (and ``ignore_environment`` is zero), If ``use_hash_seed`` is negative (and ``use_environment`` is true),
then CPython will inspect the ``PYTHONHASHSEED`` environment variable. If the then CPython will inspect the ``PYTHONHASHSEED`` environment variable. If the
environment variable is not set, is set to the empty string, or to the value environment variable is not set, is set to the empty string, or to the value
``"random"``, then randomised hashes with a random seed will be used. If the ``"random"``, then randomised hashes with a random seed will be used. If the
@ -512,7 +530,7 @@ the empty string or the value ``"random"``, both ``use_hash_seed`` and
``hash_seed``. On success the function will return zero. A non-zero return ``hash_seed``. On success the function will return zero. A non-zero return
value indicates an error (most likely in the conversion to an integer). value indicates an error (most likely in the conversion to an integer).
The ``_disable_importlib`` setting is used as part of the CPython build The ``_install_importlib`` setting is used as part of the CPython build
process to create an interpreter with no import capability at all. It is process to create an interpreter with no import capability at all. It is
considered private to the CPython development team (hence the leading considered private to the CPython development team (hence the leading
underscore), as the only currently supported use case is to permit compiler underscore), as the only currently supported use case is to permit compiler
@ -522,9 +540,8 @@ changes that invalidate the previously frozen bytecode for
The aim is to keep this initial level of configuration as small as possible The aim is to keep this initial level of configuration as small as possible
in order to keep the bootstrapping environment consistent across in order to keep the bootstrapping environment consistent across
different embedding applications. If we can create a valid interpreter state different embedding applications. If we can create a valid interpreter state
without the setting, then the setting should go in the configuration passed without the setting, then the setting should appear solely in the comprehensive
to ``Py_ConfigureMainInterpreter()`` rather than in the core runtime ``PyConfig`` struct rather than in the core runtime configuration.
configuration.
A new query API will allow code to determine if the interpreter is in the A new query API will allow code to determine if the interpreter is in the
bootstrapping state between the core runtime initialization and the creation of bootstrapping state between the core runtime initialization and the creation of
@ -534,7 +551,11 @@ interpreter initialization process::
int Py_IsRuntimeInitialized(); int Py_IsRuntimeInitialized();
Attempting to call ``Py_InitializeRuntime()`` again when Attempting to call ``Py_InitializeRuntime()`` again when
``Py_IsRuntimeInitialized()`` is already true is a fatal error. ``Py_IsRuntimeInitialized()`` is already true is reported as a user
configuration error. (TBC, as existing public initialisation APIs support being
called multiple times without error, and simply ignore changes to any
write-once settings. It may make sense to keep that behaviour rather than trying
to make the new API stricter than the old one)
As frozen bytecode may now be legitimately run in an interpreter which is not As frozen bytecode may now be legitimately run in an interpreter which is not
yet fully initialized, ``sys.flags`` will gain a new ``initialized`` flag. yet fully initialized, ``sys.flags`` will gain a new ``initialized`` flag.
@ -584,22 +605,32 @@ object rather than in C process globals.
Any call to ``Py_InitializeRuntime()`` must have a matching call to Any call to ``Py_InitializeRuntime()`` must have a matching call to
``Py_Finalize()``. It is acceptable to skip calling ``Py_Finalize()``. It is acceptable to skip calling
``Py_ConfigureMainInterpreter()`` in between (e.g. if attempting to read the ``Py_InitializeMainInterpreter()`` in between (e.g. if attempting to build the
main interpreter configuration settings fails). main interpreter configuration settings fails).
Determining the remaining configuration settings Determining the remaining configuration settings
------------------------------------------------ ------------------------------------------------
The next step in the initialization sequence is to determine the full The next step in the initialization sequence is to determine the remaining
settings needed to complete the process. No changes are made to the settings needed to complete the process. No changes are made to the
interpreter state at this point. The core API for this step is:: interpreter state at this point. The core APIs for this step are::
int Py_ReadMainInterpreterConfig(PyMainInterpreterConfig *config); int Py_BuildPythonConfig(
PyConfigAsObjects *py_config, const PyConfig *c_config
);
The config argument should be a pointer to a config struct (which may be int Py_BuildPythonConfigFromArgs(
a temporary one stored on the C stack). For any already configured value PyConfigAsObjects *py_config, const PyConfig *c_config, int argc, char **argv
(i.e. any non-NULL pointer), CPython will sanity check the supplied value, );
int Py_BuildPythonConfigFromWideArgs(
PyConfigAsObjects *py_config, const PyConfig *c_config, int argc, wchar_t **argv
);
The ``py_config`` argument should be a pointer to a PyConfigAsObjects struct
(which may be a temporary one stored on the C stack). For any already configured
value (i.e. any non-NULL pointer), CPython will sanity check the supplied value,
but otherwise accept it as correct. but otherwise accept it as correct.
A struct is used rather than a Python dictionary as the struct is easier A struct is used rather than a Python dictionary as the struct is easier
@ -608,24 +639,28 @@ CPython version and only a read-only view needs to be exposed to Python
code (which is relatively straightforward, thanks to the infrastructure code (which is relatively straightforward, thanks to the infrastructure
already put in place to expose ``sys.implementation``). already put in place to expose ``sys.implementation``).
Unlike ``Py_Initialize`` and ``Py_InitializeRuntime``, this call will raise Unlike ``Py_InitializeRuntime``, this call will raise a Python exception and
an exception and report an error return rather than exhibiting fatal errors report an error return rather than returning a Python initialization specific
if a problem is found with the config data. C struct if a problem is found with the config data.
Any supported configuration setting which is not already set will be Any supported configuration setting which is not already set will be
populated appropriately in the supplied configuration struct. The default populated appropriately in the supplied configuration struct. The default
configuration can be overridden entirely by setting the value *before* configuration can be overridden entirely by setting the value *before*
calling ``Py_ReadMainInterpreterConfig``. The provided value will then also be calling ``Py_BuildPythonConfig``. The provided value will then also be
used in calculating any other settings derived from that value. used in calculating any other settings derived from that value.
Alternatively, settings may be overridden *after* the Alternatively, settings may be overridden *after* the
``Py_ReadMainInterpreterConfig`` call (this can be useful if an embedding ``Py_BuildPythonConfig`` call (this can be useful if an embedding
application wants to adjust a setting rather than replace it completely, application wants to adjust a setting rather than replace it completely,
such as removing ``sys.path[0]``). such as removing ``sys.path[0]``).
The ``c_config`` argument is an optional pointer to a ``PyConfig`` structure,
as defined in PEP 587. If provided, it is used in preference to reading settings
directly from the environment or process global state.
Merely reading the configuration has no effect on the interpreter state: it Merely reading the configuration has no effect on the interpreter state: it
only modifies the passed in configuration struct. The settings are not only modifies the passed in configuration struct. The settings are not
applied to the running interpreter until the ``Py_ConfigureMainInterpreter`` applied to the running interpreter until the ``Py_InitializeMainInterpreter``
call (see below). call (see below).
@ -642,10 +677,16 @@ or not the interpreter is the main interpreter will be configured on a per
interpreter basis. Other fields will be reviewed for whether or not they can interpreter basis. Other fields will be reviewed for whether or not they can
feasibly be made interpreter specific over the course of the implementation. feasibly be made interpreter specific over the course of the implementation.
The ``PyMainInterpreterConfig`` struct holds the settings required to .. note:: The list of config fields below is currently out of sync with PEP 587.
complete the main interpreter configuration. These settings are also all Where they differ, PEP 587 takes precedence.
passed through unmodified to subinterpreters. Fields are always pointers to
Python data types, with unset values indicated by ``NULL``:: The ``PyConfigAsObjects`` struct mirrors the ``PyConfig`` struct from PEP 587,
but uses full Python objects to store values, rather than C level data types.
It adds ``raw_argv`` and ``argv`` list fields, so later initialisation steps
don't need to accept those separately.
Fields are always pointers to Python data types, with unset values indicated by
``NULL``::
typedef struct { typedef struct {
/* Argument processing */ /* Argument processing */
@ -719,11 +760,11 @@ Python data types, with unset values indicated by ``NULL``::
PyBoolObject *show_banner; /* -q switch (inverted) */ PyBoolObject *show_banner; /* -q switch (inverted) */
PyBoolObject *inspect_main; /* -i switch, PYTHONINSPECT */ PyBoolObject *inspect_main; /* -i switch, PYTHONINSPECT */
} PyMainInterpreterConfig; } PyConfigAsObjects;
The ``PyInterpreterConfig`` struct holds the settings that may vary between The ``PyInterpreterConfig`` struct holds the settings that may vary between
the main interpreter and subinterpreters. For the main interpreter, these the main interpreter and subinterpreters. For the main interpreter, these
settings are automatically populated by ``Py_ConfigureMainInterpreter()``. settings are automatically populated by ``Py_InitializeMainInterpreter()``.
:: ::
@ -735,31 +776,33 @@ As these structs consist solely of object pointers, no explicit initializer
definitions are needed - C99's default initialization of struct memory to zero definitions are needed - C99's default initialization of struct memory to zero
is sufficient. is sufficient.
<TBD: did I miss anything?>
Completing the main interpreter initialization
Completing the interpreter initialization ----------------------------------------------
-----------------------------------------
The final step in the initialization process is to actually put the The final step in the initialization process is to actually put the
configuration settings into effect and finish bootstrapping the main configuration settings into effect and finish bootstrapping the main
interpreter up to full operation:: interpreter up to full operation::
int Py_ConfigureMainInterpreter(const PyMainInterpreterConfig *config); int Py_InitializeMainInterpreter(const PyConfigAsObjects *config);
Like ``Py_ReadMainInterpreterConfig``, this call will raise an exception and Like ``Py_BuildPythonConfig``, this call will raise an exception and
report an error return rather than exhibiting fatal errors if a problem is report an error return rather than exhibiting fatal errors if a problem is
found with the config data. found with the config data. (TBC, as existing public initialisation APIs support
being called multiple times without error, and simply ignore changes to any
write-once settings. It may make sense to keep that behaviour rather than trying
to make the new API stricter than the old one)
All configuration settings are required - the configuration struct All configuration settings are required - the configuration struct
should always be passed through ``Py_ReadMainInterpreterConfig`` to ensure it should always be passed through ``Py_BuildPythonConfig`` to ensure it
is fully populated. is fully populated.
After a successful call ``Py_IsInitialized()`` will become true. The caveats After a successful call ``Py_IsInitialized()`` will become true and
described above for the interpreter during the phase where only the core ``Py_IsInitializing()`` will become false. The caveats described above for the
runtime is initialized will no longer hold. interpreter during the phase where only the core runtime is initialized will
no longer hold.
Attempting to call ``Py_ConfigureMainInterpreter()`` again when Attempting to call ``Py_InitializeMainInterpreter()`` again when
``Py_IsInitialized()`` is true is an error. ``Py_IsInitialized()`` is true is an error.
However, some metadata related to the ``__main__`` module may still be However, some metadata related to the ``__main__`` module may still be
@ -790,6 +833,11 @@ state if ``import site`` is later explicitly executed in the process.
Preparing the main module Preparing the main module
------------------------- -------------------------
.. note:: In PEP 587, ``PyRun_PrepareMain`` and ``PyRun_ExecMain`` are not
exposed separately, and are instead accessed through a ``Py_RunMain`` API
that both prepares and executes main, and then finalizes the Python
interpreter.
This subphase completes the population of the ``__main__`` module This subphase completes the population of the ``__main__`` module
related metadata, without actually starting execution of the ``__main__`` related metadata, without actually starting execution of the ``__main__``
module code. module code.
@ -856,6 +904,12 @@ configuration system)
Executing the main module Executing the main module
------------------------- -------------------------
.. note:: In PEP 587, ``PyRun_PrepareMain`` and ``PyRun_ExecMain`` are not
exposed separately, and are instead accessed through a ``Py_RunMain`` API
that both prepares and executes main, and then finalizes the Python
interpreter.
This subphase covers the execution of the actual ``__main__`` module code. This subphase covers the execution of the actual ``__main__`` module code.
It is handled by calling the following API:: It is handled by calling the following API::
@ -900,17 +954,16 @@ Internal Storage of Configuration Data
The interpreter state will be updated to include details of the configuration The interpreter state will be updated to include details of the configuration
settings supplied during initialization by extending the interpreter state settings supplied during initialization by extending the interpreter state
object with an embedded copy of the ``PyRuntimeConfig``, object with at least an embedded copy of the ``PyConfigAsObjects`` and
``PyMainInterpreterConfig`` and ``PyInterpreterConfig`` structs. ``PyInterpreterConfig`` structs.
For debugging purposes, the configuration settings will be exposed as For debugging purposes, the configuration settings will be exposed as
a ``sys._configuration`` simple namespace (similar to ``sys.flags`` and a ``sys._configuration`` simple namespace (similar to ``sys.flags`` and
``sys.implementation``. The attributes will be themselves by simple namespaces ``sys.implementation``. The attributes will be themselves by simple namespaces
corresponding to the three levels of configurations setting: corresponding to the two levels of configuration setting:
* ``runtime`` * ``all_interpreters``
* ``main_interpreter`` * ``active_interpreter``
* ``interpreter``
Field names will match those in the configuration structs, except for Field names will match those in the configuration structs, except for
``hash_seed``, which will be deliberately excluded. ``hash_seed``, which will be deliberately excluded.
@ -965,8 +1018,8 @@ Most of the APIs proposed in this PEP are excluded from the stable ABI, as
embedding a Python interpreter involves a much higher degree of coupling embedding a Python interpreter involves a much higher degree of coupling
than merely writing an extension module. than merely writing an extension module.
The only newly exposed API that will be part of the stable ABI is the The only newly exposed APIs that will be part of the stable ABI are the
``Py_IsRuntimeInitialized()`` query. ``Py_IsInitializing()`` and ``Py_IsRuntimeInitialized()`` queries.
Build time configuration Build time configuration
@ -981,9 +1034,9 @@ Backwards Compatibility
----------------------- -----------------------
Backwards compatibility will be preserved primarily by ensuring that Backwards compatibility will be preserved primarily by ensuring that
``Py_ReadMainInterpreterConfig()`` interrogates all the previously defined ``Py_BuildPythonConfig()`` interrogates all the previously defined
configuration settings stored in global variables and environment variables, configuration settings stored in global variables and environment variables,
and that ``Py_ConfigureMainInterpreter()`` writes affected settings back to and that ``Py_InitializeMainInterpreter()`` writes affected settings back to
the relevant locations. the relevant locations.
One acknowledged incompatibility is that some environment variables which One acknowledged incompatibility is that some environment variables which
@ -1007,10 +1060,6 @@ prior to ``Py_Initialize()`` will
continue to do so, and will also support being called prior to continue to do so, and will also support being called prior to
``Py_InitializeRuntime()``. ``Py_InitializeRuntime()``.
To minimise unnecessary code churn, and to ensure the backwards compatibility
is well tested, the main CPython executable may continue to use some elements
of the old style initialization API. (very much TBC)
A System Python Executable A System Python Executable
========================== ==========================
@ -1053,8 +1102,8 @@ argument parsing infrastructure for use during the initializing phase.
Open Questions Open Questions
============== ==============
* Error details for ``Py_ReadMainInterpreterConfig`` and * Error details for ``Py_BuildPythonConfig`` and
``Py_ConfigureMainInterpreter`` (these should become clearer as the ``Py_InitializeMainInterpreter`` (these should become clearer as the
implementation progresses) implementation progresses)
@ -1065,6 +1114,9 @@ The reference implementation is being developed as a private API refactoring
within the CPython reference interpreter (as attempting to maintain it as an within the CPython reference interpreter (as attempting to maintain it as an
independent project proved impractical). independent project proved impractical).
PEP 587 extracts a subset of the proposal that is considered sufficiently stable
to be worth proposing as a public API for Python 3.8.
The Status Quo (as of Python 3.6) The Status Quo (as of Python 3.6)
================================= =================================
@ -1073,6 +1125,9 @@ The current mechanisms for configuring the interpreter have accumulated in
a fairly ad hoc fashion over the past 20+ years, leading to a rather a fairly ad hoc fashion over the past 20+ years, leading to a rather
inconsistent interface with varying levels of documentation. inconsistent interface with varying levels of documentation.
Also see PEP 587 for further discussion of the existing settings and their
handling.
(Note: some of the info below could probably be cleaned up and added to the (Note: some of the info below could probably be cleaned up and added to the
C API documentation for 3.x - it's all CPython specific, so it C API documentation for 3.x - it's all CPython specific, so it
doesn't belong in the language reference) doesn't belong in the language reference)

View File

@ -1,6 +1,6 @@
PEP: 587 PEP: 587
Title: Python Initialization Configuration Title: Python Initialization Configuration
Author: Nick Coghlan <ncoghlan@gmail.com>, Victor Stinner <vstinner@redhat.com> Author: Victor Stinner <vstinner@redhat.com>, Nick Coghlan <ncoghlan@gmail.com>
Discussions-To: python-dev@python.org Discussions-To: python-dev@python.org
Status: Draft Status: Draft
Type: Standards Track Type: Standards Track
@ -14,6 +14,11 @@ Abstract
Add a new C API to configure the Python Initialization providing finer Add a new C API to configure the Python Initialization providing finer
control on the whole configuration and better error reporting. control on the whole configuration and better error reporting.
This extracts a subset of the API design from the PEP 432 development and
refactoring work that is now considered sufficiently stable to make public
(allowing 3rd party embedding applications access to the same configuration
APIs that the native CPython CLI is now using).
Rationale Rationale
========= =========
@ -31,9 +36,12 @@ initialization.
This PEP is a partial implementation of PEP 432 which is the overall This PEP is a partial implementation of PEP 432 which is the overall
design. New fields can be added later to ``PyConfig`` structure to design. New fields can be added later to ``PyConfig`` structure to
finish the implementation of the PEP 432 (add a new partial finish the implementation of the PEP 432 (e.g. by adding a new partial
initialization which allows to configure Python using Python objects to initialization API which allows to configure Python using Python objects to
finish the full initialization). finish the full initialization). However, those features are omitted from this
PEP as even the native CPython CLI doesn't work that way - the public API
proposal in this PEP is limited to features which have already been implemented
and adopted as private APIs for us in the native CPython CLI.
Python Initialization C API Python Initialization C API