Further PEP 432 updates
- describe all 4 proposed initialisation phases - more detailed interface for hash seed handling - move ignore environment flag into core config - consistently use American spelling of initialize - preliminary concept for main execution API - misc notes on status quo
This commit is contained in:
parent
1be4a026ed
commit
4281048f3b
246
pep-0432.txt
246
pep-0432.txt
|
@ -15,7 +15,7 @@ Abstract
|
||||||
========
|
========
|
||||||
|
|
||||||
This PEP proposes a mechanism for simplifying the startup sequence for
|
This PEP proposes a mechanism for simplifying the startup sequence for
|
||||||
CPython, making it easier to modify the initialisation behaviour of the
|
CPython, making it easier to modify the initialization behaviour of the
|
||||||
reference interpreter executable, as well as making it easier to control
|
reference interpreter executable, as well as making it easier to control
|
||||||
CPython's startup behaviour when creating an alternate executable or
|
CPython's startup behaviour when creating an alternate executable or
|
||||||
embedding it as a Python execution engine inside a larger application.
|
embedding it as a Python execution engine inside a larger application.
|
||||||
|
@ -25,15 +25,24 @@ resolution for most of these should become clearer as the reference
|
||||||
implementation is developed.
|
implementation is developed.
|
||||||
|
|
||||||
|
|
||||||
Proposal Summary
|
Proposal
|
||||||
================
|
========
|
||||||
|
|
||||||
This PEP proposes that CPython move to an explicit 2-phase initialisation
|
This PEP proposes that CPython move to an explicit multi-phase initialization
|
||||||
process, where a preliminary interpreter is put in place with limited OS
|
process, where a preliminary interpreter is put in place with limited OS
|
||||||
interaction capabilities early in the startup sequence. This essential core
|
interaction capabilities early in the startup sequence. This essential core
|
||||||
remains in place while all of the configuration settings are determined,
|
remains in place while all of the configuration settings are determined,
|
||||||
until a final configuration call takes those settings and finishes
|
until a final configuration call takes those settings and finishes
|
||||||
bootstrapping the interpreter immediately before executing the main module.
|
bootstrapping the interpreter immediately before locating and executing
|
||||||
|
the main module.
|
||||||
|
|
||||||
|
In the new design, the interpreter will move through the following
|
||||||
|
well-defined phases during the startup sequence:
|
||||||
|
|
||||||
|
* Pre-Initialization - no interpreter available
|
||||||
|
* Initialization - limited interpreter available
|
||||||
|
* Pre-Main - full interpreter available, __main__ related metadata incomplete
|
||||||
|
* Main Execution - normal interpreter operation
|
||||||
|
|
||||||
As a concrete use case to help guide any design changes, and to solve a known
|
As a concrete use case to help guide any design changes, and to solve a known
|
||||||
problem where the appropriate defaults for system utilities differ from those
|
problem where the appropriate defaults for system utilities differ from those
|
||||||
|
@ -46,20 +55,21 @@ script being executed.
|
||||||
To keep the implementation complexity under control, this PEP does *not*
|
To keep the implementation complexity under control, this PEP does *not*
|
||||||
propose wholesale changes to the way the interpreter state is accessed at
|
propose wholesale changes to the way the interpreter state is accessed at
|
||||||
runtime, nor does it propose changes to the way subinterpreters are
|
runtime, nor does it propose changes to the way subinterpreters are
|
||||||
created after the main interpreter has already been initialised. Changing
|
created after the main interpreter has already been initialized. Changing
|
||||||
the order in which the existing initialisation steps occur to make the
|
the order in which the existing initialization steps occur in order to make
|
||||||
startup sequence easier to maintain is already a substantial change, and
|
the startup sequence easier to maintain is already a substantial change, and
|
||||||
attempting to make those other changes at the same time will make the
|
attempting to make those other changes at the same time will make the
|
||||||
change significantly more invasive and much harder to review. However, such
|
change significantly more invasive and much harder to review. However, such
|
||||||
proposals may be suitable topics for follow-on PEPs or patches - one key
|
proposals may be suitable topics for follow-on PEPs or patches - one key
|
||||||
benefit of this PEP is decreasing the coupling between the internal storage
|
benefit of this PEP is decreasing the coupling between the internal storage
|
||||||
model and the configuration interface.
|
model and the configuration interface, so such changes should be easier
|
||||||
|
once this PEP has been implemented.
|
||||||
|
|
||||||
|
|
||||||
Background
|
Background
|
||||||
==========
|
==========
|
||||||
|
|
||||||
Over time, CPython's initialisation sequence has become progressively more
|
Over time, CPython's initialization sequence has become progressively more
|
||||||
complicated, offering more options, as well as performing more complex tasks
|
complicated, offering more options, as well as performing more complex tasks
|
||||||
(such as configuring the Unicode settings for OS interfaces in Python 3 as
|
(such as configuring the Unicode settings for OS interfaces in Python 3 as
|
||||||
well as bootstrapping a pure Python implementation of the import system).
|
well as bootstrapping a pure Python implementation of the import system).
|
||||||
|
@ -72,7 +82,7 @@ maintainers, as much of the configuration needs to take place prior to the
|
||||||
safely.
|
safely.
|
||||||
|
|
||||||
A number of proposals are on the table for even *more* sophisticated
|
A number of proposals are on the table for even *more* sophisticated
|
||||||
startup behaviour, such as better control over ``sys.path`` initialisation
|
startup behaviour, such as better control over ``sys.path`` initialization
|
||||||
(easily adding additional directories on the command line in a cross-platform
|
(easily adding additional directories on the command line in a cross-platform
|
||||||
fashion, as well as controlling the configuration of ``sys.path[0]``), easier
|
fashion, as well as controlling the configuration of ``sys.path[0]``), easier
|
||||||
configuration of utilities like coverage tracing when launching Python
|
configuration of utilities like coverage tracing when launching Python
|
||||||
|
@ -96,14 +106,14 @@ Maintainability
|
||||||
|
|
||||||
The current CPython startup sequence is difficult to understand, and even
|
The current CPython startup sequence is difficult to understand, and even
|
||||||
more difficult to modify. It is not clear what state the interpreter is in
|
more difficult to modify. It is not clear what state the interpreter is in
|
||||||
while much of the initialisation code executes, leading to behaviour such
|
while much of the initialization code executes, leading to behaviour such
|
||||||
as lists, dictionaries and Unicode values being created prior to the call
|
as lists, dictionaries and Unicode values being created prior to the call
|
||||||
to ``Py_Initialize`` when the ``-X`` or ``-W`` options are used [1_].
|
to ``Py_Initialize`` when the ``-X`` or ``-W`` options are used [1_].
|
||||||
|
|
||||||
By moving to a 2-phase startup sequence, developers should only need to
|
By moving to an explicitly multi-phase startup sequence, developers should
|
||||||
understand which features are not available in the core bootstrapping state,
|
only need to understand which features are not available in the core
|
||||||
as the vast majority of the configuration process will now take place in
|
bootstrapping state, as the vast majority of the configuration process
|
||||||
that state.
|
will now take place in that state.
|
||||||
|
|
||||||
By basing the new design on a combination of C structures and Python
|
By basing the new design on a combination of C structures and Python
|
||||||
dictionaries, it should also be easier to modify the system in the
|
dictionaries, it should also be easier to modify the system in the
|
||||||
|
@ -114,7 +124,7 @@ Performance
|
||||||
-----------
|
-----------
|
||||||
|
|
||||||
CPython is used heavily to run short scripts where the runtime is dominated
|
CPython is used heavily to run short scripts where the runtime is dominated
|
||||||
by the interpreter initialisation time. Any changes to the startup sequence
|
by the interpreter initialization time. Any changes to the startup sequence
|
||||||
should minimise their impact on the startup overhead.
|
should minimise their impact on the startup overhead.
|
||||||
|
|
||||||
Experience with the importlib migration suggests that the startup time is
|
Experience with the importlib migration suggests that the startup time is
|
||||||
|
@ -141,15 +151,15 @@ builds)::
|
||||||
Improvements in the import system and the Unicode support already resulted
|
Improvements in the import system and the Unicode support already resulted
|
||||||
in a more than 30% improvement in startup time in Python 3.3 relative to
|
in a more than 30% improvement in startup time in Python 3.3 relative to
|
||||||
3.2. Python 3.3 is still slightly slower to start than Python 2.7 due to the
|
3.2. Python 3.3 is still slightly slower to start than Python 2.7 due to the
|
||||||
additional infrastructure that needs to be put in place to support the Unicode
|
additional infrastructure that needs to be put in place to support the
|
||||||
based text model.
|
Unicode based text model.
|
||||||
|
|
||||||
This PEP is not expected to have any significant effect on the startup time,
|
This PEP is not expected to have any significant effect on the startup time,
|
||||||
as it is aimed primarily at *reordering* the existing initialisation
|
as it is aimed primarily at *reordering* the existing initialization
|
||||||
sequence, without making substantial changes to the individual steps.
|
sequence, without making substantial changes to the individual steps.
|
||||||
|
|
||||||
However, if this simple check suggests that the proposed changes to the
|
However, if this simple check suggests that the proposed changes to the
|
||||||
initialisation sequence may pose a performance problem, then a more
|
initialization sequence may pose a performance problem, then a more
|
||||||
sophisticated microbenchmark will be developed to assist in investigation.
|
sophisticated microbenchmark will be developed to assist in investigation.
|
||||||
|
|
||||||
|
|
||||||
|
@ -198,7 +208,7 @@ be able to control the following aspects of the final interpreter state:
|
||||||
* ``no_site`` (don't implicitly import site during startup)
|
* ``no_site`` (don't implicitly import site during startup)
|
||||||
* ``ignore_environment`` (whether environment vars are used during config)
|
* ``ignore_environment`` (whether environment vars are used during config)
|
||||||
* ``verbose`` (enable all sorts of random output)
|
* ``verbose`` (enable all sorts of random output)
|
||||||
* ``bytes_warning``
|
* ``bytes_warning`` (warnings/errors for implicit str/bytes interaction)
|
||||||
* ``quiet`` (disable banner output even if verbose is also enabled or
|
* ``quiet`` (disable banner output even if verbose is also enabled or
|
||||||
stdin is a tty and the interpreter is launched in interactive mode)
|
stdin is a tty and the interpreter is launched in interactive mode)
|
||||||
|
|
||||||
|
@ -219,7 +229,7 @@ be able to control the following aspects of the final interpreter state:
|
||||||
Note that this just covers settings that are currently configurable in some
|
Note that this just covers settings that are currently configurable in some
|
||||||
manner when using the main CPython executable. While this PEP aims to make
|
manner when using the main CPython executable. While this PEP aims to make
|
||||||
adding additional configuration settings easier in the future, it
|
adding additional configuration settings easier in the future, it
|
||||||
deliberately avoids any new settings of its own.
|
deliberately avoids adding any new settings of its own.
|
||||||
|
|
||||||
|
|
||||||
The Status Quo
|
The Status Quo
|
||||||
|
@ -238,13 +248,14 @@ Ignoring Environment Variables
|
||||||
------------------------------
|
------------------------------
|
||||||
|
|
||||||
The ``-E`` command line option allows all environment variables to be
|
The ``-E`` command line option allows all environment variables to be
|
||||||
ignored when initialising the Python interpreter. An embedding application
|
ignored when initializing the Python interpreter. An embedding application
|
||||||
can enable this behaviour by setting ``Py_IgnoreEnvironmentFlag`` before
|
can enable this behaviour by setting ``Py_IgnoreEnvironmentFlag`` before
|
||||||
calling ``Py_Initialize()``.
|
calling ``Py_Initialize()``.
|
||||||
|
|
||||||
In the CPython source code, the ``Py_GETENV`` macro implicitly checks this
|
In the CPython source code, the ``Py_GETENV`` macro implicitly checks this
|
||||||
flag, and always produces ``NULL`` if it is set.
|
flag, and always produces ``NULL`` if it is set.
|
||||||
|
|
||||||
|
<TBD: I believe PYTHONCASEOK is checked regardless of this setting >
|
||||||
<TBD: Does -E also ignore Windows registry keys? >
|
<TBD: Does -E also ignore Windows registry keys? >
|
||||||
|
|
||||||
|
|
||||||
|
@ -266,7 +277,8 @@ rather than in ``Py_Initialize()``).
|
||||||
|
|
||||||
The new configuration API should make it straightforward for an
|
The new configuration API should make it straightforward for an
|
||||||
embedding application to reuse the ``PYTHONHASHSEED`` processing with
|
embedding application to reuse the ``PYTHONHASHSEED`` processing with
|
||||||
a text based configuration setting provided by other means.
|
a text based configuration setting provided by other means (e.g. a
|
||||||
|
config file or separate environment variable).
|
||||||
|
|
||||||
|
|
||||||
Locating Python and the standard library
|
Locating Python and the standard library
|
||||||
|
@ -301,7 +313,7 @@ Configuring ``sys.path``
|
||||||
An embedding application may call ``Py_SetPath()`` prior to
|
An embedding application may call ``Py_SetPath()`` prior to
|
||||||
``Py_Initialize()`` to completely override the calculation of
|
``Py_Initialize()`` to completely override the calculation of
|
||||||
``sys.path``. It is not straightforward to only allow *some* of the
|
``sys.path``. It is not straightforward to only allow *some* of the
|
||||||
calculations, as modifying ``sys.path`` after initialisation is
|
calculations, as modifying ``sys.path`` after initialization is
|
||||||
already complete means those modifications will not be in effect
|
already complete means those modifications will not be in effect
|
||||||
when standard library modules are imported during the startup sequence.
|
when standard library modules are imported during the startup sequence.
|
||||||
|
|
||||||
|
@ -332,10 +344,10 @@ for a given Python executable on a given system:
|
||||||
|
|
||||||
(Note: you can see similar information using ``-m site`` instead of ``-c``,
|
(Note: you can see similar information using ``-m site`` instead of ``-c``,
|
||||||
but this is slightly misleading as it calls ``os.abspath`` on all of the
|
but this is slightly misleading as it calls ``os.abspath`` on all of the
|
||||||
path entries (making relative path entries look absolute), and also causes
|
path entries, making relative path entries look absolute. Using the ``site``
|
||||||
problems in the last case, as on Python versions prior to 3.3, explicitly
|
module also causes problems in the last case, as on Python versions prior to
|
||||||
importing site will carry out the path modifications ``-S`` avoids, while on
|
3.3, explicitly importing site will carry out the path modifications ``-S``
|
||||||
3.3+ combining ``-m site`` with ``-S`` currently fails)
|
avoids, while on 3.3+ combining ``-m site`` with ``-S`` currently fails)
|
||||||
|
|
||||||
The calculation of ``sys.path[0]`` is comparatively straightforward:
|
The calculation of ``sys.path[0]`` is comparatively straightforward:
|
||||||
|
|
||||||
|
@ -386,7 +398,7 @@ However, the ``runpy`` module does provide roughly equivalent logic in
|
||||||
Other configuration settings
|
Other configuration settings
|
||||||
----------------------------
|
----------------------------
|
||||||
|
|
||||||
TBD: Cover the initialisation of the following in more detail:
|
TBD: Cover the initialization of the following in more detail:
|
||||||
|
|
||||||
* The initial warning system state:
|
* The initial warning system state:
|
||||||
* ``sys.warnoptions``
|
* ``sys.warnoptions``
|
||||||
|
@ -419,7 +431,7 @@ TBD: Cover the initialisation of the following in more detail:
|
||||||
* ``no_site`` (don't implicitly import site during startup)
|
* ``no_site`` (don't implicitly import site during startup)
|
||||||
* ``ignore_environment`` (whether environment vars are used during config)
|
* ``ignore_environment`` (whether environment vars are used during config)
|
||||||
* ``verbose`` (enable all sorts of random output)
|
* ``verbose`` (enable all sorts of random output)
|
||||||
* ``bytes_warning`` (This may be obsolete in Py3k...)
|
* ``bytes_warning`` (warnings/errors for implicit str/bytes interaction)
|
||||||
* ``quiet`` (disable banner output even if verbose is also enabled or
|
* ``quiet`` (disable banner output even if verbose is also enabled or
|
||||||
stdin is a tty and the interpreter is launched in interactive mode)
|
stdin is a tty and the interpreter is launched in interactive mode)
|
||||||
|
|
||||||
|
@ -428,15 +440,15 @@ TBD: Cover the initialisation of the following in more detail:
|
||||||
Much of the configuration of CPython is currently handled through C level
|
Much of the configuration of CPython is currently handled through C level
|
||||||
global variables::
|
global variables::
|
||||||
|
|
||||||
Py_BytesWarningFlag
|
Py_BytesWarningFlag (-b)
|
||||||
Py_DebugFlag (-d option)
|
Py_DebugFlag (-d option)
|
||||||
Py_InspectFlag (-i option, PYTHONINSPECT)
|
Py_InspectFlag (-i option, PYTHONINSPECT)
|
||||||
Py_InteractiveFlag
|
Py_InteractiveFlag (property of stdin, cannot be overridden)
|
||||||
Py_OptimizeFlag (-O option, PYTHONOPTIMIZE)
|
Py_OptimizeFlag (-O option, PYTHONOPTIMIZE)
|
||||||
Py_DontWriteBytecodeFlag (-B option, PYTHONDONTWRITEBYTECODE)
|
Py_DontWriteBytecodeFlag (-B option, PYTHONDONTWRITEBYTECODE)
|
||||||
Py_NoUserSiteDirectory (-s option, PYTHONNOUSERSITE)
|
Py_NoUserSiteDirectory (-s option, PYTHONNOUSERSITE)
|
||||||
Py_NoSiteFlag (-S option)
|
Py_NoSiteFlag (-S option)
|
||||||
Py_UnbufferedStdioFlag
|
Py_UnbufferedStdioFlag (-u, PYTHONUNBUFFEREDIO)
|
||||||
Py_VerboseFlag (-v option, PYTHONVERBOSE)
|
Py_VerboseFlag (-v option, PYTHONVERBOSE)
|
||||||
|
|
||||||
For the above variables, the conversion of command line options and
|
For the above variables, the conversion of command line options and
|
||||||
|
@ -463,34 +475,63 @@ first comment line in the main script)
|
||||||
Also see detailed sequence of operations notes at [1_]
|
Also see detailed sequence of operations notes at [1_]
|
||||||
|
|
||||||
|
|
||||||
Proposal
|
Design Details
|
||||||
========
|
==============
|
||||||
|
|
||||||
(Note: details here are still very much in flux, but preliminary feedback
|
(Note: details here are still very much in flux, but preliminary feedback
|
||||||
is appreciated anyway)
|
is appreciated anyway)
|
||||||
|
|
||||||
The main theme of this proposal is to create the interpreter state for
|
The main theme of this proposal is to create the interpreter state for
|
||||||
the main interpreter *much* earlier in the startup process. This will allow
|
the main interpreter *much* earlier in the startup process. This will allow
|
||||||
most of the CPython API to be used during the remainder of the initialisation
|
most of the CPython API to be used during the remainder of the initialization
|
||||||
process, potentially simplifying a number of operations that currently need
|
process, potentially simplifying a number of operations that currently need
|
||||||
to rely on basic C functionality rather than being able to use the richer
|
to rely on basic C functionality rather than being able to use the richer
|
||||||
data structures provided by the CPython C API.
|
data structures provided by the CPython C API.
|
||||||
|
|
||||||
|
In the following, the term "embedding application" also covers the standard
|
||||||
|
CPython command line application.
|
||||||
|
|
||||||
Core Interpreter Initialisation
|
|
||||||
-------------------------------
|
|
||||||
|
|
||||||
The only configuration that currently absolutely needs to be in place
|
Startup Phases
|
||||||
before even the interpreter core can be initialised is a flag indicating
|
--------------
|
||||||
whether or not to use a specific seed value for the randomised hashes, and
|
|
||||||
if so, the specific value for the seed (a seed value of zero disables
|
Four distinct phases are proposed:
|
||||||
randomised hashing).
|
|
||||||
|
* Pre-Initialization: no interpreter is available. Embedding application
|
||||||
|
determines the settings required to create the core interpreter and
|
||||||
|
moves to the next phase by calling ``Py_BeginInitialization``.
|
||||||
|
* Initialization - a limited interpreter is available. Embedding application
|
||||||
|
determines and applies the settings required to complete the initialization
|
||||||
|
process by calling ``Py_ReadConfiguration`` and ``Py_EndInitialization``.
|
||||||
|
* Pre-Main - the full interpreter is available, but ``__main__`` related
|
||||||
|
metadata is incomplete.
|
||||||
|
* Main Execution - normal interpreter operation
|
||||||
|
|
||||||
|
All 4 phases will be used by the standard CPython interpreter and the
|
||||||
|
proposed System Python interpreter. Other embedding applications may
|
||||||
|
choose to skip the step of executing code in the ``__main__`` module.
|
||||||
|
|
||||||
|
Pre-Initialization Phase
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
The pre-initialization phase is where an embedding application determines
|
||||||
|
the settings which are absolutely required before the interpreter can be
|
||||||
|
initialized at all. Currently, the only configuration settings in this
|
||||||
|
category are those related to the randomised hash algorithm - the hash
|
||||||
|
algorithms must be consistent for the lifetime of the process, and so they
|
||||||
|
must be in place before the core interpreter is created.
|
||||||
|
|
||||||
|
The specific settings needed are a flag indicating whether or not to use a
|
||||||
|
specific seed value for the randomised hashes, and if so, the specific value
|
||||||
|
for the seed (a seed value of zero disables randomised hashing). In addition,
|
||||||
|
the question of whether or not to consider environment variables must be
|
||||||
|
addressed early.
|
||||||
|
|
||||||
The proposed API for this step in the startup sequence is::
|
The proposed API for this step in the startup sequence is::
|
||||||
|
|
||||||
void Py_BeginInitialization(Py_CoreConfig *config);
|
void Py_BeginInitialization(Py_CoreConfig *config);
|
||||||
|
|
||||||
Like Py_Initialize, this part of the new API treats initialisation failures
|
Like Py_Initialize, this part of the new API treats initialization failures
|
||||||
as fatal errors. While that's still not particularly embedding friendly,
|
as fatal errors. While that's still not particularly embedding friendly,
|
||||||
the operations in this step *really* shouldn't be failing, and changing them
|
the operations in this step *really* shouldn't be failing, and changing them
|
||||||
to return error codes instead of aborting would be an even larger task than
|
to return error codes instead of aborting would be an even larger task than
|
||||||
|
@ -500,16 +541,50 @@ The new Py_CoreConfig struct holds the settings required for preliminary
|
||||||
configuration::
|
configuration::
|
||||||
|
|
||||||
typedef struct {
|
typedef struct {
|
||||||
|
int ignore_environment;
|
||||||
int use_hash_seed;
|
int use_hash_seed;
|
||||||
unsigned long hash_seed;
|
unsigned long hash_seed;
|
||||||
} Py_CoreConfig;
|
} Py_CoreConfig;
|
||||||
|
|
||||||
To disable hash randomisation, set "use_hash_seed" and pass a hash seed of
|
The core configuration settings pointer may be ``NULL``, in which case the
|
||||||
zero. (This is the same approach already used when interpreting the
|
default values are ``ignore_environment = 0`` and ``use_hash_seed = -1``.
|
||||||
``PYTHONHASHSEED`` environment variable)
|
|
||||||
|
|
||||||
The core configuration settings pointer may be NULL, in which case the
|
``ignore_environment`` controls the processing of all Python related
|
||||||
default behaviour of randomised hashes with a random seed will be used.
|
environment variables. If the flag is zero, then environment variables are
|
||||||
|
processed normally. Otherwise, all Python-specific environment variables
|
||||||
|
are considered undefined (exceptions may be made for some OS specific
|
||||||
|
environment variables, such as those used on Mac OS X to communicate
|
||||||
|
between the App bundle and the main Python binary).
|
||||||
|
|
||||||
|
``use_hash_seed`` controls the configuration of the randomised hash
|
||||||
|
algorithm. If it is zero, then randomised hashes with a random seed will
|
||||||
|
be used. It it is positive, then the value in ``hash_seed`` will be used
|
||||||
|
to seed the random number generator. If the ``hash_seed`` is zero in this
|
||||||
|
case, then the randomised hashing is disabled completely.
|
||||||
|
|
||||||
|
If ``use_hash_seed`` is negative (and ``ignore_environment`` is zero),
|
||||||
|
then CPython will inspect the ``PYTHONHASHSEED`` environment variable. If it
|
||||||
|
is not set, is set to the empty string, or to the value ``"random"``, then
|
||||||
|
randomised hashes with a random seed will be used. If it is set to the string
|
||||||
|
``"0"`` the randomised hashing will be disabled. Otherwise, the hash seed is
|
||||||
|
expected to be a string representation of an integer in the range
|
||||||
|
``[0; 4294967295]``.
|
||||||
|
|
||||||
|
To make it easier for embedding applications to use the ``PYTHONHASHSEED``
|
||||||
|
processing with a different data source, the following helper function
|
||||||
|
will be added to the C API::
|
||||||
|
|
||||||
|
int Py_ReadHashSeed(char *seed_text,
|
||||||
|
int *use_hash_seed,
|
||||||
|
unsigned long *hash_seed);
|
||||||
|
|
||||||
|
This function accepts a seed string in ``seed_text`` and converts it to
|
||||||
|
the appropriate flag and seed values. If ``seed_text`` is ``NULL``,
|
||||||
|
the empty string or the value ``"random"``, both ``use_hash_seed`` and
|
||||||
|
``hash_seed`` will be set to zero. Otherwise, ``use_hash_seed`` will be set to
|
||||||
|
``1`` and the seed text will be interpreted as an integer and reported as
|
||||||
|
``hash_seed``. On success the function will return zero. A non-zero return
|
||||||
|
value indicates an error (most likely in the conversion to an integer).
|
||||||
|
|
||||||
The aim is to keep this initial level of configuration as small as possible
|
The aim is to keep this initial level of configuration as small as possible
|
||||||
in order to keep the bootstrapping environment consistent across
|
in order to keep the bootstrapping environment consistent across
|
||||||
|
@ -519,14 +594,14 @@ to ``Py_EndInitialization()`` rather than in the core configuration.
|
||||||
|
|
||||||
A new query API will allow code to determine if the interpreter is in the
|
A new query API will allow code to determine if the interpreter is in the
|
||||||
bootstrapping state between the creation of the interpreter state and the
|
bootstrapping state between the creation of the interpreter state and the
|
||||||
completion of the bulk of the initialisation process::
|
completion of the bulk of the initialization process::
|
||||||
|
|
||||||
int Py_IsInitializing();
|
int Py_IsInitializing();
|
||||||
|
|
||||||
Attempting to call ``Py_BeginInitialization()`` again when
|
Attempting to call ``Py_BeginInitialization()`` again when
|
||||||
``Py_IsInitializing()`` or ``Py_IsInitialized()`` is true is a fatal error.
|
``Py_IsInitializing()`` or ``Py_IsInitialized()`` is true is a fatal error.
|
||||||
|
|
||||||
While in the initialising state, the interpreter should be fully functional
|
While in the initializing state, the interpreter should be fully functional
|
||||||
except that:
|
except that:
|
||||||
|
|
||||||
* compilation is not allowed (as the parser and compiler are not yet
|
* compilation is not allowed (as the parser and compiler are not yet
|
||||||
|
@ -551,7 +626,7 @@ except that:
|
||||||
* only builtin and frozen modules may be imported (due to above limitations)
|
* only builtin and frozen modules may be imported (due to above limitations)
|
||||||
* ``sys.stderr`` is set to a temporary IO object using unbuffered binary
|
* ``sys.stderr`` is set to a temporary IO object using unbuffered binary
|
||||||
mode
|
mode
|
||||||
* The ``warnings`` module is not yet initialised
|
* The ``warnings`` module is not yet initialized
|
||||||
* The ``__main__`` module does not yet exist
|
* The ``__main__`` module does not yet exist
|
||||||
|
|
||||||
<TBD: identify any other notable missing functionality>
|
<TBD: identify any other notable missing functionality>
|
||||||
|
@ -573,7 +648,7 @@ between (e.g. if attempting to read the configuration settings fails)
|
||||||
Determining the remaining configuration settings
|
Determining the remaining configuration settings
|
||||||
------------------------------------------------
|
------------------------------------------------
|
||||||
|
|
||||||
The next step in the initialisation sequence is to determine the full
|
The next step in the initialization sequence is to determine the full
|
||||||
settings needed to complete the process. No changes are made to the
|
settings needed to complete the process. No changes are made to the
|
||||||
interpreter state at this point. The core API for this step is::
|
interpreter state at this point. The core API for this step is::
|
||||||
|
|
||||||
|
@ -630,11 +705,12 @@ At least the following configuration settings will be supported::
|
||||||
<TBD: at least more from sys.flags need to go here>
|
<TBD: at least more from sys.flags need to go here>
|
||||||
|
|
||||||
|
|
||||||
Completing the interpreter initialisation
|
Completing the interpreter initialization
|
||||||
-----------------------------------------
|
-----------------------------------------
|
||||||
|
|
||||||
The final step in the process is to actually put the configuration settings
|
The final step in the initialization process is to actually put the
|
||||||
into effect and finish bootstrapping the interpreter up to full operation::
|
configuration settings into effect and finish bootstrapping the interpreter
|
||||||
|
up to full operation::
|
||||||
|
|
||||||
int Py_EndInitialization(PyObject *config);
|
int Py_EndInitialization(PyObject *config);
|
||||||
|
|
||||||
|
@ -648,7 +724,48 @@ is fully populated.
|
||||||
|
|
||||||
After a successful call, Py_IsInitializing() will be false, while
|
After a successful call, Py_IsInitializing() will be false, while
|
||||||
Py_IsInitialized() will become true. The caveats described above for the
|
Py_IsInitialized() will become true. The caveats described above for the
|
||||||
interpreter during the initialisation phase will no longer hold.
|
interpreter during the initialization phase will no longer hold.
|
||||||
|
|
||||||
|
However, some metadata related to the ``__main__`` module may still be
|
||||||
|
incomplete:
|
||||||
|
|
||||||
|
* ``sys.argv[0]`` may not yet have its final value
|
||||||
|
* it will be ``-m`` when executing a module or package with CPython
|
||||||
|
* it will be the same as ``sys.path[0]`` rather than the location of
|
||||||
|
the ``__main__`` module when executing a valid ``sys.path`` entry
|
||||||
|
(typically a zipfile or directory)
|
||||||
|
* the metadata in the ``__main__`` module will still indicate it is a
|
||||||
|
builtin module
|
||||||
|
|
||||||
|
|
||||||
|
Executing the main module
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
<TBD>
|
||||||
|
|
||||||
|
Initial thought is that hiding the various options behind a single API
|
||||||
|
would make that API too complicated, so 3 separate APIs is more likely::
|
||||||
|
|
||||||
|
Py_RunPathAsMain
|
||||||
|
Py_RunModuleAsMain
|
||||||
|
Py_RunStreamAsMain
|
||||||
|
|
||||||
|
|
||||||
|
Internal Storage of Configuration Data
|
||||||
|
--------------------------------------
|
||||||
|
|
||||||
|
The interpreter state will be updated to include details of the configuration
|
||||||
|
settings supplied during initialization by extending the interpreter state
|
||||||
|
object with an embedded copy of the ``Py_CoreConfig`` struct and an
|
||||||
|
additional ``PyObject`` pointer to hold a reference to a copy of the
|
||||||
|
supplied configuration dictionary.
|
||||||
|
|
||||||
|
For debugging purposes, the copied configuration dictionary will be
|
||||||
|
exposed as ``sys._configuration``. It will include additional keys for
|
||||||
|
the fields in the ``Py_CoreConfig`` struct.
|
||||||
|
|
||||||
|
These are *snapshots* of the initial configuration settings. They are not
|
||||||
|
consulted by the interpreter during runtime.
|
||||||
|
|
||||||
|
|
||||||
Stable ABI
|
Stable ABI
|
||||||
|
@ -670,7 +787,7 @@ locations.
|
||||||
|
|
||||||
One acknowledged incompatiblity is that some environment variables which
|
One acknowledged incompatiblity is that some environment variables which
|
||||||
are currently read lazily may instead be read once during interpreter
|
are currently read lazily may instead be read once during interpreter
|
||||||
initialisation. As the PEP matures, these will be discussed in more detail
|
initialization. As the PEP matures, these will be discussed in more detail
|
||||||
on a case by case basis. The environment variables which are currently
|
on a case by case basis. The environment variables which are currently
|
||||||
known to be looked up dynamically are:
|
known to be looked up dynamically are:
|
||||||
|
|
||||||
|
@ -680,7 +797,7 @@ known to be looked up dynamically are:
|
||||||
* ``PYTHONINSPECT``: ``os.environ['PYTHONINSPECT']`` will still be checked
|
* ``PYTHONINSPECT``: ``os.environ['PYTHONINSPECT']`` will still be checked
|
||||||
after execution of the ``__main__`` module terminates
|
after execution of the ``__main__`` module terminates
|
||||||
|
|
||||||
The ``Py_Initialize()`` style of initialisation will continue to be
|
The ``Py_Initialize()`` style of initialization will continue to be
|
||||||
supported. It will use (at least some elements of) the new API
|
supported. It will use (at least some elements of) the new API
|
||||||
internally, but will continue to exhibit the same behaviour as it
|
internally, but will continue to exhibit the same behaviour as it
|
||||||
does today, ensuring that ``sys.argv`` is not populated until a subsequent
|
does today, ensuring that ``sys.argv`` is not populated until a subsequent
|
||||||
|
@ -691,7 +808,7 @@ continue to do so, and will also support being called prior to
|
||||||
|
|
||||||
To minimise unnecessary code churn, and to ensure the backwards compatibility
|
To minimise unnecessary code churn, and to ensure the backwards compatibility
|
||||||
is well tested, the main CPython executable may continue to use some elements
|
is well tested, the main CPython executable may continue to use some elements
|
||||||
of the old style initialisation API. (very much TBC)
|
of the old style initialization API. (very much TBC)
|
||||||
|
|
||||||
|
|
||||||
A System Python Executable
|
A System Python Executable
|
||||||
|
@ -712,8 +829,8 @@ application to make use of key components of ``Py_Main``. Including this
|
||||||
change in the PEP is designed to help avoid acceptance of a design that
|
change in the PEP is designed to help avoid acceptance of a design that
|
||||||
sounds good in theory but proves to be problematic in practice.
|
sounds good in theory but proves to be problematic in practice.
|
||||||
|
|
||||||
One final aspect not addressed by the general embedding changes above is
|
Better supporting this kind of "alternate CLI" is the main reason for the
|
||||||
the current inaccessibility of the core logic for deciding between the
|
proposed changes to better expose the core logic for deciding between the
|
||||||
different execution modes supported by CPython:
|
different execution modes supported by CPython:
|
||||||
|
|
||||||
* script execution
|
* script execution
|
||||||
|
@ -723,7 +840,6 @@ different execution modes supported by CPython:
|
||||||
* execution from stdin (non-interactive)
|
* execution from stdin (non-interactive)
|
||||||
* interactive stdin
|
* interactive stdin
|
||||||
|
|
||||||
<TBD: concrete proposal for better exposing the __main__ execution step>
|
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
==============
|
==============
|
||||||
|
|
Loading…
Reference in New Issue