diff --git a/pep-0432.txt b/pep-0432.txt index 99548a520..67f6df46a 100644 --- a/pep-0432.txt +++ b/pep-0432.txt @@ -242,250 +242,6 @@ adding additional configuration settings easier in the future, it deliberately avoids adding any new settings of its own. -The Status Quo -============== - -The current mechanisms for configuring the interpreter have accumulated in -a fairly ad hoc fashion over the past 20+ years, leading to a rather -inconsistent interface with varying levels of documentation. - -(Note: some of the info below could probably be cleaned up and added to the -C API documentation - it's all CPython specific, so it doesn't belong in -the language reference) - - -Ignoring Environment Variables ------------------------------- - -The ``-E`` command line option allows all environment variables to be -ignored when initializing the Python interpreter. An embedding application -can enable this behaviour by setting ``Py_IgnoreEnvironmentFlag`` before -calling ``Py_Initialize()``. - -In the CPython source code, the ``Py_GETENV`` macro implicitly checks this -flag, and always produces ``NULL`` if it is set. - - - - - -Randomised Hashing ------------------- - -The randomised hashing is controlled via the ``-R`` command line option (in -releases prior to 3.3), as well as the ``PYTHONHASHSEED`` environment -variable. - -In Python 3.3, only the environment variable remains relevant. It can be -used to disable randomised hashing (by using a seed value of 0) or else -to force a specific hash value (e.g. for repeatability of testing, or -to share hash values between processes) - -However, embedding applications must use the ``Py_HashRandomizationFlag`` -to explicitly request hash randomisation (CPython sets it in ``Py_Main()`` -rather than in ``Py_Initialize()``). - -The new configuration API should make it straightforward for an -embedding application to reuse the ``PYTHONHASHSEED`` processing with -a text based configuration setting provided by other means (e.g. a -config file or separate environment variable). - - -Locating Python and the standard library ----------------------------------------- - -The location of the Python binary and the standard library is influenced -by several elements. The algorithm used to perform the calculation is -not documented anywhere other than in the source code [3_,4_]. Even that -description is incomplete, as it failed to be updated for the virtual -environment support added in Python 3.3 (detailed in PEP 405). - -These calculations are affected by the following function calls (made -prior to calling ``Py_Initialize()``) and environment variables: - -* ``Py_SetProgramName()`` -* ``Py_SetPythonHome()`` -* ``PYTHONHOME`` - -The filesystem is also inspected for ``pyvenv.cfg`` files (see PEP 405) or, -failing that, a ``lib/os.py`` (Windows) or ``lib/python$VERSION/os.py`` -file. - -The build time settings for ``PREFIX`` and ``EXEC_PREFIX`` are also relevant, -as are some registry settings on Windows. The hardcoded fallbacks are -based on the layout of the CPython source tree and build output when -working in a source checkout. - - -Configuring ``sys.path`` ------------------------- - -An embedding application may call ``Py_SetPath()`` prior to -``Py_Initialize()`` to completely override the calculation of -``sys.path``. It is not straightforward to only allow *some* of the -calculations, as modifying ``sys.path`` after initialization is -already complete means those modifications will not be in effect -when standard library modules are imported during the startup sequence. - -If ``Py_SetPath()`` is not used prior to the first call to ``Py_GetPath()`` -(implicit in ``Py_Initialize()``), then it builds on the location data -calculations above to calculate suitable path entries, along with -the ``PYTHONPATH`` environment variable. - - - -The ``site`` module, which is implicitly imported at startup (unless -disabled via the ``-S`` option) adds additional paths to this initial -set of paths, as described in its documentation [5_]. - -The ``-s`` command line option can be used to exclude the user site -directory from the list of directories added. Embedding applications -can control this by setting the ``Py_NoUserSiteDirectory`` global variable. - -The following commands can be used to check the default path configurations -for a given Python executable on a given system: - -* ``./python -c "import sys, pprint; pprint.pprint(sys.path)"`` - - standard configuration -* ``./python -s -c "import sys, pprint; pprint.pprint(sys.path)"`` - - user site directory disabled -* ``./python -S -c "import sys, pprint; pprint.pprint(sys.path)"`` - - all site path modifications disabled - -(Note: you can see similar information using ``-m site`` instead of ``-c``, -but this is slightly misleading as it calls ``os.abspath`` on all of the -path entries, making relative path entries look absolute. Using the ``site`` -module also causes problems in the last case, as on Python versions prior to -3.3, explicitly importing site will carry out the path modifications ``-S`` -avoids, while on 3.3+ combining ``-m site`` with ``-S`` currently fails) - -The calculation of ``sys.path[0]`` is comparatively straightforward: - -* For an ordinary script (Python source or compiled bytecode), - ``sys.path[0]`` will be the directory containing the script. -* For a valid ``sys.path`` entry (typically a zipfile or directory), - ``sys.path[0]`` will be that path -* For an interactive session, running from stdin or when using the ``-c`` or - ``-m`` switches, ``sys.path[0]`` will be the empty string, which the import - system interprets as allowing imports from the current directory - - -Configuring ``sys.argv`` ------------------------- - -Unlike most other settings discussed in this PEP, ``sys.argv`` is not -set implicitly by ``Py_Initialize()``. Instead, it must be set via an -explicitly call to ``Py_SetArgv()``. - -CPython calls this in ``Py_Main()`` after calling ``Py_Initialize()``. The -calculation of ``sys.argv[1:]`` is straightforward: they're the command line -arguments passed after the script name or the argument to the ``-c`` or -``-m`` options. - -The calculation of ``sys.argv[0]`` is a little more complicated: - -* For an ordinary script (source or bytecode), it will be the script name -* For a ``sys.path`` entry (typically a zipfile or directory) it will - initially be the zipfile or directory name, but will later be changed by - the ``runpy`` module to the full path to the imported ``__main__`` module. -* For a module specified with the ``-m`` switch, it will initially be the - string ``"-m"``, but will later be changed by the ``runpy`` module to the - full path to the executed module. -* For a package specified with the ``-m`` switch, it will initially be the - string ``"-m"``, but will later be changed by the ``runpy`` module to the - full path to the executed ``__main__`` submodule of the package. -* For a command executed with ``-c``, it will be the string ``"-c"`` -* For explicitly requested input from stdin, it will be the string ``"-"`` -* Otherwise, it will be the empty string - -Embedding applications must call Py_SetArgv themselves. The CPython logic -for doing so is part of ``Py_Main()`` and is not exposed separately. -However, the ``runpy`` module does provide roughly equivalent logic in -``runpy.run_module`` and ``runpy.run_path``. - - - -Other configuration settings ----------------------------- - -TBD: Cover the initialization of the following in more detail: - -* Completely disabling the import system -* The initial warning system state: - * ``sys.warnoptions`` - * (-W option, PYTHONWARNINGS) -* Arbitrary extended options (e.g. to automatically enable ``faulthandler``): - * ``sys._xoptions`` - * (-X option) -* The filesystem encoding used by: - * ``sys.getfsencoding`` - * ``os.fsencode`` - * ``os.fsdecode`` -* The IO encoding and buffering used by: - * ``sys.stdin`` - * ``sys.stdout`` - * ``sys.stderr`` - * (-u option, PYTHONIOENCODING, PYTHONUNBUFFEREDIO) -* Whether or not to implicitly cache bytecode files: - * ``sys.dont_write_bytecode`` - * (-B option, PYTHONDONTWRITEBYTECODE) -* Whether or not to enforce correct case in filenames on case-insensitive - platforms - * ``os.environ["PYTHONCASEOK"]`` -* The other settings exposed to Python code in ``sys.flags``: - - * ``debug`` (Enable debugging output in the pgen parser) - * ``inspect`` (Enter interactive interpreter after __main__ terminates) - * ``interactive`` (Treat stdin as a tty) - * ``optimize`` (__debug__ status, write .pyc or .pyo, strip doc strings) - * ``no_user_site`` (don't add the user site directory to sys.path) - * ``no_site`` (don't implicitly import site during startup) - * ``ignore_environment`` (whether environment vars are used during config) - * ``verbose`` (enable all sorts of random output) - * ``bytes_warning`` (warnings/errors for implicit str/bytes interaction) - * ``quiet`` (disable banner output even if verbose is also enabled or - stdin is a tty and the interpreter is launched in interactive mode) - -* Whether or not CPython's signal handlers should be installed - -Much of the configuration of CPython is currently handled through C level -global variables:: - - Py_BytesWarningFlag (-b) - Py_DebugFlag (-d option) - Py_InspectFlag (-i option, PYTHONINSPECT) - Py_InteractiveFlag (property of stdin, cannot be overridden) - Py_OptimizeFlag (-O option, PYTHONOPTIMIZE) - Py_DontWriteBytecodeFlag (-B option, PYTHONDONTWRITEBYTECODE) - Py_NoUserSiteDirectory (-s option, PYTHONNOUSERSITE) - Py_NoSiteFlag (-S option) - Py_UnbufferedStdioFlag (-u, PYTHONUNBUFFEREDIO) - Py_VerboseFlag (-v option, PYTHONVERBOSE) - -For the above variables, the conversion of command line options and -environment variables to C global variables is handled by ``Py_Main``, -so each embedding application must set those appropriately in order to -change them from their defaults. - -Some configuration can only be provided as OS level environment variables:: - - PYTHONSTARTUP - PYTHONCASEOK - PYTHONIOENCODING - -The ``Py_InitializeEx()`` API also accepts a boolean flag to indicate -whether or not CPython's signal handlers should be installed. - -Finally, some interactive behaviour (such as printing the introductory -banner) is triggered only when standard input is reported as a terminal -connection by the operating system. - -TBD: Document how the "-x" option is handled (skips processing of the -first comment line in the main script) - -Also see detailed sequence of operations notes at [1_] - - Design Details ============== @@ -623,7 +379,7 @@ configuration:: #define Py_CoreConfig_INIT {0, -1, 0, 0} The core configuration settings pointer may be ``NULL``, in which case the -default values are ``ignore_environment = 0`` and ``use_hash_seed = -1``. +default values are ``ignore_environment = -1`` and ``use_hash_seed = -1``. The ``Py_CoreConfig_INIT`` macro is designed to allow easy initialization of a struct instance with sensible defaults:: @@ -1189,6 +945,250 @@ is intended for CPython builtin and extension modules) and into the Tools directory. +The Status Quo +============== + +The current mechanisms for configuring the interpreter have accumulated in +a fairly ad hoc fashion over the past 20+ years, leading to a rather +inconsistent interface with varying levels of documentation. + +(Note: some of the info below could probably be cleaned up and added to the +C API documentation for at least 3.3. - it's all CPython specific, so it +doesn't belong in the language reference) + + +Ignoring Environment Variables +------------------------------ + +The ``-E`` command line option allows all environment variables to be +ignored when initializing the Python interpreter. An embedding application +can enable this behaviour by setting ``Py_IgnoreEnvironmentFlag`` before +calling ``Py_Initialize()``. + +In the CPython source code, the ``Py_GETENV`` macro implicitly checks this +flag, and always produces ``NULL`` if it is set. + + + + + +Randomised Hashing +------------------ + +The randomised hashing is controlled via the ``-R`` command line option (in +releases prior to 3.3), as well as the ``PYTHONHASHSEED`` environment +variable. + +In Python 3.3, only the environment variable remains relevant. It can be +used to disable randomised hashing (by using a seed value of 0) or else +to force a specific hash value (e.g. for repeatability of testing, or +to share hash values between processes) + +However, embedding applications must use the ``Py_HashRandomizationFlag`` +to explicitly request hash randomisation (CPython sets it in ``Py_Main()`` +rather than in ``Py_Initialize()``). + +The new configuration API should make it straightforward for an +embedding application to reuse the ``PYTHONHASHSEED`` processing with +a text based configuration setting provided by other means (e.g. a +config file or separate environment variable). + + +Locating Python and the standard library +---------------------------------------- + +The location of the Python binary and the standard library is influenced +by several elements. The algorithm used to perform the calculation is +not documented anywhere other than in the source code [3_,4_]. Even that +description is incomplete, as it failed to be updated for the virtual +environment support added in Python 3.3 (detailed in PEP 405). + +These calculations are affected by the following function calls (made +prior to calling ``Py_Initialize()``) and environment variables: + +* ``Py_SetProgramName()`` +* ``Py_SetPythonHome()`` +* ``PYTHONHOME`` + +The filesystem is also inspected for ``pyvenv.cfg`` files (see PEP 405) or, +failing that, a ``lib/os.py`` (Windows) or ``lib/python$VERSION/os.py`` +file. + +The build time settings for ``PREFIX`` and ``EXEC_PREFIX`` are also relevant, +as are some registry settings on Windows. The hardcoded fallbacks are +based on the layout of the CPython source tree and build output when +working in a source checkout. + + +Configuring ``sys.path`` +------------------------ + +An embedding application may call ``Py_SetPath()`` prior to +``Py_Initialize()`` to completely override the calculation of +``sys.path``. It is not straightforward to only allow *some* of the +calculations, as modifying ``sys.path`` after initialization is +already complete means those modifications will not be in effect +when standard library modules are imported during the startup sequence. + +If ``Py_SetPath()`` is not used prior to the first call to ``Py_GetPath()`` +(implicit in ``Py_Initialize()``), then it builds on the location data +calculations above to calculate suitable path entries, along with +the ``PYTHONPATH`` environment variable. + + + +The ``site`` module, which is implicitly imported at startup (unless +disabled via the ``-S`` option) adds additional paths to this initial +set of paths, as described in its documentation [5_]. + +The ``-s`` command line option can be used to exclude the user site +directory from the list of directories added. Embedding applications +can control this by setting the ``Py_NoUserSiteDirectory`` global variable. + +The following commands can be used to check the default path configurations +for a given Python executable on a given system: + +* ``./python -c "import sys, pprint; pprint.pprint(sys.path)"`` + - standard configuration +* ``./python -s -c "import sys, pprint; pprint.pprint(sys.path)"`` + - user site directory disabled +* ``./python -S -c "import sys, pprint; pprint.pprint(sys.path)"`` + - all site path modifications disabled + +(Note: you can see similar information using ``-m site`` instead of ``-c``, +but this is slightly misleading as it calls ``os.abspath`` on all of the +path entries, making relative path entries look absolute. Using the ``site`` +module also causes problems in the last case, as on Python versions prior to +3.3, explicitly importing site will carry out the path modifications ``-S`` +avoids, while on 3.3+ combining ``-m site`` with ``-S`` currently fails) + +The calculation of ``sys.path[0]`` is comparatively straightforward: + +* For an ordinary script (Python source or compiled bytecode), + ``sys.path[0]`` will be the directory containing the script. +* For a valid ``sys.path`` entry (typically a zipfile or directory), + ``sys.path[0]`` will be that path +* For an interactive session, running from stdin or when using the ``-c`` or + ``-m`` switches, ``sys.path[0]`` will be the empty string, which the import + system interprets as allowing imports from the current directory + + +Configuring ``sys.argv`` +------------------------ + +Unlike most other settings discussed in this PEP, ``sys.argv`` is not +set implicitly by ``Py_Initialize()``. Instead, it must be set via an +explicitly call to ``Py_SetArgv()``. + +CPython calls this in ``Py_Main()`` after calling ``Py_Initialize()``. The +calculation of ``sys.argv[1:]`` is straightforward: they're the command line +arguments passed after the script name or the argument to the ``-c`` or +``-m`` options. + +The calculation of ``sys.argv[0]`` is a little more complicated: + +* For an ordinary script (source or bytecode), it will be the script name +* For a ``sys.path`` entry (typically a zipfile or directory) it will + initially be the zipfile or directory name, but will later be changed by + the ``runpy`` module to the full path to the imported ``__main__`` module. +* For a module specified with the ``-m`` switch, it will initially be the + string ``"-m"``, but will later be changed by the ``runpy`` module to the + full path to the executed module. +* For a package specified with the ``-m`` switch, it will initially be the + string ``"-m"``, but will later be changed by the ``runpy`` module to the + full path to the executed ``__main__`` submodule of the package. +* For a command executed with ``-c``, it will be the string ``"-c"`` +* For explicitly requested input from stdin, it will be the string ``"-"`` +* Otherwise, it will be the empty string + +Embedding applications must call Py_SetArgv themselves. The CPython logic +for doing so is part of ``Py_Main()`` and is not exposed separately. +However, the ``runpy`` module does provide roughly equivalent logic in +``runpy.run_module`` and ``runpy.run_path``. + + + +Other configuration settings +---------------------------- + +TBD: Cover the initialization of the following in more detail: + +* Completely disabling the import system +* The initial warning system state: + * ``sys.warnoptions`` + * (-W option, PYTHONWARNINGS) +* Arbitrary extended options (e.g. to automatically enable ``faulthandler``): + * ``sys._xoptions`` + * (-X option) +* The filesystem encoding used by: + * ``sys.getfsencoding`` + * ``os.fsencode`` + * ``os.fsdecode`` +* The IO encoding and buffering used by: + * ``sys.stdin`` + * ``sys.stdout`` + * ``sys.stderr`` + * (-u option, PYTHONIOENCODING, PYTHONUNBUFFEREDIO) +* Whether or not to implicitly cache bytecode files: + * ``sys.dont_write_bytecode`` + * (-B option, PYTHONDONTWRITEBYTECODE) +* Whether or not to enforce correct case in filenames on case-insensitive + platforms + * ``os.environ["PYTHONCASEOK"]`` +* The other settings exposed to Python code in ``sys.flags``: + + * ``debug`` (Enable debugging output in the pgen parser) + * ``inspect`` (Enter interactive interpreter after __main__ terminates) + * ``interactive`` (Treat stdin as a tty) + * ``optimize`` (__debug__ status, write .pyc or .pyo, strip doc strings) + * ``no_user_site`` (don't add the user site directory to sys.path) + * ``no_site`` (don't implicitly import site during startup) + * ``ignore_environment`` (whether environment vars are used during config) + * ``verbose`` (enable all sorts of random output) + * ``bytes_warning`` (warnings/errors for implicit str/bytes interaction) + * ``quiet`` (disable banner output even if verbose is also enabled or + stdin is a tty and the interpreter is launched in interactive mode) + +* Whether or not CPython's signal handlers should be installed + +Much of the configuration of CPython is currently handled through C level +global variables:: + + Py_BytesWarningFlag (-b) + Py_DebugFlag (-d option) + Py_InspectFlag (-i option, PYTHONINSPECT) + Py_InteractiveFlag (property of stdin, cannot be overridden) + Py_OptimizeFlag (-O option, PYTHONOPTIMIZE) + Py_DontWriteBytecodeFlag (-B option, PYTHONDONTWRITEBYTECODE) + Py_NoUserSiteDirectory (-s option, PYTHONNOUSERSITE) + Py_NoSiteFlag (-S option) + Py_UnbufferedStdioFlag (-u, PYTHONUNBUFFEREDIO) + Py_VerboseFlag (-v option, PYTHONVERBOSE) + +For the above variables, the conversion of command line options and +environment variables to C global variables is handled by ``Py_Main``, +so each embedding application must set those appropriately in order to +change them from their defaults. + +Some configuration can only be provided as OS level environment variables:: + + PYTHONSTARTUP + PYTHONCASEOK + PYTHONIOENCODING + +The ``Py_InitializeEx()`` API also accepts a boolean flag to indicate +whether or not CPython's signal handlers should be installed. + +Finally, some interactive behaviour (such as printing the introductory +banner) is triggered only when standard input is reported as a terminal +connection by the operating system. + +TBD: Document how the "-x" option is handled (skips processing of the +first comment line in the main script) + +Also see detailed sequence of operations notes at [1_] + + References ==========