PEP: 587 Title: Python Initialization Configuration Author: Victor Stinner , Nick Coghlan BDFL-Delegate: Thomas Wouters Discussions-To: python-dev@python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 27-Mar-2019 Python-Version: 3.8 Abstract ======== Add a new C API to configure the Python Initialization providing finer control on the whole configuration and better error reporting. It becomes possible to read the configuration and then override some computed parameters before it is applied. It also becomes possible to completely override how Python computes the module search paths (``sys.path``). The new `Isolated Configuration`_ provides sane default values to isolate Python from the system. For example, to embed Python into an application. Using the environment are now opt-in options, rather than an opt-out options. For example, environment variables, command line arguments and global configuration variables are ignored by default. Building a customized Python which behaves as the regular Python becomes easier using the new ``Py_RunMain()`` function. Moreover, using the `Python Configuration`_, ``PyConfig.argv`` arguments are now parsed the same way the regular Python parses command line arguments, and ``PyConfig.xoptions`` are handled as ``-X opt`` command line options. This extracts a subset of the API design from the PEP 432 development and refactoring work that is now considered sufficiently stable to make public (allowing 3rd party embedding applications access to the same configuration APIs that the native CPython CLI is now using). Rationale ========= Python is highly configurable but its configuration evolved organically. The initialization configuration is scattered all around the code using different ways to set them: global configuration variables (ex: ``Py_IsolatedFlag``), environment variables (ex: ``PYTHONPATH``), command line arguments (ex: ``-b``), configuration files (ex: ``pyvenv.cfg``), function calls (ex: ``Py_SetProgramName()``). A straightforward and reliable way to configure Python is needed. Some configuration parameters are not accessible from the C API, or not easily. For example, there is no API to override the default values of ``sys.executable``. Some options like ``PYTHONPATH`` can only be set using an environment variable which has a side effect on Python child processes if not unset properly. Some options also depends on other options: see `Priority and Rules`_. Python 3.7 API does not provide a consistent view of the overall configuration. The C API of Python 3.7 Initialization takes ``wchar_t*`` strings as input whereas the Python filesystem encoding is set during the initialization which can lead to mojibake. Python 3.7 APIs like ``Py_Initialize()`` aborts the process on memory allocation failure which is not convenient when Python is embedded. Moreover, ``Py_Main()`` could exit directly the process rather than returning an exit code. Proposed new API reports the error or exit code to the caller which can decide how to handle it. Implementing the PEP 540 (UTF-8 Mode) and the new ``-X dev`` correctly was almost impossible in Python 3.6. The code base has been deeply reworked in Python 3.7 and then in Python 3.8 to read the configuration into a structure with no side effect. It becomes possible to clear the configuration (release memory) and read again the configuration if the encoding changed . It is required to implement properly the UTF-8 which changes the encoding using ``-X utf8`` command line option. Internally, bytes ``argv`` strings are decoded from the filesystem encoding. The ``-X dev`` changes the memory allocator (behaves as ``PYTHONMALLOC=debug``), whereas it was not possible to change the memory allocation *while* parsing the command line arguments. The new design of the internal implementation not only allowed to implement properly ``-X utf8`` and ``-X dev``, it also allows to change the Python behavior way more easily, especially for corner cases like that, and ensure that the configuration remains consistent: see `Priority and Rules`_. This PEP is a partial implementation of PEP 432 which is the overall design. New fields can be added later to ``PyConfig`` structure to finish the implementation of the PEP 432 (e.g. by adding a new partial initialization API which allows to configure Python using Python objects to finish the full initialization). However, those features are omitted from this PEP as even the native CPython CLI doesn't work that way - the public API proposal in this PEP is limited to features which have already been implemented and adopted as private APIs for us in the native CPython CLI. Python Initialization C API =========================== This PEP proposes to add the following new structures, functions and macros. New structures: * ``PyConfig`` * ``PyInitError`` * ``PyPreConfig`` * ``PyWideStringList`` New functions: * ``PyConfig_Clear(config)`` * ``PyConfig_InitIsolatedConfig()`` * ``PyConfig_InitPythonConfig()`` * ``PyConfig_Read(config)`` * ``PyConfig_SetArgv(config, argc, argv)`` * ``PyConfig_SetBytesArgv(config, argc, argv)`` * ``PyConfig_SetBytesString(config, config_str, str)`` * ``PyConfig_SetString(config, config_str, str)`` * ``PyInitError_Error(err_msg)`` * ``PyInitError_Exit(exitcode)`` * ``PyInitError_Failed(err)`` * ``PyInitError_IsError(err)`` * ``PyInitError_IsExit(err)`` * ``PyInitError_NoMemory()`` * ``PyInitError_Ok()`` * ``PyPreConfig_InitIsolatedConfig(preconfig)`` * ``PyPreConfig_InitPythonConfig(preconfig)`` * ``PyWideStringList_Append(list, item)`` * ``PyWideStringList_Insert(list, index, item)`` * ``Py_BytesMain(argc, argv)`` * ``Py_ExitInitError(err)`` * ``Py_InitializeFromConfig(config)`` * ``Py_PreInitialize(preconfig)`` * ``Py_PreInitializeFromArgs(preconfig, argc, argv)`` * ``Py_PreInitializeFromBytesArgs(preconfig, argc, argv)`` * ``Py_RunMain()`` This PEP also adds ``_PyRuntimeState.preconfig`` (``PyPreConfig`` type) and ``PyInterpreterState.config`` (``PyConfig`` type) fields to these internal structures. ``PyInterpreterState.config`` becomes the new reference configuration, replacing global configuration variables and other private variables. PyWideStringList ---------------- ``PyWideStringList`` is a list of ``wchar_t*`` strings. ``PyWideStringList`` structure fields: * ``length`` (``Py_ssize_t``) * ``items`` (``wchar_t**``) Methods: * ``PyInitError PyWideStringList_Append(PyWideStringList *list, const wchar_t *item)``: Append *item* to *list*. * ``PyInitError PyWideStringList_Insert(PyWideStringList *list, Py_ssize_t index, const wchar_t *item)``: Insert *item* into *list* at *index*. If *index* is greater than *list* length, just append *item* to *list*. If *length* is non-zero, *items* must be non-NULL and all strings must be non-NULL. PyInitError ----------- ``PyInitError`` is a structure to store an error message or an exit code for the Python Initialization. For an error, it stores the C function name which created the error. Example:: PyInitError alloc(void **ptr, size_t size) { *ptr = PyMem_RawMalloc(size); if (*ptr == NULL) { return PyInitError_NoMemory(); } return PyInitError_Ok(); } int main(int argc, char **argv) { void *ptr; PyInitError err = alloc(&ptr, 16); if (PyInitError_Failed(err)) { Py_ExitInitError(err); } PyMem_Free(ptr); return 0; } ``PyInitError`` fields: * ``exitcode`` (``int``): Argument passed to ``exit()``. * ``err_msg`` (``const char*``): Error message. * ``func`` (``const char *``): Name of the function which created an error, can be ``NULL``. * private ``_type`` field: for internal usage only. Functions to create an error: * ``PyInitError_Ok()``: Success. * ``PyInitError_Error(err_msg)``: Initialization error with a message. * ``PyInitError_NoMemory()``: Memory allocation failure (out of memory). * ``PyInitError_Exit(exitcode)``: Exit Python with the specified exit code. Functions to handle an error: * ``PyInitError_Failed(err)``: Is the result an error or an exit? * ``PyInitError_IsError(err)``: Is the result an error? * ``PyInitError_IsExit(err)``: Is the result an exit? * ``Py_ExitInitError(err)``: Call ``exit(exitcode)`` if *err* is an exit, print the error and exit if *err* is an error. Must only be called with an error and an exit: if ``PyInitError_Failed(err)`` is true. Preinitialization with PyPreConfig ---------------------------------- The ``PyPreConfig`` structure is used to preinitialize Python: * Set the Python memory allocator * Configure the LC_CTYPE locale * Set the UTF-8 mode Example using the preinitialization to enable the UTF-8 Mode:: PyPreConfig preconfig; PyPreConfig_InitPythonConfig(&preconfig); preconfig.utf8_mode = 1; PyInitError err = Py_PreInitialize(&preconfig); if (PyInitError_Failed(err)) { Py_ExitInitError(err); } /* at this point, Python will speak UTF-8 */ Py_Initialize(); /* ... use Python API here ... */ Py_Finalize(); Function to initialize a pre-configuration: * ``void PyPreConfig_InitIsolatedConfig(PyPreConfig *preconfig)`` * ``void PyPreConfig_InitPythonConfig(PyPreConfig *preconfig)`` Functions to preinitialization Python: * ``PyInitError Py_PreInitialize(const PyPreConfig *preconfig)`` * ``PyInitError Py_PreInitializeFromBytesArgs(const PyPreConfig *preconfig, int argc, char * const *argv)`` * ``PyInitError Py_PreInitializeFromArgs(const PyPreConfig *preconfig, int argc, wchar_t * const * argv)`` The caller is responsible to handle error or exit using ``PyInitError_Failed()`` and ``Py_ExitInitError()``. If Python is initialized with command line arguments, the command line arguments must also be passed to preinitialize Python, since they have an effect on the pre-configuration like encodings. For example, the ``-X utf8`` command line option enables the UTF-8 Mode. ``PyPreConfig`` fields: * ``allocator`` (``int``): Name of the memory allocator (ex: ``PYMEM_ALLOCATOR_MALLOC``). Valid values: * ``PYMEM_ALLOCATOR_NOT_SET`` (``0``): don't change memory allocators (use defaults) * ``PYMEM_ALLOCATOR_DEFAULT`` (``1``): default memory allocators * ``PYMEM_ALLOCATOR_DEBUG`` (``2``): enable debug hooks * ``PYMEM_ALLOCATOR_MALLOC`` (``3``): force usage of ``malloc()`` * ``PYMEM_ALLOCATOR_MALLOC_DEBUG`` (``4``): ``malloc()`` with debug hooks * ``PYMEM_ALLOCATOR_PYMALLOC`` (``5``): Python "pymalloc" allocator * ``PYMEM_ALLOCATOR_PYMALLOC_DEBUG`` (``6``): pymalloc with debug hooks * Note: ``PYMEM_ALLOCATOR_PYMALLOC`` and ``PYMEM_ALLOCATOR_PYMALLOC_DEBUG`` are not supported if Python is configured using ``--without-pymalloc`` * ``configure_locale`` (``int``): Set the LC_CTYPE locale to the user preferred locale? If equals to 0, set ``coerce_c_locale`` and ``coerce_c_locale_warn`` to 0. * ``coerce_c_locale`` (``int``): If equals to 2, coerce the C locale; if equals to 1, read the LC_CTYPE locale to decide if it should be coerced. * ``coerce_c_locale_warn`` (``int``): If non-zero, emit a warning if the C locale is coerced. * ``dev_mode`` (``int``): See ``PyConfig.dev_mode``. * ``isolated`` (``int``): See ``PyConfig.isolated``. * ``legacy_windows_fs_encoding`` (``int``): If non-zero, disable UTF-8 Mode, set the Python filesystem encoding to ``mbcs``, set the filesystem error handler to ``replace``. * ``parse_argv`` (``int``): If non-zero, ``Py_PreInitializeFromArgs()`` and ``Py_PreInitializeFromBytesArgs()`` parse their ``argv`` argument the same way the regular Python parses command line arguments: see `Command Line Arguments`_. * ``use_environment`` (``int``): See ``PyConfig.use_environment``. * ``utf8_mode`` (``int``): If non-zero, enable the UTF-8 mode. The ``legacy_windows_fs_encoding`` is only available on Windows. There is also a private field, for internal use only, ``_config_version`` (``int``): the configuration version, used for ABI compatibility. ``PyMem_SetAllocator()`` can be called after ``Py_PreInitialize()`` and before ``Py_InitializeFromConfig()`` to install a custom memory allocator. It can be called before ``Py_PreInitialize()`` if ``allocator`` is set to ``PYMEM_ALLOCATOR_NOT_SET`` (default value). Python memory allocation functions like ``PyMem_RawMalloc()`` must not be used before Python preinitialization, whereas calling directly ``malloc()`` and ``free()`` is always safe. ``Py_DecodeLocale()`` must not be called before the preinitialization. Initialization with PyConfig ---------------------------- The ``PyConfig`` structure contains most parameters to configure Python. Example setting the program name:: void init_python(void) { PyInitError err; PyConfig config; err = PyConfig_InitPythonConfig(&config); if (PyInitError_Failed(err)) { goto fail; } /* Set the program name. Implicitly preinitialize Python. */ err = PyConfig_SetString(&config, &config.program_name, L"/path/to/my_program"); if (PyInitError_Failed(err)) { goto fail; } err = Py_InitializeFromConfig(&config); if (PyInitError_Failed(err)) { goto fail; } PyConfig_Clear(&config); return; fail: PyConfig_Clear(&config); Py_ExitInitError(err); } ``PyConfig`` methods: * ``PyInitError PyConfig_InitPythonConfig(PyConfig *config)`` Initialize configuration with `Python Configuration`_. * ``PyInitError PyConfig_InitIsolatedConfig(PyConfig *config)``: Initialize configuration with `Isolated Configuration`_. * ``PyInitError PyConfig_SetString(PyConfig *config, wchar_t * const *config_str, const wchar_t *str)``: Copy the wide character string *str* into ``*config_str``. Preinitialize Python if needed. * ``PyInitError PyConfig_SetBytesString(PyConfig *config, wchar_t * const *config_str, const char *str)``: Decode *str* using ``Py_DecodeLocale()`` and set the result into ``*config_str``. Preinitialize Python if needed. * ``PyInitError PyConfig_SetArgv(PyConfig *config, int argc, wchar_t * const *argv)``: Set command line arguments from wide character strings. Preinitialize Python if needed. * ``PyInitError PyConfig_SetBytesArgv(PyConfig *config, int argc, char * const *argv)``: Set command line arguments: decode bytes using ``Py_DecodeLocale()``. Preinitialize Python if needed. * ``PyInitError PyConfig_Read(PyConfig *config)``: Read all Python configuration. Fields which are already initialized are left unchanged. Preinitialize Python if needed. * ``void PyConfig_Clear(PyConfig *config)``: Release configuration memory. Most ``PyConfig`` methods preinitialize Python if needed. In that case, the Python preinitialization configuration in based on the ``PyConfig``. If configuration fields which are in common with ``PyPreConfig`` are tuned, they must be set before calling a ``PyConfig`` method: * ``dev_mode`` * ``isolated`` * ``parse_argv`` * ``use_environment`` Moreover, if ``PyConfig_SetArgv()`` or ``PyConfig_SetBytesArgv()`` is used, this method must be called first, before other methods, since the preinitialization configuration depends on command line arguments (if ``parse_argv`` is non-zero). Functions to initialize Python: * ``PyInitError Py_InitializeFromConfig(const PyConfig *config)``: Initialize Python from *config* configuration. The caller of these methods and functions is responsible to handle error or exit using ``PyInitError_Failed()`` and ``Py_ExitInitError()``. ``PyConfig`` fields: * ``argv`` (``PyWideStringList``): Command line arguments, ``sys.argv``. See ``parse_argv`` to parse ``argv`` the same way the regular Python parses Python command line arguments. If ``argv`` is empty, an empty string is added to ensure that ``sys.argv`` always exists and is never empty. * ``base_exec_prefix`` (``wchar_t*``): ``sys.base_exec_prefix``. * ``base_prefix`` (``wchar_t*``): ``sys.base_prefix``. * ``buffered_stdio`` (``int``): If equals to 0, enable unbuffered mode, making the stdout and stderr streams unbuffered. * ``bytes_warning`` (``int``): If equals to 1, issue a warning when comparing ``bytes`` or ``bytearray`` with ``str``, or comparing ``bytes`` with ``int``. If equal or greater to 2, raise a ``BytesWarning`` exception. * ``check_hash_pycs_mode`` (``wchar_t*``): ``--check-hash-based-pycs`` command line option value (see PEP 552). * ``configure_c_stdio`` (``int``): If non-zero, configure C standard streams (``stdio``, ``stdout``, ``stdout``). For example, set their mode to ``O_BINARY`` on Windows. * ``dev_mode`` (``int``): Development mode * ``dump_refs`` (``int``): If non-zero, dump all objects which are still alive at exit * ``exec_prefix`` (``wchar_t*``): ``sys.exec_prefix``. * ``executable`` (``wchar_t*``): ``sys.executable``. * ``faulthandler`` (``int``): If non-zero, call ``faulthandler.enable()``. * ``filesystem_encoding`` (``wchar_t*``): Filesystem encoding, ``sys.getfilesystemencoding()``. * ``filesystem_errors`` (``wchar_t*``): Filesystem encoding errors, ``sys.getfilesystemencodeerrors()``. * ``use_hash_seed`` (``int``), ``hash_seed`` (``unsigned long``): Randomized hash function seed. * ``home`` (``wchar_t*``): Python home directory. * ``import_time`` (``int``): If non-zero, profile import time. * ``inspect`` (``int``): Enter interactive mode after executing a script or a command. * ``install_signal_handlers`` (``int``): Install signal handlers? * ``interactive`` (``int``): Interactive mode. * ``legacy_windows_stdio`` (``int``, Windows only): If non-zero, use ``io.FileIO`` instead of ``WindowsConsoleIO`` for ``sys.stdin``, ``sys.stdout`` and ``sys.stderr``. * ``malloc_stats`` (``int``): If non-zero, dump memory allocation statistics at exit. * ``pythonpath_env`` (``wchar_t*``): Module search paths as a string separated by DELIM (usually ``:``). Initialized from ``PYTHONPATH`` environment variable value by default. * ``module_search_paths_set`` (``int``), ``module_search_paths`` (``PyWideStringList``): ``sys.path``. If ``module_search_paths_set`` is equal to 0, the ``module_search_paths`` is replaced by the function computing the `Path Configuration`. * ``optimization_level`` (``int``): Compilation optimization level. * ``parse_argv`` (``int``): If non-zero, parse ``argv`` the same way the regular Python command line arguments, and strip Python arguments from ``argv``: see `Command Line Arguments`_. * ``parser_debug`` (``int``): If non-zero, turn on parser debugging output (for expert only, depending on compilation options). * ``pathconfig_warnings`` (``int``): If equal to 0, suppress warnings when computing the path configuration (Unix only, Windows does not log any warning). Otherwise, warnings are written into stderr. * ``prefix`` (``wchar_t*``): ``sys.prefix``. * ``program_name`` (``wchar_t*``): Program name. * ``pycache_prefix`` (``wchar_t*``): ``.pyc`` cache prefix. * ``quiet`` (``int``): Quiet mode. For example, don't display the copyright and version messages even in interactive mode. * ``run_command`` (``wchar_t*``): ``-c COMMAND`` argument. * ``run_filename`` (``wchar_t*``): ``python3 SCRIPT`` argument. * ``run_module`` (``wchar_t*``): ``python3 -m MODULE`` argument. * ``show_alloc_count`` (``int``): Show allocation counts at exit? * ``show_ref_count`` (``int``): Show total reference count at exit? * ``site_import`` (``int``): Import the ``site`` module at startup? * ``skip_source_first_line`` (``int``): Skip the first line of the source? * ``stdio_encoding`` (``wchar_t*``), ``stdio_errors`` (``wchar_t*``): Encoding and encoding errors of ``sys.stdin``, ``sys.stdout`` and ``sys.stderr``. * ``tracemalloc`` (``int``): If non-zero, call ``tracemalloc.start(value)``. * ``user_site_directory`` (``int``): If non-zero, add user site directory to ``sys.path``. * ``verbose`` (``int``): If non-zero, enable verbose mode. * ``warnoptions`` (``PyWideStringList``): Options of the ``warnings`` module to build warnings filters. * ``write_bytecode`` (``int``): If non-zero, write ``.pyc`` files. * ``xoptions`` (``PyWideStringList``): ``sys._xoptions``. If ``parse_argv`` is non-zero, ``argv`` arguments are parsed the same way the regular Python parses command line arguments, and Python arguments are stripped from ``argv``: see `Command Line Arguments`_. The ``xoptions`` options are parsed to set other options: see `-X Options`_. ``PyConfig`` private fields, for internal use only: * ``_config_version`` (``int``): Configuration version, used for ABI compatibility. * ``_config_init`` (``int``): Function used to initalize ``PyConfig``, used for preinitialization. * ``_install_importlib`` (``int``): Install importlib? * ``_init_main`` (``int``): If equal to 0, stop Python initialization before the "main" phase (see PEP 432). More complete example modifying the default configuration, read the configuration, and then override some parameters:: PyInitError init_python(const char *program_name) { PyInitError err; PyConfig config; err = PyConfig_InitPythonConfig(&config); if (PyInitError_Failed(err)) { goto done; } /* Set the program name before reading the configuraton (decode byte string from the locale encoding). Implicitly preinitialize Python. */ err = PyConfig_SetBytesString(&config, &config.program_name, program_name); if (PyInitError_Failed(err)) { goto done; } /* Read all configuration at once */ err = PyConfig_Read(&config); if (PyInitError_Failed(err)) { goto done; } /* Append our custom search path to sys.path */ err = PyWideStringList_Append(&config.module_search_paths, L"/path/to/more/modules"); if (PyInitError_Failed(err)) { goto done; } /* Override executable computed by PyConfig_Read() */ err = PyConfig_SetString(&config, &config.executable, L"/path/to/my_executable"); if (PyInitError_Failed(err)) { goto done; } err = Py_InitializeFromConfig(&config); done: PyConfig_Clear(&config); return err; } .. note:: ``PyImport_FrozenModules``, ``PyImport_AppendInittab()`` and ``PyImport_ExtendInittab()`` functions are still relevant and continue to work as previously. They should be set or called before the Python initialization. Isolated Configuration ---------------------- ``PyPreConfig_InitIsolatedConfig()`` and ``PyConfig_InitIsolatedConfig()`` functions create a configuration to isolate Python from the system. For example, to embed Python into an application. This configuration ignores global configuration variables, environments variables and command line arguments (``argv`` is not parsed). The C standard streams (ex: ``stdout``) and the LC_CTYPE locale are left unchanged by default. Configuration files are still used with this configuration. Set the `Path Configuration`_ ("output fields") to ignore these configuration files and avoid the function computing the default path configuration. Python Configuration -------------------- ``PyPreConfig_InitPythonConfig()`` and ``PyConfig_InitPythonConfig()`` functions create a configuration to build a customized Python which behaves as the regular Python. Environments variables and command line arguments are used to configure Python, whereas global configuration variables are ignored. This function enables C locale coercion (PEP 538) and UTF-8 Mode (PEP 540) depending on the LC_CTYPE locale, ``PYTHONUTF8`` and ``PYTHONCOERCECLOCALE`` environment variables. Example of customized Python always running in isolated mode:: int main(int argc, char **argv) { PyConfig config; PyInitError err; err = PyConfig_InitPythonConfig(&config); if (PyInitError_Failed(err)) { goto fail; } config.isolated = 1; /* Decode command line arguments. Implicitly preinitialize Python (in isolated mode). */ err = PyConfig_SetBytesArgv(&config, argc, argv); if (PyInitError_Failed(err)) { goto fail; } err = Py_InitializeFromConfig(&config); if (PyInitError_Failed(err)) { goto fail; } PyConfig_Clear(&config); return Py_RunMain(); fail: PyConfig_Clear(&config); if (!PyInitError_IsExit(err)) { /* Display the error message and exit the process with non-zero exit code */ Py_ExitInitError(err); } return err.exitcode; } This example is a basic implementation of the "System Python Executable" discussed in PEP 432. Path Configuration ------------------ ``PyConfig`` contains multiple fields for the path configuration: * Path configuration input fields: * ``home`` * ``pythonpath_env`` * ``pathconfig_warnings`` * Path configuration output fields: * ``exec_prefix`` * ``executable`` * ``prefix`` * ``module_search_paths_set``, ``module_search_paths`` It is possible to completely ignore the function computing the default path configuration by setting explicitly all path configuration output fields listed above. A string is considered as set even if it's an empty string. ``module_search_paths`` is considered as set if ``module_search_paths_set`` is set to 1. In this case, path configuration input fields are ignored as well. Set ``pathconfig_warnings`` to 0 to suppress warnings when computing the path configuration (Unix only, Windows does not log any warning). If ``base_prefix`` or ``base_exec_prefix`` fields are not set, they inherit their value from ``prefix`` and ``exec_prefix`` respectively. If ``site_import`` is non-zero, ``sys.path`` can be modified by the ``site`` module. For example, if ``user_site_directory`` is non-zero, the user site directory is added to ``sys.path`` (if it exists). See also `Configuration Files`_ used by the path configuration. Py_BytesMain() -------------- Python 3.7 provides a high-level ``Py_Main()`` function which requires to pass command line arguments as ``wchar_t*`` strings. It is non-trivial to use the correct encoding to decode bytes. Python has its own set of issues with C locale coercion and UTF-8 Mode. This PEP adds a new ``Py_BytesMain()`` function which takes command line arguments as bytes:: int Py_BytesMain(int argc, char **argv) Py_RunMain() ------------ The new ``Py_RunMain()`` function executes the command (``PyConfig.run_command``), the script (``PyConfig.run_filename``) or the module (``PyConfig.run_module``) specified on the command line or in the configuration, and then finalizes Python. It returns an exit status that can be passed to the ``exit()`` function. :: int Py_RunMain(void); See `Python Configuration`_ for an example of customized Python always running in isolated mode using ``Py_RunMain()``. Backwards Compatibility ======================= This PEP only adds a new API: it leaves the existing API unchanged and has no impact on the backwards compatibility. The Python 3.7 ``Py_Initialize()`` function now disable the C locale coercion (PEP 538) and the UTF-8 Mode (PEP 540) by default to prevent mojibake. The new API using the `Python Configuration`_ is needed to enable them automatically. Annexes ======= Comparison of Python and Isolated Configurations ------------------------------------------------ Differences between ``PyPreConfig_InitPythonConfig()`` and ``PyPreConfig_InitIsolatedConfig()``: =============================== ======= ======== PyPreConfig Python Isolated =============================== ======= ======== ``coerce_c_locale_warn`` -1 0 ``coerce_c_locale`` -1 0 ``configure_locale`` **1** 0 ``dev_mode`` -1 0 ``isolated`` -1 **1** ``legacy_windows_fs_encoding`` -1 0 ``use_environment`` -1 0 ``parse_argv`` **1** 0 ``utf8_mode`` -1 0 =============================== ======= ======== Differences between ``PyConfig_InitPythonConfig()`` and ``PyConfig_InitIsolatedConfig()``: =============================== ======= ======== PyConfig Python Isolated =============================== ======= ======== ``configure_c_stdio`` **1** 0 ``install_signal_handlers`` **1** 0 ``isolated`` 0 **1** ``parse_argv`` **1** 0 ``pathconfig_warnings`` **1** 0 ``use_environment`` **1** 0 ``user_site_directory`` **1** 0 =============================== ======= ======== Priority and Rules ------------------ Priority of configuration parameters, highest to lowest: * ``PyConfig`` * ``PyPreConfig`` * Configuration files * Command line options * Environment variables * Global configuration variables Priority of warning options, highest to lowest: * ``PyConfig.warnoptions`` * ``PyConfig.dev_mode`` (add ``"default"``) * ``PYTHONWARNINGS`` environment variables * ``-W WARNOPTION`` command line argument * ``PyConfig.bytes_warning`` (add ``"error::BytesWarning"`` if greater than 1, or add ``"default::BytesWarning``) Rules on ``PyConfig`` parameters: * If ``isolated`` is non-zero, ``use_environment`` and ``user_site_directory`` are set to 0. * If ``dev_mode`` is non-zero, ``allocator`` is set to ``"debug"``, ``faulthandler`` is set to 1, and ``"default"`` filter is added to ``warnoptions``. But the ``PYTHONMALLOC`` environment variable has the priority over ``dev_mode`` to set the memory allocator. * If ``base_prefix`` is not set, it inherits ``prefix`` value. * If ``base_exec_prefix`` is not set, it inherits ``exec_prefix`` value. * If the ``python._pth`` configuration file is present, ``isolated`` is set to 1 and ``site_import`` is set to 0; but ``site_import`` is set to 1 if ``python._pth`` contains ``import site``. Rules on ``PyConfig`` and ``PyPreConfig`` parameters: * If ``PyPreConfig.legacy_windows_fs_encoding`` is non-zero, set ``PyPreConfig.utf8_mode`` to 0, set ``PyConfig.filesystem_encoding`` to ``mbcs``, and set ``PyConfig.filesystem_errors`` to ``replace``. Configuration Files ------------------- Python configuration files used by the `Path Configuration`_: * ``pyvenv.cfg`` * ``python._pth`` (Windows only) * ``pybuilddir.txt`` (Unix only) Global Configuration Variables ------------------------------ Global configuration variables mapped to ``PyPreConfig`` fields: ======================================== ================================ Variable Field ======================================== ================================ ``Py_IgnoreEnvironmentFlag`` ``use_environment`` (NOT) ``Py_IsolatedFlag`` ``isolated`` ``Py_LegacyWindowsFSEncodingFlag`` ``legacy_windows_fs_encoding`` ``Py_UTF8Mode`` ``utf8_mode`` ======================================== ================================ (NOT) means that the ``PyPreConfig`` value is the oposite of the global configuration variable value. ``Py_LegacyWindowsFSEncodingFlag`` is only available on Windows. Global configuration variables mapped to ``PyConfig`` fields: ======================================== ================================ Variable Field ======================================== ================================ ``Py_BytesWarningFlag`` ``bytes_warning`` ``Py_DebugFlag`` ``parser_debug`` ``Py_DontWriteBytecodeFlag`` ``write_bytecode`` (NOT) ``Py_FileSystemDefaultEncodeErrors`` ``filesystem_errors`` ``Py_FileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_FrozenFlag`` ``pathconfig_warnings`` (NOT) ``Py_HasFileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_HashRandomizationFlag`` ``use_hash_seed``, ``hash_seed`` ``Py_IgnoreEnvironmentFlag`` ``use_environment`` (NOT) ``Py_InspectFlag`` ``inspect`` ``Py_InteractiveFlag`` ``interactive`` ``Py_IsolatedFlag`` ``isolated`` ``Py_LegacyWindowsStdioFlag`` ``legacy_windows_stdio`` ``Py_NoSiteFlag`` ``site_import`` (NOT) ``Py_NoUserSiteDirectory`` ``user_site_directory`` (NOT) ``Py_OptimizeFlag`` ``optimization_level`` ``Py_QuietFlag`` ``quiet`` ``Py_UnbufferedStdioFlag`` ``buffered_stdio`` (NOT) ``Py_VerboseFlag`` ``verbose`` ``_Py_HasFileSystemDefaultEncodeErrors`` ``filesystem_errors`` ======================================== ================================ (NOT) means that the ``PyConfig`` value is the oposite of the global configuration variable value. ``Py_LegacyWindowsStdioFlag`` is only available on Windows. Command Line Arguments ---------------------- Usage:: python3 [options] python3 [options] -c COMMAND python3 [options] -m MODULE python3 [options] SCRIPT Command line options mapped to pseudo-action on ``PyPreConfig`` fields: ================================ ================================ Option ``PyConfig`` field ================================ ================================ ``-E`` ``use_environment = 0`` ``-I`` ``isolated = 1`` ``-X dev`` ``dev_mode = 1`` ``-X utf8`` ``utf8_mode = 1`` ``-X utf8=VALUE`` ``utf8_mode = VALUE`` ================================ ================================ Command line options mapped to pseudo-action on ``PyConfig`` fields: ================================ ================================ Option ``PyConfig`` field ================================ ================================ ``-b`` ``bytes_warning++`` ``-B`` ``write_bytecode = 0`` ``-c COMMAND`` ``run_command = COMMAND`` ``--check-hash-based-pycs=MODE`` ``_check_hash_pycs_mode = MODE`` ``-d`` ``parser_debug++`` ``-E`` ``use_environment = 0`` ``-i`` ``inspect++`` and ``interactive++`` ``-I`` ``isolated = 1`` ``-m MODULE`` ``run_module = MODULE`` ``-O`` ``optimization_level++`` ``-q`` ``quiet++`` ``-R`` ``use_hash_seed = 0`` ``-s`` ``user_site_directory = 0`` ``-S`` ``site_import`` ``-t`` ignored (kept for backwards compatibility) ``-u`` ``buffered_stdio = 0`` ``-v`` ``verbose++`` ``-W WARNING`` add ``WARNING`` to ``warnoptions`` ``-x`` ``skip_source_first_line = 1`` ``-X OPTION`` add ``OPTION`` to ``xoptions`` ================================ ================================ ``-h``, ``-?`` and ``-V`` options are handled without ``PyConfig``. -X Options ---------- -X options mapped to pseudo-action on ``PyConfig`` fields: ================================ ================================ Option ``PyConfig`` field ================================ ================================ ``-X dev`` ``dev_mode = 1`` ``-X faulthandler`` ``faulthandler = 1`` ``-X importtime`` ``import_time = 1`` ``-X pycache_prefix=PREFIX`` ``pycache_prefix = PREFIX`` ``-X showalloccount`` ``show_alloc_count = 1`` ``-X showrefcount`` ``show_ref_count = 1`` ``-X tracemalloc=N`` ``tracemalloc = N`` ================================ ================================ Environment Variables --------------------- Environment variables mapped to ``PyPreConfig`` fields: ================================= ============================================= Variable ``PyPreConfig`` field ================================= ============================================= ``PYTHONCOERCECLOCALE`` ``coerce_c_locale``, ``coerce_c_locale_warn`` ``PYTHONDEVMODE`` ``dev_mode`` ``PYTHONLEGACYWINDOWSFSENCODING`` ``legacy_windows_fs_encoding`` ``PYTHONMALLOC`` ``allocator`` ``PYTHONUTF8`` ``utf8_mode`` ================================= ============================================= Environment variables mapped to ``PyConfig`` fields: ================================= ==================================== Variable ``PyConfig`` field ================================= ==================================== ``PYTHONDEBUG`` ``parser_debug`` ``PYTHONDEVMODE`` ``dev_mode`` ``PYTHONDONTWRITEBYTECODE`` ``write_bytecode`` ``PYTHONDUMPREFS`` ``dump_refs`` ``PYTHONEXECUTABLE`` ``program_name`` ``PYTHONFAULTHANDLER`` ``faulthandler`` ``PYTHONHASHSEED`` ``use_hash_seed``, ``hash_seed`` ``PYTHONHOME`` ``home`` ``PYTHONINSPECT`` ``inspect`` ``PYTHONIOENCODING`` ``stdio_encoding``, ``stdio_errors`` ``PYTHONLEGACYWINDOWSSTDIO`` ``legacy_windows_stdio`` ``PYTHONMALLOCSTATS`` ``malloc_stats`` ``PYTHONNOUSERSITE`` ``user_site_directory`` ``PYTHONOPTIMIZE`` ``optimization_level`` ``PYTHONPATH`` ``pythonpath_env`` ``PYTHONPROFILEIMPORTTIME`` ``import_time`` ``PYTHONPYCACHEPREFIX,`` ``pycache_prefix`` ``PYTHONTRACEMALLOC`` ``tracemalloc`` ``PYTHONUNBUFFERED`` ``buffered_stdio`` ``PYTHONVERBOSE`` ``verbose`` ``PYTHONWARNINGS`` ``warnoptions`` ================================= ==================================== ``PYTHONLEGACYWINDOWSFSENCODING`` and ``PYTHONLEGACYWINDOWSSTDIO`` are specific to Windows. Default Python Configugration ----------------------------- ``PyPreConfig_InitPythonConfig()``: * ``allocator`` = ``PYMEM_ALLOCATOR_NOT_SET`` * ``coerce_c_locale_warn`` = -1 * ``coerce_c_locale`` = -1 * ``configure_locale`` = 1 * ``dev_mode`` = -1 * ``isolated`` = -1 * ``legacy_windows_fs_encoding`` = -1 * ``use_environment`` = -1 * ``utf8_mode`` = -1 ``PyConfig_InitPythonConfig()``: * ``argv`` = [] * ``base_exec_prefix`` = ``NULL`` * ``base_prefix`` = ``NULL`` * ``buffered_stdio`` = 1 * ``bytes_warning`` = 0 * ``check_hash_pycs_mode`` = ``NULL`` * ``configure_c_stdio`` = 1 * ``dev_mode`` = 0 * ``dump_refs`` = 0 * ``exec_prefix`` = ``NULL`` * ``executable`` = ``NULL`` * ``faulthandler`` = 0 * ``filesystem_encoding`` = ``NULL`` * ``filesystem_errors`` = ``NULL`` * ``hash_seed`` = 0 * ``home`` = ``NULL`` * ``import_time`` = 0 * ``inspect`` = 0 * ``install_signal_handlers`` = 1 * ``interactive`` = 0 * ``isolated`` = 0 * ``malloc_stats`` = 0 * ``module_search_path_env`` = ``NULL`` * ``module_search_paths`` = [] * ``optimization_level`` = 0 * ``parse_argv`` = 1 * ``parser_debug`` = 0 * ``pathconfig_warnings`` = 1 * ``prefix`` = ``NULL`` * ``program_name`` = ``NULL`` * ``pycache_prefix`` = ``NULL`` * ``quiet`` = 0 * ``run_command`` = ``NULL`` * ``run_filename`` = ``NULL`` * ``run_module`` = ``NULL`` * ``show_alloc_count`` = 0 * ``show_ref_count`` = 0 * ``site_import`` = 1 * ``skip_source_first_line`` = 0 * ``stdio_encoding`` = ``NULL`` * ``stdio_errors`` = ``NULL`` * ``tracemalloc`` = 0 * ``use_environment`` = 1 * ``use_hash_seed`` = 0 * ``user_site_directory`` = 1 * ``verbose`` = 0 * ``warnoptions`` = [] * ``write_bytecode`` = 1 * ``xoptions`` = [] * ``_init_main`` = 1 * ``_install_importlib`` = 1 Default Isolated Configugration ------------------------------- ``PyPreConfig_InitIsolatedConfig()``: * ``allocator`` = ``PYMEM_ALLOCATOR_NOT_SET`` * ``coerce_c_locale_warn`` = 0 * ``coerce_c_locale`` = 0 * ``configure_locale`` = 0 * ``dev_mode`` = 0 * ``isolated`` = 1 * ``legacy_windows_fs_encoding`` = 0 * ``use_environment`` = 0 * ``utf8_mode`` = 0 ``PyConfig_InitIsolatedConfig()``: * ``argv`` = [] * ``base_exec_prefix`` = ``NULL`` * ``base_prefix`` = ``NULL`` * ``buffered_stdio`` = 1 * ``bytes_warning`` = 0 * ``check_hash_pycs_mode`` = ``NULL`` * ``configure_c_stdio`` = 0 * ``dev_mode`` = 0 * ``dump_refs`` = 0 * ``exec_prefix`` = ``NULL`` * ``executable`` = ``NULL`` * ``faulthandler`` = 0 * ``filesystem_encoding`` = ``NULL`` * ``filesystem_errors`` = ``NULL`` * ``hash_seed`` = 0 * ``home`` = ``NULL`` * ``import_time`` = 0 * ``inspect`` = 0 * ``install_signal_handlers`` = 0 * ``interactive`` = 0 * ``isolated`` = 1 * ``malloc_stats`` = 0 * ``module_search_path_env`` = ``NULL`` * ``module_search_paths`` = [] * ``optimization_level`` = 0 * ``parse_argv`` = 0 * ``parser_debug`` = 0 * ``pathconfig_warnings`` = 0 * ``prefix`` = ``NULL`` * ``program_name`` = ``NULL`` * ``pycache_prefix`` = ``NULL`` * ``quiet`` = 0 * ``run_command`` = ``NULL`` * ``run_filename`` = ``NULL`` * ``run_module`` = ``NULL`` * ``show_alloc_count`` = 0 * ``show_ref_count`` = 0 * ``site_import`` = 1 * ``skip_source_first_line`` = 0 * ``stdio_encoding`` = ``NULL`` * ``stdio_errors`` = ``NULL`` * ``tracemalloc`` = 0 * ``use_environment`` = 0 * ``use_hash_seed`` = 0 * ``user_site_directory`` = 0 * ``verbose`` = 0 * ``warnoptions`` = [] * ``write_bytecode`` = 1 * ``xoptions`` = [] * ``_init_main`` = 1 * ``_install_importlib`` = 1 Python 3.7 API -------------- Python 3.7 has 4 functions in its C API to initialize and finalize Python: * ``Py_Initialize()``, ``Py_InitializeEx()``: initialize Python * ``Py_Finalize()``, ``Py_FinalizeEx()``: finalize Python Python 3.7 can be configured using `Global Configuration Variables`_, `Environment Variables`_, and the following functions: * ``PyImport_AppendInittab()`` * ``PyImport_ExtendInittab()`` * ``PyMem_SetAllocator()`` * ``PyMem_SetupDebugHooks()`` * ``PyObject_SetArenaAllocator()`` * ``Py_SetPath()`` * ``Py_SetProgramName()`` * ``Py_SetPythonHome()`` * ``Py_SetStandardStreamEncoding()`` * ``PySys_AddWarnOption()`` * ``PySys_AddXOption()`` * ``PySys_ResetWarnOptions()`` There is also a high-level ``Py_Main()`` function and ``PyImport_FrozenModules`` variable which can be overridden. See `Initialization, Finalization, and Threads `_ documentation. Python Issues ============= Issues that will be fixed by this PEP, directly or indirectly: * `bpo-1195571 `_: "simple callback system for Py_FatalError" * `bpo-11320 `_: "Usage of API method Py_SetPath causes errors in Py_Initialize() (Posix ony)" * `bpo-13533 `_: "Would like Py_Initialize to play friendly with host app" * `bpo-14956 `_: "custom PYTHONPATH may break apps embedding Python" * `bpo-19983 `_: "When interrupted during startup, Python should not call abort() but exit()" * `bpo-22213 `_: "Make pyvenv style virtual environments easier to configure when embedding Python". This PEP more or * `bpo-22257 `_: "PEP 432: Redesign the interpreter startup sequence" * `bpo-29778 `_: "_Py_CheckPython3 uses uninitialized dllpath when embedder sets module path with Py_SetPath" * `bpo-30560 `_: "Add Py_SetFatalErrorAbortFunc: Allow embedding program to handle fatal errors". * `bpo-31745 `_: "Overloading "Py_GetPath" does not work" * `bpo-32573 `_: "All sys attributes (.argv, ...) should exist in embedded environments". * `bpo-34725 `_: "Py_GetProgramFullPath() odd behaviour in Windows" * `bpo-36204 `_: "Deprecate calling Py_Main() after Py_Initialize()? Add Py_InitializeFromArgv()?" * `bpo-33135 `_: "Define field prefixes for the various config structs". The PEP now defines well how warnings options are handled. Issues of the PEP implementation: * `bpo-16961 `_: "No regression tests for -E and individual environment vars" * `bpo-20361 `_: "-W command line options and PYTHONWARNINGS environmental variable should not override -b / -bb command line options" * `bpo-26122 `_: "Isolated mode doesn't ignore PYTHONHASHSEED" * `bpo-29818 `_: "Py_SetStandardStreamEncoding leads to a memory error in debug mode" * `bpo-31845 `_: "PYTHONDONTWRITEBYTECODE and PYTHONOPTIMIZE have no effect" * `bpo-32030 `_: "PEP 432: Rewrite Py_Main()" * `bpo-32124 `_: "Document functions safe to be called before Py_Initialize()" * `bpo-33042 `_: "New 3.7 startup sequence crashes PyInstaller" * `bpo-33932 `_: "Calling Py_Initialize() twice now triggers a fatal error (Python 3.7)" * `bpo-34008 `_: "Do we support calling Py_Main() after Py_Initialize()?" * `bpo-34170 `_: "Py_Initialize(): computing path configuration must not have side effect (PEP 432)" * `bpo-34589 `_: "Py_Initialize() and Py_Main() should not enable C locale coercion" * `bpo-34639 `_: "PYTHONCOERCECLOCALE is ignored when using -E or -I option" * `bpo-36142 `_: "Add a new _PyPreConfig step to Python initialization to setup memory allocator and encodings" * `bpo-36202 `_: "Calling Py_DecodeLocale() before _PyPreConfig_Write() can produce mojibake" * `bpo-36301 `_: "Add _Py_PreInitialize() function" * `bpo-36443 `_: "Disable coerce_c_locale and utf8_mode by default in _PyPreConfig?" * `bpo-36444 `_: "Python initialization: remove _PyMainInterpreterConfig" * `bpo-36471 `_: "PEP 432, PEP 587: Add _Py_RunMain()" * `bpo-36763 `_: "PEP 587: Rework initialization API to prepare second version of the PEP" * `bpo-36775 `_: "Rework filesystem codec implementation" * `bpo-36900 `_: "Use _PyCoreConfig rather than global configuration variables" Issues related to this PEP: * `bpo-12598 `_: "Move sys variable initialization from import.c to sysmodule.c" * `bpo-15577 `_: "Real argc and argv in embedded interpreter" * `bpo-16202 `_: "sys.path[0] security issues" * `bpo-18309 `_: "Make python slightly more relocatable" * `bpo-25631 `_: "Segmentation fault with invalid Unicode command-line arguments in embedded Python" * `bpo-26007 `_: "Support embedding the standard library in an executable" * `bpo-31210 `_: "Can not import modules if sys.prefix contains DELIM". * `bpo-31349 `_: "Embedded initialization ignores Py_SetProgramName()" * `bpo-33919 `_: "Expose _PyCoreConfig structure to Python" * `bpo-35173 `_: "Re-use already existing functionality to allow Python 2.7.x (both embedded and standalone) to locate the module path according to the shared library" Version History =============== * Version 4: * Introduce "Python Configuration" and "Isolated Configuration" which are well better defined. Replace all macros with functions. * Replace ``PyPreConfig_INIT`` and ``PyConfig_INIT`` macros with functions: * ``PyPreConfig_InitIsolatedConfig()``, ``PyConfig_InitIsolatedConfig()`` * ``PyPreConfig_InitPythonConfig()``, ``PyConfig_InitPythonConfig()`` * ``PyPreConfig`` no longer uses dynamic memory, the ``allocator`` field type becomes an int, add ``configure_locale`` and ``parse_argv`` field. * ``PyConfig``: rename ``module_search_path_env`` to ``pythonpath_env``, rename ``use_module_search_paths`` to ``module_search_paths_set``, remove ``program`` and ``dll_path``. * Replace ``Py_INIT_xxx()`` macros with ``PyInitError_xxx()`` functions. * Remove the "Constant PyConfig" section. Remove ``Py_InitializeFromArgs()`` and ``Py_InitializeFromBytesArgs()`` functions. * Version 3: * ``PyConfig``: Add ``configure_c_stdio`` and ``parse_argv``; rename ``_frozen`` to ``pathconfig_warnings``. * Rename functions using bytes strings and wide character strings. For example, ``Py_PreInitializeFromWideArgs()`` becomes ``Py_PreInitializeFromArgs()``, and ``PyConfig_SetArgv()`` becomes ``PyConfig_SetBytesArgv()``. * Add ``PyWideStringList_Insert()`` function. * New "Path configuration", "Isolate Python", "Python Issues" and "Version History" sections. * ``PyConfig_SetString()`` and ``PyConfig_SetBytesString()`` now requires the configuration as the first argument. * Rename ``Py_UnixMain()`` to ``Py_BytesMain()`` * Version 2: Add ``PyConfig`` methods (ex: ``PyConfig_Read()``), add ``PyWideStringList_Append()``, rename ``PyWideCharList`` to ``PyWideStringList``. * Version 1: Initial version. Copyright ========= This document has been placed in the public domain.