PEP: 587 Title: Python Initialization Configuration Author: Victor Stinner Discussions-To: python-dev@python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 27-Mar-2019 Python-Version: 3.8 Abstract ======== Add a new C API to configure the Python Initialization providing finer control on the whole configuration and better error reporting. Rationale ========= Python is highly configurable but its configuration evolved organically: configuration parameters is scattered all around the code using different ways to set them (mostly global configuration variables and functions). A straightforward and reliable way to configure Python is needed. Some configuration parameters are not accessible from the C API, or not easily. The C API of Python 3.7 Initialization takes ``wchar_t*`` strings as input whereas the Python filesystem encoding is set during the initialization. Python Initialization C API =========================== This PEP proposes to add the following new structures, functions and macros. New structures (4): * ``PyConfig`` * ``PyInitError`` * ``PyPreConfig`` * ``PyWideCharList`` New functions (8): * ``Py_PreInitialize(config)`` * ``Py_PreInitializeFromArgs(config, argc, argv)`` * ``Py_PreInitializeFromWideArgs(config, argc, argv)`` * ``Py_InitializeFromConfig(config)`` * ``Py_InitializeFromArgs(config, argc, argv)`` * ``Py_InitializeFromWideArgs(config, argc, argv)`` * ``Py_UnixMain(argc, argv)`` * ``Py_ExitInitError(err)`` New macros (6): * ``Py_INIT_OK()`` * ``Py_INIT_ERR(MSG)`` * ``Py_INIT_USER_ERR(MSG)`` * ``Py_INIT_NO_MEMORY()`` * ``Py_INIT_EXIT(EXITCODE)`` * ``Py_INIT_FAILED(err)`` PyWideCharList -------------- ``PyWideCharList`` is a list of ``wchar_t*`` strings. Example to initialize a string from C static array:: static wchar_t* argv[2] = { L"-c", L"pass", }; PyWideCharList config_argv = PyWideCharList_INIT; config_argv.length = Py_ARRAY_LENGTH(argv); config_argv.items = argv; ``PyWideCharList`` structure fields: * ``length`` (``Py_ssize_t``) * ``items`` (``wchar_t**``) If *length* is non-zero, *items* must be non-NULL and all strings must be non-NULL. .. note:: The "WideChar" name comes from the existing Python C API. Example: ``PyUnicode_FromWideChar(const wchar_t *str, Py_ssize_t size)``. PyInitError ----------- ``PyInitError`` is a structure to store an error message or an exit code for the Python Initialization. Example:: PyInitError alloc(void **ptr, size_t size) { *ptr = PyMem_RawMalloc(size); if (*ptr == NULL) { return Py_INIT_NO_MEMORY(); } return Py_INIT_OK(); } int main(int argc, char **argv) { void *ptr; PyInitError err = alloc(&ptr, 16); if (Py_INIT_FAILED(err)) { Py_ExitInitError(err); } PyMem_Free(ptr); return 0; } ``PyInitError`` fields: * ``exitcode`` (``int``): if greater or equal to zero, argument passed to ``exit()`` * ``msg`` (``const char*``): error message * ``prefix`` (``const char*``): string displayed before the message, usually rendered as ``prefix: msg``. * ``user_err`` (int): if non-zero, the error is caused by the user configuration, otherwise it's an internal Python error. Macro to create an error: * ``Py_INIT_OK()``: success * ``Py_INIT_NO_MEMORY()``: memory allocation failure (out of memory) * ``Py_INIT_ERR(MSG)``: Python internal error * ``Py_INIT_USER_ERR(MSG)``: error caused by user configuration * ``Py_INIT_EXIT(STATUS)``: exit Python with the specified status Other macros and functions: * ``Py_INIT_FAILED(err)``: Is the result an error or an exit? * ``Py_ExitInitError(err)``: call ``exit(status)`` for an error created by ``Py_INIT_EXIT(status)``, call ``Py_FatalError(msg)`` for other errors. Must not be called for an error created by ``Py_INIT_OK()``. Pre-Initialization with PyPreConfig ----------------------------------- ``PyPreConfig`` structure is used to pre-initialize Python: * Set the memory allocator * Configure the LC_CTYPE locale * Set the UTF-8 mode Example using the pre-initialization to enable the UTF-8 Mode:: PyPreConfig preconfig = PyPreConfig_INIT; preconfig.utf8_mode = 1; PyInitError err = Py_PreInitialize(&preconfig); if (Py_INIT_FAILED(err)) { Py_ExitInitError(err); } /* at this point, Python will speak UTF-8 */ Py_Initialize(); /* ... use Python API here ... */ Py_Finalize(); Functions to pre-initialize Python: * ``PyInitError Py_PreInitialize(const PyPreConfig *config)`` * ``PyInitError Py_PreInitializeFromArgs( const PyPreConfig *config, int argc, char **argv)`` * ``PyInitError Py_PreInitializeFromWideArgs( const PyPreConfig *config, int argc, wchar_t **argv)`` These functions can be called with *config* set to ``NULL``. The caller is responsible to handler error using ``Py_INIT_FAILED()`` and ``Py_ExitInitError()``. ``PyPreConfig`` fields: * ``allocator``: name of the memory allocator (ex: ``"malloc"``) * ``coerce_c_locale_warn``: if non-zero, emit a warning if the C locale is coerced. * ``coerce_c_locale``: if equals to 2, coerce the C locale; if equals to 1, read the LC_CTYPE to decide if it should be coerced. * ``dev_mode``: see ``PyConfig.dev_mode`` * ``isolated``: see ``PyConfig.isolated`` * ``legacy_windows_fs_encoding`` (Windows only): if non-zero, set the Python filesystem encoding to ``"mbcs"``. * ``use_environment``: see ``PyConfig.use_environment`` * ``utf8_mode``: if non-zero, enable the UTF-8 mode The C locale coercion (PEP 538) and the UTF-8 Mode (PEP 540) are disabled by default in ``PyPreConfig``. Set ``coerce_c_locale``, ``coerce_c_locale_warn`` and ``utf8_mode`` to ``-1`` to let Python enable them depending on the user configuration. Initialization with PyConfig ---------------------------- The ``PyConfig`` structure contains all parameters to configure Python. Example of Python initialization enabling the isolated mode:: PyConfig config = PyConfig_INIT; config.isolated = 1; PyInitError err = Py_InitializeFromConfig(&config); if (Py_INIT_FAILED(err)) { Py_ExitInitError(err); } /* ... use Python API here ... */ Py_Finalize(); Functions to initialize Python: * ``PyInitError Py_InitializeFromConfig(const PyConfig *config)`` * ``PyInitError Py_InitializeFromArgs(const PyConfig *config, int argc, char **argv)`` * ``PyInitError Py_InitializeFromWideArgs(const PyConfig *config, int argc, wchar_t **argv)`` These functions can be called with *config* set to ``NULL``. The caller is responsible to handler error using ``Py_INIT_FAILED()`` and ``Py_ExitInitError()``. PyConfig fields: * ``argv``: ``sys.argv`` * ``base_exec_prefix``: ``sys.base_exec_prefix`` * ``base_prefix``: ``sys.base_prefix`` * ``buffered_stdio``: if equals to 0, enable unbuffered mode, make stdout and stderr streams to be unbuffered. * ``bytes_warning``: if equals to 1, issue a warning when comparing ``bytes`` or ``bytearray`` with ``str``, or comparing ``bytes`` with ``int``. If equal or greater to 2, raise a ``BytesWarning`` exception. * ``dll_path`` (Windows only): Windows DLL path * ``dump_refs``: if non-zero, display all objects still alive at exit * ``exec_prefix``: ``sys.exec_prefix`` * ``executable``: ``sys.executable`` * ``faulthandler``: if non-zero, call ``faulthandler.enable()`` * ``filesystem_encoding``: Filesystem encoding, ``sys.getfilesystemencoding()`` * ``filesystem_errors``: Filesystem encoding errors, ``sys.getfilesystemencodeerrors()`` * ``hash_seed``, ``use_hash_seed``: randomized hash function seed * ``home``: Python home * ``import_time``: if non-zero, profile import time * ``inspect``: enter interactive mode after executing a script or a command * ``install_signal_handlers``: install signal handlers? * ``interactive``: interactive mode * ``legacy_windows_stdio`` (Windows only): if non-zero, use ``io.FileIO`` instead of ``WindowsConsoleIO`` for ``sys.stdin``, ``sys.stdout`` and ``sys.stderr``. * ``malloc_stats``: if non-zero, dump memory allocation statistics at exit * ``module_search_path_env``: ``PYTHONPATH`` environment variale value * ``module_search_paths``, ``use_module_search_paths``: ``sys.path`` * ``optimization_level``: compilation optimization level * ``parser_debug``: if non-zero, turn on parser debugging output (for expert only, depending on compilation options). * ``prefix``: ``sys.prefix`` * ``program_name``: Program name * ``program``: ``argv[0]`` or an empty string * ``pycache_prefix``: ``.pyc`` cache prefix * ``quiet``: quiet mode (ex: don't display the copyright and version messages even in interactive mode) * ``run_command``: ``-c COMMAND`` argument * ``run_filename``: ``python3 SCRIPT`` argument * ``run_module``: ``python3 -m MODULE`` argument * ``show_alloc_count``: show allocation counts at exit * ``show_ref_count``: show total reference count at exit * ``site_import``: import the ``site`` module at startup? * ``skip_source_first_line``: skip the first line of the source * ``stdio_encoding``, ``stdio_errors``: encoding and encoding errors of ``sys.stdin``, ``sys.stdout`` and ``sys.stderr`` * ``tracemalloc``: if non-zero, call ``tracemalloc.start(value)`` * ``user_site_directory``: if non-zero, add user site directory to ``sys.path`` * ``verbose``: if non-zero, enable verbose mode * ``warnoptions``: options of the ``warnings`` module to build filters * ``write_bytecode``: if non-zero, write ``.pyc`` files * ``xoptions``: ``sys._xoptions`` There are also private fields which are for internal-usage only: * ``_check_hash_pycs_mode`` * ``_frozen`` * ``_init_main`` * ``_install_importlib`` New Py_UnixMain() function -------------------------- Python 3.7 provides a high-level ``Py_Main()`` function which requires to pass command line arguments as ``wchar_t*`` strings. It is non-trivial to use the correct encoding to decode bytes. Python has its own set of issues with C locale coercion and UTF-8 Mode. This PEP adds a new ``Py_UnixMain()`` function which takes command line arguments as bytes:: int Py_UnixMain(int argc, char **argv) Memory allocations and Py_DecodeLocale() ---------------------------------------- New pre-initialization and initialization APIs use constant ``PyPreConfig`` or ``PyConfig`` structures. If memory is allocated dynamically, the caller is responsible to release it. Using static strings is just fine. Python memory allocation functions like ``PyMem_RawMalloc()`` must not be used before Python pre-initialization. Using ``malloc()`` and ``free()`` is always safe. ``Py_DecodeLocale()`` must only be used after the pre-initialization. XXX Open Questions ================== This PEP is still a draft with open questions which should be answered: * Do we need to add an API for import ``inittab``? * What about the stable ABI? Should we add a version into ``PyPreConfig`` and ``PyConfig`` structures somehow? The Windows API is known for its ABI stability and it stores the structure size into the structure directly. Do the same? * The PEP 432 stores ``PYTHONCASEOK`` into the config. Do we need to add something for that into ``PyConfig``? How would it be exposed at the Python level for ``importlib``? Passed as an argument to ``importlib._bootstrap._setup()`` maybe? Backwards Compatibility ======================= This PEP only adds a new API: it leaves the existing API unchanged and has no impact on the backwards compatibility. Alternative: PEP 432 ==================== This PEP is inspired by Nick Coghlan's PEP 432 with a main difference: it only allows to configure Python before its initialization. The PEP 432 uses three initialization phases: Pre-Initialization, Initializing, Initialized. It is possible to configure Python between Initializing and Initialized phases using Python objects. This PEP only uses C types like ``int`` and ``wchar_t*`` (and ``PyWideCharList`` structure). All parameters must be configured at once before the Python initialization using the ``PyConfig`` structure. Annex: Python Configuration =========================== Priority and Rules ------------------ Priority of configuration parameters, highest to lowest: * ``PyConfig`` * ``PyPreConfig`` * Configuration files * Command line options * Environment variables * Global configuration variables Priority of warning options, highest to lowest: * ``PyConfig.warnoptions`` * ``PyConfig.dev_mode`` (add ``"default"``) * ``PYTHONWARNINGS`` environment variables * ``-W WARNOPTION`` command line argument * ``PyConfig.bytes_warning`` (add ``"error::BytesWarning"`` if greater than 1, or add ``"default::BytesWarning``) Rules on ``PyConfig`` and ``PyPreConfig`` parameters: * If ``isolated`` is non-zero, ``use_environment`` and ``user_site_directory`` are set to 0 * If ``legacy_windows_fs_encoding`` is non-zero, ``utf8_mode`` is set to 0 * If ``dev_mode`` is non-zero, ``allocator`` is set to ``"debug"``, ``faulthandler`` is set to 1, and ``"default"`` filter is added to ``warnoptions``. But ``PYTHONMALLOC`` has the priority over ``dev_mode`` to set the memory allocator. Configuration Files ------------------- Python configuration files: * ``pyvenv.cfg`` * ``python._pth`` (Windows only) * ``pybuilddir.txt`` (Unix only) Global Configuration Variables ------------------------------ Global configuration variables mapped to ``PyPreConfig`` fields: ======================================== ================================ Variable Field ======================================== ================================ ``Py_LegacyWindowsFSEncodingFlag`` ``legacy_windows_fs_encoding`` ``Py_LegacyWindowsFSEncodingFlag`` ``legacy_windows_fs_encoding`` ``Py_UTF8Mode`` ``utf8_mode`` ``Py_UTF8Mode`` ``utf8_mode`` ======================================== ================================ Global configuration variables mapped to ``PyConfig`` fields: ======================================== ================================ Variable Field ======================================== ================================ ``Py_BytesWarningFlag`` ``bytes_warning`` ``Py_DebugFlag`` ``parser_debug`` ``Py_DontWriteBytecodeFlag`` ``write_bytecode`` ``Py_FileSystemDefaultEncodeErrors`` ``filesystem_errors`` ``Py_FileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_FrozenFlag`` ``_frozen`` ``Py_HasFileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_HashRandomizationFlag`` ``use_hash_seed``, ``hash_seed`` ``Py_IgnoreEnvironmentFlag`` ``use_environment`` ``Py_InspectFlag`` ``inspect`` ``Py_InteractiveFlag`` ``interactive`` ``Py_IsolatedFlag`` ``isolated`` ``Py_LegacyWindowsStdioFlag`` ``legacy_windows_stdio`` ``Py_NoSiteFlag`` ``site_import`` ``Py_NoUserSiteDirectory`` ``user_site_directory`` ``Py_OptimizeFlag`` ``optimization_level`` ``Py_QuietFlag`` ``quiet`` ``Py_UnbufferedStdioFlag`` ``buffered_stdio`` ``Py_VerboseFlag`` ``verbose`` ``_Py_HasFileSystemDefaultEncodeErrors`` ``filesystem_errors`` ``Py_BytesWarningFlag`` ``bytes_warning`` ``Py_DebugFlag`` ``parser_debug`` ``Py_DontWriteBytecodeFlag`` ``write_bytecode`` ``Py_FileSystemDefaultEncodeErrors`` ``filesystem_errors`` ``Py_FileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_FrozenFlag`` ``_frozen`` ``Py_HasFileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_HashRandomizationFlag`` ``use_hash_seed``, ``hash_seed`` ``Py_IgnoreEnvironmentFlag`` ``use_environment`` ``Py_InspectFlag`` ``inspect`` ``Py_InteractiveFlag`` ``interactive`` ``Py_IsolatedFlag`` ``isolated`` ``Py_LegacyWindowsStdioFlag`` ``legacy_windows_stdio`` ``Py_NoSiteFlag`` ``site_import`` ``Py_NoUserSiteDirectory`` ``user_site_directory`` ``Py_OptimizeFlag`` ``optimization_level`` ``Py_QuietFlag`` ``quiet`` ``Py_UnbufferedStdioFlag`` ``buffered_stdio`` ``Py_VerboseFlag`` ``verbose`` ``_Py_HasFileSystemDefaultEncodeErrors`` ``filesystem_errors`` ======================================== ================================ ``Py_LegacyWindowsFSEncodingFlag`` and ``Py_LegacyWindowsStdioFlag`` are only available on Windows. Command Line Arguments ---------------------- Usage:: python3 [options] python3 [options] -c COMMAND python3 [options] -m MODULE python3 [options] SCRIPT Command line options mapped to pseudo-action on ``PyConfig`` fields: ================================ ================================ Option ``PyConfig`` field ================================ ================================ ``-b`` ``bytes_warning++`` ``-B`` ``write_bytecode = 0`` ``-c COMMAND`` ``run_module = COMMAND`` ``--check-hash-based-pycs=MODE`` ``_check_hash_pycs_mode = MODE`` ``-d`` ``parser_debug++`` ``-E`` ``use_environment = 0`` ``-i`` ``inspect++`` and ``interactive++`` ``-I`` ``isolated = 1`` ``-m MODULE`` ``run_module = MODULE`` ``-O`` ``optimization_level++`` ``-q`` ``quiet++`` ``-R`` ``use_hash_seed = 0`` ``-s`` ``user_site_directory = 0`` ``-S`` ``site_import`` ``-t`` ignored (kept for backwards compatibility) ``-u`` ``buffered_stdio = 0`` ``-v`` ``verbose++`` ``-W WARNING`` add ``WARNING`` to ``warnoptions`` ``-x`` ``skip_source_first_line = 1`` ``-X XOPTION`` add ``XOPTION`` to ``xoptions`` ================================ ================================ ``-h``, ``-?`` and ``-V`` options are handled outside ``PyConfig``. Environment Variables --------------------- Environment variables mapped to ``PyPreConfig`` fields: ================================= ============================================= Variable ``PyPreConfig`` field ================================= ============================================= ``PYTHONCOERCECLOCALE`` ``coerce_c_locale``, ``coerce_c_locale_warn`` ``PYTHONDEVMODE`` ``dev_mode`` ``PYTHONLEGACYWINDOWSFSENCODING`` ``legacy_windows_fs_encoding`` ``PYTHONMALLOC`` ``allocator`` ``PYTHONUTF8`` ``utf8_mode`` ================================= ============================================= Environment variables mapped to ``PyConfig`` fields: ================================= ==================================== Variable ``PyConfig`` field ================================= ==================================== ``PYTHONDEBUG`` ``parser_debug`` ``PYTHONDEVMODE`` ``dev_mode`` ``PYTHONDONTWRITEBYTECODE`` ``write_bytecode`` ``PYTHONDUMPREFS`` ``dump_refs`` ``PYTHONEXECUTABLE`` ``program_name`` ``PYTHONFAULTHANDLER`` ``faulthandler`` ``PYTHONHASHSEED`` ``use_hash_seed``, ``hash_seed`` ``PYTHONHOME`` ``home`` ``PYTHONINSPECT`` ``inspect`` ``PYTHONIOENCODING`` ``stdio_encoding``, ``stdio_errors`` ``PYTHONLEGACYWINDOWSSTDIO`` ``legacy_windows_stdio`` ``PYTHONMALLOCSTATS`` ``malloc_stats`` ``PYTHONNOUSERSITE`` ``user_site_directory`` ``PYTHONOPTIMIZE`` ``optimization_level`` ``PYTHONPATH`` ``module_search_path_env`` ``PYTHONPROFILEIMPORTTIME`` ``import_time`` ``PYTHONPYCACHEPREFIX,`` ``pycache_prefix`` ``PYTHONTRACEMALLOC`` ``tracemalloc`` ``PYTHONUNBUFFERED`` ``buffered_stdio`` ``PYTHONVERBOSE`` ``verbose`` ``PYTHONWARNINGS`` ``warnoptions`` ================================= ==================================== ``PYTHONLEGACYWINDOWSFSENCODING`` and ``PYTHONLEGACYWINDOWSSTDIO`` are specific to Windows. ``PYTHONDEVMODE`` is mapped to ``PyPreConfig.dev_mode`` and ``PyConfig.dev_mode``. Annex: Python 3.7 API ===================== Python 3.7 has 4 functions in its C API to initialize and finalize Python: * ``Py_Initialize()``, ``Py_InitializeEx()``: initialize Python * ``Py_Finalize()``, ``Py_FinalizeEx()``: finalize Python Python can be configured using scattered global configuration variables (like ``Py_IgnoreEnvironmentFlag``) and using the following functions: * ``PyImport_AppendInittab()`` * ``PyImport_ExtendInittab()`` * ``PyMem_SetAllocator()`` * ``PyMem_SetupDebugHooks()`` * ``PyObject_SetArenaAllocator()`` * ``Py_SetPath()`` * ``Py_SetProgramName()`` * ``Py_SetPythonHome()`` * ``Py_SetStandardStreamEncoding()`` * ``PySys_AddWarnOption()`` * ``PySys_AddXOption()`` * ``PySys_ResetWarnOptions()`` There is also a high-level ``Py_Main()`` function. Copyright ========= This document has been placed in the public domain.