PEP 741: Python Configuration C API (#3625)

This commit is contained in:
Victor Stinner 2024-01-19 17:43:45 +01:00 committed by GitHub
parent f31f53dcba
commit c3af863741
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 524 additions and 0 deletions

524
peps/pep-0741.rst Normal file
View File

@ -0,0 +1,524 @@
PEP: 741
Title: Python Configuration C API
Author: Victor Stinner <vstinner@python.org>
Status: Draft
Type: Standards Track
Created: 18-Jan-2024
Python-Version: 3.13
Abstract
========
Add a C API to the limited C API to configure the Python
preinitialization and initialization, and to get the current
configuration. It can be used with the stable ABI.
Add ``sys.get_config(name)`` function to get the current value of a
configuration option.
Allow setting custom configuration options, not used by Python but by
third-party code. Options are referred to by their name as a string.
:pep:`587` "Python Initialization Configuration" unified all the ways to
configure the Python **initialization**. This PEP unifies also the
configuration of the Python **preinitialization** and the Python
**initialization** in a single API.
Rationale
=========
PyConfig is not part of the limited C API
-----------------------------------------
When the first versions of :pep:`587` "Python Initialization Configuration"
were discussed, there was a private field ``_config_version`` (``int``):
the configuration version, used for ABI compatibility. It was decided
that if an application embeds Python, it sticks to a Python version
anyway, and so there is no need to bother with the ABI compatibility.
The final PyConfig API of :pep:`587` is excluded from the limited C API
since its main ``PyConfig`` structure is not versioned. Python cannot
guarantee ABI backward and forward compatibility, it's incompatible with
the stable ABI.
Since PyConfig was added to Python 3.8, the limited C API and the stable
ABI are getting more popular. For example, Rust bindings such as the
`PyO3 project <https://pyo3.rs/>`_ target the limited C API to embed
Python in Rust. In practice, PyO3 can use non-limited C API for specific
needs, but using them avoids the stable ABI advantages.
Deprecated legacy API
---------------------
The legacy configuration API has been deprecated since Python 3.8:
* Set the initialization configuration such as ``Py_SetPath()``:
deprecated in Python 3.11.
* Global configuration variables such as ``Py_VerboseFlag``:
deprecated in Python 3.12.
* Get the current configuration such as ``Py_GetPath()``:
deprecated in Python 3.13.
Get the current configuration
-----------------------------
:pep:`587` has no API to **get** the **current** configuration, only to
**configure** the Python **initialization**.
For example, the global configuration variable
``Py_UnbufferedStdioFlag`` was deprecated in Python 3.12 and using
``PyConfig.buffered_stdio`` is recommended instead. It only works to
configure Python, there is no public API to get
``PyConfig.buffered_stdio``.
Users of the limited C API are asking for a public API to get the
current configuration.
Security fix
------------
To fix `CVE-2020-10735
<https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_,
a denial-of-service when converting very a large string to an integer (in base
10), it was discussed to add a new ``PyConfig`` member to stable
branches which affects the ABI.
Gregory P. Smith proposed a different API using text based configuration
file to not be limited by ``PyConfig`` members: `FR: Allow private
runtime config to enable extending without breaking the PyConfig ABI
<https://discuss.python.org/t/fr-allow-private-runtime-config-to-enable-extending-without-breaking-the-pyconfig-abi/18004>`__
(August 2022).
In the end, it was decided to not add a new ``PyConfig`` member to
stable branches, but only add a new ``PyConfig.int_max_str_digits``
member to the development branch (which became Python 3.12). A dedicated
private global variable (unrelated to ``PyConfig``) is used in stable
branches.
Redundancy between PyPreConfig and PyConfig
-------------------------------------------
The Python preinitialization uses the ``PyPreConfig`` structure and the
Python initialization uses the ``PyConfig`` structure. Both structures
have four duplicated members: ``dev_mode``, ``parse_argv``, ``isolated``
and ``use_environment``.
The redundancy is caused by the fact that the two structures are
separated, whereas some ``PyConfig`` members are needed by the
preinitialization.
Specification
=============
C API:
* ``PyInitConfig`` structure
* ``PyInitConfig_Python_New()``
* ``PyInitConfig_Isolated_New()``
* ``PyInitConfig_Free(config)``
* ``PyInitConfig_SetInt(config, name, value)``
* ``PyInitConfig_SetStr(config, name, value)``
* ``PyInitConfig_SetWStr(config, name, value)``
* ``PyInitConfig_SetStrList(config, name, length, items)``
* ``PyInitConfig_SetWStrList(config, name, length, items)``
* ``Py_InitializeFromInitConfig(config)``
* ``PyInitConfig_Exception(config)``
* ``PyInitConfig_GetError(config, &err_msg)``
* ``PyInitConfig_GetExitCode(config, &exitcode)``
* ``Py_ExitWithInitConfig(config)``
* ``PyConfig_Get(name)``
* ``PyConfig_GetInt(name, &value)``
Python API:
* ``sys.get_config(name)``
The C API uses null-terminated UTF-8 encoded strings to refer to a
configuration option.
All C API functions are added to the limited C API version 3.13.
The ``PyInitConfig`` structure is implemented by combining the three
structures of the ``PyConfig`` API:
* ``PyPreConfig preconfig``
* ``PyConfig config``
* ``PyStatus status``
The ``PyStatus`` status is no longer separated, but part of the unified
``PyInitConfig`` structure, which makes the API easier to use.
Configuration Options
---------------------
Configuration options are named after ``PyPreConfig`` and
``PyConfig`` structure members such as ``"verbose"``
(``PyConfig.verbose``), ``"buffered_stdio"``
(``PyConfig.buffered_stdio``), or ``"allocator"``
(``PyPreConfig.allocator``).
The type of configuration options depends on the option. For example,
the ``"verbose"`` option type is an integer, whereas
``"module_search_paths"`` option type is an array of wide strings.
See the `PyConfig documentation
<https://docs.python.org/dev/c-api/init_config.html#pyconfig>`_
and the `PyPreConfig documentation
<https://docs.python.org/dev/c-api/init_config.html#pypreconfig>`_.
Configure the Python initialization
-----------------------------------
``PyInitConfig`` structure:
Opaque structure to configure the Python preinitialization and the
Python initialization.
``PyInitConfig* PyInitConfig_Python_New(void)``:
Create a new initialization configuration using default values
of the `Python Configuration
<https://docs.python.org/dev/c-api/init_config.html#python-configuration>`_.
It must be freed with ``PyInitConfig_Free()``.
Return ``NULL`` on memory allocation failure.
``PyInitConfig* PyInitConfig_Isolated_New(void)``:
Similar to ``PyInitConfig_Python_New()``, but use default values
of the `Isolated Configuration
<https://docs.python.org/dev/c-api/init_config.html#isolated-configuration>`_.
``void PyInitConfig_Free(PyInitConfig *config)``:
Free memory of an initialization configuration.
``int PyInitConfig_SetInt(PyInitConfig *config, const char *name, int64_t value)``:
Set an integer configuration option.
* Return ``0`` on success.
* Set an error in *config* and return ``-1`` on error.
``int PyInitConfig_SetStr(PyInitConfig *config, const char *name, const char *value)``:
Set a string configuration option from a null-terminated bytes
string.
The bytes string is decoded by ``Py_DecodeLocale()``. If Python is
not yet preinitialized, this function preinitializes it to ensure
that encodings are properly configured.
* Return ``0`` on success.
* Set an error in *config* and return ``-1`` on error.
``int PyInitConfig_SetWStr(PyInitConfig *config, const char *name, const wchar_t *value)``:
Set a string configuration option from a null-terminated wide
string.
If Python is not yet preinitialized, this function preinitializes
it.
* Return ``0`` on success.
* Set an error in *config* and return ``-1`` on error.
``int PyInitConfig_SetStrList(PyInitConfig *config, const char *name, size_t length, char * const *items)``:
Set a string list configuration option from an array of
null-terminated bytes strings.
The bytes string is decoded by :c:func:`Py_DecodeLocale`. If Python
is not yet preinitialized, this function preinitializes it to ensure
that encodings are properly configured.
* Return ``0`` on success.
* Set an error in *config* and return ``-1`` on error.
``int PyInitConfig_SetWStrList(PyInitConfig *config, const char *name, size_t length, wchar_t * const *items)``:
Set a string list configuration option from an error of
null-terminated wide strings.
If Python is not yet preinitialized, this function preinitializes
it.
* Return ``0`` on success.
* Set an error in *config* and return ``-1`` on error.
``int Py_PreInitializeFromInitConfig(PyInitConfig *config)``:
Preinitialize Python from the initialization configuration.
* Return ``0`` on success.
* Set an error in *config* and return ``-1`` on error.
``int Py_InitializeFromInitConfig(PyInitConfig *config)``:
Initialize Python from the initialization configuration.
* Return ``0`` on success.
* Set an error in *config* and return ``-1`` on error.
* Set an exit code in *config* and return ``-1`` on exit.
Error handling
--------------
``int PyInitConfig_Exception(PyInitConfig* config)``:
Check if an exception is set in *config*:
* Return non-zero if an error was set or if an exit code was set.
* Return zero otherwise.
``int PyInitConfig_GetError(PyInitConfig* config, const char **err_msg)``:
Get the *config* error message.
* Set *\*err_msg* and return ``1`` if an error is set.
* Set *\*err_msg* to ``NULL`` and return ``0`` otherwise.
An error message is an UTF-8 encoded string.
The error message remains valid until a ``PyInitConfig`` function is
called with *config*. The caller doesn't have to free the error
message.
``int PyInitConfig_GetExitCode(PyInitConfig* config, int *exitcode)``:
Get the *config* exit code.
* Set *\*exitcode* and return ``1`` if an exit code is set.
* Return ``0`` otherwise.
``void Py_ExitWithInitConfig(PyInitConfig *config)``:
Exit Python and free memory of an initialization configuration.
If an error message is set, display the error message.
If an exit code is set, use it to exit the process.
The function does not return.
Get current configuration
-------------------------
``PyObject* PyConfig_Get(const char *name)``:
Get the current value of a configuration option as an object.
* Return a new reference on success.
* Set an exception and return ``NULL`` on error.
The object type depends on the option.
``int PyConfig_GetInt(const char *name, int *value)``:
Similar to ``PyConfig_Get()``, but get the value as an integer.
* Set ``*value`` and return ``0`` success.
* Set an exception and return ``-1`` on error.
sys.get_config()
----------------
Add ``sys.get_config(name: str)`` function which calls
``PyConfig_Get()``:
* Return the configuration option value on success.
* Raise an exception on error.
Custom configuration options
----------------------------
It is possible to set custom configuration options, not used by Python
but only by third-party code, by calling:
``PyInitConfig_SetInt(config, "allow_custom_options", 1)``. In this
case, setting custom configuration options is accepted, rather than
failing with an "unknown option" error. By default, setting custom
configuration options is not allowed.
Custom configuration options are set with the ``PyInitConfig`` API, such
as ``PyInitConfig_SetInt()``, and can be got later with the
``PyConfig_Get()`` API.
To avoid conflicts with future Python configuration options, it is
recommended to use a prefix separated by a colon. For example, an
application called ``myapp`` can use the ``"myapp:verbose"`` option name
instead of ``"verbose"`` name, to avoid conflict with the Python
``verbose`` option.
Examples
========
Initialize Python
-----------------
Example setting some configuration options of different types to
initialize Python.
.. code-block:: c
void init_python(void)
{
PyInitConfig *config = PyInitConfig_Python_New();
if (config == NULL) {
printf("Init allocation error\n");
return;
}
// Set an integer (dev_mode)
if (PyInitConfig_SetInt(config, "dev_mode", 1) < 0) {
goto error;
}
// Set a list of wide strings (argv)
wchar_t *argv[] = {L"my_program"", L"-c", L"pass"};
if (PyInitConfig_SetWStrList(config, "argv",
Py_ARRAY_LENGTH(argv), argv) < 0) {
goto error;
}
// Set a wide string (program_name)
if (PyInitConfig_SetWStr(config, "program_name", L"my_program") < 0) {
goto error;
}
// Set a list of bytes strings (xoptions).
// Preinitialize implicitly Python to decode the bytes string.
char* xoptions[] = {"faulthandler"};
if (PyInitConfig_SetStrList(config, "xoptions",
Py_ARRAY_LENGTH(xoptions), xoptions) < 0) {
goto error;
}
// Initialize Python with the configuration
if (Py_InitializeFromInitConfig(config) < 0) {
Py_ExitWithInitConfig(config);
}
PyInitConfig_Free(config);
}
Get the verbose option
-----------------------
Example getting the configuration option ``verbose``:
.. code-block:: c
int get_verbose(void)
{
int verbose;
if (PyConfig_GetInt("verbose", &verbose) < 0) {
// Silently ignore the error
PyErr_Clear();
return -1;
}
return verbose;
}
On error, the function silently ignores the error and returns ``-1``.
Implementation
==============
* Issue: `No limited C API to customize Python initialization
<https://github.com/python/cpython/issues/107954>`_
* PR: `Add PyInitConfig C API
<https://github.com/python/cpython/pull/110176>`_
* PR: `Add PyConfig_Get() function
<https://github.com/python/cpython/pull/112609>`_
Backwards Compatibility
=======================
Changes are fully backward compatible. Only new APIs are added.
Existing API such as the ``PyConfig`` C API are left unchanged.
Discussions
===========
* `FR: Allow private runtime config to enable extending without breaking
the PyConfig ABI
<https://discuss.python.org/t/fr-allow-private-runtime-config-to-enable-extending-without-breaking-the-pyconfig-abi/18004>`__
(August 2022).
Rejected Ideas
==============
Configuration as text
---------------------
It was proposed to provide the configuration as text to make the API
compatible with the stable ABI and to allow custom options.
Example::
# integer
bytes_warning = 2
# string
filesystem_encoding = "utf8" # comment
# list of strings
argv = ['python', '-c', 'code']
The API would take the configuration as a string, not as a file. Example
with a hypothetical ``PyInit_SetConfig()`` function:
.. code-block:: c
void stable_abi_init_demo(int set_path)
{
PyInit_SetConfig(
"isolated = 1\n"
"argv = ['python', '-c', 'code']\n"
"filesystem_encoding = 'utf-8'\n"
);
if (set_path) {
PyInit_SetConfig("pythonpath = '/my/path'");
}
}
The example ignores error handling to make it easier to read.
The problem is that generating such configuration text requires adding
quotes to strings and to escape quotes in strings. Formatting an array
of strings becomes non-trivial.
Providing an API to format a string or an array of strings is not really
worth it, whereas Python can provide directly an API to set a
configuration option where the value is passed directly as a string or
an array of strings. It avoids giving special meaning to some
characters, such as newline characters, which would have to be escaped.
Refer to an option with an integer
----------------------------------
Using strings to refer to a configuration option requires comparing
strings which can be slower than comparing integers.
Use integers, similar to type "slots" such as ``Py_tp_doc``, to refer to
a configuration option. The ``const char *name`` parameter is replaced
with ``int option``.
Accepting custom options is more likely to cause conflicts when using
integers, since it's harder to maintain "namespaces" (ranges) for
integer options. Using strings, a simple prefix with a colon separator
can be used.
Integers also requires maintaining a list of integer constants and so
make the C API and the Python API larger.
Python 3.13 only has around 62 configuration options, and so performance
is not really a blocker issue. If better performance is needed later, a
hash table can be used to get an option by its name.
If getting a configuration option is used in hot code, the value can be
read once and cached. By the way, most configuration options cannot be
changed at runtime.
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.