PEP 737: Add %N format, recommend fully qualified name (#3560)

* Add %N and %#N formats.
* The %T and %#T formats now expect an object instead of a type.
* Exchange %T and %#T formats: %T now formats the fully qualified
  name.
* Recommend using the type fully qualified name in error messages and
  in __repr__() methods in new code.
* Skip the __main__ module in the fully qualified name.
This commit is contained in:
Victor Stinner 2023-12-05 12:15:09 +01:00 committed by GitHub
parent 15dbd2632c
commit 99825461f8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 140 additions and 66 deletions

View File

@ -14,8 +14,14 @@ Abstract
Add new convenient APIs to format type names the same way in Python and Add new convenient APIs to format type names the same way in Python and
in C. No longer format type names differently depending on how types are in C. No longer format type names differently depending on how types are
implemented. Also, put an end to truncating type names in C. The new C implemented. No longer truncate type names in the standard library.
API is compatible with the limited C API.
Recommend using the type fully qualified name in error messages and in
``__repr__()`` methods in new code.
Make C code safer by avoiding borrowed reference which can lead to
crashes. The new C API is compatible with the limited C API.
Rationale Rationale
========= =========
@ -41,7 +47,7 @@ Example with the ``datetime.timedelta`` type:
Python code Python code
^^^^^^^^^^^ ^^^^^^^^^^^
In Python, ``type.__name__`` gets the type "short name", whereas In Python, ``type.__name__`` gets the type short name, whereas
``f"{type.__module__}.{type.__qualname__}"`` formats the type "fully ``f"{type.__module__}.{type.__qualname__}"`` formats the type "fully
qualified name". Usually, ``type(obj)`` or ``obj.__class__`` are used to qualified name". Usually, ``type(obj)`` or ``obj.__class__`` are used to
get the type of the object *obj*. Sometimes, the type name is put get the type of the object *obj*. Sometimes, the type name is put
@ -67,11 +73,14 @@ In C, the most common way to format a type name is to get the
PyErr_Format(PyExc_TypeError, "globals must be a dict, not %.100s", PyErr_Format(PyExc_TypeError, "globals must be a dict, not %.100s",
Py_TYPE(globals)->tp_name); Py_TYPE(globals)->tp_name);
The type qualified name (``type.__qualname__``) is only used at a single The type "fully qualified name" is used in a few places:
place, by the ``type.__repr__()`` implementation. Using ``PyErr_Display()``, ``type.__repr__()`` implementation, and
``Py_TYPE(obj)->tp_name`` is more convenient than calling ``sys.unraisablehook`` implementation.
``PyType_GetQualName()`` which requires ``Py_DECREF()``. Moreover,
``PyType_GetQualName()`` was only added recently, in Python 3.11. Using ``Py_TYPE(obj)->tp_name`` is preferred since it is more convenient
than calling ``PyType_GetQualName()`` which requires ``Py_DECREF()``.
Moreover, ``PyType_GetQualName()`` was only added recently, in Python
3.11.
Some functions use ``%R`` (``repr(type)``) to format a type name, the Some functions use ``%R`` (``repr(type)``) to format a type name, the
output contains the type fully qualified name. Example: output contains the type fully qualified name. Example:
@ -163,29 +172,48 @@ Specification
============= =============
* Add ``type.__fully_qualified_name__`` attribute. * Add ``type.__fully_qualified_name__`` attribute.
* Add ``%T`` and ``%#T`` formats to ``PyUnicode_FromFormat()``. * Add ``%T``, ``%#T``, ``%N``, ``%#N`` formats to
``PyUnicode_FromFormat()``.
* Add ``PyType_GetFullyQualifiedName()`` function. * Add ``PyType_GetFullyQualifiedName()`` function.
* Recommend using the type fully qualified name in error messages and
in ``__repr__()`` methods in new code.
* Recommend not truncating type names. * Recommend not truncating type names.
Python API Python API
---------- ----------
Add ``type.__fully_qualified_name__`` read-only attribute, the fully Add ``type.__fully_qualified_name__`` read-only attribute, the fully
qualified name of a type: similar to qualified name of a type: similar to
``f"{type.__module__}.{type.__qualname__}"`` or ``type.__qualname__`` if ``f"{type.__module__}.{type.__qualname__}"``, or ``type.__qualname__`` if
``type.__module__`` is not a string or is equal to ``"builtins"``. ``type.__module__`` is not a string or is equal to ``"builtins"`` or is
equal to ``"__main__"``.
The ``type.__repr__()`` is left unchanged, it only omits the module if
the module is equal to ``"builtins"``. It includes the module if the
module is equal to ``"__main__"``. Pseudo-code::
def type_repr(cls):
if isinstance(cls.__module__, str) and cls.__module__ != "builtins":
name = f"{cls.__module__}.{cls.__qualname__}"
else:
name = cls.__qualname__
return f"<class '{name}'>"
Add PyUnicode_FromFormat() formats Add PyUnicode_FromFormat() formats
---------------------------------- ----------------------------------
Add ``%T`` and ``%#T`` formats to ``PyUnicode_FromFormat()`` to format Add formats to ``PyUnicode_FromFormat()``:
a type name:
* ``%T`` formats the type "short name" (``type.__name__``). * ``%T`` formats the type fully qualified name of an **object**:
* ``%#T`` formats the type "fully qualified name" similar to ``type(obj).__fully_qualified_name__``.
(``type.__fully_qualified_name__``). * ``%#T`` formats the type short name of an **object**:
similar to ``type(obj).__name__``.
Both formats expect a type as argument. * ``%N`` formats the fully qualified name of a **type**:
similar to ``type.__fully_qualified_name__``.
* ``%#N`` formats the short name of an object of a **type**:
similar to ``type.__name__``.
The hash character (``#``) in the format string stands for The hash character (``#``) in the format string stands for
`alternative format `alternative format
@ -209,11 +237,11 @@ can be replaced with the ``%T`` format:
.. code-block:: c .. code-block:: c
PyErr_Format(PyExc_TypeError, PyErr_Format(PyExc_TypeError,
"__format__ must return a str, not %T", "__format__ must return a str, not %T", result);
Py_TYPE(result));
Advantages of the updated code: Advantages of the updated code:
* Safer C code: avoid ``Py_TYPE()`` which returns a borrowed reference.
* The ``PyTypeObject.tp_name`` member is no longer read explicitly: the * The ``PyTypeObject.tp_name`` member is no longer read explicitly: the
code becomes compatible with the limited C API. code becomes compatible with the limited C API.
* The ``PyTypeObject.tp_name`` bytes string no longer has to be decoded * The ``PyTypeObject.tp_name`` bytes string no longer has to be decoded
@ -221,6 +249,7 @@ Advantages of the updated code:
``type.__fully_qualified_name__`` is already a Unicode string. ``type.__fully_qualified_name__`` is already a Unicode string.
* The type name is no longer truncated. * The type name is no longer truncated.
Add PyType_GetFullyQualifiedName() function Add PyType_GetFullyQualifiedName() function
------------------------------------------- -------------------------------------------
@ -235,6 +264,18 @@ On success, return a new reference to the string. On error, raise an
exception and return ``NULL``. exception and return ``NULL``.
Recommend using the type fully qualified name
---------------------------------------------
The type fully qualified name is recommended in error messages and in
``__repr__()`` methods in new code.
In non-trivial applications, it is likely to have two types with the
same short name defined in two different modules, especially with
generic names. Using the fully qualified name helps identifying the type
in an unambiguous way.
Recommend not truncating type names Recommend not truncating type names
----------------------------------- -----------------------------------
@ -242,6 +283,9 @@ Type names must not be truncated. For example, the ``%.100s`` format
should be avoided: use the ``%s`` format instead (or ``%T`` and ``%#T`` should be avoided: use the ``%s`` format instead (or ``%T`` and ``%#T``
formats in C). formats in C).
Code in the standard library is updated to no longer truncate type
names.
Implementation Implementation
============== ==============
@ -253,8 +297,18 @@ Implementation
Backwards Compatibility Backwards Compatibility
======================= =======================
Only new APIs are added. No existing API is modified. Changes are fully Changes proposed in this PEP are backward compatible.
backward compatible.
Adding new APIs has no effect on the backward compatibility. Existing
APIs are left unchanged.
Replacing the type short name with the type fully qualified name is only
recommended in new code. Existing code should be left
unchanged and so remains backward compatible.
In the standard library, type names are no longer truncated. We believe
that no code should be affected in practice, since type names longer
than 100 characters are rare.
Rejected Ideas Rejected Ideas
@ -332,13 +386,6 @@ can be formatted as ``f"{type.__module__}:{type.__qualname__}"``, or
In the standard library, no code formats a type fully qualified name In the standard library, no code formats a type fully qualified name
this way. this way.
It is already tricky to get a type from its qualified name. The type
qualified name already uses the dot (``.``) separator between different
parts: class name, ``<locals>``, nested class name, etc.
The colon separator is not consistent with dot separator used in a
module fully qualified name (``module.__name__``).
Other ways to format type names in C Other ways to format type names in C
------------------------------------ ------------------------------------
@ -378,35 +425,47 @@ modifier for ``ptrdiff_t`` argument.
can be used in C to format a type qualified name. can be used in C to format a type qualified name.
Omit Py_TYPE() with %T format: pass an object Use %T format with Py_TYPE(): pass a type
----------------------------------------------- -----------------------------------------
It was proposed to format a type name of an object, like: It was proposed to pass a type to the ``%T`` format, like:
.. code-block:: c .. code-block:: c
PyErr_Format(PyExc_TypeError, "type name: %T", obj); PyErr_Format(PyExc_TypeError, "object type name: %T", Py_TYPE(obj));
The intent is to avoid ``Py_TYPE()`` which returns a borrowed reference The ``Py_TYPE()`` functions returns a borrowed reference. Just to format
to the type. Using a borrowed reference can cause a bug or crash if the an error, using a borrowed reference to a type looks safe. In practice,
type is finalized or deallocated while being used. it can lead to crash. Example::
In practice, it's unlikely that a type is finalized while the error import gc
message is formatted. Instances of static types cannot have their type import my_cext
deallocated: static types are never deallocated. Since Python 3.8,
instances of heap types hold a strong reference to their type (in
``PyObject.ob_type``) and it's safe to make the assumption that the code
holds a strong reference to the formatted object, so the object type
cannot be deallocated.
In short, it is safe to use a ``Py_TYPE(obj)`` borrowed reference while class ClassA:
formatting an error message. pass
If the ``%T`` format expects an instance, formatting a type cannot use def create_object():
the ``%T`` format, whereas it's a common operation in stdlib C class ClassB:
extensions. The ``%T`` format would only cover half of cases (only def __repr__(self):
instances). If the ``%T`` format takes a type, all cases are covered self.__class__ = ClassA
(types, and instances using ``Py_TYPE()``). gc.collect()
return "ClassB repr"
return ClassB()
obj = create_object()
my_cext.func(obj)
where ``my_cext.func()`` is a C function which calls::
PyErr_Format(PyExc_ValueError,
"Unexpected value %R of type %T",
obj, Py_TYPE(obj));
``PyErr_Format()`` is called with a borrowed reference to ``ClassB``.
When ``repr(obj)`` is called by the ``%R`` format, the last reference to
``ClassB`` is removed and the class is deallocated. When the ``%T``
format is proceed, ``Py_TYPE(obj)`` is already a dangling pointer and
Python does crash.
Other proposed APIs to get a type fully qualified name Other proposed APIs to get a type fully qualified name
@ -423,26 +482,41 @@ Other proposed APIs to get a type fully qualified name
``inspect`` module to use it. ``inspect`` module to use it.
Omit __main__ module in the type fully qualified name Include the __main__ module in the type fully qualified name
----------------------------------------------------- ------------------------------------------------------------
The ``pdb`` module formats a type fully qualified names in a similar way Format ``type.__fully_qualified_name__`` as
as the proposed ``type.__fully_qualified_name__``, but it omits the module ``f"{type.__module__}.{type.__qualname__}"``, or ``type.__qualname__`` if
if the module is equal to ``"__main__"``. ``type.__module__`` is not a string or is equal to ``"builtins"``. Do
not treat the ``__main__`` module differently: include it in the name.
The ``unittest`` module and a lot of existing stdlib code format a type Existing code such as ``type.__repr__()``, ``collections.abc`` and
fully qualified names the same way as the proposed ``unittest`` modules format a type name with
``type.__fully_qualified_name__``: only omits the module if the module ``f'{obj.__module__}.{obj.__qualname__}'`` and only omit the module part
is equal to ``"builtins"``. if the module is equal to ``builtins``. Only the ``traceback`` and
``pdb`` modules also the module if it's equal to ``"builtins"`` or
``"__main__"``.
It's possible to omit the ``"__main__."`` prefix of the ``__main__`` The ``type.__fully_qualified_name__`` attribute omits the ``__main__``
module with:: module to produce shorter names for a common case: types defined in a
script run with ``python script.py``. For debugging, the ``repr()``
function can be used on a type, it includes the ``__main__`` module in
the type name. Or use ``f"{type.__module__}.{type.__qualname__}"``
format to always include the module name, even for the ``"builtins"``
module.
def format_type(cls): Example of script::
if cls.__module__ != "__main"__:
return cls.__fully_qualified_name__ class MyType:
else: pass
return cls.__qualname__
print(f"name: {MyType.__fully_qualified_name__}")
print(f"repr: {repr(MyType)}")
Output::
name: MyType
repr: <class '__main__.MyType'>
Discussions Discussions