PEP 737: Add %N format, recommend fully qualified name (#3560)
* Add %N and %#N formats. * The %T and %#T formats now expect an object instead of a type. * Exchange %T and %#T formats: %T now formats the fully qualified name. * Recommend using the type fully qualified name in error messages and in __repr__() methods in new code. * Skip the __main__ module in the fully qualified name.
This commit is contained in:
parent
15dbd2632c
commit
99825461f8
|
@ -14,8 +14,14 @@ Abstract
|
||||||
|
|
||||||
Add new convenient APIs to format type names the same way in Python and
|
Add new convenient APIs to format type names the same way in Python and
|
||||||
in C. No longer format type names differently depending on how types are
|
in C. No longer format type names differently depending on how types are
|
||||||
implemented. Also, put an end to truncating type names in C. The new C
|
implemented. No longer truncate type names in the standard library.
|
||||||
API is compatible with the limited C API.
|
|
||||||
|
Recommend using the type fully qualified name in error messages and in
|
||||||
|
``__repr__()`` methods in new code.
|
||||||
|
|
||||||
|
Make C code safer by avoiding borrowed reference which can lead to
|
||||||
|
crashes. The new C API is compatible with the limited C API.
|
||||||
|
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
=========
|
=========
|
||||||
|
@ -41,7 +47,7 @@ Example with the ``datetime.timedelta`` type:
|
||||||
Python code
|
Python code
|
||||||
^^^^^^^^^^^
|
^^^^^^^^^^^
|
||||||
|
|
||||||
In Python, ``type.__name__`` gets the type "short name", whereas
|
In Python, ``type.__name__`` gets the type short name, whereas
|
||||||
``f"{type.__module__}.{type.__qualname__}"`` formats the type "fully
|
``f"{type.__module__}.{type.__qualname__}"`` formats the type "fully
|
||||||
qualified name". Usually, ``type(obj)`` or ``obj.__class__`` are used to
|
qualified name". Usually, ``type(obj)`` or ``obj.__class__`` are used to
|
||||||
get the type of the object *obj*. Sometimes, the type name is put
|
get the type of the object *obj*. Sometimes, the type name is put
|
||||||
|
@ -67,11 +73,14 @@ In C, the most common way to format a type name is to get the
|
||||||
PyErr_Format(PyExc_TypeError, "globals must be a dict, not %.100s",
|
PyErr_Format(PyExc_TypeError, "globals must be a dict, not %.100s",
|
||||||
Py_TYPE(globals)->tp_name);
|
Py_TYPE(globals)->tp_name);
|
||||||
|
|
||||||
The type qualified name (``type.__qualname__``) is only used at a single
|
The type "fully qualified name" is used in a few places:
|
||||||
place, by the ``type.__repr__()`` implementation. Using
|
``PyErr_Display()``, ``type.__repr__()`` implementation, and
|
||||||
``Py_TYPE(obj)->tp_name`` is more convenient than calling
|
``sys.unraisablehook`` implementation.
|
||||||
``PyType_GetQualName()`` which requires ``Py_DECREF()``. Moreover,
|
|
||||||
``PyType_GetQualName()`` was only added recently, in Python 3.11.
|
Using ``Py_TYPE(obj)->tp_name`` is preferred since it is more convenient
|
||||||
|
than calling ``PyType_GetQualName()`` which requires ``Py_DECREF()``.
|
||||||
|
Moreover, ``PyType_GetQualName()`` was only added recently, in Python
|
||||||
|
3.11.
|
||||||
|
|
||||||
Some functions use ``%R`` (``repr(type)``) to format a type name, the
|
Some functions use ``%R`` (``repr(type)``) to format a type name, the
|
||||||
output contains the type fully qualified name. Example:
|
output contains the type fully qualified name. Example:
|
||||||
|
@ -163,29 +172,48 @@ Specification
|
||||||
=============
|
=============
|
||||||
|
|
||||||
* Add ``type.__fully_qualified_name__`` attribute.
|
* Add ``type.__fully_qualified_name__`` attribute.
|
||||||
* Add ``%T`` and ``%#T`` formats to ``PyUnicode_FromFormat()``.
|
* Add ``%T``, ``%#T``, ``%N``, ``%#N`` formats to
|
||||||
|
``PyUnicode_FromFormat()``.
|
||||||
* Add ``PyType_GetFullyQualifiedName()`` function.
|
* Add ``PyType_GetFullyQualifiedName()`` function.
|
||||||
|
* Recommend using the type fully qualified name in error messages and
|
||||||
|
in ``__repr__()`` methods in new code.
|
||||||
* Recommend not truncating type names.
|
* Recommend not truncating type names.
|
||||||
|
|
||||||
|
|
||||||
Python API
|
Python API
|
||||||
----------
|
----------
|
||||||
|
|
||||||
Add ``type.__fully_qualified_name__`` read-only attribute, the fully
|
Add ``type.__fully_qualified_name__`` read-only attribute, the fully
|
||||||
qualified name of a type: similar to
|
qualified name of a type: similar to
|
||||||
``f"{type.__module__}.{type.__qualname__}"`` or ``type.__qualname__`` if
|
``f"{type.__module__}.{type.__qualname__}"``, or ``type.__qualname__`` if
|
||||||
``type.__module__`` is not a string or is equal to ``"builtins"``.
|
``type.__module__`` is not a string or is equal to ``"builtins"`` or is
|
||||||
|
equal to ``"__main__"``.
|
||||||
|
|
||||||
|
The ``type.__repr__()`` is left unchanged, it only omits the module if
|
||||||
|
the module is equal to ``"builtins"``. It includes the module if the
|
||||||
|
module is equal to ``"__main__"``. Pseudo-code::
|
||||||
|
|
||||||
|
def type_repr(cls):
|
||||||
|
if isinstance(cls.__module__, str) and cls.__module__ != "builtins":
|
||||||
|
name = f"{cls.__module__}.{cls.__qualname__}"
|
||||||
|
else:
|
||||||
|
name = cls.__qualname__
|
||||||
|
return f"<class '{name}'>"
|
||||||
|
|
||||||
|
|
||||||
Add PyUnicode_FromFormat() formats
|
Add PyUnicode_FromFormat() formats
|
||||||
----------------------------------
|
----------------------------------
|
||||||
|
|
||||||
Add ``%T`` and ``%#T`` formats to ``PyUnicode_FromFormat()`` to format
|
Add formats to ``PyUnicode_FromFormat()``:
|
||||||
a type name:
|
|
||||||
|
|
||||||
* ``%T`` formats the type "short name" (``type.__name__``).
|
* ``%T`` formats the type fully qualified name of an **object**:
|
||||||
* ``%#T`` formats the type "fully qualified name"
|
similar to ``type(obj).__fully_qualified_name__``.
|
||||||
(``type.__fully_qualified_name__``).
|
* ``%#T`` formats the type short name of an **object**:
|
||||||
|
similar to ``type(obj).__name__``.
|
||||||
Both formats expect a type as argument.
|
* ``%N`` formats the fully qualified name of a **type**:
|
||||||
|
similar to ``type.__fully_qualified_name__``.
|
||||||
|
* ``%#N`` formats the short name of an object of a **type**:
|
||||||
|
similar to ``type.__name__``.
|
||||||
|
|
||||||
The hash character (``#``) in the format string stands for
|
The hash character (``#``) in the format string stands for
|
||||||
`alternative format
|
`alternative format
|
||||||
|
@ -209,11 +237,11 @@ can be replaced with the ``%T`` format:
|
||||||
.. code-block:: c
|
.. code-block:: c
|
||||||
|
|
||||||
PyErr_Format(PyExc_TypeError,
|
PyErr_Format(PyExc_TypeError,
|
||||||
"__format__ must return a str, not %T",
|
"__format__ must return a str, not %T", result);
|
||||||
Py_TYPE(result));
|
|
||||||
|
|
||||||
Advantages of the updated code:
|
Advantages of the updated code:
|
||||||
|
|
||||||
|
* Safer C code: avoid ``Py_TYPE()`` which returns a borrowed reference.
|
||||||
* The ``PyTypeObject.tp_name`` member is no longer read explicitly: the
|
* The ``PyTypeObject.tp_name`` member is no longer read explicitly: the
|
||||||
code becomes compatible with the limited C API.
|
code becomes compatible with the limited C API.
|
||||||
* The ``PyTypeObject.tp_name`` bytes string no longer has to be decoded
|
* The ``PyTypeObject.tp_name`` bytes string no longer has to be decoded
|
||||||
|
@ -221,6 +249,7 @@ Advantages of the updated code:
|
||||||
``type.__fully_qualified_name__`` is already a Unicode string.
|
``type.__fully_qualified_name__`` is already a Unicode string.
|
||||||
* The type name is no longer truncated.
|
* The type name is no longer truncated.
|
||||||
|
|
||||||
|
|
||||||
Add PyType_GetFullyQualifiedName() function
|
Add PyType_GetFullyQualifiedName() function
|
||||||
-------------------------------------------
|
-------------------------------------------
|
||||||
|
|
||||||
|
@ -235,6 +264,18 @@ On success, return a new reference to the string. On error, raise an
|
||||||
exception and return ``NULL``.
|
exception and return ``NULL``.
|
||||||
|
|
||||||
|
|
||||||
|
Recommend using the type fully qualified name
|
||||||
|
---------------------------------------------
|
||||||
|
|
||||||
|
The type fully qualified name is recommended in error messages and in
|
||||||
|
``__repr__()`` methods in new code.
|
||||||
|
|
||||||
|
In non-trivial applications, it is likely to have two types with the
|
||||||
|
same short name defined in two different modules, especially with
|
||||||
|
generic names. Using the fully qualified name helps identifying the type
|
||||||
|
in an unambiguous way.
|
||||||
|
|
||||||
|
|
||||||
Recommend not truncating type names
|
Recommend not truncating type names
|
||||||
-----------------------------------
|
-----------------------------------
|
||||||
|
|
||||||
|
@ -242,6 +283,9 @@ Type names must not be truncated. For example, the ``%.100s`` format
|
||||||
should be avoided: use the ``%s`` format instead (or ``%T`` and ``%#T``
|
should be avoided: use the ``%s`` format instead (or ``%T`` and ``%#T``
|
||||||
formats in C).
|
formats in C).
|
||||||
|
|
||||||
|
Code in the standard library is updated to no longer truncate type
|
||||||
|
names.
|
||||||
|
|
||||||
|
|
||||||
Implementation
|
Implementation
|
||||||
==============
|
==============
|
||||||
|
@ -253,8 +297,18 @@ Implementation
|
||||||
Backwards Compatibility
|
Backwards Compatibility
|
||||||
=======================
|
=======================
|
||||||
|
|
||||||
Only new APIs are added. No existing API is modified. Changes are fully
|
Changes proposed in this PEP are backward compatible.
|
||||||
backward compatible.
|
|
||||||
|
Adding new APIs has no effect on the backward compatibility. Existing
|
||||||
|
APIs are left unchanged.
|
||||||
|
|
||||||
|
Replacing the type short name with the type fully qualified name is only
|
||||||
|
recommended in new code. Existing code should be left
|
||||||
|
unchanged and so remains backward compatible.
|
||||||
|
|
||||||
|
In the standard library, type names are no longer truncated. We believe
|
||||||
|
that no code should be affected in practice, since type names longer
|
||||||
|
than 100 characters are rare.
|
||||||
|
|
||||||
|
|
||||||
Rejected Ideas
|
Rejected Ideas
|
||||||
|
@ -332,13 +386,6 @@ can be formatted as ``f"{type.__module__}:{type.__qualname__}"``, or
|
||||||
In the standard library, no code formats a type fully qualified name
|
In the standard library, no code formats a type fully qualified name
|
||||||
this way.
|
this way.
|
||||||
|
|
||||||
It is already tricky to get a type from its qualified name. The type
|
|
||||||
qualified name already uses the dot (``.``) separator between different
|
|
||||||
parts: class name, ``<locals>``, nested class name, etc.
|
|
||||||
|
|
||||||
The colon separator is not consistent with dot separator used in a
|
|
||||||
module fully qualified name (``module.__name__``).
|
|
||||||
|
|
||||||
|
|
||||||
Other ways to format type names in C
|
Other ways to format type names in C
|
||||||
------------------------------------
|
------------------------------------
|
||||||
|
@ -378,35 +425,47 @@ modifier for ``ptrdiff_t`` argument.
|
||||||
can be used in C to format a type qualified name.
|
can be used in C to format a type qualified name.
|
||||||
|
|
||||||
|
|
||||||
Omit Py_TYPE() with %T format: pass an object
|
Use %T format with Py_TYPE(): pass a type
|
||||||
-----------------------------------------------
|
-----------------------------------------
|
||||||
|
|
||||||
It was proposed to format a type name of an object, like:
|
It was proposed to pass a type to the ``%T`` format, like:
|
||||||
|
|
||||||
.. code-block:: c
|
.. code-block:: c
|
||||||
|
|
||||||
PyErr_Format(PyExc_TypeError, "type name: %T", obj);
|
PyErr_Format(PyExc_TypeError, "object type name: %T", Py_TYPE(obj));
|
||||||
|
|
||||||
The intent is to avoid ``Py_TYPE()`` which returns a borrowed reference
|
The ``Py_TYPE()`` functions returns a borrowed reference. Just to format
|
||||||
to the type. Using a borrowed reference can cause a bug or crash if the
|
an error, using a borrowed reference to a type looks safe. In practice,
|
||||||
type is finalized or deallocated while being used.
|
it can lead to crash. Example::
|
||||||
|
|
||||||
In practice, it's unlikely that a type is finalized while the error
|
import gc
|
||||||
message is formatted. Instances of static types cannot have their type
|
import my_cext
|
||||||
deallocated: static types are never deallocated. Since Python 3.8,
|
|
||||||
instances of heap types hold a strong reference to their type (in
|
|
||||||
``PyObject.ob_type``) and it's safe to make the assumption that the code
|
|
||||||
holds a strong reference to the formatted object, so the object type
|
|
||||||
cannot be deallocated.
|
|
||||||
|
|
||||||
In short, it is safe to use a ``Py_TYPE(obj)`` borrowed reference while
|
class ClassA:
|
||||||
formatting an error message.
|
pass
|
||||||
|
|
||||||
If the ``%T`` format expects an instance, formatting a type cannot use
|
def create_object():
|
||||||
the ``%T`` format, whereas it's a common operation in stdlib C
|
class ClassB:
|
||||||
extensions. The ``%T`` format would only cover half of cases (only
|
def __repr__(self):
|
||||||
instances). If the ``%T`` format takes a type, all cases are covered
|
self.__class__ = ClassA
|
||||||
(types, and instances using ``Py_TYPE()``).
|
gc.collect()
|
||||||
|
return "ClassB repr"
|
||||||
|
return ClassB()
|
||||||
|
|
||||||
|
obj = create_object()
|
||||||
|
my_cext.func(obj)
|
||||||
|
|
||||||
|
where ``my_cext.func()`` is a C function which calls::
|
||||||
|
|
||||||
|
PyErr_Format(PyExc_ValueError,
|
||||||
|
"Unexpected value %R of type %T",
|
||||||
|
obj, Py_TYPE(obj));
|
||||||
|
|
||||||
|
``PyErr_Format()`` is called with a borrowed reference to ``ClassB``.
|
||||||
|
When ``repr(obj)`` is called by the ``%R`` format, the last reference to
|
||||||
|
``ClassB`` is removed and the class is deallocated. When the ``%T``
|
||||||
|
format is proceed, ``Py_TYPE(obj)`` is already a dangling pointer and
|
||||||
|
Python does crash.
|
||||||
|
|
||||||
|
|
||||||
Other proposed APIs to get a type fully qualified name
|
Other proposed APIs to get a type fully qualified name
|
||||||
|
@ -423,26 +482,41 @@ Other proposed APIs to get a type fully qualified name
|
||||||
``inspect`` module to use it.
|
``inspect`` module to use it.
|
||||||
|
|
||||||
|
|
||||||
Omit __main__ module in the type fully qualified name
|
Include the __main__ module in the type fully qualified name
|
||||||
-----------------------------------------------------
|
------------------------------------------------------------
|
||||||
|
|
||||||
The ``pdb`` module formats a type fully qualified names in a similar way
|
Format ``type.__fully_qualified_name__`` as
|
||||||
as the proposed ``type.__fully_qualified_name__``, but it omits the module
|
``f"{type.__module__}.{type.__qualname__}"``, or ``type.__qualname__`` if
|
||||||
if the module is equal to ``"__main__"``.
|
``type.__module__`` is not a string or is equal to ``"builtins"``. Do
|
||||||
|
not treat the ``__main__`` module differently: include it in the name.
|
||||||
|
|
||||||
The ``unittest`` module and a lot of existing stdlib code format a type
|
Existing code such as ``type.__repr__()``, ``collections.abc`` and
|
||||||
fully qualified names the same way as the proposed
|
``unittest`` modules format a type name with
|
||||||
``type.__fully_qualified_name__``: only omits the module if the module
|
``f'{obj.__module__}.{obj.__qualname__}'`` and only omit the module part
|
||||||
is equal to ``"builtins"``.
|
if the module is equal to ``builtins``. Only the ``traceback`` and
|
||||||
|
``pdb`` modules also the module if it's equal to ``"builtins"`` or
|
||||||
|
``"__main__"``.
|
||||||
|
|
||||||
It's possible to omit the ``"__main__."`` prefix of the ``__main__``
|
The ``type.__fully_qualified_name__`` attribute omits the ``__main__``
|
||||||
module with::
|
module to produce shorter names for a common case: types defined in a
|
||||||
|
script run with ``python script.py``. For debugging, the ``repr()``
|
||||||
|
function can be used on a type, it includes the ``__main__`` module in
|
||||||
|
the type name. Or use ``f"{type.__module__}.{type.__qualname__}"``
|
||||||
|
format to always include the module name, even for the ``"builtins"``
|
||||||
|
module.
|
||||||
|
|
||||||
def format_type(cls):
|
Example of script::
|
||||||
if cls.__module__ != "__main"__:
|
|
||||||
return cls.__fully_qualified_name__
|
class MyType:
|
||||||
else:
|
pass
|
||||||
return cls.__qualname__
|
|
||||||
|
print(f"name: {MyType.__fully_qualified_name__}")
|
||||||
|
print(f"repr: {repr(MyType)}")
|
||||||
|
|
||||||
|
Output::
|
||||||
|
|
||||||
|
name: MyType
|
||||||
|
repr: <class '__main__.MyType'>
|
||||||
|
|
||||||
|
|
||||||
Discussions
|
Discussions
|
||||||
|
|
Loading…
Reference in New Issue