PEP 737: Add %N format, recommend fully qualified name (#3560)
* Add %N and %#N formats. * The %T and %#T formats now expect an object instead of a type. * Exchange %T and %#T formats: %T now formats the fully qualified name. * Recommend using the type fully qualified name in error messages and in __repr__() methods in new code. * Skip the __main__ module in the fully qualified name.
This commit is contained in:
parent
15dbd2632c
commit
99825461f8
|
@ -14,8 +14,14 @@ Abstract
|
|||
|
||||
Add new convenient APIs to format type names the same way in Python and
|
||||
in C. No longer format type names differently depending on how types are
|
||||
implemented. Also, put an end to truncating type names in C. The new C
|
||||
API is compatible with the limited C API.
|
||||
implemented. No longer truncate type names in the standard library.
|
||||
|
||||
Recommend using the type fully qualified name in error messages and in
|
||||
``__repr__()`` methods in new code.
|
||||
|
||||
Make C code safer by avoiding borrowed reference which can lead to
|
||||
crashes. The new C API is compatible with the limited C API.
|
||||
|
||||
|
||||
Rationale
|
||||
=========
|
||||
|
@ -41,7 +47,7 @@ Example with the ``datetime.timedelta`` type:
|
|||
Python code
|
||||
^^^^^^^^^^^
|
||||
|
||||
In Python, ``type.__name__`` gets the type "short name", whereas
|
||||
In Python, ``type.__name__`` gets the type short name, whereas
|
||||
``f"{type.__module__}.{type.__qualname__}"`` formats the type "fully
|
||||
qualified name". Usually, ``type(obj)`` or ``obj.__class__`` are used to
|
||||
get the type of the object *obj*. Sometimes, the type name is put
|
||||
|
@ -67,11 +73,14 @@ In C, the most common way to format a type name is to get the
|
|||
PyErr_Format(PyExc_TypeError, "globals must be a dict, not %.100s",
|
||||
Py_TYPE(globals)->tp_name);
|
||||
|
||||
The type qualified name (``type.__qualname__``) is only used at a single
|
||||
place, by the ``type.__repr__()`` implementation. Using
|
||||
``Py_TYPE(obj)->tp_name`` is more convenient than calling
|
||||
``PyType_GetQualName()`` which requires ``Py_DECREF()``. Moreover,
|
||||
``PyType_GetQualName()`` was only added recently, in Python 3.11.
|
||||
The type "fully qualified name" is used in a few places:
|
||||
``PyErr_Display()``, ``type.__repr__()`` implementation, and
|
||||
``sys.unraisablehook`` implementation.
|
||||
|
||||
Using ``Py_TYPE(obj)->tp_name`` is preferred since it is more convenient
|
||||
than calling ``PyType_GetQualName()`` which requires ``Py_DECREF()``.
|
||||
Moreover, ``PyType_GetQualName()`` was only added recently, in Python
|
||||
3.11.
|
||||
|
||||
Some functions use ``%R`` (``repr(type)``) to format a type name, the
|
||||
output contains the type fully qualified name. Example:
|
||||
|
@ -163,29 +172,48 @@ Specification
|
|||
=============
|
||||
|
||||
* Add ``type.__fully_qualified_name__`` attribute.
|
||||
* Add ``%T`` and ``%#T`` formats to ``PyUnicode_FromFormat()``.
|
||||
* Add ``%T``, ``%#T``, ``%N``, ``%#N`` formats to
|
||||
``PyUnicode_FromFormat()``.
|
||||
* Add ``PyType_GetFullyQualifiedName()`` function.
|
||||
* Recommend using the type fully qualified name in error messages and
|
||||
in ``__repr__()`` methods in new code.
|
||||
* Recommend not truncating type names.
|
||||
|
||||
|
||||
Python API
|
||||
----------
|
||||
|
||||
Add ``type.__fully_qualified_name__`` read-only attribute, the fully
|
||||
qualified name of a type: similar to
|
||||
``f"{type.__module__}.{type.__qualname__}"`` or ``type.__qualname__`` if
|
||||
``type.__module__`` is not a string or is equal to ``"builtins"``.
|
||||
``f"{type.__module__}.{type.__qualname__}"``, or ``type.__qualname__`` if
|
||||
``type.__module__`` is not a string or is equal to ``"builtins"`` or is
|
||||
equal to ``"__main__"``.
|
||||
|
||||
The ``type.__repr__()`` is left unchanged, it only omits the module if
|
||||
the module is equal to ``"builtins"``. It includes the module if the
|
||||
module is equal to ``"__main__"``. Pseudo-code::
|
||||
|
||||
def type_repr(cls):
|
||||
if isinstance(cls.__module__, str) and cls.__module__ != "builtins":
|
||||
name = f"{cls.__module__}.{cls.__qualname__}"
|
||||
else:
|
||||
name = cls.__qualname__
|
||||
return f"<class '{name}'>"
|
||||
|
||||
|
||||
Add PyUnicode_FromFormat() formats
|
||||
----------------------------------
|
||||
|
||||
Add ``%T`` and ``%#T`` formats to ``PyUnicode_FromFormat()`` to format
|
||||
a type name:
|
||||
Add formats to ``PyUnicode_FromFormat()``:
|
||||
|
||||
* ``%T`` formats the type "short name" (``type.__name__``).
|
||||
* ``%#T`` formats the type "fully qualified name"
|
||||
(``type.__fully_qualified_name__``).
|
||||
|
||||
Both formats expect a type as argument.
|
||||
* ``%T`` formats the type fully qualified name of an **object**:
|
||||
similar to ``type(obj).__fully_qualified_name__``.
|
||||
* ``%#T`` formats the type short name of an **object**:
|
||||
similar to ``type(obj).__name__``.
|
||||
* ``%N`` formats the fully qualified name of a **type**:
|
||||
similar to ``type.__fully_qualified_name__``.
|
||||
* ``%#N`` formats the short name of an object of a **type**:
|
||||
similar to ``type.__name__``.
|
||||
|
||||
The hash character (``#``) in the format string stands for
|
||||
`alternative format
|
||||
|
@ -209,11 +237,11 @@ can be replaced with the ``%T`` format:
|
|||
.. code-block:: c
|
||||
|
||||
PyErr_Format(PyExc_TypeError,
|
||||
"__format__ must return a str, not %T",
|
||||
Py_TYPE(result));
|
||||
"__format__ must return a str, not %T", result);
|
||||
|
||||
Advantages of the updated code:
|
||||
|
||||
* Safer C code: avoid ``Py_TYPE()`` which returns a borrowed reference.
|
||||
* The ``PyTypeObject.tp_name`` member is no longer read explicitly: the
|
||||
code becomes compatible with the limited C API.
|
||||
* The ``PyTypeObject.tp_name`` bytes string no longer has to be decoded
|
||||
|
@ -221,6 +249,7 @@ Advantages of the updated code:
|
|||
``type.__fully_qualified_name__`` is already a Unicode string.
|
||||
* The type name is no longer truncated.
|
||||
|
||||
|
||||
Add PyType_GetFullyQualifiedName() function
|
||||
-------------------------------------------
|
||||
|
||||
|
@ -235,6 +264,18 @@ On success, return a new reference to the string. On error, raise an
|
|||
exception and return ``NULL``.
|
||||
|
||||
|
||||
Recommend using the type fully qualified name
|
||||
---------------------------------------------
|
||||
|
||||
The type fully qualified name is recommended in error messages and in
|
||||
``__repr__()`` methods in new code.
|
||||
|
||||
In non-trivial applications, it is likely to have two types with the
|
||||
same short name defined in two different modules, especially with
|
||||
generic names. Using the fully qualified name helps identifying the type
|
||||
in an unambiguous way.
|
||||
|
||||
|
||||
Recommend not truncating type names
|
||||
-----------------------------------
|
||||
|
||||
|
@ -242,6 +283,9 @@ Type names must not be truncated. For example, the ``%.100s`` format
|
|||
should be avoided: use the ``%s`` format instead (or ``%T`` and ``%#T``
|
||||
formats in C).
|
||||
|
||||
Code in the standard library is updated to no longer truncate type
|
||||
names.
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
@ -253,8 +297,18 @@ Implementation
|
|||
Backwards Compatibility
|
||||
=======================
|
||||
|
||||
Only new APIs are added. No existing API is modified. Changes are fully
|
||||
backward compatible.
|
||||
Changes proposed in this PEP are backward compatible.
|
||||
|
||||
Adding new APIs has no effect on the backward compatibility. Existing
|
||||
APIs are left unchanged.
|
||||
|
||||
Replacing the type short name with the type fully qualified name is only
|
||||
recommended in new code. Existing code should be left
|
||||
unchanged and so remains backward compatible.
|
||||
|
||||
In the standard library, type names are no longer truncated. We believe
|
||||
that no code should be affected in practice, since type names longer
|
||||
than 100 characters are rare.
|
||||
|
||||
|
||||
Rejected Ideas
|
||||
|
@ -332,13 +386,6 @@ can be formatted as ``f"{type.__module__}:{type.__qualname__}"``, or
|
|||
In the standard library, no code formats a type fully qualified name
|
||||
this way.
|
||||
|
||||
It is already tricky to get a type from its qualified name. The type
|
||||
qualified name already uses the dot (``.``) separator between different
|
||||
parts: class name, ``<locals>``, nested class name, etc.
|
||||
|
||||
The colon separator is not consistent with dot separator used in a
|
||||
module fully qualified name (``module.__name__``).
|
||||
|
||||
|
||||
Other ways to format type names in C
|
||||
------------------------------------
|
||||
|
@ -378,35 +425,47 @@ modifier for ``ptrdiff_t`` argument.
|
|||
can be used in C to format a type qualified name.
|
||||
|
||||
|
||||
Omit Py_TYPE() with %T format: pass an object
|
||||
-----------------------------------------------
|
||||
Use %T format with Py_TYPE(): pass a type
|
||||
-----------------------------------------
|
||||
|
||||
It was proposed to format a type name of an object, like:
|
||||
It was proposed to pass a type to the ``%T`` format, like:
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
PyErr_Format(PyExc_TypeError, "type name: %T", obj);
|
||||
PyErr_Format(PyExc_TypeError, "object type name: %T", Py_TYPE(obj));
|
||||
|
||||
The intent is to avoid ``Py_TYPE()`` which returns a borrowed reference
|
||||
to the type. Using a borrowed reference can cause a bug or crash if the
|
||||
type is finalized or deallocated while being used.
|
||||
The ``Py_TYPE()`` functions returns a borrowed reference. Just to format
|
||||
an error, using a borrowed reference to a type looks safe. In practice,
|
||||
it can lead to crash. Example::
|
||||
|
||||
In practice, it's unlikely that a type is finalized while the error
|
||||
message is formatted. Instances of static types cannot have their type
|
||||
deallocated: static types are never deallocated. Since Python 3.8,
|
||||
instances of heap types hold a strong reference to their type (in
|
||||
``PyObject.ob_type``) and it's safe to make the assumption that the code
|
||||
holds a strong reference to the formatted object, so the object type
|
||||
cannot be deallocated.
|
||||
import gc
|
||||
import my_cext
|
||||
|
||||
In short, it is safe to use a ``Py_TYPE(obj)`` borrowed reference while
|
||||
formatting an error message.
|
||||
class ClassA:
|
||||
pass
|
||||
|
||||
If the ``%T`` format expects an instance, formatting a type cannot use
|
||||
the ``%T`` format, whereas it's a common operation in stdlib C
|
||||
extensions. The ``%T`` format would only cover half of cases (only
|
||||
instances). If the ``%T`` format takes a type, all cases are covered
|
||||
(types, and instances using ``Py_TYPE()``).
|
||||
def create_object():
|
||||
class ClassB:
|
||||
def __repr__(self):
|
||||
self.__class__ = ClassA
|
||||
gc.collect()
|
||||
return "ClassB repr"
|
||||
return ClassB()
|
||||
|
||||
obj = create_object()
|
||||
my_cext.func(obj)
|
||||
|
||||
where ``my_cext.func()`` is a C function which calls::
|
||||
|
||||
PyErr_Format(PyExc_ValueError,
|
||||
"Unexpected value %R of type %T",
|
||||
obj, Py_TYPE(obj));
|
||||
|
||||
``PyErr_Format()`` is called with a borrowed reference to ``ClassB``.
|
||||
When ``repr(obj)`` is called by the ``%R`` format, the last reference to
|
||||
``ClassB`` is removed and the class is deallocated. When the ``%T``
|
||||
format is proceed, ``Py_TYPE(obj)`` is already a dangling pointer and
|
||||
Python does crash.
|
||||
|
||||
|
||||
Other proposed APIs to get a type fully qualified name
|
||||
|
@ -423,26 +482,41 @@ Other proposed APIs to get a type fully qualified name
|
|||
``inspect`` module to use it.
|
||||
|
||||
|
||||
Omit __main__ module in the type fully qualified name
|
||||
-----------------------------------------------------
|
||||
Include the __main__ module in the type fully qualified name
|
||||
------------------------------------------------------------
|
||||
|
||||
The ``pdb`` module formats a type fully qualified names in a similar way
|
||||
as the proposed ``type.__fully_qualified_name__``, but it omits the module
|
||||
if the module is equal to ``"__main__"``.
|
||||
Format ``type.__fully_qualified_name__`` as
|
||||
``f"{type.__module__}.{type.__qualname__}"``, or ``type.__qualname__`` if
|
||||
``type.__module__`` is not a string or is equal to ``"builtins"``. Do
|
||||
not treat the ``__main__`` module differently: include it in the name.
|
||||
|
||||
The ``unittest`` module and a lot of existing stdlib code format a type
|
||||
fully qualified names the same way as the proposed
|
||||
``type.__fully_qualified_name__``: only omits the module if the module
|
||||
is equal to ``"builtins"``.
|
||||
Existing code such as ``type.__repr__()``, ``collections.abc`` and
|
||||
``unittest`` modules format a type name with
|
||||
``f'{obj.__module__}.{obj.__qualname__}'`` and only omit the module part
|
||||
if the module is equal to ``builtins``. Only the ``traceback`` and
|
||||
``pdb`` modules also the module if it's equal to ``"builtins"`` or
|
||||
``"__main__"``.
|
||||
|
||||
It's possible to omit the ``"__main__."`` prefix of the ``__main__``
|
||||
module with::
|
||||
The ``type.__fully_qualified_name__`` attribute omits the ``__main__``
|
||||
module to produce shorter names for a common case: types defined in a
|
||||
script run with ``python script.py``. For debugging, the ``repr()``
|
||||
function can be used on a type, it includes the ``__main__`` module in
|
||||
the type name. Or use ``f"{type.__module__}.{type.__qualname__}"``
|
||||
format to always include the module name, even for the ``"builtins"``
|
||||
module.
|
||||
|
||||
def format_type(cls):
|
||||
if cls.__module__ != "__main"__:
|
||||
return cls.__fully_qualified_name__
|
||||
else:
|
||||
return cls.__qualname__
|
||||
Example of script::
|
||||
|
||||
class MyType:
|
||||
pass
|
||||
|
||||
print(f"name: {MyType.__fully_qualified_name__}")
|
||||
print(f"repr: {repr(MyType)}")
|
||||
|
||||
Output::
|
||||
|
||||
name: MyType
|
||||
repr: <class '__main__.MyType'>
|
||||
|
||||
|
||||
Discussions
|
||||
|
|
Loading…
Reference in New Issue