python-peps/peps/pep-3123.rst

160 lines
4.5 KiB
ReStructuredText
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

PEP: 3123
Title: Making PyObject_HEAD conform to standard C
Version: $Revision$
Last-Modified: $Date$
Author: Martin von Löwis <martin@v.loewis.de>
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 27-Apr-2007
Python-Version: 3.0
Post-History:
Abstract
========
Python currently relies on undefined C behavior, with its
usage of ``PyObject_HEAD``. This PEP proposes to change that
into standard C.
Rationale
=========
Standard C defines that an object must be accessed only through a
pointer of its type, and that all other accesses are undefined
behavior, with a few exceptions. In particular, the following
code has undefined behavior::
struct FooObject{
PyObject_HEAD
int data;
};
PyObject *foo(struct FooObject*f){
return (PyObject*)f;
}
int bar(){
struct FooObject *f = malloc(sizeof(struct FooObject));
struct PyObject *o = foo(f);
f->ob_refcnt = 0;
o->ob_refcnt = 1;
return f->ob_refcnt;
}
The problem here is that the storage is both accessed as
if it where struct ``PyObject``, and as struct ``FooObject``.
Historically, compilers did not have any problems with this
code. However, modern compilers use that clause as an
optimization opportunity, finding that ``f->ob_refcnt`` and
``o->ob_refcnt`` cannot possibly refer to the same memory, and
that therefore the function should return 0, without having
to fetch the value of ob_refcnt at all in the return
statement. For GCC, Python now uses ``-fno-strict-aliasing``
to work around that problem; with other compilers, it
may just see undefined behavior. Even with GCC, using
``-fno-strict-aliasing`` may pessimize the generated code
unnecessarily.
Specification
=============
Standard C has one specific exception to its aliasing rules precisely
designed to support the case of Python: a value of a struct type may
also be accessed through a pointer to the first field. E.g. if a
struct starts with an ``int``, the ``struct *`` may also be cast to
an ``int *``, allowing to write int values into the first field.
For Python, ``PyObject_HEAD`` and ``PyObject_VAR_HEAD`` will be changed
to not list all fields anymore, but list a single field of type
``PyObject``/``PyVarObject``::
typedef struct _object {
_PyObject_HEAD_EXTRA
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
} PyObject;
typedef struct {
PyObject ob_base;
Py_ssize_t ob_size;
} PyVarObject;
#define PyObject_HEAD PyObject ob_base;
#define PyObject_VAR_HEAD PyVarObject ob_base;
Types defined as fixed-size structure will then include PyObject
as its first field, PyVarObject for variable-sized objects. E.g.::
typedef struct {
PyObject ob_base;
PyObject *start, *stop, *step;
} PySliceObject;
typedef struct {
PyVarObject ob_base;
PyObject **ob_item;
Py_ssize_t allocated;
} PyListObject;
The above definitions of ``PyObject_HEAD`` are normative, so extension
authors MAY either use the macro, or put the ``ob_base`` field explicitly
into their structs.
As a convention, the base field SHOULD be called ob_base. However, all
accesses to ob_refcnt and ob_type MUST cast the object pointer to
PyObject* (unless the pointer is already known to have that type), and
SHOULD use the respective accessor macros. To simplify access to
ob_type, ob_refcnt, and ob_size, macros::
#define Py_TYPE(o) (((PyObject*)(o))->ob_type)
#define Py_REFCNT(o) (((PyObject*)(o))->ob_refcnt)
#define Py_SIZE(o) (((PyVarObject*)(o))->ob_size)
are added. E.g. the code blocks ::
#define PyList_CheckExact(op) ((op)->ob_type == &PyList_Type)
return func->ob_type->tp_name;
needs to be changed to::
#define PyList_CheckExact(op) (Py_TYPE(op) == &PyList_Type)
return Py_TYPE(func)->tp_name;
For initialization of type objects, the current sequence ::
PyObject_HEAD_INIT(NULL)
0, /* ob_size */
becomes incorrect, and must be replaced with ::
PyVarObject_HEAD_INIT(NULL, 0)
Compatibility with Python 2.6
=============================
To support modules that compile with both Python 2.6 and Python 3.0,
the ``Py_*`` macros are added to Python 2.6. The macros ``Py_INCREF``
and ``Py_DECREF`` will be changed to cast their argument to ``PyObject *``,
so that module authors can also explicitly declare the ``ob_base``
field in modules designed for Python 2.6.
Copyright
=========
This document has been placed in the public domain.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: