python-peps/pep-0298.txt

PEP: 298
Title: The Locked Buffer Interface
Version: $Revision$
Last-Modified: $Date$
Author: Thomas Heller <theller@python.net>
Status: Withdrawn
Type: Standards Track
Content-Type: text/x-rst
Created: 26-Jul-2002
Python-Version: 2.3
Post-History: 30-Jul-2002, 01-Aug-2002


Abstract
========

This PEP proposes an extension to the buffer interface called the
'locked buffer interface'.

The locked buffer interface avoids the flaws of the 'old' buffer
interface [1]_ as defined in Python versions up to and including
2.2, and has the following semantics:

- The lifetime of the retrieved pointer is clearly defined and
  controlled by the client.

- The buffer size is returned as a 'size_t' data type, which
  allows access to large buffers on platforms where ``sizeof(int)
  != sizeof(void *)``.

(Guido comments: This second sounds like a change we could also
make to the "old" buffer interface, if we introduce another flag
bit that's *not* part of the default flags.)


Specification
=============

The locked buffer interface exposes new functions which return the
size and the pointer to the internal memory block of any python
object which chooses to implement this interface.

Retrieving a buffer from an object puts this object in a locked
state during which the buffer may not be freed, resized, or
reallocated.

The object must be unlocked again by releasing the buffer if it's
no longer used by calling another function in the locked buffer
interface.  If the object never resizes or reallocates the buffer
during its lifetime, this function may be NULL.  Failure to call
this function (if it is != NULL) is a programming error and may
have unexpected results.

The locked buffer interface omits the memory segment model which
is present in the old buffer interface - only a single memory
block can be exposed.

The memory blocks can be accessed without holding the global
interpreter lock.


Implementation
==============

Define a new flag in Include/object.h::

    /* PyBufferProcs contains bf_acquirelockedreadbuffer,
       bf_acquirelockedwritebuffer, and bf_releaselockedbuffer */
    #define Py_TPFLAGS_HAVE_LOCKEDBUFFER (1L<<15)


This flag would be included in ``Py_TPFLAGS_DEFAULT``::

    #define Py_TPFLAGS_DEFAULT  ( \
                        ....
                        Py_TPFLAGS_HAVE_LOCKEDBUFFER | \
                        ....
                        0)


Extend the ``PyBufferProcs`` structure by new fields in
Include/object.h::

    typedef size_t (*acquirelockedreadbufferproc)(PyObject *,
                                                  const void **);
    typedef size_t (*acquirelockedwritebufferproc)(PyObject *,
                                                   void **);
    typedef void (*releaselockedbufferproc)(PyObject *);

    typedef struct {
        getreadbufferproc bf_getreadbuffer;
        getwritebufferproc bf_getwritebuffer;
        getsegcountproc bf_getsegcount;
        getcharbufferproc bf_getcharbuffer;
        /* locked buffer interface functions */
        acquirelockedreadbufferproc bf_acquirelockedreadbuffer;
        acquirelockedwritebufferproc bf_acquirelockedwritebuffer;
        releaselockedbufferproc bf_releaselockedbuffer;
    } PyBufferProcs;


The new fields are present if the ``Py_TPFLAGS_HAVE_LOCKEDBUFFER``
flag is set in the object's type.

The ``Py_TPFLAGS_HAVE_LOCKEDBUFFER`` flag implies the
``Py_TPFLAGS_HAVE_GETCHARBUFFER`` flag.

The ``acquirelockedreadbufferproc`` and ``acquirelockedwritebufferproc``
functions return the size in bytes of the memory block on success,
and fill in the passed void \* pointer on success.  If these
functions fail - either because an error occurs or no memory block
is exposed - they must set the void \* pointer to NULL and raise an
exception.  The return value is undefined in these cases and
should not be used.

If calls to these functions succeed, eventually the buffer must be
released by a call to the ``releaselockedbufferproc``, supplying the
original object as argument.  The ``releaselockedbufferproc`` cannot
fail.  For objects that actually maintain an internal lock count
it would be a fatal error if the ``releaselockedbufferproc`` function
would be called too often, leading to a negative lock count.

Similar to the 'old' buffer interface, any of these functions may
be set to NULL, but it is strongly recommended to implement the
``releaselockedbufferproc`` function (even if it does nothing) if any
of the ``acquireread``/``writelockedbufferproc`` functions are
implemented, to discourage extension writers from checking for a
NULL value and not calling it.

These functions aren't supposed to be called directly, they are
called through convenience functions declared in
Include/abstract.h::

    int PyObject_AcquireLockedReadBuffer(PyObject *obj,
                                        const void **buffer,
                                        size_t *buffer_len);

    int PyObject_AcquireLockedWriteBuffer(PyObject *obj,
                                          void **buffer,
                                          size_t *buffer_len);

    void PyObject_ReleaseLockedBuffer(PyObject *obj);

The former two functions return 0 on success, set buffer to the
memory location and buffer_len to the length of the memory block
in bytes. On failure, or if the locked buffer interface is not
implemented by obj, they return -1 and set an exception.

The latter function doesn't return anything, and cannot fail.


Backward Compatibility
======================

The size of the ``PyBufferProcs`` structure changes if this proposal
is implemented, but the type's ``tp_flags`` slot can be used to
determine if the additional fields are present.


Reference Implementation
========================

An implementation has been uploaded to the SourceForge patch
manager as https://bugs.python.org/issue652857.


Additional Notes/Comments
=========================

Python strings, unicode strings, mmap objects, and array objects
would expose the locked buffer interface.

mmap and array objects would actually enter a locked state while
the buffer is active, this is not needed for strings and unicode
objects.  Resizing locked array objects is not allowed and will
raise an exception. Whether closing a locked mmap object is an
error or will only be deferred until the lock count reaches zero
is an implementation detail.

Guido recommends

    But I'm still very concerned that if most built-in types
    (e.g. strings, bytes) don't implement the release
    functionality, it's too easy for an extension to seem to work
    while forgetting to release the buffer.

    I recommend that at least some built-in types implement the
    acquire/release functionality with a counter, and assert that
    the counter is zero when the object is deleted -- if the
    assert fails, someone DECREF'ed their reference to the object
    without releasing it.  (The rule should be that you must own a
    reference to the object while you've acquired the object.)

    For strings that might be impractical because the string
    object would have to grow 4 bytes to hold the counter; but the
    new bytes object (:pep:`296`) could easily implement the counter,
    and the array object too -- that way there will be plenty of
    opportunity to test proper use of the protocol.


Community Feedback
==================

Greg Ewing doubts the locked buffer interface is needed at all, he
thinks the normal buffer interface could be used if the pointer is
(re)fetched each time it's used.  This seems to be dangerous,
because even innocent looking calls to the Python API like
``Py_DECREF()`` may trigger execution of arbitrary Python code.

The first version of this proposal didn't have the release
function, but it turned out that this would have been too
restrictive: mmap and array objects wouldn't have been able to
implement it, because mmap objects can be closed anytime if not
locked, and array objects could resize or reallocate the buffer.

This PEP will probably be rejected because nobody except the
author needs it.


References
==========

.. [1] The buffer interface
       https://mail.python.org/pipermail/python-dev/2000-October/009974.html


Copyright
=========

This document has been placed in the public domain.


..
  Local Variables:
  mode: indented-text
  indent-tabs-mode: nil
  sentence-end-double-space: t
  fill-column: 70
  End: