PEP 574: Rephrase and clarify the "open questions" (#752)

This commit is contained in:
Antoine Pitrou 2018-08-04 06:52:39 +02:00 committed by Nick Coghlan
parent fd91a80aaf
commit 4df40cca06
1 changed files with 25 additions and 15 deletions

View File

@ -420,26 +420,36 @@ This mechanism has two drawbacks:
such as ints and strings) triggers a call to the user's ``persistent_id()``
method, leading to a possible performance drop compared to nominal.
Passing a sequence of buffers in ``buffer_callback``
----------------------------------------------------
Open questions
==============
By passing a sequence of buffers, rather than a single buffer, we would
potentially save on function call overhead in case a large number
of buffers are produced during serialization. This would need
additional support in the Pickler to save buffers before calling the
callback. However, it would also prevent the buffer callback from returning
a boolean to indicate whether a buffer is to be serialized in-band or
out-of-band.
Should ``buffer_callback`` take a single buffers or a sequence of buffers?
We consider that having a large number of buffers to serialize is an
unlikely case, and decided to pass a single buffer to the buffer callback.
* Taking a single buffer would allow returning a boolean indicating whether
the given buffer should be serialized in-band or out-of-band.
* Taking a sequence of buffers is potentially more efficient by reducing
function call overhead.
Allow serializing a ``PickleBuffer`` in protocol 4 and earlier
--------------------------------------------------------------
Should it be allowed to serialize a ``PickleBuffer`` in protocol 4 and earlier?
It would simply be serialized as a ``bytes`` object (if read-only) or
``bytearray`` (if writable).
If we were to allow serializing a ``PickleBuffer`` in protocols 4 and earlier,
it would actually make a supplementary memory copy when the buffer is mutable.
Indeed, a mutable ``PickleBuffer`` would serialize as a bytearray object
in those protocols (that is a first copy), and serializing the bytearray
object would call ``bytearray.__reduce_ex__`` which returns a bytes object
(that is a second copy).
* It can make implementing ``__reduce__`` simpler.
* Serializing a ``bytearray`` in protocol 4 makes a supplementary memory
copy when ``bytearray.__reduce_ex__`` returns a ``bytes`` object. This
is a performance regression that may be overlooked by ``__reduce__``
implementors.
To prevent ``__reduce__`` implementors from introducing involuntary
performance regressions, we decided to reject ``PickleBuffer`` when
the protocol is smaller than 5. This forces implementors to switch to
``__reduce_ex__`` and implement protocol-dependent serialization, taking
advantage of the best path for each protocol (or at least treat protocol
5 and upwards separately from protocols 4 and downwards).
Implementation