PEP 574: Rephrase and clarify the "open questions" (#752)
This commit is contained in:
parent
fd91a80aaf
commit
4df40cca06
40
pep-0574.rst
40
pep-0574.rst
|
@ -420,26 +420,36 @@ This mechanism has two drawbacks:
|
|||
such as ints and strings) triggers a call to the user's ``persistent_id()``
|
||||
method, leading to a possible performance drop compared to nominal.
|
||||
|
||||
Passing a sequence of buffers in ``buffer_callback``
|
||||
----------------------------------------------------
|
||||
|
||||
Open questions
|
||||
==============
|
||||
By passing a sequence of buffers, rather than a single buffer, we would
|
||||
potentially save on function call overhead in case a large number
|
||||
of buffers are produced during serialization. This would need
|
||||
additional support in the Pickler to save buffers before calling the
|
||||
callback. However, it would also prevent the buffer callback from returning
|
||||
a boolean to indicate whether a buffer is to be serialized in-band or
|
||||
out-of-band.
|
||||
|
||||
Should ``buffer_callback`` take a single buffers or a sequence of buffers?
|
||||
We consider that having a large number of buffers to serialize is an
|
||||
unlikely case, and decided to pass a single buffer to the buffer callback.
|
||||
|
||||
* Taking a single buffer would allow returning a boolean indicating whether
|
||||
the given buffer should be serialized in-band or out-of-band.
|
||||
* Taking a sequence of buffers is potentially more efficient by reducing
|
||||
function call overhead.
|
||||
Allow serializing a ``PickleBuffer`` in protocol 4 and earlier
|
||||
--------------------------------------------------------------
|
||||
|
||||
Should it be allowed to serialize a ``PickleBuffer`` in protocol 4 and earlier?
|
||||
It would simply be serialized as a ``bytes`` object (if read-only) or
|
||||
``bytearray`` (if writable).
|
||||
If we were to allow serializing a ``PickleBuffer`` in protocols 4 and earlier,
|
||||
it would actually make a supplementary memory copy when the buffer is mutable.
|
||||
Indeed, a mutable ``PickleBuffer`` would serialize as a bytearray object
|
||||
in those protocols (that is a first copy), and serializing the bytearray
|
||||
object would call ``bytearray.__reduce_ex__`` which returns a bytes object
|
||||
(that is a second copy).
|
||||
|
||||
* It can make implementing ``__reduce__`` simpler.
|
||||
* Serializing a ``bytearray`` in protocol 4 makes a supplementary memory
|
||||
copy when ``bytearray.__reduce_ex__`` returns a ``bytes`` object. This
|
||||
is a performance regression that may be overlooked by ``__reduce__``
|
||||
implementors.
|
||||
To prevent ``__reduce__`` implementors from introducing involuntary
|
||||
performance regressions, we decided to reject ``PickleBuffer`` when
|
||||
the protocol is smaller than 5. This forces implementors to switch to
|
||||
``__reduce_ex__`` and implement protocol-dependent serialization, taking
|
||||
advantage of the best path for each protocol (or at least treat protocol
|
||||
5 and upwards separately from protocols 4 and downwards).
|
||||
|
||||
|
||||
Implementation
|
||||
|
|
Loading…
Reference in New Issue