PEP 574: Rephrase and clarify the "open questions" (#752)

2018-08-04 06:52:39 +02:00 · 2018-08-04 06:52:39 +02:00 · 4df40cca06
parent fd91a80aaf
commit 4df40cca06
1 changed files with 25 additions and 15 deletions
--- a/pep-0574.rst
+++ b/pep-0574.rst
@ -420,26 +420,36 @@ This mechanism has two drawbacks:
  such as ints and strings) triggers a call to the user's ``persistent_id()``
  method, leading to a possible performance drop compared to nominal.

+Passing a sequence of buffers in ``buffer_callback``
+----------------------------------------------------

-Open questions
-==============
+By passing a sequence of buffers, rather than a single buffer, we would
+potentially save on function call overhead in case a large number
+of buffers are produced during serialization.  This would need
+additional support in the Pickler to save buffers before calling the
+callback.  However, it would also prevent the buffer callback from returning
+a boolean to indicate whether a buffer is to be serialized in-band or
+out-of-band.

-Should ``buffer_callback`` take a single buffers or a sequence of buffers?
+We consider that having a large number of buffers to serialize is an
+unlikely case, and decided to pass a single buffer to the buffer callback.

-* Taking a single buffer would allow returning a boolean indicating whether
-  the given buffer should be serialized in-band or out-of-band.
-* Taking a sequence of buffers is potentially more efficient by reducing
-  function call overhead.
+Allow serializing a ``PickleBuffer`` in protocol 4 and earlier
+--------------------------------------------------------------

-Should it be allowed to serialize a ``PickleBuffer`` in protocol 4 and earlier?
-It would simply be serialized as a ``bytes`` object (if read-only) or
-``bytearray`` (if writable).
+If we were to allow serializing a ``PickleBuffer`` in protocols 4 and earlier,
+it would actually make a supplementary memory copy when the buffer is mutable.
+Indeed, a mutable ``PickleBuffer`` would serialize as a bytearray object
+in those protocols (that is a first copy), and serializing the bytearray
+object would call ``bytearray.__reduce_ex__`` which returns a bytes object
+(that is a second copy).

-* It can make implementing ``__reduce__`` simpler.
-* Serializing a ``bytearray`` in protocol 4 makes a supplementary memory
-  copy when ``bytearray.__reduce_ex__`` returns a ``bytes`` object.  This
-  is a performance regression that may be overlooked by ``__reduce__``
-  implementors.
+To prevent ``__reduce__`` implementors from introducing involuntary
+performance regressions, we decided to reject ``PickleBuffer`` when
+the protocol is smaller than 5.  This forces implementors to switch to
+``__reduce_ex__`` and implement protocol-dependent serialization, taking
+advantage of the best path for each protocol (or at least treat protocol
+5 and upwards separately from protocols 4 and downwards).


 Implementation