Add a FRAME opcode for framing, and document Alexandre's new MEMOIZE opcode
This commit is contained in:
parent
5a6cb6b60f
commit
3bafcf54c5
26
pep-3154.txt
26
pep-3154.txt
|
@ -58,15 +58,19 @@ Protocol 4, by contrast, features binary framing. The general structure
|
||||||
of a pickle is thus the following::
|
of a pickle is thus the following::
|
||||||
|
|
||||||
+------+------+
|
+------+------+
|
||||||
| 0x80 | 0x04 | protocol header (2 bytes)
|
| 0x80 | 0x04 | protocol header (2 bytes)
|
||||||
|
+------+------+
|
||||||
|
| OP | FRAME opcode (1 byte)
|
||||||
+------+------+-----------+
|
+------+------+-----------+
|
||||||
| MM MM MM MM MM MM MM MM | frame size (8 bytes, little-endian)
|
| MM MM MM MM MM MM MM MM | frame size (8 bytes, little-endian)
|
||||||
+------+------------------+
|
+------+------------------+
|
||||||
| .... | first frame contents (M bytes)
|
| .... | first frame contents (M bytes)
|
||||||
|
+------+
|
||||||
|
| OP | FRAME opcode (1 byte)
|
||||||
+------+------+-----------+
|
+------+------+-----------+
|
||||||
| NN NN NN NN NN NN NN NN | frame size (8 bytes, little-endian)
|
| NN NN NN NN NN NN NN NN | frame size (8 bytes, little-endian)
|
||||||
+------+------------------+
|
+------+------------------+
|
||||||
| .... | second frame contents (N bytes)
|
| .... | second frame contents (N bytes)
|
||||||
+------+
|
+------+
|
||||||
etc.
|
etc.
|
||||||
|
|
||||||
|
@ -142,6 +146,16 @@ Short str objects currently have their length coded as a 4-bytes
|
||||||
integer, which is wasteful. A specific opcode with a 1-byte length
|
integer, which is wasteful. A specific opcode with a 1-byte length
|
||||||
would make many pickles smaller.
|
would make many pickles smaller.
|
||||||
|
|
||||||
|
Smaller memoization
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
The PUT opcodes all require an explicit index to select in which entry
|
||||||
|
of the memo dictionary the top-of-stack is memoized. However, in practice
|
||||||
|
those numbers are allocated in sequential order. A new opcode, MEMOIZE,
|
||||||
|
will instead store the top-of-stack in at the index equal to the current
|
||||||
|
size of the memo dictionary. This allows for shorter pickles, since PUT
|
||||||
|
opcodes are emitted for all non-atomic datatypes.
|
||||||
|
|
||||||
|
|
||||||
Summary of new opcodes
|
Summary of new opcodes
|
||||||
======================
|
======================
|
||||||
|
@ -149,6 +163,9 @@ Summary of new opcodes
|
||||||
These reflect the state of the proposed implementation (thanks mostly
|
These reflect the state of the proposed implementation (thanks mostly
|
||||||
to Alexandre Vassalotti's work):
|
to Alexandre Vassalotti's work):
|
||||||
|
|
||||||
|
* ``FRAME``: introduce a new frame (followed by the 8-byte frame size
|
||||||
|
and the frame contents).
|
||||||
|
|
||||||
* ``SHORT_BINUNICODE``: push a utf8-encoded str object with a one-byte
|
* ``SHORT_BINUNICODE``: push a utf8-encoded str object with a one-byte
|
||||||
size prefix (therefore less than 256 bytes long).
|
size prefix (therefore less than 256 bytes long).
|
||||||
|
|
||||||
|
@ -178,6 +195,9 @@ to Alexandre Vassalotti's work):
|
||||||
``qualname``, and push the result of looking up the dotted ``qualname``
|
``qualname``, and push the result of looking up the dotted ``qualname``
|
||||||
in the module named ``module_name``.
|
in the module named ``module_name``.
|
||||||
|
|
||||||
|
* ``MEMOIZE``: store the top-of-stack object in the memo dictionary with
|
||||||
|
an index equal to the current size of the memo dictionary.
|
||||||
|
|
||||||
|
|
||||||
Alternative ideas
|
Alternative ideas
|
||||||
=================
|
=================
|
||||||
|
|
Loading…
Reference in New Issue