Extend intro; add motivation and a section on protocol versions.
This commit is contained in:
parent
f739e21311
commit
584036ddd6
63
pep-0307.txt
63
pep-0307.txt
|
@ -15,9 +15,68 @@ Introduction
|
||||||
Pickling new-style objects in Python 2.2 is done somewhat clumsily
|
Pickling new-style objects in Python 2.2 is done somewhat clumsily
|
||||||
and causes pickle size to bloat compared to classic class
|
and causes pickle size to bloat compared to classic class
|
||||||
instances. This PEP documents a new pickle protocol that takes
|
instances. This PEP documents a new pickle protocol that takes
|
||||||
care of this and many other pickling issues.
|
care of this and many other pickle issues.
|
||||||
|
|
||||||
(XXX The rest of this PEP is TBD.)
|
There are two sides to specifying a new pickle protocol: the byte
|
||||||
|
stream constituting pickled data must be specified, and the
|
||||||
|
interface between objects and the pickling and unpickling engines
|
||||||
|
must be specified. This PEP focuses on API issues, although it
|
||||||
|
may occasionally touch on byte stream format details to motivate a
|
||||||
|
choice. The pickle byte stream format is documented formally by
|
||||||
|
the standard library module pickletools.py (already checked into
|
||||||
|
CVS for Python 2.3).
|
||||||
|
|
||||||
|
|
||||||
|
Motivation
|
||||||
|
|
||||||
|
Pickling new-style objects causes serious pickle bloat. For
|
||||||
|
example, the binary pickle for a classic object with one instance
|
||||||
|
variable takes up 33 bytes; a new-style object with one instance
|
||||||
|
variable takes up 86 bytes. This was measured as follows:
|
||||||
|
|
||||||
|
class C(object): # Omit "(object)" for classic class
|
||||||
|
pass
|
||||||
|
x = C()
|
||||||
|
x.foo = 42
|
||||||
|
print len(pickle.dumps(x, 1))
|
||||||
|
|
||||||
|
The reasons for the bloat are complex, but are mostly caused by
|
||||||
|
the fact that new-style objects use __reduce__ in order to be
|
||||||
|
picklable at all. After ample consideration we've concluded that
|
||||||
|
the only way to reduce pickle sizes for new-style objects is to
|
||||||
|
add new opcodes to the pickle protocol. The net result is that
|
||||||
|
with the new protocol, the pickle size in the above example is 35
|
||||||
|
(two extra bytes are used at the start to indicate the protocol
|
||||||
|
version, although this isn't strictly necessary).
|
||||||
|
|
||||||
|
|
||||||
|
Protocol versions
|
||||||
|
|
||||||
|
Previously, pickling (but not unpickling) has distinguished
|
||||||
|
between text mode and binary mode. By design, text mode is a
|
||||||
|
subset of binary mode, and unpicklers don't need to know in
|
||||||
|
advance whether an incoming pickle uses text mode or binary mode.
|
||||||
|
The virtual machine used for unpickling is the same regardless of
|
||||||
|
the mode; certain opcode simply aren't used in text mode.
|
||||||
|
|
||||||
|
Retroactively, text mode is called protocol 0, and binary mode is
|
||||||
|
called protocol 1. The new protocol is called protocol 2. In the
|
||||||
|
tradition of pickling protocols, protocol 2 is a superset of
|
||||||
|
protocol 1. But just so that future pickling protocols aren't
|
||||||
|
required to be supersets of the oldest protocols, a new opcode is
|
||||||
|
inserted at the start of a protocol 2 pickle indicating that it is
|
||||||
|
using protocol 2.
|
||||||
|
|
||||||
|
Several functions, methods and constructors used for pickling used
|
||||||
|
to take a positional argument named 'bin' which was a flag,
|
||||||
|
defaulting to 0, indicating binary mode. This argument is renamed
|
||||||
|
to 'proto' and now gives the protocol number, defaulting to 0.
|
||||||
|
|
||||||
|
It so happens that passing 2 for the 'bin' argument in previous
|
||||||
|
Python versions had the same effect as passing 1. Nevertheless, a
|
||||||
|
special case is added here: passing a negative number selects the
|
||||||
|
highest protocol version supported by a particular
|
||||||
|
implementation. This works in previous Python versions, too.
|
||||||
|
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
|
|
Loading…
Reference in New Issue