PEP 412: add missing variables footer and reformat.

2012-03-31 21:40:52 +02:00 · 2012-03-31 21:40:52 +02:00 · 4408906f8a
parent 7638f18993
commit 4408906f8a
1 changed files with 101 additions and 84 deletions
--- a/pep-0412.txt
+++ b/pep-0412.txt
@ -14,26 +14,28 @@ Post-History: 08-Feb-2012
 Abstract
 ========
-This PEP proposes a change in the implementation of the builtin dictionary
+This PEP proposes a change in the implementation of the builtin
-type ``dict``. The new implementation allows dictionaries which are used as
+dictionary type ``dict``.  The new implementation allows dictionaries
-attribute dictionaries (the ``__dict__`` attribute of an object) to share
+which are used as attribute dictionaries (the ``__dict__`` attribute
-keys with other attribute dictionaries of instances of the same class.
+of an object) to share keys with other attribute dictionaries of
 instances of the same class.
 Motivation
 ==========
-The current dictionary implementation uses more memory than is necessary
+The current dictionary implementation uses more memory than is
-when used as a container for object attributes as the keys are
+necessary when used as a container for object attributes as the keys
-replicated for each instance rather than being shared across many instances
+are replicated for each instance rather than being shared across many
-of the same class.
+instances of the same class.  Despite this, the current dictionary
-Despite this, the current dictionary implementation is finely tuned and
+implementation is finely tuned and performs very well as a
-performs very well as a general-purpose mapping object.
+general-purpose mapping object.
-By separating the keys (and hashes) from the values it is possible to share
+By separating the keys (and hashes) from the values it is possible to
-the keys between multiple dictionaries and improve memory use.
+share the keys between multiple dictionaries and improve memory use.
-By ensuring that keys are separated from the values only when beneficial,
+By ensuring that keys are separated from the values only when
-it is possible to retain the high-performance of the current dictionary
+beneficial, it is possible to retain the high-performance of the
-implementation when used as a general-purpose mapping object.
+current dictionary implementation when used as a general-purpose
 mapping object.
 Behaviour
 =========
@ -47,76 +49,80 @@ Performance
 Memory Usage
 ------------
-Reduction in memory use is directly related to the number of dictionaries
+Reduction in memory use is directly related to the number of
-with shared keys in existence at any time. These dictionaries are typically
+dictionaries with shared keys in existence at any time.  These
-half the size of the current dictionary implementation.
+dictionaries are typically half the size of the current dictionary
 implementation.
 Benchmarking shows that memory use is reduced by 10% to 20% for
-object-oriented programs with no significant change in memory use
+object-oriented programs with no significant change in memory use for
-for other programs.
+other programs.
 Speed
 -----
-The performance of the new implementation is dominated by memory locality
+The performance of the new implementation is dominated by memory
-effects. When keys are not shared (for example in module dictionaries
+locality effects.  When keys are not shared (for example in module
-and dictionary explicitly created by dict() or {} ) then performance is
+dictionaries and dictionary explicitly created by ``dict()`` or
-unchanged (within a percent or two) from the current implementation.
+``{}``) then performance is unchanged (within a percent or two) from
 the current implementation.
-For the shared keys case, the new implementation tends to separate keys
+For the shared keys case, the new implementation tends to separate
-from values, but reduces total memory usage. This will improve performance
+keys from values, but reduces total memory usage.  This will improve
-in many cases as the effects of reduced memory usage outweigh the loss of
+performance in many cases as the effects of reduced memory usage
-locality, but some programs may show a small slow down.
+outweigh the loss of locality, but some programs may show a small slow
 down.
 Benchmarking shows no significant change of speed for most benchmarks.
 Object-oriented benchmarks show small speed ups when they create large
-numbers of objects of the same class (the gcbench benchmark shows a 10%
+numbers of objects of the same class (the gcbench benchmark shows a
-speed up; this is likely to be an upper limit).
+10% speed up; this is likely to be an upper limit).
 Implementation
 ==============
-Both the old and new dictionaries consist of a fixed-sized dict struct and
+Both the old and new dictionaries consist of a fixed-sized dict struct
-a re-sizeable table.
+and a re-sizeable table.  In the new dictionary the table can be
-In the new dictionary the table can be further split into a keys table and
+further split into a keys table and values array.  The keys table
-values array.
+holds the keys and hashes and (for non-split tables) the values as
-The keys table holds the keys and hashes and (for non-split tables) the
+well.  It differs only from the original implementation in that it
 values as well. It differs only from the original implementation in that it
 contains a number of fields that were previously in the dict struct.
-If a table is split the values in the keys table are ignored, instead the
+If a table is split the values in the keys table are ignored, instead
-values are held in a separate array.
+the values are held in a separate array.
 Split-Table dictionaries
 ------------------------
-When dictionaries are created to fill the __dict__ slot of an object, they are
+When dictionaries are created to fill the __dict__ slot of an object,
-created in split form. The keys table is cached in the type, potentially
+they are created in split form.  The keys table is cached in the type,
-allowing all attribute dictionaries of instances of one class to share keys.
+potentially allowing all attribute dictionaries of instances of one
-In the event of the keys of these dictionaries starting to diverge,
+class to share keys.  In the event of the keys of these dictionaries
-individual dictionaries will lazily convert to the combined-table form.
+starting to diverge, individual dictionaries will lazily convert to
-This ensures good memory use in the common case, and correctness in all cases.
+the combined-table form.  This ensures good memory use in the common
 case, and correctness in all cases.
 When resizing a split dictionary it is converted to a combined table.
-If resizing is as a result of storing an instance attribute, and there is
+If resizing is as a result of storing an instance attribute, and there
-only instance of a class, then the dictionary will be re-split immediately.
+is only instance of a class, then the dictionary will be re-split
-Since most OO code will set attributes in the __init__ method, all attributes
+immediately.  Since most OO code will set attributes in the __init__
-will be set before a second instance is created and no more resizing will be
+method, all attributes will be set before a second instance is created
-necessary as all further instance dictionaries will have the correct size.
+and no more resizing will be necessary as all further instance
-For more complex use patterns, it is impossible to know what is the best
+dictionaries will have the correct size.  For more complex use
-approach, so the implementation allows extra insertions up to the point
+patterns, it is impossible to know what is the best approach, so the
-of a resize when it reverts to the combined table (non-shared keys).
+implementation allows extra insertions up to the point of a resize
 when it reverts to the combined table (non-shared keys).
-A deletion from a split dictionary does not change the keys table, it simply
+A deletion from a split dictionary does not change the keys table, it
-removes the value from the values array.
+simply removes the value from the values array.
 Combined-Table dictionaries
 ---------------------------
-Explicit dictionaries (dict() or {}), module dictionaries and most other
+Explicit dictionaries (``dict()`` or ``{}``), module dictionaries and
-dictionaries are created as combined-table dictionaries.
+most other dictionaries are created as combined-table dictionaries.  A
-A combined-table dictionary never becomes a split-table dictionary.
+combined-table dictionary never becomes a split-table dictionary.
-Combined tables are laid out in much the same way as the tables in the old
+Combined tables are laid out in much the same way as the tables in the
-dictionary, resulting in very similar performance.
+old dictionary, resulting in very similar performance.
 Implementation
 ==============
@ -129,44 +135,45 @@ Pros and Cons
 Pros
 ----
-Significant memory savings for object-oriented applications.
+Significant memory savings for object-oriented applications.  Small
-Small improvement to speed for programs which create lots of similar objects.
+improvement to speed for programs which create lots of similar
 objects.
 Cons
 ----
-Change to data structures:
+Change to data structures: Third party modules which meddle with the
-Third party modules which meddle with the internals of the dictionary
+internals of the dictionary implementation will break.
 implementation will break.
 Changes to repr() output and iteration order:
 For most cases, this will be unchanged.
 However for some split-table dictionaries the iteration order will
 change.
-Neither of these cons should be a problem.
+Changes to repr() output and iteration order: For most cases, this
-Modules which meddle with the internals of the dictionary
+will be unchanged.  However for some split-table dictionaries the
-implementation are already broken and should be fixed to use the API.
+iteration order will change.
-The iteration order of dictionaries was never defined and has always been
+
-arbitrary; it is different for Jython and PyPy.
+Neither of these cons should be a problem.  Modules which meddle with
 the internals of the dictionary implementation are already broken and
 should be fixed to use the API.  The iteration order of dictionaries
 was never defined and has always been arbitrary; it is different for
 Jython and PyPy.
 Alternative Implementation
 --------------------------
-An alternative implementation for split tables, which could save even more
+An alternative implementation for split tables, which could save even
-memory, is to store an index in the value field of the keys table (instead
+more memory, is to store an index in the value field of the keys table
-of ignoring the value field). This index would explicitly state where in the
+(instead of ignoring the value field).  This index would explicitly
-value array to look. The value array would then only require 1 field for each
+state where in the value array to look.  The value array would then
-usable slot in the key table, rather than each slot in the key table.
+only require 1 field for each usable slot in the key table, rather
 than each slot in the key table.
 This "indexed" version would reduce the size of value array by about
-one third. The keys table would need an extra "values_size" field, increasing
+one third. The keys table would need an extra "values_size" field,
-the size of combined dicts by one word.
+increasing the size of combined dicts by one word.  The extra
-The extra indirection adds more complexity to the code, potentially reducing
+indirection adds more complexity to the code, potentially reducing
 performance a little.
-The "indexed" version will not be included in this implementation,
+The "indexed" version will not be included in this implementation, but
-but should be considered deferred rather than rejected,
+should be considered deferred rather than rejected, pending further
-pending further experimentation.
+experimentation.
 References
 ==========
@ -179,3 +186,13 @@ Copyright
 This document has been placed in the public domain.
 ..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End: