PEP: 416 Title: Add a frozendict builtin type Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 29-February-2012 Python-Version: 3.3 Abstract ======== Add a new frozendict builtin type. Rationale ========= A frozendict is a read-only mapping: a key cannot be added nor removed, and a key is always mapped to the same value. However, frozendict values can be not hashable. A frozendict is hashable if and only if all values are hashable. Use cases: * Immutable global variable like a default configuration. * Default value of a function parameter. Avoid the issue of mutable default arguments. * Implement a cache: frozendict can be used to store function keywords. frozendict can be used as a key of a mapping or as a member of set. * frozendict avoids the need of a lock when the frozendict is shared by multiple threads or processes, especially hashable frozendict. It would also help to prohibe coroutines (generators + greenlets) to modify the global state. * frozendict lookup can be done at compile time instead of runtime because the mapping is read-only. frozendict can be used instead of a preprocessor to remove conditional code at compilation, like code specific to a debug build. * frozendict helps to implement read-only object proxies for security modules. For example, it would be possible to use frozendict type for __builtins__ mapping or type.__dict__. This is possible because frozendict is compatible with the PyDict C API. * frozendict avoids the need of a read-only proxy in some cases. frozendict is faster than a proxy because getting an item in a frozendict is a fast lookup whereas a proxy requires a function call. Constraints =========== * frozendict has to implement the Mapping abstract base class * frozendict keys and values can be unorderable * a frozendict is hashable if all keys and values are hashable * frozendict hash does not depend on the items creation order Implementation ============== * Add a PyFrozenDictObject structure based on PyDictObject with an extra "Py_hash_t hash;" field * frozendict.__hash__() is implemented using hash(frozenset(self.items())) and caches the result in its private hash attribute * Register frozendict as a collections.abc.Mapping * frozendict can be used with PyDict_GetItem(), but PyDict_SetItem() and PyDict_DelItem() raise a TypeError Recipe: hashable dict ====================== To ensure that a a frozendict is hashable, values can be checked before creating the frozendict:: import itertools def hashabledict(*args, **kw): # ensure that all values are hashable for key, value in itertools.chain(args, kw.items()): if isinstance(value, (int, str, bytes, float, frozenset, complex)): # avoid the compute the hash (which may be slow) for builtin # types known to be hashable for any value continue hash(value) # don't check the key: frozendict already checks the key return frozendict.__new__(cls, *args, **kw) Objections ========== *namedtuple may fit the requiements of a frozendict.* A namedtuple is not a mapping, it does not implement the Mapping abstract base class. *frozendict can be implemented in Python using descriptors" and "frozendict just need to be practically constant.* If frozendict is used to harden Python (security purpose), it must be implemented in C. A type implemented in C is also faster. *The PEP 351 was rejected.* The PEP 351 tries to freeze an object and so may convert a mutable object to an immutable object (using a different type). frozendict doesn't convert anything: hash(frozendict) raises a TypeError if a value is not hashable. Freezing an object is not the purpose of this PEP. Alternative: dictproxy ====================== Python has a builtin dictproxy type used by type.__dict__ getter descriptor. This type is not public. dictproxy is a read-only view of a dictionary, but it is not read-only mapping. If a dictionary is modified, the dictproxy is also modified. dictproxy can be used using ctypes and the Python C API, see for example the `make dictproxy object via ctypes.pythonapi and type() (Python recipe 576540)`_ by Ikkei Shimomura. The recipe contains a test checking that a dictproxy is "mutable" (modify the dictionary linked to the dictproxy). However dictproxy can be useful in some cases, where its mutable property is not an issue, to avoid a copy of the dictionary. Existing implementations ======================== Whitelist approach. * `Implementing an Immutable Dictionary (Python recipe 498072) `_ by Aristotelis Mikropoulos. Similar to frozendict except that it is not truly read-only: it is possible to access to this private internal dict. It does not implement __hash__ and has an implementation issue: it is possible to call again __init__() to modify the mapping. * PyWebmail contains an ImmutableDict type: `webmail.utils.ImmutableDict `_. It is hashable if keys and values are hashable. It is not truly read-only: its internal dict is a public attribute. * remember project: `remember.dicts.FrozenDict `_. It is used to implement a cache: FrozenDict is used to store function callbacks. FrozenDict may be hashable. It has an extra supply_dict() class method to create a FrozenDict from a dict without copying the dict: store the dict as the internal dict. Implementation issue: __init__() can be called to modify the mapping and the hash may differ depending on item creation order. The mapping is not truly read-only: the internal dict is accessible in Python. Blacklist approach: inherit from dict and override write methods to raise an exception. It is not truly read-only: it is still possible to call dict methods on such "frozen dictionary" to modify it. * brownie: `brownie.datastructures.ImmuatableDict `_. It is hashable if keys and values are hashable. werkzeug project has the same code: `werkzeug.datastructures.ImmutableDict `_. ImmutableDict is used for global constant (configuration options). The Flask project uses ImmutableDict of werkzeug for its default configuration. * SQLAchemy project: `sqlachemy.util.immutabledict `_. It is not hashable and has an extra method: union(). immutabledict is used for the default value of parameter of some functions expecting a mapping. Example: mapper_args=immutabledict() in SqlSoup.map(). * `Frozen dictionaries (Python recipe 414283) `_ by Oren Tirosh. It is hashable if keys and values are hashable. Included in the following projects: * lingospot: `frozendict/frozendict.py `_ * factor-graphics: frozendict type in `python/fglib/util_ext_frozendict.py `_ * The gsakkis-utils project written by George Sakkis includes a frozendict type: `datastructs.frozendict `_ * characters: `scripts/python/frozendict.py `_. It is hashable. __init__() sets __init__ to None. * Old NLTK (1.x): `nltk.util.frozendict `_. Keys and values must be hashable. __init__() can be called twice to modify the mapping. frozendict is used to "freeze" an object. Hashable dict: inherit from dict and just add an __hash__ method. * `pypy.rpython.lltypesystem.lltype.frozendict `_. It is hashable but don't deny modification of the mapping. * factor-graphics: hashabledict type in `python/fglib/util_ext_frozendict.py `_ Links ===== * `Issue #14162: PEP 416: Add a builtin frozendict type `_ * PEP 412: Key-Sharing Dictionary (`issue #13903 `_) * PEP 351: The freeze protocol * `The case for immutable dictionaries; and the central misunderstanding of PEP 351 `_ * `make dictproxy object via ctypes.pythonapi and type() (Python recipe 576540) `_ by Ikkei Shimomura. * Python security modules implementing read-only object proxies using a C extension: * `pysandbox `_ * `mxProxy `_ * `zope.proxy `_ * `zope.security `_ Copyright ========= This document has been placed in the public domain.