python-peps/pep-0416.txt

224 lines
9.5 KiB
Plaintext

PEP: 416
Title: Add a frozendict builtin type
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner@gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 29-February-2012
Python-Version: 3.3
Abstract
========
Add a new frozendict builtin type.
Rationale
=========
A frozendict is a read-only mapping: a key cannot be added nor removed, and a
key is always mapped to the same value. However, frozendict values can be not
hashable. A frozendict is hashable if and only if all values are hashable.
Use cases:
* Immutable global variable like a default configuration.
* Default value of a function parameter. Avoid the issue of mutable default arguments.
* Implement a cache: frozendict can be used to store function keywords.
frozendict can be used as a key of a mapping or as a member of set.
* frozendict avoids the need of a lock when the frozendict is shared
by multiple threads or processes, especially hashable frozendict. It would
also help to prohibe coroutines (generators + greenlets) to modify the
global state.
* frozendict lookup can be done at compile time instead of runtime because the
mapping is read-only. frozendict can be used instead of a preprocessor to
remove conditional code at compilation, like code specific to a debug build.
* frozendict helps to implement read-only object proxies for security modules.
For example, it would be possible to use frozendict type for __builtins__
mapping or type.__dict__. This is possible because frozendict is compatible
with the PyDict C API.
* frozendict avoids the need of a read-only proxy in some cases. frozendict is
faster than a proxy because getting an item in a frozendict is a fast lookup
whereas a proxy requires a function call.
Constraints
===========
* frozendict has to implement the Mapping abstract base class
* frozendict keys and values can be unorderable
* a frozendict is hashable if all keys and values are hashable
* frozendict hash does not depend on the items creation order
Implementation
==============
* Add a PyFrozenDictObject structure based on PyDictObject with an extra
"Py_hash_t hash;" field
* frozendict.__hash__() is implemented using hash(frozenset(self.items())) and
caches the result in its private hash attribute
* Register frozendict as a collections.abc.Mapping
* frozendict can be used with PyDict_GetItem(), but PyDict_SetItem() and
PyDict_DelItem() raise a TypeError
Recipe: hashable dict
======================
To ensure that a a frozendict is hashable, values can be checked
before creating the frozendict::
import itertools
def hashabledict(*args, **kw):
# ensure that all values are hashable
for key, value in itertools.chain(args, kw.items()):
if isinstance(value, (int, str, bytes, float, frozenset, complex)):
# avoid the compute the hash (which may be slow) for builtin
# types known to be hashable for any value
continue
hash(value)
# don't check the key: frozendict already checks the key
return frozendict.__new__(cls, *args, **kw)
Objections
==========
*namedtuple may fit the requiements of a frozendict.*
A namedtuple is not a mapping, it does not implement the Mapping abstract base
class.
*frozendict can be implemented in Python using descriptors" and "frozendict
just need to be practically constant.*
If frozendict is used to harden Python (security purpose), it must be
implemented in C. A type implemented in C is also faster.
*The PEP 351 was rejected.*
The PEP 351 tries to freeze an object and so may convert a mutable object to an
immutable object (using a different type). frozendict doesn't convert anything:
hash(frozendict) raises a TypeError if a value is not hashable. Freezing an
object is not the purpose of this PEP.
Alternative: dictproxy
======================
Python has a builtin dictproxy type used by type.__dict__ getter descriptor.
This type is not public. dictproxy is a read-only view of a dictionary, but it
is not read-only mapping. If a dictionary is modified, the dictproxy is also
modified.
dictproxy can be used using ctypes and the Python C API, see for example the
`make dictproxy object via ctypes.pythonapi and type() (Python recipe 576540)`_
by Ikkei Shimomura. The recipe contains a test checking that a dictproxy is
"mutable" (modify the dictionary linked to the dictproxy).
However dictproxy can be useful in some cases, where its mutable property is
not an issue, to avoid a copy of the dictionary.
Existing implementations
========================
Whitelist approach.
* `Implementing an Immutable Dictionary (Python recipe 498072)
<http://code.activestate.com/recipes/498072/>`_ by Aristotelis Mikropoulos.
Similar to frozendict except that it is not truly read-only: it is possible
to access to this private internal dict. It does not implement __hash__ and
has an implementation issue: it is possible to call again __init__() to
modify the mapping.
* PyWebmail contains an ImmutableDict type: `webmail.utils.ImmutableDict
<http://pywebmail.cvs.sourceforge.net/viewvc/pywebmail/webmail/webmail/utils/ImmutableDict.py?revision=1.2&view=markup>`_.
It is hashable if keys and values are hashable. It is not truly read-only:
its internal dict is a public attribute.
* remember project: `remember.dicts.FrozenDict
<https://bitbucket.org/mikegraham/remember/src/tip/remember/dicts.py>`_.
It is used to implement a cache: FrozenDict is used to store function callbacks.
FrozenDict may be hashable. It has an extra supply_dict() class method to
create a FrozenDict from a dict without copying the dict: store the dict as
the internal dict. Implementation issue: __init__() can be called to modify
the mapping and the hash may differ depending on item creation order. The
mapping is not truly read-only: the internal dict is accessible in Python.
Blacklist approach: inherit from dict and override write methods to raise an
exception. It is not truly read-only: it is still possible to call dict methods
on such "frozen dictionary" to modify it.
* brownie: `brownie.datastructures.ImmuatableDict
<https://github.com/DasIch/brownie/blob/HEAD/brownie/datastructures/mappings.py>`_.
It is hashable if keys and values are hashable. werkzeug project has the
same code: `werkzeug.datastructures.ImmutableDict
<https://github.com/mitsuhiko/werkzeug/blob/master/werkzeug/datastructures.py>`_.
ImmutableDict is used for global constant (configuration options). The Flask
project uses ImmutableDict of werkzeug for its default configuration.
* SQLAchemy project: `sqlachemy.util.immutabledict
<http://hg.sqlalchemy.org/sqlalchemy/file/tip/lib/sqlalchemy/util/_collections.py>`_.
It is not hashable and has an extra method: union(). immutabledict is used
for the default value of parameter of some functions expecting a mapping.
Example: mapper_args=immutabledict() in SqlSoup.map().
* `Frozen dictionaries (Python recipe 414283) <http://code.activestate.com/recipes/414283/>`_
by Oren Tirosh. It is hashable if keys and values are hashable. Included in
the following projects:
* lingospot: `frozendict/frozendict.py
<http://code.google.com/p/lingospot/source/browse/trunk/frozendict/frozendict.py>`_
* factor-graphics: frozendict type in `python/fglib/util_ext_frozendict.py
<https://github.com/ih/factor-graphics/blob/41006fb71a09377445cc140489da5ce8eeb9c8b1/python/fglib/util_ext_frozendict.py>`_
* The gsakkis-utils project written by George Sakkis includes a frozendict
type: `datastructs.frozendict
<http://code.google.com/p/gsakkis-utils/source/browse/trunk/datastructs/frozendict.py>`_
* characters: `scripts/python/frozendict.py
<https://github.com/JasonGross/characters/blob/15a2af5f7861cd33a0dbce70f1569cda74e9a1e3/scripts/python/frozendict.py#L1>`_.
It is hashable. __init__() sets __init__ to None.
* Old NLTK (1.x): `nltk.util.frozendict
<http://nltk.googlecode.com/svn/trunk/nltk-old/src/nltk/util.py>`_. Keys and
values must be hashable. __init__() can be called twice to modify the
mapping. frozendict is used to "freeze" an object.
Hashable dict: inherit from dict and just add an __hash__ method.
* `pypy.rpython.lltypesystem.lltype.frozendict
<https://bitbucket.org/pypy/pypy/src/1f49987cc2fe/pypy/rpython/lltypesystem/lltype.py#cl-86>`_.
It is hashable but don't deny modification of the mapping.
* factor-graphics: hashabledict type in `python/fglib/util_ext_frozendict.py
<https://github.com/ih/factor-graphics/blob/41006fb71a09377445cc140489da5ce8eeb9c8b1/python/fglib/util_ext_frozendict.py>`_
Links
=====
* `Issue #14162: PEP 416: Add a builtin frozendict type
<http://bugs.python.org/issue14162>`_
* PEP 412: Key-Sharing Dictionary
(`issue #13903 <http://bugs.python.org/issue13903>`_)
* PEP 351: The freeze protocol
* `The case for immutable dictionaries; and the central misunderstanding of
PEP 351 <http://www.cs.toronto.edu/~tijmen/programming/immutableDictionaries.html>`_
* `make dictproxy object via ctypes.pythonapi and type() (Python recipe
576540) <http://code.activestate.com/recipes/576540/>`_ by Ikkei Shimomura.
* Python security modules implementing read-only object proxies using a C
extension:
* `pysandbox <https://github.com/haypo/pysandbox/>`_
* `mxProxy <http://www.egenix.com/products/python/mxBase/mxProxy/>`_
* `zope.proxy <http://pypi.python.org/pypi/zope.proxy>`_
* `zope.security <http://pypi.python.org/pypi/zope.security>`_
Copyright
=========
This document has been placed in the public domain.