python-peps/pep-0416.txt

255 lines
11 KiB
Plaintext

PEP: 416
Title: Add a frozendict builtin type
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <vstinner@redhat.com>
Status: Rejected
Type: Standards Track
Content-Type: text/x-rst
Created: 29-February-2012
Python-Version: 3.3
Rejection Notice
================
I'm rejecting this PEP. A number of reasons (not exhaustive):
* According to Raymond Hettinger, use of frozendict is low. Those
that do use it tend to use it as a hint only, such as declaring
global or class-level "constants": they aren't really immutable,
since anyone can still assign to the name.
* There are existing idioms for avoiding mutable default values.
* The potential of optimizing code using frozendict in PyPy is
unsure; a lot of other things would have to change first. The same
holds for compile-time lookups in general.
* Multiple threads can agree by convention not to mutate a shared
dict, there's no great need for enforcement. Multiple processes
can't share dicts.
* Adding a security sandbox written in Python, even with a limited
scope, is frowned upon by many, due to the inherent difficulty with
ever proving that the sandbox is actually secure. Because of this
we won't be adding one to the stdlib any time soon, so this use
case falls outside the scope of a PEP.
On the other hand, exposing the existing read-only dict proxy as a
built-in type sounds good to me. (It would need to be changed to
allow calling the constructor.) GvR.
**Update** (2012-04-15): A new ``MappingProxyType`` type was added to the types
module of Python 3.3.
Abstract
========
Add a new frozendict builtin type.
Rationale
=========
A frozendict is a read-only mapping: a key cannot be added nor removed, and a
key is always mapped to the same value. However, frozendict values can be not
hashable. A frozendict is hashable if and only if all values are hashable.
Use cases:
* Immutable global variable like a default configuration.
* Default value of a function parameter. Avoid the issue of mutable default
arguments.
* Implement a cache: frozendict can be used to store function keywords.
frozendict can be used as a key of a mapping or as a member of set.
* frozendict avoids the need of a lock when the frozendict is shared
by multiple threads or processes, especially hashable frozendict. It would
also help to prohibe coroutines (generators + greenlets) to modify the
global state.
* frozendict lookup can be done at compile time instead of runtime because the
mapping is read-only. frozendict can be used instead of a preprocessor to
remove conditional code at compilation, like code specific to a debug build.
* frozendict helps to implement read-only object proxies for security modules.
For example, it would be possible to use frozendict type for __builtins__
mapping or type.__dict__. This is possible because frozendict is compatible
with the PyDict C API.
* frozendict avoids the need of a read-only proxy in some cases. frozendict is
faster than a proxy because getting an item in a frozendict is a fast lookup
whereas a proxy requires a function call.
Constraints
===========
* frozendict has to implement the Mapping abstract base class
* frozendict keys and values can be unorderable
* a frozendict is hashable if all keys and values are hashable
* frozendict hash does not depend on the items creation order
Implementation
==============
* Add a PyFrozenDictObject structure based on PyDictObject with an extra
"Py_hash_t hash;" field
* frozendict.__hash__() is implemented using hash(frozenset(self.items())) and
caches the result in its private hash attribute
* Register frozendict as a collections.abc.Mapping
* frozendict can be used with PyDict_GetItem(), but PyDict_SetItem() and
PyDict_DelItem() raise a TypeError
Recipe: hashable dict
======================
To ensure that a frozendict is hashable, values can be checked
before creating the frozendict::
import itertools
def hashabledict(*args, **kw):
# ensure that all values are hashable
for key, value in itertools.chain(args, kw.items()):
if isinstance(value, (int, str, bytes, float, frozenset, complex)):
# avoid the compute the hash (which may be slow) for builtin
# types known to be hashable for any value
continue
hash(value)
# don't check the key: frozendict already checks the key
return frozendict.__new__(cls, *args, **kw)
Objections
==========
*namedtuple may fit the requirements of a frozendict.*
A namedtuple is not a mapping, it does not implement the Mapping abstract base
class.
*frozendict can be implemented in Python using descriptors" and "frozendict
just need to be practically constant.*
If frozendict is used to harden Python (security purpose), it must be
implemented in C. A type implemented in C is also faster.
*The PEP 351 was rejected.*
The PEP 351 tries to freeze an object and so may convert a mutable object to an
immutable object (using a different type). frozendict doesn't convert anything:
hash(frozendict) raises a TypeError if a value is not hashable. Freezing an
object is not the purpose of this PEP.
Alternative: dictproxy
======================
Python has a builtin dictproxy type used by type.__dict__ getter descriptor.
This type is not public. dictproxy is a read-only view of a dictionary, but it
is not read-only mapping. If a dictionary is modified, the dictproxy is also
modified.
dictproxy can be used using ctypes and the Python C API, see for example the
`make dictproxy object via ctypes.pythonapi and type() (Python recipe 576540)`_
by Ikkei Shimomura. The recipe contains a test checking that a dictproxy is
"mutable" (modify the dictionary linked to the dictproxy).
However dictproxy can be useful in some cases, where its mutable property is
not an issue, to avoid a copy of the dictionary.
Existing implementations
========================
Whitelist approach.
* `Implementing an Immutable Dictionary (Python recipe 498072)
<http://code.activestate.com/recipes/498072/>`_ by Aristotelis Mikropoulos.
Similar to frozendict except that it is not truly read-only: it is possible
to access to this private internal dict. It does not implement __hash__ and
has an implementation issue: it is possible to call again __init__() to
modify the mapping.
* PyWebmail contains an ImmutableDict type: `webmail.utils.ImmutableDict
<http://pywebmail.cvs.sourceforge.net/viewvc/pywebmail/webmail/webmail/utils/ImmutableDict.py?revision=1.2&view=markup>`_.
It is hashable if keys and values are hashable. It is not truly read-only:
its internal dict is a public attribute.
* remember project: `remember.dicts.FrozenDict
<https://bitbucket.org/mikegraham/remember/src/tip/remember/dicts.py>`_.
It is used to implement a cache: FrozenDict is used to store function callbacks.
FrozenDict may be hashable. It has an extra supply_dict() class method to
create a FrozenDict from a dict without copying the dict: store the dict as
the internal dict. Implementation issue: __init__() can be called to modify
the mapping and the hash may differ depending on item creation order. The
mapping is not truly read-only: the internal dict is accessible in Python.
Blacklist approach: inherit from dict and override write methods to raise an
exception. It is not truly read-only: it is still possible to call dict methods
on such "frozen dictionary" to modify it.
* brownie: `brownie.datastructures.ImmuatableDict
<https://github.com/DasIch/brownie/blob/HEAD/brownie/datastructures/mappings.py>`_.
It is hashable if keys and values are hashable. werkzeug project has the
same code: `werkzeug.datastructures.ImmutableDict
<https://github.com/mitsuhiko/werkzeug/blob/master/werkzeug/datastructures.py>`_.
ImmutableDict is used for global constant (configuration options). The Flask
project uses ImmutableDict of werkzeug for its default configuration.
* SQLAchemy project: `sqlachemy.util.immutabledict
<http://hg.sqlalchemy.org/sqlalchemy/file/tip/lib/sqlalchemy/util/_collections.py>`_.
It is not hashable and has an extra method: union(). immutabledict is used
for the default value of parameter of some functions expecting a mapping.
Example: mapper_args=immutabledict() in SqlSoup.map().
* `Frozen dictionaries (Python recipe 414283) <http://code.activestate.com/recipes/414283/>`_
by Oren Tirosh. It is hashable if keys and values are hashable. Included in
the following projects:
* lingospot: `frozendict/frozendict.py
<http://code.google.com/p/lingospot/source/browse/trunk/frozendict/frozendict.py>`_
* factor-graphics: frozendict type in `python/fglib/util_ext_frozendict.py
<https://github.com/ih/factor-graphics/blob/41006fb71a09377445cc140489da5ce8eeb9c8b1/python/fglib/util_ext_frozendict.py>`_
* The gsakkis-utils project written by George Sakkis includes a frozendict
type: `datastructs.frozendict
<http://code.google.com/p/gsakkis-utils/source/browse/trunk/datastructs/frozendict.py>`_
* characters: `scripts/python/frozendict.py
<https://github.com/JasonGross/characters/blob/15a2af5f7861cd33a0dbce70f1569cda74e9a1e3/scripts/python/frozendict.py#L1>`_.
It is hashable. __init__() sets __init__ to None.
* Old NLTK (1.x): `nltk.util.frozendict
<http://nltk.googlecode.com/svn/trunk/nltk-old/src/nltk/util.py>`_. Keys and
values must be hashable. __init__() can be called twice to modify the
mapping. frozendict is used to "freeze" an object.
Hashable dict: inherit from dict and just add an __hash__ method.
* `pypy.rpython.lltypesystem.lltype.frozendict
<https://bitbucket.org/pypy/pypy/src/1f49987cc2fe/pypy/rpython/lltypesystem/lltype.py#cl-86>`_.
It is hashable but don't deny modification of the mapping.
* factor-graphics: hashabledict type in `python/fglib/util_ext_frozendict.py
<https://github.com/ih/factor-graphics/blob/41006fb71a09377445cc140489da5ce8eeb9c8b1/python/fglib/util_ext_frozendict.py>`_
Links
=====
* `Issue #14162: PEP 416: Add a builtin frozendict type
<http://bugs.python.org/issue14162>`_
* PEP 412: Key-Sharing Dictionary
(`issue #13903 <http://bugs.python.org/issue13903>`_)
* PEP 351: The freeze protocol
* `The case for immutable dictionaries; and the central misunderstanding of
PEP 351 <http://www.cs.toronto.edu/~tijmen/programming/immutableDictionaries.html>`_
* `make dictproxy object via ctypes.pythonapi and type() (Python recipe
576540) <http://code.activestate.com/recipes/576540/>`_ by Ikkei Shimomura.
* Python security modules implementing read-only object proxies using a C
extension:
* `pysandbox <https://github.com/vstinner/pysandbox/>`_
* `mxProxy <http://www.egenix.com/products/python/mxBase/mxProxy/>`_
* `zope.proxy <http://pypi.python.org/pypi/zope.proxy>`_
* `zope.security <http://pypi.python.org/pypi/zope.security>`_
Copyright
=========
This document has been placed in the public domain.