255 lines
11 KiB
ReStructuredText
255 lines
11 KiB
ReStructuredText
PEP: 416
|
|
Title: Add a frozendict builtin type
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: Victor Stinner <vstinner@python.org>
|
|
Status: Rejected
|
|
Type: Standards Track
|
|
Content-Type: text/x-rst
|
|
Created: 29-Feb-2012
|
|
Python-Version: 3.3
|
|
|
|
|
|
Rejection Notice
|
|
================
|
|
|
|
I'm rejecting this PEP. A number of reasons (not exhaustive):
|
|
|
|
* According to Raymond Hettinger, use of frozendict is low. Those
|
|
that do use it tend to use it as a hint only, such as declaring
|
|
global or class-level "constants": they aren't really immutable,
|
|
since anyone can still assign to the name.
|
|
* There are existing idioms for avoiding mutable default values.
|
|
* The potential of optimizing code using frozendict in PyPy is
|
|
unsure; a lot of other things would have to change first. The same
|
|
holds for compile-time lookups in general.
|
|
* Multiple threads can agree by convention not to mutate a shared
|
|
dict, there's no great need for enforcement. Multiple processes
|
|
can't share dicts.
|
|
* Adding a security sandbox written in Python, even with a limited
|
|
scope, is frowned upon by many, due to the inherent difficulty with
|
|
ever proving that the sandbox is actually secure. Because of this
|
|
we won't be adding one to the stdlib any time soon, so this use
|
|
case falls outside the scope of a PEP.
|
|
|
|
On the other hand, exposing the existing read-only dict proxy as a
|
|
built-in type sounds good to me. (It would need to be changed to
|
|
allow calling the constructor.) GvR.
|
|
|
|
**Update** (2012-04-15): A new ``MappingProxyType`` type was added to the types
|
|
module of Python 3.3.
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
Add a new frozendict builtin type.
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
A frozendict is a read-only mapping: a key cannot be added nor removed, and a
|
|
key is always mapped to the same value. However, frozendict values can be not
|
|
hashable. A frozendict is hashable if and only if all values are hashable.
|
|
|
|
Use cases:
|
|
|
|
* Immutable global variable like a default configuration.
|
|
* Default value of a function parameter. Avoid the issue of mutable default
|
|
arguments.
|
|
* Implement a cache: frozendict can be used to store function keywords.
|
|
frozendict can be used as a key of a mapping or as a member of set.
|
|
* frozendict avoids the need of a lock when the frozendict is shared
|
|
by multiple threads or processes, especially hashable frozendict. It would
|
|
also help to prohibe coroutines (generators + greenlets) to modify the
|
|
global state.
|
|
* frozendict lookup can be done at compile time instead of runtime because the
|
|
mapping is read-only. frozendict can be used instead of a preprocessor to
|
|
remove conditional code at compilation, like code specific to a debug build.
|
|
* frozendict helps to implement read-only object proxies for security modules.
|
|
For example, it would be possible to use frozendict type for __builtins__
|
|
mapping or type.__dict__. This is possible because frozendict is compatible
|
|
with the PyDict C API.
|
|
* frozendict avoids the need of a read-only proxy in some cases. frozendict is
|
|
faster than a proxy because getting an item in a frozendict is a fast lookup
|
|
whereas a proxy requires a function call.
|
|
|
|
|
|
Constraints
|
|
===========
|
|
|
|
* frozendict has to implement the Mapping abstract base class
|
|
* frozendict keys and values can be unorderable
|
|
* a frozendict is hashable if all keys and values are hashable
|
|
* frozendict hash does not depend on the items creation order
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
* Add a PyFrozenDictObject structure based on PyDictObject with an extra
|
|
"Py_hash_t hash;" field
|
|
* frozendict.__hash__() is implemented using hash(frozenset(self.items())) and
|
|
caches the result in its private hash attribute
|
|
* Register frozendict as a collections.abc.Mapping
|
|
* frozendict can be used with PyDict_GetItem(), but PyDict_SetItem() and
|
|
PyDict_DelItem() raise a TypeError
|
|
|
|
|
|
Recipe: hashable dict
|
|
======================
|
|
|
|
To ensure that a frozendict is hashable, values can be checked
|
|
before creating the frozendict::
|
|
|
|
import itertools
|
|
|
|
def hashabledict(*args, **kw):
|
|
# ensure that all values are hashable
|
|
for key, value in itertools.chain(args, kw.items()):
|
|
if isinstance(value, (int, str, bytes, float, frozenset, complex)):
|
|
# avoid the compute the hash (which may be slow) for builtin
|
|
# types known to be hashable for any value
|
|
continue
|
|
hash(value)
|
|
# don't check the key: frozendict already checks the key
|
|
return frozendict.__new__(cls, *args, **kw)
|
|
|
|
|
|
Objections
|
|
==========
|
|
|
|
*namedtuple may fit the requirements of a frozendict.*
|
|
|
|
A namedtuple is not a mapping, it does not implement the Mapping abstract base
|
|
class.
|
|
|
|
*frozendict can be implemented in Python using descriptors" and "frozendict
|
|
just need to be practically constant.*
|
|
|
|
If frozendict is used to harden Python (security purpose), it must be
|
|
implemented in C. A type implemented in C is also faster.
|
|
|
|
*The* :pep:`351` *was rejected.*
|
|
|
|
The :pep:`351` tries to freeze an object and so may convert a mutable object to an
|
|
immutable object (using a different type). frozendict doesn't convert anything:
|
|
hash(frozendict) raises a TypeError if a value is not hashable. Freezing an
|
|
object is not the purpose of this PEP.
|
|
|
|
|
|
Alternative: dictproxy
|
|
======================
|
|
|
|
Python has a builtin dictproxy type used by type.__dict__ getter descriptor.
|
|
This type is not public. dictproxy is a read-only view of a dictionary, but it
|
|
is not read-only mapping. If a dictionary is modified, the dictproxy is also
|
|
modified.
|
|
|
|
dictproxy can be used using ctypes and the Python C API, see for example the
|
|
`make dictproxy object via ctypes.pythonapi and type() (Python recipe 576540)`_
|
|
by Ikkei Shimomura. The recipe contains a test checking that a dictproxy is
|
|
"mutable" (modify the dictionary linked to the dictproxy).
|
|
|
|
However dictproxy can be useful in some cases, where its mutable property is
|
|
not an issue, to avoid a copy of the dictionary.
|
|
|
|
|
|
Existing implementations
|
|
========================
|
|
|
|
Whitelist approach.
|
|
|
|
* `Implementing an Immutable Dictionary (Python recipe 498072)
|
|
<http://code.activestate.com/recipes/498072/>`_ by Aristotelis Mikropoulos.
|
|
Similar to frozendict except that it is not truly read-only: it is possible
|
|
to access to this private internal dict. It does not implement __hash__ and
|
|
has an implementation issue: it is possible to call again __init__() to
|
|
modify the mapping.
|
|
* PyWebmail contains an ImmutableDict type: `webmail.utils.ImmutableDict
|
|
<http://pywebmail.cvs.sourceforge.net/viewvc/pywebmail/webmail/webmail/utils/ImmutableDict.py?revision=1.2&view=markup>`_.
|
|
It is hashable if keys and values are hashable. It is not truly read-only:
|
|
its internal dict is a public attribute.
|
|
* remember project: `remember.dicts.FrozenDict
|
|
<https://bitbucket.org/mikegraham/remember/src/tip/remember/dicts.py>`_.
|
|
It is used to implement a cache: FrozenDict is used to store function callbacks.
|
|
FrozenDict may be hashable. It has an extra supply_dict() class method to
|
|
create a FrozenDict from a dict without copying the dict: store the dict as
|
|
the internal dict. Implementation issue: __init__() can be called to modify
|
|
the mapping and the hash may differ depending on item creation order. The
|
|
mapping is not truly read-only: the internal dict is accessible in Python.
|
|
|
|
|
|
Blacklist approach: inherit from dict and override write methods to raise an
|
|
exception. It is not truly read-only: it is still possible to call dict methods
|
|
on such "frozen dictionary" to modify it.
|
|
|
|
* brownie: `brownie.datastructures.ImmutableDict
|
|
<https://github.com/DasIch/brownie/blob/HEAD/brownie/datastructures/mappings.py>`_.
|
|
It is hashable if keys and values are hashable. werkzeug project has the
|
|
same code: `werkzeug.datastructures.ImmutableDict
|
|
<https://github.com/mitsuhiko/werkzeug/blob/master/werkzeug/datastructures.py>`_.
|
|
ImmutableDict is used for global constant (configuration options). The Flask
|
|
project uses ImmutableDict of werkzeug for its default configuration.
|
|
* SQLAlchemy project: `sqlalchemy.util.immutabledict
|
|
<http://hg.sqlalchemy.org/sqlalchemy/file/tip/lib/sqlalchemy/util/_collections.py>`_.
|
|
It is not hashable and has an extra method: union(). immutabledict is used
|
|
for the default value of parameter of some functions expecting a mapping.
|
|
Example: mapper_args=immutabledict() in SqlSoup.map().
|
|
* `Frozen dictionaries (Python recipe 414283) <http://code.activestate.com/recipes/414283/>`_
|
|
by Oren Tirosh. It is hashable if keys and values are hashable. Included in
|
|
the following projects:
|
|
|
|
* lingospot: `frozendict/frozendict.py
|
|
<http://code.google.com/p/lingospot/source/browse/trunk/frozendict/frozendict.py>`_
|
|
* factor-graphics: frozendict type in `python/fglib/util_ext_frozendict.py
|
|
<https://github.com/ih/factor-graphics/blob/41006fb71a09377445cc140489da5ce8eeb9c8b1/python/fglib/util_ext_frozendict.py>`_
|
|
|
|
* The gsakkis-utils project written by George Sakkis includes a frozendict
|
|
type: `datastructs.frozendict
|
|
<http://code.google.com/p/gsakkis-utils/source/browse/trunk/datastructs/frozendict.py>`_
|
|
* characters: `scripts/python/frozendict.py
|
|
<https://github.com/JasonGross/characters/blob/15a2af5f7861cd33a0dbce70f1569cda74e9a1e3/scripts/python/frozendict.py#L1>`_.
|
|
It is hashable. __init__() sets __init__ to None.
|
|
* Old NLTK (1.x): `nltk.util.frozendict
|
|
<http://nltk.googlecode.com/svn/trunk/nltk-old/src/nltk/util.py>`_. Keys and
|
|
values must be hashable. __init__() can be called twice to modify the
|
|
mapping. frozendict is used to "freeze" an object.
|
|
|
|
Hashable dict: inherit from dict and just add an __hash__ method.
|
|
|
|
* `pypy.rpython.lltypesystem.lltype.frozendict
|
|
<https://bitbucket.org/pypy/pypy/src/1f49987cc2fe/pypy/rpython/lltypesystem/lltype.py#cl-86>`_.
|
|
It is hashable but don't deny modification of the mapping.
|
|
* factor-graphics: hashabledict type in `python/fglib/util_ext_frozendict.py
|
|
<https://github.com/ih/factor-graphics/blob/41006fb71a09377445cc140489da5ce8eeb9c8b1/python/fglib/util_ext_frozendict.py>`_
|
|
|
|
|
|
Links
|
|
=====
|
|
|
|
* `Issue #14162: PEP 416: Add a builtin frozendict type
|
|
<http://bugs.python.org/issue14162>`_
|
|
* PEP 412: Key-Sharing Dictionary
|
|
(`issue #13903 <http://bugs.python.org/issue13903>`_)
|
|
* :pep:`351`: The freeze protocol
|
|
* `The case for immutable dictionaries; and the central misunderstanding of
|
|
PEP 351 <http://www.cs.toronto.edu/~tijmen/programming/immutableDictionaries.html>`_
|
|
* `make dictproxy object via ctypes.pythonapi and type() (Python recipe
|
|
576540) <http://code.activestate.com/recipes/576540/>`_ by Ikkei Shimomura.
|
|
* Python security modules implementing read-only object proxies using a C
|
|
extension:
|
|
|
|
* `pysandbox <https://github.com/vstinner/pysandbox/>`_
|
|
* `mxProxy <http://www.egenix.com/products/python/mxBase/mxProxy/>`_
|
|
* `zope.proxy <http://pypi.python.org/pypi/zope.proxy>`_
|
|
* `zope.security <http://pypi.python.org/pypi/zope.security>`_
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|