PEP 456: drop pluggable and go for compile time configuration of the hash algorithm.

This commit is contained in:
Christian Heimes 2013-10-06 14:28:39 +02:00
parent 75c006347a
commit 781581bd26
1 changed files with 34 additions and 47 deletions

View File

@ -1,5 +1,5 @@
PEP: 456 PEP: 456
Title: Pluggable and secure hash algorithm Title: Secure and interchangeable hash algorithm
Version: $Revision$ Version: $Revision$
Last-Modified: $Date$ Last-Modified: $Date$
Author: Christian Heimes <christian@python.org> Author: Christian Heimes <christian@python.org>
@ -8,16 +8,16 @@ Type: Standards Track
Content-Type: text/x-rst Content-Type: text/x-rst
Created: 27-Sep-2013 Created: 27-Sep-2013
Python-Version: 3.4 Python-Version: 3.4
Post-History: Post-History: 06-Oct-2013
Abstract Abstract
======== ========
This PEP proposes SipHash as default string and bytes hash algorithm to properly This PEP proposes SipHash as default string and bytes hash algorithm to properly
fix hash randomization once and for all. It also proposes an addition to fix hash randomization once and for all. It also proposes modifications to
Python's C API in order to make the hash code pluggable. The new API allows to Python's C code in order to unify the hash code and to make it easily
select the algorithm on startup as well as the addition of more hash algorithms. interchangeable.
Rationale Rationale
@ -57,10 +57,8 @@ This PEP proposes three major changes to the hash code for strings and bytes:
``Objects/object.c`` and ``Objects/unicodeobject.c``. The function takes a ``Objects/object.c`` and ``Objects/unicodeobject.c``. The function takes a
void pointer plus length and returns the hash for it. void pointer plus length and returns the hash for it.
* The algorithm can be selected by the user with an environment variable, * The algorithm can be selected at compile time. FNV is guaranteed to exist
command line argument or with an API function (for embedders). FNV is on all platforms. SipHash is available on the majority of modern systems.
guaranteed to exist on all platforms. SipHash is available on the majority
of modern systems.
Requirements for a hash function Requirements for a hash function
@ -321,50 +319,25 @@ hash function table
type definition:: type definition::
typedef struct { typedef struct {
PyHash_Func hashfunc; /* function pointer */ PyHash_Func hash; /* function pointer */
char *name; /* name of the hash algorithm and variant */ char *name; /* name of the hash algorithm and variant */
int hash_bits; /* internal size of hash value */ int hash_bits; /* internal size of hash value */
int seed_bits; /* size of seed input */ int seed_bits; /* size of seed input */
int precedence; /* ranking for auto-selection */ } _PyHash_FuncDef;
} PyHash_FuncDef;
PyAPI_DATA(PyHash_FuncDef *) PyHash_FuncTable; PyAPI_DATA(_PyHash_FuncDef *) _PyHash_Func;
Implementation:: Implementation::
PyHash_FuncDef hash_func_table[] = { #ifndef PY_HASH_FUNC
{fnv, "fnv", 64, 128, 10},
#ifdef PY_UINT64_T #ifdef PY_UINT64_T
{siphash24, "sip24", sizeof(Py_hash_t)*8, sizeof(Py_hash_t)*8, 20}, _PyHash_Func = {siphash24, "sip24", 64, 128}
#else
_PyHash_Func = {fnv, "fnv", 8 * sizeof(Py_hash_t), 16 * sizeof(Py_hash_t)}
#endif
#endif #endif
{NULL, NULL},
};
PyHash_FuncDef *PyHash_FuncTable = hash_func_table; TODO: select hash algorithm with autoconf variable
hash function API
-----------------
function proto types::
PyAPI_FUNC(int) PyHash_SetHashAlgorithm(char *name);
PyAPI_FUNC(PyHash_FuncDef *) PyHash_GetHashAlgorithm(void);
PyAPI_DATA(PyHash_FuncDef *) _PyHash_Func;
``PyHash_SetHashAlgorithm(NULL)`` selects the hash algorithm with the highest
precedence. ``PyHash_SetHashAlgorithm("sip24")`` selects siphash24 as hash
algorithm. The function returns ``0`` on success. In case the algorithm is
not supported or a hash algorithm is already set it returns ``-1``.
(XXX use enum?)
``PyHash_GetHashAlgorithm()`` returns a pointer to current hash function
definition or `NULL`.
``_PyHash_Func`` holds the set hash function definition. It can't be modified
or reset once a hash algorithm is set.
Python API addition Python API addition
@ -379,9 +352,8 @@ algorithm as well as all available algorithms.
:: ::
sys.hash_info(algorithm='siphash24', sys.hash_info(algorithm='siphash24',
available_algorithms=('siphash24', 'fnv'),
hash_bits=64, hash_bits=64,
hash_output=64, # sizeof(Py_hash_t)*8 hash_output=64, # 8 * sizeof(Py_hash_t)
seed_bits=128) seed_bits=128)
@ -439,7 +411,7 @@ multiplied with the size of the internal unicode kind::
if (PyUnicode_READY(u) == -1) if (PyUnicode_READY(u) == -1)
return -1; return -1;
x = _PyHash_Func->hashfunc(PyUnicode_DATA(u), x = _PyHash_Func->hash(PyUnicode_DATA(u),
PyUnicode_GET_LENGTH(u) * PyUnicode_KIND(u)); PyUnicode_GET_LENGTH(u) * PyUnicode_KIND(u));
@ -534,6 +506,19 @@ the past, but are not subject of this PEP.
prefixes are stored within the tree structure. prefixes are stored within the tree structure.
Discussion
==========
Pluggable
---------
The first draft of this PEP made the hash algorithm pluggable at runtime. It
supported multiple hash algorithms in one binary to give the user the
possibility to select a hash algorithm at startup. The approach was considered
an unnecessary complication by several core committers [pluggable]_. Subsequent
versions of the PEP aim for compile time configuration.
Reference Reference
========= =========
@ -567,6 +552,8 @@ Reference
.. [aes-ni] http://en.wikipedia.org/wiki/AES_instruction_set .. [aes-ni] http://en.wikipedia.org/wiki/AES_instruction_set
.. [pluggable] https://mail.python.org/pipermail/python-dev/2013-October/129138.html
Copyright Copyright
========= =========