586 lines
19 KiB
Plaintext
586 lines
19 KiB
Plaintext
PEP: 456
|
||
Title: Pluggable and secure hash algorithm
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Christian Heimes <christian@python.org>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 27-Sep-2013
|
||
Python-Version: 3.4
|
||
Post-History:
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP proposes SipHash as default string and bytes hash algorithm to properly
|
||
fix hash randomization once and for all. It also proposes an addition to
|
||
Python's C API in order to make the hash code pluggable. The new API allows to
|
||
select the algorithm on startup as well as the addition of more hash algorithms.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
Despite the last attempt [issue13703]_ CPython is still vulnerable to hash
|
||
collision DoS attacks [29c3]_ [issue14621]_. The current hash algorithm and
|
||
its randomization is not resilient against attacks. Only a proper
|
||
cryptographic hash function prevents the extraction of secret randomization
|
||
keys. Although no practical attack against a Python-based service has been
|
||
seen yet, the weakness has to be fixed. Jean-Philippe Aumasson and Daniel
|
||
J. Bernstein have already shown how the seed for the current implementation
|
||
can be recovered [poc]_.
|
||
|
||
Furthermore the current hash algorithm is hard-coded and implemented multiple
|
||
times for bytes and three different Unicode representations UCS1, UCS2 and
|
||
UCS4. This makes it impossible for embedders to replace it with a different
|
||
implementation without patching and recompiling large parts of the interpreter.
|
||
Embedders may want to choose a more suitable hash function.
|
||
|
||
Finally the current implementation code does not perform well. In the common
|
||
case it only processes one or two bytes per cycle. On a modern 64-bit processor
|
||
the code can easily be adjusted to deal with eight bytes at once.
|
||
|
||
This PEP proposes three major changes to the hash code for strings and bytes:
|
||
|
||
* SipHash [sip]_ is introduced as default hash algorithm. It is fast and small
|
||
despite its cryptographic properties. Due to the fact that it was designed
|
||
by well known security and crypto experts, it is safe to assume that its
|
||
secure for the near future.
|
||
|
||
* The existing FNV code is kept for platforms without a 64-bit data type. The
|
||
algorithm is optimized to process larger chunks per cycle.
|
||
|
||
* Calculation of the hash of strings and bytes is moved into a single API
|
||
function instead of multiple specialized implementations in
|
||
``Objects/object.c`` and ``Objects/unicodeobject.c``. The function takes a
|
||
void pointer plus length and returns the hash for it.
|
||
|
||
* The algorithm can be selected by the user with an environment variable,
|
||
command line argument or with an API function (for embedders). FNV is
|
||
guaranteed to exist on all platforms. SipHash is available on the majority
|
||
of modern systems.
|
||
|
||
|
||
Requirements for a hash function
|
||
================================
|
||
|
||
* It MUST be able to hash arbitrarily large blocks of memory from 1 byte up
|
||
to the maximum ``ssize_t`` value.
|
||
|
||
* It MUST produce at least 32 bits on 32-bit platforms and at least 64 bits
|
||
on 64-bit platforms. (Note: Larger outputs can be compressed with e.g.
|
||
``v ^ (v >> 32)``.)
|
||
|
||
* It MUST support hashing of unaligned memory in order to support
|
||
hash(memoryview).
|
||
|
||
* It MUST NOT return ``-1``. The value is reserved for error cases and yet
|
||
uncached hash values. (Note: A special case can be added to map ``-1``
|
||
to ``-2``.)
|
||
|
||
* It is highly RECOMMENDED that the length of the input influences the
|
||
outcome, so that ``hash(b'\00') != hash(b'\x00\x00')``.
|
||
|
||
* It MAY return ``0`` for zero length input in order to disguise the
|
||
randomization seed. (Note: This can be handled as special case, too.)
|
||
|
||
|
||
Current implementation with modified FNV
|
||
========================================
|
||
|
||
CPython currently uses uses a variant of the Fowler-Noll-Vo hash function
|
||
[fnv]_. The variant is has been modified to reduce the amount and cost of hash
|
||
collisions for common strings. The first character of the string is added
|
||
twice, the first time with a bit shift of 7. The length of the input
|
||
string is XOR-ed to the final value. Both deviations from the original FNV
|
||
algorithm reduce the amount of hash collisions for short strings.
|
||
|
||
Recently [issue13703]_ a random prefix and suffix were added as an attempt to
|
||
randomize the hash values. In order to protect the hash secret the code still
|
||
returns ``0`` for zero length input.
|
||
|
||
C code::
|
||
|
||
Py_uhash_t x;
|
||
Py_ssize_t len;
|
||
/* p is either 1, 2 or 4 byte type */
|
||
unsigned char *p;
|
||
Py_UCS2 *p;
|
||
Py_UCS4 *p;
|
||
|
||
if (len == 0)
|
||
return 0;
|
||
x = (Py_uhash_t) _Py_HashSecret.prefix;
|
||
x ^= (Py_uhash_t) *p << 7;
|
||
for (i = 0; i < len; i++)
|
||
x = (1000003 * x) ^ (Py_uhash_t) *p++;
|
||
x ^= (Py_uhash_t) len;
|
||
x ^= (Py_uhash_t) _Py_HashSecret.suffix;
|
||
return x;
|
||
|
||
|
||
Which roughly translates to Python::
|
||
|
||
def fnv(p):
|
||
if len(p) == 0:
|
||
return 0
|
||
|
||
# bit mask, 2**32-1 or 2**64-1
|
||
mask = 2 * sys.maxsize + 1
|
||
|
||
x = hashsecret.prefix
|
||
x = (x ^ (ord(p[0]) << 7)) & mask
|
||
for c in p:
|
||
x = ((1000003 * x) ^ ord(c)) & mask
|
||
x = (x ^ len(p)) & mask
|
||
x = (x ^ hashsecret.suffix) & mask
|
||
|
||
if x == -1:
|
||
x = -2
|
||
|
||
return x
|
||
|
||
|
||
FNV is a simple multiply and XOR algorithm with no cryptographic properties.
|
||
The randomization was not part of the initial hash code, but was added as
|
||
counter measure against hash collision attacks as explained in oCERT-2011-003
|
||
[ocert]_. Because FNV is not a cryptographic hash algorithm and the dict
|
||
implementation is not fortified against side channel analysis, the
|
||
randomization secrets can be calculated by a remote attacker. The author of
|
||
this PEP strongly believes that the nature of a non-cryptographic hash
|
||
function makes it impossible to conceal the secrets.
|
||
|
||
|
||
Examined hashing algorithms
|
||
===========================
|
||
|
||
The author of this PEP has researched several hashing algorithms that are
|
||
considered modern, fast and state-of-the-art.
|
||
|
||
SipHash
|
||
-------
|
||
|
||
SipHash [sip]_ is a cryptographic pseudo random function with a 128-bit seed
|
||
and 64-bit output. It was designed by Jean-Philippe Aumasson and Daniel J.
|
||
Bernstein as a fast and secure keyed hash algorithm. It's used by Ruby, Perl,
|
||
OpenDNS, Rust, Redis, FreeBSD and more. The C reference implementation has
|
||
been released under CC0 license (public domain).
|
||
|
||
Quote from SipHash's site:
|
||
|
||
SipHash is a family of pseudorandom functions (a.k.a. keyed hash
|
||
functions) optimized for speed on short messages. Target applications
|
||
include network traffic authentication and defense against hash-flooding
|
||
DoS attacks.
|
||
|
||
siphash24 is the recommend variant with best performance. It uses 2 rounds per
|
||
message block and 4 finalization rounds. Besides the reference implementation
|
||
several other implementations are available. Some are single-shot functions,
|
||
others use a Merkle–Damgård construction-like approach with init, update and
|
||
finalize functions. Marek Majkowski C implementation csiphash [csiphash]_
|
||
defines the prototype of the function. (Note: ``k`` is split up into two
|
||
uint64_t)::
|
||
|
||
uint64_t siphash24(const void *src,
|
||
unsigned long src_sz,
|
||
const char k[16]);
|
||
|
||
SipHash requires a 64-bit data type and is not compatible with pure C89
|
||
platforms.
|
||
|
||
|
||
MurmurHash
|
||
----------
|
||
|
||
MurmurHash [murmur]_ is a family of non-cryptographic keyed hash function
|
||
developed by Austin Appleby. Murmur3 is the latest and fast variant of
|
||
MurmurHash. The C++ reference implementation has been released into public
|
||
domain. It features 32- or 128-bit output with a 32-bit seed. (Note: The out
|
||
parameter is a buffer with either 1 or 4 bytes.)
|
||
|
||
Murmur3's function prototypes are::
|
||
|
||
void MurmurHash3_x86_32(const void *key,
|
||
int len,
|
||
uint32_t seed,
|
||
void *out);
|
||
|
||
void MurmurHash3_x86_128(const void * key,
|
||
int len,
|
||
uint32_t seed,
|
||
void *out);
|
||
|
||
void MurmurHash3_x64_128(const void *key,
|
||
int len,
|
||
uint32_t seed,
|
||
void *out);
|
||
|
||
The 128-bit variants requires a 64-bit data type and are not compatible with
|
||
pure C89 platforms. The 32-bit variant is fully C89-compatible.
|
||
|
||
Aumasson, Bernstein and Boßlet have shown [sip]_ [ocert-2012-001]_ that
|
||
Murmur3 is not resilient against hash collision attacks. Therefore Murmur3
|
||
can no longer be considered as secure algorithm. It still may be an
|
||
alternative is hash collision attacks are of no concern.
|
||
|
||
CityHash
|
||
--------
|
||
|
||
CityHash [city]_ is a family of non-cryptographic hash function developed by
|
||
Geoff Pike and Jyrki Alakuijala for Google. The C++ reference implementation
|
||
has been released under MIT license. The algorithm is partly based on
|
||
MurmurHash and claims to be faster. It supports 64- and 128-bit output with a
|
||
128-bit seed as well as 32-bit output without seed.
|
||
|
||
The relevant function prototype for 64-bit CityHash with 128-bit seed is::
|
||
|
||
uint64 CityHash64WithSeeds(const char *buf,
|
||
size_t len,
|
||
uint64 seed0,
|
||
uint64 seed1)
|
||
|
||
CityHash also offers SSE 4.2 optimizations with CRC32 intrinsic for long
|
||
inputs. All variants except CityHash32 require 64-bit data types. CityHash32
|
||
uses only 32-bit data types but it doesn't support seeding.
|
||
|
||
Like MurmurHash Aumasson, Bernstein and Boßlet have shown [sip]_ a similar
|
||
weakness in CityHash.
|
||
|
||
|
||
HMAC, MD5, SHA-1, SHA-2
|
||
-----------------------
|
||
|
||
These hash algorithms are too slow and have high setup and finalization costs.
|
||
For these reasons they are not considered fit for this purpose.
|
||
|
||
|
||
AES CMAC
|
||
--------
|
||
|
||
Modern AMD and Intel CPUs have AES-NI (AES instruction set) [aes-ni]_ to speed
|
||
up AES encryption. CMAC with AES-NI might be a viable option but it's probably
|
||
too slow for daily operation. (testing required)
|
||
|
||
|
||
Conclusion
|
||
----------
|
||
|
||
SipHash provides the best combination of speed and security. Developers of
|
||
other prominent projects have came to the same conclusion.
|
||
|
||
|
||
C API additions
|
||
===============
|
||
|
||
All C API extension modifications are not part of the stable API.
|
||
|
||
hash secret
|
||
-----------
|
||
|
||
The ``_Py_HashSecret_t`` type of Python 2.6 to 3.3 has two members with either
|
||
32- or 64-bit length each. SipHash requires two 64-bit unsigned integers as keys.
|
||
The typedef will be changed to an union with a guaranteed size of 128 bits on
|
||
all architectures. On platforms with a 64-bit data type it will have two
|
||
``uint64`` members. Because C89 compatible compilers may not have ``uint64``
|
||
the union also has an array of 16 chars.
|
||
|
||
new type definition::
|
||
|
||
typedef union {
|
||
unsigned char uc16[16];
|
||
struct {
|
||
Py_hash_t prefix;
|
||
Py_hash_t suffix;
|
||
} ht;
|
||
#ifdef PY_UINT64_T
|
||
struct {
|
||
PY_UINT64_T k0;
|
||
PY_UINT64_T k1;
|
||
} ui64;
|
||
#endif
|
||
} _Py_HashSecret_t;
|
||
|
||
PyAPI_DATA(_Py_HashSecret_t) _Py_HashSecret;
|
||
|
||
``_Py_HashSecret_t`` is initialized in ``Python/random.c:_PyRandom_Init()``
|
||
exactly once at startup.
|
||
|
||
hash function
|
||
-------------
|
||
|
||
function prototype::
|
||
|
||
typedef Py_hash_t (*PyHash_Func)(const void *, Py_ssize_t);
|
||
|
||
|
||
hash function table
|
||
-------------------
|
||
|
||
type definition::
|
||
|
||
typedef struct {
|
||
PyHash_Func hashfunc; /* function pointer */
|
||
char *name; /* name of the hash algorithm and variant */
|
||
int hash_bits; /* internal size of hash value */
|
||
int seed_bits; /* size of seed input */
|
||
int precedence; /* ranking for auto-selection */
|
||
} PyHash_FuncDef;
|
||
|
||
PyAPI_DATA(PyHash_FuncDef *) PyHash_FuncTable;
|
||
|
||
Implementation::
|
||
|
||
PyHash_FuncDef hash_func_table[] = {
|
||
{fnv, "fnv", 64, 128, 10},
|
||
#ifdef PY_UINT64_T
|
||
{siphash24, "sip24", sizeof(Py_hash_t)*8, sizeof(Py_hash_t)*8, 20},
|
||
#endif
|
||
{NULL, NULL},
|
||
};
|
||
|
||
PyHash_FuncDef *PyHash_FuncTable = hash_func_table;
|
||
|
||
|
||
hash function API
|
||
-----------------
|
||
|
||
function proto types::
|
||
|
||
PyAPI_FUNC(int) PyHash_SetHashAlgorithm(char *name);
|
||
|
||
PyAPI_FUNC(PyHash_FuncDef *) PyHash_GetHashAlgorithm(void);
|
||
|
||
PyAPI_DATA(PyHash_FuncDef *) _PyHash_Func;
|
||
|
||
``PyHash_SetHashAlgorithm(NULL)`` selects the hash algorithm with the highest
|
||
precedence. ``PyHash_SetHashAlgorithm("sip24")`` selects siphash24 as hash
|
||
algorithm. The function returns ``0`` on success. In case the algorithm is
|
||
not supported or a hash algorithm is already set it returns ``-1``.
|
||
(XXX use enum?)
|
||
|
||
``PyHash_GetHashAlgorithm()`` returns a pointer to current hash function
|
||
definition or `NULL`.
|
||
|
||
``_PyHash_Func`` holds the set hash function definition. It can't be modified
|
||
or reset once a hash algorithm is set.
|
||
|
||
|
||
Python API addition
|
||
===================
|
||
|
||
sys module
|
||
----------
|
||
|
||
The sys module grows a new struct member with information about the select
|
||
algorithm as well as all available algorithms.
|
||
|
||
::
|
||
|
||
sys.hash_info(algorithm='siphash24',
|
||
available_algorithms=('siphash24', 'fnv'),
|
||
hash_bits=64,
|
||
hash_output=64, # sizeof(Py_hash_t)*8
|
||
seed_bits=128)
|
||
|
||
|
||
_testcapi
|
||
---------
|
||
|
||
The `_testcapi` C module gets a function to hash a buffer or string object
|
||
with any supported hash algorithm. The function neither uses nor sets the
|
||
cached hash value of the object. The feature is soley intended for benchmarks
|
||
and testing.
|
||
|
||
::
|
||
|
||
_testcapi.get_hash(name: str, str_or_buffer) -> int
|
||
|
||
|
||
Necessary modifications to C code
|
||
=================================
|
||
|
||
_Py_HashBytes (Objects/object.c)
|
||
--------------------------------
|
||
|
||
``_Py_HashBytes`` is an internal helper function that provides the hashing
|
||
code for bytes, memoryview and datetime classes. It currently implements FNV
|
||
for ``unsigned char*``. The function can either be modified to use the new
|
||
API or it could be completely removed to avoid an unnecessary level of
|
||
indirection.
|
||
|
||
|
||
bytes_hash (Objects/bytesobject.c)
|
||
----------------------------------
|
||
|
||
``bytes_hash`` uses ``_Py_HashBytes`` to provide the tp_hash slot function
|
||
for bytes objects. If ``_Py_HashBytes`` is to be removed then ``bytes_hash``
|
||
must be reimplemented.
|
||
|
||
|
||
memory_hash (Objects/memoryobject.c)
|
||
------------------------------------
|
||
|
||
``memory_hash`` provides the tp_hash slot function for read-only memory
|
||
views if the original object is hashable, too. It's the only function that
|
||
has to support hashing of unaligned memory segments in the future.
|
||
|
||
|
||
unicode_hash (Objects/unicodeobject.c)
|
||
--------------------------------------
|
||
|
||
``unicode_hash`` provides the tp_hash slot function for unicode. Right now it
|
||
implements the FNV algorithm three times for ``unsigned char*``, ``Py_UCS2``
|
||
and ``Py_UCS4``. A reimplementation of the function must take care to use the
|
||
correct length. Since the macro ``PyUnicode_GET_LENGTH`` returns the length
|
||
of the unicode string and not its size in octets, the length must be
|
||
multiplied with the size of the internal unicode kind::
|
||
|
||
if (PyUnicode_READY(u) == -1)
|
||
return -1;
|
||
x = _PyHash_Func->hashfunc(PyUnicode_DATA(u),
|
||
PyUnicode_GET_LENGTH(u) * PyUnicode_KIND(u));
|
||
|
||
|
||
generic_hash (Modules/_datetimemodule.c)
|
||
----------------------------------------
|
||
|
||
``generic_hash`` acts as a wrapper around ``_Py_HashBytes`` for the tp_hash
|
||
slots of date, time and datetime types. timedelta objects are hashed by their
|
||
state (days, seconds, microseconds) and tzinfo objects are not hashable. The
|
||
data members of date, time and datetime types' struct are not void* aligned.
|
||
This can easily by fixed with memcpy()ing four to ten bytes to an aligned
|
||
buffer.
|
||
|
||
|
||
Further things to consider
|
||
==========================
|
||
|
||
ASCII str / bytes hash collision
|
||
--------------------------------
|
||
|
||
Since the implementation of [pep-0393]_ bytes and ASCII text have the same
|
||
memory layout. Because of this the new hashing API will keep the invariant::
|
||
|
||
hash("ascii string") == hash(b"ascii string")
|
||
|
||
for ASCII string and ASCII bytes. Equal hash values result in a hash collision
|
||
and therefore cause a minor speed penalty for dicts and sets with mixed keys.
|
||
The cause of the collision could be removed by e.g. subtraction ``-2`` from
|
||
the hash value of bytes. (``-2`` because ``hash(b"") == 0`` and ``-1`` is
|
||
reserved.)
|
||
|
||
|
||
Performance
|
||
===========
|
||
|
||
TBD
|
||
|
||
First tests suggest that SipHash performs a bit faster on 64-bit CPUs when
|
||
it is feed with medium size byte strings as well as ASCII and UCS2 Unicode
|
||
strings. For very short strings the setup costs for SipHash dominates its
|
||
speed but it is still in the same order of magnitude as the current FNV code.
|
||
|
||
It's yet unknown how the new distribution of hash values affects collisions
|
||
of common keys in dicts of Python classes.
|
||
|
||
Serhiy Storchaka has shown in [issue16427]_ that a modified FNV
|
||
implementation with 64 bits per cycle is able to process long strings several
|
||
times faster than the current FNV implementation.
|
||
|
||
|
||
Grand Unified Python Benchmark Suite
|
||
------------------------------------
|
||
|
||
Initial tests with an experimental implementation and the Grand Unified Python
|
||
Benchmark Suite have shown minimal deviations. The summarized total runtime
|
||
of the benchmark is within 1% of the runtime of an unmodified Python 3.4
|
||
binary. The tests were run on an Intel i7-2860QM machine with a 64-bit Linux
|
||
installation. The interpreter was compiled with GCC 4.7 for 64- and 32-bit.
|
||
|
||
More benchmarks will be conducted.
|
||
|
||
|
||
Backwards Compatibility
|
||
=======================
|
||
|
||
The modifications don't alter any existing API.
|
||
|
||
The output of `hash()` for strings and bytes are going to be different. The
|
||
hash values for ASCII Unicode and ASCII bytes will stay equal.
|
||
|
||
|
||
Alternative counter measures against hash collision DoS
|
||
=======================================================
|
||
|
||
Three alternative counter measures against hash collisions were discussed in
|
||
the past, but are not subject of this PEP.
|
||
|
||
1. Marc-Andre Lemburg has suggested that dicts shall count hash collision. In
|
||
case an insert operation causes too many collisions an exception shall be
|
||
raised.
|
||
|
||
2. Some application (e.g. PHP) have limit the amount of keys for GET and POST
|
||
HTTP request. The approach effectively leverages the impact of a hash
|
||
collision attack. (XXX citation needed)
|
||
|
||
3. Hash maps have a worst case of O(n) for insertion and lookup of keys. This
|
||
results in an quadratic runtime during a hash collision attack. The
|
||
introduction of a new and additional data structure with with O(log n)
|
||
worst case behavior would eliminate the root cause. A data structures like
|
||
red-black-tree or prefix trees (trie [trie]_) would have other benefits,
|
||
too. Prefix trees with stringed keyed can reduce memory usage as common
|
||
prefixes are stored within the tree structure.
|
||
|
||
|
||
Reference
|
||
=========
|
||
|
||
.. [29c3] http://events.ccc.de/congress/2012/Fahrplan/events/5152.en.html
|
||
|
||
.. [fnv] http://en.wikipedia.org/wiki/Fowler-Noll-Vo_hash_function
|
||
|
||
.. [sip] https://131002.net/siphash/
|
||
|
||
.. [ocert] http://www.nruns.com/_downloads/advisory28122011.pdf
|
||
|
||
.. [ocert-2012-001] http://www.ocert.org/advisories/ocert-2012-001.html
|
||
|
||
.. [poc] https://131002.net/siphash/poc.py
|
||
|
||
.. [issue13703] http://bugs.python.org/issue13703
|
||
|
||
.. [issue14621] http://bugs.python.org/issue14621
|
||
|
||
.. [issue16427] http://bugs.python.org/issue16427
|
||
|
||
.. [trie] http://en.wikipedia.org/wiki/Trie
|
||
|
||
.. [city] http://code.google.com/p/cityhash/
|
||
|
||
.. [murmur] http://code.google.com/p/smhasher/
|
||
|
||
.. [csiphash] https://github.com/majek/csiphash/
|
||
|
||
.. [pep-0393] http://www.python.org/dev/peps/pep-0393/
|
||
|
||
.. [aes-ni] http://en.wikipedia.org/wiki/AES_instruction_set
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|