Make Nick BDFG delegate

add string length distribution
add http://bugs.python.org/issue19183
This commit is contained in:
Christian Heimes 2013-10-07 15:20:25 +02:00
parent e8c49a7e04
commit 47829f733d
1 changed files with 12 additions and 0 deletions

View File

@ -3,6 +3,7 @@ Title: Secure and interchangeable hash algorithm
Version: $Revision$
Last-Modified: $Date$
Author: Christian Heimes <christian@python.org>
BDFL-Delegate: Nick Coghlan
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
@ -461,10 +462,17 @@ speed but it is still in the same order of magnitude as the current FNV code.
It's yet unknown how the new distribution of hash values affects collisions
of common keys in dicts of Python classes.
Typical length
--------------
Serhiy Storchaka has shown in [issue16427]_ that a modified FNV
implementation with 64 bits per cycle is able to process long strings several
times faster than the current FNV implementation.
However according to statistics [issue19183]_ a typical Python program as
well as the Python test suite have a hash ratio of about 50% small strings
between 1 and 6 bytes. Only 5% of the strings are larger than 16 bytes.
Grand Unified Python Benchmark Suite
------------------------------------
@ -526,6 +534,8 @@ versions of the PEP aim for compile time configuration.
Reference
=========
* Issue 19183 [issue19183]_ contains a reference implementation.
.. [29c3] http://events.ccc.de/congress/2012/Fahrplan/events/5152.en.html
.. [fnv] http://en.wikipedia.org/wiki/Fowler-Noll-Vo_hash_function
@ -544,6 +554,8 @@ Reference
.. [issue16427] http://bugs.python.org/issue16427
.. [issue19183] http://bugs.python.org/issue19183
.. [trie] http://en.wikipedia.org/wiki/Trie
.. [city] http://code.google.com/p/cityhash/