explain performance implications
move some paragraphs around link to benchmark and benchmark results
This commit is contained in:
parent
734743d99b
commit
41037972de
67
pep-0456.txt
67
pep-0456.txt
|
@ -296,9 +296,14 @@ Total 7921678
|
|||
However a fast function like DJBX33A is not as secure as SipHash24. A cutoff
|
||||
at about 5 to 7 bytes should provide a decent safety margin and speed up at
|
||||
the same time. The PEP's reference implementation provides such a cutoff with
|
||||
``Py_HASH_CUTOFF`` but disables the optimization by default. Multiple runs of
|
||||
Python's benchmark suite shows an average speedups between 3% and 5% for
|
||||
benchmarks such as django_v2, mako and etree with a cutoff of 7 on 64 bit Linux.
|
||||
``Py_HASH_CUTOFF``. The optimization is disabled by default for several
|
||||
reasons. For one the security implications are unclear yet and should be
|
||||
thoroughly studied before the optimization is enabled by default. Secondly
|
||||
the performance benefits vary. On 64 bit Linux system with Intel Core i7
|
||||
multiple runs of Python's benchmark suite [pybench]_ show an average speedups
|
||||
between 3% and 5% for benchmarks such as django_v2, mako and etree with a
|
||||
cutoff of 7. Benchmarks with X86 binaries and Windows X86_64 builds on the
|
||||
same machine are a bit slower with small string optimization.
|
||||
|
||||
|
||||
C API additions
|
||||
|
@ -506,33 +511,23 @@ This can easily by fixed with memcpy()ing four to ten bytes to an aligned
|
|||
buffer.
|
||||
|
||||
|
||||
Further things to consider
|
||||
==========================
|
||||
|
||||
ASCII str / bytes hash collision
|
||||
--------------------------------
|
||||
|
||||
Since the implementation of [pep-0393]_ bytes and ASCII text have the same
|
||||
memory layout. Because of this the new hashing API will keep the invariant::
|
||||
|
||||
hash("ascii string") == hash(b"ascii string")
|
||||
|
||||
for ASCII string and ASCII bytes. Equal hash values result in a hash collision
|
||||
and therefore cause a minor speed penalty for dicts and sets with mixed keys.
|
||||
The cause of the collision could be removed by e.g. subtracting ``2`` from
|
||||
the hash value of bytes. (``-2`` because ``hash(b"") == 0`` and ``-1`` is
|
||||
reserved.)
|
||||
|
||||
|
||||
Performance
|
||||
===========
|
||||
|
||||
TBD
|
||||
In general the PEP 456 code with SipHash24 is about as fast as the old code
|
||||
with FNV. SipHash24 seems to make better use of modern compilers, CPUs and
|
||||
large L1 cache. Several benchmarks show a small speed improvement on 64 bit
|
||||
CPUs such as Intel Core i5 and Intel Core i7 processes. 32 bit builds and
|
||||
benchmarks on older CPUs such as an AMD Athlon X2 are slightly slower with
|
||||
SipHash24. The performance increase or decrease are so small that they should
|
||||
not affect any application code.
|
||||
|
||||
First tests suggest that SipHash performs a bit faster on 64-bit CPUs when
|
||||
it is fed with medium size byte strings as well as ASCII and UCS2 Unicode
|
||||
strings. For very short strings the setup cost for SipHash dominates its
|
||||
speed but it is still in the same order of magnitude as the current FNV code.
|
||||
The benchmarks were conducted on CPython default branch revision b08868fd5994
|
||||
and the PEP repository [pep-456-repos]_. All upstream changes were merged
|
||||
into the pep-456 branch. The "performance" CPU governor was configured and
|
||||
almost all programs were stopped so the benchmarks were able to utilize
|
||||
TurboBoost and the CPU caches as much as possible. The raw benchmark results
|
||||
of multiple machines and platforms are made available at [benchmarks]_.
|
||||
|
||||
|
||||
Hash value distribution
|
||||
|
@ -630,6 +625,20 @@ data are always aligned. Only memoryviews can point to unaligned blocks
|
|||
under rare circumstances. The PEP implementation is optimized and simplified
|
||||
for the common case.
|
||||
|
||||
ASCII str / bytes hash collision
|
||||
--------------------------------
|
||||
|
||||
Since the implementation of [pep-0393]_ bytes and ASCII text have the same
|
||||
memory layout. Because of this the new hashing API will keep the invariant::
|
||||
|
||||
hash("ascii string") == hash(b"ascii string")
|
||||
|
||||
for ASCII string and ASCII bytes. Equal hash values result in a hash collision
|
||||
and therefore cause a minor speed penalty for dicts and sets with mixed keys.
|
||||
The cause of the collision could be removed by e.g. subtracting ``2`` from
|
||||
the hash value of bytes. ``-2`` because ``hash(b"") == 0`` and ``-1`` is
|
||||
reserved. The PEP doesn't change the hash value.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
@ -672,6 +681,12 @@ References
|
|||
|
||||
.. [alignmentmyth] http://lemire.me/blog/archives/2012/05/31/data-alignment-for-speed-myth-or-reality/
|
||||
|
||||
.. [pybench] http://hg.python.org/benchmarks/
|
||||
|
||||
.. [benchmarks] https://bitbucket.org/tiran/pep-456-benchmarks/src
|
||||
|
||||
.. [pep-456-repos] http://hg.python.org/features/pep-456
|
||||
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
|
Loading…
Reference in New Issue