reSTify PEP 452 (#327)

This commit is contained in:
Huang Huang 2017-08-17 06:53:46 +08:00 committed by Brett Cannon
parent 7f8e97ca3f
commit a5e937ea43
1 changed files with 192 additions and 179 deletions

View File

@ -5,245 +5,258 @@ Last-Modified: $Date$
Author: A.M. Kuchling <amk@amk.ca>, Christian Heimes <christian@python.org> Author: A.M. Kuchling <amk@amk.ca>, Christian Heimes <christian@python.org>
Status: Draft Status: Draft
Type: Informational Type: Informational
Content-Type: text/x-rst
Created: 15-Aug-2013 Created: 15-Aug-2013
Post-History: Post-History:
Replaces: 247 Replaces: 247
Abstract Abstract
========
There are several different modules available that implement There are several different modules available that implement
cryptographic hashing algorithms such as MD5 or SHA. This cryptographic hashing algorithms such as MD5 or SHA. This
document specifies a standard API for such algorithms, to make it document specifies a standard API for such algorithms, to make it
easier to switch between different implementations. easier to switch between different implementations.
Specification Specification
=============
All hashing modules should present the same interface. Additional All hashing modules should present the same interface. Additional
methods or variables can be added, but those described in this methods or variables can be added, but those described in this
document should always be present. document should always be present.
Hash function modules define one function: Hash function modules define one function:
new([string]) (unkeyed hashes) ``new([string]) (unkeyed hashes)``
new(key, [string], [digestmod]) (keyed hashes)
Create a new hashing object and return it. The first form is ``new(key, [string], [digestmod]) (keyed hashes)``
for hashes that are unkeyed, such as MD5 or SHA. For keyed Create a new hashing object and return it. The first form is
hashes such as HMAC, 'key' is a required parameter containing for hashes that are unkeyed, such as MD5 or SHA. For keyed
a string giving the key to use. In both cases, the optional hashes such as HMAC, 'key' is a required parameter containing
'string' parameter, if supplied, will be immediately hashed a string giving the key to use. In both cases, the optional
into the object's starting state, as if obj.update(string) was 'string' parameter, if supplied, will be immediately hashed
called. into the object's starting state, as if ``obj.update(string)`` was
called.
After creating a hashing object, arbitrary bytes can be fed After creating a hashing object, arbitrary bytes can be fed
into the object using its update() method, and the hash value into the object using its ``update()`` method, and the hash value
can be obtained at any time by calling the object's digest() can be obtained at any time by calling the object's ``digest()``
method. method.
Although the parameter is called 'string', hashing objects operate Although the parameter is called 'string', hashing objects operate
on 8-bit data only. Both 'key' and 'string' must be a bytes-like on 8-bit data only. Both 'key' and 'string' must be a bytes-like
object (bytes, bytearray...). A hashing object may support object (bytes, bytearray...). A hashing object may support
one-dimensional, contiguous buffers as argument, too. Text one-dimensional, contiguous buffers as argument, too. Text
(unicode) is no longer supported in Python 3.x. Python 2.x (unicode) is no longer supported in Python 3.x. Python 2.x
implementations may take ASCII-only unicode as argument, but implementations may take ASCII-only unicode as argument, but
portable code should not rely on the feature. portable code should not rely on the feature.
Arbitrary additional keyword arguments can be added to this Arbitrary additional keyword arguments can be added to this
function, but if they're not supplied, sensible default values function, but if they're not supplied, sensible default values
should be used. For example, 'rounds' and 'digest_size' should be used. For example, 'rounds' and 'digest_size'
keywords could be added for a hash function which supports a keywords could be added for a hash function which supports a
variable number of rounds and several different output sizes, variable number of rounds and several different output sizes,
and they should default to values believed to be secure. and they should default to values believed to be secure.
Hash function modules define one variable: Hash function modules define one variable:
digest_size ``digest_size``
An integer value; the size of the digest produced by the
hashing objects created by this module, measured in bytes.
You could also obtain this value by creating a sample object
and accessing its 'digest_size' attribute, but it can be
convenient to have this value available from the module.
Hashes with a variable output size will set this variable to
None.
An integer value; the size of the digest produced by the Hashing objects require the following attribute:
hashing objects created by this module, measured in bytes.
You could also obtain this value by creating a sample object
and accessing its 'digest_size' attribute, but it can be
convenient to have this value available from the module.
Hashes with a variable output size will set this variable to
None.
Hashing objects require the following attribute: ``digest_size``
This attribute is identical to the module-level digest_size
variable, measuring the size of the digest produced by the
hashing object, measured in bytes. If the hash has a variable
output size, this output size must be chosen when the hashing
object is created, and this attribute must contain the
selected size. Therefore, ``None`` is **not** a legal value for this
attribute.
digest_size ``block_size``
An integer value or ``NotImplemented``; the internal block size
of the hash algorithm in bytes. The block size is used by the
HMAC module to pad the secret key to ``digest_size`` or to hash the
secret key if it is longer than ``digest_size``. If no HMAC
algorithm is standardized for the hash algorithm, return
``NotImplemented`` instead.
This attribute is identical to the module-level digest_size ``name``
variable, measuring the size of the digest produced by the A text string value; the canonical, lowercase name of the hashing
hashing object, measured in bytes. If the hash has a variable algorithm. The name should be a suitable parameter for
output size, this output size must be chosen when the hashing ``hashlib.new``.
object is created, and this attribute must contain the
selected size. Therefore, None is *not* a legal value for this
attribute.
block_size Hashing objects require the following methods:
An integer value or ``NotImplemented``; the internal block size ``copy()``
of the hash algorithm in bytes. The block size is used by the Return a separate copy of this hashing object. An update to
HMAC module to pad the secret key to digest_size or to hash the this copy won't affect the original object.
secret key if it is longer than digest_size. If no HMAC
algorithm is standardized for the hash algorithm, return
``NotImplemented`` instead.
name ``digest()``
Return the hash value of this hashing object as a bytes
containing 8-bit data. The object is not altered in any way
by this function; you can continue updating the object after
calling this function.
A text string value; the canonical, lowercase name of the hashing ``hexdigest()``
algorithm. The name should be a suitable parameter for Return the hash value of this hashing object as a string
:func:`hashlib.new`. containing hexadecimal digits. Lowercase letters should be used
for the digits 'a' through 'f'. Like the ``.digest()`` method, this
method mustn't alter the object.
Hashing objects require the following methods: ``update(string)``
Hash bytes-like 'string' into the current state of the hashing
object. ``update()`` can be called any number of times during a
hashing object's lifetime.
copy() Hashing modules can define additional module-level functions or
object methods and still be compliant with this specification.
Return a separate copy of this hashing object. An update to Here's an example, using a module named 'MD5'::
this copy won't affect the original object.
digest() >>> import hashlib
>>> from Crypto.Hash import MD5
Return the hash value of this hashing object as a bytes >>> m = MD5.new()
containing 8-bit data. The object is not altered in any way >>> isinstance(m, hashlib.CryptoHash)
by this function; you can continue updating the object after True
calling this function. >>> m.name
'md5'
hexdigest() >>> m.digest_size
16
Return the hash value of this hashing object as a string >>> m.block_size
containing hexadecimal digits. Lowercase letters should be used 64
for the digits 'a' through 'f'. Like the .digest() method, this >>> m.update(b'abc')
method mustn't alter the object. >>> m.digest()
b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
update(string) >>> m.hexdigest()
'900150983cd24fb0d6963f7d28e17f72'
Hash bytes-like 'string' into the current state of the hashing >>> MD5.new(b'abc').digest()
object. update() can be called any number of times during a b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
hashing object's lifetime.
Hashing modules can define additional module-level functions or
object methods and still be compliant with this specification.
Here's an example, using a module named 'MD5':
>>> import hashlib
>>> from Crypto.Hash import MD5
>>> m = MD5.new()
>>> isinstance(m, hashlib.CryptoHash)
True
>>> m.name
'md5'
>>> m.digest_size
16
>>> m.block_size
64
>>> m.update(b'abc')
>>> m.digest()
b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
>>> m.hexdigest()
'900150983cd24fb0d6963f7d28e17f72'
>>> MD5.new(b'abc').digest()
b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
Rationale Rationale
=========
The digest size is measured in bytes, not bits, even though hash The digest size is measured in bytes, not bits, even though hash
algorithm sizes are usually quoted in bits; MD5 is a 128-bit algorithm sizes are usually quoted in bits; MD5 is a 128-bit
algorithm and not a 16-byte one, for example. This is because, in algorithm and not a 16-byte one, for example. This is because, in
the sample code I looked at, the length in bytes is often needed the sample code I looked at, the length in bytes is often needed
(to seek ahead or behind in a file; to compute the length of an (to seek ahead or behind in a file; to compute the length of an
output string) while the length in bits is rarely used. output string) while the length in bits is rarely used.
Therefore, the burden will fall on the few people actually needing Therefore, the burden will fall on the few people actually needing
the size in bits, who will have to multiply digest_size by 8. the size in bits, who will have to multiply digest_size by 8.
It's been suggested that the update() method would be better named It's been suggested that the ``update()`` method would be better named
append(). However, that method is really causing the current ``append()``. However, that method is really causing the current
state of the hashing object to be updated, and update() is already state of the hashing object to be updated, and ``update()`` is already
used by the md5 and sha modules included with Python, so it seems used by the md5 and sha modules included with Python, so it seems
simplest to leave the name update() alone. simplest to leave the name ``update()`` alone.
The order of the constructor's arguments for keyed hashes was a The order of the constructor's arguments for keyed hashes was a
sticky issue. It wasn't clear whether the key should come first sticky issue. It wasn't clear whether the key should come first
or second. It's a required parameter, and the usual convention is or second. It's a required parameter, and the usual convention is
to place required parameters first, but that also means that the to place required parameters first, but that also means that the
'string' parameter moves from the first position to the second. 'string' parameter moves from the first position to the second.
It would be possible to get confused and pass a single argument to It would be possible to get confused and pass a single argument to
a keyed hash, thinking that you're passing an initial string to an a keyed hash, thinking that you're passing an initial string to an
unkeyed hash, but it doesn't seem worth making the interface unkeyed hash, but it doesn't seem worth making the interface
for keyed hashes more obscure to avoid this potential error. for keyed hashes more obscure to avoid this potential error.
Changes from Version 1.0 to Version 2.0 Changes from Version 1.0 to Version 2.0
=======================================
Version 2.0 of API for Cryptographic Hash Functions clarifies some Version 2.0 of API for Cryptographic Hash Functions clarifies some
aspects of the API and brings it up-to-date. It also formalized aspects aspects of the API and brings it up-to-date. It also formalized aspects
that were already de facto standards and provided by most that were already de facto standards and provided by most
implementations. implementations.
Version 2.0 introduces the following new attributes: Version 2.0 introduces the following new attributes:
name ``name``
The name property was made mandatory by `issue 18532`_.
The name property was made mandatory by :issue:`18532`. ``block_size``
The new version also specifies that the return value
``NotImplemented`` prevents HMAC support.
block_size Version 2.0 takes the separation of binary and text data in Python
3.0 into account. The 'string' argument to ``new()`` and ``update()`` as
The new version also specifies that the return value well as the 'key' argument must be bytes-like objects. On Python
``NotImplemented`` prevents HMAC support. 2.x a hashing object may also support ASCII-only unicode. The actual
name of argument is not changed as it is part of the public API.
Version 2.0 takes the separation of binary and text data in Python Code may depend on the fact that the argument is called 'string'.
3.0 into account. The 'string' argument to new() and update() as
well as the 'key' argument must be bytes-like objects. On Python
2.x a hashing object may also support ASCII-only unicode. The actual
name of argument is not changed as it is part of the public API.
Code may depend on the fact that the argument is called 'string'.
Recommended names for common hashing algorithms Recommended names for common hashing algorithms
===============================================
algorithm variant recommended name +------------+------------+-------------------+
---------- --------- ---------------- | algorithm | variant | recommended name |
MD5 md5 +============+============+===================+
RIPEMD-160 ripemd160 | MD5 | | md5 |
SHA-1 sha1 +------------+------------+-------------------+
SHA-2 SHA-224 sha224 | RIPEMD-160 | | ripemd160 |
SHA-256 sha256 +------------+------------+-------------------+
SHA-384 sha384 | SHA-1 | | sha1 |
SHA-512 sha512 +------------+------------+-------------------+
SHA-3 SHA-3-224 sha3_224 | SHA-2 | SHA-224 | sha224 |
SHA-3-256 sha3_256 + +------------+-------------------+
SHA-3-384 sha3_384 | | SHA-256 | sha256 |
SHA-3-512 sha3_512 + +------------+-------------------+
WHIRLPOOL whirlpool | | SHA-384 | sha384 |
+ +------------+-------------------+
| | SHA-512 | sha512 |
+------------+------------+-------------------+
| SHA-3 | SHA-3-224 | sha3_224 |
+ +------------+-------------------+
| | SHA-3-256 | sha3_256 |
+ +------------+-------------------+
| | SHA-3-384 | sha3_384 |
+ +------------+-------------------+
| | SHA-3-512 | sha3_512 |
+------------+------------+-------------------+
| WHIRLPOOL | | whirlpool |
+------------+------------+-------------------+
Changes Changes
=======
2001-09-17: Renamed clear() to reset(); added digest_size attribute * 2001-09-17: Renamed ``clear()`` to ``reset()``; added ``digest_size`` attribute
to objects; added .hexdigest() method. to objects; added ``.hexdigest()`` method.
2001-09-20: Removed reset() method completely. * 2001-09-20: Removed ``reset()`` method completely.
2001-09-28: Set digest_size to None for variable-size hashes. * 2001-09-28: Set ``digest_size`` to ``None`` for variable-size hashes.
2013-08-15: Added block_size and name attributes; clarified that * 2013-08-15: Added ``block_size`` and ``name`` attributes; clarified that
'string' actually referes to bytes-like objects. 'string' actually referes to bytes-like objects.
Acknowledgements Acknowledgements
================
Thanks to Aahz, Andrew Archibald, Rich Salz, Itamar Thanks to Aahz, Andrew Archibald, Rich Salz, Itamar
Shtull-Trauring, and the readers of the python-crypto list for Shtull-Trauring, and the readers of the python-crypto list for
their comments on this PEP. their comments on this PEP.
Copyright Copyright
=========
This document has been placed in the public domain. This document has been placed in the public domain.
.. _issue 18532: http://bugs.python.org/issue18532
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End: