reSTify PEP 452 (#327)
This commit is contained in:
parent
7f8e97ca3f
commit
a5e937ea43
371
pep-0452.txt
371
pep-0452.txt
|
@ -5,245 +5,258 @@ Last-Modified: $Date$
|
||||||
Author: A.M. Kuchling <amk@amk.ca>, Christian Heimes <christian@python.org>
|
Author: A.M. Kuchling <amk@amk.ca>, Christian Heimes <christian@python.org>
|
||||||
Status: Draft
|
Status: Draft
|
||||||
Type: Informational
|
Type: Informational
|
||||||
|
Content-Type: text/x-rst
|
||||||
Created: 15-Aug-2013
|
Created: 15-Aug-2013
|
||||||
Post-History:
|
Post-History:
|
||||||
Replaces: 247
|
Replaces: 247
|
||||||
|
|
||||||
Abstract
|
Abstract
|
||||||
|
========
|
||||||
|
|
||||||
There are several different modules available that implement
|
There are several different modules available that implement
|
||||||
cryptographic hashing algorithms such as MD5 or SHA. This
|
cryptographic hashing algorithms such as MD5 or SHA. This
|
||||||
document specifies a standard API for such algorithms, to make it
|
document specifies a standard API for such algorithms, to make it
|
||||||
easier to switch between different implementations.
|
easier to switch between different implementations.
|
||||||
|
|
||||||
|
|
||||||
Specification
|
Specification
|
||||||
|
=============
|
||||||
|
|
||||||
All hashing modules should present the same interface. Additional
|
All hashing modules should present the same interface. Additional
|
||||||
methods or variables can be added, but those described in this
|
methods or variables can be added, but those described in this
|
||||||
document should always be present.
|
document should always be present.
|
||||||
|
|
||||||
Hash function modules define one function:
|
Hash function modules define one function:
|
||||||
|
|
||||||
new([string]) (unkeyed hashes)
|
``new([string]) (unkeyed hashes)``
|
||||||
new(key, [string], [digestmod]) (keyed hashes)
|
|
||||||
|
|
||||||
Create a new hashing object and return it. The first form is
|
``new(key, [string], [digestmod]) (keyed hashes)``
|
||||||
for hashes that are unkeyed, such as MD5 or SHA. For keyed
|
Create a new hashing object and return it. The first form is
|
||||||
hashes such as HMAC, 'key' is a required parameter containing
|
for hashes that are unkeyed, such as MD5 or SHA. For keyed
|
||||||
a string giving the key to use. In both cases, the optional
|
hashes such as HMAC, 'key' is a required parameter containing
|
||||||
'string' parameter, if supplied, will be immediately hashed
|
a string giving the key to use. In both cases, the optional
|
||||||
into the object's starting state, as if obj.update(string) was
|
'string' parameter, if supplied, will be immediately hashed
|
||||||
called.
|
into the object's starting state, as if ``obj.update(string)`` was
|
||||||
|
called.
|
||||||
|
|
||||||
After creating a hashing object, arbitrary bytes can be fed
|
After creating a hashing object, arbitrary bytes can be fed
|
||||||
into the object using its update() method, and the hash value
|
into the object using its ``update()`` method, and the hash value
|
||||||
can be obtained at any time by calling the object's digest()
|
can be obtained at any time by calling the object's ``digest()``
|
||||||
method.
|
method.
|
||||||
|
|
||||||
Although the parameter is called 'string', hashing objects operate
|
Although the parameter is called 'string', hashing objects operate
|
||||||
on 8-bit data only. Both 'key' and 'string' must be a bytes-like
|
on 8-bit data only. Both 'key' and 'string' must be a bytes-like
|
||||||
object (bytes, bytearray...). A hashing object may support
|
object (bytes, bytearray...). A hashing object may support
|
||||||
one-dimensional, contiguous buffers as argument, too. Text
|
one-dimensional, contiguous buffers as argument, too. Text
|
||||||
(unicode) is no longer supported in Python 3.x. Python 2.x
|
(unicode) is no longer supported in Python 3.x. Python 2.x
|
||||||
implementations may take ASCII-only unicode as argument, but
|
implementations may take ASCII-only unicode as argument, but
|
||||||
portable code should not rely on the feature.
|
portable code should not rely on the feature.
|
||||||
|
|
||||||
Arbitrary additional keyword arguments can be added to this
|
Arbitrary additional keyword arguments can be added to this
|
||||||
function, but if they're not supplied, sensible default values
|
function, but if they're not supplied, sensible default values
|
||||||
should be used. For example, 'rounds' and 'digest_size'
|
should be used. For example, 'rounds' and 'digest_size'
|
||||||
keywords could be added for a hash function which supports a
|
keywords could be added for a hash function which supports a
|
||||||
variable number of rounds and several different output sizes,
|
variable number of rounds and several different output sizes,
|
||||||
and they should default to values believed to be secure.
|
and they should default to values believed to be secure.
|
||||||
|
|
||||||
Hash function modules define one variable:
|
Hash function modules define one variable:
|
||||||
|
|
||||||
digest_size
|
``digest_size``
|
||||||
|
An integer value; the size of the digest produced by the
|
||||||
|
hashing objects created by this module, measured in bytes.
|
||||||
|
You could also obtain this value by creating a sample object
|
||||||
|
and accessing its 'digest_size' attribute, but it can be
|
||||||
|
convenient to have this value available from the module.
|
||||||
|
Hashes with a variable output size will set this variable to
|
||||||
|
None.
|
||||||
|
|
||||||
An integer value; the size of the digest produced by the
|
Hashing objects require the following attribute:
|
||||||
hashing objects created by this module, measured in bytes.
|
|
||||||
You could also obtain this value by creating a sample object
|
|
||||||
and accessing its 'digest_size' attribute, but it can be
|
|
||||||
convenient to have this value available from the module.
|
|
||||||
Hashes with a variable output size will set this variable to
|
|
||||||
None.
|
|
||||||
|
|
||||||
Hashing objects require the following attribute:
|
``digest_size``
|
||||||
|
This attribute is identical to the module-level digest_size
|
||||||
|
variable, measuring the size of the digest produced by the
|
||||||
|
hashing object, measured in bytes. If the hash has a variable
|
||||||
|
output size, this output size must be chosen when the hashing
|
||||||
|
object is created, and this attribute must contain the
|
||||||
|
selected size. Therefore, ``None`` is **not** a legal value for this
|
||||||
|
attribute.
|
||||||
|
|
||||||
digest_size
|
``block_size``
|
||||||
|
An integer value or ``NotImplemented``; the internal block size
|
||||||
|
of the hash algorithm in bytes. The block size is used by the
|
||||||
|
HMAC module to pad the secret key to ``digest_size`` or to hash the
|
||||||
|
secret key if it is longer than ``digest_size``. If no HMAC
|
||||||
|
algorithm is standardized for the hash algorithm, return
|
||||||
|
``NotImplemented`` instead.
|
||||||
|
|
||||||
This attribute is identical to the module-level digest_size
|
``name``
|
||||||
variable, measuring the size of the digest produced by the
|
A text string value; the canonical, lowercase name of the hashing
|
||||||
hashing object, measured in bytes. If the hash has a variable
|
algorithm. The name should be a suitable parameter for
|
||||||
output size, this output size must be chosen when the hashing
|
``hashlib.new``.
|
||||||
object is created, and this attribute must contain the
|
|
||||||
selected size. Therefore, None is *not* a legal value for this
|
|
||||||
attribute.
|
|
||||||
|
|
||||||
block_size
|
Hashing objects require the following methods:
|
||||||
|
|
||||||
An integer value or ``NotImplemented``; the internal block size
|
``copy()``
|
||||||
of the hash algorithm in bytes. The block size is used by the
|
Return a separate copy of this hashing object. An update to
|
||||||
HMAC module to pad the secret key to digest_size or to hash the
|
this copy won't affect the original object.
|
||||||
secret key if it is longer than digest_size. If no HMAC
|
|
||||||
algorithm is standardized for the hash algorithm, return
|
|
||||||
``NotImplemented`` instead.
|
|
||||||
|
|
||||||
name
|
``digest()``
|
||||||
|
Return the hash value of this hashing object as a bytes
|
||||||
|
containing 8-bit data. The object is not altered in any way
|
||||||
|
by this function; you can continue updating the object after
|
||||||
|
calling this function.
|
||||||
|
|
||||||
A text string value; the canonical, lowercase name of the hashing
|
``hexdigest()``
|
||||||
algorithm. The name should be a suitable parameter for
|
Return the hash value of this hashing object as a string
|
||||||
:func:`hashlib.new`.
|
containing hexadecimal digits. Lowercase letters should be used
|
||||||
|
for the digits 'a' through 'f'. Like the ``.digest()`` method, this
|
||||||
|
method mustn't alter the object.
|
||||||
|
|
||||||
Hashing objects require the following methods:
|
``update(string)``
|
||||||
|
Hash bytes-like 'string' into the current state of the hashing
|
||||||
|
object. ``update()`` can be called any number of times during a
|
||||||
|
hashing object's lifetime.
|
||||||
|
|
||||||
copy()
|
Hashing modules can define additional module-level functions or
|
||||||
|
object methods and still be compliant with this specification.
|
||||||
|
|
||||||
Return a separate copy of this hashing object. An update to
|
Here's an example, using a module named 'MD5'::
|
||||||
this copy won't affect the original object.
|
|
||||||
|
|
||||||
digest()
|
>>> import hashlib
|
||||||
|
>>> from Crypto.Hash import MD5
|
||||||
Return the hash value of this hashing object as a bytes
|
>>> m = MD5.new()
|
||||||
containing 8-bit data. The object is not altered in any way
|
>>> isinstance(m, hashlib.CryptoHash)
|
||||||
by this function; you can continue updating the object after
|
True
|
||||||
calling this function.
|
>>> m.name
|
||||||
|
'md5'
|
||||||
hexdigest()
|
>>> m.digest_size
|
||||||
|
16
|
||||||
Return the hash value of this hashing object as a string
|
>>> m.block_size
|
||||||
containing hexadecimal digits. Lowercase letters should be used
|
64
|
||||||
for the digits 'a' through 'f'. Like the .digest() method, this
|
>>> m.update(b'abc')
|
||||||
method mustn't alter the object.
|
>>> m.digest()
|
||||||
|
b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
|
||||||
update(string)
|
>>> m.hexdigest()
|
||||||
|
'900150983cd24fb0d6963f7d28e17f72'
|
||||||
Hash bytes-like 'string' into the current state of the hashing
|
>>> MD5.new(b'abc').digest()
|
||||||
object. update() can be called any number of times during a
|
b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
|
||||||
hashing object's lifetime.
|
|
||||||
|
|
||||||
Hashing modules can define additional module-level functions or
|
|
||||||
object methods and still be compliant with this specification.
|
|
||||||
|
|
||||||
Here's an example, using a module named 'MD5':
|
|
||||||
|
|
||||||
>>> import hashlib
|
|
||||||
>>> from Crypto.Hash import MD5
|
|
||||||
>>> m = MD5.new()
|
|
||||||
>>> isinstance(m, hashlib.CryptoHash)
|
|
||||||
True
|
|
||||||
>>> m.name
|
|
||||||
'md5'
|
|
||||||
>>> m.digest_size
|
|
||||||
16
|
|
||||||
>>> m.block_size
|
|
||||||
64
|
|
||||||
>>> m.update(b'abc')
|
|
||||||
>>> m.digest()
|
|
||||||
b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
|
|
||||||
>>> m.hexdigest()
|
|
||||||
'900150983cd24fb0d6963f7d28e17f72'
|
|
||||||
>>> MD5.new(b'abc').digest()
|
|
||||||
b'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
|
|
||||||
|
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
|
=========
|
||||||
|
|
||||||
The digest size is measured in bytes, not bits, even though hash
|
The digest size is measured in bytes, not bits, even though hash
|
||||||
algorithm sizes are usually quoted in bits; MD5 is a 128-bit
|
algorithm sizes are usually quoted in bits; MD5 is a 128-bit
|
||||||
algorithm and not a 16-byte one, for example. This is because, in
|
algorithm and not a 16-byte one, for example. This is because, in
|
||||||
the sample code I looked at, the length in bytes is often needed
|
the sample code I looked at, the length in bytes is often needed
|
||||||
(to seek ahead or behind in a file; to compute the length of an
|
(to seek ahead or behind in a file; to compute the length of an
|
||||||
output string) while the length in bits is rarely used.
|
output string) while the length in bits is rarely used.
|
||||||
Therefore, the burden will fall on the few people actually needing
|
Therefore, the burden will fall on the few people actually needing
|
||||||
the size in bits, who will have to multiply digest_size by 8.
|
the size in bits, who will have to multiply digest_size by 8.
|
||||||
|
|
||||||
It's been suggested that the update() method would be better named
|
It's been suggested that the ``update()`` method would be better named
|
||||||
append(). However, that method is really causing the current
|
``append()``. However, that method is really causing the current
|
||||||
state of the hashing object to be updated, and update() is already
|
state of the hashing object to be updated, and ``update()`` is already
|
||||||
used by the md5 and sha modules included with Python, so it seems
|
used by the md5 and sha modules included with Python, so it seems
|
||||||
simplest to leave the name update() alone.
|
simplest to leave the name ``update()`` alone.
|
||||||
|
|
||||||
The order of the constructor's arguments for keyed hashes was a
|
The order of the constructor's arguments for keyed hashes was a
|
||||||
sticky issue. It wasn't clear whether the key should come first
|
sticky issue. It wasn't clear whether the key should come first
|
||||||
or second. It's a required parameter, and the usual convention is
|
or second. It's a required parameter, and the usual convention is
|
||||||
to place required parameters first, but that also means that the
|
to place required parameters first, but that also means that the
|
||||||
'string' parameter moves from the first position to the second.
|
'string' parameter moves from the first position to the second.
|
||||||
It would be possible to get confused and pass a single argument to
|
It would be possible to get confused and pass a single argument to
|
||||||
a keyed hash, thinking that you're passing an initial string to an
|
a keyed hash, thinking that you're passing an initial string to an
|
||||||
unkeyed hash, but it doesn't seem worth making the interface
|
unkeyed hash, but it doesn't seem worth making the interface
|
||||||
for keyed hashes more obscure to avoid this potential error.
|
for keyed hashes more obscure to avoid this potential error.
|
||||||
|
|
||||||
|
|
||||||
Changes from Version 1.0 to Version 2.0
|
Changes from Version 1.0 to Version 2.0
|
||||||
|
=======================================
|
||||||
|
|
||||||
Version 2.0 of API for Cryptographic Hash Functions clarifies some
|
Version 2.0 of API for Cryptographic Hash Functions clarifies some
|
||||||
aspects of the API and brings it up-to-date. It also formalized aspects
|
aspects of the API and brings it up-to-date. It also formalized aspects
|
||||||
that were already de facto standards and provided by most
|
that were already de facto standards and provided by most
|
||||||
implementations.
|
implementations.
|
||||||
|
|
||||||
Version 2.0 introduces the following new attributes:
|
Version 2.0 introduces the following new attributes:
|
||||||
|
|
||||||
name
|
``name``
|
||||||
|
The name property was made mandatory by `issue 18532`_.
|
||||||
|
|
||||||
The name property was made mandatory by :issue:`18532`.
|
``block_size``
|
||||||
|
The new version also specifies that the return value
|
||||||
|
``NotImplemented`` prevents HMAC support.
|
||||||
|
|
||||||
block_size
|
Version 2.0 takes the separation of binary and text data in Python
|
||||||
|
3.0 into account. The 'string' argument to ``new()`` and ``update()`` as
|
||||||
The new version also specifies that the return value
|
well as the 'key' argument must be bytes-like objects. On Python
|
||||||
``NotImplemented`` prevents HMAC support.
|
2.x a hashing object may also support ASCII-only unicode. The actual
|
||||||
|
name of argument is not changed as it is part of the public API.
|
||||||
Version 2.0 takes the separation of binary and text data in Python
|
Code may depend on the fact that the argument is called 'string'.
|
||||||
3.0 into account. The 'string' argument to new() and update() as
|
|
||||||
well as the 'key' argument must be bytes-like objects. On Python
|
|
||||||
2.x a hashing object may also support ASCII-only unicode. The actual
|
|
||||||
name of argument is not changed as it is part of the public API.
|
|
||||||
Code may depend on the fact that the argument is called 'string'.
|
|
||||||
|
|
||||||
|
|
||||||
Recommended names for common hashing algorithms
|
Recommended names for common hashing algorithms
|
||||||
|
===============================================
|
||||||
|
|
||||||
algorithm variant recommended name
|
+------------+------------+-------------------+
|
||||||
---------- --------- ----------------
|
| algorithm | variant | recommended name |
|
||||||
MD5 md5
|
+============+============+===================+
|
||||||
RIPEMD-160 ripemd160
|
| MD5 | | md5 |
|
||||||
SHA-1 sha1
|
+------------+------------+-------------------+
|
||||||
SHA-2 SHA-224 sha224
|
| RIPEMD-160 | | ripemd160 |
|
||||||
SHA-256 sha256
|
+------------+------------+-------------------+
|
||||||
SHA-384 sha384
|
| SHA-1 | | sha1 |
|
||||||
SHA-512 sha512
|
+------------+------------+-------------------+
|
||||||
SHA-3 SHA-3-224 sha3_224
|
| SHA-2 | SHA-224 | sha224 |
|
||||||
SHA-3-256 sha3_256
|
+ +------------+-------------------+
|
||||||
SHA-3-384 sha3_384
|
| | SHA-256 | sha256 |
|
||||||
SHA-3-512 sha3_512
|
+ +------------+-------------------+
|
||||||
WHIRLPOOL whirlpool
|
| | SHA-384 | sha384 |
|
||||||
|
+ +------------+-------------------+
|
||||||
|
| | SHA-512 | sha512 |
|
||||||
|
+------------+------------+-------------------+
|
||||||
|
| SHA-3 | SHA-3-224 | sha3_224 |
|
||||||
|
+ +------------+-------------------+
|
||||||
|
| | SHA-3-256 | sha3_256 |
|
||||||
|
+ +------------+-------------------+
|
||||||
|
| | SHA-3-384 | sha3_384 |
|
||||||
|
+ +------------+-------------------+
|
||||||
|
| | SHA-3-512 | sha3_512 |
|
||||||
|
+------------+------------+-------------------+
|
||||||
|
| WHIRLPOOL | | whirlpool |
|
||||||
|
+------------+------------+-------------------+
|
||||||
|
|
||||||
|
|
||||||
Changes
|
Changes
|
||||||
|
=======
|
||||||
|
|
||||||
2001-09-17: Renamed clear() to reset(); added digest_size attribute
|
* 2001-09-17: Renamed ``clear()`` to ``reset()``; added ``digest_size`` attribute
|
||||||
to objects; added .hexdigest() method.
|
to objects; added ``.hexdigest()`` method.
|
||||||
2001-09-20: Removed reset() method completely.
|
* 2001-09-20: Removed ``reset()`` method completely.
|
||||||
2001-09-28: Set digest_size to None for variable-size hashes.
|
* 2001-09-28: Set ``digest_size`` to ``None`` for variable-size hashes.
|
||||||
2013-08-15: Added block_size and name attributes; clarified that
|
* 2013-08-15: Added ``block_size`` and ``name`` attributes; clarified that
|
||||||
'string' actually referes to bytes-like objects.
|
'string' actually referes to bytes-like objects.
|
||||||
|
|
||||||
|
|
||||||
Acknowledgements
|
Acknowledgements
|
||||||
|
================
|
||||||
|
|
||||||
Thanks to Aahz, Andrew Archibald, Rich Salz, Itamar
|
Thanks to Aahz, Andrew Archibald, Rich Salz, Itamar
|
||||||
Shtull-Trauring, and the readers of the python-crypto list for
|
Shtull-Trauring, and the readers of the python-crypto list for
|
||||||
their comments on this PEP.
|
their comments on this PEP.
|
||||||
|
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
|
=========
|
||||||
|
|
||||||
This document has been placed in the public domain.
|
This document has been placed in the public domain.
|
||||||
|
|
||||||
|
|
||||||
|
.. _issue 18532: http://bugs.python.org/issue18532
|
||||||
Local Variables:
|
|
||||||
mode: indented-text
|
|
||||||
indent-tabs-mode: nil
|
|
||||||
End:
|
|
||||||
|
|
||||||
|
..
|
||||||
|
Local Variables:
|
||||||
|
mode: indented-text
|
||||||
|
indent-tabs-mode: nil
|
||||||
|
End:
|
||||||
|
|
Loading…
Reference in New Issue