2001-03-27 12:42:59 -05:00
|
|
|
|
PEP: 247
|
|
|
|
|
Title: API for Cryptographic Hash Functions
|
|
|
|
|
Version: $Revision$
|
2002-10-30 20:35:08 -05:00
|
|
|
|
Author: A.M. Kuchling <amk@amk.ca>
|
2001-10-31 10:52:39 -05:00
|
|
|
|
Status: Final
|
2001-03-27 12:42:59 -05:00
|
|
|
|
Type: Informational
|
|
|
|
|
Created: 23-Mar-2001
|
2001-09-20 11:48:02 -04:00
|
|
|
|
Post-History: 20-Sep-2001
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
|
|
|
|
Abstract
|
|
|
|
|
|
2001-03-28 15:18:03 -05:00
|
|
|
|
There are several different modules available that implement
|
|
|
|
|
cryptographic hashing algorithms such as MD5 or SHA. This
|
|
|
|
|
document specifies a standard API for such algorithms, to make it
|
|
|
|
|
easier to switch between different implementations.
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Specification
|
|
|
|
|
|
2001-03-28 15:18:03 -05:00
|
|
|
|
All hashing modules should present the same interface. Additional
|
|
|
|
|
methods or variables can be added, but those described in this
|
|
|
|
|
document should always be present.
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-03-28 15:18:03 -05:00
|
|
|
|
Hash function modules define one function:
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-10-30 21:26:25 -05:00
|
|
|
|
new([string]) (unkeyed hashes)
|
|
|
|
|
new([key] , [string]) (keyed hashes)
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-09-24 11:37:19 -04:00
|
|
|
|
Create a new hashing object and return it. The first form is
|
2001-10-30 21:26:25 -05:00
|
|
|
|
for hashes that are unkeyed, such as MD5 or SHA. For keyed
|
|
|
|
|
hashes such as HMAC, 'key' is a required parameter containing
|
|
|
|
|
a string giving the key to use. In both cases, the optional
|
|
|
|
|
'string' parameter, if supplied, will be immediately hashed
|
|
|
|
|
into the object's starting state, as if obj.update(string) was
|
|
|
|
|
called.
|
2001-09-24 11:37:19 -04:00
|
|
|
|
|
|
|
|
|
After creating a hashing object, arbitrary strings can be fed
|
|
|
|
|
into the object using its update() method, and the hash value
|
|
|
|
|
can be obtained at any time by calling the object's digest()
|
|
|
|
|
method.
|
2001-09-17 11:09:37 -04:00
|
|
|
|
|
|
|
|
|
Arbitrary additional keyword arguments can be added to this
|
|
|
|
|
function, but if they're not supplied, sensible default values
|
2001-09-20 11:48:02 -04:00
|
|
|
|
should be used. For example, 'rounds' and 'digest_size'
|
|
|
|
|
keywords could be added for a hash function which supports a
|
|
|
|
|
variable number of rounds and several different output sizes,
|
|
|
|
|
and they should default to values believed to be secure.
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-03-28 15:18:03 -05:00
|
|
|
|
Hash function modules define one variable:
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-09-20 11:48:02 -04:00
|
|
|
|
digest_size
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-03-28 15:18:03 -05:00
|
|
|
|
An integer value; the size of the digest produced by the
|
2001-09-17 11:09:37 -04:00
|
|
|
|
hashing objects created by this module, measured in bytes.
|
|
|
|
|
You could also obtain this value by creating a sample object
|
2001-09-20 11:48:02 -04:00
|
|
|
|
and accessing its 'digest_size' attribute, but it can be
|
2001-09-17 11:09:37 -04:00
|
|
|
|
convenient to have this value available from the module.
|
2001-10-09 17:13:07 -04:00
|
|
|
|
Hashes with a variable output size will set this variable to
|
|
|
|
|
None.
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-09-20 11:48:02 -04:00
|
|
|
|
Hashing objects require a single attribute:
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-09-20 11:48:02 -04:00
|
|
|
|
digest_size
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-09-20 11:48:02 -04:00
|
|
|
|
This attribute is identical to the module-level digest_size
|
|
|
|
|
variable, measuring the size of the digest produced by the
|
2001-09-17 11:09:37 -04:00
|
|
|
|
hashing object, measured in bytes. If the hash has a variable
|
2001-09-20 11:48:02 -04:00
|
|
|
|
output size, this output size must be chosen when the hashing
|
|
|
|
|
object is created, and this attribute must contain the
|
2001-10-09 17:13:07 -04:00
|
|
|
|
selected size. Therefore None is *not* a legal value for this
|
2001-09-17 11:09:37 -04:00
|
|
|
|
attribute.
|
|
|
|
|
|
|
|
|
|
|
2001-09-20 11:48:02 -04:00
|
|
|
|
Hashing objects require the following methods:
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-03-28 15:18:03 -05:00
|
|
|
|
copy()
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-03-28 15:18:03 -05:00
|
|
|
|
Return a separate copy of this hashing object. An update to
|
|
|
|
|
this copy won't affect the original object.
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-03-28 15:18:03 -05:00
|
|
|
|
digest()
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-09-17 11:09:37 -04:00
|
|
|
|
Return the hash value of this hashing object as a string
|
2001-03-28 15:18:03 -05:00
|
|
|
|
containing 8-bit data. The object is not altered in any way
|
|
|
|
|
by this function; you can continue updating the object after
|
|
|
|
|
calling this function.
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-09-17 11:09:37 -04:00
|
|
|
|
hexdigest()
|
|
|
|
|
|
|
|
|
|
Return the hash value of this hashing object as a string
|
|
|
|
|
containing hexadecimal digits. Lowercase letters should be used
|
|
|
|
|
for the digits 'a' through 'f'. Like the .digest() method, this
|
|
|
|
|
method mustn't alter the object.
|
|
|
|
|
|
2001-09-24 11:37:19 -04:00
|
|
|
|
update(string)
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-09-24 11:37:19 -04:00
|
|
|
|
Hash 'string' into the current state of the hashing object.
|
|
|
|
|
update() can be called any number of times during a hashing
|
|
|
|
|
object's lifetime.
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-09-20 11:48:02 -04:00
|
|
|
|
Hashing modules can define additional module-level functions or
|
|
|
|
|
object methods and still be compliant with this specification.
|
|
|
|
|
|
2001-03-28 15:18:03 -05:00
|
|
|
|
Here's an example, using a module named 'MD5':
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
2001-03-28 15:18:03 -05:00
|
|
|
|
>>> from Crypto.Hash import MD5
|
|
|
|
|
>>> m = MD5.new()
|
2001-09-20 11:48:02 -04:00
|
|
|
|
>>> m.digest_size
|
2001-09-17 11:09:37 -04:00
|
|
|
|
16
|
2001-03-28 15:18:03 -05:00
|
|
|
|
>>> m.update('abc')
|
|
|
|
|
>>> m.digest()
|
2001-09-17 11:09:37 -04:00
|
|
|
|
'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
|
|
|
|
|
>>> m.hexdigest()
|
|
|
|
|
'900150983cd24fb0d6963f7d28e17f72'
|
2001-03-28 15:18:03 -05:00
|
|
|
|
>>> MD5.new('abc').digest()
|
2001-09-17 11:09:37 -04:00
|
|
|
|
'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
|
|
|
|
|
|
|
|
|
|
|
2001-10-21 22:10:29 -04:00
|
|
|
|
Rationale
|
|
|
|
|
|
|
|
|
|
The digest size is measured in bytes, not bits, even though hash
|
|
|
|
|
algorithm sizes are usually quoted in bits; MD5 is a 128-bit
|
|
|
|
|
algorithm and not a 16-byte one, for example. This is because, in
|
|
|
|
|
the sample code I looked at, the length in bytes is often needed
|
|
|
|
|
(to seek ahead or behind in a file; to compute the length of an
|
|
|
|
|
output string) while the length in bits is rarely used.
|
|
|
|
|
Therefore, the burden will fall on the few people actually needing
|
|
|
|
|
the size in bits, who will have to multiply digest_size by 8.
|
|
|
|
|
|
|
|
|
|
It's been suggested that the update() method would be better named
|
|
|
|
|
append(). However, that method is really causing the current
|
|
|
|
|
state of the hashing object to be updated, and update() is already
|
|
|
|
|
used by the md5 and sha modules included with Python, so it seems
|
|
|
|
|
simplest to leave the name update() alone.
|
|
|
|
|
|
2001-10-30 21:26:25 -05:00
|
|
|
|
The order of the constructor's arguments for keyed hashes was a
|
|
|
|
|
sticky issue. It wasn't clear whether the key should come first
|
|
|
|
|
or second. It's a required parameter, and the usual convention is
|
|
|
|
|
to place required parameters first, but that also means that the
|
|
|
|
|
'string' parameter moves from the first position to the second.
|
|
|
|
|
It would be possible to get confused and pass a single argument to
|
|
|
|
|
a keyed hash, thinking that you're passing an initial string to an
|
|
|
|
|
unkeyed hash, but it doesn't seem worth making the interface
|
|
|
|
|
for keyed hashes more obscure to avoid this potential error.
|
|
|
|
|
|
2001-10-21 22:10:29 -04:00
|
|
|
|
|
2001-09-17 11:09:37 -04:00
|
|
|
|
Changes
|
|
|
|
|
|
2001-09-20 11:48:02 -04:00
|
|
|
|
2001-09-17: Renamed clear() to reset(); added digest_size attribute
|
2001-09-17 11:09:37 -04:00
|
|
|
|
to objects; added .hexdigest() method.
|
2001-09-20 11:48:02 -04:00
|
|
|
|
2001-09-20: Removed reset() method completely.
|
2001-10-09 17:13:07 -04:00
|
|
|
|
2001-09-28: Set digest_size to None for variable-size hashes.
|
2001-09-17 11:09:37 -04:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Acknowledgements
|
|
|
|
|
|
2001-10-09 17:13:07 -04:00
|
|
|
|
Thanks to Aahz, Andrew Archibald, Rich Salz, Itamar
|
|
|
|
|
Shtull-Trauring, and the readers of the python-crypto list for
|
|
|
|
|
their comments on this PEP.
|
2001-03-27 12:42:59 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copyright
|
|
|
|
|
|
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Local Variables:
|
|
|
|
|
mode: indented-text
|
|
|
|
|
indent-tabs-mode: nil
|
|
|
|
|
End:
|
|
|
|
|
|