174 lines
6.0 KiB
ReStructuredText
174 lines
6.0 KiB
ReStructuredText
PEP: 247
|
|
Title: API for Cryptographic Hash Functions
|
|
Version: $Revision$
|
|
Last-Modified: $Date$
|
|
Author: A.M. Kuchling <amk@amk.ca>
|
|
Status: Final
|
|
Type: Informational
|
|
Content-Type: text/x-rst
|
|
Created: 23-Mar-2001
|
|
Post-History: 20-Sep-2001
|
|
|
|
|
|
Abstract
|
|
========
|
|
|
|
There are several different modules available that implement cryptographic
|
|
hashing algorithms such as MD5 or SHA. This document specifies a standard API
|
|
for such algorithms, to make it easier to switch between different
|
|
implementations.
|
|
|
|
|
|
Specification
|
|
=============
|
|
|
|
All hashing modules should present the same interface. Additional methods or
|
|
variables can be added, but those described in this document should always be
|
|
present.
|
|
|
|
Hash function modules define one function:
|
|
|
|
| ``new([string]) (unkeyed hashes)``
|
|
| ``new([key] , [string]) (keyed hashes)``
|
|
|
|
Create a new hashing object and return it. The first form is for hashes
|
|
that are unkeyed, such as MD5 or SHA. For keyed hashes such as HMAC, *key*
|
|
is a required parameter containing a string giving the key to use. In both
|
|
cases, the optional *string* parameter, if supplied, will be immediately
|
|
hashed into the object's starting state, as if ``obj.update(string)``
|
|
was called.
|
|
|
|
After creating a hashing object, arbitrary strings can be fed into the
|
|
object using its ``update()`` method, and the hash value can be obtained at
|
|
any time by calling the object's ``digest()`` method.
|
|
|
|
Arbitrary additional keyword arguments can be added to this function, but if
|
|
they're not supplied, sensible default values should be used. For example,
|
|
``rounds`` and ``digest_size`` keywords could be added for a hash function
|
|
which supports a variable number of rounds and several different output
|
|
sizes, and they should default to values believed to be secure.
|
|
|
|
Hash function modules define one variable:
|
|
|
|
| ``digest_size``
|
|
|
|
An integer value; the size of the digest produced by the hashing objects
|
|
created by this module, measured in bytes. You could also obtain this value
|
|
by creating a sample object and accessing its ``digest_size`` attribute, but
|
|
it can be convenient to have this value available from the module. Hashes
|
|
with a variable output size will set this variable to ``None``.
|
|
|
|
Hashing objects require a single attribute:
|
|
|
|
| ``digest_size``
|
|
|
|
This attribute is identical to the module-level ``digest_size`` variable,
|
|
measuring the size of the digest produced by the hashing object, measured in
|
|
bytes. If the hash has a variable output size, this output size must be
|
|
chosen when the hashing object is created, and this attribute must contain
|
|
the selected size. Therefore, ``None`` is *not* a legal value for this
|
|
attribute.
|
|
|
|
|
|
Hashing objects require the following methods:
|
|
|
|
| ``copy()``
|
|
|
|
Return a separate copy of this hashing object. An update to this copy won't
|
|
affect the original object.
|
|
|
|
| ``digest()``
|
|
|
|
Return the hash value of this hashing object as a string containing 8-bit
|
|
data. The object is not altered in any way by this function; you can
|
|
continue updating the object after calling this function.
|
|
|
|
| ``hexdigest()``
|
|
|
|
Return the hash value of this hashing object as a string containing
|
|
hexadecimal digits. Lowercase letters should be used for the digits ``a``
|
|
through ``f``. Like the ``.digest()`` method, this method mustn't alter the
|
|
object.
|
|
|
|
| ``update(string)``
|
|
|
|
Hash *string* into the current state of the hashing object. ``update()`` can
|
|
be called any number of times during a hashing object's lifetime.
|
|
|
|
Hashing modules can define additional module-level functions or object methods
|
|
and still be compliant with this specification.
|
|
|
|
Here's an example, using a module named ``MD5``::
|
|
|
|
>>> from Crypto.Hash import MD5
|
|
>>> m = MD5.new()
|
|
>>> m.digest_size
|
|
16
|
|
>>> m.update('abc')
|
|
>>> m.digest()
|
|
'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
|
|
>>> m.hexdigest()
|
|
'900150983cd24fb0d6963f7d28e17f72'
|
|
>>> MD5.new('abc').digest()
|
|
'\x90\x01P\x98<\xd2O\xb0\xd6\x96?}(\xe1\x7fr'
|
|
|
|
|
|
Rationale
|
|
=========
|
|
|
|
The digest size is measured in bytes, not bits, even though hash algorithm
|
|
sizes are usually quoted in bits; MD5 is a 128-bit algorithm and not a 16-byte
|
|
one, for example. This is because, in the sample code I looked at, the length
|
|
in bytes is often needed (to seek ahead or behind in a file; to compute the
|
|
length of an output string) while the length in bits is rarely used. Therefore,
|
|
the burden will fall on the few people actually needing the size in bits, who
|
|
will have to multiply ``digest_size`` by 8.
|
|
|
|
It's been suggested that the ``update()`` method would be better named
|
|
``append()``. However, that method is really causing the current state of the
|
|
hashing object to be updated, and ``update()`` is already used by the md5 and
|
|
sha modules included with Python, so it seems simplest to leave the name
|
|
``update()`` alone.
|
|
|
|
The order of the constructor's arguments for keyed hashes was a sticky issue.
|
|
It wasn't clear whether the *key* should come first or second. It's a required
|
|
parameter, and the usual convention is to place required parameters first, but
|
|
that also means that the *string* parameter moves from the first position to
|
|
the second. It would be possible to get confused and pass a single argument to
|
|
a keyed hash, thinking that you're passing an initial string to an unkeyed
|
|
hash, but it doesn't seem worth making the interface for keyed hashes more
|
|
obscure to avoid this potential error.
|
|
|
|
|
|
Changes
|
|
=======
|
|
|
|
2001-09-17: Renamed ``clear()`` to ``reset()``; added ``digest_size`` attribute
|
|
to objects; added ``.hexdigest()`` method.
|
|
|
|
2001-09-20: Removed ``reset()`` method completely.
|
|
|
|
2001-09-28: Set ``digest_size`` to ``None`` for variable-size hashes.
|
|
|
|
|
|
Acknowledgements
|
|
================
|
|
|
|
Thanks to Aahz, Andrew Archibald, Rich Salz, Itamar Shtull-Trauring, and the
|
|
readers of the python-crypto list for their comments on this PEP.
|
|
|
|
|
|
Copyright
|
|
=========
|
|
|
|
This document has been placed in the public domain.
|
|
|
|
|
|
|
|
..
|
|
Local Variables:
|
|
mode: indented-text
|
|
indent-tabs-mode: nil
|
|
End:
|
|
|