475 lines
18 KiB
Plaintext
475 lines
18 KiB
Plaintext
PEP: 506
|
||
Title: Adding A Secrets Module To The Standard Library
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Steven D'Aprano <steve@pearwood.info>
|
||
Status: Final
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 19-Sep-2015
|
||
Python-Version: 3.6
|
||
Post-History:
|
||
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP proposes the addition of a module for common security-related
|
||
functions such as generating tokens to the Python standard library.
|
||
|
||
|
||
Definitions
|
||
===========
|
||
|
||
Some common abbreviations used in this proposal:
|
||
|
||
* PRNG:
|
||
|
||
Pseudo Random Number Generator. A deterministic algorithm used
|
||
to produce random-looking numbers with certain desirable
|
||
statistical properties.
|
||
|
||
* CSPRNG:
|
||
|
||
Cryptographically Strong Pseudo Random Number Generator. An
|
||
algorithm used to produce random-looking numbers which are
|
||
resistant to prediction.
|
||
|
||
* MT:
|
||
|
||
Mersenne Twister. An extensively studied PRNG which is currently
|
||
used by the ``random`` module as the default.
|
||
|
||
|
||
Rationale
|
||
=========
|
||
|
||
This proposal is motivated by concerns that Python's standard library
|
||
makes it too easy for developers to inadvertently make serious security
|
||
errors. Theo de Raadt, the founder of OpenBSD, contacted Guido van Rossum
|
||
and expressed some concern [#]_ about the use of MT for generating sensitive
|
||
information such as passwords, secure tokens, session keys and similar.
|
||
|
||
Although the documentation for the ``random`` module explicitly states that
|
||
the default is not suitable for security purposes [#]_, it is strongly
|
||
believed that this warning may be missed, ignored or misunderstood by
|
||
many Python developers. In particular:
|
||
|
||
* developers may not have read the documentation and consequently
|
||
not seen the warning;
|
||
|
||
* they may not realise that their specific use of the module has security
|
||
implications; or
|
||
|
||
* not realising that there could be a problem, they have copied code
|
||
(or learned techniques) from websites which don't offer best
|
||
practises.
|
||
|
||
The first [#]_ hit when searching for "python how to generate passwords" on
|
||
Google is a tutorial that uses the default functions from the ``random``
|
||
module [#]_. Although it is not intended for use in web applications, it is
|
||
likely that similar techniques find themselves used in that situation.
|
||
The second hit is to a StackOverflow question about generating
|
||
passwords [#]_. Most of the answers given, including the accepted one, use
|
||
the default functions. When one user warned that the default could be
|
||
easily compromised, they were told "I think you worry too much." [#]_
|
||
|
||
This strongly suggests that the existing ``random`` module is an attractive
|
||
nuisance when it comes to generating (for example) passwords or secure
|
||
tokens.
|
||
|
||
Additional motivation (of a more philosophical bent) can be found in the
|
||
post which first proposed this idea [#]_.
|
||
|
||
|
||
Proposal
|
||
========
|
||
|
||
Alternative proposals have focused on the default PRNG in the ``random``
|
||
module, with the aim of providing "secure by default" cryptographically
|
||
strong primitives that developers can build upon without thinking about
|
||
security. (See Alternatives below.) This proposes a different approach:
|
||
|
||
* The standard library already provides cryptographically strong
|
||
primitives, but many users don't know they exist or when to use them.
|
||
|
||
* Instead of requiring crypto-naive users to write secure code, the
|
||
standard library should include a set of ready-to-use "batteries" for
|
||
the most common needs, such as generating secure tokens. This code
|
||
will both directly satisfy a need ("How do I generate a password reset
|
||
token?"), and act as an example of acceptable practises which
|
||
developers can learn from [#]_.
|
||
|
||
To do this, this PEP proposes that we add a new module to the standard
|
||
library, with the suggested name ``secrets``. This module will contain a
|
||
set of ready-to-use functions for common activities with security
|
||
implications, together with some lower-level primitives.
|
||
|
||
The suggestion is that ``secrets`` becomes the go-to module for dealing
|
||
with anything which should remain secret (passwords, tokens, etc.)
|
||
while the ``random`` module remains backward-compatible.
|
||
|
||
|
||
API and Implementation
|
||
======================
|
||
|
||
This PEP proposes the following functions for the ``secrets`` module:
|
||
|
||
* Functions for generating tokens suitable for use in (e.g.) password
|
||
recovery, as session keys, etc., in the following formats:
|
||
|
||
- as bytes, ``secrets.token_bytes``;
|
||
- as text, using hexadecimal digits, ``secrets.token_hex``;
|
||
- as text, using URL-safe base-64 encoding, ``secrets.token_urlsafe``.
|
||
|
||
* A limited interface to the system CSPRNG, using either ``os.urandom``
|
||
directly or ``random.SystemRandom``. Unlike the ``random`` module, this
|
||
does not need to provide methods for seeding, getting or setting the
|
||
state, or any non-uniform distributions. It should provide the
|
||
following:
|
||
|
||
- A function for choosing items from a sequence, ``secrets.choice``.
|
||
- A function for generating a given number of random bits and/or bytes
|
||
as an integer, ``secrets.randbits``.
|
||
- A function for returning a random integer in the half-open range
|
||
0 to the given upper limit, ``secrets.randbelow`` [#]_.
|
||
|
||
* A function for comparing text or bytes digests for equality while being
|
||
resistant to timing attacks, ``secrets.compare_digest``.
|
||
|
||
The consensus appears to be that there is no need to add a new CSPRNG to
|
||
the ``random`` module to support these uses, ``SystemRandom`` will be
|
||
sufficient.
|
||
|
||
Some illustrative implementations have been given by Nick Coghlan [#]_
|
||
and a minimalist API by Tim Peters [#]_. This idea has also been discussed
|
||
on the issue tracker for the "cryptography" module [#]_. The following
|
||
pseudo-code should be taken as the starting point for the real
|
||
implementation::
|
||
|
||
from random import SystemRandom
|
||
from hmac import compare_digest
|
||
|
||
_sysrand = SystemRandom()
|
||
|
||
randbits = _sysrand.getrandbits
|
||
choice = _sysrand.choice
|
||
|
||
def randbelow(exclusive_upper_bound):
|
||
return _sysrand._randbelow(exclusive_upper_bound)
|
||
|
||
DEFAULT_ENTROPY = 32 # bytes
|
||
|
||
def token_bytes(nbytes=None):
|
||
if nbytes is None:
|
||
nbytes = DEFAULT_ENTROPY
|
||
return os.urandom(nbytes)
|
||
|
||
def token_hex(nbytes=None):
|
||
return binascii.hexlify(token_bytes(nbytes)).decode('ascii')
|
||
|
||
def token_urlsafe(nbytes=None):
|
||
tok = token_bytes(nbytes)
|
||
return base64.urlsafe_b64encode(tok).rstrip(b'=').decode('ascii')
|
||
|
||
|
||
The ``secrets`` module itself will be pure Python, and other Python
|
||
implementations can easily make use of it unchanged, or adapt it as
|
||
necessary. An implementation can be found on BitBucket [#]_.
|
||
|
||
Default arguments
|
||
~~~~~~~~~~~~~~~~~
|
||
|
||
One difficult question is "How many bytes should my token be?". We can
|
||
help with this question by providing a default amount of entropy for the
|
||
"token_*" functions. If the ``nbytes`` argument is None or not given, the
|
||
default entropy will be used. This default value should be large enough
|
||
to be expected to be secure for medium-security uses, but is expected to
|
||
change in the future, possibly even in a maintenance release [#]_.
|
||
|
||
Naming conventions
|
||
~~~~~~~~~~~~~~~~~~
|
||
|
||
One question is the naming conventions used in the module [#]_, whether to
|
||
use C-like naming conventions such as "randrange" or more Pythonic names
|
||
such as "random_range".
|
||
|
||
Functions which are simply bound methods of the private ``SystemRandom``
|
||
instance (e.g. ``randrange``), or a thin wrapper around such, should keep
|
||
the familiar names. Those which are something new (such as the various
|
||
``token_*`` functions) will use more Pythonic names.
|
||
|
||
Alternatives
|
||
============
|
||
|
||
One alternative is to change the default PRNG provided by the ``random``
|
||
module [#]_. This received considerable scepticism and outright opposition:
|
||
|
||
* There is fear that a CSPRNG may be slower than the current PRNG (which
|
||
in the case of MT is already quite slow).
|
||
|
||
* Some applications (such as scientific simulations, and replaying
|
||
gameplay) require the ability to seed the PRNG into a known state,
|
||
which a CSPRNG lacks by design.
|
||
|
||
* Another major use of the ``random`` module is for simple "guess a number"
|
||
games written by beginners, and many people are loath to make any
|
||
change to the ``random`` module which may make that harder.
|
||
|
||
* Although there is no proposal to remove MT from the ``random`` module,
|
||
there was considerable hostility to the idea of having to opt-in to
|
||
a non-CSPRNG or any backwards-incompatible changes.
|
||
|
||
* Demonstrated attacks against MT are typically against PHP applications.
|
||
It is believed that PHP's version of MT is a significantly softer target
|
||
than Python's version, due to a poor seeding technique [#]_. Consequently,
|
||
without a proven attack against Python applications, many people object
|
||
to a backwards-incompatible change.
|
||
|
||
Nick Coghlan made an :pep:`earlier suggestion <504>`
|
||
for a globally configurable PRNG
|
||
which uses the system CSPRNG by default, but has since withdrawn it
|
||
in favour of this proposal.
|
||
|
||
|
||
Comparison To Other Languages
|
||
=============================
|
||
|
||
* PHP
|
||
|
||
PHP includes a function ``uniqid`` [#]_ which by default returns a
|
||
thirteen character string based on the current time in microseconds.
|
||
Translated into Python syntax, it has the following signature::
|
||
|
||
def uniqid(prefix='', more_entropy=False)->str
|
||
|
||
The PHP documentation warns that this function is not suitable for
|
||
security purposes. Nevertheless, various mature, well-known PHP
|
||
applications use it for that purpose (citation needed).
|
||
|
||
PHP 5.3 and better also includes a function ``openssl_random_pseudo_bytes``
|
||
[#]_. Translated into Python syntax, it has roughly the following
|
||
signature::
|
||
|
||
def openssl_random_pseudo_bytes(length:int)->Tuple[str, bool]
|
||
|
||
This function returns a pseudo-random string of bytes of the given
|
||
length, and a boolean flag giving whether the string is considered
|
||
cryptographically strong. The PHP manual suggests that returning
|
||
anything but True should be rare except for old or broken platforms.
|
||
|
||
* JavaScript
|
||
|
||
Based on a rather cursory search [#]_, there do not appear to be any
|
||
well-known standard functions for producing strong random values in
|
||
JavaScript. ``Math.random`` is often used, despite serious weaknesses
|
||
making it unsuitable for cryptographic purposes [#]_. In recent years
|
||
the majority of browsers have gained support for ``window.crypto.getRandomValues`` [#]_.
|
||
|
||
Node.js offers a rich cryptographic module, ``crypto`` [#]_, most of
|
||
which is beyond the scope of this PEP. It does include a single function
|
||
for generating random bytes, ``crypto.randomBytes``.
|
||
|
||
* Ruby
|
||
|
||
The Ruby standard library includes a module ``SecureRandom`` [#]_
|
||
which includes the following methods:
|
||
|
||
* base64 - returns a Base64 encoded random string.
|
||
|
||
* hex - returns a random hexadecimal string.
|
||
|
||
* random_bytes - returns a random byte string.
|
||
|
||
* random_number - depending on the argument, returns either a random
|
||
integer in the range(0, n), or a random float between 0.0 and 1.0.
|
||
|
||
* urlsafe_base64 - returns a random URL-safe Base64 encoded string.
|
||
|
||
* uuid - return a version 4 random Universally Unique IDentifier.
|
||
|
||
|
||
What Should Be The Name Of The Module?
|
||
======================================
|
||
|
||
There was a proposal to add a "random.safe" submodule, quoting the Zen
|
||
of Python "Namespaces are one honking great idea" koan. However, the
|
||
author of the Zen, Tim Peters, has come out against this idea [#]_, and
|
||
recommends a top-level module.
|
||
|
||
In discussion on the python-ideas mailing list so far, the name "secrets"
|
||
has received some approval, and no strong opposition.
|
||
|
||
There is already an existing third-party module with the same name [#]_,
|
||
but it appears to be unused and abandoned.
|
||
|
||
|
||
Frequently Asked Questions
|
||
==========================
|
||
|
||
* Q: Is this a real problem? Surely MT is random enough that nobody can
|
||
predict its output.
|
||
|
||
A: The consensus among security professionals is that MT is not safe
|
||
in security contexts. It is not difficult to reconstruct the internal
|
||
state of MT [#]_ [#]_ and so predict all past and future values. There
|
||
are a number of known, practical attacks on systems using MT for
|
||
randomness [#]_.
|
||
|
||
* Q: Attacks on PHP are one thing, but are there any known attacks on
|
||
Python software?
|
||
|
||
A: Yes. There have been vulnerabilities in Zope and Plone at the very
|
||
least. Hanno Schlichting commented [#]_::
|
||
|
||
"In the context of Plone and Zope a practical attack was
|
||
demonstrated, but I can't find any good non-broken links about
|
||
this anymore. IIRC Plone generated a random number and exposed
|
||
this on each error page along the lines of 'Sorry, you encountered
|
||
an error, your problem has been filed as <random number>, please
|
||
include this when you contact us'. This allowed anyone to do large
|
||
numbers of requests to this page and get enough random values to
|
||
reconstruct the MT state. A couple of security related modules used
|
||
random instead of system random (cookie session ids, password reset
|
||
links, auth token), so the attacker could break all of those."
|
||
|
||
Christian Heimes reported this issue to the Zope security team in 2012 [#]_,
|
||
there are at least two related CVE vulnerabilities [#]_, and at least one
|
||
work-around for this issue in Django [#]_.
|
||
|
||
* Q: Is this an alternative to specialist cryptographic software such as SSL?
|
||
|
||
A: No. This is a "batteries included" solution, not a full-featured
|
||
"nuclear reactor". It is intended to mitigate against some basic
|
||
security errors, not be a solution to all security-related issues. To
|
||
quote Nick Coghlan referring to his earlier proposal [#]_::
|
||
|
||
"...folks really are better off learning to use things like
|
||
cryptography.io for security sensitive software, so this change
|
||
is just about harm mitigation given that it's inevitable that a
|
||
non-trivial proportion of the millions of current and future
|
||
Python developers won't do that."
|
||
|
||
* Q: What about a password generator?
|
||
|
||
A: The consensus is that the requirements for password generators are too
|
||
variable for it to be a good match for the standard library [#]_. No password
|
||
generator will be included in the initial release of the module, instead it
|
||
will be given in the documentation as a recipe (à la the recipes in the
|
||
``itertools`` module) [#]_.
|
||
|
||
* Q: Will ``secrets`` use /dev/random (which blocks) or /dev/urandom (which
|
||
doesn't block) on Linux? What about other platforms?
|
||
|
||
A: ``secrets`` will be based on ``os.urandom`` and ``random.SystemRandom``,
|
||
which are interfaces to your operating system's best source of cryptographic
|
||
randomness. On Linux, that may be ``/dev/urandom`` [#]_, on Windows it may be
|
||
``CryptGenRandom()``, but see the documentation and/or source code for the
|
||
detailed implementation details.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
.. [#] https://mail.python.org/pipermail/python-ideas/2015-September/035820.html
|
||
|
||
.. [#] https://docs.python.org/3/library/random.html
|
||
|
||
.. [#] As of the date of writing. Also, as Google search terms may be
|
||
automatically customised for the user without their knowledge, some
|
||
readers may see different results.
|
||
|
||
.. [#] http://interactivepython.org/runestone/static/everyday/2013/01/3_password.html
|
||
|
||
.. [#] http://stackoverflow.com/questions/3854692/generate-password-in-python
|
||
|
||
.. [#] http://stackoverflow.com/questions/3854692/generate-password-in-python/3854766#3854766
|
||
|
||
.. [#] https://mail.python.org/pipermail/python-ideas/2015-September/036238.html
|
||
|
||
.. [#] At least those who are motivated to read the source code and documentation.
|
||
|
||
.. [#] After considerable discussion, Guido ruled that the module need only
|
||
provide ``randbelow``, and not similar functions ``randrange`` or
|
||
``randint``. http://code.activestate.com/lists/python-dev/138375/
|
||
|
||
.. [#] https://mail.python.org/pipermail/python-ideas/2015-September/036271.html
|
||
|
||
.. [#] https://mail.python.org/pipermail/python-ideas/2015-September/036350.html
|
||
|
||
.. [#] https://github.com/pyca/cryptography/issues/2347
|
||
|
||
.. [#] https://bitbucket.org/sdaprano/secrets
|
||
|
||
.. [#] https://mail.python.org/pipermail/python-ideas/2015-September/036517.html
|
||
https://mail.python.org/pipermail/python-ideas/2015-September/036515.html
|
||
|
||
.. [#] https://mail.python.org/pipermail/python-ideas/2015-September/036474.html
|
||
|
||
.. [#] Link needed.
|
||
|
||
.. [#] By default PHP seeds the MT PRNG with the time (citation needed),
|
||
which is exploitable by attackers, while Python seeds the PRNG with
|
||
output from the system CSPRNG, which is believed to be much harder to
|
||
exploit.
|
||
|
||
.. [#] http://php.net/manual/en/function.uniqid.php
|
||
|
||
.. [#] http://php.net/manual/en/function.openssl-random-pseudo-bytes.php
|
||
|
||
.. [#] Volunteers and patches are welcome.
|
||
|
||
.. [#] http://ifsec.blogspot.fr/2012/05/cross-domain-mathrandom-prediction.html
|
||
|
||
.. [#] https://developer.mozilla.org/en-US/docs/Web/API/RandomSource/getRandomValues
|
||
|
||
.. [#] https://nodejs.org/api/crypto.html
|
||
|
||
.. [#] http://ruby-doc.org/stdlib-2.1.2/libdoc/securerandom/rdoc/SecureRandom.html
|
||
|
||
.. [#] https://mail.python.org/pipermail/python-ideas/2015-September/036254.html
|
||
|
||
.. [#] https://pypi.python.org/pypi/secrets
|
||
|
||
.. [#] https://jazzy.id.au/2010/09/22/cracking_random_number_generators_part_3.html
|
||
|
||
.. [#] https://mail.python.org/pipermail/python-ideas/2015-September/036077.html
|
||
|
||
.. [#] https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf
|
||
|
||
.. [#] Personal communication, 2016-08-24.
|
||
|
||
.. [#] https://bugs.launchpad.net/zope2/+bug/1071067
|
||
|
||
.. [#] http://www.cvedetails.com/cve/CVE-2012-5508/
|
||
http://www.cvedetails.com/cve/CVE-2012-6661/
|
||
|
||
.. [#] https://github.com/django/django/commit/1525874238fd705ec17a066291935a9316bd3044
|
||
|
||
.. [#] https://mail.python.org/pipermail/python-ideas/2015-September/036157.html
|
||
|
||
.. [#] https://mail.python.org/pipermail/python-ideas/2015-September/036476.html
|
||
https://mail.python.org/pipermail/python-ideas/2015-September/036478.html
|
||
|
||
.. [#] https://mail.python.org/pipermail/python-ideas/2015-September/036488.html
|
||
|
||
.. [#] http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/
|
||
http://www.2uo.de/myths-about-urandom/
|
||
|
||
|
||
Copyright
|
||
=========
|
||
|
||
This document has been placed in the public domain.
|
||
|
||
|
||
|
||
..
|
||
Local Variables:
|
||
mode: indented-text
|
||
indent-tabs-mode: nil
|
||
sentence-end-double-space: t
|
||
fill-column: 70
|
||
coding: utf-8
|
||
End:
|