Major simplification of PEP 504
- drop the submodule idea - call random.ensure_repeatable() to opt in to the PRNG - seed(), getstate(), setstate() all call ensure_repeatable()
This commit is contained in:
parent
2702cd15b3
commit
107927ee7a
241
pep-0504.txt
241
pep-0504.txt
|
@ -24,22 +24,19 @@ Unfortunately, this approach has resulted in a situation where developers that
|
||||||
aren't aware that they're doing security sensitive work use the default module
|
aren't aware that they're doing security sensitive work use the default module
|
||||||
level APIs, and thus expose their users to unnecessary risks.
|
level APIs, and thus expose their users to unnecessary risks.
|
||||||
|
|
||||||
This isn't an acute problem, but it is a chronic one, and if documentation and
|
This isn't an acute problem, but it is a chronic one, and the often long
|
||||||
developer education were going to solve it, they would have done so by now.
|
delays between the introduction of security flaws and their exploitation means
|
||||||
|
that it is difficult for developers to naturally learn from experience.
|
||||||
|
|
||||||
In order to provide an eventually pervasive solution to the problem, this PEP
|
In order to provide an eventually pervasive solution to the problem, this PEP
|
||||||
proposes that Python switch to using the system random number generator by
|
proposes that Python switch to using the system random number generator by
|
||||||
default in Python 3.6, and require developers to opt-in to using the
|
default in Python 3.6, and require developers to opt-in to using the
|
||||||
deterministic random number generator.
|
deterministic random number generator process wide either by using a new
|
||||||
|
``random.ensure_repeatable()`` API, or by explicitly creating their own
|
||||||
|
``random.Random()`` instance.
|
||||||
|
|
||||||
To minimise the compatibility break, calling any of the following module level
|
To minimise the impact on existing code, module level APIs that require
|
||||||
functions will count as opting in to using the deterministic random number
|
determinism will implicitly switch to the deterministic PRNG.
|
||||||
generator for all future calls to module level functions in the random
|
|
||||||
module in the same process:
|
|
||||||
|
|
||||||
* ``random.seed``
|
|
||||||
* ``random.getstate``
|
|
||||||
* ``random.setstate``
|
|
||||||
|
|
||||||
Proposal
|
Proposal
|
||||||
========
|
========
|
||||||
|
@ -48,94 +45,130 @@ Currently, it is never correct to use the module level functions in the
|
||||||
``random`` module for security sensitive applications. This PEP proposes to
|
``random`` module for security sensitive applications. This PEP proposes to
|
||||||
change that admonition in Python 3.6+ to instead be that it is not correct to
|
change that admonition in Python 3.6+ to instead be that it is not correct to
|
||||||
use the module level functions in the ``random`` module for security sensitive
|
use the module level functions in the ``random`` module for security sensitive
|
||||||
applications if ``random.seed``, ``random.getstate``, or ``random.setstate``
|
applications if ``random.ensure_repeatable()`` is ever called (directly or
|
||||||
are ever called in that process.
|
indirectly) in that process.
|
||||||
|
|
||||||
This PEP further proposes to make it easier to explicitly opt in to using
|
To achieve this, rather than being bound methods of a ``random.Random``
|
||||||
either the system random number generator or Python's deterministic PRNG by
|
instance as they are today, the module level callables in ``random`` would
|
||||||
converting the random module to a package that exposes the same top-level API,
|
change to be functions that delegate to the corresponding method of the
|
||||||
and offering two new subpackages:
|
existing ``random._inst`` module attribute.
|
||||||
|
|
||||||
* ``random.system``
|
By default, this attribute will be bound to a ``random.SystemRandom`` instance.
|
||||||
* ``random.seedable``
|
|
||||||
|
|
||||||
The ``random.system`` submodule would provide the following bound methods of a
|
A new ``random.ensure_repeatable()`` API will then rebind the ``random._inst``
|
||||||
module global ``random.SystemRandom`` instance as module attributes:
|
attribute to a ``system.Random`` instance, restoring the same module level
|
||||||
``betavariate``, ``choice``, ``expovariate``, ``gammavariate``, ``gauss``, ``getrandbits``, ``lognormvariate``, ``normalvariate``, ``paretovariate``,
|
API behaviour as existed in previous Python versions (aside from the
|
||||||
``randint``, ``random``, ``randrange``, ``sample``, ``shuffle``,
|
additional level of indirection)::
|
||||||
``triangular``, ``uniform``, ``vonmisesvariate``, ``weibullvariate``
|
|
||||||
|
|
||||||
The ``random.seedable`` submodule would provide the same operations, but as
|
def ensure_repeatable():
|
||||||
methods of a ``random.Random`` instance. In addition, it would provide the
|
"""Switch to using random.Random() for the module level APIs"""
|
||||||
following additional methods which are only meaningful when using a
|
if not isinstance(_inst, Random):
|
||||||
deterministic random number generator: ``seed``, ``getstate``, ``setstate``.
|
_inst = random.Random()
|
||||||
|
|
||||||
Rather than being bound methods of a ``random.Random`` instance as they are
|
To minimise the impact on existing code, calling any of the following module
|
||||||
today, the module level callables in ``random`` itself would change to be
|
level functions will implicitly call ``random.ensure_repeatable()``:
|
||||||
functions that, by default, delegated to the ``random.SystemRandom`` instance
|
|
||||||
in ``random.system``.
|
|
||||||
|
|
||||||
Calling any one of ``random.seed``, ``random.getstate``, or ``random.setstate``
|
* ``random.seed``
|
||||||
would change the delegation to instead refer to the ``random.Random`` instance
|
* ``random.getstate``
|
||||||
in ``random.seedable``.
|
* ``random.setstate``
|
||||||
|
|
||||||
|
There are no changes proposed to the ``random.Random`` or
|
||||||
|
``random.SystemRandom`` class APIs - applications that explicitly instantiate
|
||||||
|
their own random number generators will be entirely unaffected by this
|
||||||
|
proposal.
|
||||||
|
|
||||||
Warning on implicit opt-in
|
Warning on implicit opt-in
|
||||||
--------------------------
|
--------------------------
|
||||||
|
|
||||||
In Python 3.6, implicitly opting in to the use of the seedable PRNG will emit a
|
In Python 3.6, implicitly opting in to the use of the deterministic PRNG will
|
||||||
deprecation warning. This warning will suggest explicitly opting in to either
|
emit a deprecation warning using the following check::
|
||||||
the system RNG or the seedable PRNG. Possible wording:
|
|
||||||
|
|
||||||
"DeprecationWarning: Implicitly switching to the seedable PRNG. Consider
|
if not isinstance(_inst, Random):
|
||||||
importing from random.system or random.seedable as appropriate"
|
warnings.warn(DeprecationWarning,
|
||||||
|
"Implicitly ensuring repeatability. "
|
||||||
|
"See help(random.ensure_repeatable) for details")
|
||||||
|
ensure_repeatable()
|
||||||
|
|
||||||
Whatever precise wording is chosen should have an answer added to Stack
|
The specific wording of the warning should have a suitable answer added to
|
||||||
Overflow as was done for the custom error message that was added for missing
|
Stack Overflow as was done for the custom error message that was added for
|
||||||
parentheses in a call to print [#print]_.
|
missing parentheses in a call to print [#print]_.
|
||||||
|
|
||||||
In the first Python 3 release after Python 2.7 switches to security fix only
|
In the first Python 3 release after Python 2.7 switches to security fix only
|
||||||
mode, the deprecation warning will be upgraded to a RuntimeWarning so it is
|
mode, the deprecation warning will be upgraded to a RuntimeWarning so it is
|
||||||
visible by default.
|
visible by default.
|
||||||
|
|
||||||
This PEP does *not* propose removing the ability to seed the default RNG used
|
This PEP does *not* propose ever removing the ability to ensure the default RNG
|
||||||
process wide - it's not a good idea relative to the alternative of explicitly
|
used process wide is a deterministic PRNG that will produce the same series of
|
||||||
importing from the appropriate submodule (hence the eventually
|
outputs given a specific seed. That capability is widely used in modelling
|
||||||
visible-by-default warning), but it's also a concern that can be more
|
and simulation scenarios, and requiring that ``ensure_repeatable()`` be called
|
||||||
readily addressed on a project-by-project basis.
|
either directly or indirectly is a sufficient enhancement to address the cases
|
||||||
|
where the module level random API is used for security sensitive tasks in web
|
||||||
|
applications without due consideration for the potential security implications
|
||||||
|
of using a deterministic PRNG.
|
||||||
|
|
||||||
|
Performance impact
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Due to the large performance difference between ``random.Random`` and
|
||||||
|
``random.SystemRandom``, applications ported to Python 3.6 will encounter a
|
||||||
|
significant performance regression in cases where:
|
||||||
|
|
||||||
|
* the application is using the module level random API
|
||||||
|
* cryptographic quality randomness isn't needed
|
||||||
|
* the application doesn't already implicitly opt back in to the deterministic
|
||||||
|
PRNG by calling ``random.seed``, ``random.getstate``, or ``random.setstate``
|
||||||
|
* the application isn't updated to explicitly call ``random.ensure_repeatable``
|
||||||
|
|
||||||
|
This would be noted in the Porting section of the Python 3.6 What's New guide,
|
||||||
|
with the recommendation to include the following code in the ``__main__``
|
||||||
|
module of affected applications::
|
||||||
|
|
||||||
|
if hasattr(random, "ensure_repeatable"):
|
||||||
|
random.ensure_repeatable()
|
||||||
|
|
||||||
|
Applications that do need cryptographic quality randomness should be using the
|
||||||
|
system random number generator regardless of speed considerations, so in those
|
||||||
|
cases the change proposed in this PEP will fix a previously latent security
|
||||||
|
defect.
|
||||||
|
|
||||||
Documentation changes
|
Documentation changes
|
||||||
---------------------
|
---------------------
|
||||||
|
|
||||||
The ``random`` module documentation would be updated to move the documentation
|
The ``random`` module documentation would be updated to move the documentation
|
||||||
of the ``seed``, ``getstate`` and ``setstate`` interfaces later in the module,
|
of the ``seed``, ``getstate`` and ``setstate`` interfaces later in the module,
|
||||||
along with the associated security warning.
|
along with the documentation of the new ``ensure_repeatable`` function and the
|
||||||
|
associated security warning.
|
||||||
|
|
||||||
The docs would gain a discussion of the respective use cases for the seedable
|
That section of the module documentation would also gain a discussion of the
|
||||||
PRNG (games, modelling & simulation, software testing) and the system RNG
|
respective use cases for the deterministic PRNG enabled by
|
||||||
(cryptography, security token generation).
|
``ensure_repeatable`` (games, modelling & simulation, software testing) and the
|
||||||
|
system RNG that is used by default (cryptography, security token generation).
|
||||||
|
This discussion will also recommend the use of third party security libraries
|
||||||
|
for the latter task.
|
||||||
|
|
||||||
Rationale
|
Rationale
|
||||||
=========
|
=========
|
||||||
|
|
||||||
Writing secure software under deadline and budget pressures is a hard problem.
|
Writing secure software under deadline and budget pressures is a hard problem.
|
||||||
This is reflected in ongoing problems with data breaches involving personally
|
This is reflected in regular notifications of data breaches involving personally
|
||||||
identifiable information [#breaches]_, as well as with failures to take
|
identifiable information [#breaches]_, as well as with failures to take
|
||||||
security considerations into account when new systems, like motor vehicles
|
security considerations into account when new systems, like motor vehicles
|
||||||
[#uconnect]_, are connected to the internet. Compounding the issue is the fact
|
[#uconnect]_, are connected to the internet. It's also the case that a lot of
|
||||||
that a lot of the programming advice readily available on the internet [#search]
|
the programming advice readily available on the internet [#search] simply
|
||||||
simply doesn't take the mathemetical arcana of computer security into account,
|
doesn't take the mathemetical arcana of computer security into account.
|
||||||
and the fact that defenders have to cover *all* of their potential
|
Compounding these issues is the fact that defenders have to cover *all* of
|
||||||
vulnerabilites, as a single mistake can make it possible to subvert other
|
their potential vulnerabilites, as a single mistake can make it possible to
|
||||||
defences [#bcrypt]_.
|
subvert other defences [#bcrypt]_.
|
||||||
|
|
||||||
One of the factors that contributes to making this last aspect particularly
|
One of the factors that contributes to making this last aspect particularly
|
||||||
difficult is APIs where using them inappropriately creates a *silent* security
|
difficult is APIs where using them inappropriately creates a *silent* security
|
||||||
failure - one where the only way to find out that what you're doing is
|
failure - one where the only way to find out that what you're doing is
|
||||||
incorrect is for someone reviewing your code to say "that's a potential
|
incorrect is for someone reviewing your code to say "that's a potential
|
||||||
security problem", or for a system you're responsible for to be compromised
|
security problem", or for a system you're responsible for to be compromised
|
||||||
through such an oversight (and your intrusion detection and auditing mechanisms
|
through such an oversight (and you're not only still responsible for that
|
||||||
are good enough for you to be able to figure out after the event how the
|
system when it is compromised, but your intrusion detection and auditing
|
||||||
compromise took place).
|
mechanisms are good enough for you to be able to figure out after the event
|
||||||
|
how the compromise took place).
|
||||||
|
|
||||||
This kind of situation is a significant contributor to "security fatigue",
|
This kind of situation is a significant contributor to "security fatigue",
|
||||||
where developers (often rightly [#owasptopten]_) feel that security engineers
|
where developers (often rightly [#owasptopten]_) feel that security engineers
|
||||||
|
@ -151,8 +184,8 @@ threats, and less time fighting with default language behaviours.
|
||||||
Discussion
|
Discussion
|
||||||
==========
|
==========
|
||||||
|
|
||||||
Why "seedable" over "deterministic"?
|
Why "ensure_repeatable" over "ensure_deterministic"?
|
||||||
------------------------------------
|
----------------------------------------------------
|
||||||
|
|
||||||
This is a case where the meaning of a word as specialist jargon conflicts with
|
This is a case where the meaning of a word as specialist jargon conflicts with
|
||||||
the typical meaning of the word, even though it's *technically* the same.
|
the typical meaning of the word, even though it's *technically* the same.
|
||||||
|
@ -163,16 +196,17 @@ future states.
|
||||||
|
|
||||||
The problem is that "deterministic" on its own doesn't convey those qualifiers,
|
The problem is that "deterministic" on its own doesn't convey those qualifiers,
|
||||||
so it's likely to instead be interpreted as "predictable" or "not random" by
|
so it's likely to instead be interpreted as "predictable" or "not random" by
|
||||||
folks that aren't familiar with the technical meaning.
|
folks that are familiar with the conventional meaning, but aren't familiar with
|
||||||
|
the additional qualifiers on the technical meaning.
|
||||||
|
|
||||||
The other problem with "deterministic" as a description for the traditional RNG
|
A second problem with "deterministic" as a description for the traditional RNG
|
||||||
is that it doesn't tell you what you can *do* with the traditional RNG that you
|
is that it doesn't really tell you what you can *do* with the traditional RNG
|
||||||
can't do with the system one.
|
that you can't do with the system one.
|
||||||
|
|
||||||
"seedable" aims to address both those problems, as it doesn't have a misleading
|
"ensure_repeatable" aims to address both of those problems, as its common
|
||||||
common meaning, and it's a word form that means "you can seed this", which then
|
meaning accurately describes the main reason for preferring the deterministic
|
||||||
leads naturally into an exploration of what it means to "seed" a random number
|
PRNG over the system RNG: ensuring you can repeat the same series of outputs
|
||||||
generator.
|
by providing the same seed value, or by restoring a previously saved PRNG state.
|
||||||
|
|
||||||
Only changing the default for Python 3.6+
|
Only changing the default for Python 3.6+
|
||||||
-----------------------------------------
|
-----------------------------------------
|
||||||
|
@ -184,9 +218,9 @@ change to all currently supported versions of Python.
|
||||||
|
|
||||||
The difference in this case is one of degree - the additional benefits from
|
The difference in this case is one of degree - the additional benefits from
|
||||||
rolling out this particular change a couple of years earlier than will
|
rolling out this particular change a couple of years earlier than will
|
||||||
otherwise be the case aren't sufficient to justify the additional effort and
|
otherwise be the case aren't sufficient to justify either the additional effort
|
||||||
stability risks involved in making such an intrusive change in a maintenance
|
or the stability risks involved in making such an intrusive change in a
|
||||||
release.
|
maintenance release.
|
||||||
|
|
||||||
Keeping the module level functions
|
Keeping the module level functions
|
||||||
----------------------------------
|
----------------------------------
|
||||||
|
@ -198,18 +232,24 @@ of the current ``random`` module API. Accordingly, this proposal ensures that
|
||||||
most of the public API can continue to be used not only without modification,
|
most of the public API can continue to be used not only without modification,
|
||||||
but without generating any new warnings.
|
but without generating any new warnings.
|
||||||
|
|
||||||
Implicitly opting in to the deterministic RNG
|
Warning when implicitly opting in to the deterministic RNG
|
||||||
---------------------------------------------
|
----------------------------------------------------------
|
||||||
|
|
||||||
Python is widely used for modelling and simulation purposes, and in many cases,
|
It's necessary to implicitly opt in to the deterministic PRNG as Python is
|
||||||
these software models won't have a dedicated maintenance team tasked with
|
widely used for modelling and simulation purposes where this is the right
|
||||||
ensuing they keep working on the latest versions of Python.
|
thing to do, and in many cases, these software models won't have a dedicated
|
||||||
|
maintenance team tasked with ensuring they keep working on the latest versions
|
||||||
|
of Python.
|
||||||
|
|
||||||
|
Unfortunately, explicitly calling ``random.seed`` with data from ``os.urandom``
|
||||||
|
is also a mistake that appears in a number of the flawed "how to generate a
|
||||||
|
security token in Python" guides readily available online.
|
||||||
|
|
||||||
Using first DeprecationWarning, and then eventually a RuntimeWarning, to
|
Using first DeprecationWarning, and then eventually a RuntimeWarning, to
|
||||||
advise against implicitly switching to the deterministic PRNG, preserves
|
advise against implicitly switching to the deterministic PRNG aims to
|
||||||
compatibility with this existing software, while still nudging future users
|
nudge future users that need a cryptographically secure RNG away from
|
||||||
that need a deterministic generator towards importing ``random.seedable``
|
calling ``random.seed()`` and those that genuinely need a deterministic
|
||||||
explicitly.
|
generator towards explicitily calling ``random.ensure_repeatable()``.
|
||||||
|
|
||||||
Avoiding the introduction of a userspace CSPRNG
|
Avoiding the introduction of a userspace CSPRNG
|
||||||
-----------------------------------------------
|
-----------------------------------------------
|
||||||
|
@ -224,23 +264,9 @@ point of failure in security sensitive situations, for the sake of applications
|
||||||
where the random number generation may not even be on a critical performance
|
where the random number generation may not even be on a critical performance
|
||||||
path.
|
path.
|
||||||
|
|
||||||
What about the performance impact?
|
Applications that do need cryptographic quality randomness should be using the
|
||||||
----------------------------------
|
system random number generator regardless of speed considerations, so in those
|
||||||
|
cases.
|
||||||
Rather than introducing a userspace CSPRNG, this PEP instead proposes that we
|
|
||||||
accept the performance regression in cases where:
|
|
||||||
|
|
||||||
* an application is using the module level random API
|
|
||||||
* cryptographic quality randomness isn't needed
|
|
||||||
* the application doesn't already implicitly opt back in to the deterministic
|
|
||||||
PRNG by calling ``random.seed``, ``random.getstate``, or ``random.setstate``
|
|
||||||
* the application isn't updated to explicitly import from ``random.seedable``
|
|
||||||
rather than ``random``
|
|
||||||
|
|
||||||
Applications that need cryptographic quality randomness should be using the
|
|
||||||
system random number generator regardless of speed considerations, while other
|
|
||||||
applications where speed is a more important consideration are better off with
|
|
||||||
the current PRNG implementation than they would be with a new CSPRNG.
|
|
||||||
|
|
||||||
Isn't the deterministic PRNG "secure enough"?
|
Isn't the deterministic PRNG "secure enough"?
|
||||||
---------------------------------------------
|
---------------------------------------------
|
||||||
|
@ -252,6 +278,11 @@ studies of PHP's random number generator [#php]_ have demonstrated the ability
|
||||||
to use weaknesses in that subsystem to facilitate a practical attack on
|
to use weaknesses in that subsystem to facilitate a practical attack on
|
||||||
password recovery tokens in popular PHP web applications.
|
password recovery tokens in popular PHP web applications.
|
||||||
|
|
||||||
|
However, one of the rules of secure software development is that "attacks only
|
||||||
|
get better, never worse", so it may be that by the time Python 3.6 is released
|
||||||
|
we will actually see a practical attack on Python's deterministic PRNG publicly
|
||||||
|
documented.
|
||||||
|
|
||||||
Security fatigue in the Python ecosystem
|
Security fatigue in the Python ecosystem
|
||||||
----------------------------------------
|
----------------------------------------
|
||||||
|
|
||||||
|
@ -264,9 +295,9 @@ on Linux systems in general, a fair share of that burden has fallen on the
|
||||||
Python ecosystem, which is understandably frustrating for Pythonistas using
|
Python ecosystem, which is understandably frustrating for Pythonistas using
|
||||||
Python in other contexts where these issues aren't of as great a concern.
|
Python in other contexts where these issues aren't of as great a concern.
|
||||||
|
|
||||||
This consideration is one of the primary factors driving the backwards
|
This consideration is one of the primary factors driving the substantial
|
||||||
compatibility improvements in this proposal relative to the initial draft
|
backwards compatibility improvements in this proposal relative to the initial
|
||||||
concept posted to python-ideas [#draft]_.
|
draft concept posted to python-ideas [#draft]_.
|
||||||
|
|
||||||
Acknowledgements
|
Acknowledgements
|
||||||
================
|
================
|
||||||
|
@ -284,6 +315,8 @@ Acknowledgements
|
||||||
experts that suggested the introduction of a userspace CSPRNG would mean
|
experts that suggested the introduction of a userspace CSPRNG would mean
|
||||||
additional complexity for insufficient gain relative to just using the
|
additional complexity for insufficient gain relative to just using the
|
||||||
system RNG directly
|
system RNG directly
|
||||||
|
* Paul Moore for eloquently making the case for the current level of security
|
||||||
|
fatigue in the Python ecosystem
|
||||||
|
|
||||||
References
|
References
|
||||||
==========
|
==========
|
||||||
|
|
Loading…
Reference in New Issue