PEP 578: Updated text for clarity (#959)
This commit is contained in:
parent
f350734f80
commit
cd42999322
117
pep-0578.rst
117
pep-0578.rst
|
@ -28,27 +28,27 @@ are unspecified here to allow implementations the freedom to determine
|
||||||
how best to provide information to their users. Some examples likely
|
how best to provide information to their users. Some examples likely
|
||||||
to be used in CPython are provided for explanatory purposes.
|
to be used in CPython are provided for explanatory purposes.
|
||||||
|
|
||||||
See PEP-551 for discussion and recommendations on enhancing the
|
See PEP 551 for discussion and recommendations on enhancing the
|
||||||
security of a Python runtime making use of these auditing APIs.
|
security of a Python runtime making use of these auditing APIs.
|
||||||
|
|
||||||
Background
|
Background
|
||||||
==========
|
==========
|
||||||
|
|
||||||
Python provides access to a wide range of low-level functionality on
|
Python provides access to a wide range of low-level functionality on
|
||||||
many common operating systems in a consistent manner. While this is
|
many common operating systems. While this is incredibly useful for
|
||||||
incredibly useful for "write-once, run-anywhere" scripting, it also
|
"write-once, run-anywhere" scripting, it also makes monitoring of
|
||||||
makes monitoring of software written in Python difficult. Because
|
software written in Python difficult. Because Python uses native system
|
||||||
Python uses native system APIs directly, existing monitoring
|
APIs directly, existing monitoring tools either suffer from limited
|
||||||
tools either suffer from limited context or auditing bypass.
|
context or auditing bypass.
|
||||||
|
|
||||||
Limited context occurs when system monitoring can report that an
|
Limited context occurs when system monitoring can report that an
|
||||||
action occurred, but cannot explain the sequence of events leading to
|
action occurred, but cannot explain the sequence of events leading to
|
||||||
it. For example, network monitoring at the OS level may be able to
|
it. For example, network monitoring at the OS level may be able to
|
||||||
report "listening started on port 5678", but may not be able to
|
report "listening started on port 5678", but may not be able to
|
||||||
provide the process ID, command line or parent process, or the local
|
provide the process ID, command line, parent process, or the local
|
||||||
state in the program at the point that triggered the action. Firewall
|
state in the program at the point that triggered the action. Firewall
|
||||||
controls to prevent such an action are similarly limited, typically
|
controls to prevent such an action are similarly limited, typically
|
||||||
to a process name or some global state such as the current user, and
|
to process names or some global state such as the current user, and
|
||||||
in any case rarely provide a useful log file correlated with other
|
in any case rarely provide a useful log file correlated with other
|
||||||
application messages.
|
application messages.
|
||||||
|
|
||||||
|
@ -73,6 +73,10 @@ same name as the module they intend to use - for example, a
|
||||||
``random.py`` file that attempts to import the standard library
|
``random.py`` file that attempts to import the standard library
|
||||||
``random`` module.
|
``random`` module.
|
||||||
|
|
||||||
|
This is not sandboxing, as this proposal does not attempt to prevent
|
||||||
|
malicious behavior (though it enables some new options to do so).
|
||||||
|
See the `Why Not A Sandbox`_ section below for further discussion.
|
||||||
|
|
||||||
Overview of Changes
|
Overview of Changes
|
||||||
===================
|
===================
|
||||||
|
|
||||||
|
@ -84,12 +88,14 @@ We propose two API changes to enable this: an Audit Hook and Verified
|
||||||
Open Hook. Both are available from Python and native code, allowing
|
Open Hook. Both are available from Python and native code, allowing
|
||||||
applications and frameworks written in pure Python code to take
|
applications and frameworks written in pure Python code to take
|
||||||
advantage of the extra messages, while also allowing embedders or
|
advantage of the extra messages, while also allowing embedders or
|
||||||
system administrators to deploy "always-on" builds of Python.
|
system administrators to deploy builds of Python where auditing is
|
||||||
|
always enabled.
|
||||||
|
|
||||||
Only CPython is bound to provide the native APIs as described here.
|
Only CPython is bound to provide the native APIs as described here.
|
||||||
Other implementations should provide the pure Python APIs, and
|
Other implementations should provide the pure Python APIs, and
|
||||||
may provide native versions as appropriate for their underlying
|
may provide native versions as appropriate for their underlying
|
||||||
runtimes.
|
runtimes. Auditing events are likewise considered implementation
|
||||||
|
specific, but are bound by normal feature compatibility guarantees.
|
||||||
|
|
||||||
Audit Hook
|
Audit Hook
|
||||||
----------
|
----------
|
||||||
|
@ -132,9 +138,9 @@ When events of interest are occurring, code can either call
|
||||||
``PySys_Audit()`` from C (while the GIL is held) or ``sys.audit()``. The
|
``PySys_Audit()`` from C (while the GIL is held) or ``sys.audit()``. The
|
||||||
string argument is the name of the event, and the tuple contains
|
string argument is the name of the event, and the tuple contains
|
||||||
arguments. A given event name should have a fixed schema for arguments,
|
arguments. A given event name should have a fixed schema for arguments,
|
||||||
which should be considered a public API (for a given x.y version
|
which should be considered a public API (for each x.y version release),
|
||||||
release), and thus should only change between feature releases with
|
and thus should only change between feature releases with updated
|
||||||
updated documentation.
|
documentation.
|
||||||
|
|
||||||
For maximum compatibility, events using the same name as an event in
|
For maximum compatibility, events using the same name as an event in
|
||||||
the reference interpreter CPython should make every attempt to use
|
the reference interpreter CPython should make every attempt to use
|
||||||
|
@ -152,7 +158,7 @@ log the event, abort the operation with an exception, or to immediately
|
||||||
terminate the process with an operating system exit call.
|
terminate the process with an operating system exit call.
|
||||||
|
|
||||||
When an event is audited but no hooks have been set, the ``audit()``
|
When an event is audited but no hooks have been set, the ``audit()``
|
||||||
function should include minimal overhead. Ideally, each argument is a
|
function should impose minimal overhead. Ideally, each argument is a
|
||||||
reference to existing data rather than a value calculated just for the
|
reference to existing data rather than a value calculated just for the
|
||||||
auditing call.
|
auditing call.
|
||||||
|
|
||||||
|
@ -160,15 +166,14 @@ As hooks may be Python objects, they need to be freed during
|
||||||
``Py_Finalize()``. To do this, we add an internal API
|
``Py_Finalize()``. To do this, we add an internal API
|
||||||
``_Py_ClearAuditHooks()`` that releases any Python hooks and any
|
``_Py_ClearAuditHooks()`` that releases any Python hooks and any
|
||||||
memory held. This is an internal function with no public export, and
|
memory held. This is an internal function with no public export, and
|
||||||
we recommend it should raise its own audit event for all current hooks
|
we recommend it raise its own audit event for all current hooks to
|
||||||
to ensure that unexpected calls are observed.
|
ensure that unexpected calls are observed.
|
||||||
|
|
||||||
Below in `Suggested Audit Hook Locations`_, we recommend some important
|
Below in `Suggested Audit Hook Locations`_, we recommend some important
|
||||||
operations that should raise audit events. In PEP 551, more audited
|
operations that should raise audit events.
|
||||||
operations are recommended with a view to security transparency.
|
|
||||||
|
|
||||||
Python implementations should document which operations will raise
|
Python implementations should document which operations will raise
|
||||||
audit events, along with the event schema. It is intended that
|
audit events, along with the event schema. It is intentional that
|
||||||
``sys.addaudithook(print)`` be a trivial way to display all messages.
|
``sys.addaudithook(print)`` be a trivial way to display all messages.
|
||||||
|
|
||||||
Verified Open Hook
|
Verified Open Hook
|
||||||
|
@ -176,11 +181,12 @@ Verified Open Hook
|
||||||
|
|
||||||
Most operating systems have a mechanism to distinguish between files
|
Most operating systems have a mechanism to distinguish between files
|
||||||
that can be executed and those that can not. For example, this may be an
|
that can be executed and those that can not. For example, this may be an
|
||||||
execute bit in the permissions field, or a verified hash of the file
|
execute bit in the permissions field, a verified hash of the file
|
||||||
contents to detect potential code tampering. These are an important
|
contents to detect potential code tampering, or file system path
|
||||||
security mechanism for preventing execution of data or code that is not
|
restrictions. These are an important security mechanism for preventing
|
||||||
approved for a given environment. Currently, Python has no way to
|
execution of data or code that is not approved for a given environment.
|
||||||
integrate with these when launching scripts or importing modules.
|
Currently, Python has no way to integrate with these when launching
|
||||||
|
scripts or importing modules.
|
||||||
|
|
||||||
The new public C API for the verified open hook is::
|
The new public C API for the verified open hook is::
|
||||||
|
|
||||||
|
@ -201,6 +207,7 @@ The ``importlib.util.open_for_import()`` function is a drop-in
|
||||||
replacement for ``open(str(pathlike), 'rb')``. Its default behaviour is
|
replacement for ``open(str(pathlike), 'rb')``. Its default behaviour is
|
||||||
to open a file for raw, binary access. To change the behaviour a new
|
to open a file for raw, binary access. To change the behaviour a new
|
||||||
handler should be set. Handler functions only accept ``str`` arguments.
|
handler should be set. Handler functions only accept ``str`` arguments.
|
||||||
|
The C API ``PyImport_OpenForImport`` function assumes UTF-8 encoding.
|
||||||
|
|
||||||
A custom handler may be set by calling ``PyImport_SetOpenForImportHook()``
|
A custom handler may be set by calling ``PyImport_SetOpenForImportHook()``
|
||||||
from C at any time, including before ``Py_Initialize()``. However, if a
|
from C at any time, including before ``Py_Initialize()``. However, if a
|
||||||
|
@ -209,9 +216,7 @@ hook has already been set then the call will fail. When
|
||||||
the path and its return value will be returned directly. The returned
|
the path and its return value will be returned directly. The returned
|
||||||
object should be an open file-like object that supports reading raw
|
object should be an open file-like object that supports reading raw
|
||||||
bytes. This is explicitly intended to allow a ``BytesIO`` instance if
|
bytes. This is explicitly intended to allow a ``BytesIO`` instance if
|
||||||
the open handler has already had to read the file into memory in order
|
the open handler has already read the entire file into memory.
|
||||||
to perform whatever verification is necessary to determine whether the
|
|
||||||
content is permitted to be executed.
|
|
||||||
|
|
||||||
Note that these hooks can import and call the ``_io.open()`` function on
|
Note that these hooks can import and call the ``_io.open()`` function on
|
||||||
CPython without triggering themselves. They can also use ``_io.BytesIO``
|
CPython without triggering themselves. They can also use ``_io.BytesIO``
|
||||||
|
@ -301,6 +306,11 @@ see which operations provide audit events.
|
||||||
file. All arguments other than the module name may be ``None`` if
|
file. All arguments other than the module name may be ``None`` if
|
||||||
they are not used or available.
|
they are not used or available.
|
||||||
"
|
"
|
||||||
|
"``open``", ``open``, "``(path, mode, flags)``", "Detect when a file
|
||||||
|
is about to be opened. *path* and *mode* are the usual parameters to
|
||||||
|
``open`` if available, while *flags* is provided instead of *mode*
|
||||||
|
in some cases.
|
||||||
|
"
|
||||||
``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is
|
``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is
|
||||||
injecting trace functions. Because of the implementation, exceptions
|
injecting trace functions. Because of the implementation, exceptions
|
||||||
raised from the hook will abort the operation, but will not be
|
raised from the hook will abort the operation, but will not be
|
||||||
|
@ -387,10 +397,9 @@ Performance Impact
|
||||||
|
|
||||||
The important performance impact is the case where events are being
|
The important performance impact is the case where events are being
|
||||||
raised but there are no hooks attached. This is the unavoidable case -
|
raised but there are no hooks attached. This is the unavoidable case -
|
||||||
once a distributor begins adding audit hooks they have explicitly
|
once a developer has added audit hooks they have explicitly chosen to
|
||||||
chosen to trade performance for functionality. Performance import
|
trade performance for functionality. Performance impact with hooks added
|
||||||
with hooks added are not of interest here, since this is considered
|
are not of interest here, since this is opt-in functionality.
|
||||||
opt-in functionality.
|
|
||||||
|
|
||||||
Analysis using the Python Performance Benchmark Suite [1]_ shows no
|
Analysis using the Python Performance Benchmark Suite [1]_ shows no
|
||||||
significant impact, with the vast majority of benchmarks showing
|
significant impact, with the vast majority of benchmarks showing
|
||||||
|
@ -415,13 +424,13 @@ always be present. The nature of these hooks is that they must be
|
||||||
callable without condition, as any conditional imports or calls provide
|
callable without condition, as any conditional imports or calls provide
|
||||||
opportunities to intercept and suppress or modify events.
|
opportunities to intercept and suppress or modify events.
|
||||||
|
|
||||||
Given its nature as one of the most core modules, the ``sys`` module is
|
Given it is one of the most core modules, the ``sys`` module is somewhat
|
||||||
somewhat protected against module shadowing attacks. Replacing ``sys``
|
protected against module shadowing attacks. Replacing ``sys`` with a
|
||||||
with a sufficiently functional module that the application can still run
|
sufficiently functional module that the application can still run is a
|
||||||
is a much more complicated task than replacing a module with only one
|
much more complicated task than replacing a module with only one
|
||||||
function of interest. An attacker that has the ability to shadow the
|
function of interest. An attacker that has the ability to shadow the
|
||||||
``sys`` module is already capable of running arbitrary code from files,
|
``sys`` module is already capable of running arbitrary code from files,
|
||||||
whereas an ``audit`` module can be replaced with a single line in a
|
whereas an ``audit`` module could be replaced with a single line in a
|
||||||
``.pth`` file anywhere on the search path::
|
``.pth`` file anywhere on the search path::
|
||||||
|
|
||||||
import sys; sys.modules['audit'] = type('audit', (object,),
|
import sys; sys.modules['audit'] = type('audit', (object,),
|
||||||
|
@ -431,8 +440,8 @@ Multiple layers of protection already exist for monkey patching attacks
|
||||||
against either ``sys`` or ``audit``, but assignments or insertions to
|
against either ``sys`` or ``audit``, but assignments or insertions to
|
||||||
``sys.modules`` are not audited.
|
``sys.modules`` are not audited.
|
||||||
|
|
||||||
This idea is rejected because it makes substituting ``audit`` calls
|
This idea is rejected because it makes it trivial to suppress all calls
|
||||||
throughout all callers trivial.
|
to ``audit``.
|
||||||
|
|
||||||
Flag in sys.flags to indicate "audited" mode
|
Flag in sys.flags to indicate "audited" mode
|
||||||
--------------------------------------------
|
--------------------------------------------
|
||||||
|
@ -465,6 +474,34 @@ This idea is rejected because there are no appropriate reasons for an
|
||||||
application to change its behaviour based on whether these APIs are in
|
application to change its behaviour based on whether these APIs are in
|
||||||
use.
|
use.
|
||||||
|
|
||||||
|
Why Not A Sandbox
|
||||||
|
=================
|
||||||
|
|
||||||
|
Sandboxing CPython has been attempted many times in the past, and each
|
||||||
|
past attempt has failed. Fundamentally, the problem is that certain
|
||||||
|
functionality has to be restricted when executing the sandboxed code,
|
||||||
|
but otherwise needs to be available for normal operation of Python. For
|
||||||
|
example, completely removing the ability to compile strings into
|
||||||
|
bytecode also breaks the ability to import modules from source code, and
|
||||||
|
if it is not completely removed then there are too many ways to get
|
||||||
|
access to that functionality indirectly. There is not yet any feasible
|
||||||
|
way to generically determine whether a given operation is "safe" or not.
|
||||||
|
Further information and references available at [2]_.
|
||||||
|
|
||||||
|
This proposal does not attempt to restrict functionality, but simply
|
||||||
|
exposes the fact that the functionality is being used. Particularly for
|
||||||
|
intrusion scenarios, detection is significantly more important than
|
||||||
|
early prevention (as early prevention will generally drive attackers to
|
||||||
|
use an alternate, less-detectable, approach). The availability of audit
|
||||||
|
hooks alone does not change the attack surface of Python in any way, but
|
||||||
|
they enable defenders to integrate Python into their environment in ways
|
||||||
|
that are currently not possible.
|
||||||
|
|
||||||
|
Since audit hooks have the ability to safely prevent an operation
|
||||||
|
occuring, this feature does enable the ability to provide some level of
|
||||||
|
sandboxing. In most cases, however, the intention is to enable logging
|
||||||
|
rather than creating a sandbox.
|
||||||
|
|
||||||
Relationship to PEP 551
|
Relationship to PEP 551
|
||||||
=======================
|
=======================
|
||||||
|
|
||||||
|
@ -483,10 +520,12 @@ References
|
||||||
|
|
||||||
.. [1] Python Performance Benchmark Suite `<https://github.com/python/performance>`_
|
.. [1] Python Performance Benchmark Suite `<https://github.com/python/performance>`_
|
||||||
|
|
||||||
|
.. [2] Python Security model - Sandbox `<https://python-security.readthedocs.io/security.html#sandbox>`_
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
=========
|
=========
|
||||||
|
|
||||||
Copyright (c) 2018 by Microsoft Corporation. This material may be
|
Copyright (c) 2019 by Microsoft Corporation. This material may be
|
||||||
distributed only subject to the terms and conditions set forth in the
|
distributed only subject to the terms and conditions set forth in the
|
||||||
Open Publication License, v1.0 or later (the latest version is presently
|
Open Publication License, v1.0 or later (the latest version is presently
|
||||||
available at http://www.opencontent.org/openpub/).
|
available at http://www.opencontent.org/openpub/).
|
||||||
|
|
Loading…
Reference in New Issue