PEP 578: Updated text for clarity (#959)
This commit is contained in:
parent
f350734f80
commit
cd42999322
117
pep-0578.rst
117
pep-0578.rst
|
@ -28,27 +28,27 @@ are unspecified here to allow implementations the freedom to determine
|
|||
how best to provide information to their users. Some examples likely
|
||||
to be used in CPython are provided for explanatory purposes.
|
||||
|
||||
See PEP-551 for discussion and recommendations on enhancing the
|
||||
See PEP 551 for discussion and recommendations on enhancing the
|
||||
security of a Python runtime making use of these auditing APIs.
|
||||
|
||||
Background
|
||||
==========
|
||||
|
||||
Python provides access to a wide range of low-level functionality on
|
||||
many common operating systems in a consistent manner. While this is
|
||||
incredibly useful for "write-once, run-anywhere" scripting, it also
|
||||
makes monitoring of software written in Python difficult. Because
|
||||
Python uses native system APIs directly, existing monitoring
|
||||
tools either suffer from limited context or auditing bypass.
|
||||
many common operating systems. While this is incredibly useful for
|
||||
"write-once, run-anywhere" scripting, it also makes monitoring of
|
||||
software written in Python difficult. Because Python uses native system
|
||||
APIs directly, existing monitoring tools either suffer from limited
|
||||
context or auditing bypass.
|
||||
|
||||
Limited context occurs when system monitoring can report that an
|
||||
action occurred, but cannot explain the sequence of events leading to
|
||||
it. For example, network monitoring at the OS level may be able to
|
||||
report "listening started on port 5678", but may not be able to
|
||||
provide the process ID, command line or parent process, or the local
|
||||
provide the process ID, command line, parent process, or the local
|
||||
state in the program at the point that triggered the action. Firewall
|
||||
controls to prevent such an action are similarly limited, typically
|
||||
to a process name or some global state such as the current user, and
|
||||
to process names or some global state such as the current user, and
|
||||
in any case rarely provide a useful log file correlated with other
|
||||
application messages.
|
||||
|
||||
|
@ -73,6 +73,10 @@ same name as the module they intend to use - for example, a
|
|||
``random.py`` file that attempts to import the standard library
|
||||
``random`` module.
|
||||
|
||||
This is not sandboxing, as this proposal does not attempt to prevent
|
||||
malicious behavior (though it enables some new options to do so).
|
||||
See the `Why Not A Sandbox`_ section below for further discussion.
|
||||
|
||||
Overview of Changes
|
||||
===================
|
||||
|
||||
|
@ -84,12 +88,14 @@ We propose two API changes to enable this: an Audit Hook and Verified
|
|||
Open Hook. Both are available from Python and native code, allowing
|
||||
applications and frameworks written in pure Python code to take
|
||||
advantage of the extra messages, while also allowing embedders or
|
||||
system administrators to deploy "always-on" builds of Python.
|
||||
system administrators to deploy builds of Python where auditing is
|
||||
always enabled.
|
||||
|
||||
Only CPython is bound to provide the native APIs as described here.
|
||||
Other implementations should provide the pure Python APIs, and
|
||||
may provide native versions as appropriate for their underlying
|
||||
runtimes.
|
||||
runtimes. Auditing events are likewise considered implementation
|
||||
specific, but are bound by normal feature compatibility guarantees.
|
||||
|
||||
Audit Hook
|
||||
----------
|
||||
|
@ -132,9 +138,9 @@ When events of interest are occurring, code can either call
|
|||
``PySys_Audit()`` from C (while the GIL is held) or ``sys.audit()``. The
|
||||
string argument is the name of the event, and the tuple contains
|
||||
arguments. A given event name should have a fixed schema for arguments,
|
||||
which should be considered a public API (for a given x.y version
|
||||
release), and thus should only change between feature releases with
|
||||
updated documentation.
|
||||
which should be considered a public API (for each x.y version release),
|
||||
and thus should only change between feature releases with updated
|
||||
documentation.
|
||||
|
||||
For maximum compatibility, events using the same name as an event in
|
||||
the reference interpreter CPython should make every attempt to use
|
||||
|
@ -152,7 +158,7 @@ log the event, abort the operation with an exception, or to immediately
|
|||
terminate the process with an operating system exit call.
|
||||
|
||||
When an event is audited but no hooks have been set, the ``audit()``
|
||||
function should include minimal overhead. Ideally, each argument is a
|
||||
function should impose minimal overhead. Ideally, each argument is a
|
||||
reference to existing data rather than a value calculated just for the
|
||||
auditing call.
|
||||
|
||||
|
@ -160,15 +166,14 @@ As hooks may be Python objects, they need to be freed during
|
|||
``Py_Finalize()``. To do this, we add an internal API
|
||||
``_Py_ClearAuditHooks()`` that releases any Python hooks and any
|
||||
memory held. This is an internal function with no public export, and
|
||||
we recommend it should raise its own audit event for all current hooks
|
||||
to ensure that unexpected calls are observed.
|
||||
we recommend it raise its own audit event for all current hooks to
|
||||
ensure that unexpected calls are observed.
|
||||
|
||||
Below in `Suggested Audit Hook Locations`_, we recommend some important
|
||||
operations that should raise audit events. In PEP 551, more audited
|
||||
operations are recommended with a view to security transparency.
|
||||
operations that should raise audit events.
|
||||
|
||||
Python implementations should document which operations will raise
|
||||
audit events, along with the event schema. It is intended that
|
||||
audit events, along with the event schema. It is intentional that
|
||||
``sys.addaudithook(print)`` be a trivial way to display all messages.
|
||||
|
||||
Verified Open Hook
|
||||
|
@ -176,11 +181,12 @@ Verified Open Hook
|
|||
|
||||
Most operating systems have a mechanism to distinguish between files
|
||||
that can be executed and those that can not. For example, this may be an
|
||||
execute bit in the permissions field, or a verified hash of the file
|
||||
contents to detect potential code tampering. These are an important
|
||||
security mechanism for preventing execution of data or code that is not
|
||||
approved for a given environment. Currently, Python has no way to
|
||||
integrate with these when launching scripts or importing modules.
|
||||
execute bit in the permissions field, a verified hash of the file
|
||||
contents to detect potential code tampering, or file system path
|
||||
restrictions. These are an important security mechanism for preventing
|
||||
execution of data or code that is not approved for a given environment.
|
||||
Currently, Python has no way to integrate with these when launching
|
||||
scripts or importing modules.
|
||||
|
||||
The new public C API for the verified open hook is::
|
||||
|
||||
|
@ -201,6 +207,7 @@ The ``importlib.util.open_for_import()`` function is a drop-in
|
|||
replacement for ``open(str(pathlike), 'rb')``. Its default behaviour is
|
||||
to open a file for raw, binary access. To change the behaviour a new
|
||||
handler should be set. Handler functions only accept ``str`` arguments.
|
||||
The C API ``PyImport_OpenForImport`` function assumes UTF-8 encoding.
|
||||
|
||||
A custom handler may be set by calling ``PyImport_SetOpenForImportHook()``
|
||||
from C at any time, including before ``Py_Initialize()``. However, if a
|
||||
|
@ -209,9 +216,7 @@ hook has already been set then the call will fail. When
|
|||
the path and its return value will be returned directly. The returned
|
||||
object should be an open file-like object that supports reading raw
|
||||
bytes. This is explicitly intended to allow a ``BytesIO`` instance if
|
||||
the open handler has already had to read the file into memory in order
|
||||
to perform whatever verification is necessary to determine whether the
|
||||
content is permitted to be executed.
|
||||
the open handler has already read the entire file into memory.
|
||||
|
||||
Note that these hooks can import and call the ``_io.open()`` function on
|
||||
CPython without triggering themselves. They can also use ``_io.BytesIO``
|
||||
|
@ -301,6 +306,11 @@ see which operations provide audit events.
|
|||
file. All arguments other than the module name may be ``None`` if
|
||||
they are not used or available.
|
||||
"
|
||||
"``open``", ``open``, "``(path, mode, flags)``", "Detect when a file
|
||||
is about to be opened. *path* and *mode* are the usual parameters to
|
||||
``open`` if available, while *flags* is provided instead of *mode*
|
||||
in some cases.
|
||||
"
|
||||
``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is
|
||||
injecting trace functions. Because of the implementation, exceptions
|
||||
raised from the hook will abort the operation, but will not be
|
||||
|
@ -387,10 +397,9 @@ Performance Impact
|
|||
|
||||
The important performance impact is the case where events are being
|
||||
raised but there are no hooks attached. This is the unavoidable case -
|
||||
once a distributor begins adding audit hooks they have explicitly
|
||||
chosen to trade performance for functionality. Performance import
|
||||
with hooks added are not of interest here, since this is considered
|
||||
opt-in functionality.
|
||||
once a developer has added audit hooks they have explicitly chosen to
|
||||
trade performance for functionality. Performance impact with hooks added
|
||||
are not of interest here, since this is opt-in functionality.
|
||||
|
||||
Analysis using the Python Performance Benchmark Suite [1]_ shows no
|
||||
significant impact, with the vast majority of benchmarks showing
|
||||
|
@ -415,13 +424,13 @@ always be present. The nature of these hooks is that they must be
|
|||
callable without condition, as any conditional imports or calls provide
|
||||
opportunities to intercept and suppress or modify events.
|
||||
|
||||
Given its nature as one of the most core modules, the ``sys`` module is
|
||||
somewhat protected against module shadowing attacks. Replacing ``sys``
|
||||
with a sufficiently functional module that the application can still run
|
||||
is a much more complicated task than replacing a module with only one
|
||||
Given it is one of the most core modules, the ``sys`` module is somewhat
|
||||
protected against module shadowing attacks. Replacing ``sys`` with a
|
||||
sufficiently functional module that the application can still run is a
|
||||
much more complicated task than replacing a module with only one
|
||||
function of interest. An attacker that has the ability to shadow the
|
||||
``sys`` module is already capable of running arbitrary code from files,
|
||||
whereas an ``audit`` module can be replaced with a single line in a
|
||||
whereas an ``audit`` module could be replaced with a single line in a
|
||||
``.pth`` file anywhere on the search path::
|
||||
|
||||
import sys; sys.modules['audit'] = type('audit', (object,),
|
||||
|
@ -431,8 +440,8 @@ Multiple layers of protection already exist for monkey patching attacks
|
|||
against either ``sys`` or ``audit``, but assignments or insertions to
|
||||
``sys.modules`` are not audited.
|
||||
|
||||
This idea is rejected because it makes substituting ``audit`` calls
|
||||
throughout all callers trivial.
|
||||
This idea is rejected because it makes it trivial to suppress all calls
|
||||
to ``audit``.
|
||||
|
||||
Flag in sys.flags to indicate "audited" mode
|
||||
--------------------------------------------
|
||||
|
@ -465,6 +474,34 @@ This idea is rejected because there are no appropriate reasons for an
|
|||
application to change its behaviour based on whether these APIs are in
|
||||
use.
|
||||
|
||||
Why Not A Sandbox
|
||||
=================
|
||||
|
||||
Sandboxing CPython has been attempted many times in the past, and each
|
||||
past attempt has failed. Fundamentally, the problem is that certain
|
||||
functionality has to be restricted when executing the sandboxed code,
|
||||
but otherwise needs to be available for normal operation of Python. For
|
||||
example, completely removing the ability to compile strings into
|
||||
bytecode also breaks the ability to import modules from source code, and
|
||||
if it is not completely removed then there are too many ways to get
|
||||
access to that functionality indirectly. There is not yet any feasible
|
||||
way to generically determine whether a given operation is "safe" or not.
|
||||
Further information and references available at [2]_.
|
||||
|
||||
This proposal does not attempt to restrict functionality, but simply
|
||||
exposes the fact that the functionality is being used. Particularly for
|
||||
intrusion scenarios, detection is significantly more important than
|
||||
early prevention (as early prevention will generally drive attackers to
|
||||
use an alternate, less-detectable, approach). The availability of audit
|
||||
hooks alone does not change the attack surface of Python in any way, but
|
||||
they enable defenders to integrate Python into their environment in ways
|
||||
that are currently not possible.
|
||||
|
||||
Since audit hooks have the ability to safely prevent an operation
|
||||
occuring, this feature does enable the ability to provide some level of
|
||||
sandboxing. In most cases, however, the intention is to enable logging
|
||||
rather than creating a sandbox.
|
||||
|
||||
Relationship to PEP 551
|
||||
=======================
|
||||
|
||||
|
@ -483,10 +520,12 @@ References
|
|||
|
||||
.. [1] Python Performance Benchmark Suite `<https://github.com/python/performance>`_
|
||||
|
||||
.. [2] Python Security model - Sandbox `<https://python-security.readthedocs.io/security.html#sandbox>`_
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
Copyright (c) 2018 by Microsoft Corporation. This material may be
|
||||
Copyright (c) 2019 by Microsoft Corporation. This material may be
|
||||
distributed only subject to the terms and conditions set forth in the
|
||||
Open Publication License, v1.0 or later (the latest version is presently
|
||||
available at http://www.opencontent.org/openpub/).
|
||||
|
|
Loading…
Reference in New Issue