Splits PEP 551 into PEP 578 (#674)
This commit is contained in:
parent
9900d8d696
commit
87fb9ab25a
601
pep-0551.rst
601
pep-0551.rst
|
@ -4,28 +4,37 @@ Version: $Revision$
|
|||
Last-Modified: $Date$
|
||||
Author: Steve Dower <steve.dower@python.org>
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
Type: Informational
|
||||
Content-Type: text/x-rst
|
||||
Created: 23-Aug-2017
|
||||
Python-Version: 3.7
|
||||
Post-History: 24-Aug-2017 (security-sig), 28-Aug-2017 (python-dev)
|
||||
|
||||
Status
|
||||
======
|
||||
|
||||
This PEP is currently in the process of being split into two.
|
||||
|
||||
See PEP 578 for the new auditing APIs proposed for addition to the next
|
||||
version of Python.
|
||||
|
||||
PEP 551 is now a draft informational PEP, providing guidance to those
|
||||
planning to integrate Python into their secure or audited environments.
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This PEP describes additions to the Python API and specific behaviors
|
||||
for the CPython implementation that make actions taken by the Python
|
||||
runtime visible to security and auditing tools. The goals in order of
|
||||
increasing importance are to prevent malicious use of Python, to detect
|
||||
and report on malicious use, and most importantly to detect attempts to
|
||||
bypass detection. Most of the responsibility for implementation is
|
||||
required from users, who must customize and build Python for their own
|
||||
environment.
|
||||
This PEP describes the concept of security transparency and how it
|
||||
applies to the Python runtime. Visibility into actions taken by the
|
||||
runtime is invaluable in integrating Python into an otherwise secure
|
||||
and/or monitored environment.
|
||||
|
||||
We propose two small sets of public APIs to enable users to reliably
|
||||
build their copy of Python without having to modify the core runtime,
|
||||
protecting future maintainability. We also discuss recommendations for
|
||||
users to help them develop and configure their copy of Python.
|
||||
The audit hooks described in PEP-578 are an essential component in
|
||||
detecting, identifying and analyzing misuse of Python. While the hooks
|
||||
themselves are neutral (in that not every reported event is inherently
|
||||
misuse), they provide essential context to those who are responsible
|
||||
for monitoring an overall system or network. With enough transparency,
|
||||
attackers are no longer able to hide.
|
||||
|
||||
Background
|
||||
==========
|
||||
|
@ -126,14 +135,14 @@ tools, most network access and DNS resolution, and attempts to create
|
|||
and hide files or configuration settings on the local machine.
|
||||
|
||||
To summarize, defenders have a need to audit specific uses of Python in
|
||||
order to detect abnormal or malicious usage. Currently, the Python
|
||||
runtime does not provide any ability to do this, which (anecdotally) has
|
||||
led to organizations switching to other languages. The aim of this PEP
|
||||
is to enable system administrators to deploy a security transparent copy
|
||||
of Python that can integrate with their existing auditing and protection
|
||||
systems.
|
||||
order to detect abnormal or malicious usage. With PEP 578, the Python
|
||||
runtime gains the ability to provide this. The aim of this PEP is to
|
||||
assist system administrators with deploying a security transparent
|
||||
version of Python that can integrate with their existing auditing and
|
||||
protection systems.
|
||||
|
||||
On Windows, some specific features that may be enabled by this include:
|
||||
On Windows, some specific features that may be integrated through the
|
||||
hooks added by PEP 578 include:
|
||||
|
||||
* Script Block Logging [3]_
|
||||
* DeviceGuard [4]_
|
||||
|
@ -151,7 +160,7 @@ On Linux, some specific features that may be integrated are:
|
|||
* SELinux labels [13]_
|
||||
* check execute bit on imported modules
|
||||
|
||||
On macOS, some features that may be used with the expanded APIs are:
|
||||
On macOS, some features that may be integrated are:
|
||||
|
||||
* OpenBSM [10]_
|
||||
* syslog [11]_
|
||||
|
@ -161,9 +170,6 @@ production machines is highly appealing to system administrators and
|
|||
will make Python a more trustworthy dependency for application
|
||||
developers.
|
||||
|
||||
Overview of Changes
|
||||
===================
|
||||
|
||||
True security transparency is not fully achievable by Python in
|
||||
isolation. The runtime can audit as many events as it likes, but unless
|
||||
the logs are reviewed and analyzed there is no value. Python may impose
|
||||
|
@ -173,340 +179,64 @@ implementations of certain security features, and organizations with the
|
|||
resources to fully customize their runtime should be encouraged to do
|
||||
so.
|
||||
|
||||
The aim of these changes is to enable system administrators to integrate
|
||||
Python into their existing security systems, without dictating what
|
||||
those systems look like or how they should behave. We propose two API
|
||||
changes to enable this: an Audit Hook and Verified Open Hook. Both are
|
||||
not set by default, and both require modifications to the entry point
|
||||
binary to enable any functionality. For the purposes of validation and
|
||||
example, we propose a new ``spython``/``spython.exe`` entry point
|
||||
program that enables some basic functionality using these hooks.
|
||||
**However, security-conscious organizations are expected to create their
|
||||
own entry points to meet their own needs.**
|
||||
Summary Recommendations
|
||||
=======================
|
||||
|
||||
Audit Hook
|
||||
----------
|
||||
These are discussed in greater detail in later sections, but are
|
||||
presented here to frame the overall discussion.
|
||||
|
||||
In order to achieve security transparency, an API is required to raise
|
||||
messages from within certain operations. These operations are typically
|
||||
deep within the Python runtime or standard library, such as dynamic code
|
||||
compilation, module imports, DNS resolution, or use of certain modules
|
||||
such as ``ctypes``.
|
||||
Sysadmins should provide and use an alternate entry point (besides
|
||||
``python.exe`` or ``pythonX.Y``) in order to reduce surface area and
|
||||
securely enable audit hooks. A discussion of what could be restricted
|
||||
is below in `Restricting the Entry Point`_.
|
||||
|
||||
The new C APIs required for audit hooks are::
|
||||
Sysadmins should use all available measures provided by their operating
|
||||
system to prevent modifications to their Python installation, such as
|
||||
file permissions, access control lists and signature validation.
|
||||
|
||||
# Add an auditing hook
|
||||
typedef int (*hook_func)(const char *event, PyObject *args,
|
||||
void *userData);
|
||||
int PySys_AddAuditHook(hook_func hook, void *userData);
|
||||
Sysadmins should log everything and collect logs to a central location
|
||||
as quickly as possible - avoid keeping logs on outer-ring machines.
|
||||
|
||||
# Raise an event with all auditing hooks
|
||||
int PySys_Audit(const char *event, PyObject *args);
|
||||
|
||||
# Internal API used during Py_Finalize() - not publicly accessible
|
||||
void _Py_ClearAuditHooks(void);
|
||||
|
||||
The new Python APIs for audit hooks are::
|
||||
|
||||
# Add an auditing hook
|
||||
sys.addaudithook(hook: Callable[str, tuple]) -> None
|
||||
|
||||
# Raise an event with all auditing hooks
|
||||
sys.audit(str, *args) -> None
|
||||
Sysadmins should prioritize _detection_ of misuse over _prevention_ of
|
||||
misuse.
|
||||
|
||||
|
||||
Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time,
|
||||
including before ``Py_Initialize()``, or by calling
|
||||
``sys.addaudithook()`` from Python code. Hooks are never removed or
|
||||
replaced, and existing hooks have an opportunity to refuse to allow new
|
||||
hooks to be added (adding an audit hook is audited, and so preexisting
|
||||
hooks can raise an exception to block the new addition).
|
||||
Restricting the Entry Point
|
||||
===========================
|
||||
|
||||
When events of interest are occurring, code can either call
|
||||
``PySys_Audit()`` from C (while the GIL is held) or ``sys.audit()``. The
|
||||
string argument is the name of the event, and the tuple contains
|
||||
arguments. A given event name should have a fixed schema for arguments,
|
||||
and both arguments are considered a public API (for a given x.y version
|
||||
of Python), and thus should only change between feature releases with
|
||||
updated documentation.
|
||||
One of the primary vulnerabilities exposed by the presence of Python
|
||||
on a machine is the ability to execute arbitrary code without
|
||||
detection or verification by the system. This is made significantly
|
||||
easier because the default entry point (``python.exe`` on Windows and
|
||||
``pythonX.Y`` on other platforms) allows execution from the command
|
||||
line, from standard input, and does not have any hooks enabled by
|
||||
default.
|
||||
|
||||
When an event is audited, each hook is called in the order it was added
|
||||
with the event name and tuple. If any hook returns with an exception
|
||||
set, later hooks are ignored and *in general* the Python runtime should
|
||||
terminate. This is intentional to allow hook implementations to decide
|
||||
how to respond to any particular event. The typical responses will be to
|
||||
log the event, abort the operation with an exception, or to immediately
|
||||
terminate the process with an operating system exit call.
|
||||
Our recommendation is that production machines should use a modified
|
||||
entry point instead of the default. Once outside of the development
|
||||
environment, there is rarely a need for the flexibility offered by the
|
||||
default entry point.
|
||||
|
||||
When an event is audited but no hooks have been set, the ``audit()``
|
||||
function should include minimal overhead. Ideally, each argument is a
|
||||
reference to existing data rather than a value calculated just for the
|
||||
auditing call.
|
||||
In this section, we describe a hypothetical ``spython`` entry point
|
||||
(``spython.exe`` on Windows; ``spythonX.Y`` on other platforms) that
|
||||
provides a level of security transparency recommended for production
|
||||
machines. An associated example implementation shows many of the
|
||||
features described here, though with a number of concessions for the
|
||||
sake of avoiding platform-specific code. A sufficient implementation
|
||||
will inherently require some integration with platform-specific
|
||||
security features.
|
||||
|
||||
As hooks may be Python objects, they need to be freed during
|
||||
``Py_Finalize()``. To do this, we add an internal API
|
||||
``_Py_ClearAuditHooks()`` that releases any ``PyObject*`` hooks that are
|
||||
held, as well as any heap memory used. This is an internal function with
|
||||
no public export, but it triggers an event for all audit hooks to ensure
|
||||
that unexpected calls are logged.
|
||||
Official distributions will not include any ``spython`` by default, but
|
||||
third party distributions may include appropriately modified entry
|
||||
points that use the same name.
|
||||
|
||||
See `Audit Hook Locations`_ for proposed audit hook points and schemas,
|
||||
and the `Recommendations`_ section for discussion on
|
||||
appropriate responses.
|
||||
|
||||
Verified Open Hook
|
||||
------------------
|
||||
|
||||
Most operating systems have a mechanism to distinguish between files
|
||||
that can be executed and those that can not. For example, this may be an
|
||||
execute bit in the permissions field, or a verified hash of the file
|
||||
contents to detect potential code tampering. These are an important
|
||||
security mechanism for preventing execution of data or code that is not
|
||||
approved for a given environment. Currently, Python has no way to
|
||||
integrate with these when launching scripts or importing modules.
|
||||
|
||||
The new public C API for the verified open hook is::
|
||||
|
||||
# Set the handler
|
||||
typedef PyObject *(*hook_func)(PyObject *path)
|
||||
int PyImport_SetOpenForImportHook(void *handler)
|
||||
|
||||
# Open a file using the handler
|
||||
PyObject *PyImport_OpenForImport(const char *path)
|
||||
|
||||
The new public Python API for the verified open hook is::
|
||||
|
||||
# Open a file using the handler
|
||||
_imp.open_for_import(path)
|
||||
|
||||
The ``_imp.open_for_import()`` function is a drop-in replacement for
|
||||
``open(str(pathlike), 'rb')``. Its default behaviour is to open a file
|
||||
for raw, binary access - any more restrictive behaviour requires the
|
||||
use of a custom handler. Only ``str`` arguments are accepted.
|
||||
|
||||
A custom handler may be set by calling ``PyImport_SetOpenForImportHook()``
|
||||
from C at any time, including before ``Py_Initialize()``. However, if a
|
||||
hook has already been set then the call will fail. When
|
||||
``open_for_import()`` is called with a hook set, the hook will be passed
|
||||
the path and its return value will be returned directly. The returned
|
||||
object should be an open file-like object that supports reading raw
|
||||
bytes. This is explicitly intended to allow a ``BytesIO`` instance if
|
||||
the open handler has already had to read the file into memory in order
|
||||
to perform whatever verification is necessary to determine whether the
|
||||
content is permitted to be executed.
|
||||
|
||||
Note that these hooks can import and call the ``_io.open()`` function on
|
||||
CPython without triggering themselves.
|
||||
|
||||
If the hook determines that the file is not suitable for execution, it
|
||||
should raise an exception of its choice, as well as raising any other
|
||||
auditing events or notifications.
|
||||
|
||||
All import and execution functionality involving code from a file will
|
||||
be changed to use ``open_for_import()`` unconditionally. It is important
|
||||
to note that calls to ``compile()``, ``exec()`` and ``eval()`` do not go
|
||||
through this function - an audit hook that includes the code from these
|
||||
calls will be added and is the best opportunity to validate code that is
|
||||
read from the file. Given the current decoupling between import and
|
||||
execution in Python, most imported code will go through both
|
||||
``open_for_import()`` and the log hook for ``compile``, and so care
|
||||
should be taken to avoid repeating verification steps.
|
||||
|
||||
.. note::
|
||||
The use of ``open_for_import()`` by ``importlib`` is a valuable
|
||||
first defence, but should not be relied upon to prevent misuse. In
|
||||
particular, it is easy to monkeypatch ``importlib`` in order to
|
||||
bypass the call. Auditing hooks are the primary way to achieve
|
||||
security transparency, and are essential for detecting attempts to
|
||||
bypass other functionality.
|
||||
|
||||
API Availability
|
||||
----------------
|
||||
|
||||
While all the functions added here are considered public and stable API,
|
||||
the behavior of the functions is implementation specific. The
|
||||
descriptions here refer to the CPython implementation, and while other
|
||||
implementations should provide the functions, there is no requirement
|
||||
that they behave the same.
|
||||
|
||||
For example, ``sys.addaudithook()`` and ``sys.audit()`` should exist but
|
||||
may do nothing. This allows code to make calls to ``sys.audit()``
|
||||
without having to test for existence, but it should not assume that its
|
||||
call will have any effect. (Including existence tests in
|
||||
security-critical code allows another vector to bypass auditing, so it
|
||||
is preferable that the function always exist.)
|
||||
|
||||
``_imp.open_for_import(path)`` should at a minimum always return
|
||||
``_io.open(path, 'rb')``. Code using the function should make no further
|
||||
assumptions about what may occur, and implementations other than CPython
|
||||
are not required to let developers override the behavior of this
|
||||
function with a hook.
|
||||
|
||||
Audit Hook Locations
|
||||
====================
|
||||
|
||||
Calls to ``sys.audit()`` or ``PySys_Audit()`` will be added to the
|
||||
following operations with the schema in Table 1. Unless otherwise
|
||||
specified, the ability for audit hooks to abort any listed operation
|
||||
should be considered part of the rationale for including the hook.
|
||||
|
||||
.. csv-table:: Table 1: Audit Hooks
|
||||
:header: "API Function", "Event Name", "Arguments", "Rationale"
|
||||
:widths: 2, 2, 3, 6
|
||||
|
||||
``PySys_AddAuditHook``, ``sys.addaudithook``, "", "Detect when new
|
||||
audit hooks are being added.
|
||||
"
|
||||
``_PySys_ClearAuditHooks``, ``sys._clearaudithooks``, "", "Notifies
|
||||
hooks they are being cleaned up, mainly in case the event is
|
||||
triggered unexpectedly. This event cannot be aborted.
|
||||
"
|
||||
``PyImport_SetOpenForImportHook``, ``setopenforimporthook``, "", "
|
||||
Detects any attempt to set the ``open_for_import`` hook.
|
||||
"
|
||||
"``compile``, ``exec``, ``eval``, ``PyAst_CompileString``,
|
||||
``PyAST_obj2mod``", ``compile``, "``(code, filename_or_none)``", "
|
||||
Detect dynamic code compilation, where ``code`` could be a string or
|
||||
AST. Note that this will be called for regular imports of source
|
||||
code, including those that were opened with ``open_for_import``.
|
||||
"
|
||||
"``exec``, ``eval``, ``run_mod``", ``exec``, "``(code_object,)``", "
|
||||
Detect dynamic execution of code objects. This only occurs for
|
||||
explicit calls, and is not raised for normal function invocation.
|
||||
"
|
||||
``import``, ``import``, "``(module, filename, sys.path,
|
||||
sys.meta_path, sys.path_hooks)``", "Detect when modules are
|
||||
imported. This is raised before the module name is resolved to a
|
||||
file. All arguments other than the module name may be ``None`` if
|
||||
they are not used or available.
|
||||
"
|
||||
``code_new``, ``code.__new__``, "``(bytecode, filename, name)``", "
|
||||
Detect dynamic creation of code objects. This only occurs for
|
||||
direct instantiation, and is not raised for normal compilation.
|
||||
"
|
||||
``func_new_impl``, ``function.__new__``, "``(code,)``", "Detect
|
||||
dynamic creation of function objects. This only occurs for direct
|
||||
instantiation, and is not raised for normal compilation.
|
||||
"
|
||||
"``_ctypes.dlopen``, ``_ctypes.LoadLibrary``", ``ctypes.dlopen``, "
|
||||
``(module_or_path,)``", "Detect when native modules are used.
|
||||
"
|
||||
``_ctypes._FuncPtr``, ``ctypes.dlsym``, "``(lib_object, name)``", "
|
||||
Collect information about specific symbols retrieved from native
|
||||
modules.
|
||||
"
|
||||
``_ctypes._CData``, ``ctypes.cdata``, "``(ptr_as_int,)``", "Detect
|
||||
when code is accessing arbitrary memory using ``ctypes``.
|
||||
"
|
||||
``id``, ``id``, "``(id_as_int,)``", "Detect when code is accessing
|
||||
the id of objects, which in CPython reveals information about
|
||||
memory layout.
|
||||
"
|
||||
``sys._getframe``, ``sys._getframe``, "``(frame_object,)``", "Detect
|
||||
when code is accessing frames directly.
|
||||
"
|
||||
``sys._current_frames``, ``sys._current_frames``, "", "Detect when
|
||||
code is accessing frames directly.
|
||||
"
|
||||
``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is
|
||||
injecting trace functions. Because of the implementation, exceptions
|
||||
raised from the hook will abort the operation, but will not be
|
||||
raised in Python code. Note that ``threading.setprofile`` eventually
|
||||
calls this function, so the event will be audited for each thread.
|
||||
"
|
||||
``PyEval_SetTrace``, ``sys.settrace``, "", "Detect when code is
|
||||
injecting trace functions. Because of the implementation, exceptions
|
||||
raised from the hook will abort the operation, but will not be
|
||||
raised in Python code. Note that ``threading.settrace`` eventually
|
||||
calls this function, so the event will be audited for each thread.
|
||||
"
|
||||
``_PyEval_SetAsyncGenFirstiter``, ``sys.set_async_gen_firstiter``, "
|
||||
", "Detect changes to async generator hooks.
|
||||
"
|
||||
``_PyEval_SetAsyncGenFinalizer``, ``sys.set_async_gen_finalizer``, "
|
||||
", "Detect changes to async generator hooks.
|
||||
"
|
||||
``_PyEval_SetCoroutineWrapper``, ``sys.set_coroutine_wrapper``, "
|
||||
", "Detect changes to the coroutine wrapper.
|
||||
"
|
||||
"``socket.bind``, ``socket.connect``, ``socket.connect_ex``,
|
||||
``socket.getaddrinfo``, ``socket.getnameinfo``, ``socket.sendmsg``,
|
||||
``socket.sendto``", ``socket.address``, "``(address,)``", "Detect
|
||||
access to network resources. The address is unmodified from the
|
||||
original call.
|
||||
"
|
||||
``socket.__init__``, "socket()", "``(family, type, proto)``", "
|
||||
Detect creation of sockets. The arguments will be int values.
|
||||
"
|
||||
``socket.gethostname``, ``socket.gethostname``, "", "Detect attempts
|
||||
to retrieve the current host name.
|
||||
"
|
||||
``socket.sethostname``, ``socket.sethostname``, "``(name,)``", "
|
||||
Detect attempts to change the current host name. The name argument
|
||||
is passed as a bytes object.
|
||||
"
|
||||
"``socket.gethostbyname``, ``socket.gethostbyname_ex``",
|
||||
"``socket.gethostbyname``", "``(name,)``", "Detect host name
|
||||
resolution. The name argument is a str or bytes object.
|
||||
"
|
||||
``socket.gethostbyaddr``, ``socket.gethostbyaddr``, "
|
||||
``(address,)``", "Detect host resolution. The address argument is a
|
||||
str or bytes object.
|
||||
"
|
||||
``socket.getservbyname``, ``socket.getservbyname``, "``(name,
|
||||
protocol)``", "Detect service resolution. The arguments are str
|
||||
objects.
|
||||
"
|
||||
"``socket.getservbyport``", ``socket.getservbyport``, "``(port,
|
||||
protocol)``", "Detect service resolution. The port argument is an
|
||||
int and protocol is a str.
|
||||
"
|
||||
"``member_get``, ``func_get_code``, ``func_get_[kw]defaults``
|
||||
",``object.__getattr__``,"``(object, attr)``","Detect access to
|
||||
restricted attributes. This event is raised for any built-in
|
||||
members that are marked as restricted, and members that may allow
|
||||
bypassing imports.
|
||||
"
|
||||
"``_PyObject_GenericSetAttr``, ``check_set_special_type_attr``,
|
||||
``object_set_class``, ``func_set_code``, ``func_set_[kw]defaults``","
|
||||
``object.__setattr__``","``(object, attr, value)``","Detect monkey
|
||||
patching of types and objects. This event
|
||||
is raised for the ``__class__`` attribute and any attribute on
|
||||
``type`` objects.
|
||||
"
|
||||
"``_PyObject_GenericSetAttr``",``object.__delattr__``,"``(object,
|
||||
attr)``","Detect deletion of object attributes. This event is raised
|
||||
for any attribute on ``type`` objects.
|
||||
"
|
||||
"``Unpickler.find_class``",``pickle.find_class``,"``(module_name,
|
||||
global_name)``","Detect imports and global name lookup when
|
||||
unpickling.
|
||||
"
|
||||
"``array_new``",``array.__new__``,"``(typecode, initial_value)``", "
|
||||
Detects creation of array objects.
|
||||
"
|
||||
|
||||
TODO - more hooks in ``_socket``, ``_ssl``, others?
|
||||
|
||||
SPython Entry Point
|
||||
===================
|
||||
|
||||
A new entry point binary will be added, called ``spython.exe`` on
|
||||
Windows and ``spythonX.Y`` on other platforms. This entry point is
|
||||
intended primarily as an example, as we expect most users of this
|
||||
functionality to implement their own entry point and hooks (see
|
||||
`Recommendations`_). It will also be used for tests.
|
||||
|
||||
Source builds will build ``spython`` by default, but distributions
|
||||
should not include it except as a test binary. The python.org managed
|
||||
binary distributions will not include ``spython``.
|
||||
|
||||
**Do not accept most command-line arguments**
|
||||
**Remove most command-line arguments**
|
||||
|
||||
The ``spython`` entry point requires a script file be passed as the
|
||||
first argument, and does not allow any options. This prevents arbitrary
|
||||
code execution from in-memory data or non-script files (such as pickles,
|
||||
which can be executed using ``-m pickle <path>``.
|
||||
first argument, and does not allow any options to precede it. This
|
||||
prevents arbitrary code execution from in-memory data or non-script
|
||||
files (such as pickles, which could be executed using
|
||||
``-m pickle <path>``.
|
||||
|
||||
Options ``-B`` (do not write bytecode), ``-E`` (ignore environment
|
||||
variables) and ``-s`` (no user site) are assumed.
|
||||
|
@ -517,38 +247,57 @@ will be used to initialize ``sys.path`` following the rules currently
|
|||
described `for Windows
|
||||
<https://docs.python.org/3/using/windows.html#finding-modules>`_.
|
||||
|
||||
When built with ``Py_DEBUG``, the ``spython`` entry point will allow a
|
||||
``-i`` option with no other arguments to enter into interactive mode,
|
||||
with audit messages being written to standard error rather than a file.
|
||||
This is intended for testing and debugging only.
|
||||
For the sake of demonstration, the example implementation of
|
||||
``spython`` also allows the ``-i`` option to start in interactive mode.
|
||||
This is not recommended for restricted entry points.
|
||||
|
||||
**Log security events to a file**
|
||||
**Log audited events**
|
||||
|
||||
Before initialization, ``spython`` will set an audit hook that writes
|
||||
events to a local file. By default, this file is the full path of the
|
||||
process with a ``.log`` suffix, but may be overridden with the
|
||||
``SPYTHONLOG`` environment variable (despite such overrides being
|
||||
explicitly discouraged in `Recommendations`_).
|
||||
Before initialization, ``spython`` sets an audit hook that writes all
|
||||
audited events to an OS-managed log file. On Windows, this is the Event
|
||||
Tracing functionality,[7]_ and on other platforms they go to
|
||||
syslog.[11]_ Logs are copied from the machine as frequently as possible
|
||||
to prevent loss of information should an attacker attempt to clear
|
||||
local logs or prevent legitimate access to the machine.
|
||||
|
||||
The audit hook will also abort all ``sys.addaudithook`` events,
|
||||
preventing any other hooks from being added.
|
||||
|
||||
The logging hook is written in native code and configured before the
|
||||
interpreter is initialized. This is the only opportunity to ensure that
|
||||
no Python code executes without auditing, and that Python code cannot
|
||||
prevent registration of the hook.
|
||||
|
||||
Our primary aim is to record all actions taken by all Python processes,
|
||||
so that detection may be performed offline against logged events.
|
||||
Having all events recorded also allows for deeper analysis and the use
|
||||
of machine learning algorithms. These are useful for detecting
|
||||
persistent attacks, where the attacker is intending to remain within
|
||||
the protected machines for some period of time, as well as for later
|
||||
analysis to determine the impact and exposure caused by a successful
|
||||
attack.
|
||||
|
||||
The example implementation of ``spython`` writes to a log file on the
|
||||
local machine, for the sake of demonstration. When started with ``-i``,
|
||||
the example implementation writes all audit events to standard error
|
||||
instead of the log file. The ``SPYTHONLOG`` environment variable can be
|
||||
used to specify the log file location.
|
||||
|
||||
**Restrict importable modules**
|
||||
|
||||
Also before initialization, ``spython`` will set an open-for-import
|
||||
hook that validates all files opened with ``os.open_for_import``. This
|
||||
implementation will require all files to have a ``.py`` suffix (thereby
|
||||
blocking the use of cached bytecode), and will raise a custom audit
|
||||
event ``spython.open_for_import`` containing ``(filename,
|
||||
True_if_allowed)``.
|
||||
Also before initialization, ``spython`` sets an open-for-import hook
|
||||
that validates all files opened with ``os.open_for_import``. This
|
||||
implementation requires all files to have a ``.py`` suffix (preventing
|
||||
the use of cached bytecode), and will raise a custom audit event
|
||||
``spython.open_for_import`` containing ``(filename, True_if_allowed)``.
|
||||
|
||||
On Windows, the hook will also open the file with flags that prevent any
|
||||
other process from opening it with write access, which allows the hook
|
||||
to perform additional validation on the contents with confidence that it
|
||||
will not be modified between the check and use. Compilation will later
|
||||
trigger a ``compile`` event, so there is no need to read the contents
|
||||
now for AMSI, but other validation mechanisms such as DeviceGuard [4]_
|
||||
should be performed here.
|
||||
After opening the file, the entire contents is read into memory in a
|
||||
single buffer and the file is closed.
|
||||
|
||||
Compilation will later trigger a ``compile`` event, so there is no need
|
||||
to validate the contents now using mechanisms that also apply to
|
||||
dynamically generated code. However, if a whitelist of source files or
|
||||
file hashes is available, then this is the point
|
||||
|
||||
**Restrict globals in pickles**
|
||||
|
||||
|
@ -556,35 +305,37 @@ The ``spython`` entry point will abort all ``pickle.find_class`` events
|
|||
that use the default implementation. Overrides will not raise audit
|
||||
events unless explicitly added, and so they will continue to be allowed.
|
||||
|
||||
Performance Impact
|
||||
==================
|
||||
**Prevent os.system**
|
||||
|
||||
The important performance impact is the case where events are being
|
||||
raised but there are no hooks attached. This is the unavoidable case -
|
||||
once a distributor or sysadmin begins adding audit hooks they have
|
||||
explicitly chosen to trade performance for functionality. Performance
|
||||
impact using ``spython`` or with hooks added are not of interest here,
|
||||
since this is considered opt-in functionality.
|
||||
The ``spython`` entry point aborts all ``os.system`` calls.
|
||||
|
||||
Analysis using the ``performance`` tool shows no significant impact,
|
||||
with the vast majority of benchmarks showing between 1.05x faster to
|
||||
1.05x slower.
|
||||
It should be noted here that ``subprocess.Popen(shell=True)`` is
|
||||
allowed (though logged via the platform-specific process creation
|
||||
events). This tradeoff is made because it is much simpler to induce a
|
||||
running application to call ``os.system`` with a single string argument
|
||||
than a function with multiple arguments, and so it is more likely to be
|
||||
used as part of an exploit. There is also little justification for
|
||||
using ``os.system`` in production code, while ``subprocess.Popen`` has
|
||||
a large number of legitimate uses. Though logs indicating the use of
|
||||
the ``shell=True`` argument should be more carefully scrutinised.
|
||||
|
||||
In our opinion, the performance impact of the set of auditing points
|
||||
described in this PEP is negligible.
|
||||
Sysadmins are encouraged to make these kinds of tradeoffs between
|
||||
restriction and detection, and generally should prefer detection.
|
||||
|
||||
Recommendations
|
||||
===============
|
||||
General Recommendations
|
||||
=======================
|
||||
|
||||
Specific recommendations are difficult to make, as the ideal
|
||||
configuration for any environment will depend on the user's ability to
|
||||
manage, monitor, and respond to activity on their own network. However,
|
||||
many of the proposals here do not appear to be of value without deeper
|
||||
illustration. This section provides recommendations using the terms
|
||||
**should** (or **should not**), indicating that we consider it dangerous
|
||||
to ignore the advice, and **may**, indicating that for the advice ought
|
||||
to be considered for high value systems. The term **sysadmins** refers
|
||||
to whoever is responsible for deploying Python throughout your network;
|
||||
Recommendations beyond those suggested in the previous section are
|
||||
difficult, as the ideal configuration for any environment depends on
|
||||
the sysadmin's ability to manage, monitor, and respond to activity on
|
||||
their own network. Nonetheless, here we attempt to provide some context
|
||||
and guidance for integrating Python into a complete system.
|
||||
|
||||
This section provides recommendations using the terms **should** (or
|
||||
**should not**), indicating that we consider it risky to ignore the
|
||||
advice, and **may**, indicating that for the advice ought to be
|
||||
considered for high value systems. The term **sysadmin** refers to
|
||||
whoever is responsible for deploying Python throughout the network;
|
||||
different organizations may have an alternative title for the
|
||||
responsible people.
|
||||
|
||||
|
@ -668,72 +419,6 @@ attribute changes on type objects.
|
|||
|
||||
[TODO: more good advice; less bad advice]
|
||||
|
||||
Rejected Ideas
|
||||
==============
|
||||
|
||||
Separate module for audit hooks
|
||||
-------------------------------
|
||||
|
||||
The proposal is to add a new module for audit hooks, hypothetically
|
||||
``audit``. This would separate the API and implementation from the
|
||||
``sys`` module, and allow naming the C functions ``PyAudit_AddHook`` and
|
||||
``PyAudit_Audit`` rather than the current variations.
|
||||
|
||||
Any such module would need to be a built-in module that is guaranteed to
|
||||
always be present. The nature of these hooks is that they must be
|
||||
callable without condition, as any conditional imports or calls provide
|
||||
more opportunities to intercept and suppress or modify events.
|
||||
|
||||
Given its nature as one of the most core modules, the ``sys`` module is
|
||||
somewhat protected against module shadowing attacks. Replacing ``sys``
|
||||
with a sufficiently functional module that the application can still run
|
||||
is a much more complicated task than replacing a module with only one
|
||||
function of interest. An attacker that has the ability to shadow the
|
||||
``sys`` module is already capable of running arbitrary code from files,
|
||||
whereas an ``audit`` module can be replaced with a single statement::
|
||||
|
||||
import sys; sys.modules['audit'] = type('audit', (object,),
|
||||
{'audit': lambda *a: None, 'addhook': lambda *a: None})
|
||||
|
||||
Multiple layers of protection already exist for monkey patching attacks
|
||||
against either ``sys`` or ``audit``, but assignments or insertions to
|
||||
``sys.modules`` are not audited.
|
||||
|
||||
This idea is rejected because it makes substituting ``audit`` calls
|
||||
throughout all callers near trivial.
|
||||
|
||||
Flag in sys.flags to indicate "secure" mode
|
||||
-------------------------------------------
|
||||
|
||||
The proposal is to add a value in ``sys.flags`` to indicate when Python
|
||||
is running in a "secure" mode. This would allow applications to detect
|
||||
when some features are enabled and modify their behaviour appropriately.
|
||||
|
||||
Currently there are no guarantees made about security by this PEP - this
|
||||
section is the first time the word "secure" has been used. Security
|
||||
**transparency** does not result in any changed behaviour, so there is
|
||||
no appropriate reason for applications to modify their behaviour.
|
||||
|
||||
Both application-level APIs ``sys.audit`` and ``_imp.open_for_import``
|
||||
are always present and functional, regardless of whether the regular
|
||||
``python`` entry point or some alternative entry point is used. Callers
|
||||
cannot determine whether any hooks have been added (except by performing
|
||||
side-channel analysis), nor do they need to. The calls should be fast
|
||||
enough that callers do not need to avoid them, and the sysadmin is
|
||||
responsible for ensuring their added hooks are fast enough to not affect
|
||||
application performance.
|
||||
|
||||
The argument that this is "security by obscurity" is valid, but
|
||||
irrelevant. Security by obscurity is only an issue when there are no
|
||||
other protective mechanisms; obscurity as the first step in avoiding
|
||||
attack is strongly recommended (see `this article
|
||||
<https://danielmiessler.com/study/security-by-obscurity/>`_ for
|
||||
discussion).
|
||||
|
||||
This idea is rejected because there are no appropriate reasons for an
|
||||
application to change its behaviour based on whether these APIs are in
|
||||
use.
|
||||
|
||||
Further Reading
|
||||
===============
|
||||
|
||||
|
@ -820,7 +505,7 @@ discussions.
|
|||
Copyright
|
||||
=========
|
||||
|
||||
Copyright (c) 2017 by Microsoft Corporation. This material may be
|
||||
Copyright (c) 2017-2018 by Microsoft Corporation. This material may be
|
||||
distributed only subject to the terms and conditions set forth in the
|
||||
Open Publication License, v1.0 or later (the latest version is presently
|
||||
available at http://www.opencontent.org/openpub/).
|
||||
|
|
|
@ -0,0 +1,479 @@
|
|||
PEP: 578
|
||||
Title: Python Runtime Audit Hooks
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Steve Dower <steve.dower@python.org>
|
||||
Status: Draft
|
||||
Type: Standards Track
|
||||
Content-Type: text/x-rst
|
||||
Created: 16-Jun-2018
|
||||
Python-Version: 3.8
|
||||
Post-History:
|
||||
|
||||
Abstract
|
||||
========
|
||||
|
||||
This PEP describes additions to the Python API and specific behaviors
|
||||
for the CPython implementation that make actions taken by the Python
|
||||
runtime visible to auditing tools. Visibility into these actions
|
||||
provides opportunities for test frameworks, logging frameworks, and
|
||||
security tools to monitor and optionally limit actions taken by the
|
||||
runtime.
|
||||
|
||||
This PEP proposes adding two APIs to provide insights into a running
|
||||
Python application: one for arbitrary events, and another specific to
|
||||
the module import system. The APIs are intended to be available in all
|
||||
Python implementations, though the specific messages and values used
|
||||
are unspecified here to allow implementations the freedom to determine
|
||||
how best to provide information to their users. Some examples likely
|
||||
to be used in CPython are provided for explanatory purposes.
|
||||
|
||||
See PEP-551 for discussion and recommendations on enhancing the
|
||||
security of a Python runtime making use of these auditing APIs.
|
||||
|
||||
Background
|
||||
==========
|
||||
|
||||
Python provides access to a wide range of low-level functionality on
|
||||
many common operating systems in a consistent manner. While this is
|
||||
incredibly useful for "write-once, run-anywhere" scripting, it also
|
||||
makes monitoring of software written in Python difficult. Because
|
||||
Python uses native system APIs directly, existing monitoring
|
||||
tools either suffer from limited context or auditing bypass.
|
||||
|
||||
Limited context occurs when system monitoring can report that an
|
||||
action occurred, but cannot explain the sequence of events leading to
|
||||
it. For example, network monitoring at the OS level may be able to
|
||||
report "listening started on port 5678", but may not be able to
|
||||
provide the process ID, command line or parent process, or the local
|
||||
state in the program at the point that triggered the action. Firewall
|
||||
controls to prevent such an action are similarly limited, typically
|
||||
to a process name or some global state such as the current user, and
|
||||
in any case rarely provide a useful log file correlated with other
|
||||
application messages.
|
||||
|
||||
Auditing bypass can occur when the typical system tool used for an
|
||||
action would ordinarily report its use, but accessing the APIs via
|
||||
Python do not trigger this. For example, invoking "curl" to make HTTP
|
||||
requests may be specifically monitored in an audited system, but
|
||||
Python's "urlretrieve" function is not.
|
||||
|
||||
Within a long-running Python application, particularly one that
|
||||
processes user-provided information such as a web app, there is a risk
|
||||
of unexpected behavior. This may be due to bugs in the code, or
|
||||
deliberately induced by a malicious user. In both cases, normal
|
||||
application logging may be bypassed resulting in no indication that
|
||||
anything out of the ordinary has occurred.
|
||||
|
||||
Additionally, and somewhat unique to Python, it is very easy to affect
|
||||
the code that is run in an application by manipulating either the
|
||||
import system's search path or placing files earlier on the path than
|
||||
intended. This is often seen when developers create a script with the
|
||||
same name as the module they intend to use - for example, a
|
||||
``random.py`` file that attempts to import the standard library
|
||||
``random`` module.
|
||||
|
||||
Overview of Changes
|
||||
===================
|
||||
|
||||
The aim of these changes is to enable both application developers and
|
||||
system administrators to integrate Python into their existing
|
||||
monitoring systems without dictating how those systems look or behave.
|
||||
|
||||
We propose two API changes to enable this: an Audit Hook and Verified
|
||||
Open Hook. Both are available from Python and native code, allowing
|
||||
applications and frameworks written in pure Python code to take
|
||||
advantage of the extra messages, while also allowing embedders or
|
||||
system administrators to deploy "always-on" builds of Python.
|
||||
|
||||
Only CPython is bound to provide the native APIs as described here.
|
||||
Other implementations should provide the pure Python APIs, and
|
||||
may provide native versions as appropriate for their underlying
|
||||
runtimes.
|
||||
|
||||
Audit Hook
|
||||
----------
|
||||
|
||||
In order to observe actions taken by the runtime (on behalf of the
|
||||
caller), an API is required to raise messages from within certain
|
||||
operations. These operations are typically deep within the Python
|
||||
runtime or standard library, such as dynamic code compilation, module
|
||||
imports, DNS resolution, or use of certain modules such as ``ctypes``.
|
||||
|
||||
The following new C APIs allow embedders and CPython implementors to
|
||||
send and receive audit hook messages::
|
||||
|
||||
# Add an auditing hook
|
||||
typedef int (*hook_func)(const char *event, PyObject *args,
|
||||
void *userData);
|
||||
int PySys_AddAuditHook(hook_func hook, void *userData);
|
||||
|
||||
# Raise an event with all auditing hooks
|
||||
int PySys_Audit(const char *event, PyObject *args);
|
||||
|
||||
# Internal API used during Py_Finalize() - not publicly accessible
|
||||
void _Py_ClearAuditHooks(void);
|
||||
|
||||
The new Python APIs for receiving and raising audit hooks are::
|
||||
|
||||
# Add an auditing hook
|
||||
sys.addaudithook(hook: Callable[[str, tuple]])
|
||||
|
||||
# Raise an event with all auditing hooks
|
||||
sys.audit(str, *args)
|
||||
|
||||
|
||||
Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time,
|
||||
including before ``Py_Initialize()``, or by calling
|
||||
``sys.addaudithook()`` from Python code. Hooks cannot be removed or
|
||||
replaced.
|
||||
|
||||
When events of interest are occurring, code can either call
|
||||
``PySys_Audit()`` from C (while the GIL is held) or ``sys.audit()``. The
|
||||
string argument is the name of the event, and the tuple contains
|
||||
arguments. A given event name should have a fixed schema for arguments,
|
||||
which should be considered a public API (for a given x.y version
|
||||
release), and thus should only change between feature releases with
|
||||
updated documentation.
|
||||
|
||||
For maximum compatibility, events using the same name as an event in
|
||||
the reference interpreter CPython should make every attempt to use
|
||||
compatible arguments. Including the name or an abbreviation of the
|
||||
implementation in implementation-specific event names will also help
|
||||
prevent collisions. For example, a ``pypy.jit_invoked`` event is clearly
|
||||
distinguised from an ``ipy.jit_invoked`` event.
|
||||
|
||||
When an event is audited, each hook is called in the order it was added
|
||||
with the event name and tuple. If any hook returns with an exception
|
||||
set, later hooks are ignored and *in general* the Python runtime should
|
||||
terminate. This is intentional to allow hook implementations to decide
|
||||
how to respond to any particular event. The typical responses will be to
|
||||
log the event, abort the operation with an exception, or to immediately
|
||||
terminate the process with an operating system exit call.
|
||||
|
||||
When an event is audited but no hooks have been set, the ``audit()``
|
||||
function should include minimal overhead. Ideally, each argument is a
|
||||
reference to existing data rather than a value calculated just for the
|
||||
auditing call.
|
||||
|
||||
As hooks may be Python objects, they need to be freed during
|
||||
``Py_Finalize()``. To do this, we add an internal API
|
||||
``_Py_ClearAuditHooks()`` that releases any Python hooks and any
|
||||
memory held. This is an internal function with no public export, and
|
||||
we recommend it should raise its own audit event for all current hooks
|
||||
to ensure that unexpected calls are observed.
|
||||
|
||||
Below in `Suggested Audit Hook Locations`_, we recommend some important
|
||||
operations that should raise audit events. In PEP 551, more audited
|
||||
operations are recommended with a view to security transparency.
|
||||
|
||||
Python implementations should document which operations will raise
|
||||
audit events, along with the event schema. It is intended that
|
||||
``sys.addaudithook(print)`` be a trivial way to display all messages.
|
||||
|
||||
Verified Open Hook
|
||||
------------------
|
||||
|
||||
Most operating systems have a mechanism to distinguish between files
|
||||
that can be executed and those that can not. For example, this may be an
|
||||
execute bit in the permissions field, or a verified hash of the file
|
||||
contents to detect potential code tampering. These are an important
|
||||
security mechanism for preventing execution of data or code that is not
|
||||
approved for a given environment. Currently, Python has no way to
|
||||
integrate with these when launching scripts or importing modules.
|
||||
|
||||
The new public C API for the verified open hook is::
|
||||
|
||||
# Set the handler
|
||||
typedef PyObject *(*hook_func)(PyObject *path, void *userData)
|
||||
int PyImport_SetOpenForImportHook(hook_func handler, void *userData)
|
||||
|
||||
# Open a file using the handler
|
||||
PyObject *PyImport_OpenForImport(const char *path)
|
||||
|
||||
The new public Python API for the verified open hook is::
|
||||
|
||||
# Open a file using the handler
|
||||
importlib.util.open_for_import(path : str) -> io.IOBase
|
||||
|
||||
|
||||
The ``importlib.util.open_for_import()`` function is a drop-in
|
||||
replacement for ``open(str(pathlike), 'rb')``. Its default behaviour is
|
||||
to open a file for raw, binary access. To change the behaviour a new
|
||||
handler should be set. Handler functions only accept ``str`` arguments.
|
||||
|
||||
A custom handler may be set by calling ``PyImport_SetOpenForImportHook()``
|
||||
from C at any time, including before ``Py_Initialize()``. However, if a
|
||||
hook has already been set then the call will fail. When
|
||||
``open_for_import()`` is called with a hook set, the hook will be passed
|
||||
the path and its return value will be returned directly. The returned
|
||||
object should be an open file-like object that supports reading raw
|
||||
bytes. This is explicitly intended to allow a ``BytesIO`` instance if
|
||||
the open handler has already had to read the file into memory in order
|
||||
to perform whatever verification is necessary to determine whether the
|
||||
content is permitted to be executed.
|
||||
|
||||
Note that these hooks can import and call the ``_io.open()`` function on
|
||||
CPython without triggering themselves. They can also use ``_io.BytesIO``
|
||||
to return a compatible result using an in-memory buffer.
|
||||
|
||||
If the hook determines that the file should not be loaded, it should
|
||||
raise an exception of its choice, as well as performing any other
|
||||
logging.
|
||||
|
||||
All import and execution functionality involving code from a file will
|
||||
be changed to use ``open_for_import()`` unconditionally. It is important
|
||||
to note that calls to ``compile()``, ``exec()`` and ``eval()`` do not go
|
||||
through this function - an audit hook that includes the code from these
|
||||
calls is the best opportunity to validate code that is read from the
|
||||
file. Given the current decoupling between import and execution in
|
||||
Python, most imported code will go through both ``open_for_import()``
|
||||
and the log hook for ``compile``, and so care should be taken to avoid
|
||||
repeating verification steps.
|
||||
|
||||
There is no Python API provided for changing the open hook. To modify
|
||||
import behavior from Python code, use the existing functionality
|
||||
provided by ``importlib``.
|
||||
|
||||
API Availability
|
||||
----------------
|
||||
|
||||
While all the functions added here are considered public and stable API,
|
||||
the behavior of the functions is implementation specific. Most
|
||||
descriptions here refer to the CPython implementation, and while other
|
||||
implementations should provide the functions, there is no requirement
|
||||
that they behave the same.
|
||||
|
||||
For example, ``sys.addaudithook()`` and ``sys.audit()`` should exist but
|
||||
may do nothing. This allows code to make calls to ``sys.audit()``
|
||||
without having to test for existence, but it should not assume that its
|
||||
call will have any effect. (Including existence tests in
|
||||
security-critical code allows another vector to bypass auditing, so it
|
||||
is preferable that the function always exist.)
|
||||
|
||||
``importlib.util.open_for_import(path)`` should at a minimum always
|
||||
return ``_io.open(path, 'rb')``. Code using the function should make no
|
||||
further assumptions about what may occur, and implementations other than
|
||||
CPython are not required to let developers override the behavior of this
|
||||
function with a hook.
|
||||
|
||||
Suggested Audit Hook Locations
|
||||
==============================
|
||||
|
||||
The locations and parameters in calls to ``sys.audit()`` or
|
||||
``PySys_Audit()`` are to be determined by individual Python
|
||||
implementations. This is to allow maximum freedom for implementations
|
||||
to expose the operations that are most relevant to their platform,
|
||||
and to avoid or ignore potentially expensive or noisy events.
|
||||
|
||||
Table 1 acts as both suggestions of operations that should trigger
|
||||
audit events on all implementations, and examples of event schemas.
|
||||
|
||||
Table 2 provides further examples that are not required, but are
|
||||
likely to be available in CPython.
|
||||
|
||||
Refer to the documentation associated with your version of Python to
|
||||
see which operations provide audit events.
|
||||
|
||||
.. csv-table:: Table 1: Suggested Audit Hooks
|
||||
:header: "API Function", "Event Name", "Arguments", "Rationale"
|
||||
:widths: 2, 2, 3, 6
|
||||
|
||||
``PySys_AddAuditHook``, ``sys.addaudithook``, "", "Detect when new
|
||||
audit hooks are being added.
|
||||
"
|
||||
``PyImport_SetOpenForImportHook``, ``setopenforimporthook``, "", "
|
||||
Detects any attempt to set the ``open_for_import`` hook.
|
||||
"
|
||||
"``compile``, ``exec``, ``eval``, ``PyAst_CompileString``,
|
||||
``PyAST_obj2mod``", ``compile``, "``(code, filename_or_none)``", "
|
||||
Detect dynamic code compilation, where ``code`` could be a string or
|
||||
AST. Note that this will be called for regular imports of source
|
||||
code, including those that were opened with ``open_for_import``.
|
||||
"
|
||||
"``exec``, ``eval``, ``run_mod``", ``exec``, "``(code_object,)``", "
|
||||
Detect dynamic execution of code objects. This only occurs for
|
||||
explicit calls, and is not raised for normal function invocation.
|
||||
"
|
||||
``import``, ``import``, "``(module, filename, sys.path,
|
||||
sys.meta_path, sys.path_hooks)``", "Detect when modules are
|
||||
imported. This is raised before the module name is resolved to a
|
||||
file. All arguments other than the module name may be ``None`` if
|
||||
they are not used or available.
|
||||
"
|
||||
``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is
|
||||
injecting trace functions. Because of the implementation, exceptions
|
||||
raised from the hook will abort the operation, but will not be
|
||||
raised in Python code. Note that ``threading.setprofile`` eventually
|
||||
calls this function, so the event will be audited for each thread.
|
||||
"
|
||||
``PyEval_SetTrace``, ``sys.settrace``, "", "Detect when code is
|
||||
injecting trace functions. Because of the implementation, exceptions
|
||||
raised from the hook will abort the operation, but will not be
|
||||
raised in Python code. Note that ``threading.settrace`` eventually
|
||||
calls this function, so the event will be audited for each thread.
|
||||
"
|
||||
"``_PyObject_GenericSetAttr``, ``check_set_special_type_attr``,
|
||||
``object_set_class``, ``func_set_code``, ``func_set_[kw]defaults``","
|
||||
``object.__setattr__``","``(object, attr, value)``","Detect monkey
|
||||
patching of types and objects. This event
|
||||
is raised for the ``__class__`` attribute and any attribute on
|
||||
``type`` objects.
|
||||
"
|
||||
"``_PyObject_GenericSetAttr``",``object.__delattr__``,"``(object,
|
||||
attr)``","Detect deletion of object attributes. This event is raised
|
||||
for any attribute on ``type`` objects.
|
||||
"
|
||||
"``Unpickler.find_class``",``pickle.find_class``,"``(module_name,
|
||||
global_name)``","Detect imports and global name lookup when
|
||||
unpickling.
|
||||
"
|
||||
|
||||
|
||||
.. csv-table:: Table 2: Potential CPython Audit Hooks
|
||||
:header: "API Function", "Event Name", "Arguments", "Rationale"
|
||||
:widths: 2, 2, 3, 6
|
||||
|
||||
``_PySys_ClearAuditHooks``, ``sys._clearaudithooks``, "", "Notifies
|
||||
hooks they are being cleaned up, mainly in case the event is
|
||||
triggered unexpectedly. This event cannot be aborted.
|
||||
"
|
||||
``code_new``, ``code.__new__``, "``(bytecode, filename, name)``", "
|
||||
Detect dynamic creation of code objects. This only occurs for
|
||||
direct instantiation, and is not raised for normal compilation.
|
||||
"
|
||||
``func_new_impl``, ``function.__new__``, "``(code,)``", "Detect
|
||||
dynamic creation of function objects. This only occurs for direct
|
||||
instantiation, and is not raised for normal compilation.
|
||||
"
|
||||
"``_ctypes.dlopen``, ``_ctypes.LoadLibrary``", ``ctypes.dlopen``, "
|
||||
``(module_or_path,)``", "Detect when native modules are used.
|
||||
"
|
||||
``_ctypes._FuncPtr``, ``ctypes.dlsym``, "``(lib_object, name)``", "
|
||||
Collect information about specific symbols retrieved from native
|
||||
modules.
|
||||
"
|
||||
``_ctypes._CData``, ``ctypes.cdata``, "``(ptr_as_int,)``", "Detect
|
||||
when code is accessing arbitrary memory using ``ctypes``.
|
||||
"
|
||||
``sys._getframe``, ``sys._getframe``, "``(frame_object,)``", "Detect
|
||||
when code is accessing frames directly.
|
||||
"
|
||||
``sys._current_frames``, ``sys._current_frames``, "", "Detect when
|
||||
code is accessing frames directly.
|
||||
"
|
||||
"``socket.bind``, ``socket.connect``, ``socket.connect_ex``,
|
||||
``socket.getaddrinfo``, ``socket.getnameinfo``, ``socket.sendmsg``,
|
||||
``socket.sendto``", ``socket.address``, "``(address,)``", "Detect
|
||||
access to network resources. The address is unmodified from the
|
||||
original call.
|
||||
"
|
||||
"``member_get``, ``func_get_code``, ``func_get_[kw]defaults``
|
||||
",``object.__getattr__``,"``(object, attr)``","Detect access to
|
||||
restricted attributes. This event is raised for any built-in
|
||||
members that are marked as restricted, and members that may allow
|
||||
bypassing imports.
|
||||
"
|
||||
|
||||
|
||||
Performance Impact
|
||||
==================
|
||||
|
||||
The important performance impact is the case where events are being
|
||||
raised but there are no hooks attached. This is the unavoidable case -
|
||||
once a distributor or sysadmin begins adding audit hooks they have
|
||||
explicitly chosen to trade performance for functionality. Performance
|
||||
impact using ``spython`` or with hooks added are not of interest here,
|
||||
since this is considered opt-in functionality.
|
||||
|
||||
Analysis using the ``performance`` tool shows no significant impact,
|
||||
with the vast majority of benchmarks showing between 1.05x faster to
|
||||
1.05x slower.
|
||||
|
||||
In our opinion, the performance impact of the set of auditing points
|
||||
described in this PEP is negligible.
|
||||
|
||||
Rejected Ideas
|
||||
==============
|
||||
|
||||
Separate module for audit hooks
|
||||
-------------------------------
|
||||
|
||||
The proposal is to add a new module for audit hooks, hypothetically
|
||||
``audit``. This would separate the API and implementation from the
|
||||
``sys`` module, and allow naming the C functions ``PyAudit_AddHook`` and
|
||||
``PyAudit_Audit`` rather than the current variations.
|
||||
|
||||
Any such module would need to be a built-in module that is guaranteed to
|
||||
always be present. The nature of these hooks is that they must be
|
||||
callable without condition, as any conditional imports or calls provide
|
||||
more opportunities to intercept and suppress or modify events.
|
||||
|
||||
Given its nature as one of the most core modules, the ``sys`` module is
|
||||
somewhat protected against module shadowing attacks. Replacing ``sys``
|
||||
with a sufficiently functional module that the application can still run
|
||||
is a much more complicated task than replacing a module with only one
|
||||
function of interest. An attacker that has the ability to shadow the
|
||||
``sys`` module is already capable of running arbitrary code from files,
|
||||
whereas an ``audit`` module can be replaced with a single statement::
|
||||
|
||||
import sys; sys.modules['audit'] = type('audit', (object,),
|
||||
{'audit': lambda *a: None, 'addhook': lambda *a: None})
|
||||
|
||||
Multiple layers of protection already exist for monkey patching attacks
|
||||
against either ``sys`` or ``audit``, but assignments or insertions to
|
||||
``sys.modules`` are not audited.
|
||||
|
||||
This idea is rejected because it makes substituting ``audit`` calls
|
||||
throughout all callers near trivial.
|
||||
|
||||
Flag in sys.flags to indicate "secure" mode
|
||||
-------------------------------------------
|
||||
|
||||
The proposal is to add a value in ``sys.flags`` to indicate when Python
|
||||
is running in a "secure" mode. This would allow applications to detect
|
||||
when some features are enabled and modify their behaviour appropriately.
|
||||
|
||||
Currently there are no guarantees made about security by this PEP - this
|
||||
section is the first time the word "secure" has been used. Security
|
||||
**transparency** does not result in any changed behaviour, so there is
|
||||
no appropriate reason for applications to modify their behaviour.
|
||||
|
||||
Both application-level APIs ``sys.audit`` and ``_imp.open_for_import``
|
||||
are always present and functional, regardless of whether the regular
|
||||
``python`` entry point or some alternative entry point is used. Callers
|
||||
cannot determine whether any hooks have been added (except by performing
|
||||
side-channel analysis), nor do they need to. The calls should be fast
|
||||
enough that callers do not need to avoid them, and the sysadmin is
|
||||
responsible for ensuring their added hooks are fast enough to not affect
|
||||
application performance.
|
||||
|
||||
The argument that this is "security by obscurity" is valid, but
|
||||
irrelevant. Security by obscurity is only an issue when there are no
|
||||
other protective mechanisms; obscurity as the first step in avoiding
|
||||
attack is strongly recommended (see `this article
|
||||
<https://danielmiessler.com/study/security-by-obscurity/>`_ for
|
||||
discussion).
|
||||
|
||||
This idea is rejected because there are no appropriate reasons for an
|
||||
application to change its behaviour based on whether these APIs are in
|
||||
use.
|
||||
|
||||
|
||||
Acknowledgments
|
||||
===============
|
||||
|
||||
Thanks to all the people from Microsoft involved in helping make the
|
||||
Python runtime safer for production use, and especially to James Powell
|
||||
for doing much of the initial research, analysis and implementation, Lee
|
||||
Holmes for invaluable insights into the info-sec field and PowerShell's
|
||||
responses, and Brett Cannon for the restraining and grounding
|
||||
discussions.
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
||||
Copyright (c) 2018 by Microsoft Corporation. This material may be
|
||||
distributed only subject to the terms and conditions set forth in the
|
||||
Open Publication License, v1.0 or later (the latest version is presently
|
||||
available at http://www.opencontent.org/openpub/).
|
Loading…
Reference in New Issue