OpenSearch/shield/docs/public/configuring-auditing.asciidoc

[[configuring-auditing]]
== Configuring Auditing

[IMPORTANT]
====
Audit logs are **disabled** by default. To enable this functionality the following setting should be added to the
`elasticsearch.yml` file:

[source,yaml]
----------------------------
shield.audit.enabled: true
----------------------------
====

The audit functionality was added to keep track of important events occurring in Elasticsearch, primarily around security
concerns. Keeping track and persisting these events is essential for any secured environment and potentially provides
evidence for suspicious/malicious activity on the Elasticsearch cluster.

Shield provides two ways to output these events: in a dedicated `access.log` file stored on the host's file system, or
in an Elasticsearch index on the same or separate cluster. These options are not mutually exclusive. For example, both
options can be enabled through an entry in the `elasticsearch.yml` file:

[source,yaml]
----------------------------
shield.audit.outputs: [index, logfile]
----------------------------

It is expected that the `index` output type will be used in conjunction with the `logfile` output type. This is
because the `index` output type can lose messages if the target index is unavailable. For this reason, it is recommended
that, if auditing is enabled, then the `logfile` output type should be used as an official record of events. The `index`
output type can be enabled as a convenience to allow historical browsing of events.

Please also note that, because audit events are batched together before being indexed, they may not appear immediately.
Please refer to the `shield.audit.index.flush_interval` setting below for instructions on how to modify the frequency
with which batched events are flushed.

[float]
=== Log Entry Types

Each audit related event that occurs is represented by a single log entry of a specific type (the type represents the
type of the event that occurred). Here are the possible log entry types:

* `anonymous_access_denied`          is logged when the request is denied due to missing authentication token.
* `authentication_failed`            is logged when the authentication token cannot be matched to a known user.
* `authentication_failed [<realm>]`  is logged for every realm that fails to present a valid authentication token.
                                     The value of _<realm>_ is the realm type.
* `access_denied`                    is logged when an authenticated user attempts an action the user does not have the
                                     <<reference,privilege>> to perform.
* `access_granted`                   is logged when an authenticated user attempts an action the user has the correct
                                     privilege to perform. In TRACE level all system (internal) actions are logged as
                                     well (in all other level they're not logged to avoid cluttering of the logs.
* `tampered_request`                 is logged when the request was detected to be tampered (typically relates to `search/scroll` requests when the scroll id is believed to be tampered)
* `connection_granted`               is logged when an incoming tcp connection has passed the ip filtering for a specific profile
* `connection_denied`                is logged when an incoming tcp connection did not pass the ip filtering for a specific profile

To avoid needless proliferation of log entries, Shield enables you to control what entry types should be logged. This can
be done by setting the logging level. The following table lists the log entry types that will be logged for each of the
possible log levels:

.Log Entry Types and Levels
[options="header"]
|======
| Log Level | Entry Type
| `ERROR`   | `authentication_failed`, `access_denied`, `tampered_request`, `connection_denied`
| `WARN`    | `authentication_failed`, `access_denied`, `tampered_request`, `connection_denied`, `anonymous_access_denied`
| `INFO`    | `authentication_failed`, `access_denied`, `tampered_request`, `connection_denied`, `anonymous_access_denied`, `access_granted`
| `DEBUG`   | (doesn't output additional entry types beyond `INFO`, but extends the information emitted for each entry (see <<audit-log-entry-format, Log Entry Format>> below)
| `TRACE`   | `authentication_failed`, `access_denied`, `tampered_request`, `connection_denied`, `anonymous_access_denied`, `access_granted`, `connection_granted`, `authentication_failed [<realm>]`. In addition, internal system requests (self-management requests triggered by Elasticsearch itself) will also be logged for `access_granted` entry type.
|======


[float]
[[audit-log-entry-format]]
=== Log Entry Format

As mentioned above, every log entry represents an event that occurred in the system. As such, each entry is associated with
a timestamp (at which the event occurred), the component/layer the event is associated with and the entry/event type. In
addition, every log entry (depending ot its type) carries addition information about the event.

The format of a log entry is shown below:

[source,txt]
----------------------------------------------------------------------------
[<timestamp>] [<local_node_info>] [<layer>] [<entry_type>] <attribute_list>
----------------------------------------------------------------------------

Where:

* `<timestamp>` -           the timestamp of the entries (in the fomrat configured in `logging.yml` as shown above)
* `<local_node_info>` -     additional information about the local node that this log entry is printed from (the <<audit-log-entry-local-node-info, table below>> shows how this information can be controlled via settings)
* `<layer>` -               the layer from which this entry relates to. Can be either `rest`, `transport` or `ip_filter`
* `<entry_type>` -          the type of the entry as discussed above. Can be either `anonymous_access_denied`, `authentication_failed`,
                            `access_denied`, `access_granted`, `connection_granted`, `connection_denied`.
* `<attribute_list>` -       A comma-separated list of attribute carrying data relevant to the occurred event (formatted as `attr1=[val1], attr2=[val2],...`)

[[audit-log-entry-local-node-info]]
.Local Node Info Settings
[options="header"]
|======
| Name                                                   | Default   | Description
| `shield.audit.logfile.prefix.emit_node_name`           | true      | When set to `true`, the local node's name will be emitted
| `shield.audit.logfile.prefix.emit_node_host_address`   | false     | When set to `true`, the local node's IP address will be emitted
| `shield.audit.logfile.prefix.emit_node_host_name`      | false     | When set to `true`, the local node's host name will be emitted
|======

The following tables describe the possible attributes each entry type can carry (the attributes that will be available depend on the configured log level):

.`[rest] [anonymous_access_denied]` attributes
[options="header"]
|======
| Attribute             | Minimum Log Level     | Description
| `origin_address`      | WARN                  | The address the rest request origins from
| `uri`                 | WARN                  | The REST endpoint URI
| `request_body`        | DEBUG                 | The body of the request
|======

.`[rest] [authentication_failed]` attributes
[options="header"]
|======
| Attribute             | Minimum Log Level     | Description
| `origin_address`      | ERROR                 | The address the rest request origins from
| `principal`           | ERROR                 | The principal (username) that failed to authenticate
| `uri`                 | ERROR                 | The REST endpoint URI
| `request_body`        | DEBUG                 | The body of the request
| `realm`               | TRACE                 | The realm that failed to authenticate the user. NOTE: A separate entry will be printed for each of the consulted realms
|======

.`[transport] [anonymous_access_denied]` attributes
[options="header"]
|======
| Attribute             | Minimum Log Level     | Description
| `origin_type`         | WARN                  | The type of the origin the request originated from. Can be either `rest` (request was originated from a rest API request), `transport` (request received on the transport channel), `local_node` (the local node issued the request)
| `origin_address`      | WARN                  | The address the request origins from
| `action`              | WARN                  | The name of the action that was executed
| `request`             | DEBUG                 | The type of the request that was executed
| `indices`             | WARN                  | A comma-separated list of indices this request relates to (when applicable)
|======

.`[transport] [authentication_failed]` attributes
[options="header"]
|======
| Attribute             | Minimum Log Level      | Description
| `origin_type`         | ERROR                  | The type of the origin the request originated from. Can be either `rest` (request was originated from a rest API request), `transport` (request received on the transport channel), `local_node` (the local node issued the request)
| `origin_address`      | ERROR                  | The address the request origins from
| `principal`           | ERROR                  | The principal (username) that failed to authenticate
| `action`              | ERROR                  | The name of the action that was executed
| `request`             | DEBUG                  | The type of the request that was executed
| `indices`             | ERROR                  | A comma-separated list of indices this request relates to (when applicable)
| `realm`               | TRACE                  | The realm that failed to authenticate the user. NOTE: A separate entry will be printed for each of the consulted realms
|======

.`[transport] [access_granted]` attributes
[options="header"]
|======
| Attribute             | Minimum Log Level     | Description
| `origin_type`         | INFO                  | The type of the origin the request originated from. Can be either `rest` (request was originated from a rest API request), `transport` (request received on the transport channel), `local_node` (the local node issued the request)
| `origin_address`      | INFO                  | The address the request origins from
| `principal`           | INFO                  | The principal (username) that failed to authenticate
| `action`              | INFO                  | The name of the action that was executed
| `request`             | DEBUG                 | The type of the request that was executed
| `indices`             | INFO                  | A comma-separated list of indices this request relates to (when applicable)
|======

.`[transport] [access_denied]` attributes
[options="header"]
|======
| Attribute             | Minimum Log Level     | Description
| `origin_type`         | ERROR                 | The type of the origin the request originated from. Can be either `rest` (request was originated from a rest API request), `transport` (request received on the transport channel), `local_node` (the local node issued the request)
| `origin_address`      | ERROR                 | The address the request origins from
| `principal`           | ERROR                 | The principal (username) that failed to authenticate
| `action`              | ERROR                 | The name of the action that was executed
| `request`             | DEBUG                 | The type of the request that was executed
| `indices`             | ERROR                 | A comma-separated list of indices this request relates to (when applicable)
|======

.`[transport] [tampered_request]` attributes
[options="header"]
|======
| Attribute             | Minimum Log Level     | Description
| `origin_type`         | ERROR                 | The type of the origin the request originated from. Can be either `rest` (request was originated from a rest API request), `transport` (request received on the transport channel), `local_node` (the local node issued the request)
| `origin_address`      | ERROR                 | The address the request origins from
| `principal`           | ERROR                 | The principal (username) that failed to authenticate
| `action`              | ERROR                 | The name of the action that was executed
| `request`             | DEBUG                 | The type of the request that was executed
| `indices`             | ERROR                 | A comma-separated list of indices this request relates to (when applicable)
|======

.`[ip_filter] [connection_granted]` attributes
[options="header"]
|======
| Attribute             | Minimum Log Level     | Description
| `origin_address`      | TRACE                 | The address the request origins from
| `transport_profile`   | TRACE                 | The principal (username) that failed to authenticate
| `rule`                | TRACE                 | The IP filtering rule that granted the request
|======

.`[ip_filter] [connection_denied]` attributes
[options="header"]
|======
| Attribute             | Minimum Log Level     | Description
| `origin_address`      | ERROR                 | The address the request origins from
| `transport_profile`   | ERROR                 | The principal (username) that failed to authenticate
| `rule`                | ERROR                 | The IP filtering rule that denied the request
|======


[float]
=== Audit Logs Settings

As mentioned above, the audit logs are configured in the `logging.yml` file located in `CONFIG_DIR/shield`. The following snippet shows the default logging configuration:

[[logging-file]]

.Default `logging.yml` File
[source,yaml]
----
logger:
  shield.audit.logfile: INFO, access_log

additivity:
  shield.audit.logfile: false

appender:

  access_log:
    type: dailyRollingFile
    file: ${path.logs}/${cluster.name}-access.log
    datePattern: "'.'yyyy-MM-dd"
    layout:
      type: pattern
      conversionPattern: "[%d{ISO8601}] %m%n"
----

As can be seen above, by default audit information is appended to the `access.log` file located in the
standard Elasticsearch `logs` directory (typically located at `$ES_HOME/logs`).

[float]
[[audit-index]]
=== Storing Audit Logs in an Elasticsearch Index

It is possible to store audit logs in an Elasticsearch index. This index can be either on the same cluster, or on
a different cluster (see below). Several settings in `elasticsearch.yml` control this behavior.

.`audit log indexing configuration`
[options="header"]
|======
| Attribute                           | Default Setting    | Description
| `shield.audit.outputs`              | `logfile`          | Must be set to *index* or *[index, logfile]* to enable
| `shield.audit.index.bulk_size`      | `1000`             | Controls how many audit events will be batched into a single write
| `shield.audit.index.flush_interval` | `1s`               | Controls how often to flush buffered events into the index
| `shield.audit.index.rollover`       | `daily`            | Controls how often to roll over to a new index: hourly, daily, weekly, monthly.
| `shield.audit.index.events.include` | `anonymous_access_denied, authentication_failed, access_granted, access_denied, tampered_request, connection_granted, connection_denied`| The audit events to be indexed. Valid values are `anonymous_access_denied, authentication_failed, access_granted, access_denied, tampered_request, connection_granted, connection_denied`, `system_access_granted`. `_all` is a special value that includes all types.
| `shield.audit.index.events.exclude` | `system_access_granted`  | The audit events to exclude from indexing. By default, `system_access_granted` events are excluded; enabling these events results in every internal node communication being indexed, which will make the index size much larger.
|======

.audit index settings
The settings for the index that the events are stored in, can also be configured. The index settings should be placed under
the `shield.audit.index.settings` namespace. For example, the following sets the number of shards and replicas to 1 for
the audit indices:

[source,yaml]
----------------------------
shield.audit.index.settings:
  index:
    number_of_shards: 1
    number_of_replicas: 1
----------------------------

[float]
=== Forwarding Audit Logs to a Remote Cluster

To have audit events stored into a remote Elasticsearch cluster, the additional following options are available.

.`remote audit log indexing configuration`
[options="header"]
|======
| Attribute                           | Default Setting    | Description
| `shield.audit.index.client.hosts`   | None        | Comma separated list of host:port pairs. These hosts should be nodes in the cluster to which you want to index.
| `shield.audit.index.client.cluster.name` | None   | The name of the remote cluster.
| `shield.audit.index.client.shield.user`  | None   | The username:password pair used to authenticate with the remote cluster.
|======

Additional settings may be passed to the remote client by placing them under the `shield.audit.index.client` namespace.
For example, to allow the remote client to discover all of the nodes in the remote cluster you could set
the *client.transport.sniff* option.

[source,yaml]
----------------------------
shield.audit.index.client.transport.sniff: true
----------------------------