Log ingestion provides a way to transform unstructured log data into structured data and ingest into OpenSearch. Structured log data allows for improved queries and filtering based on the data format when searching logs for an event.
OpenSearch Log Ingestion consists of three components---[Data Prepper]({{site.url}}{{site.baseurl}}/clients/data-prepper/index/), [OpenSearch]({{site.url}}{{site.baseurl}}/quickstart/) and [OpenSearch Dashboards]({{site.url}}{{site.baseurl}}/dashboards/index/)---that fit into the OpenSearch ecosystem. The Data Prepper repository has several [sample applications](https://github.com/opensearch-project/data-prepper/tree/main/examples) to help you get started.
![Log data flow diagram from a distributed application to OpenSearch]({{site.url}}{{site.baseurl}}/images/la.png)
1. Log Ingestion relies on you adding log collection to your application's environment to gather and send log data.
(In the [example](#example) below, [FluentBit](https://docs.fluentbit.io/manual/) is used as a log collector that collects log data from a file and sends the log data to Data Prepper).
2. [Data Prepper]({{site.url}}{{site.baseurl}}/clients/data-prepper/index/) receives the log data, transforms the data into a structure format, and indexes it on an OpenSearch cluster.
3. The data can then be explored through OpenSearch search queries or the **Discover** page in OpenSearch Dashboards.
### Example
This example mimics the writing of log entries to a log file that are then processed by Data Prepper and stored in OpenSearch.
Download or clone the [Data Prepper repository](https://github.com/opensearch-project/data-prepper). Then navigate to `examples/log-ingestion/` and open `docker-compose.yml` in a text editor. This file contains a container for:
Close the file and run `docker-compose up --build` to start the containers.
After the containers start, your ingestion pipeline is set up and ready to ingest log data. The `fluent-bit` container is configured to read log data from `test.log`. Run the following command to generate log data to send to the log ingestion pipeline.
2021-12-02T15:35:44,499 [log-pipeline-processor-worker-1-thread-1] INFO com.amazon.dataprepper.pipeline.ProcessWorker - log-pipeline Worker: Processing 1 records from buffer
```
This should result in a single document being written to the OpenSearch cluster in the `apache-logs` index as defined in the `log_pipeline.yaml` file.
Run the following command to see one of the raw documents in the OpenSearch cluster:
```bash
curl -X GET -u 'admin:admin' -k 'https://localhost:9200/apache_logs/_search?pretty&size=1'
The same data can be viewed in OpenSearch Dashboards by visiting the **Discover** page and searching the `apache_logs` index. Remember, you must create the index in OpenSearch Dashboards if this is your first time searching for the index.