Commit Graph

241 Commits

Author SHA1 Message Date
Martijn van Groningen 90850f4ea0
Backport: Introduce on_failure_pipeline ingest metadata inside on_failure block (#49596)
Backport of #49076

In case an exception occurs inside a pipeline processor,
the pipeline stack is kept around as header in the exception.
Then in the on_failure processor the id of the pipeline the
exception occurred is made accessible via the `on_failure_pipeline`
ingest metadata.

Closes #44920
2019-11-27 07:52:08 +01:00
Lisa Cawley 26beb486c7 [DOCS] Fixes security links (#49563) 2019-11-25 13:02:26 -08:00
James Rodewig 62a3154d0e
[DOCS] [7.x] Add high-level docs for enrich processor and policies (#49194) (#49331) 2019-11-19 16:38:13 -05:00
James Rodewig 0b062bbc82 [DOCS] Correct required file ext for user agent ingest processor (#48688)
For the user agent ingest processor, custom regex files must end
with the `.yml` file extension.

This corrects the docs which said the `.yaml` extension was required.
2019-10-30 11:11:29 -04:00
Dan Hermann dbc05cd808
Add option to split processor for preserving trailing empty fields (#48685) 2019-10-30 08:25:03 -05:00
Shaunak Kashyap d27a307379 [DOCS] Remove extraneous comma in Enrich Stats API's JSON response (#48539) 2019-10-25 12:35:50 -04:00
James Rodewig 19afe3f84c [DOCS] Remove duplicate links for ingest processor overview (#48394) 2019-10-23 10:55:49 -05:00
Martijn van Groningen c09b62d5bf
Backport: also validate source index at put enrich policy time (#48311)
Backport of: #48254

This changes tests to create a valid
source index prior to creating the enrich policy.
2019-10-22 07:38:16 +02:00
Alexander Reelsen 66581d8158
update ingest-user-agent regexes.yml (#47807)
This new regexes are from:
154eba17f5/regexes.yaml
2019-10-18 16:26:48 +02:00
James Rodewig 3a7c2a4d17 [DOCS] Add `wait_for_completion` parm to execute enrich policy API docs (#48077) 2019-10-15 13:47:30 -04:00
Martijn van Groningen 7fc9198d46
Change how `max_matches` affects `target_field` option. (#47982)
Prior to this change the `target_field` would always be a json array
field in the document being ingested. This to take into account that
multiple enrich documents could be inserted into the `target_field`.

However the default `max_matches` is `1`. Meaning that by default
only a single enrich document would be added to `target_field` json
array field.

This commit changes this; if `max_matches` is set to `1` then the single
document would be added as a json object to the `target_field` and
if it is configured to a higher value then the enrich documents will be
added as a json array (even if a single enrich document happens to be
enriched).
2019-10-14 21:09:48 +02:00
James Rodewig 65f8294378 [DOCS] Add docs for `geo_match` enrich policy type (#47745) 2019-10-09 09:02:52 -04:00
Martijn van Groningen da1e2ea461
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-10-09 09:06:13 +02:00
Jake Landis a6b0ae7f69
Fix bug in ingest node documentation (#45589) (#47750)
The "Conditionals with the Pipeline Processor" incorrectly documents
how to create a pipeline of pipelines with a failure condition. The 
example as-is will always execute the fail processor. The change here
updates the documentation to correct guard the fail processor with an
if condition.
2019-10-08 17:23:38 -05:00
Martijn van Groningen f2f2304c75
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-10-07 10:07:56 +02:00
James Rodewig 0179f93544
[DOCS] Reformat simulate pipeline API (#47301) (#47398) 2019-10-01 14:49:14 -04:00
James Rodewig aeb4edce3a
[DOCS] Reformat put pipeline API (#47171) (#47395) 2019-10-01 14:48:18 -04:00
James Rodewig 024d1f2ab9
[DOCS] Reformat delete pipeline API (#47172) (#47294) 2019-09-30 11:38:46 -04:00
Martijn van Groningen fe937ea4b8
Add config namespace in get policy api response (#47162)
Currently the policy config is placed directly in the json object
of the toplevel `policies` array field. For example:

```
{
    "policies": [
        {
            "match": {
                "name" : "my-policy",
                "indices" : ["users"],
                "match_field" : "email",
                "enrich_fields" : [
                    "first_name",
                    "last_name",
                    "city",
                    "zip",
                    "state"
                ]
            }
        }
    ]
}
```

This change adds a `config` field in each policy json object:

```
{
    "policies": [
        {
            "config": {
                "match": {
                    "name" : "my-policy",
                    "indices" : ["users"],
                    "match_field" : "email",
                    "enrich_fields" : [
                        "first_name",
                        "last_name",
                        "city",
                        "zip",
                        "state"
                    ]
                }
            }
        }
    ]
}
```

This allows us in the future to add other information about policies
in the get policy api response.

The UI will consume this API to build an overview of all policies.
The UI may in the future include additional information about a policy
and the plan is to include that in the get policy api, so that this
information can be gathered in a single api call.

An example of the information that is likely to be added is:
* Last policy execution time
* The status of a policy (executing, executed, unexecuted)
* Information about the last failure if exists
2019-09-30 14:37:23 +02:00
Martijn van Groningen 36215bd33e
fixed docs issue 2019-09-30 08:04:18 +02:00
Martijn van Groningen 7ffe2e7e63
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-09-27 14:42:11 +02:00
James Rodewig 3b626c2d56
[DOCS] Reformat get pipeline API (#47131) (#47163) 2019-09-26 08:51:12 -04:00
James Rodewig 618fb31be8 [DOCS] Minor editorial changes to enrich docs 2019-09-23 13:25:34 -04:00
Martijn van Groningen 0cfddca61d
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-09-23 09:46:05 +02:00
Michael Basnight f1c7ed647b Allow comma separated ids in get enrich policy API (#46351)
This commit changes the GET REST api so it will accept an optional comma
separated list of enrich policy ids. This change also modifies the
behavior of the GET API in that it will not error if it is passed a bad
enrich id anymore, but will instead just return an empty list.
2019-09-20 10:06:58 -05:00
James Rodewig b6da5fa2f7 [DOCS] Correct `<enrich-policy>` parm description for comma-sep list (#46682) 2019-09-18 08:30:50 -04:00
Alexander Reelsen 011496ed5f Expose cache setting in UserAgentPlugin (#46533)
The setting was not registered. Also documentation has been added.
2019-09-16 11:30:38 +02:00
James Rodewig 411d4e9a93 [DOCS] Change // CONSOLE comments to [source,console] (#46669) 2019-09-12 10:27:35 -04:00
James Rodewig 35bf92cdac [DOCS] Reformat enrich stats API (#46600) 2019-09-11 13:52:50 -04:00
Martijn van Groningen a4b0f66919
Add enrich stats api (#46462)
The enrich api returns enrich coordinator stats and
information about currently executing enrich policies.

The coordinator stats include per ingest node:
* The current number of search requests in the queue.
* The total number of outstanding remote requests that
  have been executed since node startup. Each remote
  request is likely to include multiple search requests.
  This depends on how much search requests are in the
  queue at the time when the remote request is performed.
* The number of current outstanding remote requests.
* The total number of search requests that `enrich`
  processors have executed since node startup.

The current execution policies stats include:
* The name of policy that is executing
* A full blow task info object that is executing the policy.

Relates to #32789
2019-09-11 13:40:24 +02:00
James Rodewig a27d075db4
[DOCS] Update "Enrich your data" tutorials (#46417)
* Move enrich docs to separate file

* Rewrite enrich processor tutorial
2019-09-11 13:08:48 +02:00
James Rodewig d74d995382
[DOCS] Separate Enrich API Docs (#46286)
* Add enrich policy common parameter

* Add enrich APIs to REST APIs index

* Add put enrich policy API docs

* Add get enrich policy API docs

* Add delete enrich policy API docs

* Add execute enrich policy API docs
2019-09-11 13:08:28 +02:00
Martijn van Groningen c057fce978
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-09-09 08:40:54 +02:00
James Rodewig f04573f8e8
[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449) (#46459) 2019-09-06 16:09:09 -04:00
James Rodewig c46c57d439
[DOCS] Change // CONSOLE comments to [source,console] (#46441) (#46451) 2019-09-06 11:31:13 -04:00
James Rodewig bb7bff5e30
[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295) (#46418) 2019-09-06 09:22:08 -04:00
Martijn van Groningen ded98e50b7
Change exact match processor to match processor. (#46041)
Besides a rename, this changes allows to processor to attach multiple
enrich docs to the document being ingested.

Also in order to control the maximum number of enrich docs to be
included in the document being ingested, the `max_matches` setting
is added to the enrich processor.

Relates #32789
2019-09-04 18:05:12 +02:00
Martijn van Groningen 555b630160
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-09-02 09:16:55 +02:00
Tal Levy a356bcff41
Add Circle Processor (#43851) (#46097)
add circle-processor that translates circles to polygons
2019-08-28 14:44:08 -07:00
Martijn van Groningen 1157224a6b
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-08-28 10:14:07 +02:00
James Rodewig f3825767f4 [DOCS] Relocate Ingest API docs to REST API section (#45812) 2019-08-23 11:55:01 -04:00
Martijn van Groningen cb42e19a32
Change how type is stored in an enrich policy. (#45789)
A policy type controls how the enrich index is created and
the query executed against the match field. Currently there
is a single policy type (`exact_match`). In the near future
more policy types will be added and different policy may have
different configuration options.

For this reason type should be a json object instead of a string field:

```
{
   "exact_match": {
      ...
   }
}
```

instead of:

```
{
  "type": "exact_match",
  ...
}
```

This will make streaming parsing of enrich policies easier as in the
new format, the parsing code can know ahead what configuration fields
to expect. In the latter format that is not possible if the type field
appears not as the first field.

Relates to #32789
2019-08-23 13:43:38 +02:00
Martijn van Groningen 33972423e9
Enrich processor configuration changes (#45466)
Enrich processor configuration changes:
* Renamed `enrich_key` option to `field` option.
* Replaced `set_from` and `targets` options with `target_field`.

The `target_field` option behaves different to how `set_from` and
`targets` worked. The `target_field` is the field that will contain
the looked up document.

Relates to #32789
2019-08-22 09:49:22 +02:00
Michael Basnight e3373d349b Consolidate enrich list all and get by name APIs (#45705)
The get and list APIs are a single API in this commit. Whether
requesting one named policy or all policies, a list of policies is
returened. The list API code has all been removed and the GET api is
what remains, which contains much of the list response code.
2019-08-20 10:29:59 -05:00
Martijn van Groningen 5ea0985711
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-08-16 09:47:11 +02:00
Michael Basnight 52a094b177 Fail delete policy if pipeline exists (#44438)
If a pipeline that refrences the policy exists, we should not allow the
policy to be deleted. The user will need to remove the processor from
the pipeline before deleting the policy. This commit adds a check to
ensure that the policy cannot be deleted if it is referenced by any
pipeline in the system.
2019-08-14 13:51:10 -05:00
Martijn van Groningen 43b8ab607d
Improve naming of enrich policy fields. (#45494)
Renamed `enrich_key` to `match_field` and
renamed `enrich_values` to `enrich_fields`.

Relates #32789
2019-08-14 11:45:22 +02:00
István Zoltán Szabó 356a632b95 [DOCS] Reformats cluster node info API (#45446)
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-08-13 13:33:15 +02:00
István Zoltán Szabó 4ee7ac25ae [DOCS] Reformats cluster node stats API (#45441)
Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-08-13 12:48:07 +02:00
Martijn van Groningen 04626de6ae
Add initial version of enrich processor docs. (#45084)
Relates to #32789
2019-08-12 20:36:54 +02:00
Alexander Reelsen 210593d8e5 Add back lowercase processor in docs (#45090)
This got lost in a refactoring in 9137d92ca6
2019-08-06 09:23:13 -04:00
Jason Tedor bf74d38782
Fix GeoIP custom database directory in docs (#43383)
These docs were misleading for package installations of
Elasticsearch. Instead, we should refer to $ES_CONFIG/ingest-geoip as
the path to place the custom database files. For non-package
installations, this is the same as $ES_HOME/config, but for package
installations this is not the case as the config directory for package
installations is /etc/elasticsearch, and is not relative to
$ES_HOME. This commit corrects the docs.
2019-06-19 13:26:07 -04:00
Marios Trivyzas 3b42dde64f
[Docs] Add note for date patterns used for index search. (#42810)
Add an explanatory NOTE section to draw attention to the difference
between small and capital letters used for the index date patterns.
e.g.: HH vs hh, MM vs mm.

Closes: #22322
(cherry picked from commit c8125417dc33215651f9bb76c9b1ffaf25f41caf)
2019-06-03 22:27:19 +02:00
Jack Conradson 813db163d8 Reorganize Painless doc structure (#42303) 2019-05-21 10:50:21 -07:00
Alexander Reelsen 8e33a5292a Add HTML strip processor (#41888)
This processor uses the lucene HTMLStripCharFilter class to remove HTML
entities from a field. This adds to the char filter, so that there is
possibility to store the stripped version as well.

Note, that the characeter filter replaces tags with a newline, so that
the produced HTML will look slightly different than the incoming HTML
with regards to newlines.
2019-05-09 13:01:07 +02:00
Flavio Pompermaier 83fef23fd1
Fix wrong property name (#40636) 2019-05-09 08:53:05 +02:00
James Rodewig b65ceb36bc [DOCS] Escape quotes to avoid smart quotes in Asciidoctor (#41603) 2019-04-30 16:31:20 -04:00
James Rodewig 53702efddd [DOCS] Add anchors for Asciidoctor migration (#41648) 2019-04-30 10:20:17 -04:00
Jason Tedor ac58b9bded
Fix date index name processor default date_formats (#40915)
This commit is a correction of a doc bug in the docs for the ingest
date-index-name processor. The correct pattern is
yyyy-MM-dd'T'HH:mm:ss.SSSXX. This is due to the transition from Joda
time to Java time where Z does not mean the same thing between the two.
2019-04-05 17:45:57 -04:00
Tal Levy 9ab2410436
Adding an example in the Set processor documentation to address #30604 (#39941) (#39969)
* Added an example of using set to copy values from one field to another.

* Modified the document type to match the test.
2019-03-12 11:14:41 -07:00
Jake Landis 797d6b8a66
Execute ingest node pipeline before creating the index (#39607) (#39796)
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates #32786
Relates #32758 
Closes #36545
2019-03-07 13:31:41 -06:00
Alexander Reelsen 8e5e48319e
Add documentation about breaking java time changes (#38886)
In addition remove joda time mentions across the docs, make 
sure links are updated to java time javadocs.

Forward port of #38720
2019-02-14 10:18:12 +01:00
Jake Landis 46bb663a09
Make 7.x like 6.7 user agent ecs, but default to true (#38828)
Forward port of https://github.com/elastic/elasticsearch/pull/38757

This change reverts the initial 7.0 commits and replaces them
with the 6.7 variant that still allows for the ecs flag. 
This commit differs from the 6.7 variants in that ecs flag will 
now default to true. 

6.7: `ecs` : default `false`
7.x: `ecs` : default `true`
8.0: no option, but behaves as `true`

* Revert "Ingest node - user agent, move device to an object (#38115)"
This reverts commit 5b008a34aa.

* Revert "Add ECS schema for user-agent ingest processor (#37727) (#37984)"
This reverts commit cac6b8e06f.

* cherry-pick 5dfe1935345da3799931fd4a3ebe0b6aa9c17f57 
Add ECS schema for user-agent ingest processor (#37727)

* cherry-pick ec8ddc890a34853ee8db6af66f608b0ad0cd1099 
Ingest node - user agent, move device to an object (#38115) (#38121)
  
* cherry-pick f63cbdb9b426ba24ee4d987ca767ca05a22f2fbb (with manual merge fixes)
Dep. check for ECS changes to User Agent processor (#38362)

* make true the default for the ecs option, and update 7.0 references and tests
2019-02-13 10:28:01 -06:00
Jake Landis 46bd04959e
fix dissect doc "ip" --> "clientip" (#38544)
Forward port of #38512.
2019-02-08 16:51:58 -06:00
Lee Hinman 70956f6f34
bad formatted JSON object (#38515) (#38526)
It just need to replace the wrong " , " to " : "

Backport of #38515
2019-02-06 13:01:45 -07:00
Gordon Brown 292e0f6fb7
Deprecate `_type` in simulate pipeline requests (#37949)
As mapping types are being removed throughout Elasticsearch, the use of
`_type` in pipeline simulation requests is deprecated. Additionally, the
default `_type` used if one is not supplied has been changed to `_doc` for
consistency with the rest of Elasticsearch.
2019-02-04 16:11:44 -07:00
Jake Landis 5b008a34aa
Ingest node - user agent, move device to an object (#38115)
When the ingest node user agent parses the device field, it
will result in a string value. To match the ecs schema
this commit moves the value of the parsed device to an
object with an inner field named 'name'. There are not
any passivity concerns since this modifies an unreleased change.

closes #38094
relates #37329
2019-01-31 13:54:34 -06:00
Lee Hinman cac6b8e06f
Add ECS schema for user-agent ingest processor (#37727) (#37984)
* Add ECS schema for user-agent ingest processor (#37727)

This switches the format of the user agent processor to use the schema from [ECS](https://github.com/elastic/ecs).
So rather than something like this:

```
{
  "patch" : "3538",
  "major" : "70",
  "minor" : "0",
  "os" : "Mac OS X 10.14.1",
  "os_minor" : "14",
  "os_major" : "10",
  "name" : "Chrome",
  "os_name" : "Mac OS X",
  "device" : "Other"
}
```

The structure is now like this:

```
{
  "name" : "Chrome",
  "original" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36",
  "os" : {
    "name" : "Mac OS X",
    "version" : "10.14.1",
    "full" : "Mac OS X 10.14.1"
  },
  "device" : "Other",
  "version" : "70.0.3538.102"
}
```

This is now the default for 7.0. The deprecated `ecs` setting in 6.x is not
supported.

Resolves #37329

* Remove `ecs` setting from docs
2019-01-30 11:24:18 -07:00
Christoph Büscher 34f2d2ec91
Remove remaining occurances of "include_type_name=true" in docs (#37646) 2019-01-22 15:13:52 +01:00
Julie Tibshirani 36a3b84fc9
Update the default for include_type_name to false. (#37285)
* Default include_type_name to false for get and put mappings.

* Default include_type_name to false for get field mappings.

* Add a constant for the default include_type_name value.

* Default include_type_name to false for get and put index templates.

* Default include_type_name to false for create index.

* Update create index calls in REST documentation to use include_type_name=true.

* Some minor clean-ups around the get index API.

* In REST tests, use include_type_name=true by default for index creation.

* Make sure to use 'expression == false'.

* Clarify the different IndexTemplateMetaData toXContent methods.

* Fix FullClusterRestartIT#testSnapshotRestore.

* Fix the ml_anomalies_default_mappings test.

* Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests.

We make sure to specify include_type_name=true during xContent parsing,
so we continue to test the legacy typed responses. XContent generation
for the typeless responses is currently only covered by REST tests,
but we will be adding unit test coverage for these as we implement
each typeless API in the Java HLRC.

This commit also refactors GetMappingsResponse to follow the same appraoch
as the other mappings-related responses, where we read include_type_name
out of the xContent params, instead of creating a second toXContent method.
This gives better consistency in the response parsing code.

* Fix more REST tests.

* Improve some wording in the create index documentation.

* Add a note about types removal in the create index docs.

* Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL.

* Make sure to mention include_type_name in the REST docs for affected APIs.

* Make sure to use 'expression == false' in FullClusterRestartIT.

* Mention include_type_name in the REST templates docs.
2019-01-14 13:08:01 -08:00
Josh Soref edb48321ba [DOCS] Various spelling corrections (#37046) 2019-01-07 14:44:12 +01:00
Adam Thomson ac4aecc92d [Docs] Update ingest-node.asciidoc (#37116) 2019-01-04 19:33:06 +01:00
Jason Tedor 9137d92ca6
Refactor ingest node API docs (#36962)
This commit is a simple refactoring of the ingest node API docs,
breaking each API into a single file for ease of maintaining.
2018-12-23 08:59:18 -05:00
Jason Tedor 7562768bd6
Fix ingest cross-doc links
This commit fixes some cross-doc links from the old ingest plugins page
to the new ingest processor pages that arose after converting
ingest-geoip and ingest-user-agent to modules.
2018-12-22 20:51:18 -05:00
Jason Tedor e14f27c033
Fix titles of GeoIP and User Agent processor docs
This commit makes the titles of the new GeoIP and User Agent processor
docs look more like the titles of the docs for other processors.
2018-12-22 20:31:07 -05:00
Jason Tedor 1f574bd17a
Package ingest-user-agent as a module (#36956)
This commit moves ingest-user-agent from being a plugin to being a
module that is packaged with Elasticsearch distributions.
2018-12-22 20:20:53 -05:00
Jason Tedor 434021c3ec
Add placeholder ingest-geoip plugin page (#36958)
This commit adds a placeholder ingest-geoip plugin page as there are
other components in the Elastic Stack that still refer to these
pages. These docs would be broken without this placeholder page forcing
teams responsible for those docs to scramble to fix the build over the
weekend before a holiday period. Instead, we add a placeholder page so
the docs build continues to function, and those teams can fix their docs
without the constraint of a broken build. We also cleanup a few minor
docs issues that were missed during the initial changes to convert
ingest-geoip to a module.
2018-12-22 09:49:56 -05:00
Jason Tedor e1717df0ac
Package ingest-geoip as a module (#36898)
This commit moves ingest-geoip from being a plugin to being a module
that is packaged with Elasticsearch distributions.
2018-12-22 07:21:49 -05:00
Jason Tedor 35911d8dd7
Split the ingest processor docs into multiple files (#36887)
This commit breaks the single ingest docs file into multiple files,
factoring out the processor docs into a documentation file per
processor. This will help make this content easier to maintain.
2018-12-20 08:04:54 -05:00
Boaz Leskes e356b8cb95
Add doc's sequence number + primary term to GetResult and use it for updates (#36680)
This commit adds the last sequence number and primary term of the last operation that have
modified a document to `GetResult` and uses it to power the Update API.

Relates #36148 
Relates #10708
2018-12-17 15:22:13 +01:00
Jake Landis 4b99a663c1
ingest: fix broken doc link 2018-11-26 10:34:42 -06:00
Jake Landis 7f7b31723e
ingest: extended `if` documentation (#35044)
part of #33188
2018-11-26 09:35:45 -06:00
Chris Cho e572a21c4b [Docs] Improve Convert Processor description (#35280)
Sometimes users are confused about whether they can use the Convert Processor
for changing an existing fields type to other types even if the existing one is already
ingested. This confusion is from the first line of description. Changing this and also
adding a some detail to the code snippet.
2018-11-07 17:01:35 +01:00
Jake Landis c2766b65cf
ingest: raise visibility of ingest plugin documentation (#35048)
* move the set security user processor to the main documentation
* link to plugin processors

part of #33188
2018-11-05 11:44:10 -06:00
Jake Landis 77fab62ebe
ingest: add common options to each processor's documentation (#35091)
* adds `if`, `on_failure`, `tag`, and `ignore_failure` to table for each processor

part of #33188

* added ingore_failure

* fix whitespace noise
2018-11-01 11:08:04 -05:00
Armin Braun f79bdec58a INGEST: Document Pipeline Processor (#33418)
* Added documentation for Pipeline Processor
* Relates #33188
2018-10-23 15:36:57 -05:00
Jake Landis a8e1ee34ca
ingest: document fields that support templating (#34536)
This change also updates many of the examples to use ecs as the example.
Some additional minor improvements are also included.

Part of #33188
2018-10-23 13:28:44 -05:00
Jake Landis c447fc258a
ingest: documentation for the drop processor (#34570) 2018-10-23 12:30:23 -05:00
Armin Braun f0f732908e
INGEST: Document Processor Conditional (#33388)
* INGEST: Document Processor Conditional

Relates #33188
2018-10-23 17:37:30 +02:00
Jake Landis 79b507dbf5
ingest: Introduce the dissect processor (#32884)
* ingest: Introduce the dissect processor

The ingest node dissect processor is an alternative to Grok
to split a string based on a pattern. Dissect differs from
Grok such that regular expressions are not used to split the
string.

Dissect can be used to parse a source text field with a
simpler pattern, and is often faster the Grok for basic string
parsing. This processor uses the dissect library which
does most of the work.
2018-08-28 07:11:20 -07:00
Jake Landis 3d4c84f7ca
ingest: doc: move Dot Expander Processor doc to correct position (#31743)
No changes to the content.
2018-08-03 07:21:05 -07:00
Armin Braun 7aa8a0a927
INGEST: Extend KV Processor (#31789) (#32232)
* INGEST: Extend KV Processor (#31789)

Added more capabilities supported by LS to the KV processor:
* Stripping of brackets and quotes from values (`include_brackets` in corresponding LS filter)
* Adding key prefixes
* Trimming specified chars from keys and values

Refactored the way the filter is configured to avoid conditionals during execution.
Refactored Tests a little to not have to add more redundant getters for new parameters.

Relates #31786
* Add documentation
2018-07-20 22:32:50 +02:00
Armin Braun e46ed73379
Ingest: Add ignore_missing option to RemoveProc (#31693)
Added `ignore_missing` setting to the RemoveProcessor to fix #23086
2018-07-09 10:24:34 +02:00
Jake Landis c0056cddd8
ingest: Introduction of a bytes processor (#31733)
ingest: Introduction of a bytes processor

This processor allows for human readable byte values (e.g. 1kb) to be converted to value in bytes (e.g. 1024). Internally this processor re-uses "ByteSizeValue.parseBytesSizeValue" which supports conversions up to Long.MAX_VALUE and the following units: "b", "kb", "mb", "gb", "tb", pb".

This change also introduces a generic return type for the AbstractStringProcessor to allow for code reuse while supporting a String -> T conversion. (String -> Long in this case).
2018-07-03 10:40:56 -05:00
Armin Braun 13e1cf6191
ingest: Add ignore_missing property to foreach filter (#22147) (#31578) 2018-06-26 20:04:41 +02:00
Martijn van Groningen 6030d4be1e
[INGEST] Interrupt the current thread if evaluation grok expressions take too long (#31024)
This adds a thread interrupter that allows us to encapsulate calls to org.joni.Matcher#search()
This method can hang forever if the regex expression is too complex.

The thread interrupter in the background checks every 3 seconds whether there are threads
execution the org.joni.Matcher#search() method for longer than 5 seconds and
if so interrupts these threads.

Joni has checks that that for every 30k iterations it checks if the current thread is interrupted and
if so returns org.joni.Matcher#INTERRUPTED

Closes #28731
2018-06-12 07:49:03 +02:00
Tanguy Leroux 42608881b0
[Docs] Remove mention pattern files in Grok processor (#31170)
Pattern files have been removed in 
16fa3e546e
2018-06-11 09:32:12 +02:00
rzmf 080cefec73 Fix missing comma in ingest-node.asciidoc (#29343) 2018-04-03 11:33:44 +01:00
Nik Everett 762226bee9
Docs: Support triple quotes (#28915)
Adds support for triple quoted strings to the documentation test
generator. Kibana's CONSOLE tool has supported them for a year but we
were unable to use them in Elasticsearch's docs because the process that
converts example snippets into tests couldn't handle this. This change
adds code to convert them into standard JSON so we can pass them to
Elasticsearch.
2018-03-16 12:46:39 -04:00
Jiri Tyr c713d62f88 [Docs] Fix link to Grok patterns (#29088) 2018-03-16 14:13:17 +01:00