149 Commits

Author SHA1 Message Date
Tal Levy
f34ce9ddf4 add on_failure context to ingest metadata during executeOnFailure 2016-01-04 11:24:27 -08:00
javanna
8de31b3c64 make index type and id optional in simulate api
Default values are _index, _type and _id.

Closes #15711
2016-01-04 13:11:44 +01:00
Martijn van Groningen
f2fce0edca Removed ingest runner main class, gradle run --debug-jvm should be used instead. 2015-12-29 13:07:23 +01:00
Martijn van Groningen
9fa3f469e9 fix test after merging in master branch 2015-12-29 13:00:27 +01:00
Martijn van Groningen
1936d6118d Added index template for the '.ingest' index and added logic that ensure the index template is installed.
An index template for the '.ingest' index is required because:
* We don't want arbitrary fields in pipeline documents, because that can turn into upgrade problems if we add more properties to the pipeline dsl.
* We know what are the usages are of the '.ingest' index, so we can optimize for that and prevent that this index is used for different purposes.

Closes #15001
2015-12-24 15:18:07 +01:00
Martijn van Groningen
8b671d7d93 added note 2015-12-22 22:52:23 +01:00
Martijn van Groningen
dbbb296322 added a node.ingest setting that controls whether ingest is active or not. Defaults to false.
If `node.ingest` isn't active then ingest related API calls fail and if the `pipeline_id` parameter is set then index and bulk requests fail.
2015-12-22 22:38:49 +01:00
Tal Levy
44d64c8a45 rename pipeline_id param to pipeline 2015-12-22 12:30:04 -08:00
Tal Levy
0bf4c8fb82 Add on_failure field to processors and pipelines.
both processors and pipelines now have the ability to define
a separate list of processors to be executed if the original line
of execution throws an Exception.

processors without an on_failure parameter defined will throw an
exception and exit the pipeline immediately. processors with on_failure
defined will catch the exception and allow for further processors to
run. Exceptions within the on_failure block will be treated the same as
the top-level.
2015-12-22 10:30:42 -08:00
javanna
46f99a11a0 Add append processor
The append processor allows to append one or more values to an existing list; add a new list with the provided values if the field doesn't exist yet, or convert an existing scalar into a list and add the provided values to the newly created  list.

This required adapting of IngestDocument#appendFieldValue behaviour, also added support for templating to it.

Closes #14324
2015-12-22 16:11:43 +01:00
Martijn van Groningen
72fc34731e applied feedback 2015-12-22 12:20:16 +01:00
Martijn van Groningen
7e99ee4edf ExecutionService should be able to process multiple index requests at a time. 2015-12-22 12:02:36 +01:00
Martijn van Groningen
3fcb8ecd54 Upgraded geoip2 and its dependencies 2015-12-22 11:39:43 +01:00
javanna
188953922a adapt to upstream changes
Removed wildcard imports and fixed two broken license headers
2015-12-22 11:22:47 +01:00
Martijn van Groningen
66d8b5342f s/PipelineStoreBootstrapper/IngestBootstrapper 2015-12-20 20:18:46 +01:00
Martijn van Groningen
6dfcee6937 Added an internal reload pipeline api that makes sure pipeline changes are visible on all ingest nodes after modifcations have been made.
* Pipeline store can now only start when there is no .ingest index or all primary shards of .ingest have been started
* IngestPlugin adds`node.ingest` setting to `true`. This is used to figure out to what nodes to send the refresh request too. This setting isn't yet configurable. This will be done in a follow up issue.
* Removed the background pipeline updater and added added logic to deal with specific scenarious to reload all pipelines.
* Ingest services are no longer be managed by Guice. Only the bootstrapper gets managed by guice and that contructs
all the services/components ingest will need.
2015-12-18 18:24:27 +01:00
Martijn van Groningen
e8a8e22e09 Add template infrastructure, removed meta processor and added template support to set and remove processor.
Added ingest wide template infrastructure to IngestDocument
Added a TemplateService interface that the ingest framework uses
Added a TemplateService implementation that the ingest plugin provides that delegates to the ES' script service
Cut SetProcessor over to use the template infrastructure for the `field` and `value` settings.
Removed the MetaDataProcessor
Removed dependency on mustache library
Added qa  ingest mustache rest test so that the ingest and mustache integration can be tested.
2015-12-18 17:35:53 +01:00
Martijn van Groningen
a56902567e don't register rest actions on transport clients 2015-12-18 15:49:00 +01:00
Martijn van Groningen
2c5bb84851 fix copyDefaultGeoIp2DatabaseFiles task to work again 2015-12-18 15:45:56 +01:00
javanna
3e155f7b54 adapt to upstream changes: thirdPartyAudit.missingClasses set to true
geoip depends on asm and google http client which we don't need
2015-12-18 11:14:39 +01:00
javanna
f349669071 adapt to upstream changes: enableMockModules => getMockPlugins 2015-12-18 11:13:34 +01:00
javanna
8bae93eee1 adapt to upstream changes: StringText => Text 2015-12-18 11:13:18 +01:00
javanna
885b01fb49 adapt to upstream changes: RestModule -> NetworkModule 2015-12-18 10:36:29 +01:00
Martijn van Groningen
07951fc731 added comment why 'accessDeclaredMembers' permission is needed 2015-12-16 10:11:46 +01:00
Martijn van Groningen
e87709f593 fix ingest runner 2015-12-15 10:18:19 +01:00
Martijn van Groningen
d38cccb8a1 Fix issues after merging in master 2015-12-11 14:51:45 +01:00
javanna
a8382de09d add comment to clarifiy why metadata fields can be set all the time to IndexRequest in PipelineExecutionService 2015-12-09 18:52:43 +01:00
javanna
a95f81c015 avoid stripping out _source prefix when in _ingest context 2015-12-09 18:42:20 +01:00
javanna
57d6971252 Streamline support for get/set/remove of metadata fields and ingest metadata fields
Unify metadata map and source, add also support for _ingest prefix. Depending on the prefix, either _source, nothing or _ingest, we will figure out which map to use for values retrieval, but also modifications.
2015-12-09 18:36:44 +01:00
javanna
744d2908a8 avoid null values in simulate serialization prototypes, use empty maps instead 2015-12-09 18:36:44 +01:00
javanna
6b7446beb9 Remove sourceModified flag from IngestDocument
If one is using the ingest plugin and providing a pipeline id with the request, the chance that the source is going to be modified is 99%. We shouldn't worry about keeping track of whether something changed. That seemed useful at first so we can save the resources for setting back the source (map to bytes) when not needed. Also, we are trying to unify metadata fields and source in the same map and that is going to complicate how we keep track of changes that happen in the source only. Best solution is to remove the flag.
2015-12-09 18:36:43 +01:00
javanna
b0d7d604ff Add support for transient metadata to IngestDocument
IngestDocument now holds an additional map of transient metadata. The only field that gets added automatically is `timestamp`, which contains the timestamp of ingestion in ISO8601 format. In the future it will be possible to eventually add or modify these fields, which will not get indexed, but they will be available via templates to all of the processors.

Transient metadata will be visualized by the simulate api, although they will never get indexed. Moved WriteableIngestDocument to the simulate package as it's only used by simulate and it's now modelled for that specific usecase.

 Also taken the chance to remove one IngestDocument constructor used only for testing (accepting only a subset of es metadata fields). While doing that introduced some more randomizations to some existing processor tests.

Closes #15036
2015-12-09 18:36:01 +01:00
javanna
5bc1e46113 setFieldValue for list to replace when an index is specified
It used to do add instead, which is not consistent with the behaviour of set, which always replaces.
2015-12-09 18:28:07 +01:00
Martijn van Groningen
233de434a0 Merge pull request #15310 from martijnvg/ingest/stream_put_and_delete_responses
Streamline put & delete pipeline responses with index & delete responses
2015-12-09 11:30:57 +01:00
Martijn van Groningen
a2cda4e3f2 Streamline the put and delete pipelines responses with the index and delete response in core. 2015-12-08 14:01:28 +01:00
Tal Levy
45f48ac126 update all processors to only operate on one field at a time when possible 2015-12-07 08:30:00 -08:00
javanna
d7c3b51b9c [TEST] adapt to upstream changes 2015-12-04 16:35:53 +01:00
javanna
73986cc54f adapt to upstream changes 2015-12-04 14:17:07 +01:00
Tal Levy
ffa8998f36 Merge pull request #15181 from martijnvg/ingest_geoip_only_read_mmdb_files
[Ingest] The geoip processor should only try to read *.mmdb files from the geoip config directory
2015-12-03 12:17:09 -08:00
Tal Levy
56da7b32ed add ability to define custom grok patterns within processor config 2015-12-03 08:24:07 -08:00
Tal Levy
cf1c393d70 Merge pull request #15166 from talevy/remove_pattern_utils
move PatternUtils#loadBankFromStream into GrokProcessor.Factory
2015-12-03 08:08:00 -08:00
Martijn van Groningen
6acf8ec263 Removed pipeline tests with a simpler tests
The PipelineTests tried to test if the configured map/list in set processor wasn't modified while documents were ingested. Creating a pipeline programmatically created more noise than the test needed. The new tests in IngestDocumentTests have the same goal, but is much smaller and clearer by directly testing against IngestDocument.
2015-12-03 15:19:04 +01:00
Martijn van Groningen
9ab765b851 The geoip processor should only try to read *.mmdb files from the geoip config directory 2015-12-02 14:38:49 +01:00
Martijn van Groningen
270a3977bc Removed the lazy cache in DatabaseReaderService and eagerly build all available databases. 2015-12-02 11:16:02 +01:00
Tal Levy
767bd1d4d5 move PatternUtils#loadBankFromStream into GrokProcessor.Factory 2015-12-01 15:46:02 -08:00
javanna
6c0510b01d Make rename processor less error prone
Rename processor now checks whether the field to rename exists and throws exception if it doesn't. It also checks that the new field to rename to doesn't exist yet, and throws exception otherwise. Also we make sure that the rename operation is atomic, otherwise things may break between the remove and the set and we'd leave the document in an inconsistent state.

Note that the requirement for the new field name to not exist simplifies the usecase for e.g. { "rename" : { "list.1": "list.2"} } as such a rename wouldn't be accepted if list is actually a list given that either list.2 already exists or the index is out of bounds for the existing list. If one really wants to replace an existing field, that field needs to be removed first through remove processor and then rename can be used.
2015-12-01 19:58:24 +01:00
Martijn van Groningen
15b6708a5d and now make use of the lifecycle infrastructure 2015-12-01 18:20:25 +01:00
Tal Levy
8e4c288b5c Merge pull request #15132 from talevy/no_match_for_grok
[Ingest] No match for grok
2015-12-01 09:13:11 -08:00
Tal Levy
2c1effdd41 throw exception when grok processor does not match 2015-12-01 08:58:58 -08:00
Martijn van Groningen
9dd52ad7d3 Removed pollution from the Processor.Factory interface.
1) It no longer extends from Closeable.
2) Removed the config directory setter. Implementation that relied on it, now get the location to the config dir via their constructors.
2015-12-01 17:32:37 +01:00