This PR enables Ingest plugins to leverage processor-scoped REST
endpoints. First of which being the Grok endpoint that retrieves
Grok Patterns for users to retrieve all the built-in patterns.
Example usage: Kibana Grok Autocomplete!
The Log4j dependency is separated into two artifacts, the API and the
core implementation. This is to enable replacing Log4j on the backend
through the SLF4J bridge with another logging implementation. For this
reason, the dependencies are marked as optional. This causes confusion
amongst users as to use the bridge, the API should be non-optional since
it is needed for the bridge to function correctly. While they could pull
it into their application directly, it would be clearer if we simply
marked this depdendency as non-optional. Note that this does not mean
that users have to use Log4j for logging in their application, so we are
not marking core as required, it only clarifies what they need to be
able to plug in a different logging implementation.
Relates #25136
This commit refactors the query phase in order to be able
to automatically detect queries that can be early terminated.
If the index sort matches the query sort, the top docs collection is early terminated
on each segment and the computing of the total number of hits that match the query is delegated to a simple TotalHitCountCollector.
This change also adds a new parameter to the search request called `track_total_hits`.
It indicates if the total number of hits that match the query should be tracked.
If false, queries sorted by the index sort will not try to compute this information and
and will limit the collection to the first N documents per segment.
Aggregations are not impacted and will continue to see every document
even when the index sort matches the query sort and `track_total_hits` is false.
Relates #6720
During package install on systemd-based systems, some sysctl settings
should be set (e.g. vm.max_map_count).
In some environments, changing sysctl settings plainly does not work;
previously a global environment variable named
ES_SKIP_SET_KERNEL_PARAMETERS was introduced to skip calling sysctl, but
this causes trouble for:
- configuration management systems, which usually cannot apply an env
var when running a package manager
- package upgrades, which will not have the env var set any more, and
thus leaving the package management system in a bad state (possibly
half-way upgraded, can be very hard to recover)
This removes the env var again and instead of calling systemd-sysctl
manually, tells systemd to restart the wrapper unit - which itself can
be masked by system administrators or management tools if it is known
that sysctl does not work in a given environment.
The restart is not silent on systems in their default configuration, but
is ignored if the unit is masked.
Relates #24234
The index parameter in the update-aliases, put-alias, and delete-alias APIs no longer accepts alias names. Instead, it accepts only index names (or wildcards which will expand to matching indices).
Closes#23960
This removes the parsing of things like `GET /idx/_aliases,_mappings`, instead,
a user must choose between retriving all index metadata with `GET /idx`, or only
a specific form such as `GET /idx/_settings`.
Relates to (and is a prerequisite of) #24437
Some response classes in the java api expose both `getTook()` which returns a `TimeValue` and `getTookInMillis` which returns a `long` value. `getTook()` is enough as one can do `getTook().millis()` to obtain the same result as `getTookInMillis()`, which can be removed.
* Adds nodes usage API to monitor usages of actions
The nodes usage API has 2 main endpoints
/_nodes/usage and /_nodes/{nodeIds}/usage return the usage statistics
for all nodes and the specified node(s) respectively.
At the moment only one type of usage statistics is available, the REST
actions usage. This records the number of times each REST action class is
called and when the nodes usage api is called will return a map of rest
action class name to long representing the number of times each of the action
classes has been called.
Still to do:
* [x] Create usage service to store usage statistics
* [x] Record usage in REST layer
* [x] Add Transport Actions
* [x] Add REST Actions
* [x] Tests
* [x] Documentation
* Rafactors UsageService so counts are done by the handlers
* Fixing up docs tests
* Adds a name to all rest actions
* Addresses review comments
This commit adds a new bg_count field to the REST response of
SignificantTerms aggregations. Similarly to the bg_count that already
exists in significant terms buckets, this new bg_count field is set at
the aggregation level and is populated with the superset size value.
Currently global ordinals are documented under `fielddata`. It moves them to
their own file since they also work with doc values and fielddata is on the way
out.
Closes#23101
By default, the remove plugin CLI command preserves configuration
files. This is so that if a user is upgrading the plugin (which is done
by first removing the old version and then installing the new version)
they do not lose their configuration file. Yet, there are circumstances
where preserving the configuration file is not desired. This commit adds
a purge option to the remove plugin CLI command.
Relates #24981
This commit adds a `doc_count` field to the response body of Matrix
Stats aggregation. It exposes the number of documents involved in
the computation of statistics, a value that can already be retrieved using
the method MatrixStats.getDocCount() in the Java API.
This commit fixes an issue with the plugin docs incorrectly specifying
how to set a custom configuration directory. The correct way is to use
the environment variable CONF_DIR.
* SignificantText aggregation - like significant_terms but doesn’t require fielddata=true, recommended used with `sampler` agg to limit expense of tokenizing docs and takes optional `filter_duplicate_text`:true setting to avoid stats skew from repeated sections of text in search results.
Closes#23674
Adds a "magic" key to the yaml testing stash mostly for use with
documentation tests. When unstashing an object, `$_path` is the
path into the current position in the object you are unstashing.
This means that in docs tests you can use
`// TESTRESPONSEs/somevalue/$body.${_path}/` to mean "replace
`somevalue` with whatever is the response in the same position."
Compare how you must carefully mock out all the numbers in the profile
response without this change:
```
// TESTRESPONSE[s/"id": "\[2aE02wS1R8q_QFnYu6vDVQ\]\[twitter\]\[1\]"/"id": $body.profile.shards.0.id/]
// TESTRESPONSE[s/"rewrite_time": 51443/"rewrite_time": $body.profile.shards.0.searches.0.rewrite_time/]
// TESTRESPONSE[s/"score": 51306/"score": $body.profile.shards.0.searches.0.query.0.breakdown.score/]
// TESTRESPONSE[s/"time_in_nanos": "1873811"/"time_in_nanos": $body.profile.shards.0.searches.0.query.0.time_in_nanos/]
// TESTRESPONSE[s/"build_scorer": 2935582/"build_scorer": $body.profile.shards.0.searches.0.query.0.breakdown.build_scorer/]
// TESTRESPONSE[s/"create_weight": 919297/"create_weight": $body.profile.shards.0.searches.0.query.0.breakdown.create_weight/]
// TESTRESPONSE[s/"next_doc": 53876/"next_doc": $body.profile.shards.0.searches.0.query.0.breakdown.next_doc/]
// TESTRESPONSE[s/"time_in_nanos": "391943"/"time_in_nanos": $body.profile.shards.0.searches.0.query.0.children.0.time_in_nanos/]
// TESTRESPONSE[s/"score": 28776/"score": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.score/]
// TESTRESPONSE[s/"build_scorer": 784451/"build_scorer": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.build_scorer/]
// TESTRESPONSE[s/"create_weight": 1669564/"create_weight": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.create_weight/]
// TESTRESPONSE[s/"next_doc": 10111/"next_doc": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.next_doc/]
// TESTRESPONSE[s/"time_in_nanos": "210682"/"time_in_nanos": $body.profile.shards.0.searches.0.query.0.children.1.time_in_nanos/]
// TESTRESPONSE[s/"score": 4552/"score": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.score/]
// TESTRESPONSE[s/"build_scorer": 42602/"build_scorer": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.build_scorer/]
// TESTRESPONSE[s/"create_weight": 89323/"create_weight": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.create_weight/]
// TESTRESPONSE[s/"next_doc": 2852/"next_doc": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.next_doc/]
// TESTRESPONSE[s/"time_in_nanos": "304311"/"time_in_nanos": $body.profile.shards.0.searches.0.collector.0.time_in_nanos/]
// TESTRESPONSE[s/"time_in_nanos": "32273"/"time_in_nanos": $body.profile.shards.0.searches.0.collector.0.children.0.time_in_nanos/]
```
To how you can cavalierly mock all the numbers at once with this change:
```
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
```
Currently a `delete document` request against a non-existing index actually **creates** this index.
With this change the `delete document` no longer creates the previously non-existing index and throws an `index_not_found` exception instead.
However as discussed in https://github.com/elastic/elasticsearch/pull/15451#issuecomment-165772026, if an external version is explicitly used, the current behavior is preserved and the index is still created and the document is marked for deletion.
Fixes#15425
Native scripts have been replaced in documentation by implementing
a ScriptEngine and they were deprecated in 5.5.0. This commit
removes the native script infrastructure for 6.0.
closes#19966
This commit adds gcs credential settings to the elasticsearch keystore.
The setting name follows the same pattern as the s3 client settings,
beginning with `gcs.client.`, followed by the client name, and then the
setting name, in this case, `credentials_file`. Using the legacy service
file setting is also deprecated.