In the HL REST client we replace the License object with a string, because of
complexity of this class. It is also not really needed on the client side since
end-users are not interacting with the license besides passing it as a string
to the server.
Relates #29827
Adds a new single-value metrics aggregation that computes the weighted
average of numeric values that are extracted from the aggregated
documents. These values can be extracted from specific numeric
fields in the documents.
When calculating a regular average, each datapoint has an equal "weight"; it
contributes equally to the final value. In contrast, weighted averages
scale each datapoint differently. The amount that each datapoint contributes
to the final value is extracted from the document, or provided by a script.
As a formula, a weighted average is the `∑(value * weight) / ∑(weight)`
A regular average can be thought of as a weighted average where every value has
an implicit weight of `1`.
Closes#15731
The notion of "quality" is an overloaded term in the search ranking evaluation
context. Its usually used to decribe certain levels of "good" vs. "bad" of a
seach result with respect to the users information need. We currently report the
result of the ranking evaluation as `quality_level` which is a bit missleading.
This changes the response parameter name to `metric_score` which fits better.
* INGEST: Extend KV Processor (#31789)
Added more capabilities supported by LS to the KV processor:
* Stripping of brackets and quotes from values (`include_brackets` in corresponding LS filter)
* Adding key prefixes
* Trimming specified chars from keys and values
Refactored the way the filter is configured to avoid conditionals during execution.
Refactored Tests a little to not have to add more redundant getters for new parameters.
Relates #31786
* Add documentation
With this commit we remove the documentation for the setting
`xpack.monitoring.collection.indices.stats.timeout` which has already
been removed in code.
Closes#32133
Relates #32229
Currently the ranking evaluation response contains a 'unknown_docs' section
for each search use case in the evaluation set. It contains document ids for
results in the search hits that currently don't have a quality rating.
This change renames it to `unrated_docs`, which better reflects its purpose.
Resolving wildcards in aliases expression is challenging as we may end
up with no aliases to replace the original expression with, but if we
replace with an empty array that means _all which is quite the opposite.
Now that we support and serialize the original requested aliases,
whenever aliases are replaced we will be able to know what was
initially requested. `MetaData#findAliases` can then be updated to not
return anything in case it gets empty aliases, but the original aliases
were not empty. That means that empty aliases are interpreted as _all
only if they were originally requested that way.
Relates to #31516
Throw an exception for doc['field'].value
if this document is missing a value for the field.
After deprecation changes have been backported to 6.x,
make this a default behaviour in 7.0
Closes#29286
We do not support intra-cluster connections on multiple interfaces, but the
documentation indicates that we will in future. In fact there is currently no
plan to support this, so the forward-looking documentation is misleading. This
commit
- removes the misleading sentence
- fixes that a transport profile affects outbound connections, not inbound ones
- tidies up some nearby text
Relates #29827
This implementation behaves like the current transport client, that you basically cannot configure a Watch POJO representation as an argument to the put watch API, but only a bytes reference. You can use the the `WatchSourceBuilder` from the `org.elasticsearch.plugin:x-pack-core` dependency to build watches.
This commit also changes the license type to trial, so that watcher is available in high level rest client tests.
/cc @hub-cap
* Add basic support for field aliases in index mappings. (#31287)
* Allow for aliases when fetching stored fields. (#31411)
* Add tests around accessing field aliases in scripts. (#31417)
* Add documentation around field aliases. (#31538)
* Add validation for field alias mappings. (#31518)
* Return both concrete fields and aliases in DocumentFieldMappers#getMapper. (#31671)
* Make sure that field-level security is enforced when using field aliases. (#31807)
* Add more comprehensive tests for field aliases in queries + aggregations. (#31565)
* Remove the deprecated method DocumentFieldMappers#getFieldMapper. (#32148)
Today it is unclear what guarantees are offered by the search preference
feature, and we claim a guarantee that is stronger than what we really offer:
> A custom value will be used to guarantee that the same shards will be used
> for the same custom value.
This commit clarifies this documentation.
Forward-port of #32098 to `master`.
This change adds two contexts the execute scripts against:
* SEARCH_SCRIPT: Allows to run scripts in a search script context.
This context is used in `function_score` query's script function,
script fields, script sorting and `terms_set` query.
* FILTER_SCRIPT: Allows to run scripts in a filter script context.
This context is used in the `script` query.
In both contexts a index name needs to be specified and a sample document.
The document is needed to create an in-memory index that the script can
access via the `doc[...]` and other notations. The index name is needed
because a mapping is needed to index the document.
Examples:
```
POST /_scripts/painless/_execute
{
"script": {
"source": "doc['field'].value.length()"
},
"context" : {
"search_script": {
"document": {
"field": "four"
},
"index": "my-index"
}
}
}
```
Returns:
```
{
"result": 4
}
```
POST /_scripts/painless/_execute
{
"script": {
"source": "doc['field'].value.length() <= params.max_length",
"params": {
"max_length": 4
}
},
"context" : {
"filter_script": {
"document": {
"field": "four"
},
"index": "my-index"
}
}
}
Returns:
```
{
"result": true
}
```
Also changed PainlessExecuteAction.TransportAction to use TransportSingleShardAction
instead of HandledAction, because now in case score or filter contexts are used
the request needs to be redirected to a node that has an active IndexService
for the index being referenced (a node with a shard copy for that index).
The current docs of the put-mapping Java API is currently broken. It its current
form, it creates an index and uses the whole mapping definition given as a JSON
string as the type name. Since we didn't check the index created in the
IndicesDocumentationIT so far this went unnoticed.
This change adds test to catch this error to the documentation test, changes the
documentation so it works correctly now and adds an input validation to
PutMappingRequest#buildFromSimplifiedDef() which was used internally to reject
calls where no mapping definition is given.
Closes#31906
Currently the `keep_types` token filter includes all token types specified using
its `types` parameter. Lucenes TypeTokenFilter also provides a second mode where
instead of keeping the specified tokens (include) they are filtered out
(exclude). This change exposes this option as a new `mode` parameter that can
either take the values `include` (the default, if not specified) or `exclude`.
Closes#29277
Make SnapshotInfo and CreateSnapshotResponse parsers lenient for backwards compatibility. Remove extraneous fields from CreateSnapshotRequest toXContent.
* Adds a new auto-interval date histogram
This change adds a new type of histogram aggregation called `auto_date_histogram` where you can specify the target number of buckets you require and it will find an appropriate interval for the returned buckets. The aggregation works by first collecting documents in buckets at second interval, when it has created more than the target number of buckets it merges these buckets into minute interval bucket and continues collecting until it reaches the target number of buckets again. It will keep merging buckets when it exceeds the target until either collection is finished or the highest interval (currently years) is reached. A similar process happens at reduce time.
This aggregation intentionally does not support min_doc_count, offest and extended_bounds to keep the already complex logic from becoming more complex. The aggregation accepts sub-aggregations but will always operate in `breadth_first` mode deferring the computation of sub-aggregations until the final buckets from the shard are known. min_doc_count is effectively hard-coded to zero meaning that we will insert empty buckets where necessary.
Closes#9572
* Adds documentation
* Added sub aggregator test
* Fixes failing docs test
* Brings branch up to date with master changes
* trying to get tests to pass again
* Fixes multiBucketConsumer accounting
* Collects more buckets than needed on shards
This gives us more options at reduce time in terms of how we do the
final merge of the buckeets to produce the final result
* Revert "Collects more buckets than needed on shards"
This reverts commit 993c782d117892af9a3c86a51921cdee630a3ac5.
* Adds ability to merge within a rounding
* Fixes nonn-timezone doc test failure
* Fix time zone tests
* iterates on tests
* Adds test case and documentation changes
Added some notes in the documentation about the intervals that can bbe
returned.
Also added a test case that utilises the merging of conseecutive buckets
* Fixes performance bug
The bug meant that getAppropriate rounding look a huge amount of time
if the range of the data was large but also sparsely populated. In
these situations the rounding would be very low so iterating through
the rounding values from the min key to the max keey look a long time
(~120 seconds in one test).
The solution is to add a rough estimate first which chooses the
rounding based just on the long values of the min and max keeys alone
but selects the rounding one lower than the one it thinks is
appropriate so the accurate method can choose the final rounding taking
into account the fact that intervals are not always fixed length.
Thee commit also adds more tests
* Changes to only do complex reduction on final reduce
* merge latest with master
* correct tests and add a new test case for 10k buckets
* refactor to perform bucket number check in innerBuild
* correctly derive bucket setting, update tests to increase bucket threshold
* fix checkstyle
* address code review comments
* add documentation for default buckets
* fix typo
This commit adds the _xpack/usage api to the high level rest client.
Currently in the transport api, the usage data is exposed in a limited
fashion, at most giving one level of helper methods for the inner keys
of data, but then exposing thos subobjects as maps of objects. Rather
than making parsers for every set of usage data from each feature, this
PR exposes the entire set of usage data as a map of maps.
Because this is a static method on a public API, and one that we encourage
plugin authors to use, the method with the typo is deprecated in 6.x
rather than just renamed.
With this commit we introduce a new circuit-breaking strategy to the parent
circuit breaker. Contrary to the current implementation which only accounts for
memory reserved via child circuit breakers, the new strategy measures real heap
memory usage at the time of reservation. This allows us to be much more
aggressive with the circuit breaker limit so we bump it to 95% by default. The
new strategy is turned on by default and can be controlled with the new cluster
setting `indices.breaker.total.userealmemory`.
Note that we turn it off for all integration tests with an internal test cluster
because it leads to spurious test failures which are of no value (we cannot
fully control heap memory usage in tests). All REST tests, however, will make
use of the real memory circuit breaker.
Relates #31767
It looks like we weren't clear on when and why you should close the high
level client and folks were closing it after every request which is not
efficient. This explains why you should close the client and when so
this shouldn't be as common.
Closes#32001
* Added lenient flag for synonym-tokenfilter.
Relates to #30968
* added docs for synonym-graph-tokenfilter
-- Also made lenient final
-- changed from !lenient to lenient == false
* Changes after review (1)
-- Renamed to ElasticsearchSynonymParser
-- Added explanation for ElasticsearchSynonymParser::add method
-- Changed ElasticsearchSynonymParser::logger instance to static
* Added lenient option for WordnetSynonymParser
-- also added more documentation
* Added additional documentation
* Improved documentation
* Handle missing values in painless
Throw an exception for `doc['field'].value`
if this document is missing a value for the `field`.
For 7.0:
This is the default behaviour from 7.0
For 6.x:
To enable this behavior from 6.x, a user can set a jvm.option:
`-Des.script.exception_for_missing_value=true` on a node.
If a user does not enable this behavior, a deprecation warning is logged on start up.
Closes#29286
This commit removes the link to an oss-MSI; there is only one version of the MSI, which includes X-Pack.
(cherry picked from commit d2e5db8a806ec8a25162f79db5209aceed4f30f7)
This is the first x-pack API we're adding to the high level REST client
so there is a lot to talk about here!
= Open source
The *client* for these APIs is open source. We're taking the previously
Elastic licensed files used for the `Request` and `Response` objects and
relicensing them under the Apache 2 license.
The implementation of these features is staying under the Elastic
license. This lines up with how the rest of the Elasticsearch language
clients work.
= Location of the new files
We're moving all of the `Request` and `Response` objects that we're
relicensing to the `x-pack/protocol` directory. We're adding a copy of
the Apache 2 license to the root fo the `x-pack/protocol` directory to
line up with the language in the root `LICENSE.txt` file. All files in
this directory will have the Apache 2 license header as well. We don't
want there to be any confusion. Even though the files are under the
`x-pack` directory, they are Apache 2 licensed.
We chose this particular directory layout because it keeps the X-Pack
stuff together and easier to think about.
= Location of the API in the REST client
We've been following the layout of the rest-api-spec files for other
APIs and we plan to do this for the X-Pack APIs with one exception:
we're dropping the `xpack` from the name of most of the APIs. So
`xpack.graph.explore` will become `graph().explore()` and
`xpack.license.get` will become `license().get()`.
`xpack.info` and `xpack.usage` are special here though because they
don't belong to any proper category. For now I'm just calling
`xpack.info` `xPackInfo()` and intend to call usage `xPackUsage` though
I'm not convinced that this is the final name for them. But it does get
us started.
= Jars, jars everywhere!
This change makes the `xpack:protocol` project a `compile` scoped
dependency of the `x-pack:plugin:core` and `client:rest-high-level`
projects. I intend to keep it a compile scoped dependency of
`x-pack:plugin:core` but I intend to bundle the contents of the protocol
jar into the `client:rest-high-level` jar in a follow up. This change
has grown large enough at this point.
In that followup I'll address javadoc issues as well.
= Breaking-Java
This breaks that transport client by a few classes around. We've
traditionally been ok with doing this to the transport client.
For historical reasons SQL restricts GROUP BY to only one field.
This commit removes the restriction and improves the test suite with
multi group by tests.
Close#31793
There have been at least two PRs trying to fix the spelling of "lazi" because it
isn't very clear from the example that the english analyzer will stem each token
in the example. This adds a short description of the analysis process to make
this clearer.
Relates to #31797
This is a followup to #31537. It makes a number of changes requested by
a review that came after the PR was merged. These are mostly cleanups
and doc improvements.
Only the shards that receive the bulk request will be affected by
`refresh`. Imagine a `_bulk?refresh=wait_for` request with three
documents in it that happen to be routed to different shards in an index
with five shards. The request will only wait for those three shards to
refresh. The other two shards of that make up the index do not
participate in the `_bulk` request at all.
Relates to #31819
Removes support for storing scripts without the usual json around the
script. So You can no longer do:
```
POST _scripts/<templatename>
{
"query": {
"match": {
"title": "{{query_string}}"
}
}
}
```
and must instead do:
```
POST _scripts/<templatename>
{
"script": {
"lang": "mustache",
"source": {
"query": {
"match": {
"title": "{{query_string}}"
}
}
}
}
}
```
This improves error reporting when you attempt to store a script but don't
quite get the syntax right. Before, there was a good chance that we'd
think of it as a "raw" template and just store it. Now we won't do that.
Nice.
* Upgrade bouncycastle
Required to fix
`bcprov-jdk15on-1.55.jar; invalid manifest format `
on jdk 11
* Downgrade bouncycastle to avoid invalid manifest
* Add checksum for new jars
* Update tika permissions for jdk 11
* Mute test failing on jdk 11
* Add JDK11 to CI
* Thread#stop(Throwable) was removed
http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-June/053536.html
* Disable failing tests #31456
* Temprorarily disable doc tests
To see if there are other failures on JDK11
* Only blacklist specific doc tests
* Disable only failing tests in ingest attachment plugin
* Mute failing HDFS tests #31498
* Mute failing lang-painless tests #31500
* Fix backwards compatability builds
Fix JAVA version to 10 for ES 6.3
* Add 6.x to bwx -> java10
* Prefix out and err from buildBwcVersion for readability
```
> Task :distribution:bwc:next-bugfix-snapshot:buildBwcVersion
[bwc] :buildSrc:compileJava
[bwc] WARNING: An illegal reflective access operation has occurred
[bwc] WARNING: Illegal reflective access by org.codehaus.groovy.reflection.CachedClass (file:/home/alpar/.gradle/wrapper/dists/gradle-4.5-all/cg9lyzfg3iwv6fa00os9gcgj4/gradle-4.5/lib/groovy-all-2.4.12.jar) to method java.lang.Object.finalize()
[bwc] WARNING: Please consider reporting this to the maintainers of org.codehaus.groovy.reflection.CachedClass
[bwc] WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
[bwc] WARNING: All illegal access operations will be denied in a future release
[bwc] :buildSrc:compileGroovy
[bwc] :buildSrc:writeVersionProperties
[bwc] :buildSrc:processResources
[bwc] :buildSrc:classes
[bwc] :buildSrc:jar
```
* Also set RUNTIME_JAVA_HOME for bwcBuild
So that we can make sure it's not too new for the build to understand.
* Align bouncycastle dependency
* fix painles array tets
closes#31500
* Update jar checksums
* Keep 8/10 runtime/compile untill consensus builds on 11
* Only skip failing tests if running on Java 11
* Failures are dependent of compile java version not runtime
* Condition doc test exceptions on compiler java version as well
* Disable hdfs tests based on runtime java
* Set runtime java to minimum supported for bwc
* PR review
* Add comment with ticket for forbidden apis
ingest: Introduction of a bytes processor
This processor allows for human readable byte values (e.g. 1kb) to be converted to value in bytes (e.g. 1024). Internally this processor re-uses "ByteSizeValue.parseBytesSizeValue" which supports conversions up to Long.MAX_VALUE and the following units: "b", "kb", "mb", "gb", "tb", pb".
This change also introduces a generic return type for the AbstractStringProcessor to allow for code reuse while supporting a String -> T conversion. (String -> Long in this case).
Significantly improve the example snippets in the documentation.
The examples are part of the test suite and checked nightly.
To help readability, the existing dataset was extended (test_emp renamed
to emp plus library).
Improve output of JDBC tests to be consistent with the CLI
Add lenient flag to JDBC asserts to allow type widening (a long is
equivalent to a integer as long as the value is the same).
AWS supports the creation and use of credentials that are only valid for a
fixed period of time. These credentials comprise three parts: the usual access
key and secret key, together with a session token. This commit adds support for
these three-part credentials to the EC2 discovery plugin and the S3 repository
plugin.
Note that session tokens are only valid for a limited period of time and yet
there is no mechanism for refreshing or rotating them when they expire without
restarting Elasticsearch. Nonetheless, this feature is already useful for
nodes that need only run for a few days, such as for training, testing or
evaluation. #29135 tracks the work towards allowing these credentials to be
refreshed at runtime.
Resolves#16428
So far the in-flight request circuit breaker has only accounted for the
on-the-wire representation of a request. However, we convert the raw
request into XContent internally which increases the overhead.
Therefore, we increase the value of the corresponding setting
`network.breaker.inflight_requests.overhead` from one to two. While this
value is still rather conservative (we assume that the representation as
structured objects has no overhead compared to the byte[]), it is closer
to reality than the current value.
Relates #31613
The range docs had an introductory section that described how to set up
and index *and* a test setup section in `docs/build.gradle` that
duplicated that section. This is bad because these section can (and do)
drift from one another. This change removes the setup in build.gradle
and marks the introductor snippet with `// TESTSETUP` so it is used on
all the snippets.
Skips tests the require xpack if we run the doc build without xpack. So
this should work:
```
./gradlew -p docs check -Dtests.distribution=oss-zip
```
This is implemented by detecting parts of the doc that look like:
```
[testenv="basic"]
```
Relates to #30665
Added support to the high-level rest client for the create snapshot API call. This required
several changes to toXContent which may need to be cleaned up in a later PR. Also
added several parsers for fromXContent to be able to retrieve appropriate responses
along with tests.
* Migrate scripted metric aggregation scripts to ScriptContext design #29328
* Rename new script context container class and add clarifying comments to remaining references to params._agg(s)
* Misc cleanup: make mock metric agg script inner classes static
* Move _score to an accessor rather than an arg for scripted metric agg scripts
This causes the score to be evaluated only when it's used.
* Documentation changes for params._agg -> agg
* Migration doc addition for scripted metric aggs _agg object change
* Rename "agg" Scripted Metric Aggregation script context variable to "state"
* Rename a private base class from ...Agg to ...State that I missed in my last commit
* Clean up imports after merge
We have made node selectors configurable per request, but all
of other language clients don't allow for that.
A good reason not to do so, is that having a different node selector
per request breaks round-robin. This commit makes NodeSelector
configurable only at client initialization. It also improves the docs
on this matter, important given that a single node selector can still
affect round-robin.