#28245 has introduced the utility class`EngineDiskUtils` with a set of methods to prepare/change
translog and lucene commit points. That util class bundled everything that's needed to create and
empty shard, bootstrap a shard from a lucene index that was just restored etc.
In order to safely do these manipulations, the util methods acquired the IndexWriter's lock. That
would sometime fail due to concurrent shard store fetching or other short activities that require the
files not to be changed while they read from them.
Since there is no way to wait on the index writer lock, the `Store` class has other locks to make
sure that once we try to acquire the IW lock, it will succeed. To side step this waiting problem, this
PR folds `EngineDiskUtils` into `Store`. Sadly this comes with a price - the store class doesn't and
shouldn't know about the translog. As such the logic is slightly less tight and callers have to do the
translog manipulations on their own.
Some source files seem to have the execute bit (a+x) set, which doesn't
really seem to hurt but is a bit odd. This change removes those, making
the permissions similar to other source files in the repository.
This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values
present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate
the collection when the leading source value is greater than the lowest value in the queue.
Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents
in the order of the values present in the leading source.
For instance the following aggregation:
```
"composite" : {
"sources" : [
{ "value1": { "terms" : { "field": "timestamp", "order": "asc" } } }
],
"size": 10
}
```
... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents.
For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough
composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it
is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited.
This mode can execute iff:
* The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`.
* The query is a match_all query or a range query over the field that is used as the leading source in the composite definition.
* The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only).
If these conditions are not met this aggregation visits each document like any other agg.
The rank_eval documentation was missing an explanation of the parameter
`k` that controls the number of top hits that are used in the ranking evaluation.
Closes#29205
Adds docs for `HighLevelRestClient#multiSearch`. Unlike the `multiGet`
docs these are much more sparse because multi-search doesn't support
setting many options on the `MultiSearchRequest` and instead just wraps
a list of `SearchRequest`s.
Closes#28389
This enhancement adds Z value support (source only) to geo_shape fields. If vertices are provided with a third dimension, the third dimension is ignored for indexing but returned as part of source. Like beofre, any values greater than the 3rd dimension are ignored.
closes#23747
This commit removes type-casts in logging in the server component (other
components will be done later). This also adds a parameterized message
test which would catch breaking-changes related to lambdas in Log4J.
While working on #27799, we find that it might make sense to change BroadcastResponse from ToXContentFragment to ToXContentObject, seeing that it's rather a complete XContent object and also the other Responses are normally ToXContentObject.
By doing this, we can also move the XContent build logic of BroadcastResponse's subclasses, from Rest Layer to the concrete classes themselves.
Relates to #3889
This commit adds a note to the low-level REST client docs regarding the
possibility of being impacted by the JVM DNS cache policy under a
default security manager policy.
In #28350, we fixed an endless flushing loop which may happen on
replicas by tightening the relation between the flush action and the
periodically flush condition.
1. The periodically flush condition is enabled only if it is disabled
after a flush.
2. If the periodically flush condition is enabled then a flush will
actually happen regardless of Lucene state.
(1) and (2) guarantee that a flushing loop will be terminated. Sadly,
the condition 1 can be violated in edge cases as we used two different
algorithms to evaluate the current and future uncommitted translog size.
- We use method `uncommittedSizeInBytes` to calculate current
uncommitted size. It is the sum of translogs whose generation at least
the minGen (determined by a given seqno). We pick a continuous range of
translogs since the minGen to evaluate the current uncommitted size.
- We use method `sizeOfGensAboveSeqNoInBytes` to calculate the future
uncommitted size. It is the sum of translogs whose maxSeqNo at least
the given seqNo. Here we don't pick a range but select translog one by
one.
Suppose we have 3 translogs `gen1={#1,#2}, gen2={}, gen3={#3} and
seqno=#1`, `uncommittedSizeInBytes` is the sum of gen1, gen2, and gen3
while `sizeOfGensAboveSeqNoInBytes` is the sum of gen1 and gen3. Gen2 is
excluded because its maxSeqno is still -1.
This commit removes both `sizeOfGensAboveSeqNoInBytes` and
`uncommittedSizeInBytes` methods, then enforces an engine to use only
`sizeInBytesByMinGen` method to evaluate the periodically flush condition.
Closes#29097
Relates ##28350
This commit removes some parameters deprecated in 6.x (or 5.x):
`use_dismax`, `split_on_whitespace`, `all_fields` and `lowercase_expanded_terms`.
Closes#25551
Remove license information from README.textile in preparation for the
the inclusion of X-Pack into the main elasticsearch repository.
This is necessary to avoid confusion around licensing, as the entire
repository will no longer be Apache 2.0.
This commit decouples `BytesRef`, `Releaseable`, and `TimeValue` from
XContentBuilder, and paves the way for doupling `ByteSizeValue` as well. It
moves much of the Lucene and Joda encoding into a new SPI extension that is
loaded by XContentBuilder to know how to encode these values.
Part of doing this also allows us to make JSON encoding strict, as we no longer
allow just any old object to be passed (in the past it was possible to get json
that was `"field": "java.lang.Object@d8355a8"` if no one was careful about what
was passed in).
Relates to #28504
This commit adds a new setting `cluster.persistent_tasks.allocation.enable`
that can be used to enable or disable the allocation of persistent tasks.
The setting accepts the values `all` (default) or `none`. When set to
none, the persistent tasks that are created (or that must be reassigned)
won't be assigned to a node but will reside in the cluster state with
a no "executor node" and a reason describing why it is not assigned:
```
"assignment" : {
"executor_node" : null,
"explanation" : "persistent task [foo/bar] cannot be assigned [no
persistent task assignments are allowed due to cluster settings]"
}
```
We discussed and agreed to include the synced-flush change in 6.3.0+ but
not in 5.6.9. We will re-evaluate the urgency and importance of the
issue then decide which versions that the change should be included.
This commit reenables the Javadoc tasks on JDK 10. To reenable these
tasks, we have to workaround a bug in JDK 10 which trips on some deeply
nested anonymous classes that we have in the codebase (and are fine
as-is, this is not a problem with this code). The workaround is to
remove the compiled classes from the classpath. This has been reported
upstream and the workaround was suggested there (see the code comment).
The assumption here is that we will no longer be making a release from
the 6.1 branch. Since we assume that all versions on this branch are
actually released, we do not want to leave behind any versions that
would require a snapshot build. We do have a test that verifies that all
released versions are present here, so if another release is performed
from the 6.1 branch, that test will fail and we will know to add the
version constant at that time.
This will reject mapping updates to the `_default_` mapping with 7.x indices
and still emit a deprecation warning with 6.x indices.
Relates #15613
Supersedes #28248
I misunderstood how the bwc versions works. If we backport to 5.x, we
need to backport to all supported 6.*. This commit corrects the BWC
versions for PreSyncedFlushResponse.
Relates #29103
* Remove BytesArray and BytesReference usage from XContentFactory
This removes the usage of `BytesArray` and `BytesReference` from
`XContentFactory`. Instead, a regular `byte[]` should be passed. To assist with
this a helper has been added to `XContentHelper` that will preserve the offset
and length from the underlying BytesReference.
This is part of ongoing work to separate the XContent parts from ES so they can
be factored into their own jar.
Relates to #28504
* Add pluggable XContentBuilder writers and human readable writers
This adds the ability to use SPI to plug in writers for XContentBuilder. By
implementing the XContentBuilderProvider class we can allow Elasticsearch to
plug in different ways to encode types to JSON.
Important caveat for this, we should always try to have the class implement
`ToXContentFragment` first, however, in the case of classes from our
dependencies (think Joda classes or Lucene classes) we need a way to specify
writers for these classes.
This also makes the human-readable field writers generic and pluggable, so that
we no longer need to tie XContentBuilder to things like `TimeValue` and
`ByteSizeValue`. Contained as part of this moves all the TimeValue human
readable fields to the new `humanReadableField` method. A future commit will
move the `ByteSizeValue` calls over to this method.
Relates to #28504
We need to configure the Java 9 checkstyle task to depend on the
checkstyle configuration task or the task could run before the
checkstyle conf has been copied leading to runtime failures. We have to
do this after projects have been evaluated because the configuration of
these tasks can occur before the Java 9 source set has been added to a
project.
This commit fixes the directory name bundled plugins are added under
within a meta plugin to be the configured name of the bundled plugin,
instead of the project name.
This commit creates the copyRestSpec task for rest integ tests
immediately on creation of the RestIntegTestTask instead of lazily in
afterEvaluate. This allows other projects to add additional rest specs
to be copied, instead of needing to create another parallel copy task.