This commit makes creators of GetField split the fields into document fields and metadata fields. It is part of larger refactoring that aims to remove the calls to static methods of MapperService related to metadata fields, as discussed in #24422.
This change moves the construction of the result
HashMap in Grok.captures() into the branch that
actually needs it.
This probably will not make a measurable difference
for ingest pipelines, but it is beneficial to the
ML find_file_structure endpoint, as it tries out
many Grok patterns that will fail to match.
`org.elasticsearch.action.bulk.BulkProcessor` is a threadsafe class that
allows for simple semantics to deal with sending bulk requests. Once a
bulk reaches it's pre-defined size, documents, or flush interval it will
execute sending the bulk. One configurable option is the number of concurrent
outstanding bulk requests. That concurrency is implemented in
`org.elasticsearch.action.bulk.BulkRequestHandler` via a semaphore. However,
the only code that currently calls into this code is blocked by `synchronized`
methods. This results in the in-ability for the BulkProcessor to behave concurrently
despite supporting configurable amounts of concurrent requests.
This change removes the `synchronized` method in favor an explicit
lock around the non-thread safe parts of the method. The call into
`org.elasticsearch.action.bulk.BulkRequestHandler` is no longer blocking, which
allows `org.elasticsearch.action.bulk.BulkRequestHandler` to handle it's own concurrency.
Allow querying of FROZEN indices both through dedicated SQL grammar
extension:
> SELECT field FROM FROZEN index
and also through driver configuration parameter, namely:
> index.include.frozen: true/false
Fix#39390Fix#39377
(cherry picked from commit 2445a933915f420c7f51e8505afa0a7978ce6b0f)
Lucene 8 has the ability to skip blocks of non-competitive documents.
However some queries don't track their maximum score (`script_score`, `span`, ...)
so they always return Float.POSITIVE_INFINITY as maximum score. This can slow down
some boolean queries if other clauses have bounded max scores. This commit disables
the max score optimization when we detect a mandatory scoring clause with unbounded
max scores. Optional clauses are not checked since they can still skip documents
when the unbounded clause is after the current document.
Currently AnalysisRegistry#processNormalizerFactory creates a normalizer and
only later checks whether it should be added to the normalizer map passed in. In
case we throw an exception it isn't closed. This can be prevented by moving the
check that throws the exception earlier.
When asserting on the checkpoint value if the DF has completed the checkpoint will be 1 else 0.
Similarly state may be started or indexing. Closes#42309
Allow for SimpleQueryString, QueryString and MultiMatchQuery
to set the `fields` parameter to the wildcard `*`. If so, set
the leniency to `true`, to achieve the same behaviour as from the
`"default_field" : "*" setting.
Furthermore, check if `*` is in the list of the `default_field` but
not necessarily as the 1st element.
Closes: #39577
(cherry picked from commit e75ff0c748e6b68232c2b08e19ac4a4934918264)
AbstractDisruptionTestCase set a lower global checkpoint sync interval setting, but this was ignored by
testAckedIndexing, which has led to spurious test failures
Relates #41068, #38931
ShardId already implements Writeable so there is no need for it to implement Streamable too. Also the readShardId static method can be
easily replaced with direct usages of the constructor that takes a
StreamInput as argument.
In case a search request holds only the suggest section, the query phase
is skipped and only the suggest phase is executed instead. There will
never be hits returned, and in case the explain flag is set to true, the
explain sub phase throws a null pointer exception as the query is null.
Usually a null query is replaced with a match all query as part of SearchContext#preProcess which is though skipped as well with suggest
only searches. To address this, we skip the explain sub fetch phase
for search requests that only requested suggestions.
Closes#31260
As a follow-up to #38540 we can use lambda functions and method
references where convenient in the low-level REST client.
Also, we need to update the docs to state that the minimum java version
required is 1.8.
Removing of payload in BulkRequest (#39843) had a side effect of making
`BulkRequest.add(DocWriteRequest<?>...)` (with varargs) recursive, thus
leading to StackOverflowError. This PR adds a small change in
RequestConvertersTests to show the error and the corresponding fix in
`BulkRequest`.
Fixes#41668
* Remove IndexShard dependency from Repository
In order to simplify repository testing especially for BlobStoreRepository
it's important to remove the dependency on IndexShard and reduce it to
Store and MapperService (in the snapshot case). This significantly reduces
the dependcy footprint for Repository and allows unittesting without starting
nodes or instantiate entire shard instances. This change deprecates the old
method signatures and adds a unittest for FileRepository to show the advantage
of this change.
In addition, the unittesting surfaced a bug where the internal file names that
are private to the repository were used in the recovery stats instead of the
target file names which makes it impossible to relate to the actual lucene files
in the recovery stats.
* don't delegate deprecated methods
* apply comments
* test
When using High Level Rest Client Java API to produce search query, using AggregationBuilders.topHits("th").sort("_score", SortOrder.DESC)
caused query to contain duplicate sort clauses.
A shard that is undergoing peer recovery is subject to logging warnings of the form
org.elasticsearch.action.FailedNodeException: Failed node [XYZ]
...
Caused by: org.apache.lucene.index.IndexNotFoundException: no segments* file found in ...
These failures are actually harmless, and expected to happen while a peer recovery is ongoing (i.e.
there is an IndexShard instance, but no proper IndexCommit just yet).
As these failures are currently bubbled up to the master, they cause unnecessary reroutes and
confusion amongst users due to being logged as warnings.
Closes #40107
rp.client_secret is a required secure setting. Make sure we fail with
a SettingsException and a clear, actionable message when building
the realm, if the setting is missing.
Enhance the handling of merging the claims sets of the
ID Token and the UserInfo response. JsonObject#merge would throw a
runtime exception when attempting to merge two objects with the
same key and different values. This could happen for an OP that
returns different vales for the same claim in the ID Token and the
UserInfo response ( Google does that for profile claim ).
If a claim is contained in both sets, we attempt to merge the
values if they are objects or arrays, otherwise the ID Token claim
value takes presedence and overwrites the userinfo response.
This adds the node name where we fail to start a process via the native
controller to facilitate debugging as otherwise it might not be known
to which node the job was allocated.
Moves the test infrastructure away from using node.max_local_storage_nodes, allowing us in a
follow-up PR to deprecate this setting in 7.x and to remove it in 8.0.
This also changes the behavior of InternalTestCluster so that starting up nodes will not automatically
reuse data folders of previously stopped nodes. If this behavior is desired, it needs to be explicitly
done by passing the data path from the stopped node to the new node that is started.
This commit reworks and clarifies the docs for the `discovery-ec2` plugin:
- folds the tiny "Getting started with AWS" into the page on configuration
- spells out the name of each setting in full instead of noting the
`discovery.ec2` prefix at the top of the page.
- replaces each `(Secure)` marker with a sentence describing what that means in
situ
- notes some missing defaults
- clarifies the behaviour of `discovery.ec2.groups` (dependent on `.any_group`)
- clarifies what `discovery.ec2.host_type` is for
- adds `discovery.ec2.tag.TAGNAME` as a (meta-)setting rather than describing
it in a separate section
- notes that the tags mentioned in `discovery.ec2.tag.TAGNAME` cannot contain
colons (see #38406)
- clarifies the EC2-specific interface names and what they're for
- reorders and rewords the recommendations for storage
- expands on why you should not span a cluster across regions
- adds a suggestion on protecting instances against termination during scale-in
- reformat to 80 columns where possible
Fixes#38406
Simplifies the voting configuration reconfiguration logic by switching to an explicit Comparator for
the priorities. Does not make changes to the behavior of the component.