We used to double write the translog operation which is not needed except
of for recovery. This commit cuts over to a big-array based temporary serialiation
and removes the crazy double writing.
Now that mapping updates are synchronous, it is not necessary to send mappings
to the master node during the recovery process anymore: they will already be on
the master node since we ensure mappings are on the master node before indexing.
Mappings conflicts should not be ignored. If I read the history correctly, this
option was added when a mapping update to an existing field was considered a
conflict, even if the new mapping was exactly the same. Now that mapping updates
are smart enough to detect conflicting options, we don't need an option to
ignore conflicts.
Whenever a query parser (or any other component) issues another
request as part of a request, the headers and the context has to
be supplied as well.
In order to do this, the `SearchContext` has to have those headers
available, which in turn means, the shard level request needs to
copy those from the original `SearchRequest`
This commit introduces two new interface to supply the needed methods
to work with context and headers.
Closes#10979
This is a leftover from the times where we failed a flush when
recoveries are ongoing. This code is really not needed anymore and
we can luckily flush the translog all the time as well.
ShardOperationFailureException implementations alread provide structured
exception support but it's not yet exposed on the interface. This change
allows nice rendering of structured REST exceptions also if searches fail on
only a subset of the shards etc.
Closes#11017
This commit fixes the name of the upated_snapshot task from something like "update_snapshot [org.elasticsearch.cluster.metadata.SnapshotMetaData$Entry@de00bc50]" to a more readable "update_snapshot [test-snap]"
When in a shared filesystem environment and recovering the primary to
any node. We should respect the allocation deciders if possible (still
force-allocting to another node if there aren't any "YES" decisions).
The AllocationDeciders should take precedence over the shard state
version when force-allocating an unassigned primary shard.
Closes#11192 which I accidentally already closed.
Squashed commit of the following:
commit f23faccddc2a77a880841da4c89c494edaa2aa46
Author: Robert Muir <rmuir@apache.org>
Date: Fri May 15 16:04:55 2015 -0400
Simplify this FileUtils even more: its either from the filesystem, or the classpath,
not both. Its already trying 4 different combinations of crazy paths for either of these anyway.
commit c7016c8a2b5a6043e2ded4b48b160821ba196974
Author: Robert Muir <rmuir@apache.org>
Date: Fri May 15 14:21:37 2015 -0400
include rest tests in test-framework jar
Closes#11192 which I accidentally already closed.
Squashed commit of the following:
commit f23faccddc2a77a880841da4c89c494edaa2aa46
Author: Robert Muir <rmuir@apache.org>
Date: Fri May 15 16:04:55 2015 -0400
Simplify this FileUtils even more: its either from the filesystem, or the classpath,
not both. Its already trying 4 different combinations of crazy paths for either of these anyway.
commit c7016c8a2b5a6043e2ded4b48b160821ba196974
Author: Robert Muir <rmuir@apache.org>
Date: Fri May 15 14:21:37 2015 -0400
include rest tests in test-framework jar
Now that lucene provides a way to identify if the warming reader is
the first initial opened reader we can detach this class from the
enclosing and make it static. This is important since it might access
not fully initialized members of the enclosing class since it's initialized
and used during constructor invocation.
The default `false` for `require_field_match` is a bit odd and confusing for users, given that field names get ignored by default and every field gets highlighted if it contains terms extracted out of the query, regardless of which fields were queries. Changed the default to `true`, it can always be changed per request.
Closes#10627Closes#11067
Our own fork of the lucene PostingsHighlighter is not easy to maintain and doesn't give us any added value at this point. In particular, it was introduced to support the require_field_match option and discrete per value highlighting, used in case one wants to highlight the whole content of a field, but get back one snippet per value. These two features won't
make it into lucene as they slow things down and shouldn't have been supported from day one on our end probably.
One other customization we had was support for a wider range of queries via custom rewrite etc. (yet another way to slow
things down), which got added to lucene and works much much better than what we used to do (instead of or rewrite, term
s are pulled out of the automata for multi term queries).
Removing our fork means the following in terms of features:
- dropped support for require_field_match: the postings highlighter will only highlight fields that were queried
- some custom es queries won't be supported anymore, meaning they won't be highlighted. The only one I found up until now is the phrase_prefix. Postings highlighter rewrites against an empty reader to avoid slow operations (like the ones that we were performing with the fork that we are removing here), thus the prefix will not be expanded to any term. What the postings highlighter does instead is pulling the automata out of multi term queries, but this is not supported at the moment with our MultiPhrasePrefixQuery.
Closes#10625Closes#11077
The underlying automaton-backed implementation throws an error if there are too many states.
This fix changes to using an implementation based on Set lookups for lists of excluded terms.
If the global-ordinals execution mode is in effect this implementation also addresses the slowness identified in issue 11181 which is caused by traversing the TermsEnum - instead the excluded terms’ global ordinals are looked up individually and unset the bits of acceptable terms. This is significantly faster.
Closes#11176
When scrolling, SCAN previously collected documents until it reached where it
had stopped on the previous iteration. This makes pagination slower and slower
as you request deep pages. With this change, SCAN now directly jumps to the
doc ID where is had previously stopped.