This adds a new feature to the Term Vectors API which allows for filtering of
terms based on their tf-idf scores. With `dfs` option on, this could be useful
for finding out a good characteric vector of a document or a set of documents.
The parameters are similar to the ones used in the MLT Query.
Closes#9561
Using ThreadLocalRandom only prevents reproducibilty but doesn't buy us
anything. In production different datapaths won't have the same since
anyway or at least with a low likelyhood.
We need to preserve settings (yet transient) even though the engine is not yet
started. This commit moves back to a single EngineConfig to simplify IndexShard
and settings state.
Closes#10584
Local execution of transport messages failures can create a more detailed remote transport exceptions. Also, when failing to handle an exception, the error should be logged, and not call the handler again with another exception
closes#10554
This commit adds a `rewrite` parameter to the validate API in order to shown
how the given query is re-written into primitive queries. For example, an MLT
query is re-written into a disjunction of the selected terms. Other use cases
include `fuzzy`, `common_terms`, or `match` query especially with a
`cutoff_frequency` parameter. Note that the explanation is only given for a
single randomly chosen shard only, so the output may vary from one shard to
another.
Relates #1412Closes#10147
Today the engine writes the transaction log itself as well as manages
all the commit / translog mapping internally. Yet, if an engine is closed
and reopend it doesn't replay it's translog or does anything to be consistent
with it's latest state again.
This change moves the transaction log replay code into the Engine / InternalEngine
and adds unittests for replaying and consistency.
Closes#10452
At the moment, we are very strict when handling data folders containing corrupted shards and will fail any recovery attempt into it. Typically this wouldn't be a problem as the shard will be assigned to another node (which we try first anyway when a shard fails). However, it has been proven to be too strict for smaller clusters which may not have an extra node available (either because of allocation filtering, disk space issues etc.). This commit changes the behavior to force a full recovery. Once all the new files are verified we remove the old corrupted data and start the shard.
This also fixes a small issue where the shard state file wasn't deleted on an engine failure (we had a protection against deleting the state file on an active shard, but in this case the shard is still active but will be removed). The state deletion is also moved to before the failure handlers are called, to avoid race conditions when calling the master (it will potentially try to read it when allocating the shard)
Closes#10558
ShapeBuilder's coordinate parser expected 2 double values for every coordinate array. If > 2 doubles were provided the parser terminated parsing of the coordinate array. This resulted in an invalid Shape state leaving LineStrings, LinearRings, and Polygons with a single coordinate. An incorrect parse exception was thrown. This corrects the parser to ignore those values in the 3rd+ dimension, correctly parsing the rest of the coordinate array.
Unit tests have been updated to verify the fix.
closes#10510
Prevents the user from changing strategies, tree, tree_level or precision. distance_error_pct changes are allowed as they do not compromise the integrity of the index. A separate issue is open for allowing users to change tree_level or precision.
OGC SFA 2.1.10 assertion 3 allows interior boundaries to touch exterior boundaries provided they intersect at a single point. Issue #9511 provides an example where a valid shape is incorrectly interpreted as invalid (a false violation of assertion 3). When the intersecting point appears as the first and last coordinate of the interior boundary in a polygon, the ShapeBuilder incorrectly counted this as multiple intersecting vertices. The fix required a little more than just a logic check. Passing the duplicate vertices resulted in a connected component in the edge graph causing an invalid self crossing polygon. This required additional logic to the edge assignment in order to correctly segment the connected components. Finally, an additional hole validation has been added along with proper unit tests for testing valid and invalid conditions (including dateline crossing polys).
closes#9511
Update Eclipse core prefs
Eclipse Luna overwrites the prefs file, putting all the settings
in alphabetical order and removing comments. This causes the prefs
files to be modified in the git workspace.
Update the file with the version generated by Eclipse to prevent it
from being modified every time.
No settings values are modified by this change.
This also adds another plugin to the lifecycle mapping in the pom.xml which was missed in https://github.com/elastic/elasticsearch/pull/10524.
This is really a Collector instead of a filter. This commit deprecates the
`limit` filter, makes it a no-op and recommends to use the `terminate_after`
parameter instead that we introduced in the meantime.
Most tests don't "really" need to fsync, and this is costly (makes
tests slower, wears out our SSDs).
This change makes it uncommon to actually fsync when Lucene asks for
it. It's just a workaround (in MockDirectoryHelper) until we can
cutover Elasticseach to use MockFileSystem like Lucene.
Closes#10516