`text` fields will have fielddata disabled by default. Fielddata can still be
enabled on an existing index by setting `fielddata=true` in the mappings.
If a terms aggregation was ordered by a metric nested in a single bucket aggregator which did not collect any documents (e.g. a filters aggregation which did not match in that term bucket) an ArrayOutOfBoundsException would be thrown when the ordering code tried to retrieve the value for the metric. This fix fixes all numeric metric aggregators so they return their default value when a bucket ordinal is requested which was not collected.
Closes#17225
readFrom is confusing because it requires an instance of the type that it
is reading but it doesn't modify it. But we also have (deprecated) methods
named readFrom that *do* modify the instance. The "right" way to implement
the non-modifying readFrom is to delegate to a constructor that takes a
StreamInput so that the read object can be immutable. Now that we have
`@FunctionalInterface`s it is fairly easy to register things by referring
directly to the constructor.
This change modifying NamedWriteableRegistry so that it does that. It keeps
supporting `registerPrototype` which registers objects to be read by
readFrom but deprecates it and delegates it to a new `register` method
that allows passing a simple functional interface. It also cuts Task.Status
subclasses over to using that method.
The start of #17085
The test was checking that we'd set the headers properly but in some cases
the request had yet to come in because it was running on another thread.
Now we wait for the headers to show up before failing the test.
Closes#17299
If the user asks for a refresh but their reindex or update-by-query
operation touched no indexes we should just skip the resfresh call
entirely. Without this commit we refresh *all* indexes which is totally
wrong.
Closes#17296
The fielddata settings in mappings have been refatored so that:
- text and string have a `fielddata` (boolean) setting that tells whether it
is ok to load in-memory fielddata. It is true by default for now but the
plan is to make it default to false for text fields.
- text and string have a `fielddata_frequency_filter` which contains the same
thing as `fielddata.filter.frequency` used to (but validated at parsing time
instead of being unchecked settings)
- regex fielddata filtering is not supported anymore and will be dropped from
mappings automatically on upgrade.
- text, string and _parent fields have an `eager_global_ordinals` (boolean)
setting that tells whether to load global ordinals eagerly on refresh.
- in-memory fielddata is not supported on keyword fields anymore at all.
- the `fielddata` setting is not supported on other fields that text and string
and will be dropped when upgrading if specified.
Currently, both Gsub and Grok parse regex strings during
Pipeline creation. Thrown parsing exceptions were leaking out, this
commit wraps those exceptions in ElasticsearchParseExceptions.
We current have a ClusterService interface, implemented by InternalClusterService and a couple of test classes. Since the decoupling of the transport service and the cluster service, one can construct a ClusterService fairly easily, so we don't need this extra indirection.
Closes#17183
Also replaced the PercolatorQueryRegistry with the new PercolatorQueryCache.
The PercolatorFieldMapper stores the rewritten form of each percolator query's xcontext
in a binary doc values field. This make sure that the query rewrite happens only during
indexing (some queries for example fetch shapes, terms in remote indices) and
the speed up the loading of the queries in the percolator query cache.
Because the percolator now works inside the search infrastructure a number of features
(sorting fields, pagination, fetch features) are available out of the box.
The following feature requests are automatically implemented via this refactoring:
Closes#10741Closes#7297Closes#13176Closes#13978Closes#11264Closes#10741Closes#4317
Today we allow to set all kinds of index level settings on the node level which
is error prone and difficult to get right in a consistent manner.
For instance if some analyzers are setup in a yaml config file some nodes might
not have these analyzers and then index creation fails.
Nevertheless, this change allows some selected settings to be specified on a node level
for instance:
* `index.codec` which is used in a hot/cold node architecture and it's value is really per node or per index
* `index.store.fs.fs_lock` which is also dependent on the filesystem a node uses
All other index level setting must be specified on the index level. For existing clusters the index must be closed
and all settings must be updated via the API on each of the indices.
Closes#16799
Without this commit fetching the status of a reindex from a node that isn't
coordinating the reindex will fail. This commit properly registers reindex's
status so this doesn't happen. To do so it moves all task status registration
into NetworkModule and creates a method to register other statuses which the
reindex plugin calls.
The build currently uses the old maven support in gradle. This commit
switches to use the newer maven-publish plugin. This will allow future
changes, for example, easily publishing to artifactory.
An additional part of this change makes publishing of build-tools part
of the normal publishing, instead of requiring a separate upload step
from within buildSrc. That also sets us up for a follow up to enable
precomit checks on the buildSrc code itself.
Add infrastructure to run REST tests on a multi-version cluster
This change adds the infrastructure to run the rest tests on a multi-node
cluster that users 2 different minor versions of elasticsearch. It doesn't implement
any dedicated BWC tests but rather leverages the existing REST tests.
Since we don't have a real version to test against, the tests uses the current version
until the first minor / RC is released to ensure the infrastructure works.
Given the amount of problems this change already found I think it's worth having this run with our test suite by default. The structure of this infra will likely change over time but for now it's a step into the right direction. We will likely want to split it up into integTests and integBwcTests etc. so each plugin can have it's own bwc tests but that's left for future refactoring.
Today index names are often resolved lazily, only when they are really
needed. This can be problematic especially when it gets to mapping updates
etc. when a node sends a mapping update to the master but while the request
is in-flight the index changes for whatever reason we would still apply the update
since we use the name of the index to identify the index in the clusterstate.
The problem is that index names can be reused which happens in practice and sometimes
even in a automated way rendering this problem as realistic.
In this change we resolve the index including it's UUID as early as possible in places
where changes to the clusterstate are possible. For instance mapping updates on a node use a
concrete index rather than it's name and the master will fail the mapping update iff
the index can't be found by it's <name, uuid> tuple.
Closes#17048
Today, certain bootstrap properties are set and read via system
properties. This action-at-distance way of managing these properties is
rather confusing, and completely unnecessary. But another problem exists
with setting these as system properties. Namely, these system properties
are interpreted as Elasticsearch settings, not all of which are
registered. This leads to Elasticsearch failing to startup if any of
these special properties are set. Instead, these properties should be
kept as local as possible, and passed around as method parameters where
needed. This eliminates the action-at-distance way of handling these
properties, and eliminates the need to register these non-setting
properties. This commit does exactly that.
Additionally, today we use the "-D" command line flag to set the
properties, but this is confusing because "-D" is a special flag to the
JVM for setting system properties. This creates confusion because some
"-D" properties should be passed via arguments to the JVM (so via
ES_JAVA_OPTS), and some should be passed as arguments to
Elasticsearch. This commit changes the "-D" flag for Elasticsearch
settings to "-E".
This change adds the infrastructure to run the rest tests on a multi-node
cluster that users 2 different minor versions of elasticsearch. It doesn't implement
any dedicated BWC tests but rather leverages the existing REST tests.
Since we don't have a real version to test against, the tests uses the current version
until the first minor / RC is released to ensure the infrastructure works.
Relates to #14406Closes#17072
The refresh tests were failing rarely due to refreshes happening
automatically on indexes with -1 refresh intervals. This commit moves
the refresh test into a unit test where we can check if it was attempted
so we never get false failures from background refreshes.
It also stopped refresh from being run if the reindex request was canceled.
Indexing failures have caused the reindex http request to fail for a while
now. Both search and indexing failures cause it to abort. But search
failures didn't cause a non-200 response code from the http api. This
fixes that.
Also slips in a fix to some infrequently failing rest tests.
Closes#16037
Implements CompositeIndicesRequest on UpdateByQueryRequest and
ReindexRequest so that plugins can reason about the request. In both cases
this implementation is imperfect but useful because instead of listing
all requests that make up the request it instead attempts to make dummy
requests that represent the requests that it will later make.
Use index UUID to lookup indices on IndicesService
Today we use the index name to lookup index instances on the IndicesService
which applied to search requests but also to index deletion etc. This commit
moves the interface to expect an Index instance which is a tuple
and looks up the index by uuid rather than by name. This prevents accidental modification
of the wrong index if and index is recreated or searching from the wrong index in such a case.
Accessing an index that has the same name but different UUID will now result in an IndexNotFoundException.
Today we use the index name to lookup index instances on the IndicesService
which applied to search reqeusts but also to index deletion etc. This commit
moves the interface to expcet and `Index` instance which is a <name, uuid> tuple
and looks up the index by uuid rather than by name. This prevents accidential modificaiton
of the wrong index if and index is recreated or searching from the _wrong_ index in such a case.
Accessing an index that has the same name but different UUID will now result in an IndexNotFoundException.
Closes#17001
_wait_for_completion defaults to false. If set to true then the API will
wait for all the tasks that it finds to stop running before returning. You
can use the timeout parameter to prevent it from waiting forever. If you
don't set a timeout parameter it'll default to 30 seconds.
Also adds a log message to rest tests if any tasks overrun the test. This
is just a log (instead of failing the test) because lots of tasks are run
by the cluster on its own and they shouldn't cause the test to fail. Things
like fetching disk usage from the other nodes, for example.
Switches the request to getter/setter style methods as we're going that
way in the Elasticsearch code base. Reindex is all getter/setter style.
Closes#16906
Closes#16964
Squashed commit of the following:
commit a23f9d2d29220991aa498214530753d7a5a148c6
Merge: eec9c4e 0b0a251
Author: Robert Muir <rmuir@apache.org>
Date: Mon Mar 7 04:12:02 2016 -0500
Merge branch 'master' into lucene6
commit eec9c4e5cd11e9c3e0b426f04894bb2a6dae4f21
Merge: bc67205 675d940
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 13:45:00 2016 -0500
Merge branch 'master' into lucene6
commit bc67205bdfe1526eae277ab7856fc050ecbdb7b2
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 09:56:31 2016 -0500
fix test bug
commit a60723b007ff12d97b1810cef473bd7b553a0327
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 15:35:35 2016 +0100
Fix SimpleValidateQueryIT to put braces around boosted terms
commit ae3a49d7ba7ced448d2a5262e5d8ec98671a9090
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 15:27:25 2016 +0100
fix multimatchquery
commit ae23fdb88a8f6d3fb7ba60fd1aaf3fd72d899aa5
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 15:20:49 2016 +0100
Rewrite DecayFunctionScoreIT to be independent of the similarity used
This test relied a lot on the term scoring and compared scores
that are dependent on the similarity. This commit changes the base query
to be a predictable constant score query.
commit 366c2d518c35d31251033f1b6f6a93f6e2ae327d
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 14:06:14 2016 +0100
Fix scoring in tests due to changes to idf calculation.
Lucene 6 uses a different default similarity as well as a different
way to calculate IDF. In contrast to older version lucene 6 uses docCount per field
to calculate the IDF not the # of docs in the index to overcome the sparse field
cases.
commit dac99fd64ac2fa71b8d8d106fe68825e574c49f8
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 08:21:57 2016 -0500
don't hardcoded expected termquery score
commit 6e9f340ba49ab10eed512df86d52a121aa775b0f
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 08:04:45 2016 -0500
suppress deprecation warning until migrated to points
commit 3ac8908424b3fdad44a90a4f7bdb3eff7efd077d
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 07:21:43 2016 -0500
Remove invalid test: all commits have IDs, and its illegal to do this.
commit c12976288124ad1a26467e7e848fb810548e7eab
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 07:06:14 2016 -0500
don't test with unsupported back compat
commit 18bbfe76128570bc70883bf91ff4c44c82d27817
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 07:02:18 2016 -0500
remove now invalid lucene 4 backcompat test
commit 7e730e572886f0ef2d3faba712e4256216ff01ec
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 06:58:52 2016 -0500
remove now invalid lucene 4 backwards test
commit 244d2ab6868ba5ac9e0bcde3c2833743751a25ec
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 06:47:23 2016 -0500
use 6.0 codec
commit 5f64d4a431a6fdaa1234adca23f154c2a1de8284
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 06:43:08 2016 -0500
compile, javadocs, forbidden-apis, etc
commit 1f273cd62a7fe9ca8f8944acbbfc5cbdd3d81ccb
Merge: cd33921 29e3443
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 10:45:29 2016 +0100
Merge branch 'master' into lucene6
commit cd33921ac742ef9fb351012eff35f3c7dbda7264
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:58:37 2016 -0500
fix hunspell dictionary loading
commit c7fdbd837b01f7defe9cb1c24e2ec65604b0dc96
Merge: 4d4190f d8948ba
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:41:53 2016 -0500
Merge branch 'master' into lucene6
commit 4d4190fd82601aaafac6b8254ccb3edf218faa34
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:39:14 2016 -0500
remove nocommit
commit 77ca69e288b1a41aa9595c921ed166c272a00ea8
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:38:24 2016 -0500
clean up numericutils vs legacynumericutils
commit a466d696fbaad04b647ffbc0857a9439b583d0bf
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:32:43 2016 -0500
upgrade spatial4j
commit 5412c747a8cfe638bacedbc8233163cb75cc3dc5
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:19:28 2016 -0500
move to 6.0.0-snapshot-8eada27
commit b32bfe924626b87e540692375ece09e7c2edb189
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 11:30:09 2016 +0100
Fix some test compile errors.
commit 6ccde35e9840b03c68d1a2cd47c7923a06edf64a
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 11:25:51 2016 +0100
Current Lucene version is 6.0.0.
commit f62e1015d931b4cc04c778298a8fa1ba65e97ad9
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 11:20:48 2016 +0100
Fix compile errors in NGramTokenFilterFactory.
commit 6837c6eabf96075f743649da9b9b52dd39611c58
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:50:59 2016 +0100
Fix the edge ngram tokenizer/filter.
commit ccd7f070de5efcdfbeb34b9555c65c4990bf1ba6
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:42:44 2016 +0100
The missing value is now accessible through a getter.
commit bd3b77f9b28e5b05daa3d49683a9922a6baf2963
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:41:51 2016 +0100
Remove IndexCacheableQuery.
commit 05f3091c347aeae80eeb16349ac51d2b53cf86f7
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:39:43 2016 +0100
Fix compilation of function_score queries.
commit 81cda79a2431ac78f56b0cc5a5765387f662d801
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:35:02 2016 +0100
Fix compile errors in BlendedTermQuery.
commit 70994ce8dd1eca0b995870974a38e20f26f96a7b
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 23:33:03 2016 -0500
add bug ID
commit 29d4f1a71f36f646b5a6060bed3db019564a279d
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 21:02:32 2016 -0500
easy .store changes
commit 5e1a1e6fd665fa455e88d3a8987362fad5f44bb1
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 20:47:24 2016 -0500
cleanups mostly around boosting
commit 333a669ec6c305ada5645d13ed1da0e19ec1d053
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 20:27:56 2016 -0500
more simple fixes
commit bd5cd98a1e089c866b6b4a5e159400b110140ce6
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 19:49:38 2016 -0500
more easy fixes and removal of ancient cruft
commit a68f419ee47da5f9c9ce5b372f01d707e902474c
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 19:35:02 2016 -0500
cutover numerics
commit 4ca5dc1fa47dd5892db00899032133318fff3116
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 18:34:18 2016 -0500
fix some constants
commit 88710a17817086e477c6c021ec346d0534b7fb88
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 18:14:25 2016 -0500
Add spatial-extras jar as a core dependency
commit c8cd6726583e5ce3f546ed355d4eca037164a30d
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 18:03:33 2016 -0500
update to lucene 6 jars
No changes were really needed in our test infra as it didn't use `node.client`. Yet it didn't take into account ingest nodes, what we used to call client nodes in InternalTestCluster were actually ingest only nodes, which now become coordinating only nodes.
Also renamed some method to get rid of the node client terminology as much as possible in favour or coordinating only node.
The field name is a required argument for all suggesters, but
it was specified via a field() setter in SuggestionBuilder so far.
This changes field name to being a mandatory constructor argument
and lets suggestion builders throw an error if field name is missing
or the empty string.
This commit removes the ability to use string fields on indices created on or
after 5.0. Dynamic mappings now generate text fields by default for strings
but there are plans to also add a sub keyword field (in a future PR).
Most of the changes in this commit are just about replacing string with
keyword or text. Some tests have been removed because they existed because of
corner cases of string mappings like setting ignore-above on a text field or
enabling term vectors on a keyword field which are now impossible.
The plan is to remove strings entirely in 6.0.
Currently each suggestion keeps track of its own name. This has
the disadvantage of having to pass down the parsed name property
in the suggestions fromXContent() and the serialization methods
as an argument, since we need it in the ctor.
This change moves the naming of the suggestions to the surrounding
SuggestBuilder and by this eliminates the need for passind down
the names in the parsing and serialization methods. By making
`name` a required argument in SuggestBuilder#addSuggestion() we
also make sure it is always set and prevent using the same name twice,
which wasn't possible before.
The suffix TransportAction is misleading as it may make think that it extends TransportAction, but it does not. This class makes accessible the different search operations exposed by SearchService through the transport layer. Also resolved few compiler warnings in the class itself.
Also renamed histogram.AbstractBuilcer to AbstractHistogramBuilder, range.AbstractBuilder to AbstractRangeBuilder and org.elasticsearch.search.aggregations.pipeline.having to org.elasticsearch.search.aggregations.pipeline.bucketselector
The `keyword` field is intended to replace `not_analyzed` string fields. It is
indexed and has doc values by default, and doesn't support enabling term
vectors.
Although it doesn't support setting an analyzer for now, there are plans for
it to support basic normalization in the future such as case folding.
QueryBuilders today do all their heavy lifting in toQuery() which
can be too late for several operations. For instance if we want to fetch geo shapes
on the coordinating node we need to do all this before we create the actual lucene query
which happens on the shard itself. Also optimizations for request caching need to be done
to the query builder rather than the query which then in-turn needs to be serialized again.
This commit adds the basic infrastructure for query rewriting and moves the heavy lifting into
the rewrite method for the following queries:
* `WrapperQueryBuilder`
* `GeoShapeQueryBuilder`
* `TermsQueryBuilder`
* `TemplateQueryBuilder`
Other queries like `MoreLikeThisQueryBuilder` still need to be fixed / converted. The nice
sideeffect of this is that queries like template queries will now also match the request cache
if their non-template equivalent has been cached befoore. In the future this will allow to
add optimizataion like rewriting time-based queries into primitives like `match_all_docs` or `match_no_docs`
based on the currents shards bounds. This is especially appealing for indices that are read-only ie. never change.
This adds a build method for the SuggestionContext to the PhraseSuggestionBuilder
and another one that creates the SuggestionSearchContext to the top level
SuggestBuilder. Also adding tests that make sure the current way of parsing
xContent to a SuggestionContext is reflected in the output the builders create.
Groovy uses reflection to invoke closures. These reflective calls are optimized by the JVM after "sun.reflect.inflationThreshold" number of invocations.
After inflation, access to sun.reflect.MethodAccessorImpl is required from the security manager.
Closes#16536
There is no need for IndicesWarmer to be a global accessible class. All it needs
access to is inside IndexService. It also doesn't need to be mutable once it's not a per node
instance. This commit move IndicesWarmer to IndexWarmer and makes the default impls like field data and
norms warming an impl detail. Also the IndexShard doesn't depend on this class anymore, instead it accepts
an Engine.Warmer as a ctor argument which delegates to the actual warmer from the index.
This processor is useful when all elements of a json array need to be processed in the same way.
This avoids that a processor needs to be defined for each element in an array.
Also it is very likely that it is unknown how many elements are inside an json array.
Modules are an internal implementation detail of our build, and there is
no need to publish them with gradle. This change disables publishing of
all modules.
This change rewrites the entire settings filtering mechanism to be immutable.
All filters must be registered up-front in the SettingsModule. Filters that are comma-sparated are
not allowed anymore and check on registration.
This commit also adds settings filtering to the default settings recently added to ensure we don't render
filtered settings.
This splits the geo distance and geo distance sorting tests marked messy:
Test cases that don't really need Groovy support are moved back to the
core test suite closer to the code they actually test.
Relates to #15178
Renamed `ingest-with-mustache` to `smoke-test-ingest-with-all-dependencies`
Also renamed `ingest-disabled` to `smoke-test-ingest-disabled` so that the name is more inline with other qa smoke test modules.
This commit enableds strict settings validation on node startup. All settings
passed to elasticsearch either through system properties, yaml files or any other
way to pass settings must be registered and valid. Settings that are unknown ie. due to
typos or due to deprecation or removal will cause the node to NOT start up. Plugins
have to declare all their settings on the `SettingsModule#registerSetting` and settings for
plugins that are not installed must be removed.
This commit also removes the ability to specify the nodes name via `-Des.name` or just `name` in the
configuration files. The node name must be prefixed with the node prexif like `node.name: Boom`. Left over
usage of `name` will also cause startup to fail.
Adds to GeoDistanceSortBuilder:
* equals
* hashcode
* writeto/readfrom
* moves xcontent parsing logic over
* adds roundtrip tests
* fixes roundtrip test for xcontent by keeping points just as geopoints not geohashes internally
* fixes xcontent parsing of ignore_malformed if coerce is set/unset
* adds exception to sortMode setter to avoid setting invalid sort modes
Relates to #15178
As a prerequisite for refactoring the whole PhraseSuggestionBuilder
to be able to be parsed and streamed from the coordinating node, the
DirectCandidateGenerator must implement Writeable, be able to parse
a new instance (fromXContent()) and later when transported to the
shard to generate a PhraseSuggestionContext.DirectCandidateGenerator.
Also adding equals/hashCode and tests and moving DirectCandidateGenerator
to its own DirectCandidateGeneratorBuilder class.
This commit removes and forbids the use of lowercase ells ('l') in long
literals because they are often hard to distinguish from the digit
representing one ('1').
Closes#16329
When there is an exception thrown during pipeline creation within
Rest calls (in put pipeline, and simulate) We now return a structured
error response to the user with details around which processor's
configuration is the cause of the issue, or which configuration property
is misconfigured, etc.
Now MasterNodeOperations, ReplicationAllShards, ReplicationSingleShard, BroadcastReplication and BroadcastByNode actions keep track of their parent tasks.
In the early days Elasticsearch used to use the index name as the index identity. Around 1.0.0 we introduced a unique index uuid which is stored in the index setting. Since then we used that uuid in a few places but it is by far not the main identifier when working with indices, partially because it's not always readily available in all places.
This PR start to make a move in the direction of using uuids instead of name by making sure that the uuid is available on the Index class (currently just a wrapper around the name) and as such also available via ShardRouting and ShardId.
Note that this is by no means an attempt to do the right thing with the uuid in all places. In almost all places it falls back to the name based comparison that was done before. It is meant as a first step towards slowly improving the situation.
Closes#16217
Adding initial serialization methods (readFrom, writeTo) to the
PhraseSuggestionBuilder, also adding the base test framework for
serialiazation testing, equals and hashCode. Moving SuggestionBuilder
out of the global SuggestBuilder for better readability.
This commit method renames the ScriptEngineService interface methods
types, extensions, and sandboxed to getTypes, getExtensions, and
isSandboxed, respectively.
This commit converts the script mode settings to the new settings
infrastructure. This is a major refactoring of the handling of script
mode settings. This refactoring is necessary because these settings are
determined at runtime based on the registered script engines and the
registered script contexts.
Doc values can now only be enabled by setting `doc_values: true` in the
mappings. Removing this feature also means that we can now fail mapping updates
that try to disable doc values.
Parsing is currently very lenient, which has the bad side-effect that if you
have a typo and pass eg. `store: fasle` this will actually be interpreted as
`store: true`. Since mappings can't be changed after the fact, it is quite bad
if it happens on an index that already contains data.
Note that this does not cover all settings that accept a boolean, but since the
PR was quite hard to build and already covers some main settirgs like `store`
or `doc_values` this would already be a good incremental improvement.
This adds support for returning the size of an array
or an collection, in addition to access fields via
`array.0` you can specify `array.size` to get its size.
* Added a `content_type` option at compile time to decide how variable values are encoded. Possible values are `application/json` and `plain/text`. Defaults to `application/json`.
* Added support for variable placeholders to lookup values from specific slots in arrays/lists.
The rest test framework, because it used to be tightly integrated with
ESIntegTestCase, currently expects the addresses for the test cluster to
be passed using the transport protocol port. However, it only uses this
to then find the http address.
This change makes ESRestTestCase extend from ESTestCase instead of
ESIntegTestCase, and changes the sysprop used to tests.rest.cluster,
which now takes the http address.
closes#15459
We have two similar tests with the same name, ContextAndHeaderTransportTests.
They shared lots of common code so I extracted much of it into
ActionRecordingPlugin, a plugin which records all action requests for later
inspection.
I also removed all the warnings from both tests. That made lang-mustache
compile cleanly without any custom -Xlint so I removed those. To remove
the warnings I had to add type parameters to ActionFilter which seemed
like a good idea anyway.
1. Gets guice out of the business of building ScoreFunctionParsers and
QueryParsers.
2. Moves QueryParser registration to SearchModule
3. Moves NamedWriteableRegistry construction out of guice and into Node and
TransportClient.
4. Moves shape registration into SearchModule so now all named writeable
registration is done in the SearchModule.
This is breaking for plugin authors. Instead of declaring new QueryParser
like:
```java
public void onModule(IndicesModule module) {
module.registerQueryParser(NewQueryParser.class);
}
```
you do it like:
```java
public void onModule(SearchModule module) {
module.registerQueryParser(NewQueryParser::new);
}
```
The QueryParser's argument no longer come from @Inject, now they come from
the declaration in the plugin. The above example is for a no-arg QueryParser.
Most of the QueryParsers in Elasticsearch are no-arg.
ScoreFunctionParsers have a similar but slightly different change. This:
```java
public void onModule(SearchModule module) {
module.registerFunctionScoreParser(NewFunctionScoreParser.class);
}
```
becomes
```java
public void onModule(SearchModule module) {
module.registerFunctionScoreParser(new NewFunctionScoreParser());
}
```
Since all known ScoreFunctionParsers have no arg constructors its simpler to
just build them at registration time rather than specify a supplier that is
used to build them later.
Site plugins used to be used for things like kibana and marvel, but
there is no longer a need since kibana (and marvel as a kibana plugin)
uses node.js. This change removes site plugins, as well as the flag for
jvm plugins. Now all plugins are jvm plugins.
the environment is now available through NodeModule#getNode#getEnvironment and can be retrieved during onModule(NodeModule), no need for this indirection anymore using the BiFunction
* Folded IngestModule into NodeModule
* Renamed IngestBootstrapper to IngestService
* Let NodeService construct IngestService and removed the Guice annotations
* Let IngestService implement Closable
ContextAndHeaders has a massive impact on the core infrastructure since it has to
be manually passed on to all relevant places across threads/network calls etc. For the same reason
it's also very error prone and easily forgotten on potentially relevant APIs.
The new ThreadContext is associated with a ThreadPool (node or transport client) and ensures that
headers and context registered on a current thread are inherited to new threads spawned, send across
the network to be deserialized on the receiver end as well as restored on the response handling thread
once the response is received.
* Added percolator field mapper that extracts the query terms and indexes these terms with the percolator query.
* At percolate time these extracted terms are used to query percolator queries that are like to be evaluated. This can significantly cut down the time it takes to percolate. Whereas before all percolator queries were evaluated if they matches with the document being percolated.
* Changes made to percolator queries are no longer immediately visible, a refresh needs to happen before the changes are visible.
* By default the percolate api only returns upto 10 matches instead of returning all matching percolator queries.
* Made percolate more modular, so that it is easier to add unit tests.
* Added unit tests for the percolator.
Closes#12664Closes#13646
Adds task manager class and enables all activities to register with the task manager. Currently, the immutable Transport*Activity class represents activity itself shared across all requests. This PR adds and an additional structure Task that keeps track of currently running requests and can be used to communicate with these requests using TransportTaskAction.
Related to #15117
This removes the backward compatibility layer with pre-2.0 indices, notably
the extraction of _id, _routing or _timestamp from the source document when a
path is defined.
We have the Text API, which is essentially a wrapper around a String and a
BytesReference and then we have 3 implementations depending on whether the
String view should be cached, the BytesReference view should be cached, or both
should be cached.
This commit merges everything into a single Text that is essentially the old
StringAndBytesText impl.
Long term we should look into whether this API has any performance benefit or
if we could just use plain strings. This would greatly simplify all our other
APIs that currently use Text.
This fixes the `lenient` parameter to be `missingClasses`. I will remove this boolean and we can handle them via the normal whitelist.
It also adds a check for sheisty classes (jar hell with the jdk).
This is inspired by the lucene "sheisty" classes check, but it has false positives. This check is more evil, it validates every class file against the extension classloader as a resource, to see if it exists there. If so: jar hell.
This jar hell is a problem for several reasons:
1. causes insanely-hard-to-debug problems (like bugs in forbidden-apis)
2. hides problems (like internal api access)
3. the code you think is executing, is not really executing
4. security permissions are not what you think they are
5. brings in unnecessary dependencies
6. its jar hell
The more difficult problems are stuff like jython, where these classes are simply 'uberjared' directly in, so you cant just fix them by removing a bogus dependency. And there is a legit reason for them to do that, they want to support java 1.4.