- "boost" should be "boost_factor"
- "mult" should be "multiply"
Also, store combine function names in ImmutableMap instead of iterating
over all possible names each time.
closes#3872 for master
SynonymFilters produces token streams with stacked tokens such that
conjunction queries need to be parsed in a special way such that the
stacked tokens are added as an innner disjuncition.
Closes#3881
The array holding the payloads (TermVectorFields.payloads) is reused for each token. If the
previous token had payloads but the current token had not, then the payloads of the previous
token were returned, because the payloads of the previous token were never invalidated.
For example, for a field only contained two tokens each occurring once, the first having a
payload and the second not, then for the second token, the payload of the first was returned.
closes#3873
Extended the specialized simplified mapping source method to support metadata mapping fields. These fields can just specified as normal fields, but will automatically be placed as top level mapping field.
Closes#3818
Return a merge_id element in each segment of the segments API, allowing to group segments that are being merged as part of a single merge and indicate which ones are being merged now.
closes#3904
Some tests use AbstractIntegrationTest#indexRandom which sometimes uses async
indexing. This can easily run into queue size based rejections on a slow
box. In that case we should retry blocked indexing.
Now that we properly fixed the ability to set the queue size on the index / bulk thread pool, we should actually set them to a somehow reasonable value to protect from users potentially overflowing our system.
I suggest defaults to be 50 for bulk, and 200 for indexing.
Also, set the thread pool for get, which we should set (in a similar value to a "read" queue size we have today).
closes#3888
We want to make sure that Plugin Manager still downloading plugins from internet.
New tests requires internet access (`@Network` annotation has been added).
By default, tests annotated with `@Network` are not launched.
If you need to run these tests, use `-Dtests.network=true` option.
Closes#3894.
This commit adds support for random merge policies set for every
index created in an AbstractIntegrationTest. It will either set
'logbyte', 'logdoc' or 'tiered' merge policy as well as a random
value for compound files.
Added checks to IndexRequest#source(Object...) to ensure and even number
of parameters. This method otherwise throws an AIOOBException which is
confusing to users and doesn't report the root cause of the problem.
This commit allows for using Lucene doc values as a backend for field data,
moving the cost of building field data from the refresh operation to indexing.
In addition, Lucene doc values can be stored on disk (partially, or even
entirely), so that memory management is done at the operating system level
(file-system cache) instead of the JVM, avoiding long pauses during major
collections due to large heaps.
So far doc values are supported on numeric types and non-analyzed strings
(index:no or index:not_analyzed). Under the hood, it uses SORTED_SET doc values
which is the only type to support multi-valued fields. Since the field data API
set is a bit wider than the doc values API set, some operations are not
supported:
- field data filtering: this will fail if doc values are enabled,
- field data cache clearing, even for memory-based doc values formats,
- getting the memory usage for a specific field,
- knowing whether a field is actually multi-valued.
This commit also allows for configuring doc-values formats on a per-field basis
similarly to postings formats. In particular the doc values format of the
_version field can be configured through its own field mapper (it used to be
handled in UidFieldMapper previously).
Closes#3806
Migrate SearchContext.Rewrite to Releasable. All the parent child queries are now implemented as Lucene queries and the state is kept in the Weight.
Completely disable caching for `has_child` and `has_parent` filters, this has never worked and it also can also never work. The matching docIds are cached per segment while the collection of parent ids is top level.
Closes#3822
* Merged segments are now warmed-up at the end of the merge operation instead
of _refresh, so that _refresh doesn't pay the price for the warm-up of merged
segments, which is often higher than flushed segments because of their size.
* Even when no _warmer is registered, some basic warm-up of the segments is
performed: norms, doc values (_version). This should help a bit people who
forget to register warmers.
* Eager loading support for the parent id cache and field data: when one
can't predict what terms will be present in the index, it is tempting to use
a match_all query in a warmer, but in that case, query execution might not be
much faster than field data loading so having a warmer that only loads field
data without running a query can be useful.
Closes#3819
I tried to install a plugin directly from disk, but used the wrong arguments. Totally my fault for not RTFM properly.
However, by following the instructions printed out by `bin/plugin`, I ended up deleting my $ES/bin directory.
```
$ pwd
/tmp/elasticsearch-0.90.5
$ ls
LICENSE.txt NOTICE.txt README.textile bin config lib
$ bin/plugin --install file:///tmp/foo.zip
-> Installing file:///tmp/foo.zip...
Failed to install file:///tmp/foo.zip, reason: plugin directory /tmp/elasticsearch-0.90.5/plugins already exists. To update the plugin, uninstall it first using -remove file:///tmp/foo.zip command
$ bin/plugin -remove file:///tmp/foo.zip
-> Removing file:///tmp/foo.zip
Removed file:///tmp/foo.zip
$ ls
LICENSE.txt NOTICE.txt README.textile config lib
```
I reproduced the problem in 0.90.5 and the latest master.
Closes#3847.
Adding the mappers first could cause the wrong FieldMappers to be used during object parsing
Also:
Removed MergeContext.newFieldMappers, MergeContext.newObjectMappers and relatives as they are not used anymore.
Similarly removed MergeContext.addFieldDataChange & fieldMapperChanges as they were not used.
Closes#3844
Due to a change in 0.90 some exceptions may show up when starting a river,
even though this is not a problem as the shard has not been initialized
yet and a retry is scheduled.
Only valid fields (input, output, weight and payload) are now allowed inside a completion field.
This makes sure that typos like "inputs" are caught on indexing.
When parsing a filter, we effectively have a case where we need to ignore a filter. For example, when an "and" filter has no elements. Ignoring a filter is different compared to matching on all or matching on none, because an and filter with no elements should simply be ignored in the context of a bool filter for example, regardless if its used within a must or a must_not filter.
We should be consistent in our codebase when handling these use case. See also #3809closes#3838
Added a load_average_format=array|hash param to OsStats to allow serializing load averages into a hash (with keys 1m,5m, 15m).
Added node_info_format to NodeStats, allowing to suppress node info output
Also:
We always return a valid json back
If there are no templates define we return 200 OK ( as opposed to a request for a specific template id which returns 404)
Closes#3812
If 'pattern_capture' tokenfilter is create / mapped without a 'patterns'
settings we now throw an exception since this is a misconfiguration and
likely due to the similar settings on related token filters.
Closes#3808
If nodes are shutting down we close thread pools and throw
'EsRejectedExcutionException'. This commit handles these exceptions
gracefully if throw during Ping execution.
Index settings can now override default behavior for 'double write'
and 'no delete open files' on MockDirectoryWrapper if tests use these
options in a legit. way. ie. in a restore situation a double write
is a legit operation.
This can go wrong if indices with the same name are repeatably created and deleted.
UUIDs can not be null anymore. If UUID is not available `_na_` will be used as a value.
Also - some minor clean up in ShardStateAction where shard started events could be added twice to the to-be-applied list where the second instance will be ignored.
Closes#3783
instead of relying on the current cluster state from the cluster service, make sure to rely on the cluster state we get from the change event, this will allow us to move processing of the cluster event around potentially to before the local cluster state has been updated
Currently we fail tests is any searcher reference is pending. Yet,
on a slow machine the freeContext calls that are async could still be
in flight so if there are pending searchers we wait for a bit to make
sure we don't fail if a freeContext call is in flight.
The MockEngine now also contains the stack trace of the first close call
if a searcher is closed twice.
DateHistogramFacets can be used on fields that store dates in a numeric
(long) field using different resolutions like seconds instead of
milliseconds. This commit adds a test that checks if the factor is
applied correctly to scale it up to milliseconds.