In context of mapper attachment and other mapper plugins, when dealing with multi fields, sub fields never get the `externalValue` although it was set.
Here is a full script which reproduce the issue when used with mapper attachment plugin:
```
DELETE /test
PUT /test
{
"mappings": {
"test": {
"properties": {
"f": {
"type": "attachment",
"fields": {
"f": {
"analyzer": "english",
"fields": {
"no_stemming": {
"type": "string",
"store": "yes",
"analyzer": "standard"
}
}
}
}
}
}
}
}
}
PUT /test/test/1
{
"f": "VGhlIHF1aWNrIGJyb3duIGZveGVz"
}
GET /test/_search
{
"query": {
"match": {
"f": "quick"
}
}
}
GET /test/_search
{
"query": {
"match": {
"f.no_stemming": "quick"
}
}
}
GET /test/test/1?fields=f.no_stemming
```
Related to https://github.com/elasticsearch/elasticsearch-mapper-attachments/issues/57Closes#5402.
This is only applicable when the order is set to _count. The upper bound of the error in the doc count is calculated by summing the doc count of the last term on each shard which did not return the term. The implementation calculates the error by summing the doc count for the last term on each shard for which the term IS returned and then subtracts this value from the sum of the doc counts for the last term from ALL shards.
Closes#6696
This commit adds regular expression support for the allow-origin
header depending on the value of the request `Origin` header.
The existing HttpRequestBuilder is also extended to support the
OPTIONS HTTP method.
Relates #5601Closes#6891
The HTTP client implementation used by the Elasticsearch REST tests is
backed by apache http client instead of a self written helper class,
that uses HttpUrlConnection. This commit removes the old simple HttpClient
class and uses the more powerful and reliable one for all tests.
It also fixes a minor bug, that when sending a 301 redirect, a Location
header needs to be added as well, which was uncovered by the switching
to the new client.
Closes#7003
The bug reproduces when the point under test for the placement of the hole of the polygon has an x coordinate which only intersects with the ends of edges in the main polygon. The previous code threw out these cases as not relevant but an intersect at 1.0 of the distance from the start to the end of an edge is just as valid as an intersect at any other point along the edge. The fix corrects this and adds a test.
Closes#5773
This commit adds the ability to force blocking on the flush operaition
to make sure all files have been written and synced to disk. Without
this option a flush might be executing at the same time causing the
current flush to fail and return before all files being synced.
Closes#6996
Added two new interfaces:
1) IndicesRequest that allows to retrieve the indices the request relates to in a generic manner, together with the indices options that tell how they are going to get resolved and expanded
2) CompositeIndicesRequest for compound requests that hold multiple indices request like MultiSearchRequest, MultiGetRequest, MultiTermVectorsRequest, BulkRequest, BenchmarkRequest, PercolateRequest, MultiPercolateRequest and MoreLikeThisRequest
Taken the chance to streamline the indices options and add them to every request where it makes sense (although they can't be changed from the outside), rather than leaving them implicit in the related TransportAction when indices get expanded (tipycally MetaData#concreteIndices or MetaData#concreteSingleIndex). Added IndicesOptions parameter to MetaData#concreteSingleIndex to make sure it is taken from the request, where the information belongs, instead of hardcoded within MetaData. The concreteSingleIndex method remains but it's just a utility method that returns a single index instead of an array and complains otherwise.
Also made sure NPE is never thrown when setting indices(null) to IndicesAliasesRequest, similar to what SearchRequest does.
Closes#6933
The benchmark indexes 200 unique full-width longs. For uninverted field data
we try to use the most memory-efficient storage, and in that case it would use
two arrays: one for the doc->ordinals mapping and one for the ordinal->value
mapping. Which is slower than what doc values do by storing directly the
mapping from docs to values.
Before this change each aggregation had to output an object field with its name and write its JSON inside that object. This allowed for badly behaved aggregations which could write JSON content in the root of the 'aggs' object. this change move the writing of the aggregation name to a level above the aggregation itself, ensuring that aggregations can only write within there own scope in the JSON output.
Closes#7004
Today if the user supplies a custom missing value for a string sort,
we do it in an extremely slow way, not using ordinals but dereferencing
bytes for every document. Ordinals are only used if the missing value
is _first or _last.
Instead, use ordinals with custom missing values too.
Closes#7005
Single index operations to use the newly added IndexClosedException introduced with #6475. This way we can also fail faster when we are trying to execute operations on closed indices and their use is not allowed (depending on indices options). Indices blocks are still checked but we can already throw error while resolving indices (MetaData#concreteIndices).
Effectively this change also affects what we return when using one of the following apis: analyze, bulk, index, update, delete, explain, get, multi_get, mlt, term vector, multi_term vector. We now return `{"error":"IndexClosedException[[test] closed]","status":403}` instead of `{"error":"ClusterBlockException[blocked by: [FORBIDDEN/4/index closed];]","status":403}`.
Closes#6988
The default shard size in the terms aggregation now uses BucketUtils.suggestShardSideQueueSize() to set the shard size if the user does not specify it as a parameter.
Closes#6857
Allow users to control document collection termination, if a specified terminate_after number is
set. Upon setting the newly added parameter, the response will include a boolean terminated_early
flag, indicating if the document collection for any shard terminated early.
closes#6876
instead of a custom encoding in BINARY.
In low level benchmarks this is 2x to 5x faster: its also optimized
for the common case where fields actually only contain at most one
value for each document.
Additionally SORTED_NUMERIC doesn't lose values if they appear more
than once, so mathematical computations such as averages are correct.
Closes#6967
The recycling happening in facets is done manually and arrays are sometimes not
released. Aggregations do it in a less error-prone way by registering on to the
SearchContext.
This commit removes custom comparators in favor of the ones that are in Lucene.
The major change is for nested documents: instead of having a comparator wrapper
that deals with nested documents, this is done at the fielddata level by having
a selector that returns the value to use for comparison.
Sorting with custom missing string values might be slower since it is using
TermValComparator since Lucene's TermOrdValComparator only supports sorting
missing values first or last. But other than this particular case, this change
will allow us to benefit from improvements on comparators from the Lucene side.
Close#5980
This change just changes the default for index.codec.bloom.load to
false: with recent performance improvements to ID lookup, such as
#6298, bloom filters don't give much of a performance gain anymore,
and they can consume non-trivial RAM when there are many tiny
documents.
For now, we still index the bloom filters, so if a given app wants
them back, it can just update the index.codec.bloom.load to true.
Closes#6959