When closing an instance of RestClient, the connection manager gets shutdown, which makes it not usable anymore. If that is static, like it is now, no RestClient will work anymore from that moment on. Each instance of RestClient should have its own instance of connection manager
This method is heavy as it builds a bitset out of a DocIdSet in order to be
able to provide random-access. Now that Lucene has removed out-of-order scoring
true random-access is very rarely needed and we could instead return an Bits
instance that wraps the iterator. Ideally, we would use the DISI API directly
but I have to admit that the Bits API is more friendly.
Close#9546
We had a REST test that relied on matching a json response against a regex. It worked but the match wasn't done against the actual json object, but its java map representation converted into a string by calling `toString`. Since all other clients test runners don't work in this case, as they try to match a json object against a regex, we should do the same and prevent it from working.
The current "checkindex" on startup is very very expensive. This is
like running one of the old school hard drive diagnostic checkers and
usually not a good idea.
But we can do a CRC32 verification of files. We don't even need to
open an indexreader to do this, its much more lightweight.
This option (as well as the existing true/false) are randomized in
tests to find problems.
Also fix bug where use of the current option would always leak
an indexwriter lock.
Closes#9183
Histogram aggregation supports an 'offset' option to move bucket boundaries.
In a histogram with buckets of size X these can be moved from 0, X, 2X, 3X,...
by an offset value of Y to Y, X+Y, 2X+Y, 3X+Y... by using the 'offset' option.
The previous 'pre_offset' and 'post_offset' options are removed in favour of
the simplified 'offset' option.
Closes#9417Closes#9505
Until now, there was no possibility to expose infos about configured
transport profiles. This commit adds the ability to expose those
information in the TransportInfo class.
The channel was well as the netty pipeline handler now also contain
the profile they were configured for, as this information cannot be
extracted elsewhere.
In addition, each profile now can set its own publish host and port,
which might be needed in case of portforwarding or using docker.
Closes#9134
This is similar to refresh, if we fail to commit the data we have to fail the
engine since in-ram data is likely discarded. Yet, it's still in translog and might
be recoverable when the node is restarted but we have to treat the engine as failed.
This callback is executed only once, on the master node during an
index's creation. An exception thrown during this listener will cancel
the index creation.
This also adds checks in `IndicesClusterStateService` for the
indexService being null as well as if the `indicesService.createIndex`
throws an exception on data nodes after an index has already been
created.
#8720 introduced a timeout mechanism for ongoing recoveries, based on a last access time variable. In the many iterations on that PR the update of the access time was lost. This adds it back, including a test that should have been there in the first place.
Closes#9506
The query-cache has an optimization to not deserialize the bytes at the shard
level. However this is a bit fragile since it assumes that serialized streams
can be concatenanted (which is not the case with shared strings) and also does
not update the QueryResult object that is held by the SearchContext. So you
need to make sure to use the right one.
With this change, the query cache just deserializes bytes into the QueryResult
object from the context.
Close#9500
Due to some unreleased refactorings we lost the persitence of
a perviously set values in MergePolicyProvider. This commit adds this
back and adds a simple unittest.
Closes#8890
provided in the search
Currently, doing a field lookup within a terms agg will restrict the
fields available to those within the types passed into the search
request. However, when doing sub aggs within a children agg, the
fields available should not be restricted to those of the search.
This change makes the field lookup use the index level mapper service.
The optimization we do in the HandlesStreamInput / Output
adds a lot of complexity with a rather unknown benefit. It tries
to compress commonly used strings and write ids instead. This
should rather be done on a lower level if at all necessary for
the small message we send over the network.
Today we have a dirty flag indicating that a refresh must
be executed. We also allow users to bypass this by setting
a force=true boolean on the refresh request / command. All
these flags are unneeded since the SearcherManager has all
the information to do the right thing if it's dirty or not.
BytesStreamOutput allows to pass the expected size but by default uses
BigArrays.PAGE_SIZE_IN_BYTES which is 16k. A common cached result ie.
a date histogram with 3 buckets is ~100byte so 16k might be very wasteful
since we don't shrink to the actual size once we are done serializing.
By passing 512 as the expected size we will resize the byte array in the stream
slowly until we hit the page size and don't waste too much memory for small query
results.
This pr removes the optimization for auto generated ids.
Previously, when ids were auto generated by elasticsearch then there was no
check to see if a document with same id already existed and instead the new
document was only appended. However, due to lucene improvements this
optimization does not add much value. In addition, under rare circumstances it might
cause duplicate documents:
When an indexing request is retried (due to connect lost, node closed etc),
then a flag 'canHaveDuplicates' is set to true for the indexing request
that is send a second time. This was to make sure that even
when an indexing request for a document with autogenerated id comes in
we do not have to update unless this flag is set and instead only append.
However, it might happen that for a retry or for the replication the
indexing request that has the canHaveDuplicates set to true (the retried request) arrives
at the destination before the original request that does have it set false.
In this case both request add a document and we have a duplicated a document.
This commit adds a workaround: remove the optimization for auto
generated ids and always update the document.
The asumtion is that this will not slow down indexing more than 10 percent,
see: http://benchmarks.elasticsearch.org/closes#8788closes#9468
today we use the length of the BytesReference which is misleading since
the reference is paged such that the length != ramBytesUsed. This can lead
to a way higher memory consuption than expected if query results are tiny
since each query result requires at least 16kb. Yet, we should rethink this
strategy for query results that are very small ie. less than 20% of the ramBytesUsed
but this commit first tries to make the acocunting correct.
Additionally, this setting can be specified in elasticsearch.yml if
desired, to pre-populate the list of methods to be added to the default
blacklist.
When making a change to this setting dynamically, the entire blacklist
is logged as well.
The `analyzer` setting is now the base setting, and `search_analyzer`
is simply an override of the search time analyzer. When setting
`search_analyzer`, `analyzer` must be set.
closes#9371
Using the `script.groovy.sandbox.method_blacklist_patch` setting, the
blacklist can be dynamically *added* to by specifying a comma-separated
list of methods (for example, "toString,size" would add .toString and
.size to the blacklist).
When the `script.groovy.sandbox.method_blacklist_patch` setting is
changed, the script cache is cleared to force new scripts to be
recompiled. Additionally the on-disk cache is cleared so that scripts in
the `config/scripts` directory are re-compiled as well.
This also fixes an issue where script engines were injected more than
once, which can cause multiple instances of the script engine per node.
Extended_stats now displays the upper and lower bounds on standard deviations (e.g. avg +/- std).
Default is to show 2 std above/below, but can be changed using the `sigma` parameter.
Accepts non-negative doubles
Closes#9356
This change makes InternalHistogram the only InternalAggregation used by the Histogram Aggregator. There is still a separate Bucket implementation and Factory implementation. All buckets are created through the factory passed into the InternalHistogram meaning and the correct factory implementation is serialised as part of the aggregation to make sure the correct bucket types are always generate.
This is needed by the Transformers (namely the derivative transformer) to allow it to generate buckets of the right type without having to know what the underlying bucket implementation is.
To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0.
I also added a simple recovery benchmark.
Closes#9439
If an index is deleted during initial state of the snapshot operation, the entire snapshot can fail with NPE. This commit improves handling of this situation and allows snapshot to continue if partial snapshots are allowed.
Closes#9024
PR #8672 addresses ambiguous polygons - those that either cross the dateline or span the map - by complying with the OGC standard right-hand rule. Since ```GeoPolygonFilter``` is self contained logic, the fix in #8672 did not address the issue for the ```GeoPolygonFilter```. This was identified in issue #5968
This fixes the ambiguous polygon issue in ```GeoPolygonFilter``` by moving the dateline crossing code from ```ShapeBuilder``` to ```GeoUtils``` and reusing the logic inside the ```pointInPolygon``` method. Unit tests are added to ensure support for coordinates specified in either standard lat/lon or great-circle coordinate systems.
closes#5968closes#9304
The InternalClusterInfoService reaches out to the nodes to get information about their disk usage and shard store size. Upon a node level error we currently remove the node info from the local cache. We should also clear the cache when we run into an error on the action level (excluding any info from all nodes).
This also adds settings for the timeout used when waiting for nodes.
Closes#9449
When we renamed all of the transport actions in #7105, shard started and failed were flipped around by mistake. This commit fixes their naming.
Closes#9440
This change makes the response API object for Histogram Aggregations the same for all types of Histogram, and does the same for all types of Ranges.
The change removes getBucketByKey() from all aggregations except filters and terms. It also reduces the methods on the Bucket class to just getKey() and getKeyAsString().
The getKey() method returns Object and the actual Type is returns will be appropriate for the type of aggregation being run. e.g. date_histogram will return a DateTime for this method and Histogram will return a Number.
Apparently some filesystems such as ZFS and occasionally NTFS can report
filesystem usages that are negative, or above the maximum total size of
the filesystem. This relaxes the constraints on `DiskUsage` so that an
exception is not thrown.
If 0 is passed as the totalBytes, `.getFreeDiskAsPercentage()` will
always return 100.0% free (to ensure the disk threshold decider fails
open)
Fixes#9249
Relates to #9260
This bug was introduced by #8454 which allowed the childFilter to only be consumed once. By adding the child docid buffering multiple buckets can now be emitted by the same doc id. This child docid buffering only happens in the scope of the current root document, so the amount of child doc ids buffered is small.
Closes#9317Closes#9346
The fix is the move the parent filter resolving from the nextReader(...) method to the collect(...) method, because only then any parent nested filter's parent filter is then properly instantiated.
Closes#9280Closes#9335
We want to check if at least the primaries succeeded if we do not
wait for green and not if all succeeded if we wait for green.
That was a misconception in c617af37e8
Closes#9402.
Squashed commit of the following:
commit 85c71b6478441a73738c81f02257193f9837f3ba
Author: Robert Muir <rmuir@apache.org>
Date: Sat Jan 24 11:24:36 2015 -0500
upgrade to lucene r1654549 snapshot
Requests are sent to two shard copies in case a shard is relocating.
This will show up in the the _shards header. Therefore we must check
with greaterThanOrEqualTo(..).
Adding missing support for the multi-index query parameters 'ignore_unavailable',
'allow_no_indices' and 'expand_wildcards' to '_cluster/state' API. These
parameters are supposed to be supported for APIs that work across multiple indices.
So far overwriting the default settings per REST call was not possible which is
fixed here.
Closes#5229Closes#9295
These two tests are confusing because they have the same class name in
different packages. This results in accidentally looking at the wrong
file when trying to open the test by class name. They are also
not "simple"..
Related to #9049.
By default, the default value for `timestamp` is `now` which means the date the document was processed by the indexing chain.
You can now reject documents which not provide a `timestamp` value by setting `ignore_missing` to false (default to `true`):
```js
{
"tweet" : {
"_timestamp" : {
"enabled" : true,
"ignore_missing" : false
}
}
}
```
When you update the cluster to 1.5 or master, this index created with 1.4 we automatically migrate an index created with 1.4 to the 1.5 syntax.
Let say you have defined this in elasticsearch 1.4.x:
```js
DELETE test
PUT test
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
}
}
PUT test/type/_mapping
{
"type" : {
"_timestamp" : {
"enabled" : true,
"default" : null
}
}
}
```
After migration, the mapping become:
```js
{
"test": {
"mappings": {
"type": {
"_timestamp": {
"enabled": true,
"store": false,
"ignore_missing": false
},
"properties": {}
}
}
}
}
```
Closes#8882.
At startup the tribe node ignores closed indices, but if you closed an index that was part of the tribe node cluster state, its state change was not currently handled. A NullPointerException could be seen in the logs instead as the routing table for the closed index was null. As a result, the index stayed in the tribe node cluster state in open state, although that didn't reflect reality. Also, subsequent cluster state updates happening in the tribe node kept failing, affecting updates related to any other index. The only way to recover from this was to restart the tribe node every time an index is closed on any tribe.
This commit properly handles index state changes, making sure that when an index gets closed it gets removed from the tribe node cluster state. Note that it makes little sense to keep the closed index around in the tribe node, as from the tribe node you can't do anything with it. The tribe node simply doesn't see any closed index, it's the same as if they didn't exist.
Closes#6411Closes#9334
This allows a plugin or user that registers a listener to be able to
stop actions like creating an index or starting a shard by throwing an
exception. Previously all exceptions were logged without being rethrown.
Before, if filter and query was defined for function_score, then the
filter was silently ignored. Now, if both is defined then function score
query wraps this in a filtered_query.
closes#8638closes#8675
ShapeBuilder threw a NPE when a polygon coordinate array consisted of a single LinearRing. This PR fixes the error handling to throw a more useful ElasticsearchParseException to provide the user with better insight into the problem.
This adds a new boolean (index.merge.scheduler.auto_throttle) dynamic
setting, default true (matching Lucene), to adaptively set the IO rate
limit for merges over time.
This is more flexible than the previous fixed rate throttling because
it responds depending on the incoming merge rate, so search-heavy
applications that are not doing much indexing will see merges heavily
throttled while indexing-heavy cases will lighten the throttle so
merges can keep up within incoming indexing.
The fixed rate throttling is still available as a fallback if things
go horribly wrong.
Closes#9243Closes#9133
previous push was partial by mistake, we still need the wrapped dirs around after being closed for the test infra, for now, explicitly clear it in the leak test (which is still bad apple)
I ran the bad apple test for index memory leaks and still saw leaks, it seems like we don't properly clean the dirs from the static mock test dir wrapper
`cluster.routing.allocation.disk.include_relocations` and
`cluster.routing.allocation.disk.reroute_interval` are both boolean
settings, so they should have the `Validator.BOOLEAN` applied.
Fixes#9309
validation tests for constants
Currently the snapshot flag for Version constants is only set to true
for CURRENT. However, this means that the snapshot state changes from
branch to branch. Instead, snapshot should be "is this version
released?". This change also adds a validation test checking that
ID -> constant and vice versa are correct, and fixes one bug found there
(for an unreleased version).
- don't allow for soft references anymore in the recycler
- remove some abusive thread locals
- don't recycle independently float/double and int/long pages, they are the
same and just interpret bits differently.
Close#9272
BucketAggregationMode used to be part of the framework, now it's only an
implementation detail of the terms, histogram, geohash grid and scripted
aggregators.
Aggregator.estimatedBucketCount() was a complicated way to do the initial sizing
of the data structures, but it did not work very well in practice and was rather
likely to lead to over-sized data-structures causing OOMEs. It's removed now and
all data-structures start with a size of 1 and grow exponentially.
Aggregator.preCollection() is now symetric with postCollection(): it exists on
all aggregation objects where postCollection() also is and recursively calls its
children.
Fixed other minor issues related to generics and exceptions.
Close#9097
The query cache has a mechanism that disables it automatically when
SearchContext.nowInMillis() is used. One issue with that is that the date math
parser always evaluates the current timestamp when parsing a date, even if it
is not needed. As a consequence, whenever you use a date expression in your
queries, the query cache would not be used.
Close#9225
Some of our Java API requests have public setters but their corresponding getters are package private only. This commit makes those getters public as well.
Closes#9273
This change fixes _timestamp's serialization method to write out
`doc_values` and `doc_values_format`, which could already be set,
but would not be written out.
closes#8893closes#8967
Today we give the HTTP status back within the HTTP response itself and within the JSON response as well:
```sh
curl localhost:9200/
```
```js
{
"status" : 200,
"name" : "Red Wolf",
"version" : {
"number" : "2.0.0",
"build_hash" : "6837a61d8a646a2ac7dc8da1ab3c4ab85d60882d",
"build_timestamp" : "2014-08-19T13:55:56Z",
"build_snapshot" : true,
"lucene_version" : "4.9"
},
"tagline" : "You Know, for Search"
}
```
This commits adds a test that simulate disconnecting nodes and dropping requests during the various stages of recovery and solves all the issues that were raised by it. In short:
1) On going recoveries will be scheduled for retry upon network disconnect. The default retry period is 5s (cross node connections are checked every 10s by default).
2) Sometimes the disconnect happens after the target engine has started (but the shard is still in recovery). For simplicity, I opted to restart the recovery from scratch (where little to no files will be copied again, because there were just synced).
3) To protected against dropped requests, a Recovery Monitor was added that fails a recovery if no progress has been made in the last 30m (by default), which is equivalent to the long time outs we use in recovery requests.
4) When a shard fails on a node, we try to assign it to another node. If no such node is available, the shard will remain unassigned, causing the target node to clean any in memory state for it (files on disk remain). At the moment the shard will remain unassigned until another cluster state change happens, which will re-assigned it to the node in question but if no such change happens the shard will remain stuck at unassigned. The commits adds an extra delayed reroute in such cases to make sure the shard will be reassinged
5) Moved all recovery related settings to the RecoverySettings.
Closes#8720
You can now specify `format` in the request definition for most numeric metric aggregations. The exceptions are Percentile_Ranks, Cardinality and Value_Count as the response type of these can be different from the field type so the formatter won't work.
Closes#6812
Today the internal engine closes itself it the engine hits an exception
it can not recover from. This complicates a lot of refcounting issues
if such an exception happens during engine creation. This commit
only markes the engine as failed and let the user close it once the exception
bubbles up. Additionally it rolls back the indexwriter to prevent any changes after
the engine is failed.
I have a field with a `null` [default `_timestamp` value](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-timestamp-field.html#mapping-timestamp-field-default) and when I try to update the mapping I get a server error caused by a `NullPointerException`
```
[2015-01-08 17:28:56,040][DEBUG][action.admin.indices.mapping.put] [...] failed to put mappings on indices [[feed_170_v1, feed_204_v1, feed_229_v1, feed_232_v1, feed_239_v1, feed_248_v1, feed_268_v1, feed_256_v1, feed_272_v1, feed_159_v1, feed_255_v1, feed_164_v1, feed_259_v1, feed_266_v1, feed_188_v1, feed_240_v1, feed_233_v1, feed_13_v1, feed_184_v1, feed_261_v1, feed_267_v1, feed_271_v1, feed_257_v1, feed_172_v1, feed_238_v1, feed_254_v1, feed_223_v1, feed_274_v1, feed_203_v1, feed_269_v1, feed_262_v1, feed_205_v1, feed_168_v1, feed_219_v1, feed_253_v1, feed_251_v1, feed_173_v1, feed_252_v1, feed_210_v1, feed_216_v1, feed_218_v1, feed_118_v1, feed_273_v1, feed_227_v1, feed_166_v1, feed_213_v1, feed_226_v1]], type [history]
java.lang.NullPointerException
at org.elasticsearch.index.mapper.internal.TimestampFieldMapper.merge(TimestampFieldMapper.java:287)
at org.elasticsearch.index.mapper.object.ObjectMapper.merge(ObjectMapper.java:936)
at org.elasticsearch.index.mapper.DocumentMapper.merge(DocumentMapper.java:693)
at org.elasticsearch.cluster.metadata.MetaDataMappingService$4.execute(MetaDataMappingService.java:508)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:329)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
```
https://github.com/elasticsearch/elasticsearch/blob/v1.4.2/src/main/java/org/elasticsearch/index/mapper/internal/TimestampFieldMapper.java#L286
Looks like the existence of default timestamp is not checked before use. The next line also has the same issue -- uses of default timestamp without checked to see if it's not null.
To reproduce:
```
$ curl -XPUT localhost:9200/twitter2
$ curl -XPUT localhost:9200/twitter2/tweet/_mapping -d '{
"tweet" : {
"_timestamp" : {
"enabled" : true,
"default" : null
}
}
}'
$ curl -XPUT localhost:9200/twitter2/tweet/_mapping -d '{
"tweet" : {
"_timestamp" : {
"enabled" : true,
"default" : null
},
"properties": {
"user": {"type": "string"}
}
}
}'
```
Closes#9204.
(cherry picked from commit 62c6d63)
Before Elasticsearch 1.0, the type was allowed to be passed as the root
element when uploading a document. However, this was ambiguous if the
mappings also contained a field with the same name as the type. The
behavior was changed in 1.0 to not allow this, but a setting was added
for backwards compatibility. This change removes the setting for 2.0.
The header indicates to how many shard copies (primary and replicas shards) a write was supposed to go to, to how many
shard copies to write succeeded and potentially captures shard failures if writing into a replica shard fails.
For async writes it also includes the number of shards a write is still pending.
Closes#7994
This commit removes most of the Engine abstractions and removes
Engine exposure via dependency injection. It also removes the Holder
abstraction and makes the engine itself start at constrcution time.
It removes the start method from the engine entire which means no engine
instances exists until it's started. There is also no way to stop the
engine to restart, it needs to be an entire new Engine