* In code, we mark `River`, `AbstractRiverComponent`, `RiverComponent` and `RiverName` classes as deprecated
* We log that information when a cluster is still using it
* We add this information in the plugins list as well
Today we check every regular expression eagerly against every possible term.
This can be very slow if you have lots of unique terms, and even the bottleneck
if your query is selective.
This commit switches to Lucene regular expressions instead of Java (not exactly
the same syntax yet most existing regular expressions should keep working) and
uses the same logic as RegExpQuery to intersect the regular expression with the
terms dictionary. I wrote a quick benchmark (in the PR) to make sure it made
things faster and the same request that took 750ms on master now takes 74ms with
this change.
Close#7526
The refactoring in #9544 introduced a regression that broke multi-level
aggregations using breadth-first. This was due to sub-aggregators creating
deferred collectors before their parent aggregator and then the parent
aggregator trying to collect sub aggregators directly instead of going through
the deferred wrapper.
This commit fixes the issue but we should try to simplify all the pre/post
collection logic that we have.
Also `breadth_first` is now automatically ignored if the sub aggregators need
scores (just like we ignore `execution_mode` when the value does not make sense
like using ordinals on a script).
Close#9823
To ensure subclasses like MockInternalEngine which is in a different
package (test.engine) are logging under the same logger name this commit
moves to a static logger class to determin the logger name. This way
all subclasses of engine will log under `index.engine` which also plays
nicely with `@TestLogging` where log messages sometimes disappeared since
they were enabled for the `index.engine` package but not for `test.engine`
If the translog is buffered we must make sure everything is synced to disk
before we rollback the writer otherwise we open a window for potential dataloss due
to stupid errors preventing the translog from being closed.
Allows the user to calculate a Moving Average over a histogram of buckets. Provides four different
moving averages:
- Simple
- Linear weighted
- Single Exponentially weighted (aka EWMA)
- Double Exponentially weighted (aka Holt-winters)
Closes#10024
For bacwards compatibility reasons routing_nodes were previously printed out when routing_table was requested, together with the actual routing_table. Now they are printed out only when requests through `routing_nodes` flag.
Relates to #10412Closes#10486
Cluster state api returns both routing_table and routing_nodes sections whenever routing_table is requested. That is pretty much the same info, just grouped differently. This commit allows to differentiate between the two. Yet, routing_table still returns both for bw comp reasons.
Closes#10352Closes#10412
Removed the following methods from `ScriptService`, which don't require the `ScriptContext` argument:
```
public CompiledScript compile(String lang, String script, ScriptType scriptType)
public ExecutableScript executable(String lang, String script, ScriptType scriptType, Map<String, Object> vars)
public SearchScript search(SearchLookup lookup, String lang, String script, ScriptType scriptType, @Nullable Map<String, Object> vars)
```
Also removed the ScriptContext.Standard.GENERIC_PLUGIN enum value, as it was used only for backwards compatibility.
Plugins that make use of scripts should declare their own script contexts through `ScriptModule#registerScriptContext` and use them when compiling/executing scripts.
Closes#10476
Plugins can now define multiple operations/contexts that they use scripts for. Fine-grained settings can then be used to enable/disable scripts based on each single registered context.
Also added a new generic category called `plugin`, which will be used as a default when the context is not specified. This allows us to restore backwards compatibility for plugins on `ScriptService` by restoring the old methods that don't require the script context and making them internally use the `plugin` context, as they can only be called from plugins.
Closes#10347Closes#10419
Align get indexed scripts and get search template apis to our get api, which returns a response body when the document is not found, with a found boolean flag. Also, return metadata info all the time too.
Closes#7325Closes#10396
This tests adds a mappings with {"fielddata": {"format": "doc_values"}} but the
default mapping has {"doc_values": false} so when the document mapper parsing
logic merges both we have {"doc_values": false,"fielddata": {"format": "doc_values"}}
and {"doc_values": false} wins, so the test is not using doc values while it
thought it would.
In several places in the code we need to notify a node it needs to do something (typically the master). When that node is the local node, we have an optimization in serveral places that runs the execution code immediately instead of sending the request through the wire to itself. This is a shame as we need to implement the same pattern again and again. On top of that we may forget (see note bellow) to do so and we might have to write some craft if the code need to run under another thread pool.
This commit folds the optimization in the TrasnportService, shortcutting wire serliazition if the target node is local.
Note: this was discovered by #10247 which tries to import a dangling index quickly after the cluster forms. When sending an import dangling request to master, the code didn't take into account that fact that the local node may master. If this happens quickly enough, one would get a NodeNotConnected exception causing the dangling indices not to be imported. This will succeed after 10s where InternalClusterService.ReconnectToNodes runs and actively connects the local node to itself (which is not needed), potentially after another cluster state update.
Closes#10350
The exceptionCaught method had default access, which imposes a requirement
for subclasses that need to override this method to be in a specific package. This
change simply makes the method protected, which removes the package requirement.
We still have a lot of APIs that use setNextReader in order to change the
current segment that should be considered. This commit moves such APIs to
getLeafXXX() instead to be more in-line with Lucene 5's collector API.
I also renamed setDocId to setDocument to be more in-line with the doc values
APIs.
Close#10389
These tests create artificial hash collisions in order to make sure that they
can be resolved correctly. But this also makes the tests very slow if there
are too many collisions because insertions/deletions become linear in such
cases. The tests have been modified to not do too many iterations when
collisions are likely.
Close#10442
Closes#10435.
Squashed commit of the following:
commit aa1935c790b2731fc2bbc7de6142b09e3fe8bd4a
Author: Ryan Ernst <ryan@iernst.net>
Date: Mon Apr 6 13:44:40 2015 -0700
fix index lookup
commit bb6373595ff62ffc56fdf0cba3ac9c0ebe679946
Merge: 916962b eb3a170
Author: Robert Muir <rmuir@apache.org>
Date: Mon Apr 6 14:24:38 2015 -0400
Merge branch 'lucene_r1671277' of github.com:elasticsearch/elasticsearch into lucene_r1671277
commit 916962b82d192a53add471b4cc4a1396bc30eb0e
Merge: 197b3a2 21f72fe
Author: Robert Muir <rmuir@apache.org>
Date: Mon Apr 6 07:09:41 2015 -0400
Merge branch 'master' into lucene_r1671277
commit eb3a1703f7932ddd0cf3e83bec0e86131d255407
Author: Ryan Ernst <ryan@iernst.net>
Date: Sat Apr 4 11:06:03 2015 -0700
re-enable index lookup tests
commit 80d65d5eab39062dd8364687da74ddbb87ebcb76
Author: Ryan Ernst <ryan@iernst.net>
Date: Sat Apr 4 10:39:52 2015 -0700
update pom to point to new snapshot repo
commit 197b3a21ac2c2d70c9f740fe53e58632a22d1aad
Author: Robert Muir <rmuir@apache.org>
Date: Sat Apr 4 12:51:22 2015 -0400
fix postingsenum usage
commit 0e2b7a00cd07d068f755c51185ac521aa1eb0326
Author: Robert Muir <rmuir@apache.org>
Date: Sat Apr 4 12:21:23 2015 -0400
upgrade to lucene r1671277 (have not yet run tests or looked at postings changes)
The current implementation of AbstractBlobContainer.deleteByPrefix() calls AbstractBlobContainer.deleteBlobsByFilter() which calls BlobContainer.listBlobs() for deleting files, resulting in loading all files in order to delete few of them. This can be improved by calling BlobContainer.listBlobsByPrefix() directly.
This problem happened in #10344 when the repository verification process tries to delete a blob prefixed by "tests-" to ensure that the repository is accessible for the node. When doing so we have the following calling graph: BlobStoreRepository.endVerification() -> BlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteByPrefix() -> AbstractBlobContainer.deleteBlobsByFilter() -> BlobContainer.listBlobs()... and boom.
Also, AbstractBlobContainer.listBlobsByPrefix() and BlobContainer.deleteBlobsByFilter() can be removed because it has the same drawbacks as AbstractBlobContainer.deleteByPrefix() and also lists all blobs. Listing blobs by prefix can be done at the FsBlobContainer level.
Related to #10344
The static old index tests currently take a long time to run because
each index version essentially recreates the cluster, and spins up
new nodes. This PR instead loads each old version into the existing
cluster as a dangling index. It also removes the intermediate
"StaticIndexBackwardCompatibilityTest" which was an extra layer
with no purpose, and moves a shared version of a commonly found
function to get an http client.
The test now takes between 40 and 60 seconds for me. I also ran it
"under stress" by running all ES tests in one shell, while
simultaneously running 10 iterations of the old index tests. Each
iteration took on average about 90 seconds, which is much better
than the 20+ minutes we see in master on jenkins.
closes#10247
When doc values are explicitly set to the default value serialization
is skipped. This means the alternate way of specifying doc values,
through `fielddata.format: doc_values`, will take precedense if
present.
This change fixes doc values to always be serialized when an explicit value
was passed, so that it continues to take precedence over
`fielddata.format`.
closes#10297closes#10302
The current version is normally a snapshot while in development.
However, when the release process changes the snapshot flag to false,
this causes the static bwc tests to fail because they cannot
find an index for the current version. Instead, this change
skips the current version, because there is no need to test
a verion's bwc against itself.
closes#10292closes#10293
We had an undocumented parameter called `numeric_resolution` which allows to
configure how to deal with dates when provided as a number. The default is to
handle them as milliseconds, but you can also opt-on for eg. seconds.
Close#10072
We recently increased the size of bw indexes and backward compatibility tests
are now taking more time so it makes sense to ask them to do a bit less. This
commit changes the number of replicas we try to copy primaries to from (2 or 3)
to (1 or 2).
Separate repository registration to make sure that failure in registering one repository doesn't cause failures to register other repositories.
Closes#10351
1.1.0 is affected by #5817 which prevents merges from keeping up with the
indexing rate. As a consequence it generates lots of segments and makes bw
compat tests slow. So I added a special case for this version to index fewer
documents.
Many scripts are used to start/stop and install/uninstall elasticsearch. These scripts share a lot of configuration properties like directory paths, max value for a setting, default user etc. Most of the values are identical but some of them are different depending of the platform (Debian-based or Redhat-based OS), depending of the way elasticsearch is started (shell script, systemd, sysv-init...) or the way it is installed (zip, rpm, deb...). Today the values are duplicated in multiple places, making it difficult to maintain the scripts or to update a value.
This pull request make this more uniform: values used in scripts must be defined in a common packaging.properties file. Each value can be overridden in another specific packaging.properties file for Debian or Redhat. All startup and installation scripts are filtered with the common then the custom packaging.properties files before being packaged as a zip/tar.gz/rpm/dpkf archive.
Follow-up of #9135. We initially decreased the stack size because it would end
up a lot of memory when there are many threads. But we also have some thread
pools that may be oversized, in particular the search thread pool.
This commit proposes to decrease the default search thread pool size from
`3 * num_procs` to `3 * num_procs / 2 + 1`. This is large enough to be sure
that we can use all the machine resources even with a search-only work load
but not too large in order to not consume too much memory because of the stack
size and thread locals.
This pull request makes boolean handled like dates and ipv4 addresses: things
are stored as as numerics under the hood and aggregations add some special
formatting logic in order to return true/false in addition to 1/0.
For example, here is an output of a terms aggregation on a boolean field:
```
"aggregations": {
"top_f": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": 0,
"key_as_string": "false",
"doc_count": 2
},
{
"key": 1,
"key_as_string": "true",
"doc_count": 1
}
]
}
}
```
Sorted numeric doc values are used under the hood.
Close#4678Close#7851
A shard recovery response might serialize a shard state at the same time that it is
modified by the recovery process. The test
RelocationTests.testMoveShardsWhileRelocation
failed because of this with a ConcurrentModificationException.
closes#10381
RoutingTables activePrimaryShardsGrouped(), allActiveShardsGrouped() and
allAssignedShardsGrouped() methods treated empty index array input
parameters as meaning "all" indices and expanded to the routing maps
keyset. However, the expansion of index names is now already done in
MetaData#concreteIndices(). Returning an empty index name list here
when a wildcard pattern didn't match any index name could lead to
problems like #9081 because the RoutingTable still expanded this
list of names to "_all". In case of e.g. the recovery endpoint this
could lead to problems.
Closes#9081Closes#10148
This fixes an issue where this was logged:
```
[node_t1] [test][0] flush with org.elasticsearch.action.admin.indices.flush.FlushRequest@65f6f1e
```
by adding a .toString() method to FlushRequest.
It also changes:
```
creating Index [test], shards [1]/[2]
```
to:
```
creating Index [test], shards [1]/[2s]
```
If shadow replicas are being used.
Today we reuse the UUID of the source index on restore. This can create conflicts
with existing shard state on disk. This also causes multiple indices with the same
UUID. This commit preserves the UUID of an existing index or creates a new UUID for
a newly created index.
For quite some time now, our networking layer makes sure to create safe messages as in not using the shared buffers. This is great, and we should remove the old support for "unsafe" notion in our codebase.
closes#10360
Prevents a current edge case resolving concrete aliases or index names in cluster MetaData
that could potentialy lead to NullPointerException when the IndicesOptions don't allow
wildcard expansion and the method is called with aliasesOrIndices argument null or emtpy list.
This change adds a check for that and introduces randomized test that catches this.
Closes#10342Closes#10339
For optimization pruposes a function score query with an empty function
will just result in the original sub query. However, sometimes one might
want to use function_score query to actually filter out docs within for example
bool clauses by using the min_score functionallity.
Therefore the sub query should only be used without wrapping inside
a function_score query if min_score was also not set.
closes#10253closes#10326
When deleting a shard th node that deletes th shard first checks if all shard copies are
started on other nodes. A message is sent to each node tand each node checks locally for
STARTED or RELOCATED.
However, it might happen that the shard is still in state POST_RECOVERY, like this:
shard is relocating from node1 to node2
1. relocated shard on node2 goes in POST_RECOVERY and node2 sends shard started to master
2. master updates routing table and sends new cluster state to node1 and node2
3. node1 processes the cluster state and asks node2 if it has the active shard
before node2 processes the new cluster state (which would cause it to set the shard to started)
4. node2 sends back it does not have the shard started and so node1 does not delete it
This can be avoided by waiting until cluster state that sets the shard to started is actually processed.
closes#10018
Today there is a chance that the state version for shard, index or cluster
state goes backwards or is reset on a full restart etc. depending on
several factors not related to the state. To prevent any collisions
with already existing state files and to maintain write-once properties
this change introductes an incremental state ID instead of using the plain
state version. This also fixes a bug when the previous legacy state had a
greater version than the current state which causes an exception on node
startup or if left-over files are present.
Closes#10316
Now that fine-grained script settings are supported (#10116) we can remove support for the script.disable_dynamic setting.
Same result as `script.disable_dynamic: false` can be obtained as follows:
```
script.inline: on
script.indexed: on
```
An exception is thrown at startup when the old setting is set, so we make sure we tell users they have to change it rather than ignoring the setting.
Closes#10286
The query cache is disabled on dfs_query_then_fetch so we need to enforce
query_then_fetch instead of relying on the randomized search type set by the
test framework.
This commit brings the benefits of the `count` search type to search requests
that have a `size` of 0:
- a single round-trip to shards (no fetch phase)
- ability to use the query cache
Since `count` now provides no benefits over `query_then_fetch`, it has been
deprecated.
Close#7630
Even if there is a background thread that periodically closes search contexts
that seem unused (every minute by default), it is important to close search
contexts as soon as possible in order to not keep unnecessary open files or
to prevent segments from being deleted.
This check would help ensure that refactorings of the SearchContext management
like #9296 are correct.
Adds a getter for the actual netty channel in NettyTransportChannel. The
channel can be used by plugins that need access into netty when processing
requests.
FakeRestRequest is used by a few tests and can also be leveraged by
tests outside of elasticsearch. Moving the package will mean the class
gets exported as part of the test jar.
We already force a refresh in index/create ops, to clear version map
when it's using too much RAM, but we were failing to do this for
deletes, so an app that does tons of deletes with no indexing, and has
set refresh_interval to -1, would have version map using unbounded
RAM.
Closes#10312
After processing mapping updates from the master, we compare the resulting binary representation of them and compare it the one cluster state has. If different, we send a refresh mapping request to master, asking it to reparse the mapping and serialize them again. This mechanism is used to update the mapping after a format change caused by a version upgrade.
The very same process can also be triggered when an old master leaves the cluster, triggering a local cluster state update. If that update contains old mapping format, the local node will again signal the need to refresh, but this time there is no master to accept the request. Instead of failing (which we now do because of #10283, we should just skip the notification and wait for the next elected master to publish a new mapping (triggering another refresh if needed).
Closes#10311
Even if there is a background thread that periodically closes search contexts
that seem unused (every minute by default), it is important to close search
contexts as soon as possible in order to not keep unnecessary open files or
to prevent segments from being deleted.
This check would help ensure that refactorings of the SearchContext management
like #9296 are correct.
When the index service (which holds shards) fails to be created as a result of a shard being allocated on a node, we should fail the relevant shard, otherwise, it will remain stuck.
Same goes when there is a failure to process updated mappings form the master.
Note, both failures typically happen when the node is misconfigured (i.e. missing plugins, ...), since they get created and processed on the master node before being published.
closes#10283
Doc values significantly reduced heap usage, which results in faster
GCs. This change makes the default for doc values dynamic: any
field that is indexed but not analyzed now has doc values. This only
affects fields on indexes created with 2.0+.
closes#8312closes#10209
The local DocumentMapper is updated while parsing and dynamic fields are added before
parsing has finished. If parsing fails after a dynamic field has been added already
then the field was not added to the cluster state but was present in the local mapper of this
node. New documents with the same field would not necessarily cause an update either and
after restarting the node the mapping for these fields were lost. Instead the new fields
should always be updated.
closes#9851closes#9874
We currently have a single bw comp test (FunctionScoreBackwardCompatibilityTests) that requires inline scripts on.
After introducing fine-grained script settings, we moved the internal cluster to use the newer settings, but they are not supported by older nodes started as part of the bw comp tests. Moved script settings out of the default settings, so they won't be part of the ordinary settings when running bw comp tests.
Added logic in FunctionScoreBackwardCompatibilityTests to enable dynamic scripts using the proper setting, depending on the version of the node.
Allow to on/off scripting based on their source (where they get loaded from), the operation that executes them and their language.
The settings cover the following combinations:
- mode: on, off, sandbox
- source: indexed, dynamic, file
- engine: groovy, expressions, mustache, etc
- operation: update, search, aggs, mapping
The following settings are supported for every engine:
script.engine.groovy.indexed.update: sandbox/on/off
script.engine.groovy.indexed.search: sandbox/on/off
script.engine.groovy.indexed.aggs: sandbox/on/off
script.engine.groovy.indexed.mapping: sandbox/on/off
script.engine.groovy.dynamic.update: sandbox/on/off
script.engine.groovy.dynamic.search: sandbox/on/off
script.engine.groovy.dynamic.aggs: sandbox/on/off
script.engine.groovy.dynamic.mapping: sandbox/on/off
script.engine.groovy.file.update: sandbox/on/off
script.engine.groovy.file.search: sandbox/on/off
script.engine.groovy.file.aggs: sandbox/on/off
script.engine.groovy.file.mapping: sandbox/on/off
For ease of use, the following more generic settings are supported too:
script.indexed: sandbox/on/off
script.dynamic: sandbox/on/off
script.file: sandbox/on/off
script.update: sandbox/on/off
script.search: sandbox/on/off
script.aggs: sandbox/on/off
script.mapping: sandbox/on/off
These will be used to calculate the more specific settings, using the stricter setting of each combination. Operation based settings have precedence over conflicting source based ones.
Note that the `mustache` engine is affected by generic settings applied to any language, while native scripts aren't as they are static by definition.
Also, the previous `script.disable_dynamic` setting can now be deprecated.
Closes#6418Closes#10116Closes#10274
In #9893, an enabled flag was added for _field_names. However,
backcompat for indexes created before 1.3.0 (when _field_names
was added) was lost. This change corrects the mapper
to always be disabled when used with older indexes that
cannot have _field_names.
closes#10268
Deleting a type from an index is inherently dangerous because
the type can be recreated with new mappings which may conflict
with existing segments still using the old mappings. This
removes the ability to delete a type (similar to how deleting
fields within a type is not allowed, for the same reason).
closes#8877closes#10231
This adds the Explanation to the explain score again. It is needed
because the explanation of script functions will otherwise not contain
an explanation of _score if boost mode is set to replace.
closes#9826
Fail merge if validate_lat or validate_lon values are not equal. This will prevent inconsistencies between geo_points in a merged index, and parse exceptions for bounding_box and distance filters.
Also merged separate GeoPoint test classes into a single GeoPointFieldMapperTest to be consistent with GeoShapeFieldMapperTests.
closes#10164
* add compiler workarounds for JDK bug JI-9019884
* remove permgen specification during tests (this results in an error on java 9)
* fix threadpool grow/shrink to call methods in the right order (this results in IAE with java 9)
This reverts commit 166fd04239.
Turns out that having log.warn produces a duplicated warn log, as the same message is already logged warn in NettyTranspo
rt#exceptionCaught.
If a request comes in at the same moment the timeout handler for it runs, we may leak a timeoutInfoHolder and erroneously log "Transport response handler not found of id" . The same issue could cause the request tracer to fire a traceUnresolvedResponse call instead of traceReceivedResponse , causing a failure of testTracerLog ( see #10187 ) .
This commit makes sure timeoutInfoHolder is visible before removing the corresponding RequestHolder. It also unifies the TransportService.Adapter#remove(requestId) with TransportService.Adapter#onResponseReceived(requestId), as they are always called together to indicate a response was received.
Closes#10187Closes#10220
When a primary moves to another node, we cancel ongoing recoveries and retry from the primary's new home. At the moment this happens when the primary relocation *starts*. It's a shame as we cancel recoveries that may be close to completion and will finish before the primary has been fully relocated. This commit only triggers the cancelation once the primary relocation is completed.
Next to this, it fixes a race condition between recovery cancellation and the recovery completion. At the moment we may trigger remove a recovered shard just after it was completed. Instead, we should use the recovery cancellation logic to make sure only one code path is followed.
All of the above caused the recoverWhileUnderLoadWithNodeShutdown test to fail (see http://build-us-00.elastic.co/job/es_core_15_debian/32/ ). The test creates an index and then increasingly disallows nodes for it, until only 1 node is left in the allocation filtering rules. Normally, this means we stay in green, but the premature recovery cancellation plus the race condition mentioned above caused a shard to be failed and stay unassigned and the test asserts to fail. This happens due to the following sequence:
- The shard has finished recovering and sent the master a shard started command.
- The recovery is cancelled locally, removing the index shard.
- Master starts shard (deleting it's other copy).
- Local node gets a cluster state with the shard started in it, which cause it to send a shard failed (to make the master aware).
- Shard is failed and can't be re-assigned due to the allocation filter.
The recoverWhileUnderLoadWithNodeShutdown is also adapted a bit to fit the current behavior of allocation filtering (in the past it used to really shut down nodes). Last, all tests in that class are given better names to fit the current terminology.
Clsoes #10218
Today we simply fetch the shards metadata without verifying the
index UUID the shard belongs to. We recently added this UUID
to the shard state metadata. This commit adds verification
to the shard metadata fetching to prevent bringing shards
back into an index it doesn't belong to due to name collisions.
Note, Jackson 2.5 is less lenient when it comes to not starting an object before starting to add fields on a fresh builder, fixed where applicable.
closes#10210
Sometimes, when using transport client for example, through a load balancer, there is a need to send a scheduled ping message to keep each channel alive.
Add support for `transport.ping_schedule`, which controls the schedule (-1 for disabled) at which a ping message will be sent. For transport client case, it gets enabled automatically since almost always this is the desired behavior.
We use the same 6 bytes header format for the ping message, with ES header and -1 for data length for ping message, and simply continue to process the next messages once this is encountered.
closes#10189
The Update Settings API tries to merge the query_string params with the settings sent as body and excluding some "well known params" such as `pretty`, `timeout` etc. Those well known params do not include the params used by IndicesOptions though, so they get merged resulting in invalid settings that get rejected.
Closes#10030
`setConsistencyLevel` setter is already present in the base class `ShardReplicationOperationRequestBuilder`. It is not needed in `DeleteRequestBuilder` and `IndexRequestBuilder`.
Closes#10188
Relates to https://github.com/elastic/elasticsearch-ruby/issues/29, the parsing logic is incorrect as comma is used twice as a separator (e.g. indices_boost=index1,5,index,10 in expected but will never be parsed correctly). Anyways `indices_boost` makes more sense in the request body only where properly parsed and supported.
Closes#6281
Detect the worst-offenders, all IBM versions and several known hotspot
versions that can cause index corruption, and fail on startup.
Provide/detect compiler workarounds when they exist, but warn about
performance degradation.
In all cases the check can be bypassed completely with a safety
switch via undocumented system property (es.bypass.vm.check=true)
Closes#7580
CBOR has a special header that is optional, if exists, allows for exact detection. Also, since we know which formats we support in ES, we can support the object major type case.
closes#7640
Today we leave the shard state behind even if a recovery is half finished
this causes in rare conditions shards to be recovered and promoted as
primaries that have never been fully recovered.
Closes#10053
This commit changes the behaviour of the delete api when processing a delete request that refers to a type that has routing set to required in the mapping, and the routing is missing in the request. Up until now the delete api sent a broadcast delete request to all of the shards that belong to the index, making sure that the document could be found although the routing value wasn't specified. This was probably not the best choice: if the routing is set to required, an error should be thrown instead.
A `RoutingMissingException` gets now thrown instead, like it happens in the same situation with every other api (index, update, get etc.). Last but not least, this change allows to get rid of a couple of `TransportAction`s, `Request`s and `Response`s and simplify the codebase.
Closes#9123Closes#10136
This makes the assertion a bit more flexible and removes the
`ensureGreen` in favor of `ensureYellow`, which is really all that is
needed to perform a search. On slow machines the relocations can take a
while and time out the `ensureGreen`.
This commit allows code to be executed before or after a shards content
is deleted from disk. This is only executed if the shard owns the
content ie. on a shard file system only a primary shard will execute
these operations.
Several issues where reported showing truncated files where footers
didn't match and checksums read past EOF. This test reproduces the issue
on the latest 1.4 branch but passes on all versions above.
Closes#10155
Fixing geo_shape field mapper to persist the orientation parameter. Also adding parsing and integration tests to ensure persistence across cluster restarts.
Adds a setting to disable detailed error messages and full exception stack traces
in HTTP responses. When set to false, the error_trace request parameter will result
in a HTTP 400 response. When the error_trace parameter is not present, the message
of the first ElasticsearchException will be output and no nested exception messages
will be output.
In case an exception was caught by the repeat rule, the retry mechanism would kick in only if the exception was the expected one. If not an NPE got thrown, while we should rather just bubble it up to the caller. This makes `NettyTransportMultiPortTests` run from a plane. An assumption would kick in to make sure that the test gets ignored but the `AssumptionViolationException` was caught and not properly re-thrown.
By default we won't allow rebalance operation if no all shards are active.
if this is the case we don't need to worry about costly rebalance calculations at all.
to enable download servers to send the correct plugin version for the
node that is installing it this PR sends the current version as a header
to the server.
In case a HTTP client connects to the transport protocol and issues a
HTTP method followed by a space, we can just try to be smart and return
a string back to the client to point the user to the fact that the wrong
port has been used.
Closes#2139Closes#10108
This commit adds the current total number of translog operations to the recovery reporting API. We also expose the recovered / total percentage:
```
"translog": {
"recovered": 536,
"total": 986,
"percent": "54.3%",
"total_time": "2ms",
"total_time_in_millis": 2
},
```
Closes#9368Closes#10042
The filter from an indexed alias is as if you would filter on the metadata of a percolator query, but then the filter is defined in the index alias instead of the percolate request.
Closes#6241
The behaviour is better in the case someone has multiple levels of nested object fields defined in the mapping and like to define a single inner_hits definition that is two or more levels deep.
If someone wants inner hits on a nested field that is 2 levels deep the following would need to be defined:
```
{
...
"inner_hits" : {
"path" : {
"level1" : {
"inner_hits" : {
"path" : {
"level2" : {
"query" : { .... }
}
}
}
}
}
}
}
```
With this change the above can be defined as:
```
{
...
"inner_hits" : {
"path" : {
"level1.level2" : {
"query" : { .... }
}
}
}
}
```
Closes#9251
The BWC tests also run against a snapshot build of previous release branches. Upon a failure it's important to know what commit exactly was used.
Closes#10111
PageCacheRecycler was mistakenly using the maximum number of items per page
instead of the byte size of a page. This could result in higher memory usage
than configured.
Close#10077
This solves a problem in the time zone rounding classes where time dates that
fall into a DST gap will cause joda time library to throw an exception.
Changing the conversion methods 'strict' option to false prevents this.
Closes#10025
To support real time gets, the engine keeps an in-memory map of recently index docs and their location in the translog. This is needed until documents become visible in Lucene. With 1.3.0, we have improved this map and made tightly integrated with refresh cycles in Lucene in order to keep the memory signature to a bare minimum. On top of that, if the version map grows above a 25% of the index buffer size, we proactively refresh in order to be able to trim the version map back to 0 (see #6363) . In the same release, we have fixed an issue where an update to the indexing buffer could result in an unwanted exception during recovery (#6667) . We solved this by waiting with updating the size of the index buffer until the shard was fully recovered. Sadly this two together can have a negative impact on the speed of translog recovery.
During the second phase of recovery we replay all operations that happened on the shard during the first phase of copying files. In parallel we start indexing new documents into the new created shard. At some point (phase 3 in the recovery), the translog replay starts to send operation which have already been indexed into the shard. The version map is crucial in being able to quickly detect this and skip the replayed operations, without hitting lucene. Sadly #6667 (only updating the index memory buffer once shard is started) means that a shard is using the default 64MB for it's index buffer, and thus only 16MB (25%) for the version map. This much less then the default index buffer size 10% of machine memory (shared between shards).
Since we don't flush anymore when updating the memory buffer, we can remove #6667 and update recovering shards as well. Also, we make the version map max size configurable, with the same default of 25% of the current index buffer.
Closes#10046
The analysis chain should be used instead of relying on this, as it is
confusing when dealing with different per-field analysers.
The `locale` option was only used for `lowercase_expanded_terms`, which,
once removed, is no longer needed, so it was removed as well.
Fixes#9978
Relates to #9973
The file scripts cache key should include the language of the script to prevent conflicts between scripts with same name but different extension (hence lang). Note that script engines can register multiple acronyms that may be used as lang at execution time (e.g. javascript and js are synonyms). We then need to make sure that the same script gets loaded no matter which of the acronyms is used at execution time. The problem didn't exist before this change ad the lang was simply ignored, while now we take it into account.
This change has also some positive effect on inline scripts caching. Up until now, the same script referred to with different acronyms would be compiled and cached multiple times, once per acronym. After this change every script gets compiled and cached only once, as we chose internally the acronym used as part of the cache key, no matter which one the user provides.
Closes#10033
Remove the settings around dangling indices, such as no import and timeout for deletion, we always want to import dangling indices for safety, and we should not allow to change the behavior. This also cleans up the code quite a bit.
closes#10016
we want to make sure the recovery finished all the way to post recovery. Current check, validating the shard is either in POST_RECOVERY or STARTED is not good because the shard could be also closed if things go fast enough (like in our tests). The assertion is changed to check the shard is not left in CREATED or RECOVERING.
Closes#10028
- Added NAME constants for each script language, avoiding to repeat the same strings all over the place.
- Simplified `compile` method signatures by removing a couple of variants. Note that all of these signatures are going to change again with #6418 as in order to compile/execute a script the caller will need to specify which operation is attempting to execute the script, info that will be provided as an additional mandatory argument.
- Removed double call to ScriptService#verifyDynamicScripting for every indexed or dynamic script.
- Decreased ScriptService inner classes visibility to private (CacheKey, IndexedScript, ApplySettings)
- Moved ScriptService inner classes to the bottom of the class, I think it makes it more readable.
- Resolved some compiler warnings
Closes#9992
The tribe node, at startup, sets up the tribe clients that will join their corresponding tribes. All of the tribe.* settings are properly forwarded to the corresponding tribe client. System properties and global configuration settings must not be forwarded to the tribe client though or they will end up overriding per tribe settings with same name causing issues.
For instance if you set the transport.tcp.port to some defined value for the tribe node, via system property or configuration file, that same value must not be forwarded to the tribe clients, otherwise they will try and use the same port, which will be already occupied by the tribe node itself, resulting in startup failed. Same for cluster.name, which will cause the tribe clients not to join their tribes.
Closes#9576Closes#9721
This commit adds scripting capability to significant_terms.
Custom heuristics can be implemented with a script that provides
parameters subset_freq, superset_freq,subset_size, superset_size.
closes#7850
The local DocumentMapper is updated while parsing and dynamic fields are added before
parsing has finished. If parsing fails after a dynamic field has been added already
then the field was not added to the cluster state but was present in the local mapper of this
node. New documents with the same field would not necessarily cause an update either and
after restarting the node the mapping for these fields were lost. Instead the new fields
should always be updated.
closes#9851closes#9874
Since the method can be called in an #execute event of the cluster service, we need the ability to use the cluster state that will be provided in the ClusterChangedEvent, have the ClusterState be provided as a parameter
Previously it was ignored and the publish cluster state timeout would kick in. In that case a stale master node would just wait for the inevitable and waste valuable time.
This issue was discovered by the DiscoveryWithServiceDisruptionsTests#testStaleMasterNotHijackingMajority test.
Also only perform cluster state versions and wrong master node check inside cluster state update task.
If a tragic even happens while we are reading the segments info from the
store the store might have been closed concurrently. We had this
behavior before and was lost in a refactoring.
If a tragic even happens while we are reading the segments info
from the store the store might have been closed concurrently. We had this behavior
before and was lost in a refactoring.
Will allow many reducers to share the same helper functionality without repeating code. Chose
to put these in static helpers instead of adding to Reducer base class. I can imagine other reducers
that aren't time-based (or don't care about contiguous buckets), which would make things like
gap policy useless.
Since these seemed more like helpers than inherent traits of a Reducer, they went into their own
static class.
Closes#9954
If a folder for an index was created that folder is never deleted from that node unless the index is deleted.
Data only nodes therefore can have empty folders for indices that they do not even have shards for.
This commit makes sure empty folders are cleaned up after all shards have moved away from a data only
node. The behavior is unchanged for master eligible nodes.
closes#9985
Delete-by-query is incredibly costly because it forces a refresh each
time, so if you are also indexing this can cause massive segment
explosion.
This change throttles delete-by-query when merges can't keep up. It's
likely not enough (#7052 is the long-term solution) but can only
help.
Closes#9986
In TransportSearchTypeAction one of the logger calls was passing the throwable in as a parameter for the message rather than a throwable to be printed as a stack trace. This change fixes it so the throwable is printed properly
We used to handle truncated translogs in a better manner (assuming that
the node was killed halfway through writing an operation and discarding
the last operation). This brings back that behavior by catching an
`EOFException` during the stream reading and throwing a
`TruncatedTranslogException` which can be safely ignored in
`IndexShardGateway`.
Fixes#9699
We've been relying on URI for url encoding, but it turns out it has some problems. For instance '+' stays as is while it should be encoded to `%2B`. If we go and manually encode query params we have to be careful though not to run into double encoding ('+'=>'%2B'=>'%252B'). The applied solution relies on URI encoding for the url path, but manual url encoding for the query parameters. We prevent URI from double encoding query params by using its single argument constructor that leaves everything as is.
We can also revert back the expression script REST test that revealed this to its original content (which contains an addition).
Closes#9769Closes#9946
It may take some time for the old master node to step down anf for it to rejoin and that all nodes have it in the nodes list.
By waiting for the old master node to have stepped down, we can again rely on assertDiscoveryCompleted() to make sure that it has joined.
If the isolated unicast host is also a master node then its local cluster state gets unusable a source for pinging when the disruption stops.
All the nodes in the cluster state node list can be removed and at that time it will only ping itself and never find out about the other nodes.
(these nodes will not ping, because they are already following a new master)
CurrentTestFailedMarker is a RunListener that gets notified whenever a test fails, and we were using it to be able to restart the suite cluster after each failure. We were checking whether a test had failed in the @After method though, which runs before the listener gets notified, so the failed flag would always be false.
This commit makes sure that the suite cluster gets restarted not only when there are problems in the afterInternal method, but also after each test failure. In order to achieve this, we need to reset the cluster afterwards, when we get to know about both of the events (problem in afterInternal and test failure), and before resetting the currentCluster. Introduced a TestRule that keeps track of test failures and allows to execute arbitrary tasks when a test fails and when a test is completed (regardless of its result). Allows also to force the execution of the failure task (used in case of afterInternal issues rather than actual test failure).
Also updated ElasticsearchRestTests to make sure that the RestClient gets re-initialized in case we restart the suite cluster, otherwise all the subsequent tests fail. Improved this mechanism also to relate it directly to the restart of the cluster instead of checking whether the addresses have changed, which doesn't work anyway as the new cluster will use the same addresses but the client needs to be recreated anyway.
Closes#9015
This commit modifies the Kernel32Library to use direct mapping instead of a proxy class when doing native calls on Windows platforms. It also adds the "createSecurityManager" permission to the tests.policy file, and adds unit tests that should have failed when the Java security manager is enabled.
Closes#9802
Previously a few methods and many class members were package-private
or private and could not be referenced from overriding classes. This changes
the member variables and a few methods to have protected access, so they
can be referenced or overridden from a subclass.
We keep track of the current stage of recovery using an instance of RecoveryState which is stored on the relevant IndexShard. At the moment changes to this object are made in many places of the code, which are charged of doing it in the right order, keeping track of timers and many more. Also the changes to shard state are decoupled from the recovery stages which caused #9503.
This PR refactors this and brings all of the changes into IndexShard. It also makes all recovery follow the exact same stages and shortcut some. This is in order to keep things simple and always the same (those shortcuts didn't add anything, we ended doing it all anyway).
Also, all timer management is now folded into RecoveryState and unit tests are added.
This closes#9503 by moving the shard to post recovery only once the recovery is done (before they were decoupled), meaning that master promotion of the target shard to started can not cancel the recovery.
Closes#9902
#9760 was a fix for translog leaking due to measing a delete flag. This is not needed here as we have a better solution to not loose the flag. This commit takes the changes from 1x in order to keep the code base similar and enjoy the extra tests.
Closes#9760
While the parser allowed changing field type settings, these would never
have been serialized. So this change simply removes parsing using
parseField. Backcompat will still work if a user uploads old settings
(they just would never have worked anyways, so we continue ignoring
them with 1.x, and 2.x will now error).
see #8143closes#9914
This setting is used by the release script to run rest tests against
the version being released. It used to work only for tests using
the global cluster. Now it supercedes both SUITE and TEST scope
test clusters.
closes#9916
Closes#9915.
Squashed commit of the following:
commit cfa59f5a3f03d9d1b432980dcee6495447c1e7ea
Author: Robert Muir <rmuir@apache.org>
Date: Fri Feb 27 12:10:16 2015 -0500
add missing null check
commit 62fe5403068c730c0e0b6fd1ab1a0246eeef6220
Author: Robert Muir <rmuir@apache.org>
Date: Fri Feb 27 11:31:53 2015 -0500
Disable ExtrasFS for now, until we hook all this in properly in a separate issue.
commit 822795c57c5cf846423fad443c2327c4ed0094ac
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Feb 27 10:12:02 2015 +0100
Fix PercolatorTests.
commit 98b2a0a7d8298648125c9a367cb7e31b3ec7d51b
Author: Adrien Grand <jpountz@gmail.com>
Date: Fri Feb 27 09:27:11 2015 +0100
Fix ChildrenQueryTests.
commit 9b99656fc56bbd01c9afe22baffae3c37bb48a71
Author: Robert Muir <rmuir@apache.org>
Date: Thu Feb 26 20:50:02 2015 -0500
cutover apis, no work on test failures yet.
Some tests failures are seen when a node attempts to use a port that is already bound
by some other process on the test machine. This commit adds a bind to test port availability
and iterates over the port range until an available port is found. This reduces the likelihood
of a test node failing to start up due to the port already being bound.
Today we have two ways of getting a setting, either with the full settings key or with only
the last part of the key where the prefix is implicit depending on the package the class is in via
component settings. this is trappy as well as confusing for users and can break easily if a class is moved
to a new package since the prefix then implicitly changes.
This commit removes the component settings from the codebase.
Almost all of our meta fields that allow enabling/disabling have an `enabled`
setting. However, _field_names is enabled by default, and disabling
requires setting `index=no`. This change adds a flag similar to that
with other meta fields.
closes#9893
Random geo shape testing periodically fails on a known issue within Spatial4j core. A simple patch in ES will fix the issue. For now this random test will be disabled until the patch can be applied.
The request tracer logs in TRACE level under the `transport.tracer` log and is dynamically configurable with include and exclude arrays to filter out unneeded info. By default all requests are logged with the exception of fault detection pings (fired every second).
add the notion of tracers in the MockTransportService for testing purposes
Closes#9286
Currently rounding in DateMathParser This always done in UTC, even
when another time zone is specified. This is fixed by passing the time zone
down to the rounding logic when it is specified.
Closes#9814Closes#9885
To support the `_recovery` API, the recovery process keeps track of current progress in a class called RecoveryState. This class currently have some issues, mostly around concurrency (see #6644 ). This PR cleans it up as well as other issues around it:
- Make the Index subsection API cleaner:
- remove redundant information - all calculation is done based on the underlying file map
- clearer definition of what is what: total files, vs reused files (local files that match the source) vs recovered files (copied over). % based progress is reported based on recovered files only.
- cleaned up json response to match other API (sadly this breaks the structure). We now properly report human values for dates and other units.
- Add more robust unit testing
- Detail flag was passed along as state (it's now a ToXContent param)
- State lookup during reporting is now always done via the IndexShard , no more fall backs to many other classes.
- Cleanup APIs around time and move the little computations to the state class as opposed to doing them out of the API
I also improved error messages out of the REST testing infra for things I run into.
Closes#6644Closes#9811
Together with #8782 it should help in the situations simliar to #8887 by adding an ability to get information about currently running snapshot without accessing the repository itself.
Closes#8887
The number of current pending tasks is useful to detect and overloaded master. This commit adds it to the cluster health API. The complete list can be retrieved from the dedicated pending tasks API.
It also adds rest tests for the cluster health variants.
Closes#9877
Today we fail the shard if we need to upgrade a replica to a primary on shadow replicas
on shared filesystem. Yet, this commit allows promotion by re-initializing on the master preventing
reallocation of all replicas.
We try to lock all shards when an index is deleted but likely not
succeeding since shards are still active. To ensure that shards
that used to be allocated on that node get cleaned up as well we have
to retry or block on the delete until we get the locks. This is not desirable
since the delete happens on the cluster state processing thread. Instead of blocking
this commit schedules a pending delete for the index just like if we can't delete shards.
There are two implications to this change.
First, percolator now uses _uid internally, extracting the id portion
when needed. Second, sorting on _id is no longer possible, since you
can no longer index _id. However, _uid can still be used to sort, and
is better anyways as indexing _id just to make it available to
fielddata for sorting is wasteful.
see #8143closes#9842
Today if we delete files from the index directory we never acquire the
write lock. Yet, for safety reasons we should get the write lock before
we modify / delete any files. Where we can we should leave the deletion
to the index writer and only delete that are necessary to delete ourself.
Refactor how settings filters are handled. Instead of specifying settings filters as filtering class, settings filters are now specified as list of settings that needs to be filtered out. Regex syntax is supported. This is breaking change and will require small change in plugins that are using settingsFilters. This change is needed in order to simplify cluster state diff implementation.
Contributes to #6295
When indexing of a document with a type that is not in the mappings fails,
for example because "dynamic": "strict" but doc contains a new field,
then the type is still created on the node that executed the indexing request.
However, the change was never added to the cluster state.
This commit makes sure mapping updates are always added to the cluster state
even if indexing of a document fails.
closes#8692
relates to #8650
Now that the global cluster is gone, we shoudln't need to ignore
thread leaks across tests. We unfortunately still need suite level
scope, since most tests are using suite scope clusters (although
test clope clusters should really switch back to test scope thread
leaks).
closes#9843
These help a lot when refactoring, upgrading lucene, etc, and
can prevent code duplication (as you get a compile error for outdated stuff).
Closes#9832.
This change removes the deprecated script parameter names ('file', 'id', and 'scriptField').
It also removes the ability to load file scripts using the 'script' parameter. File scripts should be loaded using the 'script_file' parameter only.
If an elected master node goes into a long gc then other nodes' fault detection will notice this and a new master election is started and eventually a new master node is elected. If the previous master nodes goes out of the long gc it can still have pending tasks which can result in new cluster state updates. Nodes that are still in the nodes list of this previous elected master node can get these cluster state updates. This commit makes sure that this dated cluster states are not accepted by these nodes.
This issue can temporary lead to the fact that non elected master nodes switch to the previous elected master node. The new elected master node also gets the same dated cluster state, but rejects it and tells the previous elected master node to step down and rejoin. Because the new elected master is the only master node the previous elected master node will follow the new elected master node. Any nodes that follow the previous elected master node (by accident), will also rejoin and follow the new elected master node because their master fault detection will fail. So all in all this isn't a severe problem, because the problem fixes itself eventually.
Closes#9632
In #6636 we switched to a default FileSwitchDirectory that made
.listAll run twice on the same underlying file system directory.
This fixes listAll to do a single directory listing again.
Closes#9666
Currently many meta field mappers do not take index settings in their
simple constructor that DocumentMapper uses, and instead pass null or
empty settings to the parent abstract mapper. This change fixes them to
pass through index settings, and adds an assertion in AbstractFieldMapper
that settings are not null.
closes#9780
This was previously attempted in #8854. I revived that branch and did
some performance testing as was suggested in the comments there.
I fixed all the errors, mostly just the rest tests, which
needed to have http enabled on the node settings (the global cluster
previously had this always enabled). I also addressed the comments from
that issue.
My performance tests involved running the entire test suite on my
desktop which has 6 cores, 16GB of ram, and nothing else was being
run on the box at the time. I ran each set of settings 3 times and
took the average time.
| mode | master | patch | diff |
| ------- | ------ | ----- | ---- |
| local | 409s | 417s | +2% |
| network | 368s | 380s | +3% |
This increase in average time is clearly worthwhile to pay to achieve
isolation of tests. One caveat is the way I fixed the rest tests
is still to have one cluster for the entire suite, so all the rest
tests can still potentially affect each other, but this is an
issue for another day.
There were some oddities that I noticed while running these tests
that I would like to point out, as they probably deserve some
investigation (but orthogonal to this PR):
* The total test run times are highly variable (more than a minute between the min and max)
* Running in network mode is on average actually *faster* than local mode. How is this possible!?
Files.exists(f) && Files.isDirectory(f) -> Files.exists(f)
if (Files.exists(f)) Files.delete(f) -> Files.deleteIfExists(f)
if (!Files.exists(f)) Files.createDirectories(f) -> Files.createDirectories(f)
In a few places where successive i/o ops are done against the same file, convert
to Files.readAttributes().
Closes#9807.
Today locking all shards only locks the shards that are present on
the node or that still have a shard directory. This can lead to odd
behavior if another shard that doesn't exist yet is allocated while
all shards are supposed to be locked.
Adds RandomShapeGenerator for creating random shape types. This adds a level of randomized testing to the Geospatial logic. An initial randomized GeometryCollection test is added to the GeoShapeIntegrationTest suite for validating and verifying geo_shape filter behavior. The RandomShapeGenerator can/should be used in Unit and Integration testing to avoid biased testing.
closes#9588
Squashed commit of the following:
commit 07391388715ed1f737e8acc391cea0bce5d79db9
Merge: a71cc45 b61b021
Author: Robert Muir <rmuir@apache.org>
Date: Fri Feb 20 06:58:11 2015 -0500
Git really sucks
Merge branch 'lucene_r1660560' of github.com:elasticsearch/elasticsearch into lucene_r1660560
commit b61b02163f62ad8ddd9906cedb3d57fed75eb52d
Author: Adrien Grand <jpountz@gmail.com>
Date: Wed Feb 18 19:03:49 2015 +0100
Try to improve TopDocs.merge usage.
commit bf8e4ac46d7fdaf9ae128606d96328a59784f126
Author: Ryan Ernst <ryan@iernst.net>
Date: Wed Feb 18 07:43:37 2015 -0800
reenable scripting test for accessing postings pieces. commented out
parts that fail because of bad assumptions
commit 6d4d635b1a23b33c437a6bae70beea70ad52d91c
Author: Robert Muir <rmuir@apache.org>
Date: Wed Feb 18 09:41:46 2015 -0500
add some protection against broken asserts, but, also disable crappy test
commit c735bbb11f38782dfea9c4200fcf732564126bf5
Author: Robert Muir <rmuir@apache.org>
Date: Wed Feb 18 02:21:30 2015 -0500
cutover remaining stuff from old postings api
commit 11c9c2bea3db3ff1cd2807bd43e77b500b167aed
Author: Robert Muir <rmuir@apache.org>
Date: Wed Feb 18 01:46:04 2015 -0500
cut over most DocsEnum usage
commit bc18017662f6abddf3f074078f74e582494c88e2
Author: Robert Muir <rmuir@apache.org>
Date: Wed Feb 18 01:19:35 2015 -0500
upgrade to lucene_r1660560, modulo one test fail
Today if a shard deletion fails we simply ignore it and move on. On system like
windows where a virus scanner can hold on to files or any other process ie. the admins
explorer window we fail to delete shards leaving large amout of data behind. We should try
best effort to clean those shards up before we ack the delete.
Today we restore files by running through the directory removeing all files
not in the snapshot. Some files in that direcotry might belong there even though
we remove them. This commit moves the responsiblity of cleaning up pending files
to lucene by utilizing IndexWriter#IndexFileDeleter
Today we use Directory#listAll() to find all the files we recovered. Yet,
this is not accurate since there might be leftovers etc. It's better to
only iterate over the known files from the segments info that we recovered.
This commit makes the `postings_format` and `doc_values_format` options of
mappings illegal on 2.0 and ignored on 1.x (meaning that the default postings
and doc values formats from the codec will be used in such a case).
This removes a fair amount of code.
Close#8746#9741
Squashed commit of the following:
commit 20835037c98e7d2fac4206c372717a05a27c4790
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 15:27:17 2015 -0700
Use Enum for "_primary" preference
commit 325acbe4585179190a959ba3101ee63b99f1931a
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 14:32:41 2015 -0700
Use ?preference=_primary automatically for realtime GET operations
commit edd49434af5de7e55928f27a1c9ed0fddb1fb133
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 14:32:06 2015 -0700
Move engine creation into protected createNewEngine method
commit 67a797a9235d4aa376ff4af16f3944d907df4577
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 13:14:01 2015 -0700
Factor out AssertingSearcher so it can be used by mock Engines
commit 62b0c28df8c23cc0b8205b33f7595c68ff940e2b
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 11:43:17 2015 -0700
Use IndexMetaData.isIndexUsingShadowReplicas helper
commit 1a0d45629457578a60ae5bccbeba05acf5d79ddd
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 09:59:31 2015 -0700
Rename usesSharedFilesystem -> isOnSharedFilesystem
commit 73c62df4fc7da8a5ed557620a83910d89b313aa1
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 09:58:02 2015 -0700
Add MockShadowEngine and hook it up to be used
commit c8e8db473830fce1bdca3c4df80a685e782383bc
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 09:45:50 2015 -0700
Clarify comment about pre-defined mappings
commit 60a4d5374af5262bd415f4ef40f635278ed12a03
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 09:18:22 2015 -0700
Add a test for shadow replicas that uses field data
commit 7346f9f382f83a21cd2445b3386fe67472bc3184
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 08:37:14 2015 -0700
Revert changes to RecoveryTarget.java
commit d90d6980c9b737bd8c0f4339613a5373b1645e95
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 08:35:44 2015 -0700
Rename `ownsShard` to `canDeleteShardContent`
commit 23001af834d66278ac84d9a72c37b5d1f3a10a7b
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 08:35:25 2015 -0700
Remove ShadowEngineFactory, add .newReadOnlyEngine method in EngineFactory
commit b64fef1d2c5e167713e869b22d388ff479252173
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 18 08:25:19 2015 -0700
Add warning that predefined mappings should be used
commit a1b8b8cf0db49d1bd1aeb84e51491f7f0de43b59
Author: Lee Hinman <lee@writequit.org>
Date: Tue Feb 17 14:31:50 2015 -0700
Remove unused import and fix index creation example in docs
commit 0b1b852365ceafc0df86866ac3a4ffb6988b08e4
Merge: b9d1fed a22bd49
Author: Lee Hinman <lee@writequit.org>
Date: Tue Feb 17 10:56:02 2015 -0700
Merge remote-tracking branch 'refs/remotes/origin/master' into shadow-replicas
commit b9d1fed25ae472a9dce1904eb806702fba4d9786
Merge: 4473e63 41fd4d8
Author: Lee Hinman <lee@writequit.org>
Date: Tue Feb 17 09:02:27 2015 -0700
Merge remote-tracking branch 'refs/remotes/origin/master' into shadow-replicas
commit 4473e630460e2f0ca2a2e2478f3712f39a64c919
Author: Lee Hinman <lee@writequit.org>
Date: Tue Feb 17 09:00:39 2015 -0700
Add asciidoc documentation for shadow replicas
commit eb699c19f04965952ae45e2caf107124837c4654
Author: Simon Willnauer <simonw@apache.org>
Date: Tue Feb 17 16:15:39 2015 +0100
remove last nocommit
commit c5ece6d16d423fbdd36f5d789bd8daa5724d77b0
Author: Simon Willnauer <simonw@apache.org>
Date: Tue Feb 17 16:13:12 2015 +0100
simplify shadow engine
commit 45cd34a12a442080477da3ef14ab2fe7947ea97e
Author: Simon Willnauer <simonw@apache.org>
Date: Tue Feb 17 11:32:57 2015 +0100
fix tests
commit 744f228c192602a6737051571e040731d413ba8b
Author: Simon Willnauer <simonw@apache.org>
Date: Tue Feb 17 11:28:12 2015 +0100
revert changes to IndexShardGateway - these are leftovers from previous iterations
commit 11886b7653dabc23655ec76d112f291301f98f4a
Author: Simon Willnauer <simonw@apache.org>
Date: Tue Feb 17 11:26:48 2015 +0100
Back out non-shared FS code. this will go in in a second iteration
commit 77fba571f150a0ca7fb340603669522c3ed65363
Merge: e8ad614 2e3c6a9
Author: Simon Willnauer <simonw@apache.org>
Date: Tue Feb 17 11:16:46 2015 +0100
Merge branch 'master' into shadow-replicas
Conflicts:
src/main/java/org/elasticsearch/index/engine/Engine.java
commit e8ad61467304e6d175257e389b8406d2a6cf8dba
Merge: 48a700d 1b8d8da
Author: Simon Willnauer <simonw@apache.org>
Date: Tue Feb 17 10:54:20 2015 +0100
Merge branch 'master' into shadow-replicas
commit 48a700d23cff117b8e4851d4008364f92b8272a0
Author: Simon Willnauer <simonw@apache.org>
Date: Tue Feb 17 10:50:59 2015 +0100
add test for failing shadow engine / remove nocommit
commit d77414c5e7b2cde830a8e3f70fe463ccc904d4d0
Author: Simon Willnauer <simonw@apache.org>
Date: Tue Feb 17 10:27:56 2015 +0100
remove nocommits in IndexMetaData
commit abb696563a9e418d3f842a790fcb832f91150be2
Author: Simon Willnauer <simonw@apache.org>
Date: Mon Feb 16 17:05:02 2015 +0100
remove nocommit and simplify delete logic
commit 82b9f0449108cd4741568d9b4495bf6c10a5b019
Author: Simon Willnauer <simonw@apache.org>
Date: Mon Feb 16 16:45:27 2015 +0100
reduce the changes compared to master
commit 28f069b6d99a65e285ac8c821e6a332a1d8eb315
Author: Simon Willnauer <simonw@apache.org>
Date: Mon Feb 16 16:43:46 2015 +0100
fix primary relocation
commit c4c999dd61a44a7a0db9798275a622f2b85b1039
Merge: 2ae80f9 455a85d
Author: Simon Willnauer <simonw@apache.org>
Date: Mon Feb 16 15:04:26 2015 +0100
Merge branch 'master' into shadow-replicas
commit 2ae80f9689346f8fd346a0d3775a6341874d8bef
Author: Lee Hinman <lee@writequit.org>
Date: Fri Feb 13 16:25:34 2015 -0700
throw UnsupportedOperationException on write operations in ShadowEngine
commit 740c28dd9ef987bf56b670fa1a8bcc6de2845819
Merge: e5bc047 305ba33
Author: Lee Hinman <lee@writequit.org>
Date: Fri Feb 13 15:38:39 2015 -0700
Merge branch 'master' into shadow-replicas
commit e5bc047d7c872ae960d397b1ae7b4b78d6a1ea10
Author: Lee Hinman <lee@writequit.org>
Date: Fri Feb 13 11:38:09 2015 -0700
Don't replicate document request when using shadow replicas
commit 213292e0679d8ae1492ea11861178236f4abd8ea
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Feb 13 13:58:05 2015 +0100
add one more nocommit
commit 83d171cf632f9b77cca9de58505f7db8fcda5599
Merge: aea9692 09eb8d1
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Feb 13 13:52:29 2015 +0100
Merge branch 'master' into shadow-replicas
commit aea96920d995dacef294e48e719ba18f1ecf5860
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Feb 13 09:56:41 2015 +0100
revert unneeded changes on Store
commit ea4e3e58dc6959a92c06d5990276268d586735f3
Author: Lee Hinman <lee@writequit.org>
Date: Thu Feb 12 14:26:30 2015 -0700
Add documentation to ShadowIndexShard, remove nocommit
commit 4f71c8d9f706a0c1c39aa3a370efb1604559d928
Author: Lee Hinman <lee@writequit.org>
Date: Thu Feb 12 14:17:22 2015 -0700
Add documentation to ShadowEngine
commit 28a9d1842722acba7ea69e0fa65200444532a30c
Author: Lee Hinman <lee@writequit.org>
Date: Thu Feb 12 14:08:25 2015 -0700
Remove nocommit, document canDeleteIndexContents
commit d8d59dbf6d0525cd823d97268d035820e5727ac9
Author: Lee Hinman <lee@writequit.org>
Date: Thu Feb 12 10:34:32 2015 -0700
Refactor more shared methods into the abstract Engine
commit a7eb53c1e8b8fbfd9281b43ae39eacbe3cd1a0a6
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Feb 12 17:38:59 2015 +0100
Simplify shared filesystem recovery by using a dedicated recovery handler that skip
most phases and enforces shard closing on the soruce before the target opens it's engine
commit a62b9a70adad87d7492c526f4daf868cb05018d9
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Feb 12 15:59:54 2015 +0100
fix compile error after upstream changes
commit abda7807bc3328a89fd783ca7ad8c6deac35f16f
Merge: f229719 35f6496
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Feb 12 15:57:28 2015 +0100
Merge branch 'master' into shadow-replicas
Conflicts:
src/main/java/org/elasticsearch/index/engine/Engine.java
commit f2297199b7dd5d3f9f1f109d0ddf3dd83390b0d1
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Feb 12 12:41:32 2015 +0100
first cut at catchup from primary
make flush to a refresh
factor our ShadowIndexShard to have IndexShard be idential to the master and least intrusive
cleanup abstractions
commit 4a367c07505b84b452807a58890f1cbe21711f27
Author: Simon Willnauer <simonw@apache.org>
Date: Thu Feb 12 09:50:36 2015 +0100
fix primary promotion
commit cf2fb807e7e243f1ad603a79bc9d5f31a499b769
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 11 16:45:41 2015 -0700
Make assertPathHasBeenCleared recursive
commit 5689b7d2f84ca1c41e4459030af56cb9c0151eff
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 11 15:58:19 2015 -0700
Add testShadowReplicaNaturalRelocation
commit fdbe4133537eaeb768747c2200cfc91878afeb97
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 11 15:28:57 2015 -0700
Use check for shared filesystem in primary -> primary relocation
Also adds a nocommit
commit 06e2eb4496762130af87ce68a47d360962091697
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 11 15:21:32 2015 -0700
Add a test checking that indices with shadow replicas clean up after themselves
commit e4dbfb09a689b449f0edf6ee24222d7eaba2a215
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 11 15:08:18 2015 -0700
Fix segment info for ShadowEngine, remove test nocommit
commit 80cf0e884c66eda7d59ac5d59235e1ce215af8f5
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 11 14:30:13 2015 -0700
Remove nocommit in ShadowEngineTests#testFailStart()
commit 5e33eeaca971807b342f9be51a6a566eee005251
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 11 14:22:59 2015 -0700
Remove overly-complex test
commit 2378fbb917b467e79c0262d7a41c23321bbeb147
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 11 13:45:44 2015 -0700
Fix missing import
commit 52e9cd1b8334a5dd228d5d68bd03fd0040e9c8e9
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 11 13:45:05 2015 -0700
Add a test for replica -> primary promotion
commit a95adbeded426d7f69f6ddc4cbd6712b6f6380b4
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 11 12:54:14 2015 -0700
Remove tests that don't apply to ShadowEngine
commit 1896feda9de69e4f9cf774ef6748a5c50e953946
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 11 10:29:12 2015 -0700
Add testShadowEngineIgnoresWriteOperations and testSearchResultRelease
commit 67d7df41eac5e10a1dd63ddb31de74e326e9d38b
Author: Lee Hinman <lee@writequit.org>
Date: Wed Feb 11 10:06:05 2015 -0700
Add start of ShadowEngine unit tests
commit ca9beb2d93d9b5af9aa6c75dbc0ead4ef57e220d
Merge: 2d42736 57a4646
Author: Simon Willnauer <simonw@apache.org>
Date: Wed Feb 11 18:03:53 2015 +0100
Merge branch 'master' into shadow-replicas
commit 2d42736fed3ed8afda7e4aff10b65d292e1c6f92
Author: Simon Willnauer <simonw@apache.org>
Date: Wed Feb 11 17:51:22 2015 +0100
shortcut recovery if we are on a shared FS - no need to compare files etc.
commit 24d36c92dd82adce650e7ac8e9f0b43c83b2dc53
Author: Simon Willnauer <simonw@apache.org>
Date: Wed Feb 11 17:08:08 2015 +0100
utilize the new delete code
commit 2a2eed10f58825aae29ffe4cf01aefa5743a97c7
Merge: 343dc0b 173cfc1
Author: Simon Willnauer <simonw@apache.org>
Date: Wed Feb 11 16:07:41 2015 +0100
Merge branch 'master' into shadow-replicas
Conflicts:
src/main/java/org/elasticsearch/gateway/GatewayMetaState.java
commit 343dc0b527a7052acdc783ac5abcaad1ef78dbda
Author: Simon Willnauer <simonw@apache.org>
Date: Wed Feb 11 16:05:28 2015 +0100
long adder is not available in java7
commit be02cabfeebaea74b51b212957a2a466cfbfb716
Author: Lee Hinman <lee@writequit.org>
Date: Tue Feb 10 22:04:24 2015 -0700
Add test that restarts nodes to ensure shadow replicas recover
commit 7fcb373f0617050ca1a5a577b8cf32e32dc612b0
Author: Simon Willnauer <simonw@apache.org>
Date: Tue Feb 10 23:19:21 2015 +0100
make test more evil
commit 38135af0c1991b88f168ece0efb72ffe9498ff59
Author: Simon Willnauer <simonw@apache.org>
Date: Tue Feb 10 22:25:11 2015 +0100
make tests pass
commit 05975af69e6db63cb95f3e40d25bfa7174e006ea
Author: Lee Hinman <lee@writequit.org>
Date: Mon Jan 12 18:44:29 2015 +0100
Add ShadowEngine
Change bucket key_as_string to reflect `time_zone` parameter. Currently `time_zone`
shifts bucket boundaries to other time zone, but keys are displayed in UTC, so e.g.
daily buckets in "+01:00" time zone have key_as_string like "2014-01-01T23:00:00Z". With this
change the default is to format this dates according to the local time zone, so the
above bucket key would be "2014-01-02T00:00:00+01:00".
Closes#9710Closes#9744
Today we trash everything that has been indexed but not flushed to disk
if the engine is closed. This might not be desired if we shutting down a
node for restart / upgrade or if we close / archive an index. In such a
case we would like to flush the transaction log and commit everything to
disk. This commit adds a flag to the close method that is set on close
and shutdown but not when we remove the shard due to relocations
The nested scope is set by any nested feature, so that sub nested queries and filters know about their context and these sub nested queries and filters can construct the right parent filter.
Removed the LateBindingParentFilter workaround in the nested query parser in favour of the nested scope maintained in the query parse context.
Due to this change nested queries and filters can now also be included in nested sorting and inner hits, because those features also now use the nested scope.
This change doesn't fix the usage of nested filters in nested and reverse_nested aggregations. The `nested` filter shouldn't be used inside these aggregations and instead the `nested` and `reverse_nested` aggs should be used to query on the right level. In a different change `nested` inside a `nested` and `reverse_nested` aggregation should result in a parse error.
Closes#9305