In 5.x we allowed this with a deprecation warning. This removes the code
added for that deprecation, requiring the cluster name to not be in the
data path.
Resolves#20391
This was an error-prone version type that allowed overriding previous
version semantics. It could cause primaries and replicas to be out of
sync however, so it has been removed.
Resolves#19769
Previous versions of Elasticsearch permitted unquoted JSON field names even though this is against the JSON spec. This leniency was disabled by default in the 5.x series of Elasticsearch but a backwards compatibility layer was added via a system property with the intention of removing this layer in 6.0.0. This commit removes this backwards compatibility layer.
Relates #20388
This includes:
- All regular numeric types such as int, long, scaled-float, double, etc
- IP addresses
- Dates
- Geopoints and Geoshapes
Relates to #19784
The collect_payloads parameter of the span_near query was previously
deprecated with the intention to be removed. This commit removes this
parameter.
Relates #20385
This was an error-prone version type that allowed overriding previous
version semantics. It could cause primaries and replicas to be out of
sync however, so it has been removed.
Resolves#19769
** The default script language is now maintained in `Script` class.
* Added `script.legacy.default_lang` setting that controls the default language for scripts that are stored inside documents (for example percolator queries). This defaults to groovy.
** Added `QueryParseContext#getDefaultScriptLanguage()` that manages the default scripting language. Returns always `painless`, unless loading query/search request in legacy mode then the returns what is configured in `script.legacy.default_lang` setting.
** In the aggregation parsing code added `ParserContext` that also holds the default scripting language like `QueryParseContext`. Most parser don't have access to `QueryParseContext`. This is for scripts in aggregations.
* The `lang` script field is always serialized (toXContent).
Closes#20122
The mem section was buggy in cluster stats and removed. It is now added back with the same structure as in node stats, containing total memory, available memory, used memory and percentages. All the values are the sum of all the nodes across the cluster (or at least the ones that we were able to get the values from).
If elasticsearch controls the ID values as well as the documents
version we can optimize the code that adds / appends the documents
to the index. Essentially we an skip the version lookup for all
documents unless the same document is delivered more than once.
On the lucene level we can simply call IndexWriter#addDocument instead
of #updateDocument but on the Engine level we need to ensure that we deoptimize
the case once we see the same document more than once.
This is done as follows:
1. Mark every request with a timestamp. This is done once on the first node that
receives a request and is fixed for this request. This can be even the
machine local time (see why later). The important part is that retry
requests will have the same value as the original one.
2. In the engine we make sure we keep the highest seen time stamp of "retry" requests.
This is updated while the retry request has its doc id lock. Call this `maxUnsafeAutoIdTimestamp`
3. When the engine runs an "optimized" request comes, it compares it's timestamp with the
current `maxUnsafeAutoIdTimestamp` (but doesn't update it). If the the request
timestamp is higher it is safe to execute it as optimized (no retry request with the same
timestamp has been run before). If not we fall back to "non-optimzed" mode and run the request as a retry one
and update the `maxUnsafeAutoIdTimestamp` unless it's been updated already to a higher value
Relates to #19813
* master:
Avoid NPE in LoggingListener
Randomly use Netty 3 plugin in some tests
Skip smoke test client on JDK 9
Revert "Don't allow XContentBuilder#writeValue(TimeValue)"
[docs] Remove coming in 2.0.0
Don't allow XContentBuilder#writeValue(TimeValue)
[doc] Remove leftover from CONSOLE conversion
Parameter improvements to Cluster Health API wait for shards (#20223)
Add 2.4.0 to packaging tests list
Docs: clarify scale is applied at origin+offest (#20242)
* Params improvements to Cluster Health API wait for shards
Previously, the cluster health API used a strictly numeric value
for `wait_for_active_shards`. However, with the introduction of
ActiveShardCount and the removal of write consistency level for
replication operations, `wait_for_active_shards` is used for
write operations to represent values for ActiveShardCount. This
commit moves the cluster health API's usage of `wait_for_active_shards`
to be consistent with its usage in the write operation APIs.
This commit also changes `wait_for_relocating_shards` from a
numeric value to a simple boolean value `wait_for_no_relocating_shards`
to set whether the cluster health operation should wait for
all relocating shards to complete relocation.
* Addresses code review comments
* Don't be lenient if `wait_for_relocating_shards` is set
While removing an index isn't actually an alias action, if we add
an alias action that deletes an index then we can delete and index
and add an alias with the same name as the index atomically, in
the same cluster state update.
Closes#20064
Today we do a lot of accounting inside the engine to maintain locations
of documents inside the transaction log. This is only needed to ensure
we can return the documents source from the engine if it hasn't been refreshed.
Aside of the added complexity to be able to read from the currently writing translog,
maintainance of pointers into the translog this also caused inconsistencies like different values
of the `_ttl` field if it was read from the tlog or not. TermVectors are totally different if
the document is fetched from the tranlog since copy fields are ignored etc.
This chance will simply call `refresh` if the documents latest version is not in the index. This
streamlines the semantics of the `_get` API and allows for more optimizations inside the engine
and on the transaction log. Note: `_refresh` is only called iff the requested document is not refreshed
yet but has recently been updated or added.
#Relates to #19787
This includes:
- All regular numeric types such as int, long, scaled-float, double, etc
- IP addresses
- Dates
- Geopoints and Geoshapes
Relates to #19784
Previously this was possible, which was problematic when issuing a
request like `DELETE /-myindex`, which was interpretted as "delete
everything except for myindex".
Resolves#19800
Currently both `PUT` and `POST` can be used to create indices. This commit
removes support for `POST index_name` so that we can use it to index documents
with auto-generated ids once types are removed.
Relates #15613
This commit defaults the max local storage nodes to one. The motivation
for this change is that a default value greather than one is dangerous
as users sometimes end up unknowingly starting a second node and start
thinking that they have encountered data loss.
Relates #19964
This commit rewords the expect header bug notice to provide the precise
details for the bug arising. In particular, the bug does not impact any
request over 1024 bytes, but instead impacts any request with a body
that is sent in two requests, the first with an Expect: 100-continue
header. The size is irrelevant, and requests with bodies larger than
1024 bytes are okay as long as the Expect: 100-continue header is not
also sent.
Relates #19911
With #19140 we started persisting the node ID across node restarts. Now that we have a "stable" anchor, we can use it to generate a stable default node name and make it easier to track nodes over a restarts. Sadly, this means we will not have those random fun Marvel characters but we feel this is the right tradeoff.
On the implementation side, this requires a bit of juggling because we now need to read the node id from disk before we can log as the node node is part of each log message. The PR move the initialization of NodeEnvironment as high up in the starting sequence as possible, with only one logging message before it to indicate we are initializing. Things look now like this:
```
[2016-07-15 19:38:39,742][INFO ][node ] [_unset_] initializing ...
[2016-07-15 19:38:39,826][INFO ][node ] [aAmiW40] node name set to [aAmiW40] by default. set the [node.name] settings to change it
[2016-07-15 19:38:39,829][INFO ][env ] [aAmiW40] using [1] data paths, mounts [[ /(/dev/disk1)]], net usable_space [5.5gb], net total_space [232.6gb], spins? [unknown], types [hfs]
[2016-07-15 19:38:39,830][INFO ][env ] [aAmiW40] heap size [1.9gb], compressed ordinary object pointers [true]
[2016-07-15 19:38:39,837][INFO ][node ] [aAmiW40] version[5.0.0-alpha5-SNAPSHOT], pid[46048], build[473d3c0/2016-07-15T17:38:06.771Z], OS[Mac OS X/10.11.5/x86_64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_51/25.51-b03]
[2016-07-15 19:38:40,980][INFO ][plugins ] [aAmiW40] modules [percolator, lang-mustache, lang-painless, reindex, aggs-matrix-stats, lang-expression, ingest-common, lang-groovy, transport-netty], plugins []
[2016-07-15 19:38:43,218][INFO ][node ] [aAmiW40] initialized
```
Needless to say, settings `node.name` explicitly still works as before.
The commit also contains some clean ups to the relationship between Environment, Settings and Plugins. The previous code suggested the path related settings could be changed after the initial Environment was changed. This did not have any effect as the security manager already locked things down.
Add parser for anonymous char_filters/tokenizer/token_filters
Using Settings in AnalyzeRequest for anonymous definition
Add breaking changes document
Closed#8878
Remove `ParseField` constants used for names where there are no deprecated
names and just use the `String` version of the registration method instead.
This is step 2 in cleaning up the plugin interface for extending
search time actions. Aggregations are next.
This is breaking for plugins because those that register a new query should
now implement `SearchPlugin` rather than `onModule(SearchModule)`.
* Removed `Template` class and unified script & template parsing logic. Templates are scripts, so they should be defined as a script. Unless there will be separate template infrastructure, templates should share as much code as possible with scripts.
* Removed ScriptParseException in favour for ElasticsearchParseException
* Moved TemplateQueryBuilder to lang-mustache module because this query is hard coded to work with mustache only
Today `node.mode` and `node.local` serve almost the same purpose, they
are a shortcut for `discovery.type` and `transport.type`. If `node.local: true`
or `node.mode: local` is set elasticsearch will start in _local_ mode which means
only nodes within the same JVM are discovered and a non-network based transport
is used. The _local_ mode it only really used in tests or if nodes are embedded.
For both, embedding and tests explicit configuration via `discovery.type` and `transport.type`
should be preferred.
This change removes all the usage of these settings and by-default doesn't
configure a default transport implemenation since netty is now a module. Yet, to make
the user expericence flawless, plugins or modules can set a `http.type.default` and
`transport.type.default`. Plugins set this via `PluginService#additionalSettings()`
which enforces _set-once_ which prevents node startup if set multiple times. This means
that our distributions will just startup with netty transport since it's packaged as a
module unless `transport.type` or `http.transport.type` is explicitly set.
This change also found a bunch of bugs since several NamedWriteables were not registered if a
transport client is used. Now that we don't rely on the `node.mode` leniency which is inherited
instead of using explicit settings, `TransportClient` uses `AssertingLocalTransport` which detects these problems since it serializes all messages.
Closes#16234
This commit removes support for properties syntax and config files:
- removed support for elasticsearch.properties
- removed support for logging.properties
- removed support for properties content detection in REST APIs
- removed support for properties content detection in Java API
Relates #19398
Switches most search behavior extensions from push (`onModule(SearchModule)`)
to pull (`implements SearchPlugin`). This effort in general gives plugin
authors a much cleaner view of how to extend Elasticsearch and starts to
set up portions of Elasticsearch as "the plugin API". This commit in
particular does that for search-time behavior like customized suggesters,
highlighters, score functions, and significance heuristics.
It also switches most such customization to being done at search module
construction time which is much, much easier to reason about from a testing
perspective. It also helps significantly in the process of de-guice-ing
Elasticsearch's startup.
There are at least two major search time extensions that aren't covered in
this commit that will simply have to wait for the next commit on the topic
because this one has already grown large: custom aggregations and custom
queries. These will likely live in the same SearchPlugin interface as well.
If there are percolator queries containing `range` queries with ranges based on the current time then this can lead to incorrect results if the `percolate` query gets cached. These ranges are changing each time the `percolate` query gets executed and if this query gets cached then the results will be based on how the range was at the time when the `percolate` query got cached.
The ExtractQueryTermsService has been renamed `QueryAnalyzer` and now only deals with analyzing the query (extracting terms and deciding if the entire query is a verified match) . The `PercolatorFieldMapper` is responsible for adding the right fields based on the analysis the `QueryAnalyzer` has performed, because this is highly dependent on the field mappings. Also the `PercolatorFieldMapper` is responsible for creating the percolate query.
Today when a thread encounters a fatal unrecoverable error that
threatens the stability of the JVM, Elasticsearch marches on. This
includes out of memory errors, stack overflow errors and other errors
that leave the JVM in a questionable state. Instead, the Elasticsearch
JVM should die when these errors are encountered. This commit causes
this to be the case.
Relates #19272
This change activates the doc_values on the _size field for indices created after 5.0.0-alpha4.
It also adds a note in the breaking changes that explain the situation and how to get around it.
Closes#18334
Node IDs are currently randomly generated during node startup. That means they change every time the node is restarted. While this doesn't matter for ES proper, it makes it hard for external services to track nodes. Another, more minor, side effect is that indexing the output of, say, the node stats API results in creating new fields due to node ID being used as keys.
The first approach I considered was to use the node's published address as the base for the id. We already [treat nodes with the same address as the same](https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/discovery/zen/NodeJoinController.java#L387) so this is a simple change (see [here](https://github.com/elastic/elasticsearch/compare/master...bleskes:node_persistent_id_based_on_address)). While this is simple and it works for probably most cases, it is not perfect. For example, if after a node restart, the node is not able to bind to the same port (because it's not yet freed by the OS), it will cause the node to still change identity. Also in environments where the host IP can change due to a host restart, identity will not be the same.
Due to those limitation, I opted to go with a different approach where the node id will be persisted in the node's data folder. This has the upside of connecting the id to the nodes data. It also means that the host can be adapted in any way (replace network cards, attach storage to a new VM). I
It does however also have downsides - we now run the risk of two nodes having the same id, if someone copies clones a data folder from one node to another. To mitigate this I changed the semantics of the protection against multiple nodes with the same address to be stricter - it will now reject the incoming join if a node exists with the same id but a different address. Note that if the existing node doesn't respond to pings (i.e., it's not alive) it will be removed and the new node will be accepted when it tries another join.
Last, and most importantly, this change requires that *all* nodes persist data to disk. This is a change from current behavior where only data & master nodes store local files. This is the main reason for marking this PR as breaking.
Other less important notes:
- DummyTransportAddress is removed as we need a unique network address per node. Use `LocalTransportAddress.buildUnique()` instead.
- I renamed `node.add_lid_to_custom_path` to `node.add_lock_id_to_custom_path` to avoid confusion with the node ID which is now part of the `NodeEnvironment` logic.
- I removed the `version` paramater from `MetaDataStateFormat#write` , it wasn't really used and was just in the way :)
- TribeNodes are special in the sense that they do start multiple sub-nodes (previously known as client nodes). Those sub-nodes do not store local files but derive their ID from the parent node id, so they are generated consistently.
Rename `fields` to `stored_fields` and add `docvalue_fields`
`stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields.
`fields` will throw an exception if the user uses it.
Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field.
Closes#18943
As discussed at https://github.com/elastic/elasticsearch-cloud-azure/issues/91#issuecomment-229113595, we know that the current `discovery-azure` plugin only works with Azure Classic VMs / Services (which is somehow Legacy now).
The proposal here is to rename `discovery-azure` to `discovery-azure-classic` in case some users are using it.
And deprecate it for 5.0.
Closes#19144.
`RestHandler`s are highly tied to actions so registering them in the
same place makes sense.
Removes the need to for plugins to check if they are in transport client
mode before registering a RestHandler - `getRestHandlers` isn't called
at all in transport client mode.
This caused guice to throw a massive fit about the circular dependency
between NodeClient and the allocation deciders. I broke the circular
dependency by registering the actions map with the node client after
instantiation.
Instead of implementing onModule(ActionModule) to register actions,
this has plugins implement ActionPlugin to declare actions. This is
yet another step in cleaning up the plugin infrastructure.
While I was in there I switched AutoCreateIndex and DestructiveOperations
to be eagerly constructed which makes them easier to use when
de-guice-ing the code base.
This commit modifies TimeValue parsing to keep the input time unit. This
enables round-trip parsing from instances of String to instances of
TimeValue and vice-versa. With this, this commit removes support for the
unit "w" representing weeks, and also removes support for fractional
values of units (e.g., 0.5s).
Relates #19102
Instead of plugins calling `registerTokenizer` to extend the analyzer
they now instead have to implement `AnalysisPlugin` and override
`getTokenizer`. This lines up extending plugins in with extending
scripts. This allows `AnalysisModule` to construct the `AnalysisRegistry`
immediately as part of its constructor which makes testing anslysis
much simpler.
This also moves the default analysis configuration into `AnalysisModule`
which is how search is setup.
Like `ScriptModule`, `AnalysisModule` no longer extends `AbstractModule`.
Instead it is only responsible for building `AnslysisRegistry`. We still
bind `AnalysisRegistry` but we only do so in `Node`. This is means it
is available at module construction time so we slowly remove the need to
bind it in guice.
This commit moves template support out of the Search API to its own dedicated Search Template API in the lang-mustache module. It provides a new SearchTemplateAction that can be used to render templates before it gets delegated to the usual Search API. The current REST endpoint are identical, but the Render Search Template endpoint now uses the same Search Template API with a new "simulate" option. When this option is enabled, the Search Template API only renders template and returns immediatly, without executing the search.
Closes#17906
We aren't able to actually create an index with _timestamp enabled
to test the migration, or, at least, we won't be able to after #18980
is re-merged. But the docs are still ok.
Closes#19007
If a plugin declares `onModule(SomethingThatIsntAModule)` then refuse
to start. Before this commit we just logged a warning that flies by in
the console and is easy to miss. You can't miss refusing to start!
`stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields.
`fields` will throw an exception if the user uses it.
Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field.
Closes#18943
The default similarity was set to `classic` which refers to TFIDF and has not been moved after the upgrade to Lucene 6.
Though moving to BM25 could have some downside for queries that relies on coordination factor (match_query, multi_match_query) ?
relates #18944
This commit removes the search preference _only_node as the same
functionality can be obtained by using the search preference
_only_nodes. This commit also adds a test that ensures that _only_nodes
will continue to support specifying node IDs.
Relates #18875
This commit adds a note to the breaking changes docs that since commit
da74323141, thread pool settings are no
longer cluster-level settings and thus not dynamically updatable.
The search preference _prefer_node allows specifying a single node to
prefer when routing a request. This functionality can be enhanced by
permitting multiple nodes to be preferred. This commit replaces the
search preference _prefer_node with the search preference _prefer_nodes
which supplants the former by specifying a single node and otherwise
adds functionality.
Relates #18872
Previously Elasticsearch used $DATA_DIR/$CLUSTER_NAME/nodes for the path
where data is stored, this commit changes that to be $DATA_DIR/nodes.
On startup, if the old folder structure is detected it will be used.
This behavior will be removed in Elasticsearch 6.0
Resolves#17810
This commit refactors the handling of thread pool settings so that the
individual settings can be registered rather than registering the top
level group. With this refactoring, individual plugins must now register
their own settings for custom thread pools that they need, but a
dedicated API is provided for this in the thread pool module. This
commit also renames the prefix on the thread pool settings from
"threadpool" to "thread_pool". This enables a hard break on the settings
so that:
- some of the settings can be given more sensible names (e.g., the max
number of threads in a scaling thread pool is now named "max" instead
of "size")
- change the soft limit on the number of threads in the bulk and
indexing thread pools to a hard limit
- the settings names for custom plugins for thread pools can be
prefixed (e.g., "xpack.watcher.thread_pool.size")
- remove dynamic thread pool settings
Relates #18674
This adds support for setting the refresh request parameter to
`wait_for` in the `index`, `delete`, `update`, and `bulk` APIs. When
`refresh=wait_for` is set those APIs will not return until their
results have been made visible to search by a refresh.
Also it adds a `forced_refresh` field to the response of `index`,
`delete`, `update`, and to each item in a bulk response. This will
be true for requests with `?refresh` or `?refresh=true` and will be
true for some requests (see below) with `refresh=wait_for` but ought
to otherwise always be false.
`refresh=wait_for` is implemented as a list of
`Tuple<Translog.Location, Consumer<Boolean>>`s in the new `RefreshListeners`
class that is managed by `IndexShard`. The dynamic, index scoped
`index.max_refresh_listeners` setting controls a maximum number of
listeners allowed in any shard. If more than that many listeners
accumulate in the engine then a refresh will be forced, the thread that
adds the listener will be blocked until the refresh completes, and then the
listener will be called with a `forcedRefresh` flag so it knows that it was
the "straw that broke the camel's back". These listeners are only used by
`refresh=wait_for` and that flag manifests itself as `forced_refresh` being
`true` in the response.
About half of this change comes from piping async-ness down to the appropriate
layer in a way that is compatible with the ongoing with with sequence ids.
Closes#1063
You can look up the winding story of all the commits here:
https://github.com/elastic/elasticsearch/pull/17986
Here are the commit messages in case they are intersting to you:
commit 59a753b89109828d2b8f0de05cb104fc663cf95e
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Jun 6 10:18:23 2016 -0400
Replace a method reference with implementing an interface
Saves a single allocation and forces more commonality
between the WriteResults.
commit 31f7861a85b457fb7378a6f27fa0a0c171538f68
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Jun 6 10:07:55 2016 -0400
Revert "Replace static method that takes consumer with delegate class that takes an interface"
This reverts commit 777e23a6592c75db0081a53458cc760f4db69507.
commit 777e23a6592c75db0081a53458cc760f4db69507
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Jun 6 09:29:35 2016 -0400
Replace static method that takes consumer with delegate class that takes an interface
Same number of allocations, much less code duplication.
commit 9b49a480ca9587a0a16ebe941662849f38289644
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Jun 6 08:25:38 2016 -0400
Patch from boaz
commit c2bc36524fda119fd0514415127e8901d94409c8
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 14:46:27 2016 -0400
Fix docs
After updating to master we are actually testing them.
commit 03975ac056e44954eb0a371149d410dcf303e212
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 14:20:11 2016 -0400
Cleanup after merge from master
commit 9c9a1deb002c5bebb2a997c89fa12b3d7978e02e
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 14:09:14 2016 -0400
Breaking changes notes
commit 1c3e64ae06c07a85f7af80534fab88279adb30b4
Merge: 9e63ad6 f67e580
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 14:00:05 2016 -0400
Merge branch 'master' into block_until_refresh2
commit 9e63ad6de52d0b28f0b6d7203721baf1ebf6f56b
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 13:21:27 2016 -0400
Test for TransportWriteAction
commit 522ecb59d39b3c9e8df0d3b8df34b9e7aeaf0ce9
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 10:30:18 2016 -0400
Document deprecation
commit 0cd67b947f58867e704a1f0e66928a6fb5a11f11
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 10:26:23 2016 -0400
Deprecate setRefresh(boolean)
Users should use `setRefresh(RefreshPolicy)` instead.
commit aeb1be3f2c501990b33fb1f8230d496035f498ef
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 10:12:27 2016 -0400
Remove checkstyle suppression
It is fixed
commit 00d09a9caa638b6f90f4896b5502dd98d8fad56e
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 10:08:28 2016 -0400
Improve comment
commit 788164b898a6ee2878a273961230122b7386c3c9
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 10:01:01 2016 -0400
S/ReplicatedWriteResponse/WriteResponse/
Now it lines up with WriteRequest.
commit b74cf3fe778352b140355afcaa08d3d4412d749d
Author: Nik Everett <nik9000@gmail.com>
Date: Wed Jun 1 18:27:52 2016 -0400
Preserve `?refresh` behavior
`?refresh` means the same things as `?refresh=true`.
commit 30f972bdaeaaa0de6fe67746cdb8628aa86f5a8c
Author: Nik Everett <nik9000@gmail.com>
Date: Wed Jun 1 17:39:05 2016 -0400
Handle hanging documents
If a document is added to the index during a refresh we weren't properly
firing its refresh listener. This happened because the way we detect
whether a refresh makes something visible or not is imperfect. It is
ok because it always errs on the side of thinking that something isn't
yet visible.
So when a document arrives during a refresh the refresh listeners
won't think it made it into a refresh when, often, it does. The way
we work around this is by telling Elasticsearch that it ought to
trigger a refresh if there are any pending refresh listeners even
if there aren't pending documents to update. Lucene short circuits
the refresh so it doesn't take that much effort, but the refresh
listeners still get the signal that a refresh has come in and they
still pick up the change and notify the listener.
This means that the time that a listener can wait is actually slightly
longer than the refresh interval.
commit d523b5702b60c7ba309fb0dcf3cd3a4798f11960
Author: Nik Everett <nik9000@gmail.com>
Date: Wed Jun 1 14:34:01 2016 -0400
Explain Integer.MAX_VALUE
commit 4ffb7c0e954343cc1c04b3d7be2ebad66d3a016b
Author: Nik Everett <nik9000@gmail.com>
Date: Wed Jun 1 14:27:39 2016 -0400
Fire all refresh listeners in a single thread
Rather than queueing a runnable each.
commit 19606ec3bbe612095df45eba734c5b7eb2709c01
Author: Nik Everett <nik9000@gmail.com>
Date: Wed Jun 1 14:09:52 2016 -0400
Assert translog ordering
commit 6bb4e5c75e850f4a42518f06fbc955f7ec76d245
Author: Nik Everett <nik9000@gmail.com>
Date: Wed Jun 1 13:17:44 2016 -0400
Support null RefreshListeners in InternalEngine
Just skip using it.
commit 74be1480d6e44af2b354ff9ea47c234d4870b6c2
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 18:02:03 2016 -0400
Move funny ShardInfo hack for bulk into bulk
This should make it easier to understand because it is closer to where it
matters....
commit 2b771f8dabd488e056cfdc9989608d18264ddfb0
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 17:39:46 2016 -0400
Pull listener out into an inner class with javadoc and stuff
commit 058481ad72019c0492b03a7a4ac32a48673697d3
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 17:33:42 2016 -0400
Fix javadoc links
commit d2123b1cabf29bce8ff561d4a4c1c1d5b42bccad
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 17:28:09 2016 -0400
Make more stuff final
commit 8453fc4f7850f6a02fb5971c17a942a3e3fd9f7b
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 17:26:48 2016 -0400
Javadoc
commit fb16d2fc7016c1e8e1621d481e8781c7ef43326c
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 16:14:48 2016 -0400
Rewrite refresh docs
commit 5797d1b1c4d233c0db918c0d08c21731ddccd05e
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 15:02:34 2016 -0400
Fix forced_refresh flag
It wasn't being set.
commit 43ce50a1de250a9e073a2ca6cbf55c1b4c74b11b
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 14:02:56 2016 -0400
Delay translog sync and flush until after refresh
The sync might have occurred for us during the refresh so we
have less work to do. Maybe.
commit bb2739202e084703baf02cfa58f09517598cf14e
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 13:08:08 2016 -0400
Remove duplication in WritePrimaryResult and WriteReplicaResult
commit 2f579f89b4867a880396f2e7fcffc508449ff2de
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 12:19:05 2016 -0400
Clean up registration of RefreshListeners
commit 87ab6e60ca5ba945bf0fba84784b2bbe53506abf
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 11:28:30 2016 -0400
Shorten lock time in RefreshListeners
Also use null to represent no listeners rather than an empty list.
This saves allocating a new ArrayList every refresh cycle on every
index.
commit 0d49d9c5720dadfb67da3fa760397bf6d874601c
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 10:46:18 2016 -0400
Flip relationship between RefreshListeners and Engine
Now RefreshListeners comes to Engine from EngineConfig.
commit b2704b8a39382953f8f91a9743e894ee289f7514
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 09:37:58 2016 -0400
Remove unused imports
Maybe I added them?
commit 04343a22647f19304d9dc716b3fac9b183227f63
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 09:37:52 2016 -0400
Javadoc
commit da1e765678890a02d61d8a29aa433274beb5e00c
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 09:26:35 2016 -0400
Reply with non-null
Also move the fsync and flush to before the refresh listener stuff.
commit 5d8eecd0d904b497844b4c81c46477bd6178ed3a
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 08:58:47 2016 -0400
Remove funky synchronization in AsyncReplicaAction
commit 1ec71eea0f4e1228ae1497d982307be818ef4b65
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 08:01:14 2016 -0400
s/LinkedTransferQueue/ArrayList/
commit 7da36a4ceed2ccf7955138c3b005237fa41efcb4
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 07:46:38 2016 -0400
More cleanup for RefreshListeners
commit 957e9b77007c32ee75dde152c6622bab065d5993
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 07:34:13 2016 -0400
/Consumer<Runnable>/Executor/
commit 4d8bf5d4a70dcc56150c8d8d14165cd23d308b3c
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 23 22:20:42 2016 -0400
explain
commit 15d948a348089bb2937eec5ac4e96f3ec67dbe32
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 23 22:17:59 2016 -0400
Better....
commit dc28951d02973fc03b4d51913b5f96de14b75607
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 23 21:09:20 2016 -0400
Javadocs and compromises
commit 8eebaa89c0a1ee74982fbe0d56d1485ca2ae09db
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 23 20:52:49 2016 -0400
Take boaz's changes to their logic conclusion and unbreak important stuff like bulk
commit 7056b96ea412f275005b93e3570bcff895859ed5
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 23 15:49:32 2016 -0400
Patch from boaz
commit 87be7eaed09a274cc6a99d1a3da81d2d7bf9dd64
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 23 15:49:13 2016 -0400
Revert "Move async parts of replica operation outside of the lock"
This reverts commit 13807ad10b6f5ecd39f98c9f20874f9f352c5bc2.
commit 13807ad10b6f5ecd39f98c9f20874f9f352c5bc2
Author: Nik Everett <nik9000@gmail.com>
Date: Fri May 20 22:53:15 2016 -0400
Move async parts of replica operation outside of the lock
commit b8cadcef565908b276484f7f5f988fd58b38d8b6
Author: Nik Everett <nik9000@gmail.com>
Date: Fri May 20 16:17:20 2016 -0400
Docs
commit 91149e0580233bf79c2273b419fe9374ca746648
Author: Nik Everett <nik9000@gmail.com>
Date: Fri May 20 15:17:40 2016 -0400
Finally!
commit 1ff50c2faf56665d221f00a18d9ac88745904bf5
Author: Nik Everett <nik9000@gmail.com>
Date: Fri May 20 15:01:53 2016 -0400
Remove Translog#lastWriteLocation
I wasn't being careful enough with locks so it wasn't right anyway.
Instead this builds a synthetic Tranlog.Location when you call
getWriteLocation with much more relaxed equality guarantees. Rather
than being equal to the last Translog.Location returned it is
simply guaranteed to be greater than the last translog returned
and less than the next.
commit 55596ea68b5484490c3637fbad0d95564236478b
Author: Nik Everett <nik9000@gmail.com>
Date: Fri May 20 14:40:06 2016 -0400
Remove listener from shardOperationOnPrimary
Create instead asyncShardOperationOnPrimary which is called after
all of the replica operations are started to handle any async
operations.
commit 3322e26211bf681b37132274ee158ae330afc28b
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 17 17:20:02 2016 -0400
Increase default maximum number of listeners to 1000
commit 88171a8322a424e624d48960fb4c98dd43e4d671
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 17 16:40:57 2016 -0400
Rename test
commit 179c27c4f829f2c6ded65967652cf85adaf2ae52
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 17 16:35:27 2016 -0400
Move refresh listeners into their own class
They still live at the IndexShard level but they live on their
own in RefreshListeners which interacts with IndexShard using a
couple of callbacks and a registration method. This lets us test
the listeners without standing up an entire IndexShard. We still
test the listeners against an InternalEngine, because the interplay
between InternalEngine, Translog, and RefreshListeners is complex
and important to get right.
commit d8926d5fc1d24b4da8ccff7e0f0907b98c583c41
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 17 11:02:38 2016 -0400
Move refresh listeners into IndexShard
commit df91cde398eb720143a85a8c6fa19bdc3a74e07d
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 16 16:01:03 2016 -0400
unused import
commit 066da45b08148b266e4173166662fc1b3f66ed53
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 16 15:54:11 2016 -0400
Remove RefreshListener interface
Just pass a Translog.Location and a Consumer<Boolean> when registering.
commit b971d6d3301c7522b2e7eb90d5d8dd96a77fa625
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 16 14:41:06 2016 -0400
Docs for setForcedRefresh
commit 6c43be821eaf61141d3ec520f988aad3a96a3941
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 16 14:34:39 2016 -0400
Rename refresh setter and getter
commit e61b7391f91263a4c4d6107bfbc2a828bbcc805c
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 22:48:09 2016 -0400
Trigger listeners even when there is no refresh
Each refresh gives us an opportunity to pick up any listeners we may
have left behind.
commit 0c9b0477085c021f503db775640d25668e02f635
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 20:30:06 2016 -0400
REST
commit 8250343240de7e63118c663a230a7a314807a754
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 19:34:22 2016 -0400
Switch to estimated count
We don't need a linear time count of the number of listeners - a volatile
variable is good enough to guess. It probably undercounts more than it
overcounts but it isn't a huge problem.
commit bd531167fe54f1bde6f6d4ddb0a8de5a7bcc18a2
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 18:21:02 2016 -0400
Don't try and set forced refresh on bulk items without a response
NullPointerExceptions are bad. If the entire request fails then the user
has worse problems then "did these force a refresh".
commit bcfded11515af5e0b3c3e36f3c2f73f5cd26512e
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 18:14:20 2016 -0400
Replace LinkedList and synchronized with LinkedTransferQueue
commit 8a80cc70a76375a7593745884cb987535b37ca80
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 17:38:24 2016 -0400
Support for update
commit 1f36966742f851b7328015151ef6fc8f95299af2
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 15:46:06 2016 -0400
Cleanup translog tests
commit 8d121bf35eb265b8a0aee9710afeb1b054a113d4
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 15:40:53 2016 -0400
Cleanup listener implementation
Much more testing too!
commit 2058f4a808762c4588309f21b13b677245832f2c
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 11:45:55 2016 -0400
Pass back information about whether we refreshed
commit e445cb0cb91ebdbcfdbf566696edb2bf1c84a882
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 11:03:31 2016 -0400
Javadoc
commit 611cbeeaeb458f4b428bfc43a1ee6652adf4baff
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 11:01:40 2016 -0400
Move ReplicationResponse
now it is in the same package as its request
commit 9919758b644fd73895fb88cd6a4909a8387eb2e2
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 11:00:14 2016 -0400
Oh boy that wasn't working
commit 247cb483c4459dea8e95e0e3bd2e4bf8d452c598
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 10:29:37 2016 -0400
Basic block_until_refresh exposed to java client
and basic "is it plugged in" style tests.
commit 46c855c9971cb2b748206d2afa6a2d88724be3ba
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 10:11:10 2016 -0400
Move test to own class
commit a5ffd892d0a352ae7e9757f2640fc2a1fa656bf2
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 07:44:25 2016 -0400
WIP
commit 213bebb6ece11b85d17e44af9a54fc2e5e332d39
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 21:35:52 2016 -0400
Add refresh listeners
commit a2bc7f30e6d4857a1224ef5a89909b36c8f33731
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 21:11:55 2016 -0400
Return last written location from refresh
commit 85033a87551da89f36a23d4dfd5016db218e08ee
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 20:28:21 2016 -0400
Never reply to replica actions while you have the operation lock
This last thing was causing periodic test failures because we were
replying while we had the operation lock. Now, we probably could get
away with that in most cases but the tests don't like it and it isn't
a good idea to do network io while you have a lock anyway. So this
prevents it.
commit 1f25cf35e796835b3827b8a4110e09e5de61784c
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 19:56:18 2016 -0400
Cleanup
commit 52c5f7c3f04710901f503334239a611c0e21c85a
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 19:33:00 2016 -0400
Add a listener to shard operations
commit 5b142dc331214c8eef90587144f4b3f959f9eced
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 18:03:52 2016 -0400
Cleanup
commit 3d22b2d7ceb473db339259452a7c4f117ce86069
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 17:59:55 2016 -0400
Push the listener into shardOperationOnPrimary
commit 34b378943b8185451acf6350f661c0ad33b5836d
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 17:48:47 2016 -0400
Doc
commit b42b8da968d42cc7414020c7b199606a5dcce50a
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 17:45:40 2016 -0400
Don't finish early if the primary finishes early
We use a "fake" pending shard that we resolve when the replicas have
all started.
commit 0fc045b56e1e02a48c30383ac50a281d5af7e0b6
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 17:30:06 2016 -0400
Make performOnPrimary asyncS
Instead of returning Tuple<Response, ReplicaRequest> it returns
ReplicaRequest and takes a ActionListener<Response> as an argument.
We call the listener immediately to preserve backwards compatibility
for now.
commit 80119b9a26ede96a865af45904c3ac69d5b19b59
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 16:51:53 2016 -0400
Factor out common code in shardOperationOnPrimary
commit 0642083676702618f900fa842c08802a04c1a53e
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 16:32:29 2016 -0400
Factor out common code from shardOperationOnReplica
commit 8bdc415fedaaa9f2d0c555590a13ec4699a7c3f7
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 16:23:28 2016 -0400
Create ReplicatedMutationRequest
Superclass for index, delete, and bulkShard requests.
commit 0f8fa846a2822c4293df32fed18c9b99660b39ff
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 16:10:30 2016 -0400
Create TransportReplicatedMutationAction
It is the superclass of replication actions that mutate data: index, delete,
and shardBulk. shardFlush and shardRefresh are replication actions but they
do not extend TransportReplicatedMutationAction because they don't change
the data, only shuffle it around.
The setting bootstrap.mlockall is useful on both POSIX-like systems
(POSIX mlockall) and Windows (Win32 VirtualLock). But mlockall is really
a POSIX only thing so the name should not be tied POSIX. This commit
renames the setting to "bootstrap.memory_lock".
Relates #18669
Like on other places in the query dsl the full field name should be used.
Before this change this wasn't the case for nested inner hits when source filtering was used.
Highlighting has a workaround, which is now removed as the source of nested inner hits can only be refered by the full name.
Closes#16653
This commit removes the ability to specify a custom plugins
path. Instead, the plugins path will always be a subdirectory called
"plugins" off of the home directory.
The syntax highlighter only supports [source,js].
Also adds a check to the rest test generator that runs during
the build that'll fail the build if it sees `[source,json]`.