This adds a get task API that supports GET /_tasks/${taskId} and
removes that responsibility from the list tasks API. The get task
API supports wait_for_complation just as the list tasks API does
but doesn't support any of the list task API's filters. In exchange,
it supports falling back to the .results index when the task isn't
running any more. Like any good GET API it 404s when it doesn't
find the task.
Then we change reindex, update-by-query, and delete-by-query to
persist the task result when wait_for_completion=false. The leads
to the neat behavior that, once you start a reindex with
wait_for_completion=false, you can fetch the result of the task by
using the get task API and see the result when it has finished.
Also rename the .results index to .tasks.
By default the number of searches msearch executes is capped by the number of
nodes multiplied with the default size of the search threadpool. This default can be
overwritten by using the newly added `max_concurrent_searches` parameter.
Before the msearch api would concurrently execute all searches concurrently. If many large
msearch requests would be executed this could lead to some searches being rejected
while other searches in the msearch request would succeed.
The goal of this change is to avoid this exhausting of the search TP.
Closes#17926
Folded grok processor into ingest-common module.
The rest tests have been moved to ingest-common module as well, because these tests don't run in the rest-api-spec module but in the distribution:integ-test-zip module
and adding a test plugin there felt just wrong to me. I think this is ok. I left a tiny ingest rest test behind in that tests with an empty pipeline.
Removed messy tests, these tests were already covered in the rest tests
Added ingest test plugin in test infra so that each module testing integration with ingest doesn't need write its own plugin
Moved reindex ingest tests to qa module
Closes#18490
This adds support for setting the refresh request parameter to
`wait_for` in the `index`, `delete`, `update`, and `bulk` APIs. When
`refresh=wait_for` is set those APIs will not return until their
results have been made visible to search by a refresh.
Also it adds a `forced_refresh` field to the response of `index`,
`delete`, `update`, and to each item in a bulk response. This will
be true for requests with `?refresh` or `?refresh=true` and will be
true for some requests (see below) with `refresh=wait_for` but ought
to otherwise always be false.
`refresh=wait_for` is implemented as a list of
`Tuple<Translog.Location, Consumer<Boolean>>`s in the new `RefreshListeners`
class that is managed by `IndexShard`. The dynamic, index scoped
`index.max_refresh_listeners` setting controls a maximum number of
listeners allowed in any shard. If more than that many listeners
accumulate in the engine then a refresh will be forced, the thread that
adds the listener will be blocked until the refresh completes, and then the
listener will be called with a `forcedRefresh` flag so it knows that it was
the "straw that broke the camel's back". These listeners are only used by
`refresh=wait_for` and that flag manifests itself as `forced_refresh` being
`true` in the response.
About half of this change comes from piping async-ness down to the appropriate
layer in a way that is compatible with the ongoing with with sequence ids.
Closes#1063
You can look up the winding story of all the commits here:
https://github.com/elastic/elasticsearch/pull/17986
Here are the commit messages in case they are intersting to you:
commit 59a753b89109828d2b8f0de05cb104fc663cf95e
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Jun 6 10:18:23 2016 -0400
Replace a method reference with implementing an interface
Saves a single allocation and forces more commonality
between the WriteResults.
commit 31f7861a85b457fb7378a6f27fa0a0c171538f68
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Jun 6 10:07:55 2016 -0400
Revert "Replace static method that takes consumer with delegate class that takes an interface"
This reverts commit 777e23a6592c75db0081a53458cc760f4db69507.
commit 777e23a6592c75db0081a53458cc760f4db69507
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Jun 6 09:29:35 2016 -0400
Replace static method that takes consumer with delegate class that takes an interface
Same number of allocations, much less code duplication.
commit 9b49a480ca9587a0a16ebe941662849f38289644
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Jun 6 08:25:38 2016 -0400
Patch from boaz
commit c2bc36524fda119fd0514415127e8901d94409c8
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 14:46:27 2016 -0400
Fix docs
After updating to master we are actually testing them.
commit 03975ac056e44954eb0a371149d410dcf303e212
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 14:20:11 2016 -0400
Cleanup after merge from master
commit 9c9a1deb002c5bebb2a997c89fa12b3d7978e02e
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 14:09:14 2016 -0400
Breaking changes notes
commit 1c3e64ae06c07a85f7af80534fab88279adb30b4
Merge: 9e63ad6 f67e580
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 14:00:05 2016 -0400
Merge branch 'master' into block_until_refresh2
commit 9e63ad6de52d0b28f0b6d7203721baf1ebf6f56b
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 13:21:27 2016 -0400
Test for TransportWriteAction
commit 522ecb59d39b3c9e8df0d3b8df34b9e7aeaf0ce9
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 10:30:18 2016 -0400
Document deprecation
commit 0cd67b947f58867e704a1f0e66928a6fb5a11f11
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 10:26:23 2016 -0400
Deprecate setRefresh(boolean)
Users should use `setRefresh(RefreshPolicy)` instead.
commit aeb1be3f2c501990b33fb1f8230d496035f498ef
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 10:12:27 2016 -0400
Remove checkstyle suppression
It is fixed
commit 00d09a9caa638b6f90f4896b5502dd98d8fad56e
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 10:08:28 2016 -0400
Improve comment
commit 788164b898a6ee2878a273961230122b7386c3c9
Author: Nik Everett <nik9000@gmail.com>
Date: Thu Jun 2 10:01:01 2016 -0400
S/ReplicatedWriteResponse/WriteResponse/
Now it lines up with WriteRequest.
commit b74cf3fe778352b140355afcaa08d3d4412d749d
Author: Nik Everett <nik9000@gmail.com>
Date: Wed Jun 1 18:27:52 2016 -0400
Preserve `?refresh` behavior
`?refresh` means the same things as `?refresh=true`.
commit 30f972bdaeaaa0de6fe67746cdb8628aa86f5a8c
Author: Nik Everett <nik9000@gmail.com>
Date: Wed Jun 1 17:39:05 2016 -0400
Handle hanging documents
If a document is added to the index during a refresh we weren't properly
firing its refresh listener. This happened because the way we detect
whether a refresh makes something visible or not is imperfect. It is
ok because it always errs on the side of thinking that something isn't
yet visible.
So when a document arrives during a refresh the refresh listeners
won't think it made it into a refresh when, often, it does. The way
we work around this is by telling Elasticsearch that it ought to
trigger a refresh if there are any pending refresh listeners even
if there aren't pending documents to update. Lucene short circuits
the refresh so it doesn't take that much effort, but the refresh
listeners still get the signal that a refresh has come in and they
still pick up the change and notify the listener.
This means that the time that a listener can wait is actually slightly
longer than the refresh interval.
commit d523b5702b60c7ba309fb0dcf3cd3a4798f11960
Author: Nik Everett <nik9000@gmail.com>
Date: Wed Jun 1 14:34:01 2016 -0400
Explain Integer.MAX_VALUE
commit 4ffb7c0e954343cc1c04b3d7be2ebad66d3a016b
Author: Nik Everett <nik9000@gmail.com>
Date: Wed Jun 1 14:27:39 2016 -0400
Fire all refresh listeners in a single thread
Rather than queueing a runnable each.
commit 19606ec3bbe612095df45eba734c5b7eb2709c01
Author: Nik Everett <nik9000@gmail.com>
Date: Wed Jun 1 14:09:52 2016 -0400
Assert translog ordering
commit 6bb4e5c75e850f4a42518f06fbc955f7ec76d245
Author: Nik Everett <nik9000@gmail.com>
Date: Wed Jun 1 13:17:44 2016 -0400
Support null RefreshListeners in InternalEngine
Just skip using it.
commit 74be1480d6e44af2b354ff9ea47c234d4870b6c2
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 18:02:03 2016 -0400
Move funny ShardInfo hack for bulk into bulk
This should make it easier to understand because it is closer to where it
matters....
commit 2b771f8dabd488e056cfdc9989608d18264ddfb0
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 17:39:46 2016 -0400
Pull listener out into an inner class with javadoc and stuff
commit 058481ad72019c0492b03a7a4ac32a48673697d3
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 17:33:42 2016 -0400
Fix javadoc links
commit d2123b1cabf29bce8ff561d4a4c1c1d5b42bccad
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 17:28:09 2016 -0400
Make more stuff final
commit 8453fc4f7850f6a02fb5971c17a942a3e3fd9f7b
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 17:26:48 2016 -0400
Javadoc
commit fb16d2fc7016c1e8e1621d481e8781c7ef43326c
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 16:14:48 2016 -0400
Rewrite refresh docs
commit 5797d1b1c4d233c0db918c0d08c21731ddccd05e
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 15:02:34 2016 -0400
Fix forced_refresh flag
It wasn't being set.
commit 43ce50a1de250a9e073a2ca6cbf55c1b4c74b11b
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 14:02:56 2016 -0400
Delay translog sync and flush until after refresh
The sync might have occurred for us during the refresh so we
have less work to do. Maybe.
commit bb2739202e084703baf02cfa58f09517598cf14e
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 13:08:08 2016 -0400
Remove duplication in WritePrimaryResult and WriteReplicaResult
commit 2f579f89b4867a880396f2e7fcffc508449ff2de
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 12:19:05 2016 -0400
Clean up registration of RefreshListeners
commit 87ab6e60ca5ba945bf0fba84784b2bbe53506abf
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 31 11:28:30 2016 -0400
Shorten lock time in RefreshListeners
Also use null to represent no listeners rather than an empty list.
This saves allocating a new ArrayList every refresh cycle on every
index.
commit 0d49d9c5720dadfb67da3fa760397bf6d874601c
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 10:46:18 2016 -0400
Flip relationship between RefreshListeners and Engine
Now RefreshListeners comes to Engine from EngineConfig.
commit b2704b8a39382953f8f91a9743e894ee289f7514
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 09:37:58 2016 -0400
Remove unused imports
Maybe I added them?
commit 04343a22647f19304d9dc716b3fac9b183227f63
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 09:37:52 2016 -0400
Javadoc
commit da1e765678890a02d61d8a29aa433274beb5e00c
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 09:26:35 2016 -0400
Reply with non-null
Also move the fsync and flush to before the refresh listener stuff.
commit 5d8eecd0d904b497844b4c81c46477bd6178ed3a
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 08:58:47 2016 -0400
Remove funky synchronization in AsyncReplicaAction
commit 1ec71eea0f4e1228ae1497d982307be818ef4b65
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 08:01:14 2016 -0400
s/LinkedTransferQueue/ArrayList/
commit 7da36a4ceed2ccf7955138c3b005237fa41efcb4
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 07:46:38 2016 -0400
More cleanup for RefreshListeners
commit 957e9b77007c32ee75dde152c6622bab065d5993
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 24 07:34:13 2016 -0400
/Consumer<Runnable>/Executor/
commit 4d8bf5d4a70dcc56150c8d8d14165cd23d308b3c
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 23 22:20:42 2016 -0400
explain
commit 15d948a348089bb2937eec5ac4e96f3ec67dbe32
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 23 22:17:59 2016 -0400
Better....
commit dc28951d02973fc03b4d51913b5f96de14b75607
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 23 21:09:20 2016 -0400
Javadocs and compromises
commit 8eebaa89c0a1ee74982fbe0d56d1485ca2ae09db
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 23 20:52:49 2016 -0400
Take boaz's changes to their logic conclusion and unbreak important stuff like bulk
commit 7056b96ea412f275005b93e3570bcff895859ed5
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 23 15:49:32 2016 -0400
Patch from boaz
commit 87be7eaed09a274cc6a99d1a3da81d2d7bf9dd64
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 23 15:49:13 2016 -0400
Revert "Move async parts of replica operation outside of the lock"
This reverts commit 13807ad10b6f5ecd39f98c9f20874f9f352c5bc2.
commit 13807ad10b6f5ecd39f98c9f20874f9f352c5bc2
Author: Nik Everett <nik9000@gmail.com>
Date: Fri May 20 22:53:15 2016 -0400
Move async parts of replica operation outside of the lock
commit b8cadcef565908b276484f7f5f988fd58b38d8b6
Author: Nik Everett <nik9000@gmail.com>
Date: Fri May 20 16:17:20 2016 -0400
Docs
commit 91149e0580233bf79c2273b419fe9374ca746648
Author: Nik Everett <nik9000@gmail.com>
Date: Fri May 20 15:17:40 2016 -0400
Finally!
commit 1ff50c2faf56665d221f00a18d9ac88745904bf5
Author: Nik Everett <nik9000@gmail.com>
Date: Fri May 20 15:01:53 2016 -0400
Remove Translog#lastWriteLocation
I wasn't being careful enough with locks so it wasn't right anyway.
Instead this builds a synthetic Tranlog.Location when you call
getWriteLocation with much more relaxed equality guarantees. Rather
than being equal to the last Translog.Location returned it is
simply guaranteed to be greater than the last translog returned
and less than the next.
commit 55596ea68b5484490c3637fbad0d95564236478b
Author: Nik Everett <nik9000@gmail.com>
Date: Fri May 20 14:40:06 2016 -0400
Remove listener from shardOperationOnPrimary
Create instead asyncShardOperationOnPrimary which is called after
all of the replica operations are started to handle any async
operations.
commit 3322e26211bf681b37132274ee158ae330afc28b
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 17 17:20:02 2016 -0400
Increase default maximum number of listeners to 1000
commit 88171a8322a424e624d48960fb4c98dd43e4d671
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 17 16:40:57 2016 -0400
Rename test
commit 179c27c4f829f2c6ded65967652cf85adaf2ae52
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 17 16:35:27 2016 -0400
Move refresh listeners into their own class
They still live at the IndexShard level but they live on their
own in RefreshListeners which interacts with IndexShard using a
couple of callbacks and a registration method. This lets us test
the listeners without standing up an entire IndexShard. We still
test the listeners against an InternalEngine, because the interplay
between InternalEngine, Translog, and RefreshListeners is complex
and important to get right.
commit d8926d5fc1d24b4da8ccff7e0f0907b98c583c41
Author: Nik Everett <nik9000@gmail.com>
Date: Tue May 17 11:02:38 2016 -0400
Move refresh listeners into IndexShard
commit df91cde398eb720143a85a8c6fa19bdc3a74e07d
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 16 16:01:03 2016 -0400
unused import
commit 066da45b08148b266e4173166662fc1b3f66ed53
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 16 15:54:11 2016 -0400
Remove RefreshListener interface
Just pass a Translog.Location and a Consumer<Boolean> when registering.
commit b971d6d3301c7522b2e7eb90d5d8dd96a77fa625
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 16 14:41:06 2016 -0400
Docs for setForcedRefresh
commit 6c43be821eaf61141d3ec520f988aad3a96a3941
Author: Nik Everett <nik9000@gmail.com>
Date: Mon May 16 14:34:39 2016 -0400
Rename refresh setter and getter
commit e61b7391f91263a4c4d6107bfbc2a828bbcc805c
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 22:48:09 2016 -0400
Trigger listeners even when there is no refresh
Each refresh gives us an opportunity to pick up any listeners we may
have left behind.
commit 0c9b0477085c021f503db775640d25668e02f635
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 20:30:06 2016 -0400
REST
commit 8250343240de7e63118c663a230a7a314807a754
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 19:34:22 2016 -0400
Switch to estimated count
We don't need a linear time count of the number of listeners - a volatile
variable is good enough to guess. It probably undercounts more than it
overcounts but it isn't a huge problem.
commit bd531167fe54f1bde6f6d4ddb0a8de5a7bcc18a2
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 18:21:02 2016 -0400
Don't try and set forced refresh on bulk items without a response
NullPointerExceptions are bad. If the entire request fails then the user
has worse problems then "did these force a refresh".
commit bcfded11515af5e0b3c3e36f3c2f73f5cd26512e
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 18:14:20 2016 -0400
Replace LinkedList and synchronized with LinkedTransferQueue
commit 8a80cc70a76375a7593745884cb987535b37ca80
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 17:38:24 2016 -0400
Support for update
commit 1f36966742f851b7328015151ef6fc8f95299af2
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 15:46:06 2016 -0400
Cleanup translog tests
commit 8d121bf35eb265b8a0aee9710afeb1b054a113d4
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 15:40:53 2016 -0400
Cleanup listener implementation
Much more testing too!
commit 2058f4a808762c4588309f21b13b677245832f2c
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 11:45:55 2016 -0400
Pass back information about whether we refreshed
commit e445cb0cb91ebdbcfdbf566696edb2bf1c84a882
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 11:03:31 2016 -0400
Javadoc
commit 611cbeeaeb458f4b428bfc43a1ee6652adf4baff
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 11:01:40 2016 -0400
Move ReplicationResponse
now it is in the same package as its request
commit 9919758b644fd73895fb88cd6a4909a8387eb2e2
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 11:00:14 2016 -0400
Oh boy that wasn't working
commit 247cb483c4459dea8e95e0e3bd2e4bf8d452c598
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 10:29:37 2016 -0400
Basic block_until_refresh exposed to java client
and basic "is it plugged in" style tests.
commit 46c855c9971cb2b748206d2afa6a2d88724be3ba
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 10:11:10 2016 -0400
Move test to own class
commit a5ffd892d0a352ae7e9757f2640fc2a1fa656bf2
Author: Nik Everett <nik9000@gmail.com>
Date: Mon Apr 25 07:44:25 2016 -0400
WIP
commit 213bebb6ece11b85d17e44af9a54fc2e5e332d39
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 21:35:52 2016 -0400
Add refresh listeners
commit a2bc7f30e6d4857a1224ef5a89909b36c8f33731
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 21:11:55 2016 -0400
Return last written location from refresh
commit 85033a87551da89f36a23d4dfd5016db218e08ee
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 20:28:21 2016 -0400
Never reply to replica actions while you have the operation lock
This last thing was causing periodic test failures because we were
replying while we had the operation lock. Now, we probably could get
away with that in most cases but the tests don't like it and it isn't
a good idea to do network io while you have a lock anyway. So this
prevents it.
commit 1f25cf35e796835b3827b8a4110e09e5de61784c
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 19:56:18 2016 -0400
Cleanup
commit 52c5f7c3f04710901f503334239a611c0e21c85a
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 19:33:00 2016 -0400
Add a listener to shard operations
commit 5b142dc331214c8eef90587144f4b3f959f9eced
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 18:03:52 2016 -0400
Cleanup
commit 3d22b2d7ceb473db339259452a7c4f117ce86069
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 17:59:55 2016 -0400
Push the listener into shardOperationOnPrimary
commit 34b378943b8185451acf6350f661c0ad33b5836d
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 17:48:47 2016 -0400
Doc
commit b42b8da968d42cc7414020c7b199606a5dcce50a
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 17:45:40 2016 -0400
Don't finish early if the primary finishes early
We use a "fake" pending shard that we resolve when the replicas have
all started.
commit 0fc045b56e1e02a48c30383ac50a281d5af7e0b6
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 17:30:06 2016 -0400
Make performOnPrimary asyncS
Instead of returning Tuple<Response, ReplicaRequest> it returns
ReplicaRequest and takes a ActionListener<Response> as an argument.
We call the listener immediately to preserve backwards compatibility
for now.
commit 80119b9a26ede96a865af45904c3ac69d5b19b59
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 16:51:53 2016 -0400
Factor out common code in shardOperationOnPrimary
commit 0642083676702618f900fa842c08802a04c1a53e
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 16:32:29 2016 -0400
Factor out common code from shardOperationOnReplica
commit 8bdc415fedaaa9f2d0c555590a13ec4699a7c3f7
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 16:23:28 2016 -0400
Create ReplicatedMutationRequest
Superclass for index, delete, and bulkShard requests.
commit 0f8fa846a2822c4293df32fed18c9b99660b39ff
Author: Nik Everett <nik9000@gmail.com>
Date: Fri Apr 22 16:10:30 2016 -0400
Create TransportReplicatedMutationAction
It is the superclass of replication actions that mutate data: index, delete,
and shardBulk. shardFlush and shardRefresh are replication actions but they
do not extend TransportReplicatedMutationAction because they don't change
the data, only shuffle it around.
If this option is enabled on a processor it silently catches any processor related failure and continues executing the rest of the pipeline.
Closes#18493
This adds a low level primitive operations to shrink an existing
index into a new index with a single shard. This primitive expects
all shards of the source index to allocated on a single node. Once the target index is initializing on the shrink node it takes a snapshot of the source index shards and copies all files into the target indices data folder. An [optimization](https://issues.apache.org/jira/browse/LUCENE-7300) coming in Lucene 6.1 will also allow for optional constant time copy if hard-links are supported by the filesystem. All mappings are merged into the new indexes metadata once the snapshots have been taken on the merge node.
To shrink an existing index all shards must be moved to a single node (one instance of each shard) and the index must be read-only:
```BASH
$ curl -XPUT 'http://localhost:9200/logs/_settings' -d '{
"settings" : {
"index.routing.allocation.require._name" : "shrink_node_name",
"index.blocks.write" : true
}
}
```
once all shards are started on the shrink node. the new index can be created via:
```BASH
$ curl -XPUT 'http://localhost:9200/logs/_shrink/logs_single_shard' -d '{
"settings" : {
"index.codec" : "best_compression",
"index.number_of_replicas" : 1
}
}'
```
This API will perform all needed check before the new index is created and selects the shrink node based on the allocation of the source index. This call returns immediately, to monitor shrink progress the recovery API should be used since all copy operations are reflected in the recovery API with byte copy progress etc.
The shrink operation does not modify the source index, if a shrink operation should
be canceled or if the shrink failed, the target index can simply be deleted and
all resources are released.
Closed indices are already displayed when no indices are explicitly selected. This commit ensures that closed indices are also shown when wildcard filtering is used. It also addresses another issue that is caused by the fact that the cat action is based internally on 3 different cluster states (one when we query the cluster state to get all indices, one when we query cluster health, and one when we query indices stats). We currently fail the cat request when the user specifies a concrete index as parameter that does not exist. The implementation works as intended in that regard. It checks this not only for the first cluster state request, but also the subsequent indices stats one. This means that if the index is deleted before the cat action has queried the indices stats, it rightfully fails. In case the user provides wildcards (or no parameter at all), however, we fail the indices stats as we pass the resolved concrete indices to the indices stats request and fail to distinguish whether these indices have been resolved by wildcards or explicitly requested by the user. This means that if an index has been deleted before the indices stats request gets to execute, we fail the overall cat request. The fix is to let the indices stats request do the resolving again and not pass the concrete indices.
Closes#16419Closes#17395
Significant changes:
* AbstractQueryTestCase has moved to the test framework module, in order for query builder tests in modules and plugins
* Added support to AbstractQueryTestCase to register plugins
* Lift the restriction that only one percolator could be added per index. This validation existed in MapperService, but because the percolator moved to a module it could no longer exist there. Instead of bringing it back it was removed. This validation existed since the percolator cache only supported one percolator query per document, since the percolator cache has been removed this restriction could removed as well.
* While moving percolator tests to the new module, also removed a couple of tests for the deprecated percolate and mpercolate api. These APIs are now sugar APIs for bwc and rediect to the searvh and msearvh APIs. Some tests were still testing as if percolate and mpercolate API did the percolation, but this no longer the case and these tests could be removed.
Today if a shard fails during initialization phase due to misconfiguration, broken disks,
missing analyzers, not installed plugins etc. elasticsaerch keeps on trying to initialize
or rather allocate that shard. Yet, in the worst case scenario this ends in an endless
allocation loop. To prevent this loop and all it's sideeffects like spamming log files over
and over again this commit adds an allocation decider that stops allocating a shard that
failed more than N times in a row to allocate. The number or retries can be configured via
`index.allocation.max_retry` and it's default is set to `5`. Once the setting is updated
shards with less failures than the number set per index will be allowed to allocate again.
Internally we maintain a counter on the UnassignedInfo that is reset to `0` once the shards
has been started.
Relates to #18417
Before 5.0 for it was required that the percolator queries were cached in jvm heap as Lucene queries for two reasons:
1) Performance. The percolator evaluated all percolator queries all the time. There was no pre-selecting queries that are likely to match like we have today.
2) Updates made to percolator queries were visible in realtime, Today these changes are visible in near realtime. So updating no longer requires the percolator to have the queries in jvm heap.
So having the percolator queries in jvm heap via the percolator cache is now less attractive. Especially when there are many percolator queries then these queries can consume many GBs of jvm heap.
Removing the percolator cache does make the percolate query slower compared to how the execution time in 5.0.0-alpha1 and alpha2, but it is still faster compared to 2.x and before.
Sorts an array of values in ascending or descending order. If all elements are numerics, they will be sorted numerically. If values are strings, or mixtures of strings/numbers, the elements will be sorted lexicographically.
Currently terms on an ip address try to put their binary representation in the
json response. With this commit, they would return a formatted ip address:
```
"buckets": [
{
"key": "192.168.1.7",
"doc_count": 1
}
]
```
All other values are errors.
Add java test for throttling. We had a REST test but it only ran against
one node so it didn't catch serialization errors.
Add Simple round trip test for rethrottle request
* Add isSearchable and isAggregatable (collapsed to true if any of the instances of that field are searchable or aggregatable).
* Accept wildcards in field names.
* Add a section named conflicts for fields with the same name but with incompatible types (instead of throwing an exception).
The url that takes an id has a trailing forward slash, not really an
error but as its the only url in the whole spec that does this it
triggered my OCD :)
* `rename` processor, renamed `to` to `target_field`
* `date` processor, renamed `match_field` to `field` and renamed `match_formats` to `formats`
* `geoip` processor, renamed `source_field` to `field` and renamed `fields` to `properties`
* `attachment` processor, renamed `source_field` to `field` and renamed `fields` to `properties`
Closes#17835
* Added an extra `field` parameter to the `percolator` query to indicate what percolator field should be used. This must be an existing field in the mapping of type `percolator`.
* The `.percolator` type is now forbidden. (just like any type that starts with a `.`)
This only applies for new indices created on 5.0 and later. Indices created on previous versions the .percolator type is still allowed to exist.
The new `percolator` field type isn't active in such indices and the `PercolatorQueryCache` knows how to load queries from these legacy indices.
The `PercolatorQueryBuilder` will not enforce that the `field` parameter is of type `percolator`.
We advertise in our documentation that byte units are like `kb`, `mb`... But we actually only support the simple notation `k` or `m`.
This commit adds support for the documented form and keeps the non documented options to avoid any breaking change.
It also adds support for `micros`, `nanos` and `d` as a time unit in `_cat` API.
Remove the support for `b` as a SizeValue unit. Actually, for numbers, when using raw numbers without unit, there is no text to add/parse after the number. For example, you don't write `10` as `10b`. We support option like `size=` in `_cat` API which means that we want to display raw data without unit (singles).
Documentation updated accordingly.
Add test for the empty size option.
Fix missing TimeValues options for some cat APIs
Running `gradle install` on the rest-api-spec fails because there is no available install
task. This change applies the nexus plugin so that we can install and should also enable
publishing as part of the uploadArchives task.
If ts=0, cat health disable epoch and timestamp
Be Constant String timestamp and epoch
Move timestamp and epoch to Table
Add rest-api test and test
Closes#10109
By default, tasks are grouped by node. However, task execution in elasticsearch can be quite complex and an individual task that runs on a coordinating node can have many subtasks running on other nodes in the cluster. This commit makes it possible to list task grouped by common parents instead of by node. When this option is enabled all subtask are grouped under the coordinating node task that started all subtasks in the group. To group tasks by common parents, use the following syntax:
GET /tasks?group_by=parents
This allows the user to update the reindex throttle on the fly, with changes
that speed up the throttling being applied immediately and changes that
slow down the throttling being applied during the next batch. This means
that if a user throttles reindex in such a way that it tries to sleep for
16 years and then realizes that they've done something wrong then they
can change the throttle and reindex will wake up again. We don't apply
slow downs immediately so we never get in danger of losing the scan context.
Also, if reindex is canceled while it is sleeping (how it honor throttling)
then it'll immediately wake up and cancel itself.
`text` fields will have fielddata disabled by default. Fielddata can still be
enabled on an existing index by setting `fielddata=true` in the mappings.
This adds a new `/_cluster/allocation/explain` API that explains why a
shard can or cannot be allocated to nodes in the cluster. Additionally,
it will show where the master *desires* to put the shard, according to
the `ShardsAllocator`.
It looks like this:
```
GET /_cluster/allocation/explain?pretty
{
"index": "only-foo",
"shard": 0,
"primary": false
}
```
Though, you can optionally send an empty body, which means "explain the
allocation for the first unassigned shard you find".
The output when a shard is unassigned looks like this:
```
{
"shard" : {
"index" : "only-foo",
"index_uuid" : "KnW0-zELRs6PK84l0r38ZA",
"id" : 0,
"primary" : false
},
"assigned" : false,
"unassigned_info" : {
"reason" : "INDEX_CREATED",
"at" : "2016-03-22T20:04:23.620Z"
},
"nodes" : {
"V-Spi0AyRZ6ZvKbaI3691w" : {
"node_name" : "Susan Storm",
"node_attributes" : {
"bar" : "baz"
},
"final_decision" : "NO",
"weight" : 0.06666675,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
},
"Qc6VL8c5RWaw1qXZ0Rg57g" : {
"node_name" : "Slipstream",
"node_attributes" : {
"bar" : "baz",
"foo" : "bar"
},
"final_decision" : "NO",
"weight" : -1.3833332,
"decisions" : [ {
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "the shard cannot be allocated on the same node id [Qc6VL8c5RWaw1qXZ0Rg57g] on which it already exists"
} ]
},
"PzdyMZGXQdGhqTJHF_hGgA" : {
"node_name" : "The Symbiote",
"node_attributes" : { },
"final_decision" : "NO",
"weight" : 2.3166666,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
}
}
}
```
And when the shard *is* assigned, the output looks like:
```
{
"shard" : {
"index" : "only-foo",
"index_uuid" : "KnW0-zELRs6PK84l0r38ZA",
"id" : 0,
"primary" : true
},
"assigned" : true,
"assigned_node_id" : "Qc6VL8c5RWaw1qXZ0Rg57g",
"nodes" : {
"V-Spi0AyRZ6ZvKbaI3691w" : {
"node_name" : "Susan Storm",
"node_attributes" : {
"bar" : "baz"
},
"final_decision" : "NO",
"weight" : 1.4499999,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
},
"Qc6VL8c5RWaw1qXZ0Rg57g" : {
"node_name" : "Slipstream",
"node_attributes" : {
"bar" : "baz",
"foo" : "bar"
},
"final_decision" : "CURRENTLY_ASSIGNED",
"weight" : 0.0,
"decisions" : [ {
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "the shard cannot be allocated on the same node id [Qc6VL8c5RWaw1qXZ0Rg57g] on which it already exists"
} ]
},
"PzdyMZGXQdGhqTJHF_hGgA" : {
"node_name" : "The Symbiote",
"node_attributes" : { },
"final_decision" : "NO",
"weight" : 3.6999998,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
}
}
}
```
Only "NO" decisions are returned by default, but all decisions can be
shown by specifying the `?include_yes_decisions=true` parameter in the
request.
Resolves#14593
In #17198, we removed suggest transport action, which
used the `suggest` threadpool to execute requests. Now
`suggest` threadpool is unused and suggest requests are
executed on the `search` threadpool.
In 5.0 we don't allow index settings to be specified on the node level ie.
in yaml files or via commandline argument. This can cause problems during
upgrade if this was used extensively. For instance if analyzers where
specified on a node level this might cause the index to be closed when
imported (see #17187). In such a case all indices relying on this
must be updated via `PUT /${index}/_settings`. Yet, this API has slightly
different semantics since it overrides existing settings. To make this less
painful this change adds a `preserve_existing` parameter on that API to ensure
we have the same semantics as if the setting was applied on the node level.
This change also adds a better error message and a change to the migration guide
to ensure upgrades are smooth if index settings are specified on the node level.
If a index setting is detected this change fails the node startup and prints a message
like this:
```
*************************************************************************************
Found index level settings on node level configuration.
Since elasticsearch 5.x index level settings can NOT be set on the nodes
configuration like the elasticsearch.yaml, in system properties or command line
arguments.In order to upgrade all indices the settings must be updated via the
/${index}/_settings API. Unless all settings are dynamic all indices must be closed
in order to apply the upgradeIndices created in the future should use index templates
to set default values.
Please ensure all required values are updated on all indices by executing:
curl -XPUT 'http://localhost:9200/_all/_settings?preserve_existing=true' -d '{
"index.number_of_shards" : "1",
"index.query.default_field" : "main_field",
"index.translog.durability" : "async",
"index.ttl.disable_purge" : "true"
}'
*************************************************************************************
```
Also replaced the PercolatorQueryRegistry with the new PercolatorQueryCache.
The PercolatorFieldMapper stores the rewritten form of each percolator query's xcontext
in a binary doc values field. This make sure that the query rewrite happens only during
indexing (some queries for example fetch shapes, terms in remote indices) and
the speed up the loading of the queries in the percolator query cache.
Because the percolator now works inside the search infrastructure a number of features
(sorting fields, pagination, fetch features) are available out of the box.
The following feature requests are automatically implemented via this refactoring:
Closes#10741Closes#7297Closes#13176Closes#13978Closes#11264Closes#10741Closes#4317
This change adds the infrastructure to run the rest tests on a multi-node
cluster that users 2 different minor versions of elasticsearch. It doesn't implement
any dedicated BWC tests but rather leverages the existing REST tests.
Since we don't have a real version to test against, the tests uses the current version
until the first minor / RC is released to ensure the infrastructure works.
Relates to #14406Closes#17072
This commit adds fields bytes_recovered and files_recovered to the cat
recovery API. These fields, respectively, indicate the total number of
bytes and files recovered. Additionally, for consistency, some totals
fields and translog recovery fields have been renamed.
Closes#17064
The ingest stats include the following statistics:
* `ingest.total.count`- The total number of document ingested during the lifetime of this node
* `ingest.total.time_in_millis` - The total time spent on ingest preprocessing documents during the lifetime of this node
* `ingest.total.current` - The total number of documents currently being ingested.
* `ingest.total.failed` - The total number ingest preprocessing operations failed during the lifetime of this node
Also these stats are returned on a per pipeline basis.
_wait_for_completion defaults to false. If set to true then the API will
wait for all the tasks that it finds to stop running before returning. You
can use the timeout parameter to prevent it from waiting forever. If you
don't set a timeout parameter it'll default to 30 seconds.
Also adds a log message to rest tests if any tasks overrun the test. This
is just a log (instead of failing the test) because lots of tasks are run
by the cluster on its own and they shouldn't cause the test to fail. Things
like fetching disk usage from the other nodes, for example.
Switches the request to getter/setter style methods as we're going that
way in the Elasticsearch code base. Reindex is all getter/setter style.
Closes#16906
Move some test methods from AnalylzeActionIT to RestAnalyzeActionTest
Allow string explain param if it can parse
Fix wrong param name in rest-api-spec
Closes#16925
Internally the put pipeline API uses this information in node info API to validate if all specified processors in a pipeline exist on all nodes in the cluster.
The cluster stats api now returns counts for each node role. The `master_data`, `master_only`, `data_only` and `client` fields have been removed from the response in favour of `master`, `data`, `ingest` and `coordinating_only`. The same node can have multiple roles, hence contribute to multiple roles counts. Every node is implicitly a coordinating node, so whenever a node has no explicit roles, it will be counted as coordinating only.
_cat/nodes used to return `c` for client node or `d` for data node as part of the node.role column. This commit changes it to return `m` for master eligible, `d` for data and/or `i` for ingest. A node with no explicit roles will be a coordinating only node and marked with `-`. A node can obviously have multiple roles. The master column has been adapted to return only whether a node is the current master (`*`) or not (`-`).
Elasticsearch 5.0 doesn't support indices wiht legacy checksums anymore.
The last time we write legacy checksums was in 1.3.0 which was based
on lucene 4.9 already which means that all files have CRC32 checksums.
All indices that Elasticsearch can read today must be written with
lucene version >= 4.8 anyway so we can drop this layer of backwards
compatibility entirely.
Since we are close to upgrading to Lucene 6.0 we should get rid of this
in a more contiained change than the lucene upgrade.