We try to get the index shard instance again from the index service on a different
threads while that shard might have already been closed or removed which can cause a NPE
instead of another expected expecption.
This commit shuffels and rewrites some code in RecoverySourceHandler to make it
simpler and more unittestable. This commit doesn't change all parts of this class
neither is it fully tested yet. It's an important part of the infrastrucutre so I started
to make it better tested but I don't want to change everything in one go since it makes
review simpler and more detailed. Future commits will continue cleaning up the class and
add more tests.
If a document is indexed into ES with no id, ES will generate one for it. We used to have an optimization for this case where the engine will not try to resolve the ids of these request in the existing index but immediately try to index them. This optimization has proven to be the source of brittle bugs (solved!) and we disabled it in 1.5, preparing for it to be removed if no performance degradation was found. Since we haven't seen any such degradation we can remove it.
Along with the removal of the optmization, we can remove the autogenerate id flag on indexing requests and the can have duplicate flag. The only downside of the removal of the canHaveDuplicate flag is that we can't make sure any more that when we retry an autogenerated id create operation we will ignore any document already exists exception (See #9125 for background and discussion). To work around this, we don't set the operation to CREATE any more when we generate an id, so the resulting request will never fail when it finds an existing doc but do return a version of 2. I think that's acceptable.
Closes#13857
We have two unneded heavy dependencies on IndexShard that are unneeded and only cause
trouble if you try to mock index shard. This commit removes IndexSettingsService as well as
ClusterSerivce from IndexShard to simplify future mocking and construction.
While refactoring has_child and has_parent query we lost an important detail around types. The types that the inner query gets executed against shouldn't be the main types of the search request but the parent or child type set to the parent query. We used to use QueryParseContext#setTypesWithPrevious as part of XContentStructure class which has been deleted, without taking care though of setting the types and restoring them as part of the innerQuery#toQuery call.
Meanwhile also we make sure that the original context types are restored in PercolatorQueriesRegistry
Closes#13863
Closes#13854
Squashed commit of the following:
commit 42c1166efc55adda0d13fed77de583c0973e44b3
Author: Robert Muir <rmuir@apache.org>
Date: Tue Sep 29 11:59:43 2015 -0400
Add paranoia
Groovy holds on to a classloader, so check it before compilation too.
I have not reviewed yet what Rhino is doing, but just be safe.
commit b58668a81428e964dd5ffa712872c0a34897fc91
Author: Robert Muir <rmuir@apache.org>
Date: Tue Sep 29 11:46:06 2015 -0400
Add SpecialPermission to guard exceptions to security policy.
In some cases (e.g. buggy cloud libraries, scripting engines), we must
grant dangerous permissions to contained cases. Those AccessController blocks
are dangerous, since they truncate the stack, and can allow privilege escalation.
This PR adds a simple permission to check before each one, so that unprivileged code
like groovy scripts, can't do anything they shouldn't be allowed to do otherwise.
StoreRecoveryService used to be a pretty heavy class with lots of dependencies.
This class was basically not testable in isolation and had an async API with a listener.
This commit refactors this class to be a simple utility classs with a sync API hidden behind
the IndexShard interface. It includes single node tests and moves all the async properities to
the caller side.
Note, this change also removes the mapping update on master from the store recovery code since
it's not needed anymore in 3.0 because all stores have been subject to sync mapping updates such
that the master already has all the mappings for documents that made it into the transaction log.
Closes#13766
Now that groovy is factored out, we contain this dangerous stuff there.
TODO: look into those test hacks inspecting class protection domains, maybe we can
clean that one up too.
TODO: generalize the GroovyCodeSourcePermission to something all script engines check,
before entering accesscontrollerblocks. this way e.g. groovy script cannot coerce
python engine into creating something with more privs if it gets ahold of it... we
should probably protect the aws/gce hacks in the same way.
Previous to this change if to equal geohash cell query builders were created and then toQuery was called on one, they would no longer be equal.
This change also adds a test to AbstractQueryTestCase to make sure calling toQuery on any query builder does not affect the query builder's equality
* Add OS X support via "seatbelt" mechanism. This gives consistency across dev and prod, since many devs use OS X.
* block execveat system call: it may be new, but we should not allow it.
When a shard becomes in active we trigger a sync flush in order to speed up future recoveries. The sync flush causes a new translog generation to be made, which in turn confuses the IndexingMemoryController making it think that the shard is active. If no documents comes along in the next 5m, the shard is made inactive again , triggering a sync flush and so forth.
To avoid this, the IndexingMemoryController is changed to ignore empty translogs when checking if a shard became active. This comes with the price of potentially missing indexing operations which are followed by a flush. This is acceptable as if no more index operation come in, it's OK to leave the shard in active.
A new unit test is introduced and comparable integration tests are removed.
Closes#13802
Instead just log the same thing we print to the startup console for that case (magic logic),
it sucks to do this, but guice exceptions are too much.
All other non-guice exceptions will still be fully logged.
Closes#13782
This commit adds explicit ids for managing ElasticsearchException
serialization. By adding explicit ids and unit tests for them, the ids
are less brittle and breakage can be more clearly detected.
Today we use reflection where it's not needed anymore since java8 can
pass ctors around. This commit replaces runtime checks with compile time
checks which is always preferrable.
These classes are really duplicates and are just here for historical reasons.
We don't need these anymore since the same classes exist in lucene today.
This also removes the guice injection for DeletionPolicy and make them shard private.
TermsLookupQueryBuilder was left around only for bw comp reasons, but TermsQueryBuilder is its replacement. We can remove it now that it is clear query refactoring goes in master (3.0).
There is no need to have term vectors service on the shard level where it's
created for every shard. This commit moves it to a higher level which makes
shard creation slightly simpler and reduces the number of long living objects.
Before the refactoring we didn't check any invalid settings for strategy and relation
in the GeoShapeQueryBuilder. However, using SpatialStrategy.TERM and ShapeRelation.INTERSECTS
together is invalid and we tried to protect against that in the validate() method.
This PR moves these checks to setter for strategy and relation and adds tests for the new
behaviour.
Relates to #10217
Block execve(), fork(), and vfork() system calls, returning EACCES instead,
on kernels that support seccomp-bpf: either via seccomp() or falling back
to prctl().
Only linux/amd64 is supported. This feature can be disabled (in case
of problems) with bootstrap.seccomp=false.
Closes#13753
Squashed commit of the following:
commit 92cee05c72b49e532d41be7b16709e1c9f919fa9
Author: Robert Muir <rmuir@apache.org>
Date: Thu Sep 24 10:12:51 2015 -0400
Add a note about why we don't parse uname() or anything
commit b427971f45cbda4d0b964ddc4a55fae638880335
Author: Robert Muir <rmuir@apache.org>
Date: Thu Sep 24 09:44:31 2015 -0400
style only: we already pull errno into a local, use it for catch-all case
commit ddf93305525ed1546baf91f7902148a8f5b1ad06
Author: Robert Muir <rmuir@apache.org>
Date: Thu Sep 24 08:36:01 2015 -0400
add TODO
commit f29d1b7b809a9d4c1fcf15f6064e43f7d1b24696
Author: Robert Muir <rmuir@apache.org>
Date: Thu Sep 24 08:33:28 2015 -0400
Add full stacktrace at debug level always
commit a3c991ff8b0b16dc5e128af8fb3dfa6346c6d6f1
Author: Robert Muir <rmuir@apache.org>
Date: Thu Sep 24 00:08:19 2015 -0400
Add missing check just in case.
commit 628ed9c77603699aa9c67890fe7632b0e662a911
Author: Robert Muir <rmuir@apache.org>
Date: Wed Sep 23 22:47:16 2015 -0400
Add public getter, for stats or whatever if they need to know this
commit 3e2265b5f89d42043d9a07d4525ce42e2cb1c727
Author: Robert Muir <rmuir@apache.org>
Date: Wed Sep 23 22:43:06 2015 -0400
Enable use of seccomp(2) on Linux 3.17+ which provides more protection.
Add nice errors.
Add all kinds of checks and paranoia.
Add documentation.
Add boolean switch.
commit 0e421f7fa2d5236c8fa2cd073bcb616f5bcd2d23
Author: Robert Muir <rmuir@apache.org>
Date: Wed Sep 23 21:36:32 2015 -0400
Add defensive checks and nice error messages
commit 6231c3b7c96a81af8460cde30135e077f60a3f39
Author: Robert Muir <rmuir@apache.org>
Date: Wed Sep 23 20:52:40 2015 -0400
clean up JNA and BPF. block fork and vfork too.
commit bb31e8a6ef03ceeb1d5137c84d50378c270af85a
Author: Robert Muir <rmuir@apache.org>
Date: Wed Sep 23 19:00:32 2015 -0400
order is LE already for the JNA buffer, but be explicit about it
commit 10456d2f08f12ddc3d60989acb86b37be6a4b12b
Author: Robert Muir <rmuir@apache.org>
Date: Wed Sep 23 17:47:07 2015 -0400
block process execution with seccomp on linux/amd64
Some methods had default implementation while queries were going to be refactored, now that they are all refactored all those methods can be made abstract.
Remove ParseFieldMatcher duplication query in both contexts. QueryParseContext is still contained in QueryShardContext, as parsing still happens in the shards here and there. Most of the norelease comments have been removed simply because the scope of the refactoring has become smaller. Some could only be removed once everything, the whole search request, gets parsed on the coordinating node. We will get there eventually.
SimpleIndexQueryParserTests was the main responsible: deleted lots of duplicated tests, moved the ones that made sense to keep to their corresponding unit tests (note they were ESSingleNode tests before while are now converted to unit tests).
Closes#13750
After all queries now have a `toQuery` method and the parsers all
support `fromXContent` it is possible to remove the following
workarounds and deprecated methods we kept around while doing the
refactoring:
* remove the BaseQueryParser and BaseQueryParserTemp. All parsers
implement QueryParser directly now
* remove deprecated methods in QueryParseContext that either returned
a Query or a Filter.
* remove the temporary QueryWrapperQueryBuilder
Relates to #10217
The IndexingMemoryController checks periodically if there is any indexing activity on the shard. If no activity is sean for 5m (default) the shard is marked as inactive allowing it's indexing buffer quota to given to other active shards.
Sadly the current check is bad as it checks for 0 translog operation. This makes the inactive wait for a flush to happen - which used to take 30m and since #13707 doesn't happen at all (as we rely on the synced flush triggered by inactivity). This commit fixes the check so it will work with any translog size.
Closes#13759
This commit fixes ping timeout settings inconsistencies in
ZenDiscovery. In particular, the documentation refers to the ping
timeout setting as discovery.zen.ping_timeout but the code was
ultimately using discovery.zen.ping.timeout if this was set.
This commit also changes all instances of the raw string
“discovery.zen.ping_timeout” to the constant
o.e.d.z.ZenDiscovery.SETTING_PING_TIMEOUT.
Finally, this commit removes the legacy setting
"discovery.zen.initial_ping_timeout".
Closes#6579, #9581, #9908
The current MoreLikeThisQueryBuilder validation checks for existence of at
least one `like` text or item. This is hard to check in setters, so this PR
tries to change the construction of the query so that we can do these checks
already at construction time.
Changing to using arrays for fieldnames, likeTexts, likeItems, unlikeTexts
and unlikeItems. `likeTexts` and/or `likeItems` need to be specified at
construction time to validate we have at least one item there.
Relates to #10217
Banning `ImmutableSet` outright is too much to do all at once - this starts
the process by banning `ImmutableMap#entrySet` - one of the more common ways
that `ImmutableSet`s come up. It then starts to remove calls to
`ImmutableMap#entrySet` by changing declarations from `ImmutableMap` to `Map`.
Unfortunately this process is like pulling on a long, windy string and one
declaration change requires another which requires 5 more which in turn
require another few. So this change is rather large.
As such, to keep the changes manageable they only remove `ImmutableMap` from
the signatures that are needed for `entrySet` and make little effort to stop
using `ImmutableMap` internally. Removing the usages of `ImmutableMap`
complicates immutability guarantees and will be done separately.
In #12942, the NettyTransport and NettyHttpServerTransport were updated to allow for binding
to multiple addresses. However, the BoundTransportAddress holder only exposed the first address
that the transport was bound to and this object is used to populate the values returned to the user
via our APIs.
This change exposes all of the bound addresses in the BoundTransportAddress holder, which allows
for an accurate representation of all interfaces that elasticsearch is bound to and listening on.
This commit addresses a confusing error message that arises when a
property parameter (e.g. -D) is after a double-dash parameter. The
current error message reports to the user that the parameter does not
start with “--". Adding the second dash as the error message suggests
causes the parameter to be silently ignored. This is confusing for the
user. With this commit, the user is now informed that the parameter
order is violated.
Relates e27ede48ce
These exceptions are useless and unused, since we are on a major verison we should remove
them. This commit also makes it easier to remove excepitons in the future.
In the past ClusterStateUpdateTask was an interface and we had various derived marker interfaces to control behavior. Since then we moved ClusterStateUpdateTask to be an abstract class but we kept the old hierarchy of implementations. All of those (but the AckedClusterStateUpdateTask) can be folded into ClusterStateUpdateTask, adding correct default behavior.
Closes#13735