Similarly to what we did with the search api, we can now also move query parsing on the coordinating node for the explain api. Given that the explain api is a single shard operation (compared to search which is instead a broadcast operation), this doesn't change a lot in how the api works internally. The main benefit is that we can simplify the java api by requiring a structured query object to be provided rather than a bytes array that will get parsed on the data node. Previously if you specified a QueryBuilder it would be serialized in json format and would get reparsed on the data node, while now it doesn't go through parsing anymore (as expected), given that after the query-refactoring we are able to properly stream queries natively.
Closes#14270
We have two types of parse methods for queries: one for the inner query, to be used once the parser is positioned within the query element, and one for the whole query source, including the query element that wraps the actual query.
With the search refactoring we ended up using the former in count, cat count and delete by query, whereas we should have used the former. It ends up working properly given that we have a registered (deprecated) query called "query", which used to allow to wrap a filter into a query, but this has the following downsides:
1) prevents us from removing the deprecated "query" query
2) we end up supporting a top level query that is not wrapped within a query element (pre 1.0 syntax iirc that shouldn't be supported anymore)
This commit finally removes the "query" query and fixes the related parsing bugs. We also had some tests that were providing queries in the wrong format, those have been fixed too.
Closes#13326Closes#14304
This commit brings all the registration etc. from IndexCacheModule into
IndexModule. As a side-effect to remove a circular dependency between
IndicesService and IndicesWarmer this commit also cleans up IndicesWarmer and
separates the Engine from the warmer.
* Allow for multiple host specifications (e.g. _en0_,192.168.1.2,_site_).
* Add _site_ and _global_ scopes as counterparts to _local_.
* Warn on heuristic selection of publish address.
* Remove the arbitrary _non_loopback_ setting.
Closes#13954
The @IndexSettings annoationat has been used to differentiate between node-level
and index level settings. It was also decoupled from realtime-updates such that
the settings object that a class got injected when it was created was static and
not subject to change when an update was applied. This change removes the annoation
and replaces it with a full-fledged class that adds type-safety and encapsulates additional
functionality as well as checks on the settings.
Numeric and boolean fields have doc values enabled by default as of
elasticsearch 2.0. This commit removes support for uninverted/in-memory
fielddata, as well as numeric fields encoded in binary doc values which was
the way that elasticsearch stored doc values in a Lucene index before the
1.4 release.
As a consequence, you will only be able to sort and aggregate on numeric and
boolean fields in Elasticsearch 3.0 if doc values have not been switched off.
There are three ways `@Test` was used. Way one:
```java
@Test
public void flubTheBlort() {
```
This way was always replaced with:
```java
public void testFlubTheBlort() {
```
Or, maybe with a better method name if I was feeling generous.
Way two:
```java
@Test(throws=IllegalArgumentException.class)
public void testFoo() {
methodThatThrows();
}
```
This way of using `@Test` is actually pretty OK, but to get the tools to ban
`@Test` entirely it can't be used. Instead:
```java
public void testFoo() {
try {
methodThatThrows();
fail("Expected IllegalArgumentException");
} catch (IllegalArgumentException e ) {
assertThat(e.getMessage(), containsString("something"));
}
}
```
This is longer but tests more than the old ways and is much more precise.
Compare:
```java
@Test(throws=IllegalArgumentException.class)
public void testFoo() {
some();
copy();
and();
pasted();
methodThatThrows();
code(); // <---- This was left here by mistake and is never called
}
```
to:
```java
@Test(throws=IllegalArgumentException.class)
public void testFoo() {
some();
copy();
and();
pasted();
try {
methodThatThrows();
fail("Expected IllegalArgumentException");
} catch (IllegalArgumentException e ) {
assertThat(e.getMessage(), containsString("something"));
}
}
```
The final use of test is:
```java
@Test(timeout=1000)
public void testFoo() {
methodThatWasSlow();
}
```
This is the most insidious use of `@Test` because its tempting but tragically
flawed. Its flaws are:
1. Hard and fast timeouts can look like they are asserting that something is
faster and even do an ok job of it when you compare the timings on the same
machine but as soon as you take them to another machine they start to be
invalid. On a slow VM both the new and old methods fail. On a super-fast
machine the slower and faster ways succeed.
2. Tests often contain slow `assert` calls so the performance of tests isn't
sure to predict the performance of non-test code.
3. These timeouts are rude to debuggers because the test just drops out from
under it after the timeout.
Confusingly, timeouts are useful in tests because it'd be rude for a broken
test to cause CI to abort the whole build after it hits a global timeout. But
those timeouts should be very very long "backstop" timeouts and aren't useful
assertions about speed.
For all its flaws `@Test(timeout=1000)` doesn't have a good replacement __in__
__tests__. Nightly benchmarks like http://benchmarks.elasticsearch.org/ are
useful here because they run on the same machine but they aren't quick to check
and it takes lots of time to figure out the regressions. Sometimes its useful
to compare dueling implementations but that requires keeping both
implementations around. All and all we don't have a satisfactory answer to the
question "what do you replace `@Test(timeout=1000)`" with. So we handle each
occurrence on a case by case basis.
For files with `@Test` this also:
1. Removes excess blank lines. They don't help anything.
2. Removes underscores from method names. Those would fail any code style
checks we ever care to run and don't add to readability. Since I did this manually
I didn't do it consistently.
3. Make sure all test method names start with `test`. Some used to end in `Test` or start
with `verify` or `check` and they were picked up using the annotation. Without the
annotation they always need to start with `test`.
4. Organizes imports using the rules we generate for Eclipse. For the most part
this just removes `*` imports which is a win all on its own. It was "required"
to quickly remove `@Test`.
5. Removes unneeded casts. This is just a setting I have enabled in Eclipse and
forgot to turn off before I did this work. It probably isn't hurting anything.
6. Removes trailing whitespace. Again, another Eclipse setting I forgot to turn
off that doesn't hurt anything. Hopefully.
7. Swaps some tests override superclass tests to make them empty with
`assumeTrue` so that the reasoning for the skips is logged in the test run and
it doesn't "look like" that thing is being tested when it isn't.
8. Adds an oxford comma to an error message.
The total test count doesn't change. I know. I counted.
```bash
git checkout master && mvn clean && mvn install | tee with_test
git no_test_annotation master && mvn clean && mvn install | tee not_test
grep 'Tests summary' with_test > with_test_summary
grep 'Tests summary' not_test > not_test_summary
diff with_test_summary not_test_summary
```
These differ somewhat because some tests are skipped based on the random seed.
The total shouldn't differ. But it does!
```
1c1
< [INFO] Tests summary: 564 suites (1 ignored), 3171 tests, 31 ignored (31 assumptions)
---
> [INFO] Tests summary: 564 suites (1 ignored), 3167 tests, 17 ignored (17 assumptions)
```
These are the core unit tests. So we dig further:
```bash
cat with_test | perl -pe 's/\n// if /^Suite/;s/.*\n// if /IGNOR/;s/.*\n// if /Assumption #/;s/.*\n// if /HEARTBEAT/;s/Completed .+?,//' | grep Suite > with_test_suites
cat not_test | perl -pe 's/\n// if /^Suite/;s/.*\n// if /IGNOR/;s/.*\n// if /Assumption #/;s/.*\n// if /HEARTBEAT/;s/Completed .+?,//' | grep Suite > not_test_suites
diff <(sort with_test_suites) <(sort not_test_suites)
```
The four tests with lower test numbers are all extend `AbstractQueryTestCase`
and all have a method that looks like this:
```java
@Override
public void testToQuery() throws IOException {
assumeTrue("test runs only when at least a type is registered", getCurrentTypes().length > 0);
super.testToQuery();
}
```
It looks like this method was being double counted on master and isn't anymore.
Closes#14028
The NotQueryBuilder has been deprecated on the 2.x branches
and can be removed with the next major version. It can be
replaced by boolean query with added mustNot() clause.
Closes#13761
This adds an API for force merging lucene segments. The `/_optimize` API is now
deprecated and replaced by the `/_forcemerge` API, which has all the same flags
and action, just a different name.
This commit removes some build output files from the
burn_maven_with_fire_branch that appear to have been mistakenly
committed to master in bfb9054a11.
* Add ability for plugins to declare additional permissions with a custom plugin-security.policy file and corresponding AccessController logic. See the plugin author's guide for more information.
* Add warning messages to users for extra plugin permissions in bin/plugin.
* When bin/plugin is run interactively (stdin is a controlling terminal and -b/--batch not supplied), require user confirmation.
* Improve unit test and IDE support for plugins with additional permissions by exposing plugin's metadata as a maven test resource.
Closes#14108
Squashed commit of the following:
commit cf8ace65a7397aaccd356bf55f95d6fbb8bb571c
Author: Robert Muir <rmuir@apache.org>
Date: Wed Oct 14 13:36:05 2015 -0400
fix new unit test from master merge
commit 9be3c5aa38f2d9ae50f3d54924a30ad9cddeeb65
Merge: 2f168b8 7368231
Author: Robert Muir <rmuir@apache.org>
Date: Wed Oct 14 12:58:31 2015 -0400
Merge branch 'master' into off_my_back
commit 2f168b8038e32672f01ad0279fb5db77ba902ae8
Author: Robert Muir <rmuir@apache.org>
Date: Wed Oct 14 12:56:04 2015 -0400
improve plugin author documentation
commit 6e6c2bfda68a418d92733ac22a58eec35508b2d0
Author: Robert Muir <rmuir@apache.org>
Date: Wed Oct 14 12:52:14 2015 -0400
move security confirmation after 'plugin already installed' check, to prevent user from answering unnecessary questions.
commit 08233a2972554afef2a6a7521990283102e20d92
Author: Robert Muir <rmuir@apache.org>
Date: Wed Oct 14 05:36:42 2015 -0400
Add documentation and pluginmanager support
commit 05dad86c51488ba43ccbd749f0164f3fbd3aee62
Author: Robert Muir <rmuir@apache.org>
Date: Wed Oct 14 02:22:24 2015 -0400
Decentralize plugin permissions (modulo docs and pluginmanager work)
This test had to be moved to lang-groovy when groovy has been made a plugin.
I refactored it a bit to use mock plugins instead so that groovy is not
necessary anymore and it can come back to core.
When running in GCE platform, an instance has access to:
http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/0/ip
Which gives back the private IP address, for example `10.240.0.2`.
http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/0/externalIp
Gives back the public Ip address, for example `130.211.108.21`.
As we have for `ec2`, we can support new network host settings:
* `_gce:privateIp:X_`: The private IP address of the machine for a given network interface.
* `_gce:hostname_`: The hostname of the machine.
* `_gce_`: Same as `_gce:privateIp:0_` (recommended).
Closes#13605.
Closes#13590.
BTW resolveIfPossible now throws IOException so code is also updated for ec2 discovery and
some basic tests have been added.
The `_create` API is handy way to specify an index operation should only be done if the document doesn't exist. This is currently implemented in explicit code paths all the way down to the engine. However, conceptually this is no different than any other versioned operation - instead of requiring a document is on a specific version, we require it to be deleted (or non-existent). This PR removes Engine.Create in favor of a slight extension in the VersionType logic.
There are however a couple of side effects:
- DocumentAlreadyExistsException is removed and VersionConflictException is used instead (with an improved error message)
- Update will reject version parameters if the upsert option is used (it doesn't compute anyway).
- Translog.Create is also removed infavor of Translog.Index (that's OK because their binary format was the same, so we can just read Translog.Index of the translog file)
Closes#13955
This test used indexRandom to index 4 documents. indexRandom introduces also bogus documents which were getting on the way now that we use fieldValueFactor instead of a script to determine the script of the documents. Taken out indexRandom to simplify things and make the tests more predictable.
Types are still optional, but if you do provide them, they can't be null. Split the existing constructor that accepted nnull into two, one that accepts no arguments, and another one that accepts the types argument, which must be not null.
Also trimmed down different ways of setting ids, some were misleading as they would always add the ids to the existing ones and not set them, the add prefix makes that clear. Left `addIds` method that accepts a varargs argument. Added check for ids not be null.
This commit removes and now forbids all uses of
com.google.common.hash.HashCode, com.google.common.hash.HashFunction,
and com.google.common.hash.Hashing across the codebase. This is one of
the few remaining steps in the eventual removal of Guava as a
dependency.
Relates #13224
While refactoring has_child and has_parent query we lost an important detail around types. The types that the inner query gets executed against shouldn't be the main types of the search request but the parent or child type set to the parent query. We used to use QueryParseContext#setTypesWithPrevious as part of XContentStructure class which has been deleted, without taking care though of setting the types and restoring them as part of the innerQuery#toQuery call.
Meanwhile also we make sure that the original context types are restored in PercolatorQueriesRegistry
Closes#13863
Closes#13854
Squashed commit of the following:
commit 42c1166efc55adda0d13fed77de583c0973e44b3
Author: Robert Muir <rmuir@apache.org>
Date: Tue Sep 29 11:59:43 2015 -0400
Add paranoia
Groovy holds on to a classloader, so check it before compilation too.
I have not reviewed yet what Rhino is doing, but just be safe.
commit b58668a81428e964dd5ffa712872c0a34897fc91
Author: Robert Muir <rmuir@apache.org>
Date: Tue Sep 29 11:46:06 2015 -0400
Add SpecialPermission to guard exceptions to security policy.
In some cases (e.g. buggy cloud libraries, scripting engines), we must
grant dangerous permissions to contained cases. Those AccessController blocks
are dangerous, since they truncate the stack, and can allow privilege escalation.
This PR adds a simple permission to check before each one, so that unprivileged code
like groovy scripts, can't do anything they shouldn't be allowed to do otherwise.
GCE plugin tries to start immediately gce discovery even if we don't
set discovery.type. This commmit adds check `discovery.type` and
other required parameters before loading gce plugin.
closes#13614
Now that groovy is factored out, we contain this dangerous stuff there.
TODO: look into those test hacks inspecting class protection domains, maybe we can
clean that one up too.
TODO: generalize the GroovyCodeSourcePermission to something all script engines check,
before entering accesscontrollerblocks. this way e.g. groovy script cannot coerce
python engine into creating something with more privs if it gets ahold of it... we
should probably protect the aws/gce hacks in the same way.
This commit fixes ping timeout settings inconsistencies in
ZenDiscovery. In particular, the documentation refers to the ping
timeout setting as discovery.zen.ping_timeout but the code was
ultimately using discovery.zen.ping.timeout if this was set.
This commit also changes all instances of the raw string
“discovery.zen.ping_timeout” to the constant
o.e.d.z.ZenDiscovery.SETTING_PING_TIMEOUT.
Finally, this commit removes the legacy setting
"discovery.zen.initial_ping_timeout".
Closes#6579, #9581, #9908
I want to refactor scripting engines so we can contain dangerous "God-like" permissions
like createClassloader/sun.reflect. These are used for dynamic class generation (scripts, mocks).
This will mean some refactoring to ES core.
But first lets get the plugins in order first. I removed those permissions globally, and
fixed grants for lang-javascript, lang-python, securemock so that everything works.
lang-javascript needs no code changes, because rhino is properly written :)
lang-python needs accesscontroller blocks. securemock was already working as of 1.1
This is just a baby step, to try to do some of this incrementally! It doesn't yet provide
us anything.
Until now we had a cloud-azure plugin which is providing 3 distinct features:
* discovery on Azure
* snapshot/restore on Aure
* SMB store
This commit splits the plugin by feature so people can use either one or the other or both features.
Doc is updated accordingly.
We moved a lot of repositories into elasticsearch, but in their new
location they retained their LICENSE.txt and NOTICE.txt files. These are
all the same, and having the license and notice and the root of the
repository should be sufficient.
Plugin tests require having rest-api tests, and currently copy that spec
from a directory in the root of the plugin source into the test
resources. This change moves the rest-api-spec dir into test resources
so it is like any other test resources. It also removes unnecessary
configuration for resources from the shared plugin pom.
This commit removes unnecesssary use of ExceptionHelpers where we actually
should serialize / deserialize the actual exception. This commit also
fixes one of the oddest problems where the actual exception was never
rendered / printed if `all shards failed` due to a missing cause.
This commit unfortunately doesn't fix Snapshot/Restore which is almost
unfixable since it has to serialize XContent and read from it which can't
transport exceptions.
When running a RestIT test from the IDE, you actually start an internal node which does not automatically load the plugin you would like to test.
We need to add:
```java
@Override
protected Collection<Class<? extends Plugin>> nodePlugins() {
return pluginList(PLUGIN_HERE.class);
}
```
Everything works fine when running from maven because each test basically:
* installs elasticsearch
* installs one plugin
* starts elasticsearch with this plugin loaded
* runs the test
Note that this PR only fixes the fact we run an internal cluster with the expected plugin.
Cloud tests will still fail when run from the IDE because is such a case you actually start an internal node with many mock plugins.
And REST test suite for cloud plugins basically checks if the plugin is running by checking the output of NodesInfo API.
And we check:
```yml
- match: { nodes.$master.plugins.0.name: cloud-azure }
- match: { nodes.$master.plugins.0.jvm: true }
```
But in that case, this condition is certainly false as we started also `mock-transport-service`, `mock-index-store`, `mock-engine-factory`, `node-mocks`, `asserting-local-transport`, `mock-search-service`.
Closes#13479
This commit removes and now forbids all uses of
Function, Charsets, Collections2 across the codebase. This
is one of many steps in the eventual removal of Guava as a dependency.
Relates #13224
Java 8 allows for method references which in-turn will cause
compile errors if a method is not visible while reflection fails late
and maybe too late. We can now register Request instances via FooRequest::new
instead of passing FooRequest.class and call it's ctor via reflection.
In addition to being a big security problem, setAccessible is a risk
for java 9 migration. We need to clean up our code so we can ban it
and eventually enforce this with security manager for third-party code, too,
or we may have problems.
Instead of using setAccessible, use the correct modifier (e.g. public).
TODO: ban in tests
TODO: ban in security manager at runtime
Instead of asking blob store to create output for posting blob content, this change provides that content of the blob to the blob store for writing. This will significantly simplify the interface for S3 and Azure plugins.
The semantics of the `boost` parameter for `function_score` changed. This is
due to the fact that Lucene now requires that query boosts and top-level boosts
are applied the same way.
This commit removes and now forbids all uses of
com.google.common.collect.Maps across the codebase. This is one of many
steps in the eventual removal of Guava as a dependency.
Relates #13224
As elasticsearch is marked as provided we don't need to explicitly exclude it from the assembly descriptor.
We get a warning today for all plugins, the following:
```
[INFO] --- maven-assembly-plugin:2.5.5:single (default) @ repository-s3 ---
[INFO] Reading assembly descriptor: /path/to/plugin-assembly.xml
[WARNING] The following patterns were never triggered in this artifact exclusion filter:
o 'org.elasticsearch:elasticsearch'
[INFO] Building zip: /path/to/target/releases/repository-s3-3.0.0-SNAPSHOT.zip
[INFO]
```
It now gives:
```
[INFO] --- maven-assembly-plugin:2.5.5:single (default) @ repository-s3 ---
[INFO] Reading assembly descriptor: /path/to/plugin-assembly.xml
[INFO] Building zip: /path/to/target/releases/repository-s3-3.0.0-SNAPSHOT.zip
[INFO]
```
This commit removes and now forbids all uses of
com.google.common.base.Predicate and com.google.common.base.Predicates
across the codebase. This is one of the many steps in the eventual
removal of Guava as a dependency. This was enabled by #13314.
Relates #13224
Until now we had a cloud-aws plugin which is providing 2 disctinct features:
* discovery on EC2
* snapshot/restore on S3
This commit splits the plugin by feature so people can use either one or the other or both features.
Doc is updated accordingly.