Removes the mapping transform feature which when used made debugging very
difficult. Users should transform their documents on the way into
Elasticsearch rather than having Elasticsearch do it.
Closes#12674
Most query parsers throw a ParsingException when they trying
to parse a field with an unknown name. This adds a generic
check for this to the AbstractQueryTestCase so the behaviour
gets tested for all query parsers. The test works by first
making sure the test query has a `boost` field and then
changing this to an unknown field name and checking for an
error.
There are exceptions to this for WrapperQueryBuilder
and QueryFilterBuilder, because here the parser only expects
the wrapped `query` element. MatchNoneQueryBuilder and
MatchAllQueryBuilder so far had setters for boost() and
queryName() but didn't render them, which is added here for
consistency.
GeoDistance, GeoDistanceRange and GeoHashCellQuery so far
treat unknown field names in the json as the target field name
of the center point of the query, which so far was handled by
overwriting points previously specified in the query. This
is changed here so that an attempt to use two different field names
to specify the central point of the query throws a
ParsingException
Relates to #10974
This change moves all the analysis component registration to the node level
and removes the significant API overhead to register tokenfilter, tokenizer,
charfilter and analyzer. All registration is done without guice interaction such
that real factories via functional interfaces are passed instead of class objects
that are instantiated at runtime.
This change also hides the internal analyzer caching that was done previously in the
IndicesAnalysisService entirely and decouples all analysis registration and creation
from dependency injection.
This change removes the leftover pom files. A couple files were left for
reference, namely in qa tests that have not yet been migrated (vagrant
and multinode). The deb and rpm assemblies also still exist for
reference when finishing their setup in gradle.
See #13930
The test jar was previously built in maven by copying class files. With
gradle we now have a proper test framework artifact. This change moves
the classes used by the test framework into the test-framework module.
See #13930
Closes#14353
Squashed commit of the following:
commit edae0729f71ea3d3f9fa9c0d27c9effc042eb5a9
Author: Robert Muir <rmuir@apache.org>
Date: Thu Oct 29 14:13:42 2015 -0400
update sha1 and simplify test
commit 635c4f245d66ad353a16267c810e02b725553fad
Author: Robert Muir <rmuir@apache.org>
Date: Thu Oct 29 07:01:26 2015 -0400
Add threadgroup isolation.
Code with `modifyThread` and `modifyThreadGroup` may only modify
its own threadgroup (or an ancestor of that). This enforces
what is intended by the ThreadGroup class.
This has two immediate implications:
1. Code without these permissions (scripts) may not create or mess with threads
2. ES application threads cannot mess with Java system threads
ES puts all application threads in one single group today, but in the future
this can be organized better, and we will have more isolation in the system.
This commit fixes two issues that could arise when a loader throws an
exception during a load in Cache#computeIfAbsent.
The underlying issue is that if the loader throws an exception,
Cache#computeIfAbsent would attempt to remove the polluted entry from
the cache. However, this cleanup was performed outside of the segment
lock. This means another thread could race and expire the polluted
entry (leading to NPEs) or get a polluted entry out of the cache before
the loading thread had a chance to cleanup (leading to ISEs).
The solution to the initial problem of correctly handling failed cached
loads is to check for failed loads in all places where entries are
retrieved from the map backing the segment. In such cases, we treat it
as if there was no entry in the cache, and we clean up the cache on a
best-effort basis. All of this is done outside of the segment lock to
avoid reintroducing the deadlock that was initially a problem when
loads were executed under a segment lock.
This commit adds a unit test for a deadlock issue that existed prior to
commit 1d0b93f766. While commit
1d0b93f766 seems to have addressed the
deadlock issue it would be more robust to have a unit test for it and a
unit test will reduce the risk that future maintenance on Cache will
reintroduce the deadlock issue. This test reliably fails prior to but
passes after commit 1d0b93f766.
Currently a `simple_query_string` query with one term and multiple fields
gets parsed to a BooleanQuery where the number of clauses is determined
by the number of fields, which lead to wrong calculation of `minimum_should_match`.
This PR adds checks to detect this case and wrap the resulting BooleanQuery into
another BooleanQuery with just one should-clause, so `minimum_should_match`
calculation is corrected.
In order to differentiate between the case where one term is queried across
multiple fields and the case where multiple terms are queried on one field,
we override a simplification step in Lucenes SimpleQueryParser that reduces
a one-clause BooleanQuery to the clause itself.
Closes#13884
This commit adds a listener mechanism for executing callbacks when
exceptional situations occur sending a shard failure message to the
master. The two types of exceptional situations that can occur are if
the master is not known and if the transport request exception handler
is invoked for any reason after sending the shard failed request to the
master. This commit only adds the infrastructure for executing
callbacks when one of these exceptional situations occur; no effort is
made to properly handle the exceptional situations. Some unit tests are
added for ShardStateAction to test that the listener infrastructure is
correct.
Relates #14252
The only way to refer to the plain highlighter is now `plain`, the only way to refer to the fast vector highlighter is `fvh` and the only way to refer to the postings highlighter is `postings`. The name variants like `highlighter`, `postings-highlighter` and `fast-vector-highlighter` have been removed.
Similarly to what we did with the search api, we can now also move query parsing on the coordinating node for the explain api. Given that the explain api is a single shard operation (compared to search which is instead a broadcast operation), this doesn't change a lot in how the api works internally. The main benefit is that we can simplify the java api by requiring a structured query object to be provided rather than a bytes array that will get parsed on the data node. Previously if you specified a QueryBuilder it would be serialized in json format and would get reparsed on the data node, while now it doesn't go through parsing anymore (as expected), given that after the query-refactoring we are able to properly stream queries natively.
Closes#14270
This is not needed: full mvn verify passes.
Furthermore, there are all kinds of checks for this case
(rejected while shutting down) in the actual code, so there
is no need to have it here. If its supposed to be non-fatal,
then we add the missing places to the actual code, not globally to all threads.
Some jenkins servers have this, but our codebase normalization doesn't
follow symlinks. Add this so that its correct.
Only really impacts tests, i suppose it helps if someone has a symlinked plugins/
but that is not recommended :)
* plugin authors can use full policy syntax, including codebase substitution
properties like core syntax.
* simplify test logic.
* move out test-framework permissions to separate file.
Closes#14311
On _lastWriteNanos_ we use System.nanoTime() to initialize this since:
* we use the value for figuring out if the shard / engine is active so if we startup and no write has happened yet we still consider it active
for the duration of the configured active to inactive period. If we initialize to 0 or Long.MAX_VALUE we either immediately or never mark it
inactive if no writes at all happen to the shard.
* we also use this to flush big-ass merges on an inactive engine / shard but if we we initialize 0 or Long.MAX_VALUE we either immediately or never
commit merges even though we shouldn't from a user perspective (this can also have funky sideeffects in tests when we open indices with lots of segments
and suddenly merges kick in.
This method needs special permission and can cause all kinds of other problems
if we are creating lots of theads. Also the reason why we added this are fixed
long ago, no need to maintain this code.
This commit fixes a regression introduced with #12058. This causes failures with the delete index api when providing the same index name multiple times in the request, or aliases/wildcard expressions that end up pointing to the same concrete index. The bug was revealed after merging #11258 as we delete indices in batch rather than one by one. The master node will expect too many acknowledgements based on the number of indices that it's trying to delete, hence the request will never be acknowledged by all nodes.
Closes#14316
We have two types of parse methods for queries: one for the inner query, to be used once the parser is positioned within the query element, and one for the whole query source, including the query element that wraps the actual query.
With the search refactoring we ended up using the former in count, cat count and delete by query, whereas we should have used the former. It ends up working properly given that we have a registered (deprecated) query called "query", which used to allow to wrap a filter into a query, but this has the following downsides:
1) prevents us from removing the deprecated "query" query
2) we end up supporting a top level query that is not wrapped within a query element (pre 1.0 syntax iirc that shouldn't be supported anymore)
This commit finally removes the "query" query and fixes the related parsing bugs. We also had some tests that were providing queries in the wrong format, those have been fixed too.
Closes#13326Closes#14304
This commit makes QueryCache and SearcherWrappoer registration public
otherwise plugins can't access those extension points due to security restrictions.
This commit brings all the registration etc. from IndexCacheModule into
IndexModule. As a side-effect to remove a circular dependency between
IndicesService and IndicesWarmer this commit also cleans up IndicesWarmer and
separates the Engine from the warmer.
This changes how `min_score` is implemented both for `function_score` and the
search request parameter to confirm whether the minimum score is met in the
`matches()` phase of a TwoPhaseIterator.
Today if a shard is marked as inactive after a heavy indexing period
large merges are very likely. Yet, those merges are never committed today
since time-based flush has been removed and unless the shard becomes active
again we won't revisit it.
Yet, inactive shards have very likely been sync-flushed before such that we need
to maintain the sync id if possible which this change tries on a per shard basis.
The test spawns up 3 nodes, waits for a master to be elected and starts network disruptions. If those kick in too early we may have a cluster with a state not recovered block which causes failure during clean ups. http://build-us-00.elastic.co/job/es_core_master_oracle_6/3034/
* Allow for multiple host specifications (e.g. _en0_,192.168.1.2,_site_).
* Add _site_ and _global_ scopes as counterparts to _local_.
* Warn on heuristic selection of publish address.
* Remove the arbitrary _non_loopback_ setting.
Closes#13954
We should not implement this method, it is a real problem. But I think
it is ok to workaround the JDK bug (https://bugs.openjdk.java.net/browse/JDK-8014008).
This allows jconsole/visualvm to work in the meantime.
The @IndexSettings annoationat has been used to differentiate between node-level
and index level settings. It was also decoupled from realtime-updates such that
the settings object that a class got injected when it was created was static and
not subject to change when an update was applied. This change removes the annoation
and replaces it with a full-fledged class that adds type-safety and encapsulates additional
functionality as well as checks on the settings.
Today IndicesLifecycle is a per-node class that allows to register
listeners at any time. It also requires to de-register if listeners
are not needed anymore ie. if classes are created per-index / shard etc.
They also cause issues where listeners are registered more than once as in #13259
This commit removes the per-node class and replaces it with an well defined
extension point that allows listeners to be registered at index creation time
without the need to unregister since listeners are go out of scope if the index
goes out of scope. Yet, this still allows to share instances across indices as before
but without the risk of double registering them etc.
All data-structures used for event notifications are now immuatble and can only changes
on index creation time. This removes flexibility to some degree but increases maintainability
of the interface and the code itself dramatically especially with the step by step removal of
the index level dependency injection.
Closes#13259
The ExtensionPoint.ClassSet binds adds the extension classes to a a Multibinder and binds
the classes and calls the asEagerSingleton method on the multibinder. This does not actually
create a singleton. Instead we first bind the class as a singleton and add then add the class
to the multibinder.
Closes#14194
Numeric and boolean fields have doc values enabled by default as of
elasticsearch 2.0. This commit removes support for uninverted/in-memory
fielddata, as well as numeric fields encoded in binary doc values which was
the way that elasticsearch stored doc values in a Lucene index before the
1.4 release.
As a consequence, you will only be able to sort and aggregate on numeric and
boolean fields in Elasticsearch 3.0 if doc values have not been switched off.
With #13691 we introduced some custom logic to make sure that date math expressions like <logstash-{now/D}> don't get broken up into two where the slash appears in the expression. That said the only correct way to provide such a date math expression as part of the uri would be to properly escape the '/' instead. This fix also introduced a regression, as it would make sure that unescaped '/' are handled only in the case of date math expressions, but it removed support for properly escaped slashes anywhere else. The solution is to keep supporting escaped slashes only and require client libraries to properly escape them.
This commit reverts 93ad696 and makes sure that our REST tests runner supports escaping of path parts, which was more involving than expected as each single part of the path needs to be properly escaped. I am not too happy with the current solution but it's the best I could do for now, maybe not that concerning anyway given that it's just test code. I do find uri encoding quite frustrating in java.
Relates to #13691
Relates to #13665Closes#14177Closes#14216
There are three ways `@Test` was used. Way one:
```java
@Test
public void flubTheBlort() {
```
This way was always replaced with:
```java
public void testFlubTheBlort() {
```
Or, maybe with a better method name if I was feeling generous.
Way two:
```java
@Test(throws=IllegalArgumentException.class)
public void testFoo() {
methodThatThrows();
}
```
This way of using `@Test` is actually pretty OK, but to get the tools to ban
`@Test` entirely it can't be used. Instead:
```java
public void testFoo() {
try {
methodThatThrows();
fail("Expected IllegalArgumentException");
} catch (IllegalArgumentException e ) {
assertThat(e.getMessage(), containsString("something"));
}
}
```
This is longer but tests more than the old ways and is much more precise.
Compare:
```java
@Test(throws=IllegalArgumentException.class)
public void testFoo() {
some();
copy();
and();
pasted();
methodThatThrows();
code(); // <---- This was left here by mistake and is never called
}
```
to:
```java
@Test(throws=IllegalArgumentException.class)
public void testFoo() {
some();
copy();
and();
pasted();
try {
methodThatThrows();
fail("Expected IllegalArgumentException");
} catch (IllegalArgumentException e ) {
assertThat(e.getMessage(), containsString("something"));
}
}
```
The final use of test is:
```java
@Test(timeout=1000)
public void testFoo() {
methodThatWasSlow();
}
```
This is the most insidious use of `@Test` because its tempting but tragically
flawed. Its flaws are:
1. Hard and fast timeouts can look like they are asserting that something is
faster and even do an ok job of it when you compare the timings on the same
machine but as soon as you take them to another machine they start to be
invalid. On a slow VM both the new and old methods fail. On a super-fast
machine the slower and faster ways succeed.
2. Tests often contain slow `assert` calls so the performance of tests isn't
sure to predict the performance of non-test code.
3. These timeouts are rude to debuggers because the test just drops out from
under it after the timeout.
Confusingly, timeouts are useful in tests because it'd be rude for a broken
test to cause CI to abort the whole build after it hits a global timeout. But
those timeouts should be very very long "backstop" timeouts and aren't useful
assertions about speed.
For all its flaws `@Test(timeout=1000)` doesn't have a good replacement __in__
__tests__. Nightly benchmarks like http://benchmarks.elasticsearch.org/ are
useful here because they run on the same machine but they aren't quick to check
and it takes lots of time to figure out the regressions. Sometimes its useful
to compare dueling implementations but that requires keeping both
implementations around. All and all we don't have a satisfactory answer to the
question "what do you replace `@Test(timeout=1000)`" with. So we handle each
occurrence on a case by case basis.
For files with `@Test` this also:
1. Removes excess blank lines. They don't help anything.
2. Removes underscores from method names. Those would fail any code style
checks we ever care to run and don't add to readability. Since I did this manually
I didn't do it consistently.
3. Make sure all test method names start with `test`. Some used to end in `Test` or start
with `verify` or `check` and they were picked up using the annotation. Without the
annotation they always need to start with `test`.
4. Organizes imports using the rules we generate for Eclipse. For the most part
this just removes `*` imports which is a win all on its own. It was "required"
to quickly remove `@Test`.
5. Removes unneeded casts. This is just a setting I have enabled in Eclipse and
forgot to turn off before I did this work. It probably isn't hurting anything.
6. Removes trailing whitespace. Again, another Eclipse setting I forgot to turn
off that doesn't hurt anything. Hopefully.
7. Swaps some tests override superclass tests to make them empty with
`assumeTrue` so that the reasoning for the skips is logged in the test run and
it doesn't "look like" that thing is being tested when it isn't.
8. Adds an oxford comma to an error message.
The total test count doesn't change. I know. I counted.
```bash
git checkout master && mvn clean && mvn install | tee with_test
git no_test_annotation master && mvn clean && mvn install | tee not_test
grep 'Tests summary' with_test > with_test_summary
grep 'Tests summary' not_test > not_test_summary
diff with_test_summary not_test_summary
```
These differ somewhat because some tests are skipped based on the random seed.
The total shouldn't differ. But it does!
```
1c1
< [INFO] Tests summary: 564 suites (1 ignored), 3171 tests, 31 ignored (31 assumptions)
---
> [INFO] Tests summary: 564 suites (1 ignored), 3167 tests, 17 ignored (17 assumptions)
```
These are the core unit tests. So we dig further:
```bash
cat with_test | perl -pe 's/\n// if /^Suite/;s/.*\n// if /IGNOR/;s/.*\n// if /Assumption #/;s/.*\n// if /HEARTBEAT/;s/Completed .+?,//' | grep Suite > with_test_suites
cat not_test | perl -pe 's/\n// if /^Suite/;s/.*\n// if /IGNOR/;s/.*\n// if /Assumption #/;s/.*\n// if /HEARTBEAT/;s/Completed .+?,//' | grep Suite > not_test_suites
diff <(sort with_test_suites) <(sort not_test_suites)
```
The four tests with lower test numbers are all extend `AbstractQueryTestCase`
and all have a method that looks like this:
```java
@Override
public void testToQuery() throws IOException {
assumeTrue("test runs only when at least a type is registered", getCurrentTypes().length > 0);
super.testToQuery();
}
```
It looks like this method was being double counted on master and isn't anymore.
Closes#14028
The NotQueryBuilder has been deprecated on the 2.x branches
and can be removed with the next major version. It can be
replaced by boolean query with added mustNot() clause.
Closes#13761
This commit replaces instances of manually computing a hash code for
primitive longs by XORing the upper bits with the lower bits with a
built-in method for doing the same.
This adds an API for force merging lucene segments. The `/_optimize` API is now
deprecated and replaced by the `/_forcemerge` API, which has all the same flags
and action, just a different name.
This commit removes some cache concurrency level settings that were
applicable when the cache was backed by the Guava cache implementation,
but no longer apply with the cache implementation completed in #13717.
Relates #7836, relates #13224, relates #13717
Today we leak the notion of an engine outside of the shard abstraction
which is not desirable. This commit refactors the infrastrucutre to use
use already existing interfaces to communicate if a shard has failed and
prevents engine private classes to be implemented on a higher level.
This change is purely cosmentical...
This commit removes some build output files from the
burn_maven_with_fire_branch that appear to have been mistakenly
committed to master in bfb9054a11.
This is very simple to do and recommended by `privileges(5)` documentation:
```
Daemons that never need to exec subprocesses should remove the PRIV_PROC_EXEC privilege from their permitted and limit sets.
```
Closes#14200
Adds *Exception(Throwable cause) constructors and calls them where appropriate
thus getting rid of 16 instances of calling getMessage and eliminating the risk
of loosing exception context.
Fixes ElasticsearchTimeoutException along the way (used to discard the
parameter args in the (String message, Object... args) constructor, passes it
up to super now.
Relates to #10021
Geopoint's equals method was modified to consider two points equal if they are within a threshold. This change was done to accept round-off error introduced from GeoHash encoding methods. This commit removes this trappy leniency from the GeoPoint equals method and instead forces round-off error to be handled at the encoding source.
This commit renames ShardReplicationTests to
TransportReplicationActionTests. This rename is to reflect the fact
that the tests contained in this test suite are for testing
TransportReplicationAction. This class was previously renamed but the
test suite was not.
Does so by improving the error message passed to MapperParsingException.
The error messages for mapping conflicts now look like:
```
{
"error" : {
"root_cause" : [ {
"type" : "mapper_parsing_exception",
"reason" : "Failed to parse mapping [type_one]: Mapper for [text] conflicts with existing mapping in other types:\n[mapper [text] has different [analyzer], mapper [text] is used by multiple types. Set update_all_types to true to update [search_analyzer] across all types., mapper [text] is used by multiple types. Set update_all_types to true to update [search_quote_analyzer] across all types.]"
} ],
"type" : "mapper_parsing_exception",
"reason" : "Failed to parse mapping [type_one]: Mapper for [text] conflicts with existing mapping in other types:\n[mapper [text] has different [analyzer], mapper [text] is used by multiple types. Set update_all_types to true to update [search_analyzer] across all types., mapper [text] is used by multiple types. Set update_all_types to true to update [search_quote_analyzer] across all types.]",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Mapper for [text] conflicts with existing mapping in other types:\n[mapper [text] has different [analyzer], mapper [text] is used by multiple types. Set update_all_types to true to update [search_analyzer] across all types., mapper [text] is used by multiple types. Set update_all_types to true to update [search_quote_analyzer] across all types.]"
}
},
"status" : 400
}
```
Closes#12839
Change implementation
Rather than make a new exception this improves the error message of the old
exception.