1831 Commits

Author SHA1 Message Date
Adrien Grand
8238f497d8 Expose Lucene's new TopTermsBlendedFreqScoringRewrite.
This rewrite method is interesting because it computes scores as if all terms
had the same frequencies, which avoids disappointments with ranking when a fuzzy
query ranks typos first given that they are less frequent than the correct term.
2015-07-08 16:01:47 +02:00
Tanguy Leroux
1c5d8efd47 Process Stats: remove sigar specific stats from APIs and add JMX implementation 2015-07-08 15:12:45 +02:00
Adrien Grand
fbab48e451 Clean up handling of missing values when merging shard results on the coordinating node.
Today shards are responsible for producing one sort value per document, which
is later used on the coordinating node to resolve the global top documents.
However, this is problematic on string fields with
`missing: _first, order: desc` or `missing: _last, order: asc` given that there
is no such thing as a string that compares greater than any other string. Today
we use a string containing a single code point which is the maximum allowed code
point but this is a hack: instead we should inform the coordinating node that
the document had no value and let it figure out how it should be sorted
depending on whether missing values should be sorted first or last.

Close #9155
2015-07-08 14:48:35 +02:00
Simon Willnauer
f9a45fd605 Cleanup ShardRoutingState uses and hide implementation details of ClusterInfo 2015-07-08 14:36:26 +02:00
Shay Banon
097b132238 Consolidate ShardRouting construction
Simplify and consolidate ShardRouting construction. Make sure that there is really only one place it gets created, when a shard is first created in unassigned state, and from there on, it is either copy constructed or built internally as a target for relocation.
This change helps make sure within our codebase data carries over by the ShardRouting is not lost as the shard goes through transitions, and can help simplify the addition of more data on it (like uuid).
For testing, a centralized TestShardRouting allows to create testable versions of ShardRouting, that are not needed to be as strict as the non test codebase. This can be cleanup more later on, but it is a good start.
closes #12125
2015-07-08 14:15:28 +02:00
Christoph Büscher
fc1b178dc4 Merge branch 'master' into feature/query-refactoring
Conflicts:
	core/src/main/java/org/elasticsearch/index/query/FuzzyQueryBuilder.java
	core/src/main/java/org/elasticsearch/index/query/FuzzyQueryParser.java
	core/src/main/java/org/elasticsearch/index/query/RegexpQueryBuilder.java
	core/src/main/java/org/elasticsearch/index/query/RegexpQueryParser.java
2015-07-08 13:11:25 +02:00
Isabel Drost-Fromm
0202c99e50 Separates JSON parsing from Lucene query creation, adds support for streaming, hashCode and equals as well as unit tests.
Relates to #10217
2015-07-08 10:56:23 +02:00
Alex Ksikes
a6c0007325 Fix FuzzyQuery to properly handle Object, number, dates or String.
This makes FuzzyQueryBuilder and Parser take an Object as a value using the
same logic as termQuery, so that numbers, dates or Strings would be properly
handled.

Relates #11865
Closes #12020
2015-07-08 10:41:03 +02:00
Simon Willnauer
b5452074a3 [TEST] Only sanitly check time values in stats
Testing the actual time value even with lowerbounds is very tricky
and fails very often. We should really just sanity check the values.
2015-07-08 10:02:24 +02:00
Ryan Ernst
8d9053a841 Merge pull request #12089 from rjernst/refactor/field-mapper-collapse
Remove AbstractFieldMapper
2015-07-07 21:36:47 -07:00
Ryan Ernst
8c45c7f482 Internal: Change JarHell to operate on Path instead of URL
This converts the tracking of jars and classes in JarHell to use
Path objects, instead of URL. This makes for nicer printing
of the underlying path when an error does occur.
2015-07-07 20:14:23 -07:00
Ryan Ernst
bab1323d1e Fix JarHell check to properly convert URL to Path so it can be compared
to java.home
2015-07-07 19:44:15 -07:00
Jason Tedor
83f6587e61 Default fuzzy transpositions to true
This commit defaults fuzzy_transpositions on fuzzy queries to true. This means that by default, tranpositions will now count as a single
edit.

Closes #9278
2015-07-07 20:13:56 -04:00
Ryan Ernst
6eacbf764d Merge pull request #12106 from rjernst/tests/jar-hell
Tests: Add unit tests for JarHell
2015-07-07 16:58:31 -07:00
Ryan Ernst
35b76ca081 Tests: Add unit tests for JarHell 2015-07-07 16:54:03 -07:00
Robert Muir
46c89f006d Allow use of bouncycastle 2015-07-07 17:43:35 -04:00
Robert Muir
27b8e59c24 remove temporary leniency 2015-07-07 17:35:16 -04:00
Jack Conradson
6dbf56fe99 Simplify CacheKey used for scripts
Replaced the CacheKey class with a static method that returns a String.
The class was overkill.

closes #12092
2015-07-07 14:20:03 -07:00
Areek Zillur
71a6d6d5e9 Merge branch 'master' of github.com:elasticsearch/elasticsearch 2015-07-07 17:00:04 -04:00
Tanguy Leroux
44efbf2770 Renaming FsStats to FsInfo 2015-07-07 22:50:15 +02:00
Areek Zillur
4849e76275 Currently when an engine is failed, it is marked as corrupted regardless of
the failure type. This change marks the engine as corrupted only when the failure
is caused by an actual index corrruption. When an engine is failed for other
reasons, the engine is only closed without removing the shard state.

closes #11788
2015-07-07 16:48:18 -04:00
Tanguy Leroux
fbcf4dbbf7 FS Stats: remove sigar specific stats from APIs:
- fs.*.disk_reads
- fs.*.disk_writes
- fs.*.disk_io_op
- fs.*.disk_read_size_in_bytes
- fs.*.disk_write_size_in_bytes
- fs.*.disk_io_size_in_bytes
- fs.*.disk_queue
- fs.*.disk_service_time
2015-07-07 22:16:39 +02:00
Robert Muir
7dbc5c7ab9 Merge pull request #12093 from rmuir/no_fucking_way
Give a better exception when a jar contains same classfile twice.
2015-07-07 16:13:12 -04:00
Tanguy Leroux
30892c4129 Remove network stats & info 2015-07-07 21:16:42 +02:00
Alex Ksikes
de277d99d9 Make MultiTermQueryBuilder an interface again
This PR is against the query-refactoring branch.

Closes #12074
2015-07-07 20:58:16 +02:00
Robert Muir
1994dbde15 Give a better exception when a jar contains same classfile twice.
And ignore the known issue with xmlbeans for now... though it may
cause us issues ultimately: https://issues.apache.org/jira/browse/XMLBEANS-499
2015-07-07 13:26:54 -04:00
David Pilato
d57de59158 Simplify Plugin Manager for official plugins
Plugin Manager can now use another simplified form when a user wants to install an official plugin hosted at elasticsearch download service.

The form we use is:

```sh
bin/plugin install pluginname
```

As plugins share now the same version as elasticsearch, we can automatically guess what is the exact current version of the plugin manager script.

Also, download service will now use `/org.elasticsearch.plugins/pluginName/pluginName-version.zip` URL path to download a plugin.

If the older form is provided (`user/plugin/version` or `user/plugin`), we will still use:

 * elasticsearch download service at `/user/plugin/plugin-version.zip`
 * maven central with groupIp=user, artifactId=plugin and version=version
 * github with user=user, repoName=plugin and tag=version
 * github with user=user, repoName=plugin and branch=master if no version is set

Note that community plugin providers can use other download services by using `--url` option.

If you try to use the new form with a non core elasticsearch plugin, the plugin manager will reject
it and will give you all known core plugins.

```
Usage:
    -u, --url     [plugin location]   : Set exact URL to download the plugin from
    -i, --install [plugin name]       : Downloads and installs listed plugins [*]
    -t, --timeout [duration]          : Timeout setting: 30s, 1m, 1h... (infinite by default)
    -r, --remove  [plugin name]       : Removes listed plugins
    -l, --list                        : List installed plugins
    -v, --verbose                     : Prints verbose messages
    -s, --silent                      : Run in silent mode
    -h, --help                        : Prints this help message

 [*] Plugin name could be:
     elasticsearch-plugin-name    for Elasticsearch 2.0 Core plugin (download from download.elastic.co)
     elasticsearch/plugin/version for elasticsearch commercial plugins (download from download.elastic.co)
     groupId/artifactId/version   for community plugins (download from maven central or oss sonatype)
     username/repository          for site plugins (download from github master)

Elasticsearch Core plugins:
 - elasticsearch-analysis-icu
 - elasticsearch-analysis-kuromoji
 - elasticsearch-analysis-phonetic
 - elasticsearch-analysis-smartcn
 - elasticsearch-analysis-stempel
 - elasticsearch-cloud-aws
 - elasticsearch-cloud-azure
 - elasticsearch-cloud-gce
 - elasticsearch-delete-by-query
 - elasticsearch-lang-javascript
 - elasticsearch-lang-python
```
2015-07-07 18:27:40 +02:00
Ryan Ernst
4aecd37e57 Mappings: Remove AbstractFieldMapper
AbstractFieldMapper is the only direct base class of FieldMapper.
This change moves all AbstractFieldMapper functionality into
FieldMapper, since there is no need for 2 levels of abstraction.
2015-07-07 08:43:38 -07:00
Jason Tedor
c563d68872 Failure during the fetch phase of scan should invoke the failed fetch phase handler.
This commit fixes an issue where during a failure in the fetch phase of a scan the wrong failure handler was invoked.

Closes #12086
2015-07-07 11:35:03 -04:00
Ryan Ernst
2cc0382cf0 Merge pull request #12068 from rjernst/fix/mapper-names-conflict
Mappings: Enforce field names do not contain dot
2015-07-07 08:34:04 -07:00
Martijn van Groningen
f7ac2a7e1c test: check node count on all nodes before checking if cluster state is the same on all nodes 2015-07-07 16:46:24 +02:00
Jason Tedor
b2d8a1fd1b Count scans in search stats and add metrics for scrolls
Each scroll on a scan causes a query to be executed. This commit adds support for these indirect queries to count against the search stats.
Additionally, this commit adds three new search stats: scroll_count, scroll_time_in_millis, and scroll_current. scroll_count tracks the
number of completed scrolls. scroll_time_in_millis tracks the total time that scrolls were held open. scroll_current tracks the number of
scrolls currently open.

Closes #9109
2015-07-07 10:20:45 -04:00
Alexander Reelsen
21b4f9b6f8 Plugins: Ensure logging configuration is loaded in plugin manager
This prevents log4j warnings printed out, when installing a plugin
due to the JarHell class using an ESLogger.

Closes #12064
2015-07-07 14:56:06 +02:00
Alex Ksikes
5e023848de Properly fix the default regex flag to ALL for RegexpQueryBuilder and Parser
Relates to #11896
Closes #12067
2015-07-07 14:26:46 +02:00
Boaz Leskes
67318ce7ba Tests: Faster recovery from simulated disurptions
In testing infra, one can simulate node GCs, network issues and other problems by adding a disruption to the test cluster. Those disruption are automatically removed after the test is done. At the moment each disruption indicates how long it will take the cluster to heal once the disruption is removed and the test cluster waits for this amount of time. However, more often than not this is an upper bound, causing a much longer wait than needed. Instead we should push the responsibility of healing to the disruption it self, where we can be smarter about what we wait for.

Closes #12071
2015-07-07 14:16:31 +02:00
Alex Ksikes
4f9855261a Revert "fix RegexpQueryBuilder#maxDeterminizedStates"
This reverts commit b7e26fae3ff9779caec251b7490d71eeaa297161.
2015-07-07 14:06:07 +02:00
Alex Ksikes
b7e26fae3f fix RegexpQueryBuilder#maxDeterminizedStates
Value was improperly set to `true`.

Relates to #11896
2015-07-07 13:44:05 +02:00
David Pilato
af1dc6d809 [test] awaitBusy: add a ceiling to max sleep time
When using `awaitBusy`, sometimes, you might not want to double time between two runs in an infinitive manner.

For example, let's say it will probably take 30 seconds to run a test.
When doubling all the time, you will most likely wait for a bigger time than needed:

|iteration|ms           |s            |duration (ms)|duration (s)|
|-----------|-------------|-----------|-----------|-----------|
|1|1|0,001|1|0,001|
|2|2|0,002|3|0,003|
|3|4|0,004|7|0,007|
|4|8|0,008|15|0,015|
|5|16|0,016|31|0,031|
|6|32|0,032|63|0,063|
|7|64|0,064|127|0,127|
|8|128|0,128|255|0,255|
|9|256|0,256|511|0,511|
|10|512|0,512|1023|1,023|
|11|1024|1,024|2047|2,047|
|12|2048|2,048|4095|4,095|
|13|4096|4,096|8191|8,191|
|14|8192|8,192|16383|16,383|
|15|16384|16,384|32767|32,767|
|16|32768|32,768|65535|65,535|
|17|65536|65,536|131071|131,071|
|18|131072|131,072|262143|262,143|
|19|262144|262,144|524287|524,287|
|20|524288|524,288|1048575|1048,575|
|21|1048576|1048,576|2097151|2097,151|

For example here, if the task is successful after 35 seconds, we will most likely have to wait for 32s more before the Predicate is run again.

With this patch, the maximum sleep time is now set to 1 second.
2015-07-07 12:04:16 +02:00
Christoph Büscher
d8e56e9a6d Merge pull request #12073 from cbuescher/feature/query-refactoring-spanfirst
Query refactoring: SpanFirstQueryBuilder and Parser
2015-07-07 11:58:28 +02:00
Colin Goodheart-Smithe
1d7fc6b4f2 Aggregations: Pipeline Aggregation to filter buckets based on a script
This pipeline aggregation runs a script on each bucket in the parent aggregation to determine whether the bucket is kept in the final aggregation tree. If the script returns true the bucket is retained, if it returns false the bucket is dropped
2015-07-07 09:51:16 +01:00
Christoph Büscher
51a27ab082 Query Refactoring: Make EmptyQueryBuilder implement QueryBuilder directly
By extending AbstractQueryBuilder, EmptyQueryBuilder had setters for boost and
queryname which defeats its original purpose of beeing a stand-in
singleton for empty queries. By directly implementing QueryBuilder (and
temporarily also extending ToXContentToBytes) this is prevented
2015-07-07 10:41:10 +02:00
Alexander Reelsen
b612cab96a Dates: More strict parsing of ISO dates
If you are using the default date or the named identifiers of dates,
the current implementation was allowed to read a year with only one
digit. In order to make this more strict, this fixes a year to be at
least 4 digits. Same applies for month, day, hour, minute, seconds.

Also the new default is `strictDateOptionalTime` for indices created
with Elasticsearch 2.0 or newer.

In addition a couple of not exposed date formats have been exposed, as they
have been mentioned in the documentation.

Closes #6158
2015-07-07 09:34:37 +02:00
Christoph Büscher
35ddc749b1 Merge pull request #12060 from cbuescher/fix/9821
Fix: Use correct OpType on Failure in BulkItemResponse
2015-07-07 09:29:29 +02:00
Christoph Büscher
53db46b560 Query refactoring: SpanFirstQueryBuilder and Parser
Moving the query building functionality from the parser to the builders
new toQuery() method analogous to other recent query refactorings.

Relates to #10217
2015-07-07 09:13:28 +02:00
Robert Muir
d732c0d19f Add symlink permissions test 2015-07-07 02:38:11 -04:00
Simon Willnauer
9e196c3a0b [TEST] Wait for test thread to join before 2015-07-07 07:24:50 +02:00
Ryan Ernst
aed1f68e49 Mappings: Enforce field names do not contain dot
Field names containing dots can cause problems. For example, @jpountz
made this recreation which cause no error, but can result in a
serialization exception if the type already exists:
https://gist.github.com/jpountz/8c66817e00a322b81f85

But this is not just a potential conflict. It also has larger problems,
since only the leaf mapper is created. The intermediate "foo" object
field would not exist if only "foo.bar" was in the mappings.

This change forbids the use of dots in field names. It also
fixes an issue with passing through the update_all_types setting,
which was always set to true whenever a type already existed (!).

I do not think we should worry about backwards compatibility here. This
should be a hard break (and added to the migration plugin).
2015-07-06 18:22:06 -07:00
Simon Willnauer
3906ff950c Don't use forbidden API in test 2015-07-06 22:56:32 +02:00
Simon Willnauer
04c5dab3d9 Add basic recovery prioritization to GatewayAllocator
This commit adds logic to prefer shards with higher priority
or from newer indicse to be allocated first if they are unallocated post API.

This commit allows users to set `index.priority` to a non-negative integer to
prioritize index recovery for certain indices. This setting is dynamically updateable
and defaults to `0`. If two indices have the same priority this change takes the creation
date into account to prioritize shards from newer indices which is important in the time-based
indices usecase.

Closes #11787
2015-07-06 22:51:34 +02:00
Robert Muir
546e99f072 Merge pull request #12061 from rmuir/plugin-integration-tests
Add integration test harness for plugins
2015-07-06 16:03:56 -04:00