4362 Commits

Author SHA1 Message Date
Martijn van Groningen
1cfb6a79f1 Parent/child: refactored _parent field mapper and parent/child queries
* Cut the `has_child` and `has_parent` queries over to use Lucene's query time global ordinal join. The main benefit of this change is that parent/child queries can now efficiently execute if parent/child queries are wrapped in a bigger boolean query. If the rest of the query only hit a few documents both has_child and has_parent queries don't need to evaluate all parent or child documents any more.
* Cut the `_parent` field over to use doc values. This significantly reduces the on heap memory footprint of parent/child, because the parent id values are never loaded into memory.

Breaking changes:
* The `type` option on the `_parent` field can only point to a parent type that doesn't exist yet, so this means that an existing type/mapping can't become a parent type any longer.
* The `has_child` and `has_parent` queries can no longer be use in alias filters.

All these changes, improvements and breaks in compatibility only apply for indices created with ES version 2.0 or higher. For indices creates with ES <= 2.0 the older implementation is used.

It is highly recommended to re-index all your indices with parent and child documents to benefit from all the improvements that come with this refactoring. The easiest way to achieve this is by using the scan and bulk apis using a simple script.

Closes #6107
Closes #8134
2015-05-29 21:44:17 +02:00
Colin Goodheart-Smithe
a9ee78dd08 [TEST] muted ElasticsearchRestTestCase
This is because commit 35a58d874ef56be50a0ad1d7bfb13edb4204d0a3 causes the following REST tests to fail and reverting the commit causes conflicts:

update/15_script/Script
script/10_basic/Indexed script
2015-05-29 18:52:29 +01:00
Areek Zillur
fb8cd53582 This commit removes the ability to use filter for PhraseSuggester collate.
Only `query` can be used for collation.

Internally, a collate query is executed as an exists query. So specifying a
filter does not have any benefits.
2015-05-29 12:26:08 -04:00
Colin Goodheart-Smithe
35a58d874e Scripting: Unify script and template requests across codebase
This change unifies the way scripts and templates are specified for all instances in the codebase. It builds on the Script class added previously and adds request building and parsing support as well as the ability to transfer script objects between nodes. It also adds a Template class which aims to provide the same functionality for template APIs

Closes #11091
2015-05-29 16:52:04 +01:00
Adrien Grand
0f3206e60c Merge pull request #11279 from jpountz/fix/simplify_compression
Internal: tighten up our compression framework.
2015-05-29 17:23:07 +02:00
Adrien Grand
b6a3952036 Internal: Use DEFLATE instead of LZF for compression.
LZF only stays for backward-compatibility reasons and can only read, not write.
DEFLATE is configured to use level=3, which is a nice trade-off between speed
and compression ratio and is the same as we use for Lucene's high compression
codec.
2015-05-29 17:01:45 +02:00
Christoph Büscher
29fbcd225b Merge pull request #11382 from cbuescher/fix/10825
Fix typed parameters in IndexRequestBuilder and CreateIndexRequestBuilder
2015-05-29 14:56:55 +02:00
Christoph Büscher
c7ca64cc08 Fix typed parameters in IndexRequestBuilder and CreateIndexRequestBuilder
IndexRequestBuilder#setSource as well as CreateIndexRequestBuilder#setSettings and
CreateIndexRequestBuilder#setSouce() will not work with Map<String, String> argument
although the API looks like it should. This PR fixes the problem introducing correct
wildcard parameters and adds tests.

Closes #10825
2015-05-29 14:42:58 +02:00
Simon Willnauer
5a9694783b Consolidate shard level modules without logic into IndexShardModule
We have a lot of module classes that don't contain any actual logic,
only declarative bind actions. These classes are unnecessary and can
be consolidated into the already existings IndexShardModule
2015-05-29 14:16:34 +02:00
Spyros Kapnissis
784a26321b Query DSL: throw an exception if array passed to term query.
Closes #11246
Closes #11384
2015-05-29 13:40:15 +02:00
Adrien Grand
08ee4a87b3 Internal: tighten up our compression framework.
We have a compression framework that we use internally, mainly to compress some
xcontent bytes. However it is quite lenient: for instance it relies on the
assumption that detection of the compression format can only be called on either
some compressed xcontent bytes or some raw xcontent bytes, but nothing checks
this. By the way, we are misusing it in BinaryFieldMapper so that if someone
indexes a binary field which happens to have the same header as a LZF stream,
then at read time, we will try to decompress it.

It also simplifies the API by removing block compression (only streaming) and
some code duplication caused by some methods accepting a byte[] and other
methods a BytesReference.
2015-05-29 12:13:18 +02:00
Simon Willnauer
e98b68a665 Prevent changing the number of replicas on a closed index
Setting the number of replicas on a closed index can leave the index
in an unopenable state since we might not be able to recover a quorum.
This commit simply prevents updating this setting on a closed index.

Closes #9566
2015-05-29 11:15:37 +02:00
Simon Willnauer
5cd6ced7ee Close ShardFilterCache after Store is closed
The ShardFilterCache relies on the fact that it's
closed once the last reader on the shard is closed.
This is only guaranteed once the Store and all its
references are closed. This commit moves the closing
into the internal callback mechanism we use for deleting
shard data etc. to close the cache once we have all
searchers released.
2015-05-29 10:58:34 +02:00
Britta Weber
87a0c76e9c Merge remote-tracking branch 'boaz/index_seal_to_flush_sync' 2015-05-29 10:31:03 +02:00
Adrien Grand
6f002ffca8 Merge pull request #11381 from jpountz/fix/remove_unused_code
Internal: remove unused code.
2015-05-29 10:11:36 +02:00
Adrien Grand
1bf2a44044 Merge pull request #11308 from jpountz/fix/term_vs_terms_query
Search: Do not specialize TermQuery vs. TermsQuery.
2015-05-29 09:45:09 +02:00
Igor Motov
503f844a05 Tests: make randomRepoPath work with bwc tests 2015-05-28 12:37:51 -10:00
Igor Motov
3db9caf7a1 Tests: Increase timeout waiting for snapshot to complete in batchingShardUpdateTaskTest
When this test picks a large number of shards, the snapshot doesn't always manage to complete in 10 seconds.
2015-05-28 08:56:30 -10:00
Igor Motov
d955461f58 Tests: fix NPE in UpgradeTest 2015-05-28 07:25:00 -10:00
Igor Motov
55fc3a727b Core: refactor upgrade API to use transport and write minimum compatible version that the index was upgraded to
In #11072 we are adding a check that will prevent opening of old indices. However, this check doesn't take into consideration the fact that indices can be made compatible with the current version through upgrade API. In order to make compatibility check aware of the upgrade, the upgrade API should write a new setting `index.version.minimum_compatible` that will indicate the minimum compatible version of lucene this index is compatible with and `index.version.upgraded` that will indicate the version of elasticsearch that performed the upgrade.

Closes #11095
2015-05-28 05:23:49 -10:00
markharwood
283b0931ff Aggregations fix: queries with size=0 broke aggregations that require scores.
Aggregations like Sampler and TopHits that require access to scores did not work if the query has size param set to zero. The assumption was that the Lucene query scoring logic was not required in these cases.
Added a Junit test to demonstrate the issue and a fix which relies on earlier creation of Collector wrappers so that Collector.needsScores() calls work for all search operations.

Closes #11119
2015-05-28 14:45:23 +01:00
jaymode
105f4dd512 Test: filter out colons in test section names
On Windows, colons ':' are illegal in file names and since we use a Path to
check if the test is blacklisted, tests with a colon in the test section name
will fail. This change simply removes the colon from the name when matching
against the blacklist.
2015-05-28 06:39:51 -04:00
Britta Weber
91e9caabd7 [TEST] add path.home to settings 2015-05-28 11:54:56 +02:00
Britta Weber
334763acef Merge pull request #10909 from aleph-zero/issues/9706
Read configuration file with .yaml suffix
2015-05-28 11:54:05 +02:00
javanna
2f57ae9345 Internal: deduplicate field names returned by simpleMatchToFullName & simpleMatchToIndexNames in FieldMappersLookup
Relates to #10916
Closes #11377
2015-05-28 09:14:39 +02:00
Zachary Tong
491afbe01c Aggregations: Add Holt-Winters model to moving_avg pipeline aggregation
Closes #11043
2015-05-27 14:45:45 -04:00
jaymode
e54dd688a1 make JNA optional for tests and move classes to bootstrap package
Today, JNA is a optional dependency in the build but when running tests or running
with mlockall set to true, JNA must be on the classpath for Windows systems since
we always try to load JNA classes when using mlockall.

The old Natives class was renamed to JNANatives, and a new Natives class is
introduced without any direct imports on JNA classes. The Natives class checks to
see if JNA classes are available at startup. If the classes are available the Natives
class will delegate to the JNANatives class. If the classes are not available the
Natives class will not use the JNANatives class, which results in no additional attempts
to load JNA classes.

Additionally, all of the JNA classes were moved to the bootstrap package and made
package private as this is the only place they should be called from.

Closes #11360
2015-05-27 13:06:23 -04:00
Colin Goodheart-Smithe
95faa35853 Aggregations: Sibling Pipeline Aggregations can now be nested in SingleBucketAggregations
Closes #11379
2015-05-27 17:35:21 +01:00
Adrien Grand
098c01d86c Internal: remove unused code. 2015-05-27 18:25:38 +02:00
Tanguy Leroux
acb07c72b9 Bulk: throw exception if unrecognized parameter in action/metadata line
Closes #10977
2015-05-27 18:03:58 +02:00
Alexander Reelsen
9d5e789508 Cat API: Do not rely on hashmap for sorted entries
The tests for the recently added added wildcard feature were
relying on order of the hashmap being used, which could be
different.

The implementation now ensures, that the header fields are
parsed in the order they have been added.
2015-05-27 17:46:22 +02:00
Britta Weber
ceb0782ebd Merge pull request #11364 from brwe/highlighter-wildcard
Wildcard field names in highlighting should only return fields that can be highlighted
2015-05-27 17:11:50 +02:00
Britta Weber
37610548f8 highlighting: don't fail search request when name of highlighted field contains wildcards
When we highlight on fields using wildcards then fields might match that cannot
be highlighted by the specified highlighter. The whole search request then
failed. Instead, check that the field can be highlighted and ignore the field
if it can't.
In addition ignore the exception thrown by plain highlighter if a field conatins
terms larger than 32766.

closes #9881
2015-05-27 17:10:35 +02:00
Colin Goodheart-Smithe
7fbd86aa97 Aggregations: Fixed Moving Average prediction to calculate the correct keys
The Moving average predict code generated incorrect keys if the key for the first bucket of the histogram was < 0. This fix makes the moving average use the rounding class from the histogram to generate the keys for the new buckets.

Closes #11369
2015-05-27 15:25:11 +01:00
Alexander Reelsen
fc224a0de8 Cat API: Add wildcard support for header names
This adds wildcard support (simple regexes) for specifying header names.
Aliases are supported as well.

Closes #10811
2015-05-27 16:09:31 +02:00
Tanguy Leroux
340b7ef6ef Add common SystemD file for RPM/DEB package 2015-05-27 11:51:58 +02:00
Boaz Leskes
6d269cbf4d feedback 2015-05-27 10:29:37 +03:00
Boaz Leskes
7451b4708e Simplify Transport*OperationAction names
As a follow up to #11332, this commit simplifies more class names by remove the superfluous Operation:

TransportBroadcastOperationAction -> TransportBroadcastAction
TransportMasterNodeOperationAction -> TransportMasterNodeAction
TransportMasterNodeReadOperationAction -> TransportMasterNodeReadAction
TransportShardSingleOperationAction -> TransportSingleShardAction

Closes #11349
2015-05-27 09:25:58 +03:00
Simon Willnauer
fcccd45601 Be more lenient in EIT#waitForDocs
The count request now acts like search and barfs if all shards fail
this behavior changed and some tests like RecoveryWhileUnderLoadTests
relied on the lenient behavior of the old count API. This might be
a temporary solution to stop current test failures.

Relates to #11198
2015-05-26 21:39:20 +02:00
javanna
6c81a8daf3 Internal: count api to become a shortcut to the search api
The count api used to have its own execution path, although it would do the same (up to bugs!) of the search api. This commit makes it a shortcut to the search api with size set to 0. The change is made in a backwards compatible manner, by leaving all of the java api code around too, given that you may not want to get back a whole SearchResponse when asking only for number of hits matching a query, also cause migrating from countResponse.getCount() to searchResponse.getHits().totalHits() doesn't look great from a user perspective. We can always decide to drop more code around the count api if we want to break backwards compatibility on the java api, making it a shortcut on the rest layer only.

Closes #9117
Closes #11198
2015-05-26 19:12:11 +02:00
Alexander Reelsen
045f01c085 Infra for deprecation logging
Add support for a specific deprecation logging that can be used to turn
on in order to notify users of a specific feature, flag, setting,
parameter, ... being deprecated.

The deprecation logger logs with a "deprecation." prefix logge
(or "org.elasticsearch.deprecation." if full name is used), and outputs
the logging to a dedicated deprecation log file.

Deprecation logging are logged under the DEBUG category. The idea is not to
enabled them by default (under WARN or ERROR) when running embedded in
another application.

By default they are turned off (INFO), in order to turn it on, the
"deprecation" category need to be set to DEBUG. This can be set in the
logging file or using the cluster update settings API, see the documentation

Closes #11033
2015-05-26 17:44:52 +02:00
Michael McCandless
9d1f6f7615 a few more ImmutableSettings -> Settings 2015-05-26 09:54:44 -04:00
javanna
44fe99a3a8 [TEST] make filter_path a default parameter in java rest runner
Closes #11351
2015-05-26 15:34:45 +02:00
Tanguy Leroux
ce63590bd6 API: Add response filtering with filter_path parameter
This change adds a new "filter_path" parameter that can be used to filter and reduce the responses returned by the REST API of elasticsearch.

For example, returning only the shards that failed to be optimized:
```
curl -XPOST 'localhost:9200/beer/_optimize?filter_path=_shards.failed'
{"_shards":{"failed":0}}%
```

It supports multiple filters (separated by a comma):
```
curl -XGET 'localhost:9200/_mapping?pretty&filter_path=*.mappings.*.properties.name,*.mappings.*.properties.title'
```

It also supports the YAML response format. Here it returns only the `_id` field of a newly indexed document:
```
curl -XPOST 'localhost:9200/library/book?filter_path=_id' -d '---hello:\n  world: 1\n'
---
_id: "AU0j64-b-stVfkvus5-A"
```

It also supports wildcards. Here it returns only the host name of every nodes in the cluster:
```
curl -XGET 'http://localhost:9200/_nodes/stats?filter_path=nodes.*.host*'
{"nodes":{"lvJHed8uQQu4brS-SXKsNA":{"host":"portable"}}}
```

And "**" can be used to include sub fields without knowing the exact path. Here it returns only the Lucene version of every segment:
```
curl 'http://localhost:9200/_segments?pretty&filter_path=indices.**.version'
{
  "indices" : {
    "beer" : {
      "shards" : {
        "0" : [ {
          "segments" : {
            "_0" : {
              "version" : "5.2.0"
            },
            "_1" : {
              "version" : "5.2.0"
            }
          }
        } ]
      }
    }
  }
}
```

Note that elasticsearch sometimes returns directly the raw value of a field, like the _source field. If you want to filter _source fields, you should consider combining the already existing _source parameter (see Get API for more details) with the filter_path parameter like this:

```
curl -XGET 'localhost:9200/_search?pretty&filter_path=hits.hits._source&_source=title'
{
  "hits" : {
    "hits" : [ {
      "_source":{"title":"Book #2"}
    }, {
      "_source":{"title":"Book #1"}
    }, {
      "_source":{"title":"Book #3"}
    } ]
  }
}
```
2015-05-26 13:51:04 +02:00
Britta Weber
802b7b88fa [TEST] fix epected error message 2015-05-26 11:49:33 +02:00
Britta Weber
7c6869d875 Merge pull request #11303 from brwe/custom_analyzer_name
analyzers: custom analyzers names and aliases must not start with _
2015-05-26 11:44:10 +02:00
Britta Weber
37782c1745 analyzers: custom analyzers names and aliases must not start with _
closes #9596
2015-05-26 11:38:15 +02:00
Michael McCandless
4334404a20 Don't truncate TopDocs after rescoring
We were previously over-trimming the TopDocs such that you get
size-from hits instead of size, which is wrong when from != 0.

Closes #11127

Closes #11342
2015-05-26 04:43:18 -04:00
Michael McCandless
8958096754 don't truncate TopDocs after rescoring 2015-05-26 04:06:21 -04:00
Britta Weber
e97353e84a [TEST] don't check shard operations counter in ExceptionRetryTests 2015-05-26 09:00:16 +02:00