Commit Graph

20934 Commits

Author SHA1 Message Date
David Pilato e907b7c11e Check that S3 setting `buffer_size` is always lower than `chunk_size`
We can be better at checking `buffer_size` and `chunk_size` for S3 repositories.
For example, we know that:

* `buffer_size` should be more than `5mb`
* `chunk_size` should be no more than `5tb`
* `buffer_size` should be lower than `chunk_size`

Otherwise, setting `buffer_size` is useless.

For the record:

`chunk_size` is a Snapshot setting whatever the implementation is.
`buffer_size` is an S3 implementation setting.

Let say that you are snapshotting a 500mb file. If you set `chunk_size` to `200mb`, then Snapshot service will call S3 repository to snapshot 3 files with the following sizes:

* `200mb`
* `200mb`
* `100mb`

If you set `buffer_size` to `100mb` (AWS maximum size recommendation), the first file of `200mb` will be uploaded on S3 using the multipart feature in 2 chunks and the workflow is basically the following:

* create the multipart request and get back an `id` from AWS S3 platform
* upload part1: `100mb`
* upload part2: `100mb`
* "commit" the full upload using the `id`.

Closes #17244.
2016-03-23 10:39:54 +01:00
Adrien Grand a3bb409f03 Upgrade string fields to text/keyword also if `ignore_above` is set. #17273
Since this parameter is used in the logstash default template, it would be nice
to handle it.
2016-03-23 10:31:10 +01:00
Colin Goodheart-Smithe b8ac05149d Merge pull request #17264 from pjo256/master
Setting 'other' bucket on empty aggregation
2016-03-23 09:19:42 +00:00
Simon Willnauer 2f1af552a9 Bring back operation rollback on unexpected mapping change during recovery
We lost some accounting code in the translog recover code during refactoring
which triggers a very rare assertion. If we fail on a recovery target with an
illegal mapping update (which can happen if the clusterstate is behind), then
we miss to rollback the # of processed ops in that batch and once we resume
the batch we trip an assertion that the stats are off.

This commit brings back the code lost in 8bc2332d9a
and improves the comment that explains why we need this rollback logic.
2016-03-23 10:15:53 +01:00
pengqiuyuan 80ef18c3b2 Update template-query.asciidoc 2016-03-23 17:14:31 +08:00
Adrien Grand 252ae5f15a Upgrade dynamic templates that use a dynamic type. #17254
Now that string has been splitted into text and keyword, we use text as a
dynamic type when encountering string fields in a json document. However
this does not play well with existing templates that look like

```
{
  "mapping": {
    "index": "not_analyzed",
    "type": "{dynamic_type}"
  },
  "match": "*"
}
```

Since we want existing templates to keep working as much as possible in 5.0,
this commit adds a hack to dynamic templates so that elasticsearch will create
a keyword field if the `index` property is set and is either `no` or
`not_analyzed`, similarly to what was done in #16991.

While this will make upgrades easier, we still need to figure out a way to
allow users to create keyword fields when using dynamic types.
2016-03-23 09:54:19 +01:00
Adrien Grand e50eeeaffb Refactor fielddata mappings. #17148
The fielddata settings in mappings have been refatored so that:
 - text and string have a `fielddata` (boolean) setting that tells whether it
   is ok to load in-memory fielddata. It is true by default for now but the
   plan is to make it default to false for text fields.
 - text and string have a `fielddata_frequency_filter` which contains the same
   thing as `fielddata.filter.frequency` used to (but validated at parsing time
   instead of being unchecked settings)
 - regex fielddata filtering is not supported anymore and will be dropped from
   mappings automatically on upgrade.
 - text, string and _parent fields have an `eager_global_ordinals` (boolean)
   setting that tells whether to load global ordinals eagerly on refresh.
 - in-memory fielddata is not supported on keyword fields anymore at all.
 - the `fielddata` setting is not supported on other fields that text and string
   and will be dropped when upgrading if specified.
2016-03-23 09:48:13 +01:00
Adrien Grand 435558a5c0 Also map floating-point numbers as floats when numeric detection is on. #17104
I overlooked it in #15319 since numeric detection triggers a totally different
path in the code of dynamic mappings.
2016-03-23 08:20:22 +01:00
Philip Ottesen 1dff3a8210 Setting 'other' bucket on empty aggregation 2016-03-22 20:23:35 -04:00
Tal Levy 534caa8927 Handle regex parsing errors in Gsub and Grok Processors
Currently, both Gsub and Grok parse regex strings during
Pipeline creation. Thrown parsing exceptions were leaking out, this
commit wraps those exceptions in ElasticsearchParseExceptions.
2016-03-22 15:06:29 -07:00
Jason Tedor d5e408b273 Mock rlimit infinity in virtual memory size test
This commit mocks the value of rlimit infinity in the max size virtual
memory check test. This is to avoid attempting to load the native C
library during the test on Windows which would lead to a permissions
violation (the native C library needs to be loaded before the security
manager is setup).
2016-03-22 17:03:46 -04:00
Honza Král f8e84f0bbb [TEST] fix incorrect indent in ingest/70_bulk.yaml 2016-03-22 20:53:23 +01:00
Honza Král ca4b8667bb [TEST] Move yaml test requiring header, add skip:headers 2016-03-22 20:53:23 +01:00
Simon Willison fdac0c7c6c Link to named queries docs from bool query page
The named queries feature only makes sense with bool queries, but was not cross-referenced from the bool query documentation page.
2016-03-22 12:07:57 -07:00
Areek Zillur 866a350599 Merge pull request #17232 from areek/cleanup/handling_index_state
Cleanup writing upgraded index state
2016-03-22 14:57:49 -04:00
Adrien Grand d514977c75 Make dynamic template parsing less lenient. #17249
Today unknown parameters are ignored yet carried through serialization.
2016-03-22 18:52:25 +01:00
Boaz Leskes 20644666e5 RecoveryWhileUnderLoadIT: output specific missing doc ids and their shard routing on failure
Also increase logging levels to see when a doc was indexed
2016-03-22 18:29:09 +01:00
Christoph Büscher 64d362ab9d Add parsing of list of sort builders to SortBuilder
Moving the current parsing code for the whole "sort" element
in the seach source over to static "fromXContent" method in
SortBuilder.
2016-03-22 18:07:08 +01:00
Simon Willnauer 3ed4ff054f Merge pull request #17246 from s1monw/archive_persistent_settings
Archive cluster level settings if unknown or broken

We already archive index level settings if we find an unknown or invalid/broken
value for a setting on node startup. The same could potentially happen for persistent
cluster level settings if we remove a setting or if we add validation to a setting that
didn't exist in the past. To ensure that only valid settings are recovered into the cluster
state we archive them (prefix them with `archive.` and log a warning. Tools that check the
cluster settings can then warn users that they have broken settings in their clusterstate that
got archived.
2016-03-22 17:35:08 +01:00
Nik Everett da96b6e41d [reindex] Add thottling support
The throttle is applied when starting the next scroll request so that its
timeout can include the throttle time.
2016-03-22 12:34:14 -04:00
Simon Willnauer c0ef3189b7 add javadocs for isPrivate() 2016-03-22 17:33:51 +01:00
Colin Goodheart-Smithe ea93b803d2 Rewrite to unbounded range query if relation to query is WITHIN 2016-03-22 16:14:47 +00:00
Colin Goodheart-Smithe d6fe7515fd Merge pull request #17243 from colings86/docs/searchRequestBreakingChanges
added breaking changes for the Java API to the breaking changes doc for 5.0
2016-03-22 15:58:40 +00:00
Colin Goodheart-Smithe 25c4446942 iter 2016-03-22 15:58:12 +00:00
Jason Tedor 8004c51c17 Add max size virtual memory check
This commit adds a bootstrap check on Linux and OS X for the max size of
virtual memory (address space) to the user running the Elasticsearch
process.

Closes #16935
2016-03-22 11:52:36 -04:00
Colin Goodheart-Smithe ee7e84acc3 review comments 2016-03-22 15:34:47 +00:00
Adrien Grand c52b1f3a7c An `exists` query on an object should query a single term.
Currently if you run an `exists` query on an object, it will resolve all sub
fields and create a disjunction for all those fields. However the `_field_names`
mapper indexes paths for objects so we could query object paths directly.

I also changed the query parser to reject `exists` queries if the `_field_names`
field is disabled since it would be a big performance trap.
2016-03-22 16:26:45 +01:00
Adrien Grand b42f66c8ac Document 5.0 mapping changes. 2016-03-22 16:22:58 +01:00
Simon Willnauer 68d07fc01f Archive cluster level settings if unknown or broken
We already archive index level settings if we find an unknown or invalid/broken
value for a setting on node startup. The same could potentially happen for persistent
cluster level settings if we remove a setting or if we add validation to a setting that
didn't exist in the past. To ensure that only valid settings are recovered into the cluster
state we archive them (prefix them with `archive.` and log a warning. Tools that check the
cluster settings can then warn users that they have broken settings in their clusterstate that
got archived.
2016-03-22 16:17:06 +01:00
Colin Goodheart-Smithe b8a96d9a65 added breaking changes for the Java API to the breaking changes doc for 5.0 2016-03-22 14:39:16 +00:00
Luca Cavanna 3764b3ff80 Merge pull request #17145 from alexshadow007/fix-17101
Fix column aliases in _cat/indices, _cat/nodes and _cat/shards APIs
2016-03-22 15:37:21 +01:00
Jason Tedor 5dc48e71d0 Use mock filesystem during install plugins tests
This commit sets up the default filesystem used during install plugins
tests. A hack is neeeded to handle the temporary directory because the
system property "java.io.tmpdir" will have been initialized to a value
that is sensible for the default filesystem, but not necessarily to a
value that makes sense for the mock filesystem in use during the
tests. This property is restored after each test.
2016-03-22 10:25:27 -04:00
Boaz Leskes 533c967a2d Revert "Removed index level metadata election #17233"
This reverts commit 1264ee79b6.
2016-03-22 14:35:42 +01:00
Christoph Büscher 14f45c1784 Merge pull request #17146 from cbuescher/sort-add-build
For the refactoring of SortBuilders related to #10217, each SortBuilder needs to get a build()
method that produces a SortField according to the SortBuilder parameters on the shard.
2016-03-22 13:46:50 +01:00
Simon Willnauer 1988b8b387 [TEST] Reuse EsTestCase#createAnalysisService in KuromojiAnalysisTests 2016-03-22 13:45:20 +01:00
Simon Willnauer 75d5b83367 Improve error message if resource files have illegal encoding
This commit fixes string formatting issues in the error handling and
provides a bettter error message if malformed input is detected.
This commit also adds tests for both situations.

Relates to #17212
2016-03-22 13:29:07 +01:00
Christoph Büscher 697174dcb0 Make sure to use nestedScope levels when building nested filters 2016-03-22 13:28:40 +01:00
Christoph Büscher 25da6b2f2e Merge branch 'master' into sort-add-build 2016-03-22 12:20:56 +01:00
Christoph Büscher ff021c60d9 Merge pull request #17238 from cbuescher/simplify-nestedInnerQueryParseSupport
Remove unused methods and fields in NestedInnerQueryParseSupport
2016-03-22 12:16:44 +01:00
Jun Ohtani a9a0f262af Analysis Kuromoji: Add nbest option and NumberFilter
Add nbest_cost and nbest_examples parameter to KuromojiTokenizerFactory
Add KuromojiNumberFilterFactory
2016-03-22 20:09:56 +09:00
Boaz Leskes b07a8185a7 Wait for metadata to stabilize before checking for it after opening indices in testMetaWrittenWhenIndexIsClosedAndMetaUpdated 2016-03-22 11:36:42 +01:00
Christoph Büscher 20417262e2 Remove unused methods and fields in NestedInnerQueryParseSupport 2016-03-22 11:32:24 +01:00
Simon Willnauer 33521fc27c Detach IndexShard from node services
this is the last step to remove node level service from IndexShard.
This means that tests can now more easily create an IndexShard instance
without starting a node and removes the dependency between IndexShard and Client/ScriptService
2016-03-22 11:02:04 +01:00
javanna eebd0cfccd Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-22 10:34:40 +01:00
Martijn van Groningen 8f22a01bbd ingest: Give the `foreach` processor access to the rest of the document.
Closes #17147
2016-03-22 10:32:13 +01:00
Boaz Leskes 1264ee79b6 Removed index level metadata election #17233
When a master is elected, it reaches out to all master nodes for their cluster state, selecting the one with the highest version. At the moment, we do another round to select the index metadata with the highest version as well. This is not needed - the election of a cluster state is enough - we should just use whatever indices are in it.

Closes #17233
2016-03-22 10:28:01 +01:00
Simon Willnauer 47f0e6e8f4 [TEST] Disable InternalClusterInfoService in messy tests, it sends IndicesStatsRequest periodically which messes with the messy test 2016-03-22 10:12:03 +01:00
Areek Zillur ec5419048e cleanup writing upgraded index state
In #17187, we upgrade index state after upgrading
index folder structure. As we don't have to write
the upgraded state in the old index folder structure,
we can cleanup how we write upgraded index state.
2016-03-21 18:59:37 -04:00
Simon Willnauer a0c68c281c Improve error message if setting is not found
We can do better than just throwing an error when we don't find a
setting. It's actually trivial to leverage lucenes slow LD StringDistance
to find possible candiates for a setting to detect missspellings and suggest
a possible setting.
This commit adds error messages like:

 * `unknown setting [index.numbe_of_replica] did you mean [index.number_of_replicas]?`

rather than just reporting the setting as unknown
2016-03-21 23:13:24 +01:00
Simon Willnauer 8127a06b2e Recover broken IndexMetaData as closed
Today if something is wrong with the IndexMetaData we detect it very
late and most of the time if that happens we already allocated the index
and get endless loops and full log files on data-nodes. This change tries
to verify IndexService creattion during initial state recovery on the master
and if the recovery fails the index is imported as `closed` and won't be allocated
at all.

Closes #17187
2016-03-21 22:50:58 +01:00