Commit Graph

659 Commits

Author SHA1 Message Date
Lee Hinman a4e8b5d952 Throw an exception on unrecognized "match_mapping_type"
When using dynamic templates, ES will now throw an exception if a
`match_mapping_type` is used that doesn't correspond to an actual type.

Relates to #17285
2016-12-12 09:59:48 -07:00
Luca Cavanna 6d987a9b69 Remove support for empty queries (#22092)
Our query DSL supports empty queries (`{}`), which have a different meaning depending on the query that holds it, either ignored, match_all or match_none. We deprecated the support for empty queries in 5.0, where we log a deprecation warning wherever they are used.

The way we supported it once we moved query parsing to the coordinating node was having an Optional<QueryBuilder> return type in all of our parse methods (called fromXContent). See #17624. The central place for this was QueryParseContext#parseInnerQueryBuilder. We can now remove all the optional return types and simply throw an exception whenever an empty query is found.
2016-12-12 12:37:12 +01:00
Luca Cavanna 103984a4a1 Remove indices query (#21837)
The indices query is deprecated since 5.0.0 (#17710). It can now be removed in master (future 6.0 version).
2016-11-30 19:37:01 +01:00
Adrien Grand c5b9c98b99 Remove the `default` store type. (#21616)
It used to be a hybrid store between `niofs` and `mmapfs`, which we removed when
we switched to `fs` by default (which is `mmapfs` on 64-bits systems).
2016-11-30 15:33:26 +01:00
Adrien Grand 90ab477f19 The `terms` query should always map to a Lucene `TermsQuery`. (#21786)
Currently, the `terms` query is just syctactic sugar for a `bool` query when
used in a query context. This change proposes to always generate the same query
in query and filter contexts, which is less confusing.
2016-11-30 15:29:09 +01:00
Adrien Grand 6231009a8f Remove 2.x backward compatibility of mappings. (#21670)
For the record, I also had to remove the geo-hash cell and geo-distance range
queries to make the code compile. These queries already throw an exception in
all cases with 5.x indices, so that does not hurt any more.

I also had to rename all 2.x bwc indices from `index-${version}` to
`unsupported-${version}` to make `OldIndexBackwardCompatibilityIT`
happy.
2016-11-30 13:34:46 +01:00
javanna bbebc644f9 [DOCS] document breaking changes added with #21852 2016-11-29 20:05:30 +01:00
Ryan Ernst 6940b2b8c7 Remove groovy scripting language (#21607)
* Scripting: Remove groovy scripting language

Groovy was deprecated in 5.0. This change removes it, along with the
legacy default language infrastructure in scripting.
2016-11-22 19:24:12 -08:00
Luca Cavanna db8b2dceea Remove ignored type parameter in search_shards api (#21688)
The `type` parameter has always been accepted by the search_shards api, probably to make the api and its urls the same as search. Truth is that the type never had any effect, it's been ignored from day one while accepting it may make users think that we actually do something with it.

This commit removes support for the type parameter from the REST layer and the Java API. Backwards compatibility is maintained on the transport layer though.

The new added serialization test also uncovered a bug in the java API where the `ClusterSearchShardsRequest` could be created with no arguments, but the indices were required to be not null otherwise the request couldn't be serialized as `writeTo` would throw NPE. Fixed by setting a default value (empty array) for indices.
2016-11-22 17:22:33 +01:00
Adrien Grand c7fc688096 Add information about the removal of store throttling to the migration guide.
Relates to #21573
2016-11-21 15:03:07 +01:00
Lee Hinman 96122aa518 Be strict when parsing values searching for booleans (#21555)
This changes only the query parsing behavior to be strict when searching on
boolean values. We continue to accept the variety of values during index time,
but searches will only be parsed using `"true"` or `"false"`.

Resolves #21545
2016-11-15 10:36:57 -07:00
Nik Everett 7c3886769f Breaking changes docs for template index_patterns
0219a211d3 added support for templates
to have multiple patterns and renamed `template` to `index_patterns`.
This adds the breaking changes docs for that.
2016-11-11 10:13:19 -05:00
Jason Tedor b743ab0b07 Remove 5.x references from cat API migration doc
This commit removes some references to 5.x that were picked up when the
migration docs for the cat API were migrated from 5.x to master.

Relates #21342
2016-11-09 07:09:59 -05:00
Jason Tedor 168c54fa6e Add migration docs for cat API
This commit adds migration docs for the cat API, including a note
regarding the change in response in the cat thread pool API for
unbounded queue sizes.

Relates #21342
2016-11-09 07:07:52 -05:00
Luca Cavanna c2160a88b5 Remove support for controversial ignore_unavailable and allow_no_indices from indices exists api (#20712)
Exist requests are supposed to never throw an exception, but rather return true or false depending on whether some resource exists or not. Indices exists does that for indices and accepts wildcard expressions too. The way the api works internally is by resolving indices and catching IndexNotFoundException: if an exception is thrown the index does not exist hence it returns false, otherwise it returns true. That works ok only if ignore_unavailable and allow_no_indices indices options are both set to false, meaning that they are strict and any missing index or wildcard expressions that resolves to no indices will lead to an exception that can be thrown and cause false to be returned.

Unfortunately the indices options have  been configurable up until now for this request, meaning that one can set ignore_unavailable or allow_no_indices to true and have the indices exist request return true for indices that really don't exist, which makes very little sense in the context of this api.

This commit removes the indicesOptions setter from the IndicesExistsRequest and makes settable only expandWildcardsOpen and expandWildcardsClosed, hence a subset of the available indices options. This way we can guarantee more consistent behaviour of the indices exists api. We can then remove the ignore_unavailable and allow_no_indices option from indices exists api spec
2016-11-04 19:26:37 +01:00
Jun Ohtani a66c76eb44 Merge pull request #20704 from johtani/remove_request_params_in_analyze_api
Removing request parameters in _analyze API
2016-10-27 17:43:18 +09:00
Igor Motov a541f0187d Docs: add documentation about removal of cluster.routing.allocation.snapshot.relocation_enabled 2016-10-20 14:19:12 -10:00
Pascal Borreli fcb01deb34 Fixed typos (#20843) 2016-10-10 14:51:47 -06:00
Jun Ohtani 945fa499d2 Deprecating request parameters in _analyze API
Remove params in indices.analyze.json
Fix REST changes

Closes #20246
2016-10-07 16:23:24 +09:00
Jun Ohtani 99236b7627 Removing request parameters in _analyze API
Fix small English issue in breaking changes

Closes #20246
2016-10-07 16:23:24 +09:00
Jun Ohtani eca9894c5f Removing request parameters in _analyze API
Remove unused imports
Replace POST method by GET method in docs
 Add breaking changes explanation
 Fix small issue in Kuromoji docs

Closes #20246
2016-10-07 16:23:24 +09:00
Lee Hinman 85402d5220 [DOCS] Remove non-valid link to mapping migration document 2016-09-28 09:09:19 -06:00
Lee Hinman 3f77eacab1 Revert "Default `include_in_all` for numeric-like types to false"
This reverts commit 6666892038.
2016-09-28 07:07:46 -06:00
David Pilato dfd1eebdd0 Remove mapper attachments plugin
We now have in 5.0.0 `ingest-attachment` plugin.
We can remove `mapper-attachments` plugin for 6.0.

Closes #18837.
2016-09-19 09:01:16 +02:00
Lee Hinman 94625d74e4 No longer allow cluster name in data path
In 5.x we allowed this with a deprecation warning. This removes the code
added for that deprecation, requiring the cluster name to not be in the
data path.

Resolves #20391
2016-09-12 15:47:01 -06:00
Lee Hinman 49695af2ac Remove FORCE version_type
This was an error-prone version type that allowed overriding previous
version semantics. It could cause primaries and replicas to be out of
sync however, so it has been removed.

Resolves #19769
2016-09-08 15:29:47 -06:00
Jason Tedor 27ff4f327c Remove allow unquoted JSON
Previous versions of Elasticsearch permitted unquoted JSON field names even though this is against the JSON spec. This leniency was disabled by default in the 5.x series of Elasticsearch but a backwards compatibility layer was added via a system property with the intention of removing this layer in 6.0.0. This commit removes this backwards compatibility layer.

Relates #20388
2016-09-08 13:36:31 -04:00
Clinton Gormley 891a15db22 Add breaking changes for 6.0 2016-09-08 18:27:01 +02:00
Lee Hinman 6666892038 Default `include_in_all` for numeric-like types to false
This includes:

- All regular numeric types such as int, long, scaled-float, double, etc
- IP addresses
- Dates
- Geopoints and Geoshapes

Relates to #19784
2016-09-08 09:09:48 -06:00
Jason Tedor 8e7dfae7d1 Remove collect payloads parameter
The collect_payloads parameter of the span_near query was previously
deprecated with the intention to be removed. This commit removes this
parameter.

Relates #20385
2016-09-08 09:37:36 -04:00
Lee Hinman b4cc3cd35d Remove FORCE version_type
This was an error-prone version type that allowed overriding previous
version semantics. It could cause primaries and replicas to be out of
sync however, so it has been removed.

Resolves #19769
2016-09-07 13:05:18 -06:00
Martijn van Groningen 245882cde3 * Removed `script.default_lang` setting and made `painless` the hardcoded default script language.
** The default script language is now maintained in `Script` class.
* Added `script.legacy.default_lang` setting that controls the default language for scripts that are stored inside documents (for example percolator queries).  This defaults to groovy.
** Added `QueryParseContext#getDefaultScriptLanguage()` that manages the default scripting language. Returns always `painless`, unless loading query/search request in legacy mode then the returns what is configured in `script.legacy.default_lang` setting.
** In the aggregation parsing code added `ParserContext` that also holds the default scripting language like `QueryParseContext`. Most parser don't have access to `QueryParseContext`. This is for scripts in aggregations.
* The `lang` script field is always serialized (toXContent).

Closes #20122
2016-09-06 18:44:48 +02:00
Jun Ohtani c4759bcc02 Merge pull request #20285 from johtani/fix/remove_token_filter_param_in_analyze_api
Remove `token_filter` in _analyze API
2016-09-03 02:03:51 +09:00
Areek Zillur c92f82e624 Merge pull request #20169 from areek/doc/fix_completion_breaking_changes
Update breaking changes for completion suggester
2016-09-02 12:39:16 -04:00
Areek Zillur af215b528f move completion performance tips from migration docs to completion docs 2016-09-02 12:37:56 -04:00
Jun Ohtani aef2e5d90e Remove `token_filter` in _analyze API
Fix wording in docs
Refactoring RestAnalyzeActionTests using expectThrows()

Closes #20283
2016-09-02 15:08:28 +09:00
Areek Zillur c869ca18eb Update breaking changes for completion suggester
To indicate we removed completion payload option
in favour of returning document source with
completion suggestions
2016-09-01 21:48:30 -04:00
Jun Ohtani 3d9f8ed764 Remove `token_filter` in _analyze API
Remove the param and change docs

Closes #20283
2016-09-02 01:36:45 +09:00
javanna 5f299ff46f add mem section back to cluster stats
The mem section was buggy in cluster stats and removed. It is now added back with the same structure as in node stats, containing total memory, available memory, used memory and percentages. All the values are the sum of all the nodes across the cluster (or at least the ones that we were able to get the values from).
2016-09-01 11:26:03 +02:00
Simon Willnauer a0becd26b1 Optimize indexing for the autogenerated ID append-only case (#20211)
If elasticsearch controls the ID values as well as the documents
version we can optimize the code that adds / appends the documents
to the index. Essentially we an skip the version lookup for all
documents unless the same document is delivered more than once.

On the lucene level we can simply call IndexWriter#addDocument instead
of #updateDocument but on the Engine level we need to ensure that we deoptimize
the case once we see the same document more than once.

This is done as follows:

1. Mark every request with a timestamp. This is done once on the first node that
receives a request and is fixed for this request. This can be even the
machine local time (see why later). The important part is that retry
requests will have the same value as the original one.

2. In the engine we make sure we keep the highest seen time stamp of "retry" requests.
This is updated while the retry request has its doc id lock. Call this `maxUnsafeAutoIdTimestamp`

3. When the engine runs an "optimized" request comes, it compares it's timestamp with the
current `maxUnsafeAutoIdTimestamp` (but doesn't update it). If the the request
timestamp is higher it is safe to execute it as optimized (no retry request with the same
timestamp has been run before). If not we fall back to "non-optimzed" mode and run the request as a retry one
and update the `maxUnsafeAutoIdTimestamp` unless it's been updated already to a higher value

Relates to #19813
2016-09-01 10:39:40 +02:00
Jason Tedor 76ab02e002 Merge branch 'master' into log4j2
* master:
  Avoid NPE in LoggingListener
  Randomly use Netty 3 plugin in some tests
  Skip smoke test client on JDK 9
  Revert "Don't allow XContentBuilder#writeValue(TimeValue)"
  [docs] Remove coming in 2.0.0
  Don't allow XContentBuilder#writeValue(TimeValue)
  [doc] Remove leftover from CONSOLE conversion
  Parameter improvements to Cluster Health API wait for shards (#20223)
  Add 2.4.0 to packaging tests list
  Docs: clarify scale is applied at origin+offest (#20242)
2016-08-31 16:37:55 -04:00
Jason Tedor 750033dc4b Update docs for Log4j 2
This commit updates the logging docs for Elasticsearch to reflect the
migration to Log4j 2.
2016-08-31 15:51:52 -04:00
Ali Beyad 4641254ea6 Parameter improvements to Cluster Health API wait for shards (#20223)
* Params improvements to Cluster Health API wait for shards

Previously, the cluster health API used a strictly numeric value
for `wait_for_active_shards`. However, with the introduction of
ActiveShardCount and the removal of write consistency level for
replication operations, `wait_for_active_shards` is used for
write operations to represent values for ActiveShardCount. This
commit moves the cluster health API's usage of `wait_for_active_shards`
to be consistent with its usage in the write operation APIs.

This commit also changes `wait_for_relocating_shards` from a
numeric value to a simple boolean value `wait_for_no_relocating_shards`
to set whether the cluster health operation should wait for
all relocating shards to complete relocation.

* Addresses code review comments

* Don't be lenient if `wait_for_relocating_shards` is set
2016-08-31 11:58:19 -04:00
Nik Everett df73292256 Add an alias action to delete an index
While removing an index isn't actually an alias action, if we add
an alias action that deletes an index then we can delete and index
and add an alias with the same name as the index atomically, in
the same cluster state update.

Closes #20064
2016-08-30 10:15:21 -04:00
Jun Ohtani 450f47d5b5 Validate blank field name
add validation and validate only 5.0+
Add tests before 5.0

Closes #19251
2016-08-26 20:10:33 +09:00
Simon Willnauer c499427166 Use _refresh instead of reading from Translog in the RT GET case (#20102)
Today we do a lot of accounting inside the engine to maintain locations
of documents inside the transaction log. This is only needed to ensure
we can return the documents source from the engine if it hasn't been refreshed.
Aside of the added complexity to be able to read from the currently writing translog,
maintainance of pointers into the translog this also caused inconsistencies like different values
of the `_ttl` field if it was read from the tlog or not. TermVectors are totally different if
the document is fetched from the tranlog since copy fields are ignored etc.

This chance will simply call `refresh` if the documents latest version is not in the index. This
streamlines the semantics of the `_get` API and allows for more optimizations inside the engine
and on the transaction log. Note: `_refresh` is only called iff the requested document is not refreshed
yet but has recently been updated or added.

#Relates to #19787
2016-08-24 15:30:08 +02:00
Lee Hinman 3298a4ed38 Revert "Merge remote-tracking branch 'dakrone/exclude-numerics-from-all'"
This reverts commit 514585290c, reversing
changes made to 8563c8d897.
2016-08-23 09:24:33 -06:00
Jack Conradson 131e370a16 Make Painless the default scripting language.
Closes #20017
2016-08-22 17:38:02 -07:00
Lee Hinman d7e516c0b4 Default `include_in_all` for numeric-like types to false
This includes:

- All regular numeric types such as int, long, scaled-float, double, etc
- IP addresses
- Dates
- Geopoints and Geoshapes

Relates to #19784
2016-08-19 15:50:38 -06:00
Adrien Grand a4ea7e7223 Switch indices.exists_type from `{index}/{type}` to `{index}/_mapping/{type}`. #20055
This will help remove types as we will need `{index}/{id}` to tell whether a
document exists.

Relates #15613
2016-08-19 09:18:24 +02:00
Lee Hinman 6030acb43b Disallow creating indices starting with '-' or '+'
Previously this was possible, which was problematic when issuing a
request like `DELETE /-myindex`, which was interpretted as "delete
everything except for myindex".

Resolves #19800
2016-08-17 15:13:03 -06:00
Adrien Grand d894db1590 Only use `PUT` for index creation, not POST. #20001
Currently both `PUT` and `POST` can be used to create indices. This commit
removes support for `POST index_name` so that we can use it to index documents
with auto-generated ids once types are removed.

Relates #15613
2016-08-17 10:15:42 +02:00
Nicholas Knize a93af8651c add geo distance script breaking changes to migration docs 2016-08-15 19:12:24 -05:00
Jason Tedor 1f0673c9bd Default max local storage nodes to one
This commit defaults the max local storage nodes to one. The motivation
for this change is that a default value greather than one is dangerous
as users sometimes end up unknowingly starting a second node and start
thinking that they have encountered data loss.

Relates #19964
2016-08-12 09:26:20 -04:00
Nik Everett 9f8f2ea54b Remove ESIntegTestCase#pluginList
It was a useful method in 1.7 when javac's type inference wasn't as
good, but now we can just replace it with `Arrays.asList`.
2016-08-11 15:44:02 -04:00
Jason Tedor e899e8b4e0 Reword expect header bug notice
This commit rewords the expect header bug notice to provide the precise
details for the bug arising. In particular, the bug does not impact any
request over 1024 bytes, but instead impacts any request with a body
that is sent in two requests, the first with an Expect: 100-continue
header. The size is irrelevant, and requests with bodies larger than
1024 bytes are okay as long as the Expect: 100-continue header is not
also sent.

Relates #19911
2016-08-10 10:42:58 -04:00
Clinton Gormley 83532ac377 Documented netty4 Expect bug in release notes and breaking changes 2016-08-10 10:16:25 +02:00
Ali Beyad 4f70ee521f Migration Guide changes for BlobContainer (#19731)
Adds a notice in the migration guide for removing
two deleteBlobs and one writeBlob method from the
BlobContainer interface.
2016-08-03 10:41:25 -04:00
Ali Beyad a21dd80f1b Documentation changes for wait_for_active_shards (#19581)
Documentation changes and migration doc changes for introducing 
wait_for_active_shards and removing write consistency level.

Closes #19581
2016-08-02 09:15:01 -04:00
Ali Beyad a7e68e36cf Fixes header level for migration doc entry 2016-08-01 16:15:12 -04:00
Colin Goodheart-Smithe 3f13f02575 [DOCS] updated documentation for transport client changes
Updated dependency in Java API docs and added section in breaking
changes
2016-07-29 14:25:12 +01:00
Nik Everett f159156931 [docs] Deprecate found and created (#19633)
These parts of delete and index response have been replaced with the
operation field.
2016-07-28 10:20:48 -04:00
Martijn van Groningen 2fdf79d8d4 Deprecate template query.
Closes #19390
2016-07-27 09:50:44 +02:00
Boaz Leskes cd596772ee Persistent Node Names (#19456)
With #19140 we started persisting the node ID across node restarts. Now that we have a "stable" anchor, we can use it to generate a stable default node name and make it easier to track nodes over a restarts. Sadly, this means we will not have those random fun Marvel characters but we feel this is the right tradeoff.

On the implementation side, this requires a bit of juggling because we now need to read the node id from disk before we can log as the node node is part of each log message. The PR move the initialization of NodeEnvironment as high up in the starting sequence as possible, with only one logging message before it to indicate we are initializing. Things look now like this:

```
[2016-07-15 19:38:39,742][INFO ][node                     ] [_unset_] initializing ...
[2016-07-15 19:38:39,826][INFO ][node                     ] [aAmiW40] node name set to [aAmiW40] by default. set the [node.name] settings to change it
[2016-07-15 19:38:39,829][INFO ][env                      ] [aAmiW40] using [1] data paths, mounts [[ /(/dev/disk1)]], net usable_space [5.5gb], net total_space [232.6gb], spins? [unknown], types [hfs]
[2016-07-15 19:38:39,830][INFO ][env                      ] [aAmiW40] heap size [1.9gb], compressed ordinary object pointers [true]
[2016-07-15 19:38:39,837][INFO ][node                     ] [aAmiW40] version[5.0.0-alpha5-SNAPSHOT], pid[46048], build[473d3c0/2016-07-15T17:38:06.771Z], OS[Mac OS X/10.11.5/x86_64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_51/25.51-b03]
[2016-07-15 19:38:40,980][INFO ][plugins                  ] [aAmiW40] modules [percolator, lang-mustache, lang-painless, reindex, aggs-matrix-stats, lang-expression, ingest-common, lang-groovy, transport-netty], plugins []
[2016-07-15 19:38:43,218][INFO ][node                     ] [aAmiW40] initialized
```

Needless to say, settings `node.name` explicitly still works as before.

The commit also contains some clean ups to the relationship between Environment, Settings and Plugins. The previous code suggested the path related settings could be changed after the initial Environment was changed. This did not have any effect as the security manager already locked things down.
2016-07-23 22:46:48 +02:00
Jun Ohtani cebad703fe Analyze: Specify anonymous char_filters/tokenizer/token_filters in the analyze API
Add parser for anonymous char_filters/tokenizer/token_filters
Using Settings in AnalyzeRequest for anonymous definition
Add breaking changes document

Closed #8878
2016-07-21 11:06:36 +09:00
Nik Everett 3a82c613e4 Migrate query registration from push to pull
Remove `ParseField` constants used for names where there are no deprecated
names and just use the `String` version of the registration method instead.

This is step 2 in cleaning up the plugin interface for extending
search time actions. Aggregations are next.

This is breaking for plugins because those that register a new query should
now implement `SearchPlugin` rather than `onModule(SearchModule)`.
2016-07-20 12:33:51 -04:00
Martijn van Groningen e0ebf5da1c Template cleanup:
* Removed `Template` class and unified script & template parsing logic. Templates are scripts, so they should be defined as a script. Unless there will be separate template infrastructure, templates should share as much code as possible with scripts.
* Removed ScriptParseException in favour for ElasticsearchParseException
* Moved TemplateQueryBuilder to lang-mustache module because this query is hard coded to work with mustache only
2016-07-18 10:16:01 +02:00
Nik Everett 7aeea764ba Remove wait_for_status=yellow from the docs
It is no longer required after 687e2e12b3.
2016-07-15 16:02:07 -04:00
Simon Willnauer 5616251f22 Remove `node.mode` and `node.local` settings (#19428)
Today `node.mode` and `node.local` serve almost the same purpose, they
are a shortcut for `discovery.type` and `transport.type`. If `node.local: true`
or `node.mode: local` is set elasticsearch will start in _local_ mode which means
only nodes within the same JVM are discovered and a non-network based transport
is used. The _local_ mode it only really used in tests or if nodes are embedded.
For both, embedding and tests explicit configuration via `discovery.type` and `transport.type`
should be preferred.

This change removes all the usage of these settings and by-default doesn't
configure a default transport implemenation since netty is now a module. Yet, to make
the user expericence flawless, plugins or modules can set a `http.type.default` and
`transport.type.default`. Plugins set this via `PluginService#additionalSettings()`
which enforces _set-once_ which prevents node startup if set multiple times. This means
that our distributions will just startup with netty transport since it's packaged as a
module unless `transport.type` or `http.transport.type` is explicitly set.

This change also found a bunch of bugs since several NamedWriteables were not registered if a
transport client is used. Now that we don't rely on the `node.mode` leniency which is inherited
instead of using explicit settings, `TransportClient` uses `AssertingLocalTransport` which detects these problems since it serializes all messages.

Closes #16234
2016-07-14 13:21:10 +02:00
Boaz Leskes ef33183a19 update migration docs to include removal of `netty.epollBugWorkaround` 2016-07-14 12:20:35 +02:00
Clinton Gormley 1e2d0c1000 More bad asciidoc 2016-07-13 16:30:49 +02:00
Clinton Gormley 599727e38f Fixed bad ASCIIDOC 2016-07-13 16:09:41 +02:00
Martijn van Groningen 2c3165d080 Removed deprecated 1.x script and template syntax
Closes #13729
2016-07-13 15:07:36 +02:00
Jason Tedor ce5a382c69 Remove support for properties
This commit removes support for properties syntax and config files:
 - removed support for elasticsearch.properties
 - removed support for logging.properties
 - removed support for properties content detection in REST APIs
 - removed support for properties content detection in Java API

Relates #19398
2016-07-12 17:55:18 -04:00
Nik Everett 8263873783 Switch search extension from push to pull
Switches most search behavior extensions from push (`onModule(SearchModule)`)
to pull (`implements SearchPlugin`). This effort in general gives plugin
authors a much cleaner view of how to extend Elasticsearch and starts to
set up portions of Elasticsearch as "the plugin API". This commit in
particular does that for search-time behavior like customized suggesters,
highlighters, score functions, and significance heuristics.

It also switches most such customization to being done at search module
construction time which is much, much easier to reason about from a testing
perspective. It also helps significantly in the process of de-guice-ing
Elasticsearch's startup.

There are at least two major search time extensions that aren't covered in
this commit that will simply have to wait for the next commit on the topic
because this one has already grown large: custom aggregations and custom
queries. These will likely live in the same SearchPlugin interface as well.
2016-07-11 18:49:05 -04:00
Martijn van Groningen ff5527f037 percolator: Forbid the usage or `range` queries with a range based on the current time
If there are percolator queries containing `range` queries with ranges based on the current time then this can lead to incorrect results if the `percolate` query gets cached.  These ranges are changing each time the `percolate` query gets executed and if this query gets cached then the results will be based on how the range was at the time when the `percolate` query got cached.

The ExtractQueryTermsService has been renamed `QueryAnalyzer` and now only deals with analyzing the query (extracting terms and deciding if the entire query is a verified match) . The `PercolatorFieldMapper` is responsible for adding the right fields based on the analysis the `QueryAnalyzer` has performed, because this is highly dependent on the field mappings. Also the `PercolatorFieldMapper` is responsible for creating the percolate query.
2016-07-08 14:20:56 +02:00
Jason Tedor e86aa29f67 Die with dignity
Today when a thread encounters a fatal unrecoverable error that
threatens the stability of the JVM, Elasticsearch marches on. This
includes out of memory errors, stack overflow errors and other errors
that leave the JVM in a questionable state. Instead, the Elasticsearch
JVM should die when these errors are encountered. This commit causes
this to be the case.

Relates #19272
2016-07-07 14:44:03 -04:00
Jim Ferenczi dcf6a96725 Add doc values support to the _size field in the mapper-size plugin
This change activates the doc_values on the _size field for indices created after 5.0.0-alpha4.
It also adds a note in the breaking changes that explain the situation and how to get around it.

Closes #18334
2016-07-05 14:47:58 +02:00
Boaz Leskes 6861d3571e Persistent Node Ids (#19140)
Node IDs are currently randomly generated during node startup. That means they change every time the node is restarted. While this doesn't matter for ES proper, it makes it hard for external services to track nodes. Another, more minor, side effect is that indexing the output of, say, the node stats API results in creating new fields due to node ID being used as keys.

The first approach I considered was to use the node's published address as the base for the id. We already [treat nodes with the same address as the same](https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/discovery/zen/NodeJoinController.java#L387) so this is a simple change (see [here](https://github.com/elastic/elasticsearch/compare/master...bleskes:node_persistent_id_based_on_address)). While this is simple and it works for probably most cases, it is not perfect. For example, if after a node restart, the node is not able to bind to the same port (because it's not yet freed by the OS), it will cause the node to still change identity. Also in environments where the host IP can change due to a host restart, identity will not be the same. 

Due to those limitation, I opted to go with a different approach where the node id will be persisted in the node's data folder. This has the upside of connecting the id to the nodes data. It also means that the host can be adapted in any way (replace network cards, attach storage to a new VM). I

It does however also have downsides - we now run the risk of two nodes having the same id, if someone copies clones a data folder from one node to another. To mitigate this I changed the semantics of the protection against multiple nodes with the same address to be stricter - it will now reject the incoming join if a node exists with the same id but a different address. Note that if the existing node doesn't respond to pings (i.e., it's not alive) it will be removed and the new node will be accepted when it tries another join.

Last, and most importantly, this change requires that *all* nodes persist data to disk. This is a change from current behavior where only data & master nodes store local files. This is the main reason for marking this PR as breaking.

Other less important notes:
- DummyTransportAddress is removed as we need a unique network address per node. Use `LocalTransportAddress.buildUnique()` instead.
- I renamed `node.add_lid_to_custom_path` to `node.add_lock_id_to_custom_path` to avoid confusion with the node ID which is now part of the `NodeEnvironment` logic.
- I removed the `version` paramater from `MetaDataStateFormat#write` , it wasn't really used and was just in the way :)
- TribeNodes are special in the sense that they do start multiple sub-nodes (previously known as client nodes). Those sub-nodes do not store local files but derive their ID from the parent node id, so they are generated consistently.
2016-07-04 21:09:25 +02:00
Jim Ferenczi afe99fcdcd Restore reverted change now that alpha4 is out:
Rename `fields` to `stored_fields` and add `docvalue_fields`

`stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields.
`fields` will throw an exception if the user uses it.
Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field.

Closes #18943
2016-07-04 10:39:49 +02:00
David Pilato 527a9c7f48 Deprecate discovery-azure and rename it to discovery-azure-classic
As discussed at https://github.com/elastic/elasticsearch-cloud-azure/issues/91#issuecomment-229113595, we know that the current `discovery-azure` plugin only works with Azure Classic VMs / Services (which is somehow Legacy now).

The proposal here is to rename `discovery-azure` to `discovery-azure-classic` in case some users are using it.
And deprecate it for 5.0.

Closes #19144.
2016-06-30 14:42:40 +02:00
Nik Everett 8db43c0107 Move RestHandler registration to ActionModule and ActionPlugin
`RestHandler`s are highly tied to actions so registering them in the
same place makes sense.

Removes the need to for plugins to check if they are in transport client
mode before registering a RestHandler - `getRestHandlers` isn't called
at all in transport client mode.

This caused guice to throw a massive fit about the circular dependency
between NodeClient and the allocation deciders. I broke the circular
dependency by registering the actions map with the node client after
instantiation.
2016-06-29 18:31:44 -04:00
Jason Tedor 00356edd33 Clarify time units usage in docs
This commit clarifies the distinction between supported time units for
durations and supported time units for durations in the docs.

Relates #19159
2016-06-29 17:02:15 -04:00
Nik Everett fa4844c3f4 Pull actions from plugins
Instead of implementing onModule(ActionModule) to register actions,
this has plugins implement ActionPlugin to declare actions. This is
yet another step in cleaning up the plugin infrastructure.

While I was in there I switched AutoCreateIndex and DestructiveOperations
to be eagerly constructed which makes them easier to use when
de-guice-ing the code base.
2016-06-28 08:36:24 -04:00
Jason Tedor 2f638b5a23 Keep input time unit when parsing TimeValues
This commit modifies TimeValue parsing to keep the input time unit. This
enables round-trip parsing from instances of String to instances of
TimeValue and vice-versa. With this, this commit removes support for the
unit "w" representing weeks, and also removes support for fractional
values of units (e.g., 0.5s).

Relates #19102
2016-06-27 18:41:18 -04:00
Ryan Ernst a07a3a9333 Add migration docs for MapperPlugin 2016-06-27 11:22:07 -07:00
Jim Ferenczi eb1e231a63 Revert "Rename `fields` to `stored_fields` and add `docvalue_fields`"
This reverts commit 2f46f53dc8.
2016-06-27 17:20:32 +02:00
Damien Alexandre fec4a18835 Rename plainless into painless in migration doc
The scripting language was wrongly named.
2016-06-26 17:41:34 +02:00
Nik Everett 71b95fb63c Switch analysis from push to pull
Instead of plugins calling `registerTokenizer` to extend the analyzer
they now instead have to implement `AnalysisPlugin` and override
`getTokenizer`. This lines up extending plugins in with extending
scripts. This allows `AnalysisModule` to construct the `AnalysisRegistry`
immediately as part of its constructor which makes testing anslysis
much simpler.

This also moves the default analysis configuration into `AnalysisModule`
which is how search is setup.

Like `ScriptModule`, `AnalysisModule` no longer extends `AbstractModule`.
Instead it is only responsible for building `AnslysisRegistry`. We still
bind `AnalysisRegistry` but we only do so in `Node`. This is means it
is available at module construction time so we slowly remove the need to
bind it in guice.
2016-06-26 07:15:42 -04:00
Clinton Gormley 5a08e36f9c Update migrate_5_0.asciidoc
Updated breaking changes to state that upgraded indices still need to be reindexed,
and to mention the migration plugin
2016-06-23 13:10:50 +02:00
Tanguy Leroux 04da1bda0d Move templates out of the Search API, into lang-mustache module
This commit moves template support out of the Search API to its own dedicated Search Template API in the lang-mustache module. It provides a new SearchTemplateAction that can be used to render templates before it gets delegated to the usual Search API. The current REST endpoint are identical, but the Render Search Template endpoint now uses the same Search Template API with a new "simulate" option. When this option is enabled, the Search Template API only renders template and returns immediatly, without executing the search.

Closes #17906
2016-06-23 09:30:53 +02:00
Nik Everett 02761f5fe0 Docs: migration notes for _timestamp and _ttl
We aren't able to actually create an index with _timestamp enabled
to test the migration, or, at least, we won't be able to after #18980
is re-merged. But the docs are still ok.

Closes #19007
2016-06-22 14:43:12 -04:00
Nik Everett 6574243077 Fail to start if plugin tries broken onModule
If a plugin declares `onModule(SomethingThatIsntAModule)` then refuse
to start. Before this commit we just logged a warning that flies by in
the console and is easy to miss. You can't miss refusing to start!
2016-06-22 12:20:52 -04:00
Jim Ferenczi 2f46f53dc8 Rename `fields` to `stored_fields` and add `docvalue_fields`
`stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields.
`fields` will throw an exception if the user uses it.
Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field.

Closes #18943
2016-06-22 17:38:30 +02:00
Adrien Grand db9af54ec0 Remove `_timestamp` and `_ttl` on 5.x indices. #18980
This removes the ability to use `_timestamp` and `_ttl` on indices created on
or after 5.0.

Closes #18280
2016-06-22 08:35:54 +02:00
Martijn van Groningen 5dc88ffd26 docs: added note the inner hits migrate section 2016-06-22 08:29:50 +02:00
Clinton Gormley 2f2ea0c280 Improved docs explaining the index upgrade process in breaking changes 2016-06-21 18:03:19 +02:00
Clinton Gormley 70482d1e39 Update java.asciidoc
Fixed asciidoc
2016-06-21 16:02:25 +02:00
Martijn van Groningen 5ad2fdaa8e inner_hits: Don't include `_id`, `_type` and `_index` keys in search response for inner hits
Closes #18091
2016-06-21 14:13:38 +02:00
Jim Ferenczi 423291b6bc Change default similarity to BM25
The default similarity was set to `classic` which refers to TFIDF and has not been moved after the upgrade to Lucene 6.

Though moving to BM25 could have some downside for queries that relies on coordination factor (match_query, multi_match_query) ?

relates #18944
2016-06-21 11:29:36 +02:00