Commit Graph

3185 Commits

Author SHA1 Message Date
Nik Everett 8263873783 Switch search extension from push to pull
Switches most search behavior extensions from push (`onModule(SearchModule)`)
to pull (`implements SearchPlugin`). This effort in general gives plugin
authors a much cleaner view of how to extend Elasticsearch and starts to
set up portions of Elasticsearch as "the plugin API". This commit in
particular does that for search-time behavior like customized suggesters,
highlighters, score functions, and significance heuristics.

It also switches most such customization to being done at search module
construction time which is much, much easier to reason about from a testing
perspective. It also helps significantly in the process of de-guice-ing
Elasticsearch's startup.

There are at least two major search time extensions that aren't covered in
this commit that will simply have to wait for the next commit on the topic
because this one has already grown large: custom aggregations and custom
queries. These will likely live in the same SearchPlugin interface as well.
2016-07-11 18:49:05 -04:00
Sho Minagawa 6aa598e3fb Fix typo on analyze.asciidoc (#19354) 2016-07-11 15:49:39 +02:00
Yannick Welsch 7dff8fbb1d Update resiliency docs (#19303)
Adds clarifications about Jepsen tests and new section on issues with versioning.
2016-07-08 17:30:46 +02:00
Clinton Gormley 982e01d463 Update network.asciidoc
`network.publish_host` defaults to `network.host`, not `network.bind_host`

Closes #19304
2016-07-08 17:13:10 +02:00
Jason Tedor 527980c995 Fix nesting of stopping docs
This commit fixes errant nesting of the stopping docs due to using a
section header instead of a chapter header at the top of the stopping
docs.
2016-07-08 10:43:35 -04:00
Martijn van Groningen ff5527f037 percolator: Forbid the usage or `range` queries with a range based on the current time
If there are percolator queries containing `range` queries with ranges based on the current time then this can lead to incorrect results if the `percolate` query gets cached.  These ranges are changing each time the `percolate` query gets executed and if this query gets cached then the results will be based on how the range was at the time when the `percolate` query got cached.

The ExtractQueryTermsService has been renamed `QueryAnalyzer` and now only deals with analyzing the query (extracting terms and deciding if the entire query is a verified match) . The `PercolatorFieldMapper` is responsible for adding the right fields based on the analysis the `QueryAnalyzer` has performed, because this is highly dependent on the field mappings. Also the `PercolatorFieldMapper` is responsible for creating the percolate query.
2016-07-08 14:20:56 +02:00
Glen Smith d7099f05b9 slight clarification 2016-07-07 20:46:18 -04:00
Jason Tedor e86aa29f67 Die with dignity
Today when a thread encounters a fatal unrecoverable error that
threatens the stability of the JVM, Elasticsearch marches on. This
includes out of memory errors, stack overflow errors and other errors
that leave the JVM in a questionable state. Instead, the Elasticsearch
JVM should die when these errors are encountered. This commit causes
this to be the case.

Relates #19272
2016-07-07 14:44:03 -04:00
Jason Tedor c05f818160 Fix casing of "Elasticsearch" in how-to docs 2016-07-07 12:33:27 -04:00
Adrien Grand 873661df17 Fix typo. 2016-07-07 17:49:01 +02:00
Adrien Grand f295a218a0 Add notes about sparsity. 2016-07-07 17:47:19 +02:00
Clinton Gormley ee86a9f634 Update field-stats.asciidoc
Change use of index constraints to correctly identify any indices containing relevant docs

Closes #19232
2016-07-07 14:56:40 +02:00
Martijn van Groningen b4defafcb2 ingest: Renamed from `ingest-useragent` to `ingest-user-agent` and processor from `useragent` to `user_agent`
and on some other places did similar renaming. This is consistent with ES naming. Also made sure that the docs are navigable from the reference guide.
2016-07-07 09:43:43 +02:00
Clinton Gormley 3e6769f237 Add otto-de/flummi client to plugins
Closes #19266
2016-07-06 14:31:58 +02:00
Nik Everett b3c015e2bb Reindex from remote
This adds a remote option to reindex that looks like

```
curl -POST 'localhost:9200/_reindex?pretty' -d'{
  "source": {
    "remote": {
      "host": "http://otherhost:9200"
    },
    "index": "target",
    "query": {
      "match": {
        "foo": "bar"
      }
    }
  },
  "dest": {
    "index": "target"
  }
}'
```

This reindex has all of the features of local reindex:
* Using queries to filter what is copied
* Retry on rejection
* Throttle/rethottle
The big advantage of this version is that it goes over the HTTP API
which can be made backwards compatible.

Some things are different:

The query field is sent directly to the other node rather than parsed
on the coordinating node. This should allow it to support constructs
that are invalid on the coordinating node but are valid on the target
node. Mostly, that means old syntax.
2016-07-05 16:13:17 -04:00
Christoph Wurm c9da56dc80 Reword Refresh API reference (#19270) 2016-07-05 18:37:28 +02:00
Britta Weber f36c1b4e60 Update fielddata.asciidoc 2016-07-05 16:21:52 +02:00
Jim Ferenczi e3fe5c9625 Add missing footer notes in mapper size docs 2016-07-05 15:12:18 +02:00
Jim Ferenczi dcf6a96725 Add doc values support to the _size field in the mapper-size plugin
This change activates the doc_values on the _size field for indices created after 5.0.0-alpha4.
It also adds a note in the breaking changes that explain the situation and how to get around it.

Closes #18334
2016-07-05 14:47:58 +02:00
Christoph Wurm 768beea6c7 Update refresh.asciidoc
Fix grammar and example
2016-07-05 13:49:25 +02:00
Christoph Wurm d1727653dd Update shrink-index.asciidoc
Fix half-finished sentence
2016-07-05 13:34:58 +02:00
Boaz Leskes 6861d3571e Persistent Node Ids (#19140)
Node IDs are currently randomly generated during node startup. That means they change every time the node is restarted. While this doesn't matter for ES proper, it makes it hard for external services to track nodes. Another, more minor, side effect is that indexing the output of, say, the node stats API results in creating new fields due to node ID being used as keys.

The first approach I considered was to use the node's published address as the base for the id. We already [treat nodes with the same address as the same](https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/discovery/zen/NodeJoinController.java#L387) so this is a simple change (see [here](https://github.com/elastic/elasticsearch/compare/master...bleskes:node_persistent_id_based_on_address)). While this is simple and it works for probably most cases, it is not perfect. For example, if after a node restart, the node is not able to bind to the same port (because it's not yet freed by the OS), it will cause the node to still change identity. Also in environments where the host IP can change due to a host restart, identity will not be the same. 

Due to those limitation, I opted to go with a different approach where the node id will be persisted in the node's data folder. This has the upside of connecting the id to the nodes data. It also means that the host can be adapted in any way (replace network cards, attach storage to a new VM). I

It does however also have downsides - we now run the risk of two nodes having the same id, if someone copies clones a data folder from one node to another. To mitigate this I changed the semantics of the protection against multiple nodes with the same address to be stricter - it will now reject the incoming join if a node exists with the same id but a different address. Note that if the existing node doesn't respond to pings (i.e., it's not alive) it will be removed and the new node will be accepted when it tries another join.

Last, and most importantly, this change requires that *all* nodes persist data to disk. This is a change from current behavior where only data & master nodes store local files. This is the main reason for marking this PR as breaking.

Other less important notes:
- DummyTransportAddress is removed as we need a unique network address per node. Use `LocalTransportAddress.buildUnique()` instead.
- I renamed `node.add_lid_to_custom_path` to `node.add_lock_id_to_custom_path` to avoid confusion with the node ID which is now part of the `NodeEnvironment` logic.
- I removed the `version` paramater from `MetaDataStateFormat#write` , it wasn't really used and was just in the way :)
- TribeNodes are special in the sense that they do start multiple sub-nodes (previously known as client nodes). Those sub-nodes do not store local files but derive their ID from the parent node id, so they are generated consistently.
2016-07-04 21:09:25 +02:00
Clinton Gormley f572f8cc17 Bad asciidoc link 2016-07-04 11:02:06 +02:00
Jim Ferenczi afe99fcdcd Restore reverted change now that alpha4 is out:
Rename `fields` to `stored_fields` and add `docvalue_fields`

`stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields.
`fields` will throw an exception if the user uses it.
Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field.

Closes #18943
2016-07-04 10:39:49 +02:00
Christoph Wurm 42addb5692 Add ingest-useragent plugin (#19074) 2016-07-01 15:49:43 +02:00
Leon Weidauer 1297a707da non-binary gender option in term aggr. example (#19188)
* non-binary gender option in term aggr. example

* replace gender with music genre for term aggregation docs
2016-07-01 14:59:03 +02:00
javanna 62462f5d9b [TEST] replace ResponseBodyAssertion with existing MatchAssertion
We introduced a special response_body assertion to test our docs snippets. The match assertion does the same job though and can be reused and adapted where needed. ResponseBodyAssertion contains provides much better and accurate errors though, which can be now utilized in MatchAssertion so that many more REST tests can benefit from readable error messages.

 Each response body gets always stashed and can be retrieved for later evaluations already. Instead of providing the response body as strings that get parsed to json objects separately, then converted to maps as ResponseBodyAssertion did, we parse everything once, the json is part of the yaml test, which is supported. The only downside is that json comments cannot be used, rather yaml comments should be used (// C style vs # ). There were only two docs tests that were using comments in ingest-node.asciidoc where I went ahead and remove the comments which didn't seem that useful anyways.
2016-07-01 11:13:10 +02:00
Clinton Gormley e1ab3f16fd Add link to alpha4 release notes 2016-06-30 18:32:15 +02:00
Boaz Leskes 09ca6d6ed2 Add a BridgePartition to be used by testAckedIndexing (#19172)
We have long worked to capture different partitioning scenarios in our testing infra. This PR adds a new variant, inspired by the Jepsen blogs, which was forgotten far - namely a partition where one node can still see and be seen by all other nodes. It also updates the resiliency page to better reflect all the work that was done in this area.
2016-06-30 17:58:12 +02:00
jalvar08 dbf1f61c5b Fixing typo for path.conf location (#19098)
Changing -Ees.path.conf to -Epath.conf
2016-06-30 16:42:01 +02:00
Tanguy Leroux 5903966dc8 Merge pull request #19180 from tlrx/doc-version-number-zero-with-dbq-and-ubq
[Doc] Document Update/Delete-By-Query with version number zero
2016-06-30 15:51:46 +02:00
Tanguy Leroux dc53ce929d Document Update/Delete-By-Query with version number zero
Update-By-Query and Delete-By-Query use internal versioning to update/delete documents. But documents can have a version number equal to zero using the external versioning... making the UBQ/DBQ request fail because zero is not a valid version number and they only support internal versioning for now. Sequence numbers might help to solve this issue in the future.
2016-06-30 15:45:14 +02:00
David Pilato 535157474e Merge branch 'pr/19144-discovery-azure-classic' 2016-06-30 15:44:28 +02:00
David Pilato 72c220b1df Add deprecation notice 2016-06-30 15:29:29 +02:00
David Pilato f3ddccad17 Fix documentation filenames 2016-06-30 15:26:54 +02:00
Clinton Gormley b5bb27cf90 Bumped version to 5.0.0-alpha4 2016-06-30 15:20:59 +02:00
David Pilato 8a2b27076e Merge branch 'master' into pr/19144-discovery-azure-classic
# Conflicts:
#	plugins/discovery-azure-classic/LICENSE.txt
2016-06-30 14:46:21 +02:00
David Pilato 527a9c7f48 Deprecate discovery-azure and rename it to discovery-azure-classic
As discussed at https://github.com/elastic/elasticsearch-cloud-azure/issues/91#issuecomment-229113595, we know that the current `discovery-azure` plugin only works with Azure Classic VMs / Services (which is somehow Legacy now).

The proposal here is to rename `discovery-azure` to `discovery-azure-classic` in case some users are using it.
And deprecate it for 5.0.

Closes #19144.
2016-06-30 14:42:40 +02:00
David Pilato 8c6c00ff15 Update documentation for cat/plugins API
Cat API for plugins doesn't display anymore url or jvm/site flag
2016-06-30 13:57:43 +02:00
Colin Goodheart-Smithe 0d7c11ea1d [DOCS] put profiling performance and limitations section on same page 2016-06-30 12:28:46 +01:00
Britta Weber 57a734e641 [doc] explain avg in function_score better (#19154)
* [doc] explain avg in function_score better
2016-06-30 11:52:53 +02:00
Nik Everett 8db43c0107 Move RestHandler registration to ActionModule and ActionPlugin
`RestHandler`s are highly tied to actions so registering them in the
same place makes sense.

Removes the need to for plugins to check if they are in transport client
mode before registering a RestHandler - `getRestHandlers` isn't called
at all in transport client mode.

This caused guice to throw a massive fit about the circular dependency
between NodeClient and the allocation deciders. I broke the circular
dependency by registering the actions map with the node client after
instantiation.
2016-06-29 18:31:44 -04:00
Jason Tedor 00356edd33 Clarify time units usage in docs
This commit clarifies the distinction between supported time units for
durations and supported time units for durations in the docs.

Relates #19159
2016-06-29 17:02:15 -04:00
Nik Everett 57f413e851 More changes to java update-by-query api docs 2016-06-29 11:10:02 -04:00
Nik Everett ccab85835a Rework java update-by-query docs 2016-06-29 11:10:02 -04:00
Paul Echeverri 83d7f199c7 Partial draft for Java Update-by-Query 2016-06-29 11:10:02 -04:00
Clinton Gormley 4f82b2de1a Fixed bad asciidoc in azure discovery 2016-06-29 16:57:57 +02:00
Alexander Reelsen 56fa751928 Plugins: Add status bar on download (#18695)
As some plugins are becoming big now, it is hard for the user to know, if the plugin
is being downloaded or just nothing happens.

This commit adds a progress bar during download, which can be disabled by using the `-q`
parameter.

In addition this updates to jimfs 1.1, which allows us to test the batch mode, as adding
security policies are now supported due to having jimfs:// protocol support in URL stream
handlers.
2016-06-29 16:44:12 +02:00
Jim Ferenczi 6d2df0dc18 Fix docs example for the _id field, the field is not accessible in scripts 2016-06-29 15:25:51 +02:00
Clinton Gormley 6b7acc0ca2 Update index.asciidoc
In-flight requests circuit breaker is done
2016-06-29 10:24:43 +02:00