193 Commits

Author SHA1 Message Date
Matthew Haugen
ca8d89af5a Correct typo in "detect_noop" documentation (#22039)
The documentation reads:

> You can disable this behavior by setting "detect_noop": false like this:

Followed by a code example, that originally set `"detect_noop": true`.

Please correct me if I got the change backwards (i.e. the paragraph should be changed to `true`), but this seems like it makes the most sense.
2016-12-09 15:24:08 -05:00
Nik Everett
2087234d74 Timeout improvements for rest client and reindex (#21741)
Changes the default socket and connection timeouts for the rest
client from 10 seconds to the more generous 30 seconds.

Defaults reindex-from-remote to those timeouts and make the
timeouts configurable like so:
```
POST _reindex
{
  "source": {
    "remote": {
      "host": "http://otherhost:9200",
      "socket_timeout": "1m",
      "connect_timeout": "10s"
    },
    "index": "source",
    "query": {
      "match": {
        "test": "data"
      }
    }
  },
  "dest": {
    "index": "dest"
  }
}
```

Closes #21707
2016-12-05 10:54:51 -05:00
Itamar Syn-Hershko
c3a95a6666 Fixing cut-in-middle paragraph (#21850) 2016-11-29 14:00:26 +01:00
Luca Cavanna
c25f9b5fba [DOCS] add source filtering example to reindex docs (#21835) 2016-11-29 09:22:54 +01:00
Adrin Jalali
3bb9317ca2 clarify ambiguous sentence. (#21734) 2016-11-24 16:47:14 +01:00
Nik Everett
76a804e589 Revert "it's a noop operation, not a none operation. (#21736)"
This reverts commit 7f77214cedbc927347241670be140b54185752c6.

`none` is indeed how you trigger the `noop` operation in the `_update`
API.
2016-11-23 09:26:53 -05:00
Adrin Jalali
7f77214ced it's a noop operation, not a none operation. (#21736)
It works I guess cause it's ignored as an invalid operation.
2016-11-22 10:41:24 -05:00
Adrin Jalali
982f7cb067 fixing an ambiguous sentence. (#21729) 2016-11-22 16:35:58 +01:00
Adrin Jalali
4bb7091f64 force is deprecated be mentioned at the end. (#21731) 2016-11-22 16:35:57 +01:00
Jason Tedor
d06a8903fd Merge branch 'master' into feature/seq_no
* master: (22 commits)
  Add proper toString() method to UpdateTask (#21582)
  Fix `InternalEngine#isThrottled` to not always return `false`. (#21592)
  add `ignore_missing` option to SplitProcessor (#20982)
  fix trace_match behavior for when there is only one grok pattern (#21413)
  Remove dead code from GetResponse.java
  Fixes date range query using epoch with timezone (#21542)
  Do not cache term queries. (#21566)
  Updated dynamic mapper section
  Docs: Clarify date_histogram bucket sizes for DST time zones
  Handle release of 5.0.1
  Fix skip reason for stats API parameters test
  Reduce skip version for stats API parameter tests
  Strict level parsing for indices stats
  Remove cluster update task when task times out (#21578)
  [DOCS] Mention "all-fields" mode doesn't search across nested documents
  InternalTestCluster: when restarting a node we should validate the cluster is formed via the node we just restarted
  Fixed bad asciidoc in boolean mapping docs
  Fixed bad asciidoc ID in node stats
  Be strict when parsing values searching for booleans (#21555)
  Fix time zone rounding edge case for DST overlaps
  ...
2016-11-16 09:10:35 -05:00
David Pilato
2842e2752a Updated dynamic mapper section
Backport of #21574 in master (6.0)
2016-11-16 09:52:08 +01:00
Jason Tedor
33f7cd5a16 Remove shard ID from doc write response
This commit removes the shard ID from doc write response; this was
useful for debugging but its time has passed.

Relates #21508
2016-11-11 15:18:25 -05:00
Jason Tedor
1e7c424479 Merge branch 'master' into feature/seq_no
* master:
  ShardActiveResponseHandler shouldn't hold to an entire cluster state
  Ensures cleanup of temporary index-* generational blobs during snapshotting (#21469)
  Remove (again) test uses of onModule (#21414)
  [TEST] Add assertBusy when checking for pending operation counter after tests
  Revert "Add trace logging when aquiring and releasing operation locks for replication requests"
  Allows multiple patterns to be specified for index templates (#21009)
  [TEST] fixes rebalance single shard check as it isn't guaranteed that a rebalance makes sense and the method only tests if rebalance is allowed
  Document _reindex with random_score
2016-11-11 11:25:27 -05:00
Jason Tedor
d3417fb022 Merge branch 'master' into feature/seq_no
* master: (516 commits)
  Avoid angering Log4j in TransportNodesActionTests
  Add trace logging when aquiring and releasing operation locks for replication requests
  Fix handler name on message not fully read
  Remove accidental import.
  Improve log message in TransportNodesAction
  Clean up of Script.
  Update Joda Time to version 2.9.5 (#21468)
  Remove unused ClusterService dependency from SearchPhaseController (#21421)
  Remove max_local_storage_nodes from elasticsearch.yml (#21467)
  Wait for all reindex subtasks before rethrottling
  Correcting a typo-Maan to Man-in README.textile (#21466)
  Fix InternalSearchHit#hasSource to return the proper boolean value (#21441)
  Replace all index date-math examples with the URI encoded form
  Fix typos (#21456)
  Adapt ES_JVM_OPTIONS packaging test to ubuntu-1204
  Add null check in InternalSearchHit#sourceRef to prevent NPE (#21431)
  Add VirtualBox version check (#21370)
  Export ES_JVM_OPTIONS for SysV init
  Skip reindex rethrottle tests with workers
  Make forbidden APIs be quieter about classpath warnings (#21443)
  ...
2016-11-10 23:40:33 -05:00
Nik Everett
eeb6602c98 Document _reindex with random_score
You can use `_reindex` and `random_score` to extract a random
subset of an index but you have to be careful to sort by `_score`
or it won't work.

Closes #21432
2016-11-10 16:14:30 -05:00
Nik Everett
7ff9ba1604 Fix asciidoc structure for sliced reindex
Asciidoc likes headings just so and will complain and fail the
docs build without it.

Related to #20767
2016-11-04 21:59:19 -04:00
Nik Everett
a13a050271 Add automatic parallelization support to reindex and friends (#20767)
Adds support for `?slices=N` to reindex which automatically
parallelizes the process using parallel scrolls on `_uid`. Performance
testing sees a 3x performance improvement for simple docs
on decent hardware, maybe 30% performance improvement
for more complex docs. Still compelling, especially because
clusters should be able to get closer to the 3x than the 30%
number.

Closes #20624
2016-11-04 20:59:15 -04:00
Nik Everett
a612e5988e Bump reindex-from-remote's buffer to 200mb
It was 10mb and that was causing trouble when folks reindex-from-remoted
with large documents.

We also improve the error reporting so it tells folks to use a smaller
batch size if they hit a buffer size exception. Finally, adds some docs
to reindex-from-remote mentioning the buffer and giving an example of
lowering the size.

Closes #21185
2016-11-01 13:19:28 -04:00
Stanislav Mamontov
7738af27e8 Fix malformed JSON in Delete API example (#21168)
Obviously, there should be

    "result": "deleted"

instead of

    "result: deleted"
2016-10-31 09:13:46 -06:00
Nik Everett
acf7c7430b Add "simple match" support for reindex-from-remote whitelist
This allows you to whitelist `localhost:*` or `127.0.10.*:9200`.
It explicitly checks for patterns like `*` in the whitelist and
refuses to start if the whitelist would match everything. Beyond
that the user is on their own designing a secure whitelist.
2016-10-18 21:47:21 -04:00
Thibaud BARDIN
1bcd26627c [DOCS] Fix typo in "Wait For Active Shards" part (#20900)
Add missing closing backtick
2016-10-13 08:53:30 +02:00
Lee Hinman
f0a2726dcd [DOCS] Remove documentation for force version-type
This option should not be recommended to anyone, and should never be
used, upon chance of primary/replica divergence.

Relates to #19769
2016-10-11 11:21:32 -06:00
Clinton Gormley
82e2f6e747 Document the ctx._now variable in the update API
Relates to #20835
2016-10-11 13:13:03 +02:00
Clinton Gormley
02a739d3c9 Added upgrade docs explaining how to reindex in place or reindex from remote
Closes #20675
2016-10-11 12:14:35 +02:00
Shane Connelly
3164917fd4 Adds a note that reindex does not set up mappings, etc. Closes #20783 2016-10-06 12:27:08 -07:00
Jason Tedor
51d53791fe Remove lenient URL parameter parsing
Today when parsing a request, Elasticsearch silently ignores incorrect
(including parameters with typos) or unused parameters. This is bad as
it leads to requests having unintended behavior (e.g., if a user hits
the _analyze API and misspell the "tokenizer" then Elasticsearch will
just use the standard analyzer, completely against intentions).

This commit removes lenient URL parameter parsing. The strategy is
simple: when a request is handled and a parameter is touched, we mark it
as such. Before the request is actually executed, we check to ensure
that all parameters have been consumed. If there are remaining
parameters yet to be consumed, we fail the request with a list of the
unconsumed parameters. An exception has to be made for parameters that
format the response (as opposed to controlling the request); for this
case, handlers are able to provide a list of parameters that should be
excluded from tripping the unconsumed parameters check because those
parameters will be used in formatting the response.

Additionally, some inconsistencies between the parameters in the code
and in the docs are corrected.

Relates #20722
2016-10-04 12:45:29 -04:00
Jason Tedor
8879360f66 Fix failing doc tests in feature/seq_no
This commit fixes failing doc tests in feature/seq_no after merging
master into this branch.
2016-09-29 03:58:02 +02:00
Jason Tedor
25fd9e26c4 Merge branch 'master' into feature/seq_no
* master: (1199 commits)
  [DOCS] Remove non-valid link to mapping migration document
  Revert "Default `include_in_all` for numeric-like types to false"
  test: add a test with ipv6 address
  docs: clearify that both ip4 and ip6 addresses are supported
  Include complex settings in settings requests
  Add production warning for pre-release builds
  Clean up confusing error message on unhandled endpoint
  [TEST] Increase logging level in testDelayShards()
  change health from string to enum (#20661)
  Provide error message when plugin id is missing
  Document that sliced scroll works for reindex
  Make reindex-from-remote ignore unknown fields
  Remove NoopGatewayAllocator in favor of a more realistic mock (#20637)
  Remove Marvel character reference from guide
  Fix documentation for setting Java I/O temp dir
  Update client benchmarks to log4j2
  Changes the API of GatewayAllocator#applyStartedShards and (#20642)
  Removes FailedRerouteAllocation and StartedRerouteAllocation
  IndexRoutingTable.initializeEmpty shouldn't override supplied primary RecoverySource (#20638)
  Smoke tester: Adjust to latest changes (#20611)
  ...
2016-09-29 00:22:31 +02:00
Nik Everett
560fba1b28 Document that sliced scroll works for reindex
Surprise! You can use sliced scroll to easily parallelize reindex
and friend. They support it because they use the same infrastructure
as a regular search to parse the search request. While we would like
to make an "automatic" option for parallelizing reindex, this manual
option works right now and is pretty convenient!
2016-09-26 05:27:44 +02:00
Jim Ferenczi
f98d5b6261 Add CONSOLE tests for snippets in get and bulk API docs (#20473)
* Add CONSOLE tests for snippets in get and bulk API docs

This change adds tests for the snippets in the get and bulk API documentation.
2016-09-20 09:18:08 +02:00
Tanguy Leroux
656596c2a9 [DOC] Remove obsolete node names from documentation
Funny node names have been removed in #19456 and replaced by UUID. This commit removes these obsolete node names and replace them by real UUIDs in the documentation.

closes #20065
2016-09-19 11:56:28 +02:00
Jim Ferenczi
1764ec56b3 Fixed naming inconsistency for fields/stored_fields in the APIs (#20166)
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain

The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering):
* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.

Some APIs still have the `fields` parameter for various reasons:
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155
2016-09-13 20:54:41 +02:00
Florian Hopf
359e76f7e7 Fixed wording 2016-09-01 11:22:44 -06:00
Jim Ferenczi
accb636824 Merge pull request #20213 from jimferenczi/painless_list_add
Fix docs that uses += to add an element in a list even though painless does not accept it.
2016-08-30 09:33:59 +02:00
Jim Ferenczi
dc663a432b Fix docs that uses += to add an element in a list even though painless does not accept it. 2016-08-29 16:00:11 +02:00
Nicolas Ruflin
4ab1093564 Add reindex example on how to reindex daily indices (#18654)
This can be a common case with beats in case the template changes between two versions and the old data should be reindex with the new templates.
2016-08-26 13:08:52 -04:00
Simon Willnauer
c499427166 Use _refresh instead of reading from Translog in the RT GET case (#20102)
Today we do a lot of accounting inside the engine to maintain locations
of documents inside the transaction log. This is only needed to ensure
we can return the documents source from the engine if it hasn't been refreshed.
Aside of the added complexity to be able to read from the currently writing translog,
maintainance of pointers into the translog this also caused inconsistencies like different values
of the `_ttl` field if it was read from the tlog or not. TermVectors are totally different if
the document is fetched from the tranlog since copy fields are ignored etc.

This chance will simply call `refresh` if the documents latest version is not in the index. This
streamlines the semantics of the `_get` API and allows for more optimizations inside the engine
and on the transaction log. Note: `_refresh` is only called iff the requested document is not refreshed
yet but has recently been updated or added.

#Relates to #19787
2016-08-24 15:30:08 +02:00
ddddn
6228b002c5 Update index_.asciidoc (#20125) 2016-08-23 20:08:52 +02:00
javanna
73d0a1b777 [DOCS] clarify behaviour when routing is required and no routing value is specified
This note in the delete api about broadcasting to all shards is a leftover that should have been removed when the broadcasting feature was removed

Relates to #10136
2016-08-08 10:41:59 +02:00
Ali Beyad
a21dd80f1b Documentation changes for wait_for_active_shards (#19581)
Documentation changes and migration doc changes for introducing 
wait_for_active_shards and removing write consistency level.

Closes #19581
2016-08-02 09:15:01 -04:00
Alexander Lin
9ac6389e43 Rename operation to result and reworking responses
* Rename operation to result and reworking responses
* Rename DocWriteResponse.Operation enum to DocWriteResponse.Result

These are just easier to interpret names.

Closes #19664
2016-08-01 10:42:58 -04:00
Nik Everett
bdebd02d8c Only write forced_refresh if we forced a refresh
Otherwise it just adds noise to the response.

Closes #19629
2016-07-29 15:00:30 -04:00
Alexander Lin
8f2882a442 Add _operation field to index, update, delete responses
Performing the bulk request shown in #19267 now results in the following:
```
{"_index":"test","_type":"test","_id":"1","_version":1,"_operation":"create","forced_refresh":false,"_shards":{"total":2,"successful":1,"failed":0},"status":201}
{"_index":"test","_type":"test","_id":"1","_version":1,"_operation":"noop","forced_refresh":false,"_shards":{"total":2,"successful":1,"failed":0},"status":200}
```
2016-07-26 11:16:19 -04:00
Nik Everett
d573541f66 Support requests_per_second=-1 to mean no throttling in reindex
This is entirely on the REST level, Float.POSITIVE_INFINITY is still
how you get no throttling over the transport api.

Closes #19089
2016-07-18 13:05:06 -04:00
Nik Everett
7aeea764ba Remove wait_for_status=yellow from the docs
It is no longer required after 687e2e12b31ed3c12ef4c411333bff9da58fc808.
2016-07-15 16:02:07 -04:00
Jason Tedor
d0765d0761 Merge branch 'master' into feature/seq_no
* master: (192 commits)
  [TEST] Fix rare OBOE in AbstractBytesReferenceTestCase
  Reindex from remote
  Rename writeThrowable to writeException
  Start transport client round-robin randomly
  Reword Refresh API reference (#19270)
  Update fielddata.asciidoc
  Fix stored_fields message
  Add missing footer notes in mapper size docs
  Remote BucketStreams
  Add doc values support to the _size field in the mapper-size plugin
  Bump version to 5.0.0-alpha5.
  Update refresh.asciidoc
  Update shrink-index.asciidoc
  Change Debian repository for Vagrant debian-8 box
  [TEST] fix test to account for internal empyt reference optimization
  Upgrade to netty 3.10.6.Final (#19235)
  [TEST] fix histogram test when extended bounds overlaps data
  Remove redundant modifier
  Simplify TcpTransport interface by reducing send code to a single send method (#19223)
  Fix style violation in InstallPluginCommand.java
  ...
2016-07-05 22:01:07 -04:00
Nik Everett
b3c015e2bb Reindex from remote
This adds a remote option to reindex that looks like

```
curl -POST 'localhost:9200/_reindex?pretty' -d'{
  "source": {
    "remote": {
      "host": "http://otherhost:9200"
    },
    "index": "target",
    "query": {
      "match": {
        "foo": "bar"
      }
    }
  },
  "dest": {
    "index": "target"
  }
}'
```

This reindex has all of the features of local reindex:
* Using queries to filter what is copied
* Retry on rejection
* Throttle/rethottle
The big advantage of this version is that it goes over the HTTP API
which can be made backwards compatible.

Some things are different:

The query field is sent directly to the other node rather than parsed
on the coordinating node. This should allow it to support constructs
that are invalid on the coordinating node but are valid on the target
node. Mostly, that means old syntax.
2016-07-05 16:13:17 -04:00
Christoph Wurm
c9da56dc80 Reword Refresh API reference (#19270) 2016-07-05 18:37:28 +02:00
Christoph Wurm
768beea6c7 Update refresh.asciidoc
Fix grammar and example
2016-07-05 13:49:25 +02:00
Tanguy Leroux
dc53ce929d Document Update/Delete-By-Query with version number zero
Update-By-Query and Delete-By-Query use internal versioning to update/delete documents. But documents can have a version number equal to zero using the external versioning... making the UBQ/DBQ request fail because zero is not a valid version number and they only support internal versioning for now. Sequence numbers might help to solve this issue in the future.
2016-06-30 15:45:14 +02:00