Commit Graph

120 Commits

Author SHA1 Message Date
Simon Willnauer 75e816400c Remove TranslogService and fold it into synchronous IndexShard API
This commit moves the size and ops based flush into a synchronous API into
IndexShard and removes the time-based flush alltogether since it' basically
covered by the inactive async flush API we have today. The functionality doesn't
need to be covered by scheduled task and async APIs while we can actually make all
the decisions in a sync manner which is way easier to control and to test.

Closes #13707
2015-09-23 12:39:06 +02:00
David Pilato 35049a05c3 Allocation: add support for filtering by transport IP address
Allocation filtering by IP only works today using the node host address. But in some cases, you might want to filter using the publish address which could be different.
2015-09-09 15:15:53 +02:00
Michael McCandless 1c85b68674 Don't document expert segment merge settings 2015-08-29 17:21:46 -04:00
Nik Everett 79d9f5b775 Logging: Log less source in slowlog
Instead of logging the entire `_source` in the indexing slowlog we log by
default just the first 1000 characters - this is controlled by the
`index.indexing.slowlog.source` settings and can be set to `true` to log the
whole `_source`, `false` to log none of it, and a number to log at most that
many characters.

Closes #4485
2015-08-11 13:16:04 -07:00
Clinton Gormley c22e179e87 Docs: Documented cancelation of shard recovery
Relates to #12421
2015-08-07 19:44:34 +02:00
Clinton Gormley ac2b8951c6 Docs: Mapping docs completely rewritten for 2.0 2015-08-06 17:24:51 +02:00
Clinton Gormley c56ce0e242 Docs: Refactored the mapping meta-fields docs 2015-07-20 01:26:27 +02:00
Clinton Gormley dbc0b45896 Docs: Documented index prioritization 2015-07-15 18:05:42 +02:00
Clinton Gormley 2b512f1f29 Docs: Use "js" instead of "json" and "sh" instead of "shell" for source highlighting 2015-07-14 18:14:09 +02:00
Shay Banon e598f16b58 Default delayed allocation timeout to 1m from 0
Change the default delayed allocation timeout from 0 (no delayed allocation) to 1m. The value came from a test of having a node with 50 shards being indexed into (so beefy translog requiring flush on shutdown), then shutting it down and starting it back up and waiting for it to join the cluster. This took, on a slow machine, about 30s.
The value is conservatively low and does not try to address a virtual machine / OS restart for now, in order to not have the affect of node going away and users being concerned that shards are not being allocated to the rest of the cluster as a result of that. The setting can always be changed in order to increase the delayed allocation if needed.
closes #12166
2015-07-14 11:31:16 +02:00
Clinton Gormley aaf1d14b21 Docs: Fixed bad links 2015-07-07 16:08:10 +02:00
Clinton Gormley 93fe8f8910 Docs: Updated the translog docs to reflect the new behaviour/settings in master
Closes #11287
2015-06-30 19:08:31 +02:00
Clinton Gormley 84acb65ca1 Docs: Documented delayed allocation settings
Relates to: #11712
2015-06-30 13:53:04 +02:00
Clinton Gormley f123a53d72 Docs: Refactored modules and index modules sections 2015-06-22 23:49:45 +02:00
Adrien Grand 14c9c239bc Remove non-default fielddata formats.
Now that doc values are the default for fielddata, specialized in-memory
formats are becoming an esoteric option. This commit removes such formats:
 - `fst` on string fields,
 - `compressed` on geo points.

I also removed documentation and tests that the fielddata cache is shared if
you change the format, since this is only true for in-memory fielddata formats
(given that for doc values, the caching is done directly in Lucene).
2015-06-15 14:05:23 +02:00
Clinton Gormley 0216dfd3b6 Docs: Removed left over table header from merge.asciidoc 2015-06-11 13:26:34 +02:00
Simon Willnauer f77804dad3 Bake in TieredMergePolicy
Today we provide the ability to plug in MergePolicy and
we provide the once lucene ships with. We do not recommend to change
the default and even only a small number of expert users would ever touch
this. This commit removes the ancient log byte size and log doc count
merge policy providers, simplifies the MergePolicy wiring and makes the
tiered MP the one and only default. All notions of a merge policy has been
removed from the docs and should be deprecated in the previous version.

Closes #11588
2015-06-11 11:58:30 +02:00
Simon Willnauer 657d6dd9cf Remove MergeScheduler pluggability
Nobody should really plug in a different merge scheduler for elasticsearch.
This is too expert and might cause catastrophic failures.
2015-06-10 20:28:30 +02:00
Clinton Gormley 60c7e0eb91 Update merge.asciidoc
Corrected typo in merge docs
2015-06-08 16:45:59 +02:00
Martijn van Groningen 359d9ac0d0 docs: added missing ids 2015-05-29 22:45:01 +02:00
Clinton Gormley 603a0c193b Docs: More translog doc improvements 2015-05-05 22:01:58 +02:00
Clinton Gormley a60251068c Docs: Improved the translog docs 2015-05-05 21:32:52 +02:00
Simon Willnauer fe5a35b68e Merge branch 'master' into pr-10624
Conflicts:
	src/main/java/org/elasticsearch/index/shard/IndexShard.java
2015-05-05 11:46:02 +02:00
Pascal Borreli af6d890ad5 Docs: Fixed typos
Closes #10973
2015-05-05 10:38:05 +02:00
Simon Willnauer 7e5f9d5628 Merge branch 'master' into pr-10624
Conflicts:
	src/main/java/org/elasticsearch/index/engine/EngineConfig.java
	src/main/java/org/elasticsearch/index/shard/IndexShard.java
	src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
	src/test/java/org/elasticsearch/index/engine/ShadowEngineTests.java
2015-05-04 11:37:54 +02:00
Ryan Ernst 4ef9f3ca63 Mappings: Remove file based default mappings
Using files that must be specified on each node is an anti-pattern
from the API based goal of ES. This change removes the ability
to specify the default mapping with a file on each node.

closes #10620
2015-04-30 13:50:35 -07:00
Boaz Leskes d596f5cc45 Decouple recoveries from engine flush
In order to safely complete recoveries / relocations we have to keep all operation done since the recovery start at available for replay. At the moment we do so by preventing the engine from flushing and thus making sure that the operations are kept in the translog. A side effect of this is that the translog keeps on growing until the recovery is done. This is not a problem as we do need these operations but if the another recovery starts concurrently it may have an unneededly long translog to replay. Also, if we shutdown the engine for some reason at this point (like when a node is restarted)  we have to recover a long translog when we come back.

To void this, the translog is changed to be based on multiple files instead of a single one. This allows recoveries to keep hold to the files they need while allowing the engine to flush and do a lucene commit (which will create a new translog files bellow the hood).

Change highlights:
- Refactor Translog file management to allow for multiple files.
- Translog maintains a list of referenced files, both by outstanding recoveries and files containing operations not yet committed to Lucene.
- A new Translog.View concept is introduced, allowing recoveries to get a reference to all currently uncommitted translog files plus all future translog files created until the view is closed. They can use this view to iterate over operations.
- Recovery phase3 is removed. That phase was replaying operations while preventing new writes to the engine. This is unneeded as standard indexing also send all operations from the start of the recovery  to the recovering shard. Replay all ops in the view acquired in recovery start is enough to guarantee no operation is lost.
- IndexShard now creates the translog together with the engine. The translog is closed by the engine on close. ShadowIndexShards do not open the translog.
- Moved the ownership of translog fsyncing to the translog it self, changing the responsible setting to `index.translog.sync_interval` (was `index.gateway.local.sync`)

Closes #10624
2015-04-30 23:42:50 +03:00
Clinton Gormley 37ed61807f Docs: Updated the experimental annotations in the docs as follows:
* Removed the docs for `index.compound_format` and `index.compound_on_flush` - these are expert settings which should probably be removed (see https://github.com/elastic/elasticsearch/issues/10778)
* Removed the docs for `index.index_concurrency` - another expert setting
* Labelled the segments verbose output as experimental
* Marked the `compression`, `precision_threshold` and `rehash` options as experimental in the cardinality and percentile aggs
* Improved the experimental text on `significant_terms`, `execution_hint` in the terms agg, and `terminate_after` param on count and search
* Removed the experimental flag on the `geobounds` agg
* Marked the settings in the `merge` and `store` modules as experimental, rather than the modules themselves

Closes #10782
2015-04-26 18:49:15 +02:00
Lee Hinman a4f98e7400 [DOCS] Add example of setting disk threshold decider settings
Fixes #10686
2015-04-22 11:53:19 -06:00
Adrien Grand a608db122d Search: Remove the `count` search type.
This commit brings the benefits of the `count` search type to search requests
that have a `size` of 0:
 - a single round-trip to shards (no fetch phase)
 - ability to use the query cache

Since `count` now provides no benefits over `query_then_fetch`, it has been
deprecated.

Close #7630
2015-03-31 11:31:49 +02:00
Joshua Rich db2caa54cd Small grammar fix. 2015-03-17 11:27:13 -07:00
Adrien Grand 95f46f1212 Docs: Use the new experimental annotation.
We now have a very useful annotation to mark features or parameters as
experimental. Let's use it! This commit replaces some custom text warnings with
this annotation and adds this annotation to some existing features/parameters:
 - inner_hits (unreleased yet)
 - terminate_after (released in 1.4)
 - per-bucket doc count errors in the terms agg (released in 1.4)

I also tagged with this annotation settings which should either be not needed
(like the ability to evict entries from the filter cache based on time) or that
are too deep into the way that Elasticsearch works like the Directory
implementation or merge settings.

Close #9563
2015-02-05 15:29:45 +01:00
Masaru Hasegawa b4f7d26723 Fielddata: Change threshold value of fielddata.filter.frequency.max/min
Make it consider 1.0 as 100% instead of aboslute count 1.

Closes: #9327
2015-02-05 13:27:42 +09:00
Michael McCandless 3c0d2081cf Core: change default xlog size from 200 MB to 512 MB
Closes #9341
2015-01-19 15:52:29 -05:00
Michael McCandless def2d34f80 don't mention fixed throttling in the docs 2015-01-14 10:13:10 -05:00
Michael McCandless 107099affa put back fixed throttling, but off by default 2015-01-14 05:35:09 -05:00
Michael McCandless 1aad275c55 expose current CMS throttle in merge stats; fix tests, docs; also log per-merge stop/throttle/rate 2015-01-11 05:52:43 -05:00
Michael McCandless 31e6acf3f2 first cut 2015-01-10 16:38:56 -05:00
Itamar Syn-Hershko cb042cd662 Fixing typo
Closes #8713
2014-12-01 10:52:00 +01:00
Simon Willnauer 0fcb466555 [STORE] Remove `memory`/ `ram` store
The RAM store is discuraged for production usage anyway and
we don't test it in our randomized infrastructure. This commit
removes it for `2.0`
2014-11-20 14:47:19 +01:00
Israel Tsadok 7590629531 Docs: note about confusing disk threshold settings 2014-11-12 09:24:03 +01:00
Henrik Nordvik fdbb62b1ab Docs: Fix curl statements in query-cache.asciidoc
Closes #7989
2014-10-15 13:16:20 +02:00
Suyog Rao 82b16ae0ad Doc: Clarify that index.routing.allocation.total_shards_per_node means both primary and replica shards
Closes #8002
2014-10-13 18:08:23 -07:00
Adrien Grand 491a48e55b Docs: Remove the note that fielddata doesn't support filtering.
This particular note was about fielddata filtering but could cause confusion
that fields that have doc values enabled cannot be used for filtering (as in
a `filtered_query`).
2014-10-10 10:50:47 +02:00
Clinton Gormley cb00d4a542 Docs: Removed all the added/deprecated tags from 1.x 2014-09-26 21:04:42 +02:00
Lee Hinman 4185566e93 Add option to take currently relocating shards' sizes into account
When using the DiskThresholdDecider, it's possible that shards could
already be marked as relocating to the node being evaluated. This commit
adds a new setting `cluster.routing.allocation.disk.include_relocations`
which adds the size of the shards currently being relocated to this node
to the node's used disk space.

This new option defaults to `true`, however it's possible to
over-estimate the usage for a node if the relocation is already
partially complete, for instance:

A node with a 10gb shard that's 45% of the way through a relocation
would add 10gb + (.45 * 10) = 14.5gb to the node's disk usage before
examining the watermarks to see if a new shard can be allocated.

Fixes #7753
Relates to #6168
2014-09-19 12:36:51 +02:00
Nik Everett 2bc58d5f77 Docs: Fix misnamed setting
The settings is `index.merge.policy.reclaim_deletes_weight` not
`index.reclaim_deletes_weight`.

Closes #7676
2014-09-11 10:41:23 +02:00
Tanvir Alam c7d0c3ea18 Docs: fixed typo
Closes #7544
2014-09-08 10:50:59 +02:00
Alexander Reelsen f2aa4a38bc Docs: Added link to clarify meaning of filtering in fielddata context 2014-08-26 12:00:06 +02:00
Adrien Grand ea96359d82 Facets: Removal from master.
Close #7337
2014-08-21 10:34:39 +02:00