119 Commits

Author SHA1 Message Date
Clinton Gormley
e7f1aa4f4f Documented the query cache module
Related to #7161 and #7167
2014-08-06 11:55:11 +02:00
Clinton Gormley
4b0a89d4fb Update translog.asciidoc
Documented `index.gateway.local.sync`
2014-07-31 14:06:24 +02:00
Lee Hinman
6abe4c951d Add HierarchyCircuitBreakerService
Adds a breaker for request BigArrays, which are used for parent/child
queries as well as some aggregations. Certain operations like Netty HTTP
responses and transport responses increment the breaker, but will not
trip.

This also changes the output of the nodes' stats endpoint to show the
parent breaker as well as the fielddata and request breakers.

There are a number of new settings for breakers now:

`indices.breaker.total.limit`: starting limit for all memory-use breaker,
defaults to 70%

`indices.breaker.fielddata.limit`: starting limit for fielddata breaker,
defaults to 60%
`indices.breaker.fielddata.overhead`: overhead for fielddata breaker
estimations, defaults to 1.03

(the fielddata breaker settings also use the backwards-compatible
setting `indices.fielddata.breaker.limit` and
`indices.fielddata.breaker.overhead`)

`indices.breaker.request.limit`: starting limit for request breaker,
defaults to 40%
`indices.breaker.request.overhead`: request breaker estimation overhead,
defaults to 1.0

The breaker service infrastructure is now generic and opens the path to
adding additional circuit breakers in the future.

Fixes #6129

Conflicts:
	src/main/java/org/elasticsearch/index/fielddata/IndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/IndexFieldDataService.java
	src/main/java/org/elasticsearch/index/fielddata/RamAccountingTermsEnum.java
	src/main/java/org/elasticsearch/index/fielddata/ordinals/GlobalOrdinalsBuilder.java
	src/main/java/org/elasticsearch/index/fielddata/ordinals/InternalGlobalOrdinalsBuilder.java
	src/main/java/org/elasticsearch/index/fielddata/plain/AbstractIndexOrdinalsFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/DisabledIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/IndexIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/NonEstimatingEstimator.java
	src/main/java/org/elasticsearch/index/fielddata/plain/PackedArrayIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/ParentChildIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/SortedSetDVOrdinalsIndexFieldData.java
	src/main/java/org/elasticsearch/node/internal/InternalNode.java
	src/test/java/org/elasticsearch/index/aliases/IndexAliasesServiceTests.java
	src/test/java/org/elasticsearch/index/codec/CodecTests.java
	src/test/java/org/elasticsearch/index/fielddata/AbstractFieldDataTests.java
	src/test/java/org/elasticsearch/index/fielddata/IndexFieldDataServiceTests.java
	src/test/java/org/elasticsearch/index/mapper/MapperTestUtils.java
	src/test/java/org/elasticsearch/index/query/IndexQueryParserFilterCachingTests.java
	src/test/java/org/elasticsearch/index/query/SimpleIndexQueryParserTests.java
	src/test/java/org/elasticsearch/index/query/guice/IndexQueryParserModuleTests.java
	src/test/java/org/elasticsearch/index/search/FieldDataTermsFilterTests.java
	src/test/java/org/elasticsearch/index/search/child/ChildrenConstantScoreQueryTests.java
	src/test/java/org/elasticsearch/index/similarity/SimilarityTests.java
2014-07-28 11:27:33 +02:00
mikemccand
96ecec34d1 Docs: fix documentation for bloom filter defaults 2014-07-27 18:39:29 -04:00
Simon Willnauer
5bfea56457 [DOCS] move all coming tags to added in master 2014-07-23 16:37:19 +02:00
Peter Johnson @insertcoffee
9a4abc2620 Docs: typo
example fails in bash

Closes #6977
2014-07-23 12:43:43 +02:00
mikemccand
cc4d7c6272 Core: don't load bloom filters by default
This change just changes the default for index.codec.bloom.load to
false: with recent performance improvements to ID lookup, such as
#6298, bloom filters don't give much of a performance gain anymore,
and they can consume non-trivial RAM when there are many tiny
documents.

For now, we still index the bloom filters, so if a given app wants
them back, it can just update the index.codec.bloom.load to true.

Closes #6959
2014-07-23 05:58:41 -04:00
mikemccand
63cab559e3 Docs: explain that SerialMergeScheduler just maps to CMS for back compat
Closes #6878
2014-07-15 11:38:43 -04:00
mikemccand
6c78147f5f Docs: remove orphan comma 2014-07-11 08:26:08 -04:00
mikemccand
b4e80999a7 Docs: fix merge docs to match the code (the max_thread_count default is 'aggressive' (favor SSDs)) 2014-07-11 07:00:57 -04:00
Simon Willnauer
154bd0309c [DOCS] Fix typo in reference 2014-07-10 08:47:18 +02:00
Simon Willnauer
d82a434d10 [STORE] Make a hybrid directory default using mmapfs and niofs
`mmapfs` is really good for random access but can have sideeffects if
memory maps are large depending on the operating system etc. A hybrid
solution where only selected files are actually memory mapped but others
mostly consumed sequentially brings the best of both worlds and
minimizes the memory map impact.
This commit mmaps only the `dvd` and `tim` file for fast random access
on docvalues and term dictionaries.

Closes #6636
2014-07-10 00:01:43 +02:00
Clinton Gormley
d3f8c66e26 Updated cache.asciidoc
The index level filter cache was removed a long time ago

Closes #6455
2014-07-04 14:26:20 +02:00
Ian Babrou
698eb7de9b Fixed JSON in fielddata docs 2014-07-01 12:53:10 +02:00
Adrien Grand
7a34702925 [DOCS] Clarify the trade-off of the disk doc values format. 2014-06-13 13:24:53 +02:00
Lee Hinman
3a3f81d59b Enable DiskThresholdDecider by default, change default limits to 85/90%
Fixes #6200
Fixes #6201
2014-06-12 16:35:29 +02:00
Clinton Gormley
c41e63c2f9 Docs: Updated index-modules/store and setup/configuration
Explain how to set different index storage types, and
added the vm settings required to stop mmapfs from running
out of memory

Closes #6327
2014-06-12 13:56:06 +02:00
Israel Tsadok
1a58016ea1 [DOCS] Add special attributes for indices allocation filtering 2014-06-05 10:38:07 +02:00
Simon Willnauer
9d5507047f Update Documentation Feature Flags [1.2.0] 2014-05-22 15:06:42 +02:00
Simon Willnauer
85a0b76dbb Upgrade to Lucene 4.8.1
This commit upgrades to the latest Lucene 4.8.1 release including the
following bugfixes:

 * An IndexThrottle now kicks in when merges start falling behind
   limiting index threads to 1 until merges caught up. Closes #6066
 * RateLimiter now kicks in at the configured rate where previously
   the limiter was limiting at ~8MB/sec almost all the time. Closes #6018
2014-05-19 20:47:55 +02:00
mikemccand
00fcf4d560 #6081: set IO throttling back to 20 MB/sec now that #6018 is fixed 2014-05-12 14:42:26 -04:00
mikemccand
b6ae7fbadb #5882: fix docs 2014-05-12 14:16:27 -04:00
mikemccand
254ebc2f88 #6120 Remove SerialMergeScheduler (master only)
It's dangerous to expose SerialMergeScheduler as an option: since it only allows one merge at a time, it can easily cause merging to fall behind.

Closes #6120
2014-05-12 14:06:20 -04:00
Ivan Brusic
bac0627c5e Update fielddata.asciidoc
Spelling correction
2014-05-08 10:59:24 +02:00
Ivan Brusic
59e0c34cdb Update fielddata.asciidoc
Fixed default value for circuit breaker
2014-05-08 10:58:10 +02:00
mikemccand
9daaae27b3 clarify that CMS defaults change is coming in 1.2 2014-05-07 13:49:54 -04:00
Adrien Grand
fc78dd2f13 [DOC] Fix default values for filter cache size and field data circuit breaker.
Relates to #5990
2014-05-06 10:13:05 +02:00
mikemccand
07563379dc fix docs for merging and throttling 2014-05-05 16:22:00 -04:00
Simon Willnauer
b4f0603169 Change default merge throttling to 50MB / sec
The current setting of 20MB/sec seems to be too conservative given
the capabilities of modern hardware. Even on cloud infrastructure this
seems to be too lowish. A 50MB default should provide better out of the box
performance
2014-04-22 21:08:40 +02:00
Simon Willnauer
1cf62e7782 Use unlimited flush_threshold_ops for translog
Currently we use 5k operations as a flush threshold. Indexing 5k documents
per second is rather common which would cause the index to be committed on
the lucene level each time the flush logic runs which is 5 seconds by default.
We should rather use a size based threshold similar to the lucene index writer
that doesn't cause such agressive commits which can slow down indexing significantly
especially since they cause the underlying devices to fsync their data.
2014-04-22 16:37:07 +02:00
Christoph Frick
e3e631eca5 Update allocation.asciidoc 2014-04-17 14:42:58 +02:00
Kouhei Sutou
de59cde926 Remove garbage 2014-04-15 17:57:25 +02:00
Simon Willnauer
9898eed30c [DOCS] Update merge docs to reflect the max_merge_at_once property 2014-04-15 16:42:23 +02:00
Simon Willnauer
320a206352 Switch back to ConcurrentMergeScheduler
Load tests showed that SerialMS has problems to keep up with
the merges under high load. We should switch back to CMS
until we have a better story to balance merge
threads / efforts across shards on a single node.

Closes #5817
2014-04-15 16:42:23 +02:00
Kevin Wang
ecab74fe6c add lucene language model similarities (Dirichlet & JelinekMercer) 2014-04-07 10:48:03 +02:00
Martijn van Groningen
ade1d0ef57 Added global ordinals (unique incremental numbering for terms) to fielddata.
Added a terms aggregation implementations that work on global ordinals, which is also the default.

Closes #5672
2014-04-07 11:06:41 +07:00
Lee Hinman
211f740100 Add getAsRatio to Settings class, allow DiskThresholdDecider to take percentages
Adds new RatioValue class that parses ratios between 0-100% expressed in
either floating-point (0.13) or percentage (51.12%) notation.

Closes #5690
2014-04-04 13:19:35 -06:00
Lee Hinman
c3089701f2 [DOCS] remove extraneous ` from cache page 2014-04-02 16:07:00 -06:00
Shay Banon
0ef3b03be1 Move to use serial merge schedule by default
Today, we use ConcurrentMergeScheduler, and this can be painful since it is concurrent on a shard level, with a max of 3 threads doing concurrent merges. If there are several shards being indexed, then there will be a minor explosion of threads trying to do merges, all being throttled by our merge throttling.
Moving to serial merge scheduler will still maintain concurrency of merges across shards, as we have the merge thread pool that schedules those merges. It will just be a serial one on a specific shard.
Also, on serial merge scheduler, we now have a limit of how many merges it will do at one go, so it will let other shards get their fair chance of merging. We use the pending merges on IW to check if merges are needed or not for it.
Note, that if a merge is happening, it will not block due to a sync on the maybeMerge call at indexing (flush) time, since we wrap our merge scheduler with the EnabledMergeScheduler, where maybeMerge is not activated during indexing, only with explicit calls to IW#maybeMerge (see Merges).
closes #5447
2014-03-18 13:17:00 +01:00
Konrad Feldmeier
d7b0d547d4 [DOCS] Multiple doc fixes
Closes #5047
2014-03-07 14:24:58 +01:00
Oleg Anashkin
eb0e1aa38f Fix typo in similarity docs
DRF similarity -> DFR similarity
2014-02-13 07:45:30 -08:00
Clinton Gormley
93930d6dc7 Removed 0.90.* deprecation and addition notifications
Closes #5052
2014-02-07 20:52:49 +01:00
Shay Banon
d36e345f1f fix docs to reflect removal of byte buffer memory 2014-02-03 09:54:30 -05:00
Brusic
d9b71a8083 [DOCS] various docs fixes
Removed unused misc.asciidoc file
Added plugins directory to directory layout
Fixed transport.tcp.connect_timeout value to match the code found in NetworkService.TcpSettings
Clarified that phrase query does not preserve order of terms
Clarified merge page
Added instructions on how to build documentation to docs/README
2014-01-23 10:52:13 +01:00
Clinton Gormley
faddd66e87 [DOCS] Added breaking changes in 1.0 2014-01-15 17:50:24 +01:00
Lee Hinman
3062e59f51 [DOCS] Fix default setting in circuit breaker documentation 2014-01-15 07:05:05 -07:00
Clinton Gormley
f8a427e266 [DOCS] Moved fielddata circuit breaker higher up the page 2014-01-15 14:00:08 +01:00
Shay Banon
4aa5ef139e randomize flush interval so multiple shards won't flush at the sam time
- also, allow to update interval using update settings on an index
2014-01-07 19:58:28 +01:00
Simon Willnauer
fa16969360 Cleanup comments and class names s/ElasticSearch/Elasticsearch
* Clean up s/ElasticSearch/Elasticsearch on docs/*
 * Clean up s/ElasticSearch/Elasticsearch on src/* bin/* & pom.xml
 * Clean up s/ElasticSearch/Elasticsearch on NOTICE.txt and README.textile

Closes #4634
2014-01-07 11:21:51 +01:00
Lee Hinman
47607a69a1 Default the circuit breaker limit to 80% of the maximum JVM heap 2014-01-03 16:21:55 -07:00