Commit Graph

993 Commits

Author SHA1 Message Date
Adrien Grand ce11e0ee6d Filter cache: add a `_cache: auto` option and make it the default.
Up to now, all filters could be cached using the `_cache` flag that could be
set to `true` or `false` and the default was set depending on the type of the
`filter`. For instance, `script` filters are not cached by default while
`terms` are. For some filters, the default is more complicated and eg. date
range filters are cached unless they use `now` in a non-rounded fashion.

This commit adds a 3rd option called `auto`, which becomes the default for
all filters. So for all filters a cache wrapper will be returned, and the
decision will be made at caching time, per-segment. Here is the default logic:
 - if there is already a cache entry for this filter in the current segment,
   then return the cache entry.
 - else if the doc id set cannot iterate (eg. script filter) then do not cache.
 - else if the doc id set is already cacheable and it has been used twice or
   more in the last 1000 filters then cache it.
 - else if the filter is costly (eg. multi-term) and has been used twice or more
   in the last 1000 filters then cache it.
 - else if the doc id set is not cacheable and it has been used 5 times or more
   in the last 1000 filters, then load it into a cacheable set and cache it.
 - else return the uncached set.

So for instance geo-distance filters and script filters are going to use this
new default and are not going to be cached because of their iterators.

Similarly, date range filters are going to use this default all the time, but
it is very unlikely that those that use `now` in a not rounded fashion will get
reused so in practice they won't be cached.

`terms`, `range`, ... filters produce cacheable doc id sets with good iterators
so they will be cached as soon as they have been used twice.

Filters that don't produce cacheable doc id sets such as the `term` filter will
need to be used 5 times before being cached. This ensures that we don't spend
CPU iterating over all documents matching such filters unless we have good
evidence of reuse.

One last interesting point about this change is that it also applies to compound
filters. So if you keep on repeating the same `bool` filter with the same
underlying clauses, it will be cached on its own while up to now it used to
never be cached by default.

`_cache: true` has been changed to only cache on large segments, in order to not
pollute the cache since small segments should not be the bottleneck anyway.
However `_cache: false` still has the same semantics.

Close #8449
2014-12-18 15:51:36 +01:00
Michael McCandless 242e631e95 Core: ignore known idle threads by default in /_nodes/hot_threads
Add a new ignore_idle_threads boolean option (default true) to
/_nodes/hot_threads, to filter out threads in known idle places like
waiting on a socket select or on pulling the next task from an empty
queue.

Closes #8985

Closes #8908
2014-12-17 11:59:31 -05:00
Yasir Bamarni 5059d6fe1c Update percolate.asciidoc
wrong type used in the -GET request

Closes #8942
2014-12-17 14:05:27 +01:00
Pablo Díaz-López adb1a5b43b Update getting-started.asciidoc
Missing -X flag at the curl template

Closes #8977
2014-12-17 14:03:38 +01:00
Peter Johnson a.k.a. insertcoffee 4b5e6b2de0 [docs] pedantry
Closes #8982
2014-12-17 13:46:39 +01:00
Nicholas Knize ac0e37449e Adding unit test for self intersecting polygons. Relevant to #7751 even/odd discussion
Updating documentation to describe polygon ambiguity and vertex ordering.
2014-12-16 10:54:39 -06:00
Ryan Ernst 37287284e6 Settings: Remove `mapping.date.round_ceil` setting for date math parsing
The setting `mapping.date.round_ceil` (and the undocumented setting
`index.mapping.date.parse_upper_inclusive`) affect how date ranges using
`lte` are parsed.  In #8556 the semantics of date rounding were
solidified, eliminating the need to have different parsing functions
whether the date is inclusive or exclusive.

This change removes these legacy settings and improves the tests
for the date math parser (now at 100% coverage!). It also removes the
unnecessary function `DateMathParser.parseTimeZone` for which
the existing `DateTimeZone.forID` handles all use cases.

Any user previously using these settings can refer to the changed
semantics and change their query accordingly. This is a breaking change
because even dates without datemath previously used the different
parsing functions depending on context.

closes #8598
closes #8889
2014-12-15 13:13:45 -08:00
Timothy Perisho ceafde41e9 Docs: typo on "frequent"
I replaced "high frequent terms" with "high frequency terms" and "low frequent terms" with "low frequency terms".

Alternatively, we could write, "highly frequent terms" and "minimally frequent terms" (or just "rare terms").

Closes #8962
2014-12-15 19:59:50 +01:00
Clinton Gormley fcb83055de Update repositories.asciidoc
Update formatting of PGP key
2014-12-15 18:04:17 +01:00
Simon Willnauer 1247774ff1 Remove Gateway abstraction
We only have a single gatweway since es 1.3. There is no need to keep all
these abstractsion and nested packages. We can fold most of it into simpler
structures.
2014-12-15 15:53:02 +01:00
spapin ad747ba67f Docs: fix a typo in cluster stats documentation example
Closes #8898
2014-12-15 14:14:38 +01:00
Ayush 23dbecf3e7 Update percolate.asciidoc
Updating the `associated` spelling

Closes #8907
2014-12-15 14:12:03 +01:00
Alexander Reelsen 544ef8cb17 Packaging: Add java7/8 java-package paths to debian init script
If you use the java-package tool to create java packages, those
paths also should be added to the debian init script.

Also updated the docs, that it is ok to install java8.

Closes #7383
2014-12-11 16:15:00 +01:00
Peter Fabian Mitchell b2bab05c29 HTTP: Add 'http.publish_port' setting to the HTTP module
This change adds a 'http.publish_port' setting to the HTTP module to configure
the port which HTTP clients should use when communicating with the node. This
is useful when running on a bridged network interface or when running behind
a proxy or firewall.

Closes #8807
Closes #8137
2014-12-11 16:10:07 +01:00
Robert Muir a2ffe494ae [core] add best_compression option for Lucene 5.0
Upgrades lucene to latest, and supports the BEST_COMPRESSION parameter
now supported (with backwards compatibility, etc) in Lucene.
This option uses deflate, tuned for highly compressible data.

index.codec::
The default value compresses stored data with LZ4 compression, but
this can be set to best_compression for a higher compression ratio,
at the expense of slower stored fields performance.

IMO its safest to implement as a named codec here, because ES already
has logic to handle this correctly, and because its unrealistic to have
a plethora of options to Lucene's default codec... we are practically
limited in Lucene to what we can support with back compat, so I don't
think we should overengineer this and add additional unnecessary plumbing.

See also:
https://issues.apache.org/jira/browse/LUCENE-5914
https://issues.apache.org/jira/browse/LUCENE-6089
https://issues.apache.org/jira/browse/LUCENE-6090
https://issues.apache.org/jira/browse/LUCENE-6100

Closes #8863
2014-12-10 22:13:09 -05:00
Alexander Clausen 633905161a Docs: use https to download the gpg public key
Closes #8818
2014-12-10 18:14:07 +01:00
Adam Menges 3a3030e217 Docs: Fix the wording for inner hits a bit
Closes #8747
2014-12-09 13:36:26 +01:00
Ashraf Sarhan 24f8807cb5 Docs: Update repositories.asciidoc
1. Enable the repository using "add-apt-repository" to avoid this error "No command 'deb' found".
2. Adding "sudo" to update and install command.

Closes #8691
2014-12-09 13:23:16 +01:00
Kevin Kluge 63ac4614f4 docs: add pgp key to repositories page 2014-12-08 15:41:09 +01:00
Jun Ohtani d78d2ff93d Docs: add randomizedtesting-runner to testing-framework.asciidoc
Close #8450
2014-12-07 01:30:58 +09:00
Adrien Grand 344bbf2ced Docs: Add instructions to start elasticsearch on bootup on RHEL/Fedora. 2014-12-05 11:14:13 +01:00
tristanbob 0a09f1ea13 Docs: Added a command to start elasticsearch on bootup on Debian.
Close #8600
2014-12-05 11:03:32 +01:00
David Pilato d2a2d1bb53 java: QueryBuilders cleanup: remove deprecated
Related to #8667:

Some QueryBuilders have been deprecated in 1.x branches. We removed them in 2.0.

Removed
-------

* `textPhrase(...)`
* `textPhrasePrefix(...)`
* `textPhrasePrefixQuery(...)`
* `filtered(...)`
* `inQuery(...)`
* `commonTerms(...)`
* `queryString(...)`
* `simpleQueryString(...)`

Closes #8721.
2014-12-03 16:07:34 +01:00
Peter Johnson a.k.a. insertcoffee ac71f1b70a [docs] formatting and general pedantry
I'm not sure if the `distance-units` section is totally clear, when using the 'Geohash Cell Filter' and omitting a unit, the default is to interpret the integer as the 'length of the geohash prefix', not to default it to 'meter'. Maybe I'm being pedantic.

Closes #8744
2014-12-02 19:23:48 +01:00
John Michael Luy 01ef80a33d Update range-filter.asciidoc
Closes #8741
2014-12-02 18:00:38 +01:00
John Michael Luy f20f6ffe22 Docs: Update range-query.asciidoc
Closes #8740
2014-12-02 12:55:44 +01:00
Martijn van Groningen d7e224da04 Added `inner_hits` feature that allows to include nested hits.
Inner hits allows to embed nested inner objects, children documents or the parent document that contributed to the matching of the returned search hit as inner hits, which would otherwise be hidden.

Closes #8153
Closes #3022
Closes #3152
2014-12-02 12:01:01 +01:00
Itamar Syn-Hershko cb042cd662 Fixing typo
Closes #8713
2014-12-01 10:52:00 +01:00
Dan Tuffery 3b5fa9075a Docs: Grammar correction
Closes #8702
2014-11-29 14:06:04 +01:00
Clinton Gormley 88e06cba80 Update daterange-aggregation.asciidoc
Clarified the date-math expressions on date range aggregations

Closes #8703
2014-11-28 16:53:33 +01:00
Alex Ksikes 256712640f MLT Query: Support for ignore docs
Adds a `ignore_like` parameter to the MLT Query, which simply tells the
algorithm to skip all the terms from the given documents. This could be useful
in order to better guide nearest neighbor search by telling the algorithm to
never explore the space spanned by the given `ignore_like` docs. In essence we
are interested about the characteristic of a given item, but not of the ones
provided by `ignore_like`, thereby forcing the algorithm to go deeper in its
selection of terms. Note that this is different than simply performing a must
not boolean query on the unliked items. The syntax is exactly the same as the
`like` parameter.

Closes #8674
2014-11-28 14:48:43 +01:00
pmamat 9e2eaeece4 Docs: Additional info about _score calculation
Description taken from http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/multi-query-strings.html / 110_Multi_Field_Search/05_Multiple_query_strings.asciidoc

Closes #8635
2014-11-28 13:54:45 +01:00
Britta Weber 59507cf793 function_score: match only document with score above custom score threshold
functon_score matched each document regardless of the computed score.
This commit adds a query parameter `min_score` (-Float.MAX_VALUE default).
Documents that have a score lower than this threshold will not be mached.

closes #6952
2014-11-28 12:35:26 +01:00
David Pilato 43a1435d3b [Docs] fix consistency between examples 2014-11-27 20:29:34 +01:00
David Pilato 40f0e07db3 [Docs] Fix missing new line 2014-11-27 19:39:12 +01:00
Britta Weber f00b431c18 [docs] explain default settings for parameters of decay functions
relates to #8624
2014-11-27 19:18:55 +01:00
David Pilato da27c2104a [Docs] Fix missing comma in mapping 2014-11-27 11:03:19 +01:00
Clinton Gormley 818b9b7563 Updated docs to use v1.4.1 as current 2014-11-26 17:18:37 +01:00
Sebastian Ziebell 3a6c6f4b26 Docs: Adds documentation for indices.exists_template
Closes: #8657
2014-11-25 19:36:01 +01:00
tristanbob 807f363d6d Added note that ES packages automatically change vm.max_map_count
Closes #8601
2014-11-25 18:25:46 +01:00
Matt Hughes afba977e80 Docs: Added swift openstack repository
Closes #8583
2014-11-25 13:49:15 +01:00
David Haney 2c429452e9 Typo: changed "5% or the real words" to "5% of the real words"
Closes #8582
2014-11-25 13:15:33 +01:00
Michael McCandless 856b294441 Core: let Lucene kick off merges
Today, Elasticsearch has a separate merge thread pool checking once
per second (by default) if any merges are necessary, but this is no
longer necessary since we can and do now tell Lucene's
ConcurrentMergeScheduler never to "hard pause" threads when merges
fall behind, since we do our own index throttling.

This change goes back to letting Lucene launch merges as needed, and
removes these two expert settings:

  index.merge.force_async_merge
  index.merge.async_interval

Now merges kick off immediately instead of waiting up to 1 second
before running.

Closes #8643
2014-11-25 04:13:57 -05:00
Martijn van Groningen 1d7cdd7d22 Applied PR, changed the way defaults are handled and updated the docs.
Closes #4452
2014-11-24 13:32:41 +01:00
Lee Hinman 45408844e7 Remove NoneGateway, NoneGatewayAllocator, & NoneGatewayModule
Always use the LocalGateway* equivalents

We already check in the LocalGateway whether a node is a client node, or
is not master-eligible, and skip writing the state there. This allows us
to remove this code that was previously used only for tribe nodes (which
are not master eligible anyway and wouldn't write state) and in
tests (which can shake more bugs out)
2014-11-24 12:22:05 +01:00
dw ad408eee85 Docs: Reword note regarding _source for accuracy
Previously it suggested _source was always present, when that is not the case.

Closes #8491
2014-11-24 12:19:44 +01:00
Laurent Broudoux feb465f26f Docs: Update plugins.asciidoc on river plugins section
Adding links to Amazon S3 and Google Drive river plugins

Closes #8544
2014-11-24 12:15:12 +01:00
Michael McCandless dfb6d6081c Core: upgrade to current Lucene 5.0.0 snapshot
Elasticsearch no longer unlocks the Lucene index on startup (this was
dangerous, and could possibly lead to corruption).

Added the new serbian_normalization TokenFilter from Lucene.

NoLockFactory is no longer supported (index.store.fs.fs_lock = none),
and if you have a typo in your fs_lock you'll now hit a StoreException
instead of silently using NoLockFactory.

Closes #8588
2014-11-24 05:08:42 -05:00
Adrien Grand 8346e92ebb Core: Fix script fields to be returned as a multivalued field when they produce a list.
This change is essentially the same as #3015 but on script fields.

Close #8592
2014-11-24 09:41:16 +01:00
mdzor bc52ccfd33 Docs: Update update-settings.asciidoc
Inconsistent indentation

Closes #8525
2014-11-23 14:45:56 +01:00