Commit Graph

698 Commits

Author SHA1 Message Date
Nik Everett da5fb34163 Mappings: Add transform to document before index.
Closes #6566
2014-07-15 18:40:46 +02:00
mikemccand 63cab559e3 Docs: explain that SerialMergeScheduler just maps to CMS for back compat
Closes #6878
2014-07-15 11:38:43 -04:00
Ryan Ernst 64ab22816c Scripting: Add script engine for lucene expressions.
These are javascript expressions, which can only access numeric
fielddata, parameters, and _score. They can only be used for searches (not document updates).

closes #6818
2014-07-15 07:49:01 -07:00
Areek Zillur d0d1b98d23 Stats: Expose IndexWriter and VersionMap RAM usage to ShardStats and _cat endpoint
This commit adds the RAM usage of IndexWriter and VersionMap

Closes #6483
2014-07-14 19:46:12 -04:00
Areek Zillur 76343899ea Phrase Suggester: Add collate option to PhraseSuggester
The newly added collate option will let the user provide a template query/filter which will be executed for every phrase suggestions generated to ensure that the suggestion matches at least one document for the filter/query.
The user can also add routing preference `preference` to route the collate query/filter and additional `params` to inject into the collate template.

Closes #3482
2014-07-14 16:07:52 -04:00
Malte Schirnacher 647a2a64a1 Docs: Update query-string-syntax.asciidoc
Closes #6853
2014-07-14 16:35:17 +02:00
Clinton Gormley 6e70edb0a4 Analysis: Improve Hunspell error messages
The Hunspell service would throw a confusing error message if more than
one affix file was present.  This commit distinguishes between the two
error cases: where there are no affix files and when there are too many
affix files.

Also implements lazy dictionary loading, which was used in the tests
but not implemented.

Closes #6850
2014-07-14 12:13:32 +02:00
Britta Weber 74927adced significant terms: infrastructure for changing easily the significance heuristic
This commit adds the infrastructure to allow pluging in different
measures for computing the significance of a term.
Significance measures can be provided externally by overriding

- SignificanceHeuristic
- SignificanceHeuristicBuilder
- SignificanceHeuristicParser

closes #6561
2014-07-14 11:00:50 +02:00
Igor Motov 60b317caa4 Snapshot/Restore: Add ability to restore indices without their aliases
Closes #6457
2014-07-13 17:52:41 +09:00
Florian Hopf 3689f67a76 Docs: Fixed invalid word count in geodistance agg doc
Closes #6838
2014-07-11 18:35:36 +02:00
mikemccand 6c78147f5f Docs: remove orphan comma 2014-07-11 08:26:08 -04:00
mikemccand b4e80999a7 Docs: fix merge docs to match the code (the max_thread_count default is 'aggressive' (favor SSDs)) 2014-07-11 07:00:57 -04:00
Boaz Leskes f480969503 [Gateway] set a default of 5m to `recover_after_time` when any to the `expected*Nodes` is set
The `recovery_after_time` tells the gateway to wait before starting recovery from disk. The goal here is to allow for more nodes to join the cluster and thus not start potentially unneeded replications. The `expectedNodes` setting (and friends) tells the gateway when it can start recovering even if the `recover_after_time` has not yet elapsed. However, `expectedNodes` is useless if one doesn't set `recovery_after_time`. This commit changes that by setting a sensible default of 5m for `recover_after_time` *if* a `expectedNodes` setting is present.

Closes #6742
2014-07-11 11:28:45 +02:00
Iulia Pasov eed3513c37 Docs: Update plugins.asciidoc to fix typo
Changed the name of the European Environment Agency (from European Environmental Agency)

Closes #6807
2014-07-10 14:04:26 +02:00
Simon Willnauer 154bd0309c [DOCS] Fix typo in reference 2014-07-10 08:47:18 +02:00
Simon Willnauer d82a434d10 [STORE] Make a hybrid directory default using `mmapfs` and `niofs`
`mmapfs` is really good for random access but can have sideeffects if
memory maps are large depending on the operating system etc. A hybrid
solution where only selected files are actually memory mapped but others
mostly consumed sequentially brings the best of both worlds and
minimizes the memory map impact.
This commit mmaps only the `dvd` and `tim` file for fast random access
on docvalues and term dictionaries.

Closes #6636
2014-07-10 00:01:43 +02:00
Shay Banon 8910e09beb Disable JSONP by default
By default, disable the option to use JSONP in our REST layer
closes #6795
2014-07-09 21:17:17 +02:00
Iulia Pasov a79d0744d3 Docs: Update plugins.asciidoc
Closes #6683
2014-07-09 16:15:59 +02:00
Clinton Gormley b6baa4be4a Update preference.asciidoc
Clarify that `preference` is a query string parameter only
and provide an example.
2014-07-09 11:13:17 +02:00
Clinton Gormley 6c30ad1ce6 Docs: Improved the docs for nested mapping
Closes #1643
2014-07-08 15:54:11 +02:00
Clinton Gormley feb81e228b Docs: Rewrote the scroll/scan docs
Closes #6774
2014-07-08 11:54:53 +02:00
Andrii Gakhov 80321d89d9 Docs: Update histogram-aggregation.asciidoc
filter in a filtered query should be under "filter" key

Closes #6738
2014-07-07 10:44:11 +02:00
Carsten Brandt bd4699da7e Docs: fixed a typo in the docs
Closes: #6718
2014-07-07 10:41:36 +02:00
Clinton Gormley e4baa56f4b Docs: Language analyzers
Clarified the use of stem_exclusion and the keyword_marker
token filter

Closes #6613
2014-07-07 10:06:18 +02:00
Clinton Gormley 54790eea10 Update lang-analyzer.asciidoc
Clarified the use of the `stem_exclusion` token filter.

Closes #6613
2014-07-04 17:50:43 +02:00
Shinsuke Sugaya 4bddb4e346 Update plugins.asciidoc 2014-07-05 00:44:02 +09:00
Shikhar Bhushan 1e894111b0 Docs: Link to eskka discovery plugin from doc
Closes #6721
2014-07-04 17:06:51 +02:00
Clinton Gormley d3f8c66e26 Updated cache.asciidoc
The index level filter cache was removed a long time ago

Closes #6455
2014-07-04 14:26:20 +02:00
David Pilato 162c62dbcc [DOCS] Add information regarding _type parameter requirement for _mget
Change ID to `[[mget-type]]`

Closes #6670.
2014-07-03 15:38:06 +02:00
David Pilato de48d7f94c [DOCS] Add information regarding _type parameter requirement for _mget
Closes #6670.
2014-07-03 15:23:35 +02:00
Jun Ohtani 0c6a859357 Docs: fixed ICU plugin documentation
add ICU Normalization CharFilter to docs

Closes #6711
2014-07-03 15:21:51 +02:00
Mikhail Korobov 955473f475 Docs: unescape regexes in Pattern Tokenizer docs
Currently regexes in Pattern Tokenizer docs are escaped (it seems according to Java rules). I think it is better not to escape them because JSON escaping should be automatic in client libraries, and string escaping depends on a client language used. The default pattern is `\W+`, not `\\W+`.

Closes #6615
2014-07-03 13:34:13 +02:00
hanneskaeufler 6e6f4def5d Docs: Fix typo in timestamp-field.asciidoc
Closes #6661
2014-07-03 13:27:37 +02:00
Robert Muir 2935b751e9 Fix doc formatting. Norwegian stemmers and Scandinavian normalizers
were missing commas between entries.
2014-07-03 07:08:33 -04:00
Robert Muir b9a09c2b06 Analysis: Add additional Analyzers, Tokenizers, and TokenFilters from Lucene
Add `irish` analyzer
Add `sorani` analyzer (Kurdish)

Add `classic` tokenizer: specific to english text and tries to recognize hostnames, companies, acronyms, etc.
Add `thai` tokenizer: segments thai text into words.

Add `classic` tokenfilter: cleans up acronyms and possessives from classic tokenizer
Add `apostrophe` tokenfilter: removes text after apostrophe and the apostrophe itself
Add `german_normalization` tokenfilter: umlaut/sharp S normalization
Add `hindi_normalization` tokenfilter: accounts for hindi spelling differences
Add `indic_normalization` tokenfilter: accounts for different unicode representations in Indian languages
Add `sorani_normalization` tokenfilter: normalizes kurdish text
Add `scandinavian_normalization` tokenfilter: normalizes Norwegian, Danish, Swedish text
Add `scandinavian_folding` tokenfilter: much more aggressive form of `scandinavian_normalization`
Add additional languages to stemmer tokenfilter: `galician`, `minimal_galician`, `irish`, `sorani`, `light_nynorsk`, `minimal_nynorsk`

Add support access to default Thai stopword set "_thai_"

Fix some bugs and broken links in documentation.

Closes #5935
2014-07-03 05:47:49 -04:00
Matthew L Daniel 53f2301eea Docs: Add clarifying text about regexp and terms
For the casual reader, the reference to "term queries" may be glossed over, yielding an unexpected result when using `regexp` queries.
This attempts to make that distinction more prominent.

Closes #6698
2014-07-03 11:39:57 +02:00
jnguyenx 1883f74cc0 Docs: Fixed missing comma in multi match query example 2014-07-03 08:17:09 +02:00
Ian Babrou 698eb7de9b Fixed JSON in fielddata docs 2014-07-01 12:53:10 +02:00
Duncan Angus Wilkie 60a8515fb7 Update histogram-facet.asciidoc
Spotted a typo, which I've fixed.
2014-07-01 10:49:43 +02:00
Igor Motov 1425e28639 Add ability to restore partial snapshots
Closes #5742
2014-06-30 20:18:02 -04:00
Lee Hinman b43b56a6a8 Add a transformer to translate constant BigDecimal to double 2014-06-26 10:52:28 +02:00
mahdeto e78f1edca3 DOC:Added field data circuit breaker settings 2014-06-26 10:29:41 +02:00
Clinton Gormley 30c80319c0 Match query with operator and, cutoff_frequency and stacked tokens
If the match query with cutoff_frequency encounters stacked tokens,
like synonyms in the same position, it returns a boolean query instead
of a common terms query.  However, if the original operator was set
to "and", it was ignoring that and resetting the operator to "or".

In fact, if operator is "and" then there is little benefit in using
a common terms query as a must query is already
executed efficiently.
2014-06-25 17:53:43 +02:00
Lee Hinman 5c6d28240f Switch to Groovy as the default scripting language
This is a breaking change to move from MVEL -> Groovy
2014-06-25 12:15:12 +02:00
Clinton Gormley 64a4acc49b Docs: Added IDs to the highlighters for linking 2014-06-22 16:46:42 +02:00
Clinton Gormley cf059378d1 Docs: Updated stop token filter docs 2014-06-21 18:42:38 +02:00
Clinton Gormley fac724cc99 Docs: Updated the explanation about memory usage with parent/child 2014-06-21 16:32:29 +02:00
Clinton Gormley e52364a95a Docs: Updated cluster health docs 2014-06-20 18:05:46 +02:00
Clinton Gormley adf6e794b6 Docs: Rewrote the filtered query docs to be clearer
Closes #1688
2014-06-19 16:34:26 +02:00
Adrien Grand 703dbff83d Index field names of documents.
The `exists` and `missing` filters need to merge postings lists of all existing
terms, which can be very costly, especially on high-cardinality fields. This
commit indexes the field names of a document under `_field_names` and reuses it
to speed up the `exists` and `missing` filters.

This is only enabled for indices that are created on or after Elasticsearch
1.3.0.

Close #5659
2014-06-19 11:50:06 +02:00
Fitblip d18fb8bfbd REST API: Allow to configure JSONP/callback support
Added the http.jsonp.enable option to configure disabling of JSONP responses, as those
might pose a security risk, and can be disabled if unused.

This also fixes bugs in NettyHttpChannel
* JSONP responses were never setting application/javascript as the content-type
* The content-type and content-length headers were being overwritten even if they were set before

Closes #6164
2014-06-19 08:34:38 +02:00
Chris 011e20678d [DOCS] Fixed json example in nested-aggregation.asciidoc 2014-06-18 19:38:02 +02:00
Colin Goodheart-Smithe 7423ce0560 Aggregations: Added percentile rank aggregation
Percentile Rank Aggregation is the reverse of the Percetiles aggregation.  It determines the percentile rank (the proportion of values less than a given value) of the provided array of values.

Closes #6386
2014-06-18 12:02:08 +01:00
Clinton Gormley 69350dc426 Update stemmer-override-tokenfilter.asciidoc 2014-06-18 11:34:20 +02:00
Clinton Gormley 3eb291f334 Docs: tidied configuration.asciidoc 2014-06-17 17:37:07 +02:00
Shay Banon f450c3ea30 update docs to reflect how default write consistency with 1 replica behaves 2014-06-17 14:25:04 +02:00
Matt Janssen 946dde287a [DOCS] Fixed is/if typo in Api Conventions doc 2014-06-16 15:44:47 +02:00
Volker Fröhlich 06192686a2 [DOCS] Fixd typo in http.asciidoc 2014-06-16 10:42:34 +02:00
stephlag 13d910f016 Added missing comma in suggester example 2014-06-13 16:01:04 +02:00
Adrien Grand 7a34702925 [DOCS] Clarify the trade-off of the `disk` doc values format. 2014-06-13 13:24:53 +02:00
Adrien Grand 01327d7136 Facets: deprecation.
Users are encouraged to move to the new aggregation framework that was
introduced in Elasticsearch 1.0.

Close #6485
2014-06-13 13:13:44 +02:00
Clinton Gormley eb6c9fe111 Docs: Linked to fielddata formats from core types
Closes #6489
2014-06-13 12:58:03 +02:00
Boaz Leskes 7fb16c783d Added caching support to geohash_filter
Caching is turned off by default.

Closes #6478
2014-06-12 22:19:34 +02:00
Shay Banon 2330421816 Wait till node is part of cluster state for join process
When a node sends a join request to the master, only send back the response after it has been added to the master cluster state and published.
This will fix the rare cases where today, a join request can return, and the master, since its under load, have not yet added the node to its cluster state, and the node that joined will start a fault detect against the master, failing since its not part of the cluster state.
Since now the join request is longer, also increase the join request timeout default.
closes #6480
2014-06-12 18:15:51 +02:00
Lee Hinman 3a3f81d59b Enable DiskThresholdDecider by default, change default limits to 85/90%
Fixes #6200
Fixes #6201
2014-06-12 16:35:29 +02:00
Clinton Gormley c41e63c2f9 Docs: Updated index-modules/store and setup/configuration
Explain how to set different index storage types, and
added the vm settings required to stop mmapfs from running
out of memory

Closes #6327
2014-06-12 13:56:06 +02:00
shadow000fire 1b45b216fd Update nested-query.asciidoc
Added note that fields inside a nested query must be full qualified.
2014-06-12 12:48:23 +02:00
Luke Fender f9da5259bc [DOCS] Fixed typo in post-filter.asciidoc
Remove 'be' where it is not needed
2014-06-12 12:09:19 +02:00
Igor Motov 56a264cf6d [DOCS] Snapshot/restore: add more information about snapshot and restore monitoring 2014-06-11 20:52:45 -04:00
Clinton Gormley f546662e8f Docs: Hunspell tidied
Tidied some formatting
2014-06-11 21:49:02 +02:00
Clinton Gormley 04dacaaf27 Docs: Use the "stemmer" token filter for the english analyzer, to be consistent 2014-06-11 13:47:07 +02:00
Clinton Gormley 8a94b71b75 Docs: Corrected the use of keyword_marker on the lang analyzers 2014-06-11 13:43:02 +02:00
Clinton Gormley 673ef3db3f The StemmerTokenFilter had a number of issues:
* `english` returned the slow snowball English stemmer
* `porter2` returned the snowball Porter stemmer (v1)
* `portuguese` was used twice, preventing the second version from working

Changes:

* `english` now returns the fast PorterStemmer (for indices created from v1.3.0 onwards)
* `porter2` now returns the snowball English stemmer (for indices created from v1.3.0 onwards)
* `light_english` now returns the `kstem` stemmer (`kstem` still works)
* `portuguese_rslp` returns the PortugueseStemmer
* `dutch_kp` is a synonym for `kp`

Tests and docs updated

Fixes #6345
Fixes #6213
Fixes #6330
2014-06-11 12:30:16 +02:00
Martijn van Groningen 5e408f3d40 Change the top_hits to be a metric aggregation instead of a bucket aggregation (which can't have an sub aggs)
Closes #6395
Closes #6434
2014-06-10 09:09:50 +02:00
Clinton Gormley e323e577e8 Docs: Fixed bad ref on cjk_width/bigram pages 2014-06-09 23:36:58 +02:00
Clinton Gormley 5e40868f44 Docs: Fixed a bad ref on lang analyzers page 2014-06-09 23:03:12 +02:00
Clinton Gormley 5c5c1da06c Docs: Fixed some errors on the language analyzers page 2014-06-09 22:51:28 +02:00
Clinton Gormley 585b0ef730 Docs: Added custom-analyzer equivalents of all the language analyzers 2014-06-09 22:41:25 +02:00
Clinton Gormley bc402d5f87 Docs: Documented the cjk_width and cjk_bigram token filters 2014-06-09 22:40:58 +02:00
Matthew L Daniel b0a85f6ca3 Guard against improper auto_expand_replica values
Previously if the user provided a non-conforming string, it would blow up with
`java.lang.StringIndexOutOfBoundsException: String index out of range: -1`
which is not a *helpful* error message.

Also updated the documentation to make the possible setting values more clear.

Close #5752
2014-06-07 01:19:06 +02:00
markharwood 724129e6ce Aggregations optimisation for memory usage. Added changes to core Aggregator class to support a new mode of deferred collection.
A new "breadth_first" results collection mode allows upper branches of aggregation tree to be calculated and then pruned
to a smaller selection before advancing into executing collection on child branches.

Closes #6128
2014-06-06 15:59:51 +01:00
fransflippo cdbde4a578 [DOCS] Reworded note about shorthand suggest syntax
The existing Note about the shorthand suggest syntax was poorly worded and confusing. Please check whether the way I've phrased it now is still correct as to what the shorthand form actually does and doesn't do: the original wording did not provide me enough information to be sure.
Thanks!
2014-06-06 10:21:01 +02:00
Evgeniy Sokovikov 1383ab77b6 [DOCS] Fixed typo in put-mapping docs
split backwardscompatibility to backwards compatibility
2014-06-05 19:55:11 +02:00
Yervand Aghababyan cb22417cc1 [DOCS] Fixed the fuzzy query docs with correct default value max_expansion option 2014-06-05 19:52:12 +02:00
Steve Fuller e991c1f717 [DOCS] fixed typo in date-format.asciidoc 2014-06-05 19:49:20 +02:00
Jad Naous 5aa84c9aab [DOCS] Fixed typos in aggregations.asciidoc
Fix plural/singular forms.
2014-06-05 19:47:01 +02:00
gseng 7b5807fe4a [DOCS] Fixed typo in object-type.asciidoc 2014-06-05 19:34:50 +02:00
Philip Stevens 4998c0928f [DOCS] Replace facets example with aggregations in warmers docs 2014-06-05 19:22:16 +02:00
Israel Tsadok 1a58016ea1 [DOCS] Add special attributes for indices allocation filtering 2014-06-05 10:38:07 +02:00
Rob Young 07a6143386 [DOCS] Fix grammar in dynamic mappings 2014-06-04 08:56:15 +02:00
Colin Goodheart-Smithe b9f4d44b14 Aggregations: Adds GeoBounds Aggregation
The GeoBounds Aggregation is a new single bucket aggregation which outputs the coordinates of a bounding box containing all the points from all the documents passed to the aggregation as well as the doc count. Geobound Aggregation also use a wrap_logitude parameter which specifies whether the resulting bounding box is permitted to overlap the international date line.  This option defaults to true.

This aggregation introduces the idea of MetricsAggregation which do not return double values and cannot be used for sorting.  The existing MetricsAggregation has been renamed to NumericMetricsAggregation and is a subclass of MetricsAggregation.  MetricsAggregations do not store doc counts and do not support child aggregations.

Closes #5634
2014-06-03 15:59:56 +01:00
violuke 4f99f0c6f1 [DOCS] Improved readability of multi-match query docs 2014-06-03 14:23:34 +02:00
darkwarriors d8765a8f1d [DOCS] fixed urls in nodes-stats docs 2014-06-03 13:48:42 +02:00
Patrik Ragnarsson 9a3368b937 [DOCS] Fix minor error in cluster stats example 2014-06-03 13:38:37 +02:00
Gaurav Arora 4a3837acf0 [DOCS] fix typo in network module docs 2014-06-03 13:19:36 +02:00
James Yu 8994eed82b [DOCS] Update elasticsearch version in repositories.asciidoc 2014-06-03 12:30:51 +02:00
Steve Fuller b800be891f [DOCS] fixed typo in fucntion-score query docs 2014-06-03 12:05:59 +02:00
violuke 0020e5fc0a [DOCS] Improved grammar in multi-match query docs 2014-06-03 11:50:41 +02:00
javanna 5a1ad7b42e [DOCS] fixed curl requests in benchmark docs 2014-06-03 11:47:13 +02:00
leonardo menezes f3eca05c3b [DOCS] removed slowest on single query benchmark requests
Relates to #5904
2014-06-03 11:47:13 +02:00