Commit Graph

364 Commits

Author SHA1 Message Date
Adrien Grand 5821fa042c Cardinality aggregation.
This aggregation computes unique term counts using the hyperloglog++ algorithm
which uses linear counting to estimate low cardinalities and hyperloglog on
higher cardinalities.

Since this algorithm works on hashes, it is useful for high-cardinality fields
to store the hash of values directly in the index, which is the purpose of
the new `murmur3` field type. This is less necessary on low-cardinality
string fields because the aggregator is smart enough to only compute the hash
once per unique value per segment thanks to ordinals, or on numeric fields
since hashing them is very fast.

Close #5426
2014-03-13 19:19:56 +01:00
Florian Schilling 81e537bd5e ContextSuggester
================

This commit extends the `CompletionSuggester` by context
informations. In example such a context informations can
be a simple string representing a category reducing the
suggestions in order to this category.

Three base implementations of these context informations
have been setup in this commit.

- a Category Context
- a Geo Context

All the mapping for these context informations are
specified within a context field in the completion
field that should use this kind of information.
2014-03-13 11:24:46 +01:00
Kurt Hurtado ca6a2bb790 [DOCS] Various aggregation doc fixes 2014-03-13 09:05:25 +01:00
Costin Leau 9624b215fb Add docs for plugin isolation 2014-03-11 12:32:58 +02:00
Boaz Leskes b7a95d11a7 Introduced VersionType.FORCE & VersionType.EXTERNAL_GTE
Also added "external_gt" as an alias name for VersionType.EXTERNAL , accessible for the rest layer.

Closes #4213 , Closes #2946
2014-03-10 21:07:17 +01:00
javanna d5aaa90f34 [TEST] Randomized number of shards used for indices created during tests
Introduced two levels of randomization for the number of shards (between 1 and 10) when running tests:

1) through the existing random index template, which now sets a random number of shards that is shared across all the indices created in the same test method unless overwritten

2) through `createIndex` and `prepareCreate` methods, similar to what happens using the `indexSettings` method, which changes for every `createIndex` or `prepareCreate` unless overwritten (overwrites index template for what concerns the number of shards)

Added the following facilities to deal with the random number of shards:
- `getNumShards` to retrieve the number of shards of a given existing index, useful when doing comparisons based on the number of shards and we can avoid specifying a static number. The method returns an object containing the number of primaries, number of replicas and the total number of shards for the existing index

- added `assertFailures` that checks that a shard failure happened during a search request, either partial failure or total (all shards failed). Checks also the error code and the error message related to the failure. This is needed as without knowing the number of shards upfront, when simulating errors we can run into either partial (search returns partial results and failures) or total failures (search returns an error)

- added common methods similar to `indexSettings`, to be used in combination with `createIndex` and `prepareCreate` method and explicitly control the second level of randomization: `numberOfShards`, `minimumNumberOfShards` and `maximumNumberOfShards`. Added also `numberOfReplicas` despite the number of replicas is not randomized (default not specified but can be overwritten by tests)

Tests that specified the number of shards have been reviewed and the results follow:
- removed number_of_shards in node settings, ignored anyway as it would be overwritten by both mechanisms above
- remove specific number of shards when not needed
- removed manual shards randomization where present, replaced with ordinary one that's now available
- adapted tests that didn't need a specific number of shards to the new random behaviour
- fixed a couple of test bugs (e.g. 3 levels parent child test could only work on a single shard as the routing key used for grand-children wasn't correct)
- also done some cleanup, shared code through shard size facets and aggs tests and used common methods like `assertAcked`, `ensureGreen`, `refresh`, `flush` and `refreshAndFlush` where possible
- made sure that `indexSettings()` is always used as a basis when using `prepareCreate` to inject specific settings
- converted indexRandom(false, ...) + refresh to indexRandom(true, ...)
2014-03-10 13:01:52 +01:00
Simon Willnauer fbb8c0fafa [DOCS] Add `coming` tag to multiple rescores
Closes #5365
2014-03-10 09:27:44 +01:00
Andrew Raines 2f48be597e Display all available endpoints by default at /_cat
Closes #5106
2014-03-07 13:21:43 -06:00
Konrad Feldmeier d7b0d547d4 [DOCS] Multiple doc fixes
Closes #5047
2014-03-07 14:24:58 +01:00
Benjamin Devèze 2affa5004f Fix small typo in percentiles doc 2014-03-07 10:10:19 +01:00
Adrien Grand f359b7f38b [DOC] The percentiles aggregation is coming in 1.1.0. 2014-03-07 10:03:15 +01:00
Brusic 95274c18c5 Added support for char filters in the analyze API
Closes #5148
2014-03-06 12:23:51 +01:00
James Brook a93d6d55a5 Added support for aliases to index templates
Adapted existing PR (#2739) to updated code (post #4920), added tests and docs (@javanna)

Closes #1825
2014-03-06 11:11:07 +01:00
uboness 9d0fc76f54 Added support for sorting buckets based on sub aggregations
Supports sorting on sub-aggs down the current hierarchy. This is supported as long as the aggregation in the specified order path are of a single-bucket type, where the last aggregation in the path points to either a single-bucket aggregation or a metrics one. If it's a single-bucket aggregation, the sort will be applied on the document count in the bucket (i.e. doc_count), and if it is a metrics type, the sort will be applied on the pointed out metric (in case of a single-metric aggregations, such as avg, the sort will be applied on the single metric value)

 NOTE: this commit adds a constraint on what should be considered a valid aggregation name. Aggregations names must be alpha-numeric and may contain '-' and '_'.

 Closes #5253
2014-03-06 00:05:27 +01:00
Igor Motov b723ee0d20 [DOCS] Update boolean mapping docs with a full list of values that are treated as false
Closes #5337
2014-03-05 15:33:59 -05:00
Clinton Gormley 98ecf80f07 [DOCS] Formatting error
Closes #5346
2014-03-05 17:40:51 +01:00
Kevin 2c7a3a49c5 [DOCS] add Elasticsearch Image Plugin 2014-03-05 14:16:56 +01:00
Zachary Tong 7b16c5857d Percentiles aggregation.
A new metric aggregation that can compute approximate values of arbitrary
percentiles.

Close #5323
2014-03-03 18:06:14 +01:00
Martijn van Groningen dcb590398d [DOCS] Better document the limitation of nested objects. 2014-03-03 14:12:18 +01:00
Binh Ly 7e49848697 Clarify range aggregations 2014-02-28 14:38:57 -05:00
Clinton Gormley 53ce0e8e27 [DOCS] Fixed added[] tag version number 2014-02-28 15:29:43 +01:00
Lee Hinman e53a43800e Add `explain` flag support to the reroute API
By specifying the `explain` flag, an explanation for the reason a
command can or cannot be executed is returned. No allocation commands
are actually performed.

Returns a response similar to:

{
  "state": {...cluster state...},
  "acknowledged": true,
  "explanations" : [ {
    "command" : "cancel",
      "parameters" : {
        "index" : "decide",
        "shard" : 0,
        "node" : "IvpoKRdtRiGrQ_WKtt4_4w",
        "allow_primary" : false
      },
      "decisions" : [ {
        "decider" : "cancel_allocation_command",
        "decision" : "YES",
        "explanation" : "..."
        } ]
     }, {
      "command" : "move",
      "parameters" : {
        "index" : "decide",
        "shard" : 0,
        "from_node" : "IvpoKRdtRiGrQ_WKtt4_4w",
        "to_node" : "IvpoKRdtRiGrQ_WKtt4_4w"
       },
       "decisions" : [ {
         "decider" : "same_shard",
         "decision" : "NO",
         "explanation" : "shard cannot be allocated on same node [IvpoKRdtRiGrQ_WKtt4_4w] it already exists on"
       },
       etc
       ]
  }]
}

also removes AllocationExplanation from cluster state

Closes #2483
Closes #5169
2014-02-27 09:48:51 -07:00
Simon Willnauer 9160516b28 Expose `filler_token` via ShingleTokenFilterFactory
Lucene 4.7 supports a setter for the `filler_token` that is
inserted if there are gaps in the token stream. This change exposes
this setting.

Closes #4307
2014-02-26 22:21:10 +01:00
Martijn van Groningen 1441fec068 [DOCS] Updated memory considerations for p/c queries and filters. 2014-02-26 22:16:51 +01:00
Simon Willnauer 90e57c15e8 [DOCS]: fixed small problem in example json 2014-02-26 16:40:04 +01:00
Clinton Gormley 03ad168b24 [DOCS] Added note about dely in clearing filter cache.
Closes #5231
2014-02-24 11:36:22 +01:00
hura 818f8c0e2b [DOCS] Fix wrong explanation in configuration.asciidoc
Replaced network.host with node.name to match config file
2014-02-24 11:29:50 +01:00
Luca Cavanna 4e6610a798 Fixed multi term queries support in postings highlighter for non top-level queries
In #4052 we added support for highlighting multi term queries using the postings highlighter. That worked only for top-level queries though, and not for multi term queries that are nested for instance within a bool query, or filtered query, or a constant score query.

The way we make this work is by walking the query structure and temporarily overriding the query rewrite method with a method that allows for multi terms extraction.

Closes #5102
2014-02-21 21:43:40 +01:00
Adrien Grand edb854d952 Document the indices segments response format. 2014-02-21 12:01:32 +01:00
Lee Hinman 8f8cc7205d Add "locale" parameter to query_string and simple_query_string
Fixes #5128

Remove java 7 specific Locale functions, add "coming[1.1.0]" to documentation

add LocaleUtils utility class for dealing with Locale functions
2014-02-20 15:53:08 -07:00
Martijn van Groningen a81a4a5efe [DOCS] Included the `_percolator` index breaking change to migration docs. 2014-02-20 16:43:06 +01:00
Isabel Drost-Fromm 48004ff8a5 Add mustache templating to query execution.
Adds support for storing mustache based query templates that can later be filled
with query parameter values at execution time. Templates may be both quoted,
non-quoted and referencing templates stored in config/scripts/*.mustache by file
name.

See docs/reference/query-dsl/queries/template-query.asciidoc for templating
examples.

Implementation detail: mustache itself is being shaded as it depends directly on
guava - so having it marked optional but included in the final distribution
raises chances of version conflicts downstream.

Fixes #4879
2014-02-20 12:21:59 +01:00
javanna 419db6ee12 [DOCS] Fixed typo in create index api 2014-02-19 17:49:38 +01:00
Boaz Leskes e379f419e6 [DOCS] Remove clear flag from node-stats as it is not used anymore 2014-02-17 15:20:12 +01:00
Luca Cavanna 3afdf4a872 Added support for aliases to create index api
It is now possible to specify aliases during index creation:

curl -XPUT 'http://localhost:9200/test' -d '
{
    "aliases" : {
        "alias1" : {},
        "alias2" : {
            "filter" : { "term" : {"field":"value"}}
        }
    }
}'

Closes #4920
2014-02-17 14:54:21 +01:00
Britta Weber db3c6c2a8e Enable percolation for nested documents
closes #5082
2014-02-14 22:42:33 +01:00
Lee Hinman c97bcc3602 Add support for `lowercase_expanded_terms` flag to simple_query_string
Default the flag to true, making simple_query_string behave similarly to
query_string

Fixes #5008
2014-02-14 11:51:23 -07:00
Nik Everett 5c3f4ceafb Add preserve original token option to ASCIIFolding
Closes #4931
2014-02-14 19:37:00 +01:00
Luca Cavanna 6abd0a76bd [DOCS] improved get docs
- added _version to response
- exists call use -XHEAD with -i flag to include headers in the output
2014-02-14 13:11:10 +01:00
Lars Francke 2a765415c8 Update get.asciidoc
Minor improvements.

curl -XHEAD doesn't actually print anything so I've changed to use -I which actually prints the headers received.
2014-02-14 13:11:10 +01:00
Brian Yoder 41dba68bda Added the `DistanceUnit.NAUTICALMILES` enumeration
label with the corresponding *NM* and *nmi* unit
suffixes. Update the docs to match.

Closes #5085
2014-02-14 19:48:58 +09:00
uboness d335630e57 [docs] fixed errors in aggs docs
- error in nested aggs example
- error in terms aggs example
2014-02-13 20:36:02 +01:00
Oleg Anashkin eb0e1aa38f Fix typo in similarity docs
DRF similarity -> DFR similarity
2014-02-13 07:45:30 -08:00
Luca Cavanna 179750f0f5 [DOCS] fixed count docs, it now requires a top-level query object, same as other apis
Relates to #4074
2014-02-13 13:36:20 +01:00
Luca Cavanna 9902f04033 [DOCS] rephrased delete by query docs 2014-02-13 11:44:51 +01:00
Luca Cavanna 01abea5945 [DOCS] fixed count and validate query docs, they now require a top-level query object, same as other apis
Relates to #4074
Closes #5111
2014-02-13 11:42:04 +01:00
Kevin 99942089a8 [DOCS] add DynamoDB river plugin 2014-02-13 10:38:04 +01:00
James Yu 699fe5e929 fixed markup and typo 2014-02-13 10:33:15 +01:00
Clinton Gormley 80c7619591 [DOCS] Changed coming[] to added[] for 1.0.0* 2014-02-12 17:17:25 +02:00
Luca Cavanna 1d8d58391f [DOCS] added coming tags for `zen.discovery.publish_timeout` made dynamic 2014-02-12 15:24:38 +01:00