Commit Graph

472 Commits

Author SHA1 Message Date
Martijn van Groningen 5705537ecf Added field stats api
The field stats api returns field level statistics such as lowest, highest values and number of documents that have at least one value for a field.

An api like this can be useful to explore a data set you don't know much about. For example you can figure at with the lowest and highest response times are, so that you can create a histogram or range aggregation with sane settings.

This api doesn't run a search to figure this statistics out, but rather use the Lucene index look these statics up (using Terms class in Lucene). So finding out these stats for fields is cheap and quick.

The min/max values are based on the type of the field. So for a numeric field min/max are numbers and date field the min/max date and other fields the min/max are term based.

Closes #10523
2015-04-23 08:52:34 +02:00
Zachary Tong e08e45cee8 [DOCS] Add link to movavg page 2015-04-22 18:59:39 -04:00
Zachary Tong a03cefcece [DOCS] Add documentation for moving average 2015-04-22 18:59:39 -04:00
Clinton Gormley a60571c597 Docs: Removed some unused callout from the scroll docs 2015-04-22 12:49:06 +02:00
Jun Ohtani 0955c127c0 Rest: Add json in request body to scroll, clear scroll, and analyze API
Change analyze.asciidoc and scroll.asciidoc
Add json support to Analyze and Scroll, and clear scrollAPI
Add rest-api-spec/test

Closes #5866
2015-04-22 17:53:20 +09:00
Colin Goodheart-Smithe bd28c9c44e Documentation for the max_bucket reducer 2015-04-21 15:06:20 +01:00
Colin Goodheart-Smithe be647a89d3 Documentation for the derivative reducer 2015-04-21 15:06:20 +01:00
Colin Goodheart-Smithe 0f4b7f3b5c Added section for reducer aggregations in the main aggregation docs page 2015-04-21 15:06:19 +01:00
markharwood 63db34f649 New feature - Sampler aggregation used to limit any nested aggregations' processing to a sample of the top-scoring documents.
Optionally, a “diversify” setting can limit the number of collected matches that share a common value such as an "author".

Closes #8108
2015-04-21 10:22:05 +01:00
Adrien Grand f4d5914511 Docs: Warn about the fact that min_doc_count=0 might return terms that only belong to different types. 2015-04-21 00:57:57 +02:00
Honza Král e929c1560d [DOCS] Be explicit about scan doing no scoring 2015-04-20 18:05:45 +02:00
Alex Ksikes c347dfe91c Validate API: support for verbose explanation of succesfully validated queries
This commit adds a `rewrite` parameter to the validate API in order to shown
how the given query is re-written into primitive queries. For example, an MLT
query is re-written into a disjunction of the selected terms. Other use cases
include `fuzzy`, `common_terms`, or `match` query especially with a
`cutoff_frequency` parameter. Note that the explanation is only given for a
single randomly chosen shard only, so the output may vary from one shard to
another.

Relates #1412
Closes #10147
2015-04-13 19:17:58 +02:00
Clinton Gormley abc7de96ae Docs: Updated version annotations in master 2015-04-09 14:50:11 +02:00
Adrien Grand aecd9ac515 Aggregations: Speed up include/exclude in terms aggregations with regexps.
Today we check every regular expression eagerly against every possible term.
This can be very slow if you have lots of unique terms, and even the bottleneck
if your query is selective.

This commit switches to Lucene regular expressions instead of Java (not exactly
the same syntax yet most existing regular expressions should keep working) and
uses the same logic as RegExpQuery to intersect the regular expression with the
terms dictionary. I wrote a quick benchmark (in the PR) to make sure it made
things faster and the same request that took 750ms on master now takes 74ms with
this change.

Close #7526
2015-04-09 12:12:56 +02:00
marko asplund 5585175173 Docs: fix typos in example JSON data
Closes #10479
2015-04-08 13:40:35 +02:00
Adrien Grand a608db122d Search: Remove the `count` search type.
This commit brings the benefits of the `count` search type to search requests
that have a `size` of 0:
 - a single round-trip to shards (no fetch phase)
 - ability to use the query cache

Since `count` now provides no benefits over `query_then_fetch`, it has been
deprecated.

Close #7630
2015-03-31 11:31:49 +02:00
olivier bourgain 00a9db73ae [DOCS] Fix multi percolate response sample in percolate.asciidoc 2015-03-30 11:32:41 +02:00
javanna d9d1e6a67a Scripting: add support for fine-grained settings
Allow to on/off scripting based on their source (where they get loaded from), the  operation that executes them and their language.

The settings cover the following combinations:

- mode: on, off, sandbox
- source: indexed, dynamic, file
- engine: groovy, expressions, mustache, etc
- operation: update, search, aggs, mapping

The following settings are supported for every engine:

script.engine.groovy.indexed.update:    sandbox/on/off
script.engine.groovy.indexed.search:    sandbox/on/off
script.engine.groovy.indexed.aggs:      sandbox/on/off
script.engine.groovy.indexed.mapping:   sandbox/on/off
script.engine.groovy.dynamic.update:    sandbox/on/off
script.engine.groovy.dynamic.search:    sandbox/on/off
script.engine.groovy.dynamic.aggs:      sandbox/on/off
script.engine.groovy.dynamic.mapping:   sandbox/on/off
script.engine.groovy.file.update:       sandbox/on/off
script.engine.groovy.file.search:       sandbox/on/off
script.engine.groovy.file.aggs:         sandbox/on/off
script.engine.groovy.file.mapping:      sandbox/on/off

For ease of use, the following more generic settings are supported too:

script.indexed: sandbox/on/off
script.dynamic: sandbox/on/off
script.file:    sandbox/on/off

script.update:  sandbox/on/off
script.search:  sandbox/on/off
script.aggs:    sandbox/on/off
script.mapping: sandbox/on/off

These will be used to calculate the more specific settings, using the stricter setting of each combination. Operation based settings have precedence over conflicting source based ones.

Note that the `mustache` engine is affected by generic settings applied to any language, while native scripts aren't as they are static by definition.

Also, the previous `script.disable_dynamic` setting can now be deprecated.

Closes #6418
Closes #10116
Closes #10274
2015-03-26 19:56:55 +01:00
Boaz Leskes 4970e3e225 Revert "Rest: Add json in request body to scroll, clear scroll, and analyze API"
This reverts commit 16083d454c.
2015-03-23 12:57:19 +01:00
Jun Ohtani 16083d454c Rest: Add json in request body to scroll, clear scroll, and analyze API
Add json support to scroll, clear scroll, and analyze

Closes #5866
2015-03-23 15:35:38 +09:00
Simon Willnauer 7257345db9 Revert Benchmark API
The benchmark api is being worked on feature/bench branch and will be merged from there when ready.
2015-03-21 10:36:04 +01:00
Asimov4 649e3aa4c5 [DOCS] Fix typos in percolate.asciidoc 2015-03-21 10:23:15 +01:00
Martijn van Groningen 4393939f5e inner_hits: Nested parent field should be resolved based on the parent inner hit definition, instead of the nested parent field in the mapping.
The behaviour is better in the case someone has multiple levels of nested object fields defined in the mapping and like to define a single inner_hits definition that is two or more levels deep.

If someone wants inner hits on a nested field that is 2 levels deep the following would need to be defined:

```
{
  ...
  "inner_hits" : {
     "path" : {
        "level1" : {
            "inner_hits" : {
               "path" : {
                  "level2" : {
                     "query" : { .... }
                  }
               }
            }
        }
     }
  }
}
```

With this change the above can be defined as:

```
{
  ...
  "inner_hits" : {
     "path" : {
        "level1.level2" : {
            "query" : { .... }
        }
     }
  }
}
```

Closes #9251
2015-03-16 16:31:03 -07:00
Lee Hinman 6aec68cd29 Revert "[QUERY] Remove lowercase_expanded_terms and locale options"
This reverts commit d1f7bd97cb.

Ryan pointed out that this needs to work with the multi term query, so
additional analysis and tests should be added.
2015-03-13 13:51:44 -06:00
Lee Hinman d1f7bd97cb [QUERY] Remove lowercase_expanded_terms and locale options
The analysis chain should be used instead of relying on this, as it is
confusing when dealing with different per-field analysers.

The `locale` option was only used for `lowercase_expanded_terms`, which,
once removed, is no longer needed, so it was removed as well.

Fixes #9978
Relates to #9973
2015-03-13 13:17:27 -06:00
olivier bourgain bcb4decca9 [DOCS] add missing comma in percentile_rank aggregation example 2015-03-10 08:21:06 -07:00
olivier bourgain fb7cd2ea9a [DOCS] Adjusted geo_distance aggregation example
unit is not returned in the response, but we have key and an implicit from starting at 0 for the first bucket
2015-03-10 08:20:20 -07:00
olivier bourgain eaeddc6bd4 [DOCS] missing curly brace in ip_range aggregation example 2015-03-10 08:19:57 -07:00
Britta Weber 580728dfd6 significant terms: add scriptable significance heuristic
This commit adds scripting capability to significant_terms.
Custom heuristics can be implemented with a script that provides
parameters subset_freq, superset_freq,subset_size, superset_size.

closes #7850
2015-03-06 17:06:04 +01:00
Clinton Gormley c223ed0db4 Update search-type.asciidoc
Changed search_type docs to reflect that the `(dfs_)query_and_fetch` modes are an internal optimization and should not be specified explicitly by the user.

Relates to #9606
2015-03-02 10:55:22 +01:00
Geoff Bourne 0e09c02c56 Spelling out the sort order options
Closes #9768
2015-03-01 21:05:52 +01:00
Clinton Gormley e194fb3a07 Docs: Default distance unit in geo distance agg is metres, not km
Closes #9812
2015-02-28 01:45:29 +01:00
Colin Goodheart-Smithe 2520dc78ec [DOCS] added a note for the default shard_size value 2015-02-25 11:00:55 +00:00
markharwood 29b1902cfb New aggregations feature - “PercentageScore” heuristic for significant_terms aggregation provides simple “per-capita” type measures.
Closes #9720
2015-02-20 13:22:08 +00:00
Christoph Büscher 30fd70f07b Aggregations: Simplify time zone option in `date_histogram`
Removed the existing `pre_zone` and `post_zone` option in `date_histogram` in favor of
the simpler `time_zone` option. Previously, specifying different values for these could
lead to confusing scenarios where ES would return bucket keys that are not UTC.
Now `time_zone` is the only option setting, the calculation of date buckets to take place in the
preferred time zone, but after rounding converting the bucket key values back to UTC.

Closes #9062
Closes #9637
2015-02-16 16:54:06 +01:00
Clinton Gormley 6fadeeca56 Updated doc annotations for 1.4.3 2015-02-11 17:54:53 +01:00
Christoph Büscher d2f852a274 Aggregations: Add 'offset' option to date_histogram, replacing 'pre_offset' and 'post_offset'
Add offset option to 'date_histogram' replacing and simplifying the previous 'pre_offset' and 'post_offset' options.
This change is part of a larger clean up task for `date_histogram` from issue #9062.
2015-02-09 14:03:28 +01:00
Adrien Grand 95f46f1212 Docs: Use the new experimental annotation.
We now have a very useful annotation to mark features or parameters as
experimental. Let's use it! This commit replaces some custom text warnings with
this annotation and adds this annotation to some existing features/parameters:
 - inner_hits (unreleased yet)
 - terminate_after (released in 1.4)
 - per-bucket doc count errors in the terms agg (released in 1.4)

I also tagged with this annotation settings which should either be not needed
(like the ability to evict entries from the filter cache based on time) or that
are too deep into the way that Elasticsearch works like the Directory
implementation or merge settings.

Close #9563
2015-02-05 15:29:45 +01:00
Adrien Grand 3a486066fd Docs: Remove the experimental status of the cardinality and percentiles(-ranks) aggregations
These aggregations are not experimental anymore but some of their parameters
still are:
 - `precision_threshold` and `rehash` on `cardinality`
 - `compression` on percentiles(-ranks)

Close #9560
2015-02-05 15:18:40 +01:00
Christoph Büscher 44193e7ba5 Aggregations: Add 'offset' option to histogram aggregation
Histogram aggregation supports an 'offset' option to move bucket boundaries.
In a histogram with buckets of size X these can be moved from 0, X, 2X, 3X,...
by an offset value of Y to Y, X+Y, 2X+Y, 3X+Y... by using the 'offset' option.
The previous 'pre_offset' and 'post_offset' options are removed in favour of
the simplified 'offset' option.

Closes #9417
Closes #9505
2015-02-02 18:23:01 +01:00
Oliver e412dab63a Docs: Fix sample query
Closes #9472
2015-01-29 15:56:24 +01:00
Ryan Ernst afcedb94ed Mappings: Remove `index_analyzer` setting to simplify analyzer logic
The `analyzer` setting is now the base setting, and `search_analyzer`
is simply an override of the search time analyzer.  When setting
`search_analyzer`, `analyzer` must be set.

closes #9371
2015-01-28 13:43:15 -08:00
Zachary Tong a4eb1d5505 Aggregations: Add standard deviation bounds to extended_stats
Extended_stats now displays the upper and lower bounds on standard deviations (e.g. avg +/- std).
Default is to show 2 std above/below, but can be changed using the `sigma` parameter.
Accepts non-negative doubles

Closes #9356
2015-01-28 11:47:20 -05:00
eBuildy 85ef44fd73 Docs: Fix missing comma and boolean true
Closes #9350
2015-01-19 21:31:29 +01:00
Martijn van Groningen 8e0292b1aa docs: fix inner hits snippet 2015-01-19 18:56:45 +01:00
sweetest eaa1674d6d Introduce index option named 'index.percolator.map_unmapped_fields_as_string', that handles unmapped fields in percolator queries as type string.
Closes #9053
Closes #9054
2015-01-19 09:51:10 +01:00
David Pilato fc7a0d3a4a [Docs] fix three to four 2015-01-12 12:13:23 +01:00
Martijn van Groningen d8054ec299 inner_hits: Added another more compact syntax for inner hits.
Closes #8770
2014-12-24 17:41:35 +01:00
Ryan Ernst 39b3613420 Fix date histogram docs grammar. 2014-12-23 10:19:55 -08:00
Yasir Bamarni 5059d6fe1c Update percolate.asciidoc
wrong type used in the -GET request

Closes #8942
2014-12-17 14:05:27 +01:00
Ayush 23dbecf3e7 Update percolate.asciidoc
Updating the `associated` spelling

Closes #8907
2014-12-15 14:12:03 +01:00
Adam Menges 3a3030e217 Docs: Fix the wording for inner hits a bit
Closes #8747
2014-12-09 13:36:26 +01:00
Martijn van Groningen d7e224da04 Added `inner_hits` feature that allows to include nested hits.
Inner hits allows to embed nested inner objects, children documents or the parent document that contributed to the matching of the returned search hit as inner hits, which would otherwise be hidden.

Closes #8153
Closes #3022
Closes #3152
2014-12-02 12:01:01 +01:00
Clinton Gormley 88e06cba80 Update daterange-aggregation.asciidoc
Clarified the date-math expressions on date range aggregations

Closes #8703
2014-11-28 16:53:33 +01:00
David Pilato 43a1435d3b [Docs] fix consistency between examples 2014-11-27 20:29:34 +01:00
David Pilato 40f0e07db3 [Docs] Fix missing new line 2014-11-27 19:39:12 +01:00
David Pilato da27c2104a [Docs] Fix missing comma in mapping 2014-11-27 11:03:19 +01:00
David Haney 2c429452e9 Typo: changed "5% or the real words" to "5% of the real words"
Closes #8582
2014-11-25 13:15:33 +01:00
barbasa fd6c41bfbf Missing quote in the example 2014-11-23 14:03:58 +01:00
Boaz Leskes 1e16375d04 Docs: Update execution hint docs for Significant terms agg
copied over the relevant pieces from the terms agg

Closes #8532
2014-11-18 20:54:26 +01:00
Joel Taddei 7e72800c83 [DOCS] Corrected syntax error in search curl cmd
Closes #8447
2014-11-12 17:21:19 +01:00
Clinton Gormley cff544dcc2 Docs: Removed old coming/added tags 2014-11-10 14:41:24 +01:00
Veres Lajos 4059e4ac86 typo fixes - https://github.com/vlajos/misspell_fixer
Closes #8323
2014-11-08 18:55:57 +01:00
Clinton Gormley 08aa715d2e Update datehistogram-aggregation.asciidoc
Clarified use of fractional time units in the date histo agg.

Closes #7957
2014-11-08 17:49:34 +01:00
Martijn Laarman 82278bb7bc [Aggregations] Meta data support
This commit adds the ability to associate a bit of state with each
individual aggregation.

The aggregation response can be hard to stitch back together without
having a reference to the aggregation request. In many cases this is not
available, many json serializer frameworks cache types globally or have a
static deserialisation override mechanism. In these cases making the
original request available, if at all possible, would be a hack.

The old facets returned `_type` which was just enough metadata to know
what the originating facet type in the request was.

This PR takes `_type` one step further by introducing ANY arbitrary meta
data. This could be further <strike>ab</strike>used for instance by
generic/automated aggregations that include UI state (color information,
thresholds, user input states, etc) per aggregation.
2014-11-03 22:32:23 +01:00
Clinton Gormley e56d85439c Update search-template.asciidoc
Clarified using the conditional clause template example as a string
2014-10-31 15:32:14 +01:00
Clinton Gormley 2569188d25 Update search-template.asciidoc
Fixed asciidoc typo

Closes #8308
2014-10-31 14:40:32 +01:00
Areek Zillur 96f1606cdc Completion Suggester: Fix CompletionFieldMapper to correctly parse weight
- Allows weight to be defined as a string representation of a positive integer

closes #8090
2014-10-28 18:39:02 -04:00
Adrien Grand 7ea490dfd1 Aggregations: Return the sum of the doc counts of other buckets.
This commit adds a new field to the response of the terms aggregation called
`sum_other_doc_count` which is equal to the sum of the doc counts of the buckets
that did not make it to the list of top buckets. It is typically useful to have
a sector called eg. `other` when using terms aggregations to build pie charts.

Example query and response:

```json
GET test/_search?search_type=count
{
  "aggs": {
    "colors": {
      "terms": {
        "field": "color",
        "size": 3
      }
    }
  }
}
```

```json
{
   [...],
   "aggregations": {
      "colors": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 4,
         "buckets": [
            {
               "key": "blue",
               "doc_count": 65
            },
            {
               "key": "red",
               "doc_count": 14
            },
            {
               "key": "brown",
               "doc_count": 3
            }
         ]
      }
   }
}
```

Close #8213
2014-10-27 12:11:26 +01:00
Brian Kim 58086dd08b Docs: missing quote
fix missing quote

Closes #8176
2014-10-21 12:52:12 +02:00
Michael McCandless 85065f9c8e Core: cutover to Lucene's query rescorer
This is functionally equivalent to before, so there should be no
user-visible impact, except I added a NOTE in the docs warning about
the interaction of pagination and rescoring.

Closes #6232

Closes #7707
2014-10-18 05:25:50 -04:00
Sergii Golubev 028a2b732a Docs: Percolate reference - a typo and a misused word
Closes #8116
2014-10-17 15:26:29 +02:00
Sergii Golubev ae923a81b9 Docs: Percolate `_score` reference
Added missing `_score` word, made the sentence less ambiguous.

Closes #8115
2014-10-17 15:25:02 +02:00
Andrew O'Brien 33097d901b Docs: Typo: s/by/be/
Closes #8114
2014-10-16 20:51:58 +02:00
Son 6f3227db01 Docs: Fix order for PUT _mapping docs
Closes #8083
2014-10-16 18:49:36 +02:00
Clinton Gormley 6a180d1803 Docs: Update highlighting.asciidoc
Added note about how to highlight on the `_all` field

Closes #7991
2014-10-15 13:45:56 +02:00
Clinton Gormley 7e916d0b8b Update completion-suggest.asciidoc
Documented the `size` parameter in the completion suggester query
2014-10-14 18:47:32 +02:00
Martijn van Groningen 5763b24686 Core: Make fetch phase nested doc aware
By letting the fetch phase understand the nested docs structure we can serve nested docs as hits.
The `top_hits` aggregation can because of this commit be placed in a `nested` or `reverse_nested` aggregation.

Closes #7164
2014-10-08 22:21:30 +02:00
Colin Goodheart-Smithe 6cf371395a Aggregations: makes script params consistent with other APIs in scripted_metric
This change removes the script_type parameter form the Scripted Metric Aggregation and adds support for _file and _id suffixes to the init_script, map_script, combine_script and reduce_script parameters to make defining the source of the script consistent with the other APIs which use the ScriptService
2014-10-06 09:07:25 +01:00
mdzor 4b3f66e585 Update suggesters.asciidoc
A request was malformed

Closes #7867
2014-09-28 11:04:28 +02:00
Clinton Gormley cb00d4a542 Docs: Removed all the added/deprecated tags from 1.x 2014-09-26 21:04:42 +02:00
Colin Goodheart-Smithe 8a70b115f2 Aggregations: More consistent response format for scripted metrics aggregation
Changes the name of the field in the scripted metrics aggregation from 'aggregation' to 'value' to be more in line with the other metrics aggregations like 'avg'
2014-09-17 11:46:26 +01:00
Jordan Snodgrass 6246aac9ab Docs: Indicate that the Children Aggregation is coming in 1.4.0 2014-09-17 09:22:02 +02:00
Colin Goodheart-Smithe d4e83df3b8 Aggregations: Adds ability to sort on multiple criteria
The terms aggregation can now support sorting on multiple criteria by replacing the sort object with an array or sort object whose order signifies the priority of the sort. The existing syntax for sorting on a single criteria also still works.

Contributes to #6917
Replaces #7588
2014-09-15 11:08:29 +01:00
markharwood 3c8f8cc090 Aggs enhancement - allow Include/Exclude clauses to use array of terms as alternative to a regex
Closes #6782
2014-09-12 15:28:03 +01:00
Lee Hinman 1dd26888f6 [DOCS] Additional documentation for _score accessing
Closes #7043
2014-09-11 12:53:25 +02:00
smayzak 65a0ca021d The description was incorrect
Looked like a copy and paste from another aggregation
2014-09-10 16:05:03 +02:00
smayzak 6416f5d3d0 Fixing some grammar 2014-09-10 16:05:03 +02:00
David Pilato 7fdd3651fa [docs] Fix typo: resonable - reasonable 2014-09-10 15:57:57 +02:00
Martijn van Groningen 52f1ab6e16 Core: Added the `index.query.parse.allow_unmapped_fields` setting to fail queries if they refer to unmapped fields.
The percolator and filters in aliases by default enforce strict query parsing.

Closes #7335
2014-09-09 15:00:47 +02:00
Colin Goodheart-Smithe b127b52fd3 Revert "Aggregations: Adds ability to sort on multiple criteria"
This reverts commit bfedd11ffa.
2014-09-08 20:27:19 +01:00
Colin Goodheart-Smithe bfedd11ffa Aggregations: Adds ability to sort on multiple criteria
The terms aggregation can now support sorting on multiple criteria by replacing the sort object with an array or sort object whose order signifies the priority of the sort. The existing syntax for sorting on a single criteria also still works.

Contributes to #6917
2014-09-08 15:20:33 +01:00
Clinton Gormley 1bdf79e527 Docs: Added explanation of how to do multi-field terms agg
Closes #5100
2014-09-07 11:09:52 +02:00
shrinidhichaudhari 13e3a5e99c Docs: Update cardinality-aggregation.asciidoc
Closes #7516
2014-09-06 20:45:45 +02:00
Adrien Grand 4bfad644b3 Aggregations: Forbid usage of aggregations in conjunction with search_type=SCAN.
Aggregations are collection-wide statistics, which is incompatible with the
collection mode of search_type=SCAN since it doesn't collect all matches on
calls to the search API.

Close #7429
2014-09-03 09:03:01 +02:00
Adrien Grand 203e80e650 Aggregations: Only return aggregations on the first page when scrolling.
Aggregations are collection-wide statistics so they would always be the same.
In order to save CPU/bandwidth, we can just return them on the first page.

Same as #1642 but for aggregations.
2014-09-03 09:03:01 +02:00
Clinton Gormley a059a6574a Update reverse-nested-aggregation.asciidoc
Fixed reverse nested example

Closes #7463
2014-09-02 11:40:41 +01:00
Adrien Grand 8e1d3d56b3 Docs: Replace added[1.4.0] with coming[1.4.0] since 1.4 is not released yet. 2014-08-29 11:57:22 +02:00
londocr 1213eec834 Spelling error of aggregation 2014-08-28 08:57:12 +02:00
Adrien Grand ea96359d82 Facets: Removal from master.
Close #7337
2014-08-21 10:34:39 +02:00
Colin Goodheart-Smithe 7f943f0296 Aggregations: Scriptable Metrics Aggregation
A metrics aggregation which runs specified scripts at the init, collect, combine, and reduce phases

Closes #5923
2014-08-20 18:17:27 +01:00
Martijn van Groningen 383e64bd5c Aggregations: Add `children` bucket aggregator that is able to map buckets between parent types and child types using the already builtin parent/child support.
Closes #6936
2014-08-19 12:40:51 +02:00
Britta Weber 639692943f Docs: Document distance type and sort mode for many to many geo_points
closes #7280
2014-08-18 16:15:55 +02:00
Konrad Feldmeier 3b3e2ed5e9 Docs: Remove the 'Factor' paragraph to reflect #6490
The current implementation of 'date_histogram' does not understand
the `factor` parameter. Since the docs shouldn't raise false hopes,
I removed the section.

Closes #7277
2014-08-18 13:02:15 +02:00
Mpampis Kostas 55b642abc5 Docs: Fix typo in phrase-suggest.asciidoc
Closes #7262
2014-08-18 13:00:30 +02:00
Clinton Gormley 9dfede8cbb Update search-template.asciidoc
Remove extra commas in template query ;-)

Closes #7033
2014-08-18 12:35:18 +02:00
Clinton Gormley 6477e13c77 Typo 2014-08-18 12:30:49 +02:00
smayzak 8449128032 error in code
The top-tags and terms were reversed.
2014-08-18 12:28:53 +02:00
Areek Zillur 0b6734aa40 [DOCS] Clarify Completion Suggester output deduplication 2014-08-13 11:09:18 -04:00
Colin Goodheart-Smithe 36083cb27f [DOCS] Added section describing how to return only agg results
Closes #5875
2014-08-11 11:31:01 +01:00
Britta Weber d49ed93488 Docs: md -> asciidoc 2014-08-08 11:25:14 +02:00
Colin Goodheart-Smithe e6632ec63e [DOCS] fixed title for filters aggregation documentation 2014-08-07 08:37:43 +01:00
Clinton Gormley 7b0b315b71 Tidied up the filters agg docs and added a coming[] tag 2014-08-07 09:03:23 +02:00
Clinton Gormley e7f1aa4f4f Documented the query cache module
Related to #7161 and #7167
2014-08-06 11:55:11 +02:00
Britta Weber a3cefd919e significant terms: add google normalized distance, add chi square
closes #6858
2014-08-04 08:15:26 +02:00
uboness 3c9c9f33e2 Aggregations Added Filters aggregation
A multi-bucket aggregation where multiple filters can be defined (each filter defines a bucket). The buckets will collect all the documents that match their associated filter.

This aggregation can be very useful when one wants to compare analytics between different criterias. It can also be accomplished using multiple definitions of the single filter aggregation, but here, the user will only need to define the sub-aggregations only once.

Closes #6118
2014-08-01 16:01:08 +01:00
Adrien Grand d9d5b35be9 Sort: Make `ignore_unmapped` work for cross-index queries.
Close #2255
2014-08-01 15:30:17 +02:00
Stefan Antoni 8e862f15c1 [DOCS] fixed small typo in percolate.asciidoc 2014-08-01 12:38:35 +02:00
Britta Weber d6a18ab2ba Docs: add 1.4.0 label to many to many geo distance sort 2014-08-01 12:30:08 +02:00
Kurt Hurtado 66560acebb Update fielddata-fields.asciidoc 2014-08-01 09:20:19 +02:00
Areek Zillur 1d581e6286 Search Exists API: Checks if any matching documents exist for a given query
Implements a new Exists API allowing users to do fast exists check on any matched documents for a given query.
This API should be faster then using the Count API as it will:
 - early terminate the search execution once any document is found to exist
 - return the response as soon as the first shard reports matched documents

closes #6995
2014-07-31 15:42:30 -04:00
Britta Weber fe86c8bc88 _geo_distance sort: allow many to many geo point distance
Add computation of disyance to many geo points. Example request:

```
{
  "sort": [
    {
      "_geo_distance": {
        "location": [
          {
            "lat":1.2,
            "lon":3
          },
          {
             "lat":1.2,
            "lon":3
          }
        ],
        "order": "desc",
        "unit": "km",
        "sort_mode": "max"
      }
    }
  ]
}
```

closes #3926
2014-07-31 17:33:45 +02:00
Clinton Gormley 36e1c7928c Rewrote post-filter.asciidoc
Closes #5166
2014-07-31 12:56:11 +02:00
Adrien Grand 1fe76b891b Docs: Add links to the equivalent aggs in facets documentation. 2014-07-28 15:22:49 +02:00
Clinton Gormley be86556946 Update request-body.asciidoc
Added link from `timeout` to time-units

Closes #6361
2014-07-28 11:08:59 +02:00
Clinton Gormley 10b4177def Docs: Fixed path to search-shards 2014-07-26 15:05:53 +02:00
Clinton Gormley 88c8754a3c Docs: Removed search-shards from request-body 2014-07-26 14:52:50 +02:00
Colin Goodheart-Smithe 655157c83a Aggregations: Added an option to show the upper bound of the error for the terms aggregation.
This is only applicable when the order is set to _count.  The upper bound of the error in the doc count is calculated by summing the doc count of the last term on each shard which did not return the term.  The implementation calculates the error by summing the doc count for the last term on each shard for which the term IS returned and then subtracts this value from the sum of the doc counts for the last term from ALL shards.

Closes #6696
2014-07-25 14:24:24 +01:00
Areek Zillur 5487c56c70 Search & Count: Add option to early terminate doc collection
Allow users to control document collection termination, if a specified terminate_after number is
set. Upon setting the newly added parameter, the response will include a boolean terminated_early
flag, indicating if the document collection for any shard terminated early.

closes #6876
2014-07-23 15:10:15 -04:00
Clinton Gormley 0f943850a0 Update named-queries-and-filters.asciidoc 2014-07-23 17:28:49 +02:00
Simon Willnauer 5bfea56457 [DOCS] move all coming tags to added in master 2014-07-23 16:37:19 +02:00
Areek Zillur f39d4e1f89 PhraseSuggester: Collate option should allow returning phrases with no matching docs
A new option `prune` has been added to allow users to control phrase suggestion pruning when `collate`
is set. If the new option is set, the phrase suggestion option will contain a boolean `collate_match`
indicating whether the respective result had hits in collation.

CLoses #6927
2014-07-22 17:17:15 -04:00
Adrien Grand abeefbddea Docs: Update documentation about execution hints for the terms aggregation. 2014-07-21 11:55:57 +02:00
Clinton Gormley 6a7a77eada Docs: Add links to client helper classes for bulk/scroll/reindexing 2014-07-18 13:55:47 +02:00
Simon Willnauer f9a9348508 [DOCS] Move benchmark API to 1.4 2014-07-16 15:02:20 +02:00
Brian Murphy d6cd2c2b73 [DOCS][FIX] Fix reference check in indexed scripts/templates doc. 2014-07-16 11:24:18 +01:00
Brian Murphy bc570919ee [DOCS][FIX] Fix doc parsing, broken closing block 2014-07-16 11:18:21 +01:00
Brian Murphy cbd2a97abd [DOCS] : Indexed scripts/templates
These are the docs for the indexed scripts/templates feature.
Also moved the namespace for the REST endpoints.

Closes #6851
2014-07-16 10:49:02 +01:00
Areek Zillur 76343899ea Phrase Suggester: Add collate option to PhraseSuggester
The newly added collate option will let the user provide a template query/filter which will be executed for every phrase suggestions generated to ensure that the suggestion matches at least one document for the filter/query.
The user can also add routing preference `preference` to route the collate query/filter and additional `params` to inject into the collate template.

Closes #3482
2014-07-14 16:07:52 -04:00
Britta Weber 74927adced significant terms: infrastructure for changing easily the significance heuristic
This commit adds the infrastructure to allow pluging in different
measures for computing the significance of a term.
Significance measures can be provided externally by overriding

- SignificanceHeuristic
- SignificanceHeuristicBuilder
- SignificanceHeuristicParser

closes #6561
2014-07-14 11:00:50 +02:00
Florian Hopf 3689f67a76 Docs: Fixed invalid word count in geodistance agg doc
Closes #6838
2014-07-11 18:35:36 +02:00
Clinton Gormley b6baa4be4a Update preference.asciidoc
Clarify that `preference` is a query string parameter only
and provide an example.
2014-07-09 11:13:17 +02:00
Clinton Gormley feb81e228b Docs: Rewrote the scroll/scan docs
Closes #6774
2014-07-08 11:54:53 +02:00
Andrii Gakhov 80321d89d9 Docs: Update histogram-aggregation.asciidoc
filter in a filtered query should be under "filter" key

Closes #6738
2014-07-07 10:44:11 +02:00
Carsten Brandt bd4699da7e Docs: fixed a typo in the docs
Closes: #6718
2014-07-07 10:41:36 +02:00
Duncan Angus Wilkie 60a8515fb7 Update histogram-facet.asciidoc
Spotted a typo, which I've fixed.
2014-07-01 10:49:43 +02:00
Clinton Gormley 64a4acc49b Docs: Added IDs to the highlighters for linking 2014-06-22 16:46:42 +02:00
Chris 011e20678d [DOCS] Fixed json example in nested-aggregation.asciidoc 2014-06-18 19:38:02 +02:00
Colin Goodheart-Smithe 7423ce0560 Aggregations: Added percentile rank aggregation
Percentile Rank Aggregation is the reverse of the Percetiles aggregation.  It determines the percentile rank (the proportion of values less than a given value) of the provided array of values.

Closes #6386
2014-06-18 12:02:08 +01:00
stephlag 13d910f016 Added missing comma in suggester example 2014-06-13 16:01:04 +02:00
Adrien Grand 01327d7136 Facets: deprecation.
Users are encouraged to move to the new aggregation framework that was
introduced in Elasticsearch 1.0.

Close #6485
2014-06-13 13:13:44 +02:00
Luke Fender f9da5259bc [DOCS] Fixed typo in post-filter.asciidoc
Remove 'be' where it is not needed
2014-06-12 12:09:19 +02:00
Martijn van Groningen 5e408f3d40 Change the top_hits to be a metric aggregation instead of a bucket aggregation (which can't have an sub aggs)
Closes #6395
Closes #6434
2014-06-10 09:09:50 +02:00
markharwood 724129e6ce Aggregations optimisation for memory usage. Added changes to core Aggregator class to support a new mode of deferred collection.
A new "breadth_first" results collection mode allows upper branches of aggregation tree to be calculated and then pruned
to a smaller selection before advancing into executing collection on child branches.

Closes #6128
2014-06-06 15:59:51 +01:00
fransflippo cdbde4a578 [DOCS] Reworded note about shorthand suggest syntax
The existing Note about the shorthand suggest syntax was poorly worded and confusing. Please check whether the way I've phrased it now is still correct as to what the shorthand form actually does and doesn't do: the original wording did not provide me enough information to be sure.
Thanks!
2014-06-06 10:21:01 +02:00
Jad Naous 5aa84c9aab [DOCS] Fixed typos in aggregations.asciidoc
Fix plural/singular forms.
2014-06-05 19:47:01 +02:00
Colin Goodheart-Smithe b9f4d44b14 Aggregations: Adds GeoBounds Aggregation
The GeoBounds Aggregation is a new single bucket aggregation which outputs the coordinates of a bounding box containing all the points from all the documents passed to the aggregation as well as the doc count. Geobound Aggregation also use a wrap_logitude parameter which specifies whether the resulting bounding box is permitted to overlap the international date line.  This option defaults to true.

This aggregation introduces the idea of MetricsAggregation which do not return double values and cannot be used for sorting.  The existing MetricsAggregation has been renamed to NumericMetricsAggregation and is a subclass of MetricsAggregation.  MetricsAggregations do not store doc counts and do not support child aggregations.

Closes #5634
2014-06-03 15:59:56 +01:00
javanna 5a1ad7b42e [DOCS] fixed curl requests in benchmark docs 2014-06-03 11:47:13 +02:00
leonardo menezes f3eca05c3b [DOCS] removed slowest on single query benchmark requests
Relates to #5904
2014-06-03 11:47:13 +02:00
Clinton Gormley 7fff6f1f43 Docs: Tidied percolate.asciidoc 2014-05-30 11:56:06 +02:00
Martijn van Groningen aab38fb2e6 Aggregations: added pagination support to `top_hits` aggregation by adding `from` option.
Closes #6299
2014-05-30 11:45:31 +02:00
Martijn van Groningen 5fafd2451a Added `top_hits` aggregation that keeps track of the most relevant document being aggregated per bucket.
Closes #6124
2014-05-23 16:01:18 +02:00
Nik Everett 3573822b7e Highlight fields in request order
Because json objects are unordered this also adds an explicit order syntax
that looks like
    "highlight": {
        "fields": [
            {"title":{ /*params*/ }},
            {"text":{ /*params*/ }}
        ]
    }

This is not useful for any of the builtin highlighters but will be useful
in plugins.

Closes #4649
2014-05-22 16:44:14 +02:00
Simon Willnauer 9d5507047f Update Documentation Feature Flags [1.2.0] 2014-05-22 15:06:42 +02:00
Clinton Gormley f950344546 [DOCS] Fixed title levels in context suggester 2014-05-21 20:47:25 +02:00
Simon Willnauer ec3b1c57ac Move Benchmark release to 1.3 2014-05-21 10:17:59 +02:00
Britta Weber 08e57890f8 use shard_min_doc_count also in TermsAggregation
This was discussed in issue #6041 and #5998 .

closes #6143
2014-05-14 14:10:04 +02:00
Clinton Gormley ff12585fea Improved wording in search-type.asciidoc
Closes #5951
2014-05-14 12:15:48 +02:00
David Pilato 1cb2c3bdd3 [DOCS] reverse-nested aggs are added in 1.2.0 2014-05-13 20:00:42 +02:00
Tiago Alves Macambira a8242e6c8c Clarify `missing` behavior. 2014-05-13 15:49:46 +02:00
Adrien Grand cc530b9037 Use t-digest as a dependency.
Our improvements to t-digest have been pushed upstream and t-digest also got
some additional nice improvements around memory usage and speedups of quantile
estimation. So it makes sense to use it as a dependency now.

This also allows to remove the test dependency on Apache Mahout.

Close #6142
2014-05-13 10:38:08 +02:00
Clinton Gormley 3aac594503 [DOCS] Fix typos in context suggest 2014-05-13 10:34:16 +02:00
markharwood 1e560b0d92 Significant_terms agg: added option for a background_filter to define background context for analysis of term frequencies
Closes #5944
2014-05-13 09:10:30 +01:00
Clinton Gormley 5b93255ec8 [DOCS] Added "Aggregation" to all aggs titles 2014-05-13 01:35:58 +02:00
Rashid Khan 233aaa63c9 Change key to keyed 2014-05-12 13:15:07 -07:00
Alex Ksikes dae48d9fe8 Added the ability to include the queried document for More Like This API.
By default More Like This API excludes the queried document from the response.
However, when debugging or when comparing scores across different queries, it
could be useful to have the best possible matched hit. So this option lets users
explicitly specify the desired behavior.

Closes #6067
2014-05-09 12:59:39 +02:00
Alex Ksikes 48b7172ee7 Provided some insights as to how More Like This works internally.
In the Google Groups forum there appears to be some confusion as to what mlt
does. This documentation update should hopefully help demystifying this
feature, and provide some understanding as to how to use its parameters.

Closes #6092
2014-05-09 12:13:29 +02:00
Andrew Selden f23274523a Integration tests for benchmark API.
- Randomized integration tests for the benchmark API.
- Negative tests for cases where the cluster cannot run benchmarks.
- Return 404 on missing benchmark name.
- Allow to specify 'types' as an array in the JSON syntax when describing a benchmark competition.
- Don't record slowest for single-request competitions.

Closes #6003, #5906, #5903, #5904
2014-05-07 14:14:54 -07:00
uboness fc52db1209 Changed the respnose structure of the percentiles aggregation where now all the percentiles are placed under a `values` object (or `values` array in case the `keyed` flag is set to `false`
Closes #5870
2014-05-07 18:35:24 +02:00
Britta Weber 7944369fd1 Add `shard_min_doc_count` parameter for significant terms similar to `shard_size`
Significant terms internally maintain a priority queue per shard with a size potentially
lower than the number of terms. This queue uses the score as criterion to determine if
a bucket is kept or not. If many terms with low subsetDF score very high
but the `min_doc_count` is set high, this might result in no terms being
returned because the pq is filled with low frequent terms which are all sorted
out in the end.

This can be avoided by increasing the `shard_size` parameter to a higher value.
However, it is not immediately clear to which value this parameter must be set
because we can not know how many terms with low frequency are scored higher that
the high frequent terms that we are actually interested in.

On the other hand, if there is no routing of docs to shards involved, we can maybe
assume that the documents of classes and also the terms therein are distributed evenly
across shards. In that case it might be easier to not add documents to the pq that have
subsetDF <= `shard_min_doc_count` which can be set to something like
`min_doc_count`/number of shards  because we would assume that even when summing up
the subsetDF across shards `min_doc_count` will not be reached.

closes #5998
closes #6041
2014-05-07 18:02:56 +02:00
gabriel-tessier 7b0efcbd96 fix typo 2014-05-06 15:54:36 +02:00
Audrey 52d2f2d229 [DOCS] Update phrase-suggest.asciidoc
Grammatical error

Close #5993
2014-05-06 10:28:13 +02:00
Martijn van Groningen 013b319415 Added `reverse_nested` aggregation.
The `reverse_nested` aggregation allows to aggregate on properties outside of the nested scope of a `nested` aggregation.

Closes #5507
2014-05-01 00:23:05 +07:00
Lee Hinman 57bee03193 [DOCS] Add /_search_shards documentation 2014-04-22 08:54:32 -06:00
Clinton Gormley 3ba8fbbef8 Update benchmark.asciidoc
Fixed incorrect parameter spec for benchmark nodes
2014-04-22 14:16:10 +02:00
Clinton Gormley 0e782331be Update benchmark.asciidoc 2014-04-21 20:39:33 +02:00
David Pilato f3fe50aac4 [DOCS] fix typo 2014-04-19 22:44:44 +02:00
Scott Wilkerson 9ea0e3a95b Update percolate.asciidoc
fix typo
2014-04-15 16:01:44 +02:00
Andrew Selden 2cf66c4115 Benchmark documentation
Moving benchmark documentation under the search section.

Closes #5786
2014-04-14 14:08:41 -07:00
Malte Schirnacher 8ce3bba010 Fix typos in percolate.asciidoc
Close #5762 #5763 #5764
2014-04-11 18:09:16 +02:00
Andrew O'Brien 48031b6236 Fixes typo in "Scan" search type documention 2014-04-07 16:01:37 -06:00
gabriel-tessier 000c33aac3 fix typo 2014-04-07 09:23:46 +02:00
Martijn van Groningen ade1d0ef57 Added global ordinals (unique incremental numbering for terms) to fielddata.
Added a terms aggregation implementations that work on global ordinals, which is also the default.

Closes #5672
2014-04-07 11:06:41 +07:00
Karl Meisterheim 6d993bc810 [DOCS] A few grammar and word use corrections 2014-04-04 19:26:38 +02:00
Alexander Reelsen e547e113e1 Geo context suggester: Require precision in mapping
The default precision was way too exact and could lead people to
think that geo context suggestions are not working. This patch now
requires you to set the precision in the mapping, as elasticsearch itself
can never tell exactly, what the required precision for the users
suggestions are.

Closes #5621
2014-04-02 23:51:14 +02:00
Hannes Korte c11293ad78 Fix some typos in documentation. 2014-03-31 13:48:17 +02:00
bleskes 5d832374dd Update Documentation Feature Flags [1.1.0] 2014-03-25 17:51:30 +01:00
Boaz Leskes fc8dc3f733 [Docs] updated the search template and query template docs 2014-03-25 15:25:02 +01:00
Alexander Reelsen 4fc461a97c [DOCS] Moved the template query documentation into search section 2014-03-25 10:01:41 +01:00
Simon Willnauer b4e504df99 [Docs] Add coming tag for context suggester docs 2014-03-25 09:46:49 +01:00