OpenSearch

Commit Graph

Author	SHA1	Message	Date
markharwood	fe623acf66	Docs - removed experimental/beta markers from adjacency matrix aggregation (#34599 )	2018-10-19 09:33:59 +01:00
markharwood	2a413abb0b	Docs - remove experimental marker from significant_text aggregation (#34598 )	2018-10-19 09:32:02 +01:00
Jim Ferenczi	36557469f6	[DOCS] Removes beta label from composite aggregation (#34329 )	2018-10-05 19:46:20 +02:00
Nik Everett	dc2cf28fde	Docs: Allow skipping response assertions (#34240 ) We generate tests from our documentation, including assertions about the responses returned by a particular API. But sometimes we can't assert that the response is correct because of some defficiency in our tooling. Previously we marked the response `// NOTCONSOLE` to skip it, but this is kind of odd because `// NOTCONSOLE` is really to mark snippets that are json but aren't requests or responses. This introduces a new construct to skip response assertions: ``` // TESTRESPONSE[skip:reason we skipped this] ```	2018-10-04 08:03:38 -04:00
Serge Populov	13af5d5d7f	Docs: Fix typo in field name in aggregations (#34223 )	2018-10-02 10:54:29 -04:00
Ryan Ernst	3046656ab1	Scripting: Rework joda time backcompat (#33486 ) This commit switches the joda time backcompat in scripting to use augmentation over ZonedDateTime. The augmentation methods provide compatibility with the missing methods between joda's DateTime and java's ZonedDateTime. Due to getDayOfWeek returning an enum in the java API, ZonedDateTime is wrapped so that the method can return int like the joda time does. The java time api version is renamed to getDayOfWeekEnum, which will be kept through 7.x for compatibility while users switch back to getDayOfWeek once joda compatibility is removed.	2018-09-16 19:18:00 -07:00
Christoph Büscher	fe478c23b7	[Docs] Fix heading in composite-aggregation.asciidoc (#33627 ) The heading for the "Missing buckets" should be on the same level as the the "Order" section.	2018-09-12 16:56:03 +02:00
Paul Sanwald	c303006e6b	Add interval response parameter to AutoDateInterval histogram (#33254 ) Adds the interval used to the aggregation response.	2018-09-05 07:35:59 -04:00
lipsill	b7c0d2830a	[Docs] Remove repeating words (#33087 )	2018-08-28 13:16:43 +02:00
Luca Cavanna	393eec1482	Set maxScore for empty TopDocs to Nan rather than 0 (#32938 ) We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).	2018-08-22 17:23:54 +02:00
Ryan Ernst	478f6d6cf1	Scripting: Conditionally use java time api in scripting (#31441 ) This commit adds a boolean system property, `es.scripting.use_java_time`, which controls the concrete return type used by doc values within scripts. The return type of accessing doc values for a date field is changed to Object, essentially duck typing the type to allow co-existence during the transition from joda time to java time.	2018-08-01 08:58:49 -07:00
Colm O'Shea	97b379e0d4	fix no=>not typo (#32463 ) Found a tiny typo while reading the docs	2018-07-31 13:33:23 +01:00
Paul Sanwald	feb07559aa	fix typo	2018-07-13 14:59:11 -04:00
Colin Goodheart-Smithe	0edb096eb4	Adds a new auto-interval date histogram (#28993 ) * Adds a new auto-interval date histogram This change adds a new type of histogram aggregation called `auto_date_histogram` where you can specify the target number of buckets you require and it will find an appropriate interval for the returned buckets. The aggregation works by first collecting documents in buckets at second interval, when it has created more than the target number of buckets it merges these buckets into minute interval bucket and continues collecting until it reaches the target number of buckets again. It will keep merging buckets when it exceeds the target until either collection is finished or the highest interval (currently years) is reached. A similar process happens at reduce time. This aggregation intentionally does not support min_doc_count, offest and extended_bounds to keep the already complex logic from becoming more complex. The aggregation accepts sub-aggregations but will always operate in `breadth_first` mode deferring the computation of sub-aggregations until the final buckets from the shard are known. min_doc_count is effectively hard-coded to zero meaning that we will insert empty buckets where necessary. Closes #9572 * Adds documentation * Added sub aggregator test * Fixes failing docs test * Brings branch up to date with master changes * trying to get tests to pass again * Fixes multiBucketConsumer accounting * Collects more buckets than needed on shards This gives us more options at reduce time in terms of how we do the final merge of the buckeets to produce the final result * Revert "Collects more buckets than needed on shards" This reverts commit 993c782d117892af9a3c86a51921cdee630a3ac5. * Adds ability to merge within a rounding * Fixes nonn-timezone doc test failure * Fix time zone tests * iterates on tests * Adds test case and documentation changes Added some notes in the documentation about the intervals that can bbe returned. Also added a test case that utilises the merging of conseecutive buckets * Fixes performance bug The bug meant that getAppropriate rounding look a huge amount of time if the range of the data was large but also sparsely populated. In these situations the rounding would be very low so iterating through the rounding values from the min key to the max keey look a long time (~120 seconds in one test). The solution is to add a rough estimate first which chooses the rounding based just on the long values of the min and max keeys alone but selects the rounding one lower than the one it thinks is appropriate so the accurate method can choose the final rounding taking into account the fact that intervals are not always fixed length. Thee commit also adds more tests * Changes to only do complex reduction on final reduce * merge latest with master * correct tests and add a new test case for 10k buckets * refactor to perform bucket number check in innerBuild * correctly derive bucket setting, update tests to increase bucket threshold * fix checkstyle * address code review comments * add documentation for default buckets * fix typo	2018-07-13 13:08:35 -04:00
Jimi Ford	e955ffc38d	Docs: fix typo in datehistogram (#31972 )	2018-07-11 15:04:57 -04:00
Sue Gallagher	357a07e7a2	[DOCS] Fix heading format errors (#31483 ) * [DOCS] Fix heading format errors. Closes #31327 * [DOCS] Fix heading format errors. Closes #31327	2018-06-25 17:25:32 -07:00
Jim Ferenczi	e33d107f84	Add missing_bucket option in the composite agg (#29465 ) This change adds a new option to the composite aggregation named `missing_bucket`. This option can be set by source and dictates whether documents without a value for the source should be ignored. When set to true, documents without a value for a field emits an explicit `null` value which is then added in the composite bucket. The `missing` option that allows to set an explicit value (instead of `null`) is deprecated in this change and will be removed in a follow up (only in 7.x). This commit also changes how the big arrays are allocated, instead of reserving the provided `size` for all sources they are created with a small intial size and they grow depending on the number of buckets created by the aggregation: Closes #29380	2018-05-30 09:48:40 +02:00
Julie Tibshirani	638a719370	Ensure that ip_range aggregations always return bucket keys. (#30701 )	2018-05-24 08:55:14 -07:00
Piotr Prądzyński	a0a8c4f186	filters agg docs duplicated 'bucket' word removal (#30677 ) In one place word 'bucket' was duplicated.	2018-05-17 15:21:50 +01:00
Jason Tedor	4a4e3d70d5	Default to one shard (#30539 ) This commit changes the default out-of-the-box configuration for the number of shards from five to one. We think this will help address a common problem of oversharding. For users with time-based indices that need a different default, this can be managed with index templates. For users with non-time-based indices that find they need to re-shard with the split API in place they no longer need to resort only to reindexing. Since this has the impact of changing the default number of shards used in REST tests, we want to ensure that we still have coverage for issues that could arise from multiple shards. As such, we randomize (rarely) the default number of shards in REST tests to two. This is managed via a global index template. However, some tests check the templates that are in the cluster state during the test. Since this template is randomly there, we need a way for tests to skip adding the template used to set the number of shards to two. For this we add the default_shards feature skip. To avoid having to write our docs in a complicated way because sometimes they might be behind one shard, and sometimes they might be behind two shards we apply the default_shards feature skip to all docs tests. That is, these tests will always run with the default number of shards (one).	2018-05-14 12:22:35 -04:00
Christoph Büscher	21dbf9fab0	Document time unit limitations for date histograms (#30177 ) Adding some allowed abbreviated values for intervals in date histograms as well as documenting the limitations of intervals larger than days. Closes #23294	2018-04-26 19:44:21 +02:00
Adrien Grand	ebd6b5b7ba	Deprecate filtering on `_type`. (#29468 ) As indices are only allowed to have one type now, and types are going away in the future, we should deprecate filtering by `_type`. Relates #15613	2018-04-13 09:07:51 +02:00
Jim Ferenczi	5288235ca3	Optimize the composite aggregation for match_all and range queries (#28745 ) This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate the collection when the leading source value is greater than the lowest value in the queue. Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents in the order of the values present in the leading source. For instance the following aggregation: ``` "composite" : { "sources" : [ { "value1": { "terms" : { "field": "timestamp", "order": "asc" } } } ], "size": 10 } ``` ... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents. For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited. This mode can execute iff: * The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`. * The query is a match_all query or a range query over the field that is used as the leading source in the composite definition. * The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only). If these conditions are not met this aggregation visits each document like any other agg.	2018-03-26 09:51:37 +02:00
Paul Sanwald	6dae955b6a	Document and test date_range "missing" support (#28983 ) * Add a REST integration test that documents date_range support Add a test case that exercises date_range aggregations using the missing option. Addresses #17597 * Test cleanup and correction Adding a document with a null date to exercise `missing` option, update test name to something reasonable. * Update documentation to explain how the "missing" parameter works for date_range aggregations. * Wrap lines at 80 chars in docs. * Change format of test to YAML for readability.	2018-03-13 12:58:30 -07:00
Menno Oudshoorn	d018a0008e	Add a usage example of the JLH score (#28905 ) Adds a usage example of the JLH score used in significant terms aggregation. All other methods to calculate significance score have such an example Closes #28513	2018-03-06 15:37:18 +01:00
Tim Roes	5689dc1182	[Docs] Fix typo in composite aggregation (#28891 )	2018-03-04 11:47:24 -08:00
Clinton Gormley	45c1e37740	Add defined ID to terms agg size header	2018-02-02 13:43:20 +01:00
Jim Ferenczi	c4e0a84344	Mark the composite aggregation as a beta feature (#28431 ) The `composite` aggregation should be marked as beta (rather than experimental) in the documentation.	2018-02-02 09:24:10 +01:00
Jim Ferenczi	c26d4ac6c1	Always return the after_key in composite aggregation response (#28358 ) This change adds the `after_key` of a composite aggregation directly in the response. It is redundant when all buckets are not filtered/removed by a pipeline aggregation since in this case the `after_key` is always the last bucket in the response. Though when using a pipeline aggregation to filter composite buckets, the `after_key` can be lost if the last bucket is filtered. This commit fixes this situation by always returning the `after_key` in a dedicated section.	2018-01-25 09:15:27 +01:00
Jim Ferenczi	65184d0b5b	Adds a note in the `terms` aggregation docs regarding pagination (#28360 ) This change adds a note in the `terms` aggregation that explains how to retrieve all terms (or all combinations of terms in a nested agg) using the `composite` aggregation.	2018-01-25 08:59:41 +01:00
Alex Moros Marco	090ac3c2a2	[Doc] Fixs typo in reverse-nested-aggregation.asciidoc (#28348 )	2018-01-24 17:54:02 +01:00
Jim Ferenczi	b2ce994be7	[Docs] Fix asciidoc style in composite agg docs	2018-01-23 16:41:32 +01:00
Jim Ferenczi	19cfc25873	Adds the ability to specify a format on composite date_histogram source (#28310 ) This commit adds the ability to specify a date format on the `date_histogram` composite source. If the format is defined, the key for the source is returned as a formatted date. Closes #27923	2018-01-23 15:14:49 +01:00
Christoph Büscher	556d77c9ad	[Docs] Add note on limitation for significant_text with nested objects (#28052 ) Add section to `significant_text` documentation mentioning that it currently does not support use on nested objects. Relates to #28050	2018-01-03 16:28:23 +01:00
Shaunak Kashyap	da0ed578b2	Fixing typo in param name: values => sources (#28016 )	2017-12-28 18:18:30 +01:00
Adrien Grand	1b660821a2	Allow `_doc` as a type. (#27816 ) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751	2017-12-14 17:47:53 +01:00
Christoph Büscher	0d11b9fe34	[Docs] Unify spelling of Elasticsearch (#27567 ) Removes occurences of "elasticsearch" or "ElasticSearch" in favour of "Elasticsearch" where appropriate.	2017-11-29 09:44:25 +01:00
Clinton Gormley	d1b1d711df	Update composite-aggregation.asciidoc Fixed asciidoc typo	2017-11-23 15:05:14 +01:00
Takumasa Ochi	eed8d1aee5	[DOC] Fix mathematical representation on interval (range) (#27450 )	2017-11-21 17:06:26 +00:00
Jim Ferenczi	d1093bd2fa	#26800 : Fix docs rendering	2017-11-20 08:41:02 +01:00
Jim Ferenczi	623367d793	Add composite aggregator (#26800 ) * This change adds a module called `aggs-composite` that defines a new aggregation named `composite`. The `composite` aggregation is a multi-buckets aggregation that creates composite buckets made of multiple sources. The sources for each bucket can be defined as: * A `terms` source, values are extracted from a field or a script. * A `date_histogram` source, values are extracted from a date field and rounded to the provided interval. This aggregation can be used to retrieve all buckets of a deeply nested aggregation by flattening the nested aggregation in composite buckets. A composite buckets is composed of one value per source and is built for each document as the combinations of values in the provided sources. For instance the following aggregation: ```` "test_agg": { "terms": { "field": "field1" }, "aggs": { "nested_test_agg": "terms": { "field": "field2" } } } ```` ... which retrieves the top N terms for `field1` and for each top term in `field1` the top N terms for `field2`, can be replaced by a `composite` aggregation in order to retrieve all the combinations of `field1`, `field2` in the matching documents: ```` "composite_agg": { "composite": { "sources": [ { "field1": { "terms": { "field": "field1" } } }, { "field2": { "terms": { "field": "field2" } } }, } } ```` The response of the aggregation looks like this: ```` "aggregations": { "composite_agg": { "buckets": [ { "key": { "field1": "alabama", "field2": "almanach" }, "doc_count": 100 }, { "key": { "field1": "alabama", "field2": "calendar" }, "doc_count": 1 }, { "key": { "field1": "arizona", "field2": "calendar" }, "doc_count": 1 } ] } } ```` By default this aggregation returns 10 buckets sorted in ascending order of the composite key. Pagination can be achieved by providing `after` values, the values of the composite key to aggregate after. For instance the following aggregation will aggregate all composite keys that sorts after `arizona, calendar`: ```` "composite_agg": { "composite": { "after": {"field1": "alabama", "field2": "calendar"}, "size": 100, "sources": [ { "field1": { "terms": { "field": "field1" } } }, { "field2": { "terms": { "field": "field2" } } } } } ```` This aggregation is optimized for indices that set an index sorting that match the composite source definition. For instance the aggregation above could run faster on indices that defines an index sorting like this: ```` "settings": { "index.sort.field": ["field1", "field2"] } ```` In this case the `composite` aggregation can early terminate on each segment. This aggregation also accepts multi-valued field but disables early termination for these fields even if index sorting matches the sources definition. This is mandatory because index sorting picks only one value per document to perform the sort.	2017-11-16 15:13:36 +01:00
shaulzorea	9db21cd23f	fixing typo in datehistogram-aggregation.asciidoc (#26924 )	2017-10-08 15:12:43 +02:00
shaulzorea	666cf4b872	fixing typo in nested-aggregation.asciidoc (#26481 )	2017-09-04 06:42:44 +02:00
Tanguy Leroux	643eb286dc	[Docs] Convert remaining code snippets in docs (#26422 ) This commit converts the last remaining code snippets so that they are now testable.	2017-08-30 12:11:10 +02:00
Jim Ferenczi	977dcfe789	Deprecate global_ordinals_hash and global_ordinals_low_cardinality (#26173 ) * Deprecate global_ordinals_hash and global_ordinals_low_cardinality This change deprecates the `global_ordinals_hash` and `global_ordinals_low_cardinality` and makes the `global_ordinals` execution hint choose internally if global ords should be remapped or use the segment ord directly. These hints are too sensitive and expert to be exposed and we should be able to take the right decision internally based on the agg tree.	2017-08-21 19:12:27 +02:00
Christoph Büscher	5dae277bb2	Support distance units in GeoHashGrid aggregation precision (#26291 ) Currently the `precision` parameter must be a precision level in the range of [1,12]. In #5042 it was suggested also supporting distance units like "1km" to automatically approcimate the needed precision level. This change adds this support to the Rest API by making use of GeoUtils#geoHashLevelsForPrecision. Plain integer values without a unit are still treated as precision levels like before. Distance values that are too small to be represented by a precision level of 12 (values approx. less than 0.056m) are rejected. Closes #5042	2017-08-21 17:29:28 +02:00
Nik Everett	7e76b2a8c3	Docs: fold section into current chapter In #25602 we added a new chapter on aggregating by day of the week. We intended to add a new section but we were missing a single `=`.	2017-08-17 11:19:02 -04:00
Nik Everett	6d2c40e546	Enforce that responses in docs are valid json (#26249 ) All of the snippets in our docs marked with `// TESTRESPONSE` are checked against the response from Elasticsearch but, due to the way they are implemented they are actually parsed as YAML instead of JSON. Luckilly, all valid JSON is valid YAML! Unfurtunately that means that invalid JSON has snuck into the exmples! This adds a step during the build to parse them as JSON and fail the build if they don't parse. But no! It isn't quite that simple. The displayed text of some of these responses looks like: ``` { ... "aggregations": { "range": { "buckets": [ { "to": 1.4436576E12, "to_as_string": "10-2015", "doc_count": 7, "key": "-10-2015" }, { "from": 1.4436576E12, "from_as_string": "10-2015", "doc_count": 0, "key": "10-2015-" } ] } } } ``` Note the `...` which isn't valid json but we like it anyway and want it in the output. We use substitution rules to convert the `...` into the response we expect. That yields a response that looks like: ``` { "took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits, "aggregations": { "range": { "buckets": [ { "to": 1.4436576E12, "to_as_string": "10-2015", "doc_count": 7, "key": "-10-2015" }, { "from": 1.4436576E12, "from_as_string": "10-2015", "doc_count": 0, "key": "10-2015-" } ] } } } ``` That is what the tests consume but it isn't valid JSON! Oh no! We don't want to go update all the substitution rules because that'd be huge and, ultimately, wouldn't buy much. So we quote the `$body.took` bits before parsing the JSON. Note the responses that we use for the `_cat` APIs are all converted into regexes and there is no expectation that they are valid JSON. Closes #26233	2017-08-17 09:02:10 -04:00
Zachary Tong	829f7cb658	CONSOLEify ip-range bucket agg docs Related #18160	2017-08-03 17:19:54 -04:00
Clinton Gormley	ff4a2519f2	Update experimental labels in the docs (#25727 ) Relates https://github.com/elastic/elasticsearch/issues/19798 Removed experimental label from: * Painless * Diversified Sampler Agg * Sampler Agg * Significant Terms Agg * Terms Agg document count error and execution_hint * Cardinality Agg precision_threshold * Pipeline Aggregations * index.shard.check_on_startup * index.store.type (added warning) * Preloading data into the file system cache * foreach ingest processor * Field caps API * Profile API Added experimental label to: * Moving Average Agg Prediction Changed experimental to beta for: * Adjacency matrix agg * Normalizers * Tasks API * Index sorting Labelled experimental in Lucene: * ICU plugin custom rules file * Flatten graph token filter * Synonym graph token filter * Word delimiter graph token filter * Simple pattern tokenizer * Simple pattern split tokenizer Replaced experimental label with warning that details may change in the future: * Analysis explain output format * Segments verbose output format * Percentile Agg compression and HDR Histogram * Percentile Rank Agg HDR Histogram	2017-07-18 14:06:22 +02:00
Simon Willnauer	e81804cfa4	Add a shard filter search phase to pre-filter shards based on query rewriting (#25658 ) Today if we search across a large amount of shards we hit every shard. Yet, it's quite common to search across an index pattern for time based indices but filtering will exclude all results outside a certain time range ie. `now-3d`. While the search can potentially hit hundreds of shards the majority of the shards might yield 0 results since there is not document that is within this date range. Kibana for instance does this regularly but used `_field_stats` to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results. This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.	2017-07-12 22:19:20 +02:00
matarrese	2eafbaf759	Document aggregating by day of the week (#25602 ) Add documentation for aggregating by day of the week. Closes #24660	2017-07-07 14:16:53 -04:00
Clinton Gormley	0170e0e8d3	Remove usage of multi-types from the docs and added a page explaining type removal (#25543 ) Closes #25401	2017-07-05 12:30:19 +02:00
Ryan Ernst	a03b6c2fa5	Scripting: Change keys for inline/stored scripts to source/id (#25127 ) This commit adds back "id" as the key within a script to specify a stored script (which with file scripts now gone is no longer ambiguous). It also adds "source" as a replacement for "code". This is in an attempt to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.	2017-06-09 08:29:25 -07:00
Tanguy Leroux	528bd25fa7	Add superset size to Significant Term REST response (#24865 ) This commit adds a new bg_count field to the REST response of SignificantTerms aggregations. Similarly to the bg_count that already exists in significant terms buckets, this new bg_count field is set at the aggregation level and is populated with the superset size value.	2017-06-02 09:45:15 +02:00
markharwood	b7197f5e21	SignificantText aggregation - like significant_terms, but for text (#24432 ) * SignificantText aggregation - like significant_terms but doesn’t require fielddata=true, recommended used with `sampler` agg to limit expense of tokenizing docs and takes optional `filter_duplicate_text`:true setting to avoid stats skew from repeated sections of text in search results. Closes #23674	2017-05-24 13:46:43 +01:00
Ryan Ernst	463fe2f4d4	Scripting: Remove file scripts (#24627 ) This commit removes file scripts, which were deprecated in 5.5. closes #21798	2017-05-17 14:42:25 -07:00
Zachary Tong	a2845c86fe	CONSOLEify some more aggregation docs Related #18160	2017-05-16 17:25:24 -04:00
qwerty4030	e7d352b489	Compound order for histogram aggregations. (#22343 ) This commit adds support for histogram and date_histogram agg compound order by refactoring and reusing terms agg order code. The major change is that the Terms.Order and Histogram.Order classes have been replaced/refactored into a new class BucketOrder. This is a breaking change for the Java Transport API. For backward compatibility with previous ES versions the (date)histogram compound order will use the first order. Also the _term and _time aggregation order keys have been deprecated; replaced by _key. Relates to #20003: now that all these aggregations use the same order code, it should be easier to move validation to parse time (as a follow up PR). Relates to #14771: histogram and date_histogram aggregation order will now be validated at reduce time. Closes #23613: if a single BucketOrder that is not a tie-breaker is added with the Java Transport API, it will be converted into a CompoundOrder with a tie-breaker.	2017-05-11 18:06:26 +01:00
Adrien Grand	1be2800120	Only allow one type on 7.0 indices (#24317 ) This adds the `index.mapping.single_type` setting, which enforces that indices have at most one type when it is true. The default value is true for 6.0+ indices and false for old indices. Relates #15613	2017-04-27 08:43:20 +02:00
Suhas Karanth	cee76295ca	Update aggs reference documentation for 'keyed' options (#23758 ) Add 'keyed' parameter documentation for following: - Date Histogram Aggregation - Date Range Aggregation - Geo Distance Aggregation - Histogram Aggregation - IP range aggregation - Percentiles Aggregation - Percentile Ranks Aggregation	2017-04-18 15:57:50 +02:00
Ulugbek Baymuradov	9cb477d387	Update filter-aggregation.asciidoc (#24138 ) Fix a discrepancy between the example and the prose.	2017-04-17 18:46:13 -04:00
Nik Everett	5f91241f57	CONSOLEify geo aggregation docs Turns the top example in each of the geo aggregation docs into a working example that can be opened in CONSOLE. Subsequent examples can all also be opened in console and will work after you've run the first example. All examples are tested as part of the build.	2017-03-30 21:28:52 -04:00
Christoph Büscher	413bf05956	Docs: Add comma to reverse nested agg snippet	2017-03-17 14:07:18 +01:00
Randall Britten	98e19cced4	Docs: Corrected definition of type param of children agg (#23377 )	2017-02-27 14:38:28 -05:00
Nik Everett	0c011cb290	Docs: CONSOLEify histogram aggregation docs This adds the `COPY AS CURL` and `VIEW IN CONSOLE` links to the docs and causes the snippets to be tested during Elasticsearch's build. Relates to #18160	2017-02-07 16:09:32 -05:00
Jun Ohtani	7ea457955d	Merge pull request #22879 from johtani/fix_documentation_error_in_date_histogram [Doc]Not support "M" time unit in offset param	2017-02-03 16:40:08 +09:00
Nicholas Knize	b41d5747f0	Reduce GeoDistance insanity GeoDistance query, sort, and scripts make use of a crazy GeoDistance enum for handling 4 different ways of computing geo distance: SLOPPY_ARC, ARC, FACTOR, and PLANE. Only two of these are necessary: ARC, PLANE. This commit removes SLOPPY_ARC, and FACTOR and cleans up the way Geo distance is computed.	2017-02-02 12:39:42 -06:00
markharwood	9e8e556b08	Build fix for broken docs build	2017-01-31 10:27:06 +00:00
markharwood	c0d525b108	[DOCS] [TEST] enhancement - added CONSOLE scripts for sampler aggs (#22869 ) Added missing CONSOLE scripts to documentation for sampler and diversified_sampler aggs. Includes new StackOverflow index setup in build.gradle Closes #22746 * Formatting tweaks	2017-01-31 09:45:25 +00:00
Jun Ohtani	94933f9d19	[Doc]Not support "M" time unit in offset param	2017-01-31 18:23:38 +09:00
Mathieu Berube	e0b8e45cc5	Fix typo - mergins to margins (#22839 )	2017-01-30 13:52:32 +01:00
Nik Everett	a99bddcc7e	CONSOLE-ify filter aggregation docs This adds the `VIEW IN CONSOLE` and `COPY AS CURL` links to the snippet and causes the build to execute the snippet as a test. Relates to #18160	2017-01-23 01:32:56 -05:00
Nik Everett	40e2645177	CONSOLE-ify date_range aggregation docs This adds the `VIEW IN CONSOLE` and `COPY AS CURL` links to the snippets in the docs for the `date_range` aggregation and tests those snippets as part of the build. Relates to #18160	2017-01-22 23:38:45 -05:00
Nik Everett	f7524fbdef	CONSOLE-ify date histogram docs This adds the `VIEW IN SENSE` and `COPY AS CURL` links and has the build automatically execute the snippets and verify that they work. Relates to #18160	2017-01-20 16:23:28 -05:00
Nik Everett	8c856eaa9f	CONSOLE-ify global-aggregation.asciidoc Adds the `VIEW IN CONSOLE` and `COPY AS CURL` links to the example `global` aggregation. Also improves the example by adding a non-`global` aggregation to compare it to. Relates to #18160	2017-01-20 14:36:51 -05:00
markharwood	f01784205f	New AdjacencyMatrix aggregation Similar to the Filters aggregation but only supports "keyed" filter buckets and automatically "ANDs" pairs of filters to produce a form of adjacency matrix. The intersection of buckets "A" and "B" is named "A&B" (the choice of separator is configurable). Empty intersection buckets are removed from the final results. Closes #22169	2017-01-20 15:49:31 +00:00
Jim Ferenczi	433c822d4f	Promote longs to doubles when a terms agg mixes decimal and non-decimal numbers (#22449 ) * Promote longs to doubles when a terms agg mixes decimal and non-decimal number This change makes the terms aggregation work when the buckets coming from different indices are a mix of decimal numbers and non-decimal numbers. In this case non-decimal number (longs) are promoted to decimal (double) which can result in a loss of precision for big numbers. Fixes #22232	2017-01-10 11:50:56 +01:00
Adrien Grand	787519ee4c	Fix `other_bucket` on the `filters` agg to be enabled if a key is set. (#21994 ) Closes #21951	2016-12-09 09:48:48 +01:00
Colin Goodheart-Smithe	8006b105f3	Update order examples to use max instead of avg (#22032 ) The use of the avg aggregation for sorting the terms aggregation is not encouraged since it has unbounded error. This changes the examples to use the max aggregation which does not suffer the same issues	2016-12-07 16:00:24 +00:00
markharwood	aa60e5cc07	Aggregations - support for partitioning set of terms used in aggregations so that multiple requests can be done without trying to compute everything in one request. Closes #21487	2016-11-24 15:10:46 +00:00
Chris Fritz	546fa92d61	Fix typo in filters aggregation docs (#21690 )	2016-11-21 12:52:45 +01:00
Christoph Büscher	4ccd8e79c1	Docs: Clarify date_histogram bucket sizes for DST time zones Added a warning note that clarifies bucket sizes diverging from the intended `interval` size when using a time zone that has DST changes. Closes #18805	2016-11-16 09:40:07 +01:00
Sumit Gupta	e53405f4f3	Update geohashgrid-aggregation.asciidoc (#21530 )	2016-11-15 10:49:02 +01:00
Clinton Gormley	30d342c87c	Update significantterms-aggregation.asciidoc Fix scripted significant terms example to use `params.` prefix for painless	2016-11-14 09:40:04 +01:00
markharwood	dd21aa41be	Docs fix - Diversified sampler agg had incorrect title and example Closes #21347	2016-11-07 10:46:22 +00:00
Robin Clarke	bbe6555b7a	Docs: your -> you're (#20883 )	2016-10-12 11:09:34 -04:00
Pascal Borreli	fcb01deb34	Fixed typos (#20843 )	2016-10-10 14:51:47 -06:00
Nik Everett	9271c0302f	CONSOLEify some aggs docs Cleans up the example result in `children-aggregation` so that it matches the example data. Relates to #18160	2016-10-03 09:22:56 -04:00
Nik Everett	5cff2a046d	Remove most of the need for `// NOTCONSOLE` and be much more stingy about what we consider a console candidate. * Add `// CONSOLE` to check-running * Fix version in some snippets * Mark groovy snippets as groovy * Fix versions in plugins * Fix language marker errors * Fix language parsing in snippets This adds support for snippets who's language is written like `[source, txt]` and `["source","js",subs="attributes,callouts"]`. This also makes language required for snippets which is nice because then we can be sure we can grep for snippets in a particular language.	2016-09-06 10:32:54 -04:00
Clinton Gormley	de208cf78c	Fied bad asciidoc	2016-08-18 14:08:58 +02:00
Clinton Gormley	31e5e0b17f	Document that pipeline aggs cannot be used for sorting Closes #20037	2016-08-18 13:52:45 +02:00
Adrien Grand	a0818d3b87	Split regular histograms from date histograms. #19551 Currently both aggregations really share the same implementation. This commit splits the implementations so that regular histograms can support decimal intervals/offsets and compute correct buckets for negative decimal values. However the response API is still the same. So for intance both regular histograms and date histograms will produce an `org.elasticsearch.search.aggregations.bucket.histogram.Histogram` aggregation. The optimization to compute an identifier of the rounded value and the rounded value itself has been removed since it was only used by regular histograms, which now do the rounding themselves instead of relying on the Rounding abstraction. Closes #8082 Closes #4847	2016-08-03 08:39:48 +02:00
Adrien Grand	dcc598c414	Make the heuristic to compute the default shard size less aggressive. The current heuristic to compute a default shard size is pretty aggressive, it returns `max(10, number_of_shards * size)` as a value for the shard size. I think making it less aggressive has the benefit that it would reduce the likelyness of running into OOME when there are many shards (yearly aggregations with time-based indices can make numbers of shards in the thousands) and make the use of breadth-first more likely/efficient. This commit replaces the heuristic with `size * 1.5 + 10`, which is enough to have good accuracy on zipfian distributions.	2016-07-29 09:59:29 +02:00
Jared McQueen	d97b3fd817	[docs] missing a comma in the terms aggregation example	2016-07-27 12:59:38 -04:00
Leon Weidauer	1297a707da	non-binary gender option in term aggr. example (#19188 ) * non-binary gender option in term aggr. example * replace gender with music genre for term aggregation docs	2016-07-01 14:59:03 +02:00
Jason Tedor	00356edd33	Clarify time units usage in docs This commit clarifies the distinction between supported time units for durations and supported time units for durations in the docs. Relates #19159	2016-06-29 17:02:15 -04:00
Robert Muir	6fc1a22977	cutover some docs to painless	2016-06-27 09:55:16 -04:00
Jim Ferenczi	fb2a48d0f0	Revert "Remove support for sorting terms aggregation by ascending count" This is delayed after alpha4 since Kibana relies on it.	2016-06-17 17:14:01 +02:00
Jim Ferenczi	755721953b	Remove support for sorting terms aggregation by ascending count closes #17614	2016-06-17 15:06:49 +02:00

1 2 3 4

195 Commits