OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-03-02 17:09:18 +00:00

Author	SHA1	Message	Date
Dimitris Athanasiou	66bef26495	Aggregations: bucket_sort pipeline aggregation (#27152 ) This commit adds a parent pipeline aggregation that allows sorting the buckets of a parent multi-bucket aggregation. The aggregation also offers [from] and [size] parameters in order to truncate the result as desired. Closes #14928	2017-11-09 17:59:57 +00:00
Dimitrios Athanasiou	3796471ac4	[Docs] Fix note in bucket_selector	2017-10-30 15:20:46 +00:00
Martijn van Groningen	87c9b79b10	Return the _source of inner hit nested as is without wrapping it into its full path context Due to a change happened via #26102 to make the nested source consistent with or without source filtering, the _source of a nested inner hit was always wrapped in the parent path. This turned out to be not ideal for users relying on the nested source, as it would require additional parsing on the client side. This change fixes this, the _source of nested inner hits is now no longer wrapped by parent json objects, irregardless of whether the _source is included as is or source filtering is used. Internally source filtering and highlighting relies on the fact that the _source of nested inner hits are accessible by its full field path, so in order to now break this, the conversion of the _source into its binary form is performed in FetchSourceSubPhase, after any potential source filtering is performed to make sure the structure of _source of the nested inner hit is consistent irregardless if source filtering is performed. PR for #26944 Closes #26944	2017-10-19 12:04:56 +02:00
shaulzorea	9db21cd23f	fixing typo in datehistogram-aggregation.asciidoc (#26924 )	2017-10-08 15:12:43 +02:00
Ryan Ernst	c0c5d5488f	Docs: Remove remaining references to file and native scripts (#26580 ) relates #25690	2017-09-11 11:39:29 -07:00
shaulzorea	666cf4b872	fixing typo in nested-aggregation.asciidoc (#26481 )	2017-09-04 06:42:44 +02:00
Tanguy Leroux	3d07bce504	[Docs] Fix tophits-aggregation.asciidoc	2017-08-30 13:06:44 +02:00
Tanguy Leroux	643eb286dc	[Docs] Convert remaining code snippets in docs (#26422 ) This commit converts the last remaining code snippets so that they are now testable.	2017-08-30 12:11:10 +02:00
Jim Ferenczi	977dcfe789	Deprecate global_ordinals_hash and global_ordinals_low_cardinality (#26173 ) * Deprecate global_ordinals_hash and global_ordinals_low_cardinality This change deprecates the `global_ordinals_hash` and `global_ordinals_low_cardinality` and makes the `global_ordinals` execution hint choose internally if global ords should be remapped or use the segment ord directly. These hints are too sensitive and expert to be exposed and we should be able to take the right decision internally based on the agg tree.	2017-08-21 19:12:27 +02:00
Christoph Büscher	5dae277bb2	Support distance units in GeoHashGrid aggregation precision (#26291 ) Currently the `precision` parameter must be a precision level in the range of [1,12]. In #5042 it was suggested also supporting distance units like "1km" to automatically approcimate the needed precision level. This change adds this support to the Rest API by making use of GeoUtils#geoHashLevelsForPrecision. Plain integer values without a unit are still treated as precision levels like before. Distance values that are too small to be represented by a precision level of 12 (values approx. less than 0.056m) are rejected. Closes #5042	2017-08-21 17:29:28 +02:00
Nik Everett	7e76b2a8c3	Docs: fold section into current chapter In #25602 we added a new chapter on aggregating by day of the week. We intended to add a new section but we were missing a single `=`.	2017-08-17 11:19:02 -04:00
Nik Everett	6d2c40e546	Enforce that responses in docs are valid json (#26249 ) All of the snippets in our docs marked with `// TESTRESPONSE` are checked against the response from Elasticsearch but, due to the way they are implemented they are actually parsed as YAML instead of JSON. Luckilly, all valid JSON is valid YAML! Unfurtunately that means that invalid JSON has snuck into the exmples! This adds a step during the build to parse them as JSON and fail the build if they don't parse. But no! It isn't quite that simple. The displayed text of some of these responses looks like: ``` { ... "aggregations": { "range": { "buckets": [ { "to": 1.4436576E12, "to_as_string": "10-2015", "doc_count": 7, "key": "-10-2015" }, { "from": 1.4436576E12, "from_as_string": "10-2015", "doc_count": 0, "key": "10-2015-" } ] } } } ``` Note the `...` which isn't valid json but we like it anyway and want it in the output. We use substitution rules to convert the `...` into the response we expect. That yields a response that looks like: ``` { "took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits, "aggregations": { "range": { "buckets": [ { "to": 1.4436576E12, "to_as_string": "10-2015", "doc_count": 7, "key": "-10-2015" }, { "from": 1.4436576E12, "from_as_string": "10-2015", "doc_count": 0, "key": "10-2015-" } ] } } } ``` That is what the tests consume but it isn't valid JSON! Oh no! We don't want to go update all the substitution rules because that'd be huge and, ultimately, wouldn't buy much. So we quote the `$body.took` bits before parsing the JSON. Note the responses that we use for the `_cat` APIs are all converted into regexes and there is no expectation that they are valid JSON. Closes #26233	2017-08-17 09:02:10 -04:00
Zachary Tong	829f7cb658	CONSOLEify ip-range bucket agg docs Related #18160	2017-08-03 17:19:54 -04:00
Zachary Tong	e7eda5e1be	CONSOLEify scripted-metric agg docs Related #18160	2017-08-03 17:19:54 -04:00
Zachary Tong	d8414ffa29	CONSOLEify percentile and percentile-ranks docs Related #18160	2017-08-02 17:47:27 -04:00
Zachary Tong	268923ebdc	CONSOLEify extended_stats docs Related #18160	2017-08-02 16:13:30 -04:00
Clinton Gormley	ff4a2519f2	Update experimental labels in the docs (#25727 ) Relates https://github.com/elastic/elasticsearch/issues/19798 Removed experimental label from: * Painless * Diversified Sampler Agg * Sampler Agg * Significant Terms Agg * Terms Agg document count error and execution_hint * Cardinality Agg precision_threshold * Pipeline Aggregations * index.shard.check_on_startup * index.store.type (added warning) * Preloading data into the file system cache * foreach ingest processor * Field caps API * Profile API Added experimental label to: * Moving Average Agg Prediction Changed experimental to beta for: * Adjacency matrix agg * Normalizers * Tasks API * Index sorting Labelled experimental in Lucene: * ICU plugin custom rules file * Flatten graph token filter * Synonym graph token filter * Word delimiter graph token filter * Simple pattern tokenizer * Simple pattern split tokenizer Replaced experimental label with warning that details may change in the future: * Analysis explain output format * Segments verbose output format * Percentile Agg compression and HDR Histogram * Percentile Rank Agg HDR Histogram	2017-07-18 14:06:22 +02:00
Simon Willnauer	e81804cfa4	Add a shard filter search phase to pre-filter shards based on query rewriting (#25658 ) Today if we search across a large amount of shards we hit every shard. Yet, it's quite common to search across an index pattern for time based indices but filtering will exclude all results outside a certain time range ie. `now-3d`. While the search can potentially hit hundreds of shards the majority of the shards might yield 0 results since there is not document that is within this date range. Kibana for instance does this regularly but used `_field_stats` to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results. This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.	2017-07-12 22:19:20 +02:00
matarrese	2eafbaf759	Document aggregating by day of the week (#25602 ) Add documentation for aggregating by day of the week. Closes #24660	2017-07-07 14:16:53 -04:00
Clinton Gormley	0170e0e8d3	Remove usage of multi-types from the docs and added a page explaining type removal (#25543 ) Closes #25401	2017-07-05 12:30:19 +02:00
Alexander Kazakov	64abc47ab0	[Docs] Fix documentation for percentiles bucket aggregation (#25229 )	2017-06-15 10:16:32 +02:00
Ryan Ernst	a03b6c2fa5	Scripting: Change keys for inline/stored scripts to source/id (#25127 ) This commit adds back "id" as the key within a script to specify a stored script (which with file scripts now gone is no longer ambiguous). It also adds "source" as a replacement for "code". This is in an attempt to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.	2017-06-09 08:29:25 -07:00
Colin Goodheart-Smithe	5e7a79636d	[DOCS] Clarify behaviour of scripted-metric arg with empty parent buckets	2017-06-02 11:00:27 +01:00
Tanguy Leroux	528bd25fa7	Add superset size to Significant Term REST response (#24865 ) This commit adds a new bg_count field to the REST response of SignificantTerms aggregations. Similarly to the bg_count that already exists in significant terms buckets, this new bg_count field is set at the aggregation level and is populated with the superset size value.	2017-06-02 09:45:15 +02:00
Tanguy Leroux	28d97df67c	Add document count to Matrix Stats aggregation response (#24776 ) This commit adds a `doc_count` field to the response body of Matrix Stats aggregation. It exposes the number of documents involved in the computation of statistics, a value that can already be retrieved using the method MatrixStats.getDocCount() in the Java API.	2017-05-30 09:39:41 +02:00
markharwood	b7197f5e21	SignificantText aggregation - like significant_terms, but for text (#24432 ) * SignificantText aggregation - like significant_terms but doesn’t require fielddata=true, recommended used with `sampler` agg to limit expense of tokenizing docs and takes optional `filter_duplicate_text`:true setting to avoid stats skew from repeated sections of text in search results. Closes #23674	2017-05-24 13:46:43 +01:00
Ryan Ernst	463fe2f4d4	Scripting: Remove file scripts (#24627 ) This commit removes file scripts, which were deprecated in 5.5. closes #21798	2017-05-17 14:42:25 -07:00
Zachary Tong	a2845c86fe	CONSOLEify some more aggregation docs Related #18160	2017-05-16 17:25:24 -04:00
Vlad Holubiev	557390d7d1	Fix typo in example (grades_count -> types_count) (#24635 ) Looks like `doc.grade` was used for examples before. But not anymore - https://www.elastic.co/guide/en/elasticsearch/reference/2.4/search-aggregations-metrics-valuecount-aggregation.html	2017-05-15 14:08:46 -04:00
qwerty4030	e7d352b489	Compound order for histogram aggregations. (#22343 ) This commit adds support for histogram and date_histogram agg compound order by refactoring and reusing terms agg order code. The major change is that the Terms.Order and Histogram.Order classes have been replaced/refactored into a new class BucketOrder. This is a breaking change for the Java Transport API. For backward compatibility with previous ES versions the (date)histogram compound order will use the first order. Also the _term and _time aggregation order keys have been deprecated; replaced by _key. Relates to #20003: now that all these aggregations use the same order code, it should be easier to move validation to parse time (as a follow up PR). Relates to #14771: histogram and date_histogram aggregation order will now be validated at reduce time. Closes #23613: if a single BucketOrder that is not a tie-breaker is added with the Java Transport API, it will be converted into a CompoundOrder with a tie-breaker.	2017-05-11 18:06:26 +01:00
Suhas Karanth	09c5fbfd00	Docs: Correct description of example (#24541 ) Copy and paste error.	2017-05-09 15:18:43 -04:00
Zachary Tong	4e49c618f2	CONSOLEify Stats Aggregation docs (#24373 )	2017-05-01 13:33:24 -04:00
Zachary Tong	130f1a56f1	Re-enable doc testing for Pipeline Aggregations (#24374 ) * Re-enable doc testing for Pipeline Aggregations Also adds a response + test for movavg pipeline	2017-05-01 13:30:51 -04:00
Christoph Büscher	16a7cbe463	Add `count` value to rest output of `geo_centroid` (#24387 ) Currently we don't write the count value to the geo_centroid aggregation rest response, but it is provided via the java api and the count() method in the GeoCentroid interface. We should add this parameter to the rest output and also provide it via the getProperty() method.	2017-04-28 16:25:22 +02:00
Adrien Grand	1be2800120	Only allow one type on 7.0 indices (#24317 ) This adds the `index.mapping.single_type` setting, which enforces that indices have at most one type when it is true. The default value is true for 6.0+ indices and false for old indices. Relates #15613	2017-04-27 08:43:20 +02:00
Suhas Karanth	cee76295ca	Update aggs reference documentation for 'keyed' options (#23758 ) Add 'keyed' parameter documentation for following: - Date Histogram Aggregation - Date Range Aggregation - Geo Distance Aggregation - Histogram Aggregation - IP range aggregation - Percentiles Aggregation - Percentile Ranks Aggregation	2017-04-18 15:57:50 +02:00
Adrien Grand	4632661bc7	Upgrade to a Lucene 7 snapshot (#24089 ) We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index. Some notes about the change: - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge	2017-04-18 15:17:21 +02:00
Andrew Selden	f8b15abe9a	Update reference docs for geocentroid aggregation. (#24141 ) This includes a link to the Wikipedia page explaining what a centroid is. Closes #24140	2017-04-17 21:27:43 -04:00
Ulugbek Baymuradov	9cb477d387	Update filter-aggregation.asciidoc (#24138 ) Fix a discrepancy between the example and the prose.	2017-04-17 18:46:13 -04:00
Suhas Karanth	777b5a3c16	Correct documentation for Min Bucket Aggregation (#23867 )	2017-04-05 12:39:37 +02:00
Nik Everett	5f91241f57	CONSOLEify geo aggregation docs Turns the top example in each of the geo aggregation docs into a working example that can be opened in CONSOLE. Subsequent examples can all also be opened in console and will work after you've run the first example. All examples are tested as part of the build.	2017-03-30 21:28:52 -04:00
Christoph Büscher	413bf05956	Docs: Add comma to reverse nested agg snippet	2017-03-17 14:07:18 +01:00
msancho	a37c759ba2	Fixed typo in documentation (#23406 ) * Fixed typo in documentation The option in "gap_policy" "insert_zeros" was missing a trailing "s" * Update movavg-aggregation.asciidoc	2017-03-01 15:22:26 +01:00
Randall Britten	c54fa177ef	Docs: Fixed Parameters tables to use defaults col (#23396 ) Occurred in a few places for pipeline aggregates.	2017-03-01 14:47:21 +01:00
Randall Britten	05fd2eca6f	Docs: corrected "and" --> "an" (#23376 )	2017-02-27 14:38:29 -05:00
Randall Britten	98e19cced4	Docs: Corrected definition of type param of children agg (#23377 )	2017-02-27 14:38:28 -05:00
Tanguy Leroux	e2e5937455	Use `typed_keys` parameter to prefix suggester names by type in search responses (#23080 ) This pull request reuses the typed_keys parameter added in #22965, but this time it applies it to suggesters. When set to true, the suggester names in the search response will be prefixed with a prefix that reflects their type.	2017-02-10 10:53:38 +01:00
Tanguy Leroux	3553522328	Add parameter to prefix aggs name with type in search responses (#22965 ) This pull request adds a new parameter to the REST Search API named `typed_keys`. When set to true, the aggregation names in the search response will be prefixed with a prefix that reflects the internal type of the aggregation. Here is a simple example: ``` GET /_search?typed_keys { "aggs": { "tweets_per_user": { "terms": { "field": "user" } } }, "size": 0 } ``` And the response: ``` { "aggs": { "sterms:tweets_per_user": { ... } } } ``` This parameter is intended to make life easier for REST clients that could parse back the prefix and could detect the type of the aggregation to parse. It could also be implemented for suggesters.	2017-02-09 11:19:04 +01:00
Nik Everett	0c011cb290	Docs: CONSOLEify histogram aggregation docs This adds the `COPY AS CURL` and `VIEW IN CONSOLE` links to the docs and causes the snippets to be tested during Elasticsearch's build. Relates to #18160	2017-02-07 16:09:32 -05:00
Nik Everett	245aa0404a	Docs: CONSOLEify sum aggregation docs This adds the `COPY AS CURL` and `VIEW IN CONSOLE` buttons to the docs and makes the build execute the snippets as part of `docs:check`. Relates to #18160	2017-02-07 14:18:54 -05:00

1 2 3 4 5

210 Commits