OpenSearch

Commit Graph

Author	SHA1	Message	Date
Andy Bristol	b8280ea7cc	median absolute deviation agg (#34482 ) This commit adds a new single value metric aggregation that calculates the statistic called median absolute deviation, which is a measure of variability that works on more types of data than standard deviation Our calculation of MAD is approximated using t-digests. In the collect phase, we collect each value visited into a t-digest. In the reduce phase, we merge all value t-digests, then create a t-digest of deviations using the first t-digest's median and centroids	2018-10-30 07:22:52 -07:00
Gordon Brown	794d4fa879	Label required scripts in Scripted Metric Agg docs (#35051 ) When combine_script and reduce_script were made into required parameters for Scripted Metric aggregations in #33452, the docs were not updated to reflect that. This marks those parameters as required in the documentation.	2018-10-29 15:13:14 -06:00
Julie Tibshirani	f854330e06	Make sure to use the type _doc in the REST documentation. (#34662 ) * Replace custom type names with _doc in REST examples. * Avoid using two mapping types in the percolator docs. * Rename doc -> _doc in the main repository README. * Also replace some custom type names in the HLRC docs.	2018-10-22 11:54:04 -07:00
Zachary Tong	d981746142	[Docs] clarification about cardinality accuracy (#34616 ) Adds a bit more clarification about how accuracy is dependent on the dataset in question. Closes #18231	2018-10-22 13:15:45 -04:00
ben5556	012b9c7539	Corrected aggregation name to match the example (#33786 )	2018-09-17 18:24:43 -07:00
Jim Ferenczi	7ad71f906a	Upgrade to a Lucene 8 snapshot (#33310 ) The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly. Some comments about the change: * Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309 Closes #32899	2018-09-06 14:42:06 +02:00
Sandeep Kanabar	7ad16ffd84	Docs: Correcting a typo in tophits (#32359 )	2018-07-26 13:30:01 -04:00
Zachary Tong	6ba144ae31	Add WeightedAvg metric aggregation (#31037 ) Adds a new single-value metrics aggregation that computes the weighted average of numeric values that are extracted from the aggregated documents. These values can be extracted from specific numeric fields in the documents. When calculating a regular average, each datapoint has an equal "weight"; it contributes equally to the final value. In contrast, weighted averages scale each datapoint differently. The amount that each datapoint contributes to the final value is extracted from the document, or provided by a script. As a formula, a weighted average is the `∑(value * weight) / ∑(weight)` A regular average can be thought of as a weighted average where every value has an implicit weight of `1`. Closes #15731	2018-07-23 18:33:15 -04:00
Peter Evers	ea15284230	Docs: Match the examples in the description (#31710 ) Prose drifted from snippet.	2018-07-02 14:12:49 -04:00
Peter Evers	050fbc8f3d	Docs: Fix description of percentile ranks example example (#31652 )	2018-06-28 09:29:56 -04:00
Jonathan Little	8e4768890a	Migrate scripted metric aggregation scripts to ScriptContext design (#30111 ) * Migrate scripted metric aggregation scripts to ScriptContext design #29328 * Rename new script context container class and add clarifying comments to remaining references to params._agg(s) * Misc cleanup: make mock metric agg script inner classes static * Move _score to an accessor rather than an arg for scripted metric agg scripts This causes the score to be evaluated only when it's used. * Documentation changes for params._agg -> agg * Migration doc addition for scripted metric aggs _agg object change * Rename "agg" Scripted Metric Aggregation script context variable to "state" * Rename a private base class from ...Agg to ...State that I missed in my last commit * Clean up imports after merge	2018-06-25 12:01:33 +01:00
Colin Goodheart-Smithe	58e9446e00	Removes experimental tag from scripted_metric aggregation (#31298 )	2018-06-13 17:24:32 +01:00
Christoph Büscher	4777d8a2df	[Docs] Fix typo in Min Aggregation reference (#30899 )	2018-05-31 15:05:03 +02:00
Piotr Prądzyński	cefbd29db3	top_hits doc example description update (#30676 ) Example description does not fit example code.	2018-05-17 15:21:25 +01:00
Jason Tedor	4a4e3d70d5	Default to one shard (#30539 ) This commit changes the default out-of-the-box configuration for the number of shards from five to one. We think this will help address a common problem of oversharding. For users with time-based indices that need a different default, this can be managed with index templates. For users with non-time-based indices that find they need to re-shard with the split API in place they no longer need to resort only to reindexing. Since this has the impact of changing the default number of shards used in REST tests, we want to ensure that we still have coverage for issues that could arise from multiple shards. As such, we randomize (rarely) the default number of shards in REST tests to two. This is managed via a global index template. However, some tests check the templates that are in the cluster state during the test. Since this template is randomly there, we need a way for tests to skip adding the template used to set the number of shards to two. For this we add the default_shards feature skip. To avoid having to write our docs in a complicated way because sometimes they might be behind one shard, and sometimes they might be behind two shards we apply the default_shards feature skip to all docs tests. That is, these tests will always run with the default number of shards (one).	2018-05-14 12:22:35 -04:00
Karim Frenn	3acca0b35c	[Docs] Fix typo in cardinality-aggregation.asciidoc (#30434 )	2018-05-08 16:12:36 +02:00
Adrien Grand	ebd6b5b7ba	Deprecate filtering on `_type`. (#29468 ) As indices are only allowed to have one type now, and types are going away in the future, we should deprecate filtering by `_type`. Relates #15613	2018-04-13 09:07:51 +02:00
Ke Li	fc406c9a5a	Upgrade t-digest to 3.2 (#28295 ) (#28305 )	2018-02-15 08:23:20 +00:00
Islam Heggo	f562c7f15a	Correct the explanation of load time percentiles (#28510 ) * Correct the explanation of load time percentiles * Adjusting the percentile clarification Eliminating the false sentence about majority of load time	2018-02-08 16:29:43 -08:00
Jin Liang	66c81e7f5e	[Docs] Update tophits-aggregation.asciidoc (#28273 )	2018-01-18 18:06:20 +01:00
akadko	6a5807ad8f	[DOCS] Removed differencies between text and code (#27993 )	2018-01-12 10:36:48 -05:00
Adrien Grand	1b660821a2	Allow `_doc` as a type. (#27816 ) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751	2017-12-14 17:47:53 +01:00
Christoph Büscher	0d11b9fe34	[Docs] Unify spelling of Elasticsearch (#27567 ) Removes occurences of "elasticsearch" or "ElasticSearch" in favour of "Elasticsearch" where appropriate.	2017-11-29 09:44:25 +01:00
Martijn van Groningen	cb1204774b	Include the _index, _type and _id to nested search hits in the top_hits and inner_hits response. Also include _type and _id for parent/child hits inside inner hits. In the case of top_hits aggregation the nested search hits are directly returned and are not grouped by a root or parent document, so it is important to include the _id and _index attributes in order to know to what documents these nested search hits belong to. Closes #27053	2017-11-28 14:05:29 +01:00
Simon Willnauer	fadbe0de08	Automatically prepare indices for splitting (#27451 ) Today we require users to prepare their indices for split operations. Yet, we can do this automatically when an index is created which would make the split feature a much more appealing option since it doesn't have any 3rd party prerequisites anymore. This change automatically sets the number of routinng shards such that an index is guaranteed to be able to split once into twice as many shards. The number of routing shards is scaled towards the default shard limit per index such that indices with a smaller amount of shards can be split more often than larger ones. For instance an index with 1 or 2 shards can be split 10x (until it approaches 1024 shards) while an index created with 128 shards can only be split 3x by a factor of 2. Please note this is just a default value and users can still prepare their indices with `index.number_of_routing_shards` for custom splitting. NOTE: this change has an impact on the document distribution since we are changing the hash space. Documents are still uniformly distributed across all shards but since we are artificually changing the number of buckets in the consistent hashign space document might be hashed into different shards compared to previous versions. This is a 7.0 only change.	2017-11-23 09:48:54 +01:00
Martijn van Groningen	87c9b79b10	Return the _source of inner hit nested as is without wrapping it into its full path context Due to a change happened via #26102 to make the nested source consistent with or without source filtering, the _source of a nested inner hit was always wrapped in the parent path. This turned out to be not ideal for users relying on the nested source, as it would require additional parsing on the client side. This change fixes this, the _source of nested inner hits is now no longer wrapped by parent json objects, irregardless of whether the _source is included as is or source filtering is used. Internally source filtering and highlighting relies on the fact that the _source of nested inner hits are accessible by its full field path, so in order to now break this, the conversion of the _source into its binary form is performed in FetchSourceSubPhase, after any potential source filtering is performed to make sure the structure of _source of the nested inner hit is consistent irregardless if source filtering is performed. PR for #26944 Closes #26944	2017-10-19 12:04:56 +02:00
Ryan Ernst	c0c5d5488f	Docs: Remove remaining references to file and native scripts (#26580 ) relates #25690	2017-09-11 11:39:29 -07:00
Tanguy Leroux	3d07bce504	[Docs] Fix tophits-aggregation.asciidoc	2017-08-30 13:06:44 +02:00
Tanguy Leroux	643eb286dc	[Docs] Convert remaining code snippets in docs (#26422 ) This commit converts the last remaining code snippets so that they are now testable.	2017-08-30 12:11:10 +02:00
Zachary Tong	e7eda5e1be	CONSOLEify scripted-metric agg docs Related #18160	2017-08-03 17:19:54 -04:00
Zachary Tong	d8414ffa29	CONSOLEify percentile and percentile-ranks docs Related #18160	2017-08-02 17:47:27 -04:00
Zachary Tong	268923ebdc	CONSOLEify extended_stats docs Related #18160	2017-08-02 16:13:30 -04:00
Clinton Gormley	ff4a2519f2	Update experimental labels in the docs (#25727 ) Relates https://github.com/elastic/elasticsearch/issues/19798 Removed experimental label from: * Painless * Diversified Sampler Agg * Sampler Agg * Significant Terms Agg * Terms Agg document count error and execution_hint * Cardinality Agg precision_threshold * Pipeline Aggregations * index.shard.check_on_startup * index.store.type (added warning) * Preloading data into the file system cache * foreach ingest processor * Field caps API * Profile API Added experimental label to: * Moving Average Agg Prediction Changed experimental to beta for: * Adjacency matrix agg * Normalizers * Tasks API * Index sorting Labelled experimental in Lucene: * ICU plugin custom rules file * Flatten graph token filter * Synonym graph token filter * Word delimiter graph token filter * Simple pattern tokenizer * Simple pattern split tokenizer Replaced experimental label with warning that details may change in the future: * Analysis explain output format * Segments verbose output format * Percentile Agg compression and HDR Histogram * Percentile Rank Agg HDR Histogram	2017-07-18 14:06:22 +02:00
Ryan Ernst	a03b6c2fa5	Scripting: Change keys for inline/stored scripts to source/id (#25127 ) This commit adds back "id" as the key within a script to specify a stored script (which with file scripts now gone is no longer ambiguous). It also adds "source" as a replacement for "code". This is in an attempt to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.	2017-06-09 08:29:25 -07:00
Colin Goodheart-Smithe	5e7a79636d	[DOCS] Clarify behaviour of scripted-metric arg with empty parent buckets	2017-06-02 11:00:27 +01:00
Ryan Ernst	463fe2f4d4	Scripting: Remove file scripts (#24627 ) This commit removes file scripts, which were deprecated in 5.5. closes #21798	2017-05-17 14:42:25 -07:00
Zachary Tong	a2845c86fe	CONSOLEify some more aggregation docs Related #18160	2017-05-16 17:25:24 -04:00
Vlad Holubiev	557390d7d1	Fix typo in example (grades_count -> types_count) (#24635 ) Looks like `doc.grade` was used for examples before. But not anymore - https://www.elastic.co/guide/en/elasticsearch/reference/2.4/search-aggregations-metrics-valuecount-aggregation.html	2017-05-15 14:08:46 -04:00
Zachary Tong	4e49c618f2	CONSOLEify Stats Aggregation docs (#24373 )	2017-05-01 13:33:24 -04:00
Christoph Büscher	16a7cbe463	Add `count` value to rest output of `geo_centroid` (#24387 ) Currently we don't write the count value to the geo_centroid aggregation rest response, but it is provided via the java api and the count() method in the GeoCentroid interface. We should add this parameter to the rest output and also provide it via the getProperty() method.	2017-04-28 16:25:22 +02:00
Suhas Karanth	cee76295ca	Update aggs reference documentation for 'keyed' options (#23758 ) Add 'keyed' parameter documentation for following: - Date Histogram Aggregation - Date Range Aggregation - Geo Distance Aggregation - Histogram Aggregation - IP range aggregation - Percentiles Aggregation - Percentile Ranks Aggregation	2017-04-18 15:57:50 +02:00
Adrien Grand	4632661bc7	Upgrade to a Lucene 7 snapshot (#24089 ) We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index. Some notes about the change: - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge	2017-04-18 15:17:21 +02:00
Andrew Selden	f8b15abe9a	Update reference docs for geocentroid aggregation. (#24141 ) This includes a link to the Wikipedia page explaining what a centroid is. Closes #24140	2017-04-17 21:27:43 -04:00
Nik Everett	5f91241f57	CONSOLEify geo aggregation docs Turns the top example in each of the geo aggregation docs into a working example that can be opened in CONSOLE. Subsequent examples can all also be opened in console and will work after you've run the first example. All examples are tested as part of the build.	2017-03-30 21:28:52 -04:00
Randall Britten	05fd2eca6f	Docs: corrected "and" --> "an" (#23376 )	2017-02-27 14:38:29 -05:00
Nik Everett	245aa0404a	Docs: CONSOLEify sum aggregation docs This adds the `COPY AS CURL` and `VIEW IN CONSOLE` buttons to the docs and makes the build execute the snippets as part of `docs:check`. Relates to #18160	2017-02-07 14:18:54 -05:00
Nik Everett	274ee30d34	Docs: CONSOLEify the avg aggregation docs This creates the `COPY AS CURL` and `VIEW IN CONSOLE` buttons and makes the build test the examples. Relates to #18160	2017-02-07 13:48:27 -05:00
Nik Everett	d704a880e7	Add tests for top_hits aggregation (#22754 ) Add unit tests for `TopHitsAggregator` and convert some snippets in docs for `top_hits` aggregation to `// CONSOLE`. Relates to #22278 Relates to #18160	2017-01-25 16:15:50 -05:00
Nik Everett	da8740128b	Docs: CONSOLE-ify value_count aggregation docs Adds the `VIEW IN CONSOLE` and `COPY AS CURL` links to the snippets in the `value_count` docs and causes the build to execute the snippets for testing. Release #18160	2017-01-23 10:07:29 -05:00
Nik Everett	c2a580304b	CONSOLE-ify min and max aggregation docs Adds the `VIEW IN CONSOLE` and `COPY AS CURL` links to the docs and makes the build automatically test them. Relates to #18160	2017-01-20 15:33:00 -05:00

1 2

79 Commits