OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jim Ferenczi	18866c4c0b	Make hits.total an object in the search response (#35849 ) This commit changes the format of the `hits.total` in the search response to be an object with a `value` and a `relation`. The `value` indicates the number of hits that match the query and the `relation` indicates whether the number is accurate (in which case the relation is equals to `eq`) or a lower bound of the total (in which case it is equals to `gte`). This change also adds a parameter called `rest_total_hits_as_int` that can be used in the search APIs to opt out from this change (retrieve the total hits as a number in the rest response). Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain `hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a follow up (to allow numbers to be passed to `track_total_hits`). Relates #33028	2018-12-05 19:49:06 +01:00
Andy Bristol	b8280ea7cc	median absolute deviation agg (#34482 ) This commit adds a new single value metric aggregation that calculates the statistic called median absolute deviation, which is a measure of variability that works on more types of data than standard deviation Our calculation of MAD is approximated using t-digests. In the collect phase, we collect each value visited into a t-digest. In the reduce phase, we merge all value t-digests, then create a t-digest of deviations using the first t-digest's median and centroids	2018-10-30 07:22:52 -07:00
Gordon Brown	794d4fa879	Label required scripts in Scripted Metric Agg docs (#35051 ) When combine_script and reduce_script were made into required parameters for Scripted Metric aggregations in #33452, the docs were not updated to reflect that. This marks those parameters as required in the documentation.	2018-10-29 15:13:14 -06:00
Julie Tibshirani	f854330e06	Make sure to use the type _doc in the REST documentation. (#34662 ) * Replace custom type names with _doc in REST examples. * Avoid using two mapping types in the percolator docs. * Rename doc -> _doc in the main repository README. * Also replace some custom type names in the HLRC docs.	2018-10-22 11:54:04 -07:00
Zachary Tong	d981746142	[Docs] clarification about cardinality accuracy (#34616 ) Adds a bit more clarification about how accuracy is dependent on the dataset in question. Closes #18231	2018-10-22 13:15:45 -04:00
ben5556	012b9c7539	Corrected aggregation name to match the example (#33786 )	2018-09-17 18:24:43 -07:00
Jim Ferenczi	7ad71f906a	Upgrade to a Lucene 8 snapshot (#33310 ) The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly. Some comments about the change: * Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309 Closes #32899	2018-09-06 14:42:06 +02:00
Sandeep Kanabar	7ad16ffd84	Docs: Correcting a typo in tophits (#32359 )	2018-07-26 13:30:01 -04:00
Zachary Tong	6ba144ae31	Add WeightedAvg metric aggregation (#31037 ) Adds a new single-value metrics aggregation that computes the weighted average of numeric values that are extracted from the aggregated documents. These values can be extracted from specific numeric fields in the documents. When calculating a regular average, each datapoint has an equal "weight"; it contributes equally to the final value. In contrast, weighted averages scale each datapoint differently. The amount that each datapoint contributes to the final value is extracted from the document, or provided by a script. As a formula, a weighted average is the `∑(value * weight) / ∑(weight)` A regular average can be thought of as a weighted average where every value has an implicit weight of `1`. Closes #15731	2018-07-23 18:33:15 -04:00
Peter Evers	ea15284230	Docs: Match the examples in the description (#31710 ) Prose drifted from snippet.	2018-07-02 14:12:49 -04:00
Peter Evers	050fbc8f3d	Docs: Fix description of percentile ranks example example (#31652 )	2018-06-28 09:29:56 -04:00
Jonathan Little	8e4768890a	Migrate scripted metric aggregation scripts to ScriptContext design (#30111 ) * Migrate scripted metric aggregation scripts to ScriptContext design #29328 * Rename new script context container class and add clarifying comments to remaining references to params._agg(s) * Misc cleanup: make mock metric agg script inner classes static * Move _score to an accessor rather than an arg for scripted metric agg scripts This causes the score to be evaluated only when it's used. * Documentation changes for params._agg -> agg * Migration doc addition for scripted metric aggs _agg object change * Rename "agg" Scripted Metric Aggregation script context variable to "state" * Rename a private base class from ...Agg to ...State that I missed in my last commit * Clean up imports after merge	2018-06-25 12:01:33 +01:00
Colin Goodheart-Smithe	58e9446e00	Removes experimental tag from scripted_metric aggregation (#31298 )	2018-06-13 17:24:32 +01:00
Christoph Büscher	4777d8a2df	[Docs] Fix typo in Min Aggregation reference (#30899 )	2018-05-31 15:05:03 +02:00
Piotr Prądzyński	cefbd29db3	top_hits doc example description update (#30676 ) Example description does not fit example code.	2018-05-17 15:21:25 +01:00
Jason Tedor	4a4e3d70d5	Default to one shard (#30539 ) This commit changes the default out-of-the-box configuration for the number of shards from five to one. We think this will help address a common problem of oversharding. For users with time-based indices that need a different default, this can be managed with index templates. For users with non-time-based indices that find they need to re-shard with the split API in place they no longer need to resort only to reindexing. Since this has the impact of changing the default number of shards used in REST tests, we want to ensure that we still have coverage for issues that could arise from multiple shards. As such, we randomize (rarely) the default number of shards in REST tests to two. This is managed via a global index template. However, some tests check the templates that are in the cluster state during the test. Since this template is randomly there, we need a way for tests to skip adding the template used to set the number of shards to two. For this we add the default_shards feature skip. To avoid having to write our docs in a complicated way because sometimes they might be behind one shard, and sometimes they might be behind two shards we apply the default_shards feature skip to all docs tests. That is, these tests will always run with the default number of shards (one).	2018-05-14 12:22:35 -04:00
Karim Frenn	3acca0b35c	[Docs] Fix typo in cardinality-aggregation.asciidoc (#30434 )	2018-05-08 16:12:36 +02:00
Adrien Grand	ebd6b5b7ba	Deprecate filtering on `_type`. (#29468 ) As indices are only allowed to have one type now, and types are going away in the future, we should deprecate filtering by `_type`. Relates #15613	2018-04-13 09:07:51 +02:00
Ke Li	fc406c9a5a	Upgrade t-digest to 3.2 (#28295 ) (#28305 )	2018-02-15 08:23:20 +00:00
Islam Heggo	f562c7f15a	Correct the explanation of load time percentiles (#28510 ) * Correct the explanation of load time percentiles * Adjusting the percentile clarification Eliminating the false sentence about majority of load time	2018-02-08 16:29:43 -08:00
Jin Liang	66c81e7f5e	[Docs] Update tophits-aggregation.asciidoc (#28273 )	2018-01-18 18:06:20 +01:00
akadko	6a5807ad8f	[DOCS] Removed differencies between text and code (#27993 )	2018-01-12 10:36:48 -05:00
Adrien Grand	1b660821a2	Allow `_doc` as a type. (#27816 ) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751	2017-12-14 17:47:53 +01:00
Christoph Büscher	0d11b9fe34	[Docs] Unify spelling of Elasticsearch (#27567 ) Removes occurences of "elasticsearch" or "ElasticSearch" in favour of "Elasticsearch" where appropriate.	2017-11-29 09:44:25 +01:00
Martijn van Groningen	cb1204774b	Include the _index, _type and _id to nested search hits in the top_hits and inner_hits response. Also include _type and _id for parent/child hits inside inner hits. In the case of top_hits aggregation the nested search hits are directly returned and are not grouped by a root or parent document, so it is important to include the _id and _index attributes in order to know to what documents these nested search hits belong to. Closes #27053	2017-11-28 14:05:29 +01:00
Simon Willnauer	fadbe0de08	Automatically prepare indices for splitting (#27451 ) Today we require users to prepare their indices for split operations. Yet, we can do this automatically when an index is created which would make the split feature a much more appealing option since it doesn't have any 3rd party prerequisites anymore. This change automatically sets the number of routinng shards such that an index is guaranteed to be able to split once into twice as many shards. The number of routing shards is scaled towards the default shard limit per index such that indices with a smaller amount of shards can be split more often than larger ones. For instance an index with 1 or 2 shards can be split 10x (until it approaches 1024 shards) while an index created with 128 shards can only be split 3x by a factor of 2. Please note this is just a default value and users can still prepare their indices with `index.number_of_routing_shards` for custom splitting. NOTE: this change has an impact on the document distribution since we are changing the hash space. Documents are still uniformly distributed across all shards but since we are artificually changing the number of buckets in the consistent hashign space document might be hashed into different shards compared to previous versions. This is a 7.0 only change.	2017-11-23 09:48:54 +01:00
Martijn van Groningen	87c9b79b10	Return the _source of inner hit nested as is without wrapping it into its full path context Due to a change happened via #26102 to make the nested source consistent with or without source filtering, the _source of a nested inner hit was always wrapped in the parent path. This turned out to be not ideal for users relying on the nested source, as it would require additional parsing on the client side. This change fixes this, the _source of nested inner hits is now no longer wrapped by parent json objects, irregardless of whether the _source is included as is or source filtering is used. Internally source filtering and highlighting relies on the fact that the _source of nested inner hits are accessible by its full field path, so in order to now break this, the conversion of the _source into its binary form is performed in FetchSourceSubPhase, after any potential source filtering is performed to make sure the structure of _source of the nested inner hit is consistent irregardless if source filtering is performed. PR for #26944 Closes #26944	2017-10-19 12:04:56 +02:00
Ryan Ernst	c0c5d5488f	Docs: Remove remaining references to file and native scripts (#26580 ) relates #25690	2017-09-11 11:39:29 -07:00
Tanguy Leroux	3d07bce504	[Docs] Fix tophits-aggregation.asciidoc	2017-08-30 13:06:44 +02:00
Tanguy Leroux	643eb286dc	[Docs] Convert remaining code snippets in docs (#26422 ) This commit converts the last remaining code snippets so that they are now testable.	2017-08-30 12:11:10 +02:00
Zachary Tong	e7eda5e1be	CONSOLEify scripted-metric agg docs Related #18160	2017-08-03 17:19:54 -04:00
Zachary Tong	d8414ffa29	CONSOLEify percentile and percentile-ranks docs Related #18160	2017-08-02 17:47:27 -04:00
Zachary Tong	268923ebdc	CONSOLEify extended_stats docs Related #18160	2017-08-02 16:13:30 -04:00
Clinton Gormley	ff4a2519f2	Update experimental labels in the docs (#25727 ) Relates https://github.com/elastic/elasticsearch/issues/19798 Removed experimental label from: * Painless * Diversified Sampler Agg * Sampler Agg * Significant Terms Agg * Terms Agg document count error and execution_hint * Cardinality Agg precision_threshold * Pipeline Aggregations * index.shard.check_on_startup * index.store.type (added warning) * Preloading data into the file system cache * foreach ingest processor * Field caps API * Profile API Added experimental label to: * Moving Average Agg Prediction Changed experimental to beta for: * Adjacency matrix agg * Normalizers * Tasks API * Index sorting Labelled experimental in Lucene: * ICU plugin custom rules file * Flatten graph token filter * Synonym graph token filter * Word delimiter graph token filter * Simple pattern tokenizer * Simple pattern split tokenizer Replaced experimental label with warning that details may change in the future: * Analysis explain output format * Segments verbose output format * Percentile Agg compression and HDR Histogram * Percentile Rank Agg HDR Histogram	2017-07-18 14:06:22 +02:00
Ryan Ernst	a03b6c2fa5	Scripting: Change keys for inline/stored scripts to source/id (#25127 ) This commit adds back "id" as the key within a script to specify a stored script (which with file scripts now gone is no longer ambiguous). It also adds "source" as a replacement for "code". This is in an attempt to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.	2017-06-09 08:29:25 -07:00
Colin Goodheart-Smithe	5e7a79636d	[DOCS] Clarify behaviour of scripted-metric arg with empty parent buckets	2017-06-02 11:00:27 +01:00
Ryan Ernst	463fe2f4d4	Scripting: Remove file scripts (#24627 ) This commit removes file scripts, which were deprecated in 5.5. closes #21798	2017-05-17 14:42:25 -07:00
Zachary Tong	a2845c86fe	CONSOLEify some more aggregation docs Related #18160	2017-05-16 17:25:24 -04:00
Vlad Holubiev	557390d7d1	Fix typo in example (grades_count -> types_count) (#24635 ) Looks like `doc.grade` was used for examples before. But not anymore - https://www.elastic.co/guide/en/elasticsearch/reference/2.4/search-aggregations-metrics-valuecount-aggregation.html	2017-05-15 14:08:46 -04:00
Zachary Tong	4e49c618f2	CONSOLEify Stats Aggregation docs (#24373 )	2017-05-01 13:33:24 -04:00
Christoph Büscher	16a7cbe463	Add `count` value to rest output of `geo_centroid` (#24387 ) Currently we don't write the count value to the geo_centroid aggregation rest response, but it is provided via the java api and the count() method in the GeoCentroid interface. We should add this parameter to the rest output and also provide it via the getProperty() method.	2017-04-28 16:25:22 +02:00
Suhas Karanth	cee76295ca	Update aggs reference documentation for 'keyed' options (#23758 ) Add 'keyed' parameter documentation for following: - Date Histogram Aggregation - Date Range Aggregation - Geo Distance Aggregation - Histogram Aggregation - IP range aggregation - Percentiles Aggregation - Percentile Ranks Aggregation	2017-04-18 15:57:50 +02:00
Adrien Grand	4632661bc7	Upgrade to a Lucene 7 snapshot (#24089 ) We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index. Some notes about the change: - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge	2017-04-18 15:17:21 +02:00
Andrew Selden	f8b15abe9a	Update reference docs for geocentroid aggregation. (#24141 ) This includes a link to the Wikipedia page explaining what a centroid is. Closes #24140	2017-04-17 21:27:43 -04:00
Nik Everett	5f91241f57	CONSOLEify geo aggregation docs Turns the top example in each of the geo aggregation docs into a working example that can be opened in CONSOLE. Subsequent examples can all also be opened in console and will work after you've run the first example. All examples are tested as part of the build.	2017-03-30 21:28:52 -04:00
Randall Britten	05fd2eca6f	Docs: corrected "and" --> "an" (#23376 )	2017-02-27 14:38:29 -05:00
Nik Everett	245aa0404a	Docs: CONSOLEify sum aggregation docs This adds the `COPY AS CURL` and `VIEW IN CONSOLE` buttons to the docs and makes the build execute the snippets as part of `docs:check`. Relates to #18160	2017-02-07 14:18:54 -05:00
Nik Everett	274ee30d34	Docs: CONSOLEify the avg aggregation docs This creates the `COPY AS CURL` and `VIEW IN CONSOLE` buttons and makes the build test the examples. Relates to #18160	2017-02-07 13:48:27 -05:00
Nik Everett	d704a880e7	Add tests for top_hits aggregation (#22754 ) Add unit tests for `TopHitsAggregator` and convert some snippets in docs for `top_hits` aggregation to `// CONSOLE`. Relates to #22278 Relates to #18160	2017-01-25 16:15:50 -05:00
Nik Everett	da8740128b	Docs: CONSOLE-ify value_count aggregation docs Adds the `VIEW IN CONSOLE` and `COPY AS CURL` links to the snippets in the `value_count` docs and causes the build to execute the snippets for testing. Release #18160	2017-01-23 10:07:29 -05:00
Nik Everett	c2a580304b	CONSOLE-ify min and max aggregation docs Adds the `VIEW IN CONSOLE` and `COPY AS CURL` links to the docs and makes the build automatically test them. Relates to #18160	2017-01-20 15:33:00 -05:00
Johannes Kanavin	27c57aeebe	Fixed id's of 'worked example' in scripted metric aggs docs (#22430 )	2017-01-05 14:37:27 -05:00
Adrin Jalali	235e6acd73	typo fix (and -> any) (#21860 )	2016-11-30 12:56:00 +01:00
Carney Wu	2c0db3909f	include not work in 5.x anymore (#21815 ) include not work in 5.x anymore use includes instead	2016-11-28 11:02:59 +01:00
Adrien Grand	4c46ffcecf	Document that min/max operate on the double representation of the data. Relates #9545	2016-11-28 10:34:43 +01:00
Nik Everett	7dcff27aea	Update docs for scripted metric agg Now that the default language is painless the examples didn't work at all. This fixes them. Closes #21536	2016-11-15 11:47:17 -05:00
Clinton Gormley	5ec2ba3166	Update scripted-metric-aggregation.asciidoc Removed docs for `reduce_params` Closes #20917	2016-10-17 19:31:30 +02:00
Nik Everett	5cff2a046d	Remove most of the need for `// NOTCONSOLE` and be much more stingy about what we consider a console candidate. * Add `// CONSOLE` to check-running * Fix version in some snippets * Mark groovy snippets as groovy * Fix versions in plugins * Fix language marker errors * Fix language parsing in snippets This adds support for snippets who's language is written like `[source, txt]` and `["source","js",subs="attributes,callouts"]`. This also makes language required for snippets which is nice because then we can be sure we can grep for snippets in a particular language.	2016-09-06 10:32:54 -04:00
Jim Ferenczi	4682fc34ae	Add the ability to disable the retrieval of the stored fields entirely This change adds a special field named _none_ that allows to disable the retrieval of the stored fields in a search request or in a TopHitsAggregation. To completely disable stored fields retrieval (including disabling metadata fields retrieval such as _id or _type) use _none_ like this: ```` POST _search { "stored_fields": "_none_" } ````	2016-08-24 16:40:08 +02:00
Ryan Biesemeyer	9f1525255a	Update link to mapper-murmur3 plugin in card docs (#19788 )	2016-08-04 15:56:59 +02:00
Colin Goodheart-Smithe	3f344d3154	[DOCS] fix documentation for selecting algorithm for percentiles agg	2016-07-27 08:48:51 +01:00
Adrien Grand	1ed6c5d110	Docs: Add more points to the chart that gives accuracy for the cardinality aggregation. This also adds instructions how to regenerate the chart.	2016-07-20 10:37:12 +02:00
Adrien Grand	bde99bad2e	Use a static default precision for the cardinality aggregation. #19215 Today the default precision for the cardinality aggregation depends on how many parent bucket aggregations it had. The reasoning was that the more parent bucket aggregations, the more buckets the cardinality had to be computed on. And this number could be huge depending on what the parent aggregations actually are. However now that we run terms aggregations in breadth-first mode by default when there are sub aggregations, it is less likely that we have to run the cardinality aggregation on kagilions of buckets. So we could use a static default, which will be less confusing to users.	2016-07-18 11:30:41 +02:00
Jim Ferenczi	afe99fcdcd	Restore reverted change now that alpha4 is out: Rename `fields` to `stored_fields` and add `docvalue_fields` `stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields. `fields` will throw an exception if the user uses it. Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field. Closes #18943	2016-07-04 10:39:49 +02:00
Robert Muir	6d52cec2a0	Merge pull request #19092 from rmuir/more_painless_docs cutover some docs to painless	2016-06-28 13:40:25 -04:00
Jim Ferenczi	eb1e231a63	Revert "Rename `fields` to `stored_fields` and add `docvalue_fields`" This reverts commit `2f46f53dc8`.	2016-06-27 17:20:32 +02:00
Robert Muir	6fc1a22977	cutover some docs to painless	2016-06-27 09:55:16 -04:00
Jerry Liu	1863ab95f8	fixed typo 'if' -> 'is' (#19051 )	2016-06-27 14:20:23 +02:00
Jim Ferenczi	2f46f53dc8	Rename `fields` to `stored_fields` and add `docvalue_fields` `stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields. `fields` will throw an exception if the user uses it. Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field. Closes #18943	2016-06-22 17:38:30 +02:00
Martijn van Groningen	8e63ce00f0	docs: removed confusing statement.	2016-04-19 11:49:51 +02:00
Sergii Golubev	5ce3eb96b0	tophits-aggregation.asciidoc: fix a typo	2016-04-18 09:23:39 +02:00
Nicholas Knize	b31d3ddd3e	Adds geo_centroid metric aggregator This commit adds a new metric aggregator for computing the geo_centroid over a set of geo_point fields. This can be combined with other aggregators (e.g., geohash_grid, significant_terms) for computing the geospatial centroid based on the document sets from other aggregation results.	2015-10-14 16:19:09 -05:00
Adrien Grand	86f1b07df0	Docs: Remove docs for the `filtered`, `and`, `or` and `(f)query` queries.	2015-09-11 11:00:54 +02:00
Colin Goodheart-Smithe	1d9905a798	[DOCS] Added note about valid return types for scripts in the scripted_metric aggregation	2015-09-02 12:13:15 +01:00
Adrien Grand	a91b3fcbb9	Move the `murmur3` field to a plugin and fix defaults. This move the `murmur3` field to the `mapper-murmur3` plugin and fixes its defaults so that values will not be indexed by default, as the only purpose of this field is to speed up `cardinality` aggregations on high-cardinality string fields, which only requires doc values. I also removed the `rehash` option from the `cardinality` aggregation as it doesn't bring much value (rehashing is cheap) and allowed to remove the coupling between the `cardinality` aggregation and the `murmur3` field. Close #12874	2015-08-18 11:41:52 +02:00
Clinton Gormley	ac2b8951c6	Docs: Mapping docs completely rewritten for 2.0	2015-08-06 17:24:51 +02:00
Colin Goodheart-Smithe	3e0532a0c5	Aggregations: Add HDRHistogram as an option in percentiles and percentile_ranks aggregations HDRHistogram has been added as an option in the percentiles and percentile_ranks aggregation. It has one option `number_significant_digits` which controls the accuracy and memory size for the algorithm Closes #8324	2015-07-24 17:55:36 +01:00
Colin Goodheart-Smithe	35a58d874e	Scripting: Unify script and template requests across codebase This change unifies the way scripts and templates are specified for all instances in the codebase. It builds on the Script class added previously and adds request building and parsing support as well as the ability to transfer script objects between nodes. It also adds a Template class which aims to provide the same functionality for template APIs Closes #11091	2015-05-29 16:52:04 +01:00
Adrien Grand	32e23b9100	Aggs: Make it possible to configure missing values. Most aggregations (terms, histogram, stats, percentiles, geohash-grid) now support a new `missing` option which defines the value to consider when a field does not have a value. This can be handy if you eg. want a terms aggregation to handle the same way documents that have "N/A" or no value for a `tag` field. This works in a very similar way to the `missing` option on the `sort` element. One known issue is that this option sometimes cannot make the right decision in the unmapped case: it needs to replace all values with the `missing` value but might not know what kind of values source should be produced (numerics, strings, geo points?). For this reason, we might want to add an `unmapped_type` option in the future like we did for sorting. Related to #5324	2015-05-15 16:26:58 +02:00
Zachary Tong	e3ae1df6f0	[DOCS] Restructure Aggs documentation	2015-05-01 16:04:55 -04:00

1 2 3

130 Commits