OpenSearch

Commit Graph

Author	SHA1	Message	Date
Julie Tibshirani	36a3b84fc9	Update the default for include_type_name to false. (#37285 ) * Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.	2019-01-14 13:08:01 -08:00
Igor Motov	d6acd8e15f	Docs: add clarification about geohash use in geohashgrid agg (#36901 ) Adds an example on translating geohashes returned by geohashgrid agg as bucket keys into geo bounding box filters in elasticsearch as well as 3rd party applications. Closes #36413	2019-01-03 15:40:48 -05:00
Luca Cavanna	42ea644903	Remove single shard optimization when suggesting shard_size (#37041 ) When executing terms aggregations we set the shard_size, meaning the number of buckets to collect on each shard, to a value that's higher than the number of requested buckets, to guarantee some basic level of precision. We have an optimization in place so that we leave shard_size set to size whenever we are searching against a single shard, in which case maximum precision is guaranteed by definition. Such optimization requires us access to the total number of shards that the search is executing against. In the context of cross-cluster search, once we will introduce multiple reduction steps (one per cluster) each cluster will only know the number of local shards, which is problematic as we should only optimize if we are searching against a single shard in a single cluster. It could be that we are searching against one shard per cluster in which case the current code would optimize number of terms causing a loss of precision. While discussing how to address the CCS scenario, we decided that we do not want to introduce further complexity caused by this single shard optimization, as it benefits only a minority of cases, especially when the benefits are not so great. This commit removes the single shard optimization, meaning that we will always have heuristic enabled on how many number of buckets to collect on the shards, even when searching against a single shard. This will cause more buckets to be collected when searching against a single shard compared to before. If that becomes a problem for some users, they can work around that by setting the shard_size equal to the size. Relates to #32125	2019-01-02 17:45:49 +01:00
Jim Ferenczi	18866c4c0b	Make hits.total an object in the search response (#35849 ) This commit changes the format of the `hits.total` in the search response to be an object with a `value` and a `relation`. The `value` indicates the number of hits that match the query and the `relation` indicates whether the number is accurate (in which case the relation is equals to `eq`) or a lower bound of the total (in which case it is equals to `gte`). This change also adds a parameter called `rest_total_hits_as_int` that can be used in the search APIs to opt out from this change (retrieve the total hits as a number in the rest response). Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain `hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a follow up (to allow numbers to be passed to `track_total_hits`). Relates #33028	2018-12-05 19:49:06 +01:00
Jeff Hajewski	49087f16f5	Adds deprecation logging to ScriptDocValues#getValues. (#34279 ) `ScriptDocValues#getValues` was added for backwards compatibility but no longer needed. Scripts using the syntax `doc['foo'].values` when `doc['foo']` is a list should be using `doc['foo']` instead. Closes #22919	2018-11-27 14:30:13 -05:00
William Desportes	a204d1cdff	[Docs] Fix typo in datehistogram-aggregation.asciidoc (#35855 )	2018-11-23 15:16:53 +01:00
Jim Ferenczi	d96202a282	[DOCS] Fix missing callouts	2018-11-08 15:40:01 +01:00
Dominik Stadler	d351422215	Add parent-aggregation to parent-join module (#34210 ) Add `parent` aggregation, a special single bucket aggregation that joins children documents to their parent.	2018-11-08 14:13:00 +01:00
Sue Gallagher	1ce3c92a2d	[DOCS] Add info on calendar vs fixed interval. (#31638 ) Extensive edit to add additional information on the difference between calendar intervals and fixed-length intervals.	2018-10-31 10:16:36 -04:00
Julie Tibshirani	f854330e06	Make sure to use the type _doc in the REST documentation. (#34662 ) * Replace custom type names with _doc in REST examples. * Avoid using two mapping types in the percolator docs. * Rename doc -> _doc in the main repository README. * Also replace some custom type names in the HLRC docs.	2018-10-22 11:54:04 -07:00
markharwood	fe623acf66	Docs - removed experimental/beta markers from adjacency matrix aggregation (#34599 )	2018-10-19 09:33:59 +01:00
markharwood	2a413abb0b	Docs - remove experimental marker from significant_text aggregation (#34598 )	2018-10-19 09:32:02 +01:00
Jim Ferenczi	36557469f6	[DOCS] Removes beta label from composite aggregation (#34329 )	2018-10-05 19:46:20 +02:00
Nik Everett	dc2cf28fde	Docs: Allow skipping response assertions (#34240 ) We generate tests from our documentation, including assertions about the responses returned by a particular API. But sometimes we can't assert that the response is correct because of some defficiency in our tooling. Previously we marked the response `// NOTCONSOLE` to skip it, but this is kind of odd because `// NOTCONSOLE` is really to mark snippets that are json but aren't requests or responses. This introduces a new construct to skip response assertions: ``` // TESTRESPONSE[skip:reason we skipped this] ```	2018-10-04 08:03:38 -04:00
Serge Populov	13af5d5d7f	Docs: Fix typo in field name in aggregations (#34223 )	2018-10-02 10:54:29 -04:00
Ryan Ernst	3046656ab1	Scripting: Rework joda time backcompat (#33486 ) This commit switches the joda time backcompat in scripting to use augmentation over ZonedDateTime. The augmentation methods provide compatibility with the missing methods between joda's DateTime and java's ZonedDateTime. Due to getDayOfWeek returning an enum in the java API, ZonedDateTime is wrapped so that the method can return int like the joda time does. The java time api version is renamed to getDayOfWeekEnum, which will be kept through 7.x for compatibility while users switch back to getDayOfWeek once joda compatibility is removed.	2018-09-16 19:18:00 -07:00
Christoph Büscher	fe478c23b7	[Docs] Fix heading in composite-aggregation.asciidoc (#33627 ) The heading for the "Missing buckets" should be on the same level as the the "Order" section.	2018-09-12 16:56:03 +02:00
Paul Sanwald	c303006e6b	Add interval response parameter to AutoDateInterval histogram (#33254 ) Adds the interval used to the aggregation response.	2018-09-05 07:35:59 -04:00
lipsill	b7c0d2830a	[Docs] Remove repeating words (#33087 )	2018-08-28 13:16:43 +02:00
Luca Cavanna	393eec1482	Set maxScore for empty TopDocs to Nan rather than 0 (#32938 ) We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).	2018-08-22 17:23:54 +02:00
Ryan Ernst	478f6d6cf1	Scripting: Conditionally use java time api in scripting (#31441 ) This commit adds a boolean system property, `es.scripting.use_java_time`, which controls the concrete return type used by doc values within scripts. The return type of accessing doc values for a date field is changed to Object, essentially duck typing the type to allow co-existence during the transition from joda time to java time.	2018-08-01 08:58:49 -07:00
Colm O'Shea	97b379e0d4	fix no=>not typo (#32463 ) Found a tiny typo while reading the docs	2018-07-31 13:33:23 +01:00
Paul Sanwald	feb07559aa	fix typo	2018-07-13 14:59:11 -04:00
Colin Goodheart-Smithe	0edb096eb4	Adds a new auto-interval date histogram (#28993 ) * Adds a new auto-interval date histogram This change adds a new type of histogram aggregation called `auto_date_histogram` where you can specify the target number of buckets you require and it will find an appropriate interval for the returned buckets. The aggregation works by first collecting documents in buckets at second interval, when it has created more than the target number of buckets it merges these buckets into minute interval bucket and continues collecting until it reaches the target number of buckets again. It will keep merging buckets when it exceeds the target until either collection is finished or the highest interval (currently years) is reached. A similar process happens at reduce time. This aggregation intentionally does not support min_doc_count, offest and extended_bounds to keep the already complex logic from becoming more complex. The aggregation accepts sub-aggregations but will always operate in `breadth_first` mode deferring the computation of sub-aggregations until the final buckets from the shard are known. min_doc_count is effectively hard-coded to zero meaning that we will insert empty buckets where necessary. Closes #9572 * Adds documentation * Added sub aggregator test * Fixes failing docs test * Brings branch up to date with master changes * trying to get tests to pass again * Fixes multiBucketConsumer accounting * Collects more buckets than needed on shards This gives us more options at reduce time in terms of how we do the final merge of the buckeets to produce the final result * Revert "Collects more buckets than needed on shards" This reverts commit 993c782d117892af9a3c86a51921cdee630a3ac5. * Adds ability to merge within a rounding * Fixes nonn-timezone doc test failure * Fix time zone tests * iterates on tests * Adds test case and documentation changes Added some notes in the documentation about the intervals that can bbe returned. Also added a test case that utilises the merging of conseecutive buckets * Fixes performance bug The bug meant that getAppropriate rounding look a huge amount of time if the range of the data was large but also sparsely populated. In these situations the rounding would be very low so iterating through the rounding values from the min key to the max keey look a long time (~120 seconds in one test). The solution is to add a rough estimate first which chooses the rounding based just on the long values of the min and max keeys alone but selects the rounding one lower than the one it thinks is appropriate so the accurate method can choose the final rounding taking into account the fact that intervals are not always fixed length. Thee commit also adds more tests * Changes to only do complex reduction on final reduce * merge latest with master * correct tests and add a new test case for 10k buckets * refactor to perform bucket number check in innerBuild * correctly derive bucket setting, update tests to increase bucket threshold * fix checkstyle * address code review comments * add documentation for default buckets * fix typo	2018-07-13 13:08:35 -04:00
Jimi Ford	e955ffc38d	Docs: fix typo in datehistogram (#31972 )	2018-07-11 15:04:57 -04:00
Sue Gallagher	357a07e7a2	[DOCS] Fix heading format errors (#31483 ) * [DOCS] Fix heading format errors. Closes #31327 * [DOCS] Fix heading format errors. Closes #31327	2018-06-25 17:25:32 -07:00
Jim Ferenczi	e33d107f84	Add missing_bucket option in the composite agg (#29465 ) This change adds a new option to the composite aggregation named `missing_bucket`. This option can be set by source and dictates whether documents without a value for the source should be ignored. When set to true, documents without a value for a field emits an explicit `null` value which is then added in the composite bucket. The `missing` option that allows to set an explicit value (instead of `null`) is deprecated in this change and will be removed in a follow up (only in 7.x). This commit also changes how the big arrays are allocated, instead of reserving the provided `size` for all sources they are created with a small intial size and they grow depending on the number of buckets created by the aggregation: Closes #29380	2018-05-30 09:48:40 +02:00
Julie Tibshirani	638a719370	Ensure that ip_range aggregations always return bucket keys. (#30701 )	2018-05-24 08:55:14 -07:00
Piotr Prądzyński	a0a8c4f186	filters agg docs duplicated 'bucket' word removal (#30677 ) In one place word 'bucket' was duplicated.	2018-05-17 15:21:50 +01:00
Jason Tedor	4a4e3d70d5	Default to one shard (#30539 ) This commit changes the default out-of-the-box configuration for the number of shards from five to one. We think this will help address a common problem of oversharding. For users with time-based indices that need a different default, this can be managed with index templates. For users with non-time-based indices that find they need to re-shard with the split API in place they no longer need to resort only to reindexing. Since this has the impact of changing the default number of shards used in REST tests, we want to ensure that we still have coverage for issues that could arise from multiple shards. As such, we randomize (rarely) the default number of shards in REST tests to two. This is managed via a global index template. However, some tests check the templates that are in the cluster state during the test. Since this template is randomly there, we need a way for tests to skip adding the template used to set the number of shards to two. For this we add the default_shards feature skip. To avoid having to write our docs in a complicated way because sometimes they might be behind one shard, and sometimes they might be behind two shards we apply the default_shards feature skip to all docs tests. That is, these tests will always run with the default number of shards (one).	2018-05-14 12:22:35 -04:00
Christoph Büscher	21dbf9fab0	Document time unit limitations for date histograms (#30177 ) Adding some allowed abbreviated values for intervals in date histograms as well as documenting the limitations of intervals larger than days. Closes #23294	2018-04-26 19:44:21 +02:00
Adrien Grand	ebd6b5b7ba	Deprecate filtering on `_type`. (#29468 ) As indices are only allowed to have one type now, and types are going away in the future, we should deprecate filtering by `_type`. Relates #15613	2018-04-13 09:07:51 +02:00
Jim Ferenczi	5288235ca3	Optimize the composite aggregation for match_all and range queries (#28745 ) This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate the collection when the leading source value is greater than the lowest value in the queue. Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents in the order of the values present in the leading source. For instance the following aggregation: ``` "composite" : { "sources" : [ { "value1": { "terms" : { "field": "timestamp", "order": "asc" } } } ], "size": 10 } ``` ... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents. For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited. This mode can execute iff: * The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`. * The query is a match_all query or a range query over the field that is used as the leading source in the composite definition. * The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only). If these conditions are not met this aggregation visits each document like any other agg.	2018-03-26 09:51:37 +02:00
Paul Sanwald	6dae955b6a	Document and test date_range "missing" support (#28983 ) * Add a REST integration test that documents date_range support Add a test case that exercises date_range aggregations using the missing option. Addresses #17597 * Test cleanup and correction Adding a document with a null date to exercise `missing` option, update test name to something reasonable. * Update documentation to explain how the "missing" parameter works for date_range aggregations. * Wrap lines at 80 chars in docs. * Change format of test to YAML for readability.	2018-03-13 12:58:30 -07:00
Menno Oudshoorn	d018a0008e	Add a usage example of the JLH score (#28905 ) Adds a usage example of the JLH score used in significant terms aggregation. All other methods to calculate significance score have such an example Closes #28513	2018-03-06 15:37:18 +01:00
Tim Roes	5689dc1182	[Docs] Fix typo in composite aggregation (#28891 )	2018-03-04 11:47:24 -08:00
Clinton Gormley	45c1e37740	Add defined ID to terms agg size header	2018-02-02 13:43:20 +01:00
Jim Ferenczi	c4e0a84344	Mark the composite aggregation as a beta feature (#28431 ) The `composite` aggregation should be marked as beta (rather than experimental) in the documentation.	2018-02-02 09:24:10 +01:00
Jim Ferenczi	c26d4ac6c1	Always return the after_key in composite aggregation response (#28358 ) This change adds the `after_key` of a composite aggregation directly in the response. It is redundant when all buckets are not filtered/removed by a pipeline aggregation since in this case the `after_key` is always the last bucket in the response. Though when using a pipeline aggregation to filter composite buckets, the `after_key` can be lost if the last bucket is filtered. This commit fixes this situation by always returning the `after_key` in a dedicated section.	2018-01-25 09:15:27 +01:00
Jim Ferenczi	65184d0b5b	Adds a note in the `terms` aggregation docs regarding pagination (#28360 ) This change adds a note in the `terms` aggregation that explains how to retrieve all terms (or all combinations of terms in a nested agg) using the `composite` aggregation.	2018-01-25 08:59:41 +01:00
Alex Moros Marco	090ac3c2a2	[Doc] Fixs typo in reverse-nested-aggregation.asciidoc (#28348 )	2018-01-24 17:54:02 +01:00
Jim Ferenczi	b2ce994be7	[Docs] Fix asciidoc style in composite agg docs	2018-01-23 16:41:32 +01:00
Jim Ferenczi	19cfc25873	Adds the ability to specify a format on composite date_histogram source (#28310 ) This commit adds the ability to specify a date format on the `date_histogram` composite source. If the format is defined, the key for the source is returned as a formatted date. Closes #27923	2018-01-23 15:14:49 +01:00
Christoph Büscher	556d77c9ad	[Docs] Add note on limitation for significant_text with nested objects (#28052 ) Add section to `significant_text` documentation mentioning that it currently does not support use on nested objects. Relates to #28050	2018-01-03 16:28:23 +01:00
Shaunak Kashyap	da0ed578b2	Fixing typo in param name: values => sources (#28016 )	2017-12-28 18:18:30 +01:00
Adrien Grand	1b660821a2	Allow `_doc` as a type. (#27816 ) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751	2017-12-14 17:47:53 +01:00
Christoph Büscher	0d11b9fe34	[Docs] Unify spelling of Elasticsearch (#27567 ) Removes occurences of "elasticsearch" or "ElasticSearch" in favour of "Elasticsearch" where appropriate.	2017-11-29 09:44:25 +01:00
Clinton Gormley	d1b1d711df	Update composite-aggregation.asciidoc Fixed asciidoc typo	2017-11-23 15:05:14 +01:00
Takumasa Ochi	eed8d1aee5	[DOC] Fix mathematical representation on interval (range) (#27450 )	2017-11-21 17:06:26 +00:00
Jim Ferenczi	d1093bd2fa	#26800 : Fix docs rendering	2017-11-20 08:41:02 +01:00

1 2 3 4

155 Commits