OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jim Ferenczi	c340814b34	Fix highlighting of overlapping terms in the unified highlighter (#47227 ) The passage formatter that the unified highlighter use doesn't handle terms with overlapping offsets. For tokenizer that provides multiple segmentation of the same terms (edge ngram for instance) the formatter should select the largest span in order to highlight the term only once. This change implements this logic.	2019-10-02 16:34:12 +02:00
Jim Ferenczi	08f28e642b	Replace SearchContext with QueryShardContext in query builder tests (#46978 ) This commit replaces the SearchContext used in AbstractQueryTestCase with a QueryShardContext in order to reduce the visibility of search contexts. Relates #46523	2019-09-23 20:24:02 +02:00
Jim Ferenczi	79a1390935	Add mapper-extras and the RankFeatureQuery in the hlrc (#43713 ) This change adds the support for the RankFeatureQuery in the HLRC by providing an extra dependency on mapper-extras-client. It also removes the dependency on lang-painless in mapper-extras which is not needed anymore since the move of the vector field into a dedicated module. Closes #43634	2019-08-14 18:41:39 +02:00
Julie Tibshirani	cc0ff3aa71	Ensure field caps doesn't error on rank feature fields. (#44370 ) The contract for MappedFieldType#fielddataBuilder is to throw an IllegalArgumentException if fielddata is not supported. The rank feature mappers were instead throwing an UnsupportedOperationException, which caused MappedFieldType#isAggregatable to fail.	2019-07-16 15:56:50 -07:00
Mayya Sharipova	aa6248d4d7	Move dense_vector and sparse_vector to module (#43280 ) (#43333 )	2019-06-18 11:56:04 -04:00
Mayya Sharipova	a7bdea8a15	BWC tests - move vector distance functions to 7.3	2019-06-14 12:41:27 -04:00
Jason Tedor	7f3ab4524f	Bump 7.x branch to version 7.2.0 This commit adds the 7.2.0 version constant to the 7.x branch, and bumps BWC logic accordingly.	2019-05-01 13:38:57 -04:00
Jim Ferenczi	67820f9da1	Fix search_as_you_type's sub-fields to pick their names from the full path of the root field (#41541 ) The subfields of the search_as_you_type are prefixed with the name of their root field. However they should used the full path of the root field rather than just the name since these fields can appear in a multi-`fields` definition or under an object field. Since this field type is not released yet, this should be considered as a non-issue.	2019-04-26 10:19:20 +02:00
Alpar Torok	25944c4317	convert modules to use testclusters (#40804 ) * convert modules to use testclusters * Eliminate PluginPropertiesTask and move logic in plugin where it belongs	2019-04-04 11:45:40 +03:00
Jim Ferenczi	e256eb361a	Fix merging of search_as_you_type field mapper (#40593 ) The merge of the `search_as_you_type` field mapper uses the wrong prefix field and does not update the underlying field types.	2019-03-29 09:02:40 +01:00
Jeff Hajewski	6c13ed7db8	Update max dims for vectors to 1024. (#40597 )	2019-03-28 17:08:14 -04:00
Andy Bristol	23395a9b9f	search as you type fieldmapper (#35600 ) Adds the search_as_you_type field type that acts like a text field optimized for as-you-type search completion. It creates a couple subfields that analyze the indexed terms as shingles, against which full terms are queried, and a prefix subfield that analyze terms as the largest shingle size used and edge-ngrams, against which partial terms are queried Adds a match_bool_prefix query type that creates a boolean clause of a term query for each term except the last, for which a boolean clause with a prefix query is created. The match_bool_prefix query is the recommended way of querying a search as you type field, which will boil down to term queries for each shingle of the input text on the appropriate shingle field, and the final (possibly partial) term as a term query on the prefix field. This field type also supports phrase and phrase prefix queries however	2019-03-27 13:29:13 -07:00
Julie Tibshirani	419cf1c02f	Fix an off-by-one error in the vector field dimension limit. (#40489 ) Previously only vectors up to 499 dimensions were accepted, whereas the stated limit is 500.	2019-03-27 11:17:58 -07:00
Mayya Sharipova	e80284231d	Backport distance functions vectors (#39330 ) Distance functions for dense and sparse vectors Backport for #37947, #39313	2019-02-23 11:52:43 -05:00
Julie Tibshirani	c2e9d13ebd	Default include_type_name to false in the yml test harness. (#38058 ) This PR removes the temporary change we made to the yml test harness in #37285 to automatically set `include_type_name` to `true` in index creation requests if it's not already specified. This is possible now that the vast majority of index creation requests were updated to be typeless in #37611. A few additional tests also needed updating here. Additionally, this PR updates the test harness to set `include_type_name` to `false` in index creation requests when communicating with 6.x nodes. This mirrors the logic added in #37611 to allow for typeless document write requests in test set-up code. With this update in place, we can remove many references to `include_type_name: false` from the yml tests.	2019-02-01 11:44:13 -08:00
Colin Goodheart-Smithe	21e392e95e	Removes typed calls from YAML REST tests (#37611 ) This PR attempts to remove all typed calls from our YAML REST tests. The PR adds include_type_name: false to create index requests that use a mapping and also to put mapping requests. It also removes _type from index requests where they haven't already been removed. The PR ignores tests named *_with_types.yml since this are specifically testing typed API behaviour. The change also includes changing the test harness to add the type _doc to index, update, get and bulk requests that do not specify the document type when the test is running against a mixed 7.x/6.x cluster.	2019-01-30 16:32:58 +00:00
Mayya Sharipova	a30ce6a00a	Rename feature, feature_vector and feature_query (#37794 ) Ranaming as follows: feature -> rank_feature feature_vector -> rank_features feature query -> rank_feature query Ranaming is done to distinguish from other vector types. Closes #36723	2019-01-24 19:18:48 -05:00
Alexander Reelsen	daa2ec8a60	Switch mapping/aggregations over to java time (#36363 ) This commit moves the aggregation and mapping code from joda time to java time. This includes field mappers, root object mappers, aggregations with date histograms, query builders and a lot of changes within tests. The cut-over to java time is a requirement so that we can support nanoseconds properly in a future field mapper. Relates #27330	2019-01-23 10:40:05 +01:00
Armin Braun	860a8a7b23	Improve Precision for scaled_float (#37169 ) * Use `toString` and `Bigdecimal` parsing to get intuitive behaviour for `scaled_float` as discussed in #32570 * Closes #32570	2019-01-11 08:07:55 +01:00
Nhat Nguyen	7580d9d925	Make SourceToParse immutable (#36971 ) Today the routing of a SourceToParse is assigned in a separate step after the object is created. We can easily forget to set the routing. With this commit, the routing must be provided in the constructor of SourceToParse. Relates #36921	2018-12-24 14:06:50 -05:00
Mayya Sharipova	b5d532f9e3	Vector field (#33022 ) 1. Dense vector PUT dindex { "mappings": { "_doc": { "properties": { "my_vector": { "type": "dense_vector" }, "my_text" : { "type" : "keyword" } } } } } PUT dinex/_doc/1 { "my_text" : "text1", "my_vector" : [ 0.5, 10, 6 ] } 2. Sparse vector PUT sindex { "mappings": { "_doc": { "properties": { "my_vector": { "type": "sparse_vector" }, "my_text" : { "type" : "keyword" } } } } } PUT sindex/_doc/1 { "my_text" : "text1", "my_vector" : {"1": 0.5, "99": -0.5, "5": 1} }	2018-12-12 21:20:53 -05:00
Jim Ferenczi	18866c4c0b	Make hits.total an object in the search response (#35849 ) This commit changes the format of the `hits.total` in the search response to be an object with a `value` and a `relation`. The `value` indicates the number of hits that match the query and the `relation` indicates whether the number is accurate (in which case the relation is equals to `eq`) or a lower bound of the total (in which case it is equals to `gte`). This change also adds a parameter called `rest_total_hits_as_int` that can be used in the search APIs to opt out from this change (retrieve the total hits as a number in the rest response). Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain `hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a follow up (to allow numbers to be passed to `track_total_hits`). Relates #33028	2018-12-05 19:49:06 +01:00
Nick Knize	a5e1f4d3a2	Upgrade to lucene-8.0.0-snapshot-31d7dfe6b1 (#35224 )	2018-11-06 11:55:23 +01:00
Julie Tibshirani	78df00ff24	Simplify the return type of FieldMapper#parse. (#32654 )	2018-09-04 01:15:19 +00:00
Jim Ferenczi	8e5f281b27	AbstractQueryTestCase should run without type less often (#28936 ) This commit changes the randomization to always create an index with a type. It also adds a way to create a query shard context that maps to an index with no type registered in order to explicitely test cases where there is no type.	2018-07-26 20:29:05 +02:00
Nick Peihl	ac63408655	Add region ISO code to GeoIP Ingest plugin (#31669 )	2018-07-20 11:23:29 -07:00
Tanguy Leroux	bf58660482	Remove all unused imports and fix CRLF (#31207 ) The X-Pack opening and the recent other refactorings left a lot of unused imports in the codebase. This commit removes them all.	2018-06-11 15:12:12 +02:00
Julie Tibshirani	8f607071b6	Remove DocumentFieldMappers#smartNameFieldMapper, as it is no longer needed. (#31018 )	2018-06-08 09:24:09 -07:00
Martijn van Groningen	07a57cc131	Move number of language analyzers to analysis-common module (#31143 ) The following analyzers were moved from server module to analysis-common module: `snowball`, `arabic`, `armenian`, `basque`, `bengali`, `brazilian`, `bulgarian`, `catalan`, `chinese`, `cjk`, `czech`, `danish`, `dutch`, `english`, `finnish`, `french`, `galician` and `german`. Relates to #23658	2018-06-08 08:58:46 +02:00
Adrien Grand	458bca11bc	Add a `feature_vector` field. (#31102 ) This field is similar to the `feature` field but is better suited to index sparse feature vectors. A use-case for this field could be to record topics associated with every documents alongside a metric that quantifies how well the topic is connected to this document, and then boost queries based on the topics that the logged user is interested in. Relates #27552	2018-06-07 10:05:37 +02:00
Christoph Büscher	3f87c79500	Change ObjectParser exception (#31030 ) ObjectParser should throw XContentParseExceptions, not IAE. A dedicated parsing exception can includes the place where the error occurred. Closes #30605	2018-06-04 20:20:37 +02:00
Julie Tibshirani	cd0a375414	Remove unused query methods from MappedFieldType. (#30987 ) * Remove MappedFieldType#nullValueQuery, as it is now unused. * Remove MappedFieldType#queryStringTermQuery, as it is never overridden.	2018-05-31 12:47:52 -07:00
Adrien Grand	886db84ad2	Expose Lucene's FeatureField. (#30618 ) Lucene has a new `FeatureField` which gives the ability to record numeric features as term frequencies. Its main benefit is that it allows to boost queries with the values of these features and efficiently skip non-competitive documents at the same time using block-max WAND and indexed impacts.	2018-05-23 08:55:21 +02:00
Adrien Grand	4918924fae	Remove legacy mapping code. (#29224 ) Some features have been deprecated since `6.0` like the `_parent` field or the ability to have multiple types per index. This allows to remove quite some code, which in-turn will hopefully make it easier to proceed with the removal of types.	2018-04-11 09:41:37 +02:00
Lee Hinman	8e8fdc4f0e	Decouple XContentBuilder from BytesReference (#28972 ) * Decouple XContentBuilder from BytesReference This commit removes all mentions of `BytesReference` from `XContentBuilder`. This is needed so that we can completely decouple the XContent code and move it into its own dependency. While this change appears large, it is due to two main changes, moving `.bytes()` and `.string()` out of XContentBuilder itself into static methods `BytesReference.bytes` and `Strings.toString` respectively. The rest of the change is code reacting to these changes (the majority of it in tests). Relates to #28504	2018-03-14 13:47:57 -06:00
Jason Tedor	5904d936fa	Copy Lucene IOUtils (#29012 ) As we have factored Elasticsearch into smaller libraries, we have ended up in a situation that some of the dependencies of Elasticsearch are not available to code that depends on these smaller libraries but not server Elasticsearch. This is a good thing, this was one of the goals of separating Elasticsearch into smaller libraries, to shed some of the dependencies from other components of the system. However, this now means that simple utility methods from Lucene that we rely on are no longer available everywhere. This commit copies IOUtils (with some small formatting changes for our codebase) into the fold so that other components of the system can rely on these methods where they no longer depend on Lucene.	2018-03-13 12:49:33 -04:00
Adrien Grand	700d9ecc95	Remove the `update_all_types` option. (#28288 ) This option is not useful in 7.x since no indices may have more than one type anymore.	2018-01-22 12:03:07 +01:00
Adrien Grand	7d88851766	Upgrade beats templates that we use for bwc testing. (#27929 ) These templates were generated with 5.0. We need those generated with 6.0 since we do not guarantee compatibility with previous versions of the template anyway. I removed the winlogbeat template which is a bit harder to generate as it requires Windows, since we do not aim to be exhaustive.	2017-12-21 08:50:14 +01:00
Jason Tedor	75c0cd0672	Move range field mapper back to core This commit moves the range field mapper back to core so that we can remove the compile-time dependency of percolator on mapper-extras which compilcates dependency management for the percolator client JAR, and modules should not be intertwined like this anyway. Relates #27854	2017-12-17 14:27:10 -05:00
Adrien Grand	1b660821a2	Allow `_doc` as a type. (#27816 ) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751	2017-12-14 17:47:53 +01:00
Alan Woodward	77617c8e62	[TEST] Add test for _range fields in query_string queries (#27756 ) [TEST] Add test for _range fields in query_string queries Closes #26555	2017-12-12 13:33:37 +00:00
Adrien Grand	996990ad1f	Upgrade to lucene-7.2.0-snapshot-8c94404. (#27496 ) The main highlight of this new snapshot is that it introduces the opportunity for queries to opt out of caching. In case a query opts out of caching, not only will it never be cached, but also no compound query that wraps it will be cached.	2017-11-28 14:52:42 +01:00
Armin Braun	3deba0ed1f	#26260 Allow ip_range to accept CIDR notation (#27192 ) * #26260 Allow ip_range to accept CIDR notation * #26260 added non-byte-alligned cidr test cases	2017-11-03 13:34:48 -06:00
Armin Braun	8f0f024507	#27189 Fixed rounding of bounds in scaled float comparison (#27207 ) * #27189 Fixed rounding of bounds in scaled float comparison * #27189 more assertions from CR	2017-11-03 13:23:07 -06:00
Colin Goodheart-Smithe	99aca9cdfc	Enhances exists queries to reduce need for `_field_names` (#26930 ) * Enhances exists queries to reduce need for `_field_names` Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance. This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`. Closes #26770 * Addresses review comments * Addresses more review comments * implements existsQuery explicitly on every mapper * Reinstates ability to perform term query on `_field_names` * Added bwc depending on index created version * Review Comments * Skips tests that are not supported in 6.1.0 These values will need to be changed after backporting this PR to 6.x	2017-11-01 10:46:59 +00:00
Christoph Büscher	6189c54c84	Reject the `index_options` parameter for numeric fields (#26668 ) Numeric fields no longer support the index_options parameter. This changes the parameter to be rejected in numeric field types after it was deprecated in 6.0. Closes #21475	2017-09-25 23:43:14 +02:00
Adrien Grand	93da7720ff	Move non-core mappers to a module. (#26549 ) Today we have all non-plugin mappers in core. I'd like to start moving those that neither map to json datatypes nor are very frequently used like `date` or `ip` to a module. This commit creates a new module called `mappers-extra` and moves the `scaled_float` and `token_count` mappers to it. I'd like to eventually move `range` fields there but it's more complicated due to their intimate relationship with range queries. Relates #10368	2017-09-13 17:58:53 +02:00

47 Commits