OpenSearch

Commit Graph

Author	SHA1	Message	Date
Alan Woodward	4b9cbfca64	Remove test backported in error	2020-07-09 21:45:41 +01:00
Alan Woodward	f4caadd239	MappedFieldType no longer requires equals/hashCode/clone (#59212 ) With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer have any cause to compare MappedFieldType instances. This means that we can remove all equals and hashCode implementations, and in addition we no longer need the clone implementations which were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes, which will be particularly useful for the runtime fields project.	2020-07-09 21:05:10 +01:00
Alan Woodward	8ebd341710	Add text search information to MappedFieldType (#58230 ) (#58432 ) Now that MappedFieldType no longer extends lucene's FieldType, we need to have a way of getting the index information about a field necessary for building text queries, building term vectors, highlighting, etc. This commit introduces a new TextSearchInfo abstraction that holds this information, and a getTextSearchInfo() method to MappedFieldType to make it available. Field types that do not support text search can just return null here. This allows us to remove the MapperService.getLuceneFieldType() shim method.	2020-06-23 14:37:26 +01:00
Alan Woodward	4b8cf2af6a	Add serialization test for FieldMappers when include_defaults=true (#58235 ) (#58328 ) Fixes a bug in TextFieldMapper serialization when index is false, and adds a base-class test to ensure that all field mappers are tested against all variations with defaults both included and excluded. Fixes #58188	2020-06-18 15:46:04 +01:00
Alan Woodward	ca2d12d039	Remove Settings parameter from FieldMapper base class (#58237 ) This is currently used to set the indexVersionCreated parameter on FieldMapper. However, this parameter is only actually used by two implementations, and clutters the API considerably. We should just remove it, and use it directly in the implementations that require it.	2020-06-18 12:53:54 +01:00
Alan Woodward	12a3f6dfca	MappedFieldType should not extend FieldType (#58160 ) MappedFieldType is a combination of two concerns: * an extension of lucene's FieldType, defining how a field should be indexed * a set of query factory methods, defining how a field should be searched We want to break these two concerns apart. This commit is a first step to doing this, breaking the inheritance relationship between MappedFieldType and FieldType. MappedFieldType instead has a series of boolean flags defining whether or not the field is searchable or aggregatable, and FieldMapper has a separate FieldType passed to its constructor defining how indexing should be done. Relates to #56814	2020-06-16 16:56:43 +01:00
Julie Tibshirani	e0a15e8dc4	Remove the 'array value parser' marker interface. (#57571 ) (#57622 ) This PR replaces the marker interface with the method FieldMapper#parsesArrayValue. I find this cleaner and it will help with the fields retrieval work (#55363). The refactor also ensures that only field mappers can declare they parse array values. Previously other types like ObjectMapper could implement the marker interface and be passed array values, which doesn't make sense.	2020-06-03 11:30:14 -07:00
Mark Tozzi	e50f514092	IndexFieldData should hold the ValuesSourceType (#57373 ) (#57532 )	2020-06-02 12:16:53 -04:00
Alan Woodward	18bfbeda29	Move merge compatibility logic from MappedFieldType to FieldMapper (#56915 ) Merging logic is currently split between FieldMapper, with its merge() method, and MappedFieldType, which checks for merging compatibility. The compatibility checks are called from a third class, MappingMergeValidator. This makes it difficult to reason about what is or is not compatible in updates, and even what is in fact updateable - we have a number of tests that check compatibility on changes in mapping configuration that are not in fact possible. This commit refactors the compatibility logic so that it all sits on FieldMapper, and makes it called at merge time. It adds a new FieldMapperTestCase base class that FieldMapper tests can extend, and moves the compatibility testing machinery from FieldTypeTestCase to here. Relates to #56814	2020-05-20 09:43:13 +01:00
Alan Woodward	d33d13f2be	Simplify generics on Mapper.Builder (#56747 ) Mapper.Builder currently has some complex generics on it to allow fluent builder construction. However, the second parameter, a return type from the build() method, is unnecessary, as we can use covariant return types. This commit removes this second generic parameter.	2020-05-15 12:14:49 +01:00
Mark Tozzi	b718193a01	Clean up DocValuesIndexFieldData (#56372 ) (#56684 )	2020-05-14 12:42:37 -04:00
Julie Tibshirani	e852bb29b7	Simplify signature of FieldMapper#parseCreateField. (#56144 ) `FieldMapper#parseCreateField` accepts the parse context, plus a list of fields as an output parameter. These fields are immediately added to the document through `ParseContext#doc()`. This commit simplifies the signature by removing the list of fields, and having the mappers add the fields directly to `ParseContext#doc()`. I think this is nicer for implementors, because previously fields could be added either through the list, or the context (through `add`, `addWithKey`, etc.)	2020-05-06 11:12:09 -07:00
William Brafford	3499fa917c	Deprecated xpack "enable" settings should be no-ops (#55416 ) (#56167 ) The following settings are now no-ops: * xpack.flattened.enabled * xpack.logstash.enabled * xpack.rollup.enabled * xpack.slm.enabled * xpack.sql.enabled * xpack.transform.enabled * xpack.vectors.enabled Since these settings no longer need to be checked, we can remove settings parameters from a number of constructors and methods, and do so in this commit. We also update documentation to remove references to these settings.	2020-05-05 10:40:49 -04:00
Ryan Ernst	52b9d8d15e	Convert remaining license methods to isAllowed (#55908 ) (#55991 ) This commit converts the remaining isXXXAllowed methods to instead of use isAllowed with a Feature value. There are a couple other methods that are static, as well as some licensed features that check the license directly, but those will be dealt with in other followups.	2020-04-30 15:52:22 -07:00
William Brafford	49e30b15a2	Deprecate disabling basic-license features (#54816 ) (#55405 ) We believe there's no longer a need to be able to disable basic-license features completely using the "xpack..enabled" settings. If users don't want to use those features, they simply don't need to use them. Having such features always available lets us build more complex features that assume basic-license features are present. This commit deprecates settings of the form "xpack..enabled" for basic-license features, excluding "security", which is a special case. It also removes deprecated settings from integration tests and unit tests where they're not directly relevant; e.g. monitoring and ILM are no longer disabled in many integration tests.	2020-04-17 15:04:17 -04:00
Jason Tedor	5fcda57b37	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 17:24:38 -04:00
Zachary Tong	c9db2de41d	[7.x] Comprehensively test supported/unsupported field type:agg combinations (#54451 ) * Comprehensively test supported/unsupported field type:agg combinations (#52493) This adds a test to AggregatorTestCase that allows us to programmatically verify that an aggregator supports or does not support a particular field type. It fetches the list of registered field type parsers, creates a MappedFieldType from the parser and then attempts to run a basic agg against the field. A supplied list of supported VSTypes are then compared against the output (success or exception) and suceeds or fails the test accordingly. Co-Authored-By: Mark Tozzi <mark.tozzi@gmail.com> * Skip fields that are not aggregatable * Use newIndexSearcher() to avoid incompatible readers (#52723) Lucene's `newSearcher()` can generate readers like ParallelCompositeReader which we can't use. We need to instead use our helper `newIndexSearcher`	2020-03-31 14:35:03 -04:00
Alan Woodward	71b703edd1	Rename AtomicFieldData to LeafFieldData (#53554 ) This conforms with lucene's LeafReader naming convention, and matches other per-segment structures in elasticsearch.	2020-03-17 12:30:12 +00:00
Nik Everett	1d1956ee93	Add size support to `top_metrics` (backport of #52662 ) (#52914 ) This adds support for returning the top "n" metrics instead of just the very top. Relates to #51813	2020-02-27 16:12:52 -05:00
Nik Everett	146def8caa	Implement top_metrics agg (#51155 ) (#52366 ) The `top_metrics` agg is kind of like `top_hits` but it only works on doc values so it should be faster. At this point it is fairly limited in that it only supports a single, numeric sort and a single, numeric metric. And it only fetches the "very topest" document worth of metric. We plan to support returning a configurable number of top metrics, requesting more than one metric and more than one sort. And, eventually, non-numeric sorts and metrics. The trick is doing those things fairly efficiently. Co-Authored by: Zachary Tong <zach@elastic.co>	2020-02-14 11:19:11 -05:00
Mayya Sharipova	e3da60c23d	Increase the number of vector dims to 2048 (#46895 )	2019-11-20 07:47:33 -05:00
Julie Tibshirani	ae1ef5fd92	Refactor unit tests for vector functions. (#48662 ) This PR performs the following changes: * Split `ScoreScriptUtilsTests` into `DenseVectorFunctionTests` and `SparseVectorFunctionTests`. This will make it easier to delete all sparse vector function tests once we remove support on 8.x. * As much as possible, break up the large test methods into individual tests for each vector function (`cosineSimilarity`, `l2norm`, etc.).	2019-10-30 15:36:06 -07:00
Julie Tibshirani	89c65752dc	Update the signature of vector script functions. (#48653 ) Previously the functions accepted a doc values reference, whereas they now accept the name of the vector field. Here's an example of how a vector function was called before and after the change. ``` Before: cosineSimilarity(params.query_vector, doc['field']) After: cosineSimilarity(params.query_vector, 'field') ``` This seems more intuitive, since we don't allow direct access to vector doc values and the the meaning of `doc['field']` is unclear. The PR makes the following changes (broken into distinct commits): * Add new function signatures of the form `function(params.query_vector, 'field')` and deprecates the old ones. Because Painless doesn't allow two methods with the same name and number of arguments, we allow a generic `Object` to be passed in to the function and decide on the behavior through an `instanceof` check. * Refactor the class bindings so that the document field is passed to the constructor instead of the instance method. This allows us to avoid retrieving the vector doc values on every function invocation, which gives a tiny speed-up in benchmarks. Note that this PR adds new signatures for the sparse vector functions too, even though sparse vectors are deprecated. It seemed simplest to understand (for both us and users) to keep everything symmetric between dense and sparse vectors.	2019-10-29 15:46:05 -07:00
Julie Tibshirani	2664cbd20b	Deprecate the sparse_vector field type. (#48368 ) We have not seen much adoption of this experimental field type, and don't see a clear use case as it's currently designed. This PR deprecates the field type in 7.x. It will be removed from 8.0 in a follow-up PR.	2019-10-23 16:35:03 -07:00
Henning Andersen	42453aec96	Fix XPackPlugin usages in tests (#47252 ) XPackPlugin holds data in statics and can only be initialized once. This caused tests to fail primarily when running with a low max-workers. Replaced usages with the LocalStateCompositeXPackPlugin, which handles this properly for testing.	2019-10-02 12:36:02 +02:00
Julie Tibshirani	40c3225d26	First round of optimizations for vector functions. (#46294 ) This PR merges the `vectors-optimize-brute-force` feature branch, which makes the following changes to how vector functions are computed: * Precompute the L2 norm of each vector at indexing time. (#45390) * Switch to ByteBuffer for vector encoding. (#45936) * Decode vectors and while computing the vector function. (#46103) * Use an array instead of a List for the query vector. (#46155) * Precompute the normalized query vector when using cosine similarity. (#46190) Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>	2019-09-04 14:45:57 -07:00
Julie Tibshirani	d94c4dcffb	Use float instead of double for query vectors. (#46004 ) Currently, when using script_score functions like cosineSimilarity, the query vector is treated as an array of doubles. Since the stored document vectors use floats, it seems like the least surprising behavior for the query vectors to also be float arrays. In addition to improving consistency, this change may help with some optimizations we have been considering around vector dot product.	2019-08-28 11:03:14 -07:00
Mayya Sharipova	0c68765088	Adds usage stats for vectors (#45023 ) Example of usage: _xpack/usage "vectors": { "available": true, "enabled": true, "dense_vector_fields_count" : 1, "sparse_vector_fields_count" : 1, "dense_vector_dims_avg_count" : 100 } Backport for #44512	2019-07-31 12:32:41 -04:00
Mayya Sharipova	32cb47b91c	Add l1norm and l2norm distances for vectors (#44116 ) Add L1norm - Manhattan distance Add L2norm - Euclidean distance relates to #37947	2019-07-11 14:30:02 -04:00
Mayya Sharipova	37e1ad7062	Forbid empty doc values on vector functions (#43944 ) Currently when a document misses a vector value, vector function returns 0 as a score for this document. We think this is incorrect behaviour. With this change, an error will be thrown if vector functions are used with docs that are missing vector doc values. Also VectorScriptDocValues is modified to allow size() function, which can be used to check if a document has a value for the vector field.	2019-07-05 18:09:06 -04:00
Mayya Sharipova	756c42f99f	Add dims parameter to dense_vector mapping (#43444 ) (#43895 ) Typically, dense vectors of both documents and queries must have the same number of dimensions. Different number of dimensions among documents or query vector indicate an error. This PR enforces that all vectors for the same field have the same number of dimensions. It also enforces that query vectors have the same number of dimensions.	2019-07-02 21:14:16 -04:00
Mayya Sharipova	aa6248d4d7	Move dense_vector and sparse_vector to module (#43280 ) (#43333 )	2019-06-18 11:56:04 -04:00

32 Commits