OpenSearch

Commit Graph

Author	SHA1	Message	Date
Julie Tibshirani	40c3225d26	First round of optimizations for vector functions. (#46294 ) This PR merges the `vectors-optimize-brute-force` feature branch, which makes the following changes to how vector functions are computed: * Precompute the L2 norm of each vector at indexing time. (#45390) * Switch to ByteBuffer for vector encoding. (#45936) * Decode vectors and while computing the vector function. (#46103) * Use an array instead of a List for the query vector. (#46155) * Precompute the normalized query vector when using cosine similarity. (#46190) Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>	2019-09-04 14:45:57 -07:00
Julie Tibshirani	d94c4dcffb	Use float instead of double for query vectors. (#46004 ) Currently, when using script_score functions like cosineSimilarity, the query vector is treated as an array of doubles. Since the stored document vectors use floats, it seems like the least surprising behavior for the query vectors to also be float arrays. In addition to improving consistency, this change may help with some optimizations we have been considering around vector dot product.	2019-08-28 11:03:14 -07:00
Mayya Sharipova	0c68765088	Adds usage stats for vectors (#45023 ) Example of usage: _xpack/usage "vectors": { "available": true, "enabled": true, "dense_vector_fields_count" : 1, "sparse_vector_fields_count" : 1, "dense_vector_dims_avg_count" : 100 } Backport for #44512	2019-07-31 12:32:41 -04:00
Mayya Sharipova	32cb47b91c	Add l1norm and l2norm distances for vectors (#44116 ) Add L1norm - Manhattan distance Add L2norm - Euclidean distance relates to #37947	2019-07-11 14:30:02 -04:00
Mayya Sharipova	37e1ad7062	Forbid empty doc values on vector functions (#43944 ) Currently when a document misses a vector value, vector function returns 0 as a score for this document. We think this is incorrect behaviour. With this change, an error will be thrown if vector functions are used with docs that are missing vector doc values. Also VectorScriptDocValues is modified to allow size() function, which can be used to check if a document has a value for the vector field.	2019-07-05 18:09:06 -04:00
Mayya Sharipova	756c42f99f	Add dims parameter to dense_vector mapping (#43444 ) (#43895 ) Typically, dense vectors of both documents and queries must have the same number of dimensions. Different number of dimensions among documents or query vector indicate an error. This PR enforces that all vectors for the same field have the same number of dimensions. It also enforces that query vectors have the same number of dimensions.	2019-07-02 21:14:16 -04:00
Mayya Sharipova	813551e070	Fix eclipse build gradle for vectors project Closes #43496	2019-06-24 09:22:48 -04:00
Mayya Sharipova	aa6248d4d7	Move dense_vector and sparse_vector to module (#43280 ) (#43333 )	2019-06-18 11:56:04 -04:00

8 Commits