LUCENE-10146: Add note that dot product is preferred over cosine (#400)

While VectorSimilarityFunction#COSINE is helpful when you need to preserve the
original vectors, it is significantly slower than DOT_PRODUCT. This commit adds
javadocs to COSINE explaining that dot product is the fastest option.
This commit is contained in:
Julie Tibshirani 2021-10-20 09:50:25 -07:00 committed by GitHub
parent 5b8f0a5eb5
commit 6bb2bbcd6a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 6 additions and 1 deletions

View File

@ -56,7 +56,12 @@ public enum VectorSimilarityFunction {
}
},
/** Cosine similarity */
/**
* Cosine similarity. NOTE: the preferred way to perform cosine similarity is to normalize all
* vectors to unit length, and instead use {@link VectorSimilarityFunction#DOT_PRODUCT}. You
* should only use this function if you need to preserve the original vectors and cannot normalize
* them in advance.
*/
COSINE {
@Override
public float compare(float[] v1, float[] v2) {