mirror of https://github.com/apache/lucene.git
LUCENE-3749: replace SimilarityProvider with PerFieldSimilarityWrapper
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1241001 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
dfce8dd760
commit
4710d32447
|
@ -216,7 +216,7 @@ Changes in backwards compatibility policy
|
||||||
and clone() no longer take readOnly booleans or IndexDeletionPolicy
|
and clone() no longer take readOnly booleans or IndexDeletionPolicy
|
||||||
instances. Furthermore, IndexReader.setNorm() was removed. If you need
|
instances. Furthermore, IndexReader.setNorm() was removed. If you need
|
||||||
customized norm values, the recommended way to do this is by modifying
|
customized norm values, the recommended way to do this is by modifying
|
||||||
SimilarityProvider to use an external byte[] or one of the new DocValues
|
Similarity to use an external byte[] or one of the new DocValues
|
||||||
fields (LUCENE-3108). Alternatively, to dynamically change norms (boost
|
fields (LUCENE-3108). Alternatively, to dynamically change norms (boost
|
||||||
*and* length norm) at query time, wrap your IndexReader using
|
*and* length norm) at query time, wrap your IndexReader using
|
||||||
FilterIndexReader, overriding FilterIndexReader.norms(). To persist the
|
FilterIndexReader, overriding FilterIndexReader.norms(). To persist the
|
||||||
|
@ -583,16 +583,12 @@ New features
|
||||||
for plugging in new ranking algorithms without dealing with all of the
|
for plugging in new ranking algorithms without dealing with all of the
|
||||||
nuances and implementation details of Lucene.
|
nuances and implementation details of Lucene.
|
||||||
|
|
||||||
- Added a new helper class BasicSimilarityProvider that just applies one
|
- For example, to use BM25 for all fields:
|
||||||
scoring algorithm to all fields, with queryNorm() and coord() returning 1.
|
searcher.setSimilarity(new BM25Similarity());
|
||||||
In general, it is recommended to disable coord() when using the new models.
|
|
||||||
For example, to use BM25 for all fields:
|
|
||||||
searcher.setSimilarityProvider(
|
|
||||||
new BasicSimilarityProvider(new BM25Similarity()));
|
|
||||||
|
|
||||||
If you instead want to apply different similarities (e.g. ones with
|
If you instead want to apply different similarities (e.g. ones with
|
||||||
different parameter values or different algorithms entirely) to different
|
different parameter values or different algorithms entirely) to different
|
||||||
fields, implement SimilarityProvider with your per-field logic.
|
fields, implement PerFieldSimilarityWrapper with your per-field logic.
|
||||||
|
|
||||||
(David Mark Nemeskey via Robert Muir)
|
(David Mark Nemeskey via Robert Muir)
|
||||||
|
|
||||||
|
@ -774,7 +770,7 @@ API Changes
|
||||||
query time, wrap your IndexReader using FilterIndexReader, overriding
|
query time, wrap your IndexReader using FilterIndexReader, overriding
|
||||||
FilterIndexReader.norms(). To persist the changes on disk, copy the
|
FilterIndexReader.norms(). To persist the changes on disk, copy the
|
||||||
FilteredIndexReader to a new index using IndexWriter.addIndexes().
|
FilteredIndexReader to a new index using IndexWriter.addIndexes().
|
||||||
In Lucene 4.0, SimilarityProvider will allow you to customize scoring
|
In Lucene 4.0, Similarity will allow you to customize scoring
|
||||||
using external norms, too. (Uwe Schindler, Robert Muir)
|
using external norms, too. (Uwe Schindler, Robert Muir)
|
||||||
|
|
||||||
* LUCENE-3735: PayloadProcessorProvider was changed to return a
|
* LUCENE-3735: PayloadProcessorProvider was changed to return a
|
||||||
|
|
|
@ -351,13 +351,9 @@ LUCENE-1458, LUCENE-2111: Flexible Indexing
|
||||||
|
|
||||||
* LUCENE-2236, LUCENE-2912: DefaultSimilarity can no longer be set statically
|
* LUCENE-2236, LUCENE-2912: DefaultSimilarity can no longer be set statically
|
||||||
(and dangerously) for the entire JVM.
|
(and dangerously) for the entire JVM.
|
||||||
Instead, IndexWriterConfig and IndexSearcher now take a SimilarityProvider.
|
Similarity can now be configured on a per-field basis (via PerFieldSimilarityWrapper)
|
||||||
Similarity can now be configured on a per-field basis.
|
Similarity has a lower-level API, if you want the higher-level vector-space API
|
||||||
Similarity retains only the field-specific relevance methods such as tf() and idf().
|
like in previous Lucene releases, then look at TFIDFSimilarity.
|
||||||
Previously some (but not all) of these methods, such as computeNorm and scorePayload took
|
|
||||||
field as a parameter, this is removed due to the fact the entire Similarity (all methods)
|
|
||||||
can now be configured per-field.
|
|
||||||
Methods that apply to the entire query such as coord() and queryNorm() exist in SimilarityProvider.
|
|
||||||
|
|
||||||
* LUCENE-1076: TieredMergePolicy is now the default merge policy.
|
* LUCENE-1076: TieredMergePolicy is now the default merge policy.
|
||||||
It's able to merge non-contiguous segments; this may cause problems
|
It's able to merge non-contiguous segments; this may cause problems
|
||||||
|
|
|
@ -81,7 +81,7 @@ API Changes
|
||||||
for API use. (Andrzej Bialecki)
|
for API use. (Andrzej Bialecki)
|
||||||
|
|
||||||
* LUCENE-2912: The field-specific hashmaps in SweetSpotSimilarity were removed.
|
* LUCENE-2912: The field-specific hashmaps in SweetSpotSimilarity were removed.
|
||||||
Instead, use SimilarityProvider to return different SweetSpotSimilaritys
|
Instead, use PerFieldSimilarityWrapper to return different SweetSpotSimilaritys
|
||||||
for different fields, this way all parameters (such as TF factors) can be
|
for different fields, this way all parameters (such as TF factors) can be
|
||||||
customized on a per-field basis. (Robert Muir)
|
customized on a per-field basis. (Robert Muir)
|
||||||
|
|
||||||
|
|
|
@ -56,7 +56,6 @@ import org.apache.lucene.search.IndexSearcher;
|
||||||
import org.apache.lucene.search.Query;
|
import org.apache.lucene.search.Query;
|
||||||
import org.apache.lucene.search.Scorer;
|
import org.apache.lucene.search.Scorer;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.store.RAMDirectory; // for javadocs
|
import org.apache.lucene.store.RAMDirectory; // for javadocs
|
||||||
import org.apache.lucene.util.ArrayUtil;
|
import org.apache.lucene.util.ArrayUtil;
|
||||||
import org.apache.lucene.util.Bits;
|
import org.apache.lucene.util.Bits;
|
||||||
|
@ -1085,9 +1084,9 @@ public class MemoryIndex {
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
private SimilarityProvider getSimilarityProvider() {
|
private Similarity getSimilarity() {
|
||||||
if (searcher != null) return searcher.getSimilarityProvider();
|
if (searcher != null) return searcher.getSimilarity();
|
||||||
return IndexSearcher.getDefaultSimilarityProvider();
|
return IndexSearcher.getDefaultSimilarity();
|
||||||
}
|
}
|
||||||
|
|
||||||
private void setSearcher(IndexSearcher searcher) {
|
private void setSearcher(IndexSearcher searcher) {
|
||||||
|
@ -1131,21 +1130,20 @@ public class MemoryIndex {
|
||||||
/** performance hack: cache norms to avoid repeated expensive calculations */
|
/** performance hack: cache norms to avoid repeated expensive calculations */
|
||||||
private DocValues cachedNormValues;
|
private DocValues cachedNormValues;
|
||||||
private String cachedFieldName;
|
private String cachedFieldName;
|
||||||
private SimilarityProvider cachedSimilarity;
|
private Similarity cachedSimilarity;
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public DocValues normValues(String field) throws IOException {
|
public DocValues normValues(String field) throws IOException {
|
||||||
DocValues norms = cachedNormValues;
|
DocValues norms = cachedNormValues;
|
||||||
SimilarityProvider sim = getSimilarityProvider();
|
Similarity sim = getSimilarity();
|
||||||
if (!field.equals(cachedFieldName) || sim != cachedSimilarity) { // not cached?
|
if (!field.equals(cachedFieldName) || sim != cachedSimilarity) { // not cached?
|
||||||
Info info = getInfo(field);
|
Info info = getInfo(field);
|
||||||
Similarity fieldSim = sim.get(field);
|
|
||||||
int numTokens = info != null ? info.numTokens : 0;
|
int numTokens = info != null ? info.numTokens : 0;
|
||||||
int numOverlapTokens = info != null ? info.numOverlapTokens : 0;
|
int numOverlapTokens = info != null ? info.numOverlapTokens : 0;
|
||||||
float boost = info != null ? info.getBoost() : 1.0f;
|
float boost = info != null ? info.getBoost() : 1.0f;
|
||||||
FieldInvertState invertState = new FieldInvertState(0, numTokens, numOverlapTokens, 0, boost);
|
FieldInvertState invertState = new FieldInvertState(field, 0, numTokens, numOverlapTokens, 0, boost);
|
||||||
Norm norm = new Norm();
|
Norm norm = new Norm();
|
||||||
fieldSim.computeNorm(invertState, norm);
|
sim.computeNorm(invertState, norm);
|
||||||
SingleValueSource singleByteSource = new SingleValueSource(norm);
|
SingleValueSource singleByteSource = new SingleValueSource(norm);
|
||||||
norms = new MemoryIndexNormDocValues(singleByteSource);
|
norms = new MemoryIndexNormDocValues(singleByteSource);
|
||||||
// cache it for future reuse
|
// cache it for future reuse
|
||||||
|
|
|
@ -19,9 +19,8 @@
|
||||||
package org.apache.lucene.misc;
|
package org.apache.lucene.misc;
|
||||||
|
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.PerFieldSimilarityWrapper;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.search.similarities.TFIDFSimilarity;
|
import org.apache.lucene.search.similarities.TFIDFSimilarity;
|
||||||
import org.apache.lucene.util.LuceneTestCase;
|
import org.apache.lucene.util.LuceneTestCase;
|
||||||
import org.apache.lucene.index.Norm;
|
import org.apache.lucene.index.Norm;
|
||||||
|
@ -53,7 +52,7 @@ public class SweetSpotSimilarityTest extends LuceneTestCase {
|
||||||
|
|
||||||
|
|
||||||
// base case, should degrade
|
// base case, should degrade
|
||||||
final FieldInvertState invertState = new FieldInvertState();
|
FieldInvertState invertState = new FieldInvertState("bogus");
|
||||||
invertState.setBoost(1.0f);
|
invertState.setBoost(1.0f);
|
||||||
for (int i = 1; i < 1000; i++) {
|
for (int i = 1; i < 1000; i++) {
|
||||||
invertState.setLength(i);
|
invertState.setLength(i);
|
||||||
|
@ -102,7 +101,8 @@ public class SweetSpotSimilarityTest extends LuceneTestCase {
|
||||||
final SweetSpotSimilarity ssB = new SweetSpotSimilarity();
|
final SweetSpotSimilarity ssB = new SweetSpotSimilarity();
|
||||||
ssB.setLengthNormFactors(5,8,0.1f, false);
|
ssB.setLengthNormFactors(5,8,0.1f, false);
|
||||||
|
|
||||||
SimilarityProvider sp = new DefaultSimilarityProvider() {
|
Similarity sp = new PerFieldSimilarityWrapper() {
|
||||||
|
@Override
|
||||||
public Similarity get(String field) {
|
public Similarity get(String field) {
|
||||||
if (field.equals("bar"))
|
if (field.equals("bar"))
|
||||||
return ssBar;
|
return ssBar;
|
||||||
|
@ -116,53 +116,68 @@ public class SweetSpotSimilarityTest extends LuceneTestCase {
|
||||||
return ss;
|
return ss;
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
invertState = new FieldInvertState("foo");
|
||||||
|
invertState.setBoost(1.0f);
|
||||||
for (int i = 3; i <=10; i++) {
|
for (int i = 3; i <=10; i++) {
|
||||||
invertState.setLength(i);
|
invertState.setLength(i);
|
||||||
assertEquals("f: 3,10: spot i="+i,
|
assertEquals("f: 3,10: spot i="+i,
|
||||||
1.0f,
|
1.0f,
|
||||||
computeAndDecodeNorm(ss, sp.get("foo"), invertState),
|
computeAndDecodeNorm(ss, sp, invertState),
|
||||||
0.0f);
|
0.0f);
|
||||||
}
|
}
|
||||||
|
|
||||||
for (int i = 10; i < 1000; i++) {
|
for (int i = 10; i < 1000; i++) {
|
||||||
invertState.setLength(i-9);
|
invertState.setLength(i-9);
|
||||||
final byte normD = computeAndGetNorm(d, invertState);
|
final byte normD = computeAndGetNorm(d, invertState);
|
||||||
invertState.setLength(i);
|
invertState.setLength(i);
|
||||||
final byte normS = computeAndGetNorm(sp.get("foo"), invertState);
|
final byte normS = computeAndGetNorm(sp, invertState);
|
||||||
assertEquals("f: 3,10: 10<x : i="+i,
|
assertEquals("f: 3,10: 10<x : i="+i,
|
||||||
normD,
|
normD,
|
||||||
normS,
|
normS,
|
||||||
0.0f);
|
0.0f);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
invertState = new FieldInvertState("bar");
|
||||||
|
invertState.setBoost(1.0f);
|
||||||
for (int i = 8; i <=13; i++) {
|
for (int i = 8; i <=13; i++) {
|
||||||
invertState.setLength(i);
|
invertState.setLength(i);
|
||||||
assertEquals("f: 8,13: spot i="+i,
|
assertEquals("f: 8,13: spot i="+i,
|
||||||
1.0f,
|
1.0f,
|
||||||
computeAndDecodeNorm(ss, sp.get("bar"), invertState),
|
computeAndDecodeNorm(ss, sp, invertState),
|
||||||
0.0f);
|
0.0f);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
invertState = new FieldInvertState("yak");
|
||||||
|
invertState.setBoost(1.0f);
|
||||||
for (int i = 6; i <=9; i++) {
|
for (int i = 6; i <=9; i++) {
|
||||||
invertState.setLength(i);
|
invertState.setLength(i);
|
||||||
assertEquals("f: 6,9: spot i="+i,
|
assertEquals("f: 6,9: spot i="+i,
|
||||||
1.0f,
|
1.0f,
|
||||||
computeAndDecodeNorm(ss, sp.get("yak"), invertState),
|
computeAndDecodeNorm(ss, sp, invertState),
|
||||||
0.0f);
|
0.0f);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
invertState = new FieldInvertState("bar");
|
||||||
|
invertState.setBoost(1.0f);
|
||||||
for (int i = 13; i < 1000; i++) {
|
for (int i = 13; i < 1000; i++) {
|
||||||
invertState.setLength(i-12);
|
invertState.setLength(i-12);
|
||||||
final byte normD = computeAndGetNorm(d, invertState);
|
final byte normD = computeAndGetNorm(d, invertState);
|
||||||
invertState.setLength(i);
|
invertState.setLength(i);
|
||||||
final byte normS = computeAndGetNorm(sp.get("bar"), invertState);
|
final byte normS = computeAndGetNorm(sp, invertState);
|
||||||
assertEquals("f: 8,13: 13<x : i="+i,
|
assertEquals("f: 8,13: 13<x : i="+i,
|
||||||
normD,
|
normD,
|
||||||
normS,
|
normS,
|
||||||
0.0f);
|
0.0f);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
invertState = new FieldInvertState("yak");
|
||||||
|
invertState.setBoost(1.0f);
|
||||||
for (int i = 9; i < 1000; i++) {
|
for (int i = 9; i < 1000; i++) {
|
||||||
invertState.setLength(i-8);
|
invertState.setLength(i-8);
|
||||||
final byte normD = computeAndGetNorm(d, invertState);
|
final byte normD = computeAndGetNorm(d, invertState);
|
||||||
invertState.setLength(i);
|
invertState.setLength(i);
|
||||||
final byte normS = computeAndGetNorm(sp.get("yak"), invertState);
|
final byte normS = computeAndGetNorm(sp, invertState);
|
||||||
assertEquals("f: 6,9: 9<x : i="+i,
|
assertEquals("f: 6,9: 9<x : i="+i,
|
||||||
normD,
|
normD,
|
||||||
normS,
|
normS,
|
||||||
|
@ -173,9 +188,14 @@ public class SweetSpotSimilarityTest extends LuceneTestCase {
|
||||||
// steepness
|
// steepness
|
||||||
|
|
||||||
for (int i = 9; i < 1000; i++) {
|
for (int i = 9; i < 1000; i++) {
|
||||||
|
invertState = new FieldInvertState("a");
|
||||||
|
invertState.setBoost(1.0f);
|
||||||
invertState.setLength(i);
|
invertState.setLength(i);
|
||||||
final byte normSS = computeAndGetNorm(sp.get("a"), invertState);
|
final byte normSS = computeAndGetNorm(sp, invertState);
|
||||||
final byte normS = computeAndGetNorm(sp.get("b"), invertState);
|
invertState = new FieldInvertState("b");
|
||||||
|
invertState.setBoost(1.0f);
|
||||||
|
invertState.setLength(i);
|
||||||
|
final byte normS = computeAndGetNorm(sp, invertState);
|
||||||
assertTrue("s: i="+i+" : a="+normSS+
|
assertTrue("s: i="+i+" : a="+normSS+
|
||||||
" < b="+normS,
|
" < b="+normS,
|
||||||
normSS < normS);
|
normSS < normS);
|
||||||
|
|
|
@ -37,8 +37,6 @@ final class DocInverter extends DocFieldConsumer {
|
||||||
|
|
||||||
final DocumentsWriterPerThread.DocState docState;
|
final DocumentsWriterPerThread.DocState docState;
|
||||||
|
|
||||||
final FieldInvertState fieldState = new FieldInvertState();
|
|
||||||
|
|
||||||
final SingleTokenAttributeSource singleToken = new SingleTokenAttributeSource();
|
final SingleTokenAttributeSource singleToken = new SingleTokenAttributeSource();
|
||||||
|
|
||||||
static class SingleTokenAttributeSource extends AttributeSource {
|
static class SingleTokenAttributeSource extends AttributeSource {
|
||||||
|
|
|
@ -42,7 +42,7 @@ final class DocInverterPerField extends DocFieldConsumerPerField {
|
||||||
public DocInverterPerField(DocInverter parent, FieldInfo fieldInfo) {
|
public DocInverterPerField(DocInverter parent, FieldInfo fieldInfo) {
|
||||||
this.fieldInfo = fieldInfo;
|
this.fieldInfo = fieldInfo;
|
||||||
docState = parent.docState;
|
docState = parent.docState;
|
||||||
fieldState = parent.fieldState;
|
fieldState = new FieldInvertState(fieldInfo.name);
|
||||||
this.consumer = parent.consumer.addField(this, fieldInfo);
|
this.consumer = parent.consumer.addField(this, fieldInfo);
|
||||||
this.endConsumer = parent.endConsumer.addField(this, fieldInfo);
|
this.endConsumer = parent.endConsumer.addField(this, fieldInfo);
|
||||||
}
|
}
|
||||||
|
|
|
@ -31,7 +31,7 @@ import org.apache.lucene.index.DocumentsWriterPerThread.IndexingChain;
|
||||||
import org.apache.lucene.index.DocumentsWriterPerThreadPool.ThreadState;
|
import org.apache.lucene.index.DocumentsWriterPerThreadPool.ThreadState;
|
||||||
import org.apache.lucene.index.FieldInfos.FieldNumberBiMap;
|
import org.apache.lucene.index.FieldInfos.FieldNumberBiMap;
|
||||||
import org.apache.lucene.search.Query;
|
import org.apache.lucene.search.Query;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.store.AlreadyClosedException;
|
import org.apache.lucene.store.AlreadyClosedException;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.util.InfoStream;
|
import org.apache.lucene.util.InfoStream;
|
||||||
|
@ -106,7 +106,7 @@ final class DocumentsWriter {
|
||||||
private volatile boolean closed;
|
private volatile boolean closed;
|
||||||
|
|
||||||
final InfoStream infoStream;
|
final InfoStream infoStream;
|
||||||
SimilarityProvider similarityProvider;
|
Similarity similarity;
|
||||||
|
|
||||||
List<String> newFiles;
|
List<String> newFiles;
|
||||||
|
|
||||||
|
@ -140,7 +140,7 @@ final class DocumentsWriter {
|
||||||
this.directory = directory;
|
this.directory = directory;
|
||||||
this.indexWriter = writer;
|
this.indexWriter = writer;
|
||||||
this.infoStream = config.getInfoStream();
|
this.infoStream = config.getInfoStream();
|
||||||
this.similarityProvider = config.getSimilarityProvider();
|
this.similarity = config.getSimilarity();
|
||||||
this.perThreadPool = config.getIndexerThreadPool();
|
this.perThreadPool = config.getIndexerThreadPool();
|
||||||
this.chain = config.getIndexingChain();
|
this.chain = config.getIndexingChain();
|
||||||
this.perThreadPool.initialize(this, globalFieldNumbers, config);
|
this.perThreadPool.initialize(this, globalFieldNumbers, config);
|
||||||
|
|
|
@ -26,7 +26,7 @@ import java.text.NumberFormat;
|
||||||
import org.apache.lucene.analysis.Analyzer;
|
import org.apache.lucene.analysis.Analyzer;
|
||||||
import org.apache.lucene.codecs.Codec;
|
import org.apache.lucene.codecs.Codec;
|
||||||
import org.apache.lucene.index.DocumentsWriterDeleteQueue.DeleteSlice;
|
import org.apache.lucene.index.DocumentsWriterDeleteQueue.DeleteSlice;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.store.FlushInfo;
|
import org.apache.lucene.store.FlushInfo;
|
||||||
import org.apache.lucene.store.IOContext;
|
import org.apache.lucene.store.IOContext;
|
||||||
|
@ -88,7 +88,7 @@ public class DocumentsWriterPerThread {
|
||||||
final DocumentsWriterPerThread docWriter;
|
final DocumentsWriterPerThread docWriter;
|
||||||
Analyzer analyzer;
|
Analyzer analyzer;
|
||||||
InfoStream infoStream;
|
InfoStream infoStream;
|
||||||
SimilarityProvider similarityProvider;
|
Similarity similarity;
|
||||||
int docID;
|
int docID;
|
||||||
Iterable<? extends IndexableField> doc;
|
Iterable<? extends IndexableField> doc;
|
||||||
String maxTermPrefix;
|
String maxTermPrefix;
|
||||||
|
@ -188,8 +188,7 @@ public class DocumentsWriterPerThread {
|
||||||
this.infoStream = parent.infoStream;
|
this.infoStream = parent.infoStream;
|
||||||
this.codec = parent.codec;
|
this.codec = parent.codec;
|
||||||
this.docState = new DocState(this, infoStream);
|
this.docState = new DocState(this, infoStream);
|
||||||
this.docState.similarityProvider = parent.indexWriter.getConfig()
|
this.docState.similarity = parent.indexWriter.getConfig().getSimilarity();
|
||||||
.getSimilarityProvider();
|
|
||||||
bytesUsed = Counter.newCounter();
|
bytesUsed = Counter.newCounter();
|
||||||
byteBlockAllocator = new DirectTrackingAllocator(bytesUsed);
|
byteBlockAllocator = new DirectTrackingAllocator(bytesUsed);
|
||||||
consumer = indexingChain.getChain(this);
|
consumer = indexingChain.getChain(this);
|
||||||
|
|
|
@ -26,6 +26,7 @@ import org.apache.lucene.util.AttributeSource;
|
||||||
* @lucene.experimental
|
* @lucene.experimental
|
||||||
*/
|
*/
|
||||||
public final class FieldInvertState {
|
public final class FieldInvertState {
|
||||||
|
String name;
|
||||||
int position;
|
int position;
|
||||||
int length;
|
int length;
|
||||||
int numOverlap;
|
int numOverlap;
|
||||||
|
@ -35,10 +36,12 @@ public final class FieldInvertState {
|
||||||
float boost;
|
float boost;
|
||||||
AttributeSource attributeSource;
|
AttributeSource attributeSource;
|
||||||
|
|
||||||
public FieldInvertState() {
|
public FieldInvertState(String name) {
|
||||||
|
this.name = name;
|
||||||
}
|
}
|
||||||
|
|
||||||
public FieldInvertState(int position, int length, int numOverlap, int offset, float boost) {
|
public FieldInvertState(String name, int position, int length, int numOverlap, int offset, float boost) {
|
||||||
|
this.name = name;
|
||||||
this.position = position;
|
this.position = position;
|
||||||
this.length = length;
|
this.length = length;
|
||||||
this.numOverlap = numOverlap;
|
this.numOverlap = numOverlap;
|
||||||
|
@ -134,4 +137,11 @@ public final class FieldInvertState {
|
||||||
public AttributeSource getAttributeSource() {
|
public AttributeSource getAttributeSource() {
|
||||||
return attributeSource;
|
return attributeSource;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Return the field's name
|
||||||
|
*/
|
||||||
|
public String getName() {
|
||||||
|
return name;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -24,7 +24,7 @@ import org.apache.lucene.codecs.Codec;
|
||||||
import org.apache.lucene.index.DocumentsWriterPerThread.IndexingChain;
|
import org.apache.lucene.index.DocumentsWriterPerThread.IndexingChain;
|
||||||
import org.apache.lucene.index.IndexWriter.IndexReaderWarmer;
|
import org.apache.lucene.index.IndexWriter.IndexReaderWarmer;
|
||||||
import org.apache.lucene.search.IndexSearcher;
|
import org.apache.lucene.search.IndexSearcher;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.util.InfoStream;
|
import org.apache.lucene.util.InfoStream;
|
||||||
import org.apache.lucene.util.PrintStreamInfoStream;
|
import org.apache.lucene.util.PrintStreamInfoStream;
|
||||||
import org.apache.lucene.util.Version;
|
import org.apache.lucene.util.Version;
|
||||||
|
@ -116,7 +116,7 @@ public final class IndexWriterConfig implements Cloneable {
|
||||||
private volatile IndexDeletionPolicy delPolicy;
|
private volatile IndexDeletionPolicy delPolicy;
|
||||||
private volatile IndexCommit commit;
|
private volatile IndexCommit commit;
|
||||||
private volatile OpenMode openMode;
|
private volatile OpenMode openMode;
|
||||||
private volatile SimilarityProvider similarityProvider;
|
private volatile Similarity similarity;
|
||||||
private volatile int termIndexInterval; // TODO: this should be private to the codec, not settable here
|
private volatile int termIndexInterval; // TODO: this should be private to the codec, not settable here
|
||||||
private volatile MergeScheduler mergeScheduler;
|
private volatile MergeScheduler mergeScheduler;
|
||||||
private volatile long writeLockTimeout;
|
private volatile long writeLockTimeout;
|
||||||
|
@ -154,7 +154,7 @@ public final class IndexWriterConfig implements Cloneable {
|
||||||
delPolicy = new KeepOnlyLastCommitDeletionPolicy();
|
delPolicy = new KeepOnlyLastCommitDeletionPolicy();
|
||||||
commit = null;
|
commit = null;
|
||||||
openMode = OpenMode.CREATE_OR_APPEND;
|
openMode = OpenMode.CREATE_OR_APPEND;
|
||||||
similarityProvider = IndexSearcher.getDefaultSimilarityProvider();
|
similarity = IndexSearcher.getDefaultSimilarity();
|
||||||
termIndexInterval = DEFAULT_TERM_INDEX_INTERVAL; // TODO: this should be private to the codec, not settable here
|
termIndexInterval = DEFAULT_TERM_INDEX_INTERVAL; // TODO: this should be private to the codec, not settable here
|
||||||
mergeScheduler = new ConcurrentMergeScheduler();
|
mergeScheduler = new ConcurrentMergeScheduler();
|
||||||
writeLockTimeout = WRITE_LOCK_TIMEOUT;
|
writeLockTimeout = WRITE_LOCK_TIMEOUT;
|
||||||
|
@ -258,23 +258,23 @@ public final class IndexWriterConfig implements Cloneable {
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Expert: set the {@link SimilarityProvider} implementation used by this IndexWriter.
|
* Expert: set the {@link Similarity} implementation used by this IndexWriter.
|
||||||
* <p>
|
* <p>
|
||||||
* <b>NOTE:</b> the similarity provider cannot be null. If <code>null</code> is passed,
|
* <b>NOTE:</b> the similarity cannot be null. If <code>null</code> is passed,
|
||||||
* the similarity provider will be set to the default implementation (unspecified).
|
* the similarity will be set to the default implementation (unspecified).
|
||||||
*
|
*
|
||||||
* <p>Only takes effect when IndexWriter is first created. */
|
* <p>Only takes effect when IndexWriter is first created. */
|
||||||
public IndexWriterConfig setSimilarityProvider(SimilarityProvider similarityProvider) {
|
public IndexWriterConfig setSimilarity(Similarity similarity) {
|
||||||
this.similarityProvider = similarityProvider == null ? IndexSearcher.getDefaultSimilarityProvider() : similarityProvider;
|
this.similarity = similarity == null ? IndexSearcher.getDefaultSimilarity() : similarity;
|
||||||
return this;
|
return this;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Expert: returns the {@link SimilarityProvider} implementation used by this
|
* Expert: returns the {@link Similarity} implementation used by this
|
||||||
* IndexWriter.
|
* IndexWriter.
|
||||||
*/
|
*/
|
||||||
public SimilarityProvider getSimilarityProvider() {
|
public Similarity getSimilarity() {
|
||||||
return similarityProvider;
|
return similarity;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -718,7 +718,7 @@ public final class IndexWriterConfig implements Cloneable {
|
||||||
sb.append("delPolicy=").append(delPolicy.getClass().getName()).append("\n");
|
sb.append("delPolicy=").append(delPolicy.getClass().getName()).append("\n");
|
||||||
sb.append("commit=").append(commit == null ? "null" : commit).append("\n");
|
sb.append("commit=").append(commit == null ? "null" : commit).append("\n");
|
||||||
sb.append("openMode=").append(openMode).append("\n");
|
sb.append("openMode=").append(openMode).append("\n");
|
||||||
sb.append("similarityProvider=").append(similarityProvider.getClass().getName()).append("\n");
|
sb.append("similarity=").append(similarity.getClass().getName()).append("\n");
|
||||||
sb.append("termIndexInterval=").append(termIndexInterval).append("\n"); // TODO: this should be private to the codec, not settable here
|
sb.append("termIndexInterval=").append(termIndexInterval).append("\n"); // TODO: this should be private to the codec, not settable here
|
||||||
sb.append("mergeScheduler=").append(mergeScheduler.getClass().getName()).append("\n");
|
sb.append("mergeScheduler=").append(mergeScheduler.getClass().getName()).append("\n");
|
||||||
sb.append("default WRITE_LOCK_TIMEOUT=").append(WRITE_LOCK_TIMEOUT).append("\n");
|
sb.append("default WRITE_LOCK_TIMEOUT=").append(WRITE_LOCK_TIMEOUT).append("\n");
|
||||||
|
|
|
@ -39,7 +39,7 @@ public class NormsConsumerPerField extends InvertedDocEndConsumerPerField implem
|
||||||
this.parent = parent;
|
this.parent = parent;
|
||||||
docState = docInverterPerField.docState;
|
docState = docInverterPerField.docState;
|
||||||
fieldState = docInverterPerField.fieldState;
|
fieldState = docInverterPerField.fieldState;
|
||||||
similarity = docState.similarityProvider.get(fieldInfo.name);
|
similarity = docState.similarity;
|
||||||
norm = new Norm();
|
norm = new Norm();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -28,8 +28,8 @@ import org.apache.lucene.index.TermsEnum;
|
||||||
import org.apache.lucene.search.BooleanClause.Occur;
|
import org.apache.lucene.search.BooleanClause.Occur;
|
||||||
import org.apache.lucene.search.ConjunctionTermScorer.DocsAndFreqs;
|
import org.apache.lucene.search.ConjunctionTermScorer.DocsAndFreqs;
|
||||||
import org.apache.lucene.search.TermQuery.TermWeight;
|
import org.apache.lucene.search.TermQuery.TermWeight;
|
||||||
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.Similarity.ExactSimScorer;
|
import org.apache.lucene.search.similarities.Similarity.ExactSimScorer;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.util.Bits;
|
import org.apache.lucene.util.Bits;
|
||||||
import org.apache.lucene.util.ToStringUtils;
|
import org.apache.lucene.util.ToStringUtils;
|
||||||
|
|
||||||
|
@ -79,18 +79,18 @@ public class BooleanQuery extends Query implements Iterable<BooleanClause> {
|
||||||
|
|
||||||
/** Constructs an empty boolean query.
|
/** Constructs an empty boolean query.
|
||||||
*
|
*
|
||||||
* {@link SimilarityProvider#coord(int,int)} may be disabled in scoring, as
|
* {@link Similarity#coord(int,int)} may be disabled in scoring, as
|
||||||
* appropriate. For example, this score factor does not make sense for most
|
* appropriate. For example, this score factor does not make sense for most
|
||||||
* automatically generated queries, like {@link WildcardQuery} and {@link
|
* automatically generated queries, like {@link WildcardQuery} and {@link
|
||||||
* FuzzyQuery}.
|
* FuzzyQuery}.
|
||||||
*
|
*
|
||||||
* @param disableCoord disables {@link SimilarityProvider#coord(int,int)} in scoring.
|
* @param disableCoord disables {@link Similarity#coord(int,int)} in scoring.
|
||||||
*/
|
*/
|
||||||
public BooleanQuery(boolean disableCoord) {
|
public BooleanQuery(boolean disableCoord) {
|
||||||
this.disableCoord = disableCoord;
|
this.disableCoord = disableCoord;
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Returns true iff {@link SimilarityProvider#coord(int,int)} is disabled in
|
/** Returns true iff {@link Similarity#coord(int,int)} is disabled in
|
||||||
* scoring for this query instance.
|
* scoring for this query instance.
|
||||||
* @see #BooleanQuery(boolean)
|
* @see #BooleanQuery(boolean)
|
||||||
*/
|
*/
|
||||||
|
@ -169,7 +169,7 @@ public class BooleanQuery extends Query implements Iterable<BooleanClause> {
|
||||||
*/
|
*/
|
||||||
protected class BooleanWeight extends Weight {
|
protected class BooleanWeight extends Weight {
|
||||||
/** The Similarity implementation. */
|
/** The Similarity implementation. */
|
||||||
protected SimilarityProvider similarityProvider;
|
protected Similarity similarity;
|
||||||
protected ArrayList<Weight> weights;
|
protected ArrayList<Weight> weights;
|
||||||
protected int maxCoord; // num optional + num required
|
protected int maxCoord; // num optional + num required
|
||||||
private final boolean disableCoord;
|
private final boolean disableCoord;
|
||||||
|
@ -177,7 +177,7 @@ public class BooleanQuery extends Query implements Iterable<BooleanClause> {
|
||||||
|
|
||||||
public BooleanWeight(IndexSearcher searcher, boolean disableCoord)
|
public BooleanWeight(IndexSearcher searcher, boolean disableCoord)
|
||||||
throws IOException {
|
throws IOException {
|
||||||
this.similarityProvider = searcher.getSimilarityProvider();
|
this.similarity = searcher.getSimilarity();
|
||||||
this.disableCoord = disableCoord;
|
this.disableCoord = disableCoord;
|
||||||
weights = new ArrayList<Weight>(clauses.size());
|
weights = new ArrayList<Weight>(clauses.size());
|
||||||
boolean termConjunction = clauses.isEmpty() || minNrShouldMatch != 0 ? false : true;
|
boolean termConjunction = clauses.isEmpty() || minNrShouldMatch != 0 ? false : true;
|
||||||
|
@ -213,7 +213,7 @@ public class BooleanQuery extends Query implements Iterable<BooleanClause> {
|
||||||
}
|
}
|
||||||
|
|
||||||
public float coord(int overlap, int maxOverlap) {
|
public float coord(int overlap, int maxOverlap) {
|
||||||
return similarityProvider.coord(overlap, maxOverlap);
|
return similarity.coord(overlap, maxOverlap);
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
|
|
|
@ -40,8 +40,8 @@ import org.apache.lucene.index.IndexReaderContext;
|
||||||
import org.apache.lucene.index.StoredFieldVisitor;
|
import org.apache.lucene.index.StoredFieldVisitor;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.index.Terms;
|
import org.apache.lucene.index.Terms;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.store.NIOFSDirectory; // javadoc
|
import org.apache.lucene.store.NIOFSDirectory; // javadoc
|
||||||
import org.apache.lucene.util.ReaderUtil;
|
import org.apache.lucene.util.ReaderUtil;
|
||||||
import org.apache.lucene.util.TermContext;
|
import org.apache.lucene.util.TermContext;
|
||||||
|
@ -86,22 +86,22 @@ public class IndexSearcher {
|
||||||
// These are only used for multi-threaded search
|
// These are only used for multi-threaded search
|
||||||
private final ExecutorService executor;
|
private final ExecutorService executor;
|
||||||
|
|
||||||
// the default SimilarityProvider
|
// the default Similarity
|
||||||
private static final SimilarityProvider defaultProvider = new DefaultSimilarityProvider();
|
private static final Similarity defaultSimilarity = new DefaultSimilarity();
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Expert: returns a default SimilarityProvider instance.
|
* Expert: returns a default Similarity instance.
|
||||||
* In general, this method is only called to initialize searchers and writers.
|
* In general, this method is only called to initialize searchers and writers.
|
||||||
* User code and query implementations should respect
|
* User code and query implementations should respect
|
||||||
* {@link IndexSearcher#getSimilarityProvider()}.
|
* {@link IndexSearcher#getSimilarity()}.
|
||||||
* @lucene.internal
|
* @lucene.internal
|
||||||
*/
|
*/
|
||||||
public static SimilarityProvider getDefaultSimilarityProvider() {
|
public static Similarity getDefaultSimilarity() {
|
||||||
return defaultProvider;
|
return defaultSimilarity;
|
||||||
}
|
}
|
||||||
|
|
||||||
/** The SimilarityProvider implementation used by this searcher. */
|
/** The Similarity implementation used by this searcher. */
|
||||||
private SimilarityProvider similarityProvider = defaultProvider;
|
private Similarity similarity = defaultSimilarity;
|
||||||
|
|
||||||
/** Creates a searcher searching the provided index. */
|
/** Creates a searcher searching the provided index. */
|
||||||
public IndexSearcher(IndexReader r) {
|
public IndexSearcher(IndexReader r) {
|
||||||
|
@ -193,15 +193,15 @@ public class IndexSearcher {
|
||||||
return reader.document(docID, fieldsToLoad);
|
return reader.document(docID, fieldsToLoad);
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Expert: Set the SimilarityProvider implementation used by this Searcher.
|
/** Expert: Set the Similarity implementation used by this Searcher.
|
||||||
*
|
*
|
||||||
*/
|
*/
|
||||||
public void setSimilarityProvider(SimilarityProvider similarityProvider) {
|
public void setSimilarity(Similarity similarity) {
|
||||||
this.similarityProvider = similarityProvider;
|
this.similarity = similarity;
|
||||||
}
|
}
|
||||||
|
|
||||||
public SimilarityProvider getSimilarityProvider() {
|
public Similarity getSimilarity() {
|
||||||
return similarityProvider;
|
return similarity;
|
||||||
}
|
}
|
||||||
|
|
||||||
/** @lucene.internal */
|
/** @lucene.internal */
|
||||||
|
@ -583,7 +583,7 @@ public class IndexSearcher {
|
||||||
query = rewrite(query);
|
query = rewrite(query);
|
||||||
Weight weight = query.createWeight(this);
|
Weight weight = query.createWeight(this);
|
||||||
float v = weight.getValueForNormalization();
|
float v = weight.getValueForNormalization();
|
||||||
float norm = getSimilarityProvider().queryNorm(v);
|
float norm = getSimilarity().queryNorm(v);
|
||||||
if (Float.isInfinite(norm) || Float.isNaN(norm))
|
if (Float.isInfinite(norm) || Float.isNaN(norm))
|
||||||
norm = 1.0f;
|
norm = 1.0f;
|
||||||
weight.normalize(norm, 1.0f);
|
weight.normalize(norm, 1.0f);
|
||||||
|
|
|
@ -142,7 +142,7 @@ public class MultiPhraseQuery extends Query {
|
||||||
|
|
||||||
public MultiPhraseWeight(IndexSearcher searcher)
|
public MultiPhraseWeight(IndexSearcher searcher)
|
||||||
throws IOException {
|
throws IOException {
|
||||||
this.similarity = searcher.getSimilarityProvider().get(field);
|
this.similarity = searcher.getSimilarity();
|
||||||
final IndexReaderContext context = searcher.getTopReaderContext();
|
final IndexReaderContext context = searcher.getTopReaderContext();
|
||||||
|
|
||||||
// compute idf
|
// compute idf
|
||||||
|
|
|
@ -188,7 +188,7 @@ public class PhraseQuery extends Query {
|
||||||
|
|
||||||
public PhraseWeight(IndexSearcher searcher)
|
public PhraseWeight(IndexSearcher searcher)
|
||||||
throws IOException {
|
throws IOException {
|
||||||
this.similarity = searcher.getSimilarityProvider().get(field);
|
this.similarity = searcher.getSimilarity();
|
||||||
final IndexReaderContext context = searcher.getTopReaderContext();
|
final IndexReaderContext context = searcher.getTopReaderContext();
|
||||||
states = new TermContext[terms.size()];
|
states = new TermContext[terms.size()];
|
||||||
TermStatistics termStats[] = new TermStatistics[terms.size()];
|
TermStatistics termStats[] = new TermStatistics[terms.size()];
|
||||||
|
|
|
@ -23,7 +23,7 @@ import java.util.concurrent.ExecutorService; // javadocs
|
||||||
import org.apache.lucene.index.IndexReader;
|
import org.apache.lucene.index.IndexReader;
|
||||||
import org.apache.lucene.index.IndexWriter; // javadocs
|
import org.apache.lucene.index.IndexWriter; // javadocs
|
||||||
import org.apache.lucene.index.IndexWriterConfig; // javadocs
|
import org.apache.lucene.index.IndexWriterConfig; // javadocs
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider; // javadocs
|
import org.apache.lucene.search.similarities.Similarity; // javadocs
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Factory class used by {@link SearcherManager} and {@link NRTManager} to
|
* Factory class used by {@link SearcherManager} and {@link NRTManager} to
|
||||||
|
@ -38,7 +38,7 @@ import org.apache.lucene.search.similarities.SimilarityProvider; // javadocs
|
||||||
*
|
*
|
||||||
* You can pass your own factory instead if you want custom behavior, such as:
|
* You can pass your own factory instead if you want custom behavior, such as:
|
||||||
* <ul>
|
* <ul>
|
||||||
* <li>Setting a custom scoring model: {@link IndexSearcher#setSimilarityProvider(SimilarityProvider)}
|
* <li>Setting a custom scoring model: {@link IndexSearcher#setSimilarity(Similarity)}
|
||||||
* <li>Parallel per-segment search: {@link IndexSearcher#IndexSearcher(IndexReader, ExecutorService)}
|
* <li>Parallel per-segment search: {@link IndexSearcher#IndexSearcher(IndexReader, ExecutorService)}
|
||||||
* <li>Return custom subclasses of IndexSearcher (for example that implement distributed scoring)
|
* <li>Return custom subclasses of IndexSearcher (for example that implement distributed scoring)
|
||||||
* <li>Run queries to warm your IndexSearcher before it is used. Note: when using near-realtime search
|
* <li>Run queries to warm your IndexSearcher before it is used. Note: when using near-realtime search
|
||||||
|
|
|
@ -52,7 +52,7 @@ public class TermQuery extends Query {
|
||||||
throws IOException {
|
throws IOException {
|
||||||
assert termStates != null : "TermContext must not be null";
|
assert termStates != null : "TermContext must not be null";
|
||||||
this.termStates = termStates;
|
this.termStates = termStates;
|
||||||
this.similarity = searcher.getSimilarityProvider().get(term.field());
|
this.similarity = searcher.getSimilarity();
|
||||||
this.stats = similarity.computeWeight(
|
this.stats = similarity.computeWeight(
|
||||||
getBoost(),
|
getBoost(),
|
||||||
searcher.collectionStatistics(term.field()),
|
searcher.collectionStatistics(term.field()),
|
||||||
|
|
|
@ -22,7 +22,7 @@ import java.io.IOException;
|
||||||
import org.apache.lucene.index.AtomicReader; // javadocs
|
import org.apache.lucene.index.AtomicReader; // javadocs
|
||||||
import org.apache.lucene.index.AtomicReaderContext;
|
import org.apache.lucene.index.AtomicReaderContext;
|
||||||
import org.apache.lucene.index.IndexReaderContext; // javadocs
|
import org.apache.lucene.index.IndexReaderContext; // javadocs
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.util.Bits;
|
import org.apache.lucene.util.Bits;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -46,7 +46,7 @@ import org.apache.lucene.util.Bits;
|
||||||
* <code>IndexSearcher</code> ({@link Query#createWeight(IndexSearcher)}).
|
* <code>IndexSearcher</code> ({@link Query#createWeight(IndexSearcher)}).
|
||||||
* <li>The {@link #getValueForNormalization()} method is called on the
|
* <li>The {@link #getValueForNormalization()} method is called on the
|
||||||
* <code>Weight</code> to compute the query normalization factor
|
* <code>Weight</code> to compute the query normalization factor
|
||||||
* {@link SimilarityProvider#queryNorm(float)} of the query clauses contained in the
|
* {@link Similarity#queryNorm(float)} of the query clauses contained in the
|
||||||
* query.
|
* query.
|
||||||
* <li>The query normalization factor is passed to {@link #normalize(float, float)}. At
|
* <li>The query normalization factor is passed to {@link #normalize(float, float)}. At
|
||||||
* this point the weighting is complete.
|
* this point the weighting is complete.
|
||||||
|
|
|
@ -1,54 +0,0 @@
|
||||||
package org.apache.lucene.search.similarities;
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Licensed to the Apache Software Foundation (ASF) under one or more
|
|
||||||
* contributor license agreements. See the NOTICE file distributed with
|
|
||||||
* this work for additional information regarding copyright ownership.
|
|
||||||
* The ASF licenses this file to You under the Apache License, Version 2.0
|
|
||||||
* (the "License"); you may not use this file except in compliance with
|
|
||||||
* the License. You may obtain a copy of the License at
|
|
||||||
*
|
|
||||||
* http://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
*
|
|
||||||
* Unless required by applicable law or agreed to in writing, software
|
|
||||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
* See the License for the specific language governing permissions and
|
|
||||||
* limitations under the License.
|
|
||||||
*/
|
|
||||||
|
|
||||||
/**
|
|
||||||
* A simple {@link Similarity} provider that returns in
|
|
||||||
* {@code get(String field)} the object passed to its constructor. This class
|
|
||||||
* is aimed at non-VSM models, and therefore both the {@link #coord} and
|
|
||||||
* {@link #queryNorm} methods return {@code 1}. Use
|
|
||||||
* {@link DefaultSimilarityProvider} for {@link DefaultSimilarity}.
|
|
||||||
* @lucene.experimental
|
|
||||||
*/
|
|
||||||
public class BasicSimilarityProvider implements SimilarityProvider {
|
|
||||||
private final Similarity sim;
|
|
||||||
|
|
||||||
public BasicSimilarityProvider(Similarity sim) {
|
|
||||||
this.sim = sim;
|
|
||||||
}
|
|
||||||
|
|
||||||
@Override
|
|
||||||
public float coord(int overlap, int maxOverlap) {
|
|
||||||
return 1f;
|
|
||||||
}
|
|
||||||
|
|
||||||
@Override
|
|
||||||
public float queryNorm(float sumOfSquaredWeights) {
|
|
||||||
return 1f;
|
|
||||||
}
|
|
||||||
|
|
||||||
@Override
|
|
||||||
public Similarity get(String field) {
|
|
||||||
return sim;
|
|
||||||
}
|
|
||||||
|
|
||||||
@Override
|
|
||||||
public String toString() {
|
|
||||||
return "BasicSimilarityProvider(" + sim + ")";
|
|
||||||
}
|
|
||||||
}
|
|
|
@ -24,6 +24,16 @@ import org.apache.lucene.util.BytesRef;
|
||||||
/** Expert: Default scoring implementation. */
|
/** Expert: Default scoring implementation. */
|
||||||
public class DefaultSimilarity extends TFIDFSimilarity {
|
public class DefaultSimilarity extends TFIDFSimilarity {
|
||||||
|
|
||||||
|
/** Implemented as <code>overlap / maxOverlap</code>. */
|
||||||
|
public float coord(int overlap, int maxOverlap) {
|
||||||
|
return overlap / (float)maxOverlap;
|
||||||
|
}
|
||||||
|
|
||||||
|
/** Implemented as <code>1/sqrt(sumOfSquaredWeights)</code>. */
|
||||||
|
public float queryNorm(float sumOfSquaredWeights) {
|
||||||
|
return (float)(1.0 / Math.sqrt(sumOfSquaredWeights));
|
||||||
|
}
|
||||||
|
|
||||||
/** Implemented as
|
/** Implemented as
|
||||||
* <code>state.getBoost()*lengthNorm(numTerms)</code>, where
|
* <code>state.getBoost()*lengthNorm(numTerms)</code>, where
|
||||||
* <code>numTerms</code> is {@link FieldInvertState#getLength()} if {@link
|
* <code>numTerms</code> is {@link FieldInvertState#getLength()} if {@link
|
||||||
|
|
|
@ -1,42 +0,0 @@
|
||||||
package org.apache.lucene.search.similarities;
|
|
||||||
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Licensed to the Apache Software Foundation (ASF) under one or more
|
|
||||||
* contributor license agreements. See the NOTICE file distributed with
|
|
||||||
* this work for additional information regarding copyright ownership.
|
|
||||||
* The ASF licenses this file to You under the Apache License, Version 2.0
|
|
||||||
* (the "License"); you may not use this file except in compliance with
|
|
||||||
* the License. You may obtain a copy of the License at
|
|
||||||
*
|
|
||||||
* http://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
*
|
|
||||||
* Unless required by applicable law or agreed to in writing, software
|
|
||||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
* See the License for the specific language governing permissions and
|
|
||||||
* limitations under the License.
|
|
||||||
*/
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Expert: Default scoring provider.
|
|
||||||
* <p>
|
|
||||||
* Returns {@link DefaultSimilarity} for every field
|
|
||||||
*/
|
|
||||||
public class DefaultSimilarityProvider implements SimilarityProvider {
|
|
||||||
private static final Similarity impl = new DefaultSimilarity();
|
|
||||||
|
|
||||||
/** Implemented as <code>overlap / maxOverlap</code>. */
|
|
||||||
public float coord(int overlap, int maxOverlap) {
|
|
||||||
return overlap / (float)maxOverlap;
|
|
||||||
}
|
|
||||||
|
|
||||||
/** Implemented as <code>1/sqrt(sumOfSquaredWeights)</code>. */
|
|
||||||
public float queryNorm(float sumOfSquaredWeights) {
|
|
||||||
return (float)(1.0 / Math.sqrt(sumOfSquaredWeights));
|
|
||||||
}
|
|
||||||
|
|
||||||
public Similarity get(String field) {
|
|
||||||
return impl;
|
|
||||||
}
|
|
||||||
}
|
|
|
@ -0,0 +1,82 @@
|
||||||
|
package org.apache.lucene.search.similarities;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Licensed to the Apache Software Foundation (ASF) under one or more
|
||||||
|
* contributor license agreements. See the NOTICE file distributed with
|
||||||
|
* this work for additional information regarding copyright ownership.
|
||||||
|
* The ASF licenses this file to You under the Apache License, Version 2.0
|
||||||
|
* (the "License"); you may not use this file except in compliance with
|
||||||
|
* the License. You may obtain a copy of the License at
|
||||||
|
*
|
||||||
|
* http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
*
|
||||||
|
* Unless required by applicable law or agreed to in writing, software
|
||||||
|
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
* See the License for the specific language governing permissions and
|
||||||
|
* limitations under the License.
|
||||||
|
*/
|
||||||
|
|
||||||
|
import java.io.IOException;
|
||||||
|
|
||||||
|
import org.apache.lucene.index.AtomicReaderContext;
|
||||||
|
import org.apache.lucene.index.FieldInvertState;
|
||||||
|
import org.apache.lucene.index.Norm;
|
||||||
|
import org.apache.lucene.search.CollectionStatistics;
|
||||||
|
import org.apache.lucene.search.TermStatistics;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Provides the ability to use a different {@link Similarity} for different fields.
|
||||||
|
* <p>
|
||||||
|
* Subclasses should implement {@link #get(String)} to return an appropriate
|
||||||
|
* Similarity (for example, using field-specific parameter values) for the field.
|
||||||
|
*
|
||||||
|
* @lucene.experimental
|
||||||
|
*/
|
||||||
|
public abstract class PerFieldSimilarityWrapper extends Similarity {
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public final void computeNorm(FieldInvertState state, Norm norm) {
|
||||||
|
get(state.getName()).computeNorm(state, norm);
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public final SimWeight computeWeight(float queryBoost, CollectionStatistics collectionStats, TermStatistics... termStats) {
|
||||||
|
PerFieldSimWeight weight = new PerFieldSimWeight();
|
||||||
|
weight.delegate = get(collectionStats.field());
|
||||||
|
weight.delegateWeight = weight.delegate.computeWeight(queryBoost, collectionStats, termStats);
|
||||||
|
return weight;
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public final ExactSimScorer exactSimScorer(SimWeight weight, AtomicReaderContext context) throws IOException {
|
||||||
|
PerFieldSimWeight perFieldWeight = (PerFieldSimWeight) weight;
|
||||||
|
return perFieldWeight.delegate.exactSimScorer(perFieldWeight.delegateWeight, context);
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public final SloppySimScorer sloppySimScorer(SimWeight weight, AtomicReaderContext context) throws IOException {
|
||||||
|
PerFieldSimWeight perFieldWeight = (PerFieldSimWeight) weight;
|
||||||
|
return perFieldWeight.delegate.sloppySimScorer(perFieldWeight.delegateWeight, context);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns a {@link Similarity} for scoring a field.
|
||||||
|
*/
|
||||||
|
public abstract Similarity get(String name);
|
||||||
|
|
||||||
|
static class PerFieldSimWeight extends SimWeight {
|
||||||
|
Similarity delegate;
|
||||||
|
SimWeight delegateWeight;
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public float getValueForNormalization() {
|
||||||
|
return delegateWeight.getValueForNormalization();
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public void normalize(float queryNorm, float topLevelBoost) {
|
||||||
|
delegateWeight.normalize(queryNorm, topLevelBoost);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
|
@ -74,8 +74,8 @@ import org.apache.lucene.util.SmallFloat; // javadoc
|
||||||
* Finally, using index-time boosts (either via folding into the normalization byte or
|
* Finally, using index-time boosts (either via folding into the normalization byte or
|
||||||
* via DocValues), is an inefficient way to boost the scores of different fields if the
|
* via DocValues), is an inefficient way to boost the scores of different fields if the
|
||||||
* boost will be the same for every document, instead the Similarity can simply take a constant
|
* boost will be the same for every document, instead the Similarity can simply take a constant
|
||||||
* boost parameter <i>C</i>, and the SimilarityProvider can return different instances with
|
* boost parameter <i>C</i>, and {@link PerFieldSimilarityWrapper} can return different
|
||||||
* different boosts depending upon field name.
|
* instances with different boosts depending upon field name.
|
||||||
* <p>
|
* <p>
|
||||||
* <a name="querytime"/>
|
* <a name="querytime"/>
|
||||||
* At query-time, Queries interact with the Similarity via these steps:
|
* At query-time, Queries interact with the Similarity via these steps:
|
||||||
|
@ -87,7 +87,7 @@ import org.apache.lucene.util.SmallFloat; // javadoc
|
||||||
* of statistics without causing any additional I/O. Lucene makes no assumption about what is
|
* of statistics without causing any additional I/O. Lucene makes no assumption about what is
|
||||||
* stored in the returned {@link Similarity.SimWeight} object.
|
* stored in the returned {@link Similarity.SimWeight} object.
|
||||||
* <li>The query normalization process occurs a single time: {@link Similarity.SimWeight#getValueForNormalization()}
|
* <li>The query normalization process occurs a single time: {@link Similarity.SimWeight#getValueForNormalization()}
|
||||||
* is called for each query leaf node, {@link SimilarityProvider#queryNorm(float)} is called for the top-level
|
* is called for each query leaf node, {@link Similarity#queryNorm(float)} is called for the top-level
|
||||||
* query, and finally {@link Similarity.SimWeight#normalize(float, float)} passes down the normalization value
|
* query, and finally {@link Similarity.SimWeight#normalize(float, float)} passes down the normalization value
|
||||||
* and any top-level boosts (e.g. from enclosing {@link BooleanQuery}s).
|
* and any top-level boosts (e.g. from enclosing {@link BooleanQuery}s).
|
||||||
* <li>For each segment in the index, the Query creates a {@link #exactSimScorer(SimWeight, AtomicReaderContext)}
|
* <li>For each segment in the index, the Query creates a {@link #exactSimScorer(SimWeight, AtomicReaderContext)}
|
||||||
|
@ -101,12 +101,43 @@ import org.apache.lucene.util.SmallFloat; // javadoc
|
||||||
* explanation of how it computed its score. The query passes in a the document id and an explanation of how the frequency
|
* explanation of how it computed its score. The query passes in a the document id and an explanation of how the frequency
|
||||||
* was computed.
|
* was computed.
|
||||||
*
|
*
|
||||||
* @see org.apache.lucene.index.IndexWriterConfig#setSimilarityProvider(SimilarityProvider)
|
* @see org.apache.lucene.index.IndexWriterConfig#setSimilarity(Similarity)
|
||||||
* @see IndexSearcher#setSimilarityProvider(SimilarityProvider)
|
* @see IndexSearcher#setSimilarity(Similarity)
|
||||||
* @lucene.experimental
|
* @lucene.experimental
|
||||||
*/
|
*/
|
||||||
public abstract class Similarity {
|
public abstract class Similarity {
|
||||||
|
|
||||||
|
/** Hook to integrate coordinate-level matching.
|
||||||
|
* <p>
|
||||||
|
* By default this is disabled (returns <code>1</code>), as with
|
||||||
|
* most modern models this will only skew performance, but some
|
||||||
|
* implementations such as {@link TFIDFSimilarity} override this.
|
||||||
|
*
|
||||||
|
* @param overlap the number of query terms matched in the document
|
||||||
|
* @param maxOverlap the total number of terms in the query
|
||||||
|
* @return a score factor based on term overlap with the query
|
||||||
|
*/
|
||||||
|
public float coord(int overlap, int maxOverlap) {
|
||||||
|
return 1f;
|
||||||
|
}
|
||||||
|
|
||||||
|
/** Computes the normalization value for a query given the sum of the
|
||||||
|
* normalized weights {@link SimWeight#getValueForNormalization()} of
|
||||||
|
* each of the query terms. This value is passed back to the
|
||||||
|
* weight ({@link SimWeight#normalize(float, float)} of each query
|
||||||
|
* term, to provide a hook to attempt to make scores from different
|
||||||
|
* queries comparable.
|
||||||
|
* <p>
|
||||||
|
* By default this is disabled (returns <code>1</code>), but some
|
||||||
|
* implementations such as {@link TFIDFSimilarity} override this.
|
||||||
|
*
|
||||||
|
* @param valueForNormalization the sum of the term normalization values
|
||||||
|
* @return a normalization factor for query weights
|
||||||
|
*/
|
||||||
|
public float queryNorm(float valueForNormalization) {
|
||||||
|
return 1f;
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Computes the normalization value for a field, given the accumulated
|
* Computes the normalization value for a field, given the accumulated
|
||||||
* state of term processing for this field (see {@link FieldInvertState}).
|
* state of term processing for this field (see {@link FieldInvertState}).
|
||||||
|
|
|
@ -1,68 +0,0 @@
|
||||||
package org.apache.lucene.search.similarities;
|
|
||||||
|
|
||||||
import org.apache.lucene.search.BooleanQuery;
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Licensed to the Apache Software Foundation (ASF) under one or more
|
|
||||||
* contributor license agreements. See the NOTICE file distributed with
|
|
||||||
* this work for additional information regarding copyright ownership.
|
|
||||||
* The ASF licenses this file to You under the Apache License, Version 2.0
|
|
||||||
* (the "License"); you may not use this file except in compliance with
|
|
||||||
* the License. You may obtain a copy of the License at
|
|
||||||
*
|
|
||||||
* http://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
*
|
|
||||||
* Unless required by applicable law or agreed to in writing, software
|
|
||||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
* See the License for the specific language governing permissions and
|
|
||||||
* limitations under the License.
|
|
||||||
*/
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Expert: Scoring API.
|
|
||||||
*
|
|
||||||
* Provides top-level scoring functions that aren't specific to a field,
|
|
||||||
* and work across multi-field queries (such as {@link BooleanQuery}).
|
|
||||||
*
|
|
||||||
* Field-specific scoring is accomplished through {@link Similarity}.
|
|
||||||
*
|
|
||||||
* @lucene.experimental
|
|
||||||
*/
|
|
||||||
public interface SimilarityProvider {
|
|
||||||
|
|
||||||
/** Computes a score factor based on the fraction of all query terms that a
|
|
||||||
* document contains. This value is multiplied into scores.
|
|
||||||
*
|
|
||||||
* <p>The presence of a large portion of the query terms indicates a better
|
|
||||||
* match with the query, so implementations of this method usually return
|
|
||||||
* larger values when the ratio between these parameters is large and smaller
|
|
||||||
* values when the ratio between them is small.
|
|
||||||
*
|
|
||||||
* @param overlap the number of query terms matched in the document
|
|
||||||
* @param maxOverlap the total number of terms in the query
|
|
||||||
* @return a score factor based on term overlap with the query
|
|
||||||
*/
|
|
||||||
public abstract float coord(int overlap, int maxOverlap);
|
|
||||||
|
|
||||||
/** Computes the normalization value for a query given the sum of the squared
|
|
||||||
* weights of each of the query terms. This value is multiplied into the
|
|
||||||
* weight of each query term. While the classic query normalization factor is
|
|
||||||
* computed as 1/sqrt(sumOfSquaredWeights), other implementations might
|
|
||||||
* completely ignore sumOfSquaredWeights (ie return 1).
|
|
||||||
*
|
|
||||||
* <p>This does not affect ranking, but the default implementation does make scores
|
|
||||||
* from different queries more comparable than they would be by eliminating the
|
|
||||||
* magnitude of the Query vector as a factor in the score.
|
|
||||||
*
|
|
||||||
* @param sumOfSquaredWeights the sum of the squares of query term weights
|
|
||||||
* @return a normalization factor for query weights
|
|
||||||
*/
|
|
||||||
public abstract float queryNorm(float sumOfSquaredWeights);
|
|
||||||
|
|
||||||
/** Returns a {@link Similarity} for scoring a field
|
|
||||||
* @param field field name.
|
|
||||||
* @return a field-specific Similarity.
|
|
||||||
*/
|
|
||||||
public abstract Similarity get(String field);
|
|
||||||
}
|
|
|
@ -366,8 +366,8 @@ import org.apache.lucene.util.SmallFloat;
|
||||||
* Typically, a document that contains more of the query's terms will receive a higher score
|
* Typically, a document that contains more of the query's terms will receive a higher score
|
||||||
* than another document with fewer query terms.
|
* than another document with fewer query terms.
|
||||||
* This is a search time factor computed in
|
* This is a search time factor computed in
|
||||||
* {@link SimilarityProvider#coord(int, int) coord(q,d)}
|
* {@link #coord(int, int) coord(q,d)}
|
||||||
* by the SimilarityProvider in effect at search time.
|
* by the Similarity in effect at search time.
|
||||||
* <br> <br>
|
* <br> <br>
|
||||||
* </li>
|
* </li>
|
||||||
*
|
*
|
||||||
|
@ -381,14 +381,14 @@ import org.apache.lucene.util.SmallFloat;
|
||||||
* This is a search time factor computed by the Similarity in effect at search time.
|
* This is a search time factor computed by the Similarity in effect at search time.
|
||||||
*
|
*
|
||||||
* The default computation in
|
* The default computation in
|
||||||
* {@link org.apache.lucene.search.similarities.DefaultSimilarityProvider#queryNorm(float) DefaultSimilarityProvider}
|
* {@link org.apache.lucene.search.similarities.DefaultSimilarity#queryNorm(float) DefaultSimilarity}
|
||||||
* produces a <a href="http://en.wikipedia.org/wiki/Euclidean_norm#Euclidean_norm">Euclidean norm</a>:
|
* produces a <a href="http://en.wikipedia.org/wiki/Euclidean_norm#Euclidean_norm">Euclidean norm</a>:
|
||||||
* <br> <br>
|
* <br> <br>
|
||||||
* <table cellpadding="1" cellspacing="0" border="0" align="center">
|
* <table cellpadding="1" cellspacing="0" border="0" align="center">
|
||||||
* <tr>
|
* <tr>
|
||||||
* <td valign="middle" align="right" rowspan="1">
|
* <td valign="middle" align="right" rowspan="1">
|
||||||
* queryNorm(q) =
|
* queryNorm(q) =
|
||||||
* {@link org.apache.lucene.search.similarities.DefaultSimilarityProvider#queryNorm(float) queryNorm(sumOfSquaredWeights)}
|
* {@link org.apache.lucene.search.similarities.DefaultSimilarity#queryNorm(float) queryNorm(sumOfSquaredWeights)}
|
||||||
* =
|
* =
|
||||||
* </td>
|
* </td>
|
||||||
* <td valign="middle" align="center" rowspan="1">
|
* <td valign="middle" align="center" rowspan="1">
|
||||||
|
@ -520,11 +520,42 @@ import org.apache.lucene.util.SmallFloat;
|
||||||
* </li>
|
* </li>
|
||||||
* </ol>
|
* </ol>
|
||||||
*
|
*
|
||||||
* @see org.apache.lucene.index.IndexWriterConfig#setSimilarityProvider(SimilarityProvider)
|
* @see org.apache.lucene.index.IndexWriterConfig#setSimilarity(Similarity)
|
||||||
* @see IndexSearcher#setSimilarityProvider(SimilarityProvider)
|
* @see IndexSearcher#setSimilarity(Similarity)
|
||||||
*/
|
*/
|
||||||
public abstract class TFIDFSimilarity extends Similarity {
|
public abstract class TFIDFSimilarity extends Similarity {
|
||||||
|
|
||||||
|
/** Computes a score factor based on the fraction of all query terms that a
|
||||||
|
* document contains. This value is multiplied into scores.
|
||||||
|
*
|
||||||
|
* <p>The presence of a large portion of the query terms indicates a better
|
||||||
|
* match with the query, so implementations of this method usually return
|
||||||
|
* larger values when the ratio between these parameters is large and smaller
|
||||||
|
* values when the ratio between them is small.
|
||||||
|
*
|
||||||
|
* @param overlap the number of query terms matched in the document
|
||||||
|
* @param maxOverlap the total number of terms in the query
|
||||||
|
* @return a score factor based on term overlap with the query
|
||||||
|
*/
|
||||||
|
@Override
|
||||||
|
public abstract float coord(int overlap, int maxOverlap);
|
||||||
|
|
||||||
|
/** Computes the normalization value for a query given the sum of the squared
|
||||||
|
* weights of each of the query terms. This value is multiplied into the
|
||||||
|
* weight of each query term. While the classic query normalization factor is
|
||||||
|
* computed as 1/sqrt(sumOfSquaredWeights), other implementations might
|
||||||
|
* completely ignore sumOfSquaredWeights (ie return 1).
|
||||||
|
*
|
||||||
|
* <p>This does not affect ranking, but the default implementation does make scores
|
||||||
|
* from different queries more comparable than they would be by eliminating the
|
||||||
|
* magnitude of the Query vector as a factor in the score.
|
||||||
|
*
|
||||||
|
* @param sumOfSquaredWeights the sum of the squares of query term weights
|
||||||
|
* @return a normalization factor for query weights
|
||||||
|
*/
|
||||||
|
@Override
|
||||||
|
public abstract float queryNorm(float sumOfSquaredWeights);
|
||||||
|
|
||||||
/** Computes a score factor based on a term or phrase's frequency in a
|
/** Computes a score factor based on a term or phrase's frequency in a
|
||||||
* document. This value is multiplied by the {@link #idf(long, long)}
|
* document. This value is multiplied by the {@link #idf(long, long)}
|
||||||
* factor for each term in the query and these products are then summed to
|
* factor for each term in the query and these products are then summed to
|
||||||
|
|
|
@ -30,7 +30,6 @@ package.
|
||||||
<p>
|
<p>
|
||||||
<ol>
|
<ol>
|
||||||
<li><a href="#sims">Summary of the Ranking Methods</a></li>
|
<li><a href="#sims">Summary of the Ranking Methods</a></li>
|
||||||
<li><a href="#providers">Similarity Providers<a/></li>
|
|
||||||
<li><a href="#changingSimilarity">Changing the Similarity</a></li>
|
<li><a href="#changingSimilarity">Changing the Similarity</a></li>
|
||||||
</ol>
|
</ol>
|
||||||
</p>
|
</p>
|
||||||
|
@ -69,27 +68,6 @@ performance is to be expected when using the methods listed above. However,
|
||||||
optimizations can always be implemented in subclasses; see
|
optimizations can always be implemented in subclasses; see
|
||||||
<a href="#changingSimilarity">below</a>.</p>
|
<a href="#changingSimilarity">below</a>.</p>
|
||||||
|
|
||||||
|
|
||||||
<a name="providers"></a>
|
|
||||||
<h2>Similarity Providers</h2>
|
|
||||||
|
|
||||||
<p>{@link org.apache.lucene.search.similarities.SimilarityProvider}s are factories
|
|
||||||
that return Similarities per-field and compute coordination factors and normalization
|
|
||||||
values for the query.
|
|
||||||
{@link org.apache.lucene.search.similarities.DefaultSimilarityProvider} is the
|
|
||||||
default implementation used by Lucene, geared towards vector-spaced search: it returns
|
|
||||||
{@link org.apache.lucene.search.similarities.DefaultSimilarity} for every field,
|
|
||||||
and implements coordination-level matching and query normalization.
|
|
||||||
{@link org.apache.lucene.search.similarities.BasicSimilarityProvider} is geared towards
|
|
||||||
non-vector-space models and does not implement coordination-level matching or query
|
|
||||||
normalization. It is a convenience implementation that returns an arbitrary
|
|
||||||
{@link org.apache.lucene.search.similarities.Similarity} for every field.
|
|
||||||
You can write your own SimilarityProvider to return different Similarities for different
|
|
||||||
fields: for example you might want to use different parameter values for different fields,
|
|
||||||
or maybe even entirely different ranking algorithms.
|
|
||||||
</p>
|
|
||||||
|
|
||||||
|
|
||||||
<a name="changingSimilarity"></a>
|
<a name="changingSimilarity"></a>
|
||||||
<h2>Changing Similarity</h2>
|
<h2>Changing Similarity</h2>
|
||||||
|
|
||||||
|
@ -110,13 +88,11 @@ or maybe even entirely different ranking algorithms.
|
||||||
<p>To make this change, implement your own {@link org.apache.lucene.search.similarities.Similarity} (likely
|
<p>To make this change, implement your own {@link org.apache.lucene.search.similarities.Similarity} (likely
|
||||||
you'll want to simply subclass an existing method, be it
|
you'll want to simply subclass an existing method, be it
|
||||||
{@link org.apache.lucene.search.similarities.DefaultSimilarity} or a descendant of
|
{@link org.apache.lucene.search.similarities.DefaultSimilarity} or a descendant of
|
||||||
{@link org.apache.lucene.search.similarities.SimilarityBase}) and
|
{@link org.apache.lucene.search.similarities.SimilarityBase}), and
|
||||||
{@link org.apache.lucene.search.similarities.SimilarityProvider} (or use
|
|
||||||
{@link org.apache.lucene.search.similarities.BasicSimilarityProvider}), and
|
|
||||||
then register the new class by calling
|
then register the new class by calling
|
||||||
{@link org.apache.lucene.index.IndexWriterConfig#setSimilarityProvider(SimilarityProvider)}
|
{@link org.apache.lucene.index.IndexWriterConfig#setSimilarity(Similarity)}
|
||||||
before indexing and
|
before indexing and
|
||||||
{@link org.apache.lucene.search.IndexSearcher#setSimilarityProvider(SimilarityProvider)}
|
{@link org.apache.lucene.search.IndexSearcher#setSimilarity(Similarity)}
|
||||||
before searching.
|
before searching.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
|
|
@ -42,7 +42,7 @@ public class SpanWeight extends Weight {
|
||||||
|
|
||||||
public SpanWeight(SpanQuery query, IndexSearcher searcher)
|
public SpanWeight(SpanQuery query, IndexSearcher searcher)
|
||||||
throws IOException {
|
throws IOException {
|
||||||
this.similarity = searcher.getSimilarityProvider().get(query.getField());
|
this.similarity = searcher.getSimilarity();
|
||||||
this.query = query;
|
this.query = query;
|
||||||
|
|
||||||
termContexts = new HashMap<Term,TermContext>();
|
termContexts = new HashMap<Term,TermContext>();
|
||||||
|
|
|
@ -33,7 +33,7 @@ import org.apache.lucene.document.StoredField;
|
||||||
import org.apache.lucene.document.StringField;
|
import org.apache.lucene.document.StringField;
|
||||||
import org.apache.lucene.document.TextField;
|
import org.apache.lucene.document.TextField;
|
||||||
import org.apache.lucene.index.FieldInfo.IndexOptions;
|
import org.apache.lucene.index.FieldInfo.IndexOptions;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
|
|
||||||
import static org.apache.lucene.util.LuceneTestCase.TEST_VERSION_CURRENT;
|
import static org.apache.lucene.util.LuceneTestCase.TEST_VERSION_CURRENT;
|
||||||
|
@ -276,9 +276,9 @@ class DocHelper {
|
||||||
* @param doc
|
* @param doc
|
||||||
* @throws IOException
|
* @throws IOException
|
||||||
*/
|
*/
|
||||||
public static SegmentInfo writeDoc(Random random, Directory dir, Analyzer analyzer, SimilarityProvider similarity, Document doc) throws IOException {
|
public static SegmentInfo writeDoc(Random random, Directory dir, Analyzer analyzer, Similarity similarity, Document doc) throws IOException {
|
||||||
IndexWriter writer = new IndexWriter(dir, new IndexWriterConfig( /* LuceneTestCase.newIndexWriterConfig(random, */
|
IndexWriter writer = new IndexWriter(dir, new IndexWriterConfig( /* LuceneTestCase.newIndexWriterConfig(random, */
|
||||||
TEST_VERSION_CURRENT, analyzer).setSimilarityProvider(similarity));
|
TEST_VERSION_CURRENT, analyzer).setSimilarity(similarity));
|
||||||
//writer.setUseCompoundFile(false);
|
//writer.setUseCompoundFile(false);
|
||||||
writer.addDocument(doc);
|
writer.addDocument(doc);
|
||||||
writer.commit();
|
writer.commit();
|
||||||
|
|
|
@ -168,7 +168,7 @@ public class QueryUtils {
|
||||||
0 < edge ? r : emptyReaders[0])
|
0 < edge ? r : emptyReaders[0])
|
||||||
};
|
};
|
||||||
IndexSearcher out = LuceneTestCase.newSearcher(new MultiReader(readers));
|
IndexSearcher out = LuceneTestCase.newSearcher(new MultiReader(readers));
|
||||||
out.setSimilarityProvider(s.getSimilarityProvider());
|
out.setSimilarity(s.getSimilarity());
|
||||||
return out;
|
return out;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -38,7 +38,6 @@ import org.apache.lucene.search.similarities.BasicModelIne;
|
||||||
import org.apache.lucene.search.similarities.BasicModelP;
|
import org.apache.lucene.search.similarities.BasicModelP;
|
||||||
import org.apache.lucene.search.similarities.DFRSimilarity;
|
import org.apache.lucene.search.similarities.DFRSimilarity;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
|
||||||
import org.apache.lucene.search.similarities.Distribution;
|
import org.apache.lucene.search.similarities.Distribution;
|
||||||
import org.apache.lucene.search.similarities.DistributionLL;
|
import org.apache.lucene.search.similarities.DistributionLL;
|
||||||
import org.apache.lucene.search.similarities.DistributionSPL;
|
import org.apache.lucene.search.similarities.DistributionSPL;
|
||||||
|
@ -53,9 +52,11 @@ import org.apache.lucene.search.similarities.NormalizationH1;
|
||||||
import org.apache.lucene.search.similarities.NormalizationH2;
|
import org.apache.lucene.search.similarities.NormalizationH2;
|
||||||
import org.apache.lucene.search.similarities.NormalizationH3;
|
import org.apache.lucene.search.similarities.NormalizationH3;
|
||||||
import org.apache.lucene.search.similarities.NormalizationZ;
|
import org.apache.lucene.search.similarities.NormalizationZ;
|
||||||
|
import org.apache.lucene.search.similarities.PerFieldSimilarityWrapper;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
|
|
||||||
public class RandomSimilarityProvider extends DefaultSimilarityProvider {
|
public class RandomSimilarityProvider extends PerFieldSimilarityWrapper {
|
||||||
|
final DefaultSimilarity defaultSim = new DefaultSimilarity();
|
||||||
final List<Similarity> knownSims;
|
final List<Similarity> knownSims;
|
||||||
Map<String,Similarity> previousMappings = new HashMap<String,Similarity>();
|
Map<String,Similarity> previousMappings = new HashMap<String,Similarity>();
|
||||||
final int perFieldSeed;
|
final int perFieldSeed;
|
||||||
|
@ -73,7 +74,7 @@ public class RandomSimilarityProvider extends DefaultSimilarityProvider {
|
||||||
@Override
|
@Override
|
||||||
public float coord(int overlap, int maxOverlap) {
|
public float coord(int overlap, int maxOverlap) {
|
||||||
if (shouldCoord) {
|
if (shouldCoord) {
|
||||||
return super.coord(overlap, maxOverlap);
|
return defaultSim.coord(overlap, maxOverlap);
|
||||||
} else {
|
} else {
|
||||||
return 1.0f;
|
return 1.0f;
|
||||||
}
|
}
|
||||||
|
@ -82,7 +83,7 @@ public class RandomSimilarityProvider extends DefaultSimilarityProvider {
|
||||||
@Override
|
@Override
|
||||||
public float queryNorm(float sumOfSquaredWeights) {
|
public float queryNorm(float sumOfSquaredWeights) {
|
||||||
if (shouldQueryNorm) {
|
if (shouldQueryNorm) {
|
||||||
return super.queryNorm(sumOfSquaredWeights);
|
return defaultSim.queryNorm(sumOfSquaredWeights);
|
||||||
} else {
|
} else {
|
||||||
return 1.0f;
|
return 1.0f;
|
||||||
}
|
}
|
||||||
|
|
|
@ -49,7 +49,8 @@ import org.apache.lucene.search.FieldCache.CacheEntry;
|
||||||
import org.apache.lucene.search.AssertingIndexSearcher;
|
import org.apache.lucene.search.AssertingIndexSearcher;
|
||||||
import org.apache.lucene.search.IndexSearcher;
|
import org.apache.lucene.search.IndexSearcher;
|
||||||
import org.apache.lucene.search.RandomSimilarityProvider;
|
import org.apache.lucene.search.RandomSimilarityProvider;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.store.FSDirectory;
|
import org.apache.lucene.store.FSDirectory;
|
||||||
import org.apache.lucene.store.FlushInfo;
|
import org.apache.lucene.store.FlushInfo;
|
||||||
|
@ -213,7 +214,7 @@ public abstract class LuceneTestCase extends Assert {
|
||||||
|
|
||||||
private static InfoStream savedInfoStream;
|
private static InfoStream savedInfoStream;
|
||||||
|
|
||||||
private static SimilarityProvider similarityProvider;
|
private static Similarity similarity;
|
||||||
|
|
||||||
private static Locale locale;
|
private static Locale locale;
|
||||||
private static Locale savedLocale;
|
private static Locale savedLocale;
|
||||||
|
@ -330,7 +331,7 @@ public abstract class LuceneTestCase extends Assert {
|
||||||
savedTimeZone = TimeZone.getDefault();
|
savedTimeZone = TimeZone.getDefault();
|
||||||
timeZone = TEST_TIMEZONE.equals("random") ? randomTimeZone(random) : TimeZone.getTimeZone(TEST_TIMEZONE);
|
timeZone = TEST_TIMEZONE.equals("random") ? randomTimeZone(random) : TimeZone.getTimeZone(TEST_TIMEZONE);
|
||||||
TimeZone.setDefault(timeZone);
|
TimeZone.setDefault(timeZone);
|
||||||
similarityProvider = new RandomSimilarityProvider(random);
|
similarity = random.nextBoolean() ? new DefaultSimilarity() : new RandomSimilarityProvider(random);
|
||||||
testsFailed = false;
|
testsFailed = false;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -407,7 +408,7 @@ public abstract class LuceneTestCase extends Assert {
|
||||||
/** print some useful debugging information about the environment */
|
/** print some useful debugging information about the environment */
|
||||||
private static void printDebuggingInformation(String codecDescription) {
|
private static void printDebuggingInformation(String codecDescription) {
|
||||||
System.err.println("NOTE: test params are: codec=" + codecDescription +
|
System.err.println("NOTE: test params are: codec=" + codecDescription +
|
||||||
", sim=" + similarityProvider +
|
", sim=" + similarity +
|
||||||
", locale=" + locale +
|
", locale=" + locale +
|
||||||
", timezone=" + (timeZone == null ? "(null)" : timeZone.getID()));
|
", timezone=" + (timeZone == null ? "(null)" : timeZone.getID()));
|
||||||
System.err.println("NOTE: all tests run in this JVM:");
|
System.err.println("NOTE: all tests run in this JVM:");
|
||||||
|
@ -911,7 +912,7 @@ public abstract class LuceneTestCase extends Assert {
|
||||||
/** create a new index writer config with random defaults using the specified random */
|
/** create a new index writer config with random defaults using the specified random */
|
||||||
public static IndexWriterConfig newIndexWriterConfig(Random r, Version v, Analyzer a) {
|
public static IndexWriterConfig newIndexWriterConfig(Random r, Version v, Analyzer a) {
|
||||||
IndexWriterConfig c = new IndexWriterConfig(v, a);
|
IndexWriterConfig c = new IndexWriterConfig(v, a);
|
||||||
c.setSimilarityProvider(similarityProvider);
|
c.setSimilarity(similarity);
|
||||||
if (r.nextBoolean()) {
|
if (r.nextBoolean()) {
|
||||||
c.setMergeScheduler(new SerialMergeScheduler());
|
c.setMergeScheduler(new SerialMergeScheduler());
|
||||||
}
|
}
|
||||||
|
@ -1261,7 +1262,7 @@ public abstract class LuceneTestCase extends Assert {
|
||||||
r = SlowCompositeReaderWrapper.wrap(r);
|
r = SlowCompositeReaderWrapper.wrap(r);
|
||||||
}
|
}
|
||||||
IndexSearcher ret = random.nextBoolean() ? new AssertingIndexSearcher(random, r) : new AssertingIndexSearcher(random, r.getTopReaderContext());
|
IndexSearcher ret = random.nextBoolean() ? new AssertingIndexSearcher(random, r) : new AssertingIndexSearcher(random, r.getTopReaderContext());
|
||||||
ret.setSimilarityProvider(similarityProvider);
|
ret.setSimilarity(similarity);
|
||||||
return ret;
|
return ret;
|
||||||
} else {
|
} else {
|
||||||
int threads = 0;
|
int threads = 0;
|
||||||
|
@ -1282,7 +1283,7 @@ public abstract class LuceneTestCase extends Assert {
|
||||||
IndexSearcher ret = random.nextBoolean()
|
IndexSearcher ret = random.nextBoolean()
|
||||||
? new AssertingIndexSearcher(random, r, ex)
|
? new AssertingIndexSearcher(random, r, ex)
|
||||||
: new AssertingIndexSearcher(random, r.getTopReaderContext(), ex);
|
: new AssertingIndexSearcher(random, r.getTopReaderContext(), ex);
|
||||||
ret.setSimilarityProvider(similarityProvider);
|
ret.setSimilarity(similarity);
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -27,9 +27,8 @@ import org.apache.lucene.document.TextField;
|
||||||
import org.apache.lucene.index.DocValues.Source;
|
import org.apache.lucene.index.DocValues.Source;
|
||||||
import org.apache.lucene.index.DocValues.Type;
|
import org.apache.lucene.index.DocValues.Type;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.PerFieldSimilarityWrapper;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.store.MockDirectoryWrapper;
|
import org.apache.lucene.store.MockDirectoryWrapper;
|
||||||
import org.apache.lucene.util.BytesRef;
|
import org.apache.lucene.util.BytesRef;
|
||||||
import org.apache.lucene.util.LineFileDocs;
|
import org.apache.lucene.util.LineFileDocs;
|
||||||
|
@ -59,8 +58,8 @@ public class TestCustomNorms extends LuceneTestCase {
|
||||||
dir.setCheckIndexOnClose(false); // can't set sim to checkindex yet
|
dir.setCheckIndexOnClose(false); // can't set sim to checkindex yet
|
||||||
IndexWriterConfig config = newIndexWriterConfig(TEST_VERSION_CURRENT,
|
IndexWriterConfig config = newIndexWriterConfig(TEST_VERSION_CURRENT,
|
||||||
new MockAnalyzer(random));
|
new MockAnalyzer(random));
|
||||||
SimilarityProvider provider = new MySimProvider();
|
Similarity provider = new MySimProvider();
|
||||||
config.setSimilarityProvider(provider);
|
config.setSimilarity(provider);
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, dir, config);
|
RandomIndexWriter writer = new RandomIndexWriter(random, dir, config);
|
||||||
final LineFileDocs docs = new LineFileDocs(random);
|
final LineFileDocs docs = new LineFileDocs(random);
|
||||||
int num = atLeast(100);
|
int num = atLeast(100);
|
||||||
|
@ -100,8 +99,8 @@ public class TestCustomNorms extends LuceneTestCase {
|
||||||
dir.setCheckIndexOnClose(false); // can't set sim to checkindex yet
|
dir.setCheckIndexOnClose(false); // can't set sim to checkindex yet
|
||||||
IndexWriterConfig config = newIndexWriterConfig(TEST_VERSION_CURRENT,
|
IndexWriterConfig config = newIndexWriterConfig(TEST_VERSION_CURRENT,
|
||||||
new MockAnalyzer(random));
|
new MockAnalyzer(random));
|
||||||
SimilarityProvider provider = new MySimProvider();
|
Similarity provider = new MySimProvider();
|
||||||
config.setSimilarityProvider(provider);
|
config.setSimilarity(provider);
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, dir, config);
|
RandomIndexWriter writer = new RandomIndexWriter(random, dir, config);
|
||||||
final LineFileDocs docs = new LineFileDocs(random);
|
final LineFileDocs docs = new LineFileDocs(random);
|
||||||
int num = atLeast(100);
|
int num = atLeast(100);
|
||||||
|
@ -130,12 +129,11 @@ public class TestCustomNorms extends LuceneTestCase {
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
public class MySimProvider implements SimilarityProvider {
|
public class MySimProvider extends PerFieldSimilarityWrapper {
|
||||||
SimilarityProvider delegate = new DefaultSimilarityProvider();
|
Similarity delegate = new DefaultSimilarity();
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public float queryNorm(float sumOfSquaredWeights) {
|
public float queryNorm(float sumOfSquaredWeights) {
|
||||||
|
|
||||||
return delegate.queryNorm(sumOfSquaredWeights);
|
return delegate.queryNorm(sumOfSquaredWeights);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -146,7 +144,7 @@ public class TestCustomNorms extends LuceneTestCase {
|
||||||
} else if (exceptionTestField.equals(field)) {
|
} else if (exceptionTestField.equals(field)) {
|
||||||
return new RandomTypeSimilarity(random);
|
return new RandomTypeSimilarity(random);
|
||||||
} else {
|
} else {
|
||||||
return delegate.get(field);
|
return delegate;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -28,14 +28,14 @@ import org.apache.lucene.codecs.Codec;
|
||||||
import org.apache.lucene.index.DocumentsWriterPerThread.IndexingChain;
|
import org.apache.lucene.index.DocumentsWriterPerThread.IndexingChain;
|
||||||
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
|
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
|
||||||
import org.apache.lucene.search.IndexSearcher;
|
import org.apache.lucene.search.IndexSearcher;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.util.InfoStream;
|
import org.apache.lucene.util.InfoStream;
|
||||||
import org.apache.lucene.util.LuceneTestCase;
|
import org.apache.lucene.util.LuceneTestCase;
|
||||||
import org.junit.Test;
|
import org.junit.Test;
|
||||||
|
|
||||||
public class TestIndexWriterConfig extends LuceneTestCase {
|
public class TestIndexWriterConfig extends LuceneTestCase {
|
||||||
|
|
||||||
private static final class MySimilarityProvider extends DefaultSimilarityProvider {
|
private static final class MySimilarity extends DefaultSimilarity {
|
||||||
// Does not implement anything - used only for type checking on IndexWriterConfig.
|
// Does not implement anything - used only for type checking on IndexWriterConfig.
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -58,7 +58,7 @@ public class TestIndexWriterConfig extends LuceneTestCase {
|
||||||
assertEquals(ConcurrentMergeScheduler.class, conf.getMergeScheduler().getClass());
|
assertEquals(ConcurrentMergeScheduler.class, conf.getMergeScheduler().getClass());
|
||||||
assertEquals(OpenMode.CREATE_OR_APPEND, conf.getOpenMode());
|
assertEquals(OpenMode.CREATE_OR_APPEND, conf.getOpenMode());
|
||||||
// we don't need to assert this, it should be unspecified
|
// we don't need to assert this, it should be unspecified
|
||||||
assertTrue(IndexSearcher.getDefaultSimilarityProvider() == conf.getSimilarityProvider());
|
assertTrue(IndexSearcher.getDefaultSimilarity() == conf.getSimilarity());
|
||||||
assertEquals(IndexWriterConfig.DEFAULT_TERM_INDEX_INTERVAL, conf.getTermIndexInterval());
|
assertEquals(IndexWriterConfig.DEFAULT_TERM_INDEX_INTERVAL, conf.getTermIndexInterval());
|
||||||
assertEquals(IndexWriterConfig.getDefaultWriteLockTimeout(), conf.getWriteLockTimeout());
|
assertEquals(IndexWriterConfig.getDefaultWriteLockTimeout(), conf.getWriteLockTimeout());
|
||||||
assertEquals(IndexWriterConfig.WRITE_LOCK_TIMEOUT, IndexWriterConfig.getDefaultWriteLockTimeout());
|
assertEquals(IndexWriterConfig.WRITE_LOCK_TIMEOUT, IndexWriterConfig.getDefaultWriteLockTimeout());
|
||||||
|
@ -83,7 +83,7 @@ public class TestIndexWriterConfig extends LuceneTestCase {
|
||||||
getters.add("getMaxFieldLength");
|
getters.add("getMaxFieldLength");
|
||||||
getters.add("getMergeScheduler");
|
getters.add("getMergeScheduler");
|
||||||
getters.add("getOpenMode");
|
getters.add("getOpenMode");
|
||||||
getters.add("getSimilarityProvider");
|
getters.add("getSimilarity");
|
||||||
getters.add("getTermIndexInterval");
|
getters.add("getTermIndexInterval");
|
||||||
getters.add("getWriteLockTimeout");
|
getters.add("getWriteLockTimeout");
|
||||||
getters.add("getDefaultWriteLockTimeout");
|
getters.add("getDefaultWriteLockTimeout");
|
||||||
|
@ -185,11 +185,11 @@ public class TestIndexWriterConfig extends LuceneTestCase {
|
||||||
|
|
||||||
// Test Similarity:
|
// Test Similarity:
|
||||||
// we shouldnt assert what the default is, just that its not null.
|
// we shouldnt assert what the default is, just that its not null.
|
||||||
assertTrue(IndexSearcher.getDefaultSimilarityProvider() == conf.getSimilarityProvider());
|
assertTrue(IndexSearcher.getDefaultSimilarity() == conf.getSimilarity());
|
||||||
conf.setSimilarityProvider(new MySimilarityProvider());
|
conf.setSimilarity(new MySimilarity());
|
||||||
assertEquals(MySimilarityProvider.class, conf.getSimilarityProvider().getClass());
|
assertEquals(MySimilarity.class, conf.getSimilarity().getClass());
|
||||||
conf.setSimilarityProvider(null);
|
conf.setSimilarity(null);
|
||||||
assertTrue(IndexSearcher.getDefaultSimilarityProvider() == conf.getSimilarityProvider());
|
assertTrue(IndexSearcher.getDefaultSimilarity() == conf.getSimilarity());
|
||||||
|
|
||||||
// Test IndexingChain
|
// Test IndexingChain
|
||||||
assertTrue(DocumentsWriterPerThread.defaultIndexingChain == conf.getIndexingChain());
|
assertTrue(DocumentsWriterPerThread.defaultIndexingChain == conf.getIndexingChain());
|
||||||
|
|
|
@ -28,8 +28,6 @@ import org.apache.lucene.document.Document;
|
||||||
import org.apache.lucene.document.Field;
|
import org.apache.lucene.document.Field;
|
||||||
import org.apache.lucene.document.TextField;
|
import org.apache.lucene.document.TextField;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.util.LuceneTestCase;
|
import org.apache.lucene.util.LuceneTestCase;
|
||||||
import org.apache.lucene.util._TestUtil;
|
import org.apache.lucene.util._TestUtil;
|
||||||
|
@ -49,12 +47,7 @@ public class TestMaxTermFrequency extends LuceneTestCase {
|
||||||
dir = newDirectory();
|
dir = newDirectory();
|
||||||
IndexWriterConfig config = newIndexWriterConfig(TEST_VERSION_CURRENT,
|
IndexWriterConfig config = newIndexWriterConfig(TEST_VERSION_CURRENT,
|
||||||
new MockAnalyzer(random, MockTokenizer.SIMPLE, true)).setMergePolicy(newLogMergePolicy());
|
new MockAnalyzer(random, MockTokenizer.SIMPLE, true)).setMergePolicy(newLogMergePolicy());
|
||||||
config.setSimilarityProvider(new DefaultSimilarityProvider() {
|
config.setSimilarity(new TestSimilarity());
|
||||||
@Override
|
|
||||||
public Similarity get(String field) {
|
|
||||||
return new TestSimilarity();
|
|
||||||
}
|
|
||||||
});
|
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, dir, config);
|
RandomIndexWriter writer = new RandomIndexWriter(random, dir, config);
|
||||||
Document doc = new Document();
|
Document doc = new Document();
|
||||||
Field foo = newField("foo", "", TextField.TYPE_UNSTORED);
|
Field foo = newField("foo", "", TextField.TYPE_UNSTORED);
|
||||||
|
|
|
@ -26,9 +26,8 @@ import org.apache.lucene.document.TextField;
|
||||||
import org.apache.lucene.index.DocValues.Source;
|
import org.apache.lucene.index.DocValues.Source;
|
||||||
import org.apache.lucene.index.DocValues.Type;
|
import org.apache.lucene.index.DocValues.Type;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.PerFieldSimilarityWrapper;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.store.MockDirectoryWrapper;
|
import org.apache.lucene.store.MockDirectoryWrapper;
|
||||||
import org.apache.lucene.util.LineFileDocs;
|
import org.apache.lucene.util.LineFileDocs;
|
||||||
|
@ -62,12 +61,7 @@ public class TestNorms extends LuceneTestCase {
|
||||||
public void testCustomEncoder() throws Exception {
|
public void testCustomEncoder() throws Exception {
|
||||||
Directory dir = newDirectory();
|
Directory dir = newDirectory();
|
||||||
IndexWriterConfig config = newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random));
|
IndexWriterConfig config = newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random));
|
||||||
config.setSimilarityProvider(new DefaultSimilarityProvider() {
|
config.setSimilarity(new CustomNormEncodingSimilarity());
|
||||||
@Override
|
|
||||||
public Similarity get(String field) {
|
|
||||||
return new CustomNormEncodingSimilarity();
|
|
||||||
}
|
|
||||||
});
|
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, dir, config);
|
RandomIndexWriter writer = new RandomIndexWriter(random, dir, config);
|
||||||
Document doc = new Document();
|
Document doc = new Document();
|
||||||
Field foo = newField("foo", "", TextField.TYPE_UNSTORED);
|
Field foo = newField("foo", "", TextField.TYPE_UNSTORED);
|
||||||
|
@ -182,8 +176,8 @@ public class TestNorms extends LuceneTestCase {
|
||||||
CorruptIndexException {
|
CorruptIndexException {
|
||||||
IndexWriterConfig config = newIndexWriterConfig(TEST_VERSION_CURRENT,
|
IndexWriterConfig config = newIndexWriterConfig(TEST_VERSION_CURRENT,
|
||||||
new MockAnalyzer(random));
|
new MockAnalyzer(random));
|
||||||
SimilarityProvider provider = new MySimProvider(writeNorms);
|
Similarity provider = new MySimProvider(writeNorms);
|
||||||
config.setSimilarityProvider(provider);
|
config.setSimilarity(provider);
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, dir, config);
|
RandomIndexWriter writer = new RandomIndexWriter(random, dir, config);
|
||||||
final LineFileDocs docs = new LineFileDocs(random);
|
final LineFileDocs docs = new LineFileDocs(random);
|
||||||
int num = atLeast(100);
|
int num = atLeast(100);
|
||||||
|
@ -205,8 +199,8 @@ public class TestNorms extends LuceneTestCase {
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
public class MySimProvider implements SimilarityProvider {
|
public class MySimProvider extends PerFieldSimilarityWrapper {
|
||||||
SimilarityProvider delegate = new DefaultSimilarityProvider();
|
Similarity delegate = new DefaultSimilarity();
|
||||||
private boolean writeNorms;
|
private boolean writeNorms;
|
||||||
public MySimProvider(boolean writeNorms) {
|
public MySimProvider(boolean writeNorms) {
|
||||||
this.writeNorms = writeNorms;
|
this.writeNorms = writeNorms;
|
||||||
|
@ -222,7 +216,7 @@ public class TestNorms extends LuceneTestCase {
|
||||||
if (byteTestField.equals(field)) {
|
if (byteTestField.equals(field)) {
|
||||||
return new ByteEncodingBoostSimilarity(writeNorms);
|
return new ByteEncodingBoostSimilarity(writeNorms);
|
||||||
} else {
|
} else {
|
||||||
return delegate.get(field);
|
return delegate;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -31,28 +31,23 @@ import org.apache.lucene.document.TextField;
|
||||||
import org.apache.lucene.search.*;
|
import org.apache.lucene.search.*;
|
||||||
import org.apache.lucene.search.BooleanClause.Occur;
|
import org.apache.lucene.search.BooleanClause.Occur;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.search.similarities.TFIDFSimilarity;
|
import org.apache.lucene.search.similarities.TFIDFSimilarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
|
|
||||||
|
|
||||||
public class TestOmitTf extends LuceneTestCase {
|
public class TestOmitTf extends LuceneTestCase {
|
||||||
|
|
||||||
public static class SimpleSimilarityProvider implements SimilarityProvider {
|
public static class SimpleSimilarity extends TFIDFSimilarity {
|
||||||
public float queryNorm(float sumOfSquaredWeights) { return 1.0f; }
|
public float queryNorm(float sumOfSquaredWeights) { return 1.0f; }
|
||||||
public float coord(int overlap, int maxOverlap) { return 1.0f; }
|
public float coord(int overlap, int maxOverlap) { return 1.0f; }
|
||||||
public Similarity get(String field) {
|
@Override public void computeNorm(FieldInvertState state, Norm norm) { norm.setByte(encodeNormValue(state.getBoost())); }
|
||||||
return new TFIDFSimilarity() {
|
@Override public float tf(float freq) { return freq; }
|
||||||
@Override public void computeNorm(FieldInvertState state, Norm norm) { norm.setByte(encodeNormValue(state.getBoost())); }
|
@Override public float sloppyFreq(int distance) { return 2.0f; }
|
||||||
@Override public float tf(float freq) { return freq; }
|
@Override public float idf(long docFreq, long numDocs) { return 1.0f; }
|
||||||
@Override public float sloppyFreq(int distance) { return 2.0f; }
|
@Override public Explanation idfExplain(CollectionStatistics collectionStats, TermStatistics[] termStats) {
|
||||||
@Override public float idf(long docFreq, long numDocs) { return 1.0f; }
|
return new Explanation(1.0f, "Inexplicable");
|
||||||
@Override public Explanation idfExplain(CollectionStatistics collectionStats, TermStatistics[] termStats) {
|
|
||||||
return new Explanation(1.0f, "Inexplicable");
|
|
||||||
}
|
|
||||||
@Override public float scorePayload(int doc, int start, int end, BytesRef payload) { return 1.0f; }
|
|
||||||
};
|
|
||||||
}
|
}
|
||||||
|
@Override public float scorePayload(int doc, int start, int end, BytesRef payload) { return 1.0f; }
|
||||||
}
|
}
|
||||||
|
|
||||||
private static final FieldType omitType = new FieldType(TextField.TYPE_UNSTORED);
|
private static final FieldType omitType = new FieldType(TextField.TYPE_UNSTORED);
|
||||||
|
@ -257,7 +252,7 @@ public class TestOmitTf extends LuceneTestCase {
|
||||||
dir,
|
dir,
|
||||||
newIndexWriterConfig(TEST_VERSION_CURRENT, analyzer).
|
newIndexWriterConfig(TEST_VERSION_CURRENT, analyzer).
|
||||||
setMaxBufferedDocs(2).
|
setMaxBufferedDocs(2).
|
||||||
setSimilarityProvider(new SimpleSimilarityProvider()).
|
setSimilarity(new SimpleSimilarity()).
|
||||||
setMergePolicy(newLogMergePolicy(2))
|
setMergePolicy(newLogMergePolicy(2))
|
||||||
);
|
);
|
||||||
|
|
||||||
|
@ -286,7 +281,7 @@ public class TestOmitTf extends LuceneTestCase {
|
||||||
*/
|
*/
|
||||||
IndexReader reader = IndexReader.open(dir);
|
IndexReader reader = IndexReader.open(dir);
|
||||||
IndexSearcher searcher = new IndexSearcher(reader);
|
IndexSearcher searcher = new IndexSearcher(reader);
|
||||||
searcher.setSimilarityProvider(new SimpleSimilarityProvider());
|
searcher.setSimilarity(new SimpleSimilarity());
|
||||||
|
|
||||||
Term a = new Term("noTf", term);
|
Term a = new Term("noTf", term);
|
||||||
Term b = new Term("tf", term);
|
Term b = new Term("tf", term);
|
||||||
|
|
|
@ -26,7 +26,6 @@ import org.apache.lucene.document.Document;
|
||||||
import org.apache.lucene.document.Field;
|
import org.apache.lucene.document.Field;
|
||||||
import org.apache.lucene.document.TextField;
|
import org.apache.lucene.document.TextField;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.util.LuceneTestCase;
|
import org.apache.lucene.util.LuceneTestCase;
|
||||||
|
@ -47,12 +46,7 @@ public class TestUniqueTermCount extends LuceneTestCase {
|
||||||
dir = newDirectory();
|
dir = newDirectory();
|
||||||
IndexWriterConfig config = newIndexWriterConfig(TEST_VERSION_CURRENT,
|
IndexWriterConfig config = newIndexWriterConfig(TEST_VERSION_CURRENT,
|
||||||
new MockAnalyzer(random, MockTokenizer.SIMPLE, true)).setMergePolicy(newLogMergePolicy());
|
new MockAnalyzer(random, MockTokenizer.SIMPLE, true)).setMergePolicy(newLogMergePolicy());
|
||||||
config.setSimilarityProvider(new DefaultSimilarityProvider() {
|
config.setSimilarity(new TestSimilarity());
|
||||||
@Override
|
|
||||||
public Similarity get(String field) {
|
|
||||||
return new TestSimilarity();
|
|
||||||
}
|
|
||||||
});
|
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, dir, config);
|
RandomIndexWriter writer = new RandomIndexWriter(random, dir, config);
|
||||||
Document doc = new Document();
|
Document doc = new Document();
|
||||||
Field foo = newField("foo", "", TextField.TYPE_UNSTORED);
|
Field foo = newField("foo", "", TextField.TYPE_UNSTORED);
|
||||||
|
|
|
@ -22,7 +22,6 @@ import java.io.IOException;
|
||||||
import org.apache.lucene.index.AtomicReaderContext;
|
import org.apache.lucene.index.AtomicReaderContext;
|
||||||
import org.apache.lucene.index.Norm;
|
import org.apache.lucene.index.Norm;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.util.Bits;
|
import org.apache.lucene.util.Bits;
|
||||||
import org.apache.lucene.util.BytesRef;
|
import org.apache.lucene.util.BytesRef;
|
||||||
import org.apache.lucene.index.FieldInvertState;
|
import org.apache.lucene.index.FieldInvertState;
|
||||||
|
@ -266,21 +265,6 @@ final class JustCompileSearch {
|
||||||
throw new UnsupportedOperationException(UNSUPPORTED_MSG);
|
throw new UnsupportedOperationException(UNSUPPORTED_MSG);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
static final class JustCompileSimilarityProvider implements SimilarityProvider {
|
|
||||||
|
|
||||||
public float queryNorm(float sumOfSquaredWeights) {
|
|
||||||
throw new UnsupportedOperationException(UNSUPPORTED_MSG);
|
|
||||||
}
|
|
||||||
|
|
||||||
public float coord(int overlap, int maxOverlap) {
|
|
||||||
throw new UnsupportedOperationException(UNSUPPORTED_MSG);
|
|
||||||
}
|
|
||||||
|
|
||||||
public Similarity get(String field) {
|
|
||||||
throw new UnsupportedOperationException(UNSUPPORTED_MSG);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static final class JustCompileTopDocsCollector extends TopDocsCollector<ScoreDoc> {
|
static final class JustCompileTopDocsCollector extends TopDocsCollector<ScoreDoc> {
|
||||||
|
|
||||||
|
|
|
@ -26,8 +26,8 @@ import org.apache.lucene.document.TextField;
|
||||||
import org.apache.lucene.index.RandomIndexWriter;
|
import org.apache.lucene.index.RandomIndexWriter;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.index.IndexReader;
|
import org.apache.lucene.index.IndexReader;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.store.IOContext;
|
import org.apache.lucene.store.IOContext;
|
||||||
import org.apache.lucene.store.MockDirectoryWrapper;
|
import org.apache.lucene.store.MockDirectoryWrapper;
|
||||||
|
@ -230,9 +230,9 @@ public class TestBoolean2 extends LuceneTestCase {
|
||||||
query.add(new TermQuery(new Term(field, "zz")), BooleanClause.Occur.SHOULD);
|
query.add(new TermQuery(new Term(field, "zz")), BooleanClause.Occur.SHOULD);
|
||||||
|
|
||||||
int[] expDocNrs = {2, 3};
|
int[] expDocNrs = {2, 3};
|
||||||
SimilarityProvider oldSimilarity = searcher.getSimilarityProvider();
|
Similarity oldSimilarity = searcher.getSimilarity();
|
||||||
try {
|
try {
|
||||||
searcher.setSimilarityProvider(new DefaultSimilarityProvider(){
|
searcher.setSimilarity(new DefaultSimilarity(){
|
||||||
@Override
|
@Override
|
||||||
public float coord(int overlap, int maxOverlap) {
|
public float coord(int overlap, int maxOverlap) {
|
||||||
return overlap / ((float)maxOverlap - 1);
|
return overlap / ((float)maxOverlap - 1);
|
||||||
|
@ -240,7 +240,7 @@ public class TestBoolean2 extends LuceneTestCase {
|
||||||
});
|
});
|
||||||
queriesTest(query, expDocNrs);
|
queriesTest(query, expDocNrs);
|
||||||
} finally {
|
} finally {
|
||||||
searcher.setSimilarityProvider(oldSimilarity);
|
searcher.setSimilarity(oldSimilarity);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -30,9 +30,8 @@ import org.apache.lucene.index.IndexReader;
|
||||||
import org.apache.lucene.index.MultiReader;
|
import org.apache.lucene.index.MultiReader;
|
||||||
import org.apache.lucene.index.RandomIndexWriter;
|
import org.apache.lucene.index.RandomIndexWriter;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.util.LuceneTestCase;
|
import org.apache.lucene.util.LuceneTestCase;
|
||||||
import org.apache.lucene.util.NamedThreadFactory;
|
import org.apache.lucene.util.NamedThreadFactory;
|
||||||
|
@ -81,18 +80,7 @@ public class TestBooleanQuery extends LuceneTestCase {
|
||||||
IndexSearcher s = newSearcher(r);
|
IndexSearcher s = newSearcher(r);
|
||||||
// this test relies upon coord being the default implementation,
|
// this test relies upon coord being the default implementation,
|
||||||
// otherwise scores are different!
|
// otherwise scores are different!
|
||||||
final SimilarityProvider delegate = s.getSimilarityProvider();
|
s.setSimilarity(new DefaultSimilarity());
|
||||||
s.setSimilarityProvider(new DefaultSimilarityProvider() {
|
|
||||||
@Override
|
|
||||||
public float queryNorm(float sumOfSquaredWeights) {
|
|
||||||
return delegate.queryNorm(sumOfSquaredWeights);
|
|
||||||
}
|
|
||||||
|
|
||||||
@Override
|
|
||||||
public Similarity get(String field) {
|
|
||||||
return delegate.get(field);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
BooleanQuery q = new BooleanQuery();
|
BooleanQuery q = new BooleanQuery();
|
||||||
q.add(new TermQuery(new Term("field", "a")), BooleanClause.Occur.SHOULD);
|
q.add(new TermQuery(new Term("field", "a")), BooleanClause.Occur.SHOULD);
|
||||||
|
|
|
@ -19,7 +19,7 @@ package org.apache.lucene.search;
|
||||||
|
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.search.BooleanClause.Occur;
|
import org.apache.lucene.search.BooleanClause.Occur;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.spans.*;
|
import org.apache.lucene.search.spans.*;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@ -36,18 +36,18 @@ public class TestComplexExplanations extends TestExplanations {
|
||||||
@Override
|
@Override
|
||||||
public void setUp() throws Exception {
|
public void setUp() throws Exception {
|
||||||
super.setUp();
|
super.setUp();
|
||||||
searcher.setSimilarityProvider(createQnorm1Similarity());
|
searcher.setSimilarity(createQnorm1Similarity());
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public void tearDown() throws Exception {
|
public void tearDown() throws Exception {
|
||||||
searcher.setSimilarityProvider(IndexSearcher.getDefaultSimilarityProvider());
|
searcher.setSimilarity(IndexSearcher.getDefaultSimilarity());
|
||||||
super.tearDown();
|
super.tearDown();
|
||||||
}
|
}
|
||||||
|
|
||||||
// must be static for weight serialization tests
|
// must be static for weight serialization tests
|
||||||
private static DefaultSimilarityProvider createQnorm1Similarity() {
|
private static DefaultSimilarity createQnorm1Similarity() {
|
||||||
return new DefaultSimilarityProvider() {
|
return new DefaultSimilarity() {
|
||||||
@Override
|
@Override
|
||||||
public float queryNorm(float sumOfSquaredWeights) {
|
public float queryNorm(float sumOfSquaredWeights) {
|
||||||
return 1.0f; // / (float) Math.sqrt(1.0f + sumOfSquaredWeights);
|
return 1.0f; // / (float) Math.sqrt(1.0f + sumOfSquaredWeights);
|
||||||
|
|
|
@ -23,7 +23,7 @@ import org.apache.lucene.index.AtomicReaderContext;
|
||||||
import org.apache.lucene.index.IndexReader;
|
import org.apache.lucene.index.IndexReader;
|
||||||
import org.apache.lucene.index.RandomIndexWriter;
|
import org.apache.lucene.index.RandomIndexWriter;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.util.LuceneTestCase;
|
import org.apache.lucene.util.LuceneTestCase;
|
||||||
|
|
||||||
|
@ -98,7 +98,7 @@ public class TestConstantScoreQuery extends LuceneTestCase {
|
||||||
searcher = newSearcher(reader);
|
searcher = newSearcher(reader);
|
||||||
|
|
||||||
// set a similarity that does not normalize our boost away
|
// set a similarity that does not normalize our boost away
|
||||||
searcher.setSimilarityProvider(new DefaultSimilarityProvider() {
|
searcher.setSimilarity(new DefaultSimilarity() {
|
||||||
@Override
|
@Override
|
||||||
public float queryNorm(float sumOfSquaredWeights) {
|
public float queryNorm(float sumOfSquaredWeights) {
|
||||||
return 1.0f;
|
return 1.0f;
|
||||||
|
|
|
@ -30,9 +30,7 @@ import org.apache.lucene.index.FieldInvertState;
|
||||||
import org.apache.lucene.index.RandomIndexWriter;
|
import org.apache.lucene.index.RandomIndexWriter;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
|
|
||||||
import java.text.DecimalFormat;
|
import java.text.DecimalFormat;
|
||||||
|
@ -78,12 +76,7 @@ public class TestDisjunctionMaxQuery extends LuceneTestCase {
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
public SimilarityProvider sim = new DefaultSimilarityProvider() {
|
public Similarity sim = new TestSimilarity();
|
||||||
@Override
|
|
||||||
public Similarity get(String field) {
|
|
||||||
return new TestSimilarity();
|
|
||||||
}
|
|
||||||
};
|
|
||||||
public Directory index;
|
public Directory index;
|
||||||
public IndexReader r;
|
public IndexReader r;
|
||||||
public IndexSearcher s;
|
public IndexSearcher s;
|
||||||
|
@ -100,7 +93,7 @@ public class TestDisjunctionMaxQuery extends LuceneTestCase {
|
||||||
index = newDirectory();
|
index = newDirectory();
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, index,
|
RandomIndexWriter writer = new RandomIndexWriter(random, index,
|
||||||
newIndexWriterConfig( TEST_VERSION_CURRENT, new MockAnalyzer(random))
|
newIndexWriterConfig( TEST_VERSION_CURRENT, new MockAnalyzer(random))
|
||||||
.setSimilarityProvider(sim).setMergePolicy(newLogMergePolicy()));
|
.setSimilarity(sim).setMergePolicy(newLogMergePolicy()));
|
||||||
|
|
||||||
// hed is the most important field, dek is secondary
|
// hed is the most important field, dek is secondary
|
||||||
|
|
||||||
|
@ -159,7 +152,7 @@ public class TestDisjunctionMaxQuery extends LuceneTestCase {
|
||||||
r = SlowCompositeReaderWrapper.wrap(writer.getReader());
|
r = SlowCompositeReaderWrapper.wrap(writer.getReader());
|
||||||
writer.close();
|
writer.close();
|
||||||
s = newSearcher(r);
|
s = newSearcher(r);
|
||||||
s.setSimilarityProvider(sim);
|
s.setSimilarity(sim);
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
|
|
|
@ -32,8 +32,8 @@ import org.apache.lucene.index.IndexReader;
|
||||||
import org.apache.lucene.index.Norm;
|
import org.apache.lucene.index.Norm;
|
||||||
import org.apache.lucene.index.RandomIndexWriter;
|
import org.apache.lucene.index.RandomIndexWriter;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
|
import org.apache.lucene.search.similarities.PerFieldSimilarityWrapper;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.util.BytesRef;
|
import org.apache.lucene.util.BytesRef;
|
||||||
import org.apache.lucene.util.LuceneTestCase;
|
import org.apache.lucene.util.LuceneTestCase;
|
||||||
|
@ -74,14 +74,15 @@ public class TestDocValuesScoring extends LuceneTestCase {
|
||||||
|
|
||||||
// no boosting
|
// no boosting
|
||||||
IndexSearcher searcher1 = newSearcher(ir);
|
IndexSearcher searcher1 = newSearcher(ir);
|
||||||
final SimilarityProvider base = searcher1.getSimilarityProvider();
|
final Similarity base = searcher1.getSimilarity();
|
||||||
// boosting
|
// boosting
|
||||||
IndexSearcher searcher2 = newSearcher(ir);
|
IndexSearcher searcher2 = newSearcher(ir);
|
||||||
searcher2.setSimilarityProvider(new SimilarityProvider() {
|
searcher2.setSimilarity(new PerFieldSimilarityWrapper() {
|
||||||
final Similarity fooSim = new BoostingSimilarity(base.get("foo"), "foo_boost");
|
final Similarity fooSim = new BoostingSimilarity(base, "foo_boost");
|
||||||
|
|
||||||
|
@Override
|
||||||
public Similarity get(String field) {
|
public Similarity get(String field) {
|
||||||
return "foo".equals(field) ? fooSim : base.get(field);
|
return "foo".equals(field) ? fooSim : base;
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
|
|
|
@ -22,7 +22,7 @@ import org.apache.lucene.document.Document;
|
||||||
import org.apache.lucene.document.TextField;
|
import org.apache.lucene.document.TextField;
|
||||||
import org.apache.lucene.index.*;
|
import org.apache.lucene.index.*;
|
||||||
import org.apache.lucene.search.FieldValueHitQueue.Entry;
|
import org.apache.lucene.search.FieldValueHitQueue.Entry;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.store.*;
|
import org.apache.lucene.store.*;
|
||||||
import org.apache.lucene.util.LuceneTestCase;
|
import org.apache.lucene.util.LuceneTestCase;
|
||||||
import org.apache.lucene.util.BytesRef;
|
import org.apache.lucene.util.BytesRef;
|
||||||
|
@ -42,7 +42,7 @@ public class TestElevationComparator extends LuceneTestCase {
|
||||||
newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random)).
|
newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random)).
|
||||||
setMaxBufferedDocs(2).
|
setMaxBufferedDocs(2).
|
||||||
setMergePolicy(newLogMergePolicy(1000)).
|
setMergePolicy(newLogMergePolicy(1000)).
|
||||||
setSimilarityProvider(new DefaultSimilarityProvider())
|
setSimilarity(new DefaultSimilarity())
|
||||||
);
|
);
|
||||||
writer.addDocument(adoc(new String[] {"id", "a", "title", "ipod", "str_s", "a"}));
|
writer.addDocument(adoc(new String[] {"id", "a", "title", "ipod", "str_s", "a"}));
|
||||||
writer.addDocument(adoc(new String[] {"id", "b", "title", "ipod ipod", "str_s", "b"}));
|
writer.addDocument(adoc(new String[] {"id", "b", "title", "ipod ipod", "str_s", "b"}));
|
||||||
|
@ -55,7 +55,7 @@ public class TestElevationComparator extends LuceneTestCase {
|
||||||
writer.close();
|
writer.close();
|
||||||
|
|
||||||
IndexSearcher searcher = newSearcher(r);
|
IndexSearcher searcher = newSearcher(r);
|
||||||
searcher.setSimilarityProvider(new DefaultSimilarityProvider());
|
searcher.setSimilarity(new DefaultSimilarity());
|
||||||
|
|
||||||
runTest(searcher, true);
|
runTest(searcher, true);
|
||||||
runTest(searcher, false);
|
runTest(searcher, false);
|
||||||
|
|
|
@ -29,9 +29,7 @@ import org.apache.lucene.document.TextField;
|
||||||
import org.apache.lucene.index.IndexReader;
|
import org.apache.lucene.index.IndexReader;
|
||||||
import org.apache.lucene.index.RandomIndexWriter;
|
import org.apache.lucene.index.RandomIndexWriter;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.util.LuceneTestCase;
|
import org.apache.lucene.util.LuceneTestCase;
|
||||||
|
|
||||||
|
@ -107,18 +105,7 @@ public class TestFuzzyQuery2 extends LuceneTestCase {
|
||||||
}
|
}
|
||||||
// even though this uses a boost-only rewrite, this test relies upon queryNorm being the default implementation,
|
// even though this uses a boost-only rewrite, this test relies upon queryNorm being the default implementation,
|
||||||
// otherwise scores are different!
|
// otherwise scores are different!
|
||||||
final SimilarityProvider delegate = searcher.getSimilarityProvider();
|
searcher.setSimilarity(new DefaultSimilarity());
|
||||||
searcher.setSimilarityProvider(new DefaultSimilarityProvider() {
|
|
||||||
@Override
|
|
||||||
public float coord(int overlap, int maxOverlap) {
|
|
||||||
return delegate.coord(overlap, maxOverlap);
|
|
||||||
}
|
|
||||||
|
|
||||||
@Override
|
|
||||||
public Similarity get(String field) {
|
|
||||||
return delegate.get(field);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
writer.close();
|
writer.close();
|
||||||
String line;
|
String line;
|
||||||
|
|
|
@ -42,8 +42,6 @@ import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.index.TermsEnum;
|
import org.apache.lucene.index.TermsEnum;
|
||||||
import org.apache.lucene.search.IndexSearcher;
|
import org.apache.lucene.search.IndexSearcher;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.store.RAMDirectory;
|
import org.apache.lucene.store.RAMDirectory;
|
||||||
import org.apache.lucene.util.BytesRef;
|
import org.apache.lucene.util.BytesRef;
|
||||||
|
@ -307,17 +305,11 @@ public class TestMultiPhraseQuery extends LuceneTestCase {
|
||||||
|
|
||||||
IndexReader reader = writer.getReader();
|
IndexReader reader = writer.getReader();
|
||||||
IndexSearcher searcher = newSearcher(reader);
|
IndexSearcher searcher = newSearcher(reader);
|
||||||
searcher.setSimilarityProvider(new DefaultSimilarityProvider() {
|
searcher.setSimilarity(new DefaultSimilarity() {
|
||||||
@Override
|
@Override
|
||||||
public Similarity get(String field) {
|
public Explanation idfExplain(CollectionStatistics collectionStats, TermStatistics termStats[]) {
|
||||||
return new DefaultSimilarity() {
|
return new Explanation(10f, "just a test");
|
||||||
|
}
|
||||||
@Override
|
|
||||||
public Explanation idfExplain(CollectionStatistics collectionStats, TermStatistics termStats[]) {
|
|
||||||
return new Explanation(10f, "just a test");
|
|
||||||
}
|
|
||||||
};
|
|
||||||
}
|
|
||||||
});
|
});
|
||||||
|
|
||||||
MultiPhraseQuery query = new MultiPhraseQuery();
|
MultiPhraseQuery query = new MultiPhraseQuery();
|
||||||
|
|
|
@ -26,9 +26,8 @@ import org.apache.lucene.index.AtomicReaderContext;
|
||||||
import org.apache.lucene.index.IndexReader;
|
import org.apache.lucene.index.IndexReader;
|
||||||
import org.apache.lucene.index.RandomIndexWriter;
|
import org.apache.lucene.index.RandomIndexWriter;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.junit.AfterClass;
|
import org.junit.AfterClass;
|
||||||
import org.junit.BeforeClass;
|
import org.junit.BeforeClass;
|
||||||
|
@ -172,18 +171,7 @@ public class TestMultiTermConstantScore extends BaseTestRangeFilter {
|
||||||
// test for correct application of query normalization
|
// test for correct application of query normalization
|
||||||
// must use a non score normalizing method for this.
|
// must use a non score normalizing method for this.
|
||||||
|
|
||||||
final SimilarityProvider delegate = search.getSimilarityProvider();
|
search.setSimilarity(new DefaultSimilarity());
|
||||||
search.setSimilarityProvider(new DefaultSimilarityProvider() {
|
|
||||||
@Override
|
|
||||||
public float coord(int overlap, int maxOverlap) {
|
|
||||||
return delegate.coord(overlap, maxOverlap);
|
|
||||||
}
|
|
||||||
|
|
||||||
@Override
|
|
||||||
public Similarity get(String field) {
|
|
||||||
return delegate.get(field);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
Query q = csrq("data", "1", "6", T, T);
|
Query q = csrq("data", "1", "6", T, T);
|
||||||
q.setBoost(100);
|
q.setBoost(100);
|
||||||
search.search(q, null, new Collector() {
|
search.search(q, null, new Collector() {
|
||||||
|
|
|
@ -23,7 +23,7 @@ import org.apache.lucene.analysis.tokenattributes.*;
|
||||||
import org.apache.lucene.document.*;
|
import org.apache.lucene.document.*;
|
||||||
import org.apache.lucene.index.*;
|
import org.apache.lucene.index.*;
|
||||||
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
|
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.store.*;
|
import org.apache.lucene.store.*;
|
||||||
import org.apache.lucene.util.Version;
|
import org.apache.lucene.util.Version;
|
||||||
import org.apache.lucene.util._TestUtil;
|
import org.apache.lucene.util._TestUtil;
|
||||||
|
@ -342,7 +342,7 @@ public class TestPhraseQuery extends LuceneTestCase {
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
||||||
newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random))
|
newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random))
|
||||||
.setMergePolicy(newLogMergePolicy())
|
.setMergePolicy(newLogMergePolicy())
|
||||||
.setSimilarityProvider(new DefaultSimilarityProvider()));
|
.setSimilarity(new DefaultSimilarity()));
|
||||||
|
|
||||||
Document doc = new Document();
|
Document doc = new Document();
|
||||||
doc.add(newField("field", "foo firstname lastname foo", TextField.TYPE_STORED));
|
doc.add(newField("field", "foo firstname lastname foo", TextField.TYPE_STORED));
|
||||||
|
@ -360,7 +360,7 @@ public class TestPhraseQuery extends LuceneTestCase {
|
||||||
writer.close();
|
writer.close();
|
||||||
|
|
||||||
IndexSearcher searcher = newSearcher(reader);
|
IndexSearcher searcher = newSearcher(reader);
|
||||||
searcher.setSimilarityProvider(new DefaultSimilarityProvider());
|
searcher.setSimilarity(new DefaultSimilarity());
|
||||||
PhraseQuery query = new PhraseQuery();
|
PhraseQuery query = new PhraseQuery();
|
||||||
query.add(new Term("field", "firstname"));
|
query.add(new Term("field", "firstname"));
|
||||||
query.add(new Term("field", "lastname"));
|
query.add(new Term("field", "lastname"));
|
||||||
|
|
|
@ -29,7 +29,6 @@ import org.apache.lucene.index.RandomIndexWriter;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.analysis.MockAnalyzer;
|
import org.apache.lucene.analysis.MockAnalyzer;
|
||||||
import org.apache.lucene.document.Document;
|
import org.apache.lucene.document.Document;
|
||||||
|
@ -41,19 +40,15 @@ import org.apache.lucene.document.TextField;
|
||||||
*/
|
*/
|
||||||
public class TestSimilarity extends LuceneTestCase {
|
public class TestSimilarity extends LuceneTestCase {
|
||||||
|
|
||||||
public static class SimpleSimilarityProvider implements SimilarityProvider {
|
public static class SimpleSimilarity extends DefaultSimilarity {
|
||||||
public float queryNorm(float sumOfSquaredWeights) { return 1.0f; }
|
public float queryNorm(float sumOfSquaredWeights) { return 1.0f; }
|
||||||
public float coord(int overlap, int maxOverlap) { return 1.0f; }
|
public float coord(int overlap, int maxOverlap) { return 1.0f; }
|
||||||
public Similarity get(String field) {
|
@Override public void computeNorm(FieldInvertState state, Norm norm) { norm.setByte(encodeNormValue(state.getBoost())); }
|
||||||
return new DefaultSimilarity() {
|
@Override public float tf(float freq) { return freq; }
|
||||||
@Override public void computeNorm(FieldInvertState state, Norm norm) { norm.setByte(encodeNormValue(state.getBoost())); }
|
@Override public float sloppyFreq(int distance) { return 2.0f; }
|
||||||
@Override public float tf(float freq) { return freq; }
|
@Override public float idf(long docFreq, long numDocs) { return 1.0f; }
|
||||||
@Override public float sloppyFreq(int distance) { return 2.0f; }
|
@Override public Explanation idfExplain(CollectionStatistics collectionStats, TermStatistics[] stats) {
|
||||||
@Override public float idf(long docFreq, long numDocs) { return 1.0f; }
|
return new Explanation(1.0f, "Inexplicable");
|
||||||
@Override public Explanation idfExplain(CollectionStatistics collectionStats, TermStatistics[] stats) {
|
|
||||||
return new Explanation(1.0f, "Inexplicable");
|
|
||||||
}
|
|
||||||
};
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -61,7 +56,7 @@ public class TestSimilarity extends LuceneTestCase {
|
||||||
Directory store = newDirectory();
|
Directory store = newDirectory();
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, store,
|
RandomIndexWriter writer = new RandomIndexWriter(random, store,
|
||||||
newIndexWriterConfig( TEST_VERSION_CURRENT, new MockAnalyzer(random))
|
newIndexWriterConfig( TEST_VERSION_CURRENT, new MockAnalyzer(random))
|
||||||
.setSimilarityProvider(new SimpleSimilarityProvider()));
|
.setSimilarity(new SimpleSimilarity()));
|
||||||
|
|
||||||
Document d1 = new Document();
|
Document d1 = new Document();
|
||||||
d1.add(newField("field", "a c", TextField.TYPE_STORED));
|
d1.add(newField("field", "a c", TextField.TYPE_STORED));
|
||||||
|
@ -75,7 +70,7 @@ public class TestSimilarity extends LuceneTestCase {
|
||||||
writer.close();
|
writer.close();
|
||||||
|
|
||||||
IndexSearcher searcher = newSearcher(reader);
|
IndexSearcher searcher = newSearcher(reader);
|
||||||
searcher.setSimilarityProvider(new SimpleSimilarityProvider());
|
searcher.setSimilarity(new SimpleSimilarity());
|
||||||
|
|
||||||
Term a = new Term("field", "a");
|
Term a = new Term("field", "a");
|
||||||
Term b = new Term("field", "b");
|
Term b = new Term("field", "b");
|
||||||
|
|
|
@ -28,8 +28,8 @@ import org.apache.lucene.index.MultiDocValues;
|
||||||
import org.apache.lucene.index.Norm;
|
import org.apache.lucene.index.Norm;
|
||||||
import org.apache.lucene.index.RandomIndexWriter;
|
import org.apache.lucene.index.RandomIndexWriter;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
|
import org.apache.lucene.search.similarities.PerFieldSimilarityWrapper;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.search.similarities.TFIDFSimilarity;
|
import org.apache.lucene.search.similarities.TFIDFSimilarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.util.BytesRef;
|
import org.apache.lucene.util.BytesRef;
|
||||||
|
@ -44,9 +44,9 @@ public class TestSimilarityProvider extends LuceneTestCase {
|
||||||
public void setUp() throws Exception {
|
public void setUp() throws Exception {
|
||||||
super.setUp();
|
super.setUp();
|
||||||
directory = newDirectory();
|
directory = newDirectory();
|
||||||
SimilarityProvider sim = new ExampleSimilarityProvider();
|
PerFieldSimilarityWrapper sim = new ExampleSimilarityProvider();
|
||||||
IndexWriterConfig iwc = newIndexWriterConfig(TEST_VERSION_CURRENT,
|
IndexWriterConfig iwc = newIndexWriterConfig(TEST_VERSION_CURRENT,
|
||||||
new MockAnalyzer(random)).setSimilarityProvider(sim);
|
new MockAnalyzer(random)).setSimilarity(sim);
|
||||||
RandomIndexWriter iw = new RandomIndexWriter(random, directory, iwc);
|
RandomIndexWriter iw = new RandomIndexWriter(random, directory, iwc);
|
||||||
Document doc = new Document();
|
Document doc = new Document();
|
||||||
Field field = newField("foo", "", TextField.TYPE_UNSTORED);
|
Field field = newField("foo", "", TextField.TYPE_UNSTORED);
|
||||||
|
@ -63,7 +63,7 @@ public class TestSimilarityProvider extends LuceneTestCase {
|
||||||
reader = iw.getReader();
|
reader = iw.getReader();
|
||||||
iw.close();
|
iw.close();
|
||||||
searcher = newSearcher(reader);
|
searcher = newSearcher(reader);
|
||||||
searcher.setSimilarityProvider(sim);
|
searcher.setSimilarity(sim);
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
|
@ -90,18 +90,10 @@ public class TestSimilarityProvider extends LuceneTestCase {
|
||||||
assertTrue(foodocs.scoreDocs[0].score < bardocs.scoreDocs[0].score);
|
assertTrue(foodocs.scoreDocs[0].score < bardocs.scoreDocs[0].score);
|
||||||
}
|
}
|
||||||
|
|
||||||
private class ExampleSimilarityProvider implements SimilarityProvider {
|
private class ExampleSimilarityProvider extends PerFieldSimilarityWrapper {
|
||||||
private Similarity sim1 = new Sim1();
|
private Similarity sim1 = new Sim1();
|
||||||
private Similarity sim2 = new Sim2();
|
private Similarity sim2 = new Sim2();
|
||||||
|
|
||||||
public float coord(int overlap, int maxOverlap) {
|
|
||||||
return 1f;
|
|
||||||
}
|
|
||||||
|
|
||||||
public float queryNorm(float sumOfSquaredWeights) {
|
|
||||||
return 1f;
|
|
||||||
}
|
|
||||||
|
|
||||||
public Similarity get(String field) {
|
public Similarity get(String field) {
|
||||||
if (field.equals("foo")) {
|
if (field.equals("foo")) {
|
||||||
return sim1;
|
return sim1;
|
||||||
|
@ -112,6 +104,14 @@ public class TestSimilarityProvider extends LuceneTestCase {
|
||||||
}
|
}
|
||||||
|
|
||||||
private class Sim1 extends TFIDFSimilarity {
|
private class Sim1 extends TFIDFSimilarity {
|
||||||
|
|
||||||
|
public float coord(int overlap, int maxOverlap) {
|
||||||
|
return 1f;
|
||||||
|
}
|
||||||
|
|
||||||
|
public float queryNorm(float sumOfSquaredWeights) {
|
||||||
|
return 1f;
|
||||||
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public void computeNorm(FieldInvertState state, Norm norm) {
|
public void computeNorm(FieldInvertState state, Norm norm) {
|
||||||
|
@ -141,6 +141,14 @@ public class TestSimilarityProvider extends LuceneTestCase {
|
||||||
|
|
||||||
private class Sim2 extends TFIDFSimilarity {
|
private class Sim2 extends TFIDFSimilarity {
|
||||||
|
|
||||||
|
public float coord(int overlap, int maxOverlap) {
|
||||||
|
return 1f;
|
||||||
|
}
|
||||||
|
|
||||||
|
public float queryNorm(float sumOfSquaredWeights) {
|
||||||
|
return 1f;
|
||||||
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public void computeNorm(FieldInvertState state, Norm norm) {
|
public void computeNorm(FieldInvertState state, Norm norm) {
|
||||||
norm.setByte(encodeNormValue(10f));
|
norm.setByte(encodeNormValue(10f));
|
||||||
|
|
|
@ -29,7 +29,7 @@ import org.apache.lucene.index.IndexReader;
|
||||||
import org.apache.lucene.index.RandomIndexWriter;
|
import org.apache.lucene.index.RandomIndexWriter;
|
||||||
import org.apache.lucene.index.SlowCompositeReaderWrapper;
|
import org.apache.lucene.index.SlowCompositeReaderWrapper;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.util.LuceneTestCase;
|
import org.apache.lucene.util.LuceneTestCase;
|
||||||
|
|
||||||
|
@ -50,7 +50,7 @@ public class TestTermScorer extends LuceneTestCase {
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
||||||
newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random))
|
newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random))
|
||||||
.setMergePolicy(newLogMergePolicy())
|
.setMergePolicy(newLogMergePolicy())
|
||||||
.setSimilarityProvider(new DefaultSimilarityProvider()));
|
.setSimilarity(new DefaultSimilarity()));
|
||||||
for (int i = 0; i < values.length; i++) {
|
for (int i = 0; i < values.length; i++) {
|
||||||
Document doc = new Document();
|
Document doc = new Document();
|
||||||
doc
|
doc
|
||||||
|
@ -60,7 +60,7 @@ public class TestTermScorer extends LuceneTestCase {
|
||||||
indexReader = SlowCompositeReaderWrapper.wrap(writer.getReader());
|
indexReader = SlowCompositeReaderWrapper.wrap(writer.getReader());
|
||||||
writer.close();
|
writer.close();
|
||||||
indexSearcher = newSearcher(indexReader);
|
indexSearcher = newSearcher(indexReader);
|
||||||
indexSearcher.setSimilarityProvider(new DefaultSimilarityProvider());
|
indexSearcher.setSimilarity(new DefaultSimilarity());
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
|
|
|
@ -29,7 +29,7 @@ import org.apache.lucene.document.FieldType;
|
||||||
import org.apache.lucene.document.TextField;
|
import org.apache.lucene.document.TextField;
|
||||||
import org.apache.lucene.index.*;
|
import org.apache.lucene.index.*;
|
||||||
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
|
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.util.English;
|
import org.apache.lucene.util.English;
|
||||||
import org.apache.lucene.util.IOUtils;
|
import org.apache.lucene.util.IOUtils;
|
||||||
|
@ -242,7 +242,7 @@ public class TestTermVectors extends LuceneTestCase {
|
||||||
newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random, MockTokenizer.SIMPLE, true))
|
newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random, MockTokenizer.SIMPLE, true))
|
||||||
.setOpenMode(OpenMode.CREATE)
|
.setOpenMode(OpenMode.CREATE)
|
||||||
.setMergePolicy(newLogMergePolicy())
|
.setMergePolicy(newLogMergePolicy())
|
||||||
.setSimilarityProvider(new DefaultSimilarityProvider()));
|
.setSimilarity(new DefaultSimilarity()));
|
||||||
writer.addDocument(testDoc1);
|
writer.addDocument(testDoc1);
|
||||||
writer.addDocument(testDoc2);
|
writer.addDocument(testDoc2);
|
||||||
writer.addDocument(testDoc3);
|
writer.addDocument(testDoc3);
|
||||||
|
@ -250,7 +250,7 @@ public class TestTermVectors extends LuceneTestCase {
|
||||||
IndexReader reader = writer.getReader();
|
IndexReader reader = writer.getReader();
|
||||||
writer.close();
|
writer.close();
|
||||||
IndexSearcher knownSearcher = newSearcher(reader);
|
IndexSearcher knownSearcher = newSearcher(reader);
|
||||||
knownSearcher.setSimilarityProvider(new DefaultSimilarityProvider());
|
knownSearcher.setSimilarity(new DefaultSimilarity());
|
||||||
FieldsEnum fields = MultiFields.getFields(knownSearcher.reader).iterator();
|
FieldsEnum fields = MultiFields.getFields(knownSearcher.reader).iterator();
|
||||||
|
|
||||||
DocsEnum docs = null;
|
DocsEnum docs = null;
|
||||||
|
|
|
@ -29,7 +29,7 @@ import org.apache.lucene.util.English;
|
||||||
import org.apache.lucene.util.LuceneTestCase;
|
import org.apache.lucene.util.LuceneTestCase;
|
||||||
import org.apache.lucene.index.IndexReader;
|
import org.apache.lucene.index.IndexReader;
|
||||||
import org.apache.lucene.search.IndexSearcher;
|
import org.apache.lucene.search.IndexSearcher;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.store.MockDirectoryWrapper;
|
import org.apache.lucene.store.MockDirectoryWrapper;
|
||||||
import org.apache.lucene.store.RAMDirectory;
|
import org.apache.lucene.store.RAMDirectory;
|
||||||
|
@ -115,13 +115,13 @@ public class PayloadHelper {
|
||||||
* @throws IOException
|
* @throws IOException
|
||||||
*/
|
*/
|
||||||
// TODO: randomize
|
// TODO: randomize
|
||||||
public IndexSearcher setUp(Random random, SimilarityProvider similarity, int numDocs) throws IOException {
|
public IndexSearcher setUp(Random random, Similarity similarity, int numDocs) throws IOException {
|
||||||
Directory directory = new MockDirectoryWrapper(random, new RAMDirectory());
|
Directory directory = new MockDirectoryWrapper(random, new RAMDirectory());
|
||||||
PayloadAnalyzer analyzer = new PayloadAnalyzer();
|
PayloadAnalyzer analyzer = new PayloadAnalyzer();
|
||||||
|
|
||||||
// TODO randomize this
|
// TODO randomize this
|
||||||
IndexWriter writer = new IndexWriter(directory, new IndexWriterConfig(
|
IndexWriter writer = new IndexWriter(directory, new IndexWriterConfig(
|
||||||
TEST_VERSION_CURRENT, analyzer).setSimilarityProvider(similarity));
|
TEST_VERSION_CURRENT, analyzer).setSimilarity(similarity));
|
||||||
// writer.infoStream = System.out;
|
// writer.infoStream = System.out;
|
||||||
for (int i = 0; i < numDocs; i++) {
|
for (int i = 0; i < numDocs; i++) {
|
||||||
Document doc = new Document();
|
Document doc = new Document();
|
||||||
|
@ -134,7 +134,7 @@ public class PayloadHelper {
|
||||||
writer.close();
|
writer.close();
|
||||||
|
|
||||||
IndexSearcher searcher = LuceneTestCase.newSearcher(reader);
|
IndexSearcher searcher = LuceneTestCase.newSearcher(reader);
|
||||||
searcher.setSimilarityProvider(similarity);
|
searcher.setSimilarity(similarity);
|
||||||
return searcher;
|
return searcher;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -19,7 +19,6 @@ package org.apache.lucene.search.payloads;
|
||||||
|
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.TestExplanations;
|
import org.apache.lucene.search.TestExplanations;
|
||||||
import org.apache.lucene.search.spans.SpanQuery;
|
import org.apache.lucene.search.spans.SpanQuery;
|
||||||
|
@ -38,15 +37,10 @@ public class TestPayloadExplanations extends TestExplanations {
|
||||||
@Override
|
@Override
|
||||||
public void setUp() throws Exception {
|
public void setUp() throws Exception {
|
||||||
super.setUp();
|
super.setUp();
|
||||||
searcher.setSimilarityProvider(new DefaultSimilarityProvider() {
|
searcher.setSimilarity(new DefaultSimilarity() {
|
||||||
@Override
|
@Override
|
||||||
public Similarity get(String field) {
|
public float scorePayload(int doc, int start, int end, BytesRef payload) {
|
||||||
return new DefaultSimilarity() {
|
return 1 + (payload.hashCode() % 10);
|
||||||
@Override
|
|
||||||
public float scorePayload(int doc, int start, int end, BytesRef payload) {
|
|
||||||
return 1 + (payload.hashCode() % 10);
|
|
||||||
}
|
|
||||||
};
|
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
|
@ -37,7 +37,6 @@ import org.apache.lucene.search.TermStatistics;
|
||||||
import org.apache.lucene.search.TopDocs;
|
import org.apache.lucene.search.TopDocs;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.search.spans.SpanQuery;
|
import org.apache.lucene.search.spans.SpanQuery;
|
||||||
import org.apache.lucene.search.spans.SpanNearQuery;
|
import org.apache.lucene.search.spans.SpanNearQuery;
|
||||||
import org.apache.lucene.search.spans.SpanTermQuery;
|
import org.apache.lucene.search.spans.SpanTermQuery;
|
||||||
|
@ -53,7 +52,7 @@ public class TestPayloadNearQuery extends LuceneTestCase {
|
||||||
private static IndexSearcher searcher;
|
private static IndexSearcher searcher;
|
||||||
private static IndexReader reader;
|
private static IndexReader reader;
|
||||||
private static Directory directory;
|
private static Directory directory;
|
||||||
private static BoostingSimilarityProvider similarityProvider = new BoostingSimilarityProvider();
|
private static BoostingSimilarity similarity = new BoostingSimilarity();
|
||||||
private static byte[] payload2 = new byte[]{2};
|
private static byte[] payload2 = new byte[]{2};
|
||||||
private static byte[] payload4 = new byte[]{4};
|
private static byte[] payload4 = new byte[]{4};
|
||||||
|
|
||||||
|
@ -112,7 +111,7 @@ public class TestPayloadNearQuery extends LuceneTestCase {
|
||||||
directory = newDirectory();
|
directory = newDirectory();
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
||||||
newIndexWriterConfig(TEST_VERSION_CURRENT, new PayloadAnalyzer())
|
newIndexWriterConfig(TEST_VERSION_CURRENT, new PayloadAnalyzer())
|
||||||
.setSimilarityProvider(similarityProvider));
|
.setSimilarity(similarity));
|
||||||
//writer.infoStream = System.out;
|
//writer.infoStream = System.out;
|
||||||
for (int i = 0; i < 1000; i++) {
|
for (int i = 0; i < 1000; i++) {
|
||||||
Document doc = new Document();
|
Document doc = new Document();
|
||||||
|
@ -125,7 +124,7 @@ public class TestPayloadNearQuery extends LuceneTestCase {
|
||||||
writer.close();
|
writer.close();
|
||||||
|
|
||||||
searcher = newSearcher(reader);
|
searcher = newSearcher(reader);
|
||||||
searcher.setSimilarityProvider(similarityProvider);
|
searcher.setSimilarity(similarity);
|
||||||
}
|
}
|
||||||
|
|
||||||
@AfterClass
|
@AfterClass
|
||||||
|
@ -307,8 +306,7 @@ public class TestPayloadNearQuery extends LuceneTestCase {
|
||||||
assertTrue(doc.score + " does not equal: " + 3, doc.score == 3);
|
assertTrue(doc.score + " does not equal: " + 3, doc.score == 3);
|
||||||
}
|
}
|
||||||
|
|
||||||
// must be static for weight serialization tests
|
static class BoostingSimilarity extends DefaultSimilarity {
|
||||||
static class BoostingSimilarityProvider implements SimilarityProvider {
|
|
||||||
|
|
||||||
public float queryNorm(float sumOfSquaredWeights) {
|
public float queryNorm(float sumOfSquaredWeights) {
|
||||||
return 1.0f;
|
return 1.0f;
|
||||||
|
@ -318,39 +316,34 @@ public class TestPayloadNearQuery extends LuceneTestCase {
|
||||||
return 1.0f;
|
return 1.0f;
|
||||||
}
|
}
|
||||||
|
|
||||||
public Similarity get(String field) {
|
@Override
|
||||||
return new DefaultSimilarity() {
|
public float scorePayload(int docId, int start, int end, BytesRef payload) {
|
||||||
|
//we know it is size 4 here, so ignore the offset/length
|
||||||
|
return payload.bytes[payload.offset];
|
||||||
|
}
|
||||||
|
|
||||||
@Override
|
//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
||||||
public float scorePayload(int docId, int start, int end, BytesRef payload) {
|
//Make everything else 1 so we see the effect of the payload
|
||||||
//we know it is size 4 here, so ignore the offset/length
|
//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
||||||
return payload.bytes[payload.offset];
|
@Override
|
||||||
}
|
public void computeNorm(FieldInvertState state, Norm norm) {
|
||||||
|
norm.setByte(encodeNormValue(state.getBoost()));
|
||||||
//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
}
|
||||||
//Make everything else 1 so we see the effect of the payload
|
|
||||||
//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
|
||||||
@Override
|
|
||||||
public void computeNorm(FieldInvertState state, Norm norm) {
|
|
||||||
norm.setByte(encodeNormValue(state.getBoost()));
|
|
||||||
}
|
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public float sloppyFreq(int distance) {
|
public float sloppyFreq(int distance) {
|
||||||
return 1.0f;
|
return 1.0f;
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public float tf(float freq) {
|
public float tf(float freq) {
|
||||||
return 1.0f;
|
return 1.0f;
|
||||||
}
|
}
|
||||||
|
|
||||||
// idf used for phrase queries
|
// idf used for phrase queries
|
||||||
@Override
|
@Override
|
||||||
public Explanation idfExplain(CollectionStatistics collectionStats, TermStatistics[] termStats) {
|
public Explanation idfExplain(CollectionStatistics collectionStats, TermStatistics[] termStats) {
|
||||||
return new Explanation(1.0f, "Inexplicable");
|
return new Explanation(1.0f, "Inexplicable");
|
||||||
}
|
|
||||||
};
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -28,9 +28,7 @@ import org.apache.lucene.search.CheckHits;
|
||||||
import org.apache.lucene.search.BooleanClause;
|
import org.apache.lucene.search.BooleanClause;
|
||||||
import org.apache.lucene.search.BooleanQuery;
|
import org.apache.lucene.search.BooleanQuery;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.search.spans.MultiSpansWrapper;
|
import org.apache.lucene.search.spans.MultiSpansWrapper;
|
||||||
import org.apache.lucene.search.spans.SpanTermQuery;
|
import org.apache.lucene.search.spans.SpanTermQuery;
|
||||||
import org.apache.lucene.search.spans.Spans;
|
import org.apache.lucene.search.spans.Spans;
|
||||||
|
@ -59,7 +57,7 @@ import java.io.IOException;
|
||||||
public class TestPayloadTermQuery extends LuceneTestCase {
|
public class TestPayloadTermQuery extends LuceneTestCase {
|
||||||
private static IndexSearcher searcher;
|
private static IndexSearcher searcher;
|
||||||
private static IndexReader reader;
|
private static IndexReader reader;
|
||||||
private static SimilarityProvider similarityProvider = new BoostingSimilarityProvider();
|
private static Similarity similarity = new BoostingSimilarity();
|
||||||
private static final byte[] payloadField = new byte[]{1};
|
private static final byte[] payloadField = new byte[]{1};
|
||||||
private static final byte[] payloadMultiField1 = new byte[]{2};
|
private static final byte[] payloadMultiField1 = new byte[]{2};
|
||||||
private static final byte[] payloadMultiField2 = new byte[]{4};
|
private static final byte[] payloadMultiField2 = new byte[]{4};
|
||||||
|
@ -122,7 +120,7 @@ public class TestPayloadTermQuery extends LuceneTestCase {
|
||||||
directory = newDirectory();
|
directory = newDirectory();
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
||||||
newIndexWriterConfig(TEST_VERSION_CURRENT, new PayloadAnalyzer())
|
newIndexWriterConfig(TEST_VERSION_CURRENT, new PayloadAnalyzer())
|
||||||
.setSimilarityProvider(similarityProvider).setMergePolicy(newLogMergePolicy()));
|
.setSimilarity(similarity).setMergePolicy(newLogMergePolicy()));
|
||||||
//writer.infoStream = System.out;
|
//writer.infoStream = System.out;
|
||||||
for (int i = 0; i < 1000; i++) {
|
for (int i = 0; i < 1000; i++) {
|
||||||
Document doc = new Document();
|
Document doc = new Document();
|
||||||
|
@ -137,7 +135,7 @@ public class TestPayloadTermQuery extends LuceneTestCase {
|
||||||
writer.close();
|
writer.close();
|
||||||
|
|
||||||
searcher = newSearcher(reader);
|
searcher = newSearcher(reader);
|
||||||
searcher.setSimilarityProvider(similarityProvider);
|
searcher.setSimilarity(similarity);
|
||||||
}
|
}
|
||||||
|
|
||||||
@AfterClass
|
@AfterClass
|
||||||
|
@ -234,12 +232,7 @@ public class TestPayloadTermQuery extends LuceneTestCase {
|
||||||
|
|
||||||
IndexReader reader = IndexReader.open(directory);
|
IndexReader reader = IndexReader.open(directory);
|
||||||
IndexSearcher theSearcher = new IndexSearcher(reader);
|
IndexSearcher theSearcher = new IndexSearcher(reader);
|
||||||
theSearcher.setSimilarityProvider(new DefaultSimilarityProvider() {
|
theSearcher.setSimilarity(new FullSimilarity());
|
||||||
@Override
|
|
||||||
public Similarity get(String field) {
|
|
||||||
return new FullSimilarity();
|
|
||||||
}
|
|
||||||
});
|
|
||||||
TopDocs hits = searcher.search(query, null, 100);
|
TopDocs hits = searcher.search(query, null, 100);
|
||||||
assertTrue("hits is null and it shouldn't be", hits != null);
|
assertTrue("hits is null and it shouldn't be", hits != null);
|
||||||
assertTrue("hits Size: " + hits.totalHits + " is not: " + 100, hits.totalHits == 100);
|
assertTrue("hits Size: " + hits.totalHits + " is not: " + 100, hits.totalHits == 100);
|
||||||
|
@ -301,8 +294,7 @@ public class TestPayloadTermQuery extends LuceneTestCase {
|
||||||
CheckHits.checkHitCollector(random, query, PayloadHelper.NO_PAYLOAD_FIELD, searcher, results);
|
CheckHits.checkHitCollector(random, query, PayloadHelper.NO_PAYLOAD_FIELD, searcher, results);
|
||||||
}
|
}
|
||||||
|
|
||||||
// must be static for weight serialization tests
|
static class BoostingSimilarity extends DefaultSimilarity {
|
||||||
static class BoostingSimilarityProvider implements SimilarityProvider {
|
|
||||||
|
|
||||||
public float queryNorm(float sumOfSquaredWeights) {
|
public float queryNorm(float sumOfSquaredWeights) {
|
||||||
return 1;
|
return 1;
|
||||||
|
@ -312,39 +304,34 @@ public class TestPayloadTermQuery extends LuceneTestCase {
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
public Similarity get(String field) {
|
// TODO: Remove warning after API has been finalized
|
||||||
return new DefaultSimilarity() {
|
@Override
|
||||||
|
public float scorePayload(int docId, int start, int end, BytesRef payload) {
|
||||||
// TODO: Remove warning after API has been finalized
|
//we know it is size 4 here, so ignore the offset/length
|
||||||
@Override
|
return payload.bytes[payload.offset];
|
||||||
public float scorePayload(int docId, int start, int end, BytesRef payload) {
|
}
|
||||||
//we know it is size 4 here, so ignore the offset/length
|
|
||||||
return payload.bytes[payload.offset];
|
|
||||||
}
|
|
||||||
|
|
||||||
//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
||||||
//Make everything else 1 so we see the effect of the payload
|
//Make everything else 1 so we see the effect of the payload
|
||||||
//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
//!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
||||||
@Override
|
@Override
|
||||||
public void computeNorm(FieldInvertState state, Norm norm) {
|
public void computeNorm(FieldInvertState state, Norm norm) {
|
||||||
norm.setByte(encodeNormValue(state.getBoost()));
|
norm.setByte(encodeNormValue(state.getBoost()));
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public float sloppyFreq(int distance) {
|
public float sloppyFreq(int distance) {
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public float idf(long docFreq, long numDocs) {
|
public float idf(long docFreq, long numDocs) {
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public float tf(float freq) {
|
public float tf(float freq) {
|
||||||
return freq == 0 ? 0 : 1;
|
return freq == 0 ? 0 : 1;
|
||||||
}
|
|
||||||
};
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -48,32 +48,32 @@ import org.apache.lucene.util.LuceneTestCase;
|
||||||
* Tests against all the similarities we have
|
* Tests against all the similarities we have
|
||||||
*/
|
*/
|
||||||
public class TestSimilarity2 extends LuceneTestCase {
|
public class TestSimilarity2 extends LuceneTestCase {
|
||||||
List<SimilarityProvider> simProviders;
|
List<Similarity> sims;
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public void setUp() throws Exception {
|
public void setUp() throws Exception {
|
||||||
super.setUp();
|
super.setUp();
|
||||||
simProviders = new ArrayList<SimilarityProvider>();
|
sims = new ArrayList<Similarity>();
|
||||||
simProviders.add(new BasicSimilarityProvider(new DefaultSimilarity()));
|
sims.add(new DefaultSimilarity());
|
||||||
simProviders.add(new BasicSimilarityProvider(new BM25Similarity()));
|
sims.add(new BM25Similarity());
|
||||||
// TODO: not great that we dup this all with TestSimilarityBase
|
// TODO: not great that we dup this all with TestSimilarityBase
|
||||||
for (BasicModel basicModel : TestSimilarityBase.BASIC_MODELS) {
|
for (BasicModel basicModel : TestSimilarityBase.BASIC_MODELS) {
|
||||||
for (AfterEffect afterEffect : TestSimilarityBase.AFTER_EFFECTS) {
|
for (AfterEffect afterEffect : TestSimilarityBase.AFTER_EFFECTS) {
|
||||||
for (Normalization normalization : TestSimilarityBase.NORMALIZATIONS) {
|
for (Normalization normalization : TestSimilarityBase.NORMALIZATIONS) {
|
||||||
simProviders.add(new BasicSimilarityProvider(new DFRSimilarity(basicModel, afterEffect, normalization)));
|
sims.add(new DFRSimilarity(basicModel, afterEffect, normalization));
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
for (Distribution distribution : TestSimilarityBase.DISTRIBUTIONS) {
|
for (Distribution distribution : TestSimilarityBase.DISTRIBUTIONS) {
|
||||||
for (Lambda lambda : TestSimilarityBase.LAMBDAS) {
|
for (Lambda lambda : TestSimilarityBase.LAMBDAS) {
|
||||||
for (Normalization normalization : TestSimilarityBase.NORMALIZATIONS) {
|
for (Normalization normalization : TestSimilarityBase.NORMALIZATIONS) {
|
||||||
simProviders.add(new BasicSimilarityProvider(new IBSimilarity(distribution, lambda, normalization)));
|
sims.add(new IBSimilarity(distribution, lambda, normalization));
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
simProviders.add(new BasicSimilarityProvider(new LMDirichletSimilarity()));
|
sims.add(new LMDirichletSimilarity());
|
||||||
simProviders.add(new BasicSimilarityProvider(new LMJelinekMercerSimilarity(0.1f)));
|
sims.add(new LMJelinekMercerSimilarity(0.1f));
|
||||||
simProviders.add(new BasicSimilarityProvider(new LMJelinekMercerSimilarity(0.7f)));
|
sims.add(new LMJelinekMercerSimilarity(0.7f));
|
||||||
}
|
}
|
||||||
|
|
||||||
/** because of stupid things like querynorm, its possible we computeStats on a field that doesnt exist at all
|
/** because of stupid things like querynorm, its possible we computeStats on a field that doesnt exist at all
|
||||||
|
@ -86,8 +86,8 @@ public class TestSimilarity2 extends LuceneTestCase {
|
||||||
iw.close();
|
iw.close();
|
||||||
IndexSearcher is = newSearcher(ir);
|
IndexSearcher is = newSearcher(ir);
|
||||||
|
|
||||||
for (SimilarityProvider simProvider : simProviders) {
|
for (Similarity sim : sims) {
|
||||||
is.setSimilarityProvider(simProvider);
|
is.setSimilarity(sim);
|
||||||
assertEquals(0, is.search(new TermQuery(new Term("foo", "bar")), 10).totalHits);
|
assertEquals(0, is.search(new TermQuery(new Term("foo", "bar")), 10).totalHits);
|
||||||
}
|
}
|
||||||
ir.close();
|
ir.close();
|
||||||
|
@ -105,8 +105,8 @@ public class TestSimilarity2 extends LuceneTestCase {
|
||||||
iw.close();
|
iw.close();
|
||||||
IndexSearcher is = newSearcher(ir);
|
IndexSearcher is = newSearcher(ir);
|
||||||
|
|
||||||
for (SimilarityProvider simProvider : simProviders) {
|
for (Similarity sim : sims) {
|
||||||
is.setSimilarityProvider(simProvider);
|
is.setSimilarity(sim);
|
||||||
BooleanQuery query = new BooleanQuery(true);
|
BooleanQuery query = new BooleanQuery(true);
|
||||||
query.add(new TermQuery(new Term("foo", "bar")), BooleanClause.Occur.SHOULD);
|
query.add(new TermQuery(new Term("foo", "bar")), BooleanClause.Occur.SHOULD);
|
||||||
query.add(new TermQuery(new Term("bar", "baz")), BooleanClause.Occur.SHOULD);
|
query.add(new TermQuery(new Term("bar", "baz")), BooleanClause.Occur.SHOULD);
|
||||||
|
@ -127,8 +127,8 @@ public class TestSimilarity2 extends LuceneTestCase {
|
||||||
iw.close();
|
iw.close();
|
||||||
IndexSearcher is = newSearcher(ir);
|
IndexSearcher is = newSearcher(ir);
|
||||||
|
|
||||||
for (SimilarityProvider simProvider : simProviders) {
|
for (Similarity sim : sims) {
|
||||||
is.setSimilarityProvider(simProvider);
|
is.setSimilarity(sim);
|
||||||
BooleanQuery query = new BooleanQuery(true);
|
BooleanQuery query = new BooleanQuery(true);
|
||||||
query.add(new TermQuery(new Term("foo", "bar")), BooleanClause.Occur.SHOULD);
|
query.add(new TermQuery(new Term("foo", "bar")), BooleanClause.Occur.SHOULD);
|
||||||
query.add(new TermQuery(new Term("foo", "baz")), BooleanClause.Occur.SHOULD);
|
query.add(new TermQuery(new Term("foo", "baz")), BooleanClause.Occur.SHOULD);
|
||||||
|
@ -152,8 +152,8 @@ public class TestSimilarity2 extends LuceneTestCase {
|
||||||
iw.close();
|
iw.close();
|
||||||
IndexSearcher is = newSearcher(ir);
|
IndexSearcher is = newSearcher(ir);
|
||||||
|
|
||||||
for (SimilarityProvider simProvider : simProviders) {
|
for (Similarity sim : sims) {
|
||||||
is.setSimilarityProvider(simProvider);
|
is.setSimilarity(sim);
|
||||||
BooleanQuery query = new BooleanQuery(true);
|
BooleanQuery query = new BooleanQuery(true);
|
||||||
query.add(new TermQuery(new Term("foo", "bar")), BooleanClause.Occur.SHOULD);
|
query.add(new TermQuery(new Term("foo", "bar")), BooleanClause.Occur.SHOULD);
|
||||||
assertEquals(1, is.search(query, 10).totalHits);
|
assertEquals(1, is.search(query, 10).totalHits);
|
||||||
|
@ -177,8 +177,8 @@ public class TestSimilarity2 extends LuceneTestCase {
|
||||||
iw.close();
|
iw.close();
|
||||||
IndexSearcher is = newSearcher(ir);
|
IndexSearcher is = newSearcher(ir);
|
||||||
|
|
||||||
for (SimilarityProvider simProvider : simProviders) {
|
for (Similarity sim : sims) {
|
||||||
is.setSimilarityProvider(simProvider);
|
is.setSimilarity(sim);
|
||||||
BooleanQuery query = new BooleanQuery(true);
|
BooleanQuery query = new BooleanQuery(true);
|
||||||
query.add(new TermQuery(new Term("foo", "bar")), BooleanClause.Occur.SHOULD);
|
query.add(new TermQuery(new Term("foo", "bar")), BooleanClause.Occur.SHOULD);
|
||||||
assertEquals(1, is.search(query, 10).totalHits);
|
assertEquals(1, is.search(query, 10).totalHits);
|
||||||
|
@ -203,8 +203,8 @@ public class TestSimilarity2 extends LuceneTestCase {
|
||||||
iw.close();
|
iw.close();
|
||||||
IndexSearcher is = newSearcher(ir);
|
IndexSearcher is = newSearcher(ir);
|
||||||
|
|
||||||
for (SimilarityProvider simProvider : simProviders) {
|
for (Similarity sim : sims) {
|
||||||
is.setSimilarityProvider(simProvider);
|
is.setSimilarity(sim);
|
||||||
BooleanQuery query = new BooleanQuery(true);
|
BooleanQuery query = new BooleanQuery(true);
|
||||||
query.add(new TermQuery(new Term("foo", "bar")), BooleanClause.Occur.SHOULD);
|
query.add(new TermQuery(new Term("foo", "bar")), BooleanClause.Occur.SHOULD);
|
||||||
assertEquals(1, is.search(query, 10).totalHits);
|
assertEquals(1, is.search(query, 10).totalHits);
|
||||||
|
@ -229,8 +229,8 @@ public class TestSimilarity2 extends LuceneTestCase {
|
||||||
iw.close();
|
iw.close();
|
||||||
IndexSearcher is = newSearcher(ir);
|
IndexSearcher is = newSearcher(ir);
|
||||||
|
|
||||||
for (SimilarityProvider simProvider : simProviders) {
|
for (Similarity sim : sims) {
|
||||||
is.setSimilarityProvider(simProvider);
|
is.setSimilarity(sim);
|
||||||
SpanTermQuery s1 = new SpanTermQuery(new Term("foo", "bar"));
|
SpanTermQuery s1 = new SpanTermQuery(new Term("foo", "bar"));
|
||||||
SpanTermQuery s2 = new SpanTermQuery(new Term("foo", "baz"));
|
SpanTermQuery s2 = new SpanTermQuery(new Term("foo", "baz"));
|
||||||
Query query = new SpanOrQuery(s1, s2);
|
Query query = new SpanOrQuery(s1, s2);
|
||||||
|
@ -238,7 +238,7 @@ public class TestSimilarity2 extends LuceneTestCase {
|
||||||
assertEquals(1, td.totalHits);
|
assertEquals(1, td.totalHits);
|
||||||
float score = td.scoreDocs[0].score;
|
float score = td.scoreDocs[0].score;
|
||||||
assertTrue(score >= 0.0f);
|
assertTrue(score >= 0.0f);
|
||||||
assertFalse("inf score for " + simProvider, Float.isInfinite(score));
|
assertFalse("inf score for " + sim, Float.isInfinite(score));
|
||||||
}
|
}
|
||||||
ir.close();
|
ir.close();
|
||||||
dir.close();
|
dir.close();
|
||||||
|
|
|
@ -551,7 +551,7 @@ public class TestSimilarityBase extends LuceneTestCase {
|
||||||
Query q = new TermQuery(new Term(FIELD_BODY, "heart"));
|
Query q = new TermQuery(new Term(FIELD_BODY, "heart"));
|
||||||
|
|
||||||
for (SimilarityBase sim : sims) {
|
for (SimilarityBase sim : sims) {
|
||||||
searcher.setSimilarityProvider(new BasicSimilarityProvider(sim));
|
searcher.setSimilarity(sim);
|
||||||
TopDocs topDocs = searcher.search(q, 1000);
|
TopDocs topDocs = searcher.search(q, 1000);
|
||||||
assertEquals("Failed: " + sim.toString(), 3, topDocs.totalHits);
|
assertEquals("Failed: " + sim.toString(), 3, topDocs.totalHits);
|
||||||
}
|
}
|
||||||
|
@ -565,7 +565,7 @@ public class TestSimilarityBase extends LuceneTestCase {
|
||||||
Query q = new TermQuery(new Term(FIELD_BODY, "heart"));
|
Query q = new TermQuery(new Term(FIELD_BODY, "heart"));
|
||||||
|
|
||||||
for (SimilarityBase sim : sims) {
|
for (SimilarityBase sim : sims) {
|
||||||
searcher.setSimilarityProvider(new BasicSimilarityProvider(sim));
|
searcher.setSimilarity(sim);
|
||||||
TopDocs topDocs = searcher.search(q, 1000);
|
TopDocs topDocs = searcher.search(q, 1000);
|
||||||
assertEquals("Failed: " + sim.toString(), "2", reader.document(topDocs.scoreDocs[0].doc).get(FIELD_ID));
|
assertEquals("Failed: " + sim.toString(), "2", reader.document(topDocs.scoreDocs[0].doc).get(FIELD_ID));
|
||||||
}
|
}
|
||||||
|
|
|
@ -242,7 +242,7 @@ public class TestFieldMaskingSpanQuery extends LuceneTestCase {
|
||||||
|
|
||||||
public void testSimple2() throws Exception {
|
public void testSimple2() throws Exception {
|
||||||
assumeTrue("Broken scoring: LUCENE-3723",
|
assumeTrue("Broken scoring: LUCENE-3723",
|
||||||
searcher.getSimilarityProvider().get("id") instanceof TFIDFSimilarity);
|
searcher.getSimilarity() instanceof TFIDFSimilarity);
|
||||||
SpanQuery q1 = new SpanTermQuery(new Term("gender", "female"));
|
SpanQuery q1 = new SpanTermQuery(new Term("gender", "female"));
|
||||||
SpanQuery q2 = new SpanTermQuery(new Term("last", "smith"));
|
SpanQuery q2 = new SpanTermQuery(new Term("last", "smith"));
|
||||||
SpanQuery q = new SpanNearQuery(new SpanQuery[]
|
SpanQuery q = new SpanNearQuery(new SpanQuery[]
|
||||||
|
@ -314,7 +314,7 @@ public class TestFieldMaskingSpanQuery extends LuceneTestCase {
|
||||||
|
|
||||||
public void testSpans2() throws Exception {
|
public void testSpans2() throws Exception {
|
||||||
assumeTrue("Broken scoring: LUCENE-3723",
|
assumeTrue("Broken scoring: LUCENE-3723",
|
||||||
searcher.getSimilarityProvider().get("id") instanceof TFIDFSimilarity);
|
searcher.getSimilarity() instanceof TFIDFSimilarity);
|
||||||
SpanQuery qA1 = new SpanTermQuery(new Term("gender", "female"));
|
SpanQuery qA1 = new SpanTermQuery(new Term("gender", "female"));
|
||||||
SpanQuery qA2 = new SpanTermQuery(new Term("first", "james"));
|
SpanQuery qA2 = new SpanTermQuery(new Term("first", "james"));
|
||||||
SpanQuery qA = new SpanOrQuery(qA1, new FieldMaskingSpanQuery(qA2, "gender"));
|
SpanQuery qA = new SpanOrQuery(qA1, new FieldMaskingSpanQuery(qA2, "gender"));
|
||||||
|
|
|
@ -39,15 +39,15 @@ import org.apache.lucene.search.TermQuery;
|
||||||
import org.apache.lucene.search.TopDocs;
|
import org.apache.lucene.search.TopDocs;
|
||||||
import org.apache.lucene.search.payloads.PayloadHelper;
|
import org.apache.lucene.search.payloads.PayloadHelper;
|
||||||
import org.apache.lucene.search.payloads.PayloadSpanUtil;
|
import org.apache.lucene.search.payloads.PayloadSpanUtil;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.store.LockObtainFailedException;
|
import org.apache.lucene.store.LockObtainFailedException;
|
||||||
import org.apache.lucene.util.LuceneTestCase;
|
import org.apache.lucene.util.LuceneTestCase;
|
||||||
|
|
||||||
public class TestPayloadSpans extends LuceneTestCase {
|
public class TestPayloadSpans extends LuceneTestCase {
|
||||||
private IndexSearcher searcher;
|
private IndexSearcher searcher;
|
||||||
private SimilarityProvider similarity = new DefaultSimilarityProvider();
|
private Similarity similarity = new DefaultSimilarity();
|
||||||
protected IndexReader indexReader;
|
protected IndexReader indexReader;
|
||||||
private IndexReader closeIndexReader;
|
private IndexReader closeIndexReader;
|
||||||
private Directory directory;
|
private Directory directory;
|
||||||
|
@ -107,7 +107,7 @@ public class TestPayloadSpans extends LuceneTestCase {
|
||||||
|
|
||||||
Directory directory = newDirectory();
|
Directory directory = newDirectory();
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
||||||
newIndexWriterConfig(TEST_VERSION_CURRENT, new PayloadAnalyzer()).setSimilarityProvider(similarity));
|
newIndexWriterConfig(TEST_VERSION_CURRENT, new PayloadAnalyzer()).setSimilarity(similarity));
|
||||||
|
|
||||||
Document doc = new Document();
|
Document doc = new Document();
|
||||||
doc.add(newField(PayloadHelper.FIELD, "one two three one four three", TextField.TYPE_STORED));
|
doc.add(newField(PayloadHelper.FIELD, "one two three one four three", TextField.TYPE_STORED));
|
||||||
|
@ -366,7 +366,7 @@ public class TestPayloadSpans extends LuceneTestCase {
|
||||||
public void testPayloadSpanUtil() throws Exception {
|
public void testPayloadSpanUtil() throws Exception {
|
||||||
Directory directory = newDirectory();
|
Directory directory = newDirectory();
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
||||||
newIndexWriterConfig(TEST_VERSION_CURRENT, new PayloadAnalyzer()).setSimilarityProvider(similarity));
|
newIndexWriterConfig(TEST_VERSION_CURRENT, new PayloadAnalyzer()).setSimilarity(similarity));
|
||||||
|
|
||||||
Document doc = new Document();
|
Document doc = new Document();
|
||||||
doc.add(newField(PayloadHelper.FIELD,"xx rr yy mm pp", TextField.TYPE_STORED));
|
doc.add(newField(PayloadHelper.FIELD,"xx rr yy mm pp", TextField.TYPE_STORED));
|
||||||
|
@ -426,7 +426,7 @@ public class TestPayloadSpans extends LuceneTestCase {
|
||||||
directory = newDirectory();
|
directory = newDirectory();
|
||||||
String[] docs = new String[]{"xx rr yy mm pp","xx yy mm rr pp", "nopayload qq ss pp np", "one two three four five six seven eight nine ten eleven", "nine one two three four five six seven eight eleven ten"};
|
String[] docs = new String[]{"xx rr yy mm pp","xx yy mm rr pp", "nopayload qq ss pp np", "one two three four five six seven eight nine ten eleven", "nine one two three four five six seven eight eleven ten"};
|
||||||
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
RandomIndexWriter writer = new RandomIndexWriter(random, directory,
|
||||||
newIndexWriterConfig(TEST_VERSION_CURRENT, new PayloadAnalyzer()).setSimilarityProvider(similarity));
|
newIndexWriterConfig(TEST_VERSION_CURRENT, new PayloadAnalyzer()).setSimilarity(similarity));
|
||||||
|
|
||||||
Document doc = null;
|
Document doc = null;
|
||||||
for(int i = 0; i < docs.length; i++) {
|
for(int i = 0; i < docs.length; i++) {
|
||||||
|
|
|
@ -24,9 +24,7 @@ import org.apache.lucene.search.Scorer;
|
||||||
import org.apache.lucene.search.TermQuery;
|
import org.apache.lucene.search.TermQuery;
|
||||||
import org.apache.lucene.search.IndexSearcher;
|
import org.apache.lucene.search.IndexSearcher;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
import org.apache.lucene.analysis.MockAnalyzer;
|
import org.apache.lucene.analysis.MockAnalyzer;
|
||||||
import org.apache.lucene.index.AtomicReaderContext;
|
import org.apache.lucene.index.AtomicReaderContext;
|
||||||
|
@ -411,21 +409,17 @@ public class TestSpans extends LuceneTestCase {
|
||||||
for (int i = 0; i < leaves.length; i++) {
|
for (int i = 0; i < leaves.length; i++) {
|
||||||
|
|
||||||
|
|
||||||
final SimilarityProvider sim = new DefaultSimilarityProvider() {
|
final Similarity sim = new DefaultSimilarity() {
|
||||||
public Similarity get(String field) {
|
@Override
|
||||||
return new DefaultSimilarity() {
|
public float sloppyFreq(int distance) {
|
||||||
@Override
|
return 0.0f;
|
||||||
public float sloppyFreq(int distance) {
|
|
||||||
return 0.0f;
|
|
||||||
}
|
|
||||||
};
|
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
final SimilarityProvider oldSim = searcher.getSimilarityProvider();
|
final Similarity oldSim = searcher.getSimilarity();
|
||||||
Scorer spanScorer;
|
Scorer spanScorer;
|
||||||
try {
|
try {
|
||||||
searcher.setSimilarityProvider(sim);
|
searcher.setSimilarity(sim);
|
||||||
SpanNearQuery snq = new SpanNearQuery(
|
SpanNearQuery snq = new SpanNearQuery(
|
||||||
new SpanQuery[] {
|
new SpanQuery[] {
|
||||||
makeSpanTermQuery("t1"),
|
makeSpanTermQuery("t1"),
|
||||||
|
@ -435,7 +429,7 @@ public class TestSpans extends LuceneTestCase {
|
||||||
|
|
||||||
spanScorer = searcher.createNormalizedWeight(snq).scorer(leaves[i], true, false, leaves[i].reader().getLiveDocs());
|
spanScorer = searcher.createNormalizedWeight(snq).scorer(leaves[i], true, false, leaves[i].reader().getLiveDocs());
|
||||||
} finally {
|
} finally {
|
||||||
searcher.setSimilarityProvider(oldSim);
|
searcher.setSimilarity(oldSim);
|
||||||
}
|
}
|
||||||
if (i == subIndex) {
|
if (i == subIndex) {
|
||||||
assertTrue("first doc", spanScorer.nextDoc() != DocIdSetIterator.NO_MORE_DOCS);
|
assertTrue("first doc", spanScorer.nextDoc() != DocIdSetIterator.NO_MORE_DOCS);
|
||||||
|
|
|
@ -31,7 +31,7 @@ import org.apache.lucene.index.IndexReader;
|
||||||
import org.apache.lucene.index.RandomIndexWriter;
|
import org.apache.lucene.index.RandomIndexWriter;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.search.*;
|
import org.apache.lucene.search.*;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.store.Directory;
|
import org.apache.lucene.store.Directory;
|
||||||
|
|
||||||
/*******************************************************************************
|
/*******************************************************************************
|
||||||
|
@ -61,7 +61,7 @@ public class TestSpansAdvanced extends LuceneTestCase {
|
||||||
final RandomIndexWriter writer = new RandomIndexWriter(random, mDirectory,
|
final RandomIndexWriter writer = new RandomIndexWriter(random, mDirectory,
|
||||||
newIndexWriterConfig(TEST_VERSION_CURRENT,
|
newIndexWriterConfig(TEST_VERSION_CURRENT,
|
||||||
new MockAnalyzer(random, MockTokenizer.SIMPLE, true, MockTokenFilter.ENGLISH_STOPSET, true))
|
new MockAnalyzer(random, MockTokenizer.SIMPLE, true, MockTokenFilter.ENGLISH_STOPSET, true))
|
||||||
.setMergePolicy(newLogMergePolicy()).setSimilarityProvider(new DefaultSimilarityProvider()));
|
.setMergePolicy(newLogMergePolicy()).setSimilarity(new DefaultSimilarity()));
|
||||||
addDocument(writer, "1", "I think it should work.");
|
addDocument(writer, "1", "I think it should work.");
|
||||||
addDocument(writer, "2", "I think it should work.");
|
addDocument(writer, "2", "I think it should work.");
|
||||||
addDocument(writer, "3", "I think it should work.");
|
addDocument(writer, "3", "I think it should work.");
|
||||||
|
@ -69,7 +69,7 @@ public class TestSpansAdvanced extends LuceneTestCase {
|
||||||
reader = writer.getReader();
|
reader = writer.getReader();
|
||||||
writer.close();
|
writer.close();
|
||||||
searcher = newSearcher(reader);
|
searcher = newSearcher(reader);
|
||||||
searcher.setSimilarityProvider(new DefaultSimilarityProvider());
|
searcher.setSimilarity(new DefaultSimilarity());
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
|
|
|
@ -27,7 +27,7 @@ import org.apache.lucene.index.RandomIndexWriter;
|
||||||
import org.apache.lucene.index.Term;
|
import org.apache.lucene.index.Term;
|
||||||
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
|
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
|
||||||
import org.apache.lucene.search.*;
|
import org.apache.lucene.search.*;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
|
|
||||||
/*******************************************************************************
|
/*******************************************************************************
|
||||||
* Some expanded tests to make sure my patch doesn't break other SpanTermQuery
|
* Some expanded tests to make sure my patch doesn't break other SpanTermQuery
|
||||||
|
@ -50,7 +50,7 @@ public class TestSpansAdvanced2 extends TestSpansAdvanced {
|
||||||
newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random,
|
newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random,
|
||||||
MockTokenizer.SIMPLE, true, MockTokenFilter.ENGLISH_STOPSET, true))
|
MockTokenizer.SIMPLE, true, MockTokenFilter.ENGLISH_STOPSET, true))
|
||||||
.setOpenMode(OpenMode.APPEND).setMergePolicy(newLogMergePolicy())
|
.setOpenMode(OpenMode.APPEND).setMergePolicy(newLogMergePolicy())
|
||||||
.setSimilarityProvider(new DefaultSimilarityProvider()));
|
.setSimilarity(new DefaultSimilarity()));
|
||||||
addDocument(writer, "A", "Should we, could we, would we?");
|
addDocument(writer, "A", "Should we, could we, would we?");
|
||||||
addDocument(writer, "B", "It should. Should it?");
|
addDocument(writer, "B", "It should. Should it?");
|
||||||
addDocument(writer, "C", "It shouldn't.");
|
addDocument(writer, "C", "It shouldn't.");
|
||||||
|
@ -60,7 +60,7 @@ public class TestSpansAdvanced2 extends TestSpansAdvanced {
|
||||||
|
|
||||||
// re-open the searcher since we added more docs
|
// re-open the searcher since we added more docs
|
||||||
searcher2 = newSearcher(reader2);
|
searcher2 = newSearcher(reader2);
|
||||||
searcher2.setSimilarityProvider(new DefaultSimilarityProvider());
|
searcher2.setSimilarity(new DefaultSimilarity());
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
|
|
|
@ -20,6 +20,7 @@ package org.apache.lucene.queries.function.valuesource;
|
||||||
import org.apache.lucene.index.*;
|
import org.apache.lucene.index.*;
|
||||||
import org.apache.lucene.queries.function.FunctionValues;
|
import org.apache.lucene.queries.function.FunctionValues;
|
||||||
import org.apache.lucene.search.IndexSearcher;
|
import org.apache.lucene.search.IndexSearcher;
|
||||||
|
import org.apache.lucene.search.similarities.PerFieldSimilarityWrapper;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.TFIDFSimilarity;
|
import org.apache.lucene.search.similarities.TFIDFSimilarity;
|
||||||
import org.apache.lucene.util.BytesRef;
|
import org.apache.lucene.util.BytesRef;
|
||||||
|
@ -41,13 +42,25 @@ public class IDFValueSource extends DocFreqValueSource {
|
||||||
@Override
|
@Override
|
||||||
public FunctionValues getValues(Map context, AtomicReaderContext readerContext) throws IOException {
|
public FunctionValues getValues(Map context, AtomicReaderContext readerContext) throws IOException {
|
||||||
IndexSearcher searcher = (IndexSearcher)context.get("searcher");
|
IndexSearcher searcher = (IndexSearcher)context.get("searcher");
|
||||||
Similarity sim = searcher.getSimilarityProvider().get(field);
|
TFIDFSimilarity sim = asTFIDF(searcher.getSimilarity(), field);
|
||||||
if (!(sim instanceof TFIDFSimilarity)) {
|
if (sim == null) {
|
||||||
throw new UnsupportedOperationException("requires a TFIDFSimilarity (such as DefaultSimilarity)");
|
throw new UnsupportedOperationException("requires a TFIDFSimilarity (such as DefaultSimilarity)");
|
||||||
}
|
}
|
||||||
int docfreq = searcher.getIndexReader().docFreq(new Term(indexedField, indexedBytes));
|
int docfreq = searcher.getIndexReader().docFreq(new Term(indexedField, indexedBytes));
|
||||||
float idf = ((TFIDFSimilarity)sim).idf(docfreq, searcher.getIndexReader().maxDoc());
|
float idf = sim.idf(docfreq, searcher.getIndexReader().maxDoc());
|
||||||
return new ConstDoubleDocValues(idf, this);
|
return new ConstDoubleDocValues(idf, this);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// tries extra hard to cast the sim to TFIDFSimilarity
|
||||||
|
static TFIDFSimilarity asTFIDF(Similarity sim, String field) {
|
||||||
|
while (sim instanceof PerFieldSimilarityWrapper) {
|
||||||
|
sim = ((PerFieldSimilarityWrapper)sim).get(field);
|
||||||
|
}
|
||||||
|
if (sim instanceof TFIDFSimilarity) {
|
||||||
|
return (TFIDFSimilarity)sim;
|
||||||
|
} else {
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -23,7 +23,6 @@ import org.apache.lucene.queries.function.FunctionValues;
|
||||||
import org.apache.lucene.queries.function.ValueSource;
|
import org.apache.lucene.queries.function.ValueSource;
|
||||||
import org.apache.lucene.queries.function.docvalues.FloatDocValues;
|
import org.apache.lucene.queries.function.docvalues.FloatDocValues;
|
||||||
import org.apache.lucene.search.IndexSearcher;
|
import org.apache.lucene.search.IndexSearcher;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
|
||||||
import org.apache.lucene.search.similarities.TFIDFSimilarity;
|
import org.apache.lucene.search.similarities.TFIDFSimilarity;
|
||||||
|
|
||||||
import java.io.IOException;
|
import java.io.IOException;
|
||||||
|
@ -52,11 +51,10 @@ public class NormValueSource extends ValueSource {
|
||||||
@Override
|
@Override
|
||||||
public FunctionValues getValues(Map context, AtomicReaderContext readerContext) throws IOException {
|
public FunctionValues getValues(Map context, AtomicReaderContext readerContext) throws IOException {
|
||||||
IndexSearcher searcher = (IndexSearcher)context.get("searcher");
|
IndexSearcher searcher = (IndexSearcher)context.get("searcher");
|
||||||
Similarity sim = searcher.getSimilarityProvider().get(field);
|
final TFIDFSimilarity similarity = IDFValueSource.asTFIDF(searcher.getSimilarity(), field);
|
||||||
if (!(sim instanceof TFIDFSimilarity)) {
|
if (similarity == null) {
|
||||||
throw new UnsupportedOperationException("requires a TFIDFSimilarity (such as DefaultSimilarity)");
|
throw new UnsupportedOperationException("requires a TFIDFSimilarity (such as DefaultSimilarity)");
|
||||||
}
|
}
|
||||||
final TFIDFSimilarity similarity = (TFIDFSimilarity) sim;
|
|
||||||
DocValues dv = readerContext.reader().normValues(field);
|
DocValues dv = readerContext.reader().normValues(field);
|
||||||
|
|
||||||
if (dv == null) {
|
if (dv == null) {
|
||||||
|
|
|
@ -22,7 +22,6 @@ import org.apache.lucene.queries.function.FunctionValues;
|
||||||
import org.apache.lucene.queries.function.docvalues.FloatDocValues;
|
import org.apache.lucene.queries.function.docvalues.FloatDocValues;
|
||||||
import org.apache.lucene.search.DocIdSetIterator;
|
import org.apache.lucene.search.DocIdSetIterator;
|
||||||
import org.apache.lucene.search.IndexSearcher;
|
import org.apache.lucene.search.IndexSearcher;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
|
||||||
import org.apache.lucene.search.similarities.TFIDFSimilarity;
|
import org.apache.lucene.search.similarities.TFIDFSimilarity;
|
||||||
import org.apache.lucene.util.BytesRef;
|
import org.apache.lucene.util.BytesRef;
|
||||||
|
|
||||||
|
@ -43,11 +42,11 @@ public class TFValueSource extends TermFreqValueSource {
|
||||||
public FunctionValues getValues(Map context, AtomicReaderContext readerContext) throws IOException {
|
public FunctionValues getValues(Map context, AtomicReaderContext readerContext) throws IOException {
|
||||||
Fields fields = readerContext.reader().fields();
|
Fields fields = readerContext.reader().fields();
|
||||||
final Terms terms = fields.terms(field);
|
final Terms terms = fields.terms(field);
|
||||||
final Similarity sim = ((IndexSearcher)context.get("searcher")).getSimilarityProvider().get(field);
|
IndexSearcher searcher = (IndexSearcher)context.get("searcher");
|
||||||
if (!(sim instanceof TFIDFSimilarity)) {
|
final TFIDFSimilarity similarity = IDFValueSource.asTFIDF(searcher.getSimilarity(), field);
|
||||||
|
if (similarity == null) {
|
||||||
throw new UnsupportedOperationException("requires a TFIDFSimilarity (such as DefaultSimilarity)");
|
throw new UnsupportedOperationException("requires a TFIDFSimilarity (such as DefaultSimilarity)");
|
||||||
}
|
}
|
||||||
final TFIDFSimilarity similarity = (TFIDFSimilarity) sim;
|
|
||||||
|
|
||||||
return new FloatDocValues(this) {
|
return new FloatDocValues(this) {
|
||||||
DocsEnum docs ;
|
DocsEnum docs ;
|
||||||
|
|
|
@ -32,7 +32,7 @@ import org.apache.lucene.search.BooleanClause;
|
||||||
import org.apache.lucene.search.BooleanQuery;
|
import org.apache.lucene.search.BooleanQuery;
|
||||||
import org.apache.lucene.search.Query;
|
import org.apache.lucene.search.Query;
|
||||||
import org.apache.lucene.search.BooleanQuery.TooManyClauses;
|
import org.apache.lucene.search.BooleanQuery.TooManyClauses;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* This builder does the same as the {@link BooleanQueryNodeBuilder}, but this
|
* This builder does the same as the {@link BooleanQueryNodeBuilder}, but this
|
||||||
|
@ -41,7 +41,7 @@ import org.apache.lucene.search.similarities.SimilarityProvider;
|
||||||
*
|
*
|
||||||
* @see BooleanQueryNodeBuilder
|
* @see BooleanQueryNodeBuilder
|
||||||
* @see BooleanQuery
|
* @see BooleanQuery
|
||||||
* @see SimilarityProvider#coord(int, int)
|
* @see Similarity#coord(int, int)
|
||||||
*/
|
*/
|
||||||
public class StandardBooleanQueryNodeBuilder implements StandardQueryBuilder {
|
public class StandardBooleanQueryNodeBuilder implements StandardQueryBuilder {
|
||||||
|
|
||||||
|
|
|
@ -22,14 +22,14 @@ import java.util.List;
|
||||||
import org.apache.lucene.queryparser.flexible.core.nodes.BooleanQueryNode;
|
import org.apache.lucene.queryparser.flexible.core.nodes.BooleanQueryNode;
|
||||||
import org.apache.lucene.queryparser.flexible.core.nodes.QueryNode;
|
import org.apache.lucene.queryparser.flexible.core.nodes.QueryNode;
|
||||||
import org.apache.lucene.search.BooleanQuery;
|
import org.apache.lucene.search.BooleanQuery;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* A {@link StandardBooleanQueryNode} has the same behavior as
|
* A {@link StandardBooleanQueryNode} has the same behavior as
|
||||||
* {@link BooleanQueryNode}. It only indicates if the coord should be enabled or
|
* {@link BooleanQueryNode}. It only indicates if the coord should be enabled or
|
||||||
* not for this boolean query. <br/>
|
* not for this boolean query. <br/>
|
||||||
*
|
*
|
||||||
* @see SimilarityProvider#coord(int, int)
|
* @see Similarity#coord(int, int)
|
||||||
* @see BooleanQuery
|
* @see BooleanQuery
|
||||||
*/
|
*/
|
||||||
public class StandardBooleanQueryNode extends BooleanQueryNode {
|
public class StandardBooleanQueryNode extends BooleanQueryNode {
|
||||||
|
|
|
@ -502,9 +502,6 @@ public abstract class FieldType extends FieldProperties {
|
||||||
* has no custom similarity associated with it.
|
* has no custom similarity associated with it.
|
||||||
* </p>
|
* </p>
|
||||||
*
|
*
|
||||||
* This method exists to internally support SolrSimilarityProvider.
|
|
||||||
* Custom application code interested in a field's Similarity should
|
|
||||||
* instead query via the searcher's SimilarityProvider.
|
|
||||||
* @lucene.internal
|
* @lucene.internal
|
||||||
*/
|
*/
|
||||||
public Similarity getSimilarity() {
|
public Similarity getSimilarity() {
|
||||||
|
|
|
@ -97,7 +97,7 @@ public final class FieldTypePluginLoader
|
||||||
// a custom similarity[Factory]
|
// a custom similarity[Factory]
|
||||||
expression = "./similarity";
|
expression = "./similarity";
|
||||||
anode = (Node)xpath.evaluate(expression, node, XPathConstants.NODE);
|
anode = (Node)xpath.evaluate(expression, node, XPathConstants.NODE);
|
||||||
Similarity similarity = IndexSchema.readSimilarity(loader, anode);
|
SimilarityFactory simFactory = IndexSchema.readSimilarity(loader, anode);
|
||||||
|
|
||||||
if (queryAnalyzer==null) queryAnalyzer=analyzer;
|
if (queryAnalyzer==null) queryAnalyzer=analyzer;
|
||||||
if (analyzer==null) analyzer=queryAnalyzer;
|
if (analyzer==null) analyzer=queryAnalyzer;
|
||||||
|
@ -110,8 +110,8 @@ public final class FieldTypePluginLoader
|
||||||
if (ft instanceof TextField)
|
if (ft instanceof TextField)
|
||||||
((TextField)ft).setMultiTermAnalyzer(multiAnalyzer);
|
((TextField)ft).setMultiTermAnalyzer(multiAnalyzer);
|
||||||
}
|
}
|
||||||
if (similarity!=null) {
|
if (simFactory!=null) {
|
||||||
ft.setSimilarity(similarity);
|
ft.setSimilarity(simFactory.getSimilarity());
|
||||||
}
|
}
|
||||||
if (ft instanceof SchemaAware){
|
if (ft instanceof SchemaAware){
|
||||||
schemaAware.add((SchemaAware) ft);
|
schemaAware.add((SchemaAware) ft);
|
||||||
|
|
|
@ -22,7 +22,6 @@ import org.apache.lucene.analysis.AnalyzerWrapper;
|
||||||
import org.apache.lucene.index.IndexableField;
|
import org.apache.lucene.index.IndexableField;
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.lucene.util.Version;
|
import org.apache.lucene.util.Version;
|
||||||
import org.apache.solr.common.ResourceLoader;
|
import org.apache.solr.common.ResourceLoader;
|
||||||
import org.apache.solr.common.SolrException;
|
import org.apache.solr.common.SolrException;
|
||||||
|
@ -33,7 +32,7 @@ import org.apache.solr.common.util.SystemIdResolver;
|
||||||
import org.apache.solr.core.SolrConfig;
|
import org.apache.solr.core.SolrConfig;
|
||||||
import org.apache.solr.core.Config;
|
import org.apache.solr.core.Config;
|
||||||
import org.apache.solr.core.SolrResourceLoader;
|
import org.apache.solr.core.SolrResourceLoader;
|
||||||
import org.apache.solr.search.SolrSimilarityProvider;
|
import org.apache.solr.search.similarities.DefaultSimilarityFactory;
|
||||||
import org.apache.solr.util.plugin.SolrCoreAware;
|
import org.apache.solr.util.plugin.SolrCoreAware;
|
||||||
import org.w3c.dom.*;
|
import org.w3c.dom.*;
|
||||||
import org.xml.sax.InputSource;
|
import org.xml.sax.InputSource;
|
||||||
|
@ -181,22 +180,12 @@ public final class IndexSchema {
|
||||||
*/
|
*/
|
||||||
public Collection<SchemaField> getRequiredFields() { return requiredFields; }
|
public Collection<SchemaField> getRequiredFields() { return requiredFields; }
|
||||||
|
|
||||||
private SimilarityProviderFactory similarityProviderFactory;
|
private Similarity similarity;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Returns the SimilarityProvider used for this index
|
* Returns the Similarity used for this index
|
||||||
*/
|
*/
|
||||||
public SimilarityProvider getSimilarityProvider() { return similarityProviderFactory.getSimilarityProvider(this); }
|
public Similarity getSimilarity() { return similarity; }
|
||||||
|
|
||||||
/**
|
|
||||||
* Returns the SimilarityProviderFactory used for this index
|
|
||||||
*/
|
|
||||||
public SimilarityProviderFactory getSimilarityProviderFactory() { return similarityProviderFactory; }
|
|
||||||
|
|
||||||
private Similarity fallbackSimilarity;
|
|
||||||
|
|
||||||
/** fallback similarity, in the case a field doesnt specify */
|
|
||||||
public Similarity getFallbackSimilarity() { return fallbackSimilarity; }
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Returns the Analyzer used when indexing documents for this index
|
* Returns the Analyzer used when indexing documents for this index
|
||||||
|
@ -438,31 +427,14 @@ public final class IndexSchema {
|
||||||
dynamicFields = dFields.toArray(new DynamicField[dFields.size()]);
|
dynamicFields = dFields.toArray(new DynamicField[dFields.size()]);
|
||||||
|
|
||||||
Node node = (Node) xpath.evaluate("/schema/similarity", document, XPathConstants.NODE);
|
Node node = (Node) xpath.evaluate("/schema/similarity", document, XPathConstants.NODE);
|
||||||
Similarity similarity = readSimilarity(loader, node);
|
SimilarityFactory simFactory = readSimilarity(loader, node);
|
||||||
fallbackSimilarity = similarity == null ? new DefaultSimilarity() : similarity;
|
if (simFactory == null) {
|
||||||
|
simFactory = new DefaultSimilarityFactory();
|
||||||
node = (Node) xpath.evaluate("/schema/similarityProvider", document, XPathConstants.NODE);
|
|
||||||
if (node==null) {
|
|
||||||
final SolrSimilarityProvider provider = new SolrSimilarityProvider(this);
|
|
||||||
similarityProviderFactory = new SimilarityProviderFactory() {
|
|
||||||
@Override
|
|
||||||
public SolrSimilarityProvider getSimilarityProvider(IndexSchema schema) {
|
|
||||||
return provider;
|
|
||||||
}
|
|
||||||
};
|
|
||||||
log.debug("using default similarityProvider");
|
|
||||||
} else {
|
|
||||||
final Object obj = loader.newInstance(((Element) node).getAttribute("class"), "search.similarities.");
|
|
||||||
// just like always, assume it's a SimilarityProviderFactory and get a ClassCastException - reasonable error handling
|
|
||||||
// configure a factory, get a similarity back
|
|
||||||
NamedList<?> args = DOMUtil.childNodesToNamedList(node);
|
|
||||||
similarityProviderFactory = (SimilarityProviderFactory)obj;
|
|
||||||
similarityProviderFactory.init(args);
|
|
||||||
if (similarityProviderFactory instanceof SchemaAware){
|
|
||||||
schemaAware.add((SchemaAware) similarityProviderFactory);
|
|
||||||
}
|
|
||||||
log.debug("using similarityProvider factory" + similarityProviderFactory.getClass().getName());
|
|
||||||
}
|
}
|
||||||
|
if (simFactory instanceof SchemaAware) {
|
||||||
|
((SchemaAware)simFactory).inform(this);
|
||||||
|
}
|
||||||
|
similarity = simFactory.getSimilarity();
|
||||||
|
|
||||||
node = (Node) xpath.evaluate("/schema/defaultSearchField/text()", document, XPathConstants.NODE);
|
node = (Node) xpath.evaluate("/schema/defaultSearchField/text()", document, XPathConstants.NODE);
|
||||||
if (node==null) {
|
if (node==null) {
|
||||||
|
@ -686,7 +658,7 @@ public final class IndexSchema {
|
||||||
return newArr;
|
return newArr;
|
||||||
}
|
}
|
||||||
|
|
||||||
static Similarity readSimilarity(ResourceLoader loader, Node node) throws XPathExpressionException {
|
static SimilarityFactory readSimilarity(ResourceLoader loader, Node node) throws XPathExpressionException {
|
||||||
if (node==null) {
|
if (node==null) {
|
||||||
return null;
|
return null;
|
||||||
} else {
|
} else {
|
||||||
|
@ -706,7 +678,7 @@ public final class IndexSchema {
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
return similarityFactory.getSimilarity();
|
return similarityFactory;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -1,34 +0,0 @@
|
||||||
package org.apache.solr.schema;
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Licensed to the Apache Software Foundation (ASF) under one or more
|
|
||||||
* contributor license agreements. See the NOTICE file distributed with
|
|
||||||
* this work for additional information regarding copyright ownership.
|
|
||||||
* The ASF licenses this file to You under the Apache License, Version 2.0
|
|
||||||
* (the "License"); you may not use this file except in compliance with
|
|
||||||
* the License. You may obtain a copy of the License at
|
|
||||||
*
|
|
||||||
* http://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
*
|
|
||||||
* Unless required by applicable law or agreed to in writing, software
|
|
||||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
* See the License for the specific language governing permissions and
|
|
||||||
* limitations under the License.
|
|
||||||
*/
|
|
||||||
|
|
||||||
import org.apache.solr.common.util.NamedList;
|
|
||||||
import org.apache.solr.search.SolrSimilarityProvider;
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Expert: Factory to provide a {@link SolrSimilarityProvider}.
|
|
||||||
* <p>
|
|
||||||
* Usually you would implement this if you want to customize the
|
|
||||||
* scoring routines that are not field-specific, such as coord() and queryNorm().
|
|
||||||
* Most scoring customization happens in the fieldtype's Similarity
|
|
||||||
*/
|
|
||||||
public abstract class SimilarityProviderFactory {
|
|
||||||
public void init(NamedList<?> args) {}
|
|
||||||
|
|
||||||
public abstract SolrSimilarityProvider getSimilarityProvider(IndexSchema schema);
|
|
||||||
}
|
|
|
@ -141,7 +141,7 @@ public class SolrIndexSearcher extends IndexSearcher implements Closeable,SolrIn
|
||||||
}
|
}
|
||||||
|
|
||||||
this.closeReader = closeReader;
|
this.closeReader = closeReader;
|
||||||
setSimilarityProvider(schema.getSimilarityProvider());
|
setSimilarity(schema.getSimilarity());
|
||||||
|
|
||||||
SolrConfig solrConfig = core.getSolrConfig();
|
SolrConfig solrConfig = core.getSolrConfig();
|
||||||
queryResultWindowSize = solrConfig.queryResultWindowSize;
|
queryResultWindowSize = solrConfig.queryResultWindowSize;
|
||||||
|
|
|
@ -1,56 +0,0 @@
|
||||||
package org.apache.solr.search;
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Licensed to the Apache Software Foundation (ASF) under one or more
|
|
||||||
* contributor license agreements. See the NOTICE file distributed with
|
|
||||||
* this work for additional information regarding copyright ownership.
|
|
||||||
* The ASF licenses this file to You under the Apache License, Version 2.0
|
|
||||||
* (the "License"); you may not use this file except in compliance with
|
|
||||||
* the License. You may obtain a copy of the License at
|
|
||||||
*
|
|
||||||
* http://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
*
|
|
||||||
* Unless required by applicable law or agreed to in writing, software
|
|
||||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
* See the License for the specific language governing permissions and
|
|
||||||
* limitations under the License.
|
|
||||||
*/
|
|
||||||
|
|
||||||
import org.apache.lucene.search.similarities.DefaultSimilarityProvider;
|
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
|
||||||
import org.apache.solr.schema.FieldType;
|
|
||||||
import org.apache.solr.schema.IndexSchema;
|
|
||||||
|
|
||||||
/**
|
|
||||||
* SimilarityProvider that uses the default Lucene implementation, unless
|
|
||||||
* otherwise specified by the fieldtype.
|
|
||||||
* <p>
|
|
||||||
* You can extend this class to customize the behavior of the parts
|
|
||||||
* of lucene's ranking system that are not field-specific, such as
|
|
||||||
* {@link #coord(int, int)} and {@link #queryNorm(float)}.
|
|
||||||
*/
|
|
||||||
public class SolrSimilarityProvider extends DefaultSimilarityProvider {
|
|
||||||
private final IndexSchema schema;
|
|
||||||
|
|
||||||
public SolrSimilarityProvider(IndexSchema schema) {
|
|
||||||
this.schema = schema;
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Solr implementation delegates to the fieldtype's similarity.
|
|
||||||
* If this does not exist, uses the schema's default similarity.
|
|
||||||
*/
|
|
||||||
// note: this is intentionally final, to maintain consistency with
|
|
||||||
// whatever is specified in the the schema!
|
|
||||||
@Override
|
|
||||||
public final Similarity get(String field) {
|
|
||||||
FieldType fieldType = schema.getFieldTypeNoEx(field);
|
|
||||||
if (fieldType == null) {
|
|
||||||
return schema.getFallbackSimilarity();
|
|
||||||
} else {
|
|
||||||
Similarity similarity = fieldType.getSimilarity();
|
|
||||||
return similarity == null ? schema.getFallbackSimilarity() : similarity;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
|
@ -1,47 +0,0 @@
|
||||||
package org.apache.solr.search.similarities;
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Licensed to the Apache Software Foundation (ASF) under one or more
|
|
||||||
* contributor license agreements. See the NOTICE file distributed with
|
|
||||||
* this work for additional information regarding copyright ownership.
|
|
||||||
* The ASF licenses this file to You under the Apache License, Version 2.0
|
|
||||||
* (the "License"); you may not use this file except in compliance with
|
|
||||||
* the License. You may obtain a copy of the License at
|
|
||||||
*
|
|
||||||
* http://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
*
|
|
||||||
* Unless required by applicable law or agreed to in writing, software
|
|
||||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
* See the License for the specific language governing permissions and
|
|
||||||
* limitations under the License.
|
|
||||||
*/
|
|
||||||
|
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider; // javadoc
|
|
||||||
import org.apache.solr.schema.IndexSchema;
|
|
||||||
import org.apache.solr.schema.SimilarityProviderFactory;
|
|
||||||
import org.apache.solr.search.SolrSimilarityProvider;
|
|
||||||
|
|
||||||
/**
|
|
||||||
* This class is aimed at non-VSM models, and therefore both the
|
|
||||||
* {@link SimilarityProvider#coord} and
|
|
||||||
* {@link SimilarityProvider#queryNorm} methods return {@code 1}.
|
|
||||||
* @lucene.experimental
|
|
||||||
*/
|
|
||||||
public class BasicSimilarityProviderFactory extends SimilarityProviderFactory {
|
|
||||||
|
|
||||||
@Override
|
|
||||||
public SolrSimilarityProvider getSimilarityProvider(IndexSchema schema) {
|
|
||||||
return new SolrSimilarityProvider(schema) {
|
|
||||||
@Override
|
|
||||||
public float coord(int overlap, int maxOverlap) {
|
|
||||||
return 1f;
|
|
||||||
}
|
|
||||||
|
|
||||||
@Override
|
|
||||||
public float queryNorm(float sumOfSquaredWeights) {
|
|
||||||
return 1f;
|
|
||||||
}
|
|
||||||
};
|
|
||||||
}
|
|
||||||
}
|
|
|
@ -0,0 +1,58 @@
|
||||||
|
package org.apache.solr.search.similarities;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Licensed to the Apache Software Foundation (ASF) under one or more
|
||||||
|
* contributor license agreements. See the NOTICE file distributed with
|
||||||
|
* this work for additional information regarding copyright ownership.
|
||||||
|
* The ASF licenses this file to You under the Apache License, Version 2.0
|
||||||
|
* (the "License"); you may not use this file except in compliance with
|
||||||
|
* the License. You may obtain a copy of the License at
|
||||||
|
*
|
||||||
|
* http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
*
|
||||||
|
* Unless required by applicable law or agreed to in writing, software
|
||||||
|
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
* See the License for the specific language governing permissions and
|
||||||
|
* limitations under the License.
|
||||||
|
*/
|
||||||
|
|
||||||
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
|
import org.apache.lucene.search.similarities.PerFieldSimilarityWrapper;
|
||||||
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
|
import org.apache.solr.schema.FieldType;
|
||||||
|
import org.apache.solr.schema.IndexSchema;
|
||||||
|
import org.apache.solr.schema.SchemaAware;
|
||||||
|
import org.apache.solr.schema.SimilarityFactory;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* SimilarityFactory that returns a PerFieldSimilarityWrapper
|
||||||
|
* that delegates to the fieldtype, if its configured, otherwise
|
||||||
|
* {@link DefaultSimilarity}.
|
||||||
|
*/
|
||||||
|
public class SchemaSimilarityFactory extends SimilarityFactory implements SchemaAware {
|
||||||
|
private Similarity similarity;
|
||||||
|
private Similarity defaultSimilarity = new DefaultSimilarity();
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public void inform(final IndexSchema schema) {
|
||||||
|
similarity = new PerFieldSimilarityWrapper() {
|
||||||
|
@Override
|
||||||
|
public Similarity get(String name) {
|
||||||
|
FieldType fieldType = schema.getFieldTypeNoEx(name);
|
||||||
|
if (fieldType == null) {
|
||||||
|
return defaultSimilarity;
|
||||||
|
} else {
|
||||||
|
Similarity similarity = fieldType.getSimilarity();
|
||||||
|
return similarity == null ? defaultSimilarity : similarity;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public Similarity getSimilarity() {
|
||||||
|
assert similarity != null : "inform must be called first";
|
||||||
|
return similarity;
|
||||||
|
}
|
||||||
|
}
|
|
@ -152,7 +152,7 @@ public class SolrIndexConfig {
|
||||||
if (writeLockTimeout != -1)
|
if (writeLockTimeout != -1)
|
||||||
iwc.setWriteLockTimeout(writeLockTimeout);
|
iwc.setWriteLockTimeout(writeLockTimeout);
|
||||||
|
|
||||||
iwc.setSimilarityProvider(schema.getSimilarityProvider());
|
iwc.setSimilarity(schema.getSimilarity());
|
||||||
iwc.setMergePolicy(buildMergePolicy(schema));
|
iwc.setMergePolicy(buildMergePolicy(schema));
|
||||||
iwc.setMergeScheduler(buildMergeScheduler(schema));
|
iwc.setMergeScheduler(buildMergeScheduler(schema));
|
||||||
|
|
||||||
|
|
|
@ -48,6 +48,5 @@
|
||||||
<defaultSearchField>text</defaultSearchField>
|
<defaultSearchField>text</defaultSearchField>
|
||||||
<uniqueKey>id</uniqueKey>
|
<uniqueKey>id</uniqueKey>
|
||||||
|
|
||||||
<!-- when using non-vector space models, its recommended to disable coord() etc -->
|
<similarity class="solr.SchemaSimilarityFactory"/>
|
||||||
<similarityProvider class="solr.BasicSimilarityProviderFactory"/>
|
|
||||||
</schema>
|
</schema>
|
||||||
|
|
|
@ -66,6 +66,5 @@
|
||||||
<defaultSearchField>text</defaultSearchField>
|
<defaultSearchField>text</defaultSearchField>
|
||||||
<uniqueKey>id</uniqueKey>
|
<uniqueKey>id</uniqueKey>
|
||||||
|
|
||||||
<!-- when using non-vector space models, its recommended to disable coord() etc -->
|
<similarity class="solr.SchemaSimilarityFactory"/>
|
||||||
<similarityProvider class="solr.BasicSimilarityProviderFactory"/>
|
|
||||||
</schema>
|
</schema>
|
||||||
|
|
|
@ -54,6 +54,5 @@
|
||||||
<defaultSearchField>text</defaultSearchField>
|
<defaultSearchField>text</defaultSearchField>
|
||||||
<uniqueKey>id</uniqueKey>
|
<uniqueKey>id</uniqueKey>
|
||||||
|
|
||||||
<!-- when using non-vector space models, its recommended to disable coord() etc -->
|
<similarity class="solr.SchemaSimilarityFactory"/>
|
||||||
<similarityProvider class="solr.BasicSimilarityProviderFactory"/>
|
|
||||||
</schema>
|
</schema>
|
||||||
|
|
|
@ -47,6 +47,5 @@
|
||||||
<defaultSearchField>text</defaultSearchField>
|
<defaultSearchField>text</defaultSearchField>
|
||||||
<uniqueKey>id</uniqueKey>
|
<uniqueKey>id</uniqueKey>
|
||||||
|
|
||||||
<!-- when using non-vector space models, its recommended to disable coord() etc -->
|
<similarity class="solr.SchemaSimilarityFactory"/>
|
||||||
<similarityProvider class="solr.BasicSimilarityProviderFactory"/>
|
|
||||||
</schema>
|
</schema>
|
||||||
|
|
|
@ -47,6 +47,5 @@
|
||||||
<defaultSearchField>text</defaultSearchField>
|
<defaultSearchField>text</defaultSearchField>
|
||||||
<uniqueKey>id</uniqueKey>
|
<uniqueKey>id</uniqueKey>
|
||||||
|
|
||||||
<!-- when using non-vector space models, its recommended to disable coord() etc -->
|
<similarity class="solr.SchemaSimilarityFactory"/>
|
||||||
<similarityProvider class="solr.BasicSimilarityProviderFactory"/>
|
|
||||||
</schema>
|
</schema>
|
||||||
|
|
|
@ -0,0 +1,68 @@
|
||||||
|
<?xml version="1.0" ?>
|
||||||
|
<!--
|
||||||
|
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||||
|
contributor license agreements. See the NOTICE file distributed with
|
||||||
|
this work for additional information regarding copyright ownership.
|
||||||
|
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||||
|
(the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software
|
||||||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
See the License for the specific language governing permissions and
|
||||||
|
limitations under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
<!-- Per-field similarity example for testing -->
|
||||||
|
|
||||||
|
<schema name="test" version="1.0">
|
||||||
|
<types>
|
||||||
|
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
|
||||||
|
<!-- some per-field similarity examples -->
|
||||||
|
<!-- specify a Similarity classname directly -->
|
||||||
|
<fieldType name="sim1" class="solr.TextField">
|
||||||
|
<analyzer>
|
||||||
|
<tokenizer class="solr.MockTokenizerFactory"/>
|
||||||
|
</analyzer>
|
||||||
|
<similarity class="org.apache.lucene.misc.SweetSpotSimilarity"/>
|
||||||
|
</fieldType>
|
||||||
|
|
||||||
|
<!-- specify a Similarity factory -->
|
||||||
|
<fieldType name="sim2" class="solr.TextField">
|
||||||
|
<analyzer>
|
||||||
|
<tokenizer class="solr.MockTokenizerFactory"/>
|
||||||
|
</analyzer>
|
||||||
|
<similarity class="solr.CustomSimilarityFactory">
|
||||||
|
<str name="echo">is there an echo?</str>
|
||||||
|
</similarity>
|
||||||
|
</fieldType>
|
||||||
|
|
||||||
|
<!-- don't specify any sim at all: get the default -->
|
||||||
|
<fieldType name="sim3" class="solr.TextField">
|
||||||
|
<analyzer>
|
||||||
|
<tokenizer class="solr.MockTokenizerFactory"/>
|
||||||
|
</analyzer>
|
||||||
|
</fieldType>
|
||||||
|
</types>
|
||||||
|
|
||||||
|
<fields>
|
||||||
|
<field name="id" type="int" indexed="true" stored="true" multiValued="false" required="false"/>
|
||||||
|
<field name="sim1text" type="sim1" indexed="true" stored="true"/>
|
||||||
|
<field name="sim2text" type="sim2" indexed="true" stored="true"/>
|
||||||
|
<field name="sim3text" type="sim3" indexed="true" stored="true"/>
|
||||||
|
|
||||||
|
<!-- make sure custom sims work with dynamic fields -->
|
||||||
|
<dynamicField name="*_sim1" type="sim1" indexed="true" stored="true"/>
|
||||||
|
<dynamicField name="*_sim2" type="sim2" indexed="true" stored="true"/>
|
||||||
|
<dynamicField name="*_sim3" type="sim3" indexed="true" stored="true"/>
|
||||||
|
</fields>
|
||||||
|
|
||||||
|
<defaultSearchField>sim1text</defaultSearchField>
|
||||||
|
<uniqueKey>id</uniqueKey>
|
||||||
|
|
||||||
|
<!-- default similarity, defers to the fieldType -->
|
||||||
|
<similarity class="solr.SchemaSimilarityFactory"/>
|
||||||
|
</schema>
|
|
@ -680,17 +680,7 @@
|
||||||
<!-- dynamic destination -->
|
<!-- dynamic destination -->
|
||||||
<copyField source="*_dynamic" dest="dynamic_*"/>
|
<copyField source="*_dynamic" dest="dynamic_*"/>
|
||||||
|
|
||||||
<!-- expert: SimilarityProvider contains scoring routines that are not field-specific,
|
<!-- example of a custom similarity -->
|
||||||
such as coord() and queryNorm(). most scoring customization happens in the fieldtype.
|
|
||||||
A custom similarity provider may be specified here, but the default is fine
|
|
||||||
for most applications.
|
|
||||||
-->
|
|
||||||
<similarityProvider class="solr.CustomSimilarityProviderFactory">
|
|
||||||
<str name="echo">is there an echo?</str>
|
|
||||||
</similarityProvider>
|
|
||||||
|
|
||||||
<!-- default similarity, unless otherwise specified by the fieldType
|
|
||||||
-->
|
|
||||||
<similarity class="solr.CustomSimilarityFactory">
|
<similarity class="solr.CustomSimilarityFactory">
|
||||||
<str name="echo">I am your default sim</str>
|
<str name="echo">I am your default sim</str>
|
||||||
</similarity>
|
</similarity>
|
||||||
|
|
|
@ -17,14 +17,12 @@
|
||||||
|
|
||||||
package org.apache.solr.schema;
|
package org.apache.solr.schema;
|
||||||
|
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.solr.SolrTestCaseJ4;
|
import org.apache.solr.SolrTestCaseJ4;
|
||||||
import org.apache.solr.common.params.CommonParams;
|
import org.apache.solr.common.params.CommonParams;
|
||||||
import org.apache.solr.common.params.MapSolrParams;
|
import org.apache.solr.common.params.MapSolrParams;
|
||||||
import org.apache.solr.core.SolrCore;
|
import org.apache.solr.core.SolrCore;
|
||||||
import org.apache.solr.request.LocalSolrQueryRequest;
|
import org.apache.solr.request.LocalSolrQueryRequest;
|
||||||
import org.apache.solr.request.SolrQueryRequest;
|
import org.apache.solr.request.SolrQueryRequest;
|
||||||
import org.apache.solr.search.similarities.MockConfigurableSimilarityProvider;
|
|
||||||
import org.junit.BeforeClass;
|
import org.junit.BeforeClass;
|
||||||
import org.junit.Test;
|
import org.junit.Test;
|
||||||
|
|
||||||
|
@ -80,14 +78,6 @@ public class IndexSchemaTest extends SolrTestCaseJ4 {
|
||||||
clearIndex();
|
clearIndex();
|
||||||
}
|
}
|
||||||
|
|
||||||
@Test
|
|
||||||
public void testSimilarityProviderFactory() {
|
|
||||||
SolrCore core = h.getCore();
|
|
||||||
SimilarityProvider similarityProvider = core.getSchema().getSimilarityProvider();
|
|
||||||
assertTrue("wrong class", similarityProvider instanceof MockConfigurableSimilarityProvider);
|
|
||||||
assertEquals("is there an echo?", ((MockConfigurableSimilarityProvider)similarityProvider).getPassthrough());
|
|
||||||
}
|
|
||||||
|
|
||||||
@Test
|
@Test
|
||||||
public void testIsDynamicField() throws Exception {
|
public void testIsDynamicField() throws Exception {
|
||||||
SolrCore core = h.getCore();
|
SolrCore core = h.getCore();
|
||||||
|
|
|
@ -344,7 +344,7 @@ public class TestFunctionQuery extends SolrTestCaseJ4 {
|
||||||
"//float[@name='score']='" + similarity.idf(3,6) + "'");
|
"//float[@name='score']='" + similarity.idf(3,6) + "'");
|
||||||
assertQ(req("fl","*,score","q", "{!func}tf(a_t,cow)", "fq","id:6"),
|
assertQ(req("fl","*,score","q", "{!func}tf(a_t,cow)", "fq","id:6"),
|
||||||
"//float[@name='score']='" + similarity.tf(5) + "'");
|
"//float[@name='score']='" + similarity.tf(5) + "'");
|
||||||
FieldInvertState state = new FieldInvertState();
|
FieldInvertState state = new FieldInvertState("a_t");
|
||||||
state.setBoost(1.0f);
|
state.setBoost(1.0f);
|
||||||
state.setLength(4);
|
state.setLength(4);
|
||||||
Norm norm = new Norm();
|
Norm norm = new Norm();
|
||||||
|
|
|
@ -17,8 +17,8 @@ package org.apache.solr.search.similarities;
|
||||||
* limitations under the License.
|
* limitations under the License.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
import org.apache.lucene.search.similarities.PerFieldSimilarityWrapper;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.solr.SolrTestCaseJ4;
|
import org.apache.solr.SolrTestCaseJ4;
|
||||||
import org.apache.solr.core.SolrCore;
|
import org.apache.solr.core.SolrCore;
|
||||||
import org.apache.solr.search.SolrIndexSearcher;
|
import org.apache.solr.search.SolrIndexSearcher;
|
||||||
|
@ -30,17 +30,11 @@ public abstract class BaseSimilarityTestCase extends SolrTestCaseJ4 {
|
||||||
protected Similarity getSimilarity(String field) {
|
protected Similarity getSimilarity(String field) {
|
||||||
SolrCore core = h.getCore();
|
SolrCore core = h.getCore();
|
||||||
RefCounted<SolrIndexSearcher> searcher = core.getSearcher();
|
RefCounted<SolrIndexSearcher> searcher = core.getSearcher();
|
||||||
Similarity sim = searcher.get().getSimilarityProvider().get(field);
|
Similarity sim = searcher.get().getSimilarity();
|
||||||
searcher.decref();
|
searcher.decref();
|
||||||
|
while (sim instanceof PerFieldSimilarityWrapper) {
|
||||||
|
sim = ((PerFieldSimilarityWrapper)sim).get(field);
|
||||||
|
}
|
||||||
return sim;
|
return sim;
|
||||||
}
|
}
|
||||||
|
|
||||||
/** returns the (Solr)SimilarityProvider */
|
|
||||||
protected SimilarityProvider getSimilarityProvider() {
|
|
||||||
SolrCore core = h.getCore();
|
|
||||||
RefCounted<SolrIndexSearcher> searcher = core.getSearcher();
|
|
||||||
SimilarityProvider prov = searcher.get().getSimilarityProvider();
|
|
||||||
searcher.decref();
|
|
||||||
return prov;
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
|
@ -1,36 +0,0 @@
|
||||||
/**
|
|
||||||
* Licensed to the Apache Software Foundation (ASF) under one or more
|
|
||||||
* contributor license agreements. See the NOTICE file distributed with
|
|
||||||
* this work for additional information regarding copyright ownership.
|
|
||||||
* The ASF licenses this file to You under the Apache License, Version 2.0
|
|
||||||
* (the "License"); you may not use this file except in compliance with
|
|
||||||
* the License. You may obtain a copy of the License at
|
|
||||||
*
|
|
||||||
* http://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
*
|
|
||||||
* Unless required by applicable law or agreed to in writing, software
|
|
||||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
* See the License for the specific language governing permissions and
|
|
||||||
* limitations under the License.
|
|
||||||
*/
|
|
||||||
package org.apache.solr.search.similarities;
|
|
||||||
|
|
||||||
import org.apache.solr.common.util.NamedList;
|
|
||||||
import org.apache.solr.schema.IndexSchema;
|
|
||||||
import org.apache.solr.schema.SimilarityProviderFactory;
|
|
||||||
import org.apache.solr.search.SolrSimilarityProvider;
|
|
||||||
|
|
||||||
public class CustomSimilarityProviderFactory extends SimilarityProviderFactory {
|
|
||||||
String echoParam;
|
|
||||||
|
|
||||||
@Override
|
|
||||||
public void init(NamedList<?> args) {
|
|
||||||
echoParam = (String) args.get("echo");
|
|
||||||
}
|
|
||||||
|
|
||||||
@Override
|
|
||||||
public SolrSimilarityProvider getSimilarityProvider(IndexSchema schema) {
|
|
||||||
return new MockConfigurableSimilarityProvider(schema, echoParam);
|
|
||||||
}
|
|
||||||
}
|
|
|
@ -1,33 +0,0 @@
|
||||||
/**
|
|
||||||
* Licensed to the Apache Software Foundation (ASF) under one or more
|
|
||||||
* contributor license agreements. See the NOTICE file distributed with
|
|
||||||
* this work for additional information regarding copyright ownership.
|
|
||||||
* The ASF licenses this file to You under the Apache License, Version 2.0
|
|
||||||
* (the "License"); you may not use this file except in compliance with
|
|
||||||
* the License. You may obtain a copy of the License at
|
|
||||||
*
|
|
||||||
* http://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
*
|
|
||||||
* Unless required by applicable law or agreed to in writing, software
|
|
||||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
* See the License for the specific language governing permissions and
|
|
||||||
* limitations under the License.
|
|
||||||
*/
|
|
||||||
package org.apache.solr.search.similarities;
|
|
||||||
|
|
||||||
import org.apache.solr.schema.IndexSchema;
|
|
||||||
import org.apache.solr.search.SolrSimilarityProvider;
|
|
||||||
|
|
||||||
public class MockConfigurableSimilarityProvider extends SolrSimilarityProvider {
|
|
||||||
private String passthrough;
|
|
||||||
|
|
||||||
public MockConfigurableSimilarityProvider(IndexSchema schema, String passthrough) {
|
|
||||||
super(schema);
|
|
||||||
this.passthrough = passthrough;
|
|
||||||
}
|
|
||||||
|
|
||||||
public String getPassthrough() {
|
|
||||||
return passthrough;
|
|
||||||
}
|
|
||||||
}
|
|
|
@ -18,11 +18,9 @@ package org.apache.solr.search.similarities;
|
||||||
*/
|
*/
|
||||||
|
|
||||||
import org.apache.lucene.misc.SweetSpotSimilarity;
|
import org.apache.lucene.misc.SweetSpotSimilarity;
|
||||||
|
import org.apache.lucene.search.similarities.DefaultSimilarity;
|
||||||
import org.apache.lucene.search.similarities.Similarity;
|
import org.apache.lucene.search.similarities.Similarity;
|
||||||
import org.apache.lucene.search.similarities.SimilarityProvider;
|
|
||||||
import org.apache.solr.core.SolrCore;
|
|
||||||
import org.junit.BeforeClass;
|
import org.junit.BeforeClass;
|
||||||
import org.junit.Test;
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Tests per-field similarity support in the schema
|
* Tests per-field similarity support in the schema
|
||||||
|
@ -31,7 +29,7 @@ public class TestPerFieldSimilarity extends BaseSimilarityTestCase {
|
||||||
|
|
||||||
@BeforeClass
|
@BeforeClass
|
||||||
public static void beforeClass() throws Exception {
|
public static void beforeClass() throws Exception {
|
||||||
initCore("solrconfig.xml","schema.xml");
|
initCore("solrconfig-basic.xml","schema-sim.xml");
|
||||||
}
|
}
|
||||||
|
|
||||||
/** test a field where the sim is specified directly */
|
/** test a field where the sim is specified directly */
|
||||||
|
@ -61,29 +59,18 @@ public class TestPerFieldSimilarity extends BaseSimilarityTestCase {
|
||||||
/** test a field where no similarity is specified */
|
/** test a field where no similarity is specified */
|
||||||
public void testDefaults() throws Exception {
|
public void testDefaults() throws Exception {
|
||||||
Similarity sim = getSimilarity("sim3text");
|
Similarity sim = getSimilarity("sim3text");
|
||||||
assertEquals(MockConfigurableSimilarity.class, sim.getClass());
|
assertEquals(DefaultSimilarity.class, sim.getClass());;
|
||||||
assertEquals("I am your default sim", ((MockConfigurableSimilarity)sim).getPassthrough());
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/** ... and for a dynamic field */
|
/** ... and for a dynamic field */
|
||||||
public void testDefaultsDynamic() throws Exception {
|
public void testDefaultsDynamic() throws Exception {
|
||||||
Similarity sim = getSimilarity("text_sim3");
|
Similarity sim = getSimilarity("text_sim3");
|
||||||
assertEquals(MockConfigurableSimilarity.class, sim.getClass());
|
assertEquals(DefaultSimilarity.class, sim.getClass());
|
||||||
assertEquals("I am your default sim", ((MockConfigurableSimilarity)sim).getPassthrough());
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/** test a field that does not exist */
|
/** test a field that does not exist */
|
||||||
public void testNonexistent() throws Exception {
|
public void testNonexistent() throws Exception {
|
||||||
Similarity sim = getSimilarity("sdfdsfdsfdswr5fsdfdsfdsfs");
|
Similarity sim = getSimilarity("sdfdsfdsfdswr5fsdfdsfdsfs");
|
||||||
assertEquals(MockConfigurableSimilarity.class, sim.getClass());
|
assertEquals(DefaultSimilarity.class, sim.getClass());
|
||||||
assertEquals("I am your default sim", ((MockConfigurableSimilarity)sim).getPassthrough());
|
|
||||||
}
|
|
||||||
|
|
||||||
@Test
|
|
||||||
public void testSimilarityProviderFactory() {
|
|
||||||
SolrCore core = h.getCore();
|
|
||||||
SimilarityProvider similarityProvider = core.getSchema().getSimilarityProvider();
|
|
||||||
assertTrue("wrong class", similarityProvider instanceof MockConfigurableSimilarityProvider);
|
|
||||||
assertEquals("is there an echo?", ((MockConfigurableSimilarityProvider)similarityProvider).getPassthrough());
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -640,17 +640,7 @@
|
||||||
<!-- dynamic destination -->
|
<!-- dynamic destination -->
|
||||||
<copyField source="*_dynamic" dest="dynamic_*"/>
|
<copyField source="*_dynamic" dest="dynamic_*"/>
|
||||||
|
|
||||||
<!-- expert: SimilarityProvider contains scoring routines that are not field-specific,
|
<!-- example of a custom similarity -->
|
||||||
such as coord() and queryNorm(). most scoring customization happens in the fieldtype.
|
|
||||||
A custom similarity provider may be specified here, but the default is fine
|
|
||||||
for most applications.
|
|
||||||
-->
|
|
||||||
<similarityProvider class="org.apache.solr.schema.CustomSimilarityProviderFactory">
|
|
||||||
<str name="echo">is there an echo?</str>
|
|
||||||
</similarityProvider>
|
|
||||||
|
|
||||||
<!-- default similarity, unless otherwise specified by the fieldType
|
|
||||||
-->
|
|
||||||
<similarity class="org.apache.solr.schema.CustomSimilarityFactory">
|
<similarity class="org.apache.solr.schema.CustomSimilarityFactory">
|
||||||
<str name="echo">I am your default sim</str>
|
<str name="echo">I am your default sim</str>
|
||||||
</similarity>
|
</similarity>
|
||||||
|
|
Loading…
Reference in New Issue