mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-03-08 03:49:38 +00:00
Term Vectors: Support for artificial documents
This adds the ability to the Term Vector API to generate term vectors for artifical documents, that is for documents not present in the index. Following a similar syntax to the Percolator API, a new 'doc' parameter is used, instead of '_id', that specifies the document of interest. The parameters '_index' and '_type' determine the mapping and therefore analyzers to apply to each value field. Closes #7530
This commit is contained in:
parent
b49853a619
commit
07d741c2cb
@ -1,7 +1,10 @@
|
||||
[[docs-multi-termvectors]]
|
||||
== Multi termvectors API
|
||||
|
||||
Multi termvectors API allows to get multiple termvectors based on an index, type and id. The response includes a `docs`
|
||||
Multi termvectors API allows to get multiple termvectors at once. The
|
||||
documents from which to retrieve the term vectors are specified by an index,
|
||||
type and id. But the documents could also be artificially provided coming[1.4.0].
|
||||
The response includes a `docs`
|
||||
array with all the fetched termvectors, each element having the structure
|
||||
provided by the <<docs-termvectors,termvectors>>
|
||||
API. Here is an example:
|
||||
@ -89,4 +92,31 @@ curl 'localhost:9200/testidx/test/_mtermvectors' -d '{
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
Parameters can also be set by passing them as uri parameters (see <<docs-termvectors,termvectors>>). uri parameters are the default parameters and are overwritten by any parameter setting defined in the body.
|
||||
Additionally coming[1.4.0], just like for the <<docs-termvectors,termvectors>>
|
||||
API, term vectors could be generated for user provided documents. The syntax
|
||||
is similar to the <<search-percolate,percolator>> API. The mapping used is
|
||||
determined by `_index` and `_type`.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl 'localhost:9200/_mtermvectors' -d '{
|
||||
"docs": [
|
||||
{
|
||||
"_index": "testidx",
|
||||
"_type": "test",
|
||||
"doc" : {
|
||||
"fullname" : "John Doe",
|
||||
"text" : "twitter test test test"
|
||||
}
|
||||
},
|
||||
{
|
||||
"_index": "testidx",
|
||||
"_type": "test",
|
||||
"doc" : {
|
||||
"fullname" : "Jane Doe",
|
||||
"text" : "Another twitter test ..."
|
||||
}
|
||||
}
|
||||
]
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
@ -3,10 +3,11 @@
|
||||
|
||||
added[1.0.0.Beta1]
|
||||
|
||||
Returns information and statistics on terms in the fields of a
|
||||
particular document as stored in the index. Note that this is a
|
||||
near realtime API as the term vectors are not available until the
|
||||
next refresh.
|
||||
Returns information and statistics on terms in the fields of a particular
|
||||
document. The document could be stored in the index or artificially provided
|
||||
by the user coming[1.4.0]. Note that for documents stored in the index, this
|
||||
is a near realtime API as the term vectors are not available until the next
|
||||
refresh.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
@ -41,10 +42,10 @@ statistics are returned for all fields but no term statistics.
|
||||
* term payloads (`payloads` : true), as base64 encoded bytes
|
||||
|
||||
If the requested information wasn't stored in the index, it will be
|
||||
computed on the fly if possible. See <<mapping-types,type mapping>>
|
||||
for how to configure your index to store term vectors.
|
||||
computed on the fly if possible. Additionally, term vectors could be computed
|
||||
for documents not even existing in the index, but instead provided by the user.
|
||||
|
||||
coming[1.4.0,The ability to computed term vectors on the fly is only available from 1.4.0 onwards (see below)]
|
||||
coming[1.4.0,The ability to computed term vectors on the fly as well as support for artificial documents is only available from 1.4.0 onwards (see below example 2 and 3 respectively)]
|
||||
|
||||
[WARNING]
|
||||
======
|
||||
@ -86,7 +87,9 @@ The term and field statistics are not accurate. Deleted documents
|
||||
are not taken into account. The information is only retrieved for the
|
||||
shard the requested document resides in. The term and field statistics
|
||||
are therefore only useful as relative measures whereas the absolute
|
||||
numbers have no meaning in this context.
|
||||
numbers have no meaning in this context. By default, when requesting
|
||||
term vectors of artificial documents, a shard to get the statistics from
|
||||
is randomly selected. Use `routing` only to hit a particular shard.
|
||||
|
||||
[float]
|
||||
=== Example 1
|
||||
@ -231,7 +234,7 @@ Response:
|
||||
[float]
|
||||
=== Example 2 coming[1.4.0]
|
||||
|
||||
Additionally, term vectors which are not explicitly stored in the index are automatically
|
||||
Term vectors which are not explicitly stored in the index are automatically
|
||||
computed on the fly. The following request returns all information and statistics for the
|
||||
fields in document `1`, even though the terms haven't been explicitly stored in the index.
|
||||
Note that for the field `text`, the terms are not re-generated.
|
||||
@ -246,3 +249,29 @@ curl -XGET 'http://localhost:9200/twitter/tweet/1/_termvector?pretty=true' -d '{
|
||||
"field_statistics" : true
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
[float]
|
||||
=== Example 3 coming[1.4.0]
|
||||
|
||||
Additionally, term vectors can also be generated for artificial documents,
|
||||
that is for documents not present in the index. The syntax is similar to the
|
||||
<<search-percolate,percolator>> API. For example, the following request would
|
||||
return the same results as in example 1. The mapping used is determined by the
|
||||
`index` and `type`.
|
||||
|
||||
[WARNING]
|
||||
======
|
||||
If dynamic mapping is turned on (default), the document fields not in the original
|
||||
mapping will be dynamically created.
|
||||
======
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XGET 'http://localhost:9200/twitter/tweet/_termvector' -d '{
|
||||
"doc" : {
|
||||
"fullname" : "John Doe",
|
||||
"text" : "twitter test test test"
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
|
@ -90,7 +90,6 @@ public class MultiTermVectorsRequest extends ActionRequest<MultiTermVectorsReque
|
||||
if (token == XContentParser.Token.FIELD_NAME) {
|
||||
currentFieldName = parser.currentName();
|
||||
} else if (token == XContentParser.Token.START_ARRAY) {
|
||||
|
||||
if ("docs".equals(currentFieldName)) {
|
||||
while ((token = parser.nextToken()) != XContentParser.Token.END_ARRAY) {
|
||||
if (token != XContentParser.Token.START_OBJECT) {
|
||||
|
@ -26,12 +26,17 @@ import org.elasticsearch.action.ActionRequestValidationException;
|
||||
import org.elasticsearch.action.ValidateActions;
|
||||
import org.elasticsearch.action.get.MultiGetRequest;
|
||||
import org.elasticsearch.action.support.single.shard.SingleShardOperationRequest;
|
||||
import org.elasticsearch.common.bytes.BytesReference;
|
||||
import org.elasticsearch.common.io.stream.StreamInput;
|
||||
import org.elasticsearch.common.io.stream.StreamOutput;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.util.*;
|
||||
import java.util.concurrent.atomic.AtomicInteger;
|
||||
|
||||
import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
|
||||
|
||||
/**
|
||||
* Request returning the term vector (doc frequency, positions, offsets) for a
|
||||
@ -46,10 +51,14 @@ public class TermVectorRequest extends SingleShardOperationRequest<TermVectorReq
|
||||
|
||||
private String id;
|
||||
|
||||
private BytesReference doc;
|
||||
|
||||
private String routing;
|
||||
|
||||
protected String preference;
|
||||
|
||||
private static final AtomicInteger randomInt = new AtomicInteger(0);
|
||||
|
||||
// TODO: change to String[]
|
||||
private Set<String> selectedFields;
|
||||
|
||||
@ -129,6 +138,23 @@ public class TermVectorRequest extends SingleShardOperationRequest<TermVectorReq
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the artificial document from which term vectors are requested for.
|
||||
*/
|
||||
public BytesReference doc() {
|
||||
return doc;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets an artificial document from which term vectors are requested for.
|
||||
*/
|
||||
public TermVectorRequest doc(XContentBuilder documentBuilder) {
|
||||
// assign a random id to this artificial document, for routing
|
||||
this.id(String.valueOf(randomInt.getAndAdd(1)));
|
||||
this.doc = documentBuilder.bytes();
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* @return The routing for this request.
|
||||
*/
|
||||
@ -281,8 +307,8 @@ public class TermVectorRequest extends SingleShardOperationRequest<TermVectorReq
|
||||
if (type == null) {
|
||||
validationException = ValidateActions.addValidationError("type is missing", validationException);
|
||||
}
|
||||
if (id == null) {
|
||||
validationException = ValidateActions.addValidationError("id is missing", validationException);
|
||||
if (id == null && doc == null) {
|
||||
validationException = ValidateActions.addValidationError("id or doc is missing", validationException);
|
||||
}
|
||||
return validationException;
|
||||
}
|
||||
@ -303,6 +329,12 @@ public class TermVectorRequest extends SingleShardOperationRequest<TermVectorReq
|
||||
}
|
||||
type = in.readString();
|
||||
id = in.readString();
|
||||
|
||||
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
|
||||
if (in.readBoolean()) {
|
||||
doc = in.readBytesReference();
|
||||
}
|
||||
}
|
||||
routing = in.readOptionalString();
|
||||
preference = in.readOptionalString();
|
||||
long flags = in.readVLong();
|
||||
@ -331,6 +363,13 @@ public class TermVectorRequest extends SingleShardOperationRequest<TermVectorReq
|
||||
}
|
||||
out.writeString(type);
|
||||
out.writeString(id);
|
||||
|
||||
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
|
||||
out.writeBoolean(doc != null);
|
||||
if (doc != null) {
|
||||
out.writeBytesReference(doc);
|
||||
}
|
||||
}
|
||||
out.writeOptionalString(routing);
|
||||
out.writeOptionalString(preference);
|
||||
long longFlags = 0;
|
||||
@ -389,7 +428,15 @@ public class TermVectorRequest extends SingleShardOperationRequest<TermVectorReq
|
||||
} else if ("_type".equals(currentFieldName)) {
|
||||
termVectorRequest.type = parser.text();
|
||||
} else if ("_id".equals(currentFieldName)) {
|
||||
if (termVectorRequest.doc != null) {
|
||||
throw new ElasticsearchParseException("Either \"id\" or \"doc\" can be specified, but not both!");
|
||||
}
|
||||
termVectorRequest.id = parser.text();
|
||||
} else if ("doc".equals(currentFieldName)) {
|
||||
if (termVectorRequest.id != null) {
|
||||
throw new ElasticsearchParseException("Either \"id\" or \"doc\" can be specified, but not both!");
|
||||
}
|
||||
termVectorRequest.doc(jsonBuilder().copyCurrentStructure(parser));
|
||||
} else if ("_routing".equals(currentFieldName) || "routing".equals(currentFieldName)) {
|
||||
termVectorRequest.routing = parser.text();
|
||||
} else {
|
||||
@ -398,7 +445,6 @@ public class TermVectorRequest extends SingleShardOperationRequest<TermVectorReq
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (fields.size() > 0) {
|
||||
String[] fieldsAsArray = new String[fields.size()];
|
||||
termVectorRequest.selectedFields(fields.toArray(fieldsAsArray));
|
||||
|
@ -22,6 +22,7 @@ package org.elasticsearch.action.termvector;
|
||||
import org.elasticsearch.action.ActionListener;
|
||||
import org.elasticsearch.action.ActionRequestBuilder;
|
||||
import org.elasticsearch.client.Client;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
|
||||
/**
|
||||
*/
|
||||
@ -35,6 +36,38 @@ public class TermVectorRequestBuilder extends ActionRequestBuilder<TermVectorReq
|
||||
super(client, new TermVectorRequest(index, type, id));
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the index where the document is located.
|
||||
*/
|
||||
public TermVectorRequestBuilder setIndex(String index) {
|
||||
request.index(index);
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the type of the document.
|
||||
*/
|
||||
public TermVectorRequestBuilder setType(String type) {
|
||||
request.type(type);
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the id of the document.
|
||||
*/
|
||||
public TermVectorRequestBuilder setId(String id) {
|
||||
request.id(id);
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the artificial document from which to generate term vectors.
|
||||
*/
|
||||
public TermVectorRequestBuilder setDoc(XContentBuilder xContent) {
|
||||
request.doc(xContent);
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the routing. Required if routing isn't id based.
|
||||
*/
|
||||
|
@ -81,10 +81,11 @@ public class TermVectorResponse extends ActionResponse implements ToXContent {
|
||||
private String id;
|
||||
private long docVersion;
|
||||
private boolean exists = false;
|
||||
private boolean artificial = false;
|
||||
|
||||
private boolean sourceCopied = false;
|
||||
|
||||
int[] curentPositions = new int[0];
|
||||
int[] currentPositions = new int[0];
|
||||
int[] currentStartOffset = new int[0];
|
||||
int[] currentEndOffset = new int[0];
|
||||
BytesReference[] currentPayloads = new BytesReference[0];
|
||||
@ -156,7 +157,6 @@ public class TermVectorResponse extends ActionResponse implements ToXContent {
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
@Override
|
||||
@ -166,7 +166,9 @@ public class TermVectorResponse extends ActionResponse implements ToXContent {
|
||||
assert id != null;
|
||||
builder.field(FieldStrings._INDEX, index);
|
||||
builder.field(FieldStrings._TYPE, type);
|
||||
builder.field(FieldStrings._ID, id);
|
||||
if (!isArtificial()) {
|
||||
builder.field(FieldStrings._ID, id);
|
||||
}
|
||||
builder.field(FieldStrings._VERSION, docVersion);
|
||||
builder.field(FieldStrings.FOUND, isExists());
|
||||
if (!isExists()) {
|
||||
@ -181,7 +183,6 @@ public class TermVectorResponse extends ActionResponse implements ToXContent {
|
||||
}
|
||||
builder.endObject();
|
||||
return builder;
|
||||
|
||||
}
|
||||
|
||||
private void buildField(XContentBuilder builder, final CharsRef spare, Fields theFields, Iterator<String> fieldIter) throws IOException {
|
||||
@ -237,7 +238,7 @@ public class TermVectorResponse extends ActionResponse implements ToXContent {
|
||||
for (int i = 0; i < termFreq; i++) {
|
||||
builder.startObject();
|
||||
if (curTerms.hasPositions()) {
|
||||
builder.field(FieldStrings.POS, curentPositions[i]);
|
||||
builder.field(FieldStrings.POS, currentPositions[i]);
|
||||
}
|
||||
if (curTerms.hasOffsets()) {
|
||||
builder.field(FieldStrings.START_OFFSET, currentStartOffset[i]);
|
||||
@ -249,14 +250,13 @@ public class TermVectorResponse extends ActionResponse implements ToXContent {
|
||||
builder.endObject();
|
||||
}
|
||||
builder.endArray();
|
||||
|
||||
}
|
||||
|
||||
private void initValues(Terms curTerms, DocsAndPositionsEnum posEnum, int termFreq) throws IOException {
|
||||
for (int j = 0; j < termFreq; j++) {
|
||||
int nextPos = posEnum.nextPosition();
|
||||
if (curTerms.hasPositions()) {
|
||||
curentPositions[j] = nextPos;
|
||||
currentPositions[j] = nextPos;
|
||||
}
|
||||
if (curTerms.hasOffsets()) {
|
||||
currentStartOffset[j] = posEnum.startOffset();
|
||||
@ -269,7 +269,6 @@ public class TermVectorResponse extends ActionResponse implements ToXContent {
|
||||
} else {
|
||||
currentPayloads[j] = null;
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -277,7 +276,7 @@ public class TermVectorResponse extends ActionResponse implements ToXContent {
|
||||
private void initMemory(Terms curTerms, int termFreq) {
|
||||
// init memory for performance reasons
|
||||
if (curTerms.hasPositions()) {
|
||||
curentPositions = ArrayUtil.grow(curentPositions, termFreq);
|
||||
currentPositions = ArrayUtil.grow(currentPositions, termFreq);
|
||||
}
|
||||
if (curTerms.hasOffsets()) {
|
||||
currentStartOffset = ArrayUtil.grow(currentStartOffset, termFreq);
|
||||
@ -336,7 +335,6 @@ public class TermVectorResponse extends ActionResponse implements ToXContent {
|
||||
|
||||
public void setHeader(BytesReference header) {
|
||||
headerRef = header;
|
||||
|
||||
}
|
||||
|
||||
public void setDocVersion(long version) {
|
||||
@ -356,4 +354,11 @@ public class TermVectorResponse extends ActionResponse implements ToXContent {
|
||||
return id;
|
||||
}
|
||||
|
||||
public boolean isArtificial() {
|
||||
return artificial;
|
||||
}
|
||||
|
||||
public void setArtificial(boolean artificial) {
|
||||
this.artificial = artificial;
|
||||
}
|
||||
}
|
||||
|
@ -46,7 +46,6 @@ final class TermVectorWriter {
|
||||
}
|
||||
|
||||
void setFields(Fields termVectorsByField, Set<String> selectedFields, EnumSet<Flag> flags, Fields topLevelFields) throws IOException {
|
||||
|
||||
int numFieldsWritten = 0;
|
||||
TermsEnum iterator = null;
|
||||
DocsAndPositionsEnum docsAndPosEnum = null;
|
||||
@ -60,6 +59,11 @@ final class TermVectorWriter {
|
||||
Terms fieldTermVector = termVectorsByField.terms(field);
|
||||
Terms topLevelTerms = topLevelFields.terms(field);
|
||||
|
||||
// if no terms found, take the retrieved term vector fields for stats
|
||||
if (topLevelTerms == null) {
|
||||
topLevelTerms = fieldTermVector;
|
||||
}
|
||||
|
||||
topLevelIterator = topLevelTerms.iterator(topLevelIterator);
|
||||
boolean positions = flags.contains(Flag.Positions) && fieldTermVector.hasPositions();
|
||||
boolean offsets = flags.contains(Flag.Offsets) && fieldTermVector.hasOffsets();
|
||||
@ -75,7 +79,6 @@ final class TermVectorWriter {
|
||||
// get the doc frequency
|
||||
BytesRef term = iterator.term();
|
||||
boolean foundTerm = topLevelIterator.seekExact(term);
|
||||
assert (foundTerm);
|
||||
startTerm(term);
|
||||
if (flags.contains(Flag.TermStatistics)) {
|
||||
writeTermStatistics(topLevelIterator);
|
||||
|
@ -533,7 +533,6 @@ public interface Client extends ElasticsearchClient<Client>, Releasable {
|
||||
*/
|
||||
MoreLikeThisRequestBuilder prepareMoreLikeThis(String index, String type, String id);
|
||||
|
||||
|
||||
/**
|
||||
* An action that returns the term vectors for a specific document.
|
||||
*
|
||||
@ -550,6 +549,10 @@ public interface Client extends ElasticsearchClient<Client>, Releasable {
|
||||
*/
|
||||
void termVector(TermVectorRequest request, ActionListener<TermVectorResponse> listener);
|
||||
|
||||
/**
|
||||
* Builder for the term vector request.
|
||||
*/
|
||||
TermVectorRequestBuilder prepareTermVector();
|
||||
|
||||
/**
|
||||
* Builder for the term vector request.
|
||||
@ -560,7 +563,6 @@ public interface Client extends ElasticsearchClient<Client>, Releasable {
|
||||
*/
|
||||
TermVectorRequestBuilder prepareTermVector(String index, String type, String id);
|
||||
|
||||
|
||||
/**
|
||||
* Multi get term vectors.
|
||||
*/
|
||||
@ -576,7 +578,6 @@ public interface Client extends ElasticsearchClient<Client>, Releasable {
|
||||
*/
|
||||
MultiTermVectorsRequestBuilder prepareMultiTermVectors();
|
||||
|
||||
|
||||
/**
|
||||
* Percolates a request returning the matches documents.
|
||||
*/
|
||||
|
@ -441,6 +441,11 @@ public abstract class AbstractClient implements Client {
|
||||
execute(TermVectorAction.INSTANCE, request, listener);
|
||||
}
|
||||
|
||||
@Override
|
||||
public TermVectorRequestBuilder prepareTermVector() {
|
||||
return new TermVectorRequestBuilder(this);
|
||||
}
|
||||
|
||||
@Override
|
||||
public TermVectorRequestBuilder prepareTermVector(String index, String type, String id) {
|
||||
return new TermVectorRequestBuilder(this, index, type, id);
|
||||
|
@ -126,6 +126,26 @@ public abstract class ParseContext {
|
||||
return f.toArray(new IndexableField[f.size()]);
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns an array of values of the field specified as the method parameter.
|
||||
* This method returns an empty array when there are no
|
||||
* matching fields. It never returns null.
|
||||
* For {@link org.apache.lucene.document.IntField}, {@link org.apache.lucene.document.LongField}, {@link
|
||||
* org.apache.lucene.document.FloatField} and {@link org.apache.lucene.document.DoubleField} it returns the string value of the number.
|
||||
* If you want the actual numeric field instances back, use {@link #getFields}.
|
||||
* @param name the name of the field
|
||||
* @return a <code>String[]</code> of field values
|
||||
*/
|
||||
public final String[] getValues(String name) {
|
||||
List<String> result = new ArrayList<>();
|
||||
for (IndexableField field : fields) {
|
||||
if (field.name().equals(name) && field.stringValue() != null) {
|
||||
result.add(field.stringValue());
|
||||
}
|
||||
}
|
||||
return result.toArray(new String[result.size()]);
|
||||
}
|
||||
|
||||
public IndexableField getField(String name) {
|
||||
for (IndexableField field : fields) {
|
||||
if (field.name().equals(name)) {
|
||||
|
@ -25,18 +25,20 @@ import org.apache.lucene.index.memory.MemoryIndex;
|
||||
import org.elasticsearch.ElasticsearchException;
|
||||
import org.elasticsearch.action.termvector.TermVectorRequest;
|
||||
import org.elasticsearch.action.termvector.TermVectorResponse;
|
||||
import org.elasticsearch.cluster.action.index.MappingUpdatedAction;
|
||||
import org.elasticsearch.common.Strings;
|
||||
import org.elasticsearch.common.bytes.BytesReference;
|
||||
import org.elasticsearch.common.collect.Tuple;
|
||||
import org.elasticsearch.common.inject.Inject;
|
||||
import org.elasticsearch.common.lucene.uid.Versions;
|
||||
import org.elasticsearch.common.regex.Regex;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.index.engine.Engine;
|
||||
import org.elasticsearch.index.get.GetField;
|
||||
import org.elasticsearch.index.get.GetResult;
|
||||
import org.elasticsearch.index.mapper.FieldMapper;
|
||||
import org.elasticsearch.index.mapper.Uid;
|
||||
import org.elasticsearch.index.mapper.*;
|
||||
import org.elasticsearch.index.mapper.core.StringFieldMapper;
|
||||
import org.elasticsearch.index.mapper.internal.UidFieldMapper;
|
||||
import org.elasticsearch.index.service.IndexService;
|
||||
import org.elasticsearch.index.settings.IndexSettings;
|
||||
import org.elasticsearch.index.shard.AbstractIndexShardComponent;
|
||||
import org.elasticsearch.index.shard.ShardId;
|
||||
@ -45,16 +47,20 @@ import org.elasticsearch.index.shard.service.IndexShard;
|
||||
import java.io.IOException;
|
||||
import java.util.*;
|
||||
|
||||
import static org.elasticsearch.index.mapper.SourceToParse.source;
|
||||
|
||||
/**
|
||||
*/
|
||||
|
||||
public class ShardTermVectorService extends AbstractIndexShardComponent {
|
||||
|
||||
private IndexShard indexShard;
|
||||
private final MappingUpdatedAction mappingUpdatedAction;
|
||||
|
||||
@Inject
|
||||
public ShardTermVectorService(ShardId shardId, @IndexSettings Settings indexSettings) {
|
||||
public ShardTermVectorService(ShardId shardId, @IndexSettings Settings indexSettings, MappingUpdatedAction mappingUpdatedAction) {
|
||||
super(shardId, indexSettings);
|
||||
this.mappingUpdatedAction = mappingUpdatedAction;
|
||||
}
|
||||
|
||||
// sadly, to overcome cyclic dep, we need to do this and inject it ourselves...
|
||||
@ -67,23 +73,39 @@ public class ShardTermVectorService extends AbstractIndexShardComponent {
|
||||
final Engine.Searcher searcher = indexShard.acquireSearcher("term_vector");
|
||||
IndexReader topLevelReader = searcher.reader();
|
||||
final TermVectorResponse termVectorResponse = new TermVectorResponse(concreteIndex, request.type(), request.id());
|
||||
final Term uidTerm = new Term(UidFieldMapper.NAME, Uid.createUidAsBytes(request.type(), request.id()));
|
||||
|
||||
/* handle potential wildcards in fields */
|
||||
if (request.selectedFields() != null) {
|
||||
handleFieldWildcards(request);
|
||||
}
|
||||
|
||||
try {
|
||||
Fields topLevelFields = MultiFields.getFields(topLevelReader);
|
||||
Versions.DocIdAndVersion docIdAndVersion = Versions.loadDocIdAndVersion(topLevelReader, uidTerm);
|
||||
if (docIdAndVersion != null) {
|
||||
/* handle potential wildcards in fields */
|
||||
if (request.selectedFields() != null) {
|
||||
handleFieldWildcards(request);
|
||||
}
|
||||
/* generate term vectors if not available */
|
||||
Fields termVectorsByField = docIdAndVersion.context.reader().getTermVectors(docIdAndVersion.docId);
|
||||
if (request.selectedFields() != null) {
|
||||
termVectorsByField = generateTermVectorsIfNeeded(termVectorsByField, request, uidTerm, false);
|
||||
/* from an artificial document */
|
||||
if (request.doc() != null) {
|
||||
Fields termVectorsByField = generateTermVectorsFromDoc(request);
|
||||
// if no document indexed in shard, take the queried document itself for stats
|
||||
if (topLevelFields == null) {
|
||||
topLevelFields = termVectorsByField;
|
||||
}
|
||||
termVectorResponse.setFields(termVectorsByField, request.selectedFields(), request.getFlags(), topLevelFields);
|
||||
termVectorResponse.setExists(true);
|
||||
termVectorResponse.setArtificial(true);
|
||||
return termVectorResponse;
|
||||
}
|
||||
/* or from an existing document */
|
||||
final Term uidTerm = new Term(UidFieldMapper.NAME, Uid.createUidAsBytes(request.type(), request.id()));
|
||||
Versions.DocIdAndVersion docIdAndVersion = Versions.loadDocIdAndVersion(topLevelReader, uidTerm);
|
||||
if (docIdAndVersion != null) {
|
||||
// fields with stored term vectors
|
||||
Fields termVectorsByField = docIdAndVersion.context.reader().getTermVectors(docIdAndVersion.docId);
|
||||
// fields without term vectors
|
||||
if (request.selectedFields() != null) {
|
||||
termVectorsByField = addGeneratedTermVectors(termVectorsByField, request, uidTerm, false);
|
||||
}
|
||||
termVectorResponse.setFields(termVectorsByField, request.selectedFields(), request.getFlags(), topLevelFields);
|
||||
termVectorResponse.setDocVersion(docIdAndVersion.version);
|
||||
termVectorResponse.setExists(true);
|
||||
} else {
|
||||
termVectorResponse.setExists(false);
|
||||
}
|
||||
@ -103,39 +125,52 @@ public class ShardTermVectorService extends AbstractIndexShardComponent {
|
||||
request.selectedFields(fieldNames.toArray(Strings.EMPTY_ARRAY));
|
||||
}
|
||||
|
||||
private Fields generateTermVectorsIfNeeded(Fields termVectorsByField, TermVectorRequest request, Term uidTerm, boolean realTime) throws IOException {
|
||||
List<String> validFields = new ArrayList<>();
|
||||
private boolean isValidField(FieldMapper field) {
|
||||
// must be a string
|
||||
if (!(field instanceof StringFieldMapper)) {
|
||||
return false;
|
||||
}
|
||||
// and must be indexed
|
||||
if (!field.fieldType().indexed()) {
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
private Fields addGeneratedTermVectors(Fields termVectorsByField, TermVectorRequest request, Term uidTerm, boolean realTime) throws IOException {
|
||||
/* only keep valid fields */
|
||||
Set<String> validFields = new HashSet<>();
|
||||
for (String field : request.selectedFields()) {
|
||||
FieldMapper fieldMapper = indexShard.mapperService().smartNameFieldMapper(field);
|
||||
if (!(fieldMapper instanceof StringFieldMapper)) {
|
||||
if (!isValidField(fieldMapper)) {
|
||||
continue;
|
||||
}
|
||||
// already retrieved
|
||||
if (fieldMapper.fieldType().storeTermVectors()) {
|
||||
continue;
|
||||
}
|
||||
// only disallow fields which are not indexed
|
||||
if (!fieldMapper.fieldType().indexed()) {
|
||||
continue;
|
||||
}
|
||||
validFields.add(field);
|
||||
}
|
||||
|
||||
if (validFields.isEmpty()) {
|
||||
return termVectorsByField;
|
||||
}
|
||||
|
||||
/* generate term vectors from fetched document fields */
|
||||
Engine.GetResult get = indexShard.get(new Engine.Get(realTime, uidTerm));
|
||||
Fields generatedTermVectors;
|
||||
try {
|
||||
if (!get.exists()) {
|
||||
return termVectorsByField;
|
||||
}
|
||||
// TODO: support for fetchSourceContext?
|
||||
GetResult getResult = indexShard.getService().get(
|
||||
get, request.id(), request.type(), validFields.toArray(Strings.EMPTY_ARRAY), null, false);
|
||||
generatedTermVectors = generateTermVectors(getResult.getFields().values(), request.offsets());
|
||||
} finally {
|
||||
get.release();
|
||||
}
|
||||
|
||||
/* merge with existing Fields */
|
||||
if (termVectorsByField == null) {
|
||||
return generatedTermVectors;
|
||||
} else {
|
||||
@ -144,7 +179,7 @@ public class ShardTermVectorService extends AbstractIndexShardComponent {
|
||||
}
|
||||
|
||||
private Fields generateTermVectors(Collection<GetField> getFields, boolean withOffsets) throws IOException {
|
||||
// store document in memory index
|
||||
/* store document in memory index */
|
||||
MemoryIndex index = new MemoryIndex(withOffsets);
|
||||
for (GetField getField : getFields) {
|
||||
String field = getField.getName();
|
||||
@ -156,10 +191,51 @@ public class ShardTermVectorService extends AbstractIndexShardComponent {
|
||||
index.addField(field, text.toString(), analyzer);
|
||||
}
|
||||
}
|
||||
// and read vectors from it
|
||||
/* and read vectors from it */
|
||||
return MultiFields.getFields(index.createSearcher().getIndexReader());
|
||||
}
|
||||
|
||||
private Fields generateTermVectorsFromDoc(TermVectorRequest request) throws IOException {
|
||||
// parse the document, at the moment we do update the mapping, just like percolate
|
||||
ParsedDocument parsedDocument = parseDocument(indexShard.shardId().getIndex(), request.type(), request.doc());
|
||||
|
||||
// select the right fields and generate term vectors
|
||||
ParseContext.Document doc = parsedDocument.rootDoc();
|
||||
Collection<String> seenFields = new HashSet<>();
|
||||
Collection<GetField> getFields = new HashSet<>();
|
||||
for (IndexableField field : doc.getFields()) {
|
||||
FieldMapper fieldMapper = indexShard.mapperService().smartNameFieldMapper(field.name());
|
||||
if (seenFields.contains(field.name())) {
|
||||
continue;
|
||||
}
|
||||
else {
|
||||
seenFields.add(field.name());
|
||||
}
|
||||
if (!isValidField(fieldMapper)) {
|
||||
continue;
|
||||
}
|
||||
if (request.selectedFields() != null && !request.selectedFields().contains(field.name())) {
|
||||
continue;
|
||||
}
|
||||
String[] values = doc.getValues(field.name());
|
||||
getFields.add(new GetField(field.name(), Arrays.asList((Object[]) values)));
|
||||
}
|
||||
return generateTermVectors(getFields, request.offsets());
|
||||
}
|
||||
|
||||
private ParsedDocument parseDocument(String index, String type, BytesReference doc) {
|
||||
MapperService mapperService = indexShard.mapperService();
|
||||
IndexService indexService = indexShard.indexService();
|
||||
|
||||
// TODO: make parsing not dynamically create fields not in the original mapping
|
||||
Tuple<DocumentMapper, Boolean> docMapper = mapperService.documentMapperWithAutoCreate(type);
|
||||
ParsedDocument parsedDocument = docMapper.v1().parse(source(doc).type(type).flyweight(true)).setMappingsModified(docMapper);
|
||||
if (parsedDocument.mappingsModified()) {
|
||||
mappingUpdatedAction.updateMappingOnMaster(index, docMapper.v1(), indexService.indexUUID());
|
||||
}
|
||||
return parsedDocument;
|
||||
}
|
||||
|
||||
private Fields mergeFields(String[] fieldNames, Fields... fieldsObject) throws IOException {
|
||||
ParallelFields parallelFields = new ParallelFields();
|
||||
for (Fields fieldObject : fieldsObject) {
|
||||
|
@ -48,6 +48,8 @@ public class RestTermVectorAction extends BaseRestHandler {
|
||||
@Inject
|
||||
public RestTermVectorAction(Settings settings, Client client, RestController controller) {
|
||||
super(settings, client);
|
||||
controller.registerHandler(GET, "/{index}/{type}/_termvector", this);
|
||||
controller.registerHandler(POST, "/{index}/{type}/_termvector", this);
|
||||
controller.registerHandler(GET, "/{index}/{type}/{id}/_termvector", this);
|
||||
controller.registerHandler(POST, "/{index}/{type}/{id}/_termvector", this);
|
||||
}
|
||||
|
@ -31,8 +31,9 @@ import org.elasticsearch.action.index.IndexRequestBuilder;
|
||||
import org.elasticsearch.common.settings.ImmutableSettings;
|
||||
import org.elasticsearch.common.xcontent.ToXContent;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.common.xcontent.XContentFactory;
|
||||
import org.elasticsearch.index.mapper.core.AbstractFieldMapper;
|
||||
import org.elasticsearch.index.service.IndexService;
|
||||
import org.elasticsearch.indices.IndicesService;
|
||||
import org.junit.Test;
|
||||
|
||||
import java.io.IOException;
|
||||
@ -43,6 +44,7 @@ import java.util.Map;
|
||||
import java.util.concurrent.ExecutionException;
|
||||
|
||||
import static org.elasticsearch.common.settings.ImmutableSettings.settingsBuilder;
|
||||
import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
|
||||
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked;
|
||||
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertThrows;
|
||||
import static org.hamcrest.Matchers.*;
|
||||
@ -51,7 +53,7 @@ public class GetTermVectorTests extends AbstractTermVectorTests {
|
||||
|
||||
@Test
|
||||
public void testNoSuchDoc() throws Exception {
|
||||
XContentBuilder mapping = XContentFactory.jsonBuilder().startObject().startObject("type1")
|
||||
XContentBuilder mapping = jsonBuilder().startObject().startObject("type1")
|
||||
.startObject("properties")
|
||||
.startObject("field")
|
||||
.field("type", "string")
|
||||
@ -72,13 +74,13 @@ public class GetTermVectorTests extends AbstractTermVectorTests {
|
||||
assertThat(actionGet.getIndex(), equalTo("test"));
|
||||
assertThat(actionGet.isExists(), equalTo(false));
|
||||
// check response is nevertheless serializable to json
|
||||
actionGet.toXContent(XContentFactory.jsonBuilder(), ToXContent.EMPTY_PARAMS);
|
||||
actionGet.toXContent(jsonBuilder(), ToXContent.EMPTY_PARAMS);
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testExistingFieldWithNoTermVectorsNoNPE() throws Exception {
|
||||
XContentBuilder mapping = XContentFactory.jsonBuilder().startObject().startObject("type1")
|
||||
XContentBuilder mapping = jsonBuilder().startObject().startObject("type1")
|
||||
.startObject("properties")
|
||||
.startObject("existingfield")
|
||||
.field("type", "string")
|
||||
@ -107,7 +109,7 @@ public class GetTermVectorTests extends AbstractTermVectorTests {
|
||||
|
||||
@Test
|
||||
public void testExistingFieldButNotInDocNPE() throws Exception {
|
||||
XContentBuilder mapping = XContentFactory.jsonBuilder().startObject().startObject("type1")
|
||||
XContentBuilder mapping = jsonBuilder().startObject().startObject("type1")
|
||||
.startObject("properties")
|
||||
.startObject("existingfield")
|
||||
.field("type", "string")
|
||||
@ -179,7 +181,7 @@ public class GetTermVectorTests extends AbstractTermVectorTests {
|
||||
|
||||
@Test
|
||||
public void testSimpleTermVectors() throws ElasticsearchException, IOException {
|
||||
XContentBuilder mapping = XContentFactory.jsonBuilder().startObject().startObject("type1")
|
||||
XContentBuilder mapping = jsonBuilder().startObject().startObject("type1")
|
||||
.startObject("properties")
|
||||
.startObject("field")
|
||||
.field("type", "string")
|
||||
@ -197,7 +199,7 @@ public class GetTermVectorTests extends AbstractTermVectorTests {
|
||||
ensureYellow();
|
||||
for (int i = 0; i < 10; i++) {
|
||||
client().prepareIndex("test", "type1", Integer.toString(i))
|
||||
.setSource(XContentFactory.jsonBuilder().startObject().field("field", "the quick brown fox jumps over the lazy dog")
|
||||
.setSource(jsonBuilder().startObject().field("field", "the quick brown fox jumps over the lazy dog")
|
||||
// 0the3 4quick9 10brown15 16fox19 20jumps25 26over30
|
||||
// 31the34 35lazy39 40dog43
|
||||
.endObject()).execute().actionGet();
|
||||
@ -268,7 +270,7 @@ public class GetTermVectorTests extends AbstractTermVectorTests {
|
||||
ft.setStoreTermVectorPositions(storePositions);
|
||||
|
||||
String optionString = AbstractFieldMapper.termVectorOptionsToString(ft);
|
||||
XContentBuilder mapping = XContentFactory.jsonBuilder().startObject().startObject("type1")
|
||||
XContentBuilder mapping = jsonBuilder().startObject().startObject("type1")
|
||||
.startObject("properties")
|
||||
.startObject("field")
|
||||
.field("type", "string")
|
||||
@ -284,7 +286,7 @@ public class GetTermVectorTests extends AbstractTermVectorTests {
|
||||
ensureYellow();
|
||||
for (int i = 0; i < 10; i++) {
|
||||
client().prepareIndex("test", "type1", Integer.toString(i))
|
||||
.setSource(XContentFactory.jsonBuilder().startObject().field("field", "the quick brown fox jumps over the lazy dog")
|
||||
.setSource(jsonBuilder().startObject().field("field", "the quick brown fox jumps over the lazy dog")
|
||||
// 0the3 4quick9 10brown15 16fox19 20jumps25 26over30
|
||||
// 31the34 35lazy39 40dog43
|
||||
.endObject()).execute().actionGet();
|
||||
@ -423,7 +425,7 @@ public class GetTermVectorTests extends AbstractTermVectorTests {
|
||||
String delimiter = createRandomDelimiter(tokens);
|
||||
String queryString = createString(tokens, payloads, encoding, delimiter.charAt(0));
|
||||
//create the mapping
|
||||
XContentBuilder mapping = XContentFactory.jsonBuilder().startObject().startObject("type1").startObject("properties")
|
||||
XContentBuilder mapping = jsonBuilder().startObject().startObject("type1").startObject("properties")
|
||||
.startObject("field").field("type", "string").field("term_vector", "with_positions_offsets_payloads")
|
||||
.field("analyzer", "payload_test").endObject().endObject().endObject().endObject();
|
||||
assertAcked(prepareCreate("test").addMapping("type1", mapping).setSettings(
|
||||
@ -437,7 +439,7 @@ public class GetTermVectorTests extends AbstractTermVectorTests {
|
||||
ensureYellow();
|
||||
|
||||
client().prepareIndex("test", "type1", Integer.toString(1))
|
||||
.setSource(XContentFactory.jsonBuilder().startObject().field("field", queryString).endObject()).execute().actionGet();
|
||||
.setSource(jsonBuilder().startObject().field("field", queryString).endObject()).execute().actionGet();
|
||||
refresh();
|
||||
TermVectorRequestBuilder resp = client().prepareTermVector("test", "type1", Integer.toString(1)).setPayloads(true).setOffsets(true)
|
||||
.setPositions(true).setSelectedFields();
|
||||
@ -579,8 +581,8 @@ public class GetTermVectorTests extends AbstractTermVectorTests {
|
||||
fieldNames[i] = "field" + String.valueOf(i);
|
||||
}
|
||||
|
||||
XContentBuilder mapping = XContentFactory.jsonBuilder().startObject().startObject("type1").startObject("properties");
|
||||
XContentBuilder source = XContentFactory.jsonBuilder().startObject();
|
||||
XContentBuilder mapping = jsonBuilder().startObject().startObject("type1").startObject("properties");
|
||||
XContentBuilder source = jsonBuilder().startObject();
|
||||
for (String field : fieldNames) {
|
||||
mapping.startObject(field)
|
||||
.field("type", "string")
|
||||
@ -764,8 +766,8 @@ public class GetTermVectorTests extends AbstractTermVectorTests {
|
||||
public void testSimpleWildCards() throws ElasticsearchException, IOException {
|
||||
int numFields = 25;
|
||||
|
||||
XContentBuilder mapping = XContentFactory.jsonBuilder().startObject().startObject("type1").startObject("properties");
|
||||
XContentBuilder source = XContentFactory.jsonBuilder().startObject();
|
||||
XContentBuilder mapping = jsonBuilder().startObject().startObject("type1").startObject("properties");
|
||||
XContentBuilder source = jsonBuilder().startObject();
|
||||
for (int i = 0; i < numFields; i++) {
|
||||
mapping.startObject("field" + i)
|
||||
.field("type", "string")
|
||||
@ -788,6 +790,142 @@ public class GetTermVectorTests extends AbstractTermVectorTests {
|
||||
assertThat("All term vectors should have been generated", response.getFields().size(), equalTo(numFields));
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testArtificialVsExisting() throws ElasticsearchException, ExecutionException, InterruptedException, IOException {
|
||||
// setup indices
|
||||
ImmutableSettings.Builder settings = settingsBuilder()
|
||||
.put(indexSettings())
|
||||
.put("index.analysis.analyzer", "standard");
|
||||
assertAcked(prepareCreate("test")
|
||||
.setSettings(settings)
|
||||
.addMapping("type1", "field1", "type=string,term_vector=with_positions_offsets"));
|
||||
ensureGreen();
|
||||
|
||||
// index documents existing document
|
||||
String[] content = new String[]{
|
||||
"Generating a random permutation of a sequence (such as when shuffling cards).",
|
||||
"Selecting a random sample of a population (important in statistical sampling).",
|
||||
"Allocating experimental units via random assignment to a treatment or control condition.",
|
||||
"Generating random numbers: see Random number generation."};
|
||||
|
||||
List<IndexRequestBuilder> indexBuilders = new ArrayList<>();
|
||||
for (int i = 0; i < content.length; i++) {
|
||||
indexBuilders.add(client().prepareIndex()
|
||||
.setIndex("test")
|
||||
.setType("type1")
|
||||
.setId(String.valueOf(i))
|
||||
.setSource("field1", content[i]));
|
||||
}
|
||||
indexRandom(true, indexBuilders);
|
||||
|
||||
for (int i = 0; i < content.length; i++) {
|
||||
// request tvs from existing document
|
||||
TermVectorResponse respExisting = client().prepareTermVector("test", "type1", String.valueOf(i))
|
||||
.setOffsets(true)
|
||||
.setPositions(true)
|
||||
.setFieldStatistics(true)
|
||||
.setTermStatistics(true)
|
||||
.get();
|
||||
assertThat("doc with index: test, type1 and id: existing", respExisting.isExists(), equalTo(true));
|
||||
|
||||
// request tvs from artificial document
|
||||
TermVectorResponse respArtificial = client().prepareTermVector()
|
||||
.setIndex("test")
|
||||
.setType("type1")
|
||||
.setRouting(String.valueOf(i)) // ensure we get the stats from the same shard as existing doc
|
||||
.setDoc(jsonBuilder()
|
||||
.startObject()
|
||||
.field("field1", content[i])
|
||||
.endObject())
|
||||
.setOffsets(true)
|
||||
.setPositions(true)
|
||||
.setFieldStatistics(true)
|
||||
.setTermStatistics(true)
|
||||
.get();
|
||||
assertThat("doc with index: test, type1 and id: " + String.valueOf(i), respArtificial.isExists(), equalTo(true));
|
||||
|
||||
// compare existing tvs with artificial
|
||||
compareTermVectors("field1", respExisting.getFields(), respArtificial.getFields());
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testArtificialNoDoc() throws IOException {
|
||||
// setup indices
|
||||
ImmutableSettings.Builder settings = settingsBuilder()
|
||||
.put(indexSettings())
|
||||
.put("index.analysis.analyzer", "standard");
|
||||
assertAcked(prepareCreate("test")
|
||||
.setSettings(settings)
|
||||
.addMapping("type1", "field1", "type=string"));
|
||||
ensureGreen();
|
||||
|
||||
// request tvs from artificial document
|
||||
String text = "the quick brown fox jumps over the lazy dog";
|
||||
TermVectorResponse resp = client().prepareTermVector()
|
||||
.setIndex("test")
|
||||
.setType("type1")
|
||||
.setDoc(jsonBuilder()
|
||||
.startObject()
|
||||
.field("field1", text)
|
||||
.endObject())
|
||||
.setOffsets(true)
|
||||
.setPositions(true)
|
||||
.setFieldStatistics(true)
|
||||
.setTermStatistics(true)
|
||||
.get();
|
||||
assertThat(resp.isExists(), equalTo(true));
|
||||
checkBrownFoxTermVector(resp.getFields(), "field1", false);
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testArtificialNonExistingField() throws Exception {
|
||||
// setup indices
|
||||
ImmutableSettings.Builder settings = settingsBuilder()
|
||||
.put(indexSettings())
|
||||
.put("index.analysis.analyzer", "standard");
|
||||
assertAcked(prepareCreate("test")
|
||||
.setSettings(settings)
|
||||
.addMapping("type1", "field1", "type=string"));
|
||||
ensureGreen();
|
||||
|
||||
// index just one doc
|
||||
List<IndexRequestBuilder> indexBuilders = new ArrayList<>();
|
||||
indexBuilders.add(client().prepareIndex()
|
||||
.setIndex("test")
|
||||
.setType("type1")
|
||||
.setId("1")
|
||||
.setRouting("1")
|
||||
.setSource("field1", "some text"));
|
||||
indexRandom(true, indexBuilders);
|
||||
|
||||
// request tvs from artificial document
|
||||
XContentBuilder doc = jsonBuilder()
|
||||
.startObject()
|
||||
.field("field1", "the quick brown fox jumps over the lazy dog")
|
||||
.field("non_existing", "the quick brown fox jumps over the lazy dog")
|
||||
.endObject();
|
||||
|
||||
for (int i = 0; i < 2; i++) {
|
||||
TermVectorResponse resp = client().prepareTermVector()
|
||||
.setIndex("test")
|
||||
.setType("type1")
|
||||
.setDoc(doc)
|
||||
.setRouting("" + i)
|
||||
.setOffsets(true)
|
||||
.setPositions(true)
|
||||
.setFieldStatistics(true)
|
||||
.setTermStatistics(true)
|
||||
.get();
|
||||
assertThat(resp.isExists(), equalTo(true));
|
||||
checkBrownFoxTermVector(resp.getFields(), "field1", false);
|
||||
// we should have created a mapping for this field
|
||||
waitForMappingOnMaster("test", "type1", "non_existing");
|
||||
// and return the generated term vectors
|
||||
checkBrownFoxTermVector(resp.getFields(), "non_existing", false);
|
||||
}
|
||||
}
|
||||
|
||||
private static String indexOrAlias() {
|
||||
return randomBoolean() ? "test" : "alias";
|
||||
}
|
||||
|
Loading…
x
Reference in New Issue
Block a user