mirror of https://github.com/apache/lucene.git
Merge branch 'master' into feature/autoscaling
This commit is contained in:
commit
88d9b28492
|
@ -134,7 +134,7 @@ def update_example_solrconfigs(new_version):
|
||||||
print(' updating example solrconfig.xml files')
|
print(' updating example solrconfig.xml files')
|
||||||
matcher = re.compile('<luceneMatchVersion>')
|
matcher = re.compile('<luceneMatchVersion>')
|
||||||
|
|
||||||
paths = ['solr/server/solr/configsets', 'solr/example']
|
paths = ['solr/server/solr/configsets', 'solr/example', 'solr/core/src/test-files/solr/configsets/_default']
|
||||||
for path in paths:
|
for path in paths:
|
||||||
if not os.path.isdir(path):
|
if not os.path.isdir(path):
|
||||||
raise RuntimeError("Can't locate configset dir (layout change?) : " + path)
|
raise RuntimeError("Can't locate configset dir (layout change?) : " + path)
|
||||||
|
|
|
@ -58,8 +58,11 @@ New Features
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
* SOLR-11019: Add addAll Stream Evaluator (Joel Bernstein)
|
* SOLR-11019: Add addAll Stream Evaluator (Joel Bernstein)
|
||||||
|
|
||||||
* SOLR-10996: Implement TriggerListener API (ab, shalin)
|
* SOLR-10996: Implement TriggerListener API (ab, shalin)
|
||||||
|
|
||||||
|
* SOLR-11046: Add residuals Stream Evaluator (Joel Bernstein)
|
||||||
|
|
||||||
Bug Fixes
|
Bug Fixes
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
@ -68,7 +71,8 @@ Bug Fixes
|
||||||
Optimizations
|
Optimizations
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
(No Changes)
|
* SOLR-10985: Remove unnecessary toString() calls in solr-core's search package's debug logging.
|
||||||
|
(Michael Braun via Christine Poerschke)
|
||||||
|
|
||||||
Other Changes
|
Other Changes
|
||||||
----------------------
|
----------------------
|
||||||
|
@ -80,6 +84,8 @@ Other Changes
|
||||||
|
|
||||||
* SOLR-10748: Make stream.body configurable and disabled by default (janhoy)
|
* SOLR-10748: Make stream.body configurable and disabled by default (janhoy)
|
||||||
|
|
||||||
|
* SOLR-10964: Reduce SolrIndexSearcher casting in LTRRescorer. (Christine Poerschke)
|
||||||
|
|
||||||
================== 7.0.0 ==================
|
================== 7.0.0 ==================
|
||||||
|
|
||||||
Versions of Major Components
|
Versions of Major Components
|
||||||
|
@ -289,6 +295,8 @@ New Features
|
||||||
* SOLR-10965: New ExecutePlanAction for autoscaling which executes the operations computed by ComputePlanAction
|
* SOLR-10965: New ExecutePlanAction for autoscaling which executes the operations computed by ComputePlanAction
|
||||||
against the cluster. (shalin)
|
against the cluster. (shalin)
|
||||||
|
|
||||||
|
* SOLR-10282: bin/solr support for enabling Kerberos authentication (Ishan Chattopadhyaya)
|
||||||
|
|
||||||
Bug Fixes
|
Bug Fixes
|
||||||
----------------------
|
----------------------
|
||||||
* SOLR-9262: Connection and read timeouts are being ignored by UpdateShardHandler after SOLR-4509.
|
* SOLR-9262: Connection and read timeouts are being ignored by UpdateShardHandler after SOLR-4509.
|
||||||
|
@ -347,6 +355,13 @@ Bug Fixes
|
||||||
|
|
||||||
* SOLR-10826: Fix CloudSolrClient to expand the collection parameter correctly (Tim Owen via Varun Thacker)
|
* SOLR-10826: Fix CloudSolrClient to expand the collection parameter correctly (Tim Owen via Varun Thacker)
|
||||||
|
|
||||||
|
* SOLR-11039: Next button in Solr admin UI for collection list pagination does not work. (janhoy)
|
||||||
|
|
||||||
|
* SOLR-11041: MoveReplicaCmd do not specify ulog dir in case of HDFS (Cao Manh Dat)
|
||||||
|
|
||||||
|
* SOLR-11045: The new replica created by MoveReplica will have to have same name and coreName as the
|
||||||
|
old one in case of HDFS (Cao Manh Dat)
|
||||||
|
|
||||||
Optimizations
|
Optimizations
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
|
@ -482,6 +497,10 @@ Other Changes
|
||||||
- SOLR-10977: Randomize the usage of Points based numerics in schema15.xml and all impacted tests (hossman)
|
- SOLR-10977: Randomize the usage of Points based numerics in schema15.xml and all impacted tests (hossman)
|
||||||
- SOLR-10979: Randomize PointFields in schema-docValues*.xml and all affected tests (hossman)
|
- SOLR-10979: Randomize PointFields in schema-docValues*.xml and all affected tests (hossman)
|
||||||
- SOLR-10989: Randomize PointFields and general cleanup in schema files where some Trie fields were unused (hossman)
|
- SOLR-10989: Randomize PointFields and general cleanup in schema files where some Trie fields were unused (hossman)
|
||||||
|
- SOLR-11048: Randomize PointsFields in schema-add-schema-fields-update-processor.xml in solr-core collection1 and
|
||||||
|
all affected tests (Anshum Gupta)
|
||||||
|
- SOLR-11059: Randomize PointFields in schema-blockjoinfacetcomponent.xml and all related tests (Anshum Gupta)
|
||||||
|
- SOLR-11060: Randomize PointFields in schema-custom-field.xml and all related tests (Anshum Gupta)
|
||||||
|
|
||||||
* SOLR-6807: Changed requestDispatcher's handleSelect to default to false, thus ignoring "qt".
|
* SOLR-6807: Changed requestDispatcher's handleSelect to default to false, thus ignoring "qt".
|
||||||
Simplified configs to not refer to handleSelect or "qt". Switch all tests that assumed true to assume false
|
Simplified configs to not refer to handleSelect or "qt". Switch all tests that assumed true to assume false
|
||||||
|
@ -498,6 +517,13 @@ Other Changes
|
||||||
|
|
||||||
* SOLR-11016: Fix TestCloudJSONFacetJoinDomain test-only bug (hossman)
|
* SOLR-11016: Fix TestCloudJSONFacetJoinDomain test-only bug (hossman)
|
||||||
|
|
||||||
|
* SOLR-11021: The elevate.xml config-file is made optional in the ElevationComponent.
|
||||||
|
The default configset doesn't ship with a elevate.xml file anymore (Varun Thacker)
|
||||||
|
|
||||||
|
* SOLR-10898: Fix SOLR-10898 to not deterministicly fail 1/512 runs (hossman)
|
||||||
|
|
||||||
|
* SOLR-10796: TestPointFields: increase randomized testing of non-trivial values. (Steve Rowe)
|
||||||
|
|
||||||
================== 6.7.0 ==================
|
================== 6.7.0 ==================
|
||||||
|
|
||||||
Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release.
|
Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release.
|
||||||
|
@ -630,6 +656,8 @@ when using one of Exact*StatsCache (Mikhail Khludnev)
|
||||||
|
|
||||||
* SOLR-10914: RecoveryStrategy's sendPrepRecoveryCmd can get stuck for 5 minutes if leader is unloaded. (shalin)
|
* SOLR-10914: RecoveryStrategy's sendPrepRecoveryCmd can get stuck for 5 minutes if leader is unloaded. (shalin)
|
||||||
|
|
||||||
|
* SOLR-11024: ParallelStream should set the StreamContext when constructing SolrStreams (Joel Bernstein)
|
||||||
|
|
||||||
Optimizations
|
Optimizations
|
||||||
----------------------
|
----------------------
|
||||||
* SOLR-10634: JSON Facet API: When a field/terms facet will retrieve all buckets (i.e. limit:-1)
|
* SOLR-10634: JSON Facet API: When a field/terms facet will retrieve all buckets (i.e. limit:-1)
|
||||||
|
|
|
@ -555,20 +555,23 @@ function print_usage() {
|
||||||
echo ""
|
echo ""
|
||||||
echo "Usage: solr auth enable [-type basicAuth] -credentials user:pass [-blockUnknown <true|false>] [-updateIncludeFileOnly <true|false>]"
|
echo "Usage: solr auth enable [-type basicAuth] -credentials user:pass [-blockUnknown <true|false>] [-updateIncludeFileOnly <true|false>]"
|
||||||
echo " solr auth enable [-type basicAuth] -prompt <true|false> [-blockUnknown <true|false>] [-updateIncludeFileOnly <true|false>]"
|
echo " solr auth enable [-type basicAuth] -prompt <true|false> [-blockUnknown <true|false>] [-updateIncludeFileOnly <true|false>]"
|
||||||
|
echo " solr auth enable -type kerberos -config "<kerberos configs>" [-updateIncludeFileOnly <true|false>]"
|
||||||
echo " solr auth disable [-updateIncludeFileOnly <true|false>]"
|
echo " solr auth disable [-updateIncludeFileOnly <true|false>]"
|
||||||
echo ""
|
echo ""
|
||||||
echo " -type <type> The authentication mechanism to enable. Defaults to 'basicAuth'."
|
echo " -type <type> The authentication mechanism (basicAuth or kerberos) to enable. Defaults to 'basicAuth'."
|
||||||
echo ""
|
echo ""
|
||||||
echo " -credentials <user:pass> The username and password of the initial user"
|
echo " -credentials <user:pass> The username and password of the initial user. Applicable for basicAuth only."
|
||||||
echo " Note: only one of -prompt or -credentials must be provided"
|
echo " Note: only one of -prompt or -credentials must be provided"
|
||||||
echo ""
|
echo ""
|
||||||
echo " -prompt <true|false> Prompts the user to provide the credentials"
|
echo " -config "<configs>" Configuration parameters (Solr startup parameters). Required and applicable only for Kerberos"
|
||||||
|
echo ""
|
||||||
|
echo " -prompt <true|false> Prompts the user to provide the credentials. Applicable for basicAuth only."
|
||||||
echo " Note: only one of -prompt or -credentials must be provided"
|
echo " Note: only one of -prompt or -credentials must be provided"
|
||||||
echo ""
|
echo ""
|
||||||
echo " -blockUnknown <true|false> When true, this blocks out access to unauthenticated users. When not provided,"
|
echo " -blockUnknown <true|false> When true, this blocks out access to unauthenticated users. When not provided,"
|
||||||
echo " this defaults to false (i.e. unauthenticated users can access all endpoints, except the"
|
echo " this defaults to false (i.e. unauthenticated users can access all endpoints, except the"
|
||||||
echo " operations like collection-edit, security-edit, core-admin-edit etc.). Check the reference"
|
echo " operations like collection-edit, security-edit, core-admin-edit etc.). Check the reference"
|
||||||
echo " guide for Basic Authentication for more details."
|
echo " guide for Basic Authentication for more details. Applicable for basicAuth only."
|
||||||
echo ""
|
echo ""
|
||||||
echo " -updateIncludeFileOnly <true|false> Only update the solr.in.sh or solr.in.cmd file, and skip actual enabling/disabling"
|
echo " -updateIncludeFileOnly <true|false> Only update the solr.in.sh or solr.in.cmd file, and skip actual enabling/disabling"
|
||||||
echo " authentication (i.e. don't update security.json)"
|
echo " authentication (i.e. don't update security.json)"
|
||||||
|
@ -975,6 +978,14 @@ if [[ "$SCRIPT_CMD" == "create" || "$SCRIPT_CMD" == "create_core" || "$SCRIPT_CM
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
if [ "$CREATE_CONFDIR" == "_default" ]; then
|
||||||
|
echo "WARNING: Using _default configset. Data driven schema functionality is enabled by default, which is"
|
||||||
|
echo " NOT RECOMMENDED for production use."
|
||||||
|
echo
|
||||||
|
echo " To turn it off:"
|
||||||
|
echo " curl http://$SOLR_TOOL_HOST:$CREATE_PORT/solr/$CREATE_NAME/config -d '{\"set-user-property\": {\"update.autoCreateFields\":\"false\"}}'"
|
||||||
|
fi
|
||||||
|
|
||||||
if [[ "$(whoami)" == "root" ]] && [[ "$FORCE" == "false" ]] ; then
|
if [[ "$(whoami)" == "root" ]] && [[ "$FORCE" == "false" ]] ; then
|
||||||
echo "WARNING: Creating cores as the root user can cause Solr to fail and is not advisable. Exiting."
|
echo "WARNING: Creating cores as the root user can cause Solr to fail and is not advisable. Exiting."
|
||||||
echo " If you started Solr as root (not advisable either), force core creation by adding argument -force"
|
echo " If you started Solr as root (not advisable either), force core creation by adding argument -force"
|
||||||
|
@ -1242,6 +1253,11 @@ if [[ "$SCRIPT_CMD" == "auth" ]]; then
|
||||||
AUTH_PARAMS=("${AUTH_PARAMS[@]}" "-credentials" "$AUTH_CREDENTIALS")
|
AUTH_PARAMS=("${AUTH_PARAMS[@]}" "-credentials" "$AUTH_CREDENTIALS")
|
||||||
shift 2
|
shift 2
|
||||||
;;
|
;;
|
||||||
|
-config)
|
||||||
|
AUTH_CONFIG="`echo $2| base64`"
|
||||||
|
AUTH_PARAMS=("${AUTH_PARAMS[@]}" "-config" "$AUTH_CONFIG")
|
||||||
|
shift 2
|
||||||
|
;;
|
||||||
-solrIncludeFile)
|
-solrIncludeFile)
|
||||||
SOLR_INCLUDE="$2"
|
SOLR_INCLUDE="$2"
|
||||||
shift 2
|
shift 2
|
||||||
|
|
|
@ -1426,6 +1426,14 @@ if "!CREATE_PORT!"=="" (
|
||||||
goto err
|
goto err
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
if "!CREATE_CONFDIR!"=="_default" (
|
||||||
|
echo WARNING: Using _default configset. Data driven schema functionality is enabled by default, which is
|
||||||
|
echo NOT RECOMMENDED for production use.
|
||||||
|
echo To turn it off:
|
||||||
|
echo curl http://%SOLR_TOOL_HOST%:!CREATE_PORT!/solr/!CREATE_NAME!/config -d '{"set-user-property": {"update.autoCreateFields":"false"}}'
|
||||||
|
)
|
||||||
|
|
||||||
if "%SCRIPT_CMD%"=="create_core" (
|
if "%SCRIPT_CMD%"=="create_core" (
|
||||||
"%JAVA%" %SOLR_SSL_OPTS% %AUTHC_OPTS% %SOLR_ZK_CREDS_AND_ACLS% -Dsolr.install.dir="%SOLR_TIP%" ^
|
"%JAVA%" %SOLR_SSL_OPTS% %AUTHC_OPTS% %SOLR_ZK_CREDS_AND_ACLS% -Dsolr.install.dir="%SOLR_TIP%" ^
|
||||||
-Dlog4j.configuration="file:%DEFAULT_SERVER_DIR%\scripts\cloud-scripts\log4j.properties" ^
|
-Dlog4j.configuration="file:%DEFAULT_SERVER_DIR%\scripts\cloud-scripts\log4j.properties" ^
|
||||||
|
|
|
@ -116,8 +116,7 @@ public class LTRRescorer extends Rescorer {
|
||||||
final LTRScoringQuery.ModelWeight modelWeight = (LTRScoringQuery.ModelWeight) searcher
|
final LTRScoringQuery.ModelWeight modelWeight = (LTRScoringQuery.ModelWeight) searcher
|
||||||
.createNormalizedWeight(scoringQuery, true);
|
.createNormalizedWeight(scoringQuery, true);
|
||||||
|
|
||||||
final SolrIndexSearcher solrIndexSearch = (SolrIndexSearcher) searcher;
|
scoreFeatures(searcher, firstPassTopDocs,topN, modelWeight, hits, leaves, reranked);
|
||||||
scoreFeatures(solrIndexSearch, firstPassTopDocs,topN, modelWeight, hits, leaves, reranked);
|
|
||||||
// Must sort all documents that we reranked, and then select the top
|
// Must sort all documents that we reranked, and then select the top
|
||||||
Arrays.sort(reranked, new Comparator<ScoreDoc>() {
|
Arrays.sort(reranked, new Comparator<ScoreDoc>() {
|
||||||
@Override
|
@Override
|
||||||
|
@ -138,7 +137,7 @@ public class LTRRescorer extends Rescorer {
|
||||||
return new TopDocs(firstPassTopDocs.totalHits, reranked, reranked[0].score);
|
return new TopDocs(firstPassTopDocs.totalHits, reranked, reranked[0].score);
|
||||||
}
|
}
|
||||||
|
|
||||||
public void scoreFeatures(SolrIndexSearcher solrIndexSearch, TopDocs firstPassTopDocs,
|
public void scoreFeatures(IndexSearcher indexSearcher, TopDocs firstPassTopDocs,
|
||||||
int topN, LTRScoringQuery.ModelWeight modelWeight, ScoreDoc[] hits, List<LeafReaderContext> leaves,
|
int topN, LTRScoringQuery.ModelWeight modelWeight, ScoreDoc[] hits, List<LeafReaderContext> leaves,
|
||||||
ScoreDoc[] reranked) throws IOException {
|
ScoreDoc[] reranked) throws IOException {
|
||||||
|
|
||||||
|
@ -183,8 +182,8 @@ public class LTRRescorer extends Rescorer {
|
||||||
reranked[hitUpto] = hit;
|
reranked[hitUpto] = hit;
|
||||||
// if the heap is not full, maybe I want to log the features for this
|
// if the heap is not full, maybe I want to log the features for this
|
||||||
// document
|
// document
|
||||||
if (featureLogger != null) {
|
if (featureLogger != null && indexSearcher instanceof SolrIndexSearcher) {
|
||||||
featureLogger.log(hit.doc, scoringQuery, solrIndexSearch,
|
featureLogger.log(hit.doc, scoringQuery, (SolrIndexSearcher)indexSearcher,
|
||||||
modelWeight.getFeaturesInfo());
|
modelWeight.getFeaturesInfo());
|
||||||
}
|
}
|
||||||
} else if (hitUpto == topN) {
|
} else if (hitUpto == topN) {
|
||||||
|
@ -200,8 +199,8 @@ public class LTRRescorer extends Rescorer {
|
||||||
if (hit.score > reranked[0].score) {
|
if (hit.score > reranked[0].score) {
|
||||||
reranked[0] = hit;
|
reranked[0] = hit;
|
||||||
heapAdjust(reranked, topN, 0);
|
heapAdjust(reranked, topN, 0);
|
||||||
if (featureLogger != null) {
|
if (featureLogger != null && indexSearcher instanceof SolrIndexSearcher) {
|
||||||
featureLogger.log(hit.doc, scoringQuery, solrIndexSearch,
|
featureLogger.log(hit.doc, scoringQuery, (SolrIndexSearcher)indexSearcher,
|
||||||
modelWeight.getFeaturesInfo());
|
modelWeight.getFeaturesInfo());
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -31,6 +31,7 @@ import org.apache.solr.common.cloud.Replica;
|
||||||
import org.apache.solr.common.cloud.Slice;
|
import org.apache.solr.common.cloud.Slice;
|
||||||
import org.apache.solr.common.cloud.SolrZkClient;
|
import org.apache.solr.common.cloud.SolrZkClient;
|
||||||
import org.apache.solr.common.cloud.ZkStateReader;
|
import org.apache.solr.common.cloud.ZkStateReader;
|
||||||
|
import org.apache.solr.common.params.CoreAdminParams;
|
||||||
import org.apache.solr.core.CoreContainer;
|
import org.apache.solr.core.CoreContainer;
|
||||||
import org.apache.solr.core.CoreDescriptor;
|
import org.apache.solr.core.CoreDescriptor;
|
||||||
import org.apache.solr.core.SolrResourceLoader;
|
import org.apache.solr.core.SolrResourceLoader;
|
||||||
|
@ -64,10 +65,11 @@ public class CloudUtil {
|
||||||
|
|
||||||
String cnn = replica.getName();
|
String cnn = replica.getName();
|
||||||
String baseUrl = replica.getStr(ZkStateReader.BASE_URL_PROP);
|
String baseUrl = replica.getStr(ZkStateReader.BASE_URL_PROP);
|
||||||
|
boolean isSharedFs = replica.getStr(CoreAdminParams.DATA_DIR) != null;
|
||||||
log.debug("compare against coreNodeName={} baseUrl={}", cnn, baseUrl);
|
log.debug("compare against coreNodeName={} baseUrl={}", cnn, baseUrl);
|
||||||
|
|
||||||
if (thisCnn != null && thisCnn.equals(cnn)
|
if (thisCnn != null && thisCnn.equals(cnn)
|
||||||
&& !thisBaseUrl.equals(baseUrl)) {
|
&& !thisBaseUrl.equals(baseUrl) && isSharedFs) {
|
||||||
if (cc.getLoadedCoreNames().contains(desc.getName())) {
|
if (cc.getLoadedCoreNames().contains(desc.getName())) {
|
||||||
cc.unload(desc.getName());
|
cc.unload(desc.getName());
|
||||||
}
|
}
|
||||||
|
|
|
@ -294,6 +294,15 @@ public class CreateCollectionCmd implements Cmd {
|
||||||
ocmh.forwardToAutoScaling(AutoScaling.AUTO_ADD_REPLICAS_TRIGGER_DSL);
|
ocmh.forwardToAutoScaling(AutoScaling.AUTO_ADD_REPLICAS_TRIGGER_DSL);
|
||||||
}
|
}
|
||||||
log.debug("Finished create command on all shards for collection: {}", collectionName);
|
log.debug("Finished create command on all shards for collection: {}", collectionName);
|
||||||
|
|
||||||
|
// Emit a warning about production use of data driven functionality
|
||||||
|
boolean defaultConfigSetUsed = message.getStr(COLL_CONF) == null ||
|
||||||
|
message.getStr(COLL_CONF).equals(ConfigSetsHandlerApi.DEFAULT_CONFIGSET_NAME);
|
||||||
|
if (defaultConfigSetUsed) {
|
||||||
|
results.add("warning", "Using _default configset. Data driven schema functionality"
|
||||||
|
+ " is enabled by default, which is NOT RECOMMENDED for production use. To turn it off:"
|
||||||
|
+ " curl http://{host:port}/solr/" + collectionName + "/config -d '{\"set-user-property\": {\"update.autoCreateFields\":\"false\"}}'");
|
||||||
|
}
|
||||||
}
|
}
|
||||||
} catch (SolrException ex) {
|
} catch (SolrException ex) {
|
||||||
throw ex;
|
throw ex;
|
||||||
|
|
|
@ -34,6 +34,8 @@ import org.apache.solr.common.cloud.ZkNodeProps;
|
||||||
import org.apache.solr.common.params.CoreAdminParams;
|
import org.apache.solr.common.params.CoreAdminParams;
|
||||||
import org.apache.solr.common.util.NamedList;
|
import org.apache.solr.common.util.NamedList;
|
||||||
import org.apache.solr.common.util.Utils;
|
import org.apache.solr.common.util.Utils;
|
||||||
|
import org.apache.solr.update.UpdateLog;
|
||||||
|
import org.apache.solr.util.TimeOut;
|
||||||
import org.slf4j.Logger;
|
import org.slf4j.Logger;
|
||||||
import org.slf4j.LoggerFactory;
|
import org.slf4j.LoggerFactory;
|
||||||
|
|
||||||
|
@ -105,18 +107,15 @@ public class MoveReplicaCmd implements Cmd{
|
||||||
}
|
}
|
||||||
assert slice != null;
|
assert slice != null;
|
||||||
Object dataDir = replica.get("dataDir");
|
Object dataDir = replica.get("dataDir");
|
||||||
final String ulogDir = replica.getStr("ulogDir");
|
|
||||||
if (dataDir != null && dataDir.toString().startsWith("hdfs:/")) {
|
if (dataDir != null && dataDir.toString().startsWith("hdfs:/")) {
|
||||||
moveHdfsReplica(clusterState, results, dataDir.toString(), ulogDir, targetNode, async, coll, replica, slice, timeout);
|
moveHdfsReplica(clusterState, results, dataDir.toString(), targetNode, async, coll, replica, slice, timeout);
|
||||||
} else {
|
} else {
|
||||||
moveNormalReplica(clusterState, results, targetNode, async, coll, replica, slice, timeout);
|
moveNormalReplica(clusterState, results, targetNode, async, coll, replica, slice, timeout);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
private void moveHdfsReplica(ClusterState clusterState, NamedList results, String dataDir, String ulogDir, String targetNode, String async,
|
private void moveHdfsReplica(ClusterState clusterState, NamedList results, String dataDir, String targetNode, String async,
|
||||||
DocCollection coll, Replica replica, Slice slice, int timeout) throws Exception {
|
DocCollection coll, Replica replica, Slice slice, int timeout) throws Exception {
|
||||||
String newCoreName = Assign.buildCoreName(coll, slice.getName(), replica.getType());
|
|
||||||
|
|
||||||
ZkNodeProps removeReplicasProps = new ZkNodeProps(
|
ZkNodeProps removeReplicasProps = new ZkNodeProps(
|
||||||
COLLECTION_PROP, coll.getName(),
|
COLLECTION_PROP, coll.getName(),
|
||||||
SHARD_ID_PROP, slice.getName(),
|
SHARD_ID_PROP, slice.getName(),
|
||||||
|
@ -135,16 +134,32 @@ public class MoveReplicaCmd implements Cmd{
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
TimeOut timeOut = new TimeOut(20L, TimeUnit.SECONDS);
|
||||||
|
while (!timeOut.hasTimedOut()) {
|
||||||
|
coll = ocmh.zkStateReader.getClusterState().getCollection(coll.getName());
|
||||||
|
if (coll.getReplica(replica.getName()) != null) {
|
||||||
|
Thread.sleep(100);
|
||||||
|
} else {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (timeOut.hasTimedOut()) {
|
||||||
|
results.add("failure", "Still see deleted replica in clusterstate!");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
String ulogDir = replica.getStr(CoreAdminParams.ULOG_DIR);
|
||||||
ZkNodeProps addReplicasProps = new ZkNodeProps(
|
ZkNodeProps addReplicasProps = new ZkNodeProps(
|
||||||
COLLECTION_PROP, coll.getName(),
|
COLLECTION_PROP, coll.getName(),
|
||||||
SHARD_ID_PROP, slice.getName(),
|
SHARD_ID_PROP, slice.getName(),
|
||||||
CoreAdminParams.NODE, targetNode,
|
CoreAdminParams.NODE, targetNode,
|
||||||
CoreAdminParams.NAME, newCoreName,
|
CoreAdminParams.CORE_NODE_NAME, replica.getName(),
|
||||||
CoreAdminParams.DATA_DIR, dataDir,
|
CoreAdminParams.NAME, replica.getCoreName(),
|
||||||
CoreAdminParams.ULOG_DIR, ulogDir);
|
CoreAdminParams.ULOG_DIR, ulogDir.substring(0, ulogDir.lastIndexOf(UpdateLog.TLOG_NAME)),
|
||||||
|
CoreAdminParams.DATA_DIR, dataDir);
|
||||||
if(async!=null) addReplicasProps.getProperties().put(ASYNC, async);
|
if(async!=null) addReplicasProps.getProperties().put(ASYNC, async);
|
||||||
NamedList addResult = new NamedList();
|
NamedList addResult = new NamedList();
|
||||||
ocmh.addReplica(clusterState, addReplicasProps, addResult, null);
|
ocmh.addReplica(ocmh.zkStateReader.getClusterState(), addReplicasProps, addResult, null);
|
||||||
if (addResult.get("failure") != null) {
|
if (addResult.get("failure") != null) {
|
||||||
String errorString = String.format(Locale.ROOT, "Failed to create replica for collection=%s shard=%s" +
|
String errorString = String.format(Locale.ROOT, "Failed to create replica for collection=%s shard=%s" +
|
||||||
" on node=%s", coll.getName(), slice.getName(), targetNode);
|
" on node=%s", coll.getName(), slice.getName(), targetNode);
|
||||||
|
@ -153,7 +168,7 @@ public class MoveReplicaCmd implements Cmd{
|
||||||
return;
|
return;
|
||||||
} else {
|
} else {
|
||||||
String successString = String.format(Locale.ROOT, "MOVEREPLICA action completed successfully, moved replica=%s at node=%s " +
|
String successString = String.format(Locale.ROOT, "MOVEREPLICA action completed successfully, moved replica=%s at node=%s " +
|
||||||
"to replica=%s at node=%s", replica.getCoreName(), replica.getNodeName(), newCoreName, targetNode);
|
"to replica=%s at node=%s", replica.getCoreName(), replica.getNodeName(), replica.getCoreName(), targetNode);
|
||||||
results.add("success", successString);
|
results.add("success", successString);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -208,9 +208,9 @@ public class Overseer implements Closeable {
|
||||||
@Override
|
@Override
|
||||||
public void onEnqueue() throws Exception {
|
public void onEnqueue() throws Exception {
|
||||||
if (!itemWasMoved[0]) {
|
if (!itemWasMoved[0]) {
|
||||||
|
workQueue.offer(data);
|
||||||
stateUpdateQueue.poll();
|
stateUpdateQueue.poll();
|
||||||
itemWasMoved[0] = true;
|
itemWasMoved[0] = true;
|
||||||
workQueue.offer(data);
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -2250,13 +2250,10 @@ public class ZkController {
|
||||||
DocCollection collection = clusterState.getCollectionOrNull(desc
|
DocCollection collection = clusterState.getCollectionOrNull(desc
|
||||||
.getCloudDescriptor().getCollectionName());
|
.getCloudDescriptor().getCollectionName());
|
||||||
if (collection != null) {
|
if (collection != null) {
|
||||||
boolean autoAddReplicas = ClusterStateUtil.isAutoAddReplicas(getZkStateReader(), collection.getName());
|
|
||||||
if (autoAddReplicas) {
|
|
||||||
CloudUtil.checkSharedFSFailoverReplaced(cc, desc);
|
CloudUtil.checkSharedFSFailoverReplaced(cc, desc);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Add a listener to be notified once there is a new session created after a ZooKeeper session expiration occurs;
|
* Add a listener to be notified once there is a new session created after a ZooKeeper session expiration occurs;
|
||||||
|
|
|
@ -217,7 +217,7 @@ public class StreamHandler extends RequestHandlerBase implements SolrCoreAware,
|
||||||
.withFunctionName("scale", ScaleEvaluator.class)
|
.withFunctionName("scale", ScaleEvaluator.class)
|
||||||
.withFunctionName("sequence", SequenceEvaluator.class)
|
.withFunctionName("sequence", SequenceEvaluator.class)
|
||||||
.withFunctionName("addAll", AddAllEvaluator.class)
|
.withFunctionName("addAll", AddAllEvaluator.class)
|
||||||
|
.withFunctionName("residuals", ResidualsEvaluator.class)
|
||||||
|
|
||||||
// Boolean Stream Evaluators
|
// Boolean Stream Evaluators
|
||||||
.withFunctionName("and", AndEvaluator.class)
|
.withFunctionName("and", AndEvaluator.class)
|
||||||
|
|
|
@ -126,6 +126,7 @@ import static org.apache.solr.common.params.CoreAdminParams.DELETE_DATA_DIR;
|
||||||
import static org.apache.solr.common.params.CoreAdminParams.DELETE_INDEX;
|
import static org.apache.solr.common.params.CoreAdminParams.DELETE_INDEX;
|
||||||
import static org.apache.solr.common.params.CoreAdminParams.DELETE_INSTANCE_DIR;
|
import static org.apache.solr.common.params.CoreAdminParams.DELETE_INSTANCE_DIR;
|
||||||
import static org.apache.solr.common.params.CoreAdminParams.INSTANCE_DIR;
|
import static org.apache.solr.common.params.CoreAdminParams.INSTANCE_DIR;
|
||||||
|
import static org.apache.solr.common.params.CoreAdminParams.ULOG_DIR;
|
||||||
import static org.apache.solr.common.params.ShardParams._ROUTE_;
|
import static org.apache.solr.common.params.ShardParams._ROUTE_;
|
||||||
import static org.apache.solr.common.util.StrUtils.formatString;
|
import static org.apache.solr.common.util.StrUtils.formatString;
|
||||||
|
|
||||||
|
@ -633,6 +634,7 @@ public class CollectionsHandler extends RequestHandlerBase implements Permission
|
||||||
CoreAdminParams.NAME,
|
CoreAdminParams.NAME,
|
||||||
INSTANCE_DIR,
|
INSTANCE_DIR,
|
||||||
DATA_DIR,
|
DATA_DIR,
|
||||||
|
ULOG_DIR,
|
||||||
REPLICA_TYPE);
|
REPLICA_TYPE);
|
||||||
return copyPropertiesWithPrefix(req.getParams(), props, COLL_PROP_PREFIX);
|
return copyPropertiesWithPrefix(req.getParams(), props, COLL_PROP_PREFIX);
|
||||||
}),
|
}),
|
||||||
|
|
|
@ -204,15 +204,12 @@ public class QueryElevationComponent extends SearchComponent implements SolrCore
|
||||||
}
|
}
|
||||||
core.addTransformerFactory(markerName, elevatedMarkerFactory);
|
core.addTransformerFactory(markerName, elevatedMarkerFactory);
|
||||||
forceElevation = initArgs.getBool(QueryElevationParams.FORCE_ELEVATION, forceElevation);
|
forceElevation = initArgs.getBool(QueryElevationParams.FORCE_ELEVATION, forceElevation);
|
||||||
|
|
||||||
|
String f = initArgs.get(CONFIG_FILE);
|
||||||
|
if (f != null) {
|
||||||
try {
|
try {
|
||||||
synchronized (elevationCache) {
|
synchronized (elevationCache) {
|
||||||
elevationCache.clear();
|
elevationCache.clear();
|
||||||
String f = initArgs.get(CONFIG_FILE);
|
|
||||||
if (f == null) {
|
|
||||||
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR,
|
|
||||||
"QueryElevationComponent must specify argument: '" + CONFIG_FILE
|
|
||||||
+ "' -- path to elevate.xml");
|
|
||||||
}
|
|
||||||
boolean exists = false;
|
boolean exists = false;
|
||||||
|
|
||||||
// check if using ZooKeeper
|
// check if using ZooKeeper
|
||||||
|
@ -253,6 +250,7 @@ public class QueryElevationComponent extends SearchComponent implements SolrCore
|
||||||
"Error initializing QueryElevationComponent.", ex);
|
"Error initializing QueryElevationComponent.", ex);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
|
||||||
//get the elevation map from the data dir
|
//get the elevation map from the data dir
|
||||||
Map<String, ElevationObj> getElevationMap(IndexReader reader, SolrCore core) throws Exception {
|
Map<String, ElevationObj> getElevationMap(IndexReader reader, SolrCore core) throws Exception {
|
||||||
|
|
|
@ -179,15 +179,14 @@ public class ExactStatsCache extends StatsCache {
|
||||||
String termStatsString = StatsUtil.termStatsMapToString(statsMap);
|
String termStatsString = StatsUtil.termStatsMapToString(statsMap);
|
||||||
rb.rsp.add(TERM_STATS_KEY, termStatsString);
|
rb.rsp.add(TERM_STATS_KEY, termStatsString);
|
||||||
if (LOG.isDebugEnabled()) {
|
if (LOG.isDebugEnabled()) {
|
||||||
LOG.debug("termStats=" + termStatsString + ", terms=" + terms + ", numDocs=" + searcher.maxDoc());
|
LOG.debug("termStats={}, terms={}, numDocs={}", termStatsString, terms, searcher.maxDoc());
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if (colMap.size() != 0){
|
if (colMap.size() != 0){
|
||||||
String colStatsString = StatsUtil.colStatsMapToString(colMap);
|
String colStatsString = StatsUtil.colStatsMapToString(colMap);
|
||||||
rb.rsp.add(COL_STATS_KEY, colStatsString);
|
rb.rsp.add(COL_STATS_KEY, colStatsString);
|
||||||
if (LOG.isDebugEnabled()) {
|
if (LOG.isDebugEnabled()) {
|
||||||
LOG.debug("collectionStats="
|
LOG.debug("collectionStats={}, terms={}, numDocs={}", colStatsString, terms, searcher.maxDoc());
|
||||||
+ colStatsString + ", terms=" + terms + ", numDocs=" + searcher.maxDoc());
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
} catch (IOException e) {
|
} catch (IOException e) {
|
||||||
|
|
|
@ -136,7 +136,7 @@ public class LRUStatsCache extends ExactStatsCache {
|
||||||
throws IOException {
|
throws IOException {
|
||||||
TermStats termStats = termStatsCache.get(term.toString());
|
TermStats termStats = termStatsCache.get(term.toString());
|
||||||
if (termStats == null) {
|
if (termStats == null) {
|
||||||
LOG.debug("## Missing global termStats info: {}, using local", term.toString());
|
LOG.debug("## Missing global termStats info: {}, using local", term);
|
||||||
return localSearcher.localTermStatistics(term, context);
|
return localSearcher.localTermStatistics(term, context);
|
||||||
} else {
|
} else {
|
||||||
return termStats.toTermStatistics();
|
return termStats.toTermStatistics();
|
||||||
|
|
|
@ -38,7 +38,7 @@ public class LocalStatsCache extends StatsCache {
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public StatsSource get(SolrQueryRequest req) {
|
public StatsSource get(SolrQueryRequest req) {
|
||||||
LOG.debug("## GET {}", req.toString());
|
LOG.debug("## GET {}", req);
|
||||||
return new LocalStatsSource();
|
return new LocalStatsSource();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -49,31 +49,33 @@ public class LocalStatsCache extends StatsCache {
|
||||||
// by returning null we don't create additional round-trip request.
|
// by returning null we don't create additional round-trip request.
|
||||||
@Override
|
@Override
|
||||||
public ShardRequest retrieveStatsRequest(ResponseBuilder rb) {
|
public ShardRequest retrieveStatsRequest(ResponseBuilder rb) {
|
||||||
LOG.debug("## RDR {}", rb.req.toString());
|
LOG.debug("## RDR {}", rb.req);
|
||||||
return null;
|
return null;
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public void mergeToGlobalStats(SolrQueryRequest req,
|
public void mergeToGlobalStats(SolrQueryRequest req,
|
||||||
List<ShardResponse> responses) {
|
List<ShardResponse> responses) {
|
||||||
LOG.debug("## MTGD {}", req.toString());
|
if (LOG.isDebugEnabled()) {
|
||||||
|
LOG.debug("## MTGD {}", req);
|
||||||
for (ShardResponse r : responses) {
|
for (ShardResponse r : responses) {
|
||||||
LOG.debug(" - {}", r);
|
LOG.debug(" - {}", r);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public void returnLocalStats(ResponseBuilder rb, SolrIndexSearcher searcher) {
|
public void returnLocalStats(ResponseBuilder rb, SolrIndexSearcher searcher) {
|
||||||
LOG.debug("## RLD {}", rb.req.toString());
|
LOG.debug("## RLD {}", rb.req);
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public void receiveGlobalStats(SolrQueryRequest req) {
|
public void receiveGlobalStats(SolrQueryRequest req) {
|
||||||
LOG.debug("## RGD {}", req.toString());
|
LOG.debug("## RGD {}", req);
|
||||||
}
|
}
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
public void sendGlobalStats(ResponseBuilder rb, ShardRequest outgoing) {
|
public void sendGlobalStats(ResponseBuilder rb, ShardRequest outgoing) {
|
||||||
LOG.debug("## SGD {}", outgoing.toString());
|
LOG.debug("## SGD {}", outgoing);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -43,6 +43,7 @@ import java.time.Instant;
|
||||||
import java.time.Period;
|
import java.time.Period;
|
||||||
import java.util.ArrayList;
|
import java.util.ArrayList;
|
||||||
import java.util.Arrays;
|
import java.util.Arrays;
|
||||||
|
import java.util.Base64;
|
||||||
import java.util.Collection;
|
import java.util.Collection;
|
||||||
import java.util.Enumeration;
|
import java.util.Enumeration;
|
||||||
import java.util.HashMap;
|
import java.util.HashMap;
|
||||||
|
@ -115,6 +116,7 @@ import org.apache.solr.common.params.CommonParams;
|
||||||
import org.apache.solr.common.params.ModifiableSolrParams;
|
import org.apache.solr.common.params.ModifiableSolrParams;
|
||||||
import org.apache.solr.common.util.ContentStreamBase;
|
import org.apache.solr.common.util.ContentStreamBase;
|
||||||
import org.apache.solr.common.util.NamedList;
|
import org.apache.solr.common.util.NamedList;
|
||||||
|
import org.apache.solr.common.util.StrUtils;
|
||||||
import org.apache.solr.security.Sha256AuthenticationProvider;
|
import org.apache.solr.security.Sha256AuthenticationProvider;
|
||||||
import org.apache.solr.util.configuration.SSLConfigurationsFactory;
|
import org.apache.solr.util.configuration.SSLConfigurationsFactory;
|
||||||
import org.noggit.CharArr;
|
import org.noggit.CharArr;
|
||||||
|
@ -3548,7 +3550,7 @@ public class SolrCLI {
|
||||||
OptionBuilder
|
OptionBuilder
|
||||||
.withArgName("type")
|
.withArgName("type")
|
||||||
.hasArg()
|
.hasArg()
|
||||||
.withDescription("The authentication mechanism to enable. Defaults to 'basicAuth'.")
|
.withDescription("The authentication mechanism to enable (basicAuth or kerberos). Defaults to 'basicAuth'.")
|
||||||
.create("type"),
|
.create("type"),
|
||||||
OptionBuilder
|
OptionBuilder
|
||||||
.withArgName("credentials")
|
.withArgName("credentials")
|
||||||
|
@ -3561,6 +3563,11 @@ public class SolrCLI {
|
||||||
.withDescription("Prompts the user to provide the credentials. Use either -credentials or -prompt, not both")
|
.withDescription("Prompts the user to provide the credentials. Use either -credentials or -prompt, not both")
|
||||||
.create("prompt"),
|
.create("prompt"),
|
||||||
OptionBuilder
|
OptionBuilder
|
||||||
|
.withArgName("config")
|
||||||
|
.hasArgs()
|
||||||
|
.withDescription("Configuration parameters (Solr startup parameters). Required for Kerberos authentication")
|
||||||
|
.create("config"),
|
||||||
|
OptionBuilder
|
||||||
.withArgName("blockUnknown")
|
.withArgName("blockUnknown")
|
||||||
.withDescription("Blocks all access for unknown users (requires authentication for all endpoints)")
|
.withDescription("Blocks all access for unknown users (requires authentication for all endpoints)")
|
||||||
.hasArg()
|
.hasArg()
|
||||||
|
@ -3603,11 +3610,141 @@ public class SolrCLI {
|
||||||
}
|
}
|
||||||
|
|
||||||
String type = cli.getOptionValue("type", "basicAuth");
|
String type = cli.getOptionValue("type", "basicAuth");
|
||||||
if (type.equalsIgnoreCase("basicAuth") == false) {
|
switch (type) {
|
||||||
System.out.println("Only type=basicAuth supported at the moment.");
|
case "basicAuth":
|
||||||
|
return handleBasicAuth(cli);
|
||||||
|
case "kerberos":
|
||||||
|
return handleKerberos(cli);
|
||||||
|
default:
|
||||||
|
System.out.println("Only type=basicAuth or kerberos supported at the moment.");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
private int handleKerberos(CommandLine cli) throws Exception {
|
||||||
|
String cmd = cli.getArgs()[0];
|
||||||
|
boolean updateIncludeFileOnly = Boolean.parseBoolean(cli.getOptionValue("updateIncludeFileOnly", "false"));
|
||||||
|
String securityJson = "{" +
|
||||||
|
"\n \"authentication\":{" +
|
||||||
|
"\n \"class\":\"solr.KerberosPlugin\"" +
|
||||||
|
"\n }" +
|
||||||
|
"\n}";
|
||||||
|
|
||||||
|
|
||||||
|
switch (cmd) {
|
||||||
|
case "enable":
|
||||||
|
String zkHost = null;
|
||||||
|
boolean zkInaccessible = false;
|
||||||
|
|
||||||
|
if (!updateIncludeFileOnly) {
|
||||||
|
try {
|
||||||
|
zkHost = getZkHost(cli);
|
||||||
|
} catch (Exception ex) {
|
||||||
|
System.out.println("Unable to access ZooKeeper. Please add the following security.json to ZooKeeper (in case of SolrCloud):\n"
|
||||||
|
+ securityJson + "\n");
|
||||||
|
zkInaccessible = true;
|
||||||
|
}
|
||||||
|
if (zkHost == null) {
|
||||||
|
if (zkInaccessible == false) {
|
||||||
|
System.out.println("Unable to access ZooKeeper. Please add the following security.json to ZooKeeper (in case of SolrCloud):\n"
|
||||||
|
+ securityJson + "\n");
|
||||||
|
zkInaccessible = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// check if security is already enabled or not
|
||||||
|
if (!zkInaccessible) {
|
||||||
|
try (SolrZkClient zkClient = new SolrZkClient(zkHost, 10000)) {
|
||||||
|
if (zkClient.exists("/security.json", true)) {
|
||||||
|
byte oldSecurityBytes[] = zkClient.getData("/security.json", null, null, true);
|
||||||
|
if (!"{}".equals(new String(oldSecurityBytes, StandardCharsets.UTF_8).trim())) {
|
||||||
|
System.out.println("Security is already enabled. You can disable it with 'bin/solr auth disable'. Existing security.json: \n"
|
||||||
|
+ new String(oldSecurityBytes, StandardCharsets.UTF_8));
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch (Exception ex) {
|
||||||
|
if (zkInaccessible == false) {
|
||||||
|
System.out.println("Unable to access ZooKeeper. Please add the following security.json to ZooKeeper (in case of SolrCloud):\n"
|
||||||
|
+ securityJson + "\n");
|
||||||
|
zkInaccessible = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!updateIncludeFileOnly) {
|
||||||
|
if (!zkInaccessible) {
|
||||||
|
System.out.println("Uploading following security.json: " + securityJson);
|
||||||
|
try (SolrZkClient zkClient = new SolrZkClient(zkHost, 10000)) {
|
||||||
|
zkClient.setData("/security.json", securityJson.getBytes(StandardCharsets.UTF_8), true);
|
||||||
|
} catch (Exception ex) {
|
||||||
|
if (zkInaccessible == false) {
|
||||||
|
System.out.println("Unable to access ZooKeeper. Please add the following security.json to ZooKeeper (in case of SolrCloud):\n"
|
||||||
|
+ securityJson);
|
||||||
|
zkInaccessible = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
String config = StrUtils.join(Arrays.asList(cli.getOptionValues("config")), ' ');
|
||||||
|
// config is base64 encoded (to get around parsing problems), decode it
|
||||||
|
config = config.replaceAll(" ", "");
|
||||||
|
config = new String(Base64.getDecoder().decode(config.getBytes("UTF-8")), "UTF-8");
|
||||||
|
config = config.replaceAll("\n", "").replaceAll("\r", "");
|
||||||
|
|
||||||
|
String solrIncludeFilename = cli.getOptionValue("solrIncludeFile");
|
||||||
|
File includeFile = new File(solrIncludeFilename);
|
||||||
|
if (includeFile.exists() == false || includeFile.canWrite() == false) {
|
||||||
|
System.out.println("Solr include file " + solrIncludeFilename + " doesn't exist or is not writeable.");
|
||||||
|
printAuthEnablingInstructions(config);
|
||||||
|
System.exit(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
// update the solr.in.sh file to contain the necessary authentication lines
|
||||||
|
updateIncludeFileEnableAuth(includeFile, null, config);
|
||||||
|
System.out.println("Please restart any running Solr nodes.");
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
case "disable":
|
||||||
|
if (!updateIncludeFileOnly) {
|
||||||
|
zkHost = getZkHost(cli);
|
||||||
|
if (zkHost == null) {
|
||||||
|
stdout.print("ZK Host not found. Solr should be running in cloud mode");
|
||||||
exit(1);
|
exit(1);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
System.out.println("Uploading following security.json: {}");
|
||||||
|
|
||||||
|
try (SolrZkClient zkClient = new SolrZkClient(zkHost, 10000)) {
|
||||||
|
zkClient.setData("/security.json", "{}".getBytes(StandardCharsets.UTF_8), true);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
solrIncludeFilename = cli.getOptionValue("solrIncludeFile");
|
||||||
|
includeFile = new File(solrIncludeFilename);
|
||||||
|
if (!includeFile.exists() || !includeFile.canWrite()) {
|
||||||
|
System.out.println("Solr include file " + solrIncludeFilename + " doesn't exist or is not writeable.");
|
||||||
|
System.out.println("Security has been disabled. Please remove any SOLR_AUTH_TYPE or SOLR_AUTHENTICATION_OPTS configuration from solr.in.sh/solr.in.cmd.\n");
|
||||||
|
System.exit(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
// update the solr.in.sh file to comment out the necessary authentication lines
|
||||||
|
updateIncludeFileDisableAuth(includeFile);
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
default:
|
||||||
|
System.out.println("Valid auth commands are: enable, disable");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
System.out.println("Options not understood.");
|
||||||
|
new HelpFormatter().printHelp("bin/solr auth <enable|disable> [OPTIONS]", getToolOptions(this));
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
private int handleBasicAuth(CommandLine cli) throws Exception {
|
||||||
String cmd = cli.getArgs()[0];
|
String cmd = cli.getArgs()[0];
|
||||||
boolean prompt = Boolean.parseBoolean(cli.getOptionValue("prompt", "false"));
|
boolean prompt = Boolean.parseBoolean(cli.getOptionValue("prompt", "false"));
|
||||||
boolean updateIncludeFileOnly = Boolean.parseBoolean(cli.getOptionValue("updateIncludeFileOnly", "false"));
|
boolean updateIncludeFileOnly = Boolean.parseBoolean(cli.getOptionValue("updateIncludeFileOnly", "false"));
|
||||||
|
@ -3715,7 +3852,7 @@ public class SolrCLI {
|
||||||
"httpBasicAuthUser=" + username + "\nhttpBasicAuthPassword=" + password, StandardCharsets.UTF_8);
|
"httpBasicAuthUser=" + username + "\nhttpBasicAuthPassword=" + password, StandardCharsets.UTF_8);
|
||||||
|
|
||||||
// update the solr.in.sh file to contain the necessary authentication lines
|
// update the solr.in.sh file to contain the necessary authentication lines
|
||||||
updateIncludeFileEnableAuth(includeFile, basicAuthConfFile.getAbsolutePath());
|
updateIncludeFileEnableAuth(includeFile, basicAuthConfFile.getAbsolutePath(), null);
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
case "disable":
|
case "disable":
|
||||||
|
@ -3754,7 +3891,6 @@ public class SolrCLI {
|
||||||
new HelpFormatter().printHelp("bin/solr auth <enable|disable> [OPTIONS]", getToolOptions(this));
|
new HelpFormatter().printHelp("bin/solr auth <enable|disable> [OPTIONS]", getToolOptions(this));
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
private void printAuthEnablingInstructions(String username, String password) {
|
private void printAuthEnablingInstructions(String username, String password) {
|
||||||
if (SystemUtils.IS_OS_WINDOWS) {
|
if (SystemUtils.IS_OS_WINDOWS) {
|
||||||
System.out.println("\nAdd the following lines to the solr.in.cmd file so that the solr.cmd script can use subsequently.\n");
|
System.out.println("\nAdd the following lines to the solr.in.cmd file so that the solr.cmd script can use subsequently.\n");
|
||||||
|
@ -3766,8 +3902,26 @@ public class SolrCLI {
|
||||||
+ "SOLR_AUTHENTICATION_OPTS=\"-Dbasicauth=" + username + ":" + password + "\"\n");
|
+ "SOLR_AUTHENTICATION_OPTS=\"-Dbasicauth=" + username + ":" + password + "\"\n");
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
private void printAuthEnablingInstructions(String kerberosConfig) {
|
||||||
|
if (SystemUtils.IS_OS_WINDOWS) {
|
||||||
|
System.out.println("\nAdd the following lines to the solr.in.cmd file so that the solr.cmd script can use subsequently.\n");
|
||||||
|
System.out.println("set SOLR_AUTH_TYPE=kerberos\n"
|
||||||
|
+ "set SOLR_AUTHENTICATION_OPTS=\"" + kerberosConfig + "\"\n");
|
||||||
|
} else {
|
||||||
|
System.out.println("\nAdd the following lines to the solr.in.sh file so that the ./solr script can use subsequently.\n");
|
||||||
|
System.out.println("SOLR_AUTH_TYPE=\"kerberos\"\n"
|
||||||
|
+ "SOLR_AUTHENTICATION_OPTS=\"" + kerberosConfig + "\"\n");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
private void updateIncludeFileEnableAuth(File includeFile, String basicAuthConfFile) throws IOException {
|
/**
|
||||||
|
* This will update the include file (e.g. solr.in.sh / solr.in.cmd) with the authentication parameters.
|
||||||
|
* @param includeFile The include file
|
||||||
|
* @param basicAuthConfFile If basicAuth, the path of the file containing credentials. If not, null.
|
||||||
|
* @param kerberosConfig If kerberos, the config string containing startup parameters. If not, null.
|
||||||
|
*/
|
||||||
|
private void updateIncludeFileEnableAuth(File includeFile, String basicAuthConfFile, String kerberosConfig) throws IOException {
|
||||||
|
assert !(basicAuthConfFile != null && kerberosConfig != null); // only one of the two needs to be populated
|
||||||
List<String> includeFileLines = FileUtils.readLines(includeFile, StandardCharsets.UTF_8);
|
List<String> includeFileLines = FileUtils.readLines(includeFile, StandardCharsets.UTF_8);
|
||||||
for (int i=0; i<includeFileLines.size(); i++) {
|
for (int i=0; i<includeFileLines.size(); i++) {
|
||||||
String line = includeFileLines.get(i);
|
String line = includeFileLines.get(i);
|
||||||
|
@ -3780,6 +3934,8 @@ public class SolrCLI {
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
includeFileLines.add(""); // blank line
|
includeFileLines.add(""); // blank line
|
||||||
|
|
||||||
|
if (basicAuthConfFile != null) { // for basicAuth
|
||||||
if (SystemUtils.IS_OS_WINDOWS) {
|
if (SystemUtils.IS_OS_WINDOWS) {
|
||||||
includeFileLines.add("REM The following lines added by solr.cmd for enabling BasicAuth");
|
includeFileLines.add("REM The following lines added by solr.cmd for enabling BasicAuth");
|
||||||
includeFileLines.add("set SOLR_AUTH_TYPE=basic");
|
includeFileLines.add("set SOLR_AUTH_TYPE=basic");
|
||||||
|
@ -3789,9 +3945,23 @@ public class SolrCLI {
|
||||||
includeFileLines.add("SOLR_AUTH_TYPE=\"basic\"");
|
includeFileLines.add("SOLR_AUTH_TYPE=\"basic\"");
|
||||||
includeFileLines.add("SOLR_AUTHENTICATION_OPTS=\"-Dsolr.httpclient.config=" + basicAuthConfFile + "\"");
|
includeFileLines.add("SOLR_AUTHENTICATION_OPTS=\"-Dsolr.httpclient.config=" + basicAuthConfFile + "\"");
|
||||||
}
|
}
|
||||||
|
} else { // for kerberos
|
||||||
|
if (SystemUtils.IS_OS_WINDOWS) {
|
||||||
|
includeFileLines.add("REM The following lines added by solr.cmd for enabling BasicAuth");
|
||||||
|
includeFileLines.add("set SOLR_AUTH_TYPE=kerberos");
|
||||||
|
includeFileLines.add("set SOLR_AUTHENTICATION_OPTS=\"-Dsolr.httpclient.config=" + basicAuthConfFile + "\"");
|
||||||
|
} else {
|
||||||
|
includeFileLines.add("# The following lines added by ./solr for enabling BasicAuth");
|
||||||
|
includeFileLines.add("SOLR_AUTH_TYPE=\"kerberos\"");
|
||||||
|
includeFileLines.add("SOLR_AUTHENTICATION_OPTS=\"" + kerberosConfig + "\"");
|
||||||
|
}
|
||||||
|
}
|
||||||
FileUtils.writeLines(includeFile, StandardCharsets.UTF_8.name(), includeFileLines);
|
FileUtils.writeLines(includeFile, StandardCharsets.UTF_8.name(), includeFileLines);
|
||||||
|
|
||||||
System.out.println("Written out credentials file: " + basicAuthConfFile + ", updated Solr include file: " + includeFile.getAbsolutePath() + ".");
|
if (basicAuthConfFile != null) {
|
||||||
|
System.out.println("Written out credentials file: " + basicAuthConfFile);
|
||||||
|
}
|
||||||
|
System.out.println("Updated Solr include file: " + includeFile.getAbsolutePath());
|
||||||
}
|
}
|
||||||
|
|
||||||
private void updateIncludeFileDisableAuth(File includeFile) throws IOException {
|
private void updateIncludeFileDisableAuth(File includeFile) throws IOException {
|
||||||
|
|
|
@ -18,14 +18,14 @@
|
||||||
|
|
||||||
<schema name="add-schema-fields-update-processor" version="1.6">
|
<schema name="add-schema-fields-update-processor" version="1.6">
|
||||||
|
|
||||||
<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" multiValued="true" positionIncrementGap="0"/>
|
<fieldType name="tint" class="${solr.tests.IntegerFieldType}" docValues="${solr.tests.numeric.dv}" precisionStep="8" multiValued="true" positionIncrementGap="0"/>
|
||||||
<fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" multiValued="true" positionIncrementGap="0"/>
|
<fieldType name="tfloat" class="${solr.tests.FloatFieldType}" docValues="${solr.tests.numeric.dv}" precisionStep="8" multiValued="true" positionIncrementGap="0"/>
|
||||||
<fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" multiValued="true" positionIncrementGap="0"/>
|
<fieldType name="tlong" class="${solr.tests.LongFieldType}" docValues="${solr.tests.numeric.dv}" precisionStep="8" multiValued="true" positionIncrementGap="0"/>
|
||||||
<fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" multiValued="true" positionIncrementGap="0"/>
|
<fieldType name="tdouble" class="${solr.tests.DoubleFieldType}" docValues="${solr.tests.numeric.dv}" precisionStep="8" multiValued="true" positionIncrementGap="0"/>
|
||||||
<fieldType name="tdate" class="solr.TrieDateField" precisionStep="6" multiValued="true" positionIncrementGap="0"/>
|
<fieldType name="tdate" class="${solr.tests.DateFieldType}" docValues="${solr.tests.numeric.dv}" precisionStep="6" multiValued="true" positionIncrementGap="0"/>
|
||||||
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" multiValued="true"/>
|
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" multiValued="true"/>
|
||||||
<fieldType name="string" class="solr.StrField" sortMissingLast="true"/>
|
<fieldType name="string" class="solr.StrField" sortMissingLast="true"/>
|
||||||
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
|
<fieldType name="long" class="${solr.tests.LongFieldType}" docValues="${solr.tests.numeric.dv}" precisionStep="0" positionIncrementGap="0"/>
|
||||||
<fieldType name="text" class="solr.TextField" multiValued="true" positionIncrementGap="100">
|
<fieldType name="text" class="solr.TextField" multiValued="true" positionIncrementGap="100">
|
||||||
<analyzer>
|
<analyzer>
|
||||||
<tokenizer class="solr.StandardTokenizerFactory"/>
|
<tokenizer class="solr.StandardTokenizerFactory"/>
|
||||||
|
|
|
@ -17,9 +17,9 @@
|
||||||
-->
|
-->
|
||||||
|
|
||||||
<schema name="test" version="1.0">
|
<schema name="test" version="1.0">
|
||||||
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
|
<fieldType name="int" class="${solr.tests.IntegerFieldType}" docValues="${solr.tests.numeric.dv}" precisionStep="0" positionIncrementGap="0"/>
|
||||||
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/>
|
<fieldType name="float" class="${solr.tests.FloatFieldType}" docValues="${solr.tests.numeric.dv}" precisionStep="0" positionIncrementGap="0"/>
|
||||||
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
|
<fieldType name="long" class="${solr.tests.LongFieldType}" docValues="${solr.tests.numeric.dv}" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
|
||||||
<fieldtype name="string" class="solr.StrField" sortMissingLast="true"/>
|
<fieldtype name="string" class="solr.StrField" sortMissingLast="true"/>
|
||||||
|
|
||||||
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="false"/>
|
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="false"/>
|
||||||
|
|
|
@ -17,8 +17,8 @@
|
||||||
-->
|
-->
|
||||||
|
|
||||||
<schema name="test-custom-field-sort" version="1.6">
|
<schema name="test-custom-field-sort" version="1.6">
|
||||||
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
|
<fieldType name="int" class="${solr.tests.IntegerFieldType}" docValues="${solr.tests.numeric.dv}" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
|
||||||
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
|
<fieldType name="long" class="${solr.tests.LongFieldType}" docValues="${solr.tests.numeric.dv}" precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
|
||||||
<fieldType name="string" class="solr.StrField" sortMissingLast="true"/>
|
<fieldType name="string" class="solr.StrField" sortMissingLast="true"/>
|
||||||
<fieldType name="text" class="solr.TextField">
|
<fieldType name="text" class="solr.TextField">
|
||||||
<analyzer>
|
<analyzer>
|
||||||
|
|
|
@ -1,42 +0,0 @@
|
||||||
<?xml version="1.0" encoding="UTF-8" ?>
|
|
||||||
<!--
|
|
||||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
|
||||||
contributor license agreements. See the NOTICE file distributed with
|
|
||||||
this work for additional information regarding copyright ownership.
|
|
||||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
|
||||||
(the "License"); you may not use this file except in compliance with
|
|
||||||
the License. You may obtain a copy of the License at
|
|
||||||
|
|
||||||
http://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
|
|
||||||
Unless required by applicable law or agreed to in writing, software
|
|
||||||
distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
See the License for the specific language governing permissions and
|
|
||||||
limitations under the License.
|
|
||||||
-->
|
|
||||||
|
|
||||||
<!-- If this file is found in the config directory, it will only be
|
|
||||||
loaded once at startup. If it is found in Solr's data
|
|
||||||
directory, it will be re-loaded every commit.
|
|
||||||
|
|
||||||
See http://wiki.apache.org/solr/QueryElevationComponent for more info
|
|
||||||
|
|
||||||
-->
|
|
||||||
<elevate>
|
|
||||||
<!-- Query elevation examples
|
|
||||||
<query text="foo bar">
|
|
||||||
<doc id="1" />
|
|
||||||
<doc id="2" />
|
|
||||||
<doc id="3" />
|
|
||||||
</query>
|
|
||||||
|
|
||||||
for use with techproducts example
|
|
||||||
|
|
||||||
<query text="ipod">
|
|
||||||
<doc id="MA147LL/A" /> put the actual ipod at the top
|
|
||||||
<doc id="IW-02" exclude="true" /> exclude this cable
|
|
||||||
</query>
|
|
||||||
-->
|
|
||||||
|
|
||||||
</elevate>
|
|
|
@ -1004,7 +1004,6 @@
|
||||||
<searchComponent name="elevator" class="solr.QueryElevationComponent" >
|
<searchComponent name="elevator" class="solr.QueryElevationComponent" >
|
||||||
<!-- pick a fieldType to analyze queries -->
|
<!-- pick a fieldType to analyze queries -->
|
||||||
<str name="queryFieldType">string</str>
|
<str name="queryFieldType">string</str>
|
||||||
<str name="config-file">elevate.xml</str>
|
|
||||||
</searchComponent>
|
</searchComponent>
|
||||||
|
|
||||||
<!-- A request handler for demonstrating the elevator component -->
|
<!-- A request handler for demonstrating the elevator component -->
|
||||||
|
|
|
@ -74,9 +74,10 @@ public class CollectionsAPISolrJTest extends SolrCloudTestCase {
|
||||||
assertEquals(0, (int)status.get("status"));
|
assertEquals(0, (int)status.get("status"));
|
||||||
assertTrue(status.get("QTime") > 0);
|
assertTrue(status.get("QTime") > 0);
|
||||||
}
|
}
|
||||||
|
// Use of _default configset should generate a warning for data-driven functionality in production use
|
||||||
|
assertTrue(response.getWarning() != null && response.getWarning().contains("NOT RECOMMENDED for production use"));
|
||||||
|
|
||||||
response = CollectionAdminRequest.deleteCollection(collectionName).process(cluster.getSolrClient());
|
response = CollectionAdminRequest.deleteCollection(collectionName).process(cluster.getSolrClient());
|
||||||
|
|
||||||
assertEquals(0, response.getStatus());
|
assertEquals(0, response.getStatus());
|
||||||
assertTrue(response.isSuccess());
|
assertTrue(response.isSuccess());
|
||||||
Map<String,NamedList<Integer>> nodesStatus = response.getCollectionNodesStatus();
|
Map<String,NamedList<Integer>> nodesStatus = response.getCollectionNodesStatus();
|
||||||
|
|
|
@ -54,7 +54,6 @@ public class MoveReplicaHDFSTest extends MoveReplicaTest {
|
||||||
dfsCluster = null;
|
dfsCluster = null;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
public static class ForkJoinThreadsFilter implements ThreadFilter {
|
public static class ForkJoinThreadsFilter implements ThreadFilter {
|
||||||
|
|
||||||
@Override
|
@Override
|
||||||
|
|
|
@ -0,0 +1,142 @@
|
||||||
|
/*
|
||||||
|
* Licensed to the Apache Software Foundation (ASF) under one or more
|
||||||
|
* contributor license agreements. See the NOTICE file distributed with
|
||||||
|
* this work for additional information regarding copyright ownership.
|
||||||
|
* The ASF licenses this file to You under the Apache License, Version 2.0
|
||||||
|
* (the "License"); you may not use this file except in compliance with
|
||||||
|
* the License. You may obtain a copy of the License at
|
||||||
|
*
|
||||||
|
* http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
*
|
||||||
|
* Unless required by applicable law or agreed to in writing, software
|
||||||
|
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
* See the License for the specific language governing permissions and
|
||||||
|
* limitations under the License.
|
||||||
|
*/
|
||||||
|
|
||||||
|
package org.apache.solr.cloud;
|
||||||
|
|
||||||
|
import java.io.IOException;
|
||||||
|
|
||||||
|
import com.carrotsearch.randomizedtesting.annotations.ThreadLeakFilters;
|
||||||
|
import org.apache.hadoop.hdfs.MiniDFSCluster;
|
||||||
|
import org.apache.solr.client.solrj.SolrClient;
|
||||||
|
import org.apache.solr.client.solrj.SolrQuery;
|
||||||
|
import org.apache.solr.client.solrj.SolrServerException;
|
||||||
|
import org.apache.solr.client.solrj.request.CollectionAdminRequest;
|
||||||
|
import org.apache.solr.client.solrj.response.CollectionAdminResponse;
|
||||||
|
import org.apache.solr.cloud.hdfs.HdfsTestUtil;
|
||||||
|
import org.apache.solr.common.SolrInputDocument;
|
||||||
|
import org.apache.solr.common.cloud.ClusterStateUtil;
|
||||||
|
import org.apache.solr.common.cloud.DocCollection;
|
||||||
|
import org.apache.solr.common.cloud.Replica;
|
||||||
|
import org.apache.solr.common.cloud.ZkConfigManager;
|
||||||
|
import org.apache.solr.common.cloud.ZkStateReader;
|
||||||
|
import org.apache.solr.util.BadHdfsThreadsFilter;
|
||||||
|
import org.junit.AfterClass;
|
||||||
|
import org.junit.BeforeClass;
|
||||||
|
import org.junit.Test;
|
||||||
|
|
||||||
|
@ThreadLeakFilters(defaultFilters = true, filters = {
|
||||||
|
BadHdfsThreadsFilter.class, // hdfs currently leaks thread(s)
|
||||||
|
MoveReplicaHDFSTest.ForkJoinThreadsFilter.class
|
||||||
|
})
|
||||||
|
public class MoveReplicaHDFSUlogDirTest extends SolrCloudTestCase {
|
||||||
|
private static MiniDFSCluster dfsCluster;
|
||||||
|
|
||||||
|
@BeforeClass
|
||||||
|
public static void setupClass() throws Exception {
|
||||||
|
configureCluster(2)
|
||||||
|
.addConfig("conf1", TEST_PATH().resolve("configsets").resolve("cloud-dynamic").resolve("conf"))
|
||||||
|
.configure();
|
||||||
|
|
||||||
|
System.setProperty("solr.hdfs.blockcache.enabled", "false");
|
||||||
|
dfsCluster = HdfsTestUtil.setupClass(createTempDir().toFile().getAbsolutePath());
|
||||||
|
|
||||||
|
ZkConfigManager configManager = new ZkConfigManager(zkClient());
|
||||||
|
configManager.uploadConfigDir(configset("cloud-hdfs"), "conf1");
|
||||||
|
|
||||||
|
System.setProperty("solr.hdfs.home", HdfsTestUtil.getDataDir(dfsCluster, "data"));
|
||||||
|
}
|
||||||
|
|
||||||
|
@AfterClass
|
||||||
|
public static void teardownClass() throws Exception {
|
||||||
|
cluster.shutdown(); // need to close before the MiniDFSCluster
|
||||||
|
HdfsTestUtil.teardownClass(dfsCluster);
|
||||||
|
dfsCluster = null;
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test
|
||||||
|
public void testDataDirAndUlogAreMaintained() throws Exception {
|
||||||
|
String coll = "movereplicatest_coll2";
|
||||||
|
CollectionAdminRequest.createCollection(coll, "conf1", 1, 1)
|
||||||
|
.setCreateNodeSet("")
|
||||||
|
.process(cluster.getSolrClient());
|
||||||
|
String hdfsUri = HdfsTestUtil.getURI(dfsCluster);
|
||||||
|
String dataDir = hdfsUri + "/dummyFolder/dataDir";
|
||||||
|
String ulogDir = hdfsUri + "/dummyFolder2/ulogDir";
|
||||||
|
CollectionAdminResponse res = CollectionAdminRequest
|
||||||
|
.addReplicaToShard(coll, "shard1")
|
||||||
|
.setDataDir(dataDir)
|
||||||
|
.setUlogDir(ulogDir)
|
||||||
|
.setNode(cluster.getJettySolrRunner(0).getNodeName())
|
||||||
|
.process(cluster.getSolrClient());
|
||||||
|
|
||||||
|
ulogDir += "/tlog";
|
||||||
|
ZkStateReader zkStateReader = cluster.getSolrClient().getZkStateReader();
|
||||||
|
assertTrue(ClusterStateUtil.waitForAllActiveAndLiveReplicas(zkStateReader, 120000));
|
||||||
|
|
||||||
|
DocCollection docCollection = zkStateReader.getClusterState().getCollection(coll);
|
||||||
|
Replica replica = docCollection.getReplicas().iterator().next();
|
||||||
|
assertTrue(replica.getStr("ulogDir"), replica.getStr("ulogDir").equals(ulogDir) || replica.getStr("ulogDir").equals(ulogDir+'/'));
|
||||||
|
assertTrue(replica.getStr("dataDir"),replica.getStr("dataDir").equals(dataDir) || replica.getStr("dataDir").equals(dataDir+'/'));
|
||||||
|
|
||||||
|
new CollectionAdminRequest.MoveReplica(coll, replica.getName(), cluster.getJettySolrRunner(1).getNodeName())
|
||||||
|
.process(cluster.getSolrClient());
|
||||||
|
assertTrue(ClusterStateUtil.waitForAllActiveAndLiveReplicas(zkStateReader, 120000));
|
||||||
|
docCollection = zkStateReader.getClusterState().getCollection(coll);
|
||||||
|
assertEquals(1, docCollection.getSlice("shard1").getReplicas().size());
|
||||||
|
Replica newReplica = docCollection.getReplicas().iterator().next();
|
||||||
|
assertEquals(newReplica.getNodeName(), cluster.getJettySolrRunner(1).getNodeName());
|
||||||
|
assertTrue(newReplica.getStr("ulogDir"), newReplica.getStr("ulogDir").equals(ulogDir) || newReplica.getStr("ulogDir").equals(ulogDir+'/'));
|
||||||
|
assertTrue(newReplica.getStr("dataDir"),newReplica.getStr("dataDir").equals(dataDir) || newReplica.getStr("dataDir").equals(dataDir+'/'));
|
||||||
|
|
||||||
|
assertEquals(replica.getName(), newReplica.getName());
|
||||||
|
assertEquals(replica.getCoreName(), newReplica.getCoreName());
|
||||||
|
assertFalse(replica.getNodeName().equals(newReplica.getNodeName()));
|
||||||
|
final int numDocs = 100;
|
||||||
|
addDocs(coll, numDocs); // indexed but not committed
|
||||||
|
|
||||||
|
cluster.getJettySolrRunner(1).stop();
|
||||||
|
Thread.sleep(5000);
|
||||||
|
new CollectionAdminRequest.MoveReplica(coll, newReplica.getName(), cluster.getJettySolrRunner(0).getNodeName())
|
||||||
|
.process(cluster.getSolrClient());
|
||||||
|
assertTrue(ClusterStateUtil.waitForAllActiveAndLiveReplicas(zkStateReader, 120000));
|
||||||
|
|
||||||
|
// assert that the old core will be removed on startup
|
||||||
|
cluster.getJettySolrRunner(1).start();
|
||||||
|
assertTrue(ClusterStateUtil.waitForAllActiveAndLiveReplicas(zkStateReader, 120000));
|
||||||
|
docCollection = zkStateReader.getClusterState().getCollection(coll);
|
||||||
|
assertEquals(1, docCollection.getReplicas().size());
|
||||||
|
newReplica = docCollection.getReplicas().iterator().next();
|
||||||
|
assertEquals(newReplica.getNodeName(), cluster.getJettySolrRunner(0).getNodeName());
|
||||||
|
assertTrue(newReplica.getStr("ulogDir"), newReplica.getStr("ulogDir").equals(ulogDir) || newReplica.getStr("ulogDir").equals(ulogDir+'/'));
|
||||||
|
assertTrue(newReplica.getStr("dataDir"),newReplica.getStr("dataDir").equals(dataDir) || newReplica.getStr("dataDir").equals(dataDir+'/'));
|
||||||
|
|
||||||
|
assertEquals(0, cluster.getJettySolrRunner(1).getCoreContainer().getCores().size());
|
||||||
|
|
||||||
|
cluster.getSolrClient().commit(coll);
|
||||||
|
assertEquals(numDocs, cluster.getSolrClient().query(coll, new SolrQuery("*:*")).getResults().getNumFound());
|
||||||
|
}
|
||||||
|
|
||||||
|
private void addDocs(String collection, int numDocs) throws SolrServerException, IOException {
|
||||||
|
SolrClient solrClient = cluster.getSolrClient();
|
||||||
|
for (int docId = 1; docId <= numDocs; docId++) {
|
||||||
|
SolrInputDocument doc = new SolrInputDocument();
|
||||||
|
doc.addField("id", docId);
|
||||||
|
solrClient.add(collection, doc);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
|
@ -27,7 +27,7 @@ import java.util.List;
|
||||||
import java.util.Locale;
|
import java.util.Locale;
|
||||||
import java.util.Map;
|
import java.util.Map;
|
||||||
import java.util.concurrent.TimeUnit;
|
import java.util.concurrent.TimeUnit;
|
||||||
|
import org.apache.solr.SolrTestCaseJ4.SuppressObjectReleaseTracker;
|
||||||
import org.apache.solr.SolrTestCaseJ4.SuppressSSL;
|
import org.apache.solr.SolrTestCaseJ4.SuppressSSL;
|
||||||
import org.apache.solr.client.solrj.SolrClient;
|
import org.apache.solr.client.solrj.SolrClient;
|
||||||
import org.apache.solr.client.solrj.SolrQuery;
|
import org.apache.solr.client.solrj.SolrQuery;
|
||||||
|
@ -52,6 +52,7 @@ import org.slf4j.Logger;
|
||||||
import org.slf4j.LoggerFactory;
|
import org.slf4j.LoggerFactory;
|
||||||
|
|
||||||
@SuppressSSL(bugUrl = "https://issues.apache.org/jira/browse/SOLR-5776")
|
@SuppressSSL(bugUrl = "https://issues.apache.org/jira/browse/SOLR-5776")
|
||||||
|
@SuppressObjectReleaseTracker(bugUrl="Testing purposes")
|
||||||
public class TestPullReplicaErrorHandling extends SolrCloudTestCase {
|
public class TestPullReplicaErrorHandling extends SolrCloudTestCase {
|
||||||
|
|
||||||
private final static int REPLICATION_TIMEOUT_SECS = 10;
|
private final static int REPLICATION_TIMEOUT_SECS = 10;
|
||||||
|
|
|
@ -19,9 +19,12 @@ package org.apache.solr.cloud;
|
||||||
import java.lang.invoke.MethodHandles;
|
import java.lang.invoke.MethodHandles;
|
||||||
import java.util.ArrayList;
|
import java.util.ArrayList;
|
||||||
import java.util.Collection;
|
import java.util.Collection;
|
||||||
import java.util.HashMap;
|
import java.util.LinkedHashMap;
|
||||||
|
import java.util.LinkedHashSet;
|
||||||
import java.util.List;
|
import java.util.List;
|
||||||
import java.util.Map;
|
import java.util.Map;
|
||||||
|
import java.util.Set;
|
||||||
|
|
||||||
|
|
||||||
import com.codahale.metrics.Counter;
|
import com.codahale.metrics.Counter;
|
||||||
import org.apache.lucene.util.TestUtil;
|
import org.apache.lucene.util.TestUtil;
|
||||||
|
@ -41,7 +44,6 @@ import org.apache.solr.common.util.Utils;
|
||||||
import org.apache.solr.core.CoreContainer;
|
import org.apache.solr.core.CoreContainer;
|
||||||
import org.apache.solr.core.SolrCore;
|
import org.apache.solr.core.SolrCore;
|
||||||
import org.apache.solr.metrics.SolrMetricManager;
|
import org.apache.solr.metrics.SolrMetricManager;
|
||||||
import org.apache.solr.request.SolrRequestHandler;
|
|
||||||
import org.junit.Test;
|
import org.junit.Test;
|
||||||
import org.slf4j.Logger;
|
import org.slf4j.Logger;
|
||||||
import org.slf4j.LoggerFactory;
|
import org.slf4j.LoggerFactory;
|
||||||
|
@ -86,6 +88,25 @@ public class TestRandomRequestDistribution extends AbstractFullDistribZkTestBase
|
||||||
|
|
||||||
cloudClient.getZkStateReader().forceUpdateCollection("b1x1");
|
cloudClient.getZkStateReader().forceUpdateCollection("b1x1");
|
||||||
|
|
||||||
|
// get direct access to the metrics counters for each core/replica we're interested to monitor them
|
||||||
|
final Map<String,Counter> counters = new LinkedHashMap<>();
|
||||||
|
for (JettySolrRunner runner : jettys) {
|
||||||
|
CoreContainer container = runner.getCoreContainer();
|
||||||
|
SolrMetricManager metricManager = container.getMetricManager();
|
||||||
|
for (SolrCore core : container.getCores()) {
|
||||||
|
if ("a1x2".equals(core.getCoreDescriptor().getCollectionName())) {
|
||||||
|
String registry = core.getCoreMetricManager().getRegistryName();
|
||||||
|
Counter cnt = metricManager.counter(null, registry, "requests", "QUERY./select");
|
||||||
|
// sanity check
|
||||||
|
assertEquals(core.getName() + " has already recieved some requests?",
|
||||||
|
0, cnt.getCount());
|
||||||
|
counters.put(core.getName(), cnt);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
assertEquals("Sanity Check: we know there should be 2 replicas", 2, counters.size());
|
||||||
|
|
||||||
|
// send queries to the node that doesn't host any core/replica and see where it routes them
|
||||||
ClusterState clusterState = cloudClient.getZkStateReader().getClusterState();
|
ClusterState clusterState = cloudClient.getZkStateReader().getClusterState();
|
||||||
DocCollection b1x1 = clusterState.getCollection("b1x1");
|
DocCollection b1x1 = clusterState.getCollection("b1x1");
|
||||||
Collection<Replica> replicas = b1x1.getSlice("shard1").getReplicas();
|
Collection<Replica> replicas = b1x1.getSlice("shard1").getReplicas();
|
||||||
|
@ -94,29 +115,30 @@ public class TestRandomRequestDistribution extends AbstractFullDistribZkTestBase
|
||||||
if (!baseUrl.endsWith("/")) baseUrl += "/";
|
if (!baseUrl.endsWith("/")) baseUrl += "/";
|
||||||
try (HttpSolrClient client = getHttpSolrClient(baseUrl + "a1x2", 2000, 5000)) {
|
try (HttpSolrClient client = getHttpSolrClient(baseUrl + "a1x2", 2000, 5000)) {
|
||||||
|
|
||||||
|
long expectedTotalRequests = 0;
|
||||||
|
Set<String> uniqueCoreNames = new LinkedHashSet<>();
|
||||||
|
|
||||||
log.info("Making requests to " + baseUrl + "a1x2");
|
log.info("Making requests to " + baseUrl + "a1x2");
|
||||||
for (int i = 0; i < 10; i++) {
|
while (uniqueCoreNames.size() < counters.keySet().size() && expectedTotalRequests < 1000L) {
|
||||||
|
expectedTotalRequests++;
|
||||||
client.query(new SolrQuery("*:*"));
|
client.query(new SolrQuery("*:*"));
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
Map<String, Integer> shardVsCount = new HashMap<>();
|
long actualTotalRequests = 0;
|
||||||
for (JettySolrRunner runner : jettys) {
|
for (Map.Entry<String,Counter> e : counters.entrySet()) {
|
||||||
CoreContainer container = runner.getCoreContainer();
|
final long coreCount = e.getValue().getCount();
|
||||||
SolrMetricManager metricManager = container.getMetricManager();
|
actualTotalRequests += coreCount;
|
||||||
for (SolrCore core : container.getCores()) {
|
if (0 < coreCount) {
|
||||||
String registry = core.getCoreMetricManager().getRegistryName();
|
uniqueCoreNames.add(e.getKey());
|
||||||
Counter cnt = metricManager.counter(null, registry, "requests", "QUERY./select");
|
|
||||||
SolrRequestHandler select = core.getRequestHandler("");
|
|
||||||
// long c = (long) select.getStatistics().get("requests");
|
|
||||||
shardVsCount.put(core.getName(), (int) cnt.getCount());
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
assertEquals("Sanity Check: Num Queries So Far Doesn't Match Total????",
|
||||||
log.info("Shard count map = " + shardVsCount);
|
expectedTotalRequests, actualTotalRequests);
|
||||||
|
}
|
||||||
for (Map.Entry<String, Integer> entry : shardVsCount.entrySet()) {
|
log.info("Total requests: " + expectedTotalRequests);
|
||||||
assertTrue("Shard " + entry.getKey() + " received all 10 requests", entry.getValue() != 10);
|
assertEquals("either request randomization code is broken of this test seed is really unlucky, " +
|
||||||
|
"Gave up waiting for requests to hit every core at least once after " +
|
||||||
|
expectedTotalRequests + " requests",
|
||||||
|
uniqueCoreNames.size(), counters.size());
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -160,6 +160,19 @@ public class TestUseDocValuesAsStored extends AbstractBadConfigTestBase {
|
||||||
+ "]");
|
+ "]");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@Test
|
||||||
|
public void testDuplicateMultiValued() throws Exception {
|
||||||
|
doTest("strTF", dvStringFieldName(3,true,false), "str", "X", "X", "Y");
|
||||||
|
doTest("strTT", dvStringFieldName(3,true,true), "str", "X", "X", "Y");
|
||||||
|
doTest("strFF", dvStringFieldName(3,false,false), "str", "X", "X", "Y");
|
||||||
|
doTest("int", "test_is_dvo", "int", "42", "42", "-666");
|
||||||
|
doTest("float", "test_fs_dvo", "float", "4.2", "4.2", "-66.666");
|
||||||
|
doTest("long", "test_ls_dvo", "long", "420", "420", "-6666666" );
|
||||||
|
doTest("double", "test_ds_dvo", "double", "0.0042", "0.0042", "-6.6666E-5");
|
||||||
|
doTest("date", "test_dts_dvo", "date", "2016-07-04T03:02:01Z", "2016-07-04T03:02:01Z", "1999-12-31T23:59:59Z" );
|
||||||
|
doTest("enum", "enums_dvo", "str", SEVERITY[0], SEVERITY[0], SEVERITY[1]);
|
||||||
|
}
|
||||||
|
|
||||||
@Test
|
@Test
|
||||||
public void testRandomSingleAndMultiValued() throws Exception {
|
public void testRandomSingleAndMultiValued() throws Exception {
|
||||||
for (int c = 0 ; c < 10 * RANDOM_MULTIPLIER ; ++c) {
|
for (int c = 0 ; c < 10 * RANDOM_MULTIPLIER ; ++c) {
|
||||||
|
@ -318,9 +331,14 @@ public class TestUseDocValuesAsStored extends AbstractBadConfigTestBase {
|
||||||
xpaths[i] = "//arr[@name='" + field + "']/" + type + "[.='" + value[i] + "']";
|
xpaths[i] = "//arr[@name='" + field + "']/" + type + "[.='" + value[i] + "']";
|
||||||
}
|
}
|
||||||
|
|
||||||
// Docvalues are sets, but stored values are ordered multisets, so cardinality depends on the value source
|
// See SOLR-10924...
|
||||||
xpaths[value.length] = "*[count(//arr[@name='" + field + "']/" + type + ") = "
|
// Trie/String based Docvalues are sets, but stored values & Point DVs are ordered multisets,
|
||||||
+ (isStoredField(field) ? value.length : valueSet.size()) + "]";
|
// so cardinality depends on the value source
|
||||||
|
final int expectedCardinality =
|
||||||
|
(isStoredField(field) || (Boolean.getBoolean(NUMERIC_POINTS_SYSPROP)
|
||||||
|
&& ! (field.startsWith("enum") || field.startsWith("test_s"))))
|
||||||
|
? value.length : valueSet.size();
|
||||||
|
xpaths[value.length] = "*[count(//arr[@name='"+field+"']/"+type+")="+expectedCardinality+"]";
|
||||||
assertU(adoc(fieldAndValues));
|
assertU(adoc(fieldAndValues));
|
||||||
|
|
||||||
} else {
|
} else {
|
||||||
|
|
|
@ -1,42 +0,0 @@
|
||||||
<?xml version="1.0" encoding="UTF-8" ?>
|
|
||||||
<!--
|
|
||||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
|
||||||
contributor license agreements. See the NOTICE file distributed with
|
|
||||||
this work for additional information regarding copyright ownership.
|
|
||||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
|
||||||
(the "License"); you may not use this file except in compliance with
|
|
||||||
the License. You may obtain a copy of the License at
|
|
||||||
|
|
||||||
http://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
|
|
||||||
Unless required by applicable law or agreed to in writing, software
|
|
||||||
distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
See the License for the specific language governing permissions and
|
|
||||||
limitations under the License.
|
|
||||||
-->
|
|
||||||
|
|
||||||
<!-- If this file is found in the config directory, it will only be
|
|
||||||
loaded once at startup. If it is found in Solr's data
|
|
||||||
directory, it will be re-loaded every commit.
|
|
||||||
|
|
||||||
See http://wiki.apache.org/solr/QueryElevationComponent for more info
|
|
||||||
|
|
||||||
-->
|
|
||||||
<elevate>
|
|
||||||
<!-- Query elevation examples
|
|
||||||
<query text="foo bar">
|
|
||||||
<doc id="1" />
|
|
||||||
<doc id="2" />
|
|
||||||
<doc id="3" />
|
|
||||||
</query>
|
|
||||||
|
|
||||||
for use with techproducts example
|
|
||||||
|
|
||||||
<query text="ipod">
|
|
||||||
<doc id="MA147LL/A" /> put the actual ipod at the top
|
|
||||||
<doc id="IW-02" exclude="true" /> exclude this cable
|
|
||||||
</query>
|
|
||||||
-->
|
|
||||||
|
|
||||||
</elevate>
|
|
|
@ -1004,7 +1004,6 @@
|
||||||
<searchComponent name="elevator" class="solr.QueryElevationComponent" >
|
<searchComponent name="elevator" class="solr.QueryElevationComponent" >
|
||||||
<!-- pick a fieldType to analyze queries -->
|
<!-- pick a fieldType to analyze queries -->
|
||||||
<str name="queryFieldType">string</str>
|
<str name="queryFieldType">string</str>
|
||||||
<str name="config-file">elevate.xml</str>
|
|
||||||
</searchComponent>
|
</searchComponent>
|
||||||
|
|
||||||
<!-- A request handler for demonstrating the elevator component -->
|
<!-- A request handler for demonstrating the elevator component -->
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
= About This Guide
|
= About This Guide
|
||||||
:page-shortname: about-this-guide
|
:page-shortname: about-this-guide
|
||||||
:page-permalink: about-this-guide.html
|
:page-permalink: about-this-guide.html
|
||||||
|
:page-toc: false
|
||||||
// Licensed to the Apache Software Foundation (ASF) under one
|
// Licensed to the Apache Software Foundation (ASF) under one
|
||||||
// or more contributor license agreements. See the NOTICE file
|
// or more contributor license agreements. See the NOTICE file
|
||||||
// distributed with this work for additional information
|
// distributed with this work for additional information
|
||||||
|
@ -26,38 +27,13 @@ Designed to provide high-level documentation, this guide is intended to be more
|
||||||
|
|
||||||
The material as presented assumes that you are familiar with some basic search concepts and that you can read XML. It does not assume that you are a Java programmer, although knowledge of Java is helpful when working directly with Lucene or when developing custom extensions to a Lucene/Solr installation.
|
The material as presented assumes that you are familiar with some basic search concepts and that you can read XML. It does not assume that you are a Java programmer, although knowledge of Java is helpful when working directly with Lucene or when developing custom extensions to a Lucene/Solr installation.
|
||||||
|
|
||||||
[[AboutThisGuide-SpecialInlineNotes]]
|
|
||||||
== Special Inline Notes
|
|
||||||
|
|
||||||
Special notes are included throughout these pages. There are several types of notes:
|
|
||||||
|
|
||||||
Information blocks::
|
|
||||||
+
|
|
||||||
NOTE: These provide additional information that's useful for you to know.
|
|
||||||
|
|
||||||
Important::
|
|
||||||
+
|
|
||||||
IMPORTANT: These provide information that is critical for you to know.
|
|
||||||
|
|
||||||
Tip::
|
|
||||||
+
|
|
||||||
TIP: These provide helpful tips.
|
|
||||||
|
|
||||||
Caution::
|
|
||||||
+
|
|
||||||
CAUTION: These provide details on scenarios or configurations you should be careful with.
|
|
||||||
|
|
||||||
Warning::
|
|
||||||
+
|
|
||||||
WARNING: These are meant to warn you from a possibly dangerous change or action.
|
|
||||||
|
|
||||||
|
|
||||||
[[AboutThisGuide-HostsandPortExamples]]
|
|
||||||
== Hosts and Port Examples
|
== Hosts and Port Examples
|
||||||
|
|
||||||
The default port when running Solr is 8983. The samples, URLs and screenshots in this guide may show different ports, because the port number that Solr uses is configurable. If you have not customized your installation of Solr, please make sure that you use port 8983 when following the examples, or configure your own installation to use the port numbers shown in the examples. For information about configuring port numbers, see the section <<managing-solr.adoc#managing-solr,Managing Solr>>.
|
The default port when running Solr is 8983. The samples, URLs and screenshots in this guide may show different ports, because the port number that Solr uses is configurable.
|
||||||
|
|
||||||
Similarly, URL examples use 'localhost' throughout; if you are accessing Solr from a location remote to the server hosting Solr, replace 'localhost' with the proper domain or IP where Solr is running.
|
If you have not customized your installation of Solr, please make sure that you use port 8983 when following the examples, or configure your own installation to use the port numbers shown in the examples. For information about configuring port numbers, see the section <<managing-solr.adoc#managing-solr,Managing Solr>>.
|
||||||
|
|
||||||
|
Similarly, URL examples use `localhost` throughout; if you are accessing Solr from a location remote to the server hosting Solr, replace `localhost` with the proper domain or IP where Solr is running.
|
||||||
|
|
||||||
For example, we might provide a sample query like:
|
For example, we might provide a sample query like:
|
||||||
|
|
||||||
|
@ -67,7 +43,32 @@ There are several items in this URL you might need to change locally. First, if
|
||||||
|
|
||||||
`\http://www.example.com/solr/mycollection/select?q=brown+cow`
|
`\http://www.example.com/solr/mycollection/select?q=brown+cow`
|
||||||
|
|
||||||
[[AboutThisGuide-Paths]]
|
|
||||||
== Paths
|
== Paths
|
||||||
|
|
||||||
Path information is given relative to `solr.home`, which is the location under the main Solr installation where Solr's collections and their `conf` and `data` directories are stored. When running the various examples mentioned through out this tutorial (i.e., `bin/solr -e techproducts`) the `solr.home` will be a sub-directory of `example/` created for you automatically.
|
Path information is given relative to `solr.home`, which is the location under the main Solr installation where Solr's collections and their `conf` and `data` directories are stored.
|
||||||
|
|
||||||
|
When running the various examples mentioned through out this tutorial (i.e., `bin/solr -e techproducts`) the `solr.home` will be a sub-directory of `example/` created for you automatically.
|
||||||
|
|
||||||
|
== Special Inline Notes
|
||||||
|
|
||||||
|
Special notes are included throughout these pages. There are several types of notes:
|
||||||
|
|
||||||
|
=== Information blocks
|
||||||
|
|
||||||
|
NOTE: These provide additional information that's useful for you to know.
|
||||||
|
|
||||||
|
=== Important
|
||||||
|
|
||||||
|
IMPORTANT: These provide information that is critical for you to know.
|
||||||
|
|
||||||
|
=== Tip
|
||||||
|
|
||||||
|
TIP: These provide helpful tips.
|
||||||
|
|
||||||
|
=== Caution
|
||||||
|
|
||||||
|
CAUTION: These provide details on scenarios or configurations you should be careful with.
|
||||||
|
|
||||||
|
=== Warning
|
||||||
|
|
||||||
|
WARNING: These are meant to warn you from a possibly dangerous change or action.
|
||||||
|
|
|
@ -37,7 +37,6 @@ A `TypeTokenFilterFactory` is available that creates a `TypeTokenFilter` that fi
|
||||||
|
|
||||||
For a complete list of the available TokenFilters, see the section <<tokenizers.adoc#tokenizers,Tokenizers>>.
|
For a complete list of the available TokenFilters, see the section <<tokenizers.adoc#tokenizers,Tokenizers>>.
|
||||||
|
|
||||||
[[AboutTokenizers-WhenTouseaCharFiltervs.aTokenFilter]]
|
|
||||||
== When To use a CharFilter vs. a TokenFilter
|
== When To use a CharFilter vs. a TokenFilter
|
||||||
|
|
||||||
There are several pairs of CharFilters and TokenFilters that have related (ie: `MappingCharFilter` and `ASCIIFoldingFilter`) or nearly identical (ie: `PatternReplaceCharFilterFactory` and `PatternReplaceFilterFactory`) functionality and it may not always be obvious which is the best choice.
|
There are several pairs of CharFilters and TokenFilters that have related (ie: `MappingCharFilter` and `ASCIIFoldingFilter`) or nearly identical (ie: `PatternReplaceCharFilterFactory` and `PatternReplaceFilterFactory`) functionality and it may not always be obvious which is the best choice.
|
||||||
|
|
|
@ -30,12 +30,10 @@ In addition to requiring that Solr by running in <<solrcloud.adoc#solrcloud,Solr
|
||||||
Before enabling this feature, users should carefully consider the issues discussed in the <<Securing Runtime Libraries>> section below.
|
Before enabling this feature, users should carefully consider the issues discussed in the <<Securing Runtime Libraries>> section below.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[AddingCustomPluginsinSolrCloudMode-UploadingJarFiles]]
|
|
||||||
== Uploading Jar Files
|
== Uploading Jar Files
|
||||||
|
|
||||||
The first step is to use the <<blob-store-api.adoc#blob-store-api,Blob Store API>> to upload your jar files. This will to put your jars in the `.system` collection and distribute them across your SolrCloud nodes. These jars are added to a separate classloader and only accessible to components that are configured with the property `runtimeLib=true`. These components are loaded lazily because the `.system` collection may not be loaded when a particular core is loaded.
|
The first step is to use the <<blob-store-api.adoc#blob-store-api,Blob Store API>> to upload your jar files. This will to put your jars in the `.system` collection and distribute them across your SolrCloud nodes. These jars are added to a separate classloader and only accessible to components that are configured with the property `runtimeLib=true`. These components are loaded lazily because the `.system` collection may not be loaded when a particular core is loaded.
|
||||||
|
|
||||||
[[AddingCustomPluginsinSolrCloudMode-ConfigAPICommandstouseJarsasRuntimeLibraries]]
|
|
||||||
== Config API Commands to use Jars as Runtime Libraries
|
== Config API Commands to use Jars as Runtime Libraries
|
||||||
|
|
||||||
The runtime library feature uses a special set of commands for the <<config-api.adoc#config-api,Config API>> to add, update, or remove jar files currently available in the blob store to the list of runtime libraries.
|
The runtime library feature uses a special set of commands for the <<config-api.adoc#config-api,Config API>> to add, update, or remove jar files currently available in the blob store to the list of runtime libraries.
|
||||||
|
@ -74,14 +72,12 @@ curl http://localhost:8983/solr/techproducts/config -H 'Content-type:application
|
||||||
}'
|
}'
|
||||||
----
|
----
|
||||||
|
|
||||||
[[AddingCustomPluginsinSolrCloudMode-SecuringRuntimeLibraries]]
|
|
||||||
== Securing Runtime Libraries
|
== Securing Runtime Libraries
|
||||||
|
|
||||||
A drawback of this feature is that it could be used to load malicious executable code into the system. However, it is possible to restrict the system to load only trusted jars using http://en.wikipedia.org/wiki/Public_key_infrastructure[PKI] to verify that the executables loaded into the system are trustworthy.
|
A drawback of this feature is that it could be used to load malicious executable code into the system. However, it is possible to restrict the system to load only trusted jars using http://en.wikipedia.org/wiki/Public_key_infrastructure[PKI] to verify that the executables loaded into the system are trustworthy.
|
||||||
|
|
||||||
The following steps will allow you enable security for this feature. The instructions assume you have started all your Solr nodes with the `-Denable.runtime.lib=true`.
|
The following steps will allow you enable security for this feature. The instructions assume you have started all your Solr nodes with the `-Denable.runtime.lib=true`.
|
||||||
|
|
||||||
[[Step1_GenerateanRSAPrivateKey]]
|
|
||||||
=== Step 1: Generate an RSA Private Key
|
=== Step 1: Generate an RSA Private Key
|
||||||
|
|
||||||
The first step is to generate an RSA private key. The example below uses a 512-bit key, but you should use the strength appropriate to your needs.
|
The first step is to generate an RSA private key. The example below uses a 512-bit key, but you should use the strength appropriate to your needs.
|
||||||
|
@ -91,7 +87,6 @@ The first step is to generate an RSA private key. The example below uses a 512-b
|
||||||
$ openssl genrsa -out priv_key.pem 512
|
$ openssl genrsa -out priv_key.pem 512
|
||||||
----
|
----
|
||||||
|
|
||||||
[[Step2_OutputthePublicKey]]
|
|
||||||
=== Step 2: Output the Public Key
|
=== Step 2: Output the Public Key
|
||||||
|
|
||||||
The public portion of the key should be output in DER format so Java can read it.
|
The public portion of the key should be output in DER format so Java can read it.
|
||||||
|
@ -101,7 +96,6 @@ The public portion of the key should be output in DER format so Java can read it
|
||||||
$ openssl rsa -in priv_key.pem -pubout -outform DER -out pub_key.der
|
$ openssl rsa -in priv_key.pem -pubout -outform DER -out pub_key.der
|
||||||
----
|
----
|
||||||
|
|
||||||
[[Step3_LoadtheKeytoZooKeeper]]
|
|
||||||
=== Step 3: Load the Key to ZooKeeper
|
=== Step 3: Load the Key to ZooKeeper
|
||||||
|
|
||||||
The `.der` files that are output from Step 2 should then be loaded to ZooKeeper under a node `/keys/exe` so they are available throughout every node. You can load any number of public keys to that node and all are valid. If a key is removed from the directory, the signatures of that key will cease to be valid. So, before removing the a key, make sure to update your runtime library configurations with valid signatures with the `update-runtimelib` command.
|
The `.der` files that are output from Step 2 should then be loaded to ZooKeeper under a node `/keys/exe` so they are available throughout every node. You can load any number of public keys to that node and all are valid. If a key is removed from the directory, the signatures of that key will cease to be valid. So, before removing the a key, make sure to update your runtime library configurations with valid signatures with the `update-runtimelib` command.
|
||||||
|
@ -130,7 +124,6 @@ $ .bin/zkCli.sh -server localhost:9983
|
||||||
|
|
||||||
After this, any attempt to load a jar will fail. All your jars must be signed with one of your private keys for Solr to trust it. The process to sign your jars and use the signature is outlined in Steps 4-6.
|
After this, any attempt to load a jar will fail. All your jars must be signed with one of your private keys for Solr to trust it. The process to sign your jars and use the signature is outlined in Steps 4-6.
|
||||||
|
|
||||||
[[Step4_SignthejarFile]]
|
|
||||||
=== Step 4: Sign the jar File
|
=== Step 4: Sign the jar File
|
||||||
|
|
||||||
Next you need to sign the sha1 digest of your jar file and get the base64 string.
|
Next you need to sign the sha1 digest of your jar file and get the base64 string.
|
||||||
|
@ -142,7 +135,6 @@ $ openssl dgst -sha1 -sign priv_key.pem myjar.jar | openssl enc -base64
|
||||||
|
|
||||||
The output of this step will be a string that you will need to add the jar to your classpath in Step 6 below.
|
The output of this step will be a string that you will need to add the jar to your classpath in Step 6 below.
|
||||||
|
|
||||||
[[Step5_LoadthejartotheBlobStore]]
|
|
||||||
=== Step 5: Load the jar to the Blob Store
|
=== Step 5: Load the jar to the Blob Store
|
||||||
|
|
||||||
Load your jar to the Blob store, using the <<blob-store-api.adoc#blob-store-api,Blob Store API>>. This step does not require a signature; you will need the signature in Step 6 to add it to your classpath.
|
Load your jar to the Blob store, using the <<blob-store-api.adoc#blob-store-api,Blob Store API>>. This step does not require a signature; you will need the signature in Step 6 to add it to your classpath.
|
||||||
|
@ -155,7 +147,6 @@ http://localhost:8983/solr/.system/blob/{blobname}
|
||||||
|
|
||||||
The blob name that you give the jar file in this step will be used as the name in the next step.
|
The blob name that you give the jar file in this step will be used as the name in the next step.
|
||||||
|
|
||||||
[[Step6_AddthejartotheClasspath]]
|
|
||||||
=== Step 6: Add the jar to the Classpath
|
=== Step 6: Add the jar to the Classpath
|
||||||
|
|
||||||
Finally, add the jar to the classpath using the Config API as detailed above. In this step, you will need to provide the signature of the jar that you got in Step 4.
|
Finally, add the jar to the classpath using the Config API as detailed above. In this step, you will need to provide the signature of the jar that you got in Step 4.
|
||||||
|
|
|
@ -60,7 +60,6 @@ In this case, no Analyzer class was specified on the `<analyzer>` element. Rathe
|
||||||
The output of an Analyzer affects the _terms_ indexed in a given field (and the terms used when parsing queries against those fields) but it has no impact on the _stored_ value for the fields. For example: an analyzer might split "Brown Cow" into two indexed terms "brown" and "cow", but the stored value will still be a single String: "Brown Cow"
|
The output of an Analyzer affects the _terms_ indexed in a given field (and the terms used when parsing queries against those fields) but it has no impact on the _stored_ value for the fields. For example: an analyzer might split "Brown Cow" into two indexed terms "brown" and "cow", but the stored value will still be a single String: "Brown Cow"
|
||||||
====
|
====
|
||||||
|
|
||||||
[[Analyzers-AnalysisPhases]]
|
|
||||||
== Analysis Phases
|
== Analysis Phases
|
||||||
|
|
||||||
Analysis takes place in two contexts. At index time, when a field is being created, the token stream that results from analysis is added to an index and defines the set of terms (including positions, sizes, and so on) for the field. At query time, the values being searched for are analyzed and the terms that result are matched against those that are stored in the field's index.
|
Analysis takes place in two contexts. At index time, when a field is being created, the token stream that results from analysis is added to an index and defines the set of terms (including positions, sizes, and so on) for the field. At query time, the values being searched for are analyzed and the terms that result are matched against those that are stored in the field's index.
|
||||||
|
@ -89,7 +88,6 @@ In this theoretical example, at index time the text is tokenized, the tokens are
|
||||||
|
|
||||||
At query time, the only normalization that happens is to convert the query terms to lowercase. The filtering and mapping steps that occur at index time are not applied to the query terms. Queries must then, in this example, be very precise, using only the normalized terms that were stored at index time.
|
At query time, the only normalization that happens is to convert the query terms to lowercase. The filtering and mapping steps that occur at index time are not applied to the query terms. Queries must then, in this example, be very precise, using only the normalized terms that were stored at index time.
|
||||||
|
|
||||||
[[Analyzers-AnalysisforMulti-TermExpansion]]
|
|
||||||
=== Analysis for Multi-Term Expansion
|
=== Analysis for Multi-Term Expansion
|
||||||
|
|
||||||
In some types of queries (ie: Prefix, Wildcard, Regex, etc...) the input provided by the user is not natural language intended for Analysis. Things like Synonyms or Stop word filtering do not work in a logical way in these types of Queries.
|
In some types of queries (ie: Prefix, Wildcard, Regex, etc...) the input provided by the user is not natural language intended for Analysis. Things like Synonyms or Stop word filtering do not work in a logical way in these types of Queries.
|
||||||
|
|
|
@ -27,7 +27,6 @@ All authentication and authorization plugins can work with Solr whether they are
|
||||||
|
|
||||||
The following section describes how to enable plugins with `security.json` and place them in the proper locations for your mode of operation.
|
The following section describes how to enable plugins with `security.json` and place them in the proper locations for your mode of operation.
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-EnablePluginswithsecurity.json]]
|
|
||||||
== Enable Plugins with security.json
|
== Enable Plugins with security.json
|
||||||
|
|
||||||
All of the information required to initialize either type of security plugin is stored in a `security.json` file. This file contains 2 sections, one each for authentication and authorization.
|
All of the information required to initialize either type of security plugin is stored in a `security.json` file. This file contains 2 sections, one each for authentication and authorization.
|
||||||
|
@ -45,7 +44,7 @@ All of the information required to initialize either type of security plugin is
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
The `/security.json` file needs to be in the proper location before a Solr instance comes up so Solr starts with the security plugin enabled. See the section <<AuthenticationandAuthorizationPlugins-Usingsecurity.jsonwithSolr,Using security.json with Solr>> below for information on how to do this.
|
The `/security.json` file needs to be in the proper location before a Solr instance comes up so Solr starts with the security plugin enabled. See the section <<Using security.json with Solr>> below for information on how to do this.
|
||||||
|
|
||||||
Depending on the plugin(s) in use, other information will be stored in `security.json` such as user information or rules to create roles and permissions. This information is added through the APIs for each plugin provided by Solr, or, in the case of a custom plugin, the approach designed by you.
|
Depending on the plugin(s) in use, other information will be stored in `security.json` such as user information or rules to create roles and permissions. This information is added through the APIs for each plugin provided by Solr, or, in the case of a custom plugin, the approach designed by you.
|
||||||
|
|
||||||
|
@ -66,10 +65,8 @@ Here is a more detailed `security.json` example. In this, the Basic authenticati
|
||||||
}}
|
}}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-Usingsecurity.jsonwithSolr]]
|
|
||||||
== Using security.json with Solr
|
== Using security.json with Solr
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-InSolrCloudmode]]
|
|
||||||
=== In SolrCloud Mode
|
=== In SolrCloud Mode
|
||||||
|
|
||||||
While configuring Solr to use an authentication or authorization plugin, you will need to upload a `security.json` file to ZooKeeper. The following command writes the file as it uploads it - you could also upload a file that you have already created locally.
|
While configuring Solr to use an authentication or authorization plugin, you will need to upload a `security.json` file to ZooKeeper. The following command writes the file as it uploads it - you could also upload a file that you have already created locally.
|
||||||
|
@ -91,7 +88,6 @@ Depending on the authentication and authorization plugin that you use, you may h
|
||||||
|
|
||||||
Once `security.json` has been uploaded to ZooKeeper, you should use the appropriate APIs for the plugins you're using to update it. You can edit it manually, but you must take care to remove any version data so it will be properly updated across all ZooKeeper nodes. The version data is found at the end of the `security.json` file, and will appear as the letter "v" followed by a number, such as `{"v":138}`.
|
Once `security.json` has been uploaded to ZooKeeper, you should use the appropriate APIs for the plugins you're using to update it. You can edit it manually, but you must take care to remove any version data so it will be properly updated across all ZooKeeper nodes. The version data is found at the end of the `security.json` file, and will appear as the letter "v" followed by a number, such as `{"v":138}`.
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-InStandaloneMode]]
|
|
||||||
=== In Standalone Mode
|
=== In Standalone Mode
|
||||||
|
|
||||||
When running Solr in standalone mode, you need to create the `security.json` file and put it in the `$SOLR_HOME` directory for your installation (this is the same place you have located `solr.xml` and is usually `server/solr`).
|
When running Solr in standalone mode, you need to create the `security.json` file and put it in the `$SOLR_HOME` directory for your installation (this is the same place you have located `solr.xml` and is usually `server/solr`).
|
||||||
|
@ -100,8 +96,7 @@ If you are using <<legacy-scaling-and-distribution.adoc#legacy-scaling-and-distr
|
||||||
|
|
||||||
You can use the authentication and authorization APIs, but if you are using the legacy scaling model, you will need to make the same API requests on each node separately. You can also edit `security.json` by hand if you prefer.
|
You can use the authentication and authorization APIs, but if you are using the legacy scaling model, you will need to make the same API requests on each node separately. You can also edit `security.json` by hand if you prefer.
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-Authentication]]
|
== Authentication Plugins
|
||||||
== Authentication
|
|
||||||
|
|
||||||
Authentication plugins help in securing the endpoints of Solr by authenticating incoming requests. A custom plugin can be implemented by extending the AuthenticationPlugin class.
|
Authentication plugins help in securing the endpoints of Solr by authenticating incoming requests. A custom plugin can be implemented by extending the AuthenticationPlugin class.
|
||||||
|
|
||||||
|
@ -110,7 +105,6 @@ An authentication plugin consists of two parts:
|
||||||
. Server-side component, which intercepts and authenticates incoming requests to Solr using a mechanism defined in the plugin, such as Kerberos, Basic Auth or others.
|
. Server-side component, which intercepts and authenticates incoming requests to Solr using a mechanism defined in the plugin, such as Kerberos, Basic Auth or others.
|
||||||
. Client-side component, i.e., an extension of `HttpClientConfigurer`, which enables a SolrJ client to make requests to a secure Solr instance using the authentication mechanism which the server understands.
|
. Client-side component, i.e., an extension of `HttpClientConfigurer`, which enables a SolrJ client to make requests to a secure Solr instance using the authentication mechanism which the server understands.
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-EnablingaPlugin]]
|
|
||||||
=== Enabling a Plugin
|
=== Enabling a Plugin
|
||||||
|
|
||||||
* Specify the authentication plugin in `/security.json` as in this example:
|
* Specify the authentication plugin in `/security.json` as in this example:
|
||||||
|
@ -126,7 +120,6 @@ An authentication plugin consists of two parts:
|
||||||
* All of the content in the authentication block of `security.json` would be passed on as a map to the plugin during initialization.
|
* All of the content in the authentication block of `security.json` would be passed on as a map to the plugin during initialization.
|
||||||
* An authentication plugin can also be used with a standalone Solr instance by passing in `-DauthenticationPlugin=<plugin class name>` during startup.
|
* An authentication plugin can also be used with a standalone Solr instance by passing in `-DauthenticationPlugin=<plugin class name>` during startup.
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-AvailableAuthenticationPlugins]]
|
|
||||||
=== Available Authentication Plugins
|
=== Available Authentication Plugins
|
||||||
|
|
||||||
Solr has the following implementations of authentication plugins:
|
Solr has the following implementations of authentication plugins:
|
||||||
|
@ -135,12 +128,10 @@ Solr has the following implementations of authentication plugins:
|
||||||
* <<basic-authentication-plugin.adoc#basic-authentication-plugin,Basic Authentication Plugin>>
|
* <<basic-authentication-plugin.adoc#basic-authentication-plugin,Basic Authentication Plugin>>
|
||||||
* <<hadoop-authentication-plugin.adoc#hadoop-authentication-plugin,Hadoop Authentication Plugin>>
|
* <<hadoop-authentication-plugin.adoc#hadoop-authentication-plugin,Hadoop Authentication Plugin>>
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-Authorization]]
|
|
||||||
== Authorization
|
== Authorization
|
||||||
|
|
||||||
An authorization plugin can be written for Solr by extending the {solr-javadocs}/solr-core/org/apache/solr/security/AuthorizationPlugin.html[AuthorizationPlugin] interface.
|
An authorization plugin can be written for Solr by extending the {solr-javadocs}/solr-core/org/apache/solr/security/AuthorizationPlugin.html[AuthorizationPlugin] interface.
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-LoadingaCustomPlugin]]
|
|
||||||
=== Loading a Custom Plugin
|
=== Loading a Custom Plugin
|
||||||
|
|
||||||
* Make sure that the plugin implementation is in the classpath.
|
* Make sure that the plugin implementation is in the classpath.
|
||||||
|
@ -162,21 +153,16 @@ All of the content in the `authorization` block of `security.json` would be pass
|
||||||
The authorization plugin is only supported in SolrCloud mode. Also, reloading the plugin isn't yet supported and requires a restart of the Solr installation (meaning, the JVM should be restarted, not simply a core reload).
|
The authorization plugin is only supported in SolrCloud mode. Also, reloading the plugin isn't yet supported and requires a restart of the Solr installation (meaning, the JVM should be restarted, not simply a core reload).
|
||||||
====
|
====
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-AvailableAuthorizationPlugins]]
|
|
||||||
=== Available Authorization Plugins
|
=== Available Authorization Plugins
|
||||||
|
|
||||||
Solr has one implementation of an authorization plugin:
|
Solr has one implementation of an authorization plugin:
|
||||||
|
|
||||||
* <<rule-based-authorization-plugin.adoc#rule-based-authorization-plugin,Rule-Based Authorization Plugin>>
|
* <<rule-based-authorization-plugin.adoc#rule-based-authorization-plugin,Rule-Based Authorization Plugin>>
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-PKISecuringInter-NodeRequests]]
|
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-PKI]]
|
|
||||||
== Securing Inter-Node Requests
|
== Securing Inter-Node Requests
|
||||||
|
|
||||||
There are a lot of requests that originate from the Solr nodes itself. For example, requests from overseer to nodes, recovery threads, etc. Each Authentication plugin declares whether it is capable of securing inter-node requests or not. If not, Solr will fall back to using a special internode authentication mechanism where each Solr node is a super user and is fully trusted by other Solr nodes, described below.
|
There are a lot of requests that originate from the Solr nodes itself. For example, requests from overseer to nodes, recovery threads, etc. Each Authentication plugin declares whether it is capable of securing inter-node requests or not. If not, Solr will fall back to using a special internode authentication mechanism where each Solr node is a super user and is fully trusted by other Solr nodes, described below.
|
||||||
|
|
||||||
[[AuthenticationandAuthorizationPlugins-PKIAuthenticationPlugin]]
|
|
||||||
=== PKIAuthenticationPlugin
|
=== PKIAuthenticationPlugin
|
||||||
|
|
||||||
The PKIAuthenticationPlugin is used when there is any request going on between two Solr nodes, and the configured Authentication plugin does not wish to handle inter-node security.
|
The PKIAuthenticationPlugin is used when there is any request going on between two Solr nodes, and the configured Authentication plugin does not wish to handle inter-node security.
|
||||||
|
|
|
@ -22,10 +22,9 @@ Solr can support Basic authentication for users with the use of the BasicAuthPlu
|
||||||
|
|
||||||
An authorization plugin is also available to configure Solr with permissions to perform various activities in the system. The authorization plugin is described in the section <<rule-based-authorization-plugin.adoc#rule-based-authorization-plugin,Rule-Based Authorization Plugin>>.
|
An authorization plugin is also available to configure Solr with permissions to perform various activities in the system. The authorization plugin is described in the section <<rule-based-authorization-plugin.adoc#rule-based-authorization-plugin,Rule-Based Authorization Plugin>>.
|
||||||
|
|
||||||
[[BasicAuthenticationPlugin-EnableBasicAuthentication]]
|
|
||||||
== Enable Basic Authentication
|
== Enable Basic Authentication
|
||||||
|
|
||||||
To use Basic authentication, you must first create a `security.json` file. This file and where to put it is described in detail in the section <<authentication-and-authorization-plugins.adoc#AuthenticationandAuthorizationPlugins-EnablePluginswithsecurity.json,Enable Plugins with security.json>>.
|
To use Basic authentication, you must first create a `security.json` file. This file and where to put it is described in detail in the section <<authentication-and-authorization-plugins.adoc#enable-plugins-with-security-json,Enable Plugins with security.json>>.
|
||||||
|
|
||||||
For Basic authentication, the `security.json` file must have an `authentication` part which defines the class being used for authentication. Usernames and passwords (as a sha256(password+salt) hash) could be added when the file is created, or can be added later with the Basic authentication API, described below.
|
For Basic authentication, the `security.json` file must have an `authentication` part which defines the class being used for authentication. Usernames and passwords (as a sha256(password+salt) hash) could be added when the file is created, or can be added later with the Basic authentication API, described below.
|
||||||
|
|
||||||
|
@ -68,7 +67,6 @@ If you are using SolrCloud, you must upload `security.json` to ZooKeeper. You ca
|
||||||
bin/solr zk cp file:path_to_local_security.json zk:/security.json -z localhost:9983
|
bin/solr zk cp file:path_to_local_security.json zk:/security.json -z localhost:9983
|
||||||
----
|
----
|
||||||
|
|
||||||
[[BasicAuthenticationPlugin-Caveats]]
|
|
||||||
=== Caveats
|
=== Caveats
|
||||||
|
|
||||||
There are a few things to keep in mind when using the Basic authentication plugin.
|
There are a few things to keep in mind when using the Basic authentication plugin.
|
||||||
|
@ -77,19 +75,16 @@ There are a few things to keep in mind when using the Basic authentication plugi
|
||||||
* A user who has access to write permissions to `security.json` will be able to modify all the permissions and how users have been assigned permissions. Special care should be taken to only grant access to editing security to appropriate users.
|
* A user who has access to write permissions to `security.json` will be able to modify all the permissions and how users have been assigned permissions. Special care should be taken to only grant access to editing security to appropriate users.
|
||||||
* Your network should, of course, be secure. Even with Basic authentication enabled, you should not unnecessarily expose Solr to the outside world.
|
* Your network should, of course, be secure. Even with Basic authentication enabled, you should not unnecessarily expose Solr to the outside world.
|
||||||
|
|
||||||
[[BasicAuthenticationPlugin-EditingAuthenticationPluginConfiguration]]
|
|
||||||
== Editing Authentication Plugin Configuration
|
== Editing Authentication Plugin Configuration
|
||||||
|
|
||||||
An Authentication API allows modifying user IDs and passwords. The API provides an endpoint with specific commands to set user details or delete a user.
|
An Authentication API allows modifying user IDs and passwords. The API provides an endpoint with specific commands to set user details or delete a user.
|
||||||
|
|
||||||
[[BasicAuthenticationPlugin-APIEntryPoint]]
|
|
||||||
=== API Entry Point
|
=== API Entry Point
|
||||||
|
|
||||||
`admin/authentication`
|
`admin/authentication`
|
||||||
|
|
||||||
This endpoint is not collection-specific, so users are created for the entire Solr cluster. If users need to be restricted to a specific collection, that can be done with the authorization rules.
|
This endpoint is not collection-specific, so users are created for the entire Solr cluster. If users need to be restricted to a specific collection, that can be done with the authorization rules.
|
||||||
|
|
||||||
[[BasicAuthenticationPlugin-AddaUserorEditaPassword]]
|
|
||||||
=== Add a User or Edit a Password
|
=== Add a User or Edit a Password
|
||||||
|
|
||||||
The `set-user` command allows you to add users and change their passwords. For example, the following defines two users and their passwords:
|
The `set-user` command allows you to add users and change their passwords. For example, the following defines two users and their passwords:
|
||||||
|
@ -101,7 +96,6 @@ curl --user solr:SolrRocks http://localhost:8983/solr/admin/authentication -H 'C
|
||||||
"harry":"HarrysSecret"}}'
|
"harry":"HarrysSecret"}}'
|
||||||
----
|
----
|
||||||
|
|
||||||
[[BasicAuthenticationPlugin-DeleteaUser]]
|
|
||||||
=== Delete a User
|
=== Delete a User
|
||||||
|
|
||||||
The `delete-user` command allows you to remove a user. The user password does not need to be sent to remove a user. In the following example, we've asked that user IDs 'tom' and 'harry' be removed from the system.
|
The `delete-user` command allows you to remove a user. The user password does not need to be sent to remove a user. In the following example, we've asked that user IDs 'tom' and 'harry' be removed from the system.
|
||||||
|
@ -112,7 +106,6 @@ curl --user solr:SolrRocks http://localhost:8983/solr/admin/authentication -H 'C
|
||||||
"delete-user": ["tom","harry"]}'
|
"delete-user": ["tom","harry"]}'
|
||||||
----
|
----
|
||||||
|
|
||||||
[[BasicAuthenticationPlugin-Setaproperty]]
|
|
||||||
=== Set a Property
|
=== Set a Property
|
||||||
|
|
||||||
Set arbitrary properties for authentication plugin. The only supported property is `'blockUnknown'`
|
Set arbitrary properties for authentication plugin. The only supported property is `'blockUnknown'`
|
||||||
|
@ -123,7 +116,6 @@ curl --user solr:SolrRocks http://localhost:8983/solr/admin/authentication -H 'C
|
||||||
"set-property": {"blockUnknown":false}}'
|
"set-property": {"blockUnknown":false}}'
|
||||||
----
|
----
|
||||||
|
|
||||||
[[BasicAuthenticationPlugin-UsingBasicAuthwithSolrJ]]
|
|
||||||
=== Using BasicAuth with SolrJ
|
=== Using BasicAuth with SolrJ
|
||||||
|
|
||||||
In SolrJ, the basic authentication credentials need to be set for each request as in this example:
|
In SolrJ, the basic authentication credentials need to be set for each request as in this example:
|
||||||
|
@ -144,7 +136,6 @@ req.setBasicAuthCredentials(userName, password);
|
||||||
QueryResponse rsp = req.process(solrClient);
|
QueryResponse rsp = req.process(solrClient);
|
||||||
----
|
----
|
||||||
|
|
||||||
[[BasicAuthenticationPlugin-UsingCommandLinescriptswithBasicAuth]]
|
|
||||||
=== Using Command Line scripts with BasicAuth
|
=== Using Command Line scripts with BasicAuth
|
||||||
|
|
||||||
Add the following line to the `solr.in.sh` or `solr.in.cmd` file. This example tells the `bin/solr` command line to to use "basic" as the type of authentication, and to pass credentials with the user-name "solr" and password "SolrRocks":
|
Add the following line to the `solr.in.sh` or `solr.in.cmd` file. This example tells the `bin/solr` command line to to use "basic" as the type of authentication, and to pass credentials with the user-name "solr" and password "SolrRocks":
|
||||||
|
|
|
@ -28,7 +28,6 @@ When using the blob store, note that the API does not delete or overwrite a prev
|
||||||
|
|
||||||
The blob store API is implemented as a requestHandler. A special collection named ".system" is used to store the blobs. This collection can be created in advance, but if it does not exist it will be created automatically.
|
The blob store API is implemented as a requestHandler. A special collection named ".system" is used to store the blobs. This collection can be created in advance, but if it does not exist it will be created automatically.
|
||||||
|
|
||||||
[[BlobStoreAPI-Aboutthe.systemCollection]]
|
|
||||||
== About the .system Collection
|
== About the .system Collection
|
||||||
|
|
||||||
Before uploading blobs to the blob store, a special collection must be created and it must be named `.system`. Solr will automatically create this collection if it does not already exist, but you can also create it manually if you choose.
|
Before uploading blobs to the blob store, a special collection must be created and it must be named `.system`. Solr will automatically create this collection if it does not already exist, but you can also create it manually if you choose.
|
||||||
|
@ -46,7 +45,6 @@ curl http://localhost:8983/solr/admin/collections?action=CREATE&name=.system&rep
|
||||||
|
|
||||||
IMPORTANT: The `bin/solr` script cannot be used to create the `.system` collection.
|
IMPORTANT: The `bin/solr` script cannot be used to create the `.system` collection.
|
||||||
|
|
||||||
[[BlobStoreAPI-UploadFilestoBlobStore]]
|
|
||||||
== Upload Files to Blob Store
|
== Upload Files to Blob Store
|
||||||
|
|
||||||
After the `.system` collection has been created, files can be uploaded to the blob store with a request similar to the following:
|
After the `.system` collection has been created, files can be uploaded to the blob store with a request similar to the following:
|
||||||
|
@ -132,7 +130,6 @@ For the latest version of a blob, the \{version} can be omitted,
|
||||||
curl http://localhost:8983/solr/.system/blob/{blobname}?wt=filestream > {outputfilename}
|
curl http://localhost:8983/solr/.system/blob/{blobname}?wt=filestream > {outputfilename}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[BlobStoreAPI-UseaBlobinaHandlerorComponent]]
|
|
||||||
== Use a Blob in a Handler or Component
|
== Use a Blob in a Handler or Component
|
||||||
|
|
||||||
To use the blob as the class for a request handler or search component, you create a request handler in `solrconfig.xml` as usual. You will need to define the following parameters:
|
To use the blob as the class for a request handler or search component, you create a request handler in `solrconfig.xml` as usual. You will need to define the following parameters:
|
||||||
|
|
|
@ -42,7 +42,7 @@ This example shows how you could add this search components to `solrconfig.xml`
|
||||||
|
|
||||||
This component can be added into any search request handler. This component work with distributed search in SolrCloud mode.
|
This component can be added into any search request handler. This component work with distributed search in SolrCloud mode.
|
||||||
|
|
||||||
Documents should be added in children-parent blocks as described in <<uploading-data-with-index-handlers.adoc#UploadingDatawithIndexHandlers-NestedChildDocuments,indexing nested child documents>>. Examples:
|
Documents should be added in children-parent blocks as described in <<uploading-data-with-index-handlers.adoc#nested-child-documents,indexing nested child documents>>. Examples:
|
||||||
|
|
||||||
.Sample document
|
.Sample document
|
||||||
[source,xml]
|
[source,xml]
|
||||||
|
@ -95,7 +95,7 @@ Documents should be added in children-parent blocks as described in <<uploading-
|
||||||
</add>
|
</add>
|
||||||
----
|
----
|
||||||
|
|
||||||
Queries are constructed the same way as for a <<other-parsers.adoc#OtherParsers-BlockJoinQueryParsers,Parent Block Join query>>. For example:
|
Queries are constructed the same way as for a <<other-parsers.adoc#block-join-query-parsers,Parent Block Join query>>. For example:
|
||||||
|
|
||||||
[source,text]
|
[source,text]
|
||||||
----
|
----
|
||||||
|
|
|
@ -22,7 +22,6 @@ CharFilter is a component that pre-processes input characters.
|
||||||
|
|
||||||
CharFilters can be chained like Token Filters and placed in front of a Tokenizer. CharFilters can add, change, or remove characters while preserving the original character offsets to support features like highlighting.
|
CharFilters can be chained like Token Filters and placed in front of a Tokenizer. CharFilters can add, change, or remove characters while preserving the original character offsets to support features like highlighting.
|
||||||
|
|
||||||
[[CharFilterFactories-solr.MappingCharFilterFactory]]
|
|
||||||
== solr.MappingCharFilterFactory
|
== solr.MappingCharFilterFactory
|
||||||
|
|
||||||
This filter creates `org.apache.lucene.analysis.MappingCharFilter`, which can be used for changing one string to another (for example, for normalizing `é` to `e`.).
|
This filter creates `org.apache.lucene.analysis.MappingCharFilter`, which can be used for changing one string to another (for example, for normalizing `é` to `e`.).
|
||||||
|
@ -65,7 +64,6 @@ Mapping file syntax:
|
||||||
|===
|
|===
|
||||||
** A backslash followed by any other character is interpreted as if the character were present without the backslash.
|
** A backslash followed by any other character is interpreted as if the character were present without the backslash.
|
||||||
|
|
||||||
[[CharFilterFactories-solr.HTMLStripCharFilterFactory]]
|
|
||||||
== solr.HTMLStripCharFilterFactory
|
== solr.HTMLStripCharFilterFactory
|
||||||
|
|
||||||
This filter creates `org.apache.solr.analysis.HTMLStripCharFilter`. This CharFilter strips HTML from the input stream and passes the result to another CharFilter or a Tokenizer.
|
This filter creates `org.apache.solr.analysis.HTMLStripCharFilter`. This CharFilter strips HTML from the input stream and passes the result to another CharFilter or a Tokenizer.
|
||||||
|
@ -114,7 +112,6 @@ Example:
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CharFilterFactories-solr.ICUNormalizer2CharFilterFactory]]
|
|
||||||
== solr.ICUNormalizer2CharFilterFactory
|
== solr.ICUNormalizer2CharFilterFactory
|
||||||
|
|
||||||
This filter performs pre-tokenization Unicode normalization using http://site.icu-project.org[ICU4J].
|
This filter performs pre-tokenization Unicode normalization using http://site.icu-project.org[ICU4J].
|
||||||
|
@ -138,7 +135,6 @@ Example:
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CharFilterFactories-solr.PatternReplaceCharFilterFactory]]
|
|
||||||
== solr.PatternReplaceCharFilterFactory
|
== solr.PatternReplaceCharFilterFactory
|
||||||
|
|
||||||
This filter uses http://www.regular-expressions.info/reference.html[regular expressions] to replace or change character patterns.
|
This filter uses http://www.regular-expressions.info/reference.html[regular expressions] to replace or change character patterns.
|
||||||
|
|
|
@ -24,10 +24,9 @@ The Collapsing query parser groups documents (collapsing the result set) accordi
|
||||||
|
|
||||||
[IMPORTANT]
|
[IMPORTANT]
|
||||||
====
|
====
|
||||||
In order to use these features with SolrCloud, the documents must be located on the same shard. To ensure document co-location, you can define the `router.name` parameter as `compositeId` when creating the collection. For more information on this option, see the section <<shards-and-indexing-data-in-solrcloud.adoc#ShardsandIndexingDatainSolrCloud-DocumentRouting,Document Routing>>.
|
In order to use these features with SolrCloud, the documents must be located on the same shard. To ensure document co-location, you can define the `router.name` parameter as `compositeId` when creating the collection. For more information on this option, see the section <<shards-and-indexing-data-in-solrcloud.adoc#document-routing,Document Routing>>.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[CollapseandExpandResults-CollapsingQueryParser]]
|
|
||||||
== Collapsing Query Parser
|
== Collapsing Query Parser
|
||||||
|
|
||||||
The `CollapsingQParser` is really a _post filter_ that provides more performant field collapsing than Solr's standard approach when the number of distinct groups in the result set is high. This parser collapses the result set to a single document per group before it forwards the result set to the rest of the search components. So all downstream components (faceting, highlighting, etc...) will work with the collapsed result set.
|
The `CollapsingQParser` is really a _post filter_ that provides more performant field collapsing than Solr's standard approach when the number of distinct groups in the result set is high. This parser collapses the result set to a single document per group before it forwards the result set to the rest of the search components. So all downstream components (faceting, highlighting, etc...) will work with the collapsed result set.
|
||||||
|
@ -121,7 +120,6 @@ fq={!collapse field=group_field hint=top_fc}
|
||||||
|
|
||||||
The CollapsingQParserPlugin fully supports the QueryElevationComponent.
|
The CollapsingQParserPlugin fully supports the QueryElevationComponent.
|
||||||
|
|
||||||
[[CollapseandExpandResults-ExpandComponent]]
|
|
||||||
== Expand Component
|
== Expand Component
|
||||||
|
|
||||||
The ExpandComponent can be used to expand the groups that were collapsed by the http://heliosearch.org/the-collapsingqparserplugin-solrs-new-high-performance-field-collapsing-postfilter/[CollapsingQParserPlugin].
|
The ExpandComponent can be used to expand the groups that were collapsed by the http://heliosearch.org/the-collapsingqparserplugin-solrs-new-high-performance-field-collapsing-postfilter/[CollapsingQParserPlugin].
|
||||||
|
|
|
@ -24,7 +24,7 @@ The Collections API is used to create, remove, or reload collections.
|
||||||
|
|
||||||
In the context of SolrCloud you can use it to create collections with a specific number of shards and replicas, move replicas or shards, and create or delete collection aliases.
|
In the context of SolrCloud you can use it to create collections with a specific number of shards and replicas, move replicas or shards, and create or delete collection aliases.
|
||||||
|
|
||||||
[[CollectionsAPI-create]]
|
[[create]]
|
||||||
== CREATE: Create a Collection
|
== CREATE: Create a Collection
|
||||||
|
|
||||||
`/admin/collections?action=CREATE&name=_name_`
|
`/admin/collections?action=CREATE&name=_name_`
|
||||||
|
@ -45,7 +45,7 @@ The `compositeId` router hashes the value in the uniqueKey field and looks up th
|
||||||
+
|
+
|
||||||
When using the `implicit` router, the `shards` parameter is required. When using the `compositeId` router, the `numShards` parameter is required.
|
When using the `implicit` router, the `shards` parameter is required. When using the `compositeId` router, the `numShards` parameter is required.
|
||||||
+
|
+
|
||||||
For more information, see also the section <<shards-and-indexing-data-in-solrcloud.adoc#ShardsandIndexingDatainSolrCloud-DocumentRouting,Document Routing>>.
|
For more information, see also the section <<shards-and-indexing-data-in-solrcloud.adoc#document-routing,Document Routing>>.
|
||||||
|
|
||||||
`numShards`::
|
`numShards`::
|
||||||
The number of shards to be created as part of the collection. This is a required parameter when the `router.name` is `compositeId`.
|
The number of shards to be created as part of the collection. This is a required parameter when the `router.name` is `compositeId`.
|
||||||
|
@ -68,7 +68,7 @@ Allows defining the nodes to spread the new collection across. The format is a c
|
||||||
+
|
+
|
||||||
If not provided, the CREATE operation will create shard-replicas spread across all live Solr nodes.
|
If not provided, the CREATE operation will create shard-replicas spread across all live Solr nodes.
|
||||||
+
|
+
|
||||||
Alternatively, use the special value of `EMPTY` to initially create no shard-replica within the new collection and then later use the <<CollectionsAPI-addreplica,ADDREPLICA>> operation to add shard-replicas when and where required.
|
Alternatively, use the special value of `EMPTY` to initially create no shard-replica within the new collection and then later use the <<addreplica,ADDREPLICA>> operation to add shard-replicas when and where required.
|
||||||
|
|
||||||
`createNodeSet.shuffle`::
|
`createNodeSet.shuffle`::
|
||||||
Controls wether or not the shard-replicas created for this collection will be assigned to the nodes specified by the `createNodeSet` in a sequential manner, or if the list of nodes should be shuffled prior to creating individual replicas.
|
Controls wether or not the shard-replicas created for this collection will be assigned to the nodes specified by the `createNodeSet` in a sequential manner, or if the list of nodes should be shuffled prior to creating individual replicas.
|
||||||
|
@ -89,10 +89,10 @@ Please note that <<realtime-get.adoc#realtime-get,RealTime Get>> or retrieval by
|
||||||
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details on supported properties and values.
|
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details on supported properties and values.
|
||||||
|
|
||||||
`autoAddReplicas`::
|
`autoAddReplicas`::
|
||||||
When set to `true`, enables automatic addition of replicas on shared file systems (such as HDFS) only. See the section <<running-solr-on-hdfs.adoc#RunningSolronHDFS-AutomaticallyAddReplicasinSolrCloud,autoAddReplicas Settings>> for more details on settings and overrides. The default is `false`.
|
When set to `true`, enables automatic addition of replicas on shared file systems (such as HDFS) only. See the section <<running-solr-on-hdfs.adoc#automatically-add-replicas-in-solrcloud,autoAddReplicas Settings>> for more details on settings and overrides. The default is `false`.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
`rule`::
|
`rule`::
|
||||||
Replica placement rules. See the section <<rule-based-replica-placement.adoc#rule-based-replica-placement,Rule-based Replica Placement>> for details.
|
Replica placement rules. See the section <<rule-based-replica-placement.adoc#rule-based-replica-placement,Rule-based Replica Placement>> for details.
|
||||||
|
@ -141,7 +141,7 @@ http://localhost:8983/solr/admin/collections?action=CREATE&name=newCollection&nu
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-modifycollection]]
|
[[modifycollection]]
|
||||||
== MODIFYCOLLECTION: Modify Attributes of a Collection
|
== MODIFYCOLLECTION: Modify Attributes of a Collection
|
||||||
|
|
||||||
`/admin/collections?action=MODIFYCOLLECTION&collection=_<collection-name>&<attribute-name>=<attribute-value>&<another-attribute-name>=<another-value>_`
|
`/admin/collections?action=MODIFYCOLLECTION&collection=_<collection-name>&<attribute-name>=<attribute-value>&<another-attribute-name>=<another-value>_`
|
||||||
|
@ -165,10 +165,9 @@ The attributes that can be modified are:
|
||||||
* rule
|
* rule
|
||||||
* snitch
|
* snitch
|
||||||
+
|
+
|
||||||
See the <<CollectionsAPI-create,CREATE action>> section above for details on these attributes.
|
See the <<create,CREATE action>> section above for details on these attributes.
|
||||||
|
|
||||||
|
[[reload]]
|
||||||
[[CollectionsAPI-reload]]
|
|
||||||
== RELOAD: Reload a Collection
|
== RELOAD: Reload a Collection
|
||||||
|
|
||||||
`/admin/collections?action=RELOAD&name=_name_`
|
`/admin/collections?action=RELOAD&name=_name_`
|
||||||
|
@ -177,11 +176,11 @@ The RELOAD action is used when you have changed a configuration in ZooKeeper.
|
||||||
|
|
||||||
=== RELOAD Parameters
|
=== RELOAD Parameters
|
||||||
|
|
||||||
|`name`::
|
`name`::
|
||||||
The name of the collection to reload. This parameter is required.
|
The name of the collection to reload. This parameter is required.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
=== RELOAD Response
|
=== RELOAD Response
|
||||||
|
|
||||||
|
@ -222,7 +221,7 @@ http://localhost:8983/solr/admin/collections?action=RELOAD&name=newCollection
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-splitshard]]
|
[[splitshard]]
|
||||||
== SPLITSHARD: Split a Shard
|
== SPLITSHARD: Split a Shard
|
||||||
|
|
||||||
`/admin/collections?action=SPLITSHARD&collection=_name_&shard=_shardID_`
|
`/admin/collections?action=SPLITSHARD&collection=_name_&shard=_shardID_`
|
||||||
|
@ -233,7 +232,7 @@ This command allows for seamless splitting and requires no downtime. A shard bei
|
||||||
|
|
||||||
The split is performed by dividing the original shard's hash range into two equal partitions and dividing up the documents in the original shard according to the new sub-ranges. Two parameters discussed below, `ranges` and `split.key` provide further control over how the split occurs.
|
The split is performed by dividing the original shard's hash range into two equal partitions and dividing up the documents in the original shard according to the new sub-ranges. Two parameters discussed below, `ranges` and `split.key` provide further control over how the split occurs.
|
||||||
|
|
||||||
Shard splitting can be a long running process. In order to avoid timeouts, you should run this as an <<CollectionsAPI-async,asynchronous call>>.
|
Shard splitting can be a long running process. In order to avoid timeouts, you should run this as an <<Asynchronous Calls,asynchronous call>>.
|
||||||
|
|
||||||
=== SPLITSHARD Parameters
|
=== SPLITSHARD Parameters
|
||||||
|
|
||||||
|
@ -259,7 +258,7 @@ For example, suppose `split.key=A!` hashes to the range `12-15` and belongs to s
|
||||||
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details on supported properties and values.
|
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details on supported properties and values.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>
|
||||||
|
|
||||||
=== SPLITSHARD Response
|
=== SPLITSHARD Response
|
||||||
|
|
||||||
|
@ -338,7 +337,7 @@ http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=anothe
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-createshard]]
|
[[createshard]]
|
||||||
== CREATESHARD: Create a Shard
|
== CREATESHARD: Create a Shard
|
||||||
|
|
||||||
Shards can only created with this API for collections that use the 'implicit' router (i.e., when the collection was created, `router.name=implicit`). A new shard with a name can be created for an existing 'implicit' collection.
|
Shards can only created with this API for collections that use the 'implicit' router (i.e., when the collection was created, `router.name=implicit`). A new shard with a name can be created for an existing 'implicit' collection.
|
||||||
|
@ -364,7 +363,7 @@ The format is a comma-separated list of node_names, such as `localhost:8983_solr
|
||||||
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details on supported properties and values.
|
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details on supported properties and values.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
=== CREATESHARD Response
|
=== CREATESHARD Response
|
||||||
|
|
||||||
|
@ -393,7 +392,7 @@ http://localhost:8983/solr/admin/collections?action=CREATESHARD&collection=anImp
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-deleteshard]]
|
[[deleteshard]]
|
||||||
== DELETESHARD: Delete a Shard
|
== DELETESHARD: Delete a Shard
|
||||||
|
|
||||||
Deleting a shard will unload all replicas of the shard, remove them from `clusterstate.json`, and (by default) delete the instanceDir and dataDir for each replica. It will only remove shards that are inactive, or which have no range given for custom sharding.
|
Deleting a shard will unload all replicas of the shard, remove them from `clusterstate.json`, and (by default) delete the instanceDir and dataDir for each replica. It will only remove shards that are inactive, or which have no range given for custom sharding.
|
||||||
|
@ -418,7 +417,7 @@ By default Solr will delete the dataDir of each replica that is deleted. Set thi
|
||||||
By default Solr will delete the index of each replica that is deleted. Set this to `false` to prevent the index directory from being deleted.
|
By default Solr will delete the index of each replica that is deleted. Set this to `false` to prevent the index directory from being deleted.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
=== DELETESHARD Response
|
=== DELETESHARD Response
|
||||||
|
|
||||||
|
@ -455,7 +454,7 @@ http://localhost:8983/solr/admin/collections?action=DELETESHARD&collection=anoth
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-createalias]]
|
[[createalias]]
|
||||||
== CREATEALIAS: Create or Modify an Alias for a Collection
|
== CREATEALIAS: Create or Modify an Alias for a Collection
|
||||||
|
|
||||||
The `CREATEALIAS` action will create a new alias pointing to one or more collections. If an alias by the same name already exists, this action will replace the existing alias, effectively acting like an atomic "MOVE" command.
|
The `CREATEALIAS` action will create a new alias pointing to one or more collections. If an alias by the same name already exists, this action will replace the existing alias, effectively acting like an atomic "MOVE" command.
|
||||||
|
@ -471,14 +470,12 @@ The alias name to be created. This parameter is required.
|
||||||
A comma-separated list of collections to be aliased. The collections must already exist in the cluster. This parameter is required.
|
A comma-separated list of collections to be aliased. The collections must already exist in the cluster. This parameter is required.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
[[CollectionsAPI-Output.5]]
|
|
||||||
=== CREATEALIAS Response
|
=== CREATEALIAS Response
|
||||||
|
|
||||||
The output will simply be a responseHeader with details of the time it took to process the request. To confirm the creation of the alias, you can look in the Solr Admin UI, under the Cloud section and find the `aliases.json` file.
|
The output will simply be a responseHeader with details of the time it took to process the request. To confirm the creation of the alias, you can look in the Solr Admin UI, under the Cloud section and find the `aliases.json` file.
|
||||||
|
|
||||||
[[CollectionsAPI-Examples.5]]
|
|
||||||
=== Examples using CREATEALIAS
|
=== Examples using CREATEALIAS
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
@ -502,7 +499,7 @@ http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=testalias&c
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-listaliases]]
|
[[listaliases]]
|
||||||
== LISTALIASES: List of all aliases in the cluster
|
== LISTALIASES: List of all aliases in the cluster
|
||||||
|
|
||||||
`/admin/collections?action=LISTALIASES`
|
`/admin/collections?action=LISTALIASES`
|
||||||
|
@ -531,7 +528,7 @@ The output will contain a list of aliases with the corresponding collection name
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-deletealias]]
|
[[deletealias]]
|
||||||
== DELETEALIAS: Delete a Collection Alias
|
== DELETEALIAS: Delete a Collection Alias
|
||||||
|
|
||||||
`/admin/collections?action=DELETEALIAS&name=_name_`
|
`/admin/collections?action=DELETEALIAS&name=_name_`
|
||||||
|
@ -542,7 +539,7 @@ The output will contain a list of aliases with the corresponding collection name
|
||||||
The name of the alias to delete. This parameter is required.
|
The name of the alias to delete. This parameter is required.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
=== DELETEALIAS Response
|
=== DELETEALIAS Response
|
||||||
|
|
||||||
|
@ -571,7 +568,7 @@ http://localhost:8983/solr/admin/collections?action=DELETEALIAS&name=testalias
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-delete]]
|
[[delete]]
|
||||||
== DELETE: Delete a Collection
|
== DELETE: Delete a Collection
|
||||||
|
|
||||||
`/admin/collections?action=DELETE&name=_collection_`
|
`/admin/collections?action=DELETE&name=_collection_`
|
||||||
|
@ -582,7 +579,7 @@ http://localhost:8983/solr/admin/collections?action=DELETEALIAS&name=testalias
|
||||||
The name of the collection to delete. This parameter is required.
|
The name of the collection to delete. This parameter is required.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
=== DELETE Response
|
=== DELETE Response
|
||||||
|
|
||||||
|
@ -625,7 +622,7 @@ http://localhost:8983/solr/admin/collections?action=DELETE&name=newCollection
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-deletereplica]]
|
[[deletereplica]]
|
||||||
== DELETEREPLICA: Delete a Replica
|
== DELETEREPLICA: Delete a Replica
|
||||||
|
|
||||||
Deletes a named replica from the specified collection and shard.
|
Deletes a named replica from the specified collection and shard.
|
||||||
|
@ -665,7 +662,7 @@ By default Solr will delete the index of the replica that is deleted. Set this t
|
||||||
When set to `true`, no action will be taken if the replica is active. Default `false`.
|
When set to `true`, no action will be taken if the replica is active. Default `false`.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
=== Examples using DELETEREPLICA
|
=== Examples using DELETEREPLICA
|
||||||
|
|
||||||
|
@ -688,7 +685,7 @@ http://localhost:8983/solr/admin/collections?action=DELETEREPLICA&collection=tes
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-addreplica]]
|
[[addreplica]]
|
||||||
== ADDREPLICA: Add Replica
|
== ADDREPLICA: Add Replica
|
||||||
|
|
||||||
Add a replica to a shard in a collection. The node name can be specified if the replica is to be created in a specific node.
|
Add a replica to a shard in a collection. The node name can be specified if the replica is to be created in a specific node.
|
||||||
|
@ -722,7 +719,8 @@ The directory in which the core should be created
|
||||||
`property._name_=_value_`::
|
`property._name_=_value_`::
|
||||||
Set core property _name_ to _value_. See <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details about supported properties and values.
|
Set core property _name_ to _value_. See <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details about supported properties and values.
|
||||||
|
|
||||||
`async`:: string |No |Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>
|
`async`::
|
||||||
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>
|
||||||
|
|
||||||
=== Examples using ADDREPLICA
|
=== Examples using ADDREPLICA
|
||||||
|
|
||||||
|
@ -754,7 +752,7 @@ http://localhost:8983/solr/admin/collections?action=ADDREPLICA&collection=test2&
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-clusterprop]]
|
[[clusterprop]]
|
||||||
== CLUSTERPROP: Cluster Properties
|
== CLUSTERPROP: Cluster Properties
|
||||||
|
|
||||||
Add, edit or delete a cluster-wide property.
|
Add, edit or delete a cluster-wide property.
|
||||||
|
@ -794,7 +792,7 @@ http://localhost:8983/solr/admin/collections?action=CLUSTERPROP&name=urlScheme&v
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-migrate]]
|
[[migrate]]
|
||||||
== MIGRATE: Migrate Documents to Another Collection
|
== MIGRATE: Migrate Documents to Another Collection
|
||||||
|
|
||||||
`/admin/collections?action=MIGRATE&collection=_name_&split.key=_key1!_&target.collection=_target_collection_&forward.timeout=60`
|
`/admin/collections?action=MIGRATE&collection=_name_&split.key=_key1!_&target.collection=_target_collection_&forward.timeout=60`
|
||||||
|
@ -827,7 +825,7 @@ The timeout, in seconds, until which write requests made to the source collectio
|
||||||
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details on supported properties and values.
|
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details on supported properties and values.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
=== MIGRATE Response
|
=== MIGRATE Response
|
||||||
|
|
||||||
|
@ -988,7 +986,7 @@ http://localhost:8983/solr/admin/collections?action=MIGRATE&collection=test1&spl
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-addrole]]
|
[[addrole]]
|
||||||
== ADDROLE: Add a Role
|
== ADDROLE: Add a Role
|
||||||
|
|
||||||
`/admin/collections?action=ADDROLE&role=_roleName_&node=_nodeName_`
|
`/admin/collections?action=ADDROLE&role=_roleName_&node=_nodeName_`
|
||||||
|
@ -1003,7 +1001,7 @@ Use this command to dedicate a particular node as Overseer. Invoke it multiple t
|
||||||
The name of the role. The only supported role as of now is `overseer`. This parameter is required.
|
The name of the role. The only supported role as of now is `overseer`. This parameter is required.
|
||||||
|
|
||||||
`node`::
|
`node`::
|
||||||
|The name of the node that will be assigned the role. It is possible to assign a role even before that node is started. This parameter is started.
|
The name of the node that will be assigned the role. It is possible to assign a role even before that node is started. This parameter is started.
|
||||||
|
|
||||||
=== ADDROLE Response
|
=== ADDROLE Response
|
||||||
|
|
||||||
|
@ -1030,7 +1028,7 @@ http://localhost:8983/solr/admin/collections?action=ADDROLE&role=overseer&node=1
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-removerole]]
|
[[removerole]]
|
||||||
== REMOVEROLE: Remove Role
|
== REMOVEROLE: Remove Role
|
||||||
|
|
||||||
Remove an assigned role. This API is used to undo the roles assigned using ADDROLE operation
|
Remove an assigned role. This API is used to undo the roles assigned using ADDROLE operation
|
||||||
|
@ -1046,7 +1044,6 @@ The name of the role. The only supported role as of now is `overseer`. This para
|
||||||
The name of the node where the role should be removed.
|
The name of the node where the role should be removed.
|
||||||
|
|
||||||
|
|
||||||
[[CollectionsAPI-Output.11]]
|
|
||||||
=== REMOVEROLE Response
|
=== REMOVEROLE Response
|
||||||
|
|
||||||
The response will include the status of the request and the properties that were updated or removed. If the status is anything other than "0", an error message will explain why the request failed.
|
The response will include the status of the request and the properties that were updated or removed. If the status is anything other than "0", an error message will explain why the request failed.
|
||||||
|
@ -1072,7 +1069,7 @@ http://localhost:8983/solr/admin/collections?action=REMOVEROLE&role=overseer&nod
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-overseerstatus]]
|
[[overseerstatus]]
|
||||||
== OVERSEERSTATUS: Overseer Status and Statistics
|
== OVERSEERSTATUS: Overseer Status and Statistics
|
||||||
|
|
||||||
Returns the current status of the overseer, performance statistics of various overseer APIs, and the last 10 failures per operation type.
|
Returns the current status of the overseer, performance statistics of various overseer APIs, and the last 10 failures per operation type.
|
||||||
|
@ -1146,7 +1143,7 @@ http://localhost:8983/solr/admin/collections?action=OVERSEERSTATUS&wt=json
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-clusterstatus]]
|
[[clusterstatus]]
|
||||||
== CLUSTERSTATUS: Cluster Status
|
== CLUSTERSTATUS: Cluster Status
|
||||||
|
|
||||||
Fetch the cluster status including collections, shards, replicas, configuration name as well as collection aliases and cluster properties.
|
Fetch the cluster status including collections, shards, replicas, configuration name as well as collection aliases and cluster properties.
|
||||||
|
@ -1168,7 +1165,6 @@ This can be used if you need the details of the shard where a particular documen
|
||||||
|
|
||||||
The response will include the status of the request and the status of the cluster.
|
The response will include the status of the request and the status of the cluster.
|
||||||
|
|
||||||
[[CollectionsAPI-Examples.15]]
|
|
||||||
=== Examples using CLUSTERSTATUS
|
=== Examples using CLUSTERSTATUS
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
@ -1247,10 +1243,10 @@ http://localhost:8983/solr/admin/collections?action=clusterstatus&wt=json
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-requeststatus]]
|
[[requeststatus]]
|
||||||
== REQUESTSTATUS: Request Status of an Async Call
|
== REQUESTSTATUS: Request Status of an Async Call
|
||||||
|
|
||||||
Request the status and response of an already submitted <<CollectionsAPI-async,Asynchronous Collection API>> (below) call. This call is also used to clear up the stored statuses.
|
Request the status and response of an already submitted <<Asynchronous Calls,Asynchronous Collection API>> (below) call. This call is also used to clear up the stored statuses.
|
||||||
|
|
||||||
`/admin/collections?action=REQUESTSTATUS&requestid=_request-id_`
|
`/admin/collections?action=REQUESTSTATUS&requestid=_request-id_`
|
||||||
|
|
||||||
|
@ -1307,10 +1303,10 @@ http://localhost:8983/solr/admin/collections?action=REQUESTSTATUS&requestid=1004
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-deletestatus]]
|
[[deletestatus]]
|
||||||
== DELETESTATUS: Delete Status
|
== DELETESTATUS: Delete Status
|
||||||
|
|
||||||
Deletes the stored response of an already failed or completed <<CollectionsAPI-async,Asynchronous Collection API>> call.
|
Deletes the stored response of an already failed or completed <<Asynchronous Calls,Asynchronous Collection API>> call.
|
||||||
|
|
||||||
`/admin/collections?action=DELETESTATUS&requestid=_request-id_`
|
`/admin/collections?action=DELETESTATUS&requestid=_request-id_`
|
||||||
|
|
||||||
|
@ -1384,7 +1380,7 @@ http://localhost:8983/solr/admin/collections?action=DELETESTATUS&flush=true
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-list]]
|
[[list]]
|
||||||
== LIST: List Collections
|
== LIST: List Collections
|
||||||
|
|
||||||
Fetch the names of the collections in the cluster.
|
Fetch the names of the collections in the cluster.
|
||||||
|
@ -1413,7 +1409,7 @@ http://localhost:8983/solr/admin/collections?action=LIST&wt=json
|
||||||
"example2"]}
|
"example2"]}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-addreplicaprop]]
|
[[addreplicaprop]]
|
||||||
== ADDREPLICAPROP: Add Replica Property
|
== ADDREPLICAPROP: Add Replica Property
|
||||||
|
|
||||||
Assign an arbitrary property to a particular replica and give it the value specified. If the property already exists, it will be overwritten with the new value.
|
Assign an arbitrary property to a particular replica and give it the value specified. If the property already exists, it will be overwritten with the new value.
|
||||||
|
@ -1501,7 +1497,7 @@ http://localhost:8983/solr/admin/collections?action=ADDREPLICAPROP&shard=shard1&
|
||||||
http://localhost:8983/solr/admin/collections?action=ADDREPLICAPROP&shard=shard1&collection=collection1&replica=core_node3&property=testprop&property.value=value2&shardUnique=true
|
http://localhost:8983/solr/admin/collections?action=ADDREPLICAPROP&shard=shard1&collection=collection1&replica=core_node3&property=testprop&property.value=value2&shardUnique=true
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-deletereplicaprop]]
|
[[deletereplicaprop]]
|
||||||
== DELETEREPLICAPROP: Delete Replica Property
|
== DELETEREPLICAPROP: Delete Replica Property
|
||||||
|
|
||||||
Deletes an arbitrary property from a particular replica.
|
Deletes an arbitrary property from a particular replica.
|
||||||
|
@ -1555,7 +1551,7 @@ http://localhost:8983/solr/admin/collections?action=DELETEREPLICAPROP&shard=shar
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CollectionsAPI-balanceshardunique]]
|
[[balanceshardunique]]
|
||||||
== BALANCESHARDUNIQUE: Balance a Property Across Nodes
|
== BALANCESHARDUNIQUE: Balance a Property Across Nodes
|
||||||
|
|
||||||
`/admin/collections?action=BALANCESHARDUNIQUE&collection=_collectionName_&property=_propertyName_`
|
`/admin/collections?action=BALANCESHARDUNIQUE&collection=_collectionName_&property=_propertyName_`
|
||||||
|
@ -1607,7 +1603,7 @@ http://localhost:8983/solr/admin/collections?action=BALANCESHARDUNIQUE&collectio
|
||||||
|
|
||||||
Examining the clusterstate after issuing this call should show exactly one replica in each shard that has this property.
|
Examining the clusterstate after issuing this call should show exactly one replica in each shard that has this property.
|
||||||
|
|
||||||
[[CollectionsAPI-rebalanceleaders]]
|
[[rebalanceleaders]]
|
||||||
== REBALANCELEADERS: Rebalance Leaders
|
== REBALANCELEADERS: Rebalance Leaders
|
||||||
|
|
||||||
Reassigns leaders in a collection according to the preferredLeader property across active nodes.
|
Reassigns leaders in a collection according to the preferredLeader property across active nodes.
|
||||||
|
@ -1709,10 +1705,7 @@ The replica in the "inactivePreferreds" section had the `preferredLeader` proper
|
||||||
|
|
||||||
Examining the clusterstate after issuing this call should show that every live node that has the `preferredLeader` property should also have the "leader" property set to _true_.
|
Examining the clusterstate after issuing this call should show that every live node that has the `preferredLeader` property should also have the "leader" property set to _true_.
|
||||||
|
|
||||||
|
[[forceleader]]
|
||||||
[[CollectionsAPI-FORCELEADER_ForceShardLeader]]
|
|
||||||
|
|
||||||
[[CollectionsAPI-forceleader]]
|
|
||||||
== FORCELEADER: Force Shard Leader
|
== FORCELEADER: Force Shard Leader
|
||||||
|
|
||||||
In the unlikely event of a shard losing its leader, this command can be invoked to force the election of a new leader.
|
In the unlikely event of a shard losing its leader, this command can be invoked to force the election of a new leader.
|
||||||
|
@ -1729,7 +1722,7 @@ The name of the shard where leader election should occur. This parameter is requ
|
||||||
|
|
||||||
WARNING: This is an expert level command, and should be invoked only when regular leader election is not working. This may potentially lead to loss of data in the event that the new leader doesn't have certain updates, possibly recent ones, which were acknowledged by the old leader before going down.
|
WARNING: This is an expert level command, and should be invoked only when regular leader election is not working. This may potentially lead to loss of data in the event that the new leader doesn't have certain updates, possibly recent ones, which were acknowledged by the old leader before going down.
|
||||||
|
|
||||||
[[CollectionsAPI-migratestateformat]]
|
[[migratestateformat]]
|
||||||
== MIGRATESTATEFORMAT: Migrate Cluster State
|
== MIGRATESTATEFORMAT: Migrate Cluster State
|
||||||
|
|
||||||
A expert level utility API to move a collection from shared `clusterstate.json` zookeeper node (created with `stateFormat=1`, the default in all Solr releases prior to 5.0) to the per-collection `state.json` stored in ZooKeeper (created with `stateFormat=2`, the current default) seamlessly without any application down-time.
|
A expert level utility API to move a collection from shared `clusterstate.json` zookeeper node (created with `stateFormat=1`, the default in all Solr releases prior to 5.0) to the per-collection `state.json` stored in ZooKeeper (created with `stateFormat=2`, the current default) seamlessly without any application down-time.
|
||||||
|
@ -1742,11 +1735,11 @@ A expert level utility API to move a collection from shared `clusterstate.json`
|
||||||
The name of the collection to be migrated from `clusterstate.json` to its own `state.json` ZooKeeper node. This parameter is required.
|
The name of the collection to be migrated from `clusterstate.json` to its own `state.json` ZooKeeper node. This parameter is required.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
This API is useful in migrating any collections created prior to Solr 5.0 to the more scalable cluster state format now used by default. If a collection was created in any Solr 5.x version or higher, then executing this command is not necessary.
|
This API is useful in migrating any collections created prior to Solr 5.0 to the more scalable cluster state format now used by default. If a collection was created in any Solr 5.x version or higher, then executing this command is not necessary.
|
||||||
|
|
||||||
[[CollectionsAPI-backup]]
|
[[backup]]
|
||||||
== BACKUP: Backup Collection
|
== BACKUP: Backup Collection
|
||||||
|
|
||||||
Backs up Solr collections and associated configurations to a shared filesystem - for example a Network File System.
|
Backs up Solr collections and associated configurations to a shared filesystem - for example a Network File System.
|
||||||
|
@ -1761,15 +1754,15 @@ The BACKUP command will backup Solr indexes and configurations for a specified c
|
||||||
The name of the collection to be backed up. This parameter is required.
|
The name of the collection to be backed up. This parameter is required.
|
||||||
|
|
||||||
`location`::
|
`location`::
|
||||||
The location on a shared drive for the backup command to write to. Alternately it can be set as a <<CollectionsAPI-clusterprop,cluster property>>.
|
The location on a shared drive for the backup command to write to. Alternately it can be set as a <<clusterprop,cluster property>>.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
`repository`::
|
`repository`::
|
||||||
The name of a repository to be used for the backup. If no repository is specified then the local filesystem repository will be used automatically.
|
The name of a repository to be used for the backup. If no repository is specified then the local filesystem repository will be used automatically.
|
||||||
|
|
||||||
[[CollectionsAPI-restore]]
|
[[restore]]
|
||||||
== RESTORE: Restore Collection
|
== RESTORE: Restore Collection
|
||||||
|
|
||||||
Restores Solr indexes and associated configurations.
|
Restores Solr indexes and associated configurations.
|
||||||
|
@ -1782,7 +1775,7 @@ The collection created will be have the same number of shards and replicas as th
|
||||||
|
|
||||||
While restoring, if a configSet with the same name exists in ZooKeeper then Solr will reuse that, or else it will upload the backed up configSet in ZooKeeper and use that.
|
While restoring, if a configSet with the same name exists in ZooKeeper then Solr will reuse that, or else it will upload the backed up configSet in ZooKeeper and use that.
|
||||||
|
|
||||||
You can use the collection <<CollectionsAPI-createalias,CREATEALIAS>> command to make sure clients don't need to change the endpoint to query or index against the newly restored collection.
|
You can use the collection <<createalias,CREATEALIAS>> command to make sure clients don't need to change the endpoint to query or index against the newly restored collection.
|
||||||
|
|
||||||
=== RESTORE Parameters
|
=== RESTORE Parameters
|
||||||
|
|
||||||
|
@ -1790,10 +1783,10 @@ You can use the collection <<CollectionsAPI-createalias,CREATEALIAS>> command to
|
||||||
The collection where the indexes will be restored into. This parameter is required.
|
The collection where the indexes will be restored into. This parameter is required.
|
||||||
|
|
||||||
`location`::
|
`location`::
|
||||||
The location on a shared drive for the RESTORE command to read from. Alternately it can be set as a <<CollectionsAPI-clusterprop,cluster property>>.
|
The location on a shared drive for the RESTORE command to read from. Alternately it can be set as a <<clusterprop,cluster property>>.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
`repository`::
|
`repository`::
|
||||||
The name of a repository to be used for the backup. If no repository is specified then the local filesystem repository will be used automatically.
|
The name of a repository to be used for the backup. If no repository is specified then the local filesystem repository will be used automatically.
|
||||||
|
@ -1814,12 +1807,11 @@ When creating collections, the shards and/or replicas are spread across all avai
|
||||||
If a node is not live when the CREATE operation is called, it will not get any parts of the new collection, which could lead to too many replicas being created on a single live node. Defining `maxShardsPerNode` sets a limit on the number of replicas CREATE will spread to each node. If the entire collection can not be fit into the live nodes, no collection will be created at all.
|
If a node is not live when the CREATE operation is called, it will not get any parts of the new collection, which could lead to too many replicas being created on a single live node. Defining `maxShardsPerNode` sets a limit on the number of replicas CREATE will spread to each node. If the entire collection can not be fit into the live nodes, no collection will be created at all.
|
||||||
|
|
||||||
`autoAddReplicas`::
|
`autoAddReplicas`::
|
||||||
When set to `true`, enables auto addition of replicas on shared file systems. See the section <<running-solr-on-hdfs.adoc#RunningSolronHDFS-AutomaticallyAddReplicasinSolrCloud,Automatically Add Replicas in SolrCloud>> for more details on settings and overrides.
|
When set to `true`, enables auto addition of replicas on shared file systems. See the section <<running-solr-on-hdfs.adoc#automatically-add-replicas-in-solrcloud,Automatically Add Replicas in SolrCloud>> for more details on settings and overrides.
|
||||||
|
|
||||||
`property._name_=_value_`::
|
`property._name_=_value_`::
|
||||||
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details on supported properties and values.
|
Set core property _name_ to _value_. See the section <<defining-core-properties.adoc#defining-core-properties,Defining core.properties>> for details on supported properties and values.
|
||||||
|
|
||||||
[[CollectionsAPI-deletenode]]
|
|
||||||
== DELETENODE: Delete Replicas in a Node
|
== DELETENODE: Delete Replicas in a Node
|
||||||
|
|
||||||
Deletes all replicas of all collections in that node. Please note that the node itself will remain as a live node after this operation.
|
Deletes all replicas of all collections in that node. Please note that the node itself will remain as a live node after this operation.
|
||||||
|
@ -1828,12 +1820,12 @@ Deletes all replicas of all collections in that node. Please note that the node
|
||||||
|
|
||||||
=== DELETENODE Parameters
|
=== DELETENODE Parameters
|
||||||
|
|
||||||
`node`:: string |Yes |The node to be removed. This parameter is required.
|
`node`::
|
||||||
|
The node to be removed. This parameter is required.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
[[CollectionsAPI-replacenode]]
|
|
||||||
== REPLACENODE: Move All Replicas in a Node to Another
|
== REPLACENODE: Move All Replicas in a Node to Another
|
||||||
|
|
||||||
This command recreates replicas in one node (the source) to another node (the target). After each replica is copied, the replicas in the source node are deleted.
|
This command recreates replicas in one node (the source) to another node (the target). After each replica is copied, the replicas in the source node are deleted.
|
||||||
|
@ -1854,7 +1846,7 @@ The target node where replicas will be copied. This parameter is required.
|
||||||
If this flag is set to `true`, all replicas are created in separate threads. Keep in mind that this can lead to very high network and disk I/O if the replicas have very large indices. The default is `false`.
|
If this flag is set to `true`, all replicas are created in separate threads. Keep in mind that this can lead to very high network and disk I/O if the replicas have very large indices. The default is `false`.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
`timeout`::
|
`timeout`::
|
||||||
Time in seconds to wait until new replicas are created, and until leader replicas are fully recovered. The default is `300`, or 5 minutes.
|
Time in seconds to wait until new replicas are created, and until leader replicas are fully recovered. The default is `300`, or 5 minutes.
|
||||||
|
@ -1864,7 +1856,6 @@ Time in seconds to wait until new replicas are created, and until leader replica
|
||||||
This operation does not hold necessary locks on the replicas that belong to on the source node. So don't perform other collection operations in this period.
|
This operation does not hold necessary locks on the replicas that belong to on the source node. So don't perform other collection operations in this period.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[CollectionsAPI-movereplica]]
|
|
||||||
== MOVEREPLICA: Move a Replica to a New Node
|
== MOVEREPLICA: Move a Replica to a New Node
|
||||||
|
|
||||||
This command moves a replica from one node to a new node. In case of shared filesystems the `dataDir` will be reused.
|
This command moves a replica from one node to a new node. In case of shared filesystems the `dataDir` will be reused.
|
||||||
|
@ -1889,12 +1880,11 @@ The name of the node that contains the replica. This parameter is required.
|
||||||
The name of the destination node. This parameter is required.
|
The name of the destination node. This parameter is required.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be <<CollectionsAPI-async,processed asynchronously>>.
|
Request ID to track this action which will be <<Asynchronous Calls,processed asynchronously>>.
|
||||||
|
|
||||||
[[CollectionsAPI-async]]
|
|
||||||
== Asynchronous Calls
|
== Asynchronous Calls
|
||||||
|
|
||||||
Since some collection API calls can be long running tasks (such as SPLITSHARD), you can optionally have the calls run asynchronously. Specifying `async=<request-id>` enables you to make an asynchronous call, the status of which can be requested using the <<CollectionsAPI-requeststatus,REQUESTSTATUS>> call at any time.
|
Since some collection API calls can be long running tasks (such as SPLITSHARD), you can optionally have the calls run asynchronously. Specifying `async=<request-id>` enables you to make an asynchronous call, the status of which can be requested using the <<requeststatus,REQUESTSTATUS>> call at any time.
|
||||||
|
|
||||||
As of now, REQUESTSTATUS does not automatically clean up the tracking data structures, meaning the status of completed or failed tasks stays stored in ZooKeeper unless cleared manually. DELETESTATUS can be used to clear the stored statuses. However, there is a limit of 10,000 on the number of async call responses stored in a cluster.
|
As of now, REQUESTSTATUS does not automatically clean up the tracking data structures, meaning the status of completed or failed tasks stays stored in ZooKeeper unless cleared manually. DELETESTATUS can be used to clear the stored statuses. However, there is a limit of 10,000 on the number of async call responses stored in a cluster.
|
||||||
|
|
||||||
|
|
|
@ -36,6 +36,6 @@ image::images/collections-core-admin/collection-admin.png[image,width=653,height
|
||||||
|
|
||||||
Replicas can be deleted by clicking the red "X" next to the replica name.
|
Replicas can be deleted by clicking the red "X" next to the replica name.
|
||||||
|
|
||||||
If the shard is inactive, for example after a <<collections-api.adoc#CollectionsAPI-splitshard,SPLITSHARD action>>, an option to delete the shard will appear as a red "X" next to the shard name.
|
If the shard is inactive, for example after a <<collections-api.adoc#splitshard,SPLITSHARD action>>, an option to delete the shard will appear as a red "X" next to the shard name.
|
||||||
|
|
||||||
image::images/collections-core-admin/DeleteShard.png[image,width=486,height=250]
|
image::images/collections-core-admin/DeleteShard.png[image,width=486,height=250]
|
||||||
|
|
|
@ -36,7 +36,6 @@ The `zkcli.sh` provided by Solr is not the same as the https://zookeeper.apache.
|
||||||
ZooKeeper's `zkCli.sh` provides a completely general, application-agnostic shell for manipulating data in ZooKeeper. Solr's `zkcli.sh` – discussed in this section – is specific to Solr, and has command line arguments specific to dealing with Solr data in ZooKeeper.
|
ZooKeeper's `zkCli.sh` provides a completely general, application-agnostic shell for manipulating data in ZooKeeper. Solr's `zkcli.sh` – discussed in this section – is specific to Solr, and has command line arguments specific to dealing with Solr data in ZooKeeper.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[CommandLineUtilities-UsingSolr_sZooKeeperCLI]]
|
|
||||||
== Using Solr's ZooKeeper CLI
|
== Using Solr's ZooKeeper CLI
|
||||||
|
|
||||||
Use the `help` option to get a list of available commands from the script itself, as in `./server/scripts/cloud-scrips/zkcli.sh help`.
|
Use the `help` option to get a list of available commands from the script itself, as in `./server/scripts/cloud-scrips/zkcli.sh help`.
|
||||||
|
@ -91,23 +90,20 @@ The short form parameter options may be specified with a single dash (eg: `-c my
|
||||||
The long form parameter options may be specified using either a single dash (eg: `-collection mycollection`) or a double dash (eg: `--collection mycollection`)
|
The long form parameter options may be specified using either a single dash (eg: `-collection mycollection`) or a double dash (eg: `--collection mycollection`)
|
||||||
====
|
====
|
||||||
|
|
||||||
[[CommandLineUtilities-ZooKeeperCLIExamples]]
|
|
||||||
== ZooKeeper CLI Examples
|
== ZooKeeper CLI Examples
|
||||||
|
|
||||||
Below are some examples of using the `zkcli.sh` CLI which assume you have already started the SolrCloud example (`bin/solr -e cloud -noprompt`)
|
Below are some examples of using the `zkcli.sh` CLI which assume you have already started the SolrCloud example (`bin/solr -e cloud -noprompt`)
|
||||||
|
|
||||||
If you are on Windows machine, simply replace `zkcli.sh` with `zkcli.bat` in these examples.
|
If you are on Windows machine, simply replace `zkcli.sh` with `zkcli.bat` in these examples.
|
||||||
|
|
||||||
[[CommandLineUtilities-Uploadaconfigurationdirectory]]
|
=== Upload a Configuration Directory
|
||||||
=== Upload a configuration directory
|
|
||||||
|
|
||||||
[source,bash]
|
[source,bash]
|
||||||
----
|
----
|
||||||
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd upconfig -confname my_new_config -confdir server/solr/configsets/_default/conf
|
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd upconfig -confname my_new_config -confdir server/solr/configsets/_default/conf
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CommandLineUtilities-BootstrapZooKeeperfromexistingSOLR_HOME]]
|
=== Bootstrap ZooKeeper from an Existing solr.home
|
||||||
=== Bootstrap ZooKeeper from existing SOLR_HOME
|
|
||||||
|
|
||||||
[source,bash]
|
[source,bash]
|
||||||
----
|
----
|
||||||
|
@ -120,32 +116,28 @@ If you are on Windows machine, simply replace `zkcli.sh` with `zkcli.bat` in the
|
||||||
Using the boostrap command with a zookeeper chroot in the `-zkhost` parameter, e.g. `-zkhost 127.0.0.1:2181/solr`, will automatically create the chroot path before uploading the configs.
|
Using the boostrap command with a zookeeper chroot in the `-zkhost` parameter, e.g. `-zkhost 127.0.0.1:2181/solr`, will automatically create the chroot path before uploading the configs.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[CommandLineUtilities-PutarbitrarydataintoanewZooKeeperfile]]
|
=== Put Arbitrary Data into a New ZooKeeper file
|
||||||
=== Put arbitrary data into a new ZooKeeper file
|
|
||||||
|
|
||||||
[source,bash]
|
[source,bash]
|
||||||
----
|
----
|
||||||
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd put /my_zk_file.txt 'some data'
|
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd put /my_zk_file.txt 'some data'
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CommandLineUtilities-PutalocalfileintoanewZooKeeperfile]]
|
=== Put a Local File into a New ZooKeeper File
|
||||||
=== Put a local file into a new ZooKeeper file
|
|
||||||
|
|
||||||
[source,bash]
|
[source,bash]
|
||||||
----
|
----
|
||||||
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd putfile /my_zk_file.txt /tmp/my_local_file.txt
|
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd putfile /my_zk_file.txt /tmp/my_local_file.txt
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CommandLineUtilities-Linkacollectiontoaconfigurationset]]
|
=== Link a Collection to a ConfigSet
|
||||||
=== Link a collection to a configuration set
|
|
||||||
|
|
||||||
[source,bash]
|
[source,bash]
|
||||||
----
|
----
|
||||||
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd linkconfig -collection gettingstarted -confname my_new_config
|
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:9983 -cmd linkconfig -collection gettingstarted -confname my_new_config
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CommandLineUtilities-CreateanewZooKeeperpath]]
|
=== Create a New ZooKeeper Path
|
||||||
=== Create a new ZooKeeper path
|
|
||||||
|
|
||||||
This can be useful to create a chroot path in ZooKeeper before first cluster start.
|
This can be useful to create a chroot path in ZooKeeper before first cluster start.
|
||||||
|
|
||||||
|
@ -154,13 +146,11 @@ This can be useful to create a chroot path in ZooKeeper before first cluster sta
|
||||||
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:2181 -cmd makepath /solr
|
./server/scripts/cloud-scripts/zkcli.sh -zkhost 127.0.0.1:2181 -cmd makepath /solr
|
||||||
----
|
----
|
||||||
|
|
||||||
|
=== Set a Cluster Property
|
||||||
[[CommandLineUtilities-Setaclusterproperty]]
|
|
||||||
=== Set a cluster property
|
|
||||||
|
|
||||||
This command will add or modify a single cluster property in `clusterprops.json`. Use this command instead of the usual getfile \-> edit \-> putfile cycle.
|
This command will add or modify a single cluster property in `clusterprops.json`. Use this command instead of the usual getfile \-> edit \-> putfile cycle.
|
||||||
|
|
||||||
Unlike the CLUSTERPROP command on the <<collections-api.adoc#CollectionsAPI-clusterprop,Collections API>>, this command does *not* require a running Solr cluster.
|
Unlike the CLUSTERPROP command on the <<collections-api.adoc#clusterprop,Collections API>>, this command does *not* require a running Solr cluster.
|
||||||
|
|
||||||
[source,bash]
|
[source,bash]
|
||||||
----
|
----
|
||||||
|
|
|
@ -20,7 +20,7 @@
|
||||||
|
|
||||||
Several query parsers share supported query parameters.
|
Several query parsers share supported query parameters.
|
||||||
|
|
||||||
The table below summarizes Solr's common query parameters, which are supported by the <<requesthandlers-and-searchcomponents-in-solrconfig#RequestHandlersandSearchComponentsinSolrConfig-SearchHandlers,Search RequestHandlers>>
|
The table below summarizes Solr's common query parameters, which are supported by the <<requesthandlers-and-searchcomponents-in-solrconfig#searchhandlers,Search RequestHandlers>>
|
||||||
|
|
||||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||||
|
|
||||||
|
@ -249,7 +249,7 @@ As this check is periodically performed, the actual time for which a request can
|
||||||
|
|
||||||
This parameter may be set to either true or false.
|
This parameter may be set to either true or false.
|
||||||
|
|
||||||
If set to true, and if <<indexconfig-in-solrconfig.adoc#IndexConfiginSolrConfig-mergePolicyFactory,the mergePolicyFactory>> for this collection is a {solr-javadocs}/solr-core/org/apache/solr/index/SortingMergePolicyFactory.html[`SortingMergePolicyFactory`] which uses a `sort` option which is compatible with <<CommonQueryParameters-ThesortParameter,the sort parameter>> specified for this query, then Solr will attempt to use an {lucene-javadocs}/core/org/apache/lucene/search/EarlyTerminatingSortingCollector.html[`EarlyTerminatingSortingCollector`].
|
If set to true, and if <<indexconfig-in-solrconfig.adoc#mergepolicyfactory,the mergePolicyFactory>> for this collection is a {solr-javadocs}/solr-core/org/apache/solr/index/SortingMergePolicyFactory.html[`SortingMergePolicyFactory`] which uses a `sort` option which is compatible with <<CommonQueryParameters-ThesortParameter,the sort parameter>> specified for this query, then Solr will attempt to use an {lucene-javadocs}/core/org/apache/lucene/search/EarlyTerminatingSortingCollector.html[`EarlyTerminatingSortingCollector`].
|
||||||
|
|
||||||
If early termination is used, a `segmentTerminatedEarly` header will be included in the `responseHeader`.
|
If early termination is used, a `segmentTerminatedEarly` header will be included in the `responseHeader`.
|
||||||
|
|
||||||
|
|
|
@ -24,15 +24,13 @@ This feature is enabled by default and works similarly in both SolrCloud and sta
|
||||||
|
|
||||||
When using this API, `solrconfig.xml` is not changed. Instead, all edited configuration is stored in a file called `configoverlay.json`. The values in `configoverlay.json` override the values in `solrconfig.xml`.
|
When using this API, `solrconfig.xml` is not changed. Instead, all edited configuration is stored in a file called `configoverlay.json`. The values in `configoverlay.json` override the values in `solrconfig.xml`.
|
||||||
|
|
||||||
[[ConfigAPI-APIEntryPoints]]
|
== Config API Entry Points
|
||||||
== API Entry Points
|
|
||||||
|
|
||||||
* `/config`: retrieve or modify the config. GET to retrieve and POST for executing commands
|
* `/config`: retrieve or modify the config. GET to retrieve and POST for executing commands
|
||||||
* `/config/overlay`: retrieve the details in the `configoverlay.json` alone
|
* `/config/overlay`: retrieve the details in the `configoverlay.json` alone
|
||||||
* `/config/params` : allows creating parameter sets that can override or take the place of parameters defined in `solrconfig.xml`. See the <<request-parameters-api.adoc#request-parameters-api,Request Parameters API>> section for more details.
|
* `/config/params` : allows creating parameter sets that can override or take the place of parameters defined in `solrconfig.xml`. See the <<request-parameters-api.adoc#request-parameters-api,Request Parameters API>> section for more details.
|
||||||
|
|
||||||
[[ConfigAPI-Retrievingtheconfig]]
|
== Retrieving the Config
|
||||||
== Retrieving the config
|
|
||||||
|
|
||||||
All configuration items, can be retrieved by sending a GET request to the `/config` endpoint - the results will be the effective configuration resulting from merging settings in `configoverlay.json` with those in `solrconfig.xml`:
|
All configuration items, can be retrieved by sending a GET request to the `/config` endpoint - the results will be the effective configuration resulting from merging settings in `configoverlay.json` with those in `solrconfig.xml`:
|
||||||
|
|
||||||
|
@ -55,18 +53,16 @@ To further restrict returned results to a single component within a top level se
|
||||||
curl http://localhost:8983/solr/techproducts/config/requestHandler?componentName=/select
|
curl http://localhost:8983/solr/techproducts/config/requestHandler?componentName=/select
|
||||||
----
|
----
|
||||||
|
|
||||||
[[ConfigAPI-Commandstomodifytheconfig]]
|
== Commands to Modify the Config
|
||||||
== Commands to modify the config
|
|
||||||
|
|
||||||
This API uses specific commands to tell Solr what property or type of property to add to `configoverlay.json`. The commands are passed as part of the data sent with the request.
|
This API uses specific commands to tell Solr what property or type of property to add to `configoverlay.json`. The commands are passed as part of the data sent with the request.
|
||||||
|
|
||||||
The config commands are categorized into 3 different sections which manipulate various data structures in `solrconfig.xml`. Each of these is described below.
|
The config commands are categorized into 3 different sections which manipulate various data structures in `solrconfig.xml`. Each of these is described below.
|
||||||
|
|
||||||
* <<ConfigAPI-CommandsforCommonProperties,Common Properties>>
|
* <<Commands for Common Properties,Common Properties>>
|
||||||
* <<ConfigAPI-CommandsforCustomHandlersandLocalComponents,Components>>
|
* <<Commands for Custom Handlers and Local Components,Components>>
|
||||||
* <<ConfigAPI-CommandsforUser-DefinedProperties,User-defined properties>>
|
* <<Commands for User-Defined Properties,User-defined properties>>
|
||||||
|
|
||||||
[[ConfigAPI-CommandsforCommonProperties]]
|
|
||||||
=== Commands for Common Properties
|
=== Commands for Common Properties
|
||||||
|
|
||||||
The common properties are those that are frequently need to be customized in a Solr instance. They are manipulated with two commands:
|
The common properties are those that are frequently need to be customized in a Solr instance. They are manipulated with two commands:
|
||||||
|
@ -120,7 +116,6 @@ The properties that are configured with these commands are predefined and listed
|
||||||
* `requestDispatcher.requestParsers.enableStreamBody`
|
* `requestDispatcher.requestParsers.enableStreamBody`
|
||||||
* `requestDispatcher.requestParsers.addHttpRequestToContext`
|
* `requestDispatcher.requestParsers.addHttpRequestToContext`
|
||||||
|
|
||||||
[[ConfigAPI-CommandsforCustomHandlersandLocalComponents]]
|
|
||||||
=== Commands for Custom Handlers and Local Components
|
=== Commands for Custom Handlers and Local Components
|
||||||
|
|
||||||
Custom request handlers, search components, and other types of localized Solr components (such as custom query parsers, update processors, etc.) can be added, updated and deleted with specific commands for the component being modified.
|
Custom request handlers, search components, and other types of localized Solr components (such as custom query parsers, update processors, etc.) can be added, updated and deleted with specific commands for the component being modified.
|
||||||
|
@ -133,7 +128,6 @@ Settings removed from `configoverlay.json` are not removed from `solrconfig.xml`
|
||||||
|
|
||||||
The full list of available commands follows below:
|
The full list of available commands follows below:
|
||||||
|
|
||||||
[[ConfigAPI-GeneralPurposeCommands]]
|
|
||||||
==== General Purpose Commands
|
==== General Purpose Commands
|
||||||
|
|
||||||
These commands are the most commonly used:
|
These commands are the most commonly used:
|
||||||
|
@ -151,7 +145,6 @@ These commands are the most commonly used:
|
||||||
* `update-queryresponsewriter`
|
* `update-queryresponsewriter`
|
||||||
* `delete-queryresponsewriter`
|
* `delete-queryresponsewriter`
|
||||||
|
|
||||||
[[ConfigAPI-AdvancedCommands]]
|
|
||||||
==== Advanced Commands
|
==== Advanced Commands
|
||||||
|
|
||||||
These commands allow registering more advanced customizations to Solr:
|
These commands allow registering more advanced customizations to Solr:
|
||||||
|
@ -179,9 +172,8 @@ These commands allow registering more advanced customizations to Solr:
|
||||||
* `update-runtimelib`
|
* `update-runtimelib`
|
||||||
* `delete-runtimelib`
|
* `delete-runtimelib`
|
||||||
|
|
||||||
See the section <<ConfigAPI-CreatingandUpdatingRequestHandlers,Creating and Updating Request Handlers>> below for examples of using these commands.
|
See the section <<Creating and Updating Request Handlers>> below for examples of using these commands.
|
||||||
|
|
||||||
[[ConfigAPI-Whatabout_updateRequestProcessorChain_]]
|
|
||||||
==== What about updateRequestProcessorChain?
|
==== What about updateRequestProcessorChain?
|
||||||
|
|
||||||
The Config API does not let you create or edit `updateRequestProcessorChain` elements. However, it is possible to create `updateProcessor` entries and can use them by name to create a chain.
|
The Config API does not let you create or edit `updateRequestProcessorChain` elements. However, it is possible to create `updateProcessor` entries and can use them by name to create a chain.
|
||||||
|
@ -198,7 +190,6 @@ curl http://localhost:8983/solr/techproducts/config -H 'Content-type:application
|
||||||
|
|
||||||
You can use this directly in your request by adding a parameter in the `updateRequestProcessorChain` for the specific update processor called `processor=firstFld`.
|
You can use this directly in your request by adding a parameter in the `updateRequestProcessorChain` for the specific update processor called `processor=firstFld`.
|
||||||
|
|
||||||
[[ConfigAPI-CommandsforUser-DefinedProperties]]
|
|
||||||
=== Commands for User-Defined Properties
|
=== Commands for User-Defined Properties
|
||||||
|
|
||||||
Solr lets users templatize the `solrconfig.xml` using the place holder format `${variable_name:default_val}`. You could set the values using system properties, for example, `-Dvariable_name= my_customvalue`. The same can be achieved during runtime using these commands:
|
Solr lets users templatize the `solrconfig.xml` using the place holder format `${variable_name:default_val}`. You could set the values using system properties, for example, `-Dvariable_name= my_customvalue`. The same can be achieved during runtime using these commands:
|
||||||
|
@ -208,11 +199,10 @@ Solr lets users templatize the `solrconfig.xml` using the place holder format `$
|
||||||
|
|
||||||
The structure of the request is similar to the structure of requests using other commands, in the format of `"command":{"variable_name":"property_value"}`. You can add more than one variable at a time if necessary.
|
The structure of the request is similar to the structure of requests using other commands, in the format of `"command":{"variable_name":"property_value"}`. You can add more than one variable at a time if necessary.
|
||||||
|
|
||||||
For more information about user-defined properties, see the section <<configuring-solrconfig-xml.adoc#Configuringsolrconfig.xml-Userdefinedpropertiesfromcore.properties,User defined properties from core.properties>>.
|
For more information about user-defined properties, see the section <<configuring-solrconfig-xml.adoc#user-defined-properties-in-core-properties,User defined properties in core.properties>>.
|
||||||
|
|
||||||
See also the section <<ConfigAPI-CreatingandUpdatingUser-DefinedProperties,Creating and Updating User-Defined Properties>> below for examples of how to use this type of command.
|
See also the section <<Creating and Updating User-Defined Properties>> below for examples of how to use this type of command.
|
||||||
|
|
||||||
[[ConfigAPI-HowtoMapsolrconfig.xmlPropertiestoJSON]]
|
|
||||||
== How to Map solrconfig.xml Properties to JSON
|
== How to Map solrconfig.xml Properties to JSON
|
||||||
|
|
||||||
By using this API, you will be generating JSON representations of properties defined in `solrconfig.xml`. To understand how properties should be represented with the API, let's take a look at a few examples.
|
By using this API, you will be generating JSON representations of properties defined in `solrconfig.xml`. To understand how properties should be represented with the API, let's take a look at a few examples.
|
||||||
|
@ -364,15 +354,12 @@ Define the same properties with the Config API:
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[ConfigAPI-NameComponentsfortheConfigAPI]]
|
|
||||||
=== Name Components for the Config API
|
=== Name Components for the Config API
|
||||||
|
|
||||||
The Config API always allows changing the configuration of any component by name. However, some configurations such as `listener` or `initParams` do not require a name in `solrconfig.xml`. In order to be able to `update` and `delete` of the same item in `configoverlay.json`, the name attribute becomes mandatory.
|
The Config API always allows changing the configuration of any component by name. However, some configurations such as `listener` or `initParams` do not require a name in `solrconfig.xml`. In order to be able to `update` and `delete` of the same item in `configoverlay.json`, the name attribute becomes mandatory.
|
||||||
|
|
||||||
[[ConfigAPI-Examples]]
|
== Config API Examples
|
||||||
== Examples
|
|
||||||
|
|
||||||
[[ConfigAPI-CreatingandUpdatingCommonProperties]]
|
|
||||||
=== Creating and Updating Common Properties
|
=== Creating and Updating Common Properties
|
||||||
|
|
||||||
This change sets the `query.filterCache.autowarmCount` to 1000 items and unsets the `query.filterCache.size`.
|
This change sets the `query.filterCache.autowarmCount` to 1000 items and unsets the `query.filterCache.size`.
|
||||||
|
@ -403,7 +390,6 @@ And you should get a response like this:
|
||||||
"size":25}}}}}
|
"size":25}}}}}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[ConfigAPI-CreatingandUpdatingRequestHandlers]]
|
|
||||||
=== Creating and Updating Request Handlers
|
=== Creating and Updating Request Handlers
|
||||||
|
|
||||||
To create a request handler, we can use the `add-requesthandler` command:
|
To create a request handler, we can use the `add-requesthandler` command:
|
||||||
|
@ -471,7 +457,6 @@ curl http://localhost:8983/solr/techproducts/config -H 'Content-type:application
|
||||||
}'
|
}'
|
||||||
----
|
----
|
||||||
|
|
||||||
[[ConfigAPI-CreatingandUpdatingUser-DefinedProperties]]
|
|
||||||
=== Creating and Updating User-Defined Properties
|
=== Creating and Updating User-Defined Properties
|
||||||
|
|
||||||
This command sets a user property.
|
This command sets a user property.
|
||||||
|
@ -507,14 +492,12 @@ To unset the variable, issue a command like this:
|
||||||
curl http://localhost:8983/solr/techproducts/config -H'Content-type:application/json' -d '{"unset-user-property" : "variable_name"}'
|
curl http://localhost:8983/solr/techproducts/config -H'Content-type:application/json' -d '{"unset-user-property" : "variable_name"}'
|
||||||
----
|
----
|
||||||
|
|
||||||
[[ConfigAPI-HowItWorks]]
|
== How the Config API Works
|
||||||
== How It Works
|
|
||||||
|
|
||||||
Every core watches the ZooKeeper directory for the configset being used with that core. In standalone mode, however, there is no watch (because ZooKeeper is not running). If there are multiple cores in the same node using the same configset, only one ZooKeeper watch is used. For instance, if the configset 'myconf' is used by a core, the node would watch `/configs/myconf`. Every write operation performed through the API would 'touch' the directory (sets an empty byte[] to trigger watches) and all watchers are notified. Every core would check if the Schema file, `solrconfig.xml` or `configoverlay.json` is modified by comparing the `znode` versions and if modified, the core is reloaded.
|
Every core watches the ZooKeeper directory for the configset being used with that core. In standalone mode, however, there is no watch (because ZooKeeper is not running). If there are multiple cores in the same node using the same configset, only one ZooKeeper watch is used. For instance, if the configset 'myconf' is used by a core, the node would watch `/configs/myconf`. Every write operation performed through the API would 'touch' the directory (sets an empty byte[] to trigger watches) and all watchers are notified. Every core would check if the Schema file, `solrconfig.xml` or `configoverlay.json` is modified by comparing the `znode` versions and if modified, the core is reloaded.
|
||||||
|
|
||||||
If `params.json` is modified, the params object is just updated without a core reload (see the section <<request-parameters-api.adoc#request-parameters-api,Request Parameters API>> for more information about `params.json`).
|
If `params.json` is modified, the params object is just updated without a core reload (see the section <<request-parameters-api.adoc#request-parameters-api,Request Parameters API>> for more information about `params.json`).
|
||||||
|
|
||||||
[[ConfigAPI-EmptyCommand]]
|
|
||||||
=== Empty Command
|
=== Empty Command
|
||||||
|
|
||||||
If an empty command is sent to the `/config` endpoint, the watch is triggered on all cores using this configset. For example:
|
If an empty command is sent to the `/config` endpoint, the watch is triggered on all cores using this configset. For example:
|
||||||
|
@ -528,7 +511,6 @@ Directly editing any files without 'touching' the directory *will not* make it v
|
||||||
|
|
||||||
It is possible for components to watch for the configset 'touch' events by registering a listener using `SolrCore#registerConfListener()` .
|
It is possible for components to watch for the configset 'touch' events by registering a listener using `SolrCore#registerConfListener()` .
|
||||||
|
|
||||||
[[ConfigAPI-ListeningtoconfigChanges]]
|
|
||||||
=== Listening to config Changes
|
=== Listening to config Changes
|
||||||
|
|
||||||
Any component can register a listener using:
|
Any component can register a listener using:
|
||||||
|
|
|
@ -1,6 +1,7 @@
|
||||||
= ConfigSets API
|
= ConfigSets API
|
||||||
:page-shortname: configsets-api
|
:page-shortname: configsets-api
|
||||||
:page-permalink: configsets-api.html
|
:page-permalink: configsets-api.html
|
||||||
|
:page-toclevels: 1
|
||||||
// Licensed to the Apache Software Foundation (ASF) under one
|
// Licensed to the Apache Software Foundation (ASF) under one
|
||||||
// or more contributor license agreements. See the NOTICE file
|
// or more contributor license agreements. See the NOTICE file
|
||||||
// distributed with this work for additional information
|
// distributed with this work for additional information
|
||||||
|
@ -24,45 +25,40 @@ To use a ConfigSet created with this API as the configuration for a collection,
|
||||||
|
|
||||||
This API can only be used with Solr running in SolrCloud mode. If you are not running Solr in SolrCloud mode but would still like to use shared configurations, please see the section <<config-sets.adoc#config-sets,Config Sets>>.
|
This API can only be used with Solr running in SolrCloud mode. If you are not running Solr in SolrCloud mode but would still like to use shared configurations, please see the section <<config-sets.adoc#config-sets,Config Sets>>.
|
||||||
|
|
||||||
[[ConfigSetsAPI-APIEntryPoints]]
|
== ConfigSets API Entry Points
|
||||||
== API Entry Points
|
|
||||||
|
|
||||||
The base URL for all API calls is `\http://<hostname>:<port>/solr`.
|
The base URL for all API calls is `\http://<hostname>:<port>/solr`.
|
||||||
|
|
||||||
* `/admin/configs?action=CREATE`: <<ConfigSetsAPI-create,create>> a ConfigSet, based on an existing ConfigSet
|
* `/admin/configs?action=CREATE`: <<configsets-create,create>> a ConfigSet, based on an existing ConfigSet
|
||||||
* `/admin/configs?action=DELETE`: <<ConfigSetsAPI-delete,delete>> a ConfigSet
|
* `/admin/configs?action=DELETE`: <<configsets-delete,delete>> a ConfigSet
|
||||||
* `/admin/configs?action=LIST`: <<ConfigSetsAPI-list,list>> all ConfigSets
|
* `/admin/configs?action=LIST`: <<configsets-list,list>> all ConfigSets
|
||||||
* `/admin/configs?action=UPLOAD`: <<ConfigSetsAPI-upload,upload>> a ConfigSet
|
* `/admin/configs?action=UPLOAD`: <<configsets-upload,upload>> a ConfigSet
|
||||||
|
|
||||||
[[ConfigSetsAPI-createCreateaConfigSet]]
|
[[configsets-create]]
|
||||||
|
|
||||||
[[ConfigSetsAPI-create]]
|
|
||||||
== Create a ConfigSet
|
== Create a ConfigSet
|
||||||
|
|
||||||
`/admin/configs?action=CREATE&name=_name_&baseConfigSet=_baseConfigSet_`
|
`/admin/configs?action=CREATE&name=_name_&baseConfigSet=_baseConfigSet_`
|
||||||
|
|
||||||
Create a ConfigSet, based on an existing ConfigSet.
|
Create a ConfigSet, based on an existing ConfigSet.
|
||||||
|
|
||||||
[[ConfigSetsAPI-Input]]
|
=== Create ConfigSet Parameters
|
||||||
=== Input
|
|
||||||
|
|
||||||
The following parameters are supported when creating a ConfigSet.
|
The following parameters are supported when creating a ConfigSet.
|
||||||
|
|
||||||
name:: The ConfigSet to be created. This parameter is required.
|
name::
|
||||||
|
The ConfigSet to be created. This parameter is required.
|
||||||
|
|
||||||
baseConfigSet:: The ConfigSet to copy as a base. This parameter is required.
|
baseConfigSet::
|
||||||
|
The ConfigSet to copy as a base. This parameter is required.
|
||||||
|
|
||||||
configSetProp._name_=_value_:: Any ConfigSet property from base to override.
|
configSetProp._name_=_value_::
|
||||||
|
Any ConfigSet property from base to override.
|
||||||
|
|
||||||
[[ConfigSetsAPI-Output]]
|
=== Create ConfigSet Response
|
||||||
=== Output
|
|
||||||
|
|
||||||
*Output Content*
|
The response will include the status of the request. If the status is anything other than "success", an error message will explain why the request failed.
|
||||||
|
|
||||||
The output will include the status of the request. If the status is anything other than "success", an error message will explain why the request failed.
|
=== Create ConfigSet Examples
|
||||||
|
|
||||||
[[ConfigSetsAPI-Examples]]
|
|
||||||
=== Examples
|
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
|
||||||
|
@ -85,31 +81,23 @@ http://localhost:8983/solr/admin/configs?action=CREATE&name=myConfigSet&baseConf
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[ConfigSetsAPI-deleteDeleteaConfigSet]]
|
[[configsets-delete]]
|
||||||
|
|
||||||
[[ConfigSetsAPI-delete]]
|
|
||||||
== Delete a ConfigSet
|
== Delete a ConfigSet
|
||||||
|
|
||||||
`/admin/configs?action=DELETE&name=_name_`
|
`/admin/configs?action=DELETE&name=_name_`
|
||||||
|
|
||||||
Delete a ConfigSet
|
Delete a ConfigSet
|
||||||
|
|
||||||
[[ConfigSetsAPI-Input.1]]
|
=== Delete ConfigSet Parameters
|
||||||
=== Input
|
|
||||||
|
|
||||||
*Query Parameters*
|
name::
|
||||||
|
The ConfigSet to be deleted. This parameter is required.
|
||||||
|
|
||||||
name:: The ConfigSet to be deleted. This parameter is required.
|
=== Delete ConfigSet Response
|
||||||
|
|
||||||
[[ConfigSetsAPI-Output.1]]
|
|
||||||
=== Output
|
|
||||||
|
|
||||||
*Output Content*
|
|
||||||
|
|
||||||
The output will include the status of the request. If the status is anything other than "success", an error message will explain why the request failed.
|
The output will include the status of the request. If the status is anything other than "success", an error message will explain why the request failed.
|
||||||
|
|
||||||
[[ConfigSetsAPI-Examples.1]]
|
=== Delete ConfigSet Examples
|
||||||
=== Examples
|
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
|
||||||
|
@ -132,15 +120,14 @@ http://localhost:8983/solr/admin/configs?action=DELETE&name=myConfigSet
|
||||||
</response>
|
</response>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[ConfigSetsAPI-list]]
|
[[configsets-list]]
|
||||||
== List ConfigSets
|
== List ConfigSets
|
||||||
|
|
||||||
`/admin/configs?action=LIST`
|
`/admin/configs?action=LIST`
|
||||||
|
|
||||||
Fetch the names of the ConfigSets in the cluster.
|
Fetch the names of the ConfigSets in the cluster.
|
||||||
|
|
||||||
[[ConfigSetsAPI-Examples.2]]
|
=== List ConfigSet Examples
|
||||||
=== Examples
|
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
|
||||||
|
@ -161,7 +148,7 @@ http://localhost:8983/solr/admin/configs?action=LIST&wt=json
|
||||||
"myConfig2"]}
|
"myConfig2"]}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[ConfigSetsAPI-upload]]
|
[[configsets-upload]]
|
||||||
== Upload a ConfigSet
|
== Upload a ConfigSet
|
||||||
|
|
||||||
`/admin/configs?action=UPLOAD&name=_name_`
|
`/admin/configs?action=UPLOAD&name=_name_`
|
||||||
|
@ -173,22 +160,18 @@ Upload a ConfigSet, sent in as a zipped file. Please note that a ConfigSet is up
|
||||||
* XSLT transformer (tr parameter) cannot be used at request processing time.
|
* XSLT transformer (tr parameter) cannot be used at request processing time.
|
||||||
* StatelessScriptUpdateProcessor does not initialize, if specified in the ConfigSet.
|
* StatelessScriptUpdateProcessor does not initialize, if specified in the ConfigSet.
|
||||||
|
|
||||||
[[ConfigSetsAPI-Input.3]]
|
=== Upload ConfigSet Parameters
|
||||||
=== Input
|
|
||||||
|
|
||||||
name:: The ConfigSet to be created when the upload is complete. This parameter is required.
|
name::
|
||||||
|
The ConfigSet to be created when the upload is complete. This parameter is required.
|
||||||
|
|
||||||
The body of the request should contain a zipped config set.
|
The body of the request should contain a zipped config set.
|
||||||
|
|
||||||
[[ConfigSetsAPI-Output.3]]
|
=== Upload ConfigSet Response
|
||||||
=== Output
|
|
||||||
|
|
||||||
*Output Content*
|
|
||||||
|
|
||||||
The output will include the status of the request. If the status is anything other than "success", an error message will explain why the request failed.
|
The output will include the status of the request. If the status is anything other than "success", an error message will explain why the request failed.
|
||||||
|
|
||||||
[[ConfigSetsAPI-Examples.3]]
|
=== Upload ConfigSet Examples
|
||||||
=== Examples
|
|
||||||
|
|
||||||
Create a ConfigSet named 'myConfigSet' based on a 'predefinedTemplate' ConfigSet, overriding the immutable property to false.
|
Create a ConfigSet named 'myConfigSet' based on a 'predefinedTemplate' ConfigSet, overriding the immutable property to false.
|
||||||
|
|
||||||
|
|
|
@ -25,7 +25,6 @@ Solr logs are a key way to know what's happening in the system. There are severa
|
||||||
In addition to the logging options described below, there is a way to configure which request parameters (such as parameters sent as part of queries) are logged with an additional request parameter called `logParamsList`. See the section on <<common-query-parameters.adoc#CommonQueryParameters-ThelogParamsListParameter,Common Query Parameters>> for more information.
|
In addition to the logging options described below, there is a way to configure which request parameters (such as parameters sent as part of queries) are logged with an additional request parameter called `logParamsList`. See the section on <<common-query-parameters.adoc#CommonQueryParameters-ThelogParamsListParameter,Common Query Parameters>> for more information.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[ConfiguringLogging-TemporaryLoggingSettings]]
|
|
||||||
== Temporary Logging Settings
|
== Temporary Logging Settings
|
||||||
|
|
||||||
You can control the amount of logging output in Solr by using the Admin Web interface. Select the *LOGGING* link. Note that this page only lets you change settings in the running system and is not saved for the next run. (For more information about the Admin Web interface, see <<using-the-solr-administration-user-interface.adoc#using-the-solr-administration-user-interface,Using the Solr Administration User Interface>>.)
|
You can control the amount of logging output in Solr by using the Admin Web interface. Select the *LOGGING* link. Note that this page only lets you change settings in the running system and is not saved for the next run. (For more information about the Admin Web interface, see <<using-the-solr-administration-user-interface.adoc#using-the-solr-administration-user-interface,Using the Solr Administration User Interface>>.)
|
||||||
|
@ -59,7 +58,6 @@ Log levels settings are as follows:
|
||||||
|
|
||||||
Multiple settings at one time are allowed.
|
Multiple settings at one time are allowed.
|
||||||
|
|
||||||
[[ConfiguringLogging-LoglevelAPI]]
|
|
||||||
=== Log level API
|
=== Log level API
|
||||||
|
|
||||||
There is also a way of sending REST commands to the logging endpoint to do the same. Example:
|
There is also a way of sending REST commands to the logging endpoint to do the same. Example:
|
||||||
|
@ -70,7 +68,6 @@ There is also a way of sending REST commands to the logging endpoint to do the s
|
||||||
curl -s http://localhost:8983/solr/admin/info/logging --data-binary "set=root:WARN&wt=json"
|
curl -s http://localhost:8983/solr/admin/info/logging --data-binary "set=root:WARN&wt=json"
|
||||||
----
|
----
|
||||||
|
|
||||||
[[ConfiguringLogging-ChoosingLogLevelatStartup]]
|
|
||||||
== Choosing Log Level at Startup
|
== Choosing Log Level at Startup
|
||||||
|
|
||||||
You can temporarily choose a different logging level as you start Solr. There are two ways:
|
You can temporarily choose a different logging level as you start Solr. There are two ways:
|
||||||
|
@ -87,7 +84,6 @@ bin/solr start -f -v
|
||||||
bin/solr start -f -q
|
bin/solr start -f -q
|
||||||
----
|
----
|
||||||
|
|
||||||
[[ConfiguringLogging-PermanentLoggingSettings]]
|
|
||||||
== Permanent Logging Settings
|
== Permanent Logging Settings
|
||||||
|
|
||||||
Solr uses http://logging.apache.org/log4j/1.2/[Log4J version 1.2] for logging which is configured using `server/resources/log4j.properties`. Take a moment to inspect the contents of the `log4j.properties` file so that you are familiar with its structure. By default, Solr log messages will be written to `SOLR_LOGS_DIR/solr.log`.
|
Solr uses http://logging.apache.org/log4j/1.2/[Log4J version 1.2] for logging which is configured using `server/resources/log4j.properties`. Take a moment to inspect the contents of the `log4j.properties` file so that you are familiar with its structure. By default, Solr log messages will be written to `SOLR_LOGS_DIR/solr.log`.
|
||||||
|
@ -109,7 +105,6 @@ On every startup of Solr, the start script will clean up old logs and rotate the
|
||||||
|
|
||||||
You can disable the automatic log rotation at startup by changing the setting `SOLR_LOG_PRESTART_ROTATION` found in `bin/solr.in.sh` or `bin/solr.in.cmd` to false.
|
You can disable the automatic log rotation at startup by changing the setting `SOLR_LOG_PRESTART_ROTATION` found in `bin/solr.in.sh` or `bin/solr.in.cmd` to false.
|
||||||
|
|
||||||
[[ConfiguringLogging-LoggingSlowQueries]]
|
|
||||||
== Logging Slow Queries
|
== Logging Slow Queries
|
||||||
|
|
||||||
For high-volume search applications, logging every query can generate a large amount of logs and, depending on the volume, potentially impact performance. If you mine these logs for additional insights into your application, then logging every query request may be useful.
|
For high-volume search applications, logging every query can generate a large amount of logs and, depending on the volume, potentially impact performance. If you mine these logs for additional insights into your application, then logging every query request may be useful.
|
||||||
|
|
|
@ -51,14 +51,12 @@ We've covered the options in the following sections:
|
||||||
* <<update-request-processors.adoc#update-request-processors,Update Request Processors>>
|
* <<update-request-processors.adoc#update-request-processors,Update Request Processors>>
|
||||||
* <<codec-factory.adoc#codec-factory,Codec Factory>>
|
* <<codec-factory.adoc#codec-factory,Codec Factory>>
|
||||||
|
|
||||||
[[Configuringsolrconfig.xml-SubstitutingPropertiesinSolrConfigFiles]]
|
|
||||||
== Substituting Properties in Solr Config Files
|
== Substituting Properties in Solr Config Files
|
||||||
|
|
||||||
Solr supports variable substitution of property values in config files, which allows runtime specification of various configuration options in `solrconfig.xml`. The syntax is `${propertyname[:option default value]`}. This allows defining a default that can be overridden when Solr is launched. If a default value is not specified, then the property _must_ be specified at runtime or the configuration file will generate an error when parsed.
|
Solr supports variable substitution of property values in config files, which allows runtime specification of various configuration options in `solrconfig.xml`. The syntax is `${propertyname[:option default value]`}. This allows defining a default that can be overridden when Solr is launched. If a default value is not specified, then the property _must_ be specified at runtime or the configuration file will generate an error when parsed.
|
||||||
|
|
||||||
There are multiple methods for specifying properties that can be used in configuration files. Of those below, strongly consider "config overlay" as the preferred approach, as it stays local to the config set and because it's easy to modify.
|
There are multiple methods for specifying properties that can be used in configuration files. Of those below, strongly consider "config overlay" as the preferred approach, as it stays local to the config set and because it's easy to modify.
|
||||||
|
|
||||||
[[Configuringsolrconfig.xml-JVMSystemProperties]]
|
|
||||||
=== JVM System Properties
|
=== JVM System Properties
|
||||||
|
|
||||||
Any JVM System properties, usually specified using the `-D` flag when starting the JVM, can be used as variables in any XML configuration file in Solr.
|
Any JVM System properties, usually specified using the `-D` flag when starting the JVM, can be used as variables in any XML configuration file in Solr.
|
||||||
|
@ -79,8 +77,7 @@ bin/solr start -Dsolr.lock.type=none
|
||||||
|
|
||||||
In general, any Java system property that you want to set can be passed through the `bin/solr` script using the standard `-Dproperty=value` syntax. Alternatively, you can add common system properties to the `SOLR_OPTS` environment variable defined in the Solr include file (`bin/solr.in.sh` or `bin/solr.in.cmd`). For more information about how the Solr include file works, refer to: <<taking-solr-to-production.adoc#taking-solr-to-production,Taking Solr to Production>>.
|
In general, any Java system property that you want to set can be passed through the `bin/solr` script using the standard `-Dproperty=value` syntax. Alternatively, you can add common system properties to the `SOLR_OPTS` environment variable defined in the Solr include file (`bin/solr.in.sh` or `bin/solr.in.cmd`). For more information about how the Solr include file works, refer to: <<taking-solr-to-production.adoc#taking-solr-to-production,Taking Solr to Production>>.
|
||||||
|
|
||||||
[[Configuringsolrconfig.xml-ConfigAPI]]
|
=== Config API to Override solrconfig.xml
|
||||||
=== Config API
|
|
||||||
|
|
||||||
The <<config-api.adoc#config-api,Config API>> allows you to use an API to modify Solr's configuration, specifically user defined properties. Changes made with this API are stored in a file named `configoverlay.json`. This file should only be edited with the API, but will look like this example:
|
The <<config-api.adoc#config-api,Config API>> allows you to use an API to modify Solr's configuration, specifically user defined properties. Changes made with this API are stored in a file named `configoverlay.json`. This file should only be edited with the API, but will look like this example:
|
||||||
|
|
||||||
|
@ -94,7 +91,6 @@ The <<config-api.adoc#config-api,Config API>> allows you to use an API to modify
|
||||||
|
|
||||||
For more details, see the section <<config-api.adoc#config-api,Config API>>.
|
For more details, see the section <<config-api.adoc#config-api,Config API>>.
|
||||||
|
|
||||||
[[Configuringsolrconfig.xml-solrcore.properties]]
|
|
||||||
=== solrcore.properties
|
=== solrcore.properties
|
||||||
|
|
||||||
If the configuration directory for a Solr core contains a file named `solrcore.properties` that file can contain any arbitrary user defined property names and values using the Java standard https://en.wikipedia.org/wiki/.properties[properties file format], and those properties can be used as variables in the XML configuration files for that Solr core.
|
If the configuration directory for a Solr core contains a file named `solrcore.properties` that file can contain any arbitrary user defined property names and values using the Java standard https://en.wikipedia.org/wiki/.properties[properties file format], and those properties can be used as variables in the XML configuration files for that Solr core.
|
||||||
|
@ -120,7 +116,6 @@ The path and name of the `solrcore.properties` file can be overridden using the
|
||||||
|
|
||||||
====
|
====
|
||||||
|
|
||||||
[[Configuringsolrconfig.xml-Userdefinedpropertiesfromcore.properties]]
|
|
||||||
=== User-Defined Properties in core.properties
|
=== User-Defined Properties in core.properties
|
||||||
|
|
||||||
Every Solr core has a `core.properties` file, automatically created when using the APIs. When you create a SolrCloud collection, you can pass through custom parameters to go into each core.properties that will be created, by prefixing the parameter name with "property." as a URL parameter. Example:
|
Every Solr core has a `core.properties` file, automatically created when using the APIs. When you create a SolrCloud collection, you can pass through custom parameters to go into each core.properties that will be created, by prefixing the parameter name with "property." as a URL parameter. Example:
|
||||||
|
@ -148,7 +143,6 @@ The `my.custom.prop` property can then be used as a variable, such as in `solrco
|
||||||
</requestHandler>
|
</requestHandler>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[Configuringsolrconfig.xml-ImplicitCoreProperties]]
|
|
||||||
=== Implicit Core Properties
|
=== Implicit Core Properties
|
||||||
|
|
||||||
Several attributes of a Solr core are available as "implicit" properties that can be used in variable substitution, independent of where or how they underlying value is initialized. For example: regardless of whether the name for a particular Solr core is explicitly configured in `core.properties` or inferred from the name of the instance directory, the implicit property `solr.core.name` is available for use as a variable in that core's configuration file...
|
Several attributes of a Solr core are available as "implicit" properties that can be used in variable substitution, independent of where or how they underlying value is initialized. For example: regardless of whether the name for a particular Solr core is explicitly configured in `core.properties` or inferred from the name of the instance directory, the implicit property `solr.core.name` is available for use as a variable in that core's configuration file...
|
||||||
|
|
|
@ -22,8 +22,7 @@ Content streams are bulk data passed with a request to Solr.
|
||||||
|
|
||||||
When Solr RequestHandlers are accessed using path based URLs, the `SolrQueryRequest` object containing the parameters of the request may also contain a list of ContentStreams containing bulk data for the request. (The name SolrQueryRequest is a bit misleading: it is involved in all requests, regardless of whether it is a query request or an update request.)
|
When Solr RequestHandlers are accessed using path based URLs, the `SolrQueryRequest` object containing the parameters of the request may also contain a list of ContentStreams containing bulk data for the request. (The name SolrQueryRequest is a bit misleading: it is involved in all requests, regardless of whether it is a query request or an update request.)
|
||||||
|
|
||||||
[[ContentStreams-StreamSources]]
|
== Content Stream Sources
|
||||||
== Stream Sources
|
|
||||||
|
|
||||||
Currently request handlers can get content streams in a variety of ways:
|
Currently request handlers can get content streams in a variety of ways:
|
||||||
|
|
||||||
|
@ -34,7 +33,6 @@ Currently request handlers can get content streams in a variety of ways:
|
||||||
|
|
||||||
By default, curl sends a `contentType="application/x-www-form-urlencoded"` header. If you need to test a SolrContentHeader content stream, you will need to set the content type with curl's `-H` flag.
|
By default, curl sends a `contentType="application/x-www-form-urlencoded"` header. If you need to test a SolrContentHeader content stream, you will need to set the content type with curl's `-H` flag.
|
||||||
|
|
||||||
[[ContentStreams-RemoteStreaming]]
|
|
||||||
== RemoteStreaming
|
== RemoteStreaming
|
||||||
|
|
||||||
Remote streaming lets you send the contents of a URL as a stream to a given SolrRequestHandler. You could use remote streaming to send a remote or local file to an update plugin.
|
Remote streaming lets you send the contents of a URL as a stream to a given SolrRequestHandler. You could use remote streaming to send a remote or local file to an update plugin.
|
||||||
|
@ -65,10 +63,9 @@ curl -d '
|
||||||
|
|
||||||
[IMPORTANT]
|
[IMPORTANT]
|
||||||
====
|
====
|
||||||
If `enableRemoteStreaming="true"` is used, be aware that this allows _anyone_ to send a request to any URL or local file. If <<ContentStreams-DebuggingRequests,DumpRequestHandler>> is enabled, it will allow anyone to view any file on your system.
|
If `enableRemoteStreaming="true"` is used, be aware that this allows _anyone_ to send a request to any URL or local file. If the <<Debugging Requests,DumpRequestHandler>> is enabled, it will allow anyone to view any file on your system.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[ContentStreams-DebuggingRequests]]
|
|
||||||
== Debugging Requests
|
== Debugging Requests
|
||||||
|
|
||||||
The implicit "dump" RequestHandler (see <<implicit-requesthandlers.adoc#implicit-requesthandlers,Implicit RequestHandlers>>) simply outputs the contents of the SolrQueryRequest using the specified writer type `wt`. This is a useful tool to help understand what streams are available to the RequestHandlers.
|
The implicit "dump" RequestHandler (see <<implicit-requesthandlers.adoc#implicit-requesthandlers,Implicit RequestHandlers>>) simply outputs the contents of the SolrQueryRequest using the specified writer type `wt`. This is a useful tool to help understand what streams are available to the RequestHandlers.
|
||||||
|
|
|
@ -29,7 +29,7 @@ CoreAdmin actions can be executed by via HTTP requests that specify an `action`
|
||||||
|
|
||||||
All action names are uppercase, and are defined in depth in the sections below.
|
All action names are uppercase, and are defined in depth in the sections below.
|
||||||
|
|
||||||
[[CoreAdminAPI-STATUS]]
|
[[coreadmin-status]]
|
||||||
== STATUS
|
== STATUS
|
||||||
|
|
||||||
The `STATUS` action returns the status of all running Solr cores, or status for only the named core.
|
The `STATUS` action returns the status of all running Solr cores, or status for only the named core.
|
||||||
|
@ -44,7 +44,7 @@ The name of a core, as listed in the "name" attribute of a `<core>` element in `
|
||||||
`indexInfo`::
|
`indexInfo`::
|
||||||
If `false`, information about the index will not be returned with a core STATUS request. In Solr implementations with a large number of cores (i.e., more than hundreds), retrieving the index information for each core can take a lot of time and isn't always required. The default is `true`.
|
If `false`, information about the index will not be returned with a core STATUS request. In Solr implementations with a large number of cores (i.e., more than hundreds), retrieving the index information for each core can take a lot of time and isn't always required. The default is `true`.
|
||||||
|
|
||||||
[[CoreAdminAPI-CREATE]]
|
[[coreadmin-create]]
|
||||||
== CREATE
|
== CREATE
|
||||||
|
|
||||||
The `CREATE` action creates a new core and registers it.
|
The `CREATE` action creates a new core and registers it.
|
||||||
|
@ -102,7 +102,7 @@ WARNING: While it's possible to create a core for a non-existent collection, thi
|
||||||
The shard id this core represents. Normally you want to be auto-assigned a shard id.
|
The shard id this core represents. Normally you want to be auto-assigned a shard id.
|
||||||
|
|
||||||
`property._name_=_value_`::
|
`property._name_=_value_`::
|
||||||
Sets the core property _name_ to _value_. See the section on defining <<defining-core-properties.adoc#Definingcore.properties-core.properties_files,core.properties file contents>>.
|
Sets the core property _name_ to _value_. See the section on defining <<defining-core-properties.adoc#defining-core-properties-files,core.properties file contents>>.
|
||||||
|
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be processed asynchronously.
|
Request ID to track this action which will be processed asynchronously.
|
||||||
|
@ -115,7 +115,7 @@ Use `collection.configName=_configname_` to point to the config for a new collec
|
||||||
http://localhost:8983/solr/admin/cores?action=CREATE&name=my_core&collection=my_collection&shard=shard2
|
http://localhost:8983/solr/admin/cores?action=CREATE&name=my_core&collection=my_collection&shard=shard2
|
||||||
|
|
||||||
|
|
||||||
[[CoreAdminAPI-RELOAD]]
|
[[coreadmin-reload]]
|
||||||
== RELOAD
|
== RELOAD
|
||||||
|
|
||||||
The RELOAD action loads a new core from the configuration of an existing, registered Solr core. While the new core is initializing, the existing one will continue to handle requests. When the new Solr core is ready, it takes over and the old core is unloaded.
|
The RELOAD action loads a new core from the configuration of an existing, registered Solr core. While the new core is initializing, the existing one will continue to handle requests. When the new Solr core is ready, it takes over and the old core is unloaded.
|
||||||
|
@ -134,7 +134,7 @@ RELOAD performs "live" reloads of SolrCore, reusing some existing objects. Some
|
||||||
`core`::
|
`core`::
|
||||||
The name of the core, as listed in the "name" attribute of a `<core>` element in `solr.xml`. This parameter is required.
|
The name of the core, as listed in the "name" attribute of a `<core>` element in `solr.xml`. This parameter is required.
|
||||||
|
|
||||||
[[CoreAdminAPI-RENAME]]
|
[[coreadmin-rename]]
|
||||||
== RENAME
|
== RENAME
|
||||||
|
|
||||||
The `RENAME` action changes the name of a Solr core.
|
The `RENAME` action changes the name of a Solr core.
|
||||||
|
@ -153,7 +153,7 @@ The new name for the Solr core. If the persistent attribute of `<solr>` is `true
|
||||||
Request ID to track this action which will be processed asynchronously.
|
Request ID to track this action which will be processed asynchronously.
|
||||||
|
|
||||||
|
|
||||||
[[CoreAdminAPI-SWAP]]
|
[[coreadmin-swap]]
|
||||||
== SWAP
|
== SWAP
|
||||||
|
|
||||||
`SWAP` atomically swaps the names used to access two existing Solr cores. This can be used to swap new content into production. The prior core remains available and can be swapped back, if necessary. Each core will be known by the name of the other, after the swap.
|
`SWAP` atomically swaps the names used to access two existing Solr cores. This can be used to swap new content into production. The prior core remains available and can be swapped back, if necessary. Each core will be known by the name of the other, after the swap.
|
||||||
|
@ -162,9 +162,7 @@ Request ID to track this action which will be processed asynchronously.
|
||||||
|
|
||||||
[IMPORTANT]
|
[IMPORTANT]
|
||||||
====
|
====
|
||||||
|
|
||||||
Do not use `SWAP` with a SolrCloud node. It is not supported and can result in the core being unusable.
|
Do not use `SWAP` with a SolrCloud node. It is not supported and can result in the core being unusable.
|
||||||
|
|
||||||
====
|
====
|
||||||
|
|
||||||
=== SWAP Parameters
|
=== SWAP Parameters
|
||||||
|
@ -179,7 +177,7 @@ The name of one of the cores to be swapped. This parameter is required.
|
||||||
Request ID to track this action which will be processed asynchronously.
|
Request ID to track this action which will be processed asynchronously.
|
||||||
|
|
||||||
|
|
||||||
[[CoreAdminAPI-UNLOAD]]
|
[[coreadmin-unload]]
|
||||||
== UNLOAD
|
== UNLOAD
|
||||||
|
|
||||||
The `UNLOAD` action removes a core from Solr. Active requests will continue to be processed, but no new requests will be sent to the named core. If a core is registered under more than one name, only the given name is removed.
|
The `UNLOAD` action removes a core from Solr. Active requests will continue to be processed, but no new requests will be sent to the named core. If a core is registered under more than one name, only the given name is removed.
|
||||||
|
@ -210,8 +208,7 @@ If `true`, removes everything related to the core, including the index directory
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be processed asynchronously.
|
Request ID to track this action which will be processed asynchronously.
|
||||||
|
|
||||||
|
[[coreadmin-mergeindexes]]
|
||||||
[[CoreAdminAPI-MERGEINDEXES]]
|
|
||||||
== MERGEINDEXES
|
== MERGEINDEXES
|
||||||
|
|
||||||
The `MERGEINDEXES` action merges one or more indexes to another index. The indexes must have completed commits, and should be locked against writes until the merge is complete or the resulting merged index may become corrupted. The target core index must already exist and have a compatible schema with the one or more indexes that will be merged to it. Another commit on the target core should also be performed after the merge is complete.
|
The `MERGEINDEXES` action merges one or more indexes to another index. The indexes must have completed commits, and should be locked against writes until the merge is complete or the resulting merged index may become corrupted. The target core index must already exist and have a compatible schema with the one or more indexes that will be merged to it. Another commit on the target core should also be performed after the merge is complete.
|
||||||
|
@ -243,7 +240,7 @@ Multi-valued, source cores that would be merged.
|
||||||
Request ID to track this action which will be processed asynchronously
|
Request ID to track this action which will be processed asynchronously
|
||||||
|
|
||||||
|
|
||||||
[[CoreAdminAPI-SPLIT]]
|
[[coreadmin-split]]
|
||||||
== SPLIT
|
== SPLIT
|
||||||
|
|
||||||
The `SPLIT` action splits an index into two or more indexes. The index being split can continue to handle requests. The split pieces can be placed into a specified directory on the server's filesystem or it can be merged into running Solr cores.
|
The `SPLIT` action splits an index into two or more indexes. The index being split can continue to handle requests. The split pieces can be placed into a specified directory on the server's filesystem or it can be merged into running Solr cores.
|
||||||
|
@ -270,7 +267,6 @@ The key to be used for splitting the index. If this parameter is used, `ranges`
|
||||||
`async`::
|
`async`::
|
||||||
Request ID to track this action which will be processed asynchronously.
|
Request ID to track this action which will be processed asynchronously.
|
||||||
|
|
||||||
|
|
||||||
=== SPLIT Examples
|
=== SPLIT Examples
|
||||||
|
|
||||||
The `core` index will be split into as many pieces as the number of `path` or `targetCore` parameters.
|
The `core` index will be split into as many pieces as the number of `path` or `targetCore` parameters.
|
||||||
|
@ -305,9 +301,9 @@ This example uses the `ranges` parameter with hash ranges 0-500, 501-1000 and 10
|
||||||
|
|
||||||
The `targetCore` must already exist and must have a compatible schema with the `core` index. A commit is automatically called on the `core` index before it is split.
|
The `targetCore` must already exist and must have a compatible schema with the `core` index. A commit is automatically called on the `core` index before it is split.
|
||||||
|
|
||||||
This command is used as part of the <<collections-api.adoc#CollectionsAPI-splitshard,SPLITSHARD>> command but it can be used for non-cloud Solr cores as well. When used against a non-cloud core without `split.key` parameter, this action will split the source index and distribute its documents alternately so that each split piece contains an equal number of documents. If the `split.key` parameter is specified then only documents having the same route key will be split from the source index.
|
This command is used as part of the <<collections-api.adoc#splitshard,SPLITSHARD>> command but it can be used for non-cloud Solr cores as well. When used against a non-cloud core without `split.key` parameter, this action will split the source index and distribute its documents alternately so that each split piece contains an equal number of documents. If the `split.key` parameter is specified then only documents having the same route key will be split from the source index.
|
||||||
|
|
||||||
[[CoreAdminAPI-REQUESTSTATUS]]
|
[[coreadmin-requeststatus]]
|
||||||
== REQUESTSTATUS
|
== REQUESTSTATUS
|
||||||
|
|
||||||
Request the status of an already submitted asynchronous CoreAdmin API call.
|
Request the status of an already submitted asynchronous CoreAdmin API call.
|
||||||
|
@ -326,7 +322,7 @@ The call below will return the status of an already submitted asynchronous CoreA
|
||||||
[source,bash]
|
[source,bash]
|
||||||
http://localhost:8983/solr/admin/cores?action=REQUESTSTATUS&requestid=1
|
http://localhost:8983/solr/admin/cores?action=REQUESTSTATUS&requestid=1
|
||||||
|
|
||||||
[[CoreAdminAPI-REQUESTRECOVERY]]
|
[[coreadmin-requestrecovery]]
|
||||||
== REQUESTRECOVERY
|
== REQUESTRECOVERY
|
||||||
|
|
||||||
The `REQUESTRECOVERY` action manually asks a core to recover by synching with the leader. This should be considered an "expert" level command and should be used in situations where the node (SorlCloud replica) is unable to become active automatically.
|
The `REQUESTRECOVERY` action manually asks a core to recover by synching with the leader. This should be considered an "expert" level command and should be used in situations where the node (SorlCloud replica) is unable to become active automatically.
|
||||||
|
@ -338,7 +334,6 @@ The `REQUESTRECOVERY` action manually asks a core to recover by synching with th
|
||||||
`core`::
|
`core`::
|
||||||
The name of the core to re-sync. This parameter is required.
|
The name of the core to re-sync. This parameter is required.
|
||||||
|
|
||||||
[[CoreAdminAPI-Examples.1]]
|
|
||||||
=== REQUESTRECOVERY Examples
|
=== REQUESTRECOVERY Examples
|
||||||
|
|
||||||
[source,bash]
|
[source,bash]
|
||||||
|
|
|
@ -140,8 +140,6 @@ The CDCR replication logic requires modification to the maintenance logic of the
|
||||||
|
|
||||||
If the communication with one of the target data center is slow, the Updates Log on the source data center can grow to a substantial size. In such a scenario, it is necessary for the Updates Log to be able to efficiently find a given update operation given its identifier. Given that its identifier is an incremental number, it is possible to implement an efficient search strategy. Each transaction log file contains as part of its filename the version number of the first element. This is used to quickly traverse all the transaction log files and find the transaction log file containing one specific version number.
|
If the communication with one of the target data center is slow, the Updates Log on the source data center can grow to a substantial size. In such a scenario, it is necessary for the Updates Log to be able to efficiently find a given update operation given its identifier. Given that its identifier is an incremental number, it is possible to implement an efficient search strategy. Each transaction log file contains as part of its filename the version number of the first element. This is used to quickly traverse all the transaction log files and find the transaction log file containing one specific version number.
|
||||||
|
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-Monitoring]]
|
|
||||||
=== Monitoring
|
=== Monitoring
|
||||||
|
|
||||||
CDCR provides the following monitoring capabilities over the replication operations:
|
CDCR provides the following monitoring capabilities over the replication operations:
|
||||||
|
@ -155,24 +153,19 @@ Information about the lifecycle and statistics will be provided on a per-shard b
|
||||||
|
|
||||||
The CDC Replicator is a background thread that is responsible for replicating updates from a Source data center to one or more target data centers. It is responsible in providing monitoring information on a per-shard basis. As there can be a large number of collections and shards in a cluster, we will use a fixed-size pool of CDC Replicator threads that will be shared across shards.
|
The CDC Replicator is a background thread that is responsible for replicating updates from a Source data center to one or more target data centers. It is responsible in providing monitoring information on a per-shard basis. As there can be a large number of collections and shards in a cluster, we will use a fixed-size pool of CDC Replicator threads that will be shared across shards.
|
||||||
|
|
||||||
|
=== CDCR Limitations
|
||||||
[[CrossDataCenterReplication_CDCR_-Limitations]]
|
|
||||||
=== Limitations
|
|
||||||
|
|
||||||
The current design of CDCR has some limitations. CDCR will continue to evolve over time and many of these limitations will be addressed. Among them are:
|
The current design of CDCR has some limitations. CDCR will continue to evolve over time and many of these limitations will be addressed. Among them are:
|
||||||
|
|
||||||
* CDCR is unlikely to be satisfactory for bulk-load situations where the update rate is high, especially if the bandwidth between the Source and target clusters is restricted. In this scenario, the initial bulk load should be performed, the Source and target data centers synchronized and CDCR be utilized for incremental updates.
|
* CDCR is unlikely to be satisfactory for bulk-load situations where the update rate is high, especially if the bandwidth between the Source and target clusters is restricted. In this scenario, the initial bulk load should be performed, the Source and target data centers synchronized and CDCR be utilized for incremental updates.
|
||||||
* CDCR is currently only active-passive; data is pushed from the Source cluster to the target cluster. There is active work being done in this area in the 6x code line to remove this limitation.
|
* CDCR is currently only active-passive; data is pushed from the Source cluster to the target cluster. There is active work being done in this area in the 6x code line to remove this limitation.
|
||||||
* CDCR works most robustly with the same number of shards in the Source and target collection. The shards in the two collections may have different numbers of replicas.
|
* CDCR works most robustly with the same number of shards in the Source and target collection. The shards in the two collections may have different numbers of replicas.
|
||||||
|
* Running CDCR with the indexes on HDFS is not currently supported, see the https://issues.apache.org/jira/browse/SOLR-9861[Solr CDCR over HDFS] JIRA issue.
|
||||||
|
|
||||||
|
== CDCR Configuration
|
||||||
[[CrossDataCenterReplication_CDCR_-Configuration]]
|
|
||||||
== Configuration
|
|
||||||
|
|
||||||
The source and target configurations differ in the case of the data centers being in separate clusters. "Cluster" here means separate ZooKeeper ensembles controlling disjoint Solr instances. Whether these data centers are physically separated or not is immaterial for this discussion.
|
The source and target configurations differ in the case of the data centers being in separate clusters. "Cluster" here means separate ZooKeeper ensembles controlling disjoint Solr instances. Whether these data centers are physically separated or not is immaterial for this discussion.
|
||||||
|
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-SourceConfiguration]]
|
|
||||||
=== Source Configuration
|
=== Source Configuration
|
||||||
|
|
||||||
Here is a sample of a source configuration file, a section in `solrconfig.xml`. The presence of the <replica> section causes CDCR to use this cluster as the Source and should not be present in the target collections in the cluster-to-cluster case. Details about each setting are after the two examples:
|
Here is a sample of a source configuration file, a section in `solrconfig.xml`. The presence of the <replica> section causes CDCR to use this cluster as the Source and should not be present in the target collections in the cluster-to-cluster case. Details about each setting are after the two examples:
|
||||||
|
@ -211,8 +204,6 @@ Here is a sample of a source configuration file, a section in `solrconfig.xml`.
|
||||||
</updateHandler>
|
</updateHandler>
|
||||||
----
|
----
|
||||||
|
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-TargetConfiguration]]
|
|
||||||
=== Target Configuration
|
=== Target Configuration
|
||||||
|
|
||||||
Here is a typical target configuration.
|
Here is a typical target configuration.
|
||||||
|
@ -256,7 +247,6 @@ The configuration details, defaults and options are as follows:
|
||||||
|
|
||||||
CDCR can be configured to forward update requests to one or more replicas. A replica is defined with a “replica” list as follows:
|
CDCR can be configured to forward update requests to one or more replicas. A replica is defined with a “replica” list as follows:
|
||||||
|
|
||||||
|
|
||||||
`zkHost`::
|
`zkHost`::
|
||||||
The host address for ZooKeeper of the target SolrCloud. Usually this is a comma-separated list of addresses to each node in the target ZooKeeper ensemble. This parameter is required.
|
The host address for ZooKeeper of the target SolrCloud. Usually this is a comma-separated list of addresses to each node in the target ZooKeeper ensemble. This parameter is required.
|
||||||
|
|
||||||
|
@ -303,41 +293,27 @@ Monitor actions are performed at a core level, i.e., by using the following base
|
||||||
|
|
||||||
Currently, none of the CDCR API calls have parameters.
|
Currently, none of the CDCR API calls have parameters.
|
||||||
|
|
||||||
|
|
||||||
=== API Entry Points (Control)
|
=== API Entry Points (Control)
|
||||||
|
|
||||||
* `<collection>/cdcr?action=STATUS`: <<CrossDataCenterReplication_CDCR_-STATUS,Returns the current state>> of CDCR.
|
* `<collection>/cdcr?action=STATUS`: <<CDCR STATUS,Returns the current state>> of CDCR.
|
||||||
* `<collection>/cdcr?action=START`: <<CrossDataCenterReplication_CDCR_-START,Starts CDCR>> replication
|
* `<collection>/cdcr?action=START`: <<CDCR START,Starts CDCR>> replication
|
||||||
* `<collection>/cdcr?action=STOP`: <<CrossDataCenterReplication_CDCR_-STOP,Stops CDCR>> replication.
|
* `<collection>/cdcr?action=STOP`: <<CDCR STOP,Stops CDCR>> replication.
|
||||||
* `<collection>/cdcr?action=ENABLEBUFFER`: <<CrossDataCenterReplication_CDCR_-ENABLEBUFFER,Enables the buffering>> of updates.
|
* `<collection>/cdcr?action=ENABLEBUFFER`: <<ENABLEBUFFER,Enables the buffering>> of updates.
|
||||||
* `<collection>/cdcr?action=DISABLEBUFFER`: <<CrossDataCenterReplication_CDCR_-DISABLEBUFFER,Disables the buffering>> of updates.
|
* `<collection>/cdcr?action=DISABLEBUFFER`: <<DISABLEBUFFER,Disables the buffering>> of updates.
|
||||||
|
|
||||||
|
|
||||||
=== API Entry Points (Monitoring)
|
=== API Entry Points (Monitoring)
|
||||||
|
|
||||||
* `core/cdcr?action=QUEUES`: <<CrossDataCenterReplication_CDCR_-QUEUES,Fetches statistics about the queue>> for each replica and about the update logs.
|
* `core/cdcr?action=QUEUES`: <<QUEUES,Fetches statistics about the queue>> for each replica and about the update logs.
|
||||||
* `core/cdcr?action=OPS`: <<CrossDataCenterReplication_CDCR_-OPS,Fetches statistics about the replication performance>> (operations per second) for each replica.
|
* `core/cdcr?action=OPS`: <<OPS,Fetches statistics about the replication performance>> (operations per second) for each replica.
|
||||||
* `core/cdcr?action=ERRORS`: <<CrossDataCenterReplication_CDCR_-ERRORS,Fetches statistics and other information about replication errors>> for each replica.
|
* `core/cdcr?action=ERRORS`: <<ERRORS,Fetches statistics and other information about replication errors>> for each replica.
|
||||||
|
|
||||||
=== Control Commands
|
=== Control Commands
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-STATUS]]
|
==== CDCR STATUS
|
||||||
==== STATUS
|
|
||||||
|
|
||||||
`/collection/cdcr?action=STATUS`
|
`/collection/cdcr?action=STATUS`
|
||||||
|
|
||||||
===== Input
|
===== CDCR Status Example
|
||||||
|
|
||||||
*Query Parameters:* There are no parameters to this command.
|
|
||||||
|
|
||||||
===== Output
|
|
||||||
|
|
||||||
*Output Content*
|
|
||||||
|
|
||||||
The current state of the CDCR, which includes the state of the replication process and the state of the buffer.
|
|
||||||
|
|
||||||
[[cdcr_examples]]
|
|
||||||
===== Examples
|
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
|
||||||
|
@ -362,22 +338,15 @@ The current state of the CDCR, which includes the state of the replication proce
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-ENABLEBUFFER]]
|
|
||||||
==== ENABLEBUFFER
|
==== ENABLEBUFFER
|
||||||
|
|
||||||
`/collection/cdcr?action=ENABLEBUFFER`
|
`/collection/cdcr?action=ENABLEBUFFER`
|
||||||
|
|
||||||
===== Input
|
===== Enable Buffer Response
|
||||||
|
|
||||||
*Query Parameters:* There are no parameters to this command.
|
The status of the process and an indication of whether the buffer is enabled.
|
||||||
|
|
||||||
===== Output
|
===== Enable Buffer Example
|
||||||
|
|
||||||
*Output Content*
|
|
||||||
|
|
||||||
The status of the process and an indication of whether the buffer is enabled
|
|
||||||
|
|
||||||
===== Examples
|
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
|
||||||
|
@ -402,20 +371,15 @@ The status of the process and an indication of whether the buffer is enabled
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-DISABLEBUFFER]]
|
|
||||||
==== DISABLEBUFFER
|
==== DISABLEBUFFER
|
||||||
|
|
||||||
`/collection/cdcr?action=DISABLEBUFFER`
|
`/collection/cdcr?action=DISABLEBUFFER`
|
||||||
|
|
||||||
===== Input
|
===== Disable Buffer Response
|
||||||
|
|
||||||
*Query Parameters:* There are no parameters to this command
|
The status of CDCR and an indication that the buffer is disabled.
|
||||||
|
|
||||||
===== Output
|
===== Disable Buffer Example
|
||||||
|
|
||||||
*Output Content:* The status of CDCR and an indication that the buffer is disabled.
|
|
||||||
|
|
||||||
===== Examples
|
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
|
||||||
|
@ -440,20 +404,15 @@ http://host:8983/solr/<collection_name>/cdcr?action=DISABLEBUFFER
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-START]]
|
==== CDCR START
|
||||||
==== START
|
|
||||||
|
|
||||||
`/collection/cdcr?action=START`
|
`/collection/cdcr?action=START`
|
||||||
|
|
||||||
===== Input
|
===== CDCR Start Response
|
||||||
|
|
||||||
*Query Parameters:* There are no parameters for this action
|
Confirmation that CDCR is started and the status of buffering
|
||||||
|
|
||||||
===== Output
|
===== CDCR Start Examples
|
||||||
|
|
||||||
*Output Content:* Confirmation that CDCR is started and the status of buffering
|
|
||||||
|
|
||||||
===== Examples
|
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
|
||||||
|
@ -478,20 +437,15 @@ http://host:8983/solr/<collection_name>/cdcr?action=START
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-STOP]]
|
==== CDCR STOP
|
||||||
==== STOP
|
|
||||||
|
|
||||||
`/collection/cdcr?action=STOP`
|
`/collection/cdcr?action=STOP`
|
||||||
|
|
||||||
===== Input
|
===== CDCR Stop Response
|
||||||
|
|
||||||
*Query Parameters:* There are no parameters for this command.
|
The status of CDCR, including the confirmation that CDCR is stopped.
|
||||||
|
|
||||||
===== Output
|
===== CDCR Stop Examples
|
||||||
|
|
||||||
*Output Content:* The status of CDCR, including the confirmation that CDCR is stopped
|
|
||||||
|
|
||||||
===== Examples
|
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
|
||||||
|
@ -517,19 +471,13 @@ http://host:8983/solr/<collection_name>/cdcr?action=START
|
||||||
----
|
----
|
||||||
|
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-Monitoringcommands]]
|
=== CDCR Monitoring Commands
|
||||||
=== Monitoring commands
|
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-QUEUES]]
|
|
||||||
==== QUEUES
|
==== QUEUES
|
||||||
|
|
||||||
`/core/cdcr?action=QUEUES`
|
`/core/cdcr?action=QUEUES`
|
||||||
|
|
||||||
===== Input
|
===== QUEUES Response
|
||||||
|
|
||||||
*Query Parameters:* There are no parameters for this command
|
|
||||||
|
|
||||||
===== Output
|
|
||||||
|
|
||||||
*Output Content*
|
*Output Content*
|
||||||
|
|
||||||
|
@ -537,7 +485,7 @@ The output is composed of a list “queues” which contains a list of (ZooKeepe
|
||||||
|
|
||||||
The “queues” object also contains information about the updates log, such as the size (in bytes) of the updates log on disk (“tlogTotalSize”), the number of transaction log files (“tlogTotalCount”) and the status of the updates log synchronizer (“updateLogSynchronizer”).
|
The “queues” object also contains information about the updates log, such as the size (in bytes) of the updates log on disk (“tlogTotalSize”), the number of transaction log files (“tlogTotalCount”) and the status of the updates log synchronizer (“updateLogSynchronizer”).
|
||||||
|
|
||||||
===== Examples
|
===== QUEUES Examples
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
|
||||||
|
@ -569,20 +517,15 @@ The “queues” object also contains information about the updates log, such as
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-OPS]]
|
|
||||||
==== OPS
|
==== OPS
|
||||||
|
|
||||||
`/core/cdcr?action=OPS`
|
`/core/cdcr?action=OPS`
|
||||||
|
|
||||||
===== Input
|
===== OPS Response
|
||||||
|
|
||||||
*Query Parameters:* There are no parameters for this command.
|
The output is composed of `operationsPerSecond` which contains a list of (ZooKeeper) target hosts, themselves containing a list of target collections. For each collection, the average number of processed operations per second since the start of the replication process is provided. The operations are further broken down into two groups: add and delete operations.
|
||||||
|
|
||||||
===== Output
|
===== OPS Examples
|
||||||
|
|
||||||
*Output Content:* The output is composed of a list “operationsPerSecond” which contains a list of (ZooKeeper) target hosts, themselves containing a list of target collections. For each collection, the average number of processed operations per second since the start of the replication process is provided. The operations are further broken down into two groups: add and delete operations.
|
|
||||||
|
|
||||||
===== Examples
|
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
|
||||||
|
@ -612,20 +555,15 @@ The “queues” object also contains information about the updates log, such as
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-ERRORS]]
|
|
||||||
==== ERRORS
|
==== ERRORS
|
||||||
|
|
||||||
`/core/cdcr?action=ERRORS`
|
`/core/cdcr?action=ERRORS`
|
||||||
|
|
||||||
===== Input
|
===== ERRORS Response
|
||||||
|
|
||||||
*Query Parameters:* There are no parameters for this command.
|
The output is composed of a list “errors” which contains a list of (ZooKeeper) target hosts, themselves containing a list of target collections. For each collection, information about errors encountered during the replication is provided, such as the number of consecutive errors encountered by the replicator thread, the number of bad requests or internal errors since the start of the replication process, and a list of the last errors encountered ordered by timestamp.
|
||||||
|
|
||||||
===== Output
|
===== ERRORS Examples
|
||||||
|
|
||||||
*Output Content:* The output is composed of a list “errors” which contains a list of (ZooKeeper) target hosts, themselves containing a list of target collections. For each collection, information about errors encountered during the replication is provided, such as the number of consecutive errors encountered by the replicator thread, the number of bad requests or internal errors since the start of the replication process, and a list of the last errors encountered ordered by timestamp.
|
|
||||||
|
|
||||||
===== Examples
|
|
||||||
|
|
||||||
*Input*
|
*Input*
|
||||||
|
|
||||||
|
@ -728,7 +666,6 @@ http://host:port/solr/collection_name/cdcr?action=DISABLEBUFFER
|
||||||
+
|
+
|
||||||
* Renable indexing
|
* Renable indexing
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-Monitoring.1]]
|
|
||||||
== Monitoring
|
== Monitoring
|
||||||
|
|
||||||
. Network and disk space monitoring are essential. Ensure that the system has plenty of available storage to queue up changes if there is a disconnect between the Source and Target. A network outage between the two data centers can cause your disk usage to grow.
|
. Network and disk space monitoring are essential. Ensure that the system has plenty of available storage to queue up changes if there is a disconnect between the Source and Target. A network outage between the two data centers can cause your disk usage to grow.
|
||||||
|
@ -763,8 +700,3 @@ curl http://<Source>/solr/cloud1/update -H 'Content-type:application/json' -d '[
|
||||||
#check the Target
|
#check the Target
|
||||||
curl "http://<Target>:8983/solr/<collection_name>/select?q=SKU:ABC&wt=json&indent=true"
|
curl "http://<Target>:8983/solr/<collection_name>/select?q=SKU:ABC&wt=json&indent=true"
|
||||||
----
|
----
|
||||||
|
|
||||||
[[CrossDataCenterReplication_CDCR_-Limitations.1]]
|
|
||||||
== Limitations
|
|
||||||
|
|
||||||
* Running CDCR with the indexes on HDFS is not currently supported, see: https://issues.apache.org/jira/browse/SOLR-9861[Solr CDCR over HDFS].
|
|
||||||
|
|
|
@ -35,7 +35,6 @@ If you are using replication to replicate the Solr index (as described in <<lega
|
||||||
|
|
||||||
NOTE: If the environment variable `SOLR_DATA_HOME` if defined, or if `solr.data.home` is configured for your DirectoryFactory, the location of data directory will be `<SOLR_DATA_HOME>/<instance_name>/data`.
|
NOTE: If the environment variable `SOLR_DATA_HOME` if defined, or if `solr.data.home` is configured for your DirectoryFactory, the location of data directory will be `<SOLR_DATA_HOME>/<instance_name>/data`.
|
||||||
|
|
||||||
[[DataDirandDirectoryFactoryinSolrConfig-SpecifyingtheDirectoryFactoryForYourIndex]]
|
|
||||||
== Specifying the DirectoryFactory For Your Index
|
== Specifying the DirectoryFactory For Your Index
|
||||||
|
|
||||||
The default {solr-javadocs}/solr-core/org/apache/solr/core/StandardDirectoryFactory.html[`solr.StandardDirectoryFactory`] is filesystem based, and tries to pick the best implementation for the current JVM and platform. You can force a particular implementation and/or config options by specifying {solr-javadocs}/solr-core/org/apache/solr/core/MMapDirectoryFactory.html[`solr.MMapDirectoryFactory`], {solr-javadocs}/solr-core/org/apache/solr/core/NIOFSDirectoryFactory.html[`solr.NIOFSDirectoryFactory`], or {solr-javadocs}/solr-core/org/apache/solr/core/SimpleFSDirectoryFactory.html[`solr.SimpleFSDirectoryFactory`].
|
The default {solr-javadocs}/solr-core/org/apache/solr/core/StandardDirectoryFactory.html[`solr.StandardDirectoryFactory`] is filesystem based, and tries to pick the best implementation for the current JVM and platform. You can force a particular implementation and/or config options by specifying {solr-javadocs}/solr-core/org/apache/solr/core/MMapDirectoryFactory.html[`solr.MMapDirectoryFactory`], {solr-javadocs}/solr-core/org/apache/solr/core/NIOFSDirectoryFactory.html[`solr.NIOFSDirectoryFactory`], or {solr-javadocs}/solr-core/org/apache/solr/core/SimpleFSDirectoryFactory.html[`solr.SimpleFSDirectoryFactory`].
|
||||||
|
@ -57,7 +56,5 @@ The {solr-javadocs}/solr-core/org/apache/solr/core/RAMDirectoryFactory.html[`sol
|
||||||
|
|
||||||
[NOTE]
|
[NOTE]
|
||||||
====
|
====
|
||||||
|
|
||||||
If you are using Hadoop and would like to store your indexes in HDFS, you should use the {solr-javadocs}/solr-core/org/apache/solr/core/HdfsDirectoryFactory.html[`solr.HdfsDirectoryFactory`] instead of either of the above implementations. For more details, see the section <<running-solr-on-hdfs.adoc#running-solr-on-hdfs,Running Solr on HDFS>>.
|
If you are using Hadoop and would like to store your indexes in HDFS, you should use the {solr-javadocs}/solr-core/org/apache/solr/core/HdfsDirectoryFactory.html[`solr.HdfsDirectoryFactory`] instead of either of the above implementations. For more details, see the section <<running-solr-on-hdfs.adoc#running-solr-on-hdfs,Running Solr on HDFS>>.
|
||||||
|
|
||||||
====
|
====
|
||||||
|
|
|
@ -23,7 +23,6 @@ The Dataimport screen shows the configuration of the DataImportHandler (DIH) and
|
||||||
.The Dataimport Screen
|
.The Dataimport Screen
|
||||||
image::images/dataimport-screen/dataimport.png[image,width=485,height=250]
|
image::images/dataimport-screen/dataimport.png[image,width=485,height=250]
|
||||||
|
|
||||||
|
|
||||||
This screen also lets you adjust various options to control how the data is imported to Solr, and view the data import configuration file that controls the import.
|
This screen also lets you adjust various options to control how the data is imported to Solr, and view the data import configuration file that controls the import.
|
||||||
|
|
||||||
For more information about data importing with DIH, see the section on <<uploading-structured-data-store-data-with-the-data-import-handler.adoc#uploading-structured-data-store-data-with-the-data-import-handler,Uploading Structured Data Store Data with the Data Import Handler>>.
|
For more information about data importing with DIH, see the section on <<uploading-structured-data-store-data-with-the-data-import-handler.adoc#uploading-structured-data-store-data-with-the-data-import-handler,Uploading Structured Data Store Data with the Data Import Handler>>.
|
||||||
|
|
|
@ -26,7 +26,6 @@ Preventing duplicate or near duplicate documents from entering an index or taggi
|
||||||
* Lookup3Signature: 64-bit hash used for exact duplicate detection. This is much faster than MD5 and smaller to index.
|
* Lookup3Signature: 64-bit hash used for exact duplicate detection. This is much faster than MD5 and smaller to index.
|
||||||
* http://wiki.apache.org/solr/TextProfileSignature[TextProfileSignature]: Fuzzy hashing implementation from Apache Nutch for near duplicate detection. It's tunable but works best on longer text.
|
* http://wiki.apache.org/solr/TextProfileSignature[TextProfileSignature]: Fuzzy hashing implementation from Apache Nutch for near duplicate detection. It's tunable but works best on longer text.
|
||||||
|
|
||||||
|
|
||||||
Other, more sophisticated algorithms for fuzzy/near hashing can be added later.
|
Other, more sophisticated algorithms for fuzzy/near hashing can be added later.
|
||||||
|
|
||||||
[IMPORTANT]
|
[IMPORTANT]
|
||||||
|
@ -36,12 +35,10 @@ Adding in the de-duplication process will change the `allowDups` setting so that
|
||||||
Of course the `signatureField` could be the unique field, but generally you want the unique field to be unique. When a document is added, a signature will automatically be generated and attached to the document in the specified `signatureField`.
|
Of course the `signatureField` could be the unique field, but generally you want the unique field to be unique. When a document is added, a signature will automatically be generated and attached to the document in the specified `signatureField`.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[De-Duplication-ConfigurationOptions]]
|
|
||||||
== Configuration Options
|
== Configuration Options
|
||||||
|
|
||||||
There are two places in Solr to configure de-duplication: in `solrconfig.xml` and in `schema.xml`.
|
There are two places in Solr to configure de-duplication: in `solrconfig.xml` and in `schema.xml`.
|
||||||
|
|
||||||
[[De-Duplication-Insolrconfig.xml]]
|
|
||||||
=== In solrconfig.xml
|
=== In solrconfig.xml
|
||||||
|
|
||||||
The `SignatureUpdateProcessorFactory` has to be registered in `solrconfig.xml` as part of an <<update-request-processors.adoc#update-request-processors,Update Request Processor Chain>>, as in this example:
|
The `SignatureUpdateProcessorFactory` has to be registered in `solrconfig.xml` as part of an <<update-request-processors.adoc#update-request-processors,Update Request Processor Chain>>, as in this example:
|
||||||
|
@ -84,8 +81,6 @@ Set to *false* to disable de-duplication processing. The default is *true*.
|
||||||
overwriteDupes::
|
overwriteDupes::
|
||||||
If true, the default, when a document exists that already matches this signature, it will be overwritten.
|
If true, the default, when a document exists that already matches this signature, it will be overwritten.
|
||||||
|
|
||||||
|
|
||||||
[[De-Duplication-Inschema.xml]]
|
|
||||||
=== In schema.xml
|
=== In schema.xml
|
||||||
|
|
||||||
If you are using a separate field for storing the signature, you must have it indexed:
|
If you are using a separate field for storing the signature, you must have it indexed:
|
||||||
|
|
|
@ -29,7 +29,6 @@ A minimal `core.properties` file looks like the example below. However, it can a
|
||||||
name=my_core_name
|
name=my_core_name
|
||||||
----
|
----
|
||||||
|
|
||||||
[[Definingcore.properties-Placementofcore.properties]]
|
|
||||||
== Placement of core.properties
|
== Placement of core.properties
|
||||||
|
|
||||||
Solr cores are configured by placing a file named `core.properties` in a sub-directory under `solr.home`. There are no a-priori limits to the depth of the tree, nor are there limits to the number of cores that can be defined. Cores may be anywhere in the tree with the exception that cores may _not_ be defined under an existing core. That is, the following is not allowed:
|
Solr cores are configured by placing a file named `core.properties` in a sub-directory under `solr.home`. There are no a-priori limits to the depth of the tree, nor are there limits to the number of cores that can be defined. Cores may be anywhere in the tree with the exception that cores may _not_ be defined under an existing core. That is, the following is not allowed:
|
||||||
|
@ -61,11 +60,8 @@ Your `core.properties` file can be empty if necessary. Suppose `core.properties`
|
||||||
You can run Solr without configuring any cores.
|
You can run Solr without configuring any cores.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[Definingcore.properties-Definingcore.propertiesFiles]]
|
|
||||||
== Defining core.properties Files
|
== Defining core.properties Files
|
||||||
|
|
||||||
[[Definingcore.properties-core.properties_files]]
|
|
||||||
|
|
||||||
The minimal `core.properties` file is an empty file, in which case all of the properties are defaulted appropriately.
|
The minimal `core.properties` file is an empty file, in which case all of the properties are defaulted appropriately.
|
||||||
|
|
||||||
Java properties files allow the hash (`#`) or bang (`!`) characters to specify comment-to-end-of-line.
|
Java properties files allow the hash (`#`) or bang (`!`) characters to specify comment-to-end-of-line.
|
||||||
|
@ -98,4 +94,4 @@ The following properties are available:
|
||||||
|
|
||||||
`roles`:: Future parameter for SolrCloud or a way for users to mark nodes for their own use.
|
`roles`:: Future parameter for SolrCloud or a way for users to mark nodes for their own use.
|
||||||
|
|
||||||
Additional user-defined properties may be specified for use as variables. For more information on how to define local properties, see the section <<configuring-solrconfig-xml.adoc#Configuringsolrconfig.xml-SubstitutingPropertiesinSolrConfigFiles,Substituting Properties in Solr Config Files>>.
|
Additional user-defined properties may be specified for use as variables. For more information on how to define local properties, see the section <<configuring-solrconfig-xml.adoc#substituting-properties-in-solr-config-files,Substituting Properties in Solr Config Files>>.
|
||||||
|
|
|
@ -20,8 +20,7 @@
|
||||||
|
|
||||||
Fields are defined in the fields element of `schema.xml`. Once you have the field types set up, defining the fields themselves is simple.
|
Fields are defined in the fields element of `schema.xml`. Once you have the field types set up, defining the fields themselves is simple.
|
||||||
|
|
||||||
[[DefiningFields-Example]]
|
== Example Field Definition
|
||||||
== Example
|
|
||||||
|
|
||||||
The following example defines a field named `price` with a type named `float` and a default value of `0.0`; the `indexed` and `stored` properties are explicitly set to `true`, while any other properties specified on the `float` field type are inherited.
|
The following example defines a field named `price` with a type named `float` and a default value of `0.0`; the `indexed` and `stored` properties are explicitly set to `true`, while any other properties specified on the `float` field type are inherited.
|
||||||
|
|
||||||
|
@ -30,7 +29,6 @@ The following example defines a field named `price` with a type named `float` an
|
||||||
<field name="price" type="float" default="0.0" indexed="true" stored="true"/>
|
<field name="price" type="float" default="0.0" indexed="true" stored="true"/>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[DefiningFields-FieldProperties]]
|
|
||||||
== Field Properties
|
== Field Properties
|
||||||
|
|
||||||
Field definitions can have the following properties:
|
Field definitions can have the following properties:
|
||||||
|
@ -44,7 +42,6 @@ The name of the `fieldType` for this field. This will be found in the `name` att
|
||||||
`default`::
|
`default`::
|
||||||
A default value that will be added automatically to any document that does not have a value in this field when it is indexed. If this property is not specified, there is no default.
|
A default value that will be added automatically to any document that does not have a value in this field when it is indexed. If this property is not specified, there is no default.
|
||||||
|
|
||||||
[[DefiningFields-OptionalFieldTypeOverrideProperties]]
|
|
||||||
== Optional Field Type Override Properties
|
== Optional Field Type Override Properties
|
||||||
|
|
||||||
Fields can have many of the same properties as field types. Properties from the table below which are specified on an individual field will override any explicit value for that property specified on the the `fieldType` of the field, or any implicit default property value provided by the underlying `fieldType` implementation. The table below is reproduced from <<field-type-definitions-and-properties.adoc#field-type-definitions-and-properties,Field Type Definitions and Properties>>, which has more details:
|
Fields can have many of the same properties as field types. Properties from the table below which are specified on an individual field will override any explicit value for that property specified on the the `fieldType` of the field, or any implicit default property value provided by the underlying `fieldType` implementation. The table below is reproduced from <<field-type-definitions-and-properties.adoc#field-type-definitions-and-properties,Field Type Definitions and Properties>>, which has more details:
|
||||||
|
|
|
@ -31,12 +31,10 @@ For specific information on each of these language identification implementation
|
||||||
|
|
||||||
For more information about language analysis in Solr, see <<language-analysis.adoc#language-analysis,Language Analysis>>.
|
For more information about language analysis in Solr, see <<language-analysis.adoc#language-analysis,Language Analysis>>.
|
||||||
|
|
||||||
[[DetectingLanguagesDuringIndexing-ConfiguringLanguageDetection]]
|
|
||||||
== Configuring Language Detection
|
== Configuring Language Detection
|
||||||
|
|
||||||
You can configure the `langid` UpdateRequestProcessor in `solrconfig.xml`. Both implementations take the same parameters, which are described in the following section. At a minimum, you must specify the fields for language identification and a field for the resulting language code.
|
You can configure the `langid` UpdateRequestProcessor in `solrconfig.xml`. Both implementations take the same parameters, which are described in the following section. At a minimum, you must specify the fields for language identification and a field for the resulting language code.
|
||||||
|
|
||||||
[[DetectingLanguagesDuringIndexing-ConfiguringTikaLanguageDetection]]
|
|
||||||
=== Configuring Tika Language Detection
|
=== Configuring Tika Language Detection
|
||||||
|
|
||||||
Here is an example of a minimal Tika `langid` configuration in `solrconfig.xml`:
|
Here is an example of a minimal Tika `langid` configuration in `solrconfig.xml`:
|
||||||
|
@ -51,7 +49,6 @@ Here is an example of a minimal Tika `langid` configuration in `solrconfig.xml`:
|
||||||
</processor>
|
</processor>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[DetectingLanguagesDuringIndexing-ConfiguringLangDetectLanguageDetection]]
|
|
||||||
=== Configuring LangDetect Language Detection
|
=== Configuring LangDetect Language Detection
|
||||||
|
|
||||||
Here is an example of a minimal LangDetect `langid` configuration in `solrconfig.xml`:
|
Here is an example of a minimal LangDetect `langid` configuration in `solrconfig.xml`:
|
||||||
|
@ -66,7 +63,6 @@ Here is an example of a minimal LangDetect `langid` configuration in `solrconfig
|
||||||
</processor>
|
</processor>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[DetectingLanguagesDuringIndexing-langidParameters]]
|
|
||||||
== langid Parameters
|
== langid Parameters
|
||||||
|
|
||||||
As previously mentioned, both implementations of the `langid` UpdateRequestProcessor take the same parameters.
|
As previously mentioned, both implementations of the `langid` UpdateRequestProcessor take the same parameters.
|
||||||
|
|
|
@ -22,10 +22,9 @@ When a Solr node receives a search request, the request is routed behind the sce
|
||||||
|
|
||||||
The chosen replica acts as an aggregator: it creates internal requests to randomly chosen replicas of every shard in the collection, coordinates the responses, issues any subsequent internal requests as needed (for example, to refine facets values, or request additional stored fields), and constructs the final response for the client.
|
The chosen replica acts as an aggregator: it creates internal requests to randomly chosen replicas of every shard in the collection, coordinates the responses, issues any subsequent internal requests as needed (for example, to refine facets values, or request additional stored fields), and constructs the final response for the client.
|
||||||
|
|
||||||
[[DistributedRequests-LimitingWhichShardsareQueried]]
|
|
||||||
== Limiting Which Shards are Queried
|
== Limiting Which Shards are Queried
|
||||||
|
|
||||||
While one of the advantages of using SolrCloud is the ability to query very large collections distributed among various shards, in some cases <<shards-and-indexing-data-in-solrcloud.adoc#ShardsandIndexingDatainSolrCloud-DocumentRouting,you may know that you are only interested in results from a subset of your shards>>. You have the option of searching over all of your data or just parts of it.
|
While one of the advantages of using SolrCloud is the ability to query very large collections distributed among various shards, in some cases <<shards-and-indexing-data-in-solrcloud.adoc#document-routing,you may know that you are only interested in results from a subset of your shards>>. You have the option of searching over all of your data or just parts of it.
|
||||||
|
|
||||||
Querying all shards for a collection should look familiar; it's as though SolrCloud didn't even come into play:
|
Querying all shards for a collection should look familiar; it's as though SolrCloud didn't even come into play:
|
||||||
|
|
||||||
|
@ -71,7 +70,6 @@ And of course, you can specify a list of shards (seperated by commas) each defin
|
||||||
http://localhost:8983/solr/gettingstarted/select?q=*:*&shards=shard1,localhost:7574/solr/gettingstarted|localhost:7500/solr/gettingstarted
|
http://localhost:8983/solr/gettingstarted/select?q=*:*&shards=shard1,localhost:7574/solr/gettingstarted|localhost:7500/solr/gettingstarted
|
||||||
----
|
----
|
||||||
|
|
||||||
[[DistributedRequests-ConfiguringtheShardHandlerFactory]]
|
|
||||||
== Configuring the ShardHandlerFactory
|
== Configuring the ShardHandlerFactory
|
||||||
|
|
||||||
You can directly configure aspects of the concurrency and thread-pooling used within distributed search in Solr. This allows for finer grained control and you can tune it to target your own specific requirements. The default configuration favors throughput over latency.
|
You can directly configure aspects of the concurrency and thread-pooling used within distributed search in Solr. This allows for finer grained control and you can tune it to target your own specific requirements. The default configuration favors throughput over latency.
|
||||||
|
@ -118,7 +116,6 @@ If specified, the thread pool will use a backing queue instead of a direct hando
|
||||||
`fairnessPolicy`::
|
`fairnessPolicy`::
|
||||||
Chooses the JVM specifics dealing with fair policy queuing, if enabled distributed searches will be handled in a First in First out fashion at a cost to throughput. If disabled throughput will be favored over latency. The default is `false`.
|
Chooses the JVM specifics dealing with fair policy queuing, if enabled distributed searches will be handled in a First in First out fashion at a cost to throughput. If disabled throughput will be favored over latency. The default is `false`.
|
||||||
|
|
||||||
[[DistributedRequests-ConfiguringstatsCache_DistributedIDF_]]
|
|
||||||
== Configuring statsCache (Distributed IDF)
|
== Configuring statsCache (Distributed IDF)
|
||||||
|
|
||||||
Document and term statistics are needed in order to calculate relevancy. Solr provides four implementations out of the box when it comes to document stats calculation:
|
Document and term statistics are needed in order to calculate relevancy. Solr provides four implementations out of the box when it comes to document stats calculation:
|
||||||
|
@ -135,15 +132,13 @@ The implementation can be selected by setting `<statsCache>` in `solrconfig.xml`
|
||||||
<statsCache class="org.apache.solr.search.stats.ExactStatsCache"/>
|
<statsCache class="org.apache.solr.search.stats.ExactStatsCache"/>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[DistributedRequests-AvoidingDistributedDeadlock]]
|
|
||||||
== Avoiding Distributed Deadlock
|
== Avoiding Distributed Deadlock
|
||||||
|
|
||||||
Each shard serves top-level query requests and then makes sub-requests to all of the other shards. Care should be taken to ensure that the max number of threads serving HTTP requests is greater than the possible number of requests from both top-level clients and other shards. If this is not the case, the configuration may result in a distributed deadlock.
|
Each shard serves top-level query requests and then makes sub-requests to all of the other shards. Care should be taken to ensure that the max number of threads serving HTTP requests is greater than the possible number of requests from both top-level clients and other shards. If this is not the case, the configuration may result in a distributed deadlock.
|
||||||
|
|
||||||
For example, a deadlock might occur in the case of two shards, each with just a single thread to service HTTP requests. Both threads could receive a top-level request concurrently, and make sub-requests to each other. Because there are no more remaining threads to service requests, the incoming requests will be blocked until the other pending requests are finished, but they will not finish since they are waiting for the sub-requests. By ensuring that Solr is configured to handle a sufficient number of threads, you can avoid deadlock situations like this.
|
For example, a deadlock might occur in the case of two shards, each with just a single thread to service HTTP requests. Both threads could receive a top-level request concurrently, and make sub-requests to each other. Because there are no more remaining threads to service requests, the incoming requests will be blocked until the other pending requests are finished, but they will not finish since they are waiting for the sub-requests. By ensuring that Solr is configured to handle a sufficient number of threads, you can avoid deadlock situations like this.
|
||||||
|
|
||||||
[[DistributedRequests-PreferLocalShards]]
|
== preferLocalShards Parameter
|
||||||
== Prefer Local Shards
|
|
||||||
|
|
||||||
Solr allows you to pass an optional boolean parameter named `preferLocalShards` to indicate that a distributed query should prefer local replicas of a shard when available. In other words, if a query includes `preferLocalShards=true`, then the query controller will look for local replicas to service the query instead of selecting replicas at random from across the cluster. This is useful when a query requests many fields or large fields to be returned per document because it avoids moving large amounts of data over the network when it is available locally. In addition, this feature can be useful for minimizing the impact of a problematic replica with degraded performance, as it reduces the likelihood that the degraded replica will be hit by other healthy replicas.
|
Solr allows you to pass an optional boolean parameter named `preferLocalShards` to indicate that a distributed query should prefer local replicas of a shard when available. In other words, if a query includes `preferLocalShards=true`, then the query controller will look for local replicas to service the query instead of selecting replicas at random from across the cluster. This is useful when a query requests many fields or large fields to be returned per document because it avoids moving large amounts of data over the network when it is available locally. In addition, this feature can be useful for minimizing the impact of a problematic replica with degraded performance, as it reduces the likelihood that the degraded replica will be hit by other healthy replicas.
|
||||||
|
|
||||||
|
|
|
@ -26,14 +26,12 @@ Everything on this page is specific to legacy setup of distributed search. Users
|
||||||
|
|
||||||
Update reorders (i.e., replica A may see update X then Y, and replica B may see update Y then X). *deleteByQuery* also handles reorders the same way, to ensure replicas are consistent. All replicas of a shard are consistent, even if the updates arrive in a different order on different replicas.
|
Update reorders (i.e., replica A may see update X then Y, and replica B may see update Y then X). *deleteByQuery* also handles reorders the same way, to ensure replicas are consistent. All replicas of a shard are consistent, even if the updates arrive in a different order on different replicas.
|
||||||
|
|
||||||
[[DistributedSearchwithIndexSharding-DistributingDocumentsacrossShards]]
|
|
||||||
== Distributing Documents across Shards
|
== Distributing Documents across Shards
|
||||||
|
|
||||||
When not using SolrCloud, it is up to you to get all your documents indexed on each shard of your server farm. Solr supports distributed indexing (routing) in its true form only in the SolrCloud mode.
|
When not using SolrCloud, it is up to you to get all your documents indexed on each shard of your server farm. Solr supports distributed indexing (routing) in its true form only in the SolrCloud mode.
|
||||||
|
|
||||||
In the legacy distributed mode, Solr does not calculate universal term/doc frequencies. For most large-scale implementations, it is not likely to matter that Solr calculates TF/IDF at the shard level. However, if your collection is heavily skewed in its distribution across servers, you may find misleading relevancy results in your searches. In general, it is probably best to randomly distribute documents to your shards.
|
In the legacy distributed mode, Solr does not calculate universal term/doc frequencies. For most large-scale implementations, it is not likely to matter that Solr calculates TF/IDF at the shard level. However, if your collection is heavily skewed in its distribution across servers, you may find misleading relevancy results in your searches. In general, it is probably best to randomly distribute documents to your shards.
|
||||||
|
|
||||||
[[DistributedSearchwithIndexSharding-ExecutingDistributedSearcheswiththeshardsParameter]]
|
|
||||||
== Executing Distributed Searches with the shards Parameter
|
== Executing Distributed Searches with the shards Parameter
|
||||||
|
|
||||||
If a query request includes the `shards` parameter, the Solr server distributes the request across all the shards listed as arguments to the parameter. The `shards` parameter uses this syntax:
|
If a query request includes the `shards` parameter, the Solr server distributes the request across all the shards listed as arguments to the parameter. The `shards` parameter uses this syntax:
|
||||||
|
@ -63,7 +61,6 @@ The following components support distributed search:
|
||||||
* The *Stats* component, which returns simple statistics for numeric fields within the DocSet.
|
* The *Stats* component, which returns simple statistics for numeric fields within the DocSet.
|
||||||
* The *Debug* component, which helps with debugging.
|
* The *Debug* component, which helps with debugging.
|
||||||
|
|
||||||
[[DistributedSearchwithIndexSharding-LimitationstoDistributedSearch]]
|
|
||||||
== Limitations to Distributed Search
|
== Limitations to Distributed Search
|
||||||
|
|
||||||
Distributed searching in Solr has the following limitations:
|
Distributed searching in Solr has the following limitations:
|
||||||
|
@ -78,12 +75,10 @@ Distributed searching in Solr has the following limitations:
|
||||||
|
|
||||||
Formerly a limitation was that TF/IDF relevancy computations only used shard-local statistics. This is still the case by default. If your data isn't randomly distributed, or if you want more exact statistics, then remember to configure the ExactStatsCache.
|
Formerly a limitation was that TF/IDF relevancy computations only used shard-local statistics. This is still the case by default. If your data isn't randomly distributed, or if you want more exact statistics, then remember to configure the ExactStatsCache.
|
||||||
|
|
||||||
[[DistributedSearchwithIndexSharding-AvoidingDistributedDeadlock]]
|
== Avoiding Distributed Deadlock with Distributed Search
|
||||||
== Avoiding Distributed Deadlock
|
|
||||||
|
|
||||||
Like in SolrCloud mode, inter-shard requests could lead to a distributed deadlock. It can be avoided by following the instructions in the section <<distributed-requests.adoc#distributed-requests,Distributed Requests>>.
|
Like in SolrCloud mode, inter-shard requests could lead to a distributed deadlock. It can be avoided by following the instructions in the section <<distributed-requests.adoc#distributed-requests,Distributed Requests>>.
|
||||||
|
|
||||||
[[DistributedSearchwithIndexSharding-TestingIndexShardingonTwoLocalServers]]
|
|
||||||
== Testing Index Sharding on Two Local Servers
|
== Testing Index Sharding on Two Local Servers
|
||||||
|
|
||||||
For simple functional testing, it's easiest to just set up two local Solr servers on different ports. (In a production environment, of course, these servers would be deployed on separate machines.)
|
For simple functional testing, it's easiest to just set up two local Solr servers on different ports. (In a production environment, of course, these servers would be deployed on separate machines.)
|
||||||
|
|
|
@ -42,28 +42,24 @@ The first step is to define the RequestHandler to use (aka, 'qt'). By default `/
|
||||||
|
|
||||||
Then choose the Document Type to define the type of document to load. The remaining parameters will change depending on the document type selected.
|
Then choose the Document Type to define the type of document to load. The remaining parameters will change depending on the document type selected.
|
||||||
|
|
||||||
[[DocumentsScreen-JSON]]
|
== JSON Documents
|
||||||
== JSON
|
|
||||||
|
|
||||||
When using the JSON document type, the functionality is similar to using a requestHandler on the command line. Instead of putting the documents in a curl command, they can instead be input into the Document entry box. The document structure should still be in proper JSON format.
|
When using the JSON document type, the functionality is similar to using a requestHandler on the command line. Instead of putting the documents in a curl command, they can instead be input into the Document entry box. The document structure should still be in proper JSON format.
|
||||||
|
|
||||||
Then you can choose when documents should be added to the index (Commit Within), & whether existing documents should be overwritten with incoming documents with the same id (if this is not *true*, then the incoming documents will be dropped).
|
Then you can choose when documents should be added to the index (Commit Within), & whether existing documents should be overwritten with incoming documents with the same id (if this is not *true*, then the incoming documents will be dropped).
|
||||||
|
|
||||||
This option will only add or overwrite documents to the index; for other update tasks, see the <<DocumentsScreen-SolrCommand,Solr Command>> option.
|
This option will only add or overwrite documents to the index; for other update tasks, see the <<Solr Command>> option.
|
||||||
|
|
||||||
[[DocumentsScreen-CSV]]
|
== CSV Documents
|
||||||
== CSV
|
|
||||||
|
|
||||||
When using the CSV document type, the functionality is similar to using a requestHandler on the command line. Instead of putting the documents in a curl command, they can instead be input into the Document entry box. The document structure should still be in proper CSV format, with columns delimited and one row per document.
|
When using the CSV document type, the functionality is similar to using a requestHandler on the command line. Instead of putting the documents in a curl command, they can instead be input into the Document entry box. The document structure should still be in proper CSV format, with columns delimited and one row per document.
|
||||||
|
|
||||||
Then you can choose when documents should be added to the index (Commit Within), and whether existing documents should be overwritten with incoming documents with the same id (if this is not *true*, then the incoming documents will be dropped).
|
Then you can choose when documents should be added to the index (Commit Within), and whether existing documents should be overwritten with incoming documents with the same id (if this is not *true*, then the incoming documents will be dropped).
|
||||||
|
|
||||||
[[DocumentsScreen-DocumentBuilder]]
|
|
||||||
== Document Builder
|
== Document Builder
|
||||||
|
|
||||||
The Document Builder provides a wizard-like interface to enter fields of a document
|
The Document Builder provides a wizard-like interface to enter fields of a document
|
||||||
|
|
||||||
[[DocumentsScreen-FileUpload]]
|
|
||||||
== File Upload
|
== File Upload
|
||||||
|
|
||||||
The File Upload option allows choosing a prepared file and uploading it. If using only `/update` for the Request-Handler option, you will be limited to XML, CSV, and JSON.
|
The File Upload option allows choosing a prepared file and uploading it. If using only `/update` for the Request-Handler option, you will be limited to XML, CSV, and JSON.
|
||||||
|
@ -72,18 +68,16 @@ However, to use the ExtractingRequestHandler (aka Solr Cell), you can modify the
|
||||||
|
|
||||||
Then you can choose when documents should be added to the index (Commit Within), and whether existing documents should be overwritten with incoming documents with the same id (if this is not *true*, then the incoming documents will be dropped).
|
Then you can choose when documents should be added to the index (Commit Within), and whether existing documents should be overwritten with incoming documents with the same id (if this is not *true*, then the incoming documents will be dropped).
|
||||||
|
|
||||||
[[DocumentsScreen-SolrCommand]]
|
|
||||||
== Solr Command
|
== Solr Command
|
||||||
|
|
||||||
The Solr Command option allows you use XML or JSON to perform specific actions on documents, such as defining documents to be added or deleted, updating only certain fields of documents, or commit and optimize commands on the index.
|
The Solr Command option allows you use XML or JSON to perform specific actions on documents, such as defining documents to be added or deleted, updating only certain fields of documents, or commit and optimize commands on the index.
|
||||||
|
|
||||||
The documents should be structured as they would be if using `/update` on the command line.
|
The documents should be structured as they would be if using `/update` on the command line.
|
||||||
|
|
||||||
[[DocumentsScreen-XML]]
|
== XML Documents
|
||||||
== XML
|
|
||||||
|
|
||||||
When using the XML document type, the functionality is similar to using a requestHandler on the command line. Instead of putting the documents in a curl command, they can instead be input into the Document entry box. The document structure should still be in proper Solr XML format, with each document separated by `<doc>` tags and each field defined.
|
When using the XML document type, the functionality is similar to using a requestHandler on the command line. Instead of putting the documents in a curl command, they can instead be input into the Document entry box. The document structure should still be in proper Solr XML format, with each document separated by `<doc>` tags and each field defined.
|
||||||
|
|
||||||
Then you can choose when documents should be added to the index (Commit Within), and whether existing documents should be overwritten with incoming documents with the same id (if this is not **true**, then the incoming documents will be dropped).
|
Then you can choose when documents should be added to the index (Commit Within), and whether existing documents should be overwritten with incoming documents with the same id (if this is not **true**, then the incoming documents will be dropped).
|
||||||
|
|
||||||
This option will only add or overwrite documents to the index; for other update tasks, see the <<DocumentsScreen-SolrCommand,Solr Command>> option.
|
This option will only add or overwrite documents to the index; for other update tasks, see the <<Solr Command>> option.
|
||||||
|
|
|
@ -28,7 +28,6 @@ For other features that we now commonly associate with search, such as sorting,
|
||||||
|
|
||||||
In Lucene 4.0, a new approach was introduced. DocValue fields are now column-oriented fields with a document-to-value mapping built at index time. This approach promises to relieve some of the memory requirements of the fieldCache and make lookups for faceting, sorting, and grouping much faster.
|
In Lucene 4.0, a new approach was introduced. DocValue fields are now column-oriented fields with a document-to-value mapping built at index time. This approach promises to relieve some of the memory requirements of the fieldCache and make lookups for faceting, sorting, and grouping much faster.
|
||||||
|
|
||||||
[[DocValues-EnablingDocValues]]
|
|
||||||
== Enabling DocValues
|
== Enabling DocValues
|
||||||
|
|
||||||
To use docValues, you only need to enable it for a field that you will use it with. As with all schema design, you need to define a field type and then define fields of that type with docValues enabled. All of these actions are done in `schema.xml`.
|
To use docValues, you only need to enable it for a field that you will use it with. As with all schema design, you need to define a field type and then define fields of that type with docValues enabled. All of these actions are done in `schema.xml`.
|
||||||
|
@ -76,7 +75,6 @@ Lucene index back-compatibility is only supported for the default codec. If you
|
||||||
|
|
||||||
If `docValues="true"` for a field, then DocValues will automatically be used any time the field is used for <<common-query-parameters.adoc#CommonQueryParameters-ThesortParameter,sorting>>, <<faceting.adoc#faceting,faceting>> or <<function-queries.adoc#function-queries,function queries>>.
|
If `docValues="true"` for a field, then DocValues will automatically be used any time the field is used for <<common-query-parameters.adoc#CommonQueryParameters-ThesortParameter,sorting>>, <<faceting.adoc#faceting,faceting>> or <<function-queries.adoc#function-queries,function queries>>.
|
||||||
|
|
||||||
[[DocValues-RetrievingDocValuesDuringSearch]]
|
|
||||||
=== Retrieving DocValues During Search
|
=== Retrieving DocValues During Search
|
||||||
|
|
||||||
Field values retrieved during search queries are typically returned from stored values. However, non-stored docValues fields will be also returned along with other stored fields when all fields (or pattern matching globs) are specified to be returned (e.g. "`fl=*`") for search queries depending on the effective value of the `useDocValuesAsStored` parameter for each field. For schema versions >= 1.6, the implicit default is `useDocValuesAsStored="true"`. See <<field-type-definitions-and-properties.adoc#field-type-definitions-and-properties,Field Type Definitions and Properties>> & <<defining-fields.adoc#defining-fields,Defining Fields>> for more details.
|
Field values retrieved during search queries are typically returned from stored values. However, non-stored docValues fields will be also returned along with other stored fields when all fields (or pattern matching globs) are specified to be returned (e.g. "`fl=*`") for search queries depending on the effective value of the `useDocValuesAsStored` parameter for each field. For schema versions >= 1.6, the implicit default is `useDocValuesAsStored="true"`. See <<field-type-definitions-and-properties.adoc#field-type-definitions-and-properties,Field Type Definitions and Properties>> & <<defining-fields.adoc#defining-fields,Defining Fields>> for more details.
|
||||||
|
|
|
@ -24,10 +24,8 @@ This section describes enabling SSL using a self-signed certificate.
|
||||||
|
|
||||||
For background on SSL certificates and keys, see http://www.tldp.org/HOWTO/SSL-Certificates-HOWTO/.
|
For background on SSL certificates and keys, see http://www.tldp.org/HOWTO/SSL-Certificates-HOWTO/.
|
||||||
|
|
||||||
[[EnablingSSL-BasicSSLSetup]]
|
|
||||||
== Basic SSL Setup
|
== Basic SSL Setup
|
||||||
|
|
||||||
[[EnablingSSL-Generateaself-signedcertificateandakey]]
|
|
||||||
=== Generate a Self-Signed Certificate and a Key
|
=== Generate a Self-Signed Certificate and a Key
|
||||||
|
|
||||||
To generate a self-signed certificate and a single key that will be used to authenticate both the server and the client, we'll use the JDK https://docs.oracle.com/javase/8/docs/technotes/tools/unix/keytool.html[`keytool`] command and create a separate keystore. This keystore will also be used as a truststore below. It's possible to use the keystore that comes with the JDK for these purposes, and to use a separate truststore, but those options aren't covered here.
|
To generate a self-signed certificate and a single key that will be used to authenticate both the server and the client, we'll use the JDK https://docs.oracle.com/javase/8/docs/technotes/tools/unix/keytool.html[`keytool`] command and create a separate keystore. This keystore will also be used as a truststore below. It's possible to use the keystore that comes with the JDK for these purposes, and to use a separate truststore, but those options aren't covered here.
|
||||||
|
@ -45,7 +43,6 @@ keytool -genkeypair -alias solr-ssl -keyalg RSA -keysize 2048 -keypass secret -s
|
||||||
|
|
||||||
The above command will create a keystore file named `solr-ssl.keystore.jks` in the current directory.
|
The above command will create a keystore file named `solr-ssl.keystore.jks` in the current directory.
|
||||||
|
|
||||||
[[EnablingSSL-ConvertthecertificateandkeytoPEMformatforusewithcURL]]
|
|
||||||
=== Convert the Certificate and Key to PEM Format for Use with cURL
|
=== Convert the Certificate and Key to PEM Format for Use with cURL
|
||||||
|
|
||||||
cURL isn't capable of using JKS formatted keystores, so the JKS keystore needs to be converted to PEM format, which cURL understands.
|
cURL isn't capable of using JKS formatted keystores, so the JKS keystore needs to be converted to PEM format, which cURL understands.
|
||||||
|
@ -73,7 +70,6 @@ If you want to use cURL on OS X Yosemite (10.10), you'll need to create a certif
|
||||||
openssl pkcs12 -nokeys -in solr-ssl.keystore.p12 -out solr-ssl.cacert.pem
|
openssl pkcs12 -nokeys -in solr-ssl.keystore.p12 -out solr-ssl.cacert.pem
|
||||||
----
|
----
|
||||||
|
|
||||||
[[EnablingSSL-SetcommonSSLrelatedsystemproperties]]
|
|
||||||
=== Set Common SSL-Related System Properties
|
=== Set Common SSL-Related System Properties
|
||||||
|
|
||||||
The Solr Control Script is already setup to pass SSL-related Java system properties to the JVM. To activate the SSL settings, uncomment and update the set of properties beginning with SOLR_SSL_* in `bin/solr.in.sh`. (or `bin\solr.in.cmd` on Windows).
|
The Solr Control Script is already setup to pass SSL-related Java system properties to the JVM. To activate the SSL settings, uncomment and update the set of properties beginning with SOLR_SSL_* in `bin/solr.in.sh`. (or `bin\solr.in.cmd` on Windows).
|
||||||
|
@ -116,7 +112,6 @@ REM Enable clients to authenticate (but not require)
|
||||||
set SOLR_SSL_WANT_CLIENT_AUTH=false
|
set SOLR_SSL_WANT_CLIENT_AUTH=false
|
||||||
----
|
----
|
||||||
|
|
||||||
[[EnablingSSL-RunSingleNodeSolrusingSSL]]
|
|
||||||
=== Run Single Node Solr using SSL
|
=== Run Single Node Solr using SSL
|
||||||
|
|
||||||
Start Solr using the command shown below; by default clients will not be required to authenticate:
|
Start Solr using the command shown below; by default clients will not be required to authenticate:
|
||||||
|
@ -133,12 +128,10 @@ bin/solr -p 8984
|
||||||
bin\solr.cmd -p 8984
|
bin\solr.cmd -p 8984
|
||||||
----
|
----
|
||||||
|
|
||||||
[[EnablingSSL-SolrCloud]]
|
|
||||||
== SSL with SolrCloud
|
== SSL with SolrCloud
|
||||||
|
|
||||||
This section describes how to run a two-node SolrCloud cluster with no initial collections and a single-node external ZooKeeper. The commands below assume you have already created the keystore described above.
|
This section describes how to run a two-node SolrCloud cluster with no initial collections and a single-node external ZooKeeper. The commands below assume you have already created the keystore described above.
|
||||||
|
|
||||||
[[EnablingSSL-ConfigureZooKeeper]]
|
|
||||||
=== Configure ZooKeeper
|
=== Configure ZooKeeper
|
||||||
|
|
||||||
NOTE: ZooKeeper does not support encrypted communication with clients like Solr. There are several related JIRA tickets where SSL support is being planned/worked on: https://issues.apache.org/jira/browse/ZOOKEEPER-235[ZOOKEEPER-235]; https://issues.apache.org/jira/browse/ZOOKEEPER-236[ZOOKEEPER-236]; https://issues.apache.org/jira/browse/ZOOKEEPER-1000[ZOOKEEPER-1000]; and https://issues.apache.org/jira/browse/ZOOKEEPER-2120[ZOOKEEPER-2120].
|
NOTE: ZooKeeper does not support encrypted communication with clients like Solr. There are several related JIRA tickets where SSL support is being planned/worked on: https://issues.apache.org/jira/browse/ZOOKEEPER-235[ZOOKEEPER-235]; https://issues.apache.org/jira/browse/ZOOKEEPER-236[ZOOKEEPER-236]; https://issues.apache.org/jira/browse/ZOOKEEPER-1000[ZOOKEEPER-1000]; and https://issues.apache.org/jira/browse/ZOOKEEPER-2120[ZOOKEEPER-2120].
|
||||||
|
@ -161,12 +154,10 @@ server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd clusterprop -n
|
||||||
server\scripts\cloud-scripts\zkcli.bat -zkhost localhost:2181 -cmd clusterprop -name urlScheme -val https
|
server\scripts\cloud-scripts\zkcli.bat -zkhost localhost:2181 -cmd clusterprop -name urlScheme -val https
|
||||||
----
|
----
|
||||||
|
|
||||||
If you have set up your ZooKeeper cluster to use a <<taking-solr-to-production.adoc#TakingSolrtoProduction-ZooKeeperchroot,chroot for Solr>> , make sure you use the correct `zkhost` string with `zkcli`, e.g. `-zkhost localhost:2181/solr`.
|
If you have set up your ZooKeeper cluster to use a <<taking-solr-to-production.adoc#zookeeper-chroot,chroot for Solr>> , make sure you use the correct `zkhost` string with `zkcli`, e.g. `-zkhost localhost:2181/solr`.
|
||||||
|
|
||||||
[[EnablingSSL-RunSolrCloudwithSSL]]
|
|
||||||
=== Run SolrCloud with SSL
|
=== Run SolrCloud with SSL
|
||||||
|
|
||||||
[[EnablingSSL-CreateSolrhomedirectoriesfortwonodes]]
|
|
||||||
==== Create Solr Home Directories for Two Nodes
|
==== Create Solr Home Directories for Two Nodes
|
||||||
|
|
||||||
Create two copies of the `server/solr/` directory which will serve as the Solr home directories for each of your two SolrCloud nodes:
|
Create two copies of the `server/solr/` directory which will serve as the Solr home directories for each of your two SolrCloud nodes:
|
||||||
|
@ -187,7 +178,6 @@ xcopy /E server\solr cloud\node1\
|
||||||
xcopy /E server\solr cloud\node2\
|
xcopy /E server\solr cloud\node2\
|
||||||
----
|
----
|
||||||
|
|
||||||
[[EnablingSSL-StartthefirstSolrnode]]
|
|
||||||
==== Start the First Solr Node
|
==== Start the First Solr Node
|
||||||
|
|
||||||
Next, start the first Solr node on port 8984. Be sure to stop the standalone server first if you started it when working through the previous section on this page.
|
Next, start the first Solr node on port 8984. Be sure to stop the standalone server first if you started it when working through the previous section on this page.
|
||||||
|
@ -220,7 +210,6 @@ bin/solr -cloud -s cloud/node1 -z localhost:2181 -p 8984 -Dsolr.ssl.checkPeerNam
|
||||||
bin\solr.cmd -cloud -s cloud\node1 -z localhost:2181 -p 8984 -Dsolr.ssl.checkPeerName=false
|
bin\solr.cmd -cloud -s cloud\node1 -z localhost:2181 -p 8984 -Dsolr.ssl.checkPeerName=false
|
||||||
----
|
----
|
||||||
|
|
||||||
[[EnablingSSL-StartthesecondSolrnode]]
|
|
||||||
==== Start the Second Solr Node
|
==== Start the Second Solr Node
|
||||||
|
|
||||||
Finally, start the second Solr node on port 7574 - again, to skip hostname verification, add `-Dsolr.ssl.checkPeerName=false`;
|
Finally, start the second Solr node on port 7574 - again, to skip hostname verification, add `-Dsolr.ssl.checkPeerName=false`;
|
||||||
|
@ -237,14 +226,13 @@ bin/solr -cloud -s cloud/node2 -z localhost:2181 -p 7574
|
||||||
bin\solr.cmd -cloud -s cloud\node2 -z localhost:2181 -p 7574
|
bin\solr.cmd -cloud -s cloud\node2 -z localhost:2181 -p 7574
|
||||||
----
|
----
|
||||||
|
|
||||||
[[EnablingSSL-ExampleClientActions]]
|
|
||||||
== Example Client Actions
|
== Example Client Actions
|
||||||
|
|
||||||
[IMPORTANT]
|
[IMPORTANT]
|
||||||
====
|
====
|
||||||
cURL on OS X Mavericks (10.9) has degraded SSL support. For more information and workarounds to allow one-way SSL, see http://curl.haxx.se/mail/archive-2013-10/0036.html. cURL on OS X Yosemite (10.10) is improved - 2-way SSL is possible - see http://curl.haxx.se/mail/archive-2014-10/0053.html .
|
cURL on OS X Mavericks (10.9) has degraded SSL support. For more information and workarounds to allow one-way SSL, see http://curl.haxx.se/mail/archive-2013-10/0036.html. cURL on OS X Yosemite (10.10) is improved - 2-way SSL is possible - see http://curl.haxx.se/mail/archive-2014-10/0053.html .
|
||||||
|
|
||||||
The cURL commands in the following sections will not work with the system `curl` on OS X Yosemite (10.10). Instead, the certificate supplied with the `-E` param must be in PKCS12 format, and the file supplied with the `--cacert` param must contain only the CA certificate, and no key (see <<EnablingSSL-ConvertthecertificateandkeytoPEMformatforusewithcURL,above>> for instructions on creating this file):
|
The cURL commands in the following sections will not work with the system `curl` on OS X Yosemite (10.10). Instead, the certificate supplied with the `-E` param must be in PKCS12 format, and the file supplied with the `--cacert` param must contain only the CA certificate, and no key (see <<Convert the Certificate and Key to PEM Format for Use with cURL,above>> for instructions on creating this file):
|
||||||
|
|
||||||
[source,bash]
|
[source,bash]
|
||||||
curl -E solr-ssl.keystore.p12:secret --cacert solr-ssl.cacert.pem ...
|
curl -E solr-ssl.keystore.p12:secret --cacert solr-ssl.cacert.pem ...
|
||||||
|
@ -271,7 +259,6 @@ bin\solr.cmd create -c mycollection -shards 2
|
||||||
|
|
||||||
The `create` action will pass the `SOLR_SSL_*` properties set in your include file to the SolrJ code used to create the collection.
|
The `create` action will pass the `SOLR_SSL_*` properties set in your include file to the SolrJ code used to create the collection.
|
||||||
|
|
||||||
[[EnablingSSL-RetrieveSolrCloudclusterstatususingcURL]]
|
|
||||||
=== Retrieve SolrCloud Cluster Status using cURL
|
=== Retrieve SolrCloud Cluster Status using cURL
|
||||||
|
|
||||||
To get the resulting cluster status (again, if you have not enabled client authentication, remove the `-E solr-ssl.pem:secret` option):
|
To get the resulting cluster status (again, if you have not enabled client authentication, remove the `-E solr-ssl.pem:secret` option):
|
||||||
|
@ -317,7 +304,6 @@ You should get a response that looks like this:
|
||||||
"properties":{"urlScheme":"https"}}}
|
"properties":{"urlScheme":"https"}}}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[EnablingSSL-Indexdocumentsusingpost.jar]]
|
|
||||||
=== Index Documents using post.jar
|
=== Index Documents using post.jar
|
||||||
|
|
||||||
Use `post.jar` to index some example documents to the SolrCloud collection created above:
|
Use `post.jar` to index some example documents to the SolrCloud collection created above:
|
||||||
|
@ -329,7 +315,6 @@ cd example/exampledocs
|
||||||
java -Djavax.net.ssl.keyStorePassword=secret -Djavax.net.ssl.keyStore=../../server/etc/solr-ssl.keystore.jks -Djavax.net.ssl.trustStore=../../server/etc/solr-ssl.keystore.jks -Djavax.net.ssl.trustStorePassword=secret -Durl=https://localhost:8984/solr/mycollection/update -jar post.jar *.xml
|
java -Djavax.net.ssl.keyStorePassword=secret -Djavax.net.ssl.keyStore=../../server/etc/solr-ssl.keystore.jks -Djavax.net.ssl.trustStore=../../server/etc/solr-ssl.keystore.jks -Djavax.net.ssl.trustStorePassword=secret -Durl=https://localhost:8984/solr/mycollection/update -jar post.jar *.xml
|
||||||
----
|
----
|
||||||
|
|
||||||
[[EnablingSSL-QueryusingcURL]]
|
|
||||||
=== Query Using cURL
|
=== Query Using cURL
|
||||||
|
|
||||||
Use cURL to query the SolrCloud collection created above, from a directory containing the PEM formatted certificate and key created above (e.g. `example/etc/`) - if you have not enabled client authentication (system property `-Djetty.ssl.clientAuth=true)`, then you can remove the `-E solr-ssl.pem:secret` option:
|
Use cURL to query the SolrCloud collection created above, from a directory containing the PEM formatted certificate and key created above (e.g. `example/etc/`) - if you have not enabled client authentication (system property `-Djetty.ssl.clientAuth=true)`, then you can remove the `-E solr-ssl.pem:secret` option:
|
||||||
|
@ -339,8 +324,7 @@ Use cURL to query the SolrCloud collection created above, from a directory conta
|
||||||
curl -E solr-ssl.pem:secret --cacert solr-ssl.pem "https://localhost:8984/solr/mycollection/select?q=*:*&wt=json&indent=on"
|
curl -E solr-ssl.pem:secret --cacert solr-ssl.pem "https://localhost:8984/solr/mycollection/select?q=*:*&wt=json&indent=on"
|
||||||
----
|
----
|
||||||
|
|
||||||
[[EnablingSSL-IndexadocumentusingCloudSolrClient]]
|
=== Index a Document using CloudSolrClient
|
||||||
=== Index a document using CloudSolrClient
|
|
||||||
|
|
||||||
From a java client using SolrJ, index a document. In the code below, the `javax.net.ssl.*` system properties are set programmatically, but you could instead specify them on the java command line, as in the `post.jar` example above:
|
From a java client using SolrJ, index a document. In the code below, the `javax.net.ssl.*` system properties are set programmatically, but you could instead specify them on the java command line, as in the `post.jar` example above:
|
||||||
|
|
||||||
|
|
|
@ -18,14 +18,12 @@
|
||||||
// specific language governing permissions and limitations
|
// specific language governing permissions and limitations
|
||||||
// under the License.
|
// under the License.
|
||||||
|
|
||||||
[[Errata-ErrataForThisDocumentation]]
|
|
||||||
== Errata For This Documentation
|
== Errata For This Documentation
|
||||||
|
|
||||||
Any mistakes found in this documentation after its release will be listed on the on-line version of this page:
|
Any mistakes found in this documentation after its release will be listed on the on-line version of this page:
|
||||||
|
|
||||||
https://lucene.apache.org/solr/guide/{solr-docs-version}/errata.html
|
https://lucene.apache.org/solr/guide/{solr-docs-version}/errata.html
|
||||||
|
|
||||||
[[Errata-ErrataForPastVersionsofThisDocumentation]]
|
|
||||||
== Errata For Past Versions of This Documentation
|
== Errata For Past Versions of This Documentation
|
||||||
|
|
||||||
Any known mistakes in past releases of this documentation will be noted below.
|
Any known mistakes in past releases of this documentation will be noted below.
|
||||||
|
|
|
@ -25,19 +25,16 @@ This feature uses a stream sorting technique that begins to send records within
|
||||||
|
|
||||||
The cases where this functionality may be useful include: session analysis, distributed merge joins, time series roll-ups, aggregations on high cardinality fields, fully distributed field collapsing, and sort based stats.
|
The cases where this functionality may be useful include: session analysis, distributed merge joins, time series roll-ups, aggregations on high cardinality fields, fully distributed field collapsing, and sort based stats.
|
||||||
|
|
||||||
[[ExportingResultSets-FieldRequirements]]
|
|
||||||
== Field Requirements
|
== Field Requirements
|
||||||
|
|
||||||
All the fields being sorted and exported must have docValues set to true. For more information, see the section on <<docvalues.adoc#docvalues,DocValues>>.
|
All the fields being sorted and exported must have docValues set to true. For more information, see the section on <<docvalues.adoc#docvalues,DocValues>>.
|
||||||
|
|
||||||
[[ExportingResultSets-The_exportRequestHandler]]
|
|
||||||
== The /export RequestHandler
|
== The /export RequestHandler
|
||||||
|
|
||||||
The `/export` request handler with the appropriate configuration is one of Solr's out-of-the-box request handlers - see <<implicit-requesthandlers.adoc#implicit-requesthandlers,Implicit RequestHandlers>> for more information.
|
The `/export` request handler with the appropriate configuration is one of Solr's out-of-the-box request handlers - see <<implicit-requesthandlers.adoc#implicit-requesthandlers,Implicit RequestHandlers>> for more information.
|
||||||
|
|
||||||
Note that this request handler's properties are defined as "invariants", which means they cannot be overridden by other properties passed at another time (such as at query time).
|
Note that this request handler's properties are defined as "invariants", which means they cannot be overridden by other properties passed at another time (such as at query time).
|
||||||
|
|
||||||
[[ExportingResultSets-RequestingResultsExport]]
|
|
||||||
== Requesting Results Export
|
== Requesting Results Export
|
||||||
|
|
||||||
You can use `/export` to make requests to export the result set of a query.
|
You can use `/export` to make requests to export the result set of a query.
|
||||||
|
@ -53,19 +50,16 @@ Here is an example of an export request of some indexed log data:
|
||||||
http://localhost:8983/solr/core_name/export?q=my-query&sort=severity+desc,timestamp+desc&fl=severity,timestamp,msg
|
http://localhost:8983/solr/core_name/export?q=my-query&sort=severity+desc,timestamp+desc&fl=severity,timestamp,msg
|
||||||
----
|
----
|
||||||
|
|
||||||
[[ExportingResultSets-SpecifyingtheSortCriteria]]
|
|
||||||
=== Specifying the Sort Criteria
|
=== Specifying the Sort Criteria
|
||||||
|
|
||||||
The `sort` property defines how documents will be sorted in the exported result set. Results can be sorted by any field that has a field type of int,long, float, double, string. The sort fields must be single valued fields.
|
The `sort` property defines how documents will be sorted in the exported result set. Results can be sorted by any field that has a field type of int,long, float, double, string. The sort fields must be single valued fields.
|
||||||
|
|
||||||
Up to four sort fields can be specified per request, with the 'asc' or 'desc' properties.
|
Up to four sort fields can be specified per request, with the 'asc' or 'desc' properties.
|
||||||
|
|
||||||
[[ExportingResultSets-SpecifyingtheFieldList]]
|
|
||||||
=== Specifying the Field List
|
=== Specifying the Field List
|
||||||
|
|
||||||
The `fl` property defines the fields that will be exported with the result set. Any of the field types that can be sorted (i.e., int, long, float, double, string, date, boolean) can be used in the field list. The fields can be single or multi-valued. However, returning scores and wildcards are not supported at this time.
|
The `fl` property defines the fields that will be exported with the result set. Any of the field types that can be sorted (i.e., int, long, float, double, string, date, boolean) can be used in the field list. The fields can be single or multi-valued. However, returning scores and wildcards are not supported at this time.
|
||||||
|
|
||||||
[[ExportingResultSets-DistributedSupport]]
|
|
||||||
== Distributed Support
|
== Distributed Support
|
||||||
|
|
||||||
See the section <<streaming-expressions.adoc#streaming-expressions,Streaming Expressions>> for distributed support.
|
See the section <<streaming-expressions.adoc#streaming-expressions,Streaming Expressions>> for distributed support.
|
||||||
|
|
|
@ -21,7 +21,7 @@
|
||||||
|
|
||||||
Faceting is the arrangement of search results into categories based on indexed terms.
|
Faceting is the arrangement of search results into categories based on indexed terms.
|
||||||
|
|
||||||
Searchers are presented with the indexed terms, along with numerical counts of how many matching documents were found were each term. Faceting makes it easy for users to explore search results, narrowing in on exactly the results they are looking for.
|
Searchers are presented with the indexed terms, along with numerical counts of how many matching documents were found for each term. Faceting makes it easy for users to explore search results, narrowing in on exactly the results they are looking for.
|
||||||
|
|
||||||
[[Faceting-GeneralParameters]]
|
[[Faceting-GeneralParameters]]
|
||||||
== General Parameters
|
== General Parameters
|
||||||
|
@ -351,7 +351,7 @@ The `facet.mincount` parameter, the same one as used in field faceting is also a
|
||||||
[NOTE]
|
[NOTE]
|
||||||
====
|
====
|
||||||
|
|
||||||
Range faceting on date fields is a common situation where the <<working-with-dates.adoc#WorkingwithDates-TZ,`TZ`>> parameter can be useful to ensure that the "facet counts per day" or "facet counts per month" are based on a meaningful definition of when a given day/month "starts" relative to a particular TimeZone.
|
Range faceting on date fields is a common situation where the <<working-with-dates.adoc#tz,`TZ`>> parameter can be useful to ensure that the "facet counts per day" or "facet counts per month" are based on a meaningful definition of when a given day/month "starts" relative to a particular TimeZone.
|
||||||
|
|
||||||
For more information, see the examples in the <<working-with-dates.adoc#working-with-dates,Working with Dates>> section.
|
For more information, see the examples in the <<working-with-dates.adoc#working-with-dates,Working with Dates>> section.
|
||||||
|
|
||||||
|
|
|
@ -27,7 +27,6 @@ A field type definition can include four types of information:
|
||||||
* If the field type is `TextField`, a description of the field analysis for the field type.
|
* If the field type is `TextField`, a description of the field analysis for the field type.
|
||||||
* Field type properties - depending on the implementation class, some properties may be mandatory.
|
* Field type properties - depending on the implementation class, some properties may be mandatory.
|
||||||
|
|
||||||
[[FieldTypeDefinitionsandProperties-FieldTypeDefinitionsinschema.xml]]
|
|
||||||
== Field Type Definitions in schema.xml
|
== Field Type Definitions in schema.xml
|
||||||
|
|
||||||
Field types are defined in `schema.xml`. Each field type is defined between `fieldType` elements. They can optionally be grouped within a `types` element. Here is an example of a field type definition for a type called `text_general`:
|
Field types are defined in `schema.xml`. Each field type is defined between `fieldType` elements. They can optionally be grouped within a `types` element. Here is an example of a field type definition for a type called `text_general`:
|
||||||
|
@ -91,9 +90,9 @@ For multivalued fields, specifies a distance between multiple values, which prev
|
||||||
`autoGeneratePhraseQueries`:: For text fields. If `true`, Solr automatically generates phrase queries for adjacent terms. If `false`, terms must be enclosed in double-quotes to be treated as phrases.
|
`autoGeneratePhraseQueries`:: For text fields. If `true`, Solr automatically generates phrase queries for adjacent terms. If `false`, terms must be enclosed in double-quotes to be treated as phrases.
|
||||||
|
|
||||||
`enableGraphQueries`::
|
`enableGraphQueries`::
|
||||||
For text fields, applicable when querying with <<the-standard-query-parser.adoc#TheStandardQueryParser-StandardQueryParserParameters,`sow=false`>>. Use `true` (the default) for field types with query analyzers including graph-aware filters, e.g., <<filter-descriptions.adoc#FilterDescriptions-SynonymGraphFilter,Synonym Graph Filter>> and <<filter-descriptions.adoc#FilterDescriptions-WordDelimiterGraphFilter,Word Delimiter Graph Filter>>.
|
For text fields, applicable when querying with <<the-standard-query-parser.adoc#TheStandardQueryParser-StandardQueryParserParameters,`sow=false`>>. Use `true` (the default) for field types with query analyzers including graph-aware filters, e.g., <<filter-descriptions.adoc#synonym-graph-filter,Synonym Graph Filter>> and <<filter-descriptions.adoc#word-delimiter-graph-filter,Word Delimiter Graph Filter>>.
|
||||||
+
|
+
|
||||||
Use `false` for field types with query analyzers including filters that can match docs when some tokens are missing, e.g., <<filter-descriptions.adoc#FilterDescriptions-ShingleFilter,Shingle Filter>>.
|
Use `false` for field types with query analyzers including filters that can match docs when some tokens are missing, e.g., <<filter-descriptions.adoc#shingle-filter,Shingle Filter>>.
|
||||||
|
|
||||||
[[FieldTypeDefinitionsandProperties-docValuesFormat]]
|
[[FieldTypeDefinitionsandProperties-docValuesFormat]]
|
||||||
`docValuesFormat`::
|
`docValuesFormat`::
|
||||||
|
@ -137,9 +136,8 @@ The default values for each property depend on the underlying `FieldType` class,
|
||||||
|
|
||||||
// TODO: SOLR-10655 END
|
// TODO: SOLR-10655 END
|
||||||
|
|
||||||
[[FieldTypeDefinitionsandProperties-FieldTypeSimilarity]]
|
|
||||||
== Field Type Similarity
|
== Field Type Similarity
|
||||||
|
|
||||||
A field type may optionally specify a `<similarity/>` that will be used when scoring documents that refer to fields with this type, as long as the "global" similarity for the collection allows it.
|
A field type may optionally specify a `<similarity/>` that will be used when scoring documents that refer to fields with this type, as long as the "global" similarity for the collection allows it.
|
||||||
|
|
||||||
By default, any field type which does not define a similarity, uses `BM25Similarity`. For more details, and examples of configuring both global & per-type Similarities, please see <<other-schema-elements.adoc#OtherSchemaElements-Similarity,Other Schema Elements>>.
|
By default, any field type which does not define a similarity, uses `BM25Similarity`. For more details, and examples of configuring both global & per-type Similarities, please see <<other-schema-elements.adoc#similarity,Other Schema Elements>>.
|
||||||
|
|
|
@ -27,17 +27,17 @@ The following table lists the field types that are available in Solr. The `org.a
|
||||||
|Class |Description
|
|Class |Description
|
||||||
|BinaryField |Binary data.
|
|BinaryField |Binary data.
|
||||||
|BoolField |Contains either true or false. Values of "1", "t", or "T" in the first character are interpreted as true. Any other values in the first character are interpreted as false.
|
|BoolField |Contains either true or false. Values of "1", "t", or "T" in the first character are interpreted as true. Any other values in the first character are interpreted as false.
|
||||||
|CollationField |Supports Unicode collation for sorting and range queries. ICUCollationField is a better choice if you can use ICU4J. See the section <<language-analysis.adoc#LanguageAnalysis-UnicodeCollation,Unicode Collation>>.
|
|CollationField |Supports Unicode collation for sorting and range queries. ICUCollationField is a better choice if you can use ICU4J. See the section <<language-analysis.adoc#unicode-collation,Unicode Collation>>.
|
||||||
|CurrencyField |Deprecated in favor of CurrencyFieldType.
|
|CurrencyField |Deprecated in favor of CurrencyFieldType.
|
||||||
|CurrencyFieldType |Supports currencies and exchange rates. See the section <<working-with-currencies-and-exchange-rates.adoc#working-with-currencies-and-exchange-rates,Working with Currencies and Exchange Rates>>.
|
|CurrencyFieldType |Supports currencies and exchange rates. See the section <<working-with-currencies-and-exchange-rates.adoc#working-with-currencies-and-exchange-rates,Working with Currencies and Exchange Rates>>.
|
||||||
|DateRangeField |Supports indexing date ranges, to include point in time date instances as well (single-millisecond durations). See the section <<working-with-dates.adoc#working-with-dates,Working with Dates>> for more detail on using this field type. Consider using this field type even if it's just for date instances, particularly when the queries typically fall on UTC year/month/day/hour, etc., boundaries.
|
|DateRangeField |Supports indexing date ranges, to include point in time date instances as well (single-millisecond durations). See the section <<working-with-dates.adoc#working-with-dates,Working with Dates>> for more detail on using this field type. Consider using this field type even if it's just for date instances, particularly when the queries typically fall on UTC year/month/day/hour, etc., boundaries.
|
||||||
|ExternalFileField |Pulls values from a file on disk. See the section <<working-with-external-files-and-processes.adoc#working-with-external-files-and-processes,Working with External Files and Processes>>.
|
|ExternalFileField |Pulls values from a file on disk. See the section <<working-with-external-files-and-processes.adoc#working-with-external-files-and-processes,Working with External Files and Processes>>.
|
||||||
|EnumField |Allows defining an enumerated set of values which may not be easily sorted by either alphabetic or numeric order (such as a list of severities, for example). This field type takes a configuration file, which lists the proper order of the field values. See the section <<working-with-enum-fields.adoc#working-with-enum-fields,Working with Enum Fields>> for more information.
|
|EnumField |Allows defining an enumerated set of values which may not be easily sorted by either alphabetic or numeric order (such as a list of severities, for example). This field type takes a configuration file, which lists the proper order of the field values. See the section <<working-with-enum-fields.adoc#working-with-enum-fields,Working with Enum Fields>> for more information.
|
||||||
|ICUCollationField |Supports Unicode collation for sorting and range queries. See the section <<language-analysis.adoc#LanguageAnalysis-UnicodeCollation,Unicode Collation>>.
|
|ICUCollationField |Supports Unicode collation for sorting and range queries. See the section <<language-analysis.adoc#unicode-collation,Unicode Collation>>.
|
||||||
|LatLonPointSpatialField |<<spatial-search.adoc#spatial-search,Spatial Search>>: a latitude/longitude coordinate pair; possibly multi-valued for multiple points. Usually it's specified as "lat,lon" order with a comma.
|
|LatLonPointSpatialField |<<spatial-search.adoc#spatial-search,Spatial Search>>: a latitude/longitude coordinate pair; possibly multi-valued for multiple points. Usually it's specified as "lat,lon" order with a comma.
|
||||||
|LatLonType |(deprecated) <<spatial-search.adoc#spatial-search,Spatial Search>>: a single-valued latitude/longitude coordinate pair. Usually it's specified as "lat,lon" order with a comma.
|
|LatLonType |(deprecated) <<spatial-search.adoc#spatial-search,Spatial Search>>: a single-valued latitude/longitude coordinate pair. Usually it's specified as "lat,lon" order with a comma.
|
||||||
|PointType |<<spatial-search.adoc#spatial-search,Spatial Search>>: A single-valued n-dimensional point. It's both for sorting spatial data that is _not_ lat-lon, and for some more rare use-cases. (NOTE: this is _not_ related to the "Point" based numeric fields)
|
|PointType |<<spatial-search.adoc#spatial-search,Spatial Search>>: A single-valued n-dimensional point. It's both for sorting spatial data that is _not_ lat-lon, and for some more rare use-cases. (NOTE: this is _not_ related to the "Point" based numeric fields)
|
||||||
|PreAnalyzedField |Provides a way to send to Solr serialized token streams, optionally with independent stored values of a field, and have this information stored and indexed without any additional text processing. Configuration and usage of PreAnalyzedField is documented on the <<working-with-external-files-and-processes.adoc#WorkingwithExternalFilesandProcesses-ThePreAnalyzedFieldType,Working with External Files and Processes>> page.
|
|PreAnalyzedField |Provides a way to send to Solr serialized token streams, optionally with independent stored values of a field, and have this information stored and indexed without any additional text processing. Configuration and usage of PreAnalyzedField is documented on the <<working-with-external-files-and-processes.adoc#the-preanalyzedfield-type,Working with External Files and Processes>> page.
|
||||||
|RandomSortField |Does not contain a value. Queries that sort on this field type will return results in random order. Use a dynamic field to use this feature.
|
|RandomSortField |Does not contain a value. Queries that sort on this field type will return results in random order. Use a dynamic field to use this feature.
|
||||||
|SpatialRecursivePrefixTreeFieldType |(RPT for short) <<spatial-search.adoc#spatial-search,Spatial Search>>: Accepts latitude comma longitude strings or other shapes in WKT format.
|
|SpatialRecursivePrefixTreeFieldType |(RPT for short) <<spatial-search.adoc#spatial-search,Spatial Search>>: Accepts latitude comma longitude strings or other shapes in WKT format.
|
||||||
|StrField |String (UTF-8 encoded string or Unicode). Strings are intended for small fields and are _not_ tokenized or analyzed in any way. They have a hard limit of slightly less than 32K.
|
|StrField |String (UTF-8 encoded string or Unicode). Strings are intended for small fields and are _not_ tokenized or analyzed in any way. They have a hard limit of slightly less than 32K.
|
||||||
|
|
|
@ -50,7 +50,6 @@ The following sections describe the filter factories that are included in this r
|
||||||
|
|
||||||
For user tips about Solr's filters, see http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters.
|
For user tips about Solr's filters, see http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters.
|
||||||
|
|
||||||
[[FilterDescriptions-ASCIIFoldingFilter]]
|
|
||||||
== ASCII Folding Filter
|
== ASCII Folding Filter
|
||||||
|
|
||||||
This filter converts alphabetic, numeric, and symbolic Unicode characters which are not in the Basic Latin Unicode block (the first 127 ASCII characters) to their ASCII equivalents, if one exists. This filter converts characters from the following Unicode blocks:
|
This filter converts alphabetic, numeric, and symbolic Unicode characters which are not in the Basic Latin Unicode block (the first 127 ASCII characters) to their ASCII equivalents, if one exists. This filter converts characters from the following Unicode blocks:
|
||||||
|
@ -92,10 +91,9 @@ This filter converts alphabetic, numeric, and symbolic Unicode characters which
|
||||||
|
|
||||||
*Out:* "a" (ASCII character 97)
|
*Out:* "a" (ASCII character 97)
|
||||||
|
|
||||||
[[FilterDescriptions-Beider-MorseFilter]]
|
|
||||||
== Beider-Morse Filter
|
== Beider-Morse Filter
|
||||||
|
|
||||||
Implements the Beider-Morse Phonetic Matching (BMPM) algorithm, which allows identification of similar names, even if they are spelled differently or in different languages. More information about how this works is available in the section on <<phonetic-matching.adoc#PhoneticMatching-Beider-MorsePhoneticMatching_BMPM_,Phonetic Matching>>.
|
Implements the Beider-Morse Phonetic Matching (BMPM) algorithm, which allows identification of similar names, even if they are spelled differently or in different languages. More information about how this works is available in the section on <<phonetic-matching.adoc#beider-morse-phonetic-matching-bmpm,Phonetic Matching>>.
|
||||||
|
|
||||||
[IMPORTANT]
|
[IMPORTANT]
|
||||||
====
|
====
|
||||||
|
@ -125,10 +123,9 @@ BeiderMorseFilter changed its behavior in Solr 5.0 due to an update to version 3
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[FilterDescriptions-ClassicFilter]]
|
|
||||||
== Classic Filter
|
== Classic Filter
|
||||||
|
|
||||||
This filter takes the output of the <<tokenizers.adoc#Tokenizers-ClassicTokenizer,Classic Tokenizer>> and strips periods from acronyms and "'s" from possessives.
|
This filter takes the output of the <<tokenizers.adoc#classic-tokenizer,Classic Tokenizer>> and strips periods from acronyms and "'s" from possessives.
|
||||||
|
|
||||||
*Factory class:* `solr.ClassicFilterFactory`
|
*Factory class:* `solr.ClassicFilterFactory`
|
||||||
|
|
||||||
|
@ -150,7 +147,6 @@ This filter takes the output of the <<tokenizers.adoc#Tokenizers-ClassicTokenize
|
||||||
|
|
||||||
*Out:* "IBM", "cat", "can't"
|
*Out:* "IBM", "cat", "can't"
|
||||||
|
|
||||||
[[FilterDescriptions-CommonGramsFilter]]
|
|
||||||
== Common Grams Filter
|
== Common Grams Filter
|
||||||
|
|
||||||
This filter creates word shingles by combining common tokens such as stop words with regular tokens. This is useful for creating phrase queries containing common words, such as "the cat." Solr normally ignores stop words in queried phrases, so searching for "the cat" would return all matches for the word "cat."
|
This filter creates word shingles by combining common tokens such as stop words with regular tokens. This is useful for creating phrase queries containing common words, such as "the cat." Solr normally ignores stop words in queried phrases, so searching for "the cat" would return all matches for the word "cat."
|
||||||
|
@ -181,12 +177,10 @@ This filter creates word shingles by combining common tokens such as stop words
|
||||||
|
|
||||||
*Out:* "the_cat"
|
*Out:* "the_cat"
|
||||||
|
|
||||||
[[FilterDescriptions-CollationKeyFilter]]
|
|
||||||
== Collation Key Filter
|
== Collation Key Filter
|
||||||
|
|
||||||
Collation allows sorting of text in a language-sensitive way. It is usually used for sorting, but can also be used with advanced searches. We've covered this in much more detail in the section on <<language-analysis.adoc#LanguageAnalysis-UnicodeCollation,Unicode Collation>>.
|
Collation allows sorting of text in a language-sensitive way. It is usually used for sorting, but can also be used with advanced searches. We've covered this in much more detail in the section on <<language-analysis.adoc#unicode-collation,Unicode Collation>>.
|
||||||
|
|
||||||
[[FilterDescriptions-Daitch-MokotoffSoundexFilter]]
|
|
||||||
== Daitch-Mokotoff Soundex Filter
|
== Daitch-Mokotoff Soundex Filter
|
||||||
|
|
||||||
Implements the Daitch-Mokotoff Soundex algorithm, which allows identification of similar names, even if they are spelled differently. More information about how this works is available in the section on <<phonetic-matching.adoc#phonetic-matching,Phonetic Matching>>.
|
Implements the Daitch-Mokotoff Soundex algorithm, which allows identification of similar names, even if they are spelled differently. More information about how this works is available in the section on <<phonetic-matching.adoc#phonetic-matching,Phonetic Matching>>.
|
||||||
|
@ -207,7 +201,6 @@ Implements the Daitch-Mokotoff Soundex algorithm, which allows identification of
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[FilterDescriptions-DoubleMetaphoneFilter]]
|
|
||||||
== Double Metaphone Filter
|
== Double Metaphone Filter
|
||||||
|
|
||||||
This filter creates tokens using the http://commons.apache.org/codec/apidocs/org/apache/commons/codec/language/DoubleMetaphone.html[`DoubleMetaphone`] encoding algorithm from commons-codec. For more information, see the <<phonetic-matching.adoc#phonetic-matching,Phonetic Matching>> section.
|
This filter creates tokens using the http://commons.apache.org/codec/apidocs/org/apache/commons/codec/language/DoubleMetaphone.html[`DoubleMetaphone`] encoding algorithm from commons-codec. For more information, see the <<phonetic-matching.adoc#phonetic-matching,Phonetic Matching>> section.
|
||||||
|
@ -260,7 +253,6 @@ Discard original token (`inject="false"`).
|
||||||
|
|
||||||
Note that "Kuczewski" has two encodings, which are added at the same position.
|
Note that "Kuczewski" has two encodings, which are added at the same position.
|
||||||
|
|
||||||
[[FilterDescriptions-EdgeN-GramFilter]]
|
|
||||||
== Edge N-Gram Filter
|
== Edge N-Gram Filter
|
||||||
|
|
||||||
This filter generates edge n-gram tokens of sizes within the given range.
|
This filter generates edge n-gram tokens of sizes within the given range.
|
||||||
|
@ -327,7 +319,6 @@ A range of 4 to 6.
|
||||||
|
|
||||||
*Out:* "four", "scor", "score", "twen", "twent", "twenty"
|
*Out:* "four", "scor", "score", "twen", "twent", "twenty"
|
||||||
|
|
||||||
[[FilterDescriptions-EnglishMinimalStemFilter]]
|
|
||||||
== English Minimal Stem Filter
|
== English Minimal Stem Filter
|
||||||
|
|
||||||
This filter stems plural English words to their singular form.
|
This filter stems plural English words to their singular form.
|
||||||
|
@ -352,7 +343,6 @@ This filter stems plural English words to their singular form.
|
||||||
|
|
||||||
*Out:* "dog", "cat"
|
*Out:* "dog", "cat"
|
||||||
|
|
||||||
[[FilterDescriptions-EnglishPossessiveFilter]]
|
|
||||||
== English Possessive Filter
|
== English Possessive Filter
|
||||||
|
|
||||||
This filter removes singular possessives (trailing *'s*) from words. Note that plural possessives, e.g. the *s'* in "divers' snorkels", are not removed by this filter.
|
This filter removes singular possessives (trailing *'s*) from words. Note that plural possessives, e.g. the *s'* in "divers' snorkels", are not removed by this filter.
|
||||||
|
@ -377,7 +367,6 @@ This filter removes singular possessives (trailing *'s*) from words. Note that p
|
||||||
|
|
||||||
*Out:* "Man", "dog", "bites", "dogs'", "man"
|
*Out:* "Man", "dog", "bites", "dogs'", "man"
|
||||||
|
|
||||||
[[FilterDescriptions-FingerprintFilter]]
|
|
||||||
== Fingerprint Filter
|
== Fingerprint Filter
|
||||||
|
|
||||||
This filter outputs a single token which is a concatenation of the sorted and de-duplicated set of input tokens. This can be useful for clustering/linking use cases.
|
This filter outputs a single token which is a concatenation of the sorted and de-duplicated set of input tokens. This can be useful for clustering/linking use cases.
|
||||||
|
@ -406,7 +395,6 @@ This filter outputs a single token which is a concatenation of the sorted and de
|
||||||
|
|
||||||
*Out:* "brown_dog_fox_jumped_lazy_over_quick_the"
|
*Out:* "brown_dog_fox_jumped_lazy_over_quick_the"
|
||||||
|
|
||||||
[[FilterDescriptions-FlattenGraphFilter]]
|
|
||||||
== Flatten Graph Filter
|
== Flatten Graph Filter
|
||||||
|
|
||||||
This filter must be included on index-time analyzer specifications that include at least one graph-aware filter, including Synonym Graph Filter and Word Delimiter Graph Filter.
|
This filter must be included on index-time analyzer specifications that include at least one graph-aware filter, including Synonym Graph Filter and Word Delimiter Graph Filter.
|
||||||
|
@ -417,7 +405,6 @@ This filter must be included on index-time analyzer specifications that include
|
||||||
|
|
||||||
See the examples below for <<Synonym Graph Filter>> and <<Word Delimiter Graph Filter>>.
|
See the examples below for <<Synonym Graph Filter>> and <<Word Delimiter Graph Filter>>.
|
||||||
|
|
||||||
[[FilterDescriptions-HunspellStemFilter]]
|
|
||||||
== Hunspell Stem Filter
|
== Hunspell Stem Filter
|
||||||
|
|
||||||
The `Hunspell Stem Filter` provides support for several languages. You must provide the dictionary (`.dic`) and rules (`.aff`) files for each language you wish to use with the Hunspell Stem Filter. You can download those language files http://wiki.services.openoffice.org/wiki/Dictionaries[here].
|
The `Hunspell Stem Filter` provides support for several languages. You must provide the dictionary (`.dic`) and rules (`.aff`) files for each language you wish to use with the Hunspell Stem Filter. You can download those language files http://wiki.services.openoffice.org/wiki/Dictionaries[here].
|
||||||
|
@ -456,7 +443,6 @@ Be aware that your results will vary widely based on the quality of the provided
|
||||||
|
|
||||||
*Out:* "jump", "jump", "jump"
|
*Out:* "jump", "jump", "jump"
|
||||||
|
|
||||||
[[FilterDescriptions-HyphenatedWordsFilter]]
|
|
||||||
== Hyphenated Words Filter
|
== Hyphenated Words Filter
|
||||||
|
|
||||||
This filter reconstructs hyphenated words that have been tokenized as two tokens because of a line break or other intervening whitespace in the field test. If a token ends with a hyphen, it is joined with the following token and the hyphen is discarded.
|
This filter reconstructs hyphenated words that have been tokenized as two tokens because of a line break or other intervening whitespace in the field test. If a token ends with a hyphen, it is joined with the following token and the hyphen is discarded.
|
||||||
|
@ -483,10 +469,9 @@ Note that for this filter to work properly, the upstream tokenizer must not remo
|
||||||
|
|
||||||
*Out:* "A", "hyphenated", "word"
|
*Out:* "A", "hyphenated", "word"
|
||||||
|
|
||||||
[[FilterDescriptions-ICUFoldingFilter]]
|
|
||||||
== ICU Folding Filter
|
== ICU Folding Filter
|
||||||
|
|
||||||
This filter is a custom Unicode normalization form that applies the foldings specified in http://www.unicode.org/reports/tr30/tr30-4.html[Unicode Technical Report 30] in addition to the `NFKC_Casefold` normalization form as described in <<FilterDescriptions-ICUNormalizer2Filter,ICU Normalizer 2 Filter>>. This filter is a better substitute for the combined behavior of the <<FilterDescriptions-ASCIIFoldingFilter,ASCII Folding Filter>>, <<FilterDescriptions-LowerCaseFilter,Lower Case Filter>>, and <<FilterDescriptions-ICUNormalizer2Filter,ICU Normalizer 2 Filter>>.
|
This filter is a custom Unicode normalization form that applies the foldings specified in http://www.unicode.org/reports/tr30/tr30-4.html[Unicode Technical Report 30] in addition to the `NFKC_Casefold` normalization form as described in <<ICU Normalizer 2 Filter>>. This filter is a better substitute for the combined behavior of the <<ASCII Folding Filter>>, <<Lower Case Filter>>, and <<ICU Normalizer 2 Filter>>.
|
||||||
|
|
||||||
To use this filter, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`. For more information about adding jars, see the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in Solrconfig>>.
|
To use this filter, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`. For more information about adding jars, see the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in Solrconfig>>.
|
||||||
|
|
||||||
|
@ -506,7 +491,6 @@ To use this filter, see `solr/contrib/analysis-extras/README.txt` for instructio
|
||||||
|
|
||||||
For detailed information on this normalization form, see http://www.unicode.org/reports/tr30/tr30-4.html.
|
For detailed information on this normalization form, see http://www.unicode.org/reports/tr30/tr30-4.html.
|
||||||
|
|
||||||
[[FilterDescriptions-ICUNormalizer2Filter]]
|
|
||||||
== ICU Normalizer 2 Filter
|
== ICU Normalizer 2 Filter
|
||||||
|
|
||||||
This filter factory normalizes text according to one of five Unicode Normalization Forms as described in http://unicode.org/reports/tr15/[Unicode Standard Annex #15]:
|
This filter factory normalizes text according to one of five Unicode Normalization Forms as described in http://unicode.org/reports/tr15/[Unicode Standard Annex #15]:
|
||||||
|
@ -539,7 +523,6 @@ For detailed information about these Unicode Normalization Forms, see http://uni
|
||||||
|
|
||||||
To use this filter, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
|
To use this filter, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
|
||||||
|
|
||||||
[[FilterDescriptions-ICUTransformFilter]]
|
|
||||||
== ICU Transform Filter
|
== ICU Transform Filter
|
||||||
|
|
||||||
This filter applies http://userguide.icu-project.org/transforms/general[ICU Tranforms] to text. This filter supports only ICU System Transforms. Custom rule sets are not supported.
|
This filter applies http://userguide.icu-project.org/transforms/general[ICU Tranforms] to text. This filter supports only ICU System Transforms. Custom rule sets are not supported.
|
||||||
|
@ -564,7 +547,6 @@ For detailed information about ICU Transforms, see http://userguide.icu-project.
|
||||||
|
|
||||||
To use this filter, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
|
To use this filter, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
|
||||||
|
|
||||||
[[FilterDescriptions-KeepWordFilter]]
|
|
||||||
== Keep Word Filter
|
== Keep Word Filter
|
||||||
|
|
||||||
This filter discards all tokens except those that are listed in the given word list. This is the inverse of the Stop Words Filter. This filter can be useful for building specialized indices for a constrained set of terms.
|
This filter discards all tokens except those that are listed in the given word list. This is the inverse of the Stop Words Filter. This filter can be useful for building specialized indices for a constrained set of terms.
|
||||||
|
@ -638,7 +620,6 @@ Using LowerCaseFilterFactory before filtering for keep words, no `ignoreCase` fl
|
||||||
|
|
||||||
*Out:* "happy", "funny"
|
*Out:* "happy", "funny"
|
||||||
|
|
||||||
[[FilterDescriptions-KStemFilter]]
|
|
||||||
== KStem Filter
|
== KStem Filter
|
||||||
|
|
||||||
KStem is an alternative to the Porter Stem Filter for developers looking for a less aggressive stemmer. KStem was written by Bob Krovetz, ported to Lucene by Sergio Guzman-Lara (UMASS Amherst). This stemmer is only appropriate for English language text.
|
KStem is an alternative to the Porter Stem Filter for developers looking for a less aggressive stemmer. KStem was written by Bob Krovetz, ported to Lucene by Sergio Guzman-Lara (UMASS Amherst). This stemmer is only appropriate for English language text.
|
||||||
|
@ -663,7 +644,6 @@ KStem is an alternative to the Porter Stem Filter for developers looking for a l
|
||||||
|
|
||||||
*Out:* "jump", "jump", "jump"
|
*Out:* "jump", "jump", "jump"
|
||||||
|
|
||||||
[[FilterDescriptions-LengthFilter]]
|
|
||||||
== Length Filter
|
== Length Filter
|
||||||
|
|
||||||
This filter passes tokens whose length falls within the min/max limit specified. All other tokens are discarded.
|
This filter passes tokens whose length falls within the min/max limit specified. All other tokens are discarded.
|
||||||
|
@ -694,7 +674,6 @@ This filter passes tokens whose length falls within the min/max limit specified.
|
||||||
|
|
||||||
*Out:* "turn", "right"
|
*Out:* "turn", "right"
|
||||||
|
|
||||||
[[FilterDescriptions-LimitTokenCountFilter]]
|
|
||||||
== Limit Token Count Filter
|
== Limit Token Count Filter
|
||||||
|
|
||||||
This filter limits the number of accepted tokens, typically useful for index analysis.
|
This filter limits the number of accepted tokens, typically useful for index analysis.
|
||||||
|
@ -726,7 +705,6 @@ By default, this filter ignores any tokens in the wrapped `TokenStream` once the
|
||||||
|
|
||||||
*Out:* "1", "2", "3", "4", "5", "6", "7", "8", "9", "10"
|
*Out:* "1", "2", "3", "4", "5", "6", "7", "8", "9", "10"
|
||||||
|
|
||||||
[[FilterDescriptions-LimitTokenOffsetFilter]]
|
|
||||||
== Limit Token Offset Filter
|
== Limit Token Offset Filter
|
||||||
|
|
||||||
This filter limits tokens to those before a configured maximum start character offset. This can be useful to limit highlighting, for example.
|
This filter limits tokens to those before a configured maximum start character offset. This can be useful to limit highlighting, for example.
|
||||||
|
@ -758,7 +736,6 @@ By default, this filter ignores any tokens in the wrapped `TokenStream` once the
|
||||||
|
|
||||||
*Out:* "0", "2", "4", "6", "8", "A"
|
*Out:* "0", "2", "4", "6", "8", "A"
|
||||||
|
|
||||||
[[FilterDescriptions-LimitTokenPositionFilter]]
|
|
||||||
== Limit Token Position Filter
|
== Limit Token Position Filter
|
||||||
|
|
||||||
This filter limits tokens to those before a configured maximum token position.
|
This filter limits tokens to those before a configured maximum token position.
|
||||||
|
@ -790,7 +767,6 @@ By default, this filter ignores any tokens in the wrapped `TokenStream` once the
|
||||||
|
|
||||||
*Out:* "1", "2", "3"
|
*Out:* "1", "2", "3"
|
||||||
|
|
||||||
[[FilterDescriptions-LowerCaseFilter]]
|
|
||||||
== Lower Case Filter
|
== Lower Case Filter
|
||||||
|
|
||||||
Converts any uppercase letters in a token to the equivalent lowercase token. All other characters are left unchanged.
|
Converts any uppercase letters in a token to the equivalent lowercase token. All other characters are left unchanged.
|
||||||
|
@ -815,10 +791,9 @@ Converts any uppercase letters in a token to the equivalent lowercase token. All
|
||||||
|
|
||||||
*Out:* "down", "with", "camelcase"
|
*Out:* "down", "with", "camelcase"
|
||||||
|
|
||||||
[[FilterDescriptions-ManagedStopFilter]]
|
|
||||||
== Managed Stop Filter
|
== Managed Stop Filter
|
||||||
|
|
||||||
This is specialized version of the <<FilterDescriptions-StopFilter,Stop Words Filter Factory>> that uses a set of stop words that are <<managed-resources.adoc#managed-resources,managed from a REST API.>>
|
This is specialized version of the <<Stop Filter,Stop Words Filter Factory>> that uses a set of stop words that are <<managed-resources.adoc#managed-resources,managed from a REST API.>>
|
||||||
|
|
||||||
*Arguments:*
|
*Arguments:*
|
||||||
|
|
||||||
|
@ -836,12 +811,11 @@ With this configuration the set of words is named "english" and can be managed v
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
See <<FilterDescriptions-StopFilter,Stop Filter>> for example input/output.
|
See <<Stop Filter>> for example input/output.
|
||||||
|
|
||||||
[[FilterDescriptions-ManagedSynonymFilter]]
|
|
||||||
== Managed Synonym Filter
|
== Managed Synonym Filter
|
||||||
|
|
||||||
This is specialized version of the <<FilterDescriptions-SynonymFilter,Synonym Filter Factory>> that uses a mapping on synonyms that is <<managed-resources.adoc#managed-resources,managed from a REST API.>>
|
This is specialized version of the <<Synonym Filter>> that uses a mapping on synonyms that is <<managed-resources.adoc#managed-resources,managed from a REST API.>>
|
||||||
|
|
||||||
.Managed Synonym Filter has been Deprecated
|
.Managed Synonym Filter has been Deprecated
|
||||||
[WARNING]
|
[WARNING]
|
||||||
|
@ -851,12 +825,11 @@ Managed Synonym Filter has been deprecated in favor of Managed Synonym Graph Fil
|
||||||
|
|
||||||
*Factory class:* `solr.ManagedSynonymFilterFactory`
|
*Factory class:* `solr.ManagedSynonymFilterFactory`
|
||||||
|
|
||||||
For arguments and examples, see the Managed Synonym Graph Filter below.
|
For arguments and examples, see the <<Managed Synonym Graph Filter>> below.
|
||||||
|
|
||||||
[[FilterDescriptions-ManagedSynonymGraphFilter]]
|
|
||||||
== Managed Synonym Graph Filter
|
== Managed Synonym Graph Filter
|
||||||
|
|
||||||
This is specialized version of the <<FilterDescriptions-SynonymGraphFilter,Synonym Graph Filter Factory>> that uses a mapping on synonyms that is <<managed-resources.adoc#managed-resources,managed from a REST API.>>
|
This is specialized version of the <<Synonym Graph Filter>> that uses a mapping on synonyms that is <<managed-resources.adoc#managed-resources,managed from a REST API.>>
|
||||||
|
|
||||||
This filter maps single- or multi-token synonyms, producing a fully correct graph output. This filter is a replacement for the Managed Synonym Filter, which produces incorrect graphs for multi-token synonyms.
|
This filter maps single- or multi-token synonyms, producing a fully correct graph output. This filter is a replacement for the Managed Synonym Filter, which produces incorrect graphs for multi-token synonyms.
|
||||||
|
|
||||||
|
@ -881,9 +854,8 @@ With this configuration the set of mappings is named "english" and can be manage
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
See <<FilterDescriptions-ManagedSynonymFilter,Managed Synonym Filter>> for example input/output.
|
See <<Managed Synonym Filter>> for example input/output.
|
||||||
|
|
||||||
[[FilterDescriptions-N-GramFilter]]
|
|
||||||
== N-Gram Filter
|
== N-Gram Filter
|
||||||
|
|
||||||
Generates n-gram tokens of sizes in the given range. Note that tokens are ordered by position and then by gram size.
|
Generates n-gram tokens of sizes in the given range. Note that tokens are ordered by position and then by gram size.
|
||||||
|
@ -950,7 +922,6 @@ A range of 3 to 5.
|
||||||
|
|
||||||
*Out:* "fou", "four", "our", "sco", "scor", "score", "cor", "core", "ore"
|
*Out:* "fou", "four", "our", "sco", "scor", "score", "cor", "core", "ore"
|
||||||
|
|
||||||
[[FilterDescriptions-NumericPayloadTokenFilter]]
|
|
||||||
== Numeric Payload Token Filter
|
== Numeric Payload Token Filter
|
||||||
|
|
||||||
This filter adds a numeric floating point payload value to tokens that match a given type. Refer to the Javadoc for the `org.apache.lucene.analysis.Token` class for more information about token types and payloads.
|
This filter adds a numeric floating point payload value to tokens that match a given type. Refer to the Javadoc for the `org.apache.lucene.analysis.Token` class for more information about token types and payloads.
|
||||||
|
@ -979,7 +950,6 @@ This filter adds a numeric floating point payload value to tokens that match a g
|
||||||
|
|
||||||
*Out:* "bing"[0.75], "bang"[0.75], "boom"[0.75]
|
*Out:* "bing"[0.75], "bang"[0.75], "boom"[0.75]
|
||||||
|
|
||||||
[[FilterDescriptions-PatternReplaceFilter]]
|
|
||||||
== Pattern Replace Filter
|
== Pattern Replace Filter
|
||||||
|
|
||||||
This filter applies a regular expression to each token and, for those that match, substitutes the given replacement string in place of the matched pattern. Tokens which do not match are passed though unchanged.
|
This filter applies a regular expression to each token and, for those that match, substitutes the given replacement string in place of the matched pattern. Tokens which do not match are passed though unchanged.
|
||||||
|
@ -1048,7 +1018,6 @@ More complex pattern with capture group reference in the replacement. Tokens tha
|
||||||
|
|
||||||
*Out:* "cat", "foo_1234", "9987", "blah1234foo"
|
*Out:* "cat", "foo_1234", "9987", "blah1234foo"
|
||||||
|
|
||||||
[[FilterDescriptions-PhoneticFilter]]
|
|
||||||
== Phonetic Filter
|
== Phonetic Filter
|
||||||
|
|
||||||
This filter creates tokens using one of the phonetic encoding algorithms in the `org.apache.commons.codec.language` package. For more information, see the section on <<phonetic-matching.adoc#phonetic-matching,Phonetic Matching>>.
|
This filter creates tokens using one of the phonetic encoding algorithms in the `org.apache.commons.codec.language` package. For more information, see the section on <<phonetic-matching.adoc#phonetic-matching,Phonetic Matching>>.
|
||||||
|
@ -1119,7 +1088,6 @@ Default Soundex encoder.
|
||||||
|
|
||||||
*Out:* "four"(1), "F600"(1), "score"(2), "S600"(2), "and"(3), "A530"(3), "twenty"(4), "T530"(4)
|
*Out:* "four"(1), "F600"(1), "score"(2), "S600"(2), "and"(3), "A530"(3), "twenty"(4), "T530"(4)
|
||||||
|
|
||||||
[[FilterDescriptions-PorterStemFilter]]
|
|
||||||
== Porter Stem Filter
|
== Porter Stem Filter
|
||||||
|
|
||||||
This filter applies the Porter Stemming Algorithm for English. The results are similar to using the Snowball Porter Stemmer with the `language="English"` argument. But this stemmer is coded directly in Java and is not based on Snowball. It does not accept a list of protected words and is only appropriate for English language text. However, it has been benchmarked as http://markmail.org/thread/d2c443z63z37rwf6[four times faster] than the English Snowball stemmer, so can provide a performance enhancement.
|
This filter applies the Porter Stemming Algorithm for English. The results are similar to using the Snowball Porter Stemmer with the `language="English"` argument. But this stemmer is coded directly in Java and is not based on Snowball. It does not accept a list of protected words and is only appropriate for English language text. However, it has been benchmarked as http://markmail.org/thread/d2c443z63z37rwf6[four times faster] than the English Snowball stemmer, so can provide a performance enhancement.
|
||||||
|
@ -1144,7 +1112,6 @@ This filter applies the Porter Stemming Algorithm for English. The results are s
|
||||||
|
|
||||||
*Out:* "jump", "jump", "jump"
|
*Out:* "jump", "jump", "jump"
|
||||||
|
|
||||||
[[FilterDescriptions-RemoveDuplicatesTokenFilter]]
|
|
||||||
== Remove Duplicates Token Filter
|
== Remove Duplicates Token Filter
|
||||||
|
|
||||||
The filter removes duplicate tokens in the stream. Tokens are considered to be duplicates ONLY if they have the same text and position values.
|
The filter removes duplicate tokens in the stream. Tokens are considered to be duplicates ONLY if they have the same text and position values.
|
||||||
|
@ -1223,7 +1190,6 @@ This filter reverses tokens to provide faster leading wildcard and prefix querie
|
||||||
|
|
||||||
*Out:* "oof*", "rab*"
|
*Out:* "oof*", "rab*"
|
||||||
|
|
||||||
[[FilterDescriptions-ShingleFilter]]
|
|
||||||
== Shingle Filter
|
== Shingle Filter
|
||||||
|
|
||||||
This filter constructs shingles, which are token n-grams, from the token stream. It combines runs of tokens into a single token.
|
This filter constructs shingles, which are token n-grams, from the token stream. It combines runs of tokens into a single token.
|
||||||
|
@ -1278,7 +1244,6 @@ A shingle size of four, do not include original token.
|
||||||
|
|
||||||
*Out:* "To be"(1), "To be or"(1), "To be or not"(1), "be or"(2), "be or not"(2), "be or not to"(2), "or not"(3), "or not to"(3), "or not to be"(3), "not to"(4), "not to be"(4), "to be"(5)
|
*Out:* "To be"(1), "To be or"(1), "To be or not"(1), "be or"(2), "be or not"(2), "be or not to"(2), "or not"(3), "or not to"(3), "or not to be"(3), "not to"(4), "not to be"(4), "to be"(5)
|
||||||
|
|
||||||
[[FilterDescriptions-SnowballPorterStemmerFilter]]
|
|
||||||
== Snowball Porter Stemmer Filter
|
== Snowball Porter Stemmer Filter
|
||||||
|
|
||||||
This filter factory instantiates a language-specific stemmer generated by Snowball. Snowball is a software package that generates pattern-based word stemmers. This type of stemmer is not as accurate as a table-based stemmer, but is faster and less complex. Table-driven stemmers are labor intensive to create and maintain and so are typically commercial products.
|
This filter factory instantiates a language-specific stemmer generated by Snowball. Snowball is a software package that generates pattern-based word stemmers. This type of stemmer is not as accurate as a table-based stemmer, but is faster and less complex. Table-driven stemmers are labor intensive to create and maintain and so are typically commercial products.
|
||||||
|
@ -1349,7 +1314,6 @@ Spanish stemmer, Spanish words:
|
||||||
|
|
||||||
*Out:* "cant", "cant"
|
*Out:* "cant", "cant"
|
||||||
|
|
||||||
[[FilterDescriptions-StandardFilter]]
|
|
||||||
== Standard Filter
|
== Standard Filter
|
||||||
|
|
||||||
This filter removes dots from acronyms and the substring "'s" from the end of tokens. This filter depends on the tokens being tagged with the appropriate term-type to recognize acronyms and words with apostrophes.
|
This filter removes dots from acronyms and the substring "'s" from the end of tokens. This filter depends on the tokens being tagged with the appropriate term-type to recognize acronyms and words with apostrophes.
|
||||||
|
@ -1363,7 +1327,6 @@ This filter removes dots from acronyms and the substring "'s" from the end of to
|
||||||
This filter is no longer operational in Solr when the `luceneMatchVersion` (in `solrconfig.xml`) is higher than "3.1".
|
This filter is no longer operational in Solr when the `luceneMatchVersion` (in `solrconfig.xml`) is higher than "3.1".
|
||||||
====
|
====
|
||||||
|
|
||||||
[[FilterDescriptions-StopFilter]]
|
|
||||||
== Stop Filter
|
== Stop Filter
|
||||||
|
|
||||||
This filter discards, or _stops_ analysis of, tokens that are on the given stop words list. A standard stop words list is included in the Solr `conf` directory, named `stopwords.txt`, which is appropriate for typical English language text.
|
This filter discards, or _stops_ analysis of, tokens that are on the given stop words list. A standard stop words list is included in the Solr `conf` directory, named `stopwords.txt`, which is appropriate for typical English language text.
|
||||||
|
@ -1414,10 +1377,9 @@ Case-sensitive matching, capitalized words not stopped. Token positions skip sto
|
||||||
|
|
||||||
*Out:* "what"(4)
|
*Out:* "what"(4)
|
||||||
|
|
||||||
[[FilterDescriptions-SuggestStopFilter]]
|
|
||||||
== Suggest Stop Filter
|
== Suggest Stop Filter
|
||||||
|
|
||||||
Like <<FilterDescriptions-StopFilter,Stop Filter>>, this filter discards, or _stops_ analysis of, tokens that are on the given stop words list.
|
Like <<Stop Filter>>, this filter discards, or _stops_ analysis of, tokens that are on the given stop words list.
|
||||||
|
|
||||||
Suggest Stop Filter differs from Stop Filter in that it will not remove the last token unless it is followed by a token separator. For example, a query `"find the"` would preserve the `'the'` since it was not followed by a space, punctuation etc., and mark it as a `KEYWORD` so that following filters will not change or remove it.
|
Suggest Stop Filter differs from Stop Filter in that it will not remove the last token unless it is followed by a token separator. For example, a query `"find the"` would preserve the `'the'` since it was not followed by a space, punctuation etc., and mark it as a `KEYWORD` so that following filters will not change or remove it.
|
||||||
|
|
||||||
|
@ -1455,7 +1417,6 @@ By contrast, a query like "`find the popsicle`" would remove '`the`' as a stopwo
|
||||||
|
|
||||||
*Out:* "the"(2)
|
*Out:* "the"(2)
|
||||||
|
|
||||||
[[FilterDescriptions-SynonymFilter]]
|
|
||||||
== Synonym Filter
|
== Synonym Filter
|
||||||
|
|
||||||
This filter does synonym mapping. Each token is looked up in the list of synonyms and if a match is found, then the synonym is emitted in place of the token. The position value of the new tokens are set such they all occur at the same position as the original token.
|
This filter does synonym mapping. Each token is looked up in the list of synonyms and if a match is found, then the synonym is emitted in place of the token. The position value of the new tokens are set such they all occur at the same position as the original token.
|
||||||
|
@ -1470,7 +1431,6 @@ Synonym Filter has been deprecated in favor of Synonym Graph Filter, which is re
|
||||||
|
|
||||||
For arguments and examples, see the Synonym Graph Filter below.
|
For arguments and examples, see the Synonym Graph Filter below.
|
||||||
|
|
||||||
[[FilterDescriptions-SynonymGraphFilter]]
|
|
||||||
== Synonym Graph Filter
|
== Synonym Graph Filter
|
||||||
|
|
||||||
This filter maps single- or multi-token synonyms, producing a fully correct graph output. This filter is a replacement for the Synonym Filter, which produces incorrect graphs for multi-token synonyms.
|
This filter maps single- or multi-token synonyms, producing a fully correct graph output. This filter is a replacement for the Synonym Filter, which produces incorrect graphs for multi-token synonyms.
|
||||||
|
@ -1542,7 +1502,6 @@ small => tiny,teeny,weeny
|
||||||
|
|
||||||
*Out:* "the"(1), "large"(2), "large"(3), "couch"(4), "sofa"(4), "divan"(4)
|
*Out:* "the"(1), "large"(2), "large"(3), "couch"(4), "sofa"(4), "divan"(4)
|
||||||
|
|
||||||
[[FilterDescriptions-TokenOffsetPayloadFilter]]
|
|
||||||
== Token Offset Payload Filter
|
== Token Offset Payload Filter
|
||||||
|
|
||||||
This filter adds the numeric character offsets of the token as a payload value for that token.
|
This filter adds the numeric character offsets of the token as a payload value for that token.
|
||||||
|
@ -1567,7 +1526,6 @@ This filter adds the numeric character offsets of the token as a payload value f
|
||||||
|
|
||||||
*Out:* "bing"[0,4], "bang"[5,9], "boom"[10,14]
|
*Out:* "bing"[0,4], "bang"[5,9], "boom"[10,14]
|
||||||
|
|
||||||
[[FilterDescriptions-TrimFilter]]
|
|
||||||
== Trim Filter
|
== Trim Filter
|
||||||
|
|
||||||
This filter trims leading and/or trailing whitespace from tokens. Most tokenizers break tokens at whitespace, so this filter is most often used for special situations.
|
This filter trims leading and/or trailing whitespace from tokens. Most tokenizers break tokens at whitespace, so this filter is most often used for special situations.
|
||||||
|
@ -1596,7 +1554,6 @@ The PatternTokenizerFactory configuration used here splits the input on simple c
|
||||||
|
|
||||||
*Out:* "one", "two", "three", "four"
|
*Out:* "one", "two", "three", "four"
|
||||||
|
|
||||||
[[FilterDescriptions-TypeAsPayloadFilter]]
|
|
||||||
== Type As Payload Filter
|
== Type As Payload Filter
|
||||||
|
|
||||||
This filter adds the token's type, as an encoded byte sequence, as its payload.
|
This filter adds the token's type, as an encoded byte sequence, as its payload.
|
||||||
|
@ -1621,10 +1578,9 @@ This filter adds the token's type, as an encoded byte sequence, as its payload.
|
||||||
|
|
||||||
*Out:* "Pay"[<ALPHANUM>], "Bob's"[<APOSTROPHE>], "I.O.U."[<ACRONYM>]
|
*Out:* "Pay"[<ALPHANUM>], "Bob's"[<APOSTROPHE>], "I.O.U."[<ACRONYM>]
|
||||||
|
|
||||||
[[FilterDescriptions-TypeTokenFilter]]
|
|
||||||
== Type Token Filter
|
== Type Token Filter
|
||||||
|
|
||||||
This filter blacklists or whitelists a specified list of token types, assuming the tokens have type metadata associated with them. For example, the <<tokenizers.adoc#Tokenizers-UAX29URLEmailTokenizer,UAX29 URL Email Tokenizer>> emits "<URL>" and "<EMAIL>" typed tokens, as well as other types. This filter would allow you to pull out only e-mail addresses from text as tokens, if you wish.
|
This filter blacklists or whitelists a specified list of token types, assuming the tokens have type metadata associated with them. For example, the <<tokenizers.adoc#uax29-url-email-tokenizer,UAX29 URL Email Tokenizer>> emits "<URL>" and "<EMAIL>" typed tokens, as well as other types. This filter would allow you to pull out only e-mail addresses from text as tokens, if you wish.
|
||||||
|
|
||||||
*Factory class:* `solr.TypeTokenFilterFactory`
|
*Factory class:* `solr.TypeTokenFilterFactory`
|
||||||
|
|
||||||
|
@ -1645,7 +1601,6 @@ This filter blacklists or whitelists a specified list of token types, assuming t
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[FilterDescriptions-WordDelimiterFilter]]
|
|
||||||
== Word Delimiter Filter
|
== Word Delimiter Filter
|
||||||
|
|
||||||
This filter splits tokens at word delimiters.
|
This filter splits tokens at word delimiters.
|
||||||
|
@ -1660,7 +1615,6 @@ Word Delimiter Filter has been deprecated in favor of Word Delimiter Graph Filte
|
||||||
|
|
||||||
For a full description, including arguments and examples, see the Word Delimiter Graph Filter below.
|
For a full description, including arguments and examples, see the Word Delimiter Graph Filter below.
|
||||||
|
|
||||||
[[FilterDescriptions-WordDelimiterGraphFilter]]
|
|
||||||
== Word Delimiter Graph Filter
|
== Word Delimiter Graph Filter
|
||||||
|
|
||||||
This filter splits tokens at word delimiters.
|
This filter splits tokens at word delimiters.
|
||||||
|
|
|
@ -25,14 +25,13 @@ Function queries are supported by the <<the-dismax-query-parser.adoc#the-dismax-
|
||||||
|
|
||||||
Function queries use _functions_. The functions can be a constant (numeric or string literal), a field, another function or a parameter substitution argument. You can use these functions to modify the ranking of results for users. These could be used to change the ranking of results based on a user's location, or some other calculation.
|
Function queries use _functions_. The functions can be a constant (numeric or string literal), a field, another function or a parameter substitution argument. You can use these functions to modify the ranking of results for users. These could be used to change the ranking of results based on a user's location, or some other calculation.
|
||||||
|
|
||||||
[[FunctionQueries-UsingFunctionQuery]]
|
|
||||||
== Using Function Query
|
== Using Function Query
|
||||||
|
|
||||||
Functions must be expressed as function calls (for example, `sum(a,b)` instead of simply `a+b`).
|
Functions must be expressed as function calls (for example, `sum(a,b)` instead of simply `a+b`).
|
||||||
|
|
||||||
There are several ways of using function queries in a Solr query:
|
There are several ways of using function queries in a Solr query:
|
||||||
|
|
||||||
* Via an explicit QParser that expects function arguments, such <<other-parsers.adoc#OtherParsers-FunctionQueryParser,`func`>> or <<other-parsers.adoc#OtherParsers-FunctionRangeQueryParser,`frange`>> . For example:
|
* Via an explicit QParser that expects function arguments, such <<other-parsers.adoc#function-query-parser,`func`>> or <<other-parsers.adoc#function-range-query-parser,`frange`>> . For example:
|
||||||
+
|
+
|
||||||
[source,text]
|
[source,text]
|
||||||
----
|
----
|
||||||
|
@ -76,7 +75,6 @@ q=_val_:mynumericfield _val_:"recip(rord(myfield),1,2,3)"
|
||||||
|
|
||||||
Only functions with fast random access are recommended.
|
Only functions with fast random access are recommended.
|
||||||
|
|
||||||
[[FunctionQueries-AvailableFunctions]]
|
|
||||||
== Available Functions
|
== Available Functions
|
||||||
|
|
||||||
The table below summarizes the functions available for function queries.
|
The table below summarizes the functions available for function queries.
|
||||||
|
@ -89,7 +87,7 @@ Returns the absolute value of the specified value or function.
|
||||||
* `abs(x)` `abs(-5)`
|
* `abs(x)` `abs(-5)`
|
||||||
|
|
||||||
=== childfield(field) Function
|
=== childfield(field) Function
|
||||||
Returns the value of the given field for one of the matched child docs when searching by <<other-parsers.adoc#OtherParsers-BlockJoinParentQueryParser,{!parent}>>. It can be used only in `sort` parameter.
|
Returns the value of the given field for one of the matched child docs when searching by <<other-parsers.adoc#block-join-parent-query-parser,{!parent}>>. It can be used only in `sort` parameter.
|
||||||
|
|
||||||
*Syntax Examples*
|
*Syntax Examples*
|
||||||
|
|
||||||
|
@ -149,7 +147,6 @@ You can quote the term if it's more complex, or do parameter substitution for th
|
||||||
* `docfreq(text,'solr')`
|
* `docfreq(text,'solr')`
|
||||||
* `...&defType=func` `&q=docfreq(text,$myterm)&myterm=solr`
|
* `...&defType=func` `&q=docfreq(text,$myterm)&myterm=solr`
|
||||||
|
|
||||||
[[FunctionQueries-field]]
|
|
||||||
=== field Function
|
=== field Function
|
||||||
Returns the numeric docValues or indexed value of the field with the specified name. In its simplest (single argument) form, this function can only be used on single valued fields, and can be called using the name of the field as a string, or for most conventional field names simply use the field name by itself with out using the `field(...)` syntax.
|
Returns the numeric docValues or indexed value of the field with the specified name. In its simplest (single argument) form, this function can only be used on single valued fields, and can be called using the name of the field as a string, or for most conventional field names simply use the field name by itself with out using the `field(...)` syntax.
|
||||||
|
|
||||||
|
@ -232,7 +229,7 @@ If the value of `x` does not fall between `min` and `max`, then either the value
|
||||||
=== max Function
|
=== max Function
|
||||||
Returns the maximum numeric value of multiple nested functions or constants, which are specified as arguments: `max(x,y,...)`. The `max` function can also be useful for "bottoming out" another function or field at some specified constant.
|
Returns the maximum numeric value of multiple nested functions or constants, which are specified as arguments: `max(x,y,...)`. The `max` function can also be useful for "bottoming out" another function or field at some specified constant.
|
||||||
|
|
||||||
Use the `field(myfield,max)` syntax for <<FunctionQueries-field,selecting the maximum value of a single multivalued field>>.
|
Use the `field(myfield,max)` syntax for <<field Function,selecting the maximum value of a single multivalued field>>.
|
||||||
|
|
||||||
*Syntax Example*
|
*Syntax Example*
|
||||||
|
|
||||||
|
@ -248,7 +245,7 @@ Returns the number of documents in the index, including those that are marked as
|
||||||
=== min Function
|
=== min Function
|
||||||
Returns the minimum numeric value of multiple nested functions of constants, which are specified as arguments: `min(x,y,...)`. The `min` function can also be useful for providing an "upper bound" on a function using a constant.
|
Returns the minimum numeric value of multiple nested functions of constants, which are specified as arguments: `min(x,y,...)`. The `min` function can also be useful for providing an "upper bound" on a function using a constant.
|
||||||
|
|
||||||
Use the `field(myfield,min)` <<FunctionQueries-field,syntax for selecting the minimum value of a single multivalued field>>.
|
Use the `field(myfield,min)` <<field Function,syntax for selecting the minimum value of a single multivalued field>>.
|
||||||
|
|
||||||
*Syntax Example*
|
*Syntax Example*
|
||||||
|
|
||||||
|
@ -502,8 +499,6 @@ Returns `true` if any member of the field exists.
|
||||||
*Syntax Example*
|
*Syntax Example*
|
||||||
* `if(lt(ms(mydatefield),315569259747),0.8,1)` translates to this pseudocode: `if mydatefield < 315569259747 then 0.8 else 1`
|
* `if(lt(ms(mydatefield),315569259747),0.8,1)` translates to this pseudocode: `if mydatefield < 315569259747 then 0.8 else 1`
|
||||||
|
|
||||||
|
|
||||||
[[FunctionQueries-ExampleFunctionQueries]]
|
|
||||||
== Example Function Queries
|
== Example Function Queries
|
||||||
|
|
||||||
To give you a better understanding of how function queries can be used in Solr, suppose an index stores the dimensions in meters x,y,z of some hypothetical boxes with arbitrary names stored in field `boxname`. Suppose we want to search for box matching name `findbox` but ranked according to volumes of boxes. The query parameters would be:
|
To give you a better understanding of how function queries can be used in Solr, suppose an index stores the dimensions in meters x,y,z of some hypothetical boxes with arbitrary names stored in field `boxname`. Suppose we want to search for box matching name `findbox` but ranked according to volumes of boxes. The query parameters would be:
|
||||||
|
@ -521,7 +516,6 @@ Suppose that you also have a field storing the weight of the box as `weight`. To
|
||||||
http://localhost:8983/solr/collection_name/select?q=boxname:findbox _val_:"div(weight,product(x,y,z))"&fl=boxname x y z weight score
|
http://localhost:8983/solr/collection_name/select?q=boxname:findbox _val_:"div(weight,product(x,y,z))"&fl=boxname x y z weight score
|
||||||
----
|
----
|
||||||
|
|
||||||
[[FunctionQueries-SortByFunction]]
|
|
||||||
== Sort By Function
|
== Sort By Function
|
||||||
|
|
||||||
You can sort your query results by the output of a function. For example, to sort results by distance, you could enter:
|
You can sort your query results by the output of a function. For example, to sort results by distance, you could enter:
|
||||||
|
|
|
@ -33,10 +33,8 @@ In this section you will learn how to start a SolrCloud cluster using startup sc
|
||||||
This tutorial assumes that you're already familiar with the basics of using Solr. If you need a refresher, please see the <<getting-started.adoc#getting-started,Getting Started section>> to get a grounding in Solr concepts. If you load documents as part of that exercise, you should start over with a fresh Solr installation for these SolrCloud tutorials.
|
This tutorial assumes that you're already familiar with the basics of using Solr. If you need a refresher, please see the <<getting-started.adoc#getting-started,Getting Started section>> to get a grounding in Solr concepts. If you load documents as part of that exercise, you should start over with a fresh Solr installation for these SolrCloud tutorials.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[GettingStartedwithSolrCloud-SolrCloudExample]]
|
|
||||||
== SolrCloud Example
|
== SolrCloud Example
|
||||||
|
|
||||||
[[GettingStartedwithSolrCloud-InteractiveStartup]]
|
|
||||||
=== Interactive Startup
|
=== Interactive Startup
|
||||||
|
|
||||||
The `bin/solr` script makes it easy to get started with SolrCloud as it walks you through the process of launching Solr nodes in cloud mode and adding a collection. To get started, simply do:
|
The `bin/solr` script makes it easy to get started with SolrCloud as it walks you through the process of launching Solr nodes in cloud mode and adding a collection. To get started, simply do:
|
||||||
|
@ -120,7 +118,6 @@ To stop Solr in SolrCloud mode, you would use the `bin/solr` script and issue th
|
||||||
bin/solr stop -all
|
bin/solr stop -all
|
||||||
----
|
----
|
||||||
|
|
||||||
[[GettingStartedwithSolrCloud-Startingwith-noprompt]]
|
|
||||||
=== Starting with -noprompt
|
=== Starting with -noprompt
|
||||||
|
|
||||||
You can also get SolrCloud started with all the defaults instead of the interactive session using the following command:
|
You can also get SolrCloud started with all the defaults instead of the interactive session using the following command:
|
||||||
|
@ -130,7 +127,6 @@ You can also get SolrCloud started with all the defaults instead of the interact
|
||||||
bin/solr -e cloud -noprompt
|
bin/solr -e cloud -noprompt
|
||||||
----
|
----
|
||||||
|
|
||||||
[[GettingStartedwithSolrCloud-RestartingNodes]]
|
|
||||||
=== Restarting Nodes
|
=== Restarting Nodes
|
||||||
|
|
||||||
You can restart your SolrCloud nodes using the `bin/solr` script. For instance, to restart node1 running on port 8983 (with an embedded ZooKeeper server), you would do:
|
You can restart your SolrCloud nodes using the `bin/solr` script. For instance, to restart node1 running on port 8983 (with an embedded ZooKeeper server), you would do:
|
||||||
|
@ -149,7 +145,6 @@ bin/solr restart -c -p 7574 -z localhost:9983 -s example/cloud/node2/solr
|
||||||
|
|
||||||
Notice that you need to specify the ZooKeeper address (`-z localhost:9983`) when starting node2 so that it can join the cluster with node1.
|
Notice that you need to specify the ZooKeeper address (`-z localhost:9983`) when starting node2 so that it can join the cluster with node1.
|
||||||
|
|
||||||
[[GettingStartedwithSolrCloud-Addinganodetoacluster]]
|
|
||||||
=== Adding a node to a cluster
|
=== Adding a node to a cluster
|
||||||
|
|
||||||
Adding a node to an existing cluster is a bit advanced and involves a little more understanding of Solr. Once you startup a SolrCloud cluster using the startup scripts, you can add a new node to it by:
|
Adding a node to an existing cluster is a bit advanced and involves a little more understanding of Solr. Once you startup a SolrCloud cluster using the startup scripts, you can add a new node to it by:
|
||||||
|
|
|
@ -31,7 +31,6 @@ The `nodes` function can be combined with the `scoreNodes` function to provide r
|
||||||
This document assumes a basic understanding of graph terminology and streaming expressions. You can begin exploring graph traversal concepts with this https://en.wikipedia.org/wiki/Graph_traversal[Wikipedia article]. More details about streaming expressions are available in this Guide, in the section <<streaming-expressions.adoc#streaming-expressions,Streaming Expressions>>.
|
This document assumes a basic understanding of graph terminology and streaming expressions. You can begin exploring graph traversal concepts with this https://en.wikipedia.org/wiki/Graph_traversal[Wikipedia article]. More details about streaming expressions are available in this Guide, in the section <<streaming-expressions.adoc#streaming-expressions,Streaming Expressions>>.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[GraphTraversal-BasicSyntax]]
|
|
||||||
== Basic Syntax
|
== Basic Syntax
|
||||||
|
|
||||||
We'll start with the most basic syntax and slowly build up more complexity. The most basic syntax for `nodes` is:
|
We'll start with the most basic syntax and slowly build up more complexity. The most basic syntax for `nodes` is:
|
||||||
|
@ -161,7 +160,6 @@ When scattering both branches and leaves the output would like this:
|
||||||
|
|
||||||
Now the level 0 root node is included in the output.
|
Now the level 0 root node is included in the output.
|
||||||
|
|
||||||
[[GraphTraversal-Aggregations]]
|
|
||||||
== Aggregations
|
== Aggregations
|
||||||
|
|
||||||
`nodes` also supports aggregations. For example:
|
`nodes` also supports aggregations. For example:
|
||||||
|
@ -182,8 +180,7 @@ Edges are uniqued as part of the traversal so the count will *not* reflect the n
|
||||||
|
|
||||||
The aggregation functions supported are `count(*)`, `sum(field)`, `min(field)`, `max(field)`, and `avg(field)`. The fields being aggregated should be present in the edges collected during the traversal. Later examples (below) will show aggregations can be a powerful tool for providing recommendations and limiting the scope of traversals.
|
The aggregation functions supported are `count(*)`, `sum(field)`, `min(field)`, `max(field)`, and `avg(field)`. The fields being aggregated should be present in the edges collected during the traversal. Later examples (below) will show aggregations can be a powerful tool for providing recommendations and limiting the scope of traversals.
|
||||||
|
|
||||||
[[GraphTraversal-Nestingnodesfunctions]]
|
== Nesting nodes Functions
|
||||||
== Nesting nodes functions
|
|
||||||
|
|
||||||
The `nodes` function can be nested to traverse deeper into the graph. For example:
|
The `nodes` function can be nested to traverse deeper into the graph. For example:
|
||||||
|
|
||||||
|
@ -207,14 +204,12 @@ Put more simply, the inner expression gathers all the people that "\johndoe@apac
|
||||||
|
|
||||||
This construct of nesting `nodes` functions is the basic technique for doing a controlled traversal through the graph.
|
This construct of nesting `nodes` functions is the basic technique for doing a controlled traversal through the graph.
|
||||||
|
|
||||||
[[GraphTraversal-CycleDetection]]
|
|
||||||
== Cycle Detection
|
== Cycle Detection
|
||||||
|
|
||||||
The `nodes` function performs cycle detection across the entire traversal. This ensures that nodes that have already been visited are not traversed again. Cycle detection is important for both limiting the size of traversals and gathering accurate aggregations. Without cycle detection the size of the traversal could grow exponentially with each hop in the traversal. With cycle detection only new nodes encountered are traversed.
|
The `nodes` function performs cycle detection across the entire traversal. This ensures that nodes that have already been visited are not traversed again. Cycle detection is important for both limiting the size of traversals and gathering accurate aggregations. Without cycle detection the size of the traversal could grow exponentially with each hop in the traversal. With cycle detection only new nodes encountered are traversed.
|
||||||
|
|
||||||
Cycle detection *does not* cross collection boundaries. This is because internally the collection name is part of the node ID. For example the node ID "\johndoe@apache.org", is really `emails/johndoe@apache.org`. When traversing to another collection "\johndoe@apache.org" will be traversed.
|
Cycle detection *does not* cross collection boundaries. This is because internally the collection name is part of the node ID. For example the node ID "\johndoe@apache.org", is really `emails/johndoe@apache.org`. When traversing to another collection "\johndoe@apache.org" will be traversed.
|
||||||
|
|
||||||
[[GraphTraversal-FilteringtheTraversal]]
|
|
||||||
== Filtering the Traversal
|
== Filtering the Traversal
|
||||||
|
|
||||||
Each level in the traversal can be filtered with a filter query. For example:
|
Each level in the traversal can be filtered with a filter query. For example:
|
||||||
|
@ -229,7 +224,6 @@ nodes(emails,
|
||||||
|
|
||||||
In the example above only emails that match the filter query will be included in the traversal. Any Solr query can be included here. So you can do fun things like <<spatial-search.adoc#spatial-search,geospatial queries>>, apply any of the available <<query-syntax-and-parsing.adoc#query-syntax-and-parsing,query parsers>>, or even write custom query parsers to limit the traversal.
|
In the example above only emails that match the filter query will be included in the traversal. Any Solr query can be included here. So you can do fun things like <<spatial-search.adoc#spatial-search,geospatial queries>>, apply any of the available <<query-syntax-and-parsing.adoc#query-syntax-and-parsing,query parsers>>, or even write custom query parsers to limit the traversal.
|
||||||
|
|
||||||
[[GraphTraversal-RootStreams]]
|
|
||||||
== Root Streams
|
== Root Streams
|
||||||
|
|
||||||
Any streaming expression can be used to provide the root nodes for a traversal. For example:
|
Any streaming expression can be used to provide the root nodes for a traversal. For example:
|
||||||
|
@ -246,7 +240,6 @@ The example above provides the root nodes through a search expression. You can a
|
||||||
|
|
||||||
Notice that the `walk` parameter maps a field from the tuples generated by the inner stream. In this case it maps the `to` field from the inner stream to the `from` field.
|
Notice that the `walk` parameter maps a field from the tuples generated by the inner stream. In this case it maps the `to` field from the inner stream to the `from` field.
|
||||||
|
|
||||||
[[GraphTraversal-SkippingHighFrequencyNodes]]
|
|
||||||
== Skipping High Frequency Nodes
|
== Skipping High Frequency Nodes
|
||||||
|
|
||||||
It's often desirable to skip traversing high frequency nodes in the graph. This is similar in nature to a search term stop list. The best way to describe this is through an example use case.
|
It's often desirable to skip traversing high frequency nodes in the graph. This is similar in nature to a search term stop list. The best way to describe this is through an example use case.
|
||||||
|
@ -277,7 +270,6 @@ The `nodes` function has the `maxDocFreq` param to allow for filtering out high
|
||||||
|
|
||||||
In the example above, the inner search expression searches the `logs` collection and returning all the articles viewed by "user1". The outer `nodes` expression takes all the articles emitted from the inner search expression and finds all the records in the logs collection for those articles. It then gathers and aggregates the users that have read the articles. The `maxDocFreq` parameter limits the articles returned to those that appear in no more then 10,000 log records (per shard). This guards against returning articles that have been viewed by millions of users.
|
In the example above, the inner search expression searches the `logs` collection and returning all the articles viewed by "user1". The outer `nodes` expression takes all the articles emitted from the inner search expression and finds all the records in the logs collection for those articles. It then gathers and aggregates the users that have read the articles. The `maxDocFreq` parameter limits the articles returned to those that appear in no more then 10,000 log records (per shard). This guards against returning articles that have been viewed by millions of users.
|
||||||
|
|
||||||
[[GraphTraversal-TrackingtheTraversal]]
|
|
||||||
== Tracking the Traversal
|
== Tracking the Traversal
|
||||||
|
|
||||||
By default the `nodes` function only tracks enough information to do cycle detection. This provides enough information to output the nodes and aggregations in the graph.
|
By default the `nodes` function only tracks enough information to do cycle detection. This provides enough information to output the nodes and aggregations in the graph.
|
||||||
|
@ -298,7 +290,6 @@ nodes(emails,
|
||||||
gather="to")
|
gather="to")
|
||||||
----
|
----
|
||||||
|
|
||||||
[[GraphTraversal-Cross-CollectionTraversals]]
|
|
||||||
== Cross-Collection Traversals
|
== Cross-Collection Traversals
|
||||||
|
|
||||||
Nested `nodes` functions can operate on different SolrCloud collections. This allow traversals to "walk" from one collection to another to gather nodes. Cycle detection does not cross collection boundaries, so nodes collected in one collection will be traversed in a different collection. This was done deliberately to support cross-collection traversals. Note that the output from a cross-collection traversal will likely contain duplicate nodes with different collection attributes.
|
Nested `nodes` functions can operate on different SolrCloud collections. This allow traversals to "walk" from one collection to another to gather nodes. Cycle detection does not cross collection boundaries, so nodes collected in one collection will be traversed in a different collection. This was done deliberately to support cross-collection traversals. Note that the output from a cross-collection traversal will likely contain duplicate nodes with different collection attributes.
|
||||||
|
@ -320,7 +311,6 @@ nodes(logs,
|
||||||
|
|
||||||
The example above finds all people who sent emails with a body that contains "solr rocks". It then finds all the people these people have emailed. Then it traverses to the logs collection and gathers all the content IDs that these people have edited.
|
The example above finds all people who sent emails with a body that contains "solr rocks". It then finds all the people these people have emailed. Then it traverses to the logs collection and gathers all the content IDs that these people have edited.
|
||||||
|
|
||||||
[[GraphTraversal-CombiningnodesWithOtherStreamingExpressions]]
|
|
||||||
== Combining nodes With Other Streaming Expressions
|
== Combining nodes With Other Streaming Expressions
|
||||||
|
|
||||||
The `nodes` function can act as both a stream source and a stream decorator. The connection with the wider stream expression library provides tremendous power and flexibility when performing graph traversals. Here is an example of using the streaming expression library to intersect two friend networks:
|
The `nodes` function can act as both a stream source and a stream decorator. The connection with the wider stream expression library provides tremendous power and flexibility when performing graph traversals. Here is an example of using the streaming expression library to intersect two friend networks:
|
||||||
|
@ -348,10 +338,8 @@ The `nodes` function can act as both a stream source and a stream decorator. The
|
||||||
|
|
||||||
The example above gathers two separate friend networks, one rooted with "\johndoe@apache.org" and another rooted with "\janedoe@apache.org". The friend networks are then sorted by the `node` field, and intersected. The resulting node set will be the intersection of the two friend networks.
|
The example above gathers two separate friend networks, one rooted with "\johndoe@apache.org" and another rooted with "\janedoe@apache.org". The friend networks are then sorted by the `node` field, and intersected. The resulting node set will be the intersection of the two friend networks.
|
||||||
|
|
||||||
[[GraphTraversal-SampleUseCases]]
|
== Sample Use Cases for Graph Traversal
|
||||||
== Sample Use Cases
|
|
||||||
|
|
||||||
[[GraphTraversal-CalculateMarketBasketCo-occurrence]]
|
|
||||||
=== Calculate Market Basket Co-occurrence
|
=== Calculate Market Basket Co-occurrence
|
||||||
|
|
||||||
It is often useful to know which products are most frequently purchased with a particular product. This example uses a simple market basket table (indexed in Solr) to store past shopping baskets. The schema for the table is very simple with each row containing a `basketID` and a `productID`. This can be seen as a graph with each row in the table representing an edge. And it can be traversed very quickly to calculate basket co-occurrence, even when the graph contains billions of edges.
|
It is often useful to know which products are most frequently purchased with a particular product. This example uses a simple market basket table (indexed in Solr) to store past shopping baskets. The schema for the table is very simple with each row containing a `basketID` and a `productID`. This can be seen as a graph with each row in the table representing an edge. And it can be traversed very quickly to calculate basket co-occurrence, even when the graph contains billions of edges.
|
||||||
|
@ -378,15 +366,13 @@ Let's break down exactly what this traversal is doing.
|
||||||
|
|
||||||
In a nutshell this expression finds the products that most frequently co-occur with product "ABC" in past shopping baskets.
|
In a nutshell this expression finds the products that most frequently co-occur with product "ABC" in past shopping baskets.
|
||||||
|
|
||||||
[[GraphTraversal-UsingthescoreNodesFunctiontoMakeaRecommendation]]
|
|
||||||
=== Using the scoreNodes Function to Make a Recommendation
|
=== Using the scoreNodes Function to Make a Recommendation
|
||||||
|
|
||||||
This use case builds on the market basket example <<GraphTraversal-CalculateMarketBasketCo-occurrence,above>> that calculates which products co-occur most frequently with productID:ABC. The ranked co-occurrence counts provide candidates for a recommendation. The `scoreNodes` function can be used to score the candidates to find the best recommendation.
|
This use case builds on the market basket example <<Calculate Market Basket Co-occurrence,above>> that calculates which products co-occur most frequently with productID:ABC. The ranked co-occurrence counts provide candidates for a recommendation. The `scoreNodes` function can be used to score the candidates to find the best recommendation.
|
||||||
|
|
||||||
Before diving into the syntax of the `scoreNodes` function it's useful to understand why the raw co-occurrence counts may not produce the best recommendation. The reason is that raw co-occurrence counts favor items that occur frequently across all baskets. A better recommendation would find the product that has the most significant relationship with productID ABC. The `scoreNodes` function uses a term frequency-inverse document frequency (TF-IDF) algorithm to find the most significant relationship.
|
Before diving into the syntax of the `scoreNodes` function it's useful to understand why the raw co-occurrence counts may not produce the best recommendation. The reason is that raw co-occurrence counts favor items that occur frequently across all baskets. A better recommendation would find the product that has the most significant relationship with productID ABC. The `scoreNodes` function uses a term frequency-inverse document frequency (TF-IDF) algorithm to find the most significant relationship.
|
||||||
|
|
||||||
[[GraphTraversal-HowItWorks]]
|
==== How scoreNodes Works
|
||||||
==== *How It Works*
|
|
||||||
|
|
||||||
The `scoreNodes` function assigns a score to each node emitted by the nodes expression. By default the `scoreNodes` function uses the `count(*)` aggregation, which is the co-occurrence count, as the TF value. The IDF value for each node is fetched from the collection where the node was gathered. Each node is then scored using the TF*IDF formula, which provides a boost to nodes with a lower frequency across all market baskets.
|
The `scoreNodes` function assigns a score to each node emitted by the nodes expression. By default the `scoreNodes` function uses the `count(*)` aggregation, which is the co-occurrence count, as the TF value. The IDF value for each node is fetched from the collection where the node was gathered. Each node is then scored using the TF*IDF formula, which provides a boost to nodes with a lower frequency across all market baskets.
|
||||||
|
|
||||||
|
@ -394,8 +380,7 @@ Combining the co-occurrence count with the IDF provides a score that shows how i
|
||||||
|
|
||||||
The `scoreNodes` function adds the score to each node in the `nodeScore` field.
|
The `scoreNodes` function adds the score to each node in the `nodeScore` field.
|
||||||
|
|
||||||
[[GraphTraversal-ExampleSyntax]]
|
==== Example scoreNodes Syntax
|
||||||
==== *Example Syntax*
|
|
||||||
|
|
||||||
[source,plain]
|
[source,plain]
|
||||||
----
|
----
|
||||||
|
@ -417,7 +402,6 @@ This example builds on the earlier example "Calculate market basket co-occurrenc
|
||||||
. The `scoreNodes` function then assigns a score to the candidates based on the TF*IDF of each node.
|
. The `scoreNodes` function then assigns a score to the candidates based on the TF*IDF of each node.
|
||||||
. The outer `top` expression selects the highest scoring node. This is the recommendation.
|
. The outer `top` expression selects the highest scoring node. This is the recommendation.
|
||||||
|
|
||||||
[[GraphTraversal-RecommendContentBasedonCollaborativeFilter]]
|
|
||||||
=== Recommend Content Based on Collaborative Filter
|
=== Recommend Content Based on Collaborative Filter
|
||||||
|
|
||||||
In this example we'll recommend content for a user based on a collaborative filter. This recommendation is made using log records that contain the `userID` and `articleID` and the action performed. In this scenario each log record can be viewed as an edge in a graph. The userID and articleID are the nodes and the action is an edge property used to filter the traversal.
|
In this example we'll recommend content for a user based on a collaborative filter. This recommendation is made using log records that contain the `userID` and `articleID` and the action performed. In this scenario each log record can be viewed as an edge in a graph. The userID and articleID are the nodes and the action is an edge property used to filter the traversal.
|
||||||
|
@ -458,7 +442,6 @@ Note that it skips high frequency nodes using the `maxDocFreq` param to filter o
|
||||||
Any article selected in step 1 (user1 reading list), will not appear in this step due to cycle detection. So this step returns the articles read by the users with the most similar readings habits to "user1" that "user1" has not read yet. It also counts the number of times each article has been read across this user group.
|
Any article selected in step 1 (user1 reading list), will not appear in this step due to cycle detection. So this step returns the articles read by the users with the most similar readings habits to "user1" that "user1" has not read yet. It also counts the number of times each article has been read across this user group.
|
||||||
. The outer `top` expression takes the top articles emitted from step 4. This is the recommendation.
|
. The outer `top` expression takes the top articles emitted from step 4. This is the recommendation.
|
||||||
|
|
||||||
[[GraphTraversal-ProteinPathwayTraversal]]
|
|
||||||
=== Protein Pathway Traversal
|
=== Protein Pathway Traversal
|
||||||
|
|
||||||
In recent years, scientists have become increasingly able to rationally design drugs that target the mutated proteins, called oncogenes, responsible for some cancers. Proteins typically act through long chains of chemical interactions between multiple proteins, called pathways, and, while the oncogene in the pathway may not have a corresponding drug, another protein in the pathway may. Graph traversal on a protein collection that records protein interactions and drugs may yield possible candidates. (Thanks to Lewis Geer of the NCBI, for providing this example).
|
In recent years, scientists have become increasingly able to rationally design drugs that target the mutated proteins, called oncogenes, responsible for some cancers. Proteins typically act through long chains of chemical interactions between multiple proteins, called pathways, and, while the oncogene in the pathway may not have a corresponding drug, another protein in the pathway may. Graph traversal on a protein collection that records protein interactions and drugs may yield possible candidates. (Thanks to Lewis Geer of the NCBI, for providing this example).
|
||||||
|
@ -481,7 +464,6 @@ Let's break down exactly what this traversal is doing.
|
||||||
. The outer `nodes` expression also works with the `proteins` collection. It gathers all the drugs that correspond to proteins emitted from step 1.
|
. The outer `nodes` expression also works with the `proteins` collection. It gathers all the drugs that correspond to proteins emitted from step 1.
|
||||||
. Using this stepwise approach you can gather the drugs along the pathway of interactions any number of steps away from the root protein.
|
. Using this stepwise approach you can gather the drugs along the pathway of interactions any number of steps away from the root protein.
|
||||||
|
|
||||||
[[GraphTraversal-ExportingGraphMLtoSupportGraphVisualization]]
|
|
||||||
== Exporting GraphML to Support Graph Visualization
|
== Exporting GraphML to Support Graph Visualization
|
||||||
|
|
||||||
In the examples above, the `nodes` expression was sent to Solr's `/stream` handler like any other streaming expression. This approach outputs the nodes in the same JSON tuple format as other streaming expressions so that it can be treated like any other streaming expression. You can use the `/stream` handler when you need to operate directly on the tuples, such as in the recommendation use cases above.
|
In the examples above, the `nodes` expression was sent to Solr's `/stream` handler like any other streaming expression. This approach outputs the nodes in the same JSON tuple format as other streaming expressions so that it can be treated like any other streaming expression. You can use the `/stream` handler when you need to operate directly on the tuples, such as in the recommendation use cases above.
|
||||||
|
@ -496,8 +478,7 @@ There are a few things to keep mind when exporting a graph in GraphML:
|
||||||
. The `/graph` handler currently accepts an arbitrarily complex streaming expression which includes a `nodes` expression. If the streaming expression doesn't include a `nodes` expression, the `/graph` handler will not properly output GraphML.
|
. The `/graph` handler currently accepts an arbitrarily complex streaming expression which includes a `nodes` expression. If the streaming expression doesn't include a `nodes` expression, the `/graph` handler will not properly output GraphML.
|
||||||
. The `/graph` handler currently accepts a single arbitrarily complex, nested `nodes` expression per request. This means you cannot send in a streaming expression that joins or intersects the node sets from multiple `nodes` expressions. The `/graph` handler does support any level of nesting within a single `nodes` expression. The `/stream` handler does support joining and intersecting node sets, but the `/graph` handler currently does not.
|
. The `/graph` handler currently accepts a single arbitrarily complex, nested `nodes` expression per request. This means you cannot send in a streaming expression that joins or intersects the node sets from multiple `nodes` expressions. The `/graph` handler does support any level of nesting within a single `nodes` expression. The `/stream` handler does support joining and intersecting node sets, but the `/graph` handler currently does not.
|
||||||
|
|
||||||
[[GraphTraversal-SampleRequest]]
|
=== Sample GraphML Request
|
||||||
=== Sample Request
|
|
||||||
|
|
||||||
[source,bash]
|
[source,bash]
|
||||||
----
|
----
|
||||||
|
@ -512,7 +493,6 @@ curl --data-urlencode 'expr=nodes(enron_emails,
|
||||||
gather="to")' http://localhost:8983/solr/enron_emails/graph
|
gather="to")' http://localhost:8983/solr/enron_emails/graph
|
||||||
----
|
----
|
||||||
|
|
||||||
[[GraphTraversal-SampleGraphMLOutput]]
|
|
||||||
=== Sample GraphML Output
|
=== Sample GraphML Output
|
||||||
|
|
||||||
[source,xml]
|
[source,xml]
|
||||||
|
|
|
@ -30,7 +30,7 @@ For some of the authentication schemes (e.g., Kerberos), Solr provides a native
|
||||||
|
|
||||||
There are two plugin classes:
|
There are two plugin classes:
|
||||||
|
|
||||||
* `HadoopAuthPlugin`: This can be used with standalone Solr as well as Solrcloud with <<authentication-and-authorization-plugins.adoc#AuthenticationandAuthorizationPlugins-PKI,PKI authentication>> for internode communication.
|
* `HadoopAuthPlugin`: This can be used with standalone Solr as well as Solrcloud with <<authentication-and-authorization-plugins.adoc#securing-inter-node-requests,PKI authentication>> for internode communication.
|
||||||
* `ConfigurableInternodeAuthHadoopPlugin`: This is an extension of HadoopAuthPlugin that allows you to configure the authentication scheme for internode communication.
|
* `ConfigurableInternodeAuthHadoopPlugin`: This is an extension of HadoopAuthPlugin that allows you to configure the authentication scheme for internode communication.
|
||||||
|
|
||||||
[TIP]
|
[TIP]
|
||||||
|
@ -38,7 +38,6 @@ There are two plugin classes:
|
||||||
For most SolrCloud or standalone Solr setups, the `HadoopAuthPlugin` should suffice.
|
For most SolrCloud or standalone Solr setups, the `HadoopAuthPlugin` should suffice.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[HadoopAuthenticationPlugin-PluginConfiguration]]
|
|
||||||
== Plugin Configuration
|
== Plugin Configuration
|
||||||
|
|
||||||
`class`::
|
`class`::
|
||||||
|
@ -70,11 +69,8 @@ Configures proxy users for the underlying Hadoop authentication mechanism. This
|
||||||
`clientBuilderFactory`:: No |
|
`clientBuilderFactory`:: No |
|
||||||
The `HttpClientBuilderFactory` implementation used for the Solr internal communication. Only applicable for `ConfigurableInternodeAuthHadoopPlugin`.
|
The `HttpClientBuilderFactory` implementation used for the Solr internal communication. Only applicable for `ConfigurableInternodeAuthHadoopPlugin`.
|
||||||
|
|
||||||
|
|
||||||
[[HadoopAuthenticationPlugin-ExampleConfigurations]]
|
|
||||||
== Example Configurations
|
== Example Configurations
|
||||||
|
|
||||||
[[HadoopAuthenticationPlugin-KerberosAuthenticationusingHadoopAuthenticationPlugin]]
|
|
||||||
=== Kerberos Authentication using Hadoop Authentication Plugin
|
=== Kerberos Authentication using Hadoop Authentication Plugin
|
||||||
|
|
||||||
This example lets you configure Solr to use Kerberos Authentication, similar to how you would use the <<kerberos-authentication-plugin.adoc#kerberos-authentication-plugin,Kerberos Authentication Plugin>>.
|
This example lets you configure Solr to use Kerberos Authentication, similar to how you would use the <<kerberos-authentication-plugin.adoc#kerberos-authentication-plugin,Kerberos Authentication Plugin>>.
|
||||||
|
@ -105,7 +101,6 @@ To setup this plugin, use the following in your `security.json` file.
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[HadoopAuthenticationPlugin-SimpleAuthenticationwithDelegationTokens]]
|
|
||||||
=== Simple Authentication with Delegation Tokens
|
=== Simple Authentication with Delegation Tokens
|
||||||
|
|
||||||
Similar to the previous example, this is an example of setting up a Solr cluster that uses delegation tokens. Refer to the parameters in the Hadoop authentication library's https://hadoop.apache.org/docs/stable/hadoop-auth/Configuration.html[documentation] or refer to the section <<kerberos-authentication-plugin.adoc#kerberos-authentication-plugin,Kerberos Authentication Plugin>> for further details. Please note that this example does not use Kerberos and the requests made to Solr must contain valid delegation tokens.
|
Similar to the previous example, this is an example of setting up a Solr cluster that uses delegation tokens. Refer to the parameters in the Hadoop authentication library's https://hadoop.apache.org/docs/stable/hadoop-auth/Configuration.html[documentation] or refer to the section <<kerberos-authentication-plugin.adoc#kerberos-authentication-plugin,Kerberos Authentication Plugin>> for further details. Please note that this example does not use Kerberos and the requests made to Solr must contain valid delegation tokens.
|
||||||
|
|
|
@ -24,7 +24,6 @@ The fragments are included in a special section of the query response (the `high
|
||||||
|
|
||||||
Highlighting is extremely configurable, perhaps more than any other part of Solr. There are many parameters each for fragment sizing, formatting, ordering, backup/alternate behavior, and more options that are hard to categorize. Nonetheless, highlighting is very simple to use.
|
Highlighting is extremely configurable, perhaps more than any other part of Solr. There are many parameters each for fragment sizing, formatting, ordering, backup/alternate behavior, and more options that are hard to categorize. Nonetheless, highlighting is very simple to use.
|
||||||
|
|
||||||
[[Highlighting-Usage]]
|
|
||||||
== Usage
|
== Usage
|
||||||
|
|
||||||
=== Common Highlighter Parameters
|
=== Common Highlighter Parameters
|
||||||
|
@ -36,7 +35,7 @@ Use this parameter to enable or disable highlighting. The default is `false`. If
|
||||||
`hl.method`::
|
`hl.method`::
|
||||||
The highlighting implementation to use. Acceptable values are: `unified`, `original`, `fastVector`. The default is `original`.
|
The highlighting implementation to use. Acceptable values are: `unified`, `original`, `fastVector`. The default is `original`.
|
||||||
+
|
+
|
||||||
See the <<Highlighting-ChoosingaHighlighter,Choosing a Highlighter>> section below for more details on the differences between the available highlighters.
|
See the <<Choosing a Highlighter>> section below for more details on the differences between the available highlighters.
|
||||||
|
|
||||||
`hl.fl`::
|
`hl.fl`::
|
||||||
Specifies a list of fields to highlight. Accepts a comma- or space-delimited list of fields for which Solr should generate highlighted snippets.
|
Specifies a list of fields to highlight. Accepts a comma- or space-delimited list of fields for which Solr should generate highlighted snippets.
|
||||||
|
@ -92,7 +91,6 @@ The default is `51200` characters.
|
||||||
|
|
||||||
There are more parameters supported as well depending on the highlighter (via `hl.method`) chosen.
|
There are more parameters supported as well depending on the highlighter (via `hl.method`) chosen.
|
||||||
|
|
||||||
[[Highlighting-HighlightingintheQueryResponse]]
|
|
||||||
=== Highlighting in the Query Response
|
=== Highlighting in the Query Response
|
||||||
|
|
||||||
In the response to a query, Solr includes highlighting data in a section separate from the documents. It is up to a client to determine how to process this response and display the highlights to users.
|
In the response to a query, Solr includes highlighting data in a section separate from the documents. It is up to a client to determine how to process this response and display the highlights to users.
|
||||||
|
@ -136,7 +134,6 @@ Note the two sections `docs` and `highlighting`. The `docs` section contains the
|
||||||
|
|
||||||
The `highlighting` section includes the ID of each document, and the field that contains the highlighted portion. In this example, we used the `hl.fl` parameter to say we wanted query terms highlighted in the "manu" field. When there is a match to the query term in that field, it will be included for each document ID in the list.
|
The `highlighting` section includes the ID of each document, and the field that contains the highlighted portion. In this example, we used the `hl.fl` parameter to say we wanted query terms highlighted in the "manu" field. When there is a match to the query term in that field, it will be included for each document ID in the list.
|
||||||
|
|
||||||
[[Highlighting-ChoosingaHighlighter]]
|
|
||||||
== Choosing a Highlighter
|
== Choosing a Highlighter
|
||||||
|
|
||||||
Solr provides a `HighlightComponent` (a `SearchComponent`) and it's in the default list of components for search handlers. It offers a somewhat unified API over multiple actual highlighting implementations (or simply "highlighters") that do the business of highlighting.
|
Solr provides a `HighlightComponent` (a `SearchComponent`) and it's in the default list of components for search handlers. It offers a somewhat unified API over multiple actual highlighting implementations (or simply "highlighters") that do the business of highlighting.
|
||||||
|
@ -173,7 +170,6 @@ The Unified Highlighter is exclusively configured via search parameters. In cont
|
||||||
|
|
||||||
In addition to further information below, more information can be found in the {solr-javadocs}/solr-core/org/apache/solr/highlight/package-summary.html[Solr javadocs].
|
In addition to further information below, more information can be found in the {solr-javadocs}/solr-core/org/apache/solr/highlight/package-summary.html[Solr javadocs].
|
||||||
|
|
||||||
[[Highlighting-SchemaOptionsandPerformanceConsiderations]]
|
|
||||||
=== Schema Options and Performance Considerations
|
=== Schema Options and Performance Considerations
|
||||||
|
|
||||||
Fundamental to the internals of highlighting are detecting the _offsets_ of the individual words that match the query. Some of the highlighters can run the stored text through the analysis chain defined in the schema, some can look them up from _postings_, and some can look them up from _term vectors._ These choices have different trade-offs:
|
Fundamental to the internals of highlighting are detecting the _offsets_ of the individual words that match the query. Some of the highlighters can run the stored text through the analysis chain defined in the schema, some can look them up from _postings_, and some can look them up from _term vectors._ These choices have different trade-offs:
|
||||||
|
@ -198,7 +194,6 @@ This is definitely the fastest option for highlighting wildcard queries on large
|
||||||
+
|
+
|
||||||
This adds substantial weight to the index – similar in size to the compressed stored text. If you are using the Unified Highlighter then this is not a recommended configuration since it's slower and heavier than postings with light term vectors. However, this could make sense if full term vectors are already needed for another use-case.
|
This adds substantial weight to the index – similar in size to the compressed stored text. If you are using the Unified Highlighter then this is not a recommended configuration since it's slower and heavier than postings with light term vectors. However, this could make sense if full term vectors are already needed for another use-case.
|
||||||
|
|
||||||
[[Highlighting-TheUnifiedHighlighter]]
|
|
||||||
== The Unified Highlighter
|
== The Unified Highlighter
|
||||||
|
|
||||||
The Unified Highlighter supports these following additional parameters to the ones listed earlier:
|
The Unified Highlighter supports these following additional parameters to the ones listed earlier:
|
||||||
|
@ -243,7 +238,6 @@ Indicates which character to break the text on. Use only if you have defined `hl
|
||||||
This is useful when the text has already been manipulated in advance to have a special delineation character at desired highlight passage boundaries. This character will still appear in the text as the last character of a passage.
|
This is useful when the text has already been manipulated in advance to have a special delineation character at desired highlight passage boundaries. This character will still appear in the text as the last character of a passage.
|
||||||
|
|
||||||
|
|
||||||
[[Highlighting-TheOriginalHighlighter]]
|
|
||||||
== The Original Highlighter
|
== The Original Highlighter
|
||||||
|
|
||||||
The Original Highlighter supports these following additional parameters to the ones listed earlier:
|
The Original Highlighter supports these following additional parameters to the ones listed earlier:
|
||||||
|
@ -314,7 +308,6 @@ If this may happen and you know you don't need them for highlighting (i.e. your
|
||||||
|
|
||||||
The Original Highlighter has a plugin architecture that enables new functionality to be registered in `solrconfig.xml`. The "```techproducts```" configset shows most of these settings explicitly. You can use it as a guide to provide your own components to include a `SolrFormatter`, `SolrEncoder`, and `SolrFragmenter.`
|
The Original Highlighter has a plugin architecture that enables new functionality to be registered in `solrconfig.xml`. The "```techproducts```" configset shows most of these settings explicitly. You can use it as a guide to provide your own components to include a `SolrFormatter`, `SolrEncoder`, and `SolrFragmenter.`
|
||||||
|
|
||||||
[[Highlighting-TheFastVectorHighlighter]]
|
|
||||||
== The FastVector Highlighter
|
== The FastVector Highlighter
|
||||||
|
|
||||||
The FastVector Highlighter (FVH) can be used in conjunction with the Original Highlighter if not all fields should be highlighted with the FVH. In such a mode, set `hl.method=original` and `f.yourTermVecField.hl.method=fastVector` for all fields that should use the FVH. One annoyance to keep in mind is that the Original Highlighter uses `hl.simple.pre` whereas the FVH (and other highlighters) use `hl.tag.pre`.
|
The FastVector Highlighter (FVH) can be used in conjunction with the Original Highlighter if not all fields should be highlighted with the FVH. In such a mode, set `hl.method=original` and `f.yourTermVecField.hl.method=fastVector` for all fields that should use the FVH. One annoyance to keep in mind is that the Original Highlighter uses `hl.simple.pre` whereas the FVH (and other highlighters) use `hl.tag.pre`.
|
||||||
|
@ -349,15 +342,12 @@ The maximum number of phrases to analyze when searching for the highest-scoring
|
||||||
`hl.multiValuedSeparatorChar`::
|
`hl.multiValuedSeparatorChar`::
|
||||||
Text to use to separate one value from the next for a multi-valued field. The default is " " (a space).
|
Text to use to separate one value from the next for a multi-valued field. The default is " " (a space).
|
||||||
|
|
||||||
|
|
||||||
[[Highlighting-UsingBoundaryScannerswiththeFastVectorHighlighter]]
|
|
||||||
=== Using Boundary Scanners with the FastVector Highlighter
|
=== Using Boundary Scanners with the FastVector Highlighter
|
||||||
|
|
||||||
The FastVector Highlighter will occasionally truncate highlighted words. To prevent this, implement a boundary scanner in `solrconfig.xml`, then use the `hl.boundaryScanner` parameter to specify the boundary scanner for highlighting.
|
The FastVector Highlighter will occasionally truncate highlighted words. To prevent this, implement a boundary scanner in `solrconfig.xml`, then use the `hl.boundaryScanner` parameter to specify the boundary scanner for highlighting.
|
||||||
|
|
||||||
Solr supports two boundary scanners: `breakIterator` and `simple`.
|
Solr supports two boundary scanners: `breakIterator` and `simple`.
|
||||||
|
|
||||||
[[Highlighting-ThebreakIteratorBoundaryScanner]]
|
|
||||||
==== The breakIterator Boundary Scanner
|
==== The breakIterator Boundary Scanner
|
||||||
|
|
||||||
The `breakIterator` boundary scanner offers excellent performance right out of the box by taking locale and boundary type into account. In most cases you will want to use the `breakIterator` boundary scanner. To implement the `breakIterator` boundary scanner, add this code to the `highlighting` section of your `solrconfig.xml` file, adjusting the type, language, and country values as appropriate to your application:
|
The `breakIterator` boundary scanner offers excellent performance right out of the box by taking locale and boundary type into account. In most cases you will want to use the `breakIterator` boundary scanner. To implement the `breakIterator` boundary scanner, add this code to the `highlighting` section of your `solrconfig.xml` file, adjusting the type, language, and country values as appropriate to your application:
|
||||||
|
@ -375,7 +365,6 @@ The `breakIterator` boundary scanner offers excellent performance right out of t
|
||||||
|
|
||||||
Possible values for the `hl.bs.type` parameter are WORD, LINE, SENTENCE, and CHARACTER.
|
Possible values for the `hl.bs.type` parameter are WORD, LINE, SENTENCE, and CHARACTER.
|
||||||
|
|
||||||
[[Highlighting-ThesimpleBoundaryScanner]]
|
|
||||||
==== The simple Boundary Scanner
|
==== The simple Boundary Scanner
|
||||||
|
|
||||||
The `simple` boundary scanner scans term boundaries for a specified maximum character value (`hl.bs.maxScan`) and for common delimiters such as punctuation marks (`hl.bs.chars`). The `simple` boundary scanner may be useful for some custom To implement the `simple` boundary scanner, add this code to the `highlighting` section of your `solrconfig.xml` file, adjusting the values as appropriate to your application:
|
The `simple` boundary scanner scans term boundaries for a specified maximum character value (`hl.bs.maxScan`) and for common delimiters such as punctuation marks (`hl.bs.chars`). The `simple` boundary scanner may be useful for some custom To implement the `simple` boundary scanner, add this code to the `highlighting` section of your `solrconfig.xml` file, adjusting the values as appropriate to your application:
|
||||||
|
|
|
@ -27,13 +27,11 @@ The following sections cover provide general information about how various SolrC
|
||||||
|
|
||||||
If you are already familiar with SolrCloud concepts and basic functionality, you can skip to the section covering <<solrcloud-configuration-and-parameters.adoc#solrcloud-configuration-and-parameters,SolrCloud Configuration and Parameters>>.
|
If you are already familiar with SolrCloud concepts and basic functionality, you can skip to the section covering <<solrcloud-configuration-and-parameters.adoc#solrcloud-configuration-and-parameters,SolrCloud Configuration and Parameters>>.
|
||||||
|
|
||||||
[[HowSolrCloudWorks-KeySolrCloudConcepts]]
|
|
||||||
== Key SolrCloud Concepts
|
== Key SolrCloud Concepts
|
||||||
|
|
||||||
A SolrCloud cluster consists of some "logical" concepts layered on top of some "physical" concepts.
|
A SolrCloud cluster consists of some "logical" concepts layered on top of some "physical" concepts.
|
||||||
|
|
||||||
[[HowSolrCloudWorks-Logical]]
|
=== Logical Concepts
|
||||||
=== Logical
|
|
||||||
|
|
||||||
* A Cluster can host multiple Collections of Solr Documents.
|
* A Cluster can host multiple Collections of Solr Documents.
|
||||||
* A collection can be partitioned into multiple Shards, which contain a subset of the Documents in the Collection.
|
* A collection can be partitioned into multiple Shards, which contain a subset of the Documents in the Collection.
|
||||||
|
@ -41,8 +39,7 @@ A SolrCloud cluster consists of some "logical" concepts layered on top of some "
|
||||||
** The theoretical limit to the number of Documents that Collection can reasonably contain.
|
** The theoretical limit to the number of Documents that Collection can reasonably contain.
|
||||||
** The amount of parallelization that is possible for an individual search request.
|
** The amount of parallelization that is possible for an individual search request.
|
||||||
|
|
||||||
[[HowSolrCloudWorks-Physical]]
|
=== Physical Concepts
|
||||||
=== Physical
|
|
||||||
|
|
||||||
* A Cluster is made up of one or more Solr Nodes, which are running instances of the Solr server process.
|
* A Cluster is made up of one or more Solr Nodes, which are running instances of the Solr server process.
|
||||||
* Each Node can host multiple Cores.
|
* Each Node can host multiple Cores.
|
||||||
|
|
|
@ -20,7 +20,6 @@
|
||||||
|
|
||||||
Solr ships with many out-of-the-box RequestHandlers, which are called implicit because they are not configured in `solrconfig.xml`.
|
Solr ships with many out-of-the-box RequestHandlers, which are called implicit because they are not configured in `solrconfig.xml`.
|
||||||
|
|
||||||
[[ImplicitRequestHandlers-ListofImplicitlyAvailableEndpoints]]
|
|
||||||
== List of Implicitly Available Endpoints
|
== List of Implicitly Available Endpoints
|
||||||
|
|
||||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||||
|
@ -44,19 +43,18 @@ Solr ships with many out-of-the-box RequestHandlers, which are called implicit b
|
||||||
|`/debug/dump` |{solr-javadocs}/solr-core/org/apache/solr/handler/DumpRequestHandler.html[DumpRequestHandler] |`_DEBUG_DUMP` |Echo the request contents back to the client.
|
|`/debug/dump` |{solr-javadocs}/solr-core/org/apache/solr/handler/DumpRequestHandler.html[DumpRequestHandler] |`_DEBUG_DUMP` |Echo the request contents back to the client.
|
||||||
|<<exporting-result-sets.adoc#exporting-result-sets,`/export`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/component/SearchHandler.html[SearchHandler] |`_EXPORT` |Export full sorted result sets.
|
|<<exporting-result-sets.adoc#exporting-result-sets,`/export`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/component/SearchHandler.html[SearchHandler] |`_EXPORT` |Export full sorted result sets.
|
||||||
|<<realtime-get.adoc#realtime-get,`/get`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/RealTimeGetHandler.html[RealTimeGetHandler] |`_GET` |Real-time get: low-latency retrieval of the latest version of a document.
|
|<<realtime-get.adoc#realtime-get,`/get`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/RealTimeGetHandler.html[RealTimeGetHandler] |`_GET` |Real-time get: low-latency retrieval of the latest version of a document.
|
||||||
|<<graph-traversal.adoc#GraphTraversal-ExportingGraphMLtoSupportGraphVisualization,`/graph`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/GraphHandler.html[GraphHandler] |`_ADMIN_GRAPH` |Return http://graphml.graphdrawing.org/[GraphML] formatted output from a <<graph-traversal.adoc#graph-traversal,`gather` `Nodes` streaming expression>>.
|
|<<graph-traversal.adoc#exporting-graphml-to-support-graph-visualization,`/graph`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/GraphHandler.html[GraphHandler] |`_ADMIN_GRAPH` |Return http://graphml.graphdrawing.org/[GraphML] formatted output from a <<graph-traversal.adoc#graph-traversal,`gather` `Nodes` streaming expression>>.
|
||||||
|<<index-replication.adoc#index-replication,`/replication`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/ReplicationHandler.html[ReplicationHandler] |`_REPLICATION` |Replicate indexes for SolrCloud recovery and Master/Slave index distribution.
|
|<<index-replication.adoc#index-replication,`/replication`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/ReplicationHandler.html[ReplicationHandler] |`_REPLICATION` |Replicate indexes for SolrCloud recovery and Master/Slave index distribution.
|
||||||
|<<schema-api.adoc#schema-api,`/schema`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/SchemaHandler.html[SchemaHandler] |`_SCHEMA` |Retrieve/modify Solr schema.
|
|<<schema-api.adoc#schema-api,`/schema`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/SchemaHandler.html[SchemaHandler] |`_SCHEMA` |Retrieve/modify Solr schema.
|
||||||
|<<parallel-sql-interface.adoc#sql-request-handler,`/sql`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/SQLHandler.html[SQLHandler] |`_SQL` |Front end of the Parallel SQL interface.
|
|<<parallel-sql-interface.adoc#sql-request-handler,`/sql`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/SQLHandler.html[SQLHandler] |`_SQL` |Front end of the Parallel SQL interface.
|
||||||
|<<streaming-expressions.adoc#StreamingExpressions-StreamingRequestsandResponses,`/stream`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/StreamHandler.html[StreamHandler] |`_STREAM` |Distributed stream processing.
|
|<<streaming-expressions.adoc#streaming-requests-and-responses,`/stream`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/StreamHandler.html[StreamHandler] |`_STREAM` |Distributed stream processing.
|
||||||
|<<the-terms-component.adoc#TheTermsComponent-UsingtheTermsComponentinaRequestHandler,`/terms`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/component/SearchHandler.html[SearchHandler] |`_TERMS` |Return a field's indexed terms and the number of documents containing each term.
|
|<<the-terms-component.adoc#using-the-terms-component-in-a-request-handler,`/terms`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/component/SearchHandler.html[SearchHandler] |`_TERMS` |Return a field's indexed terms and the number of documents containing each term.
|
||||||
|<<uploading-data-with-index-handlers.adoc#uploading-data-with-index-handlers,`/update`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/UpdateRequestHandler.html[UpdateRequestHandler] |`_UPDATE` |Add, delete and update indexed documents formatted as SolrXML, CSV, SolrJSON or javabin.
|
|<<uploading-data-with-index-handlers.adoc#uploading-data-with-index-handlers,`/update`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/UpdateRequestHandler.html[UpdateRequestHandler] |`_UPDATE` |Add, delete and update indexed documents formatted as SolrXML, CSV, SolrJSON or javabin.
|
||||||
|<<uploading-data-with-index-handlers.adoc#UploadingDatawithIndexHandlers-CSVUpdateConveniencePaths,`/update/csv`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/UpdateRequestHandler.html[UpdateRequestHandler] |`_UPDATE_CSV` |Add and update CSV-formatted documents.
|
|<<uploading-data-with-index-handlers.adoc#csv-update-convenience-paths,`/update/csv`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/UpdateRequestHandler.html[UpdateRequestHandler] |`_UPDATE_CSV` |Add and update CSV-formatted documents.
|
||||||
|<<uploading-data-with-index-handlers.adoc#UploadingDatawithIndexHandlers-CSVUpdateConveniencePaths,`/update/json`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/UpdateRequestHandler.html[UpdateRequestHandler] |`_UPDATE_JSON` |Add, delete and update SolrJSON-formatted documents.
|
|<<uploading-data-with-index-handlers.adoc#csv-update-convenience-paths,`/update/json`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/UpdateRequestHandler.html[UpdateRequestHandler] |`_UPDATE_JSON` |Add, delete and update SolrJSON-formatted documents.
|
||||||
|<<transforming-and-indexing-custom-json.adoc#transforming-and-indexing-custom-json,`/update/json/docs`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/UpdateRequestHandler.html[UpdateRequestHandler] |`_UPDATE_JSON_DOCS` |Add and update custom JSON-formatted documents.
|
|<<transforming-and-indexing-custom-json.adoc#transforming-and-indexing-custom-json,`/update/json/docs`>> |{solr-javadocs}/solr-core/org/apache/solr/handler/UpdateRequestHandler.html[UpdateRequestHandler] |`_UPDATE_JSON_DOCS` |Add and update custom JSON-formatted documents.
|
||||||
|===
|
|===
|
||||||
|
|
||||||
[[ImplicitRequestHandlers-HowtoViewtheConfiguration]]
|
|
||||||
== How to View the Configuration
|
== How to View the Configuration
|
||||||
|
|
||||||
You can see configuration for all request handlers, including the implicit request handlers, via the <<config-api.adoc#config-api,Config API>>. E.g. for the `gettingstarted` collection:
|
You can see configuration for all request handlers, including the implicit request handlers, via the <<config-api.adoc#config-api,Config API>>. E.g. for the `gettingstarted` collection:
|
||||||
|
@ -71,7 +69,6 @@ To include the expanded paramset in the response, as well as the effective param
|
||||||
|
|
||||||
`curl "http://localhost:8983/solr/gettingstarted/config/requestHandler?componentName=/export&expandParams=true"`
|
`curl "http://localhost:8983/solr/gettingstarted/config/requestHandler?componentName=/export&expandParams=true"`
|
||||||
|
|
||||||
[[ImplicitRequestHandlers-HowtoEdittheConfiguration]]
|
|
||||||
== How to Edit the Configuration
|
== How to Edit the Configuration
|
||||||
|
|
||||||
Because implicit request handlers are not present in `solrconfig.xml`, configuration of their associated `default`, `invariant` and `appends` parameters may be edited via<<request-parameters-api.adoc#request-parameters-api, Request Parameters API>> using the paramset listed in the above table. However, other parameters, including SearchHandler components, may not be modified. The invariants and appends specified in the implicit configuration cannot be overridden.
|
Because implicit request handlers are not present in `solrconfig.xml`, configuration of their associated `default`, `invariant` and `appends` parameters may be edited via<<request-parameters-api.adoc#request-parameters-api, Request Parameters API>> using the paramset listed in the above table. However, other parameters, including SearchHandler components, may not be modified. The invariants and appends specified in the implicit configuration cannot be overridden.
|
||||||
|
|
|
@ -26,7 +26,6 @@ The figure below shows a Solr configuration using index replication. The master
|
||||||
image::images/index-replication/worddav2b7e14725d898b4104cdd9c502fc77cd.png[image,width=159,height=235]
|
image::images/index-replication/worddav2b7e14725d898b4104cdd9c502fc77cd.png[image,width=159,height=235]
|
||||||
|
|
||||||
|
|
||||||
[[IndexReplication-IndexReplicationinSolr]]
|
|
||||||
== Index Replication in Solr
|
== Index Replication in Solr
|
||||||
|
|
||||||
Solr includes a Java implementation of index replication that works over HTTP:
|
Solr includes a Java implementation of index replication that works over HTTP:
|
||||||
|
@ -46,7 +45,6 @@ Although there is no explicit concept of "master/slave" nodes in a <<solrcloud.a
|
||||||
When using SolrCloud, the `ReplicationHandler` must be available via the `/replication` path. Solr does this implicitly unless overridden explicitly in your `solrconfig.xml`, but if you wish to override the default behavior, make certain that you do not explicitly set any of the "master" or "slave" configuration options mentioned below, or they will interfere with normal SolrCloud operation.
|
When using SolrCloud, the `ReplicationHandler` must be available via the `/replication` path. Solr does this implicitly unless overridden explicitly in your `solrconfig.xml`, but if you wish to override the default behavior, make certain that you do not explicitly set any of the "master" or "slave" configuration options mentioned below, or they will interfere with normal SolrCloud operation.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[IndexReplication-ReplicationTerminology]]
|
|
||||||
== Replication Terminology
|
== Replication Terminology
|
||||||
|
|
||||||
The table below defines the key terms associated with Solr replication.
|
The table below defines the key terms associated with Solr replication.
|
||||||
|
@ -79,15 +77,13 @@ Snapshot::
|
||||||
A directory containing hard links to the data files of an index. Snapshots are distributed from the master nodes when the slaves pull them, "smart copying" any segments the slave node does not have in snapshot directory that contains the hard links to the most recent index data files.
|
A directory containing hard links to the data files of an index. Snapshots are distributed from the master nodes when the slaves pull them, "smart copying" any segments the slave node does not have in snapshot directory that contains the hard links to the most recent index data files.
|
||||||
|
|
||||||
|
|
||||||
[[IndexReplication-ConfiguringtheReplicationHandler]]
|
|
||||||
== Configuring the ReplicationHandler
|
== Configuring the ReplicationHandler
|
||||||
|
|
||||||
In addition to `ReplicationHandler` configuration options specific to the master/slave roles, there are a few special configuration options that are generally supported (even when using SolrCloud).
|
In addition to `ReplicationHandler` configuration options specific to the master/slave roles, there are a few special configuration options that are generally supported (even when using SolrCloud).
|
||||||
|
|
||||||
* `maxNumberOfBackups` an integer value dictating the maximum number of backups this node will keep on disk as it receives `backup` commands.
|
* `maxNumberOfBackups` an integer value dictating the maximum number of backups this node will keep on disk as it receives `backup` commands.
|
||||||
* Similar to most other request handlers in Solr you may configure a set of <<requesthandlers-and-searchcomponents-in-solrconfig.adoc#RequestHandlersandSearchComponentsinSolrConfig-SearchHandlers,defaults, invariants, and/or appends>> parameters corresponding with any request parameters supported by the `ReplicationHandler` when <<IndexReplication-HTTPAPICommandsfortheReplicationHandler,processing commands>>.
|
* Similar to most other request handlers in Solr you may configure a set of <<requesthandlers-and-searchcomponents-in-solrconfig.adoc#searchhandlers,defaults, invariants, and/or appends>> parameters corresponding with any request parameters supported by the `ReplicationHandler` when <<HTTP API Commands for the ReplicationHandler,processing commands>>.
|
||||||
|
|
||||||
[[IndexReplication-ConfiguringtheReplicationRequestHandleronaMasterServer]]
|
|
||||||
=== Configuring the Replication RequestHandler on a Master Server
|
=== Configuring the Replication RequestHandler on a Master Server
|
||||||
|
|
||||||
Before running a replication, you should set the following parameters on initialization of the handler:
|
Before running a replication, you should set the following parameters on initialization of the handler:
|
||||||
|
@ -125,7 +121,6 @@ The example below shows a possible 'master' configuration for the `ReplicationHa
|
||||||
</requestHandler>
|
</requestHandler>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[IndexReplication-Replicatingsolrconfig.xml]]
|
|
||||||
==== Replicating solrconfig.xml
|
==== Replicating solrconfig.xml
|
||||||
|
|
||||||
In the configuration file on the master server, include a line like the following:
|
In the configuration file on the master server, include a line like the following:
|
||||||
|
@ -139,7 +134,6 @@ This ensures that the local configuration `solrconfig_slave.xml` will be saved a
|
||||||
|
|
||||||
On the master server, the file name of the slave configuration file can be anything, as long as the name is correctly identified in the `confFiles` string; then it will be saved as whatever file name appears after the colon ':'.
|
On the master server, the file name of the slave configuration file can be anything, as long as the name is correctly identified in the `confFiles` string; then it will be saved as whatever file name appears after the colon ':'.
|
||||||
|
|
||||||
[[IndexReplication-ConfiguringtheReplicationRequestHandleronaSlaveServer]]
|
|
||||||
=== Configuring the Replication RequestHandler on a Slave Server
|
=== Configuring the Replication RequestHandler on a Slave Server
|
||||||
|
|
||||||
The code below shows how to configure a ReplicationHandler on a slave.
|
The code below shows how to configure a ReplicationHandler on a slave.
|
||||||
|
@ -188,7 +182,6 @@ The code below shows how to configure a ReplicationHandler on a slave.
|
||||||
</requestHandler>
|
</requestHandler>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[IndexReplication-SettingUpaRepeaterwiththeReplicationHandler]]
|
|
||||||
== Setting Up a Repeater with the ReplicationHandler
|
== Setting Up a Repeater with the ReplicationHandler
|
||||||
|
|
||||||
A master may be able to serve only so many slaves without affecting performance. Some organizations have deployed slave servers across multiple data centers. If each slave downloads the index from a remote data center, the resulting download may consume too much network bandwidth. To avoid performance degradation in cases like this, you can configure one or more slaves as repeaters. A repeater is simply a node that acts as both a master and a slave.
|
A master may be able to serve only so many slaves without affecting performance. Some organizations have deployed slave servers across multiple data centers. If each slave downloads the index from a remote data center, the resulting download may consume too much network bandwidth. To avoid performance degradation in cases like this, you can configure one or more slaves as repeaters. A repeater is simply a node that acts as both a master and a slave.
|
||||||
|
@ -213,7 +206,6 @@ Here is an example of a ReplicationHandler configuration for a repeater:
|
||||||
</requestHandler>
|
</requestHandler>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[IndexReplication-CommitandOptimizeOperations]]
|
|
||||||
== Commit and Optimize Operations
|
== Commit and Optimize Operations
|
||||||
|
|
||||||
When a commit or optimize operation is performed on the master, the RequestHandler reads the list of file names which are associated with each commit point. This relies on the `replicateAfter` parameter in the configuration to decide which types of events should trigger replication.
|
When a commit or optimize operation is performed on the master, the RequestHandler reads the list of file names which are associated with each commit point. This relies on the `replicateAfter` parameter in the configuration to decide which types of events should trigger replication.
|
||||||
|
@ -233,7 +225,6 @@ The `replicateAfter` parameter can accept multiple arguments. For example:
|
||||||
<str name="replicateAfter">optimize</str>
|
<str name="replicateAfter">optimize</str>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[IndexReplication-SlaveReplication]]
|
|
||||||
== Slave Replication
|
== Slave Replication
|
||||||
|
|
||||||
The master is totally unaware of the slaves.
|
The master is totally unaware of the slaves.
|
||||||
|
@ -246,7 +237,6 @@ The slave continuously keeps polling the master (depending on the `pollInterval`
|
||||||
* After the download completes, all the new files are moved to the live index directory and the file's timestamp is same as its counterpart on the master.
|
* After the download completes, all the new files are moved to the live index directory and the file's timestamp is same as its counterpart on the master.
|
||||||
* A commit command is issued on the slave by the Slave's ReplicationHandler and the new index is loaded.
|
* A commit command is issued on the slave by the Slave's ReplicationHandler and the new index is loaded.
|
||||||
|
|
||||||
[[IndexReplication-ReplicatingConfigurationFiles]]
|
|
||||||
=== Replicating Configuration Files
|
=== Replicating Configuration Files
|
||||||
|
|
||||||
To replicate configuration files, list them using using the `confFiles` parameter. Only files found in the `conf` directory of the master's Solr instance will be replicated.
|
To replicate configuration files, list them using using the `confFiles` parameter. Only files found in the `conf` directory of the master's Solr instance will be replicated.
|
||||||
|
@ -259,7 +249,6 @@ As a precaution when replicating configuration files, Solr copies configuration
|
||||||
|
|
||||||
If a replication involved downloading of at least one configuration file, the ReplicationHandler issues a core-reload command instead of a commit command.
|
If a replication involved downloading of at least one configuration file, the ReplicationHandler issues a core-reload command instead of a commit command.
|
||||||
|
|
||||||
[[IndexReplication-ResolvingCorruptionIssuesonSlaveServers]]
|
|
||||||
=== Resolving Corruption Issues on Slave Servers
|
=== Resolving Corruption Issues on Slave Servers
|
||||||
|
|
||||||
If documents are added to the slave, then the slave is no longer in sync with its master. However, the slave will not undertake any action to put itself in sync, until the master has new index data.
|
If documents are added to the slave, then the slave is no longer in sync with its master. However, the slave will not undertake any action to put itself in sync, until the master has new index data.
|
||||||
|
@ -268,7 +257,6 @@ When a commit operation takes place on the master, the index version of the mast
|
||||||
|
|
||||||
To correct this problem, the slave then copies all the index files from master to a new index directory and asks the core to load the fresh index from the new directory.
|
To correct this problem, the slave then copies all the index files from master to a new index directory and asks the core to load the fresh index from the new directory.
|
||||||
|
|
||||||
[[IndexReplication-HTTPAPICommandsfortheReplicationHandler]]
|
|
||||||
== HTTP API Commands for the ReplicationHandler
|
== HTTP API Commands for the ReplicationHandler
|
||||||
|
|
||||||
You can use the HTTP commands below to control the ReplicationHandler's operations.
|
You can use the HTTP commands below to control the ReplicationHandler's operations.
|
||||||
|
@ -355,7 +343,6 @@ There are two supported parameters:
|
||||||
* `location`: Location where the snapshot is created.
|
* `location`: Location where the snapshot is created.
|
||||||
|
|
||||||
|
|
||||||
[[IndexReplication-DistributionandOptimization]]
|
|
||||||
== Distribution and Optimization
|
== Distribution and Optimization
|
||||||
|
|
||||||
Optimizing an index is not something most users should generally worry about - but in particular users should be aware of the impacts of optimizing an index when using the `ReplicationHandler`.
|
Optimizing an index is not something most users should generally worry about - but in particular users should be aware of the impacts of optimizing an index when using the `ReplicationHandler`.
|
||||||
|
|
|
@ -29,10 +29,8 @@ By default, the settings are commented out in the sample `solrconfig.xml` includ
|
||||||
</indexConfig>
|
</indexConfig>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-WritingNewSegments]]
|
|
||||||
== Writing New Segments
|
== Writing New Segments
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-ramBufferSizeMB]]
|
|
||||||
=== ramBufferSizeMB
|
=== ramBufferSizeMB
|
||||||
|
|
||||||
Once accumulated document updates exceed this much memory space (defined in megabytes), then the pending updates are flushed. This can also create new segments or trigger a merge. Using this setting is generally preferable to `maxBufferedDocs`. If both `maxBufferedDocs` and `ramBufferSizeMB` are set in `solrconfig.xml`, then a flush will occur when either limit is reached. The default is 100Mb.
|
Once accumulated document updates exceed this much memory space (defined in megabytes), then the pending updates are flushed. This can also create new segments or trigger a merge. Using this setting is generally preferable to `maxBufferedDocs`. If both `maxBufferedDocs` and `ramBufferSizeMB` are set in `solrconfig.xml`, then a flush will occur when either limit is reached. The default is 100Mb.
|
||||||
|
@ -42,7 +40,6 @@ Once accumulated document updates exceed this much memory space (defined in mega
|
||||||
<ramBufferSizeMB>100</ramBufferSizeMB>
|
<ramBufferSizeMB>100</ramBufferSizeMB>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-maxBufferedDocs]]
|
|
||||||
=== maxBufferedDocs
|
=== maxBufferedDocs
|
||||||
|
|
||||||
Sets the number of document updates to buffer in memory before they are flushed as a new segment. This may also trigger a merge. The default Solr configuration sets to flush by RAM usage (`ramBufferSizeMB`).
|
Sets the number of document updates to buffer in memory before they are flushed as a new segment. This may also trigger a merge. The default Solr configuration sets to flush by RAM usage (`ramBufferSizeMB`).
|
||||||
|
@ -52,20 +49,17 @@ Sets the number of document updates to buffer in memory before they are flushed
|
||||||
<maxBufferedDocs>1000</maxBufferedDocs>
|
<maxBufferedDocs>1000</maxBufferedDocs>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-useCompoundFile]]
|
|
||||||
=== useCompoundFile
|
=== useCompoundFile
|
||||||
|
|
||||||
Controls whether newly written (and not yet merged) index segments should use the <<IndexConfiginSolrConfig-CompoundFileSegments,Compound File Segment>> format. The default is false.
|
Controls whether newly written (and not yet merged) index segments should use the <<Compound File Segments>> format. The default is false.
|
||||||
|
|
||||||
[source,xml]
|
[source,xml]
|
||||||
----
|
----
|
||||||
<useCompoundFile>false</useCompoundFile>
|
<useCompoundFile>false</useCompoundFile>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-MergingIndexSegments]]
|
|
||||||
== Merging Index Segments
|
== Merging Index Segments
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-mergePolicyFactory]]
|
|
||||||
=== mergePolicyFactory
|
=== mergePolicyFactory
|
||||||
|
|
||||||
Defines how merging segments is done.
|
Defines how merging segments is done.
|
||||||
|
@ -99,7 +93,6 @@ Choosing the best merge factors is generally a trade-off of indexing speed vs. s
|
||||||
|
|
||||||
Conversely, keeping more segments can accelerate indexing, because merges happen less often, making an update is less likely to trigger a merge. But searches become more computationally expensive and will likely be slower, because search terms must be looked up in more index segments. Faster index updates also means shorter commit turnaround times, which means more timely search results.
|
Conversely, keeping more segments can accelerate indexing, because merges happen less often, making an update is less likely to trigger a merge. But searches become more computationally expensive and will likely be slower, because search terms must be looked up in more index segments. Faster index updates also means shorter commit turnaround times, which means more timely search results.
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-CustomizingMergePolicies]]
|
|
||||||
=== Customizing Merge Policies
|
=== Customizing Merge Policies
|
||||||
|
|
||||||
If the configuration options for the built-in merge policies do not fully suit your use case, you can customize them: either by creating a custom merge policy factory that you specify in your configuration, or by configuring a {solr-javadocs}/solr-core/org/apache/solr/index/WrapperMergePolicyFactory.html[merge policy wrapper] which uses a `wrapped.prefix` configuration option to control how the factory it wraps will be configured:
|
If the configuration options for the built-in merge policies do not fully suit your use case, you can customize them: either by creating a custom merge policy factory that you specify in your configuration, or by configuring a {solr-javadocs}/solr-core/org/apache/solr/index/WrapperMergePolicyFactory.html[merge policy wrapper] which uses a `wrapped.prefix` configuration option to control how the factory it wraps will be configured:
|
||||||
|
@ -117,7 +110,6 @@ If the configuration options for the built-in merge policies do not fully suit y
|
||||||
|
|
||||||
The example above shows Solr's {solr-javadocs}/solr-core/org/apache/solr/index/SortingMergePolicyFactory.html[`SortingMergePolicyFactory`] being configured to sort documents in merged segments by `"timestamp desc"`, and wrapped around a `TieredMergePolicyFactory` configured to use the values `maxMergeAtOnce=10` and `segmentsPerTier=10` via the `inner` prefix defined by `SortingMergePolicyFactory` 's `wrapped.prefix` option. For more information on using `SortingMergePolicyFactory`, see <<common-query-parameters.adoc#CommonQueryParameters-ThesegmentTerminateEarlyParameter,the segmentTerminateEarly parameter>>.
|
The example above shows Solr's {solr-javadocs}/solr-core/org/apache/solr/index/SortingMergePolicyFactory.html[`SortingMergePolicyFactory`] being configured to sort documents in merged segments by `"timestamp desc"`, and wrapped around a `TieredMergePolicyFactory` configured to use the values `maxMergeAtOnce=10` and `segmentsPerTier=10` via the `inner` prefix defined by `SortingMergePolicyFactory` 's `wrapped.prefix` option. For more information on using `SortingMergePolicyFactory`, see <<common-query-parameters.adoc#CommonQueryParameters-ThesegmentTerminateEarlyParameter,the segmentTerminateEarly parameter>>.
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-mergeScheduler]]
|
|
||||||
=== mergeScheduler
|
=== mergeScheduler
|
||||||
|
|
||||||
The merge scheduler controls how merges are performed. The default `ConcurrentMergeScheduler` performs merges in the background using separate threads. The alternative, `SerialMergeScheduler`, does not perform merges with separate threads.
|
The merge scheduler controls how merges are performed. The default `ConcurrentMergeScheduler` performs merges in the background using separate threads. The alternative, `SerialMergeScheduler`, does not perform merges with separate threads.
|
||||||
|
@ -127,7 +119,6 @@ The merge scheduler controls how merges are performed. The default `ConcurrentMe
|
||||||
<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
|
<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-mergedSegmentWarmer]]
|
|
||||||
=== mergedSegmentWarmer
|
=== mergedSegmentWarmer
|
||||||
|
|
||||||
When using Solr in for <<near-real-time-searching.adoc#near-real-time-searching,Near Real Time Searching>> a merged segment warmer can be configured to warm the reader on the newly merged segment, before the merge commits. This is not required for near real-time search, but will reduce search latency on opening a new near real-time reader after a merge completes.
|
When using Solr in for <<near-real-time-searching.adoc#near-real-time-searching,Near Real Time Searching>> a merged segment warmer can be configured to warm the reader on the newly merged segment, before the merge commits. This is not required for near real-time search, but will reduce search latency on opening a new near real-time reader after a merge completes.
|
||||||
|
@ -137,7 +128,6 @@ When using Solr in for <<near-real-time-searching.adoc#near-real-time-searching,
|
||||||
<mergedSegmentWarmer class="org.apache.lucene.index.SimpleMergedSegmentWarmer"/>
|
<mergedSegmentWarmer class="org.apache.lucene.index.SimpleMergedSegmentWarmer"/>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-CompoundFileSegments]]
|
|
||||||
== Compound File Segments
|
== Compound File Segments
|
||||||
|
|
||||||
Each Lucene segment is typically comprised of a dozen or so files. Lucene can be configured to bundle all of the files for a segment into a single compound file using a file extension of `.cfs`; it's an abbreviation for Compound File Segment.
|
Each Lucene segment is typically comprised of a dozen or so files. Lucene can be configured to bundle all of the files for a segment into a single compound file using a file extension of `.cfs`; it's an abbreviation for Compound File Segment.
|
||||||
|
@ -149,16 +139,14 @@ On systems where the number of open files allowed per process is limited, CFS ma
|
||||||
.CFS: New Segments vs Merged Segments
|
.CFS: New Segments vs Merged Segments
|
||||||
[NOTE]
|
[NOTE]
|
||||||
====
|
====
|
||||||
To configure whether _newly written segments_ should use CFS, see the <<IndexConfiginSolrConfig-useCompoundFile,`useCompoundFile`>> setting described above. To configure whether _merged segments_ use CFS, review the Javadocs for your <<IndexConfiginSolrConfig-mergePolicyFactory,`mergePolicyFactory`>> .
|
To configure whether _newly written segments_ should use CFS, see the <<useCompoundFile,`useCompoundFile`>> setting described above. To configure whether _merged segments_ use CFS, review the Javadocs for your <<mergePolicyFactory,`mergePolicyFactory`>> .
|
||||||
|
|
||||||
Many <<IndexConfiginSolrConfig-MergingIndexSegments,Merge Policy>> implementations support `noCFSRatio` and `maxCFSSegmentSizeMB` settings with default values that prevent compound files from being used for large segments, but do use compound files for small segments.
|
Many <<Merging Index Segments,Merge Policy>> implementations support `noCFSRatio` and `maxCFSSegmentSizeMB` settings with default values that prevent compound files from being used for large segments, but do use compound files for small segments.
|
||||||
|
|
||||||
====
|
====
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-IndexLocks]]
|
|
||||||
== Index Locks
|
== Index Locks
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-lockType]]
|
|
||||||
=== lockType
|
=== lockType
|
||||||
|
|
||||||
The LockFactory options specify the locking implementation to use.
|
The LockFactory options specify the locking implementation to use.
|
||||||
|
@ -177,7 +165,6 @@ For more information on the nuances of each LockFactory, see http://wiki.apache.
|
||||||
<lockType>native</lockType>
|
<lockType>native</lockType>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-writeLockTimeout]]
|
|
||||||
=== writeLockTimeout
|
=== writeLockTimeout
|
||||||
|
|
||||||
The maximum time to wait for a write lock on an IndexWriter. The default is 1000, expressed in milliseconds.
|
The maximum time to wait for a write lock on an IndexWriter. The default is 1000, expressed in milliseconds.
|
||||||
|
@ -187,7 +174,6 @@ The maximum time to wait for a write lock on an IndexWriter. The default is 1000
|
||||||
<writeLockTimeout>1000</writeLockTimeout>
|
<writeLockTimeout>1000</writeLockTimeout>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[IndexConfiginSolrConfig-OtherIndexingSettings]]
|
|
||||||
== Other Indexing Settings
|
== Other Indexing Settings
|
||||||
|
|
||||||
There are a few other parameters that may be important to configure for your implementation. These settings affect how or when updates are made to an index.
|
There are a few other parameters that may be important to configure for your implementation. These settings affect how or when updates are made to an index.
|
||||||
|
|
|
@ -43,7 +43,6 @@ This section describes how Solr adds data to its index. It covers the following
|
||||||
|
|
||||||
* *<<uima-integration.adoc#uima-integration,UIMA Integration>>*: Information about integrating Solr with Apache's Unstructured Information Management Architecture (UIMA). UIMA lets you define custom pipelines of Analysis Engines that incrementally add metadata to your documents as annotations.
|
* *<<uima-integration.adoc#uima-integration,UIMA Integration>>*: Information about integrating Solr with Apache's Unstructured Information Management Architecture (UIMA). UIMA lets you define custom pipelines of Analysis Engines that incrementally add metadata to your documents as annotations.
|
||||||
|
|
||||||
[[IndexingandBasicDataOperations-IndexingUsingClientAPIs]]
|
|
||||||
== Indexing Using Client APIs
|
== Indexing Using Client APIs
|
||||||
|
|
||||||
Using client APIs, such as <<using-solrj.adoc#using-solrj,SolrJ>>, from your applications is an important option for updating Solr indexes. See the <<client-apis.adoc#client-apis,Client APIs>> section for more information.
|
Using client APIs, such as <<using-solrj.adoc#using-solrj,SolrJ>>, from your applications is an important option for updating Solr indexes. See the <<client-apis.adoc#client-apis,Client APIs>> section for more information.
|
||||||
|
|
|
@ -55,8 +55,7 @@ For example, if an `<initParams>` section has the name "myParams", you can call
|
||||||
[source,xml]
|
[source,xml]
|
||||||
<requestHandler name="/dump1" class="DumpRequestHandler" initParams="myParams"/>
|
<requestHandler name="/dump1" class="DumpRequestHandler" initParams="myParams"/>
|
||||||
|
|
||||||
[[InitParamsinSolrConfig-Wildcards]]
|
== Wildcards in initParams
|
||||||
== Wildcards
|
|
||||||
|
|
||||||
An `<initParams>` section can support wildcards to define nested paths that should use the parameters defined. A single asterisk (\*) denotes that a nested path one level deeper should use the parameters. Double asterisks (**) denote all nested paths no matter how deep should use the parameters.
|
An `<initParams>` section can support wildcards to define nested paths that should use the parameters defined. A single asterisk (\*) denotes that a nested path one level deeper should use the parameters. Double asterisks (**) denote all nested paths no matter how deep should use the parameters.
|
||||||
|
|
||||||
|
|
|
@ -38,12 +38,10 @@ If the field name is defined in the Schema that is associated with the index, th
|
||||||
|
|
||||||
For more information on indexing in Solr, see the https://wiki.apache.org/solr/FrontPage[Solr Wiki].
|
For more information on indexing in Solr, see the https://wiki.apache.org/solr/FrontPage[Solr Wiki].
|
||||||
|
|
||||||
[[IntroductiontoSolrIndexing-TheSolrExampleDirectory]]
|
|
||||||
== The Solr Example Directory
|
== The Solr Example Directory
|
||||||
|
|
||||||
When starting Solr with the "-e" option, the `example/` directory will be used as base directory for the example Solr instances that are created. This directory also includes an `example/exampledocs/` subdirectory containing sample documents in a variety of formats that you can use to experiment with indexing into the various examples.
|
When starting Solr with the "-e" option, the `example/` directory will be used as base directory for the example Solr instances that are created. This directory also includes an `example/exampledocs/` subdirectory containing sample documents in a variety of formats that you can use to experiment with indexing into the various examples.
|
||||||
|
|
||||||
[[IntroductiontoSolrIndexing-ThecurlUtilityforTransferringFiles]]
|
|
||||||
== The curl Utility for Transferring Files
|
== The curl Utility for Transferring Files
|
||||||
|
|
||||||
Many of the instructions and examples in this section make use of the `curl` utility for transferring content through a URL. `curl` posts and retrieves data over HTTP, FTP, and many other protocols. Most Linux distributions include a copy of `curl`. You'll find curl downloads for Linux, Windows, and many other operating systems at http://curl.haxx.se/download.html. Documentation for `curl` is available here: http://curl.haxx.se/docs/manpage.html.
|
Many of the instructions and examples in this section make use of the `curl` utility for transferring content through a URL. `curl` posts and retrieves data over HTTP, FTP, and many other protocols. Most Linux distributions include a copy of `curl`. You'll find curl downloads for Linux, Windows, and many other operating systems at http://curl.haxx.se/download.html. Documentation for `curl` is available here: http://curl.haxx.se/docs/manpage.html.
|
||||||
|
|
|
@ -24,7 +24,6 @@ Configuring your JVM can be a complex topic and a full discussion is beyond the
|
||||||
|
|
||||||
For more general information about improving Solr performance, see https://wiki.apache.org/solr/SolrPerformanceFactors.
|
For more general information about improving Solr performance, see https://wiki.apache.org/solr/SolrPerformanceFactors.
|
||||||
|
|
||||||
[[JVMSettings-ChoosingMemoryHeapSettings]]
|
|
||||||
== Choosing Memory Heap Settings
|
== Choosing Memory Heap Settings
|
||||||
|
|
||||||
The most important JVM configuration settings are those that determine the amount of memory it is allowed to allocate. There are two primary command-line options that set memory limits for the JVM. These are `-Xms`, which sets the initial size of the JVM's memory heap, and `-Xmx`, which sets the maximum size to which the heap is allowed to grow.
|
The most important JVM configuration settings are those that determine the amount of memory it is allowed to allocate. There are two primary command-line options that set memory limits for the JVM. These are `-Xms`, which sets the initial size of the JVM's memory heap, and `-Xmx`, which sets the maximum size to which the heap is allowed to grow.
|
||||||
|
@ -41,12 +40,10 @@ When setting the maximum heap size, be careful not to let the JVM consume all av
|
||||||
|
|
||||||
On systems with many CPUs/cores, it can also be beneficial to tune the layout of the heap and/or the behavior of the garbage collector. Adjusting the relative sizes of the generational pools in the heap can affect how often GC sweeps occur and whether they run concurrently. Configuring the various settings of how the garbage collector should behave can greatly reduce the overall performance impact when it does run. There is a lot of good information on this topic available on Sun's website. A good place to start is here: http://www.oracle.com/technetwork/java/javase/tech/index-jsp-140228.html[Oracle's Java HotSpot Garbage Collection].
|
On systems with many CPUs/cores, it can also be beneficial to tune the layout of the heap and/or the behavior of the garbage collector. Adjusting the relative sizes of the generational pools in the heap can affect how often GC sweeps occur and whether they run concurrently. Configuring the various settings of how the garbage collector should behave can greatly reduce the overall performance impact when it does run. There is a lot of good information on this topic available on Sun's website. A good place to start is here: http://www.oracle.com/technetwork/java/javase/tech/index-jsp-140228.html[Oracle's Java HotSpot Garbage Collection].
|
||||||
|
|
||||||
[[JVMSettings-UsetheServerHotSpotVM]]
|
|
||||||
== Use the Server HotSpot VM
|
== Use the Server HotSpot VM
|
||||||
|
|
||||||
If you are using Sun's JVM, add the `-server` command-line option when you start Solr. This tells the JVM that it should optimize for a long running, server process. If the Java runtime on your system is a JRE, rather than a full JDK distribution (including `javac` and other development tools), then it is possible that it may not support the `-server` JVM option. Test this by running `java -help` and look for `-server` as an available option in the displayed usage message.
|
If you are using Sun's JVM, add the `-server` command-line option when you start Solr. This tells the JVM that it should optimize for a long running, server process. If the Java runtime on your system is a JRE, rather than a full JDK distribution (including `javac` and other development tools), then it is possible that it may not support the `-server` JVM option. Test this by running `java -help` and look for `-server` as an available option in the displayed usage message.
|
||||||
|
|
||||||
[[JVMSettings-CheckingJVMSettings]]
|
|
||||||
== Checking JVM Settings
|
== Checking JVM Settings
|
||||||
|
|
||||||
A great way to see what JVM settings your server is using, along with other useful information, is to use the admin RequestHandler, `solr/admin/system`. This request handler will display a wealth of server statistics and settings.
|
A great way to see what JVM settings your server is using, along with other useful information, is to use the admin RequestHandler, `solr/admin/system`. This request handler will display a wealth of server statistics and settings.
|
||||||
|
|
|
@ -29,17 +29,14 @@ Support for the Kerberos authentication plugin is available in SolrCloud mode or
|
||||||
If you are using Solr with a Hadoop cluster secured with Kerberos and intend to store your Solr indexes in HDFS, also see the section <<running-solr-on-hdfs.adoc#running-solr-on-hdfs,Running Solr on HDFS>> for additional steps to configure Solr for that purpose. The instructions on this page apply only to scenarios where Solr will be secured with Kerberos. If you only need to store your indexes in a Kerberized HDFS system, please see the other section referenced above.
|
If you are using Solr with a Hadoop cluster secured with Kerberos and intend to store your Solr indexes in HDFS, also see the section <<running-solr-on-hdfs.adoc#running-solr-on-hdfs,Running Solr on HDFS>> for additional steps to configure Solr for that purpose. The instructions on this page apply only to scenarios where Solr will be secured with Kerberos. If you only need to store your indexes in a Kerberized HDFS system, please see the other section referenced above.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-HowSolrWorksWithKerberos]]
|
|
||||||
== How Solr Works With Kerberos
|
== How Solr Works With Kerberos
|
||||||
|
|
||||||
When setting up Solr to use Kerberos, configurations are put in place for Solr to use a _service principal_, or a Kerberos username, which is registered with the Key Distribution Center (KDC) to authenticate requests. The configurations define the service principal name and the location of the keytab file that contains the credentials.
|
When setting up Solr to use Kerberos, configurations are put in place for Solr to use a _service principal_, or a Kerberos username, which is registered with the Key Distribution Center (KDC) to authenticate requests. The configurations define the service principal name and the location of the keytab file that contains the credentials.
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-security.json]]
|
|
||||||
=== security.json
|
=== security.json
|
||||||
|
|
||||||
The Solr authentication model uses a file called `security.json`. A description of this file and how it is created and maintained is covered in the section <<authentication-and-authorization-plugins.adoc#authentication-and-authorization-plugins,Authentication and Authorization Plugins>>. If this file is created after an initial startup of Solr, a restart of each node of the system is required.
|
The Solr authentication model uses a file called `security.json`. A description of this file and how it is created and maintained is covered in the section <<authentication-and-authorization-plugins.adoc#authentication-and-authorization-plugins,Authentication and Authorization Plugins>>. If this file is created after an initial startup of Solr, a restart of each node of the system is required.
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-ServicePrincipalsandKeytabFiles]]
|
|
||||||
=== Service Principals and Keytab Files
|
=== Service Principals and Keytab Files
|
||||||
|
|
||||||
Each Solr node must have a service principal registered with the Key Distribution Center (KDC). The Kerberos plugin uses SPNego to negotiate authentication.
|
Each Solr node must have a service principal registered with the Key Distribution Center (KDC). The Kerberos plugin uses SPNego to negotiate authentication.
|
||||||
|
@ -56,7 +53,6 @@ Along with the service principal, each Solr node needs a keytab file which shoul
|
||||||
|
|
||||||
Since a Solr cluster requires internode communication, each node must also be able to make Kerberos enabled requests to other nodes. By default, Solr uses the same service principal and keytab as a 'client principal' for internode communication. You may configure a distinct client principal explicitly, but doing so is not recommended and is not covered in the examples below.
|
Since a Solr cluster requires internode communication, each node must also be able to make Kerberos enabled requests to other nodes. By default, Solr uses the same service principal and keytab as a 'client principal' for internode communication. You may configure a distinct client principal explicitly, but doing so is not recommended and is not covered in the examples below.
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-KerberizedZooKeeper]]
|
|
||||||
=== Kerberized ZooKeeper
|
=== Kerberized ZooKeeper
|
||||||
|
|
||||||
When setting up a kerberized SolrCloud cluster, it is recommended to enable Kerberos security for ZooKeeper as well.
|
When setting up a kerberized SolrCloud cluster, it is recommended to enable Kerberos security for ZooKeeper as well.
|
||||||
|
@ -65,15 +61,13 @@ In such a setup, the client principal used to authenticate requests with ZooKeep
|
||||||
|
|
||||||
See the <<ZooKeeper Configuration>> section below for an example of starting ZooKeeper in Kerberos mode.
|
See the <<ZooKeeper Configuration>> section below for an example of starting ZooKeeper in Kerberos mode.
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-BrowserConfiguration]]
|
|
||||||
=== Browser Configuration
|
=== Browser Configuration
|
||||||
|
|
||||||
In order for your browser to access the Solr Admin UI after enabling Kerberos authentication, it must be able to negotiate with the Kerberos authenticator service to allow you access. Each browser supports this differently, and some (like Chrome) do not support it at all. If you see 401 errors when trying to access the Solr Admin UI after enabling Kerberos authentication, it's likely your browser has not been configured properly to know how or where to negotiate the authentication request.
|
In order for your browser to access the Solr Admin UI after enabling Kerberos authentication, it must be able to negotiate with the Kerberos authenticator service to allow you access. Each browser supports this differently, and some (like Chrome) do not support it at all. If you see 401 errors when trying to access the Solr Admin UI after enabling Kerberos authentication, it's likely your browser has not been configured properly to know how or where to negotiate the authentication request.
|
||||||
|
|
||||||
Detailed information on how to set up your browser is beyond the scope of this documentation; please see your system administrators for Kerberos for details on how to configure your browser.
|
Detailed information on how to set up your browser is beyond the scope of this documentation; please see your system administrators for Kerberos for details on how to configure your browser.
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-PluginConfiguration]]
|
== Kerberos Authentication Configuration
|
||||||
== Plugin Configuration
|
|
||||||
|
|
||||||
.Consult Your Kerberos Admins!
|
.Consult Your Kerberos Admins!
|
||||||
[WARNING]
|
[WARNING]
|
||||||
|
@ -97,7 +91,6 @@ We'll walk through each of these steps below.
|
||||||
To use host names instead of IP addresses, use the `SOLR_HOST` configuration in `bin/solr.in.sh` or pass a `-Dhost=<hostname>` system parameter during Solr startup. This guide uses IP addresses. If you specify a hostname, replace all the IP addresses in the guide with the Solr hostname as appropriate.
|
To use host names instead of IP addresses, use the `SOLR_HOST` configuration in `bin/solr.in.sh` or pass a `-Dhost=<hostname>` system parameter during Solr startup. This guide uses IP addresses. If you specify a hostname, replace all the IP addresses in the guide with the Solr hostname as appropriate.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-GetServicePrincipalsandKeytabs]]
|
|
||||||
=== Get Service Principals and Keytabs
|
=== Get Service Principals and Keytabs
|
||||||
|
|
||||||
Before configuring Solr, make sure you have a Kerberos service principal for each Solr host and ZooKeeper (if ZooKeeper has not already been configured) available in the KDC server, and generate a keytab file as shown below.
|
Before configuring Solr, make sure you have a Kerberos service principal for each Solr host and ZooKeeper (if ZooKeeper has not already been configured) available in the KDC server, and generate a keytab file as shown below.
|
||||||
|
@ -128,7 +121,6 @@ Copy the keytab file from the KDC server’s `/tmp/107.keytab` location to the S
|
||||||
|
|
||||||
You might need to take similar steps to create a ZooKeeper service principal and keytab if it has not already been set up. In that case, the example below shows a different service principal for ZooKeeper, so the above might be repeated with `zookeeper/host1` as the service principal for one of the nodes
|
You might need to take similar steps to create a ZooKeeper service principal and keytab if it has not already been set up. In that case, the example below shows a different service principal for ZooKeeper, so the above might be repeated with `zookeeper/host1` as the service principal for one of the nodes
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-ZooKeeperConfiguration]]
|
|
||||||
=== ZooKeeper Configuration
|
=== ZooKeeper Configuration
|
||||||
|
|
||||||
If you are using a ZooKeeper that has already been configured to use Kerberos, you can skip the ZooKeeper-related steps shown here.
|
If you are using a ZooKeeper that has already been configured to use Kerberos, you can skip the ZooKeeper-related steps shown here.
|
||||||
|
@ -173,7 +165,6 @@ Once all of the pieces are in place, start ZooKeeper with the following paramete
|
||||||
bin/zkServer.sh start -Djava.security.auth.login.config=/etc/zookeeper/conf/jaas-client.conf
|
bin/zkServer.sh start -Djava.security.auth.login.config=/etc/zookeeper/conf/jaas-client.conf
|
||||||
----
|
----
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-Createsecurity.json]]
|
|
||||||
=== Create security.json
|
=== Create security.json
|
||||||
|
|
||||||
Create the `security.json` file.
|
Create the `security.json` file.
|
||||||
|
@ -194,7 +185,6 @@ More details on how to use a `/security.json` file in Solr are available in the
|
||||||
If you already have a `/security.json` file in ZooKeeper, download the file, add or modify the authentication section and upload it back to ZooKeeper using the <<command-line-utilities.adoc#command-line-utilities,Command Line Utilities>> available in Solr.
|
If you already have a `/security.json` file in ZooKeeper, download the file, add or modify the authentication section and upload it back to ZooKeeper using the <<command-line-utilities.adoc#command-line-utilities,Command Line Utilities>> available in Solr.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-DefineaJAASConfigurationFile]]
|
|
||||||
=== Define a JAAS Configuration File
|
=== Define a JAAS Configuration File
|
||||||
|
|
||||||
The JAAS configuration file defines the properties to use for authentication, such as the service principal and the location of the keytab file. Other properties can also be set to ensure ticket caching and other features.
|
The JAAS configuration file defines the properties to use for authentication, such as the service principal and the location of the keytab file. Other properties can also be set to ensure ticket caching and other features.
|
||||||
|
@ -227,7 +217,6 @@ The main properties we are concerned with are the `keyTab` and `principal` prope
|
||||||
* `debug`: this boolean property will output debug messages for help in troubleshooting.
|
* `debug`: this boolean property will output debug messages for help in troubleshooting.
|
||||||
* `principal`: the name of the service principal to be used.
|
* `principal`: the name of the service principal to be used.
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-SolrStartupParameters]]
|
|
||||||
=== Solr Startup Parameters
|
=== Solr Startup Parameters
|
||||||
|
|
||||||
While starting up Solr, the following host-specific parameters need to be passed. These parameters can be passed at the command line with the `bin/solr` start command (see <<solr-control-script-reference.adoc#solr-control-script-reference,Solr Control Script Reference>> for details on how to pass system parameters) or defined in `bin/solr.in.sh` or `bin/solr.in.cmd` as appropriate for your operating system.
|
While starting up Solr, the following host-specific parameters need to be passed. These parameters can be passed at the command line with the `bin/solr` start command (see <<solr-control-script-reference.adoc#solr-control-script-reference,Solr Control Script Reference>> for details on how to pass system parameters) or defined in `bin/solr.in.sh` or `bin/solr.in.cmd` as appropriate for your operating system.
|
||||||
|
@ -252,7 +241,6 @@ The app name (section name) within the JAAS configuration file which is required
|
||||||
`java.security.auth.login.config`::
|
`java.security.auth.login.config`::
|
||||||
Path to the JAAS configuration file for configuring a Solr client for internode communication. This parameter is required.
|
Path to the JAAS configuration file for configuring a Solr client for internode communication. This parameter is required.
|
||||||
|
|
||||||
|
|
||||||
Here is an example that could be added to `bin/solr.in.sh`. Make sure to change this example to use the right hostname and the keytab file path.
|
Here is an example that could be added to `bin/solr.in.sh`. Make sure to change this example to use the right hostname and the keytab file path.
|
||||||
|
|
||||||
[source,bash]
|
[source,bash]
|
||||||
|
@ -273,7 +261,6 @@ For Java 1.8, this is available here: http://www.oracle.com/technetwork/java/jav
|
||||||
Replace the `local_policy.jar` present in `JAVA_HOME/jre/lib/security/` with the new `local_policy.jar` from the downloaded package and restart the Solr node.
|
Replace the `local_policy.jar` present in `JAVA_HOME/jre/lib/security/` with the new `local_policy.jar` from the downloaded package and restart the Solr node.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-UsingDelegationTokens]]
|
|
||||||
=== Using Delegation Tokens
|
=== Using Delegation Tokens
|
||||||
|
|
||||||
The Kerberos plugin can be configured to use delegation tokens, which allow an application to reuse the authentication of an end-user or another application.
|
The Kerberos plugin can be configured to use delegation tokens, which allow an application to reuse the authentication of an end-user or another application.
|
||||||
|
@ -304,7 +291,6 @@ The ZooKeeper path where the secret provider information is stored. This is in t
|
||||||
`solr.kerberos.delegation.token.secret.manager.znode.working.path`::
|
`solr.kerberos.delegation.token.secret.manager.znode.working.path`::
|
||||||
The ZooKeeper path where token information is stored. This is in the form of the path + /security/zkdtsm. The path can include the chroot or the chroot can be omitted if you are not using it. This example includes the chroot: `server1:9983,server2:9983,server3:9983/solr/security/zkdtsm`.
|
The ZooKeeper path where token information is stored. This is in the form of the path + /security/zkdtsm. The path can include the chroot or the chroot can be omitted if you are not using it. This example includes the chroot: `server1:9983,server2:9983,server3:9983/solr/security/zkdtsm`.
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-StartSolr]]
|
|
||||||
=== Start Solr
|
=== Start Solr
|
||||||
|
|
||||||
Once the configuration is complete, you can start Solr with the `bin/solr` script, as in the example below, which is for users in SolrCloud mode only. This example assumes you modified `bin/solr.in.sh` or `bin/solr.in.cmd`, with the proper values, but if you did not, you would pass the system parameters along with the start command. Note you also need to customize the `-z` property as appropriate for the location of your ZooKeeper nodes.
|
Once the configuration is complete, you can start Solr with the `bin/solr` script, as in the example below, which is for users in SolrCloud mode only. This example assumes you modified `bin/solr.in.sh` or `bin/solr.in.cmd`, with the proper values, but if you did not, you would pass the system parameters along with the start command. Note you also need to customize the `-z` property as appropriate for the location of your ZooKeeper nodes.
|
||||||
|
@ -314,7 +300,6 @@ Once the configuration is complete, you can start Solr with the `bin/solr` scrip
|
||||||
bin/solr -c -z server1:2181,server2:2181,server3:2181/solr
|
bin/solr -c -z server1:2181,server2:2181,server3:2181/solr
|
||||||
----
|
----
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-TesttheConfiguration]]
|
|
||||||
=== Test the Configuration
|
=== Test the Configuration
|
||||||
|
|
||||||
. Do a `kinit` with your username. For example, `kinit \user@EXAMPLE.COM`.
|
. Do a `kinit` with your username. For example, `kinit \user@EXAMPLE.COM`.
|
||||||
|
@ -325,7 +310,6 @@ bin/solr -c -z server1:2181,server2:2181,server3:2181/solr
|
||||||
curl --negotiate -u : "http://192.168.0.107:8983/solr/"
|
curl --negotiate -u : "http://192.168.0.107:8983/solr/"
|
||||||
----
|
----
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-UsingSolrJwithaKerberizedSolr]]
|
|
||||||
== Using SolrJ with a Kerberized Solr
|
== Using SolrJ with a Kerberized Solr
|
||||||
|
|
||||||
To use Kerberos authentication in a SolrJ application, you need the following two lines before you create a SolrClient:
|
To use Kerberos authentication in a SolrJ application, you need the following two lines before you create a SolrClient:
|
||||||
|
@ -353,7 +337,6 @@ SolrJClient {
|
||||||
};
|
};
|
||||||
----
|
----
|
||||||
|
|
||||||
[[KerberosAuthenticationPlugin-DelegationTokenswithSolrJ]]
|
|
||||||
=== Delegation Tokens with SolrJ
|
=== Delegation Tokens with SolrJ
|
||||||
|
|
||||||
Delegation tokens are also supported with SolrJ, in the following ways:
|
Delegation tokens are also supported with SolrJ, in the following ways:
|
||||||
|
|
|
@ -26,7 +26,6 @@ In other languages the tokenization rules are often not so simple. Some European
|
||||||
|
|
||||||
For information about language detection at index time, see <<detecting-languages-during-indexing.adoc#detecting-languages-during-indexing,Detecting Languages During Indexing>>.
|
For information about language detection at index time, see <<detecting-languages-during-indexing.adoc#detecting-languages-during-indexing,Detecting Languages During Indexing>>.
|
||||||
|
|
||||||
[[LanguageAnalysis-KeywordMarkerFilterFactory]]
|
|
||||||
== KeywordMarkerFilterFactory
|
== KeywordMarkerFilterFactory
|
||||||
|
|
||||||
Protects words from being modified by stemmers. A customized protected word list may be specified with the "protected" attribute in the schema. Any words in the protected word list will not be modified by any stemmer in Solr.
|
Protects words from being modified by stemmers. A customized protected word list may be specified with the "protected" attribute in the schema. Any words in the protected word list will not be modified by any stemmer in Solr.
|
||||||
|
@ -44,7 +43,6 @@ A sample Solr `protwords.txt` with comments can be found in the `sample_techprod
|
||||||
</fieldtype>
|
</fieldtype>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-KeywordRepeatFilterFactory]]
|
|
||||||
== KeywordRepeatFilterFactory
|
== KeywordRepeatFilterFactory
|
||||||
|
|
||||||
Emits each token twice, one with the `KEYWORD` attribute and once without.
|
Emits each token twice, one with the `KEYWORD` attribute and once without.
|
||||||
|
@ -69,8 +67,6 @@ A sample fieldType configuration could look like this:
|
||||||
|
|
||||||
IMPORTANT: When adding the same token twice, it will also score twice (double), so you may have to re-tune your ranking rules.
|
IMPORTANT: When adding the same token twice, it will also score twice (double), so you may have to re-tune your ranking rules.
|
||||||
|
|
||||||
|
|
||||||
[[LanguageAnalysis-StemmerOverrideFilterFactory]]
|
|
||||||
== StemmerOverrideFilterFactory
|
== StemmerOverrideFilterFactory
|
||||||
|
|
||||||
Overrides stemming algorithms by applying a custom mapping, then protecting these terms from being modified by stemmers.
|
Overrides stemming algorithms by applying a custom mapping, then protecting these terms from being modified by stemmers.
|
||||||
|
@ -90,7 +86,6 @@ A sample http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-fil
|
||||||
</fieldtype>
|
</fieldtype>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-DictionaryCompoundWordTokenFilter]]
|
|
||||||
== Dictionary Compound Word Token Filter
|
== Dictionary Compound Word Token Filter
|
||||||
|
|
||||||
This filter splits, or _decompounds_, compound words into individual words using a dictionary of the component words. Each input token is passed through unchanged. If it can also be decompounded into subwords, each subword is also added to the stream at the same logical position.
|
This filter splits, or _decompounds_, compound words into individual words using a dictionary of the component words. Each input token is passed through unchanged. If it can also be decompounded into subwords, each subword is also added to the stream at the same logical position.
|
||||||
|
@ -129,7 +124,6 @@ Assume that `germanwords.txt` contains at least the following words: `dumm kopf
|
||||||
|
|
||||||
*Out:* "Donaudampfschiff"(1), "Donau"(1), "dampf"(1), "schiff"(1), "dummkopf"(2), "dumm"(2), "kopf"(2)
|
*Out:* "Donaudampfschiff"(1), "Donau"(1), "dampf"(1), "schiff"(1), "dummkopf"(2), "dumm"(2), "kopf"(2)
|
||||||
|
|
||||||
[[LanguageAnalysis-UnicodeCollation]]
|
|
||||||
== Unicode Collation
|
== Unicode Collation
|
||||||
|
|
||||||
Unicode Collation is a language-sensitive method of sorting text that can also be used for advanced search purposes.
|
Unicode Collation is a language-sensitive method of sorting text that can also be used for advanced search purposes.
|
||||||
|
@ -175,7 +169,6 @@ Expert options:
|
||||||
|
|
||||||
`variableTop`:: Single character or contraction. Controls what is variable for `alternate`.
|
`variableTop`:: Single character or contraction. Controls what is variable for `alternate`.
|
||||||
|
|
||||||
[[LanguageAnalysis-SortingTextforaSpecificLanguage]]
|
|
||||||
=== Sorting Text for a Specific Language
|
=== Sorting Text for a Specific Language
|
||||||
|
|
||||||
In this example, text is sorted according to the default German rules provided by ICU4J.
|
In this example, text is sorted according to the default German rules provided by ICU4J.
|
||||||
|
@ -223,7 +216,6 @@ An example using the "city_sort" field to sort:
|
||||||
q=*:*&fl=city&sort=city_sort+asc
|
q=*:*&fl=city&sort=city_sort+asc
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-SortingTextforMultipleLanguages]]
|
|
||||||
=== Sorting Text for Multiple Languages
|
=== Sorting Text for Multiple Languages
|
||||||
|
|
||||||
There are two approaches to supporting multiple languages: if there is a small list of languages you wish to support, consider defining collated fields for each language and using `copyField`. However, adding a large number of sort fields can increase disk and indexing costs. An alternative approach is to use the Unicode `default` collator.
|
There are two approaches to supporting multiple languages: if there is a small list of languages you wish to support, consider defining collated fields for each language and using `copyField`. However, adding a large number of sort fields can increase disk and indexing costs. An alternative approach is to use the Unicode `default` collator.
|
||||||
|
@ -237,7 +229,6 @@ The Unicode `default` or `ROOT` locale has rules that are designed to work well
|
||||||
strength="primary" />
|
strength="primary" />
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-SortingTextwithCustomRules]]
|
|
||||||
=== Sorting Text with Custom Rules
|
=== Sorting Text with Custom Rules
|
||||||
|
|
||||||
You can define your own set of sorting rules. It's easiest to take existing rules that are close to what you want and customize them.
|
You can define your own set of sorting rules. It's easiest to take existing rules that are close to what you want and customize them.
|
||||||
|
@ -277,7 +268,6 @@ This rule set can now be used for custom collation in Solr:
|
||||||
strength="primary" />
|
strength="primary" />
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-JDKCollation]]
|
|
||||||
=== JDK Collation
|
=== JDK Collation
|
||||||
|
|
||||||
As mentioned above, ICU Unicode Collation is better in several ways than JDK Collation, but if you cannot use ICU4J for some reason, you can use `solr.CollationField`.
|
As mentioned above, ICU Unicode Collation is better in several ways than JDK Collation, but if you cannot use ICU4J for some reason, you can use `solr.CollationField`.
|
||||||
|
@ -321,7 +311,6 @@ Using a Tailored ruleset:
|
||||||
|
|
||||||
== ASCII & Decimal Folding Filters
|
== ASCII & Decimal Folding Filters
|
||||||
|
|
||||||
[[LanguageAnalysis-AsciiFolding]]
|
|
||||||
=== ASCII Folding
|
=== ASCII Folding
|
||||||
|
|
||||||
This filter converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists. Only those characters with reasonable ASCII alternatives are converted.
|
This filter converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists. Only those characters with reasonable ASCII alternatives are converted.
|
||||||
|
@ -348,7 +337,6 @@ This can increase recall by causing more matches. On the other hand, it can redu
|
||||||
|
|
||||||
*Out:* "Bjorn", "Angstrom"
|
*Out:* "Bjorn", "Angstrom"
|
||||||
|
|
||||||
[[LanguageAnalysis-DecimalDigitFolding]]
|
|
||||||
=== Decimal Digit Folding
|
=== Decimal Digit Folding
|
||||||
|
|
||||||
This filter converts any character in the Unicode "Decimal Number" general category (`Nd`) into their equivalent Basic Latin digits (0-9).
|
This filter converts any character in the Unicode "Decimal Number" general category (`Nd`) into their equivalent Basic Latin digits (0-9).
|
||||||
|
@ -369,7 +357,6 @@ This can increase recall by causing more matches. On the other hand, it can redu
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-Language-SpecificFactories]]
|
|
||||||
== Language-Specific Factories
|
== Language-Specific Factories
|
||||||
|
|
||||||
These factories are each designed to work with specific languages. The languages covered here are:
|
These factories are each designed to work with specific languages. The languages covered here are:
|
||||||
|
@ -380,8 +367,8 @@ These factories are each designed to work with specific languages. The languages
|
||||||
* <<Catalan>>
|
* <<Catalan>>
|
||||||
* <<Traditional Chinese>>
|
* <<Traditional Chinese>>
|
||||||
* <<Simplified Chinese>>
|
* <<Simplified Chinese>>
|
||||||
* <<LanguageAnalysis-Czech,Czech>>
|
* <<Czech>>
|
||||||
* <<LanguageAnalysis-Danish,Danish>>
|
* <<Danish>>
|
||||||
|
|
||||||
* <<Dutch>>
|
* <<Dutch>>
|
||||||
* <<Finnish>>
|
* <<Finnish>>
|
||||||
|
@ -389,7 +376,7 @@ These factories are each designed to work with specific languages. The languages
|
||||||
* <<Galician>>
|
* <<Galician>>
|
||||||
* <<German>>
|
* <<German>>
|
||||||
* <<Greek>>
|
* <<Greek>>
|
||||||
* <<LanguageAnalysis-Hebrew_Lao_Myanmar_Khmer,Hebrew, Lao, Myanmar, Khmer>>
|
* <<hebrew-lao-myanmar-khmer,Hebrew, Lao, Myanmar, Khmer>>
|
||||||
* <<Hindi>>
|
* <<Hindi>>
|
||||||
* <<Indonesian>>
|
* <<Indonesian>>
|
||||||
* <<Italian>>
|
* <<Italian>>
|
||||||
|
@ -410,7 +397,6 @@ These factories are each designed to work with specific languages. The languages
|
||||||
* <<Turkish>>
|
* <<Turkish>>
|
||||||
* <<Ukrainian>>
|
* <<Ukrainian>>
|
||||||
|
|
||||||
[[LanguageAnalysis-Arabic]]
|
|
||||||
=== Arabic
|
=== Arabic
|
||||||
|
|
||||||
Solr provides support for the http://www.mtholyoke.edu/~lballest/Pubs/arab_stem05.pdf[Light-10] (PDF) stemming algorithm, and Lucene includes an example stopword list.
|
Solr provides support for the http://www.mtholyoke.edu/~lballest/Pubs/arab_stem05.pdf[Light-10] (PDF) stemming algorithm, and Lucene includes an example stopword list.
|
||||||
|
@ -432,7 +418,6 @@ This algorithm defines both character normalization and stemming, so these are s
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-BrazilianPortuguese]]
|
|
||||||
=== Brazilian Portuguese
|
=== Brazilian Portuguese
|
||||||
|
|
||||||
This is a Java filter written specifically for stemming the Brazilian dialect of the Portuguese language. It uses the Lucene class `org.apache.lucene.analysis.br.BrazilianStemmer`. Although that stemmer can be configured to use a list of protected words (which should not be stemmed), this factory does not accept any arguments to specify such a list.
|
This is a Java filter written specifically for stemming the Brazilian dialect of the Portuguese language. It uses the Lucene class `org.apache.lucene.analysis.br.BrazilianStemmer`. Although that stemmer can be configured to use a list of protected words (which should not be stemmed), this factory does not accept any arguments to specify such a list.
|
||||||
|
@ -457,7 +442,6 @@ This is a Java filter written specifically for stemming the Brazilian dialect of
|
||||||
|
|
||||||
*Out:* "pra", "pra"
|
*Out:* "pra", "pra"
|
||||||
|
|
||||||
[[LanguageAnalysis-Bulgarian]]
|
|
||||||
=== Bulgarian
|
=== Bulgarian
|
||||||
|
|
||||||
Solr includes a light stemmer for Bulgarian, following http://members.unine.ch/jacques.savoy/Papers/BUIR.pdf[this algorithm] (PDF), and Lucene includes an example stopword list.
|
Solr includes a light stemmer for Bulgarian, following http://members.unine.ch/jacques.savoy/Papers/BUIR.pdf[this algorithm] (PDF), and Lucene includes an example stopword list.
|
||||||
|
@ -477,7 +461,6 @@ Solr includes a light stemmer for Bulgarian, following http://members.unine.ch/j
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-Catalan]]
|
|
||||||
=== Catalan
|
=== Catalan
|
||||||
|
|
||||||
Solr can stem Catalan using the Snowball Porter Stemmer with an argument of `language="Catalan"`. Solr includes a set of contractions for Catalan, which can be stripped using `solr.ElisionFilterFactory`.
|
Solr can stem Catalan using the Snowball Porter Stemmer with an argument of `language="Catalan"`. Solr includes a set of contractions for Catalan, which can be stripped using `solr.ElisionFilterFactory`.
|
||||||
|
@ -507,14 +490,13 @@ Solr can stem Catalan using the Snowball Porter Stemmer with an argument of `lan
|
||||||
|
|
||||||
*Out:* "llengu"(1), "llengu"(2)
|
*Out:* "llengu"(1), "llengu"(2)
|
||||||
|
|
||||||
[[LanguageAnalysis-TraditionalChinese]]
|
|
||||||
=== Traditional Chinese
|
=== Traditional Chinese
|
||||||
|
|
||||||
The default configuration of the <<tokenizers.adoc#Tokenizers-ICUTokenizer,ICU Tokenizer>> is suitable for Traditional Chinese text. It follows the Word Break rules from the Unicode Text Segmentation algorithm for non-Chinese text, and uses a dictionary to segment Chinese words. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
|
The default configuration of the <<tokenizers.adoc#icu-tokenizer,ICU Tokenizer>> is suitable for Traditional Chinese text. It follows the Word Break rules from the Unicode Text Segmentation algorithm for non-Chinese text, and uses a dictionary to segment Chinese words. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
|
||||||
|
|
||||||
<<tokenizers.adoc#Tokenizers-StandardTokenizer,Standard Tokenizer>> can also be used to tokenize Traditional Chinese text. Following the Word Break rules from the Unicode Text Segmentation algorithm, it produces one token per Chinese character. When combined with <<LanguageAnalysis-CJKBigramFilter,CJK Bigram Filter>>, overlapping bigrams of Chinese characters are formed.
|
<<tokenizers.adoc#standard-tokenizer,Standard Tokenizer>> can also be used to tokenize Traditional Chinese text. Following the Word Break rules from the Unicode Text Segmentation algorithm, it produces one token per Chinese character. When combined with <<CJK Bigram Filter>>, overlapping bigrams of Chinese characters are formed.
|
||||||
|
|
||||||
<<LanguageAnalysis-CJKWidthFilter,CJK Width Filter>> folds fullwidth ASCII variants into the equivalent Basic Latin forms.
|
<<CJK Width Filter>> folds fullwidth ASCII variants into the equivalent Basic Latin forms.
|
||||||
|
|
||||||
*Examples:*
|
*Examples:*
|
||||||
|
|
||||||
|
@ -537,10 +519,9 @@ The default configuration of the <<tokenizers.adoc#Tokenizers-ICUTokenizer,ICU T
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-CJKBigramFilter]]
|
|
||||||
=== CJK Bigram Filter
|
=== CJK Bigram Filter
|
||||||
|
|
||||||
Forms bigrams (overlapping 2-character sequences) of CJK characters that are generated from <<tokenizers.adoc#Tokenizers-StandardTokenizer,Standard Tokenizer>> or <<tokenizers.adoc#Tokenizers-ICUTokenizer,ICU Tokenizer>>.
|
Forms bigrams (overlapping 2-character sequences) of CJK characters that are generated from <<tokenizers.adoc#standard-tokenizer,Standard Tokenizer>> or <<tokenizers.adoc#icu-tokenizer,ICU Tokenizer>>.
|
||||||
|
|
||||||
By default, all CJK characters produce bigrams, but finer grained control is available by specifying orthographic type arguments `han`, `hiragana`, `katakana`, and `hangul`. When set to `false`, characters of the corresponding type will be passed through as unigrams, and will not be included in any bigrams.
|
By default, all CJK characters produce bigrams, but finer grained control is available by specifying orthographic type arguments `han`, `hiragana`, `katakana`, and `hangul`. When set to `false`, characters of the corresponding type will be passed through as unigrams, and will not be included in any bigrams.
|
||||||
|
|
||||||
|
@ -560,18 +541,17 @@ In all cases, all non-CJK input is passed through unmodified.
|
||||||
|
|
||||||
`outputUnigrams`:: (true/false) If true, in addition to forming bigrams, all characters are also passed through as unigrams. Default is false.
|
`outputUnigrams`:: (true/false) If true, in addition to forming bigrams, all characters are also passed through as unigrams. Default is false.
|
||||||
|
|
||||||
See the example under <<LanguageAnalysis-TraditionalChinese,Traditional Chinese>>.
|
See the example under <<Traditional Chinese>>.
|
||||||
|
|
||||||
[[LanguageAnalysis-SimplifiedChinese]]
|
|
||||||
=== Simplified Chinese
|
=== Simplified Chinese
|
||||||
|
|
||||||
For Simplified Chinese, Solr provides support for Chinese sentence and word segmentation with the <<LanguageAnalysis-HMMChineseTokenizer,HMM Chinese Tokenizer>>. This component includes a large dictionary and segments Chinese text into words with the Hidden Markov Model. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
|
For Simplified Chinese, Solr provides support for Chinese sentence and word segmentation with the <<HMM Chinese Tokenizer>>. This component includes a large dictionary and segments Chinese text into words with the Hidden Markov Model. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
|
||||||
|
|
||||||
The default configuration of the <<tokenizers.adoc#Tokenizers-ICUTokenizer,ICU Tokenizer>> is also suitable for Simplified Chinese text. It follows the Word Break rules from the Unicode Text Segmentation algorithm for non-Chinese text, and uses a dictionary to segment Chinese words. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
|
The default configuration of the <<tokenizers.adoc#icu-tokenizer,ICU Tokenizer>> is also suitable for Simplified Chinese text. It follows the Word Break rules from the Unicode Text Segmentation algorithm for non-Chinese text, and uses a dictionary to segment Chinese words. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
|
||||||
|
|
||||||
Also useful for Chinese analysis:
|
Also useful for Chinese analysis:
|
||||||
|
|
||||||
<<LanguageAnalysis-CJKWidthFilter,CJK Width Filter>> folds fullwidth ASCII variants into the equivalent Basic Latin forms, and folds halfwidth Katakana variants into their equivalent fullwidth forms.
|
<<CJK Width Filter>> folds fullwidth ASCII variants into the equivalent Basic Latin forms, and folds halfwidth Katakana variants into their equivalent fullwidth forms.
|
||||||
|
|
||||||
*Examples:*
|
*Examples:*
|
||||||
|
|
||||||
|
@ -598,7 +578,6 @@ Also useful for Chinese analysis:
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-HMMChineseTokenizer]]
|
|
||||||
=== HMM Chinese Tokenizer
|
=== HMM Chinese Tokenizer
|
||||||
|
|
||||||
For Simplified Chinese, Solr provides support for Chinese sentence and word segmentation with the `solr.HMMChineseTokenizerFactory` in the `analysis-extras` contrib module. This component includes a large dictionary and segments Chinese text into words with the Hidden Markov Model. To use this tokenizer, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
|
For Simplified Chinese, Solr provides support for Chinese sentence and word segmentation with the `solr.HMMChineseTokenizerFactory` in the `analysis-extras` contrib module. This component includes a large dictionary and segments Chinese text into words with the Hidden Markov Model. To use this tokenizer, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
|
||||||
|
@ -613,9 +592,8 @@ To use the default setup with fallback to English Porter stemmer for English wor
|
||||||
|
|
||||||
`<analyzer class="org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer"/>`
|
`<analyzer class="org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer"/>`
|
||||||
|
|
||||||
Or to configure your own analysis setup, use the `solr.HMMChineseTokenizerFactory` along with your custom filter setup. See an example of this in the <<LanguageAnalysis-SimplifiedChinese,Simplified Chinese>> section.
|
Or to configure your own analysis setup, use the `solr.HMMChineseTokenizerFactory` along with your custom filter setup. See an example of this in the <<Simplified Chinese>> section.
|
||||||
|
|
||||||
[[LanguageAnalysis-Czech]]
|
|
||||||
=== Czech
|
=== Czech
|
||||||
|
|
||||||
Solr includes a light stemmer for Czech, following https://dl.acm.org/citation.cfm?id=1598600[this algorithm], and Lucene includes an example stopword list.
|
Solr includes a light stemmer for Czech, following https://dl.acm.org/citation.cfm?id=1598600[this algorithm], and Lucene includes an example stopword list.
|
||||||
|
@ -641,12 +619,11 @@ Solr includes a light stemmer for Czech, following https://dl.acm.org/citation.c
|
||||||
|
|
||||||
*Out:* "preziden", "preziden", "preziden"
|
*Out:* "preziden", "preziden", "preziden"
|
||||||
|
|
||||||
[[LanguageAnalysis-Danish]]
|
|
||||||
=== Danish
|
=== Danish
|
||||||
|
|
||||||
Solr can stem Danish using the Snowball Porter Stemmer with an argument of `language="Danish"`.
|
Solr can stem Danish using the Snowball Porter Stemmer with an argument of `language="Danish"`.
|
||||||
|
|
||||||
Also relevant are the <<LanguageAnalysis-Scandinavian,Scandinavian normalization filters>>.
|
Also relevant are the <<Scandinavian,Scandinavian normalization filters>>.
|
||||||
|
|
||||||
*Factory class:* `solr.SnowballPorterFilterFactory`
|
*Factory class:* `solr.SnowballPorterFilterFactory`
|
||||||
|
|
||||||
|
@ -671,8 +648,6 @@ Also relevant are the <<LanguageAnalysis-Scandinavian,Scandinavian normalization
|
||||||
|
|
||||||
*Out:* "undersøg"(1), "undersøg"(2)
|
*Out:* "undersøg"(1), "undersøg"(2)
|
||||||
|
|
||||||
|
|
||||||
[[LanguageAnalysis-Dutch]]
|
|
||||||
=== Dutch
|
=== Dutch
|
||||||
|
|
||||||
Solr can stem Dutch using the Snowball Porter Stemmer with an argument of `language="Dutch"`.
|
Solr can stem Dutch using the Snowball Porter Stemmer with an argument of `language="Dutch"`.
|
||||||
|
@ -700,7 +675,6 @@ Solr can stem Dutch using the Snowball Porter Stemmer with an argument of `langu
|
||||||
|
|
||||||
*Out:* "kanal", "kanal"
|
*Out:* "kanal", "kanal"
|
||||||
|
|
||||||
[[LanguageAnalysis-Finnish]]
|
|
||||||
=== Finnish
|
=== Finnish
|
||||||
|
|
||||||
Solr includes support for stemming Finnish, and Lucene includes an example stopword list.
|
Solr includes support for stemming Finnish, and Lucene includes an example stopword list.
|
||||||
|
@ -726,10 +700,8 @@ Solr includes support for stemming Finnish, and Lucene includes an example stopw
|
||||||
*Out:* "kala", "kala"
|
*Out:* "kala", "kala"
|
||||||
|
|
||||||
|
|
||||||
[[LanguageAnalysis-French]]
|
|
||||||
=== French
|
=== French
|
||||||
|
|
||||||
[[LanguageAnalysis-ElisionFilter]]
|
|
||||||
==== Elision Filter
|
==== Elision Filter
|
||||||
|
|
||||||
Removes article elisions from a token stream. This filter can be useful for languages such as French, Catalan, Italian, and Irish.
|
Removes article elisions from a token stream. This filter can be useful for languages such as French, Catalan, Italian, and Irish.
|
||||||
|
@ -760,7 +732,6 @@ Removes article elisions from a token stream. This filter can be useful for lang
|
||||||
|
|
||||||
*Out:* "histoire", "art"
|
*Out:* "histoire", "art"
|
||||||
|
|
||||||
[[LanguageAnalysis-FrenchLightStemFilter]]
|
|
||||||
==== French Light Stem Filter
|
==== French Light Stem Filter
|
||||||
|
|
||||||
Solr includes three stemmers for French: one in the `solr.SnowballPorterFilterFactory`, a lighter stemmer called `solr.FrenchLightStemFilterFactory`, and an even less aggressive stemmer called `solr.FrenchMinimalStemFilterFactory`. Lucene includes an example stopword list.
|
Solr includes three stemmers for French: one in the `solr.SnowballPorterFilterFactory`, a lighter stemmer called `solr.FrenchLightStemFilterFactory`, and an even less aggressive stemmer called `solr.FrenchMinimalStemFilterFactory`. Lucene includes an example stopword list.
|
||||||
|
@ -800,7 +771,6 @@ Solr includes three stemmers for French: one in the `solr.SnowballPorterFilterFa
|
||||||
*Out:* "le", "chat", "le", "chat"
|
*Out:* "le", "chat", "le", "chat"
|
||||||
|
|
||||||
|
|
||||||
[[LanguageAnalysis-Galician]]
|
|
||||||
=== Galician
|
=== Galician
|
||||||
|
|
||||||
Solr includes a stemmer for Galician following http://bvg.udc.es/recursos_lingua/stemming.jsp[this algorithm], and Lucene includes an example stopword list.
|
Solr includes a stemmer for Galician following http://bvg.udc.es/recursos_lingua/stemming.jsp[this algorithm], and Lucene includes an example stopword list.
|
||||||
|
@ -826,8 +796,6 @@ Solr includes a stemmer for Galician following http://bvg.udc.es/recursos_lingua
|
||||||
|
|
||||||
*Out:* "feliz", "luz"
|
*Out:* "feliz", "luz"
|
||||||
|
|
||||||
|
|
||||||
[[LanguageAnalysis-German]]
|
|
||||||
=== German
|
=== German
|
||||||
|
|
||||||
Solr includes four stemmers for German: one in the `solr.SnowballPorterFilterFactory language="German"`, a stemmer called `solr.GermanStemFilterFactory`, a lighter stemmer called `solr.GermanLightStemFilterFactory`, and an even less aggressive stemmer called `solr.GermanMinimalStemFilterFactory`. Lucene includes an example stopword list.
|
Solr includes four stemmers for German: one in the `solr.SnowballPorterFilterFactory language="German"`, a stemmer called `solr.GermanStemFilterFactory`, a lighter stemmer called `solr.GermanLightStemFilterFactory`, and an even less aggressive stemmer called `solr.GermanMinimalStemFilterFactory`. Lucene includes an example stopword list.
|
||||||
|
@ -868,8 +836,6 @@ Solr includes four stemmers for German: one in the `solr.SnowballPorterFilterFac
|
||||||
|
|
||||||
*Out:* "haus", "haus"
|
*Out:* "haus", "haus"
|
||||||
|
|
||||||
|
|
||||||
[[LanguageAnalysis-Greek]]
|
|
||||||
=== Greek
|
=== Greek
|
||||||
|
|
||||||
This filter converts uppercase letters in the Greek character set to the equivalent lowercase character.
|
This filter converts uppercase letters in the Greek character set to the equivalent lowercase character.
|
||||||
|
@ -893,7 +859,6 @@ Use of custom charsets is no longer supported as of Solr 3.1. If you need to ind
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-Hindi]]
|
|
||||||
=== Hindi
|
=== Hindi
|
||||||
|
|
||||||
Solr includes support for stemming Hindi following http://computing.open.ac.uk/Sites/EACLSouthAsia/Papers/p6-Ramanathan.pdf[this algorithm] (PDF), support for common spelling differences through the `solr.HindiNormalizationFilterFactory`, support for encoding differences through the `solr.IndicNormalizationFilterFactory` following http://ldc.upenn.edu/myl/IndianScriptsUnicode.html[this algorithm], and Lucene includes an example stopword list.
|
Solr includes support for stemming Hindi following http://computing.open.ac.uk/Sites/EACLSouthAsia/Papers/p6-Ramanathan.pdf[this algorithm] (PDF), support for common spelling differences through the `solr.HindiNormalizationFilterFactory`, support for encoding differences through the `solr.IndicNormalizationFilterFactory` following http://ldc.upenn.edu/myl/IndianScriptsUnicode.html[this algorithm], and Lucene includes an example stopword list.
|
||||||
|
@ -914,8 +879,6 @@ Solr includes support for stemming Hindi following http://computing.open.ac.uk/S
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
|
|
||||||
[[LanguageAnalysis-Indonesian]]
|
|
||||||
=== Indonesian
|
=== Indonesian
|
||||||
|
|
||||||
Solr includes support for stemming Indonesian (Bahasa Indonesia) following http://www.illc.uva.nl/Publications/ResearchReports/MoL-2003-02.text.pdf[this algorithm] (PDF), and Lucene includes an example stopword list.
|
Solr includes support for stemming Indonesian (Bahasa Indonesia) following http://www.illc.uva.nl/Publications/ResearchReports/MoL-2003-02.text.pdf[this algorithm] (PDF), and Lucene includes an example stopword list.
|
||||||
|
@ -941,7 +904,6 @@ Solr includes support for stemming Indonesian (Bahasa Indonesia) following http:
|
||||||
|
|
||||||
*Out:* "bagai", "bagai"
|
*Out:* "bagai", "bagai"
|
||||||
|
|
||||||
[[LanguageAnalysis-Italian]]
|
|
||||||
=== Italian
|
=== Italian
|
||||||
|
|
||||||
Solr includes two stemmers for Italian: one in the `solr.SnowballPorterFilterFactory language="Italian"`, and a lighter stemmer called `solr.ItalianLightStemFilterFactory`. Lucene includes an example stopword list.
|
Solr includes two stemmers for Italian: one in the `solr.SnowballPorterFilterFactory language="Italian"`, and a lighter stemmer called `solr.ItalianLightStemFilterFactory`. Lucene includes an example stopword list.
|
||||||
|
@ -969,7 +931,6 @@ Solr includes two stemmers for Italian: one in the `solr.SnowballPorterFilterFac
|
||||||
|
|
||||||
*Out:* "propag", "propag", "propag"
|
*Out:* "propag", "propag", "propag"
|
||||||
|
|
||||||
[[LanguageAnalysis-Irish]]
|
|
||||||
=== Irish
|
=== Irish
|
||||||
|
|
||||||
Solr can stem Irish using the Snowball Porter Stemmer with an argument of `language="Irish"`. Solr includes `solr.IrishLowerCaseFilterFactory`, which can handle Irish-specific constructs. Solr also includes a set of contractions for Irish which can be stripped using `solr.ElisionFilterFactory`.
|
Solr can stem Irish using the Snowball Porter Stemmer with an argument of `language="Irish"`. Solr includes `solr.IrishLowerCaseFilterFactory`, which can handle Irish-specific constructs. Solr also includes a set of contractions for Irish which can be stripped using `solr.ElisionFilterFactory`.
|
||||||
|
@ -999,22 +960,20 @@ Solr can stem Irish using the Snowball Porter Stemmer with an argument of `langu
|
||||||
|
|
||||||
*Out:* "siopadóir", "síceapaite", "fearr", "athair"
|
*Out:* "siopadóir", "síceapaite", "fearr", "athair"
|
||||||
|
|
||||||
[[LanguageAnalysis-Japanese]]
|
|
||||||
=== Japanese
|
=== Japanese
|
||||||
|
|
||||||
Solr includes support for analyzing Japanese, via the Lucene Kuromoji morphological analyzer, which includes several analysis components - more details on each below:
|
Solr includes support for analyzing Japanese, via the Lucene Kuromoji morphological analyzer, which includes several analysis components - more details on each below:
|
||||||
|
|
||||||
* <<LanguageAnalysis-JapaneseIterationMarkCharFilter,`JapaneseIterationMarkCharFilter`>> normalizes Japanese horizontal iteration marks (odoriji) to their expanded form.
|
* <<Japanese Iteration Mark CharFilter,`JapaneseIterationMarkCharFilter`>> normalizes Japanese horizontal iteration marks (odoriji) to their expanded form.
|
||||||
* <<LanguageAnalysis-JapaneseTokenizer,`JapaneseTokenizer`>> tokenizes Japanese using morphological analysis, and annotates each term with part-of-speech, base form (a.k.a. lemma), reading and pronunciation.
|
* <<Japanese Tokenizer,`JapaneseTokenizer`>> tokenizes Japanese using morphological analysis, and annotates each term with part-of-speech, base form (a.k.a. lemma), reading and pronunciation.
|
||||||
* <<LanguageAnalysis-JapaneseBaseFormFilter,`JapaneseBaseFormFilter`>> replaces original terms with their base forms (a.k.a. lemmas).
|
* <<Japanese Base Form Filter,`JapaneseBaseFormFilter`>> replaces original terms with their base forms (a.k.a. lemmas).
|
||||||
* <<LanguageAnalysis-JapanesePartOfSpeechStopFilter,`JapanesePartOfSpeechStopFilter`>> removes terms that have one of the configured parts-of-speech.
|
* <<Japanese Part Of Speech Stop Filter,`JapanesePartOfSpeechStopFilter`>> removes terms that have one of the configured parts-of-speech.
|
||||||
* <<LanguageAnalysis-JapaneseKatakanaStemFilter,`JapaneseKatakanaStemFilter`>> normalizes common katakana spelling variations ending in a long sound character (U+30FC) by removing the long sound character.
|
* <<Japanese Katakana Stem Filter,`JapaneseKatakanaStemFilter`>> normalizes common katakana spelling variations ending in a long sound character (U+30FC) by removing the long sound character.
|
||||||
|
|
||||||
Also useful for Japanese analysis, from lucene-analyzers-common:
|
Also useful for Japanese analysis, from lucene-analyzers-common:
|
||||||
|
|
||||||
* <<LanguageAnalysis-CJKWidthFilter,`CJKWidthFilter`>> folds fullwidth ASCII variants into the equivalent Basic Latin forms, and folds halfwidth Katakana variants into their equivalent fullwidth forms.
|
* <<CJK Width Filter,`CJKWidthFilter`>> folds fullwidth ASCII variants into the equivalent Basic Latin forms, and folds halfwidth Katakana variants into their equivalent fullwidth forms.
|
||||||
|
|
||||||
[[LanguageAnalysis-JapaneseIterationMarkCharFilter]]
|
|
||||||
==== Japanese Iteration Mark CharFilter
|
==== Japanese Iteration Mark CharFilter
|
||||||
|
|
||||||
Normalizes horizontal Japanese iteration marks (odoriji) to their expanded form. Vertical iteration marks are not supported.
|
Normalizes horizontal Japanese iteration marks (odoriji) to their expanded form. Vertical iteration marks are not supported.
|
||||||
|
@ -1027,7 +986,6 @@ Normalizes horizontal Japanese iteration marks (odoriji) to their expanded form.
|
||||||
|
|
||||||
`normalizeKana`:: set to `false` to not normalize kana iteration marks (default is `true`)
|
`normalizeKana`:: set to `false` to not normalize kana iteration marks (default is `true`)
|
||||||
|
|
||||||
[[LanguageAnalysis-JapaneseTokenizer]]
|
|
||||||
==== Japanese Tokenizer
|
==== Japanese Tokenizer
|
||||||
|
|
||||||
Tokenizer for Japanese that uses morphological analysis, and annotates each term with part-of-speech, base form (a.k.a. lemma), reading and pronunciation.
|
Tokenizer for Japanese that uses morphological analysis, and annotates each term with part-of-speech, base form (a.k.a. lemma), reading and pronunciation.
|
||||||
|
@ -1052,7 +1010,6 @@ For some applications it might be good to use `search` mode for indexing and `no
|
||||||
|
|
||||||
`discardPunctuation`:: set to `false` to keep punctuation, `true` to discard (the default)
|
`discardPunctuation`:: set to `false` to keep punctuation, `true` to discard (the default)
|
||||||
|
|
||||||
[[LanguageAnalysis-JapaneseBaseFormFilter]]
|
|
||||||
==== Japanese Base Form Filter
|
==== Japanese Base Form Filter
|
||||||
|
|
||||||
Replaces original terms' text with the corresponding base form (lemma). (`JapaneseTokenizer` annotates each term with its base form.)
|
Replaces original terms' text with the corresponding base form (lemma). (`JapaneseTokenizer` annotates each term with its base form.)
|
||||||
|
@ -1061,7 +1018,6 @@ Replaces original terms' text with the corresponding base form (lemma). (`Japane
|
||||||
|
|
||||||
(no arguments)
|
(no arguments)
|
||||||
|
|
||||||
[[LanguageAnalysis-JapanesePartOfSpeechStopFilter]]
|
|
||||||
==== Japanese Part Of Speech Stop Filter
|
==== Japanese Part Of Speech Stop Filter
|
||||||
|
|
||||||
Removes terms with one of the configured parts-of-speech. `JapaneseTokenizer` annotates terms with parts-of-speech.
|
Removes terms with one of the configured parts-of-speech. `JapaneseTokenizer` annotates terms with parts-of-speech.
|
||||||
|
@ -1074,12 +1030,11 @@ Removes terms with one of the configured parts-of-speech. `JapaneseTokenizer` an
|
||||||
|
|
||||||
`enablePositionIncrements`:: if `luceneMatchVersion` is `4.3` or earlier and `enablePositionIncrements="false"`, no position holes will be left by this filter when it removes tokens. *This argument is invalid if `luceneMatchVersion` is `5.0` or later.*
|
`enablePositionIncrements`:: if `luceneMatchVersion` is `4.3` or earlier and `enablePositionIncrements="false"`, no position holes will be left by this filter when it removes tokens. *This argument is invalid if `luceneMatchVersion` is `5.0` or later.*
|
||||||
|
|
||||||
[[LanguageAnalysis-JapaneseKatakanaStemFilter]]
|
|
||||||
==== Japanese Katakana Stem Filter
|
==== Japanese Katakana Stem Filter
|
||||||
|
|
||||||
Normalizes common katakana spelling variations ending in a long sound character (U+30FC) by removing the long sound character.
|
Normalizes common katakana spelling variations ending in a long sound character (U+30FC) by removing the long sound character.
|
||||||
|
|
||||||
<<LanguageAnalysis-CJKWidthFilter,`solr.CJKWidthFilterFactory`>> should be specified prior to this filter to normalize half-width katakana to full-width.
|
<<CJK Width Filter,`solr.CJKWidthFilterFactory`>> should be specified prior to this filter to normalize half-width katakana to full-width.
|
||||||
|
|
||||||
*Factory class:* `JapaneseKatakanaStemFilterFactory`
|
*Factory class:* `JapaneseKatakanaStemFilterFactory`
|
||||||
|
|
||||||
|
@ -1087,7 +1042,6 @@ Normalizes common katakana spelling variations ending in a long sound character
|
||||||
|
|
||||||
`minimumLength`:: terms below this length will not be stemmed. Default is 4, value must be 2 or more.
|
`minimumLength`:: terms below this length will not be stemmed. Default is 4, value must be 2 or more.
|
||||||
|
|
||||||
[[LanguageAnalysis-CJKWidthFilter]]
|
|
||||||
==== CJK Width Filter
|
==== CJK Width Filter
|
||||||
|
|
||||||
Folds fullwidth ASCII variants into the equivalent Basic Latin forms, and folds halfwidth Katakana variants into their equivalent fullwidth forms.
|
Folds fullwidth ASCII variants into the equivalent Basic Latin forms, and folds halfwidth Katakana variants into their equivalent fullwidth forms.
|
||||||
|
@ -1115,14 +1069,13 @@ Example:
|
||||||
</fieldType>
|
</fieldType>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-Hebrew_Lao_Myanmar_Khmer]]
|
[[hebrew-lao-myanmar-khmer]]
|
||||||
=== Hebrew, Lao, Myanmar, Khmer
|
=== Hebrew, Lao, Myanmar, Khmer
|
||||||
|
|
||||||
Lucene provides support, in addition to UAX#29 word break rules, for Hebrew's use of the double and single quote characters, and for segmenting Lao, Myanmar, and Khmer into syllables with the `solr.ICUTokenizerFactory` in the `analysis-extras` contrib module. To use this tokenizer, see `solr/contrib/analysis-extras/README.txt for` instructions on which jars you need to add to your `solr_home/lib`.
|
Lucene provides support, in addition to UAX#29 word break rules, for Hebrew's use of the double and single quote characters, and for segmenting Lao, Myanmar, and Khmer into syllables with the `solr.ICUTokenizerFactory` in the `analysis-extras` contrib module. To use this tokenizer, see `solr/contrib/analysis-extras/README.txt for` instructions on which jars you need to add to your `solr_home/lib`.
|
||||||
|
|
||||||
See <<tokenizers.adoc#Tokenizers-ICUTokenizer,the ICUTokenizer>> for more information.
|
See <<tokenizers.adoc#icu-tokenizer,the ICUTokenizer>> for more information.
|
||||||
|
|
||||||
[[LanguageAnalysis-Latvian]]
|
|
||||||
=== Latvian
|
=== Latvian
|
||||||
|
|
||||||
Solr includes support for stemming Latvian, and Lucene includes an example stopword list.
|
Solr includes support for stemming Latvian, and Lucene includes an example stopword list.
|
||||||
|
@ -1150,16 +1103,14 @@ Solr includes support for stemming Latvian, and Lucene includes an example stopw
|
||||||
|
|
||||||
*Out:* "tirg", "tirg"
|
*Out:* "tirg", "tirg"
|
||||||
|
|
||||||
[[LanguageAnalysis-Norwegian]]
|
|
||||||
=== Norwegian
|
=== Norwegian
|
||||||
|
|
||||||
Solr includes two classes for stemming Norwegian, `NorwegianLightStemFilterFactory` and `NorwegianMinimalStemFilterFactory`. Lucene includes an example stopword list.
|
Solr includes two classes for stemming Norwegian, `NorwegianLightStemFilterFactory` and `NorwegianMinimalStemFilterFactory`. Lucene includes an example stopword list.
|
||||||
|
|
||||||
Another option is to use the Snowball Porter Stemmer with an argument of language="Norwegian".
|
Another option is to use the Snowball Porter Stemmer with an argument of language="Norwegian".
|
||||||
|
|
||||||
Also relevant are the <<LanguageAnalysis-Scandinavian,Scandinavian normalization filters>>.
|
Also relevant are the <<Scandinavian,Scandinavian normalization filters>>.
|
||||||
|
|
||||||
[[LanguageAnalysis-NorwegianLightStemmer]]
|
|
||||||
==== Norwegian Light Stemmer
|
==== Norwegian Light Stemmer
|
||||||
|
|
||||||
The `NorwegianLightStemFilterFactory` requires a "two-pass" sort for the -dom and -het endings. This means that in the first pass the word "kristendom" is stemmed to "kristen", and then all the general rules apply so it will be further stemmed to "krist". The effect of this is that "kristen," "kristendom," "kristendommen," and "kristendommens" will all be stemmed to "krist."
|
The `NorwegianLightStemFilterFactory` requires a "two-pass" sort for the -dom and -het endings. This means that in the first pass the word "kristendom" is stemmed to "kristen", and then all the general rules apply so it will be further stemmed to "krist". The effect of this is that "kristen," "kristendom," "kristendommen," and "kristendommens" will all be stemmed to "krist."
|
||||||
|
@ -1209,7 +1160,6 @@ The second pass is to pick up -dom and -het endings. Consider this example:
|
||||||
|
|
||||||
*Out:* "forelske"
|
*Out:* "forelske"
|
||||||
|
|
||||||
[[LanguageAnalysis-NorwegianMinimalStemmer]]
|
|
||||||
==== Norwegian Minimal Stemmer
|
==== Norwegian Minimal Stemmer
|
||||||
|
|
||||||
The `NorwegianMinimalStemFilterFactory` stems plural forms of Norwegian nouns only.
|
The `NorwegianMinimalStemFilterFactory` stems plural forms of Norwegian nouns only.
|
||||||
|
@ -1244,10 +1194,8 @@ The `NorwegianMinimalStemFilterFactory` stems plural forms of Norwegian nouns on
|
||||||
|
|
||||||
*Out:* "bil"
|
*Out:* "bil"
|
||||||
|
|
||||||
[[LanguageAnalysis-Persian]]
|
|
||||||
=== Persian
|
=== Persian
|
||||||
|
|
||||||
[[LanguageAnalysis-PersianFilterFactories]]
|
|
||||||
==== Persian Filter Factories
|
==== Persian Filter Factories
|
||||||
|
|
||||||
Solr includes support for normalizing Persian, and Lucene includes an example stopword list.
|
Solr includes support for normalizing Persian, and Lucene includes an example stopword list.
|
||||||
|
@ -1267,7 +1215,6 @@ Solr includes support for normalizing Persian, and Lucene includes an example st
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-Polish]]
|
|
||||||
=== Polish
|
=== Polish
|
||||||
|
|
||||||
Solr provides support for Polish stemming with the `solr.StempelPolishStemFilterFactory`, and `solr.MorphologikFilterFactory` for lemmatization, in the `contrib/analysis-extras` module. The `solr.StempelPolishStemFilterFactory` component includes an algorithmic stemmer with tables for Polish. To use either of these filters, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
|
Solr provides support for Polish stemming with the `solr.StempelPolishStemFilterFactory`, and `solr.MorphologikFilterFactory` for lemmatization, in the `contrib/analysis-extras` module. The `solr.StempelPolishStemFilterFactory` component includes an algorithmic stemmer with tables for Polish. To use either of these filters, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
|
||||||
|
@ -1308,7 +1255,6 @@ Note the lower case filter is applied _after_ the Morfologik stemmer; this is be
|
||||||
|
|
||||||
The Morfologik dictionary parameter value is a constant specifying which dictionary to choose. The dictionary resource must be named `path/to/_language_.dict` and have an associated `.info` metadata file. See http://morfologik.blogspot.com/[the Morfologik project] for details. If the dictionary attribute is not provided, the Polish dictionary is loaded and used by default.
|
The Morfologik dictionary parameter value is a constant specifying which dictionary to choose. The dictionary resource must be named `path/to/_language_.dict` and have an associated `.info` metadata file. See http://morfologik.blogspot.com/[the Morfologik project] for details. If the dictionary attribute is not provided, the Polish dictionary is loaded and used by default.
|
||||||
|
|
||||||
[[LanguageAnalysis-Portuguese]]
|
|
||||||
=== Portuguese
|
=== Portuguese
|
||||||
|
|
||||||
Solr includes four stemmers for Portuguese: one in the `solr.SnowballPorterFilterFactory`, an alternative stemmer called `solr.PortugueseStemFilterFactory`, a lighter stemmer called `solr.PortugueseLightStemFilterFactory`, and an even less aggressive stemmer called `solr.PortugueseMinimalStemFilterFactory`. Lucene includes an example stopword list.
|
Solr includes four stemmers for Portuguese: one in the `solr.SnowballPorterFilterFactory`, an alternative stemmer called `solr.PortugueseStemFilterFactory`, a lighter stemmer called `solr.PortugueseLightStemFilterFactory`, and an even less aggressive stemmer called `solr.PortugueseMinimalStemFilterFactory`. Lucene includes an example stopword list.
|
||||||
|
@ -1352,8 +1298,6 @@ Solr includes four stemmers for Portuguese: one in the `solr.SnowballPorterFilte
|
||||||
|
|
||||||
*Out:* "pra", "pra"
|
*Out:* "pra", "pra"
|
||||||
|
|
||||||
|
|
||||||
[[LanguageAnalysis-Romanian]]
|
|
||||||
=== Romanian
|
=== Romanian
|
||||||
|
|
||||||
Solr can stem Romanian using the Snowball Porter Stemmer with an argument of `language="Romanian"`.
|
Solr can stem Romanian using the Snowball Porter Stemmer with an argument of `language="Romanian"`.
|
||||||
|
@ -1375,11 +1319,8 @@ Solr can stem Romanian using the Snowball Porter Stemmer with an argument of `la
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
|
|
||||||
[[LanguageAnalysis-Russian]]
|
|
||||||
=== Russian
|
=== Russian
|
||||||
|
|
||||||
[[LanguageAnalysis-RussianStemFilter]]
|
|
||||||
==== Russian Stem Filter
|
==== Russian Stem Filter
|
||||||
|
|
||||||
Solr includes two stemmers for Russian: one in the `solr.SnowballPorterFilterFactory language="Russian"`, and a lighter stemmer called `solr.RussianLightStemFilterFactory`. Lucene includes an example stopword list.
|
Solr includes two stemmers for Russian: one in the `solr.SnowballPorterFilterFactory language="Russian"`, and a lighter stemmer called `solr.RussianLightStemFilterFactory`. Lucene includes an example stopword list.
|
||||||
|
@ -1399,11 +1340,9 @@ Solr includes two stemmers for Russian: one in the `solr.SnowballPorterFilterFac
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
|
|
||||||
[[LanguageAnalysis-Scandinavian]]
|
|
||||||
=== Scandinavian
|
=== Scandinavian
|
||||||
|
|
||||||
Scandinavian is a language group spanning three languages <<LanguageAnalysis-Norwegian,Norwegian>>, <<LanguageAnalysis-Swedish,Swedish>> and <<LanguageAnalysis-Danish,Danish>> which are very similar.
|
Scandinavian is a language group spanning three languages <<Norwegian>>, <<Swedish>> and <<Danish>> which are very similar.
|
||||||
|
|
||||||
Swedish å, ä, ö are in fact the same letters as Norwegian and Danish å, æ, ø and thus interchangeable when used between these languages. They are however folded differently when people type them on a keyboard lacking these characters.
|
Swedish å, ä, ö are in fact the same letters as Norwegian and Danish å, æ, ø and thus interchangeable when used between these languages. They are however folded differently when people type them on a keyboard lacking these characters.
|
||||||
|
|
||||||
|
@ -1413,7 +1352,6 @@ There are two filters for helping with normalization between Scandinavian langua
|
||||||
|
|
||||||
See also each language section for other relevant filters.
|
See also each language section for other relevant filters.
|
||||||
|
|
||||||
[[LanguageAnalysis-ScandinavianNormalizationFilter]]
|
|
||||||
==== Scandinavian Normalization Filter
|
==== Scandinavian Normalization Filter
|
||||||
|
|
||||||
This filter normalize use of the interchangeable Scandinavian characters æÆäÄöÖøØ and folded variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.
|
This filter normalize use of the interchangeable Scandinavian characters æÆäÄöÖøØ and folded variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.
|
||||||
|
@ -1441,7 +1379,6 @@ It's a semantically less destructive solution than `ScandinavianFoldingFilter`,
|
||||||
|
|
||||||
*Out:* "blåbærsyltetøj", "blåbærsyltetøj", "blåbærsyltetøj", "blabarsyltetoj"
|
*Out:* "blåbærsyltetøj", "blåbærsyltetøj", "blåbærsyltetøj", "blabarsyltetoj"
|
||||||
|
|
||||||
[[LanguageAnalysis-ScandinavianFoldingFilter]]
|
|
||||||
==== Scandinavian Folding Filter
|
==== Scandinavian Folding Filter
|
||||||
|
|
||||||
This filter folds Scandinavian characters åÅäæÄÆ\->a and öÖøØ\->o. It also discriminate against use of double vowels aa, ae, ao, oe and oo, leaving just the first one.
|
This filter folds Scandinavian characters åÅäæÄÆ\->a and öÖøØ\->o. It also discriminate against use of double vowels aa, ae, ao, oe and oo, leaving just the first one.
|
||||||
|
@ -1469,10 +1406,8 @@ It's a semantically more destructive solution than `ScandinavianNormalizationFil
|
||||||
|
|
||||||
*Out:* "blabarsyltetoj", "blabarsyltetoj", "blabarsyltetoj", "blabarsyltetoj"
|
*Out:* "blabarsyltetoj", "blabarsyltetoj", "blabarsyltetoj", "blabarsyltetoj"
|
||||||
|
|
||||||
[[LanguageAnalysis-Serbian]]
|
|
||||||
=== Serbian
|
=== Serbian
|
||||||
|
|
||||||
[[LanguageAnalysis-SerbianNormalizationFilter]]
|
|
||||||
==== Serbian Normalization Filter
|
==== Serbian Normalization Filter
|
||||||
|
|
||||||
Solr includes a filter that normalizes Serbian Cyrillic and Latin characters. Note that this filter only works with lowercased input.
|
Solr includes a filter that normalizes Serbian Cyrillic and Latin characters. Note that this filter only works with lowercased input.
|
||||||
|
@ -1499,7 +1434,6 @@ See the Solr wiki for tips & advice on using this filter: https://wiki.apache.or
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-Spanish]]
|
|
||||||
=== Spanish
|
=== Spanish
|
||||||
|
|
||||||
Solr includes two stemmers for Spanish: one in the `solr.SnowballPorterFilterFactory language="Spanish"`, and a lighter stemmer called `solr.SpanishLightStemFilterFactory`. Lucene includes an example stopword list.
|
Solr includes two stemmers for Spanish: one in the `solr.SnowballPorterFilterFactory language="Spanish"`, and a lighter stemmer called `solr.SpanishLightStemFilterFactory`. Lucene includes an example stopword list.
|
||||||
|
@ -1526,15 +1460,13 @@ Solr includes two stemmers for Spanish: one in the `solr.SnowballPorterFilterFac
|
||||||
*Out:* "tor", "tor", "tor"
|
*Out:* "tor", "tor", "tor"
|
||||||
|
|
||||||
|
|
||||||
[[LanguageAnalysis-Swedish]]
|
|
||||||
=== Swedish
|
=== Swedish
|
||||||
|
|
||||||
[[LanguageAnalysis-SwedishStemFilter]]
|
|
||||||
==== Swedish Stem Filter
|
==== Swedish Stem Filter
|
||||||
|
|
||||||
Solr includes two stemmers for Swedish: one in the `solr.SnowballPorterFilterFactory language="Swedish"`, and a lighter stemmer called `solr.SwedishLightStemFilterFactory`. Lucene includes an example stopword list.
|
Solr includes two stemmers for Swedish: one in the `solr.SnowballPorterFilterFactory language="Swedish"`, and a lighter stemmer called `solr.SwedishLightStemFilterFactory`. Lucene includes an example stopword list.
|
||||||
|
|
||||||
Also relevant are the <<LanguageAnalysis-Scandinavian,Scandinavian normalization filters>>.
|
Also relevant are the <<Scandinavian,Scandinavian normalization filters>>.
|
||||||
|
|
||||||
*Factory class:* `solr.SwedishStemFilterFactory`
|
*Factory class:* `solr.SwedishStemFilterFactory`
|
||||||
|
|
||||||
|
@ -1557,8 +1489,6 @@ Also relevant are the <<LanguageAnalysis-Scandinavian,Scandinavian normalization
|
||||||
|
|
||||||
*Out:* "klok", "klok", "klok"
|
*Out:* "klok", "klok", "klok"
|
||||||
|
|
||||||
|
|
||||||
[[LanguageAnalysis-Thai]]
|
|
||||||
=== Thai
|
=== Thai
|
||||||
|
|
||||||
This filter converts sequences of Thai characters into individual Thai words. Unlike European languages, Thai does not use whitespace to delimit words.
|
This filter converts sequences of Thai characters into individual Thai words. Unlike European languages, Thai does not use whitespace to delimit words.
|
||||||
|
@ -1577,7 +1507,6 @@ This filter converts sequences of Thai characters into individual Thai words. Un
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-Turkish]]
|
|
||||||
=== Turkish
|
=== Turkish
|
||||||
|
|
||||||
Solr includes support for stemming Turkish with the `solr.SnowballPorterFilterFactory`; support for case-insensitive search with the `solr.TurkishLowerCaseFilterFactory`; support for stripping apostrophes and following suffixes with `solr.ApostropheFilterFactory` (see http://www.ipcsit.com/vol57/015-ICNI2012-M021.pdf[Role of Apostrophes in Turkish Information Retrieval]); support for a form of stemming that truncating tokens at a configurable maximum length through the `solr.TruncateTokenFilterFactory` (see http://www.users.muohio.edu/canf/papers/JASIST2008offPrint.pdf[Information Retrieval on Turkish Texts]); and Lucene includes an example stopword list.
|
Solr includes support for stemming Turkish with the `solr.SnowballPorterFilterFactory`; support for case-insensitive search with the `solr.TurkishLowerCaseFilterFactory`; support for stripping apostrophes and following suffixes with `solr.ApostropheFilterFactory` (see http://www.ipcsit.com/vol57/015-ICNI2012-M021.pdf[Role of Apostrophes in Turkish Information Retrieval]); support for a form of stemming that truncating tokens at a configurable maximum length through the `solr.TruncateTokenFilterFactory` (see http://www.users.muohio.edu/canf/papers/JASIST2008offPrint.pdf[Information Retrieval on Turkish Texts]); and Lucene includes an example stopword list.
|
||||||
|
@ -1613,10 +1542,6 @@ Solr includes support for stemming Turkish with the `solr.SnowballPorterFilterFa
|
||||||
</analyzer>
|
</analyzer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LanguageAnalysis-BacktoTop#main]]
|
|
||||||
===
|
|
||||||
|
|
||||||
[[LanguageAnalysis-Ukrainian]]
|
|
||||||
=== Ukrainian
|
=== Ukrainian
|
||||||
|
|
||||||
Solr provides support for Ukrainian lemmatization with the `solr.MorphologikFilterFactory`, in the `contrib/analysis-extras` module. To use this filter, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
|
Solr provides support for Ukrainian lemmatization with the `solr.MorphologikFilterFactory`, in the `contrib/analysis-extras` module. To use this filter, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
|
||||||
|
|
|
@ -22,21 +22,17 @@ With the *Learning To Rank* (or *LTR* for short) contrib module you can configur
|
||||||
|
|
||||||
The module also supports feature extraction inside Solr. The only thing you need to do outside Solr is train your own ranking model.
|
The module also supports feature extraction inside Solr. The only thing you need to do outside Solr is train your own ranking model.
|
||||||
|
|
||||||
[[LearningToRank-Concepts]]
|
== Learning to Rank Concepts
|
||||||
== Concepts
|
|
||||||
|
|
||||||
[[LearningToRank-Re-Ranking]]
|
|
||||||
=== Re-Ranking
|
=== Re-Ranking
|
||||||
|
|
||||||
Re-Ranking allows you to run a simple query for matching documents and then re-rank the top N documents using the scores from a different, complex query. This page describes the use of *LTR* complex queries, information on other rank queries included in the Solr distribution can be found on the <<query-re-ranking.adoc#query-re-ranking,Query Re-Ranking>> page.
|
Re-Ranking allows you to run a simple query for matching documents and then re-rank the top N documents using the scores from a different, more complex query. This page describes the use of *LTR* complex queries, information on other rank queries included in the Solr distribution can be found on the <<query-re-ranking.adoc#query-re-ranking,Query Re-Ranking>> page.
|
||||||
|
|
||||||
[[LearningToRank-LearningToRank]]
|
=== Learning To Rank Models
|
||||||
=== Learning To Rank
|
|
||||||
|
|
||||||
In information retrieval systems, https://en.wikipedia.org/wiki/Learning_to_rank[Learning to Rank] is used to re-rank the top N retrieved documents using trained machine learning models. The hope is that such sophisticated models can make more nuanced ranking decisions than standard ranking functions like https://en.wikipedia.org/wiki/Tf%E2%80%93idf[TF-IDF] or https://en.wikipedia.org/wiki/Okapi_BM25[BM25].
|
In information retrieval systems, https://en.wikipedia.org/wiki/Learning_to_rank[Learning to Rank] is used to re-rank the top N retrieved documents using trained machine learning models. The hope is that such sophisticated models can make more nuanced ranking decisions than standard ranking functions like https://en.wikipedia.org/wiki/Tf%E2%80%93idf[TF-IDF] or https://en.wikipedia.org/wiki/Okapi_BM25[BM25].
|
||||||
|
|
||||||
[[LearningToRank-Model]]
|
==== Ranking Model
|
||||||
==== Model
|
|
||||||
|
|
||||||
A ranking model computes the scores used to rerank documents. Irrespective of any particular algorithm or implementation, a ranking model's computation can use three types of inputs:
|
A ranking model computes the scores used to rerank documents. Irrespective of any particular algorithm or implementation, a ranking model's computation can use three types of inputs:
|
||||||
|
|
||||||
|
@ -44,27 +40,23 @@ A ranking model computes the scores used to rerank documents. Irrespective of an
|
||||||
* features that represent the document being scored
|
* features that represent the document being scored
|
||||||
* features that represent the query for which the document is being scored
|
* features that represent the query for which the document is being scored
|
||||||
|
|
||||||
[[LearningToRank-Feature]]
|
|
||||||
==== Feature
|
==== Feature
|
||||||
|
|
||||||
A feature is a value, a number, that represents some quantity or quality of the document being scored or of the query for which documents are being scored. For example documents often have a 'recency' quality and 'number of past purchases' might be a quantity that is passed to Solr as part of the search query.
|
A feature is a value, a number, that represents some quantity or quality of the document being scored or of the query for which documents are being scored. For example documents often have a 'recency' quality and 'number of past purchases' might be a quantity that is passed to Solr as part of the search query.
|
||||||
|
|
||||||
[[LearningToRank-Normalizer]]
|
|
||||||
==== Normalizer
|
==== Normalizer
|
||||||
|
|
||||||
Some ranking models expect features on a particular scale. A normalizer can be used to translate arbitrary feature values into normalized values e.g. on a 0..1 or 0..100 scale.
|
Some ranking models expect features on a particular scale. A normalizer can be used to translate arbitrary feature values into normalized values e.g. on a 0..1 or 0..100 scale.
|
||||||
|
|
||||||
[[LearningToRank-Training]]
|
=== Training Models
|
||||||
=== Training
|
|
||||||
|
|
||||||
[[LearningToRank-Featureengineering]]
|
==== Feature Engineering
|
||||||
==== Feature engineering
|
|
||||||
|
|
||||||
The LTR contrib module includes several feature classes as well as support for custom features. Each feature class's javadocs contain an example to illustrate use of that class. The process of https://en.wikipedia.org/wiki/Feature_engineering[feature engineering] itself is then entirely up to your domain expertise and creativity.
|
The LTR contrib module includes several feature classes as well as support for custom features. Each feature class's javadocs contain an example to illustrate use of that class. The process of https://en.wikipedia.org/wiki/Feature_engineering[feature engineering] itself is then entirely up to your domain expertise and creativity.
|
||||||
|
|
||||||
[cols=",,,",options="header",]
|
[cols=",,,",options="header",]
|
||||||
|===
|
|===
|
||||||
|Feature |Class |Example parameters |<<LearningToRank-ExternalFeatureInformation,External Feature Information>>
|
|Feature |Class |Example parameters |<<External Feature Information>>
|
||||||
|field length |{solr-javadocs}/solr-ltr/org/apache/solr/ltr/feature/FieldLengthFeature.html[FieldLengthFeature] |`{"field":"title"}` |not (yet) supported
|
|field length |{solr-javadocs}/solr-ltr/org/apache/solr/ltr/feature/FieldLengthFeature.html[FieldLengthFeature] |`{"field":"title"}` |not (yet) supported
|
||||||
|field value |{solr-javadocs}/solr-ltr/org/apache/solr/ltr/feature/FieldValueFeature.html[FieldValueFeature] |`{"field":"hits"}` |not (yet) supported
|
|field value |{solr-javadocs}/solr-ltr/org/apache/solr/ltr/feature/FieldValueFeature.html[FieldValueFeature] |`{"field":"hits"}` |not (yet) supported
|
||||||
|original score |{solr-javadocs}/solr-ltr/org/apache/solr/ltr/feature/OriginalScoreFeature.html[OriginalScoreFeature] |`{}` |not applicable
|
|original score |{solr-javadocs}/solr-ltr/org/apache/solr/ltr/feature/OriginalScoreFeature.html[OriginalScoreFeature] |`{}` |not applicable
|
||||||
|
@ -84,12 +76,10 @@ The LTR contrib module includes several feature classes as well as support for c
|
||||||
|(custom) |(custom class extending {solr-javadocs}/solr-ltr/org/apache/solr/ltr/norm/Normalizer.html[Normalizer]) |
|
|(custom) |(custom class extending {solr-javadocs}/solr-ltr/org/apache/solr/ltr/norm/Normalizer.html[Normalizer]) |
|
||||||
|===
|
|===
|
||||||
|
|
||||||
[[LearningToRank-Featureextraction]]
|
|
||||||
==== Feature Extraction
|
==== Feature Extraction
|
||||||
|
|
||||||
The ltr contrib module includes a <<transforming-result-documents.adoc#transforming-result-documents,[features>> transformer] to support the calculation and return of feature values for https://en.wikipedia.org/wiki/Feature_extraction[feature extraction] purposes including and especially when you do not yet have an actual reranking model.
|
The ltr contrib module includes a <<transforming-result-documents.adoc#transforming-result-documents,[features>> transformer] to support the calculation and return of feature values for https://en.wikipedia.org/wiki/Feature_extraction[feature extraction] purposes including and especially when you do not yet have an actual reranking model.
|
||||||
|
|
||||||
[[LearningToRank-Featureselectionandmodeltraining]]
|
|
||||||
==== Feature Selection and Model Training
|
==== Feature Selection and Model Training
|
||||||
|
|
||||||
Feature selection and model training take place offline and outside Solr. The ltr contrib module supports two generalized forms of models as well as custom models. Each model class's javadocs contain an example to illustrate configuration of that class. In the form of JSON files your trained model or models (e.g. different models for different customer geographies) can then be directly uploaded into Solr using provided REST APIs.
|
Feature selection and model training take place offline and outside Solr. The ltr contrib module supports two generalized forms of models as well as custom models. Each model class's javadocs contain an example to illustrate configuration of that class. In the form of JSON files your trained model or models (e.g. different models for different customer geographies) can then be directly uploaded into Solr using provided REST APIs.
|
||||||
|
@ -102,8 +92,7 @@ Feature selection and model training take place offline and outside Solr. The lt
|
||||||
|(custom) |(custom class extending {solr-javadocs}/solr-ltr/org/apache/solr/ltr/model/LTRScoringModel.html[LTRScoringModel]) |(not applicable)
|
|(custom) |(custom class extending {solr-javadocs}/solr-ltr/org/apache/solr/ltr/model/LTRScoringModel.html[LTRScoringModel]) |(not applicable)
|
||||||
|===
|
|===
|
||||||
|
|
||||||
[[LearningToRank-QuickStartExample]]
|
== Quick Start with LTR
|
||||||
== Quick Start Example
|
|
||||||
|
|
||||||
The `"techproducts"` example included with Solr is pre-configured with the plugins required for learning-to-rank, but they are disabled by default.
|
The `"techproducts"` example included with Solr is pre-configured with the plugins required for learning-to-rank, but they are disabled by default.
|
||||||
|
|
||||||
|
@ -114,7 +103,6 @@ To enable the plugins, please specify the `solr.ltr.enabled` JVM System Property
|
||||||
bin/solr start -e techproducts -Dsolr.ltr.enabled=true
|
bin/solr start -e techproducts -Dsolr.ltr.enabled=true
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LearningToRank-Uploadingfeatures]]
|
|
||||||
=== Uploading Features
|
=== Uploading Features
|
||||||
|
|
||||||
To upload features in a `/path/myFeatures.json` file, please run:
|
To upload features in a `/path/myFeatures.json` file, please run:
|
||||||
|
@ -154,7 +142,6 @@ To view the features you just uploaded please open the following URL in a browse
|
||||||
]
|
]
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LearningToRank-Extractingfeatures]]
|
|
||||||
=== Extracting Features
|
=== Extracting Features
|
||||||
|
|
||||||
To extract features as part of a query, add `[features]` to the `fl` parameter, for example:
|
To extract features as part of a query, add `[features]` to the `fl` parameter, for example:
|
||||||
|
@ -184,7 +171,6 @@ The output XML will include feature values as a comma-separated list, resembling
|
||||||
}}
|
}}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LearningToRank-Uploadingamodel]]
|
|
||||||
=== Uploading a Model
|
=== Uploading a Model
|
||||||
|
|
||||||
To upload the model in a `/path/myModel.json` file, please run:
|
To upload the model in a `/path/myModel.json` file, please run:
|
||||||
|
@ -219,7 +205,6 @@ To view the model you just uploaded please open the following URL in a browser:
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LearningToRank-Runningarerankquery]]
|
|
||||||
=== Running a Rerank Query
|
=== Running a Rerank Query
|
||||||
|
|
||||||
To rerank the results of a query, add the `rq` parameter to your search, for example:
|
To rerank the results of a query, add the `rq` parameter to your search, for example:
|
||||||
|
@ -258,12 +243,10 @@ The output XML will include feature values as a comma-separated list, resembling
|
||||||
}}
|
}}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LearningToRank-ExternalFeatureInformation]]
|
|
||||||
=== External Feature Information
|
=== External Feature Information
|
||||||
|
|
||||||
The {solr-javadocs}/solr-ltr/org/apache/solr/ltr/feature/ValueFeature.html[ValueFeature] and {solr-javadocs}/solr-ltr/org/apache/solr/ltr/feature/SolrFeature.html[SolrFeature] classes support the use of external feature information, `efi` for short.
|
The {solr-javadocs}/solr-ltr/org/apache/solr/ltr/feature/ValueFeature.html[ValueFeature] and {solr-javadocs}/solr-ltr/org/apache/solr/ltr/feature/SolrFeature.html[SolrFeature] classes support the use of external feature information, `efi` for short.
|
||||||
|
|
||||||
[[LearningToRank-Uploadingfeatures.1]]
|
|
||||||
==== Uploading Features
|
==== Uploading Features
|
||||||
|
|
||||||
To upload features in a `/path/myEfiFeatures.json` file, please run:
|
To upload features in a `/path/myEfiFeatures.json` file, please run:
|
||||||
|
@ -308,9 +291,8 @@ To view the features you just uploaded please open the following URL in a browse
|
||||||
]
|
]
|
||||||
----
|
----
|
||||||
|
|
||||||
As an aside, you may have noticed that the `myEfiFeatures.json` example uses `"store":"myEfiFeatureStore"` attributes: read more about feature `store` in the <<Lifecycle>> section of this page.
|
As an aside, you may have noticed that the `myEfiFeatures.json` example uses `"store":"myEfiFeatureStore"` attributes: read more about feature `store` in the <<LTR Lifecycle>> section of this page.
|
||||||
|
|
||||||
[[LearningToRank-Extractingfeatures.1]]
|
|
||||||
==== Extracting Features
|
==== Extracting Features
|
||||||
|
|
||||||
To extract `myEfiFeatureStore` features as part of a query, add `efi.*` parameters to the `[features]` part of the `fl` parameter, for example:
|
To extract `myEfiFeatureStore` features as part of a query, add `efi.*` parameters to the `[features]` part of the `fl` parameter, for example:
|
||||||
|
@ -321,7 +303,6 @@ http://localhost:8983/solr/techproducts/query?q=test&fl=id,cat,manu,score,[featu
|
||||||
[source,text]
|
[source,text]
|
||||||
http://localhost:8983/solr/techproducts/query?q=test&fl=id,cat,manu,score,[features store=myEfiFeatureStore efi.text=test efi.preferredManufacturer=Apache efi.fromMobile=0 efi.answer=13]
|
http://localhost:8983/solr/techproducts/query?q=test&fl=id,cat,manu,score,[features store=myEfiFeatureStore efi.text=test efi.preferredManufacturer=Apache efi.fromMobile=0 efi.answer=13]
|
||||||
|
|
||||||
[[LearningToRank-Uploadingamodel.1]]
|
|
||||||
==== Uploading a Model
|
==== Uploading a Model
|
||||||
|
|
||||||
To upload the model in a `/path/myEfiModel.json` file, please run:
|
To upload the model in a `/path/myEfiModel.json` file, please run:
|
||||||
|
@ -359,7 +340,6 @@ To view the model you just uploaded please open the following URL in a browser:
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LearningToRank-Runningarerankquery.1]]
|
|
||||||
==== Running a Rerank Query
|
==== Running a Rerank Query
|
||||||
|
|
||||||
To obtain the feature values computed during reranking, add `[features]` to the `fl` parameter and `efi.*` parameters to the `rq` parameter, for example:
|
To obtain the feature values computed during reranking, add `[features]` to the `fl` parameter and `efi.*` parameters to the `rq` parameter, for example:
|
||||||
|
@ -368,39 +348,34 @@ To obtain the feature values computed during reranking, add `[features]` to the
|
||||||
http://localhost:8983/solr/techproducts/query?q=test&rq=\{!ltr model=myEfiModel efi.text=test efi.preferredManufacturer=Apache efi.fromMobile=1}&fl=id,cat,manu,score,[features]] link:[]
|
http://localhost:8983/solr/techproducts/query?q=test&rq=\{!ltr model=myEfiModel efi.text=test efi.preferredManufacturer=Apache efi.fromMobile=1}&fl=id,cat,manu,score,[features]] link:[]
|
||||||
|
|
||||||
[source,text]
|
[source,text]
|
||||||
http://localhost:8983/solr/techproducts/query?q=test&rq=\{!ltr model=myEfiModel efi.text=test efi.preferredManufacturer=Apache efi.fromMobile=0 efi.answer=13}&fl=id,cat,manu,score,[features]]
|
http://localhost:8983/solr/techproducts/query?q=test&rq={!ltr model=myEfiModel efi.text=test efi.preferredManufacturer=Apache efi.fromMobile=0 efi.answer=13}&fl=id,cat,manu,score,[features]
|
||||||
|
|
||||||
Notice the absence of `efi.*` parameters in the `[features]` part of the `fl` parameter.
|
Notice the absence of `efi.*` parameters in the `[features]` part of the `fl` parameter.
|
||||||
|
|
||||||
[[LearningToRank-Extractingfeatureswhilstreranking]]
|
|
||||||
==== Extracting Features While Reranking
|
==== Extracting Features While Reranking
|
||||||
|
|
||||||
To extract features for `myEfiFeatureStore` features while still reranking with `myModel`:
|
To extract features for `myEfiFeatureStore` features while still reranking with `myModel`:
|
||||||
|
|
||||||
[source,text]
|
[source,text]
|
||||||
http://localhost:8983/solr/techproducts/query?q=test&rq=\{!ltr model=myModel}&fl=id,cat,manu,score,[features store=myEfiFeatureStore efi.text=test efi.preferredManufacturer=Apache efi.fromMobile=1]] link:[]
|
http://localhost:8983/solr/techproducts/query?q=test&rq={!ltr model=myModel}&fl=id,cat,manu,score,[features store=myEfiFeatureStore efi.text=test efi.preferredManufacturer=Apache efi.fromMobile=1]
|
||||||
|
|
||||||
Notice the absence of `efi.*` parameters in the `rq` parameter (because `myModel` does not use `efi` feature) and the presence of `efi.*` parameters in the `[features]` part of the `fl` parameter (because `myEfiFeatureStore` contains `efi` features).
|
Notice the absence of `efi.\*` parameters in the `rq` parameter (because `myModel` does not use `efi` feature) and the presence of `efi.*` parameters in the `[features]` part of the `fl` parameter (because `myEfiFeatureStore` contains `efi` features).
|
||||||
|
|
||||||
Read more about model evolution in the <<Lifecycle>> section of this page.
|
Read more about model evolution in the <<LTR Lifecycle>> section of this page.
|
||||||
|
|
||||||
[[LearningToRank-Trainingexample]]
|
|
||||||
=== Training Example
|
=== Training Example
|
||||||
|
|
||||||
Example training data and a demo 'train and upload model' script can be found in the `solr/contrib/ltr/example` folder in the https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git[Apache lucene-solr git repository] which is mirrored on https://github.com/apache/lucene-solr/tree/releases/lucene-solr/6.4.0/solr/contrib/ltr/example[github.com] (the `solr/contrib/ltr/example` folder is not shipped in the solr binary release).
|
Example training data and a demo 'train and upload model' script can be found in the `solr/contrib/ltr/example` folder in the https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git[Apache lucene-solr git repository] which is mirrored on https://github.com/apache/lucene-solr/tree/releases/lucene-solr/6.4.0/solr/contrib/ltr/example[github.com] (the `solr/contrib/ltr/example` folder is not shipped in the solr binary release).
|
||||||
|
|
||||||
[[LearningToRank-Installation]]
|
== Installation of LTR
|
||||||
== Installation
|
|
||||||
|
|
||||||
The ltr contrib module requires the `dist/solr-ltr-*.jar` JARs.
|
The ltr contrib module requires the `dist/solr-ltr-*.jar` JARs.
|
||||||
|
|
||||||
[[LearningToRank-Configuration]]
|
== LTR Configuration
|
||||||
== Configuration
|
|
||||||
|
|
||||||
Learning-To-Rank is a contrib module and therefore its plugins must be configured in `solrconfig.xml`.
|
Learning-To-Rank is a contrib module and therefore its plugins must be configured in `solrconfig.xml`.
|
||||||
|
|
||||||
[[LearningToRank-Minimumrequirements]]
|
=== Minimum Requirements
|
||||||
=== Minimum requirements
|
|
||||||
|
|
||||||
* Include the required contrib JARs. Note that by default paths are relative to the Solr core so they may need adjustments to your configuration, or an explicit specification of the `$solr.install.dir`.
|
* Include the required contrib JARs. Note that by default paths are relative to the Solr core so they may need adjustments to your configuration, or an explicit specification of the `$solr.install.dir`.
|
||||||
+
|
+
|
||||||
|
@ -437,15 +412,12 @@ Learning-To-Rank is a contrib module and therefore its plugins must be configure
|
||||||
</transformer>
|
</transformer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LearningToRank-Advancedoptions]]
|
|
||||||
=== Advanced Options
|
=== Advanced Options
|
||||||
|
|
||||||
[[LearningToRank-LTRThreadModule]]
|
|
||||||
==== LTRThreadModule
|
==== LTRThreadModule
|
||||||
|
|
||||||
A thread module can be configured for the query parser and/or the transformer to parallelize the creation of feature weights. For details, please refer to the {solr-javadocs}/solr-ltr/org/apache/solr/ltr/LTRThreadModule.html[LTRThreadModule] javadocs.
|
A thread module can be configured for the query parser and/or the transformer to parallelize the creation of feature weights. For details, please refer to the {solr-javadocs}/solr-ltr/org/apache/solr/ltr/LTRThreadModule.html[LTRThreadModule] javadocs.
|
||||||
|
|
||||||
[[LearningToRank-Featurevectorcustomization]]
|
|
||||||
==== Feature Vector Customization
|
==== Feature Vector Customization
|
||||||
|
|
||||||
The features transformer returns dense CSV values such as `featureA=0.1,featureB=0.2,featureC=0.3,featureD=0.0`.
|
The features transformer returns dense CSV values such as `featureA=0.1,featureB=0.2,featureC=0.3,featureD=0.0`.
|
||||||
|
@ -462,7 +434,6 @@ For sparse CSV output such as `featureA:0.1 featureB:0.2 featureC:0.3` you can c
|
||||||
</transformer>
|
</transformer>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LearningToRank-Implementationandcontributions]]
|
|
||||||
==== Implementation and Contributions
|
==== Implementation and Contributions
|
||||||
|
|
||||||
.How does Solr Learning-To-Rank work under the hood?
|
.How does Solr Learning-To-Rank work under the hood?
|
||||||
|
@ -481,10 +452,8 @@ Contributions for further models, features and normalizers are welcome. Related
|
||||||
* http://wiki.apache.org/lucene-java/HowToContribute
|
* http://wiki.apache.org/lucene-java/HowToContribute
|
||||||
====
|
====
|
||||||
|
|
||||||
[[LearningToRank-Lifecycle]]
|
== LTR Lifecycle
|
||||||
== Lifecycle
|
|
||||||
|
|
||||||
[[LearningToRank-Featurestores]]
|
|
||||||
=== Feature Stores
|
=== Feature Stores
|
||||||
|
|
||||||
It is recommended that you organise all your features into stores which are akin to namespaces:
|
It is recommended that you organise all your features into stores which are akin to namespaces:
|
||||||
|
@ -501,7 +470,6 @@ To inspect the content of the `commonFeatureStore` feature store:
|
||||||
|
|
||||||
`\http://localhost:8983/solr/techproducts/schema/feature-store/commonFeatureStore`
|
`\http://localhost:8983/solr/techproducts/schema/feature-store/commonFeatureStore`
|
||||||
|
|
||||||
[[LearningToRank-Models]]
|
|
||||||
=== Models
|
=== Models
|
||||||
|
|
||||||
* A model uses features from exactly one feature store.
|
* A model uses features from exactly one feature store.
|
||||||
|
@ -537,13 +505,11 @@ To delete the `currentFeatureStore` feature store:
|
||||||
curl -XDELETE 'http://localhost:8983/solr/techproducts/schema/feature-store/currentFeatureStore'
|
curl -XDELETE 'http://localhost:8983/solr/techproducts/schema/feature-store/currentFeatureStore'
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LearningToRank-Applyingchanges]]
|
|
||||||
=== Applying Changes
|
=== Applying Changes
|
||||||
|
|
||||||
The feature store and the model store are both <<managed-resources.adoc#managed-resources,Managed Resources>>. Changes made to managed resources are not applied to the active Solr components until the Solr collection (or Solr core in single server mode) is reloaded.
|
The feature store and the model store are both <<managed-resources.adoc#managed-resources,Managed Resources>>. Changes made to managed resources are not applied to the active Solr components until the Solr collection (or Solr core in single server mode) is reloaded.
|
||||||
|
|
||||||
[[LearningToRank-Examples]]
|
=== LTR Examples
|
||||||
=== Examples
|
|
||||||
|
|
||||||
==== One Feature Store, Multiple Ranking Models
|
==== One Feature Store, Multiple Ranking Models
|
||||||
|
|
||||||
|
@ -628,7 +594,6 @@ The feature store and the model store are both <<managed-resources.adoc#managed-
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LearningToRank-Modelevolution]]
|
|
||||||
==== Model Evolution
|
==== Model Evolution
|
||||||
|
|
||||||
* `linearModel201701` uses features from `featureStore201701`
|
* `linearModel201701` uses features from `featureStore201701`
|
||||||
|
@ -752,8 +717,7 @@ The feature store and the model store are both <<managed-resources.adoc#managed-
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[LearningToRank-AdditionalResources]]
|
== Additional LTR Resources
|
||||||
== Additional Resources
|
|
||||||
|
|
||||||
* "Learning to Rank in Solr" presentation at Lucene/Solr Revolution 2015 in Austin:
|
* "Learning to Rank in Solr" presentation at Lucene/Solr Revolution 2015 in Austin:
|
||||||
** Slides: http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp
|
** Slides: http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp
|
||||||
|
|
|
@ -32,7 +32,6 @@ We can prefix this query string with local parameters to provide more informatio
|
||||||
|
|
||||||
These local parameters would change the query to require a match on both "solr" and "rocks" while searching the "title" field by default.
|
These local parameters would change the query to require a match on both "solr" and "rocks" while searching the "title" field by default.
|
||||||
|
|
||||||
[[LocalParametersinQueries-BasicSyntaxofLocalParameters]]
|
|
||||||
== Basic Syntax of Local Parameters
|
== Basic Syntax of Local Parameters
|
||||||
|
|
||||||
To specify a local parameter, insert the following before the argument to be modified:
|
To specify a local parameter, insert the following before the argument to be modified:
|
||||||
|
@ -45,7 +44,6 @@ To specify a local parameter, insert the following before the argument to be mod
|
||||||
|
|
||||||
You may specify only one local parameters prefix per argument. Values in the key-value pairs may be quoted via single or double quotes, and backslash escaping works within quoted strings.
|
You may specify only one local parameters prefix per argument. Values in the key-value pairs may be quoted via single or double quotes, and backslash escaping works within quoted strings.
|
||||||
|
|
||||||
[[LocalParametersinQueries-QueryTypeShortForm]]
|
|
||||||
== Query Type Short Form
|
== Query Type Short Form
|
||||||
|
|
||||||
If a local parameter value appears without a name, it is given the implicit name of "type". This allows short-form representation for the type of query parser to use when parsing a query string. Thus
|
If a local parameter value appears without a name, it is given the implicit name of "type". This allows short-form representation for the type of query parser to use when parsing a query string. Thus
|
||||||
|
@ -74,7 +72,6 @@ is equivalent to
|
||||||
|
|
||||||
`q={!type=dismax qf=myfield v='solr rocks'`}
|
`q={!type=dismax qf=myfield v='solr rocks'`}
|
||||||
|
|
||||||
[[LocalParametersinQueries-ParameterDereferencing]]
|
|
||||||
== Parameter Dereferencing
|
== Parameter Dereferencing
|
||||||
|
|
||||||
Parameter dereferencing, or indirection, lets you use the value of another argument rather than specifying it directly. This can be used to simplify queries, decouple user input from query parameters, or decouple front-end GUI parameters from defaults set in `solrconfig.xml`.
|
Parameter dereferencing, or indirection, lets you use the value of another argument rather than specifying it directly. This can be used to simplify queries, decouple user input from query parameters, or decouple front-end GUI parameters from defaults set in `solrconfig.xml`.
|
||||||
|
|
|
@ -27,7 +27,6 @@ image::images/logging/logging.png[image,width=621,height=250]
|
||||||
|
|
||||||
While this example shows logged messages for only one core, if you have multiple cores in a single instance, they will each be listed, with the level for each.
|
While this example shows logged messages for only one core, if you have multiple cores in a single instance, they will each be listed, with the level for each.
|
||||||
|
|
||||||
[[Logging-SelectingaLoggingLevel]]
|
|
||||||
== Selecting a Logging Level
|
== Selecting a Logging Level
|
||||||
|
|
||||||
When you select the *Level* link on the left, you see the hierarchy of classpaths and classnames for your instance. A row highlighted in yellow indicates that the class has logging capabilities. Click on a highlighted row, and a menu will appear to allow you to change the log level for that class. Characters in boldface indicate that the class will not be affected by level changes to root.
|
When you select the *Level* link on the left, you see the hierarchy of classpaths and classnames for your instance. A row highlighted in yellow indicates that the class has logging capabilities. Click on a highlighted row, and a menu will appear to allow you to change the log level for that class. Characters in boldface indicate that the class will not be affected by level changes to root.
|
||||||
|
|
|
@ -48,7 +48,7 @@ Replication across data centers is now possible with <<cross-data-center-replica
|
||||||
|
|
||||||
=== Graph QueryParser
|
=== Graph QueryParser
|
||||||
|
|
||||||
A new <<other-parsers.adoc#OtherParsers-GraphQueryParser,`graph` query parser>> makes it possible to to graph traversal queries of Directed (Cyclic) Graphs modelled using Solr documents.
|
A new <<other-parsers.adoc#graph-query-parser,`graph` query parser>> makes it possible to to graph traversal queries of Directed (Cyclic) Graphs modelled using Solr documents.
|
||||||
|
|
||||||
[[major-5-6-docvalues]]
|
[[major-5-6-docvalues]]
|
||||||
=== DocValues
|
=== DocValues
|
||||||
|
|
|
@ -28,12 +28,12 @@ Support for backups when running SolrCloud is provided with the <<collections-ap
|
||||||
|
|
||||||
Two commands are available:
|
Two commands are available:
|
||||||
|
|
||||||
* `action=BACKUP`: This command backs up Solr indexes and configurations. More information is available in the section <<collections-api.adoc#CollectionsAPI-backup,Backup Collection>>.
|
* `action=BACKUP`: This command backs up Solr indexes and configurations. More information is available in the section <<collections-api.adoc#backup,Backup Collection>>.
|
||||||
* `action=RESTORE`: This command restores Solr indexes and configurations. More information is available in the section <<collections-api.adoc#CollectionsAPI-restore,Restore Collection>>.
|
* `action=RESTORE`: This command restores Solr indexes and configurations. More information is available in the section <<collections-api.adoc#restore,Restore Collection>>.
|
||||||
|
|
||||||
== Standalone Mode Backups
|
== Standalone Mode Backups
|
||||||
|
|
||||||
Backups and restoration uses Solr's replication handler. Out of the box, Solr includes implicit support for replication so this API can be used. Configuration of the replication handler can, however, be customized by defining your own replication handler in `solrconfig.xml` . For details on configuring the replication handler, see the section <<index-replication.adoc#IndexReplication-ConfiguringtheReplicationHandler,Configuring the ReplicationHandler>>.
|
Backups and restoration uses Solr's replication handler. Out of the box, Solr includes implicit support for replication so this API can be used. Configuration of the replication handler can, however, be customized by defining your own replication handler in `solrconfig.xml` . For details on configuring the replication handler, see the section <<index-replication.adoc#configuring-the-replicationhandler,Configuring the ReplicationHandler>>.
|
||||||
|
|
||||||
=== Backup API
|
=== Backup API
|
||||||
|
|
||||||
|
@ -58,7 +58,7 @@ The path where the backup will be created. If the path is not absolute then the
|
||||||
|name |The snapshot will be created in a directory called `snapshot.<name>`. If a name is not specified then the directory name would have the following format: `snapshot.<yyyyMMddHHmmssSSS>`.
|
|name |The snapshot will be created in a directory called `snapshot.<name>`. If a name is not specified then the directory name would have the following format: `snapshot.<yyyyMMddHHmmssSSS>`.
|
||||||
|
|
||||||
`numberToKeep`::
|
`numberToKeep`::
|
||||||
The number of backups to keep. If `maxNumberOfBackups` has been specified on the replication handler in `solrconfig.xml`, `maxNumberOfBackups` is always used and attempts to use `numberToKeep` will cause an error. Also, this parameter is not taken into consideration if the backup name is specified. More information about `maxNumberOfBackups` can be found in the section <<index-replication.adoc#IndexReplication-ConfiguringtheReplicationHandler,Configuring the ReplicationHandler>>.
|
The number of backups to keep. If `maxNumberOfBackups` has been specified on the replication handler in `solrconfig.xml`, `maxNumberOfBackups` is always used and attempts to use `numberToKeep` will cause an error. Also, this parameter is not taken into consideration if the backup name is specified. More information about `maxNumberOfBackups` can be found in the section <<index-replication.adoc#configuring-the-replicationhandler,Configuring the ReplicationHandler>>.
|
||||||
|
|
||||||
`repository`::
|
`repository`::
|
||||||
The name of the repository to be used for the backup. If no repository is specified then the local filesystem repository will be used automatically.
|
The name of the repository to be used for the backup. If no repository is specified then the local filesystem repository will be used automatically.
|
||||||
|
|
|
@ -33,15 +33,13 @@ All of the examples in this section assume you are running the "techproducts" So
|
||||||
bin/solr -e techproducts
|
bin/solr -e techproducts
|
||||||
----
|
----
|
||||||
|
|
||||||
[[ManagedResources-Overview]]
|
== Managed Resources Overview
|
||||||
== Overview
|
|
||||||
|
|
||||||
Let's begin learning about managed resources by looking at a couple of examples provided by Solr for managing stop words and synonyms using a REST API. After reading this section, you'll be ready to dig into the details of how managed resources are implemented in Solr so you can start building your own implementation.
|
Let's begin learning about managed resources by looking at a couple of examples provided by Solr for managing stop words and synonyms using a REST API. After reading this section, you'll be ready to dig into the details of how managed resources are implemented in Solr so you can start building your own implementation.
|
||||||
|
|
||||||
[[ManagedResources-Stopwords]]
|
=== Managing Stop Words
|
||||||
=== Stop Words
|
|
||||||
|
|
||||||
To begin, you need to define a field type that uses the <<filter-descriptions.adoc#FilterDescriptions-ManagedStopFilter,ManagedStopFilterFactory>>, such as:
|
To begin, you need to define a field type that uses the <<filter-descriptions.adoc#managed-stop-filter,ManagedStopFilterFactory>>, such as:
|
||||||
|
|
||||||
[source,xml,subs="verbatim,callouts"]
|
[source,xml,subs="verbatim,callouts"]
|
||||||
----
|
----
|
||||||
|
@ -56,7 +54,7 @@ To begin, you need to define a field type that uses the <<filter-descriptions.ad
|
||||||
|
|
||||||
There are two important things to notice about this field type definition:
|
There are two important things to notice about this field type definition:
|
||||||
|
|
||||||
<1> The filter implementation class is `solr.ManagedStopFilterFactory`. This is a special implementation of the <<filter-descriptions.adoc#FilterDescriptions-StopFilter,StopFilterFactory>> that uses a set of stop words that are managed from a REST API.
|
<1> The filter implementation class is `solr.ManagedStopFilterFactory`. This is a special implementation of the <<filter-descriptions.adoc#stop-filter,StopFilterFactory>> that uses a set of stop words that are managed from a REST API.
|
||||||
|
|
||||||
<2> The `managed=”english”` attribute gives a name to the set of managed stop words, in this case indicating the stop words are for English text.
|
<2> The `managed=”english”` attribute gives a name to the set of managed stop words, in this case indicating the stop words are for English text.
|
||||||
|
|
||||||
|
@ -134,8 +132,7 @@ curl -X DELETE "http://localhost:8983/solr/techproducts/schema/analysis/stopword
|
||||||
|
|
||||||
NOTE: PUT/POST is used to add terms to an existing list instead of replacing the list entirely. This is because it is more common to add a term to an existing list than it is to replace a list altogether, so the API favors the more common approach of incrementally adding terms especially since deleting individual terms is also supported.
|
NOTE: PUT/POST is used to add terms to an existing list instead of replacing the list entirely. This is because it is more common to add a term to an existing list than it is to replace a list altogether, so the API favors the more common approach of incrementally adding terms especially since deleting individual terms is also supported.
|
||||||
|
|
||||||
[[ManagedResources-Synonyms]]
|
=== Managing Synonyms
|
||||||
=== Synonyms
|
|
||||||
|
|
||||||
For the most part, the API for managing synonyms behaves similar to the API for stop words, except instead of working with a list of words, it uses a map, where the value for each entry in the map is a set of synonyms for a term. As with stop words, the `sample_techproducts_configs` <<config-sets.adoc#config-sets,configset>> includes a pre-built set of synonym mappings suitable for the sample data that is activated by the following field type definition in schema.xml:
|
For the most part, the API for managing synonyms behaves similar to the API for stop words, except instead of working with a list of words, it uses a map, where the value for each entry in the map is a set of synonyms for a term. As with stop words, the `sample_techproducts_configs` <<config-sets.adoc#config-sets,configset>> includes a pre-built set of synonym mappings suitable for the sample data that is activated by the following field type definition in schema.xml:
|
||||||
|
|
||||||
|
@ -209,8 +206,7 @@ Note that the expansion is performed when processing the PUT request so the unde
|
||||||
|
|
||||||
Lastly, you can delete a mapping by sending a DELETE request to the managed endpoint.
|
Lastly, you can delete a mapping by sending a DELETE request to the managed endpoint.
|
||||||
|
|
||||||
[[ManagedResources-ApplyingChanges]]
|
== Applying Managed Resource Changes
|
||||||
== Applying Changes
|
|
||||||
|
|
||||||
Changes made to managed resources via this REST API are not applied to the active Solr components until the Solr collection (or Solr core in single server mode) is reloaded.
|
Changes made to managed resources via this REST API are not applied to the active Solr components until the Solr collection (or Solr core in single server mode) is reloaded.
|
||||||
|
|
||||||
|
@ -227,7 +223,6 @@ However, the intent of this API implementation is that changes will be applied u
|
||||||
Changing things like stop words and synonym mappings typically require re-indexing existing documents if being used by index-time analyzers. The RestManager framework does not guard you from this, it simply makes it possible to programmatically build up a set of stop words, synonyms etc.
|
Changing things like stop words and synonym mappings typically require re-indexing existing documents if being used by index-time analyzers. The RestManager framework does not guard you from this, it simply makes it possible to programmatically build up a set of stop words, synonyms etc.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[ManagedResources-RestManagerEndpoint]]
|
|
||||||
== RestManager Endpoint
|
== RestManager Endpoint
|
||||||
|
|
||||||
Metadata about registered ManagedResources is available using the `/schema/managed` endpoint for each collection.
|
Metadata about registered ManagedResources is available using the `/schema/managed` endpoint for each collection.
|
||||||
|
|
|
@ -34,8 +34,7 @@ Specifies whether statistics are returned with results. You can override the `st
|
||||||
`wt`::
|
`wt`::
|
||||||
The output format. This operates the same as the <<response-writers.adoc#response-writers,`wt` parameter in a query>>. The default is `xml`.
|
The output format. This operates the same as the <<response-writers.adoc#response-writers,`wt` parameter in a query>>. The default is `xml`.
|
||||||
|
|
||||||
[[MBeanRequestHandler-Examples]]
|
== MBeanRequestHandler Examples
|
||||||
== Examples
|
|
||||||
|
|
||||||
The following examples assume you are running Solr's `techproducts` example configuration:
|
The following examples assume you are running Solr's `techproducts` example configuration:
|
||||||
|
|
||||||
|
|
|
@ -27,7 +27,6 @@ To merge indexes, they must meet these requirements:
|
||||||
|
|
||||||
Optimally, the two indexes should be built using the same schema.
|
Optimally, the two indexes should be built using the same schema.
|
||||||
|
|
||||||
[[MergingIndexes-UsingIndexMergeTool]]
|
|
||||||
== Using IndexMergeTool
|
== Using IndexMergeTool
|
||||||
|
|
||||||
To merge the indexes, do the following:
|
To merge the indexes, do the following:
|
||||||
|
@ -43,9 +42,8 @@ java -cp $SOLR/server/solr-webapp/webapp/WEB-INF/lib/lucene-core-VERSION.jar:$SO
|
||||||
This will create a new index at `/path/to/newindex` that contains both index1 and index2.
|
This will create a new index at `/path/to/newindex` that contains both index1 and index2.
|
||||||
. Copy this new directory to the location of your application's solr index (move the old one aside first, of course) and start Solr.
|
. Copy this new directory to the location of your application's solr index (move the old one aside first, of course) and start Solr.
|
||||||
|
|
||||||
[[MergingIndexes-UsingCoreAdmin]]
|
|
||||||
== Using CoreAdmin
|
== Using CoreAdmin
|
||||||
|
|
||||||
The `MERGEINDEXES` command of the <<coreadmin-api.adoc#CoreAdminAPI-MERGEINDEXES,CoreAdminHandler>> can be used to merge indexes into a new core – either from one or more arbitrary `indexDir` directories or by merging from one or more existing `srcCore` core names.
|
The `MERGEINDEXES` command of the <<coreadmin-api.adoc#coreadmin-mergeindexes,CoreAdminHandler>> can be used to merge indexes into a new core – either from one or more arbitrary `indexDir` directories or by merging from one or more existing `srcCore` core names.
|
||||||
|
|
||||||
See the <<coreadmin-api.adoc#CoreAdminAPI-MERGEINDEXES,CoreAdminHandler>> section for details.
|
See the <<coreadmin-api.adoc#coreadmin-mergeindexes,CoreAdminHandler>> section for details.
|
||||||
|
|
|
@ -28,7 +28,6 @@ The second is to use it as a search component. This is less desirable since it p
|
||||||
|
|
||||||
The final approach is to use it as a request handler but with externally supplied text. This case, also referred to as the MoreLikeThisHandler, will supply information about similar documents in the index based on the text of the input document.
|
The final approach is to use it as a request handler but with externally supplied text. This case, also referred to as the MoreLikeThisHandler, will supply information about similar documents in the index based on the text of the input document.
|
||||||
|
|
||||||
[[MoreLikeThis-HowMoreLikeThisWorks]]
|
|
||||||
== How MoreLikeThis Works
|
== How MoreLikeThis Works
|
||||||
|
|
||||||
`MoreLikeThis` constructs a Lucene query based on terms in a document. It does this by pulling terms from the defined list of fields ( see the `mlt.fl` parameter, below). For best results, the fields should have stored term vectors in `schema.xml`. For example:
|
`MoreLikeThis` constructs a Lucene query based on terms in a document. It does this by pulling terms from the defined list of fields ( see the `mlt.fl` parameter, below). For best results, the fields should have stored term vectors in `schema.xml`. For example:
|
||||||
|
@ -42,7 +41,6 @@ If term vectors are not stored, `MoreLikeThis` will generate terms from stored f
|
||||||
|
|
||||||
The next phase filters terms from the original document using thresholds defined with the MoreLikeThis parameters. Finally, a query is run with these terms, and any other query parameters that have been defined (see the `mlt.qf` parameter, below) and a new document set is returned.
|
The next phase filters terms from the original document using thresholds defined with the MoreLikeThis parameters. Finally, a query is run with these terms, and any other query parameters that have been defined (see the `mlt.qf` parameter, below) and a new document set is returned.
|
||||||
|
|
||||||
[[MoreLikeThis-CommonParametersforMoreLikeThis]]
|
|
||||||
== Common Parameters for MoreLikeThis
|
== Common Parameters for MoreLikeThis
|
||||||
|
|
||||||
The table below summarizes the `MoreLikeThis` parameters supported by Lucene/Solr. These parameters can be used with any of the three possible MoreLikeThis approaches.
|
The table below summarizes the `MoreLikeThis` parameters supported by Lucene/Solr. These parameters can be used with any of the three possible MoreLikeThis approaches.
|
||||||
|
@ -77,8 +75,6 @@ Specifies if the query will be boosted by the interesting term relevance. It can
|
||||||
`mlt.qf`::
|
`mlt.qf`::
|
||||||
Query fields and their boosts using the same format as that used by the <<the-dismax-query-parser.adoc#the-dismax-query-parser,DisMax Query Parser>>. These fields must also be specified in `mlt.fl`.
|
Query fields and their boosts using the same format as that used by the <<the-dismax-query-parser.adoc#the-dismax-query-parser,DisMax Query Parser>>. These fields must also be specified in `mlt.fl`.
|
||||||
|
|
||||||
|
|
||||||
[[MoreLikeThis-ParametersfortheMoreLikeThisComponent]]
|
|
||||||
== Parameters for the MoreLikeThisComponent
|
== Parameters for the MoreLikeThisComponent
|
||||||
|
|
||||||
Using MoreLikeThis as a search component returns similar documents for each document in the response set. In addition to the common parameters, these additional options are available:
|
Using MoreLikeThis as a search component returns similar documents for each document in the response set. In addition to the common parameters, these additional options are available:
|
||||||
|
@ -89,8 +85,6 @@ If set to `true`, activates the `MoreLikeThis` component and enables Solr to ret
|
||||||
`mlt.count`::
|
`mlt.count`::
|
||||||
Specifies the number of similar documents to be returned for each result. The default value is 5.
|
Specifies the number of similar documents to be returned for each result. The default value is 5.
|
||||||
|
|
||||||
|
|
||||||
[[MoreLikeThis-ParametersfortheMoreLikeThisHandler]]
|
|
||||||
== Parameters for the MoreLikeThisHandler
|
== Parameters for the MoreLikeThisHandler
|
||||||
|
|
||||||
The table below summarizes parameters accessible through the `MoreLikeThisHandler`. It supports faceting, paging, and filtering using common query parameters, but does not work well with alternate query parsers.
|
The table below summarizes parameters accessible through the `MoreLikeThisHandler`. It supports faceting, paging, and filtering using common query parameters, but does not work well with alternate query parsers.
|
||||||
|
@ -105,7 +99,6 @@ Specifies an offset into the main query search results to locate the document on
|
||||||
Controls how the `MoreLikeThis` component presents the "interesting" terms (the top TF/IDF terms) for the query. Supports three settings. The setting list lists the terms. The setting none lists no terms. The setting details lists the terms along with the boost value used for each term. Unless `mlt.boost=true`, all terms will have `boost=1.0`.
|
Controls how the `MoreLikeThis` component presents the "interesting" terms (the top TF/IDF terms) for the query. Supports three settings. The setting list lists the terms. The setting none lists no terms. The setting details lists the terms along with the boost value used for each term. Unless `mlt.boost=true`, all terms will have `boost=1.0`.
|
||||||
|
|
||||||
|
|
||||||
[[MoreLikeThis-MoreLikeThisQueryParser]]
|
|
||||||
== MoreLikeThis Query Parser
|
== MoreLikeThis Query Parser
|
||||||
|
|
||||||
The `mlt` query parser provides a mechanism to retrieve documents similar to a given document, like the handler. More information on the usage of the mlt query parser can be found in the section <<other-parsers.adoc#other-parsers,Other Parsers>>.
|
The `mlt` query parser provides a mechanism to retrieve documents similar to a given document, like the handler. More information on the usage of the mlt query parser can be found in the section <<other-parsers.adoc#other-parsers,Other Parsers>>.
|
||||||
|
|
|
@ -26,7 +26,6 @@ With NRT, you can modify a `commit` command to be a *soft commit*, which avoids
|
||||||
|
|
||||||
However, pay special attention to cache and autowarm settings as they can have a significant impact on NRT performance.
|
However, pay special attention to cache and autowarm settings as they can have a significant impact on NRT performance.
|
||||||
|
|
||||||
[[NearRealTimeSearching-CommitsandOptimizing]]
|
|
||||||
== Commits and Optimizing
|
== Commits and Optimizing
|
||||||
|
|
||||||
A commit operation makes index changes visible to new search requests. A *hard commit* uses the transaction log to get the id of the latest document changes, and also calls `fsync` on the index files to ensure they have been flushed to stable storage and no data loss will result from a power failure. The current transaction log is closed and a new one is opened. See the "transaction log" discussion below for data loss issues.
|
A commit operation makes index changes visible to new search requests. A *hard commit* uses the transaction log to get the id of the latest document changes, and also calls `fsync` on the index files to ensure they have been flushed to stable storage and no data loss will result from a power failure. The current transaction log is closed and a new one is opened. See the "transaction log" discussion below for data loss issues.
|
||||||
|
@ -45,7 +44,6 @@ The number of milliseconds to wait before pushing documents to the index. It wor
|
||||||
|
|
||||||
Use `maxDocs` and `maxTime` judiciously to fine-tune your commit strategies.
|
Use `maxDocs` and `maxTime` judiciously to fine-tune your commit strategies.
|
||||||
|
|
||||||
[[NearRealTimeSearching-TransactionLogs]]
|
|
||||||
=== Transaction Logs (tlogs)
|
=== Transaction Logs (tlogs)
|
||||||
|
|
||||||
Transaction logs are a "rolling window" of at least the last `N` (default 100) documents indexed. Tlogs are configured in solrconfig.xml, including the value of `N`. The current transaction log is closed and a new one opened each time any variety of hard commit occurs. Soft commits have no effect on the transaction log.
|
Transaction logs are a "rolling window" of at least the last `N` (default 100) documents indexed. Tlogs are configured in solrconfig.xml, including the value of `N`. The current transaction log is closed and a new one opened each time any variety of hard commit occurs. Soft commits have no effect on the transaction log.
|
||||||
|
@ -54,7 +52,6 @@ When tlogs are enabled, documents being added to the index are written to the tl
|
||||||
|
|
||||||
When Solr is shut down gracefully (i.e. using the `bin/solr stop` command and the like) Solr will close the tlog file and index segments so no replay will be necessary on startup.
|
When Solr is shut down gracefully (i.e. using the `bin/solr stop` command and the like) Solr will close the tlog file and index segments so no replay will be necessary on startup.
|
||||||
|
|
||||||
[[NearRealTimeSearching-AutoCommits]]
|
|
||||||
=== AutoCommits
|
=== AutoCommits
|
||||||
|
|
||||||
An autocommit also uses the parameters `maxDocs` and `maxTime`. However it's useful in many strategies to use both a hard `autocommit` and `autosoftcommit` to achieve more flexible commits.
|
An autocommit also uses the parameters `maxDocs` and `maxTime`. However it's useful in many strategies to use both a hard `autocommit` and `autosoftcommit` to achieve more flexible commits.
|
||||||
|
@ -72,7 +69,6 @@ For example:
|
||||||
|
|
||||||
It's better to use `maxTime` rather than `maxDocs` to modify an `autoSoftCommit`, especially when indexing a large number of documents through the commit operation. It's also better to turn off `autoSoftCommit` for bulk indexing.
|
It's better to use `maxTime` rather than `maxDocs` to modify an `autoSoftCommit`, especially when indexing a large number of documents through the commit operation. It's also better to turn off `autoSoftCommit` for bulk indexing.
|
||||||
|
|
||||||
[[NearRealTimeSearching-OptionalAttributesforcommitandoptimize]]
|
|
||||||
=== Optional Attributes for commit and optimize
|
=== Optional Attributes for commit and optimize
|
||||||
|
|
||||||
`waitSearcher`::
|
`waitSearcher`::
|
||||||
|
@ -99,7 +95,6 @@ Example of `commit` and `optimize` with optional attributes:
|
||||||
<optimize waitSearcher="false"/>
|
<optimize waitSearcher="false"/>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[NearRealTimeSearching-PassingcommitandcommitWithinparametersaspartoftheURL]]
|
|
||||||
=== Passing commit and commitWithin Parameters as Part of the URL
|
=== Passing commit and commitWithin Parameters as Part of the URL
|
||||||
|
|
||||||
Update handlers can also get `commit`-related parameters as part of the update URL, if the `stream.body` feature is enabled. This example adds a small test document and causes an explicit commit to happen immediately afterwards:
|
Update handlers can also get `commit`-related parameters as part of the update URL, if the `stream.body` feature is enabled. This example adds a small test document and causes an explicit commit to happen immediately afterwards:
|
||||||
|
@ -132,10 +127,9 @@ curl http://localhost:8983/solr/my_collection/update?commitWithin=10000
|
||||||
-H "Content-Type: text/xml" --data-binary '<add><doc><field name="id">testdoc</field></doc></add>'
|
-H "Content-Type: text/xml" --data-binary '<add><doc><field name="id">testdoc</field></doc></add>'
|
||||||
----
|
----
|
||||||
|
|
||||||
WARNING: While the `stream.body` feature is great for development and testing, it should normally not be enabled in production systems, as it lets a user with READ permissions post data that may alter the system state. The feature is disabled by default. See <<requestdispatcher-in-solrconfig.adoc#RequestDispatcherinSolrConfig-requestParsersElement,RequestDispatcher in SolrConfig>> for details.
|
WARNING: While the `stream.body` feature is great for development and testing, it should normally not be enabled in production systems, as it lets a user with READ permissions post data that may alter the system state. The feature is disabled by default. See <<requestdispatcher-in-solrconfig.adoc#requestparsers-element,RequestDispatcher in SolrConfig>> for details.
|
||||||
|
|
||||||
[[NearRealTimeSearching-ChangingdefaultcommitWithinBehavior]]
|
=== Changing Default commitWithin Behavior
|
||||||
=== Changing default commitWithin Behavior
|
|
||||||
|
|
||||||
The `commitWithin` settings allow forcing document commits to happen in a defined time period. This is used most frequently with <<near-real-time-searching.adoc#near-real-time-searching,Near Real Time Searching>>, and for that reason the default is to perform a soft commit. This does not, however, replicate new documents to slave servers in a master/slave environment. If that's a requirement for your implementation, you can force a hard commit by adding a parameter, as in this example:
|
The `commitWithin` settings allow forcing document commits to happen in a defined time period. This is used most frequently with <<near-real-time-searching.adoc#near-real-time-searching,Near Real Time Searching>>, and for that reason the default is to perform a soft commit. This does not, however, replicate new documents to slave servers in a master/slave environment. If that's a requirement for your implementation, you can force a hard commit by adding a parameter, as in this example:
|
||||||
|
|
||||||
|
|
|
@ -24,7 +24,6 @@ This section details the other parsers, and gives examples for how they might be
|
||||||
|
|
||||||
Many of these parsers are expressed the same way as <<local-parameters-in-queries.adoc#local-parameters-in-queries,Local Parameters in Queries>>.
|
Many of these parsers are expressed the same way as <<local-parameters-in-queries.adoc#local-parameters-in-queries,Local Parameters in Queries>>.
|
||||||
|
|
||||||
[[OtherParsers-BlockJoinQueryParsers]]
|
|
||||||
== Block Join Query Parsers
|
== Block Join Query Parsers
|
||||||
|
|
||||||
There are two query parsers that support block joins. These parsers allow indexing and searching for relational content that has been<<uploading-data-with-index-handlers.adoc#uploading-data-with-index-handlers,indexed as nested documents>>.
|
There are two query parsers that support block joins. These parsers allow indexing and searching for relational content that has been<<uploading-data-with-index-handlers.adoc#uploading-data-with-index-handlers,indexed as nested documents>>.
|
||||||
|
@ -55,7 +54,6 @@ The example usage of the query parsers below assumes these two documents and eac
|
||||||
</add>
|
</add>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[OtherParsers-BlockJoinChildrenQueryParser]]
|
|
||||||
=== Block Join Children Query Parser
|
=== Block Join Children Query Parser
|
||||||
|
|
||||||
This parser takes a query that matches some parent documents and returns their children.
|
This parser takes a query that matches some parent documents and returns their children.
|
||||||
|
@ -80,16 +78,16 @@ Using the example documents above, we can construct a query such as `q={!child o
|
||||||
|
|
||||||
Note that the query for `someParents` should match only parent documents passed by `allParents` or you may get an exception:
|
Note that the query for `someParents` should match only parent documents passed by `allParents` or you may get an exception:
|
||||||
|
|
||||||
....
|
[literal]
|
||||||
Parent query must not match any docs besides parent filter. Combine them as must (+) and must-not (-) clauses to find a problem doc.
|
Parent query must not match any docs besides parent filter. Combine them as must (+) and must-not (-) clauses to find a problem doc.
|
||||||
....
|
|
||||||
In older version the error is:
|
In older version the error is:
|
||||||
....
|
|
||||||
|
[literal]
|
||||||
Parent query yields document which is not matched by parents filter.
|
Parent query yields document which is not matched by parents filter.
|
||||||
....
|
|
||||||
You can search for `q=+(someParents) -(allParents)` to find a cause.
|
You can search for `q=+(someParents) -(allParents)` to find a cause.
|
||||||
|
|
||||||
[[OtherParsers-BlockJoinParentQueryParser]]
|
|
||||||
=== Block Join Parent Query Parser
|
=== Block Join Parent Query Parser
|
||||||
|
|
||||||
This parser takes a query that matches child documents and returns their parents.
|
This parser takes a query that matches child documents and returns their parents.
|
||||||
|
@ -101,13 +99,15 @@ The parameter `allParents` is a filter that matches *only parent documents*; her
|
||||||
The parameter `someChildren` is a query that matches some or all of the child documents.
|
The parameter `someChildren` is a query that matches some or all of the child documents.
|
||||||
|
|
||||||
Note that the query for `someChildren` should match only child documents or you may get an exception:
|
Note that the query for `someChildren` should match only child documents or you may get an exception:
|
||||||
....
|
|
||||||
|
[literal]
|
||||||
Child query must not match same docs with parent filter. Combine them as must clauses (+) to find a problem doc.
|
Child query must not match same docs with parent filter. Combine them as must clauses (+) to find a problem doc.
|
||||||
....
|
|
||||||
In older version it's:
|
In older version the error is:
|
||||||
....
|
|
||||||
|
[literal]
|
||||||
child query must only match non-parent docs.
|
child query must only match non-parent docs.
|
||||||
....
|
|
||||||
You can search for `q=+(parentFilter) +(someChildren)` to find a cause .
|
You can search for `q=+(parentFilter) +(someChildren)` to find a cause .
|
||||||
|
|
||||||
Again using the example documents above, we can construct a query such as `q={!parent which="content_type:parentDocument"}comments:SolrCloud`. We get this document in response:
|
Again using the example documents above, we can construct a query such as `q={!parent which="content_type:parentDocument"}comments:SolrCloud`. We get this document in response:
|
||||||
|
@ -133,20 +133,17 @@ A common mistake is to try to filter parents with a `which` filter, as in this b
|
||||||
Instead, you should use a sibling mandatory clause as a filter:
|
Instead, you should use a sibling mandatory clause as a filter:
|
||||||
|
|
||||||
`q= *+title:join* +{!parent which="*content_type:parentDocument*"}comments:SolrCloud`
|
`q= *+title:join* +{!parent which="*content_type:parentDocument*"}comments:SolrCloud`
|
||||||
|
|
||||||
====
|
====
|
||||||
|
|
||||||
[[OtherParsers-Scoring]]
|
=== Scoring with the Block Join Parent Query Parser
|
||||||
=== Scoring
|
|
||||||
|
|
||||||
You can optionally use the `score` local parameter to return scores of the subordinate query. The values to use for this parameter define the type of aggregation, which are `avg` (average), `max` (maximum), `min` (minimum), `total (sum)`. Implicit default is `none` which returns `0.0`.
|
You can optionally use the `score` local parameter to return scores of the subordinate query. The values to use for this parameter define the type of aggregation, which are `avg` (average), `max` (maximum), `min` (minimum), `total (sum)`. Implicit default is `none` which returns `0.0`.
|
||||||
|
|
||||||
[[OtherParsers-BoostQueryParser]]
|
|
||||||
== Boost Query Parser
|
== Boost Query Parser
|
||||||
|
|
||||||
`BoostQParser` extends the `QParserPlugin` and creates a boosted query from the input value. The main value is the query to be boosted. Parameter `b` is the function query to use as the boost. The query to be boosted may be of any type.
|
`BoostQParser` extends the `QParserPlugin` and creates a boosted query from the input value. The main value is the query to be boosted. Parameter `b` is the function query to use as the boost. The query to be boosted may be of any type.
|
||||||
|
|
||||||
Examples:
|
=== Boost Query Parser Examples
|
||||||
|
|
||||||
Creates a query "foo" which is boosted (scores are multiplied) by the function query `log(popularity)`:
|
Creates a query "foo" which is boosted (scores are multiplied) by the function query `log(popularity)`:
|
||||||
|
|
||||||
|
@ -162,7 +159,7 @@ Creates a query "foo" which is boosted by the date boosting function referenced
|
||||||
{!boost b=recip(ms(NOW,mydatefield),3.16e-11,1,1)}foo
|
{!boost b=recip(ms(NOW,mydatefield),3.16e-11,1,1)}foo
|
||||||
----
|
----
|
||||||
|
|
||||||
[[OtherParsers-CollapsingQueryParser]]
|
[[other-collapsing]]
|
||||||
== Collapsing Query Parser
|
== Collapsing Query Parser
|
||||||
|
|
||||||
The `CollapsingQParser` is really a _post filter_ that provides more performant field collapsing than Solr's standard approach when the number of distinct groups in the result set is high.
|
The `CollapsingQParser` is really a _post filter_ that provides more performant field collapsing than Solr's standard approach when the number of distinct groups in the result set is high.
|
||||||
|
@ -171,7 +168,6 @@ This parser collapses the result set to a single document per group before it fo
|
||||||
|
|
||||||
Details about using the `CollapsingQParser` can be found in the section <<collapse-and-expand-results.adoc#collapse-and-expand-results,Collapse and Expand Results>>.
|
Details about using the `CollapsingQParser` can be found in the section <<collapse-and-expand-results.adoc#collapse-and-expand-results,Collapse and Expand Results>>.
|
||||||
|
|
||||||
[[OtherParsers-ComplexPhraseQueryParser]]
|
|
||||||
== Complex Phrase Query Parser
|
== Complex Phrase Query Parser
|
||||||
|
|
||||||
The `ComplexPhraseQParser` provides support for wildcards, ORs, etc., inside phrase queries using Lucene's {lucene-javadocs}/queryparser/org/apache/lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.html[`ComplexPhraseQueryParser`].
|
The `ComplexPhraseQParser` provides support for wildcards, ORs, etc., inside phrase queries using Lucene's {lucene-javadocs}/queryparser/org/apache/lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.html[`ComplexPhraseQueryParser`].
|
||||||
|
@ -204,15 +200,13 @@ A mix of ordered and unordered complex phrase queries:
|
||||||
+_query_:"{!complexphrase inOrder=true}manu:\"a* c*\"" +_query_:"{!complexphrase inOrder=false df=name}\"bla* pla*\""
|
+_query_:"{!complexphrase inOrder=true}manu:\"a* c*\"" +_query_:"{!complexphrase inOrder=false df=name}\"bla* pla*\""
|
||||||
----
|
----
|
||||||
|
|
||||||
[[OtherParsers-Limitations]]
|
=== Complex Phrase Parser Limitations
|
||||||
=== Limitations
|
|
||||||
|
|
||||||
Performance is sensitive to the number of unique terms that are associated with a pattern. For instance, searching for "a*" will form a large OR clause (technically a SpanOr with many terms) for all of the terms in your index for the indicated field that start with the single letter 'a'. It may be prudent to restrict wildcards to at least two or preferably three letters as a prefix. Allowing very short prefixes may result in to many low-quality documents being returned.
|
Performance is sensitive to the number of unique terms that are associated with a pattern. For instance, searching for "a*" will form a large OR clause (technically a SpanOr with many terms) for all of the terms in your index for the indicated field that start with the single letter 'a'. It may be prudent to restrict wildcards to at least two or preferably three letters as a prefix. Allowing very short prefixes may result in to many low-quality documents being returned.
|
||||||
|
|
||||||
Notice that it also supports leading wildcards "*a" as well with consequent performance implications. Applying <<filter-descriptions.adoc#reversed-wildcard-filter,ReversedWildcardFilterFactory>> in index-time analysis is usually a good idea.
|
Notice that it also supports leading wildcards "*a" as well with consequent performance implications. Applying <<filter-descriptions.adoc#reversed-wildcard-filter,ReversedWildcardFilterFactory>> in index-time analysis is usually a good idea.
|
||||||
|
|
||||||
[[OtherParsers-MaxBooleanClauses]]
|
==== MaxBooleanClauses with Complex Phrase Parser
|
||||||
==== MaxBooleanClauses
|
|
||||||
|
|
||||||
You may need to increase MaxBooleanClauses in `solrconfig.xml` as a result of the term expansion above:
|
You may need to increase MaxBooleanClauses in `solrconfig.xml` as a result of the term expansion above:
|
||||||
|
|
||||||
|
@ -221,10 +215,9 @@ You may need to increase MaxBooleanClauses in `solrconfig.xml` as a result of th
|
||||||
<maxBooleanClauses>4096</maxBooleanClauses>
|
<maxBooleanClauses>4096</maxBooleanClauses>
|
||||||
----
|
----
|
||||||
|
|
||||||
This property is described in more detail in the section <<query-settings-in-solrconfig.adoc#QuerySettingsinSolrConfig-QuerySizingandWarming,Query Sizing and Warming>>.
|
This property is described in more detail in the section <<query-settings-in-solrconfig.adoc#query-sizing-and-warming,Query Sizing and Warming>>.
|
||||||
|
|
||||||
[[OtherParsers-Stopwords]]
|
==== Stopwords with Complex Phrase Parser
|
||||||
==== Stopwords
|
|
||||||
|
|
||||||
It is recommended not to use stopword elimination with this query parser.
|
It is recommended not to use stopword elimination with this query parser.
|
||||||
|
|
||||||
|
@ -246,12 +239,10 @@ the document is returned. The next query that _does_ use the Complex Phrase Quer
|
||||||
|
|
||||||
does _not_ return that document because SpanNearQuery has no good way to handle stopwords in a way analogous to PhraseQuery. If you must remove stopwords for your use case, use a custom filter factory or perhaps a customized synonyms filter that reduces given stopwords to some impossible token.
|
does _not_ return that document because SpanNearQuery has no good way to handle stopwords in a way analogous to PhraseQuery. If you must remove stopwords for your use case, use a custom filter factory or perhaps a customized synonyms filter that reduces given stopwords to some impossible token.
|
||||||
|
|
||||||
[[OtherParsers-Escaping]]
|
==== Escaping with Complex Phrase Parser
|
||||||
==== Escaping
|
|
||||||
|
|
||||||
Special care has to be given when escaping: clauses between double quotes (usually whole query) is parsed twice, these parts have to be escaped as twice. eg `"foo\\: bar\\^"`.
|
Special care has to be given when escaping: clauses between double quotes (usually whole query) is parsed twice, these parts have to be escaped as twice. eg `"foo\\: bar\\^"`.
|
||||||
|
|
||||||
[[OtherParsers-FieldQueryParser]]
|
|
||||||
== Field Query Parser
|
== Field Query Parser
|
||||||
|
|
||||||
The `FieldQParser` extends the `QParserPlugin` and creates a field query from the input value, applying text analysis and constructing a phrase query if appropriate. The parameter `f` is the field to be queried.
|
The `FieldQParser` extends the `QParserPlugin` and creates a field query from the input value, applying text analysis and constructing a phrase query if appropriate. The parameter `f` is the field to be queried.
|
||||||
|
@ -265,7 +256,6 @@ Example:
|
||||||
|
|
||||||
This example creates a phrase query with "foo" followed by "bar" (assuming the analyzer for `myfield` is a text field with an analyzer that splits on whitespace and lowercase terms). This is generally equivalent to the Lucene query parser expression `myfield:"Foo Bar"`.
|
This example creates a phrase query with "foo" followed by "bar" (assuming the analyzer for `myfield` is a text field with an analyzer that splits on whitespace and lowercase terms). This is generally equivalent to the Lucene query parser expression `myfield:"Foo Bar"`.
|
||||||
|
|
||||||
[[OtherParsers-FunctionQueryParser]]
|
|
||||||
== Function Query Parser
|
== Function Query Parser
|
||||||
|
|
||||||
The `FunctionQParser` extends the `QParserPlugin` and creates a function query from the input value. This is only one way to use function queries in Solr; for another, more integrated, approach, see the section on <<function-queries.adoc#function-queries,Function Queries>>.
|
The `FunctionQParser` extends the `QParserPlugin` and creates a function query from the input value. This is only one way to use function queries in Solr; for another, more integrated, approach, see the section on <<function-queries.adoc#function-queries,Function Queries>>.
|
||||||
|
@ -277,7 +267,6 @@ Example:
|
||||||
{!func}log(foo)
|
{!func}log(foo)
|
||||||
----
|
----
|
||||||
|
|
||||||
[[OtherParsers-FunctionRangeQueryParser]]
|
|
||||||
== Function Range Query Parser
|
== Function Range Query Parser
|
||||||
|
|
||||||
The `FunctionRangeQParser` extends the `QParserPlugin` and creates a range query over a function. This is also referred to as `frange`, as seen in the examples below.
|
The `FunctionRangeQParser` extends the `QParserPlugin` and creates a range query over a function. This is also referred to as `frange`, as seen in the examples below.
|
||||||
|
@ -312,15 +301,13 @@ Both of these examples restrict the results by a range of values found in a decl
|
||||||
|
|
||||||
For more information about range queries over functions, see Yonik Seeley's introductory blog post https://lucidworks.com/2009/07/06/ranges-over-functions-in-solr-14/[Ranges over Functions in Solr 1.4].
|
For more information about range queries over functions, see Yonik Seeley's introductory blog post https://lucidworks.com/2009/07/06/ranges-over-functions-in-solr-14/[Ranges over Functions in Solr 1.4].
|
||||||
|
|
||||||
[[OtherParsers-GraphQueryParser]]
|
|
||||||
== Graph Query Parser
|
== Graph Query Parser
|
||||||
|
|
||||||
The `graph` query parser does a breadth first, cyclic aware, graph traversal of all documents that are "reachable" from a starting set of root documents identified by a wrapped query.
|
The `graph` query parser does a breadth first, cyclic aware, graph traversal of all documents that are "reachable" from a starting set of root documents identified by a wrapped query.
|
||||||
|
|
||||||
The graph is built according to linkages between documents based on the terms found in `from` and `to` fields that you specify as part of the query.
|
The graph is built according to linkages between documents based on the terms found in `from` and `to` fields that you specify as part of the query.
|
||||||
|
|
||||||
[[OtherParsers-Parameters]]
|
=== Graph Query Parameters
|
||||||
=== Parameters
|
|
||||||
|
|
||||||
`to`::
|
`to`::
|
||||||
The field name of matching documents to inspect to identify outgoing edges for graph traversal. Defaults to `edge_ids`.
|
The field name of matching documents to inspect to identify outgoing edges for graph traversal. Defaults to `edge_ids`.
|
||||||
|
@ -342,17 +329,15 @@ Boolean that indicates if the results of the query should be filtered so that on
|
||||||
|
|
||||||
`useAutn`:: Boolean that indicates if an Automatons should be compiled for each iteration of the breadth first search, which may be faster for some graphs. Defaults to `false`.
|
`useAutn`:: Boolean that indicates if an Automatons should be compiled for each iteration of the breadth first search, which may be faster for some graphs. Defaults to `false`.
|
||||||
|
|
||||||
[[OtherParsers-Limitations.1]]
|
=== Graph Query Limitations
|
||||||
=== Limitations
|
|
||||||
|
|
||||||
The `graph` parser only works in single node Solr installations, or with <<solrcloud.adoc#solrcloud,SolrCloud>> collections that use exactly 1 shard.
|
The `graph` parser only works in single node Solr installations, or with <<solrcloud.adoc#solrcloud,SolrCloud>> collections that use exactly 1 shard.
|
||||||
|
|
||||||
[[OtherParsers-Examples]]
|
=== Graph Query Examples
|
||||||
=== Examples
|
|
||||||
|
|
||||||
To understand how the graph parser works, consider the following Directed Cyclic Graph, containing 8 nodes (A to H) and 9 edges (1 to 9):
|
To understand how the graph parser works, consider the following Directed Cyclic Graph, containing 8 nodes (A to H) and 9 edges (1 to 9):
|
||||||
|
|
||||||
image::images/other-parsers/graph_qparser_example.png[image,height=200]
|
image::images/other-parsers/graph_qparser_example.png[image,height=100]
|
||||||
|
|
||||||
One way to model this graph as Solr documents, would be to create one document per node, with mutivalued fields identifying the incoming and outgoing edges for each node:
|
One way to model this graph as Solr documents, would be to create one document per node, with mutivalued fields identifying the incoming and outgoing edges for each node:
|
||||||
|
|
||||||
|
@ -426,7 +411,6 @@ http://localhost:8983/solr/my_graph/query?fl=id&q={!graph+from=in_edge+to=out_ed
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[OtherParsers-SimplifiedModels]]
|
|
||||||
=== Simplified Models
|
=== Simplified Models
|
||||||
|
|
||||||
The Document & Field modeling used in the above examples enumerated all of the outgoing and income edges for each node explicitly, to help demonstrate exactly how the "from" and "to" params work, and to give you an idea of what is possible. With multiple sets of fields like these for identifying incoming and outgoing edges, it's possible to model many independent Directed Graphs that contain some or all of the documents in your collection.
|
The Document & Field modeling used in the above examples enumerated all of the outgoing and income edges for each node explicitly, to help demonstrate exactly how the "from" and "to" params work, and to give you an idea of what is possible. With multiple sets of fields like these for identifying incoming and outgoing edges, it's possible to model many independent Directed Graphs that contain some or all of the documents in your collection.
|
||||||
|
@ -469,7 +453,6 @@ http://localhost:8983/solr/alt_graph/query?fl=id&q={!graph+from=id+to=out_edge+m
|
||||||
}
|
}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[OtherParsers-JoinQueryParser]]
|
|
||||||
== Join Query Parser
|
== Join Query Parser
|
||||||
|
|
||||||
`JoinQParser` extends the `QParserPlugin`. It allows normalizing relationships between documents with a join operation. This is different from the concept of a join in a relational database because no information is being truly joined. An appropriate SQL analogy would be an "inner query".
|
`JoinQParser` extends the `QParserPlugin`. It allows normalizing relationships between documents with a join operation. This is different from the concept of a join in a relational database because no information is being truly joined. An appropriate SQL analogy would be an "inner query".
|
||||||
|
@ -493,8 +476,7 @@ fq = price:[* TO 12]
|
||||||
|
|
||||||
The join operation is done on a term basis, so the "from" and "to" fields must use compatible field types. For example: joining between a `StrField` and a `TrieIntField` will not work, likewise joining between a `StrField` and a `TextField` that uses `LowerCaseFilterFactory` will only work for values that are already lower cased in the string field.
|
The join operation is done on a term basis, so the "from" and "to" fields must use compatible field types. For example: joining between a `StrField` and a `TrieIntField` will not work, likewise joining between a `StrField` and a `TextField` that uses `LowerCaseFilterFactory` will only work for values that are already lower cased in the string field.
|
||||||
|
|
||||||
[[OtherParsers-Scoring.1]]
|
=== Join Parser Scoring
|
||||||
=== Scoring
|
|
||||||
|
|
||||||
You can optionally use the `score` parameter to return scores of the subordinate query. The values to use for this parameter define the type of aggregation, which are `avg` (average), `max` (maximum), `min` (minimum) `total`, or `none`.
|
You can optionally use the `score` parameter to return scores of the subordinate query. The values to use for this parameter define the type of aggregation, which are `avg` (average), `max` (maximum), `min` (minimum) `total`, or `none`.
|
||||||
|
|
||||||
|
@ -504,7 +486,6 @@ You can optionally use the `score` parameter to return scores of the subordinate
|
||||||
Specifying `score` local parameter switches the join algorithm. This might have performance implication on large indices, but it's more important that this algorithm won't work for single value numeric field starting from 7.0. Users are encouraged to change field types to string and rebuild indexes during migration.
|
Specifying `score` local parameter switches the join algorithm. This might have performance implication on large indices, but it's more important that this algorithm won't work for single value numeric field starting from 7.0. Users are encouraged to change field types to string and rebuild indexes during migration.
|
||||||
====
|
====
|
||||||
|
|
||||||
[[OtherParsers-JoiningAcrossCollections]]
|
|
||||||
=== Joining Across Collections
|
=== Joining Across Collections
|
||||||
|
|
||||||
You can also specify a `fromIndex` parameter to join with a field from another core or collection. If running in SolrCloud mode, then the collection specified in the `fromIndex` parameter must have a single shard and a replica on all Solr nodes where the collection you're joining to has a replica.
|
You can also specify a `fromIndex` parameter to join with a field from another core or collection. If running in SolrCloud mode, then the collection specified in the `fromIndex` parameter must have a single shard and a replica on all Solr nodes where the collection you're joining to has a replica.
|
||||||
|
@ -548,7 +529,6 @@ At query time, the `JoinQParser` will access the local replica of the *movie_dir
|
||||||
|
|
||||||
For more information about join queries, see the Solr Wiki page on http://wiki.apache.org/solr/Join[Joins]. Erick Erickson has also written a blog post about join performance titled https://lucidworks.com/2012/06/20/solr-and-joins/[Solr and Joins].
|
For more information about join queries, see the Solr Wiki page on http://wiki.apache.org/solr/Join[Joins]. Erick Erickson has also written a blog post about join performance titled https://lucidworks.com/2012/06/20/solr-and-joins/[Solr and Joins].
|
||||||
|
|
||||||
[[OtherParsers-LuceneQueryParser]]
|
|
||||||
== Lucene Query Parser
|
== Lucene Query Parser
|
||||||
|
|
||||||
The `LuceneQParser` extends the `QParserPlugin` by parsing Solr's variant on the Lucene QueryParser syntax. This is effectively the same query parser that is used in Lucene. It uses the operators `q.op`, the default operator ("OR" or "AND") and `df`, the default field name.
|
The `LuceneQParser` extends the `QParserPlugin` by parsing Solr's variant on the Lucene QueryParser syntax. This is effectively the same query parser that is used in Lucene. It uses the operators `q.op`, the default operator ("OR" or "AND") and `df`, the default field name.
|
||||||
|
@ -562,7 +542,6 @@ Example:
|
||||||
|
|
||||||
For more information about the syntax for the Lucene Query Parser, see the {lucene-javadocs}/queryparser/org/apache/lucene/queryparser/classic/package-summary.html[Classic QueryParser javadocs].
|
For more information about the syntax for the Lucene Query Parser, see the {lucene-javadocs}/queryparser/org/apache/lucene/queryparser/classic/package-summary.html[Classic QueryParser javadocs].
|
||||||
|
|
||||||
[[OtherParsers-LearningToRankQueryParser]]
|
|
||||||
== Learning To Rank Query Parser
|
== Learning To Rank Query Parser
|
||||||
|
|
||||||
The `LTRQParserPlugin` is a special purpose parser for reranking the top results of a simple query using a more complex ranking query which is based on a machine learnt model.
|
The `LTRQParserPlugin` is a special purpose parser for reranking the top results of a simple query using a more complex ranking query which is based on a machine learnt model.
|
||||||
|
@ -576,7 +555,6 @@ Example:
|
||||||
|
|
||||||
Details about using the `LTRQParserPlugin` can be found in the <<learning-to-rank.adoc#learning-to-rank,Learning To Rank>> section.
|
Details about using the `LTRQParserPlugin` can be found in the <<learning-to-rank.adoc#learning-to-rank,Learning To Rank>> section.
|
||||||
|
|
||||||
[[OtherParsers-MaxScoreQueryParser]]
|
|
||||||
== Max Score Query Parser
|
== Max Score Query Parser
|
||||||
|
|
||||||
The `MaxScoreQParser` extends the `LuceneQParser` but returns the Max score from the clauses. It does this by wrapping all `SHOULD` clauses in a `DisjunctionMaxQuery` with tie=1.0. Any `MUST` or `PROHIBITED` clauses are passed through as-is. Non-boolean queries, e.g., NumericRange falls-through to the `LuceneQParser` parser behavior.
|
The `MaxScoreQParser` extends the `LuceneQParser` but returns the Max score from the clauses. It does this by wrapping all `SHOULD` clauses in a `DisjunctionMaxQuery` with tie=1.0. Any `MUST` or `PROHIBITED` clauses are passed through as-is. Non-boolean queries, e.g., NumericRange falls-through to the `LuceneQParser` parser behavior.
|
||||||
|
@ -588,7 +566,6 @@ Example:
|
||||||
{!maxscore tie=0.01}C OR (D AND E)
|
{!maxscore tie=0.01}C OR (D AND E)
|
||||||
----
|
----
|
||||||
|
|
||||||
[[OtherParsers-MoreLikeThisQueryParser]]
|
|
||||||
== More Like This Query Parser
|
== More Like This Query Parser
|
||||||
|
|
||||||
`MLTQParser` enables retrieving documents that are similar to a given document. It uses Lucene's existing `MoreLikeThis` logic and also works in SolrCloud mode. The document identifier used here is the unique id value and not the Lucene internal document id. The list of returned documents excludes the queried document.
|
`MLTQParser` enables retrieving documents that are similar to a given document. It uses Lucene's existing `MoreLikeThis` logic and also works in SolrCloud mode. The document identifier used here is the unique id value and not the Lucene internal document id. The list of returned documents excludes the queried document.
|
||||||
|
@ -638,7 +615,6 @@ Adding more constraints to what qualifies as similar using mintf and mindf.
|
||||||
{!mlt qf=name mintf=2 mindf=3}1
|
{!mlt qf=name mintf=2 mindf=3}1
|
||||||
----
|
----
|
||||||
|
|
||||||
[[OtherParsers-NestedQueryParser]]
|
|
||||||
== Nested Query Parser
|
== Nested Query Parser
|
||||||
|
|
||||||
The `NestedParser` extends the `QParserPlugin` and creates a nested query, with the ability for that query to redefine its type via local parameters. This is useful in specifying defaults in configuration and letting clients indirectly reference them.
|
The `NestedParser` extends the `QParserPlugin` and creates a nested query, with the ability for that query to redefine its type via local parameters. This is useful in specifying defaults in configuration and letting clients indirectly reference them.
|
||||||
|
@ -662,7 +638,6 @@ If the `q1` parameter is price, then the query would be a function query on the
|
||||||
For more information about the possibilities of nested queries, see Yonik Seeley's blog post https://lucidworks.com/2009/03/31/nested-queries-in-solr/[Nested Queries in Solr].
|
For more information about the possibilities of nested queries, see Yonik Seeley's blog post https://lucidworks.com/2009/03/31/nested-queries-in-solr/[Nested Queries in Solr].
|
||||||
|
|
||||||
|
|
||||||
[[OtherParsers-PayloadQueryParsers]]
|
|
||||||
== Payload Query Parsers
|
== Payload Query Parsers
|
||||||
|
|
||||||
These query parsers utilize payloads encoded on terms during indexing.
|
These query parsers utilize payloads encoded on terms during indexing.
|
||||||
|
@ -672,7 +647,6 @@ The main query, for both of these parsers, is parsed straightforwardly from the
|
||||||
* `PayloadScoreQParser`
|
* `PayloadScoreQParser`
|
||||||
* `PayloadCheckQParser`
|
* `PayloadCheckQParser`
|
||||||
|
|
||||||
[[OtherParsers-PayloadScoreParser]]
|
|
||||||
=== Payload Score Parser
|
=== Payload Score Parser
|
||||||
|
|
||||||
`PayloadScoreQParser` incorporates each matching term's numeric (integer or float) payloads into the scores.
|
`PayloadScoreQParser` incorporates each matching term's numeric (integer or float) payloads into the scores.
|
||||||
|
@ -695,7 +669,6 @@ If `true`, multiples computed payload factor by the score of the original query.
|
||||||
{!payload_score f=my_field_dpf v=some_term func=max}
|
{!payload_score f=my_field_dpf v=some_term func=max}
|
||||||
----
|
----
|
||||||
|
|
||||||
[[OtherParsers-PayloadCheckParser]]
|
|
||||||
=== Payload Check Parser
|
=== Payload Check Parser
|
||||||
|
|
||||||
`PayloadCheckQParser` only matches when the matching terms also have the specified payloads.
|
`PayloadCheckQParser` only matches when the matching terms also have the specified payloads.
|
||||||
|
@ -719,7 +692,6 @@ Each specified payload will be encoded using the encoder determined from the fie
|
||||||
{!payload_check f=words_dps payloads="VERB NOUN"}searching stuff
|
{!payload_check f=words_dps payloads="VERB NOUN"}searching stuff
|
||||||
----
|
----
|
||||||
|
|
||||||
[[OtherParsers-PrefixQueryParser]]
|
|
||||||
== Prefix Query Parser
|
== Prefix Query Parser
|
||||||
|
|
||||||
`PrefixQParser` extends the `QParserPlugin` by creating a prefix query from the input value. Currently no analysis or value transformation is done to create this prefix query.
|
`PrefixQParser` extends the `QParserPlugin` by creating a prefix query from the input value. Currently no analysis or value transformation is done to create this prefix query.
|
||||||
|
@ -735,7 +707,6 @@ Example:
|
||||||
|
|
||||||
This would be generally equivalent to the Lucene query parser expression `myfield:foo*`.
|
This would be generally equivalent to the Lucene query parser expression `myfield:foo*`.
|
||||||
|
|
||||||
[[OtherParsers-RawQueryParser]]
|
|
||||||
== Raw Query Parser
|
== Raw Query Parser
|
||||||
|
|
||||||
`RawQParser` extends the `QParserPlugin` by creating a term query from the input value without any text analysis or transformation. This is useful in debugging, or when raw terms are returned from the terms component (this is not the default).
|
`RawQParser` extends the `QParserPlugin` by creating a term query from the input value without any text analysis or transformation. This is useful in debugging, or when raw terms are returned from the terms component (this is not the default).
|
||||||
|
@ -751,18 +722,16 @@ Example:
|
||||||
|
|
||||||
This example constructs the query: `TermQuery(Term("myfield","Foo Bar"))`.
|
This example constructs the query: `TermQuery(Term("myfield","Foo Bar"))`.
|
||||||
|
|
||||||
For easy filter construction to drill down in faceting, the <<OtherParsers-TermQueryParser,TermQParserPlugin>> is recommended.
|
For easy filter construction to drill down in faceting, the <<Term Query Parser,TermQParserPlugin>> is recommended.
|
||||||
|
|
||||||
For full analysis on all fields, including text fields, you may want to use the <<OtherParsers-FieldQueryParser,FieldQParserPlugin>>.
|
For full analysis on all fields, including text fields, you may want to use the <<Field Query Parser,FieldQParserPlugin>>.
|
||||||
|
|
||||||
[[OtherParsers-Re-RankingQueryParser]]
|
|
||||||
== Re-Ranking Query Parser
|
== Re-Ranking Query Parser
|
||||||
|
|
||||||
The `ReRankQParserPlugin` is a special purpose parser for Re-Ranking the top results of a simple query using a more complex ranking query.
|
The `ReRankQParserPlugin` is a special purpose parser for Re-Ranking the top results of a simple query using a more complex ranking query.
|
||||||
|
|
||||||
Details about using the `ReRankQParserPlugin` can be found in the <<query-re-ranking.adoc#query-re-ranking,Query Re-Ranking>> section.
|
Details about using the `ReRankQParserPlugin` can be found in the <<query-re-ranking.adoc#query-re-ranking,Query Re-Ranking>> section.
|
||||||
|
|
||||||
[[OtherParsers-SimpleQueryParser]]
|
|
||||||
== Simple Query Parser
|
== Simple Query Parser
|
||||||
|
|
||||||
The Simple query parser in Solr is based on Lucene's SimpleQueryParser. This query parser is designed to allow users to enter queries however they want, and it will do its best to interpret the query and return results.
|
The Simple query parser in Solr is based on Lucene's SimpleQueryParser. This query parser is designed to allow users to enter queries however they want, and it will do its best to interpret the query and return results.
|
||||||
|
@ -811,14 +780,12 @@ Defines the default field if none is defined in the Schema, or overrides the def
|
||||||
|
|
||||||
Any errors in syntax are ignored and the query parser will interpret queries as best it can. However, this can lead to odd results in some cases.
|
Any errors in syntax are ignored and the query parser will interpret queries as best it can. However, this can lead to odd results in some cases.
|
||||||
|
|
||||||
[[OtherParsers-SpatialQueryParsers]]
|
|
||||||
== Spatial Query Parsers
|
== Spatial Query Parsers
|
||||||
|
|
||||||
There are two spatial QParsers in Solr: `geofilt` and `bbox`. But there are other ways to query spatially: using the `frange` parser with a distance function, using the standard (lucene) query parser with the range syntax to pick the corners of a rectangle, or with RPT and BBoxField you can use the standard query parser but use a special syntax within quotes that allows you to pick the spatial predicate.
|
There are two spatial QParsers in Solr: `geofilt` and `bbox`. But there are other ways to query spatially: using the `frange` parser with a distance function, using the standard (lucene) query parser with the range syntax to pick the corners of a rectangle, or with RPT and BBoxField you can use the standard query parser but use a special syntax within quotes that allows you to pick the spatial predicate.
|
||||||
|
|
||||||
All these options are documented further in the section <<spatial-search.adoc#spatial-search,Spatial Search>>.
|
All these options are documented further in the section <<spatial-search.adoc#spatial-search,Spatial Search>>.
|
||||||
|
|
||||||
[[OtherParsers-SurroundQueryParser]]
|
|
||||||
== Surround Query Parser
|
== Surround Query Parser
|
||||||
|
|
||||||
The `SurroundQParser` enables the Surround query syntax, which provides proximity search functionality. There are two positional operators: `w` creates an ordered span query and `n` creates an unordered one. Both operators take a numeric value to indicate distance between two terms. The default is 1, and the maximum is 99.
|
The `SurroundQParser` enables the Surround query syntax, which provides proximity search functionality. There are two positional operators: `w` creates an ordered span query and `n` creates an unordered one. Both operators take a numeric value to indicate distance between two terms. The default is 1, and the maximum is 99.
|
||||||
|
@ -838,7 +805,6 @@ This query parser will also accept boolean operators (`AND`, `OR`, and `NOT`, in
|
||||||
|
|
||||||
The non-unary operators (everything but `NOT`) support both infix `(a AND b AND c)` and prefix `AND(a, b, c)` notation.
|
The non-unary operators (everything but `NOT`) support both infix `(a AND b AND c)` and prefix `AND(a, b, c)` notation.
|
||||||
|
|
||||||
[[OtherParsers-SwitchQueryParser]]
|
|
||||||
== Switch Query Parser
|
== Switch Query Parser
|
||||||
|
|
||||||
`SwitchQParser` is a `QParserPlugin` that acts like a "switch" or "case" statement.
|
`SwitchQParser` is a `QParserPlugin` that acts like a "switch" or "case" statement.
|
||||||
|
@ -895,7 +861,6 @@ Using the example configuration below, clients can optionally specify the custom
|
||||||
</requestHandler>
|
</requestHandler>
|
||||||
----
|
----
|
||||||
|
|
||||||
[[OtherParsers-TermQueryParser]]
|
|
||||||
== Term Query Parser
|
== Term Query Parser
|
||||||
|
|
||||||
`TermQParser` extends the `QParserPlugin` by creating a single term query from the input value equivalent to `readableToIndexed()`. This is useful for generating filter queries from the external human readable terms returned by the faceting or terms components. The only parameter is `f`, for the field.
|
`TermQParser` extends the `QParserPlugin` by creating a single term query from the input value equivalent to `readableToIndexed()`. This is useful for generating filter queries from the external human readable terms returned by the faceting or terms components. The only parameter is `f`, for the field.
|
||||||
|
@ -907,14 +872,13 @@ Example:
|
||||||
{!term f=weight}1.5
|
{!term f=weight}1.5
|
||||||
----
|
----
|
||||||
|
|
||||||
For text fields, no analysis is done since raw terms are already returned from the faceting and terms components. To apply analysis to text fields as well, see the <<OtherParsers-FieldQueryParser,Field Query Parser>>, above.
|
For text fields, no analysis is done since raw terms are already returned from the faceting and terms components. To apply analysis to text fields as well, see the <<Field Query Parser>>, above.
|
||||||
|
|
||||||
If no analysis or transformation is desired for any type of field, see the <<OtherParsers-RawQueryParser,Raw Query Parser>>, above.
|
If no analysis or transformation is desired for any type of field, see the <<Raw Query Parser>>, above.
|
||||||
|
|
||||||
[[OtherParsers-TermsQueryParser]]
|
|
||||||
== Terms Query Parser
|
== Terms Query Parser
|
||||||
|
|
||||||
`TermsQParser` functions similarly to the <<OtherParsers-TermQueryParser,Term Query Parser>> but takes in multiple values separated by commas and returns documents matching any of the specified values.
|
`TermsQParser` functions similarly to the <<Term Query Parser,Term Query Parser>> but takes in multiple values separated by commas and returns documents matching any of the specified values.
|
||||||
|
|
||||||
This can be useful for generating filter queries from the external human readable terms returned by the faceting or terms components, and may be more efficient in some cases than using the <<the-standard-query-parser.adoc#the-standard-query-parser,Standard Query Parser>> to generate an boolean query since the default implementation `method` avoids scoring.
|
This can be useful for generating filter queries from the external human readable terms returned by the faceting or terms components, and may be more efficient in some cases than using the <<the-standard-query-parser.adoc#the-standard-query-parser,Standard Query Parser>> to generate an boolean query since the default implementation `method` avoids scoring.
|
||||||
|
|
||||||
|
@ -929,7 +893,6 @@ Separator to use when parsing the input. If set to " " (a single blank space), w
|
||||||
`method`::
|
`method`::
|
||||||
The internal query-building implementation: `termsFilter`, `booleanQuery`, `automaton`, or `docValuesTermsFilter`. Defaults to `termsFilter`.
|
The internal query-building implementation: `termsFilter`, `booleanQuery`, `automaton`, or `docValuesTermsFilter`. Defaults to `termsFilter`.
|
||||||
|
|
||||||
|
|
||||||
*Examples*
|
*Examples*
|
||||||
|
|
||||||
[source,text]
|
[source,text]
|
||||||
|
@ -942,7 +905,6 @@ The internal query-building implementation: `termsFilter`, `booleanQuery`, `auto
|
||||||
{!terms f=categoryId method=booleanQuery separator=" "}8 6 7 5309
|
{!terms f=categoryId method=booleanQuery separator=" "}8 6 7 5309
|
||||||
----
|
----
|
||||||
|
|
||||||
[[OtherParsers-XMLQueryParser]]
|
|
||||||
== XML Query Parser
|
== XML Query Parser
|
||||||
|
|
||||||
The {solr-javadocs}/solr-core/org/apache/solr/search/XmlQParserPlugin.html[XmlQParserPlugin] extends the {solr-javadocs}/solr-core/org/apache/solr/search/QParserPlugin.html[QParserPlugin] and supports the creation of queries from XML. Example:
|
The {solr-javadocs}/solr-core/org/apache/solr/search/XmlQParserPlugin.html[XmlQParserPlugin] extends the {solr-javadocs}/solr-core/org/apache/solr/search/QParserPlugin.html[QParserPlugin] and supports the creation of queries from XML. Example:
|
||||||
|
@ -1002,7 +964,6 @@ The XmlQParser implementation uses the {solr-javadocs}/solr-core/org/apache/solr
|
||||||
|<LegacyNumericRangeQuery> |LegacyNumericRangeQuery(Builder) is deprecated
|
|<LegacyNumericRangeQuery> |LegacyNumericRangeQuery(Builder) is deprecated
|
||||||
|===
|
|===
|
||||||
|
|
||||||
[[OtherParsers-CustomizingXMLQueryParser]]
|
|
||||||
=== Customizing XML Query Parser
|
=== Customizing XML Query Parser
|
||||||
|
|
||||||
You can configure your own custom query builders for additional XML elements. The custom builders need to extend the {solr-javadocs}/solr-core/org/apache/solr/search/SolrQueryBuilder.html[SolrQueryBuilder] or the {solr-javadocs}/solr-core/org/apache/solr/search/SolrSpanQueryBuilder.html[SolrSpanQueryBuilder] class. Example solrconfig.xml snippet:
|
You can configure your own custom query builders for additional XML elements. The custom builders need to extend the {solr-javadocs}/solr-core/org/apache/solr/search/SolrQueryBuilder.html[SolrQueryBuilder] or the {solr-javadocs}/solr-core/org/apache/solr/search/SolrSpanQueryBuilder.html[SolrSpanQueryBuilder] class. Example solrconfig.xml snippet:
|
||||||
|
|
|
@ -20,7 +20,6 @@
|
||||||
|
|
||||||
This section describes several other important elements of `schema.xml` not covered in earlier sections.
|
This section describes several other important elements of `schema.xml` not covered in earlier sections.
|
||||||
|
|
||||||
[[OtherSchemaElements-UniqueKey]]
|
|
||||||
== Unique Key
|
== Unique Key
|
||||||
|
|
||||||
The `uniqueKey` element specifies which field is a unique identifier for documents. Although `uniqueKey` is not required, it is nearly always warranted by your application design. For example, `uniqueKey` should be used if you will ever update a document in the index.
|
The `uniqueKey` element specifies which field is a unique identifier for documents. Although `uniqueKey` is not required, it is nearly always warranted by your application design. For example, `uniqueKey` should be used if you will ever update a document in the index.
|
||||||
|
@ -37,7 +36,6 @@ Schema defaults and `copyFields` cannot be used to populate the `uniqueKey` fiel
|
||||||
Further, the operation will fail if the `uniqueKey` field is used, but is multivalued (or inherits the multivalue-ness from the `fieldtype`). However, `uniqueKey` will continue to work, as long as the field is properly used.
|
Further, the operation will fail if the `uniqueKey` field is used, but is multivalued (or inherits the multivalue-ness from the `fieldtype`). However, `uniqueKey` will continue to work, as long as the field is properly used.
|
||||||
|
|
||||||
|
|
||||||
[[OtherSchemaElements-Similarity]]
|
|
||||||
== Similarity
|
== Similarity
|
||||||
|
|
||||||
Similarity is a Lucene class used to score a document in searching.
|
Similarity is a Lucene class used to score a document in searching.
|
||||||
|
|
|
@ -54,7 +54,7 @@ Faceting makes use of fields defined when the search applications were indexed.
|
||||||
|
|
||||||
Solr also supports a feature called <<morelikethis.adoc#morelikethis,MoreLikeThis>>, which enables users to submit new queries that focus on particular terms returned in an earlier query. MoreLikeThis queries can make use of faceting or clustering to provide additional aid to users.
|
Solr also supports a feature called <<morelikethis.adoc#morelikethis,MoreLikeThis>>, which enables users to submit new queries that focus on particular terms returned in an earlier query. MoreLikeThis queries can make use of faceting or clustering to provide additional aid to users.
|
||||||
|
|
||||||
A Solr component called a <<response-writers.adoc#response-writers,*response writer*>> manages the final presentation of the query response. Solr includes a variety of response writers, including an <<response-writers.adoc#ResponseWriters-TheStandardXMLResponseWriter,XML Response Writer>> and a <<response-writers.adoc#ResponseWriters-JSONResponseWriter,JSON Response Writer>>.
|
A Solr component called a <<response-writers.adoc#response-writers,*response writer*>> manages the final presentation of the query response. Solr includes a variety of response writers, including an <<response-writers.adoc#standard-xml-response-writer,XML Response Writer>> and a <<response-writers.adoc#json-response-writer,JSON Response Writer>>.
|
||||||
|
|
||||||
The diagram below summarizes some key elements of the search process.
|
The diagram below summarizes some key elements of the search process.
|
||||||
|
|
||||||
|
|
|
@ -24,7 +24,7 @@ In most search applications, the "top" matching results (sorted by score, or som
|
||||||
In many applications the UI for these sorted results are displayed to the user in "pages" containing a fixed number of matching results, and users don't typically look at results past the first few pages worth of results.
|
In many applications the UI for these sorted results are displayed to the user in "pages" containing a fixed number of matching results, and users don't typically look at results past the first few pages worth of results.
|
||||||
|
|
||||||
== Basic Pagination
|
== Basic Pagination
|
||||||
In Solr, this basic paginated searching is supported using the `start` and `rows` parameters, and performance of this common behaviour can be tuned by utilizing the <<query-settings-in-solrconfig.adoc#QuerySettingsinSolrConfig-queryResultCache,`queryResultCache`>> and adjusting the <<query-settings-in-solrconfig.adoc#QuerySettingsinSolrConfig-queryResultWindowSize,`queryResultWindowSize`>> configuration options based on your expected page sizes.
|
In Solr, this basic paginated searching is supported using the `start` and `rows` parameters, and performance of this common behaviour can be tuned by utilizing the <<query-settings-in-solrconfig.adoc#queryresultcache,`queryResultCache`>> and adjusting the <<query-settings-in-solrconfig.adoc#queryresultwindowsize,`queryResultWindowSize`>> configuration options based on your expected page sizes.
|
||||||
|
|
||||||
=== Basic Pagination Examples
|
=== Basic Pagination Examples
|
||||||
|
|
||||||
|
@ -103,7 +103,7 @@ There are a few important constraints to be aware of when using `cursorMark` par
|
||||||
* If `id` is your uniqueKey field, then sort params like `id asc` and `name asc, id desc` would both work fine, but `name asc` by itself would not
|
* If `id` is your uniqueKey field, then sort params like `id asc` and `name asc, id desc` would both work fine, but `name asc` by itself would not
|
||||||
. Sorts including <<working-with-dates.adoc#working-with-dates,Date Math>> based functions that involve calculations relative to `NOW` will cause confusing results, since every document will get a new sort value on every subsequent request. This can easily result in cursors that never end, and constantly return the same documents over and over – even if the documents are never updated.
|
. Sorts including <<working-with-dates.adoc#working-with-dates,Date Math>> based functions that involve calculations relative to `NOW` will cause confusing results, since every document will get a new sort value on every subsequent request. This can easily result in cursors that never end, and constantly return the same documents over and over – even if the documents are never updated.
|
||||||
+
|
+
|
||||||
In this situation, choose & re-use a fixed value for the <<working-with-dates.adoc#WorkingwithDates-NOW,`NOW` request param>> in all of your cursor requests.
|
In this situation, choose & re-use a fixed value for the <<working-with-dates.adoc#now,`NOW` request param>> in all of your cursor requests.
|
||||||
|
|
||||||
Cursor mark values are computed based on the sort values of each document in the result, which means multiple documents with identical sort values will produce identical Cursor mark values if one of them is the last document on a page of results. In that situation, the subsequent request using that `cursorMark` would not know which of the documents with the identical mark values should be skipped. Requiring that the uniqueKey field be used as a clause in the sort criteria guarantees that a deterministic ordering will be returned, and that every `cursorMark` value will identify a unique point in the sequence of documents.
|
Cursor mark values are computed based on the sort values of each document in the result, which means multiple documents with identical sort values will produce identical Cursor mark values if one of them is the last document on a page of results. In that situation, the subsequent request using that `cursorMark` would not know which of the documents with the identical mark values should be skipped. Requiring that the uniqueKey field be used as a clause in the sort criteria guarantees that a deterministic ordering will be returned, and that every `cursorMark` value will identify a unique point in the sequence of documents.
|
||||||
|
|
||||||
|
|
|
@ -24,7 +24,7 @@ The same statistics are also exposed via the <<mbean-request-handler.adoc#mbean-
|
||||||
|
|
||||||
These statistics are per core. When you are running in SolrCloud mode these statistics would co-relate to each performance of an individual replica.
|
These statistics are per core. When you are running in SolrCloud mode these statistics would co-relate to each performance of an individual replica.
|
||||||
|
|
||||||
== Request Handlers
|
== Request Handler Statistics
|
||||||
|
|
||||||
=== Update Request Handler
|
=== Update Request Handler
|
||||||
|
|
||||||
|
@ -93,7 +93,7 @@ Both Update Request Handler and Search Request Handler along with handlers like
|
||||||
|transaction_logs_total_size |Total size of all the TLogs created so far from the beginning of the Solr instance.
|
|transaction_logs_total_size |Total size of all the TLogs created so far from the beginning of the Solr instance.
|
||||||
|===
|
|===
|
||||||
|
|
||||||
== Caches
|
== Cache Statistics
|
||||||
|
|
||||||
=== Document Cache
|
=== Document Cache
|
||||||
|
|
||||||
|
|
|
@ -22,11 +22,9 @@ Phonetic matching algorithms may be used to encode tokens so that two different
|
||||||
|
|
||||||
For overviews of and comparisons between algorithms, see http://en.wikipedia.org/wiki/Phonetic_algorithm and http://ntz-develop.blogspot.com/2011/03/phonetic-algorithms.html
|
For overviews of and comparisons between algorithms, see http://en.wikipedia.org/wiki/Phonetic_algorithm and http://ntz-develop.blogspot.com/2011/03/phonetic-algorithms.html
|
||||||
|
|
||||||
|
|
||||||
[[PhoneticMatching-Beider-MorsePhoneticMatching_BMPM_]]
|
|
||||||
== Beider-Morse Phonetic Matching (BMPM)
|
== Beider-Morse Phonetic Matching (BMPM)
|
||||||
|
|
||||||
For examples of how to use this encoding in your analyzer, see <<filter-descriptions.adoc#FilterDescriptions-Beider-MorseFilter,Beider Morse Filter>> in the Filter Descriptions section.
|
For examples of how to use this encoding in your analyzer, see <<filter-descriptions.adoc#beider-morse-filter,Beider Morse Filter>> in the Filter Descriptions section.
|
||||||
|
|
||||||
Beider-Morse Phonetic Matching (BMPM) is a "soundalike" tool that lets you search using a new phonetic matching system. BMPM helps you search for personal names (or just surnames) in a Solr/Lucene index, and is far superior to the existing phonetic codecs, such as regular soundex, metaphone, caverphone, etc.
|
Beider-Morse Phonetic Matching (BMPM) is a "soundalike" tool that lets you search using a new phonetic matching system. BMPM helps you search for personal names (or just surnames) in a Solr/Lucene index, and is far superior to the existing phonetic codecs, such as regular soundex, metaphone, caverphone, etc.
|
||||||
|
|
||||||
|
@ -59,7 +57,7 @@ For more information, see here: http://stevemorse.org/phoneticinfo.htm and http:
|
||||||
|
|
||||||
== Daitch-Mokotoff Soundex
|
== Daitch-Mokotoff Soundex
|
||||||
|
|
||||||
To use this encoding in your analyzer, see <<filter-descriptions.adoc#FilterDescriptions-Daitch-MokotoffSoundexFilter,Daitch-Mokotoff Soundex Filter>> in the Filter Descriptions section.
|
To use this encoding in your analyzer, see <<filter-descriptions.adoc#daitch-mokotoff-soundex-filter,Daitch-Mokotoff Soundex Filter>> in the Filter Descriptions section.
|
||||||
|
|
||||||
The Daitch-Mokotoff Soundex algorithm is a refinement of the Russel and American Soundex algorithms, yielding greater accuracy in matching especially Slavic and Yiddish surnames with similar pronunciation but differences in spelling.
|
The Daitch-Mokotoff Soundex algorithm is a refinement of the Russel and American Soundex algorithms, yielding greater accuracy in matching especially Slavic and Yiddish surnames with similar pronunciation but differences in spelling.
|
||||||
|
|
||||||
|
@ -76,13 +74,13 @@ For more information, see http://en.wikipedia.org/wiki/Daitch%E2%80%93Mokotoff_S
|
||||||
|
|
||||||
== Double Metaphone
|
== Double Metaphone
|
||||||
|
|
||||||
To use this encoding in your analyzer, see <<filter-descriptions.adoc#FilterDescriptions-DoubleMetaphoneFilter,Double Metaphone Filter>> in the Filter Descriptions section. Alternatively, you may specify `encoding="DoubleMetaphone"` with the <<filter-descriptions.adoc#FilterDescriptions-PhoneticFilter,Phonetic Filter>>, but note that the Phonetic Filter version will *not* provide the second ("alternate") encoding that is generated by the Double Metaphone Filter for some tokens.
|
To use this encoding in your analyzer, see <<filter-descriptions.adoc#double-metaphone-filter,Double Metaphone Filter>> in the Filter Descriptions section. Alternatively, you may specify `encoding="DoubleMetaphone"` with the <<filter-descriptions.adoc#phonetic-filter,Phonetic Filter>>, but note that the Phonetic Filter version will *not* provide the second ("alternate") encoding that is generated by the Double Metaphone Filter for some tokens.
|
||||||
|
|
||||||
Encodes tokens using the double metaphone algorithm by Lawrence Philips. See the original article at http://www.drdobbs.com/the-double-metaphone-search-algorithm/184401251?pgno=2
|
Encodes tokens using the double metaphone algorithm by Lawrence Philips. See the original article at http://www.drdobbs.com/the-double-metaphone-search-algorithm/184401251?pgno=2
|
||||||
|
|
||||||
== Metaphone
|
== Metaphone
|
||||||
|
|
||||||
To use this encoding in your analyzer, specify `encoding="Metaphone"` with the <<filter-descriptions.adoc#FilterDescriptions-PhoneticFilter,Phonetic Filter>>.
|
To use this encoding in your analyzer, specify `encoding="Metaphone"` with the <<filter-descriptions.adoc#phonetic-filter,Phonetic Filter>>.
|
||||||
|
|
||||||
Encodes tokens using the Metaphone algorithm by Lawrence Philips, described in "Hanging on the Metaphone" in Computer Language, Dec. 1990.
|
Encodes tokens using the Metaphone algorithm by Lawrence Philips, described in "Hanging on the Metaphone" in Computer Language, Dec. 1990.
|
||||||
|
|
||||||
|
@ -91,7 +89,7 @@ Another reference for more information is http://www.drdobbs.com/the-double-meta
|
||||||
|
|
||||||
== Soundex
|
== Soundex
|
||||||
|
|
||||||
To use this encoding in your analyzer, specify `encoding="Soundex"` with the <<filter-descriptions.adoc#FilterDescriptions-PhoneticFilter,Phonetic Filter>>.
|
To use this encoding in your analyzer, specify `encoding="Soundex"` with the <<filter-descriptions.adoc#phonetic-filter,Phonetic Filter>>.
|
||||||
|
|
||||||
Encodes tokens using the Soundex algorithm, which is used to relate similar names, but can also be used as a general purpose scheme to find words with similar phonemes.
|
Encodes tokens using the Soundex algorithm, which is used to relate similar names, but can also be used as a general purpose scheme to find words with similar phonemes.
|
||||||
|
|
||||||
|
@ -99,7 +97,7 @@ See also http://en.wikipedia.org/wiki/Soundex.
|
||||||
|
|
||||||
== Refined Soundex
|
== Refined Soundex
|
||||||
|
|
||||||
To use this encoding in your analyzer, specify `encoding="RefinedSoundex"` with the <<filter-descriptions.adoc#FilterDescriptions-PhoneticFilter,Phonetic Filter>>.
|
To use this encoding in your analyzer, specify `encoding="RefinedSoundex"` with the <<filter-descriptions.adoc#phonetic-filter,Phonetic Filter>>.
|
||||||
|
|
||||||
Encodes tokens using an improved version of the Soundex algorithm.
|
Encodes tokens using an improved version of the Soundex algorithm.
|
||||||
|
|
||||||
|
@ -107,7 +105,7 @@ See http://en.wikipedia.org/wiki/Soundex.
|
||||||
|
|
||||||
== Caverphone
|
== Caverphone
|
||||||
|
|
||||||
To use this encoding in your analyzer, specify `encoding="Caverphone"` with the <<filter-descriptions.adoc#FilterDescriptions-PhoneticFilter,Phonetic Filter>>.
|
To use this encoding in your analyzer, specify `encoding="Caverphone"` with the <<filter-descriptions.adoc#phonetic-filter,Phonetic Filter>>.
|
||||||
|
|
||||||
Caverphone is an algorithm created by the Caversham Project at the University of Otago. The algorithm is optimised for accents present in the southern part of the city of Dunedin, New Zealand.
|
Caverphone is an algorithm created by the Caversham Project at the University of Otago. The algorithm is optimised for accents present in the southern part of the city of Dunedin, New Zealand.
|
||||||
|
|
||||||
|
@ -115,7 +113,7 @@ See http://en.wikipedia.org/wiki/Caverphone and the Caverphone 2.0 specification
|
||||||
|
|
||||||
== Kölner Phonetik a.k.a. Cologne Phonetic
|
== Kölner Phonetik a.k.a. Cologne Phonetic
|
||||||
|
|
||||||
To use this encoding in your analyzer, specify `encoding="ColognePhonetic"` with the <<filter-descriptions.adoc#FilterDescriptions-PhoneticFilter,Phonetic Filter>>.
|
To use this encoding in your analyzer, specify `encoding="ColognePhonetic"` with the <<filter-descriptions.adoc#phonetic-filter,Phonetic Filter>>.
|
||||||
|
|
||||||
The Kölner Phonetik, an algorithm published by Hans Joachim Postel in 1969, is optimized for the German language.
|
The Kölner Phonetik, an algorithm published by Hans Joachim Postel in 1969, is optimized for the German language.
|
||||||
|
|
||||||
|
@ -123,7 +121,7 @@ See http://de.wikipedia.org/wiki/K%C3%B6lner_Phonetik
|
||||||
|
|
||||||
== NYSIIS
|
== NYSIIS
|
||||||
|
|
||||||
To use this encoding in your analyzer, specify `encoding="Nysiis"` with the <<filter-descriptions.adoc#FilterDescriptions-PhoneticFilter,Phonetic Filter>>.
|
To use this encoding in your analyzer, specify `encoding="Nysiis"` with the <<filter-descriptions.adoc#phonetic-filter,Phonetic Filter>>.
|
||||||
|
|
||||||
NYSIIS is an encoding used to relate similar names, but can also be used as a general purpose scheme to find words with similar phonemes.
|
NYSIIS is an encoding used to relate similar names, but can also be used as a general purpose scheme to find words with similar phonemes.
|
||||||
|
|
||||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue