mirror of https://github.com/apache/lucene.git
Ref Guide: several minor typos and format fixes for 6.6 Guide
This commit is contained in:
parent
f0bd43582a
commit
d411dceaec
|
@ -113,7 +113,7 @@ curl --user solr:SolrRocks http://localhost:8983/solr/admin/authentication -H 'C
|
|||
----
|
||||
|
||||
[[BasicAuthenticationPlugin-Setaproperty]]
|
||||
=== Set a property
|
||||
=== Set a Property
|
||||
|
||||
Set arbitrary properties for authentication plugin. The only supported property is `'blockUnknown'`
|
||||
|
||||
|
|
|
@ -32,7 +32,7 @@ The Collections API is used to enable you to create, remove, or reload collectio
|
|||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[cols="20,15,10,15,40",options="header"]
|
||||
[cols="25,10,10,15,40",options="header"]
|
||||
|===
|
||||
|Key |Type |Required |Default |Description
|
||||
|name |string |Yes | |The name of the collection to be created.
|
||||
|
@ -225,7 +225,7 @@ Shard splitting can be a long running process. In order to avoid timeouts, you s
|
|||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[cols="20,15,10,55",options="header"]
|
||||
[cols="25,15,10,50",options="header"]
|
||||
|===
|
||||
|Key |Type |Required |Description
|
||||
|collection |string |Yes |The name of the collection that includes the shard to be split.
|
||||
|
@ -329,7 +329,7 @@ Shards can only created with this API for collections that use the 'implicit' ro
|
|||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[cols="20,15,10,55",options="header"]
|
||||
[cols="25,15,10,50",options="header"]
|
||||
|===
|
||||
|Key |Type |Required |Description
|
||||
|collection |string |Yes |The name of the collection that includes the shard that will be splitted.
|
||||
|
@ -380,7 +380,7 @@ Deleting a shard will unload all replicas of the shard, remove them from `cluste
|
|||
|
||||
*Query Parameters*
|
||||
|
||||
[cols=",,,",options="header",]
|
||||
[cols="25,15,10,50",options="header",]
|
||||
|===
|
||||
|Key |Type |Required |Description
|
||||
|collection |string |Yes |The name of the collection that includes the shard to be deleted.
|
||||
|
@ -692,7 +692,7 @@ Add a replica to a shard in a collection. The node name can be specified if the
|
|||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[cols="20,15,10,55",options="header"]
|
||||
[cols="25,15,10,50",options="header"]
|
||||
|===
|
||||
|Key |Type |Required |Description
|
||||
|collection |string |Yes |The name of the collection.
|
||||
|
@ -814,7 +814,7 @@ Please note that the migrate API does not perform any de-duplication on the docu
|
|||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[cols="20,15,10,55",options="header"]
|
||||
[cols="25,15,10,50",options="header"]
|
||||
|===
|
||||
|Key |Type |Required |Description
|
||||
|collection |string |Yes |The name of the source collection from which documents will be split.
|
||||
|
@ -1474,10 +1474,10 @@ The property to add. Note: this will have the literal 'property.' prepended to d
|
|||
and
|
||||
|
||||
`property=property.special`
|
||||
|
||||
There is one pre-defined property "preferredLeader" for which shardUnique is forced to 'true' and an error returned if shardUnique is explicitly set to 'false'. PreferredLeader is a boolean property, any value assigned that is not equal (case insensitive) to 'true' will be interpreted as 'false' for preferredLeader.
|
||||
|property.value |string |Yes |The value to assign to the property.
|
||||
|shardUnique (1) |Boolean |No |default: false. If true, then setting this property in one replica will remove the property from all other replicas in that shard.
|
||||
|shardUnique |Boolean |No |default: false. If true, then setting this property in one replica will remove the property from all other replicas in that shard.
|
||||
|
||||
There is one pre-defined property `preferredLeader` for which `shardUnique` is forced to 'true' and an error returned if `shardUnique` is explicitly set to 'false'. `PreferredLeader` is a boolean property, any value assigned that is not equal (case insensitive) to 'true' will be interpreted as 'false' for `preferredLeader`.
|
||||
|===
|
||||
|
||||
[[CollectionsAPI-Output.13]]
|
||||
|
@ -1559,7 +1559,6 @@ The property to add. Note: this will have the literal 'property.' prepended to d
|
|||
and
|
||||
|
||||
`property=property.special`
|
||||
|
||||
|===
|
||||
|
||||
[[CollectionsAPI-Output.14]]
|
||||
|
@ -1854,7 +1853,7 @@ Additionally, there are several parameters that can be overridden:
|
|||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[cols="20,15,10,55",options="header"]
|
||||
[cols="25,15,10,50",options="header"]
|
||||
|===
|
||||
|Key |Type |Required |Description
|
||||
|collection.configName |String |No |Defines the name of the configurations to use for this collection. These must already be stored in ZooKeeper. If not provided, Solr will default to the collection name as the configuration name.
|
||||
|
|
|
@ -48,7 +48,7 @@ Create a ConfigSet, based on an existing ConfigSet.
|
|||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[cols="20,15,10,15,40",options="header"]
|
||||
[cols="25,10,10,10,45",options="header"]
|
||||
|===
|
||||
|Key |Type |Required |Default |Description
|
||||
|name |String |Yes | |ConfigSet to be created
|
||||
|
|
|
@ -125,6 +125,7 @@ The path and name of the `solrcore.properties` file can be overridden using the
|
|||
|
||||
Every Solr core has a `core.properties` file, automatically created when using the APIs. When you create a SolrCloud collection, you can pass through custom parameters to go into each core.properties that will be created, by prefixing the parameter name with "property." as a URL parameter. Example:
|
||||
|
||||
[source,bash]
|
||||
http://localhost:8983/solr/admin/collections?action=CREATE&name=gettingstarted&numShards=1&property.my.custom.prop=edismax
|
||||
|
||||
That would create a `core.properties` file that has at least the following properties (others omitted for brevity):
|
||||
|
|
|
@ -73,7 +73,7 @@ As previously mentioned, both implementations of the `langid` UpdateRequestProce
|
|||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[cols="20,10,10,10,50",options="header"]
|
||||
[cols="30,10,10,10,40",options="header"]
|
||||
|===
|
||||
|Parameter |Type |Default |Required |Description
|
||||
|langid |Boolean |true |no |Enables and disables language detection.
|
||||
|
|
|
@ -74,7 +74,7 @@ openssl pkcs12 -nokeys -in solr-ssl.keystore.p12 -out solr-ssl.cacert.pem
|
|||
----
|
||||
|
||||
[[EnablingSSL-SetcommonSSLrelatedsystemproperties]]
|
||||
=== Set common SSL related system properties
|
||||
=== Set Common SSL-Related System Properties
|
||||
|
||||
The Solr Control Script is already setup to pass SSL-related Java system properties to the JVM. To activate the SSL settings, uncomment and update the set of properties beginning with SOLR_SSL_* in `bin/solr.in.sh`. (or `bin\solr.in.cmd` on Windows).
|
||||
|
||||
|
@ -134,7 +134,7 @@ bin\solr.cmd -p 8984
|
|||
----
|
||||
|
||||
[[EnablingSSL-SolrCloud]]
|
||||
== SolrCloud
|
||||
== SSL with SolrCloud
|
||||
|
||||
This section describes how to run a two-node SolrCloud cluster with no initial collections and a single-node external ZooKeeper. The commands below assume you have already created the keystore described above.
|
||||
|
||||
|
@ -167,7 +167,7 @@ If you have set up your ZooKeeper cluster to use a <<taking-solr-to-production.a
|
|||
=== Run SolrCloud with SSL
|
||||
|
||||
[[EnablingSSL-CreateSolrhomedirectoriesfortwonodes]]
|
||||
==== Create Solr home directories for two nodes
|
||||
==== Create Solr Home Directories for Two Nodes
|
||||
|
||||
Create two copies of the `server/solr/` directory which will serve as the Solr home directories for each of your two SolrCloud nodes:
|
||||
|
||||
|
|
|
@ -679,9 +679,9 @@ For example:
|
|||
* [1,10) -> will include values greater or equal to 1 and lower than 10
|
||||
* [1,10] -> will include values greater or equal to 1 and lower or equal to 10
|
||||
|
||||
The initial and end values cannot be empty. If the interval needs to be unbounded, the special character `*` can be used for both, start and end limit.
|
||||
The initial and end values cannot be empty.
|
||||
|
||||
When using `\*`, `(` and `[`, and `)` and `]` will be treated equal. `[*,*]` will include all documents with a value in the field.
|
||||
If the interval needs to be unbounded, the special character `*` can be used for both, start and end limit. When using this special character, the start syntax options (`(` anAd `[`), and end syntax options (`)` and `]`) will be treated the same. `[*,*]` will include all documents with a value in the field.
|
||||
|
||||
The interval limits may be strings but there is no need to add quotes. All the text until the comma will be treated as the start limit, and the text after that will be the end limit. For example: `[Buenos Aires,New York]`. Keep in mind that a string-like comparison will be done to match documents in string intervals (case-sensitive). The comparator can't be changed.
|
||||
|
||||
|
|
|
@ -86,16 +86,36 @@ The table below summarizes the functions available for function queries.
|
|||
|===
|
||||
|Function |Description |Syntax Examples
|
||||
|abs |Returns the absolute value of the specified value or function. |`abs(x)` `abs(-5)`
|
||||
|childfield(field)|Returns the value of the given field for one of the matched child docs when searching by <<other-parsers.adoc#OtherParsers-BlockJoinParentQueryParser,{!parent}>>. It can be used only in `sort` parameter|
|
||||
|
||||
|childfield(field) |Returns the value of the given field for one of the matched child docs when searching by <<other-parsers.adoc#OtherParsers-BlockJoinParentQueryParser,{!parent}>>. It can be used only in `sort` parameter. a|
|
||||
* `sort=childfield(name) asc` implies `$q` as a second argument and therefor it assumes `q={!parent ..}..`;
|
||||
* `sort=childfield(field,$bjq) asc` refers to a separate parameter `bjq={!parent ..}..`;
|
||||
* `sort=childfield(field,{!parent of=...}...) desc` allows to inline block join parent query
|
||||
|
||||
|concat(v,f..)|concatenates the given string fields, literals and other functions |`concat(name," ",$param,def(opt,"-"))`
|
||||
|
||||
|"constant" |Specifies a floating point constant. |`1.5`
|
||||
|
||||
|def |`def` is short for default. Returns the value of field "field", or if the field does not exist, returns the default value specified. and yields the first value where `exists()==true`.) |`def(rating,5):` This `def()` function returns the rating, or if no rating specified in the doc, returns 5 `def(myfield, 1.0):` equivalent to `if(exists(myfield),myfield,1.0)`
|
||||
|div |Divides one value or function by another. div(x,y) divides x by y. |`div(1,y)` `div(sum(x,100),max(y,1))`
|
||||
|dist |Return the distance between two vectors (points) in an n-dimensional space. Takes in the power, plus two or more ValueSource instances and calculates the distances between the two vectors. Each ValueSource must be a number. There must be an even number of ValueSource instances passed in and the method assumes that the first half represent the first vector and the second half represent the second vector. |`dist(2, x, y, 0, 0):` calculates the Euclidean distance between (0,0) and (x,y) for each document `dist(1, x, y, 0, 0)`: calculates the Manhattan (taxicab) distance between (0,0) and (x,y) for each document `dist(2, x,y,z,0,0,0):` Euclidean distance between (0,0,0) and (x,y,z) for each document. `dist(1,x,y,z,e,f,g)`: Manhattan distance between (x,y,z) and (e,f,g) where each letter is a field name
|
||||
|docfreq(field,val) |Returns the number of documents that contain the term in the field. This is a constant (the same value for all documents in the index). You can quote the term if it's more complex, or do parameter substitution for the term value. |`docfreq(text,'solr')` `...&defType=func` `&q=docfreq(text,$myterm)&myterm=solr`
|
||||
|
||||
|div |Divides one value or function by another. div(x,y) divides x by y. a|`div(1,y)`
|
||||
|
||||
`div(sum(x,100),max(y,1))`
|
||||
|
||||
|dist |Return the distance between two vectors (points) in an n-dimensional space. Takes in the power, plus two or more ValueSource instances and calculates the distances between the two vectors. Each ValueSource must be a number. There must be an even number of ValueSource instances passed in and the method assumes that the first half represent the first vector and the second half represent the second vector. a|`dist(2, x, y, 0, 0):` calculates the Euclidean distance between (0,0) and (x,y) for each document.
|
||||
|
||||
`dist(1, x, y, 0, 0)`: calculates the Manhattan (taxicab) distance between (0,0) and (x,y) for each document.
|
||||
|
||||
`dist(2, x,y,z,0,0,0):` Euclidean distance between (0,0,0) and (x,y,z) for each document.
|
||||
|
||||
`dist(1,x,y,z,e,f,g)`: Manhattan distance between (x,y,z) and (e,f,g) where each letter is a field name.
|
||||
|
||||
|docfreq(field,val) |Returns the number of documents that contain the term in the field. This is a constant (the same value for all documents in the index).
|
||||
|
||||
You can quote the term if it's more complex, or do parameter substitution for the term value. a|`docfreq(text,'solr')`
|
||||
|
||||
`...&defType=func` `&q=docfreq(text,$myterm)&myterm=solr`
|
||||
|
||||
|field[[FunctionQueries-field]] a|
|
||||
Returns the numeric docValues or indexed value of the field with the specified name. In its simplest (single argument) form, this function can only be used on single valued fields, and can be called using the name of the field as a string, or for most conventional field names simply use the field name by itself with out using the `field(...)` syntax.
|
||||
|
||||
|
@ -120,7 +140,9 @@ For multivalued docValues fields:
|
|||
* `field(myMultiValuedFloatField,max)`
|
||||
|
||||
|hsin |The Haversine distance calculates the distance between two points on a sphere when traveling along the sphere. The values must be in radians. `hsin` also take a Boolean argument to specify whether the function should convert its output to radians. |`hsin(2, true, x, y, 0, 0)`
|
||||
|idf |Inverse document frequency; a measure of whether the term is common or rare across all documents. Obtained by dividing the total number of documents by the number of documents containing the term, and then taking the logarithm of that quotient. See also `tf`. |`idf(fieldName,'solr')`: measures the inverse of the frequency of the occurrence of the term `'solr'` in` fieldName`.
|
||||
|
||||
|idf |Inverse document frequency; a measure of whether the term is common or rare across all documents. Obtained by dividing the total number of documents by the number of documents containing the term, and then taking the logarithm of that quotient. See also `tf`. |`idf(fieldName,'solr')`: measures the inverse of the frequency of the occurrence of the term `'solr'` in `fieldName`.
|
||||
|
||||
|if a|
|
||||
Enables conditional function queries. In `if(test,value1,value2)`:
|
||||
|
||||
|
@ -130,8 +152,12 @@ Enables conditional function queries. In `if(test,value1,value2)`:
|
|||
|
||||
An expression can be any function which outputs boolean values, or even functions returning numeric values, in which case value 0 will be interpreted as false, or strings, in which case empty string is interpreted as false.
|
||||
|
||||
|`if(termfreq` `(cat,'electronics'),` `popularity,42)` : This function checks each document for the to see if it contains the term "```electronics```" in the `cat` field. If it does, then the value of the `popularity` field is returned, otherwise the value of `42` is returned.
|
||||
|linear |Implements `m*x+c` where `m` and `c` are constants and `x` is an arbitrary function. This is equivalent to `sum(product(m,x),c)`, but slightly more efficient as it is implemented as a single function. |`linear(x,m,c)` `linear(x,2,4)` returns `2*x+4`
|
||||
a|`if(termfreq` `(cat,'electronics'),` `popularity,42)`: This function checks each document for the to see if it contains the term "```electronics```" in the `cat` field. If it does, then the value of the `popularity` field is returned, otherwise the value of `42` is returned.
|
||||
|
||||
|linear |Implements `m*x+c` where `m` and `c` are constants and `x` is an arbitrary function. This is equivalent to `sum(product(m,x),c)`, but slightly more efficient as it is implemented as a single function. a|`linear(x,m,c)`
|
||||
|
||||
`linear(x,2,4)` returns `2*x+4`
|
||||
|
||||
|log |Returns the log base 10 of the specified function. a|
|
||||
`log(x)`
|
||||
|
||||
|
@ -150,23 +176,37 @@ Returns the maximum numeric value of multiple nested functions or constants, whi
|
|||
(Use the `field(myfield,max)` syntax for <<FunctionQueries-field,selecting the maximum value of a single multivalued field>>)
|
||||
|
||||
|`max(myfield,myotherfield,0)`
|
||||
|
||||
|maxdoc |Returns the number of documents in the index, including those that are marked as deleted but have not yet been purged. This is a constant (the same value for all documents in the index). |`maxdoc()`
|
||||
|
||||
|min a|
|
||||
Returns the minimum numeric value of multiple nested functions of constants, which are specified as arguments: `min(x,y,...)`. The min function can also be useful for providing an "upper bound" on a function using a constant.
|
||||
|
||||
(Use the `field(myfield,min)` <<FunctionQueries-field,syntax for selecting the minimum value of a single multivalued field>>)
|
||||
|
||||
|`min(myfield,myotherfield,0)`
|
||||
|
||||
|ms a|
|
||||
Returns milliseconds of difference between its arguments. Dates are relative to the Unix or POSIX time epoch, midnight, January 1, 1970 UTC. Arguments may be the name of an indexed `TrieDateField`, or date math based on a <<working-with-dates.adoc#working-with-dates,constant date or `NOW`>>.
|
||||
|
||||
* `ms()`: Equivalent to `ms(NOW)`, number of milliseconds since the epoch.
|
||||
* `ms(a):` Returns the number of milliseconds since the epoch that the argument represents.
|
||||
* `ms(a,b)` : Returns the number of milliseconds that b occurs before a (that is, a - b)
|
||||
* `ms(a,b)` : Returns the number of milliseconds that b occurs before a (that is, a - b) a|`ms(NOW/DAY)`
|
||||
|
||||
`ms(2000-01-01T00:00:00Z)`
|
||||
|
||||
`ms(mydatefield)`
|
||||
|
||||
`ms(NOW,mydatefield)`
|
||||
|
||||
`ms(mydatefield, 2000-01-01T00:00:00Z)`
|
||||
|
||||
`ms(datefield1, datefield2)`
|
||||
|
||||
|`ms(NOW/DAY)` `ms(2000-01-01T00:00:00Z)` `ms(mydatefield)` `ms(NOW,mydatefield)` `ms(mydatefield,` `2000-01-01T00:00:00Z)` `ms(datefield1,` `datefield2)`
|
||||
|norm(_field_) |Returns the "norm" stored in the index for the specified field. This is the product of the index time boost and the length normalization factor, according to the {lucene-javadocs}/core/org/apache/lucene/search/similarities/Similarity.html[Similarity] for the field. |`norm(fieldName)`
|
||||
|
||||
|numdocs |Returns the number of documents in the index, not including those that are marked as deleted but have not yet been purged. This is a constant (the same value for all documents in the index). |`numdocs()`
|
||||
|
||||
|ord a|
|
||||
Returns the ordinal of the indexed field value within the indexed list of terms for that field in Lucene index order (lexicographically ordered by unicode value), starting at 1. In other words, for a given field, all values are ordered lexicographically; this function then returns the offset of a particular value in that ordering. The field must have a maximum of one value per document (not multi-valued). 0 is returned for documents without a value in the field.
|
||||
|
||||
|
@ -177,7 +217,10 @@ Returns the ordinal of the indexed field value within the indexed list of terms
|
|||
|
||||
See also `rord` below.
|
||||
|
||||
|`ord(myIndexedField)` Example: If there were only three values ("apple","banana","pear") for a particular field X, then: `ord(X) `would be 1 for documents containing "apple", 2 for documnts containing "banana", etc...
|
||||
a|`ord(myIndexedField)`
|
||||
|
||||
Example: If there were only three values ("apple","banana","pear") for a particular field X, then `ord(X)` would be `1` for documents containing "apple", `2` for documents containing "banana", etc.
|
||||
|
||||
|payload a|
|
||||
Returns the float value computed from the decoded payloads of the term specified. The return value is computed using the `min`, `max`, or `average` of the decoded payloads. A special `first` function can be used instead of the others, to short-circuit term enumeration and return only the decoded payload of the first term. The field specified must have float or integer payload encoding capability (via `DelimitedPayloadTokenFilter` or `NumericPayloadTokenFilter`). If no payload is found for the term, the default value is returned.
|
||||
|
||||
|
@ -185,36 +228,83 @@ Returns the float value computed from the decoded payloads of the term specified
|
|||
* `payload(field_name,term,default_value)`: default value can be a constant, field name, or another float returning function. `average` function used.
|
||||
* `payload(field_name,term,default_value,function)`: function values can be `min`, `max`, `average`, or `first`. |`payload(payloaded_field_dpf,term,0.0,first)`
|
||||
|
||||
|pow |Raises the specified base to the specified power. `pow(x,y)` raises x to the power of y. |`pow(x,y)` `pow(x,log(y))` `pow(x,0.5):` the same as `sqrt`
|
||||
|pow |Raises the specified base to the specified power. `pow(x,y)` raises x to the power of y. a|`pow(x,y)`
|
||||
|
||||
`pow(x,log(y))`
|
||||
|
||||
`pow(x,0.5):` the same as `sqrt`
|
||||
|
||||
|product |Returns the product of multiple values or functions, which are specified in a comma-separated list. `mul(...)` may also be used as an alias for this function. |`product(x,y,...)` `product(x,2)` `product(x,y)mul(x,y)`
|
||||
|query |Returns the score for the given subquery, or the default value for documents not matching the query. Any type of subquery is supported through either parameter de-referencing `$otherparam` or direct specification of the query string in the <<local-parameters-in-queries.adoc#local-parameters-in-queries,Local Parameters>> through the `v` key. |`query(subquery, default)` `q=product` `(popularity,` ` query({!dismax v='solr rocks'})`: returns the product of the popularity and the score of the DisMax query. `q=product` `(popularity,` ` query($qq))&qq={!dismax}solr rocks`: equivalent to the previous query, using parameter de-referencing. `q=product` `(popularity,` ` query($qq,0.1))` `&qq={!dismax}` `solr rocks`: specifies a default score of 0.1 for documents that don't match the DisMax query.
|
||||
|
||||
|query |Returns the score for the given subquery, or the default value for documents not matching the query. Any type of subquery is supported through either parameter de-referencing `$otherparam` or direct specification of the query string in the <<local-parameters-in-queries.adoc#local-parameters-in-queries,Local Parameters>> through the `v` key. a|`query(subquery, default)`
|
||||
|
||||
`q=product` `(popularity,` `query({!dismax v='solr rocks'})`: returns the product of the popularity and the score of the DisMax query.
|
||||
|
||||
`q=product` `(popularity,` `query($qq))&qq={!dismax}solr rocks`: equivalent to the previous query, using parameter de-referencing.
|
||||
|
||||
`q=product` `(popularity,` `query($qq,0.1))` `&qq={!dismax}` `solr rocks`: specifies a default score of 0.1 for documents that don't match the DisMax query.
|
||||
|recip a|
|
||||
Performs a reciprocal function with `recip(x,m,a,b)` implementing `a/(m*x+b)` where `m,a,b` are constants, and `x` is any arbitrarily complex function.
|
||||
|
||||
When a and b are equal, and x>=0, this function has a maximum value of 1 that drops as x increases. Increasing the value of a and b together results in a movement of the entire function to a flatter part of the curve. These properties can make this an ideal function for boosting more recent documents when x is `rord(datefield)`.
|
||||
|
||||
|`recip(myfield,m,a,b)` `recip(rord` `(creationDate),` `1,1000,1000)`
|
||||
a|`recip(myfield,m,a,b)`
|
||||
|
||||
`recip(rord` `(creationDate), 1,1000,1000)`
|
||||
|
||||
|rord |Returns the reverse ordering of that returned by `ord`. |`rord(myDateField)`
|
||||
|
||||
|scale a|
|
||||
Scales values of the function x such that they fall between the specified `minTarget` and `maxTarget` inclusive. The current implementation traverses all of the function values to obtain the min and max, so it can pick the correct scale.
|
||||
|
||||
The current implementation cannot distinguish when documents have been deleted or documents that have no value. It uses 0.0 values for these cases. This means that if values are normally all greater than 0.0, one can still end up with 0.0 as the min value to map from. In these cases, an appropriate map() function could be used as a workaround to change 0.0 to a value in the real range, as shown here: scale(map(x,0,0,5),1,2)
|
||||
The current implementation cannot distinguish when documents have been deleted or documents that have no value. It uses 0.0 values for these cases. This means that if values are normally all greater than 0.0, one can still end up with 0.0 as the min value to map from. In these cases, an appropriate map() function could be used as a workaround to change 0.0 to a value in the real range, as shown here: `scale(map(x,0,0,5),1,2)` a|`scale(x, minTarget, maxTarget)`
|
||||
|
||||
`scale(x,1,2)`: scales the values of x such that all values will be between 1 and 2 inclusive.
|
||||
|
||||
|`scale(x,` `minTarget,` `maxTarget)` `scale(x,1,2)`: scales the values of x such that all values will be between 1 and 2 inclusive.
|
||||
|sqedist |The Square Euclidean distance calculates the 2-norm (Euclidean distance) but does not take the square root, thus saving a fairly expensive operation. It is often the case that applications that care about Euclidean distance do not need the actual distance, but instead can use the square of the distance. There must be an even number of ValueSource instances passed in and the method assumes that the first half represent the first vector and the second half represent the second vector. |`sqedist(x_td, y_td, 0, 0)`
|
||||
|
||||
|sqrt |Returns the square root of the specified value or function. |`sqrt(x)sqrt(100)sqrt(sum(x,100))`
|
||||
|strdist |Calculate the distance between two strings. Uses the Lucene spell checker `StringDistance` interface and supports all of the implementations available in that package, plus allows applications to plug in their own via Solr's resource loading capabilities. `strdist` takes (string1, string2, distance measure). Possible values for distance measure are: jw: Jaro-Winkler edit: Levenstein or Edit distance ngram: The NGramDistance, if specified, can optionally pass in the ngram size too. Default is 2. FQN: Fully Qualified class Name for an implementation of the StringDistance interface. Must have a no-arg constructor. |`strdist("SOLR",id,edit)`
|
||||
|sub |Returns x-y from sub(x,y). |`sub(myfield,myfield2)` `sub(100,` `sqrt(myfield))`
|
||||
|sum |Returns the sum of multiple values or functions, which are specified in a comma-separated list. `add(...)` may be used as an alias for this function |`sum(x,y,...) sum(x,1)` `sum(x,y)` `sum(sqrt(x),log(y),z,0.5)add(x,y)`
|
||||
|sumtotaltermfreq |Returns the sum of `totaltermfreq` values for all terms in the field in the entire index (i.e., the number of indexed tokens for that field). (Aliases `sumtotaltermfreq` to `sttf`.) |If doc1:(fieldX:A B C) and doc2:(fieldX:A A A A): `docFreq(fieldX:A)` = 2 (A appears in 2 docs) `freq(doc1, fieldX:A)` = 4 (A appears 4 times in doc 2) `totalTermFreq(fieldX:A)` = 5 (A appears 5 times across all docs) `sumTotalTermFreq(fieldX)` = 7 in `fieldX`, there are 5 As, 1 B, 1 C
|
||||
|
||||
|strdist a|Calculate the distance between two strings. Uses the Lucene spell checker `StringDistance` interface and supports all of the implementations available in that package, plus allows applications to plug in their own via Solr's resource loading capabilities. `strdist` takes (string1, string2, distance measure).
|
||||
|
||||
Possible values for distance measure are:
|
||||
|
||||
* jw: Jaro-Winkler
|
||||
* edit: Levenstein or Edit distance
|
||||
* ngram: The NGramDistance, if specified, can optionally pass in the ngram size too. Default is 2.
|
||||
* FQN: Fully Qualified class Name for an implementation of the StringDistance interface. Must have a no-arg constructor. |`strdist("SOLR",id,edit)`
|
||||
|
||||
|sub |Returns `x-y` from `sub(x,y)`. a|`sub(myfield,myfield2)`
|
||||
|
||||
`sub(100, sqrt(myfield))`
|
||||
|
||||
|sum |Returns the sum of multiple values or functions, which are specified in a comma-separated list. `add(...)` may be used as an alias for this function a|`sum(x,y,...) sum(x,1)`
|
||||
|
||||
`sum(x,y)`
|
||||
|
||||
`sum(sqrt(x),log(y),z,0.5)`
|
||||
|
||||
`add(x,y)`
|
||||
|
||||
|sumtotaltermfreq |Returns the sum of `totaltermfreq` values for all terms in the field in the entire index (i.e., the number of indexed tokens for that field). (Aliases `sumtotaltermfreq` to `sttf`.) a|If doc1:(fieldX:A B C) and doc2:(fieldX:A A A A):
|
||||
|
||||
`docFreq(fieldX:A)` = 2 (A appears in 2 docs)
|
||||
|
||||
`freq(doc1, fieldX:A)` = 4 (A appears 4 times in doc 2)
|
||||
|
||||
`totalTermFreq(fieldX:A)` = 5 (A appears 5 times across all docs)
|
||||
|
||||
`sumTotalTermFreq(fieldX)` = 7 in `fieldX`, there are 5 As, 1 B, 1 C
|
||||
|
||||
|termfreq |Returns the number of times the term appears in the field for that document. |`termfreq(text,'memory')`
|
||||
|
||||
|tf |Term frequency; returns the term frequency factor for the given term, using the {lucene-javadocs}/core/org/apache/lucene/search/similarities/Similarity.html[Similarity] for the field. The `tf-idf` value increases proportionally to the number of times a word appears in the document, but is offset by the frequency of the word in the document, which helps to control for the fact that some words are generally more common than others. See also `idf`. |`tf(text,'solr')`
|
||||
|
||||
|top a|
|
||||
Causes the function query argument to derive its values from the top-level IndexReader containing all parts of an index. For example, the ordinal of a value in a single segment will be different from the ordinal of that same value in the complete index.
|
||||
|
||||
The `ord()` and `rord()` functions implicitly use `top()`, and hence `ord(foo)` is equivalent to `top(ord(foo))`.
|
||||
The `ord()` and `rord()` functions implicitly use `top()`, and hence `ord(foo)` is equivalent to `top(ord(foo))`. |
|
||||
|
||||
|
|
||||
|totaltermfreq |Returns the number of times the term appears in the field in the entire index. (Aliases `totaltermfreq` to `ttf`.) |`ttf(text,'memory')`
|
||||
|===
|
||||
|
||||
|
|
|
@ -51,4 +51,4 @@ If you are using Sun's JVM, add the `-server` command-line option when you start
|
|||
|
||||
A great way to see what JVM settings your server is using, along with other useful information, is to use the admin RequestHandler, `solr/admin/system`. This request handler will display a wealth of server statistics and settings.
|
||||
|
||||
You can also use any of the tools that are compatible with the Java Management Extensions (JMX). See the section _Using JMX with Solr_ in <<managing-solr.adoc#managing-solr,Managing Solr>> for more information.
|
||||
You can also use any of the tools that are compatible with the Java Management Extensions (JMX). See the section <<using-jmx-with-solr.adoc#using-jmx-with-solr,Using JMX with Solr>> for more information.
|
||||
|
|
|
@ -234,7 +234,7 @@ While starting up Solr, the following host-specific parameters need to be passed
|
|||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[cols="30,10,60",options="header"]
|
||||
[cols="35,10,55",options="header"]
|
||||
|===
|
||||
|Parameter Name |Required |Description
|
||||
|`solr.kerberos.name.rules` |No |Used to map Kerberos principals to short names. Default value is `DEFAULT`. Example of a name rule: `RULE:[1:$1@$0](.\*EXAMPLE.COM)s/@.*//`
|
||||
|
@ -281,7 +281,7 @@ To enable delegation tokens, several parameters must be defined. These parameter
|
|||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[cols="30,10,60",options="header"]
|
||||
[cols="40,10,50",options="header"]
|
||||
|===
|
||||
|Parameter Name |Required |Description
|
||||
|`solr.kerberos.delegation.token.enabled` |Yes, to enable tokens |False by default, set to true to enable delegation tokens.
|
||||
|
|
|
@ -565,7 +565,7 @@ See the example under <<LanguageAnalysis-TraditionalChinese,Traditional Chinese>
|
|||
[[LanguageAnalysis-SimplifiedChinese]]
|
||||
=== Simplified Chinese
|
||||
|
||||
For Simplified Chinese, Solr provides support for Chinese sentence and word segmentation with the <<LanguageAnalysis-HMMChineseTokenizer,HMM Chinese Tokenizer`>>. This component includes a large dictionary and segments Chinese text into words with the Hidden Markov Model. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
|
||||
For Simplified Chinese, Solr provides support for Chinese sentence and word segmentation with the <<LanguageAnalysis-HMMChineseTokenizer,HMM Chinese Tokenizer>>. This component includes a large dictionary and segments Chinese text into words with the Hidden Markov Model. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
|
||||
|
||||
The default configuration of the <<tokenizers.adoc#Tokenizers-ICUTokenizer,ICU Tokenizer>> is also suitable for Simplified Chinese text. It follows the Word Break rules from the Unicode Text Segmentation algorithm for non-Chinese text, and uses a dictionary to segment Chinese words. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
|
||||
|
||||
|
|
|
@ -22,7 +22,7 @@ If you are worried about data loss, and of course you _should_ be, you need a wa
|
|||
|
||||
Solr provides two approaches to backing up and restoring Solr cores or collections, depending on how you are running Solr. If you run in SolrCloud mode, you will use the Collections API. If you run Solr in standalone mode, you will use the replication handler.
|
||||
|
||||
== [[cloud-backups]]SolrCloud
|
||||
== SolrCloud Backups
|
||||
|
||||
Support for backups when running SolrCloud is provided with the <<collections-api.adoc#collections-api,Collections API>>. This allows the backups to be generated across multiple shards, and restored to the same number of shards and replicas as the original collection.
|
||||
|
||||
|
@ -31,7 +31,7 @@ Two commands are available:
|
|||
* `action=BACKUP`: This command backs up Solr indexes and configurations. More information is available in the section <<collections-api.adoc#CollectionsAPI-backup,Backup Collection>>.
|
||||
* `action=RESTORE`: This command restores Solr indexes and configurations. More information is available in the section <<collections-api.adoc#CollectionsAPI-restore,Restore Collection>>.
|
||||
|
||||
== Standalone Mode
|
||||
== Standalone Mode Backups
|
||||
|
||||
Backups and restoration uses Solr's replication handler. Out of the box, Solr includes implicit support for replication so this API can be used. Configuration of the replication handler can, however, be customized by defining your own replication handler in `solrconfig.xml` . For details on configuring the replication handler, see the section <<index-replication.adoc#IndexReplication-ConfiguringtheReplicationHandler,Configuring the ReplicationHandler>>.
|
||||
|
||||
|
|
|
@ -96,4 +96,4 @@ The table below summarizes parameters accessible through the `MoreLikeThisHandle
|
|||
[[MoreLikeThis-MoreLikeThisQueryParser]]
|
||||
== More Like This Query Parser
|
||||
|
||||
The `mlt` query parser provides a mechanism to retrieve documents similar to a given document, like the handler. More information on the usage of the mlt query parser can be found int the section <<other-parsers.adoc#other-parsers,Other Parsers>>.
|
||||
The `mlt` query parser provides a mechanism to retrieve documents similar to a given document, like the handler. More information on the usage of the mlt query parser can be found in the section <<other-parsers.adoc#other-parsers,Other Parsers>>.
|
||||
|
|
|
@ -942,14 +942,33 @@ The {solr-javadocs}/solr-core/org/apache/solr/search/XmlQParserPlugin.html[XmlQP
|
|||
|defType |`xmlparser`
|
||||
|q a|
|
||||
[source,xml]
|
||||
<BooleanQuery fieldName="description"> <Clause occurs="must"> <TermQuery>shirt</TermQuery> </Clause> <Clause occurs="mustnot"> <TermQuery>plain</TermQuery> </Clause> <Clause occurs="should"> <TermQuery>cotton</TermQuery> </Clause> <Clause occurs="must"> <BooleanQuery fieldName="size"> <Clause occurs="should"> <TermsQuery>S M L</TermsQuery> </Clause> </BooleanQuery> </Clause> </BooleanQuery>
|
||||
----
|
||||
<BooleanQuery fieldName="description">
|
||||
<Clause occurs="must">
|
||||
<TermQuery>shirt</TermQuery>
|
||||
</Clause>
|
||||
<Clause occurs="mustnot">
|
||||
<TermQuery>plain</TermQuery>
|
||||
</Clause>
|
||||
<Clause occurs="should">
|
||||
<TermQuery>cotton</TermQuery>
|
||||
</Clause>
|
||||
<Clause occurs="must">
|
||||
<BooleanQuery fieldName="size">
|
||||
<Clause occurs="should">
|
||||
<TermsQuery>S M L</TermsQuery>
|
||||
</Clause>
|
||||
</BooleanQuery>
|
||||
</Clause>
|
||||
</BooleanQuery>
|
||||
----
|
||||
|===
|
||||
|
||||
The XmlQParser implementation uses the {solr-javadocs}/solr-core/org/apache/solr/search/SolrCoreParser.html[SolrCoreParser] class which extends Lucene's {lucene-javadocs}/queryparser/org/apache/lucene/queryparser/xml/CoreParser.html[CoreParser] class. XML elements are mapped to {lucene-javadocs}/queryparser/org/apache/lucene/queryparser/xml/QueryBuilder.html[QueryBuilder] classes as follows:
|
||||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[width="50%",cols="30,70",options="header"]
|
||||
[width="100%",cols="30,70",options="header"]
|
||||
|===
|
||||
|XML element |QueryBuilder class
|
||||
|<BooleanQuery> |{lucene-javadocs}/queryparser/org/apache/lucene/queryparser/xml/builders/BooleanQueryBuilder.html[BooleanQueryBuilder]
|
||||
|
|
|
@ -28,10 +28,8 @@ The Replication screen shows you the current replication state for the core you
|
|||
[IMPORTANT]
|
||||
====
|
||||
When using <<getting-started-with-solrcloud.adoc#getting-started-with-solrcloud,SolrCloud>>, do not attempt to disable replication via this screen.
|
||||
|
||||
image::images/replication-screen/replication.png[image,width=412,height=250]
|
||||
====
|
||||
|
||||
.Sample Replication Screen
|
||||
image::images/replication-screen/replication.png[image,width=412,height=250]
|
||||
|
||||
|
||||
More details on how to configure replication is available in the section called <<index-replication.adoc#index-replication,Index Replication>>.
|
||||
|
|
|
@ -358,5 +358,4 @@ Some of these techniques are described in _Apache SOLR and Carrot2 integration s
|
|||
The following resources provide additional information about the clustering component in Solr and its potential applications.
|
||||
|
||||
* Apache Solr and Carrot2 integration strategies: http://carrot2.github.io/solr-integration-strategies
|
||||
* Apache Solr Wiki (covers previous Solr versions, may be inaccurate): http://carrot2.github.io/solr-integration-strategies
|
||||
* Clustering and Visualization of Solr search results (video from Berlin BuzzWords conference, 2011): http://vimeo.com/26616444
|
||||
|
|
|
@ -617,7 +617,7 @@ The query parameters can be added to the API request after a '?'.
|
|||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[cols="15,15,10,20,40",options="header"]
|
||||
[cols="20,15,10,15,40",options="header"]
|
||||
|===
|
||||
|Key |Type |Required |Default |Description
|
||||
|wt |string |No |json |Defines the format of the response. The options are *json* or *xml*. If not specified, JSON will be returned by default.
|
||||
|
|
|
@ -69,7 +69,7 @@ You can use the `/schema/fields` <<schema-api.adoc#schema-api,Schema API>> to co
|
|||
|
||||
[TIP]
|
||||
====
|
||||
Because the `data_driven_schema_configs` config set includes a `copyField` directive that causes all content to be indexed in a predefined "catch-all" `\_text_` field, to enable single-field search that includes all fields' content, the index will be larger than it would be without the `copyField`. When you nail down your schema, consider removing the `\_text_` field and the corresponding `copyField` directive if you don't need it.
|
||||
The `data_driven_schema_configs` configset includes a `copyField` directive that causes all content to be indexed in a predefined "catch-all" `\_text_` field, which is used to enable single-field search that includes all fields' content. This will cause the index to be larger than it would be without this "catch-all" `copyField`. When you nail down your schema, consider removing the `\_text_` field and the corresponding `copyField` directive if you don't need it.
|
||||
====
|
||||
|
||||
[[SchemalessMode-ConfiguringSchemalessMode]]
|
||||
|
|
|
@ -416,7 +416,7 @@ hashJoin(
|
|||
|
||||
== innerJoin
|
||||
|
||||
Wraps two streams Left and Right and for every tuple in Left which exists in Right will emit a tuple containing the fields of both tuples. This supports one-one, one-many, many-one, and many-many inner join scenarios. The tuples are emitted in the order in which they appear in the Left stream. Both streams must be sorted by the fields being used to determine equality (the 'on' parameter). If both tuples contain a field of the same name then the value from the Right stream will be used in the emitted tuple. You can wrap the incoming streams with a select(...) to be specific about which field values are included in the emitted tuple.
|
||||
Wraps two streams, Left and Right. For every tuple in Left which exists in Right a tuple containing the fields of both tuples will be emitted. This supports one-to-one, one-to-many, many-to-one, and many-to-many inner join scenarios. The tuples are emitted in the order in which they appear in the Left stream. Both streams must be sorted by the fields being used to determine equality (the 'on' parameter). If both tuples contain a field of the same name then the value from the Right stream will be used in the emitted tuple. You can wrap the incoming streams with a `select(...)` expression to be specific about which field values are included in the emitted tuple.
|
||||
|
||||
=== innerJoin Parameters
|
||||
|
||||
|
@ -732,7 +732,7 @@ See section in <<graph-traversal.adoc#GraphTraversal-UsingthescoreNodesFunctiont
|
|||
|
||||
== select
|
||||
|
||||
The `select` function wraps a streaming expression and outputs tuples containing a subset or modified set of fields from the incoming tuples. The list of fields included in the output tuple can contain aliases to effectively rename fields. The select stream supports both operations and evaluators. One can provide a list of operations and evaluators to perform on any fields, such as `replace, add, if`, etc....
|
||||
The `select` function wraps a streaming expression and outputs tuples containing a subset or modified set of fields from the incoming tuples. The list of fields included in the output tuple can contain aliases to effectively rename fields. The `select` stream supports both operations and evaluators. One can provide a list of operations and evaluators to perform on any fields, such as `replace, add, if`, etc....
|
||||
|
||||
=== select Parameters
|
||||
|
||||
|
|
|
@ -26,7 +26,7 @@ This section provides guidance on how to setup Solr to run in production on *nix
|
|||
Solr includes a service installation script (`bin/install_solr_service.sh`) to help you install Solr as a service on Linux. Currently, the script only supports CentOS, Debian, Red Hat, SUSE and Ubuntu Linux distributions. Before running the script, you need to determine a few parameters about your setup. Specifically, you need to decide where to install Solr and which system user should be the owner of the Solr files and process.
|
||||
|
||||
[[TakingSolrtoProduction-Planningyourdirectorystructure]]
|
||||
=== Planning your directory structure
|
||||
=== Planning Your Directory Structure
|
||||
|
||||
We recommend separating your live Solr files, such as logs and index files, from the files included in the Solr distribution bundle, as that makes it easier to upgrade Solr and is considered a good practice to follow as a system administrator.
|
||||
|
||||
|
@ -49,7 +49,7 @@ Using a symbolic link insulates any scripts from being dependent on the specific
|
|||
You should also separate writable Solr files into a different directory; by default, the installation script uses `/var/solr`, but you can override this location using the `-d` option. With this approach, the files in `/opt/solr` will remain untouched and all files that change while Solr is running will live under `/var/solr`.
|
||||
|
||||
[[TakingSolrtoProduction-CreatetheSolruser]]
|
||||
=== Create the Solr user
|
||||
=== Create the Solr User
|
||||
|
||||
Running Solr as `root` is not recommended for security reasons, and the <<solr-control-script-reference.adoc#solr-control-script-reference,control script>> start command will refuse to do so. Consequently, you should determine the username of a system user that will own all of the Solr files and the running Solr process. By default, the installation script will create the *solr* user, but you can override this setting using the -u option. If your organization has specific requirements for creating new user accounts, then you should create the user before running the script. The installation script will make the Solr user the owner of the `/opt/solr` and `/var/solr` directories.
|
||||
|
||||
|
@ -103,7 +103,7 @@ We'll cover some additional configuration settings you can make to fine-tune you
|
|||
The Solr home directory (not to be confused with the Solr installation directory) is where Solr manages core directories with index files. By default, the installation script uses `/var/solr/data`. If the `-d` option is used on the install script, then this will change to the `data` subdirectory in the location given to the -d option. Take a moment to inspect the contents of the Solr home directory on your system. If you do not <<using-zookeeper-to-manage-configuration-files.adoc#using-zookeeper-to-manage-configuration-files,store `solr.xml` in ZooKeeper>>, the home directory must contain a `solr.xml` file. When Solr starts up, the Solr Control Script passes the location of the home directory using the `-Dsolr.solr.home=...` system property.
|
||||
|
||||
[[TakingSolrtoProduction-Environmentoverridesincludefile]]
|
||||
==== Environment overrides include file
|
||||
==== Environment Overrides Include File
|
||||
|
||||
The service installation script creates an environment specific include file that overrides defaults used by the `bin/solr` script. The main advantage of using an include file is that it provides a single location where all of your environment-specific overrides are defined. Take a moment to inspect the contents of the `/etc/default/solr.in.sh` file, which is the default path setup by the installation script. If you used the `-s` option on the install script to change the name of the service, then the first part of the filename will be different. For a service named `solr-demo`, the file will be named `/etc/default/solr-demo.in.sh`. There are many settings that you can override using this file. However, at a minimum, this script needs to define the `SOLR_PID_DIR` and `SOLR_HOME` variables, such as:
|
||||
|
||||
|
@ -116,7 +116,7 @@ SOLR_HOME=/var/solr/data
|
|||
The `SOLR_PID_DIR` variable sets the directory where the <<solr-control-script-reference.adoc#solr-control-script-reference,control script>> will write out a file containing the Solr server’s process ID.
|
||||
|
||||
[[TakingSolrtoProduction-Logsettings]]
|
||||
==== Log settings
|
||||
==== Log Settings
|
||||
|
||||
Solr uses Apache Log4J for logging. The installation script copies `/opt/solr/server/resources/log4j.properties` to `/var/solr/log4j.properties`. Take a moment to verify that the Solr include file is configured to send logs to the correct location by checking the following settings in `/etc/default/solr.in.sh`:
|
||||
|
||||
|
@ -129,7 +129,7 @@ SOLR_LOGS_DIR=/var/solr/logs
|
|||
For more information about Log4J configuration, please see: <<configuring-logging.adoc#configuring-logging,Configuring Logging>>
|
||||
|
||||
[[TakingSolrtoProduction-init.dscript]]
|
||||
==== init.d script
|
||||
==== init.d Script
|
||||
|
||||
When running a service like Solr on Linux, it’s common to setup an init.d script so that system administrators can control Solr using the service tool, such as: `service solr start`. The installation script creates a very basic init.d script to help you get started. Take a moment to inspect the `/etc/init.d/solr` file, which is the default script name setup by the installation script. If you used the `-s` option on the install script to change the name of the service, then the filename will be different. Notice that the following variables are setup for your environment based on the parameters passed to the installation script:
|
||||
|
||||
|
@ -175,7 +175,7 @@ Solr process PID running on port 8983
|
|||
If the `status` command is not successful, look for error messages in `/var/solr/logs/solr.log`.
|
||||
|
||||
[[TakingSolrtoProduction-Finetuneyourproductionsetup]]
|
||||
== Fine tune your production setup
|
||||
== Fine-Tune Your Production Setup
|
||||
|
||||
[[TakingSolrtoProduction-MemoryandGCSettings]]
|
||||
=== Memory and GC Settings
|
||||
|
@ -243,7 +243,7 @@ SOLR_HOST=solr1.example.com
|
|||
Setting the hostname of the Solr server is recommended, especially when running in SolrCloud mode, as this determines the address of the node when it registers with ZooKeeper.
|
||||
|
||||
[[TakingSolrtoProduction-Overridesettingsinsolrconfig.xml]]
|
||||
=== Override settings in solrconfig.xml
|
||||
=== Override Settings in solrconfig.xml
|
||||
|
||||
Solr allows configuration properties to be overridden using Java system properties passed at startup using the `-Dproperty=value` syntax. For instance, in `solrconfig.xml`, the default auto soft commit settings are set to:
|
||||
|
||||
|
@ -269,14 +269,13 @@ SOLR_OPTS="$SOLR_OPTS -Dsolr.autoSoftCommit.maxTime=10000"
|
|||
----
|
||||
|
||||
[[TakingSolrtoProduction-RunningmultipleSolrnodesperhost]]
|
||||
== Running multiple Solr nodes per host
|
||||
== Running Multiple Solr Nodes Per Host
|
||||
|
||||
The `bin/solr` script is capable of running multiple instances on one machine, but for a *typical* installation, this is not a recommended setup. Extra CPU and memory resources are required for each additional instance. A single instance is easily capable of handling multiple indexes.
|
||||
|
||||
.When to ignore the recommendation
|
||||
[NOTE]
|
||||
====
|
||||
|
||||
For every recommendation, there are exceptions. For the recommendation above, that exception is mostly applicable when discussing extreme scalability. The best reason for running multiple Solr nodes on one host is decreasing the need for extremely large heaps.
|
||||
|
||||
When the Java heap gets very large, it can result in extremely long garbage collection pauses, even with the GC tuning that the startup script provides by default. The exact point at which the heap is considered "very large" will vary depending on how Solr is used. This means that there is no hard number that can be given as a threshold, but if your heap is reaching the neighborhood of 16 to 32 gigabytes, it might be time to consider splitting nodes. Ideally this would mean more machines, but budget constraints might make that impossible.
|
||||
|
@ -284,7 +283,6 @@ When the Java heap gets very large, it can result in extremely long garbage coll
|
|||
There is another issue once the heap reaches 32GB. Below 32GB, Java is able to use compressed pointers, but above that point, larger pointers are required, which uses more memory and slows down the JVM.
|
||||
|
||||
Because of the potential garbage collection issues and the particular issues that happen at 32GB, if a single instance would require a 64GB heap, performance is likely to improve greatly if the machine is set up with two nodes that each have a 31GB heap.
|
||||
|
||||
====
|
||||
|
||||
If your use case requires multiple instances, at a minimum you will need unique Solr home directories for each node you want to run; ideally, each home should be on a different physical disk so that multiple Solr nodes don’t have to compete with each other when accessing files on disk. Having different Solr home directories implies that you’ll need a different include file for each node. Moreover, if using the `/etc/init.d/solr` script to control Solr as a service, then you’ll need a separate script for each node. The easiest approach is to use the service installation script to add multiple services on the same host, such as:
|
||||
|
|
|
@ -22,8 +22,8 @@
|
|||
The following sections describe how Solr breaks down and works with textual data. There are three main concepts to understand: analyzers, tokenizers, and filters.
|
||||
|
||||
* <<analyzers.adoc#analyzers,Field analyzers>> are used both during ingestion, when a document is indexed, and at query time. An analyzer examines the text of fields and generates a token stream. Analyzers may be a single class or they may be composed of a series of tokenizer and filter classes.
|
||||
* <<about-tokenizers.adoc#about-tokenizers,Tokenizers>> break field data into lexical units, or __tokens__.
|
||||
* <<about-filters.adoc#about-filters,Filters>> examine a stream of tokens and keep them, transform or discard them, or create new ones. Tokenizers and filters may be combined to form pipelines, or __chains__, where the output of one is input to the next. Such a sequence of tokenizers and filters is called an _analyzer_ and the resulting output of an analyzer is used to match query results or build indices.
|
||||
* <<about-tokenizers.adoc#about-tokenizers,Tokenizers>> break field data into lexical units, or _tokens_.
|
||||
* <<about-filters.adoc#about-filters,Filters>> examine a stream of tokens and keep them, transform or discard them, or create new ones. Tokenizers and filters may be combined to form pipelines, or _chains_, where the output of one is input to the next. Such a sequence of tokenizers and filters is called an _analyzer_ and the resulting output of an analyzer is used to match query results or build indices.
|
||||
|
||||
|
||||
[[UnderstandingAnalyzers_Tokenizers_andFilters-UsingAnalyzers_Tokenizers_andFilters]]
|
||||
|
@ -31,11 +31,11 @@ The following sections describe how Solr breaks down and works with textual data
|
|||
|
||||
Although the analysis process is used for both indexing and querying, the same analysis process need not be used for both operations. For indexing, you often want to simplify, or normalize, words. For example, setting all letters to lowercase, eliminating punctuation and accents, mapping words to their stems, and so on. Doing so can increase recall because, for example, "ram", "Ram" and "RAM" would all match a query for "ram". To increase query-time precision, a filter could be employed to narrow the matches by, for example, ignoring all-cap acronyms if you're interested in male sheep, but not Random Access Memory.
|
||||
|
||||
The tokens output by the analysis process define the values, or __terms__, of that field and are used either to build an index of those terms when a new document is added, or to identify which documents contain the terms you are querying for.
|
||||
The tokens output by the analysis process define the values, or _terms_, of that field and are used either to build an index of those terms when a new document is added, or to identify which documents contain the terms you are querying for.
|
||||
|
||||
|
||||
[[UnderstandingAnalyzers_Tokenizers_andFilters-ForMoreInformation]]
|
||||
== For More Information
|
||||
=== For More Information
|
||||
|
||||
These sections will show you how to configure field analyzers and also serves as a reference for the details of configuring each of the available tokenizer and filter classes. It also serves as a guide so that you can configure your own analysis classes if you have special needs that cannot be met with the included filters or tokenizers.
|
||||
|
||||
|
|
|
@ -129,7 +129,7 @@ Three additional expert-level configuration settings affect indexing performance
|
|||
|
||||
// TODO: Change column width to %autowidth.spread when https://github.com/asciidoctor/asciidoctor-pdf/issues/599 is fixed
|
||||
|
||||
[cols="20,10,10,60",options="header"]
|
||||
[cols="25,10,10,55",options="header"]
|
||||
|===
|
||||
|Setting Name |Type |Default |Description
|
||||
|numRecordsToKeep |int |100 |The number of update records to keep per log
|
||||
|
|
|
@ -178,16 +178,16 @@ value ::= text
|
|||
|
||||
Special characters in "text" values can be escaped using the escape character `\` . The following escape sequences are recognized:
|
||||
|
||||
[width="30%",options="header",]
|
||||
[width="60%",options="header",]
|
||||
|===
|
||||
|EscapeSequence |Description
|
||||
|"`\` " |literal space character
|
||||
|"`\,`" |literal `,` character
|
||||
|"`\=`" |literal `=` character
|
||||
|"`\\`" |literal `\` character
|
||||
|"`\n`" |newline
|
||||
|"`\r`" |carriage return
|
||||
|"`\t`" |horizontal tab
|
||||
|`\` |literal space character
|
||||
|`\,` |literal `,` character
|
||||
|`\=` |literal `=` character
|
||||
|`\\` |literal `\` character
|
||||
|`\n` |newline
|
||||
|`\r` |carriage return
|
||||
|`\t` |horizontal tab
|
||||
|===
|
||||
|
||||
Please note that Unicode sequences (e.g. `\u0001`) are not supported.
|
||||
|
@ -249,7 +249,7 @@ Token positions are tracked and implicitly added to the token stream - the start
|
|||
* stored: null
|
||||
* token: (term=`one`,positionIncrement=22,startOffset=123,endOffset=128)
|
||||
* token: (term=`two`,positionIncrement=1,startOffset=5,endOffset=8)
|
||||
* token: (term=three,positionIncrement=1,startOffset=20,endOffset=22)
|
||||
* token: (term=`three`,positionIncrement=1,startOffset=20,endOffset=22)
|
||||
|
||||
[source,text]
|
||||
----
|
||||
|
@ -260,7 +260,7 @@ Token positions are tracked and implicitly added to the token stream - the start
|
|||
|
||||
* version: 1
|
||||
* stored: null
|
||||
* token: (term=` one ,`,positionIncrement=22,startOffset=0,endOffset=6)
|
||||
* token: (term=`one ,`,positionIncrement=22,startOffset=0,endOffset=6)
|
||||
* token: (term=`two=` ,positionIncrement=1,startOffset=7,endOffset=15)
|
||||
* token: (term=`\`,positionIncrement=1,startOffset=17,endOffset=18)
|
||||
|
||||
|
@ -284,12 +284,12 @@ Note that unknown attributes and their values are ignored, so in this example, t
|
|||
----
|
||||
|
||||
* version: 1
|
||||
* stored: "`This is the stored part with = \t escapes.`"
|
||||
* stored: `This is the stored part with = \t escapes.`
|
||||
* token: (term=`one`,startOffset=0,endOffset=3)
|
||||
* token: (term=`two`,startOffset=4,endOffset=7)
|
||||
* token: (term=`three`,startOffset=8,endOffset=13)
|
||||
|
||||
Note that the "`\t`" in the above stored value is not literal; it's shown that way to visually indicate the actual tab char that is in the stored value.
|
||||
Note that the `\t` in the above stored value is not literal; it's shown that way to visually indicate the actual tab char that is in the stored value.
|
||||
|
||||
[source,text]
|
||||
----
|
||||
|
@ -306,5 +306,5 @@ Note that the "`\t`" in the above stored value is not literal; it's shown that w
|
|||
----
|
||||
|
||||
* version: 1
|
||||
* stored: "this is a test."
|
||||
* stored: `this is a test.`
|
||||
* (no tokens)
|
||||
|
|
|
@ -60,13 +60,13 @@ Solr nodes, clients and tools (e.g. ZkCLI) always use a java class called {solr-
|
|||
|
||||
You control which credentials provider will be used by configuring the `zkCredentialsProvider` property in `solr.xml`'s `<solrcloud>` section to the name of a class (on the classpath) implementing the {solr-javadocs}/solr-solrj/org/apache/solr/common/cloud/ZkCredentialsProvider[`ZkCredentialsProvider`] interface. `server/solr/solr.xml` in the Solr distribution defines the `zkCredentialsProvider` such that it will take on the value of the same-named `zkCredentialsProvider` system property if it is defined (e.g. by uncommenting the `SOLR_ZK_CREDS_AND_ACLS` environment variable definition in `solr.in.sh/.cmd` - see below), or if not, default to the `DefaultZkCredentialsProvider` implementation.
|
||||
|
||||
*Out of the Box Implementations*
|
||||
==== Out of the Box Credential Implementations
|
||||
|
||||
You can always make you own implementation, but Solr comes with two implementations:
|
||||
|
||||
* `org.apache.solr.common.cloud.DefaultZkCredentialsProvider`: Its `getCredentials()` returns a list of length zero, or "no credentials used". This is the default.
|
||||
* `org.apache.solr.common.cloud.VMParamsSingleSetCredentialsDigestZkCredentialsProvider`: This lets you define your credentials using system properties. It supports at most one set of credentials.
|
||||
** The schema is "digest". The username and password are defined by system properties "```zkDigestUsername```" and "```zkDigestPassword```", respectively. This set of credentials will be added to the list of credentials returned by `getCredentials()` if both username and password are provided.
|
||||
** The schema is "digest". The username and password are defined by system properties `zkDigestUsername` and `zkDigestPassword`. This set of credentials will be added to the list of credentials returned by `getCredentials()` if both username and password are provided.
|
||||
** If the one set of credentials above is not added to the list, this implementation will fall back to default behavior and use the (empty) credentials list from `DefaultZkCredentialsProvider`.
|
||||
|
||||
[[ZooKeeperAccessControl-ControllingACLs]]
|
||||
|
@ -75,19 +75,19 @@ You can always make you own implementation, but Solr comes with two implementati
|
|||
You control which ACLs will be added by configuring `zkACLProvider` property in `solr.xml`'s `<solrcloud>` section to the name of a class (on the classpath) implementing the {solr-javadocs}//solr-solrj/org/apache/solr/common/cloud/ZkACLProvider[`ZkACLProvider`] interface. `server/solr/solr.xml` in the Solr distribution defines the `zkACLProvider` such that it will take on the value of the same-named `zkACLProvider` system property if it is defined (e.g. by uncommenting the `SOLR_ZK_CREDS_AND_ACLS` environment variable definition in `solr.in.sh/.cmd` - see below), or if not, default to the `DefaultZkACLProvider` implementation.
|
||||
|
||||
[[ZooKeeperAccessControl-OutoftheBoxImplementations]]
|
||||
==== Out of the Box Implementations
|
||||
==== Out of the Box ACL Implementations
|
||||
|
||||
You can always make you own implementation, but Solr comes with:
|
||||
|
||||
* `org.apache.solr.common.cloud.DefaultZkACLProvider`: It returns a list of length one for all `zNodePath`-s. The single ACL entry in the list is "open-unsafe". This is the default.
|
||||
* `org.apache.solr.common.cloud.VMParamsAllAndReadonlyDigestZkACLProvider`: This lets you define your ACLs using system properties. Its `getACLsToAdd()` implementation does not use `zNodePath` for anything, so all znodes will get the same set of ACLs. It supports adding one or both of these options:
|
||||
** A user that is allowed to do everything.
|
||||
*** The permission is "```ALL```" (corresponding to all of `CREATE`, `READ`, `WRITE`, `DELETE`, and `ADMIN`), and the schema is "digest".
|
||||
*** The username and password are defined by system properties "```zkDigestUsername```" and "```zkDigestPassword```", respectively.
|
||||
*** The permission is `ALL` (corresponding to all of `CREATE`, `READ`, `WRITE`, `DELETE`, and `ADMIN`), and the schema is "digest".
|
||||
*** The username and password are defined by system properties `zkDigestUsername` and `zkDigestPassword`, respectively.
|
||||
*** This ACL will not be added to the list of ACLs unless both username and password are provided.
|
||||
** A user that is only allowed to perform read operations.
|
||||
*** The permission is "```READ```" and the schema is "digest".
|
||||
*** The username and password are defined by system properties "```zkDigestReadonlyUsername```" and "```zkDigestReadonlyPassword```", respectively.
|
||||
*** The permission is `READ` and the schema is `digest`.
|
||||
*** The username and password are defined by system properties `zkDigestReadonlyUsername` and `zkDigestReadonlyPassword`, respectively.
|
||||
*** This ACL will not be added to the list of ACLs unless both username and password are provided.
|
||||
* `org.apache.solr.common.cloud.SaslZkACLProvider`: Requires SASL authentication. Gives all permissions for the user specified in system property `solr.authorization.superuser` (default: `solr`) when using SASL, and gives read permissions for anyone else. Designed for a setup where configurations have already been set up and will not be modified, or where configuration changes are controlled via Solr APIs. This provider will be useful for administration in a kerberos environment. In such an environment, the administrator wants Solr to authenticate to ZooKeeper using SASL, since this is only way to authenticate with ZooKeeper via Kerberos.
|
||||
|
||||
|
@ -102,6 +102,7 @@ You can give the readonly credentials to "clients" of your SolrCloud cluster - e
|
|||
=== ZooKeeper ACLs in Solr Scripts
|
||||
|
||||
There are two scripts that impact ZooKeeper ACLs:
|
||||
|
||||
* For *nix systems: `bin/solr` & `server/scripts/cloud-scripts/zkcli.sh`
|
||||
* For Windows systems: `bin/solr.cmd` & `server/scripts/cloud-scripts/zkcli.bat`
|
||||
|
||||
|
|
Loading…
Reference in New Issue