mirror of https://github.com/apache/lucene.git
SOLR-13885: various Ref Guide typos. This closes #990
This commit is contained in:
parent
eb3a4757ff
commit
de1c9fb9e8
|
@ -215,6 +215,8 @@ Other Changes
|
||||||
|
|
||||||
* SOLR-12193: Move some log messages to TRACE level (gezapeti, janhoy)
|
* SOLR-12193: Move some log messages to TRACE level (gezapeti, janhoy)
|
||||||
|
|
||||||
|
* SOLR-13885: Typos in the documentation. (KoenDG via Cassandra Targett)
|
||||||
|
|
||||||
================== 8.3.1 ==================
|
================== 8.3.1 ==================
|
||||||
|
|
||||||
Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release.
|
Consult the LUCENE_CHANGES.txt file for additional, low level, changes in this release.
|
||||||
|
|
|
@ -124,7 +124,7 @@ RAUP first reads TRA configuration from the alias properties when it is initiali
|
||||||
within the target collection.
|
within the target collection.
|
||||||
|
|
||||||
* If it belongs in the current collection (which is usually the case if processing events as they occur), the document
|
* If it belongs in the current collection (which is usually the case if processing events as they occur), the document
|
||||||
passes through to DUP. DUP does it's normal collection-level processing that may involve routing the document
|
passes through to DUP. DUP does its normal collection-level processing that may involve routing the document
|
||||||
to another shard & replica.
|
to another shard & replica.
|
||||||
|
|
||||||
* If the timestamp on the document is more recent than the most recent TRA segment, then a new collection needs to be
|
* If the timestamp on the document is more recent than the most recent TRA segment, then a new collection needs to be
|
||||||
|
|
|
@ -178,7 +178,7 @@ You can refer <<solr-tutorial.adoc#exercise-1,Solr Tutorial>> for an extensive w
|
||||||
|
|
||||||
If you want to configure an external ZooKeeper ensemble to avoid using the embedded single-instance ZooKeeper that runs in the same JVM as the Solr node, you need to make few tweaks in the above listed steps as follows.
|
If you want to configure an external ZooKeeper ensemble to avoid using the embedded single-instance ZooKeeper that runs in the same JVM as the Solr node, you need to make few tweaks in the above listed steps as follows.
|
||||||
|
|
||||||
* When creating the security group, instead of opening port `9983` for ZooKeeper, you'll open `2181` (or whatever port you are using for ZooKeeper: it's default is 2181).
|
* When creating the security group, instead of opening port `9983` for ZooKeeper, you'll open `2181` (or whatever port you are using for ZooKeeper: its default is 2181).
|
||||||
* When configuring the number of instances to launch, choose to open 3 instances instead of 2.
|
* When configuring the number of instances to launch, choose to open 3 instances instead of 2.
|
||||||
* When modifying the `/etc/hosts` on each machine, add a third line for the 3rd instance and give it a recognizable name:
|
* When modifying the `/etc/hosts` on each machine, add a third line for the 3rd instance and give it a recognizable name:
|
||||||
+
|
+
|
||||||
|
|
|
@ -276,7 +276,7 @@ fq=quantity_in_stock:[5 TO *]
|
||||||
fq={!frange l=10 u=100}mul(popularity,price)
|
fq={!frange l=10 u=100}mul(popularity,price)
|
||||||
fq={!frange cost=200 l=0}pow(mul(sum(1, query('tag:smartphone')), div(1,avg_rating)), 2.3)
|
fq={!frange cost=200 l=0}pow(mul(sum(1, query('tag:smartphone')), div(1,avg_rating)), 2.3)
|
||||||
|
|
||||||
These are the same filters run w/o caching. The simple range query on the `quantity_in_stock` field will be run in parallel with the main query like a traditional lucene filter, while the 2 `frange` filters will only be checked against each document has already matched the main query and the `quantity_in_stock` range query -- first the simpler `mul(popularity,price)` will be checked (because of it's implicit `cost=100`) and only if it matches will the final very complex filter (with it's higher `cost=200`) be checked.
|
These are the same filters run w/o caching. The simple range query on the `quantity_in_stock` field will be run in parallel with the main query like a traditional lucene filter, while the 2 `frange` filters will only be checked against each document has already matched the main query and the `quantity_in_stock` range query -- first the simpler `mul(popularity,price)` will be checked (because of its implicit `cost=100`) and only if it matches will the final very complex filter (with its higher `cost=200`) be checked.
|
||||||
|
|
||||||
[source,text]
|
[source,text]
|
||||||
q=some keywords
|
q=some keywords
|
||||||
|
|
|
@ -170,7 +170,7 @@ The "alternate" fallback options are more primitive.
|
||||||
<<The Original Highlighter,Original Highlighter>>:: (`hl.method=original`, the default)
|
<<The Original Highlighter,Original Highlighter>>:: (`hl.method=original`, the default)
|
||||||
+
|
+
|
||||||
The Original Highlighter, sometimes called the "Standard Highlighter" or "Default Highlighter", is Lucene's original highlighter – a venerable option with a high degree of customization options.
|
The Original Highlighter, sometimes called the "Standard Highlighter" or "Default Highlighter", is Lucene's original highlighter – a venerable option with a high degree of customization options.
|
||||||
It's query accuracy is good enough for most needs, although it's not quite as good/perfect as the Unified Highlighter.
|
Its query accuracy is good enough for most needs, although it's not quite as good/perfect as the Unified Highlighter.
|
||||||
+
|
+
|
||||||
The Original Highlighter will normally analyze stored text on the fly in order to highlight. It will use full term vectors if available.
|
The Original Highlighter will normally analyze stored text on the fly in order to highlight. It will use full term vectors if available.
|
||||||
+
|
+
|
||||||
|
|
|
@ -909,7 +909,7 @@ ____
|
||||||
|
|
||||||
The `relatedness(...)` function is used to "score" these relationships, relative to "Foreground" and "Background" sets of documents, specified in the function params as queries.
|
The `relatedness(...)` function is used to "score" these relationships, relative to "Foreground" and "Background" sets of documents, specified in the function params as queries.
|
||||||
|
|
||||||
Unlike most aggregation functions, the `relatedness(...)` function is aware of whether and how it's used in <<nested-facets,Nested Facets>>. It evaluates the query defining the current bucket _independently_ from it's parent/ancestor buckets, and intersects those documents with a "Foreground Set" defined by the foreground query _combined with the ancestor buckets_. The result is then compared to a similar intersection done against the "Background Set" (defined exclusively by background query) to see if there is a positive, or negative, correlation between the current bucket and the Foreground Set, relative to the Background Set.
|
Unlike most aggregation functions, the `relatedness(...)` function is aware of whether and how it's used in <<nested-facets,Nested Facets>>. It evaluates the query defining the current bucket _independently_ from its parent/ancestor buckets, and intersects those documents with a "Foreground Set" defined by the foreground query _combined with the ancestor buckets_. The result is then compared to a similar intersection done against the "Background Set" (defined exclusively by background query) to see if there is a positive, or negative, correlation between the current bucket and the Foreground Set, relative to the Background Set.
|
||||||
|
|
||||||
NOTE: While it's very common to define the Background Set as `\*:*`, or some other super-set of the Foreground Query, it is not strictly required. The `relatedness(...)` function can be used to compare the statistical relatedness of sets of documents to orthogonal foreground/background queries.
|
NOTE: While it's very common to define the Background Set as `\*:*`, or some other super-set of the Foreground Query, it is not strictly required. The `relatedness(...)` function can be used to compare the statistical relatedness of sets of documents to orthogonal foreground/background queries.
|
||||||
|
|
||||||
|
|
|
@ -235,7 +235,7 @@ Example:
|
||||||
|
|
||||||
A `graph` domain change option works similarly to the `join` domain option, but can do traversal multiple hops `from` the existing domain `to` other documents.
|
A `graph` domain change option works similarly to the `join` domain option, but can do traversal multiple hops `from` the existing domain `to` other documents.
|
||||||
|
|
||||||
This works very similar to the <<other-parsers.adoc#graph-query-parser,Graph Query Parser>>, supporting all of it's optional parameters, and has the same limitations when dealing with multi-shard collections.
|
This works very similar to the <<other-parsers.adoc#graph-query-parser,Graph Query Parser>>, supporting all of its optional parameters, and has the same limitations when dealing with multi-shard collections.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
[source,json]
|
[source,json]
|
||||||
|
|
|
@ -73,7 +73,7 @@ Several changes have been made to configsets that ship with Solr; not only their
|
||||||
* The `data_driven_configset` and `basic_configset` have been removed, and replaced by the `_default` configset. The `sample_techproducts_configset` also remains, and is designed for use with the example documents shipped with Solr in the `example/exampledocs` directory.
|
* The `data_driven_configset` and `basic_configset` have been removed, and replaced by the `_default` configset. The `sample_techproducts_configset` also remains, and is designed for use with the example documents shipped with Solr in the `example/exampledocs` directory.
|
||||||
* When creating a new collection, if you do not specify a configset, the `_default` will be used.
|
* When creating a new collection, if you do not specify a configset, the `_default` will be used.
|
||||||
** If you use SolrCloud, the `_default` configset will be automatically uploaded to ZooKeeper.
|
** If you use SolrCloud, the `_default` configset will be automatically uploaded to ZooKeeper.
|
||||||
** If you use standalone mode, the instanceDir will be created automatically, using the `_default` configset as it's basis.
|
** If you use standalone mode, the instanceDir will be created automatically, using the `_default` configset as its basis.
|
||||||
|
|
||||||
=== Schemaless Improvements
|
=== Schemaless Improvements
|
||||||
|
|
||||||
|
|
|
@ -92,7 +92,7 @@ When building the Guide, the `solr-root-path` attribute will be automatically se
|
||||||
|
|
||||||
In order for editors (such as ATOM) to be able to offer "live preview" of the `*.adoc` files using these includes, the `solr-root-path` attribute must also be set as a document level attribute in each file, with the correct relative path.
|
In order for editors (such as ATOM) to be able to offer "live preview" of the `*.adoc` files using these includes, the `solr-root-path` attribute must also be set as a document level attribute in each file, with the correct relative path.
|
||||||
|
|
||||||
For example, `using-solrj.adoc` sets `solr-root-path` in it's header, along with an `example-source-dir` attribute (that depends on `solr-root-path`) in order to reduce redundancy in the many `include::` directives it specifies...
|
For example, `using-solrj.adoc` sets `solr-root-path` in its header, along with an `example-source-dir` attribute (that depends on `solr-root-path`) in order to reduce redundancy in the many `include::` directives it specifies...
|
||||||
|
|
||||||
[source]
|
[source]
|
||||||
--
|
--
|
||||||
|
|
|
@ -390,7 +390,7 @@ The graph is built according to linkages between documents based on the terms fo
|
||||||
Supported field types are point fields with docValues enabled, or string fields with `indexed=true` or `docValues=true`.
|
Supported field types are point fields with docValues enabled, or string fields with `indexed=true` or `docValues=true`.
|
||||||
|
|
||||||
TIP: For string fields which are `indexed=false` and `docValues=true`, please refer to the javadocs for {lucene-javadocs}sandbox/org/apache/lucene/search/DocValuesTermsQuery.html[`DocValuesTermsQuery`]
|
TIP: For string fields which are `indexed=false` and `docValues=true`, please refer to the javadocs for {lucene-javadocs}sandbox/org/apache/lucene/search/DocValuesTermsQuery.html[`DocValuesTermsQuery`]
|
||||||
for it's performance characteristics so `indexed=true` will perform better for most use-cases.
|
for its performance characteristics so `indexed=true` will perform better for most use-cases.
|
||||||
|
|
||||||
=== Graph Query Parameters
|
=== Graph Query Parameters
|
||||||
|
|
||||||
|
|
|
@ -46,7 +46,7 @@ These issues are not a problem for most users. However, some use cases would per
|
||||||
|
|
||||||
Solr accomplishes this by allowing you to set the replica type when creating a new collection or when adding a replica. The available types are:
|
Solr accomplishes this by allowing you to set the replica type when creating a new collection or when adding a replica. The available types are:
|
||||||
|
|
||||||
* *NRT*: This is the default. A NRT replica (NRT = NearRealTime) maintains a transaction log and writes new documents to it's indexes locally. Any replica of this type is eligible to become a leader. Traditionally, this was the only type supported by Solr.
|
* *NRT*: This is the default. A NRT replica (NRT = NearRealTime) maintains a transaction log and writes new documents to its indexes locally. Any replica of this type is eligible to become a leader. Traditionally, this was the only type supported by Solr.
|
||||||
* *TLOG*: This type of replica maintains a transaction log but does not index document changes locally. This type helps speed up indexing since no commits need to occur in the replicas. When this type of replica needs to update its index, it does so by replicating the index from the leader. This type of replica is also eligible to become a shard leader; it would do so by first processing its transaction log. If it does become a leader, it will behave the same as if it was a NRT type of replica.
|
* *TLOG*: This type of replica maintains a transaction log but does not index document changes locally. This type helps speed up indexing since no commits need to occur in the replicas. When this type of replica needs to update its index, it does so by replicating the index from the leader. This type of replica is also eligible to become a shard leader; it would do so by first processing its transaction log. If it does become a leader, it will behave the same as if it was a NRT type of replica.
|
||||||
* *PULL*: This type of replica does not maintain a transaction log nor index document changes locally. It only replicates the index from the shard leader. It is not eligible to become a shard leader and doesn't participate in shard leader election at all.
|
* *PULL*: This type of replica does not maintain a transaction log nor index document changes locally. It only replicates the index from the shard leader. It is not eligible to become a shard leader and doesn't participate in shard leader election at all.
|
||||||
|
|
||||||
|
|
|
@ -415,7 +415,7 @@ The format, either `ints2D` (default) or `png`.
|
||||||
|
|
||||||
[TIP]
|
[TIP]
|
||||||
====
|
====
|
||||||
You'll experiment with different `distErrPct` values (probably 0.10 - 0.20) with various input geometries till the default size is what you're looking for. The specific details of how it's computed isn't important. For high-detail grids used in point-plotting (loosely one cell per pixel), set `distErr` to be the number of decimal-degrees of several pixels or so of the map being displayed. Also, you probably don't want to use a geohash-based grid because the cell orientation between grid levels flip-flops between being square and rectangle. Quad is consistent and has more levels, albeit at the expense of a larger index.
|
You'll experiment with different `distErrPct` values (probably 0.10 - 0.20) with various input geometries till the default size is what you're looking for. The specific details of how it's computed aren't important. For high-detail grids used in point-plotting (loosely one cell per pixel), set `distErr` to be the number of decimal-degrees of several pixels or so of the map being displayed. Also, you probably don't want to use a geohash-based grid because the cell orientation between grid levels flip-flops between being square and rectangle. Quad is consistent and has more levels, albeit at the expense of a larger index.
|
||||||
====
|
====
|
||||||
|
|
||||||
Here's some sample output in JSON (with "..." inserted for brevity):
|
Here's some sample output in JSON (with "..." inserted for brevity):
|
||||||
|
|
|
@ -925,7 +925,7 @@ merge(
|
||||||
|
|
||||||
== null
|
== null
|
||||||
|
|
||||||
The null expression is a useful utility function for understanding bottlenecks when performing parallel relational algebra (joins, intersections, rollups etc.). The null function reads all the tuples from an underlying stream and returns a single tuple with the count and processing time. Because the null stream adds minimal overhead of it's own, it can be used to isolate the performance of Solr's /export handler. If the /export handlers performance is not the bottleneck, then the bottleneck is likely occurring in the workers where the stream decorators are running.
|
The null expression is a useful utility function for understanding bottlenecks when performing parallel relational algebra (joins, intersections, rollups etc.). The null function reads all the tuples from an underlying stream and returns a single tuple with the count and processing time. Because the null stream adds minimal overhead of its own, it can be used to isolate the performance of Solr's /export handler. If the /export handlers performance is not the bottleneck, then the bottleneck is likely occurring in the workers where the stream decorators are running.
|
||||||
|
|
||||||
The null expression can be wrapped by the parallel function and sent to worker nodes. In this scenario each worker will return one tuple with the count of tuples processed on the worker and the timing information for that worker. This gives valuable information such as:
|
The null expression can be wrapped by the parallel function and sent to worker nodes. In this scenario each worker will return one tuple with the count of tuples processed on the worker and the timing information for that worker. This gives valuable information such as:
|
||||||
|
|
||||||
|
@ -1080,7 +1080,7 @@ plist(tuple(a=search(collection1, q="*:*", fl="id, prod_ss", sort="id asc")),
|
||||||
|
|
||||||
== priority
|
== priority
|
||||||
|
|
||||||
The `priority` function is a simple priority scheduler for the <<executor>> function. The `executor` function doesn't directly have a concept of task prioritization; instead it simply executes tasks in the order that they are read from it's underlying stream. The `priority` function provides the ability to schedule a higher priority task ahead of lower priority tasks that were submitted earlier.
|
The `priority` function is a simple priority scheduler for the <<executor>> function. The `executor` function doesn't directly have a concept of task prioritization; instead it simply executes tasks in the order that they are read from its underlying stream. The `priority` function provides the ability to schedule a higher priority task ahead of lower priority tasks that were submitted earlier.
|
||||||
|
|
||||||
The `priority` function wraps two <<stream-source-reference.adoc#topic,topics>> that are both emitting tuples that contain streaming expressions to execute. The first topic is considered the higher priority task queue.
|
The `priority` function wraps two <<stream-source-reference.adoc#topic,topics>> that are both emitting tuples that contain streaming expressions to execute. The first topic is considered the higher priority task queue.
|
||||||
|
|
||||||
|
|
|
@ -26,7 +26,7 @@ If you are unfamiliar with JMX, you may find the following overview useful: htt
|
||||||
|
|
||||||
JMX support is configured by defining a metrics reporter, as described in the section the section <<metrics-reporting.adoc#jmx-reporter,JMX Reporter>>.
|
JMX support is configured by defining a metrics reporter, as described in the section the section <<metrics-reporting.adoc#jmx-reporter,JMX Reporter>>.
|
||||||
|
|
||||||
If you have an existing MBean server running in Solr's JVM, or if you start Solr with the system property `-Dcom.sun.management.jmxremote`, Solr will automatically identify it's location on startup even if you have not defined a reporter explicitly in `solr.xml`. You can also define the location of the MBean server with parameters defined in the reporter definition.
|
If you have an existing MBean server running in Solr's JVM, or if you start Solr with the system property `-Dcom.sun.management.jmxremote`, Solr will automatically identify its location on startup even if you have not defined a reporter explicitly in `solr.xml`. You can also define the location of the MBean server with parameters defined in the reporter definition.
|
||||||
|
|
||||||
== Configuring MBean Servers
|
== Configuring MBean Servers
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue