SOLR-8998: ref guide update.

2018-05-02 18:23:08 +03:00 · 2018-05-02 18:23:08 +03:00 · df713fc700
parent 46ecb73976
commit df713fc700
3 changed files with 70 additions and 3 deletions
--- a/solr/solr-ref-guide/src/blockjoin-faceting.adoc
+++ b/solr/solr-ref-guide/src/blockjoin-faceting.adoc
@ -20,6 +20,8 @@ BlockJoin facets allow you to aggregate children facet counts by their parents.

 It is a common requirement that if a parent document has several children documents, all of them need to increment facet value count only once. This functionality is provided by `BlockJoinDocSetFacetComponent`, and `BlockJoinFacetComponent` just an alias for compatibility.

+CAUTION: This functionality is considered deprecated. Users are encouraged to use `uniqueBlock(\_root_)` aggregation under terms facet in <<json-facet-api.adoc#Blockjoinfacetexample,JSON Facet API>>. 
+
 CAUTION: This component is considered experimental, and must be explicitly enabled for a request handler in `solrconfig.xml`, in the same way as any other <<requesthandlers-and-searchcomponents-in-solrconfig.adoc#requesthandlers-and-searchcomponents-in-solrconfig,search component>>.

 This example shows how you could add this search components to `solrconfig.xml` and define it in request handler:
--- a/solr/solr-ref-guide/src/json-facet-api.adoc
+++ b/solr/solr-ref-guide/src/json-facet-api.adoc
@ -59,7 +59,7 @@ The response to the facet request above will start with documents matching the r
 [[BucketingFacetExample]]
 === Bucketing Facet Example

-Here's an example of a bucketing facet, that partitions documents into bucket based on the `cat` field (short for category), and returns the top 5 buckets:
+Here's an example of a bucketing facet, that partitions documents into bucket based on the `cat` field (short for category), and returns the top 3 buckets:

 [source,bash]
 ----
@ -342,7 +342,8 @@ Aggregation functions, also called *facet functions, analytic functions,* or **m
 |avg |avg(popularity) |average of numeric values
 |min |min(salary) |minimum value
 |max |max(mul(price,popularity)) |maximum value
-|unique |unique(author) |number of unique values
+|unique |unique(author) |number of unique values of the given field. Beyond 100 values it yields not exact estimate 
+|uniqueBlock |uniqueBlock(\_root_) |same as above with smaller footprint strictly requires <<uploading-data-with-index-handlers.adoc#nested-child-documents, block index>>. The given field is expected to be unique across blocks, now only singlevalued string fields are supported, docValues are recommended. 
 |hll |hll(author) |distributed cardinality estimate via hyper-log-log algorithm
 |percentile |percentile(salary,50,75,99,99.9) |Percentile estimates via t-digest algorithm. When sorting by this metric, the first percentile listed is used as the sort value.
 |sumsq |sumsq(rent) |sum of squares of field or function
@ -449,6 +450,70 @@ And the response will look something like:

 By default "top authors" is defined by simple document count descending, but we could use our aggregation functions to sort by more interesting metrics.

+
+[[BlockJoinFacets]]
+== Block Join Facets
+
+Block Join Facets facets allow bucketing <<uploading-data-with-index-handlers.adoc#nested-child-documents, child documents>> as attributes of their parents.
+
+[[Blockjoinfacetexample]]
+=== Block Join Facet example
+
+Suppose we have products with multiple SKUs, and we want to count products for each color.
+
+[source,java]
+----
+{
+    "id": "1", "type": "product", "name": "Solr T-Shirt",
+    "_childDocuments_": [
+      { "id": "11", "type": "SKU", "color": "Red",  "size": "L" },
+      { "id": "12", "type": "SKU", "color": "Blue", "size": "L" },
+      { "id": "13", "type": "SKU", "color": "Red",  "size": "M" },
+      { "id": "14", "type": "SKU", "color": "Blue", "size": "S" }
+    ]
+  }
+
+----
+
+For *SKU domain* we can request
+
+[source,java]
+----
+  color: {
+    type: terms,
+    field: color,
+    limit: -1,
+    facet: {
+      productsCount: "uniqueBlock(_root_)"
+    }
+  }
+
+
+----
+
+and get
+
+[source,java]
+----
+
+  [...]
+  color:{
+     buckets:[
+        {
+          val:Red, count:2, productsCount:1
+        },
+        {
+          val:Blue, count:2, productsCount:1
+        }
+     ]
+  }
+----
+
+Please notice that `\_root_` is an internal field added by Lucene to each child document to reference on parent one.
+Aggregation `uniqueBlock(\_root_)` is functionally equivalent to `unique(\_root_)`, but is optimized for nested documents block structure.
+It's recommended to define `limit: -1` for `uniqueBlock` calculation, like in above example,
+since default value of `limit` parameter is `10`, while `uniqueBlock` is supposed to be much faster with `-1`.
+
 [[References]]
 == References

--- a/solr/solr-ref-guide/src/other-parsers.adoc
+++ b/solr/solr-ref-guide/src/other-parsers.adoc
@ -24,7 +24,7 @@ Many of these parsers are expressed the same way as <<local-parameters-in-querie

 == Block Join Query Parsers

-There are two query parsers that support block joins. These parsers allow indexing and searching for relational content that has been <<uploading-data-with-index-handlers.adoc#uploading-data-with-index-handlers,indexed as nested documents>>.
+There are two query parsers that support block joins. These parsers allow indexing and searching for relational content that has been <<uploading-data-with-index-handlers.adoc#nested-child-documents, indexed as nested documents>>.

 The example usage of the query parsers below assumes these two documents and each of their child documents have been indexed: