From f0ed8a9168bae2394af02f1b312f750e5962e286 Mon Sep 17 00:00:00 2001 From: Cassandra Targett Date: Fri, 1 Sep 2017 09:11:22 -0500 Subject: [PATCH] SOLR-11305: TrieField deprecation cleanup in several pages --- solr/solr-ref-guide/src/docvalues.adoc | 24 ++++++++++--------- .../src/field-types-included-with-solr.adoc | 10 ++++---- solr/solr-ref-guide/src/function-queries.adoc | 2 +- .../src/the-extended-dismax-query-parser.adoc | 21 ++++------------ .../src/the-standard-query-parser.adoc | 6 +++-- 5 files changed, 27 insertions(+), 36 deletions(-) diff --git a/solr/solr-ref-guide/src/docvalues.adoc b/solr/solr-ref-guide/src/docvalues.adoc index 4077d1a66b2..4d6ea83c8b9 100644 --- a/solr/solr-ref-guide/src/docvalues.adoc +++ b/solr/solr-ref-guide/src/docvalues.adoc @@ -44,16 +44,18 @@ If you have already indexed data into your Solr index, you will need to complete DocValues are only available for specific field types. The types chosen determine the underlying Lucene docValue type that will be used. The available Solr field types are: -* `StrField` and `UUIDField`. -** If the field is single-valued (i.e., multi-valued is false), Lucene will use the SORTED type. -** If the field is multi-valued, Lucene will use the SORTED_SET type. -* Any `Trie*` numeric fields, date fields and `EnumFieldType`. -** If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. -** If the field is multi-valued, Lucene will use the SORTED_SET type. -* Boolean fields -* Int|Long|Float|Double|Date PointField -** If the field is single-valued (i.e., multi-valued is false), Lucene will use the NUMERIC type. -** If the field is multi-valued, Lucene will use the SORTED_NUMERIC type. +* `StrField` and `UUIDField`: +** If the field is single-valued (i.e., multi-valued is false), Lucene will use the `SORTED` type. +** If the field is multi-valued, Lucene will use the `SORTED_SET` type. +* `BoolField`: +** If the field is single-valued (i.e., multi-valued is false), Lucene will use the `SORTED` type. +** If the field is multi-valued, Lucene will use the `SORTED_BINARY` type. +* Any `*PointField` Numeric or Date fields, `EnumFieldType`, and `CurrencyFieldType`: +** If the field is single-valued (i.e., multi-valued is false), Lucene will use the `NUMERIC` type. +** If the field is multi-valued, Lucene will use the `SORTED_NUMERIC` type. +* Any of the deprecated `Trie*` Numeric or Date fields, `EnumField` and `CurrencyField`: +** If the field is single-valued (i.e., multi-valued is false), Lucene will use the `NUMERIC` type. +** If the field is multi-valued, Lucene will use the `SORTED_SET` type. These Lucene types are related to how the {lucene-javadocs}/core/org/apache/lucene/index/DocValuesType.html[values are sorted and stored]. @@ -86,4 +88,4 @@ In cases where the query is returning _only_ docValues fields performance may im When retrieving fields from their docValues form (using the <>, <> or if the field is requested in the `fl` parameter), two important differences between regular stored fields and docValues fields must be understood: 1. Order is _not_ preserved. For simply retrieving stored fields, the insertion order is the return order. For docValues, it is the _sorted_ order. -2. Multiple identical entries are collapsed into a single value. Thus if I insert values 4, 5, 2, 4, 1, my return will be 1, 2, 4, 5. +2. For field types using `SORTED_SET`, multiple identical entries are collapsed into a single value. Thus if I insert values 4, 5, 2, 4, 1, my return will be 1, 2, 4, 5. diff --git a/solr/solr-ref-guide/src/field-types-included-with-solr.adoc b/solr/solr-ref-guide/src/field-types-included-with-solr.adoc index 5463552d2b5..39ddb955fbd 100644 --- a/solr/solr-ref-guide/src/field-types-included-with-solr.adoc +++ b/solr/solr-ref-guide/src/field-types-included-with-solr.adoc @@ -37,9 +37,9 @@ The following table lists the field types that are available in Solr. The `org.a |DateRangeField |Supports indexing date ranges, to include point in time date instances as well (single-millisecond durations). See the section <> for more detail on using this field type. Consider using this field type even if it's just for date instances, particularly when the queries typically fall on UTC year/month/day/hour, etc., boundaries. -|DatePointField |Date field. Represents a point in time with millisecond precision. See the section <>. This class functions similarly to TrieDateField, but using a "Dimensional Points" based data structure instead of indexed terms, and doesn't require configuration of a precision step. For single valued fields, `docValues="true"` must be used to enable sorting. +|DatePointField |Date field. Represents a point in time with millisecond precision, encoded using a "Dimensional Points" based data structure that allows for very efficient searches for specific values, or ranges of values. See the section <> for more details on the supported syntax. For single valued fields, `docValues="true"` must be used to enable sorting. -|DoublePointField |Double field (64-bit IEEE floating point). This class functions similarly to TrieDoubleField, but using a "Dimensional Points" based data structure instead of indexed terms, and doesn't require configuration of a precision step. For single valued fields, `docValues="true"` must be used to enable sorting. +|DoublePointField |Double field (64-bit IEEE floating point). This class encodes double values using a "Dimensional Points" based data structure that allows for very efficient searches for specific values, or ranges of values. For single valued fields, `docValues="true"` must be used to enable sorting. |ExternalFileField |Pulls values from a file on disk. See the section <> for more information. @@ -47,17 +47,17 @@ The following table lists the field types that are available in Solr. The `org.a |EnumFieldType |Allows defining an enumerated set of values which may not be easily sorted by either alphabetic or numeric order (such as a list of severities, for example). This field type takes a configuration file, which lists the proper order of the field values. See the section <> for more information. -|FloatPointField |Floating point field (32-bit IEEE floating point). This class functions similarly to TrieFloatField, but using a "Dimensional Points" based data structure instead of indexed terms, and doesn't require configuration of a precision step. For single valued fields, `docValues="true"` must be used to enable sorting. +|FloatPointField |Floating point field (32-bit IEEE floating point). This class encodes float values using a "Dimensional Points" based data structure that allows for very efficient searches for specific values, or ranges of values. For single valued fields, `docValues="true"` must be used to enable sorting. |ICUCollationField |Supports Unicode collation for sorting and range queries. See the section <> for more information. -|IntPointField |Integer field (32-bit signed integer). This class functions similarly to TrieIntField, but using a "Dimensional Points" based data structure instead of indexed terms, and doesn't require configuration of a precision step. For single valued fields, `docValues="true"` must be used to enable sorting. +|IntPointField |Integer field (32-bit signed integer). This class encodes int values using a "Dimensional Points" based data structure that allows for very efficient searches for specific values, or ranges of values. For single valued fields, `docValues="true"` must be used to enable sorting. |LatLonPointSpatialField |A latitude/longitude coordinate pair; possibly multi-valued for multiple points. Usually it's specified as "lat,lon" order with a comma. See the section <> for more information. |LatLonType |*Deprecated*. Consider using the LatLonPointSpatialField instead. A single-valued latitude/longitude coordinate pair. Usually it's specified as "lat,lon" order with a comma. See the section <> for more information. -|LongPointField |Long field (64-bit signed integer). This class functions similarly to TrieLongField, but using a "Dimensional Points" based data structure instead of indexed terms, and doesn't require configuration of a precision step. For single valued fields, `docValues="true"` must be used to enable sorting. +|LongPointField |Long field (64-bit signed integer). This class encodes foo values using a "Dimensional Points" based data structure that allows for very efficient searches for specific values, or ranges of values. For single valued fields, `docValues="true"` must be used to enable sorting. |PointType |A single-valued n-dimensional point. It's both for sorting spatial data that is _not_ lat-lon, and for some more rare use-cases. (NOTE: this is _not_ related to the "Point" based numeric fields). See <> for more information. diff --git a/solr/solr-ref-guide/src/function-queries.adoc b/solr/solr-ref-guide/src/function-queries.adoc index 11dfb08f301..f1684c9e229 100644 --- a/solr/solr-ref-guide/src/function-queries.adoc +++ b/solr/solr-ref-guide/src/function-queries.adoc @@ -254,7 +254,7 @@ Use the `field(myfield,min)` <>. +Arguments may be the name of a `DatePointField`, `TrieDateField`, or date math based on a <>. * `ms()`: Equivalent to `ms(NOW)`, number of milliseconds since the epoch. * `ms(a):` Returns the number of milliseconds since the epoch that the argument represents. diff --git a/solr/solr-ref-guide/src/the-extended-dismax-query-parser.adoc b/solr/solr-ref-guide/src/the-extended-dismax-query-parser.adoc index 4b042bdd7c1..ea465e2cfdd 100644 --- a/solr/solr-ref-guide/src/the-extended-dismax-query-parser.adoc +++ b/solr/solr-ref-guide/src/the-extended-dismax-query-parser.adoc @@ -22,10 +22,10 @@ The Extended DisMax (eDisMax) query parser is an improved version of the <>. -* supports queries such as AND, OR, NOT, -, and +. -* optionally treats "and" and "or" as "AND" and "OR" in Lucene syntax mode. -* respects the 'magic field' names `\_val_` and `\_query_`. These are not a real fields in the Schema, but if used it helps do special things (like a function query in the case of `\_val_` or a nested query in the case of `\_query_`). If `\_val_` is used in a term or phrase query, the value is parsed as a function. +* supports the full Lucene query parser syntax with the same enhancements as <>. +** supports queries such as AND, OR, NOT, -, and +. +** optionally treats "and" and "or" as "AND" and "OR" in Lucene syntax mode. +** respects the 'magic field' names `\_val_` and `\_query_`. These are not a real fields in the Schema, but if used it helps do special things (like a function query in the case of `\_val_` or a nested query in the case of `\_query_`). If `\_val_` is used in a term or phrase query, the value is parsed as a function. * includes improved smart partial escaping in the case of syntax errors; fielded queries, +/-, and phrase queries are still supported in this mode. * improves proximity boosting by using word shingles; you do not need the query to match all words in the document before proximity boosting is applied. * includes advanced stopword handling: stopwords are not required in the mandatory part of the query but are still used in the proximity boosting part. If a query consists of all stopwords, such as "to be or not to be", then all words are required. @@ -218,16 +218,3 @@ _val_:"recip(rord(myfield),1,2,3)" _query_:"{!dismax qf=myfield}how now brown cow" ---- -Although not technically a syntax difference, note that if you use the Solr {solr-javadocs}/solr-core/org/apache/solr/schema/TrieDateField.html[`TrieDateField`] type, any queries on those fields (typically range queries) should use either the Complete ISO 8601 Date syntax that field supports, or the {solr-javadocs}/solr-core/org/apache/solr/util/DateMathParser.html[DateMath Syntax] to get relative dates. For example: - -[source,text] ----- -timestamp:[* TO NOW] -createdate:[1976-03-06T23:59:59.999Z TO *] -createdate:[1995-12-31T23:59:59.999Z TO 2007-03-06T00:00:00Z] -pubdate:[NOW-1YEAR/DAY TO NOW/DAY+1DAY] -createdate:[1976-03-06T23:59:59.999Z TO 1976-03-06T23:59:59.999Z+1YEAR] -createdate:[1976-03-06T23:59:59.999Z/YEAR TO 1976-03-06T23:59:59.999Z] ----- - -IMPORTANT: `TO` must be uppercase, or Solr will report a 'Range Group' error. diff --git a/solr/solr-ref-guide/src/the-standard-query-parser.adoc b/solr/solr-ref-guide/src/the-standard-query-parser.adoc index 7c49d623c9a..b2db25cc8f2 100644 --- a/solr/solr-ref-guide/src/the-standard-query-parser.adoc +++ b/solr/solr-ref-guide/src/the-standard-query-parser.adoc @@ -350,11 +350,13 @@ This can even be used to cache individual clauses of complex filter queries. In === Specifying Dates and Times -Queries against fields using the `TrieDateField` type (typically range queries) should use the <>: +Queries against date based fields must use the <>. Queries for exact date values will require quoting or escaping since `:` is the parser syntax used to denote a field query: -* `timestamp:[* TO NOW]` +* `createdate:1976-03-06T23\:59\:59.999Z` +* `createdate:"1976-03-06T23:59:59.999Z"` * `createdate:[1976-03-06T23:59:59.999Z TO *]` * `createdate:[1995-12-31T23:59:59.999Z TO 2007-03-06T00:00:00Z]` +* `timestamp:[* TO NOW]` * `pubdate:[NOW-1YEAR/DAY TO NOW/DAY+1DAY]` * `createdate:[1976-03-06T23:59:59.999Z TO 1976-03-06T23:59:59.999Z+1YEAR]` * `createdate:[1976-03-06T23:59:59.999Z/YEAR TO 1976-03-06T23:59:59.999Z]`