LUCENE-3065, SOLR-2497: When a NumericField is retrieved from a Document loaded from IndexReader (or IndexSearcher), it will now come back as NumericField. Solr now uses NumericField solely (no more magic).

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1100526 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Uwe Schindler 2011-05-07 13:14:38 +00:00
parent 350cafab80
commit 400639f54e
49 changed files with 598 additions and 536 deletions

View File

@ -472,11 +472,29 @@ Changes in backwards compatibility policy
a method getHeapArray() was added to retrieve the internal heap array as a
non-generic Object[]. (Uwe Schindler, Yonik Seeley)
Changes in runtime behavior
* LUCENE-3065: When a NumericField is retrieved from a Document loaded
from IndexReader (or IndexSearcher), it will now come back as
NumericField not as a Field with a string-ified version of the
numeric value you had indexed. Note that this only applies for
newly-indexed Documents; older indices will still return Field
with the string-ified numeric value. If you call Document.get(),
the value comes still back as String, but Document.getFieldable()
returns NumericField instances. (Uwe Schindler, Ryan McKinley,
Mike McCandless)
Optimizations
* LUCENE-2990: ArrayUtil/CollectionUtil.*Sort() methods now exit early
on empty or one-element lists/arrays. (Uwe Schindler)
API Changes
* LUCENE-3065: Document.getField() was deprecated, as it throws
ClassCastException when loading lazy fields or NumericFields.
(Uwe Schindler, Ryan McKinley, Mike McCandless)
Bug fixes
* LUCENE-3024: Index with more than 2.1B terms was hitting AIOOBE when

View File

@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-version" content="0.9">
<meta name="Forrest-skin-name" content="lucene">
<title>
Apache Lucene - Contributions
@ -275,7 +275,7 @@ document.write("Last Published: " + document.lastModified);
<a href="#PDFTextStream -- PDF text and metadata extraction">PDFTextStream -- PDF text and metadata extraction</a>
</li>
<li>
<a href="#PJ Classic & PJ Professional - PDF Document Conversion">PJ Classic &amp; PJ Professional - PDF Document Conversion</a>
<a href="#PJ Classic &amp; PJ Professional - PDF Document Conversion">PJ Classic &amp; PJ Professional - PDF Document Conversion</a>
</li>
</ul>
</li>
@ -403,7 +403,7 @@ document.write("Last Published: " + document.lastModified);
URL
</th>
<td>
<a href="http://marc.theaimsgroup.com/?l=lucene-dev&m=100723333506246&w=2">
<a href="http://marc.theaimsgroup.com/?l=lucene-dev&amp;m=100723333506246&amp;w=2">
http://marc.theaimsgroup.com/?l=lucene-dev&amp;m=100723333506246&amp;w=2
</a>
</td>
@ -538,7 +538,7 @@ document.write("Last Published: " + document.lastModified);
</tr>
</table>
<a name="N10124"></a><a name="PJ Classic & PJ Professional - PDF Document Conversion"></a>
<a name="N10124"></a><a name="PJ Classic &amp; PJ Professional - PDF Document Conversion"></a>
<h3 class="boxed">PJ Classic &amp; PJ Professional - PDF Document Conversion</h3>
<table class="ForrestTable" cellspacing="1" cellpadding="4">

Binary file not shown.

View File

@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-version" content="0.9">
<meta name="Forrest-skin-name" content="lucene">
<title>
Apache Lucene - Building and Installing the Basic Demo

Binary file not shown.

View File

@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-version" content="0.9">
<meta name="Forrest-skin-name" content="lucene">
<title>
Apache Lucene - Basic Demo Sources Walk-through

Binary file not shown.

View File

@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-version" content="0.9">
<meta name="Forrest-skin-name" content="lucene">
<title>
Apache Lucene - Index File Formats
@ -429,10 +429,15 @@ document.write("Last Published: " + document.lastModified);
Additionally segments track explicitly whether or
not they have term vectors. See LUCENE-2811 for details.
</p>
<p>
In version 3.2, numeric fields are written as natively
to stored fields file, previously they were stored in
text format only.
</p>
</div>
<a name="N10037"></a><a name="Definitions"></a>
<a name="N1003A"></a><a name="Definitions"></a>
<h2 class="boxed">Definitions</h2>
<div class="section">
<p>
@ -473,7 +478,7 @@ document.write("Last Published: " + document.lastModified);
strings, the first naming the field, and the second naming text
within the field.
</p>
<a name="N10057"></a><a name="Inverted Indexing"></a>
<a name="N1005A"></a><a name="Inverted Indexing"></a>
<h3 class="boxed">Inverted Indexing</h3>
<p>
The index stores statistics about terms in order
@ -483,7 +488,7 @@ document.write("Last Published: " + document.lastModified);
it. This is the inverse of the natural relationship, in which
documents list terms.
</p>
<a name="N10063"></a><a name="Types of Fields"></a>
<a name="N10066"></a><a name="Types of Fields"></a>
<h3 class="boxed">Types of Fields</h3>
<p>
In Lucene, fields may be <i>stored</i>, in which
@ -497,7 +502,7 @@ document.write("Last Published: " + document.lastModified);
to be indexed literally.
</p>
<p>See the <a href="api/core/org/apache/lucene/document/Field.html">Field</a> java docs for more information on Fields.</p>
<a name="N10080"></a><a name="Segments"></a>
<a name="N10083"></a><a name="Segments"></a>
<h3 class="boxed">Segments</h3>
<p>
Lucene indexes may be composed of multiple sub-indexes, or
@ -523,7 +528,7 @@ document.write("Last Published: " + document.lastModified);
Searches may involve multiple segments and/or multiple indexes, each
index potentially composed of a set of segments.
</p>
<a name="N1009E"></a><a name="Document Numbers"></a>
<a name="N100A1"></a><a name="Document Numbers"></a>
<h3 class="boxed">Document Numbers</h3>
<p>
Internally, Lucene refers to documents by an integer <i>document
@ -578,7 +583,7 @@ document.write("Last Published: " + document.lastModified);
</div>
<a name="N100C5"></a><a name="Overview"></a>
<a name="N100C8"></a><a name="Overview"></a>
<h2 class="boxed">Overview</h2>
<div class="section">
<p>
@ -677,7 +682,7 @@ document.write("Last Published: " + document.lastModified);
</div>
<a name="N10108"></a><a name="File Naming"></a>
<a name="N1010B"></a><a name="File Naming"></a>
<h2 class="boxed">File Naming</h2>
<div class="section">
<p>
@ -704,7 +709,7 @@ document.write("Last Published: " + document.lastModified);
</p>
</div>
<a name="N10117"></a><a name="file-names"></a>
<a name="N1011A"></a><a name="file-names"></a>
<h2 class="boxed">Summary of File Extensions</h2>
<div class="section">
<p>The following table summarizes the names and extensions of the files in Lucene:
@ -846,10 +851,10 @@ document.write("Last Published: " + document.lastModified);
</div>
<a name="N10201"></a><a name="Primitive Types"></a>
<a name="N10204"></a><a name="Primitive Types"></a>
<h2 class="boxed">Primitive Types</h2>
<div class="section">
<a name="N10206"></a><a name="Byte"></a>
<a name="N10209"></a><a name="Byte"></a>
<h3 class="boxed">Byte</h3>
<p>
The most primitive type
@ -857,7 +862,7 @@ document.write("Last Published: " + document.lastModified);
other data types are defined as sequences
of bytes, so file formats are byte-order independent.
</p>
<a name="N1020F"></a><a name="UInt32"></a>
<a name="N10212"></a><a name="UInt32"></a>
<h3 class="boxed">UInt32</h3>
<p>
32-bit unsigned integers are written as four
@ -867,7 +872,7 @@ document.write("Last Published: " + document.lastModified);
UInt32 --&gt; &lt;Byte&gt;<sup>4</sup>
</p>
<a name="N1021E"></a><a name="Uint64"></a>
<a name="N10221"></a><a name="Uint64"></a>
<h3 class="boxed">Uint64</h3>
<p>
64-bit unsigned integers are written as eight
@ -876,7 +881,7 @@ document.write("Last Published: " + document.lastModified);
<p>UInt64 --&gt; &lt;Byte&gt;<sup>8</sup>
</p>
<a name="N1022D"></a><a name="VInt"></a>
<a name="N10230"></a><a name="VInt"></a>
<h3 class="boxed">VInt</h3>
<p>
A variable-length format for positive integers is
@ -1426,13 +1431,13 @@ document.write("Last Published: " + document.lastModified);
This provides compression while still being
efficient to decode.
</p>
<a name="N10512"></a><a name="Chars"></a>
<a name="N10515"></a><a name="Chars"></a>
<h3 class="boxed">Chars</h3>
<p>
Lucene writes unicode
character sequences as UTF-8 encoded bytes.
</p>
<a name="N1051B"></a><a name="String"></a>
<a name="N1051E"></a><a name="String"></a>
<h3 class="boxed">String</h3>
<p>
Lucene writes strings as UTF-8 encoded bytes.
@ -1445,10 +1450,10 @@ document.write("Last Published: " + document.lastModified);
</div>
<a name="N10528"></a><a name="Compound Types"></a>
<a name="N1052B"></a><a name="Compound Types"></a>
<h2 class="boxed">Compound Types</h2>
<div class="section">
<a name="N1052D"></a><a name="MapStringString"></a>
<a name="N10530"></a><a name="MapStringString"></a>
<h3 class="boxed">Map&lt;String,String&gt;</h3>
<p>
In a couple places Lucene stores a Map
@ -1461,13 +1466,13 @@ document.write("Last Published: " + document.lastModified);
</div>
<a name="N1053D"></a><a name="Per-Index Files"></a>
<a name="N10540"></a><a name="Per-Index Files"></a>
<h2 class="boxed">Per-Index Files</h2>
<div class="section">
<p>
The files in this section exist one-per-index.
</p>
<a name="N10545"></a><a name="Segments File"></a>
<a name="N10548"></a><a name="Segments File"></a>
<h3 class="boxed">Segments File</h3>
<p>
The active segments in the index are stored in the
@ -1640,7 +1645,7 @@ document.write("Last Published: " + document.lastModified);
<p> HasVectors is 1 if this segment stores term vectors,
else it's 0.
</p>
<a name="N105D0"></a><a name="Lock File"></a>
<a name="N105D3"></a><a name="Lock File"></a>
<h3 class="boxed">Lock File</h3>
<p>
The write lock, which is stored in the index
@ -1654,14 +1659,14 @@ document.write("Last Published: " + document.lastModified);
documents). This lock file ensures that only one
writer is modifying the index at a time.
</p>
<a name="N105D9"></a><a name="Deletable File"></a>
<a name="N105DC"></a><a name="Deletable File"></a>
<h3 class="boxed">Deletable File</h3>
<p>
A writer dynamically computes
the files that are deletable, instead, so no file
is written.
</p>
<a name="N105E2"></a><a name="Compound Files"></a>
<a name="N105E5"></a><a name="Compound Files"></a>
<h3 class="boxed">Compound Files</h3>
<p>Starting with Lucene 1.4 the compound file format became default. This
is simply a container for all files described in the next section
@ -1688,14 +1693,14 @@ document.write("Last Published: " + document.lastModified);
</div>
<a name="N1060A"></a><a name="Per-Segment Files"></a>
<a name="N1060D"></a><a name="Per-Segment Files"></a>
<h2 class="boxed">Per-Segment Files</h2>
<div class="section">
<p>
The remaining files are all per-segment, and are
thus defined by suffix.
</p>
<a name="N10612"></a><a name="Fields"></a>
<a name="N10615"></a><a name="Fields"></a>
<h3 class="boxed">Fields</h3>
<p>
@ -1868,13 +1873,29 @@ document.write("Last Published: " + document.lastModified);
<li>third bit is one for fields with compression option enabled
(if compression is enabled, the algorithm used is ZLIB),
only available for indexes until Lucene version 2.9.x</li>
<li>4th to 6th bits (mask: 0x7&lt;&lt;3) define the type of a
numeric field: <ul>
<li>all bits in mask are cleared if no numeric field at all</li>
<li>1&lt;&lt;3: Value is Int</li>
<li>2&lt;&lt;3: Value is Long</li>
<li>3&lt;&lt;3: Value is Int as Float (as of Integer.intBitsToFloat)</li>
<li>4&lt;&lt;3: Value is Long as Double (as of Double.longBitsToDouble)</li>
</ul>
</li>
</ul>
</p>
<p>Value --&gt;
String | BinaryValue (depending on Bits)
String | BinaryValue | Int | Long (depending on Bits)
</p>
<p>BinaryValue --&gt;
@ -1889,7 +1910,7 @@ document.write("Last Published: " + document.lastModified);
</li>
</ol>
<a name="N106B9"></a><a name="Term Dictionary"></a>
<a name="N106D0"></a><a name="Term Dictionary"></a>
<h3 class="boxed">Term Dictionary</h3>
<p>
The term dictionary is represented as two files:
@ -2081,7 +2102,7 @@ document.write("Last Published: " + document.lastModified);
</li>
</ol>
<a name="N1073D"></a><a name="Frequencies"></a>
<a name="N10754"></a><a name="Frequencies"></a>
<h3 class="boxed">Frequencies</h3>
<p>
The .frq file contains the lists of documents
@ -2209,7 +2230,7 @@ document.write("Last Published: " + document.lastModified);
entry in level-1. In the example has entry 15 on level 1 a pointer to entry 15 on level 0 and entry 31 on level 1 a pointer
to entry 31 on level 0.
</p>
<a name="N107C5"></a><a name="Positions"></a>
<a name="N107DC"></a><a name="Positions"></a>
<h3 class="boxed">Positions</h3>
<p>
The .prx file contains the lists of positions that
@ -2279,7 +2300,7 @@ document.write("Last Published: " + document.lastModified);
Payload. If PayloadLength is not stored, then this Payload has the same
length as the Payload at the previous position.
</p>
<a name="N10801"></a><a name="Normalization Factors"></a>
<a name="N10818"></a><a name="Normalization Factors"></a>
<h3 class="boxed">Normalization Factors</h3>
<p>There's a single .nrm file containing all norms:
</p>
@ -2359,7 +2380,7 @@ document.write("Last Published: " + document.lastModified);
</p>
<p>Separate norm files are created (when adequate) for both compound and non compound segments.
</p>
<a name="N10852"></a><a name="Term Vectors"></a>
<a name="N10869"></a><a name="Term Vectors"></a>
<h3 class="boxed">Term Vectors</h3>
<p>
Term Vector support is an optional on a field by
@ -2495,7 +2516,7 @@ document.write("Last Published: " + document.lastModified);
</li>
</ol>
<a name="N108EE"></a><a name="Deleted Documents"></a>
<a name="N10905"></a><a name="Deleted Documents"></a>
<h3 class="boxed">Deleted Documents</h3>
<p>The .del file is
optional, and only exists when a segment contains deletions.
@ -2559,7 +2580,7 @@ document.write("Last Published: " + document.lastModified);
</div>
<a name="N10928"></a><a name="Limitations"></a>
<a name="N1093F"></a><a name="Limitations"></a>
<h2 class="boxed">Limitations</h2>
<div class="section">
<p>

Binary file not shown.

View File

@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-version" content="0.9">
<meta name="Forrest-skin-name" content="lucene">
<title>
Apache Lucene - Getting Started Guide
@ -268,15 +268,13 @@ may wish to skip sections.
<li>
<a href="demo.html">About the command-line Lucene demo and its usage</a>. This section
is intended for anyone who wants to use the command-line Lucene demo.</li>
<p></p>
is intended for anyone who wants to use the command-line Lucene demo.</li>
<li>
<a href="demo2.html">About the sources and implementation for the command-line Lucene
demo</a>. This section walks through the implementation details (sources) of the
command-line Lucene demo. This section is intended for developers.</li>
<p></p>
command-line Lucene demo. This section is intended for developers.</li>
</ul>
</div>

Binary file not shown.

View File

@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-version" content="0.9">
<meta name="Forrest-skin-name" content="lucene">
<title>Lucene Java Documentation</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">

Binary file not shown.

View File

@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-version" content="0.9">
<meta name="Forrest-skin-name" content="lucene">
<title>Site Linkmap Table of Contents</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">

Binary file not shown.

View File

@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-version" content="0.9">
<meta name="Forrest-skin-name" content="lucene">
<title>
Apache Lucene - Lucene Contrib

Binary file not shown.

View File

@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-version" content="0.9">
<meta name="Forrest-skin-name" content="lucene">
<title>
Apache Lucene - Query Parser Syntax

Binary file not shown.

View File

@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-version" content="0.9">
<meta name="Forrest-skin-name" content="lucene">
<title>
Apache Lucene - Scoring

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.2 KiB

View File

@ -3,7 +3,7 @@
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta content="Apache Forrest" name="Generator">
<meta name="Forrest-version" content="0.8">
<meta name="Forrest-version" content="0.9">
<meta name="Forrest-skin-name" content="lucene">
<title>Apache Lucene - System Requirements</title>
<link type="text/css" href="skin/basic.css" rel="stylesheet">

Binary file not shown.

View File

@ -131,8 +131,13 @@ public final class Document {
/** Returns a field with the given name if any exist in this document, or
* null. If multiple fields exists with this name, this method returns the
* first value added.
* Do not use this method with lazy loaded fields.
* Do not use this method with lazy loaded fields or {@link NumericField}.
* @deprecated use {@link #getFieldable} instead and cast depending on
* data type.
* @throws ClassCastException if you try to retrieve a numerical or
* lazy loaded field.
*/
@Deprecated
public final Field getField(String name) {
return (Field) getFieldable(name);
}
@ -154,6 +159,8 @@ public final class Document {
* this document, or null. If multiple fields exist with this name, this
* method returns the first value added. If only binary fields with this name
* exist, returns null.
* For {@link NumericField} it returns the string value of the number. If you want
* the actual {@code NumericField} instance back, use {@link #getFieldable}.
*/
public final String get(String name) {
for (Fieldable field : fields) {
@ -177,13 +184,18 @@ public final class Document {
/**
* Returns an array of {@link Field}s with the given name.
* Do not use with lazy loaded fields.
* This method returns an empty array when there are no
* matching fields. It never returns null.
* Do not use this method with lazy loaded fields or {@link NumericField}.
*
* @param name the name of the field
* @return a <code>Field[]</code> array
* @deprecated use {@link #getFieldable} instead and cast depending on
* data type.
* @throws ClassCastException if you try to retrieve a numerical or
* lazy loaded field.
*/
@Deprecated
public final Field[] getFields(String name) {
List<Field> result = new ArrayList<Field>();
for (Fieldable field : fields) {
@ -230,6 +242,8 @@ public final class Document {
* Returns an array of values of the field specified as the method parameter.
* This method returns an empty array when there are no
* matching fields. It never returns null.
* For {@link NumericField}s it returns the string value of the number. If you want
* the actual {@code NumericField} instances back, use {@link #getFieldables}.
* @param name the name of the field
* @return a <code>String[]</code> of field values
*/

View File

@ -127,18 +127,18 @@ import org.apache.lucene.search.FieldCache; // javadocs
* class is a wrapper around this token stream type for
* easier, more intuitive usage.</p>
*
* <p><b>NOTE:</b> This class is only used during
* indexing. When retrieving the stored field value from a
* {@link Document} instance after search, you will get a
* conventional {@link Fieldable} instance where the numeric
* values are returned as {@link String}s (according to
* <code>toString(value)</code> of the used data type).
*
* @since 2.9
*/
public final class NumericField extends AbstractField {
private final NumericTokenStream numericTS;
/** Data type of the value in {@link NumericField}.
* @since 3.2
*/
public static enum DataType { INT, LONG, FLOAT, DOUBLE }
private transient NumericTokenStream numericTS;
private DataType type;
private final int precisionStep;
/**
* Creates a field for numeric values using the default <code>precisionStep</code>
@ -158,8 +158,8 @@ public final class NumericField extends AbstractField {
* a numeric value, before indexing a document containing this field,
* set a value using the various set<em>???</em>Value() methods.
* @param name the field name
* @param store if the field should be stored in plain text form
* (according to <code>toString(value)</code> of the used data type)
* @param store if the field should be stored, {@link Document#getFieldable}
* then returns {@code NumericField} instances on search results.
* @param index if the field should be indexed using {@link NumericTokenStream}
*/
public NumericField(String name, Field.Store store, boolean index) {
@ -186,19 +186,43 @@ public final class NumericField extends AbstractField {
* set a value using the various set<em>???</em>Value() methods.
* @param name the field name
* @param precisionStep the used <a href="../search/NumericRangeQuery.html#precisionStepDesc">precision step</a>
* @param store if the field should be stored in plain text form
* (according to <code>toString(value)</code> of the used data type)
* @param store if the field should be stored, {@link Document#getFieldable}
* then returns {@code NumericField} instances on search results.
* @param index if the field should be indexed using {@link NumericTokenStream}
*/
public NumericField(String name, int precisionStep, Field.Store store, boolean index) {
super(name, store, index ? Field.Index.ANALYZED_NO_NORMS : Field.Index.NO, Field.TermVector.NO);
this.precisionStep = precisionStep;
setOmitTermFreqAndPositions(true);
numericTS = new NumericTokenStream(precisionStep);
}
/** Returns a {@link NumericTokenStream} for indexing the numeric value. */
public TokenStream tokenStreamValue() {
return isIndexed() ? numericTS : null;
if (!isIndexed())
return null;
if (numericTS == null) {
// lazy init the TokenStream as it is heavy to instantiate (attributes,...),
// if not needed (stored field loading)
numericTS = new NumericTokenStream(precisionStep);
// initialize value in TokenStream
if (fieldsData != null) {
assert type != null;
final Number val = (Number) fieldsData;
switch (type) {
case INT:
numericTS.setIntValue(val.intValue()); break;
case LONG:
numericTS.setLongValue(val.longValue()); break;
case FLOAT:
numericTS.setFloatValue(val.floatValue()); break;
case DOUBLE:
numericTS.setDoubleValue(val.doubleValue()); break;
default:
assert false : "Should never get here";
}
}
}
return numericTS;
}
/** Returns always <code>null</code> for numeric fields */
@ -212,7 +236,10 @@ public final class NumericField extends AbstractField {
return null;
}
/** Returns the numeric value as a string (how it is stored, when {@link Field.Store#YES} is chosen). */
/** Returns the numeric value as a string. This format is also returned if you call {@link Document#get(String)}
* on search results. It is recommended to use {@link Document#getFieldable} instead
* that returns {@code NumericField} instances. You can then use {@link #getNumericValue}
* to return the stored value. */
public String stringValue() {
return (fieldsData == null) ? null : fieldsData.toString();
}
@ -224,7 +251,14 @@ public final class NumericField extends AbstractField {
/** Returns the precision step. */
public int getPrecisionStep() {
return numericTS.getPrecisionStep();
return precisionStep;
}
/** Returns the data type of the current value, {@code null} if not yet set.
* @since 3.2
*/
public DataType getDataType() {
return type;
}
/**
@ -234,8 +268,9 @@ public final class NumericField extends AbstractField {
* <code>document.add(new NumericField(name, precisionStep).setLongValue(value))</code>
*/
public NumericField setLongValue(final long value) {
numericTS.setLongValue(value);
if (numericTS != null) numericTS.setLongValue(value);
fieldsData = Long.valueOf(value);
type = DataType.LONG;
return this;
}
@ -246,8 +281,9 @@ public final class NumericField extends AbstractField {
* <code>document.add(new NumericField(name, precisionStep).setIntValue(value))</code>
*/
public NumericField setIntValue(final int value) {
numericTS.setIntValue(value);
if (numericTS != null) numericTS.setIntValue(value);
fieldsData = Integer.valueOf(value);
type = DataType.INT;
return this;
}
@ -258,8 +294,9 @@ public final class NumericField extends AbstractField {
* <code>document.add(new NumericField(name, precisionStep).setDoubleValue(value))</code>
*/
public NumericField setDoubleValue(final double value) {
numericTS.setDoubleValue(value);
if (numericTS != null) numericTS.setDoubleValue(value);
fieldsData = Double.valueOf(value);
type = DataType.DOUBLE;
return this;
}
@ -270,8 +307,9 @@ public final class NumericField extends AbstractField {
* <code>document.add(new NumericField(name, precisionStep).setFloatValue(value))</code>
*/
public NumericField setFloatValue(final float value) {
numericTS.setFloatValue(value);
if (numericTS != null) numericTS.setFloatValue(value);
fieldsData = Float.valueOf(value);
type = DataType.FLOAT;
return this;
}

View File

@ -24,10 +24,11 @@ import org.apache.lucene.document.Field;
import org.apache.lucene.document.FieldSelector;
import org.apache.lucene.document.FieldSelectorResult;
import org.apache.lucene.document.Fieldable;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.IndexInput;
import org.apache.lucene.document.NumericField;
import org.apache.lucene.store.AlreadyClosedException;
import org.apache.lucene.store.BufferedIndexInput;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.IndexInput;
import org.apache.lucene.util.CloseableThreadLocal;
import java.io.IOException;
@ -212,40 +213,39 @@ public final class FieldsReader implements Cloneable {
Document doc = new Document();
int numFields = fieldsStream.readVInt();
for (int i = 0; i < numFields; i++) {
out: for (int i = 0; i < numFields; i++) {
int fieldNumber = fieldsStream.readVInt();
FieldInfo fi = fieldInfos.fieldInfo(fieldNumber);
FieldSelectorResult acceptField = fieldSelector == null ? FieldSelectorResult.LOAD : fieldSelector.accept(fi.name);
byte bits = fieldsStream.readByte();
assert bits <= FieldsWriter.FIELD_IS_TOKENIZED + FieldsWriter.FIELD_IS_BINARY;
int bits = fieldsStream.readByte() & 0xFF;
assert bits <= (FieldsWriter.FIELD_IS_NUMERIC_MASK | FieldsWriter.FIELD_IS_TOKENIZED | FieldsWriter.FIELD_IS_BINARY): "bits=" + Integer.toHexString(bits);
boolean tokenize = (bits & FieldsWriter.FIELD_IS_TOKENIZED) != 0;
boolean binary = (bits & FieldsWriter.FIELD_IS_BINARY) != 0;
//TODO: Find an alternative approach here if this list continues to grow beyond the
//list of 5 or 6 currently here. See Lucene 762 for discussion
if (acceptField.equals(FieldSelectorResult.LOAD)) {
addField(doc, fi, binary, tokenize);
}
else if (acceptField.equals(FieldSelectorResult.LOAD_AND_BREAK)){
addField(doc, fi, binary, tokenize);
break;//Get out of this loop
}
else if (acceptField.equals(FieldSelectorResult.LAZY_LOAD)) {
addFieldLazy(doc, fi, binary, tokenize, true);
}
else if (acceptField.equals(FieldSelectorResult.LATENT)) {
addFieldLazy(doc, fi, binary, tokenize, false);
}
else if (acceptField.equals(FieldSelectorResult.SIZE)){
skipField(addFieldSize(doc, fi, binary));
}
else if (acceptField.equals(FieldSelectorResult.SIZE_AND_BREAK)){
addFieldSize(doc, fi, binary);
break;
}
else {
skipField();
final int numeric = bits & FieldsWriter.FIELD_IS_NUMERIC_MASK;
switch (acceptField) {
case LOAD:
addField(doc, fi, binary, tokenize, numeric);
break;
case LOAD_AND_BREAK:
addField(doc, fi, binary, tokenize, numeric);
break out; //Get out of this loop
case LAZY_LOAD:
addFieldLazy(doc, fi, binary, tokenize, true, numeric);
break;
case LATENT:
addFieldLazy(doc, fi, binary, tokenize, false, numeric);
break;
case SIZE:
skipFieldBytes(addFieldSize(doc, fi, binary, numeric));
break;
case SIZE_AND_BREAK:
addFieldSize(doc, fi, binary, numeric);
break out; //Get out of this loop
default:
skipField(numeric);
}
}
@ -282,72 +282,121 @@ public final class FieldsReader implements Cloneable {
* Skip the field. We still have to read some of the information about the field, but can skip past the actual content.
* This will have the most payoff on large fields.
*/
private void skipField() throws IOException {
skipField(fieldsStream.readVInt());
private void skipField(int numeric) throws IOException {
final int numBytes;
switch(numeric) {
case 0:
numBytes = fieldsStream.readVInt();
break;
case FieldsWriter.FIELD_IS_NUMERIC_INT:
case FieldsWriter.FIELD_IS_NUMERIC_FLOAT:
numBytes = 4;
break;
case FieldsWriter.FIELD_IS_NUMERIC_LONG:
case FieldsWriter.FIELD_IS_NUMERIC_DOUBLE:
numBytes = 8;
break;
default:
throw new FieldReaderException("Invalid numeric type: " + Integer.toHexString(numeric));
}
skipFieldBytes(numBytes);
}
private void skipField(int toRead) throws IOException {
private void skipFieldBytes(int toRead) throws IOException {
fieldsStream.seek(fieldsStream.getFilePointer() + toRead);
}
private void addFieldLazy(Document doc, FieldInfo fi, boolean binary, boolean tokenize, boolean cacheResult) throws IOException {
private NumericField loadNumericField(FieldInfo fi, int numeric) throws IOException {
assert numeric != 0;
switch(numeric) {
case FieldsWriter.FIELD_IS_NUMERIC_INT:
return new NumericField(fi.name, Field.Store.YES, fi.isIndexed).setIntValue(fieldsStream.readInt());
case FieldsWriter.FIELD_IS_NUMERIC_LONG:
return new NumericField(fi.name, Field.Store.YES, fi.isIndexed).setLongValue(fieldsStream.readLong());
case FieldsWriter.FIELD_IS_NUMERIC_FLOAT:
return new NumericField(fi.name, Field.Store.YES, fi.isIndexed).setFloatValue(Float.intBitsToFloat(fieldsStream.readInt()));
case FieldsWriter.FIELD_IS_NUMERIC_DOUBLE:
return new NumericField(fi.name, Field.Store.YES, fi.isIndexed).setDoubleValue(Double.longBitsToDouble(fieldsStream.readLong()));
default:
throw new FieldReaderException("Invalid numeric type: " + Integer.toHexString(numeric));
}
}
private void addFieldLazy(Document doc, FieldInfo fi, boolean binary, boolean tokenize, boolean cacheResult, int numeric) throws IOException {
final AbstractField f;
if (binary) {
int toRead = fieldsStream.readVInt();
long pointer = fieldsStream.getFilePointer();
//was: doc.add(new Fieldable(fi.name, b, Fieldable.Store.YES));
doc.add(new LazyField(fi.name, Field.Store.YES, toRead, pointer, binary, cacheResult));
f = new LazyField(fi.name, Field.Store.YES, toRead, pointer, binary, cacheResult);
//Need to move the pointer ahead by toRead positions
fieldsStream.seek(pointer + toRead);
} else if (numeric != 0) {
f = loadNumericField(fi, numeric);
} else {
Field.Store store = Field.Store.YES;
Field.Index index = Field.Index.toIndex(fi.isIndexed, tokenize);
Field.TermVector termVector = Field.TermVector.toTermVector(fi.storeTermVector, fi.storeOffsetWithTermVector, fi.storePositionWithTermVector);
AbstractField f;
int length = fieldsStream.readVInt();
long pointer = fieldsStream.getFilePointer();
//Skip ahead of where we are by the length of what is stored
fieldsStream.seek(pointer+length);
f = new LazyField(fi.name, store, index, termVector, length, pointer, binary, cacheResult);
f.setOmitNorms(fi.omitNorms);
f.setOmitTermFreqAndPositions(fi.omitTermFreqAndPositions);
doc.add(f);
}
f.setOmitNorms(fi.omitNorms);
f.setOmitTermFreqAndPositions(fi.omitTermFreqAndPositions);
doc.add(f);
}
private void addField(Document doc, FieldInfo fi, boolean binary, boolean tokenize) throws CorruptIndexException, IOException {
private void addField(Document doc, FieldInfo fi, boolean binary, boolean tokenize, int numeric) throws CorruptIndexException, IOException {
final AbstractField f;
if (binary) {
int toRead = fieldsStream.readVInt();
final byte[] b = new byte[toRead];
fieldsStream.readBytes(b, 0, b.length);
doc.add(new Field(fi.name, b));
f = new Field(fi.name, b);
} else if (numeric != 0) {
f = loadNumericField(fi, numeric);
} else {
Field.Store store = Field.Store.YES;
Field.Index index = Field.Index.toIndex(fi.isIndexed, tokenize);
Field.TermVector termVector = Field.TermVector.toTermVector(fi.storeTermVector, fi.storeOffsetWithTermVector, fi.storePositionWithTermVector);
AbstractField f;
f = new Field(fi.name, // name
false,
fieldsStream.readString(), // read value
store,
index,
termVector);
f.setOmitTermFreqAndPositions(fi.omitTermFreqAndPositions);
f.setOmitNorms(fi.omitNorms);
doc.add(f);
false,
fieldsStream.readString(), // read value
Field.Store.YES,
index,
termVector);
}
f.setOmitTermFreqAndPositions(fi.omitTermFreqAndPositions);
f.setOmitNorms(fi.omitNorms);
doc.add(f);
}
// Add the size of field as a byte[] containing the 4 bytes of the integer byte size (high order byte first; char = 2 bytes)
// Read just the size -- caller must skip the field content to continue reading fields
// Return the size in bytes or chars, depending on field type
private int addFieldSize(Document doc, FieldInfo fi, boolean binary) throws IOException {
int size = fieldsStream.readVInt(), bytesize = binary ? size : 2*size;
private int addFieldSize(Document doc, FieldInfo fi, boolean binary, int numeric) throws IOException {
final int bytesize, size;
switch(numeric) {
case 0:
size = fieldsStream.readVInt();
bytesize = binary ? size : 2*size;
break;
case FieldsWriter.FIELD_IS_NUMERIC_INT:
case FieldsWriter.FIELD_IS_NUMERIC_FLOAT:
size = bytesize = 4;
break;
case FieldsWriter.FIELD_IS_NUMERIC_LONG:
case FieldsWriter.FIELD_IS_NUMERIC_DOUBLE:
size = bytesize = 8;
break;
default:
throw new FieldReaderException("Invalid numeric type: " + Integer.toHexString(numeric));
}
byte[] sizebytes = new byte[4];
sizebytes[0] = (byte) (bytesize>>>24);
sizebytes[1] = (byte) (bytesize>>>16);
@ -358,7 +407,7 @@ public final class FieldsReader implements Cloneable {
}
/**
* A Lazy implementation of Fieldable that differs loading of fields until asked for, instead of when the Document is
* A Lazy implementation of Fieldable that defers loading of fields until asked for, instead of when the Document is
* loaded.
*/
private class LazyField extends AbstractField implements Fieldable {

View File

@ -21,22 +21,40 @@ import java.util.List;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Fieldable;
import org.apache.lucene.document.NumericField;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.IndexInput;
import org.apache.lucene.store.IndexOutput;
import org.apache.lucene.util.IOUtils;
final class FieldsWriter {
static final byte FIELD_IS_TOKENIZED = 0x1;
static final byte FIELD_IS_BINARY = 0x2;
static final int FIELD_IS_TOKENIZED = 1 << 0;
static final int FIELD_IS_BINARY = 1 << 1;
// the old bit 1 << 2 was compressed, is now left out
private static final int _NUMERIC_BIT_SHIFT = 3;
static final int FIELD_IS_NUMERIC_MASK = 0x07 << _NUMERIC_BIT_SHIFT;
static final int FIELD_IS_NUMERIC_INT = 1 << _NUMERIC_BIT_SHIFT;
static final int FIELD_IS_NUMERIC_LONG = 2 << _NUMERIC_BIT_SHIFT;
static final int FIELD_IS_NUMERIC_FLOAT = 3 << _NUMERIC_BIT_SHIFT;
static final int FIELD_IS_NUMERIC_DOUBLE = 4 << _NUMERIC_BIT_SHIFT;
// currently unused: static final int FIELD_IS_NUMERIC_SHORT = 5 << _NUMERIC_BIT_SHIFT;
// currently unused: static final int FIELD_IS_NUMERIC_BYTE = 6 << _NUMERIC_BIT_SHIFT;
// the next possible bits are: 1 << 6; 1 << 7
// Lucene 3.0: Removal of compressed fields
static final int FORMAT_LUCENE_3_0_NO_COMPRESSED_FIELDS = 2;
// Lucene 3.2: NumericFields are stored in binary format
static final int FORMAT_LUCENE_3_2_NUMERIC_FIELDS = 3;
// NOTE: if you introduce a new format, make it 1 higher
// than the current one, and always change this if you
// switch to a new format!
static final int FORMAT_CURRENT = FORMAT_LUCENE_3_0_NO_COMPRESSED_FIELDS;
static final int FORMAT_CURRENT = FORMAT_LUCENE_3_2_NUMERIC_FIELDS;
// when removing support for old versions, leave the last supported version here
static final int FORMAT_MINIMUM = FORMAT_LUCENE_3_0_NO_COMPRESSED_FIELDS;
@ -121,13 +139,26 @@ final class FieldsWriter {
final void writeField(int fieldNumber, Fieldable field) throws IOException {
fieldsStream.writeVInt(fieldNumber);
byte bits = 0;
int bits = 0;
if (field.isTokenized())
bits |= FieldsWriter.FIELD_IS_TOKENIZED;
bits |= FIELD_IS_TOKENIZED;
if (field.isBinary())
bits |= FieldsWriter.FIELD_IS_BINARY;
fieldsStream.writeByte(bits);
bits |= FIELD_IS_BINARY;
if (field instanceof NumericField) {
switch (((NumericField) field).getDataType()) {
case INT:
bits |= FIELD_IS_NUMERIC_INT; break;
case LONG:
bits |= FIELD_IS_NUMERIC_LONG; break;
case FLOAT:
bits |= FIELD_IS_NUMERIC_FLOAT; break;
case DOUBLE:
bits |= FIELD_IS_NUMERIC_DOUBLE; break;
default:
assert false : "Should never get here";
}
}
fieldsStream.writeByte((byte) bits);
if (field.isBinary()) {
final byte[] data;
@ -139,8 +170,22 @@ final class FieldsWriter {
fieldsStream.writeVInt(len);
fieldsStream.writeBytes(data, offset, len);
}
else {
} else if (field instanceof NumericField) {
final NumericField nf = (NumericField) field;
final Number n = nf.getNumericValue();
switch (nf.getDataType()) {
case INT:
fieldsStream.writeInt(n.intValue()); break;
case LONG:
fieldsStream.writeLong(n.longValue()); break;
case FLOAT:
fieldsStream.writeInt(Float.floatToIntBits(n.floatValue())); break;
case DOUBLE:
fieldsStream.writeLong(Double.doubleToLongBits(n.doubleValue())); break;
default:
assert false : "Should never get here";
}
} else {
fieldsStream.writeString(field.stringValue());
}
}

View File

@ -94,6 +94,11 @@
Additionally segments track explicitly whether or
not they have term vectors. See LUCENE-2811 for details.
</p>
<p>
In version 3.2, numeric fields are written as natively
to stored fields file, previously they were stored in
text format only.
</p>
</section>
<section id="Definitions"><title>Definitions</title>
@ -1300,10 +1305,18 @@
<li>third bit is one for fields with compression option enabled
(if compression is enabled, the algorithm used is ZLIB),
only available for indexes until Lucene version 2.9.x</li>
<li>4th to 6th bits (mask: 0x7&lt;&lt;3) define the type of a
numeric field: <ul>
<li>all bits in mask are cleared if no numeric field at all</li>
<li>1&lt;&lt;3: Value is Int</li>
<li>2&lt;&lt;3: Value is Long</li>
<li>3&lt;&lt;3: Value is Int as Float (as of Integer.intBitsToFloat)</li>
<li>4&lt;&lt;3: Value is Long as Double (as of Double.longBitsToDouble)</li>
</ul></li>
</ul>
</p>
<p>Value --&gt;
String | BinaryValue (depending on Bits)
String | BinaryValue | Int | Long (depending on Bits)
</p>
<p>BinaryValue --&gt;
ValueSize, &lt;Byte&gt;^ValueSize

View File

@ -28,11 +28,11 @@ may wish to skip sections.
<ul>
<li><a href="demo.html">About the command-line Lucene demo and its usage</a>. This section
is intended for anyone who wants to use the command-line Lucene demo.</li> <p/>
is intended for anyone who wants to use the command-line Lucene demo.</li>
<li><a href="demo2.html">About the sources and implementation for the command-line Lucene
demo</a>. This section walks through the implementation details (sources) of the
command-line Lucene demo. This section is intended for developers.</li> <p/>
command-line Lucene demo. This section is intended for developers.</li>
</ul>
</section>

View File

@ -90,6 +90,8 @@ public class TestBackwardsCompatibility extends LuceneTestCase {
"30.nocfs",
"31.cfs",
"31.nocfs",
"32.cfs",
"32.nocfs",
};
final String[] unsupportedNames = {"19.cfs",

View File

@ -24,12 +24,14 @@ import java.util.*;
import org.apache.lucene.analysis.MockAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.NumericField;
import org.apache.lucene.document.FieldSelector;
import org.apache.lucene.document.FieldSelectorResult;
import org.apache.lucene.document.Fieldable;
import org.apache.lucene.document.LoadFirstFieldSelector;
import org.apache.lucene.document.SetBasedFieldSelector;
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
import org.apache.lucene.search.FieldCache;
import org.apache.lucene.store.AlreadyClosedException;
import org.apache.lucene.store.BufferedIndexInput;
import org.apache.lucene.store.Directory;
@ -511,4 +513,69 @@ public class TestFieldsReader extends LuceneTestCase {
}
}
public void testNumericField() throws Exception {
Directory dir = newDirectory();
RandomIndexWriter w = new RandomIndexWriter(random, dir);
final int numDocs = _TestUtil.nextInt(random, 500, 1000) * RANDOM_MULTIPLIER;
final Number[] answers = new Number[numDocs];
final NumericField.DataType[] typeAnswers = new NumericField.DataType[numDocs];
for(int id=0;id<numDocs;id++) {
Document doc = new Document();
NumericField nf = new NumericField("nf", Field.Store.YES, false);
doc.add(nf);
final Number answer;
final NumericField.DataType typeAnswer;
if (random.nextBoolean()) {
// float/double
if (random.nextBoolean()) {
final float f = random.nextFloat();
nf.setFloatValue(f);
answer = Float.valueOf(f);
typeAnswer = NumericField.DataType.FLOAT;
} else {
final double d = random.nextDouble();
nf.setDoubleValue(d);
answer = Double.valueOf(d);
typeAnswer = NumericField.DataType.DOUBLE;
}
} else {
// int/long
if (random.nextBoolean()) {
final int i = random.nextInt();
nf.setIntValue(i);
answer = Integer.valueOf(i);
typeAnswer = NumericField.DataType.INT;
} else {
final long l = random.nextLong();
nf.setLongValue(l);
answer = Long.valueOf(l);
typeAnswer = NumericField.DataType.LONG;
}
}
answers[id] = answer;
typeAnswers[id] = typeAnswer;
doc.add(new NumericField("id", Integer.MAX_VALUE, Field.Store.NO, true).setIntValue(id));
w.addDocument(doc);
}
final IndexReader r = w.getReader();
w.close();
assertEquals(numDocs, r.numDocs());
for(IndexReader sub : r.getSequentialSubReaders()) {
final int[] ids = FieldCache.DEFAULT.getInts(sub, "id");
for(int docID=0;docID<sub.numDocs();docID++) {
final Document doc = sub.document(docID);
final Fieldable f = doc.getFieldable("nf");
assertTrue("got f=" + f, f instanceof NumericField);
final NumericField nf = (NumericField) f;
assertEquals(answers[ids[docID]], nf.getNumericValue());
assertSame(typeAnswers[ids[docID]], nf.getDataType());
}
}
r.close();
dir.close();
}
}

View File

@ -208,7 +208,7 @@ public class TermVectorComponent extends SearchComponent implements SolrCoreAwar
if (keyField != null) {
Document document = reader.document(docId, fieldSelector);
Fieldable uniqId = document.getField(uniqFieldName);
Fieldable uniqId = document.getFieldable(uniqFieldName);
String uniqVal = null;
if (uniqId != null) {
uniqVal = keyField.getType().storedToReadable(uniqId);

View File

@ -401,13 +401,24 @@ public class DefaultSolrHighlighter extends SolrHighlighter implements PluginInf
private void doHighlightingByHighlighter( Query query, SolrQueryRequest req, NamedList docSummaries,
int docId, Document doc, String fieldName ) throws IOException {
final SolrIndexSearcher searcher = req.getSearcher();
final IndexSchema schema = searcher.getSchema();
// TODO: Currently in trunk highlighting numeric fields is broken (Lucene) -
// so we disable them until fixed (see LUCENE-3080)!
// BEGIN: Hack
final SchemaField schemaField = schema.getFieldOrNull(fieldName);
if (schemaField != null && (
(schemaField.getType() instanceof org.apache.solr.schema.TrieField) ||
(schemaField.getType() instanceof org.apache.solr.schema.TrieDateField)
)) return;
// END: Hack
SolrParams params = req.getParams();
String[] docTexts = doc.getValues(fieldName);
// according to Document javadoc, doc.getValues() never returns null. check empty instead of null
if (docTexts.length == 0) return;
SolrIndexSearcher searcher = req.getSearcher();
IndexSchema schema = searcher.getSchema();
TokenStream tstream = null;
int numFragments = getMaxSnippets(fieldName, params);
boolean mergeContiguousFragments = isMergeContiguousFragments(fieldName, params);

View File

@ -19,7 +19,6 @@ package org.apache.solr.schema;
import org.apache.solr.common.SolrException;
import org.apache.solr.common.SolrException.ErrorCode;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.Fieldable;
import org.apache.lucene.search.SortField;
import org.apache.solr.search.QParser;

View File

@ -18,210 +18,125 @@
package org.apache.solr.schema;
import org.apache.noggit.CharArr;
import org.apache.solr.common.SolrException;
import org.apache.solr.analysis.CharFilterFactory;
import org.apache.solr.analysis.TokenFilterFactory;
import org.apache.solr.analysis.TokenizerChain;
import org.apache.solr.analysis.TrieTokenizerFactory;
import org.apache.solr.search.function.*;
import org.apache.solr.search.function.ValueSource;
import org.apache.solr.search.QParser;
import org.apache.solr.response.TextResponseWriter;
import org.apache.lucene.document.Fieldable;
import org.apache.lucene.document.Field;
import org.apache.lucene.search.SortField;
import org.apache.lucene.search.FieldCache;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.NumericRangeQuery;
import org.apache.lucene.search.cache.CachedArrayCreator;
import org.apache.lucene.search.cache.LongValuesCreator;
import org.apache.lucene.util.BytesRef;
import org.apache.lucene.util.NumericUtils;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.NumericTokenStream;
import java.util.Map;
import java.util.Date;
import java.io.IOException;
public class TrieDateField extends DateField {
protected int precisionStepArg = TrieField.DEFAULT_PRECISION_STEP; // the one passed in or defaulted
protected int precisionStep = precisionStepArg; // normalized
final TrieField wrappedField = new TrieField() {{
type = TrieTypes.DATE;
}};
@Override
protected void init(IndexSchema schema, Map<String, String> args) {
String p = args.remove("precisionStep");
if (p != null) {
precisionStepArg = Integer.parseInt(p);
}
// normalize the precisionStep
precisionStep = precisionStepArg;
if (precisionStep<=0 || precisionStep>=64) precisionStep=Integer.MAX_VALUE;
CharFilterFactory[] filterFactories = new CharFilterFactory[0];
TokenFilterFactory[] tokenFilterFactories = new TokenFilterFactory[0];
analyzer = new TokenizerChain(filterFactories, new TrieTokenizerFactory(TrieField.TrieTypes.DATE, precisionStep), tokenFilterFactories);
// for query time we only need one token, so we use the biggest possible precisionStep:
queryAnalyzer = new TokenizerChain(filterFactories, new TrieTokenizerFactory(TrieField.TrieTypes.DATE, Integer.MAX_VALUE), tokenFilterFactories);
wrappedField.init(schema, args);
analyzer = wrappedField.analyzer;
queryAnalyzer = wrappedField.queryAnalyzer;
}
@Override
public Date toObject(Fieldable f) {
byte[] arr = f.getBinaryValue();
if (arr==null) throw new SolrException(SolrException.ErrorCode.SERVER_ERROR,TrieField.badFieldString(f));
return new Date(TrieFieldHelper.toLong(arr));
return (Date) wrappedField.toObject(f);
}
@Override
public Object toObject(SchemaField sf, BytesRef term) {
return new Date(NumericUtils.prefixCodedToLong(term));
return wrappedField.toObject(sf, term);
}
@Override
public SortField getSortField(SchemaField field, boolean top) {
field.checkSortability();
int flags = CachedArrayCreator.CACHE_VALUES_AND_BITS;
boolean sortMissingLast = field.sortMissingLast();
boolean sortMissingFirst = field.sortMissingFirst();
Object missingValue = null;
if( sortMissingLast ) {
missingValue = top ? Long.MIN_VALUE : Long.MAX_VALUE;
} else if( sortMissingFirst ) {
missingValue = top ? Long.MAX_VALUE : Long.MIN_VALUE;
}
return new SortField(new LongValuesCreator(field.getName(), FieldCache.NUMERIC_UTILS_LONG_PARSER, flags), top).setMissingValue(missingValue);
return wrappedField.getSortField(field, top);
}
@Override
public ValueSource getValueSource(SchemaField field, QParser parser) {
field.checkFieldCacheSource(parser);
return new TrieDateFieldSource( new LongValuesCreator( field.getName(), FieldCache.NUMERIC_UTILS_LONG_PARSER, CachedArrayCreator.CACHE_VALUES_AND_BITS ));
}
@Override
public void write(TextResponseWriter writer, String name, Fieldable f) throws IOException {
byte[] arr = f.getBinaryValue();
if (arr==null) {
writer.writeStr(name, TrieField.badFieldString(f),true);
return;
}
writer.writeDate(name,new Date(TrieFieldHelper.toLong(arr)));
}
@Override
public boolean isTokenized() {
return true;
return wrappedField.getValueSource(field, parser);
}
/**
* @return the precisionStep used to index values into the field
*/
public int getPrecisionStep() {
return precisionStepArg;
return wrappedField.getPrecisionStep();
}
@Override
public void write(TextResponseWriter writer, String name, Fieldable f) throws IOException {
wrappedField.write(writer, name, f);
}
@Override
public boolean isTokenized() {
return wrappedField.isTokenized();
}
@Override
public boolean multiValuedFieldCache() {
return wrappedField.multiValuedFieldCache();
}
@Override
public String storedToReadable(Fieldable f) {
return toExternal(f);
return wrappedField.storedToReadable(f);
}
@Override
public String readableToIndexed(String val) {
// TODO: Numeric should never be handled as String, that may break in future lucene versions! Change to use BytesRef for term texts!
BytesRef bytes = new BytesRef(NumericUtils.BUF_SIZE_LONG);
NumericUtils.longToPrefixCoded(super.parseMath(null, val).getTime(), 0, bytes);
return bytes.utf8ToString();
return wrappedField.readableToIndexed(val);
}
@Override
public String toInternal(String val) {
return readableToIndexed(val);
return wrappedField.toInternal(val);
}
@Override
public String toExternal(Fieldable f) {
byte[] arr = f.getBinaryValue();
if (arr==null) return TrieField.badFieldString(f);
return super.toExternal(new Date(TrieFieldHelper.toLong(arr)));
return wrappedField.toExternal(f);
}
@Override
public String indexedToReadable(String _indexedForm) {
final BytesRef indexedForm = new BytesRef(_indexedForm);
return super.toExternal( new Date(NumericUtils.prefixCodedToLong(indexedForm)) );
return wrappedField.indexedToReadable(_indexedForm);
}
@Override
public void indexedToReadable(BytesRef input, CharArr out) {
String ext = super.toExternal( new Date(NumericUtils.prefixCodedToLong(input)) );
out.write(ext);
wrappedField.indexedToReadable(input, out);
}
@Override
public String storedToIndexed(Fieldable f) {
// TODO: optimize to remove redundant string conversion
return readableToIndexed(storedToReadable(f));
return wrappedField.storedToIndexed(f);
}
@Override
public Fieldable createField(SchemaField field, Object value, float boost) {
boolean indexed = field.indexed();
boolean stored = field.stored();
if (!indexed && !stored) {
if (log.isTraceEnabled())
log.trace("Ignoring unindexed/unstored field: " + field);
return null;
}
int ps = precisionStep;
byte[] arr=null;
TokenStream ts=null;
long time = (value instanceof Date)
? ((Date)value).getTime()
: super.parseMath(null, value.toString()).getTime();
if (stored) arr = TrieFieldHelper.toArr(time);
if (indexed) ts = new NumericTokenStream(ps).setLongValue(time);
Field f;
if (stored) {
f = new Field(field.getName(), arr);
if (indexed) f.setTokenStream(ts);
} else {
f = new Field(field.getName(), ts);
}
// term vectors aren't supported
f.setOmitNorms(field.omitNorms());
f.setOmitTermFreqAndPositions(field.omitTf());
f.setBoost(boost);
return f;
return wrappedField.createField(field, value, boost);
}
@Override
public Query getRangeQuery(QParser parser, SchemaField field, String min, String max, boolean minInclusive, boolean maxInclusive) {
return getRangeQuery(parser, field,
min==null ? null : super.parseMath(null,min),
max==null ? null : super.parseMath(null,max),
minInclusive, maxInclusive);
return wrappedField.getRangeQuery(parser, field, min, max, minInclusive, maxInclusive);
}
@Override
public Query getRangeQuery(QParser parser, SchemaField sf, Date min, Date max, boolean minInclusive, boolean maxInclusive) {
int ps = precisionStep;
Query query = NumericRangeQuery.newLongRange(sf.getName(), ps,
return NumericRangeQuery.newLongRange(sf.getName(), wrappedField.precisionStep,
min == null ? null : min.getTime(),
max == null ? null : max.getTime(),
minInclusive, maxInclusive);
return query;
}
}

View File

@ -17,6 +17,8 @@
package org.apache.solr.schema;
import org.apache.lucene.document.Fieldable;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.NumericField;
import org.apache.lucene.search.*;
import org.apache.lucene.search.cache.CachedArrayCreator;
import org.apache.lucene.search.cache.DoubleValuesCreator;
@ -40,17 +42,17 @@ import java.util.Map;
import java.util.Date;
/**
* Provides field types to support for Lucene's Trie Range Queries.
* Provides field types to support for Lucene's {@link NumericField}.
* See {@link org.apache.lucene.search.NumericRangeQuery} for more details.
* It supports integer, float, long, double and date types.
* <p/>
* For each number being added to this field, multiple terms are generated as per the algorithm described in the above
* link. The possible number of terms increases dramatically with higher precision steps (factor 2^precisionStep). For
* link. The possible number of terms increases dramatically with lower precision steps. For
* the fast range search to work, trie fields must be indexed.
* <p/>
* Trie fields are sortable in numerical order and can be used in function queries.
* <p/>
* Note that if you use a precisionStep of 32 for int/float and 64 for long/double, then multiple terms will not be
* Note that if you use a precisionStep of 32 for int/float and 64 for long/double/date, then multiple terms will not be
* generated, range search will be no faster than any other number field, but sorting will still be possible.
*
* @version $Id$
@ -101,21 +103,28 @@ public class TrieField extends FieldType {
@Override
public Object toObject(Fieldable f) {
byte[] arr = f.getBinaryValue();
if (arr==null) return badFieldString(f);
switch (type) {
case INTEGER:
return TrieFieldHelper.toInt(arr);
case FLOAT:
return TrieFieldHelper.toFloat(arr);
case LONG:
return TrieFieldHelper.toLong(arr);
case DOUBLE:
return TrieFieldHelper.toDouble(arr);
case DATE:
return new Date(TrieFieldHelper.toLong(arr));
default:
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Unknown type for trie field: " + f.name());
if (f instanceof NumericField) {
final Number val = ((NumericField) f).getNumericValue();
if (val==null) return badFieldString(f);
return (type == TrieTypes.DATE) ? new Date(val.longValue()) : val;
} else {
// the following code is "deprecated" and only to support pre-3.2 indexes using the old BinaryField encoding:
final byte[] arr = f.getBinaryValue();
if (arr==null) return badFieldString(f);
switch (type) {
case INTEGER:
return toInt(arr);
case FLOAT:
return Float.intBitsToFloat(toInt(arr));
case LONG:
return toLong(arr);
case DOUBLE:
return Double.longBitsToDouble(toLong(arr));
case DATE:
return new Date(toLong(arr));
default:
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Unknown type for trie field: " + f.name());
}
}
}
@ -198,30 +207,7 @@ public class TrieField extends FieldType {
@Override
public void write(TextResponseWriter writer, String name, Fieldable f) throws IOException {
byte[] arr = f.getBinaryValue();
if (arr==null) {
writer.writeStr(name, badFieldString(f),true);
return;
}
switch (type) {
case INTEGER:
writer.writeInt(name,TrieFieldHelper.toInt(arr));
break;
case FLOAT:
writer.writeFloat(name,TrieFieldHelper.toFloat(arr));
break;
case LONG:
writer.writeLong(name,TrieFieldHelper.toLong(arr));
break;
case DOUBLE:
writer.writeDouble(name,TrieFieldHelper.toDouble(arr));
break;
case DATE:
writer.writeDate(name,new Date(TrieFieldHelper.toLong(arr)));
break;
default:
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Unknown type for trie field: " + f.name());
}
writer.writeVal(name, toObject(f));
}
@Override
@ -290,6 +276,17 @@ public class TrieField extends FieldType {
return query;
}
@Deprecated
static int toInt(byte[] arr) {
return (arr[0]<<24) | ((arr[1]&0xff)<<16) | ((arr[2]&0xff)<<8) | (arr[3]&0xff);
}
@Deprecated
static long toLong(byte[] arr) {
int high = (arr[0]<<24) | ((arr[1]&0xff)<<16) | ((arr[2]&0xff)<<8) | (arr[3]&0xff);
int low = (arr[4]<<24) | ((arr[5]&0xff)<<16) | ((arr[6]&0xff)<<8) | (arr[7]&0xff);
return (((long)high)<<32) | (low&0x0ffffffffL);
}
@Override
public String storedToReadable(Fieldable f) {
@ -341,22 +338,9 @@ public class TrieField extends FieldType {
@Override
public String toExternal(Fieldable f) {
byte[] arr = f.getBinaryValue();
if (arr==null) return badFieldString(f);
switch (type) {
case INTEGER:
return Integer.toString(TrieFieldHelper.toInt(arr));
case FLOAT:
return Float.toString(TrieFieldHelper.toFloat(arr));
case LONG:
return Long.toString(TrieFieldHelper.toLong(arr));
case DOUBLE:
return Double.toString(TrieFieldHelper.toDouble(arr));
case DATE:
return dateField.formatDate(new Date(TrieFieldHelper.toLong(arr)));
default:
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Unknown type for trie field: " + f.name());
}
return (type == TrieTypes.DATE)
? dateField.toExternal((Date) toObject(f))
: toObject(f).toString();
}
@Override
@ -372,7 +356,7 @@ public class TrieField extends FieldType {
case DOUBLE:
return Double.toString( NumericUtils.sortableLongToDouble(NumericUtils.prefixCodedToLong(indexedForm)) );
case DATE:
return dateField.formatDate( new Date(NumericUtils.prefixCodedToLong(indexedForm)) );
return dateField.toExternal( new Date(NumericUtils.prefixCodedToLong(indexedForm)) );
default:
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Unknown type for trie field: " + type);
}
@ -397,7 +381,7 @@ public class TrieField extends FieldType {
s = Double.toString( NumericUtils.sortableLongToDouble(NumericUtils.prefixCodedToLong(indexedForm)) );
break;
case DATE:
s = dateField.formatDate( new Date(NumericUtils.prefixCodedToLong(indexedForm)) );
s = dateField.toExternal( new Date(NumericUtils.prefixCodedToLong(indexedForm)) );
break;
default:
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Unknown type for trie field: " + type);
@ -426,59 +410,117 @@ public class TrieField extends FieldType {
@Override
public String storedToIndexed(Fieldable f) {
// TODO: optimize to remove redundant string conversion
return readableToIndexed(storedToReadable(f));
final BytesRef bytes = new BytesRef(NumericUtils.BUF_SIZE_LONG);
if (f instanceof NumericField) {
final Number val = ((NumericField) f).getNumericValue();
if (val==null)
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Invalid field contents: "+f.name());
switch (type) {
case INTEGER:
NumericUtils.intToPrefixCoded(val.intValue(), 0, bytes);
break;
case FLOAT:
NumericUtils.intToPrefixCoded(NumericUtils.floatToSortableInt(val.floatValue()), 0, bytes);
break;
case LONG: //fallthrough!
case DATE:
NumericUtils.longToPrefixCoded(val.longValue(), 0, bytes);
break;
case DOUBLE:
NumericUtils.longToPrefixCoded(NumericUtils.doubleToSortableLong(val.doubleValue()), 0, bytes);
break;
default:
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Unknown type for trie field: " + f.name());
}
} else {
// the following code is "deprecated" and only to support pre-3.2 indexes using the old BinaryField encoding:
final byte[] arr = f.getBinaryValue();
if (arr==null)
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Invalid field contents: "+f.name());
switch (type) {
case INTEGER:
NumericUtils.intToPrefixCoded(toInt(arr), 0, bytes);
break;
case FLOAT: {
// WARNING: Code Duplication! Keep in sync with o.a.l.util.NumericUtils!
// copied from NumericUtils to not convert to/from float two times
// code in next 2 lines is identical to: int v = NumericUtils.floatToSortableInt(Float.intBitsToFloat(toInt(arr)));
int v = toInt(arr);
if (v<0) v ^= 0x7fffffff;
NumericUtils.intToPrefixCoded(v, 0, bytes);
break;
}
case LONG: //fallthrough!
case DATE:
NumericUtils.longToPrefixCoded(toLong(arr), 0, bytes);
break;
case DOUBLE: {
// WARNING: Code Duplication! Keep in sync with o.a.l.util.NumericUtils!
// copied from NumericUtils to not convert to/from double two times
// code in next 2 lines is identical to: long v = NumericUtils.doubleToSortableLong(Double.longBitsToDouble(toLong(arr)));
long v = toLong(arr);
if (v<0) v ^= 0x7fffffffffffffffL;
NumericUtils.longToPrefixCoded(v, 0, bytes);
break;
}
default:
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Unknown type for trie field: " + f.name());
}
}
return bytes.utf8ToString();
}
@Override
public Fieldable createField(SchemaField field, Object value, float boost) {
TrieFieldHelper.FieldInfo info = new TrieFieldHelper.FieldInfo();
info.index = field.indexed();
info.store = field.stored();
info.precisionStep = precisionStep;
info.omitNorms = field.omitNorms();
info.omitTF = field.omitTf();
if (!info.index && !info.store) {
boolean indexed = field.indexed();
boolean stored = field.stored();
if (!indexed && !stored) {
if (log.isTraceEnabled())
log.trace("Ignoring unindexed/unstored field: " + field);
return null;
}
final NumericField f = new NumericField(field.getName(), precisionStep, stored ? Field.Store.YES : Field.Store.NO, indexed);
switch (type) {
case INTEGER:
int i = (value instanceof Number)
? ((Number)value).intValue()
: Integer.parseInt(value.toString());
return TrieFieldHelper.createIntField(field.getName(), i, info, boost);
f.setIntValue(i);
break;
case FLOAT:
float f = (value instanceof Number)
float fl = (value instanceof Number)
? ((Number)value).floatValue()
: Float.parseFloat(value.toString());
return TrieFieldHelper.createFloatField(field.getName(), f, info, boost);
f.setFloatValue(fl);
break;
case LONG:
long l = (value instanceof Number)
? ((Number)value).longValue()
: Long.parseLong(value.toString());
return TrieFieldHelper.createLongField(field.getName(), l, info, boost);
f.setLongValue(l);
break;
case DOUBLE:
double d = (value instanceof Number)
? ((Number)value).doubleValue()
: Double.parseDouble(value.toString());
return TrieFieldHelper.createDoubleField(field.getName(), d, info, boost);
f.setDoubleValue(d);
break;
case DATE:
Date date = (value instanceof Date)
? ((Date)value)
: dateField.parseMath(null, value.toString());
return TrieFieldHelper.createDateField(field.getName(), date, info, boost);
f.setLongValue(date.getTime());
break;
default:
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Unknown type for trie field: " + type);
}
f.setOmitNorms(field.omitNorms());
f.setOmitTermFreqAndPositions(field.omitTf());
f.setBoost(boost);
return f;
}
public enum TrieTypes {
@ -498,14 +540,12 @@ public class TrieField extends FieldType {
* that indexes multiple precisions per value.
*/
public static String getMainValuePrefix(FieldType ft) {
if (ft instanceof TrieDateField) {
int step = ((TrieDateField)ft).getPrecisionStep();
if (step <= 0 || step >=64) return null;
return LONG_PREFIX;
} else if (ft instanceof TrieField) {
TrieField trie = (TrieField)ft;
if (trie.precisionStep == Integer.MAX_VALUE) return null;
if (ft instanceof TrieDateField)
ft = ((TrieDateField) ft).wrappedField;
if (ft instanceof TrieField) {
final TrieField trie = (TrieField)ft;
if (trie.precisionStep == Integer.MAX_VALUE)
return null;
switch (trie.type) {
case INTEGER:
case FLOAT:

View File

@ -1,166 +0,0 @@
/**
* Copyright 2005 The Apache Software Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.solr.schema;
import java.util.Date;
import org.apache.lucene.analysis.NumericTokenStream;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.Fieldable;
/**
* Helper class to make TrieFields compatible with ones written in solr
*
* TODO -- Something like this should be in in lucene
* see: LUCENE-3001
*/
public class TrieFieldHelper {
private TrieFieldHelper() {}
public static class FieldInfo {
public int precisionStep = 8; // same as solr default
public boolean store = true;
public boolean index = true;
public boolean omitNorms = true;
public boolean omitTF = true;
}
//----------------------------------------------
// Create Field
//----------------------------------------------
private static Fieldable createField(String name, byte[] arr, TokenStream ts, FieldInfo info, float boost) {
Field f;
if (info.store) {
f = new Field(name, arr);
if (info.index) f.setTokenStream(ts);
} else {
f = new Field(name, ts);
}
// term vectors aren't supported
f.setOmitNorms(info.omitNorms);
f.setOmitTermFreqAndPositions(info.omitTF);
f.setBoost(boost);
return f;
}
public static Fieldable createIntField(String name, int value, FieldInfo info, float boost) {
byte[] arr=null;
TokenStream ts=null;
if (info.store) arr = TrieFieldHelper.toArr(value);
if (info.index) ts = new NumericTokenStream(info.precisionStep).setIntValue(value);
return createField(name, arr, ts, info, boost);
}
public static Fieldable createFloatField(String name, float value, FieldInfo info, float boost) {
byte[] arr=null;
TokenStream ts=null;
if (info.store) arr = TrieFieldHelper.toArr(value);
if (info.index) ts = new NumericTokenStream(info.precisionStep).setFloatValue(value);
return createField(name, arr, ts, info, boost);
}
public static Fieldable createLongField(String name, long value, FieldInfo info, float boost) {
byte[] arr=null;
TokenStream ts=null;
if (info.store) arr = TrieFieldHelper.toArr(value);
if (info.index) ts = new NumericTokenStream(info.precisionStep).setLongValue(value);
return createField(name, arr, ts, info, boost);
}
public static Fieldable createDoubleField(String name, double value, FieldInfo info, float boost) {
byte[] arr=null;
TokenStream ts=null;
if (info.store) arr = TrieFieldHelper.toArr(value);
if (info.index) ts = new NumericTokenStream(info.precisionStep).setDoubleValue(value);
return createField(name, arr, ts, info, boost);
}
public static Fieldable createDateField(String name, Date value, FieldInfo info, float boost) {
// TODO, make sure the date is within long range!
return createLongField(name, value.getTime(), info, boost);
}
//----------------------------------------------
// number <=> byte[]
//----------------------------------------------
public static int toInt(byte[] arr) {
return (arr[0]<<24) | ((arr[1]&0xff)<<16) | ((arr[2]&0xff)<<8) | (arr[3]&0xff);
}
public static long toLong(byte[] arr) {
int high = (arr[0]<<24) | ((arr[1]&0xff)<<16) | ((arr[2]&0xff)<<8) | (arr[3]&0xff);
int low = (arr[4]<<24) | ((arr[5]&0xff)<<16) | ((arr[6]&0xff)<<8) | (arr[7]&0xff);
return (((long)high)<<32) | (low&0x0ffffffffL);
}
public static float toFloat(byte[] arr) {
return Float.intBitsToFloat(toInt(arr));
}
public static double toDouble(byte[] arr) {
return Double.longBitsToDouble(toLong(arr));
}
public static byte[] toArr(int val) {
byte[] arr = new byte[4];
arr[0] = (byte)(val>>>24);
arr[1] = (byte)(val>>>16);
arr[2] = (byte)(val>>>8);
arr[3] = (byte)(val);
return arr;
}
public static byte[] toArr(long val) {
byte[] arr = new byte[8];
arr[0] = (byte)(val>>>56);
arr[1] = (byte)(val>>>48);
arr[2] = (byte)(val>>>40);
arr[3] = (byte)(val>>>32);
arr[4] = (byte)(val>>>24);
arr[5] = (byte)(val>>>16);
arr[6] = (byte)(val>>>8);
arr[7] = (byte)(val);
return arr;
}
public static byte[] toArr(float val) {
return toArr(Float.floatToRawIntBits(val));
}
public static byte[] toArr(double val) {
return toArr(Double.doubleToRawLongBits(val));
}
}

View File

@ -18,7 +18,7 @@
package org.apache.solr.update;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.Fieldable;
import org.apache.lucene.index.Term;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.common.SolrInputField;
@ -74,7 +74,7 @@ public class AddUpdateCommand extends UpdateCommand {
if (sf != null) {
if (doc != null) {
schema.getUniqueKeyField();
Field storedId = doc.getField(sf.getName());
Fieldable storedId = doc.getFieldable(sf.getName());
indexedId = sf.getType().storedToIndexed(storedId);
}
if (solrDoc != null) {

View File

@ -159,7 +159,7 @@ public class DocumentBuilder {
// default value are defacto 'required' fields.
List<String> missingFields = null;
for (SchemaField field : schema.getRequiredFields()) {
if (doc.getField(field.getName() ) == null) {
if (doc.getFieldable(field.getName() ) == null) {
if (field.getDefaultValue() != null) {
addField(doc, field, field.getDefaultValue(), 1.0f);
} else {
@ -313,7 +313,7 @@ public class DocumentBuilder {
// Now validate required fields or add default values
// fields with default values are defacto 'required'
for (SchemaField field : schema.getRequiredFields()) {
if (out.getField(field.getName() ) == null) {
if (out.getFieldable(field.getName() ) == null) {
if (field.getDefaultValue() != null) {
addField(out, field, field.getDefaultValue(), 1.0f);
}
@ -339,8 +339,7 @@ public class DocumentBuilder {
*/
public SolrDocument loadStoredFields( SolrDocument doc, Document luceneDoc )
{
for( Object f : luceneDoc.getFields() ) {
Fieldable field = (Fieldable)f;
for( Fieldable field : luceneDoc.getFields() ) {
if( field.isStored() ) {
SchemaField sf = schema.getField( field.name() );
if( !schema.isCopyFieldTarget( sf ) ) {

View File

@ -21,7 +21,6 @@ package org.apache.solr.update;
import org.apache.lucene.index.IndexReader.AtomicReaderContext;
import org.apache.lucene.index.Term;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.Fieldable;
import org.apache.lucene.search.Collector;
import org.apache.lucene.search.Scorer;
@ -125,7 +124,7 @@ public abstract class UpdateHandler implements SolrInfoMBean {
protected final String getIndexedIdOptional(Document doc) {
if (idField == null) return null;
Field f = doc.getField(idField.getName());
Fieldable f = doc.getFieldable(idField.getName());
if (f == null) return null;
return idFieldType.storedToIndexed(f);
}

View File

@ -561,7 +561,7 @@ public class BasicFunctionalityTest extends SolrTestCaseJ4 {
DocList dl = ((ResultContext) rsp.getValues().get("response")).docs;
org.apache.lucene.document.Document d = req.getSearcher().doc(dl.iterator().nextDoc());
// ensure field is not lazy
// ensure field is not lazy, only works for Non-Numeric fields currently (if you change schema behind test, this may fail)
assertTrue( d.getFieldable("test_hlt") instanceof Field );
assertTrue( d.getFieldable("title") instanceof Field );
req.close();

View File

@ -79,7 +79,7 @@ public class MoreLikeThisHandlerTest extends SolrTestCaseJ4 {
params.set(CommonParams.Q, "id:42");
params.set(MoreLikeThisParams.MLT, "true");
params.set(MoreLikeThisParams.SIMILARITY_FIELDS, "name,subword,foo_ti");
params.set(MoreLikeThisParams.SIMILARITY_FIELDS, "name,subword");
params.set(MoreLikeThisParams.INTERESTING_TERMS, "details");
params.set(MoreLikeThisParams.MIN_TERM_FREQ,"1");
params.set(MoreLikeThisParams.MIN_DOC_FREQ,"1");

View File

@ -109,8 +109,8 @@ public class DocumentBuilderTest extends SolrTestCaseJ4 {
doc.addField( "home", "2.2,3.3", 1.0f );
Document out = DocumentBuilder.toDocument( doc, core.getSchema() );
assertNotNull( out.get( "home" ) );//contains the stored value and term vector, if there is one
assertNotNull( out.getField( "home_0" + FieldType.POLY_FIELD_SEPARATOR + "double" ) );
assertNotNull( out.getField( "home_1" + FieldType.POLY_FIELD_SEPARATOR + "double" ) );
assertNotNull( out.getFieldable( "home_0" + FieldType.POLY_FIELD_SEPARATOR + "double" ) );
assertNotNull( out.getFieldable( "home_1" + FieldType.POLY_FIELD_SEPARATOR + "double" ) );
}
}