mirror of https://github.com/apache/lucene.git
- Added useful info to Overview, added a section about Range Searches and a
section about Field Grouping. Fixed a few small gramatical errors. git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149937 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
51585c68da
commit
e8a9993725
|
@ -119,8 +119,36 @@
|
||||||
</td></tr>
|
</td></tr>
|
||||||
<tr><td>
|
<tr><td>
|
||||||
<blockquote>
|
<blockquote>
|
||||||
<p>Although Lucene provides the ability to create your own query's though its API, it also provides a rich query language through the QueryParser.</p>
|
<p>Although Lucene provides the ability to create your own
|
||||||
<p>This page provides syntax of Lucene's Query Parser, a lexer which interprets a string into a Lucene Query using JavaCC.</p>
|
queries through its API, it also provides a rich query
|
||||||
|
language through the Query Parser.</p>
|
||||||
|
<p>This page
|
||||||
|
provides syntax of Lucene's Query Parser, a lexer which
|
||||||
|
interprets a string into a Lucene Query using JavaCC.</p>
|
||||||
|
<p>
|
||||||
|
Before choosing to use the provided Query Parser, please consider the following:
|
||||||
|
<ol>
|
||||||
|
<li>If you are programmatically generating a query string and then
|
||||||
|
parsing it with the query parser then you should seriously consider building
|
||||||
|
your queries directly with the query API. In other words, the query
|
||||||
|
parser is designed for human-entered text, not for program-generated
|
||||||
|
text.</li>
|
||||||
|
|
||||||
|
<li>Untokenized fields are best added directly to queries, and not
|
||||||
|
through the query parser. If a field's values are generated programmatically
|
||||||
|
by the application, then so should query clauses for this field.
|
||||||
|
Analyzers, like the query parser, are designed to convert human-entered
|
||||||
|
text to terms. Program-generated values, like dates, keywords, etc.,
|
||||||
|
should be consistently program-generated.</li>
|
||||||
|
|
||||||
|
<li>In a query form, fields which are general text should use the query
|
||||||
|
parser. All others, such as date ranges, keywords, etc. are better added
|
||||||
|
directly through the query API. A field with a limit set of values,
|
||||||
|
that can be specified with a pull-down menu should not be added to a
|
||||||
|
query string which is subsequently parsed, but rather added as a
|
||||||
|
TermQuery clause.</li>
|
||||||
|
</ol>
|
||||||
|
</p>
|
||||||
</blockquote>
|
</blockquote>
|
||||||
</p>
|
</p>
|
||||||
</td></tr>
|
</td></tr>
|
||||||
|
@ -373,6 +401,63 @@
|
||||||
</blockquote>
|
</blockquote>
|
||||||
</td></tr>
|
</td></tr>
|
||||||
<tr><td><br/></td></tr>
|
<tr><td><br/></td></tr>
|
||||||
|
</table>
|
||||||
|
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
||||||
|
<tr><td bgcolor="#828DA6">
|
||||||
|
<font color="#ffffff" face="arial,helvetica,sanserif">
|
||||||
|
<a name="Range Searches"><strong>Range Searches</strong></a>
|
||||||
|
</font>
|
||||||
|
</td></tr>
|
||||||
|
<tr><td>
|
||||||
|
<blockquote>
|
||||||
|
<p>Range Queries allow one to match documents whose field(s) values
|
||||||
|
are between the lower and upper bound specified by the Range Query.
|
||||||
|
Range Queries are inclusive (i.e. the query includes the specified lower and upper bound).
|
||||||
|
Sorting is done lexicographically.</p>
|
||||||
|
<div align="left">
|
||||||
|
<table cellspacing="4" cellpadding="0" border="0">
|
||||||
|
<tr>
|
||||||
|
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#ffffff"><pre>mod_date:[20020101 TO 20030101]</pre></td>
|
||||||
|
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
</tr>
|
||||||
|
</table>
|
||||||
|
</div>
|
||||||
|
<p>This will find documents whose mod_date fields have values between 20020101 and 20030101.
|
||||||
|
Note that Range Queries are not reserved for date fields. You could also use range queries with non-date fields:</p>
|
||||||
|
<div align="left">
|
||||||
|
<table cellspacing="4" cellpadding="0" border="0">
|
||||||
|
<tr>
|
||||||
|
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#ffffff"><pre>title:[Aida TO Carmen]</pre></td>
|
||||||
|
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
</tr>
|
||||||
|
</table>
|
||||||
|
</div>
|
||||||
|
<p>This will find all documents whose titles are between Aida and Carmen.</p>
|
||||||
|
</blockquote>
|
||||||
|
</td></tr>
|
||||||
|
<tr><td><br/></td></tr>
|
||||||
</table>
|
</table>
|
||||||
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
||||||
<tr><td bgcolor="#828DA6">
|
<tr><td bgcolor="#828DA6">
|
||||||
|
@ -444,7 +529,7 @@
|
||||||
</tr>
|
</tr>
|
||||||
</table>
|
</table>
|
||||||
</div>
|
</div>
|
||||||
<p>By default, the boost factor is 1. Although, the boost factor must be positive, it can be less than 1 (i.e. .2)</p>
|
<p>By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2)</p>
|
||||||
</blockquote>
|
</blockquote>
|
||||||
</td></tr>
|
</td></tr>
|
||||||
<tr><td><br/></td></tr>
|
<tr><td><br/></td></tr>
|
||||||
|
@ -712,6 +797,40 @@
|
||||||
</p>
|
</p>
|
||||||
</td></tr>
|
</td></tr>
|
||||||
<tr><td><br/></td></tr>
|
<tr><td><br/></td></tr>
|
||||||
|
</table>
|
||||||
|
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
||||||
|
<tr><td bgcolor="#525D76">
|
||||||
|
<font color="#ffffff" face="arial,helvetica,sanserif">
|
||||||
|
<a name="Field Grouping"><strong>Field Grouping</strong></a>
|
||||||
|
</font>
|
||||||
|
</td></tr>
|
||||||
|
<tr><td>
|
||||||
|
<blockquote>
|
||||||
|
<p>Lucene supports using parentheses to group multiple clauses to a single field.</p>
|
||||||
|
<p>To search for a title that contains both the word "return" and the phrase "pink panther" use the query:</p>
|
||||||
|
<div align="left">
|
||||||
|
<table cellspacing="4" cellpadding="0" border="0">
|
||||||
|
<tr>
|
||||||
|
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#ffffff"><pre>title:(+return +"pink panther")</pre></td>
|
||||||
|
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
|
||||||
|
</tr>
|
||||||
|
</table>
|
||||||
|
</div>
|
||||||
|
</blockquote>
|
||||||
|
</p>
|
||||||
|
</td></tr>
|
||||||
|
<tr><td><br/></td></tr>
|
||||||
</table>
|
</table>
|
||||||
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
||||||
<tr><td bgcolor="#525D76">
|
<tr><td bgcolor="#525D76">
|
||||||
|
|
|
@ -8,9 +8,37 @@
|
||||||
</properties>
|
</properties>
|
||||||
<body>
|
<body>
|
||||||
<section name="Overview">
|
<section name="Overview">
|
||||||
<p>Although Lucene provides the ability to create your own query's though its API, it also provides a rich query language through the QueryParser.</p>
|
<p>Although Lucene provides the ability to create your own
|
||||||
<p>This page provides syntax of Lucene's Query Parser, a lexer which interprets a string into a Lucene Query using JavaCC.</p>
|
queries through its API, it also provides a rich query
|
||||||
|
language through the Query Parser.</p> <p>This page
|
||||||
|
provides syntax of Lucene's Query Parser, a lexer which
|
||||||
|
interprets a string into a Lucene Query using JavaCC.</p>
|
||||||
|
<p>
|
||||||
|
Before choosing to use the provided Query Parser, please consider the following:
|
||||||
|
<ol>
|
||||||
|
<li>If you are programmatically generating a query string and then
|
||||||
|
parsing it with the query parser then you should seriously consider building
|
||||||
|
your queries directly with the query API. In other words, the query
|
||||||
|
parser is designed for human-entered text, not for program-generated
|
||||||
|
text.</li>
|
||||||
|
|
||||||
|
<li>Untokenized fields are best added directly to queries, and not
|
||||||
|
through the query parser. If a field's values are generated programmatically
|
||||||
|
by the application, then so should query clauses for this field.
|
||||||
|
Analyzers, like the query parser, are designed to convert human-entered
|
||||||
|
text to terms. Program-generated values, like dates, keywords, etc.,
|
||||||
|
should be consistently program-generated.</li>
|
||||||
|
|
||||||
|
<li>In a query form, fields which are general text should use the query
|
||||||
|
parser. All others, such as date ranges, keywords, etc. are better added
|
||||||
|
directly through the query API. A field with a limit set of values,
|
||||||
|
that can be specified with a pull-down menu should not be added to a
|
||||||
|
query string which is subsequently parsed, but rather added as a
|
||||||
|
TermQuery clause.</li>
|
||||||
|
</ol>
|
||||||
|
</p>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section name="Terms">
|
<section name="Terms">
|
||||||
<p>A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases.</p>
|
<p>A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases.</p>
|
||||||
<p>A Single Term is a single word such as "test" or "hello".</p>
|
<p>A Single Term is a single word such as "test" or "hello".</p>
|
||||||
|
@ -37,6 +65,7 @@
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section name="Term Modifiers">
|
<section name="Term Modifiers">
|
||||||
|
|
||||||
<p>Lucene supports modifying query terms to provide a wide range of searching options.</p>
|
<p>Lucene supports modifying query terms to provide a wide range of searching options.</p>
|
||||||
|
|
||||||
<subsection name="Wildcard Searches">
|
<subsection name="Wildcard Searches">
|
||||||
|
@ -62,14 +91,27 @@
|
||||||
<p>This search will find terms like foam and roams</p>
|
<p>This search will find terms like foam and roams</p>
|
||||||
<p>Note:Terms found by the fuzzy search will automatically get a boost factor of 0.2</p>
|
<p>Note:Terms found by the fuzzy search will automatically get a boost factor of 0.2</p>
|
||||||
</subsection>
|
</subsection>
|
||||||
|
|
||||||
|
|
||||||
<subsection name="Proximity Searches">
|
<subsection name="Proximity Searches">
|
||||||
<p>Lucene supports finding words are a within a specific distance away. To do a proximity search use the tilde, "~", symbol at the end of a Phrase. For example to search for a "apache" and "jakarta" within 10 words of each other in a document use the search: </p>
|
<p>Lucene supports finding words are a within a specific distance away. To do a proximity search use the tilde, "~", symbol at the end of a Phrase. For example to search for a "apache" and "jakarta" within 10 words of each other in a document use the search: </p>
|
||||||
|
|
||||||
<source>"jakarta apache"~10</source>
|
<source>"jakarta apache"~10</source>
|
||||||
|
|
||||||
</subsection>
|
</subsection>
|
||||||
|
|
||||||
|
|
||||||
|
<subsection name="Range Searches">
|
||||||
|
<p>Range Queries allow one to match documents whose field(s) values
|
||||||
|
are between the lower and upper bound specified by the Range Query.
|
||||||
|
Range Queries are inclusive (i.e. the query includes the specified lower and upper bound).
|
||||||
|
Sorting is done lexicographically.</p>
|
||||||
|
<source>mod_date:[20020101 TO 20030101]</source>
|
||||||
|
<p>This will find documents whose mod_date fields have values between 20020101 and 20030101.
|
||||||
|
Note that Range Queries are not reserved for date fields. You could also use range queries with non-date fields:</p>
|
||||||
|
<source>title:[Aida TO Carmen]</source>
|
||||||
|
<p>This will find all documents whose titles are between Aida and Carmen.</p>
|
||||||
|
</subsection>
|
||||||
|
|
||||||
|
|
||||||
<subsection name="Boosting a Term">
|
<subsection name="Boosting a Term">
|
||||||
<p>Lucene provides the relevance level of matching documents based on the terms found. To boost a term use the caret, "^", symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be.</p>
|
<p>Lucene provides the relevance level of matching documents based on the terms found. To boost a term use the caret, "^", symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be.</p>
|
||||||
|
@ -82,11 +124,14 @@
|
||||||
<p>This will make documents with the term jakarta appear more relevant. You can also boost Phrase Terms as in the example: </p>
|
<p>This will make documents with the term jakarta appear more relevant. You can also boost Phrase Terms as in the example: </p>
|
||||||
|
|
||||||
<source>"jakarta apache"^4 "jakarta lucene"</source>
|
<source>"jakarta apache"^4 "jakarta lucene"</source>
|
||||||
<p>By default, the boost factor is 1. Although, the boost factor must be positive, it can be less than 1 (i.e. .2)</p>
|
<p>By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2)</p>
|
||||||
</subsection>
|
</subsection>
|
||||||
|
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
|
|
||||||
<section name="Boolean operators">
|
<section name="Boolean operators">
|
||||||
|
|
||||||
<p>Boolean operators allow terms to be combined through logic operators.
|
<p>Boolean operators allow terms to be combined through logic operators.
|
||||||
Lucene supports AND, "+", OR, NOT and "-" as Boolean operators(Note: Boolean operators must be ALL CAPS).</p>
|
Lucene supports AND, "+", OR, NOT and "-" as Boolean operators(Note: Boolean operators must be ALL CAPS).</p>
|
||||||
|
|
||||||
|
@ -145,6 +190,12 @@
|
||||||
<p>This eliminates any confusion and makes sure you that website must exist and either term jakarta or apache may exist.</p>
|
<p>This eliminates any confusion and makes sure you that website must exist and either term jakarta or apache may exist.</p>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
|
<section name="Field Grouping">
|
||||||
|
<p>Lucene supports using parentheses to group multiple clauses to a single field.</p>
|
||||||
|
<p>To search for a title that contains both the word "return" and the phrase "pink panther" use the query:</p>
|
||||||
|
<source>title:(+return +"pink panther")</source>
|
||||||
|
</section>
|
||||||
|
|
||||||
<section name="Escaping Special Characters">
|
<section name="Escaping Special Characters">
|
||||||
<p>Lucene supports escaping special characters that are part of the query syntax. The current list special characters are</p>
|
<p>Lucene supports escaping special characters that are part of the query syntax. The current list special characters are</p>
|
||||||
<p>+ - && || ! ( ) { } [ ] ^ " ~ * ? : \</p>
|
<p>+ - && || ! ( ) { } [ ] ^ " ~ * ? : \</p>
|
||||||
|
|
Loading…
Reference in New Issue