- Added useful info to Overview, added a section about Range Searches and a

section about Field Grouping.  Fixed a few small gramatical errors.


git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149937 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Otis Gospodnetic 2003-01-23 16:10:59 +00:00
parent 51585c68da
commit e8a9993725
2 changed files with 177 additions and 7 deletions

View File

@ -119,8 +119,36 @@
</td></tr> </td></tr>
<tr><td> <tr><td>
<blockquote> <blockquote>
<p>Although Lucene provides the ability to create your own query's though its API, it also provides a rich query language through the QueryParser.</p> <p>Although Lucene provides the ability to create your own
<p>This page provides syntax of Lucene's Query Parser, a lexer which interprets a string into a Lucene Query using JavaCC.</p> queries through its API, it also provides a rich query
language through the Query Parser.</p>
<p>This page
provides syntax of Lucene's Query Parser, a lexer which
interprets a string into a Lucene Query using JavaCC.</p>
<p>
Before choosing to use the provided Query Parser, please consider the following:
<ol>
<li>If you are programmatically generating a query string and then
parsing it with the query parser then you should seriously consider building
your queries directly with the query API. In other words, the query
parser is designed for human-entered text, not for program-generated
text.</li>
<li>Untokenized fields are best added directly to queries, and not
through the query parser. If a field's values are generated programmatically
by the application, then so should query clauses for this field.
Analyzers, like the query parser, are designed to convert human-entered
text to terms. Program-generated values, like dates, keywords, etc.,
should be consistently program-generated.</li>
<li>In a query form, fields which are general text should use the query
parser. All others, such as date ranges, keywords, etc. are better added
directly through the query API. A field with a limit set of values,
that can be specified with a pull-down menu should not be added to a
query string which is subsequently parsed, but rather added as a
TermQuery clause.</li>
</ol>
</p>
</blockquote> </blockquote>
</p> </p>
</td></tr> </td></tr>
@ -373,6 +401,63 @@
</blockquote> </blockquote>
</td></tr> </td></tr>
<tr><td><br/></td></tr> <tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#828DA6">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Range Searches"><strong>Range Searches</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>Range Queries allow one to match documents whose field(s) values
are between the lower and upper bound specified by the Range Query.
Range Queries are inclusive (i.e. the query includes the specified lower and upper bound).
Sorting is done lexicographically.</p>
<div align="left">
<table cellspacing="4" cellpadding="0" border="0">
<tr>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
</tr>
<tr>
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#ffffff"><pre>mod_date:[20020101 TO 20030101]</pre></td>
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
</tr>
<tr>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
</tr>
</table>
</div>
<p>This will find documents whose mod_date fields have values between 20020101 and 20030101.
Note that Range Queries are not reserved for date fields. You could also use range queries with non-date fields:</p>
<div align="left">
<table cellspacing="4" cellpadding="0" border="0">
<tr>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
</tr>
<tr>
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#ffffff"><pre>title:[Aida TO Carmen]</pre></td>
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
</tr>
<tr>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
</tr>
</table>
</div>
<p>This will find all documents whose titles are between Aida and Carmen.</p>
</blockquote>
</td></tr>
<tr><td><br/></td></tr>
</table> </table>
<table border="0" cellspacing="0" cellpadding="2" width="100%"> <table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#828DA6"> <tr><td bgcolor="#828DA6">
@ -444,7 +529,7 @@
</tr> </tr>
</table> </table>
</div> </div>
<p>By default, the boost factor is 1. Although, the boost factor must be positive, it can be less than 1 (i.e. .2)</p> <p>By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2)</p>
</blockquote> </blockquote>
</td></tr> </td></tr>
<tr><td><br/></td></tr> <tr><td><br/></td></tr>
@ -712,6 +797,40 @@
</p> </p>
</td></tr> </td></tr>
<tr><td><br/></td></tr> <tr><td><br/></td></tr>
</table>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Field Grouping"><strong>Field Grouping</strong></a>
</font>
</td></tr>
<tr><td>
<blockquote>
<p>Lucene supports using parentheses to group multiple clauses to a single field.</p>
<p>To search for a title that contains both the word "return" and the phrase "pink panther" use the query:</p>
<div align="left">
<table cellspacing="4" cellpadding="0" border="0">
<tr>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
</tr>
<tr>
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#ffffff"><pre>title:(+return +&quot;pink panther&quot;)</pre></td>
<td bgcolor="#023264" width="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
</tr>
<tr>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
<td bgcolor="#023264" width="1" height="1"><img src="/images/void.gif" width="1" height="1" vspace="0" hspace="0" border="0"/></td>
</tr>
</table>
</div>
</blockquote>
</p>
</td></tr>
<tr><td><br/></td></tr>
</table> </table>
<table border="0" cellspacing="0" cellpadding="2" width="100%"> <table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76"> <tr><td bgcolor="#525D76">

View File

@ -8,9 +8,37 @@
</properties> </properties>
<body> <body>
<section name="Overview"> <section name="Overview">
<p>Although Lucene provides the ability to create your own query's though its API, it also provides a rich query language through the QueryParser.</p> <p>Although Lucene provides the ability to create your own
<p>This page provides syntax of Lucene's Query Parser, a lexer which interprets a string into a Lucene Query using JavaCC.</p> queries through its API, it also provides a rich query
language through the Query Parser.</p> <p>This page
provides syntax of Lucene's Query Parser, a lexer which
interprets a string into a Lucene Query using JavaCC.</p>
<p>
Before choosing to use the provided Query Parser, please consider the following:
<ol>
<li>If you are programmatically generating a query string and then
parsing it with the query parser then you should seriously consider building
your queries directly with the query API. In other words, the query
parser is designed for human-entered text, not for program-generated
text.</li>
<li>Untokenized fields are best added directly to queries, and not
through the query parser. If a field's values are generated programmatically
by the application, then so should query clauses for this field.
Analyzers, like the query parser, are designed to convert human-entered
text to terms. Program-generated values, like dates, keywords, etc.,
should be consistently program-generated.</li>
<li>In a query form, fields which are general text should use the query
parser. All others, such as date ranges, keywords, etc. are better added
directly through the query API. A field with a limit set of values,
that can be specified with a pull-down menu should not be added to a
query string which is subsequently parsed, but rather added as a
TermQuery clause.</li>
</ol>
</p>
</section> </section>
<section name="Terms"> <section name="Terms">
<p>A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases.</p> <p>A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases.</p>
<p>A Single Term is a single word such as "test" or "hello".</p> <p>A Single Term is a single word such as "test" or "hello".</p>
@ -37,6 +65,7 @@
</section> </section>
<section name="Term Modifiers"> <section name="Term Modifiers">
<p>Lucene supports modifying query terms to provide a wide range of searching options.</p> <p>Lucene supports modifying query terms to provide a wide range of searching options.</p>
<subsection name="Wildcard Searches"> <subsection name="Wildcard Searches">
@ -62,14 +91,27 @@
<p>This search will find terms like foam and roams</p> <p>This search will find terms like foam and roams</p>
<p>Note:Terms found by the fuzzy search will automatically get a boost factor of 0.2</p> <p>Note:Terms found by the fuzzy search will automatically get a boost factor of 0.2</p>
</subsection> </subsection>
<subsection name="Proximity Searches"> <subsection name="Proximity Searches">
<p>Lucene supports finding words are a within a specific distance away. To do a proximity search use the tilde, "~", symbol at the end of a Phrase. For example to search for a "apache" and "jakarta" within 10 words of each other in a document use the search: </p> <p>Lucene supports finding words are a within a specific distance away. To do a proximity search use the tilde, "~", symbol at the end of a Phrase. For example to search for a "apache" and "jakarta" within 10 words of each other in a document use the search: </p>
<source>"jakarta apache"~10</source> <source>"jakarta apache"~10</source>
</subsection> </subsection>
<subsection name="Range Searches">
<p>Range Queries allow one to match documents whose field(s) values
are between the lower and upper bound specified by the Range Query.
Range Queries are inclusive (i.e. the query includes the specified lower and upper bound).
Sorting is done lexicographically.</p>
<source>mod_date:[20020101 TO 20030101]</source>
<p>This will find documents whose mod_date fields have values between 20020101 and 20030101.
Note that Range Queries are not reserved for date fields. You could also use range queries with non-date fields:</p>
<source>title:[Aida TO Carmen]</source>
<p>This will find all documents whose titles are between Aida and Carmen.</p>
</subsection>
<subsection name="Boosting a Term"> <subsection name="Boosting a Term">
<p>Lucene provides the relevance level of matching documents based on the terms found. To boost a term use the caret, "^", symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be.</p> <p>Lucene provides the relevance level of matching documents based on the terms found. To boost a term use the caret, "^", symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be.</p>
@ -82,11 +124,14 @@
<p>This will make documents with the term jakarta appear more relevant. You can also boost Phrase Terms as in the example: </p> <p>This will make documents with the term jakarta appear more relevant. You can also boost Phrase Terms as in the example: </p>
<source>"jakarta apache"^4 "jakarta lucene"</source> <source>"jakarta apache"^4 "jakarta lucene"</source>
<p>By default, the boost factor is 1. Although, the boost factor must be positive, it can be less than 1 (i.e. .2)</p> <p>By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2)</p>
</subsection> </subsection>
</section> </section>
<section name="Boolean operators"> <section name="Boolean operators">
<p>Boolean operators allow terms to be combined through logic operators. <p>Boolean operators allow terms to be combined through logic operators.
Lucene supports AND, "+", OR, NOT and "-" as Boolean operators(Note: Boolean operators must be ALL CAPS).</p> Lucene supports AND, "+", OR, NOT and "-" as Boolean operators(Note: Boolean operators must be ALL CAPS).</p>
@ -145,6 +190,12 @@
<p>This eliminates any confusion and makes sure you that website must exist and either term jakarta or apache may exist.</p> <p>This eliminates any confusion and makes sure you that website must exist and either term jakarta or apache may exist.</p>
</section> </section>
<section name="Field Grouping">
<p>Lucene supports using parentheses to group multiple clauses to a single field.</p>
<p>To search for a title that contains both the word "return" and the phrase "pink panther" use the query:</p>
<source>title:(+return +"pink panther")</source>
</section>
<section name="Escaping Special Characters"> <section name="Escaping Special Characters">
<p>Lucene supports escaping special characters that are part of the query syntax. The current list special characters are</p> <p>Lucene supports escaping special characters that are part of the query syntax. The current list special characters are</p>
<p>+ - &amp;&amp; || ! ( ) { } [ ] ^ " ~ * ? : \</p> <p>+ - &amp;&amp; || ! ( ) { } [ ] ^ " ~ * ? : \</p>