mirror of https://github.com/apache/lucene.git
194 lines
8.8 KiB
HTML
194 lines
8.8 KiB
HTML
<html>
|
|
<head>
|
|
<meta name="Author" content="Kelvin Tan">
|
|
<title>Lucene Query Constructor Demo and Introduction</title>
|
|
<script type="text/javascript" src="luceneQueryConstructor.js"></script>
|
|
<script type="text/javascript" src="../queryValidator/luceneQueryValidator.js"></script>
|
|
<script>
|
|
submitForm = false // necessary for luceneQueryConstructor not to submit the form upon query construction
|
|
function doSubmitForm(frm)
|
|
{
|
|
if(frm["noField-phrase-input"].value.length > 0)
|
|
frm["noField-phrase"].value = quote(frm["noField-phrase-input"].value)
|
|
else if(frm["noField-phrase"].value.length > 0)
|
|
frm["noField-phrase"].value = ''
|
|
doMakeQuery(frm.query);
|
|
}
|
|
</script>
|
|
</head>
|
|
|
|
<body>
|
|
<h2>Lucene Javascript Query Constructor</h2>
|
|
<p>luceneQueryConstructor.js is a Javascript framework for constructing queries using the "advanced" search features of lucene,
|
|
namely field-searching, boolean searching, phrase searching, group searching (via parentheses) and various combinations of the aforementioned.
|
|
<p>It also provides a convenient way to mimic a Google-like search, where all terms are ANDed, as opposed to Lucene's default OR modifier.
|
|
<p>
|
|
This HTML form provides examples on the usage of luceneQueryConstructor.js.
|
|
An interface similar to <a href="http://www.google.com/advanced_search">Google's Advanced Search</a> form is shown here.
|
|
<form>
|
|
<table width="100%" border="0" cellspacing="1" cellpadding="5">
|
|
<tr>
|
|
<th></th>
|
|
<td width="25%"></td>
|
|
<tr>
|
|
<th>
|
|
<input name="noField-andModifier" value="+|0" type="hidden"><b>Find results</b>
|
|
</th>
|
|
<td class="bodytext">With <b>all</b> of the words</td>
|
|
<td class="bodytext">
|
|
<input type="text" name="noField-and" size="25">
|
|
<select name="resultsPerPage">
|
|
<option value="10">10 results<option value="20">20 results
|
|
<option value="50" selected>50 results</select>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<th>
|
|
<input name="noField-phraseModifier" value="+|+" type="hidden">
|
|
</th>
|
|
<td class="bodytext">With the <b>exact phrase</b></td>
|
|
<td class="bodytext">
|
|
<input type="text" name="noField-phrase-input" size="25">
|
|
<input type="hidden" name="noField-phrase">
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<th>
|
|
<input name="noField-orModifier" value=" |+" type="hidden">
|
|
</th>
|
|
<td class="bodytext">With <b>at least</b> one of the words</td>
|
|
<td class="bodytext">
|
|
<input type="text" name="noField-or" size="25">
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<th>
|
|
<input name="noField-notModifier" value="-|0" type="hidden">
|
|
</th>
|
|
<td class="bodytext"><b>Without</b> the words</td>
|
|
<td class="bodytext">
|
|
<input type="text" name="noField-not" size="25">
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<th>
|
|
<b>File Format</b>
|
|
</th>
|
|
<td class="bodytext">
|
|
<select name="fileNameModifier"><option value="And" selected>Only</option><option value="Not">Don't</option></select>
|
|
return results of the file format</td>
|
|
<td class="bodytext">
|
|
<select name="fileName"><option value="" selected>any format<option value="pdf">Adobe Acrobat PDF (.pdf)
|
|
<option value="doc">Microsoft Word (.doc)<option value="xls">Microsoft Excel (.xls)<option value="ppt">Microsoft Powerpoint (.ppt)</select>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<th>
|
|
<b>Date</b>
|
|
</th>
|
|
<td class="bodytext">
|
|
Return results updated in the </td>
|
|
<td class="bodytext">
|
|
<select name="date"><option value="" selected>anytime<option value="3">past 3 months
|
|
<option value="6">past 6 months<option value="12">past year</select>
|
|
<input type="hidden" name="fromDate">
|
|
<input type="hidden" name="toDate">
|
|
<input type="hidden" name="dateRangeField" value="lastModifiedDate">
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
|
|
<input type="hidden" name="query">
|
|
<tr><td> </tr>
|
|
<tr><th><p>Current Query:</th><td><pre id="curQuery"></pre><pre id="curQueryValid"></pre></td><td>
|
|
|
|
<input type="button" name="Update" value="Update Query" onClick="doSubmitForm(this.form); document.getElementById('curQuery').innerHTML = this.form.query.value" />
|
|
<input type="button" name="Validate" value="Validate" onClick="doCheckLuceneQuery(this.form.query); getElementById('curQueryValid').innerHTML = 'Query is valid'" />
|
|
</td>
|
|
</table>
|
|
</form>
|
|
<p>luceneQueryConstructor works by assuming a certain naming convention of form fields to obtain the necessary information to construct the query.<br>
|
|
<b>NB:</b>Unless otherwise specified, all uses of the word <code>field</code> should be assumed to mean form input fields and not Lucene document fields.
|
|
<p>The input form field is expected to be the <b>same name as the Lucene Field</b>. For example, if you have a Document with <i>fileName</i> as a Field, and
|
|
you'd like to provide field-searching on this field, then introduce a form field like so:
|
|
<center>
|
|
<p><code><input type="text" name="fileName"></code>
|
|
</p>
|
|
</center>
|
|
You are also expected to provide another field known as
|
|
this field's <b>modifier</b>. This modifier field tells luceneQueryConstructor how to convert the field and value into a Lucene query. The naming convention
|
|
of the modifier is <b><name of input field/Lucene field><modifier suffix as declared in luceneQueryConstructor.js></b>. So, for the <i>fileName</i> field
|
|
we introduced above, it's modifier field would be:
|
|
<center>
|
|
<p><code><input type="hidden" name="fileNameModifier" value="+|+"></code>
|
|
</p>
|
|
</center>
|
|
<p>The value of the modifier field is in the form <b><term modifier>|<group modifier></b>. Let me explain.
|
|
<p>Looking at the form above, we see fields that provide
|
|
<ol>
|
|
<li>AND search
|
|
<li>OR search
|
|
<li>NOT search
|
|
<li>and others which are unrelated to this discussion
|
|
</ol>
|
|
Given a value of <b><i>foo bar</i></b>, the AND search field must be converted to <b><i>+foo +bar</i></b> (luceneQueryConstructor only supports
|
|
using +/-, not AND/OR/NOT), the NOT search to <b><i>-foo -bar</i></b> and the OR search not at all.
|
|
<p>However, also consider the relationship <b>between</b> these groups of fields. Assuming Google's Advanced Search interface,
|
|
we're effectively saying that we want all of the terms in the AND search field <b>AND</b> at least one of the
|
|
terms in the OR search field <b>AND</b> none of the terms in the NOT search.
|
|
<p>So, if the AND, OR and NOT search fields all have the values of <b><i>foo bar</i></b>, then an appropriate search query
|
|
which fulfills the requirements would be
|
|
<center>
|
|
<p><code>+foo +bar +(foo bar) -foo -bar</code>
|
|
</p>
|
|
</center>
|
|
Well, to be more correct, it should be
|
|
<center>
|
|
<p><code>+(+foo +bar) +(foo bar) -foo -bar</code>
|
|
</p>
|
|
</center>
|
|
Hmmmm...if you're sharp, you would have noticed that the NOT terms aren't grouped.
|
|
You'll find that if you group them with an AND modifier, no results will be returned at all (though it's a valid query),
|
|
because the query constructed wouldn't make any sense at all. Lucene also implicitly ANDs NOT terms, it seems. In any case,
|
|
both queries as presented are correct, though I prefer the first one because it is less verbose.
|
|
<p>The following matrix provides modifiers and their effects on queries:<br><br>
|
|
<table border="1" align="center">
|
|
<tr>
|
|
<th>Boolean modifier
|
|
<th>Form value
|
|
<th>As term modifier
|
|
<th>As group modifier
|
|
</tr>
|
|
<tr align="center">
|
|
<td>AND
|
|
<td>+
|
|
<td>+term1 + term2 ...
|
|
<td>+(...)
|
|
</tr>
|
|
<tr align="center">
|
|
<td>OR
|
|
<td>(single space)
|
|
<td>term1 term2 ...
|
|
<td>(...)
|
|
</tr>
|
|
<tr align="center">
|
|
<td>NOT
|
|
<td>-
|
|
<td>-term1 -term2 ...
|
|
<td>-(...)
|
|
</tr>
|
|
<tr align="center">
|
|
<td>no modifier
|
|
<td>0
|
|
<td>term1 term2 ...
|
|
<td>no grouping
|
|
</tr>
|
|
</table>
|
|
<p>With this knowledge, we know that the value of the AND field modifier needs to be <b><i>+|0</i></b> for the first query and
|
|
<b><i>+|+</i></b> for the second query, the values of the NOT field modifier and the
|
|
OR field modifier are <b><i>-|0</i></b> and <b><i> |+</i></b> (it's an empty space before the |) in both queries respectively.
|
|
<p>
|
|
Well, that's all I have to say for now. There are more topics to be covered, such as construction of phrase searches, non-field searches,
|
|
multiple list box selections, radio buttons etc, but right now I'm not even sure if anyone will read this much! :-) Anyway, there's always the code.
|
|
</body>
|
|
</html> |