Lucene Javascript Query Constructor

luceneQueryConstructor.js is a Javascript framework for constructing queries using the "advanced" search features of lucene, namely field-searching, boolean searching, phrase searching, group searching (via parentheses) and various combinations of the aforementioned.

It also provides a convenient way to mimic a Google-like search, where all terms are ANDed, as opposed to Lucene's default OR modifier.

This HTML form provides examples on the usage of luceneQueryConstructor.js. An interface similar to Google's Advanced Search form is shown here.

Find results With all of the words
With the exact phrase
With at least one of the words
Without the words
File Format return results of the file format
Date Return results updated in the

luceneQueryConstructor works by assuming a certain naming convention of form fields to obtain the necessary information to construct the query.
NB:Unless otherwise specified, all uses of the word field should be assumed to mean form input fields and not Lucene document fields.

The input form field is expected to be the same name as the Lucene Field. For example, if you have a Document with fileName as a Field, and you'd like to provide field-searching on this field, then introduce a form field like so:

<input type="text" name="fileName">

You are also expected to provide another field known as this field's modifier. This modifier field tells luceneQueryConstructor how to convert the field and value into a Lucene query. The naming convention of the modifier is <name of input field/Lucene field><modifier suffix as declared in luceneQueryConstructor.js>. So, for the fileName field we introduced above, it's modifier field would be:

<input type="hidden" name="fileNameModifier" value="+|+">

The value of the modifier field is in the form <term modifier>|<group modifier>. Let me explain.

Looking at the form above, we see fields that provide

  1. AND search
  2. OR search
  3. NOT search
  4. and others which are unrelated to this discussion
Given a value of foo bar, the AND search field must be converted to +foo +bar (luceneQueryConstructor only supports using +/-, not AND/OR/NOT), the NOT search to -foo -bar and the OR search not at all.

However, also consider the relationship between these groups of fields. Assuming Google's Advanced Search interface, we're effectively saying that we want all of the terms in the AND search field AND at least one of the terms in the OR search field AND none of the terms in the NOT search.

So, if the AND, OR and NOT search fields all have the values of foo bar, then an appropriate search query which fulfills the requirements would be

+foo +bar +(foo bar) -foo -bar

Well, to be more correct, it should be

+(+foo +bar) +(foo bar) -foo -bar

Hmmmm...if you're sharp, you would have noticed that the NOT terms aren't grouped. You'll find that if you group them with an AND modifier, no results will be returned at all (though it's a valid query), because the query constructed wouldn't make any sense at all. Lucene also implicitly ANDs NOT terms, it seems. In any case, both queries as presented are correct, though I prefer the first one because it is less verbose.

The following matrix provides modifiers and their effects on queries:

Boolean modifier Form value As term modifier As group modifier
AND + +term1 + term2 ... +(...)
OR (single space) term1 term2 ... (...)
NOT - -term1 -term2 ... -(...)
no modifier 0 term1 term2 ... no grouping

With this knowledge, we know that the value of the AND field modifier needs to be +|0 for the first query and +|+ for the second query, the values of the NOT field modifier and the OR field modifier are -|0 and  |+ (it's an empty space before the |) in both queries respectively.

Well, that's all I have to say for now. There are more topics to be covered, such as construction of phrase searches, non-field searches, multiple list box selections, radio buttons etc, but right now I'm not even sure if anyone will read this much! :-) Anyway, there's always the code.