diff --git a/docs/queryparsersyntax.html b/docs/queryparsersyntax.html new file mode 100644 index 00000000000..92e89541340 --- /dev/null +++ b/docs/queryparsersyntax.html @@ -0,0 +1,680 @@ + + + + + + + + + + +
+ + + + + + + ++ + | ++ + | +
+ + | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
+ About +
Resources +
Plans +
Download +
Jakarta +
|
+
+
|
+ |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
+ + | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
+
+ Copyright © 1999-2002, Apache Software Foundation
+
+ |
Although Lucene provides the ability to create your own query's though its API, it also provides a rich query language through the QueryParser.
+This page provides syntax of Lucene's Query Parser, a lexer which interprets a string into a Lucene Query using JavaCC.
+A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases.
+A Single Term is a single word such as "test" or "hello".
+A Phrase is a group of words surrounded by double quotes such as "hello dolly".
+Multiple terms can be combined together with Boolean operators to form a more complex query (see below).
+Lucene supports fielded data. When performing a search you can either specify a field, or use the default field. The field names and default field is implementation specific.
+You can search any field by typing the field name followed by a colon ":" and then the term you are looking for.
+As an example, let's assume a Lucene index contains two fields, title and text and text is the default field. + If you want to find the document entitled "The Right Way" which contains the text "don't go this way", you can enter:
+ + +or
+ +Since text is the default field, the field indicator is not required.
+ +Note: The field is only valid for the term that it directly precedes, so the query
+ +Will only find "Do" in the title field. It will find "it" and "right" in the default field (in this case the text field).
+Lucene supports modifying query terms to provide a wide range of searching options.
+ +Lucene supports single and multiple character wildcard searches.
+To perform a single character wildcard search use the "?" symbol.
+To perform a multiple character wildcard search use the "*" symbol.
+The single character wildcard search looks for terms that match that with the single character replaced. For example, to search for "text" or "test" you can use the search:
+ + + +Multiple character wildcard searches looks for 0 or more characters. For example, to search for test, tests or tester, you can use the search:
+ +You can also use the wildcard searches in the middle of a term.
+ +Note: You cannot use a * or ? symbol as the first character of a search.
+Lucene supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. To do a fuzzy search use the tilde, "~", symbol at the end of a term. For example to search for a term similar in spelling to "roam" use the fuzzy search:
+ + +This search will find terms like foam and roams
+Note:Terms found by the fuzzy search will automatically get a boost factor of 0.2
+Lucene provides the relevance level of matching documents based on the terms found. To boost a term use the caret, "^", symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be.
+Boosting allows you to control the relevance of a document by boosting its term. For example, if you are searching for
+ + +and you want the term "IBM" to be more relevant boost it using the ^ symbol along with the boost factor next to the term. You would type:
+ +This will make documents with the term IBM appear more relevant. You can also boost Phrase Terms as in the example:
+ + +By default, the boost factor is 1.
+Boolean operators allow terms to be combined through logic operators. + Lucene supports AND, "+", OR, NOT and "-" as Boolean operators(Note: Boolean operators must be ALL CAPS).
+ +The OR operator is the default conjunction operator. This means that if there is no Boolean operator between two terms, the OR operator is used. + The OR operator links two terms and finds a matching document if either of the terms exist in a document. This is equivalent to a union using sets. + For example to search for documents that contain either "Microsoft Word" or just "Microsoft":
+ + + +or
+ + + +The AND operator matches documents where both terms exist anywhere in the text of a single document. + This is equivalent to an intersection using sets. + For example to search for documents that contain "Microsoft Word" and "Microsoft Excel":
+ + +The "+" or required operator requires that the term after the "+" symbol exist somewhere in a the field of a single document. For example, to search for documents that contain jakarta or lucene:
+ + +The NOT operator excludes documents that contain the term after NOT. + This is equivalent to a difference using sets. + For example to search for documents that contain "Microsoft Word" but not "Microsoft Excel":
+ + +The "-" or prohibit operator excludes documents that contain the term after the "-" symbol. For example to search for documents that contain "Microsoft Word" but not "Microsoft Excel":
+ + +Lucene supports using parentheses to group clauses to form sub queries. This can be very useful if you want to control the boolean logic for a query. + For example, to search for either "jakarta" or "apache" and "website":
+ +This eliminates any confusion and makes sure you that website must exist and either term jakarta or apache may exist.
+