Modified RegexTermEnum to have more generic logic, Character.isLetterOrDigit(), to determine the prefix for term enumeration.

Also added commented out test demonstrating where prefix logic fails currently.  Perhaps, like FuzzyQuery, we should push the
prefix calculation back to the user of the query for now?



git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@348692 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Erik Hatcher 2005-11-24 09:09:48 +00:00
parent cf1d106504
commit 729175f73a
2 changed files with 8 additions and 3 deletions

View File

@ -26,9 +26,7 @@ public class RegexTermEnum extends FilteredTermEnum {
while (index < text.length()) {
char c = text.charAt(index);
// TODO: improve the logic here. There are other types of patterns
// that could break this, such as "\d*" and "\*abc"
if (c == '*' || c == '[' || c == '?' || c == '.') break;
if (!Character.isLetterOrDigit(c)) break;
index++;
}

View File

@ -89,5 +89,12 @@ public class TestRegexQuery extends TestCase {
public void testSpanRegex2() throws Exception {
assertEquals(0, spanRegexQueryNrHits("q.[aeiou]c.*", "dog", 5, true));
}
// public void testPrefix() throws Exception {
// This test currently fails because RegexTermEnum picks "r" as the prefix
// but the following "?" makes the "r" optional and should be a hit for the
// document matching "over".
// assertEquals(1, regexQueryNrHits("r?over"));
// }
}