Lucene-9336: Changes.txt and migrate.md addition for RegExp enhancements (#1515)

Added notes for new \w \s etc support
This commit is contained in:
markharwood 2020-05-14 11:51:59 +01:00 committed by GitHub
parent 1efce5444d
commit 18bd29715a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 8 additions and 0 deletions

View File

@ -60,6 +60,10 @@ API Changes
Improvements Improvements
* LUCENE-9336: RegExp query now supports \w \W \d \D \s \S expressions.
This is a break with previous behaviour where these were (mis)interpreted
as literally the characters w W d etc. (Mark Harwood)
* LUCENE-8757: When provided with an ExecutorService to run queries across * LUCENE-8757: When provided with an ExecutorService to run queries across
multiple threads, IndexSearcher now groups small segments together, up to multiple threads, IndexSearcher now groups small segments together, up to
250k docs per slice. (Atri Sharma via Adrien Grand) 250k docs per slice. (Atri Sharma via Adrien Grand)

View File

@ -1,5 +1,9 @@
# Apache Lucene Migration Guide # Apache Lucene Migration Guide
## RegExp certain regular expressions now match differently (LUCENE-9336)
The commonly used regular expressions \w \W \d \D \s and \S now work the same way [Java Pattern](https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html#CHART) matching works. Previously these expressions were (mis)interpreted as searches for the literal characters w, d, s etc.
## NGramFilterFactory "keepShortTerm" option was fixed to "preserveOriginal" (LUCENE-9259) ## NGramFilterFactory "keepShortTerm" option was fixed to "preserveOriginal" (LUCENE-9259)
The factory option name to output the original term was corrected in accordance with its Javadoc. The factory option name to output the original term was corrected in accordance with its Javadoc.