LUCENE-1390: Added ASCIIFoldingFilter, a Filter that converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists. ISOLatin1AccentFilter, which handles a subset of this filter, has been deprecated.

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@724053 13f79535-47bb-0310-9956-ffa450edef68
2008-12-06 23:25:42 +00:00 · 2008-12-06 23:25:42 +00:00 · 8660ccb303
parent 19c82a230d
commit 8660ccb303
4 changed files with 3942 additions and 1 deletions
--- a/CHANGES.txt
+++ b/CHANGES.txt
@ -1,4 +1,4 @@
-Lucene Change Log
+Lucene Change Log
 $Id$

 ======================= Trunk (not yet released) =======================
@ -117,6 +117,13 @@ New features
    to allow subclasses to choose which DocIdSet implementation to use
    (Paul Elschot via Mike McCandless)
    
+ 9. LUCENE-1390: Added ASCIIFoldingFilter, a Filter that converts 
+    alphabetic, numeric, and symbolic Unicode characters which are not in 
+    the first 127 ASCII characters (the "Basic Latin" Unicode block) into 
+    their ASCII equivalents, if one exists. ISOLatin1AccentFilter, which
+    handles a subset of this filter, has been deprecated.
+    (Andi Vajda, Steven Rowe via Mark Miller)
+
 Optimizations

 1. LUCENE-1427: Fixed QueryWrapperFilter to not waste time computing
--- a/src/java/org/apache/lucene/analysis/ASCIIFoldingFilter.java
+++ b/src/java/org/apache/lucene/analysis/ASCIIFoldingFilter.java
--- a/src/java/org/apache/lucene/analysis/ISOLatin1AccentFilter.java
+++ b/src/java/org/apache/lucene/analysis/ISOLatin1AccentFilter.java
@ -25,6 +25,9 @@ import org.apache.lucene.analysis.tokenattributes.TermAttribute;
 * <p>
 * For instance, '&agrave;' will be replaced by 'a'.
 * <p>
+ * 
+ * @deprecated in favor of {@link ASCIIFoldingFilter} which covers a superset 
+ * of Latin 1. This class will be removed in Lucene 3.0.
 */
 public class ISOLatin1AccentFilter extends TokenFilter {
  public ISOLatin1AccentFilter(TokenStream input) {
--- a/src/test/org/apache/lucene/analysis/TestASCIIFoldingFilter.java
+++ b/src/test/org/apache/lucene/analysis/TestASCIIFoldingFilter.java