mirror of https://github.com/apache/lucene.git
Add an example that builds a CharFilter chain in Analyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1378593 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
d447b1ae51
commit
71dd31de24
|
@ -33,6 +33,9 @@ import java.io.Reader;
|
|||
* You can optionally provide more efficient implementations of additional methods
|
||||
* like {@link #read()}, {@link #read(char[])}, {@link #read(java.nio.CharBuffer)},
|
||||
* but this is not required.
|
||||
* <p>
|
||||
* For examples and integration with {@link Analyzer}, see the
|
||||
* {@link org.apache.lucene.analysis Analysis package documentation}.
|
||||
*/
|
||||
// the way java.io.FilterReader should work!
|
||||
public abstract class CharFilter extends Reader {
|
||||
|
|
|
@ -817,5 +817,30 @@ As a small hint, this is how the new Attribute class could begin:
|
|||
|
||||
...
|
||||
</pre>
|
||||
<h4>Adding a CharFilter chain</h4>
|
||||
Analyzers take Java {@link java.io.Reader}s as input. Of course you can wrap your Readers with {@link java.io.FilterReader}s
|
||||
to manipulate content, but this would have the big disadvantage that character offsets might be inconsistent with your original
|
||||
text.
|
||||
<p>
|
||||
{@link org.apache.lucene.analysis.CharFilter} is designed to allow you to pre-process input like a FilterReader would, but also
|
||||
preserve the original offsets associated with those characters. This way mechanisms like highlighting still work correctly.
|
||||
CharFilters can be chained.
|
||||
<p>
|
||||
Example:
|
||||
<pre class="prettyprint">
|
||||
public class MyAnalyzer extends Analyzer {
|
||||
|
||||
{@literal @Override}
|
||||
protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
|
||||
return new TokenStreamComponents(new MyTokenizer(reader));
|
||||
}
|
||||
|
||||
{@literal @Override}
|
||||
protected Reader initReader(String fieldName, Reader reader) {
|
||||
// wrap the Reader in a CharFilter chain.
|
||||
return new SecondCharFilter(new FirstCharFilter(reader));
|
||||
}
|
||||
}
|
||||
</pre>
|
||||
</body>
|
||||
</html>
|
||||
|
|
Loading…
Reference in New Issue