mirror of https://github.com/apache/lucene.git
Add an example that builds a CharFilter chain in Analyzer
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1378593 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
d447b1ae51
commit
71dd31de24
|
@ -33,6 +33,9 @@ import java.io.Reader;
|
||||||
* You can optionally provide more efficient implementations of additional methods
|
* You can optionally provide more efficient implementations of additional methods
|
||||||
* like {@link #read()}, {@link #read(char[])}, {@link #read(java.nio.CharBuffer)},
|
* like {@link #read()}, {@link #read(char[])}, {@link #read(java.nio.CharBuffer)},
|
||||||
* but this is not required.
|
* but this is not required.
|
||||||
|
* <p>
|
||||||
|
* For examples and integration with {@link Analyzer}, see the
|
||||||
|
* {@link org.apache.lucene.analysis Analysis package documentation}.
|
||||||
*/
|
*/
|
||||||
// the way java.io.FilterReader should work!
|
// the way java.io.FilterReader should work!
|
||||||
public abstract class CharFilter extends Reader {
|
public abstract class CharFilter extends Reader {
|
||||||
|
|
|
@ -817,5 +817,30 @@ As a small hint, this is how the new Attribute class could begin:
|
||||||
|
|
||||||
...
|
...
|
||||||
</pre>
|
</pre>
|
||||||
|
<h4>Adding a CharFilter chain</h4>
|
||||||
|
Analyzers take Java {@link java.io.Reader}s as input. Of course you can wrap your Readers with {@link java.io.FilterReader}s
|
||||||
|
to manipulate content, but this would have the big disadvantage that character offsets might be inconsistent with your original
|
||||||
|
text.
|
||||||
|
<p>
|
||||||
|
{@link org.apache.lucene.analysis.CharFilter} is designed to allow you to pre-process input like a FilterReader would, but also
|
||||||
|
preserve the original offsets associated with those characters. This way mechanisms like highlighting still work correctly.
|
||||||
|
CharFilters can be chained.
|
||||||
|
<p>
|
||||||
|
Example:
|
||||||
|
<pre class="prettyprint">
|
||||||
|
public class MyAnalyzer extends Analyzer {
|
||||||
|
|
||||||
|
{@literal @Override}
|
||||||
|
protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
|
||||||
|
return new TokenStreamComponents(new MyTokenizer(reader));
|
||||||
|
}
|
||||||
|
|
||||||
|
{@literal @Override}
|
||||||
|
protected Reader initReader(String fieldName, Reader reader) {
|
||||||
|
// wrap the Reader in a CharFilter chain.
|
||||||
|
return new SecondCharFilter(new FirstCharFilter(reader));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
</pre>
|
||||||
</body>
|
</body>
|
||||||
</html>
|
</html>
|
||||||
|
|
Loading…
Reference in New Issue