2010-05-05 12:27:58 -04:00
|
|
|
Analysis README file
|
|
|
|
|
|
|
|
INTRODUCTION
|
|
|
|
|
|
|
|
The Analysis Module provides analysis capabilities to Lucene and Solr
|
|
|
|
applications.
|
|
|
|
|
|
|
|
The Lucene web site is at:
|
|
|
|
http://lucene.apache.org/
|
|
|
|
|
|
|
|
Please join the Lucene-User mailing list by sending a message to:
|
|
|
|
java-user-subscribe@lucene.apache.org
|
|
|
|
|
|
|
|
FILES
|
|
|
|
|
|
|
|
lucene-analyzers-common-XX.jar
|
|
|
|
The primary analysis module library, containing general-purpose analysis
|
|
|
|
components and support for various languages.
|
2010-05-20 06:46:00 -04:00
|
|
|
|
|
|
|
lucene-analyzers-icu-XX.jar
|
|
|
|
An add-on analysis library that provides improved Unicode support via
|
|
|
|
International Components for Unicode (ICU). Note: this module depends on
|
2010-12-04 09:08:03 -05:00
|
|
|
the ICU4j jar file (version >= 4.6.0)
|
2010-06-23 07:25:17 -04:00
|
|
|
|
2012-01-12 15:10:48 -05:00
|
|
|
lucene-analyzers-kuromoji-XX.jar
|
|
|
|
An analyzer with morphological analysis for Japanese.
|
|
|
|
|
|
|
|
lucene-analyzers-morfologik-XX.jar
|
|
|
|
An analyzer using the Morfologik stemming library.
|
|
|
|
|
2010-06-23 07:25:17 -04:00
|
|
|
lucene-analyzers-phonetic-XX.jar
|
|
|
|
An add-on analysis library that provides phonetic encoders via Apache
|
|
|
|
Commons-Codec. Note: this module depends on the commons-codec jar
|
|
|
|
file (version >= 1.4)
|
2010-05-05 12:27:58 -04:00
|
|
|
|
|
|
|
lucene-analyzers-smartcn-XX.jar
|
|
|
|
An add-on analysis library that provides word segmentation for Simplified
|
|
|
|
Chinese.
|
|
|
|
|
|
|
|
lucene-analyzers-stempel-XX.jar
|
|
|
|
An add-on analysis library that contains a universal algorithmic stemmer,
|
|
|
|
including tables for the Polish language.
|
|
|
|
|
2012-02-14 17:13:34 -05:00
|
|
|
lucene-analyzers-uima-XX.jar
|
|
|
|
An add-on analysis library that contains tokenizers/analyzers using
|
|
|
|
Apache UIMA extracted annotations to identify tokens/types/etc.
|
|
|
|
|
2010-05-05 12:27:58 -04:00
|
|
|
common/src/java
|
2010-05-20 06:46:00 -04:00
|
|
|
icu/src/java
|
2012-01-12 15:10:48 -05:00
|
|
|
kuromoji/src/java
|
|
|
|
morfologik/src/java
|
2010-06-23 07:25:17 -04:00
|
|
|
phonetic/src/java
|
2010-05-05 12:27:58 -04:00
|
|
|
smartcn/src/java
|
|
|
|
stempel/src/java
|
2012-02-14 17:13:34 -05:00
|
|
|
uima/src/java
|
2012-01-12 15:10:48 -05:00
|
|
|
The source code for the libraries.
|
2010-05-05 12:27:58 -04:00
|
|
|
|
|
|
|
common/src/test
|
2010-05-20 06:46:00 -04:00
|
|
|
icu/src/test
|
2012-01-12 15:10:48 -05:00
|
|
|
kuromoji/src/test
|
|
|
|
morfologik/src/test
|
2010-06-23 07:25:17 -04:00
|
|
|
phonetic/src/test
|
2010-05-05 12:27:58 -04:00
|
|
|
smartcn/src/test
|
|
|
|
stempel/src/test
|
2012-02-14 17:13:34 -05:00
|
|
|
uima/src/test
|
2012-01-12 15:10:48 -05:00
|
|
|
Unit tests for the libraries.
|