lucene

History

Erick Erickson 2d93746254 SOLR-6826: fieldType capitalization is not consistent with the rest of case-sensitive field names git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1645349 13f79535-47bb-0310-9956-ffa450edef68		2014-12-13 20:49:03 +00:00
..
src	SOLR-6826: fieldType capitalization is not consistent with the rest of case-sensitive field names	2014-12-13 20:49:03 +00:00
README.txt	SOLR-1979: Updated README and CHANGES in trunk	2012-04-08 23:05:52 +00:00
build.xml	SOLR-6006: fix Solr contrib test dependencies by adding jcl-over-slf4j and retrieving it into each contrib's test-lib/ directory	2014-04-25 08:55:05 +00:00
ivy.xml	LUCENE-6007: Regularize ivy.xml files to use configurations that map to remote master configurations, so that Ivy won't try to download extraneous crap	2014-10-16 20:13:48 +00:00

README.txt

Apache Solr Language Identifier


Introduction
------------
This module is intended to be used while indexing documents.
It is implemented as an UpdateProcessor to be placed in an UpdateChain.
Its purpose is to identify language from documents and tag the document with language code.
The module can optionally map field names to their language specific counterpart,
e.g. if the input is "title" and language is detected as "en", map to "title_en".
Language may be detected globally for the document, and/or individually per field.
Language detector implementations are pluggable.

Getting Started
---------------
Please refer to the module documentation at http://wiki.apache.org/solr/LanguageDetection

Dependencies
------------
The Tika detector depends on Tika Core (which is part of extraction contrib)
The Langdetect detector depends on LangDetect library