lucene/sandbox/projects/appex
Kelvin Tan bcfa0cbc60 Added simple README and GETTING STARTED docs. Hope it will ease the learning curve somewhat,
git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@150766 13f79535-47bb-0310-9956-ffa450edef68
2002-05-10 17:20:29 +00:00
..
legal Initial revision 2002-05-04 15:43:03 +00:00
lib Initial revision 2002-05-04 15:43:03 +00:00
src Initial commit of a system of retrieving Torque-persisted objects from Document objects. 2002-05-08 15:53:48 +00:00
tools Initial revision 2002-05-04 15:43:03 +00:00
GETTING STARTED.txt Added simple README and GETTING STARTED docs. Hope it will ease the learning curve somewhat, 2002-05-10 17:20:29 +00:00
README.txt Added simple README and GETTING STARTED docs. Hope it will ease the learning curve somewhat, 2002-05-10 17:20:29 +00:00
appendcp.bat Initial revision 2002-05-04 15:43:03 +00:00
build.bat Initial revision 2002-05-04 15:43:03 +00:00
build.number Initial revision 2002-05-04 15:43:03 +00:00
build.sh Initial revision 2002-05-04 15:43:03 +00:00
build.xml Initial revision 2002-05-04 15:43:03 +00:00
index.html Initial revision 2002-05-04 15:43:03 +00:00
layout.xml Initial revision 2002-05-04 15:43:03 +00:00
module.xml Initial revision 2002-05-04 15:43:03 +00:00
patch Initial revision 2002-05-04 15:43:03 +00:00
properties.xml Initial revision 2002-05-04 15:43:03 +00:00
status.xml Initial revision 2002-05-04 15:43:03 +00:00

README.txt

This is the README file for a search framework contribution to Lucene Sandbox.

It is an attempt at constructing a framework around the Lucene search API. 
(Can I have a name for it?)

3 interesting features of this framework are: 

datasource independence - through various datasource implementations, 
regardless of whether it is a database table, an object, a filesystem directory, 
or a website, these can all be indexed.

complex datasource support - complex datasources are containers for what are 
potentially new datasources (a Zip archive, a HTML document containing links to 
other HTML documents, a Java object which contains references to other objects 
to be indexed, etc). The framework has basic support for complex datasources.

pluggable file content handlers - content handlers which 'know' how to index 
various file formats (MS Word, Zip, Tar, etc) can be easily configured via an 
xml configuration file.