mirror of https://github.com/apache/lucene.git
30 lines
1.3 KiB
Plaintext
30 lines
1.3 KiB
Plaintext
Indyo is a datasource-independent Lucene indexing framework.
|
|
|
|
What this means, is that Indyo allows a myriad of sources from which
|
|
data is fed to the search engine to be indexed. Datasources can take
|
|
the form of traditional storage mediums (filesystem, database, web
|
|
site, etc), objects, complex datasources which consist of a mixture of
|
|
objects and storage medium, and pretty much anything which implements
|
|
com.relevanz.indyo.IndexDataSource. If it's a file that's being
|
|
indexed (via com.relevanz.indyo.FSDataSource), the contents of the
|
|
file can be indexed by a class which implements
|
|
com.relevanz.indyo.contenthandler.FileContentHandler (e.g.
|
|
TextHandler, ZIPHandler, etc). Via the datasource, applications can
|
|
also associate a search result object with the object that was indexed
|
|
(or optionally use Peter's SearchBean contribution), for display
|
|
purposes.
|
|
|
|
To summarize, if you:
|
|
|
|
a) Want a way of indexing various sources of data, and even nested
|
|
datasources (like indexing a HTML file, which spawns a custom
|
|
datasource, say RemoteHTMLDataSource, for every link it encounters)
|
|
|
|
b) Simply want a pluggable system of indexing different types of file
|
|
content (currently plain text, Zip, Tar, GZip file formats are
|
|
supported, but writing new file content handlers are easy)
|
|
|
|
then Indyo may be worth checking out.
|
|
|
|
|