Added simple README and GETTING STARTED docs. Hope it will ease the learning curve somewhat,

git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@150766 13f79535-47bb-0310-9956-ffa450edef68
2002-05-10 17:20:29 +00:00 · 2002-05-10 17:20:29 +00:00 · bcfa0cbc60
parent bb68f17330
commit bcfa0cbc60
2 changed files with 33 additions and 0 deletions
--- a/sandbox/projects/appex/GETTING
+++ b/sandbox/projects/appex/GETTING
@ -0,0 +1,14 @@
+The fastest way to get started is to instantiate FSDataSource, passing a File or 
+Directory into the constructor. Then in SearchIndexer, invoke indexDataSource and 
+pass in the FSDataSource in as a parameter. The other argument of indexDataSource, 
+customFields, is for declaring what are the types of the fields you wish indexed. 
+It's an optional argument.
+
+Now you might want to try writing your own DataSource by writing a class which 
+implements the DataSource interface. The only method you need to implement in the 
+DataSource interface is the getData method which returns an array of Maps. 
+From the javadoc of this method, "Each map represents a document to be indexed. 
+The key:value pairs of the map is the metadata of the document.". 
+That should be pretty self-explanatory. What the framework essentially does is 
+convert all keys in the map as Fields, and the value of the keys becoming the 
+value of the Fields. It's that simple!
--- a/sandbox/projects/appex/README.txt
+++ b/sandbox/projects/appex/README.txt
@ -0,0 +1,19 @@
+This is the README file for a search framework contribution to Lucene Sandbox.
+
+It is an attempt at constructing a framework around the Lucene search API. 
+(Can I have a name for it?)
+
+3 interesting features of this framework are: 
+
+datasource independence - through various datasource implementations, 
+regardless of whether it is a database table, an object, a filesystem directory, 
+or a website, these can all be indexed.
+
+complex datasource support - complex datasources are containers for what are 
+potentially new datasources (a Zip archive, a HTML document containing links to 
+other HTML documents, a Java object which contains references to other objects 
+to be indexed, etc). The framework has basic support for complex datasources.
+
+pluggable file content handlers - content handlers which 'know' how to index 
+various file formats (MS Word, Zip, Tar, etc) can be easily configured via an 
+xml configuration file.