lucene/xdocs/lucene-sandbox/index.xml

<?xml version="1.0"?>
<document>
<properties>
<author>Otis Gospodentic</author>
<title>Lucene Sandbox</title>
</properties>
<body>

<section name="Lucene Sandbox">
<p>
Lucene project also contains a workspace, Lucene Sandbox, that is open to all Lucene committers, as well
as a few other developers.  The purpose of the Sandbox is to host various third party contributions,
and to serve as a place to try out new ideas and prepare them for inclusion into the core Lucene
distribution.<br/>
Users are free to experiment with the components developed in the Sandbox, but Sandbox components will
not necessarily be maintained, particularly in their current state.
</p>
<p>
You can access the Lucene Sandbox CVS repository at
<a href="http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/">http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/</a>.
</p>


<subsection name="Indyo">
<p>
Indyo is a datasource-independent Lucene indexing framework.
</p>
<p>
A tutorial for using Indyo can be found <a href="indyo/tutorial.html">here</a>.
</p>
</subsection>

<subsection name="LARM">
<p>
LARM is a web crawler optimized for large intranets with up to a couple of hundred hosts.
</p>
<a href="larm/overview.html">Technical Overview</a>

</subsection>
<subsection name="Snowball Stemmers for Lucene">
<p>
This project provides pre-compiled versions of the Snowball stemmers
for Lucene.
</p>

<p>
More information can be found 
<a href="http://jakarta.apache.org/lucene/docs/lucene-sandbox/snowball/">here</a>.
</p>

<p>
<a href="http://snowball.tartarus.org/">Background information on Snowball</a>,
which is a language for stemmers developed by Martin Porter.
</p>

</subsection>

<subsection name="Ant">
<p>
The Ant project is a useful Ant task that creates a Lucene index out of an Ant fileset.  It also
contains an example HTML parser that uses JTidy.
</p>
<p>
<a href="http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/contributions/ant/">The 
CVS repository for the Ant contribution.</a>
</p>
</subsection>

<subsection name="SearchBean">
<p>
SearchBean is a UI component that can be used to browse through the results of a Lucene search.
The SearchBean searches the index for a given query string, retrieves the hits, and then brings
them into the HitsIterator class, which can be used for paging and sorting through search results.

</p>
<p>
<a href="http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/contributions/searchBean/">The 
CVS repository for the SearchBean contribution.</a>
</p>
<p>
<a href="http://snowball.tartarus.org/">Background information on Snowball</a>,
which is a language for stemmers developed by Martin Porter.
</p>

</subsection>

<subsection name="Lucene Service for Fulcrum">
<p>
Lucene can be run as a service inside <a href="http://jakarta.apache.org/turbine/fulcrum/index.html">Fulcrum</a>,
which is the services framework from the 
<a href="http://jakarta.apache.org/turbine/">Turbine</a> project.</p>
<p>
The implementation consists of a SearchService interface, a LuceneSearchSearchService implementation, and a
SearchResults object that gets an array of Document objects from a Hits object. Calls to the search methods on 
the service return the SearchResults object.
</p>
<p>
The service supports querying, but does not support indexing.  
</p>
<p>
<a href="http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/contributions/fulcrum/"> 
CVS repository for the Fulcrum Service.</a>
</p>
</subsection>

<subsection name="WordNet/Synonyms">
<p>
The Lucene WordNet code consists of a single class which parses a prolog file 
from the WordNet site that contains a list of English words and synonyms. 
The class builds a Lucene index from the synonyms file.  Your querying code could
hit this index to build up a set of synonyms for the terms in the
search query.  
</p>
<p>
More information on the <a href="http://www.tropo.com/techno/java/lucene/wordnet.html">Lucene WordNet package</a>.  
<a href="http://www.cogsci.princeton.edu/~wn/">WordNet</a> is an online database of English language words that contains
synonyms, definitions, and various relationships between synonym sets.
</p>
<p>
<a href="http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/contributions/WordNet/"> 
CVS for the WordNet module.</a>
</p>
</subsection>

<subsection name="SAX/DOM XML Indexing demo">
<p>
This contribution is some sample code that demonstrates adding simple XML documents into the index.  It creates
a new Document object for each file, and then populates the Document with a Field for each XML element, recursively.
There are examples included for both SAX and DOM.
</p>
<p>

<a href="http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/contributions/XML-Indexing-Demo/"> 
CVS for the XML Indexing Demo.</a>
</p>
</subsection>

<subsection name="High Frequency Terms">
<p>
The miscellaneous package is for classes that don't fit anywhere else. The only class in it right now determines
what terms occur the most inside a Lucene index.  This could be useful for analyzing which terms may need to go
into a custom stop word list for better search results.
</p>
<p>

<a href="http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/contributions/miscellaneous/src/java/org/apache/lucene/misc/"> 
CVS for miscellaneous classes.</a>
</p>
</subsection>

</section>

</body>
</document>
- A place holder for Lucene Sandbox documentation, etc. PR: Obtained from: Submitted by: Reviewed by: git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149809 13f79535-47bb-0310-9956-ffa450edef68 2002-07-14 15:00:11 -04:00			`<?xml version="1.0"?>`
			`<document>`
			`<properties>`
			`<author>Otis Gospodentic</author>`
			`<title>Lucene Sandbox</title>`
			`</properties>`
			`<body>`

			`<section name="Lucene Sandbox">`
- Added a link to LARM Technical Overview. git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149872 13f79535-47bb-0310-9956-ffa450edef68 2002-10-29 23:09:30 -05:00			`<p>`
- A place holder for Lucene Sandbox documentation, etc. PR: Obtained from: Submitted by: Reviewed by: git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149809 13f79535-47bb-0310-9956-ffa450edef68 2002-07-14 15:00:11 -04:00			`Lucene project also contains a workspace, Lucene Sandbox, that is open to all Lucene committers, as well`
			`as a few other developers. The purpose of the Sandbox is to host various third party contributions,`
			`and to serve as a place to try out new ideas and prepare them for inclusion into the core Lucene`
- Added a link to LARM Technical Overview. git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149872 13f79535-47bb-0310-9956-ffa450edef68 2002-10-29 23:09:30 -05:00			`distribution.<br/>`
- A place holder for Lucene Sandbox documentation, etc. PR: Obtained from: Submitted by: Reviewed by: git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149809 13f79535-47bb-0310-9956-ffa450edef68 2002-07-14 15:00:11 -04:00			`Users are free to experiment with the components developed in the Sandbox, but Sandbox components will`
			`not necessarily be maintained, particularly in their current state.`
- Added a link to LARM Technical Overview. git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149872 13f79535-47bb-0310-9956-ffa450edef68 2002-10-29 23:09:30 -05:00			`</p>`
			`<p>`
			`You can access the Lucene Sandbox CVS repository at`
			`<a href="http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/">http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/</a>.`
			`</p>`

- Indyo docs. PR: Obtained from: Submitted by: Kelvin Tan Reviewed by: git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149844 13f79535-47bb-0310-9956-ffa450edef68 2002-09-17 00:21:06 -04:00
			`<subsection name="Indyo">`
			`<p>`
			`Indyo is a datasource-independent Lucene indexing framework.`
			`</p>`
			`<p>`
			`A tutorial for using Indyo can be found <a href="indyo/tutorial.html">here</a>.`
			`</p>`
			`</subsection>`

- Added a link to LARM Technical Overview. git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149872 13f79535-47bb-0310-9956-ffa450edef68 2002-10-29 23:09:30 -05:00			`<subsection name="LARM">`
			`<p>`
			`LARM is a web crawler optimized for large intranets with up to a couple of hundred hosts.`
			`</p>`
			`<a href="larm/overview.html">Technical Overview</a>`

Add link to Snowball page. git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149904 13f79535-47bb-0310-9956-ffa450edef68 2002-12-24 16:20:23 -05:00			`</subsection>`
			`<subsection name="Snowball Stemmers for Lucene">`
			`<p>`
			`This project provides pre-compiled versions of the Snowball stemmers`
			`for Lucene.`
			`</p>`

			`<p>`
			`More information can be found`
			`<a href="http://jakarta.apache.org/lucene/docs/lucene-sandbox/snowball/">here</a>.`
			`</p>`

- Added sections about Ant and SearchBean contributions. Submitted by: Jeff Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149941 13f79535-47bb-0310-9956-ffa450edef68 2003-01-28 17:54:23 -05:00			`<p>`
			`<a href="http://snowball.tartarus.org/">Background information on Snowball</a>,`
			`which is a language for stemmers developed by Martin Porter.`
			`</p>`

			`</subsection>`

			`<subsection name="Ant">`
			`<p>`
			`The Ant project is a useful Ant task that creates a Lucene index out of an Ant fileset. It also`
			`contains an example HTML parser that uses JTidy.`
			`</p>`
			`<p>`
			`<a href="http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/contributions/ant/">The`
			`CVS repository for the Ant contribution.</a>`
			`</p>`
			`</subsection>`

			`<subsection name="SearchBean">`
			`<p>`
			`SearchBean is a UI component that can be used to browse through the results of a Lucene search.`
			`The SearchBean searches the index for a given query string, retrieves the hits, and then brings`
			`them into the HitsIterator class, which can be used for paging and sorting through search results.`

			`</p>`
			`<p>`
			`<a href="http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/contributions/searchBean/">The`
			`CVS repository for the SearchBean contribution.</a>`
			`</p>`
			`<p>`
			`<a href="http://snowball.tartarus.org/">Background information on Snowball</a>,`
			`which is a language for stemmers developed by Martin Porter.`
			`</p>`

			`</subsection>`

- Added information about other Lucene Sandbox contributions. Submitted by: Jell Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149944 13f79535-47bb-0310-9956-ffa450edef68 2003-01-31 14:42:30 -05:00			`<subsection name="Lucene Service for Fulcrum">`
- Added sections about Ant and SearchBean contributions. Submitted by: Jeff Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149941 13f79535-47bb-0310-9956-ffa450edef68 2003-01-28 17:54:23 -05:00			`<p>`
- Added information about other Lucene Sandbox contributions. Submitted by: Jell Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149944 13f79535-47bb-0310-9956-ffa450edef68 2003-01-31 14:42:30 -05:00			`Lucene can be run as a service inside <a href="http://jakarta.apache.org/turbine/fulcrum/index.html">Fulcrum</a>,`
			`which is the services framework from the`
			`<a href="http://jakarta.apache.org/turbine/">Turbine</a> project.</p>`
			`<p>`
			`The implementation consists of a SearchService interface, a LuceneSearchSearchService implementation, and a`
			`SearchResults object that gets an array of Document objects from a Hits object. Calls to the search methods on`
			`the service return the SearchResults object.`
- Added sections about Ant and SearchBean contributions. Submitted by: Jeff Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149941 13f79535-47bb-0310-9956-ffa450edef68 2003-01-28 17:54:23 -05:00			`</p>`
			`<p>`
- Added information about other Lucene Sandbox contributions. Submitted by: Jell Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149944 13f79535-47bb-0310-9956-ffa450edef68 2003-01-31 14:42:30 -05:00			`The service supports querying, but does not support indexing.`
			`</p>`
			`<p>`
			`<a href="http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/contributions/fulcrum/">`
			`CVS repository for the Fulcrum Service.</a>`
- Added sections about Ant and SearchBean contributions. Submitted by: Jeff Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149941 13f79535-47bb-0310-9956-ffa450edef68 2003-01-28 17:54:23 -05:00			`</p>`
			`</subsection>`

- Added information about other Lucene Sandbox contributions. Submitted by: Jell Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149944 13f79535-47bb-0310-9956-ffa450edef68 2003-01-31 14:42:30 -05:00			`<subsection name="WordNet/Synonyms">`
- Added sections about Ant and SearchBean contributions. Submitted by: Jeff Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149941 13f79535-47bb-0310-9956-ffa450edef68 2003-01-28 17:54:23 -05:00			`<p>`
- Added information about other Lucene Sandbox contributions. Submitted by: Jell Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149944 13f79535-47bb-0310-9956-ffa450edef68 2003-01-31 14:42:30 -05:00			`The Lucene WordNet code consists of a single class which parses a prolog file`
			`from the WordNet site that contains a list of English words and synonyms.`
			`The class builds a Lucene index from the synonyms file. Your querying code could`
			`hit this index to build up a set of synonyms for the terms in the`
			`search query.`
			`</p>`
			`<p>`
			`More information on the <a href="http://www.tropo.com/techno/java/lucene/wordnet.html">Lucene WordNet package</a>.`
			`<a href="http://www.cogsci.princeton.edu/~wn/">WordNet</a> is an online database of English language words that contains`
			`synonyms, definitions, and various relationships between synonym sets.`
			`</p>`
			`<p>`
			`<a href="http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/contributions/WordNet/">`
			`CVS for the WordNet module.</a>`
			`</p>`
			`</subsection>`
- Added sections about Ant and SearchBean contributions. Submitted by: Jeff Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149941 13f79535-47bb-0310-9956-ffa450edef68 2003-01-28 17:54:23 -05:00
- Added information about other Lucene Sandbox contributions. Submitted by: Jell Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149944 13f79535-47bb-0310-9956-ffa450edef68 2003-01-31 14:42:30 -05:00			`<subsection name="SAX/DOM XML Indexing demo">`
			`<p>`
			`This contribution is some sample code that demonstrates adding simple XML documents into the index. It creates`
			`a new Document object for each file, and then populates the Document with a Field for each XML element, recursively.`
			`There are examples included for both SAX and DOM.`
- Added sections about Ant and SearchBean contributions. Submitted by: Jeff Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149941 13f79535-47bb-0310-9956-ffa450edef68 2003-01-28 17:54:23 -05:00			`</p>`
			`<p>`
- Added information about other Lucene Sandbox contributions. Submitted by: Jell Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149944 13f79535-47bb-0310-9956-ffa450edef68 2003-01-31 14:42:30 -05:00
			`<a href="http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/contributions/XML-Indexing-Demo/">`
			`CVS for the XML Indexing Demo.</a>`
			`</p>`
			`</subsection>`

			`<subsection name="High Frequency Terms">`
			`<p>`
			`The miscellaneous package is for classes that don't fit anywhere else. The only class in it right now determines`
			`what terms occur the most inside a Lucene index. This could be useful for analyzing which terms may need to go`
			`into a custom stop word list for better search results.`
			`</p>`
			`<p>`

			`<a href="http://cvs.apache.org/viewcvs.cgi/jakarta-lucene-sandbox/contributions/miscellaneous/src/java/org/apache/lucene/misc/">`
			`CVS for miscellaneous classes.</a>`
- Added sections about Ant and SearchBean contributions. Submitted by: Jeff Linwood git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149941 13f79535-47bb-0310-9956-ffa450edef68 2003-01-28 17:54:23 -05:00			`</p>`
- Added a link to LARM Technical Overview. git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149872 13f79535-47bb-0310-9956-ffa450edef68 2002-10-29 23:09:30 -05:00			`</subsection>`

- A place holder for Lucene Sandbox documentation, etc. PR: Obtained from: Submitted by: Reviewed by: git-svn-id: https://svn.apache.org/repos/asf/lucene/java/trunk@149809 13f79535-47bb-0310-9956-ffa450edef68 2002-07-14 15:00:11 -04:00			`</section>`

			`</body>`
			`</document>`