@@ -59,24 +59,22 @@
Resources
Plans
@@ -117,11 +115,11 @@
Jakarta Lucene is a high-performance, full-featured text search engine
written entirely in Java. It is a technology suitable for nearly any
-application that requires full-text search, especially cross-platform.
+application that requires full-text search, especially cross-platform.
-Jakarta Lucene is an open source project available for
-free download from Apache Jakarta.
+Jakarta Lucene is an open source project available for
+free download from Apache Jakarta.
Please use the links on the left to access Lucene.
@@ -142,14 +140,14 @@ Please use the links on the left to access Lucene.
Download it here.
-Lucene v1.02 released - This release repackages Lucene as a product
-of the Apache Software Foundation. Download it
+Lucene v1.02 released - This release repackages Lucene as a product
+of the Apache Software Foundation. Download it
here.
-Lucene Joins Jakarta - The Lucene Team is happy to announce that
-Lucene is now a part of the Apache Jakarta Project. This move will
-help Lucene continue to grow, and enhance its position as the leading
+Lucene Joins Jakarta - The Lucene Team is happy to announce that
+Lucene is now a part of the Apache Jakarta Project. This move will
+help Lucene continue to grow, and enhance its position as the leading
server-side searching solution for Java.
@@ -166,7 +164,7 @@ server-side searching solution for Java.
|
-The goal of the Apache Jakarta Project
+The goal of the Apache Jakarta Project
is to provide commercial-quality server solutions, based on the Java Platform,
developed in an open and cooperative fashion.
@@ -191,3 +189,23 @@ developed in an open and cooperative fashion.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/docs/luceneplan.html b/docs/luceneplan.html
index d76548dbb99..8c73a02c8dc 100644
--- a/docs/luceneplan.html
+++ b/docs/luceneplan.html
@@ -115,8 +115,8 @@
The best reference is
htDig, though it is not quite as sophisticated as
Lucene, it has a number of features that make it
- desireable. It however is a traditional c-compiled app
- which makes it somewhat unpleasent to install on some
+ desirable. It however is a traditional c-compiled app
+ which makes it somewhat unpleasant to install on some
platforms (like Solaris!).
@@ -124,12 +124,12 @@
community for an initial reaction, advice, feedback and
consent. Following this it will be submitted to the
Lucene user community for support. Although, I'm (Andy
- Oliver) capable of providing these enhancements by
- myself, I'd of course prefer to work on them in concert
+ Oliver) capable of providing these enhancements by
+ myself, I'd of course prefer to work on them in concert
with others.
- While I'm outlaying a fairly large featureset, these can
+ While I'm outlaying a fairly large feature set, these can
be implemented incrementally of course (and are probably
best if done that way).
@@ -148,27 +148,27 @@
The goal is to provide features to Lucene that allow it
- to be used as a dropin search engine. It should provide
+ to be used as a drop-in search engine. It should provide
many of the features of projects like htDig while surpassing
- them with unique Lucene features and capabillities such as
+ them with unique Lucene features and capabilities such as
easy installation on and java-supporting platform,
- and support for document fields and field searches. And
+ and support for document fields and field searches. And
of course,
a pragmatic software license.
To reach this goal we'll implement code to support the
following objectives that augment but do not replace
- the current Lucene featureset.
+ the current Lucene feature set.
-
- Document Location Independance - meaning mapping
+ Document Location Independence - meaning mapping
real contexts to runtime contexts.
Essentially, if the document is at
/var/www/htdocs/mydoc.html, I probably want it
indexed as
- http://www.bigevilmegacorp.com/mydoc.html.
+ http://www.bigevilmegacorp.com/mydoc.html.
-
Standard methods of creating central indicies -
@@ -176,21 +176,21 @@
many environments than is *remote* indexing (for
instance http). I would suggest that most folks
would prefer that general functionality be
- suppored by Lucene instead of having to write
+ supported by Lucene instead of having to write
code for every indexing project. Obviously, if
what they are doing is *special* they'll have to
- code, but general document indexing accross
- webservers would not qualify.
+ code, but general document indexing across
+ web servers would not qualify.
-
- Document interperatation abstraction - currently
+ Document interpretation abstraction - currently
one must handle document object construction via
custom code. A standard interface for plugging
- in format handlers should be supported.
+ in format handlers should be supported.
-
Mime and file-extension to document
- interperatation mapping.
+ interpretation mapping.
@@ -241,7 +241,7 @@
replacement type - the type of
- replacewith path: relative, url or
+ replace with path: relative, URL or
path.
@@ -266,8 +266,8 @@
0 - Long.MAX_VALUE.
- SleeptimeBetweenCalls - can be used to
- avoid flooding a machine with too many
+ SleeptimeBetweenCalls - can be used to
+ avoid flooding a machine with too many
requests
@@ -276,12 +276,12 @@
inactivity.
- IncludeFilter - include only items
- matching filter. (can occur mulitple
+ IncludeFilter - include only items
+ matching filter. (can occur multiple
times)
- ExcludeFilter - exclude only items
+ ExcludeFilter - exclude only items
matching filter. (can occur multiple
times)
@@ -309,9 +309,9 @@
(probably from the command line) read
this properties file and get them from
it. Command line options override
- the properties file in the case of
+ the properties file in the case of
duplicates. There should also be an
- enivironment variable or VM parameter to
+ environment variable or VM parameter to
set this.
@@ -320,19 +320,19 @@
This should extend the AbstractCrawler and
- support any addtional options required for a
- filesystem index.
+ support any additional options required for a
+ file system index.
HTTP Crawler
- Supports the AbstractCrawler options as well as:
+ Supports the AbstractCrawler options as well as:
-
- span hosts - Wheter to span hosts or not,
- by default this should be no.
+ span hosts - Whether to span hosts or not,
+ by default this should be no.
-
restrict domains - (ignored if span
@@ -346,11 +346,11 @@
recurse and go to
/nextcontext/index.html this option says
to also try /nextcontext to get the dir
- lsiting)
+ listing)
-
map extensions -
- (always/default/never/fallback). Wether
+ (always/default/never/fallback). Whether
to always use extension mapping, by
default (fallback to mime type), NEVER
or fallback if mime is not available
@@ -376,8 +376,8 @@
A configurable registry of document types, their
- description, an identifyer, mime-type and file
- extension. This should map both MIME -> factory
+ description, an identifier, mime-type and file
+ extension. This should map both MIME -> factory
and extension -> factory.
@@ -500,17 +500,17 @@
- A class taht maps standard fields from the
+ A class that maps standard fields from the
DocumentFactories into *fields* in the Document objects
they create. I suggest that a regular expression system
or xpath might be the most universal way to do this.
For instance if perhaps I had an XML factory that
represented XML elements as fields, I could map content
- from particular fields to ther fields or supress them
+ from particular fields to their fields or suppress them
entirely. We could even make this configurable.
-
+
for example:
@@ -533,11 +533,11 @@
title.suppress=false
-
- In this example we map html documents such that all
- fields are suppressed but author and title. We map
- author and title to anything in the content matching
- author: (and x characters). Okay my regular expresions
+
+ In this example we map html documents such that all
+ fields are suppressed but author and title. We map
+ author and title to anything in the content matching
+ author: (and x characters). Okay my regular expresions
suck but hopefully you get the idea.
@@ -554,35 +554,35 @@
|
- We might also consider eliminating the DocumentFactory
- entirely by making an AbstractDocument from which the
- current document object would inherit from. I
- experimented with this locally, and it was a relatively
- minor code change and there was of course no difference
- in performance. The Document Factory classes would
- instead be instances of various subclasses of
+ We might also consider eliminating the DocumentFactory
+ entirely by making an AbstractDocument from which the
+ current document object would inherit from. I
+ experimented with this locally, and it was a relatively
+ minor code change and there was of course no difference
+ in performance. The Document Factory classes would
+ instead be instances of various subclasses of
AbstractDocument.
- My inspiration for this is HTDig (http://www.htdig.org/).
- While this goes slightly beyond what HTDig provides by
- providing field mapping (where HTDIG is just interested
- in Strings/numbers wherever they are found), it provides
- at least what I would need to use this as a dropin for
- most places I contract at (with the obvious exception of
- a default set of content handlers which would of course
+ My inspiration for this is HTDig (http://www.htdig.org/).
+ While this goes slightly beyond what HTDig provides by
+ providing field mapping (where HTDIG is just interested
+ in Strings/numbers wherever they are found), it provides
+ at least what I would need to use this as a drop-in for
+ most places I contract at (with the obvious exception of
+ a default set of content handlers which would of course
develop naturally over time).
- I am able to certainly contribute to this effort if the
- development community is open to it. I'd suggest we do
- it iteratively in stages and not aim for all of this at
+ I am able to certainly contribute to this effort if the
+ development community is open to it. I'd suggest we do
+ it iteratively in stages and not aim for all of this at
once (for instance leave out the field mapping at first).
-
- Anyhow, please give me some feedback, counter
- suggestions, let me know if I'm way off base or out of
+
+ Anyhow, please give me some feedback, counter
+ suggestions, let me know if I'm way off base or out of
line, etc. -Andy
diff --git a/docs/lucenesandbox.html b/docs/lucenesandbox.html
index df5704069d7..8b0799f3fc1 100644
--- a/docs/lucenesandbox.html
+++ b/docs/lucenesandbox.html
@@ -106,7 +106,11 @@
|
-
+
+You can access Lucene Sandbox CVS repository at
+http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/.
+
+
|
|
diff --git a/docs/whoweare.html b/docs/whoweare.html
index 53b8ec47470..e3030702806 100644
--- a/docs/whoweare.html
+++ b/docs/whoweare.html
@@ -131,11 +131,10 @@ Palo Alto Research Center (PARC), Apple, and Excite@Home, and authored
several information retrieval papers and
patents.
-Doug currently works for Grand
-Central.
-
-Please do not email Doug directly about Lucene. Instead use
-the Jakarta-Lucene mailing lists.
+Recently Doug has worked on peer-to-peer search at Infrasearch
+(aquired by Sun's JXTA project) and on web services at Grand Central.
+Currently he continues to help develop Lucene and is available for
+contract work.
- Otis Gospodnetic (otis at apache.org)
diff --git a/xdocs/lucenesandbox.xml b/xdocs/lucenesandbox.xml
index ded1cc40f09..0a78523d2b1 100644
--- a/xdocs/lucenesandbox.xml
+++ b/xdocs/lucenesandbox.xml
@@ -16,7 +16,7 @@ not necessarily be maintained, particularly in their current state.
You can access Lucene Sandbox CVS repository at
http://cvs.apache.org/viewcvs/jakarta-lucene-sandbox/.
-
+
|