mirror of https://github.com/apache/lucene.git
372 lines
18 KiB
HTML
372 lines
18 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|
|
|
<!-- Content Stylesheet for Site -->
|
|
|
|
|
|
<!-- start the processing -->
|
|
<!-- ====================================================================== -->
|
|
<!-- GENERATED FILE, DO NOT EDIT, EDIT THE XML FILE IN xdocs INSTEAD! -->
|
|
<!-- Main Page Section -->
|
|
<!-- ====================================================================== -->
|
|
<html>
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
|
|
|
|
|
|
|
|
|
|
|
|
<title>Jakarta Lucene - Lucene TODO List</title>
|
|
</head>
|
|
|
|
<body bgcolor="#ffffff" text="#000000" link="#525D76">
|
|
<table border="0" width="100%" cellspacing="0">
|
|
<!-- TOP IMAGE -->
|
|
<tr>
|
|
<td align="left">
|
|
<a href="http://jakarta.apache.org"><img src="http://jakarta.apache.org/images/jakarta-logo.gif" border="0"/></a>
|
|
</td>
|
|
<td align="right">
|
|
<a href="http://jakarta.apache.org/lucene/"><img src="./images/lucene_green_300.gif" alt="Jakarta Lucene" border="0"/></a>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
<table border="0" width="100%" cellspacing="4">
|
|
<tr><td colspan="2">
|
|
<hr noshade="" size="1"/>
|
|
</td></tr>
|
|
|
|
<tr>
|
|
<!-- LEFT SIDE NAVIGATION -->
|
|
<td width="20%" valign="top" nowrap="true">
|
|
<p><strong>About</strong></p>
|
|
<ul>
|
|
<li> <a href="./index.html">Overview</a>
|
|
</li>
|
|
<li> <a href="./powered.html">Powered by Lucene</a>
|
|
</li>
|
|
<li> <a href="./whoweare.html">Who We Are</a>
|
|
</li>
|
|
<li> <a href="http://jakarta.apache.org/site/mail.html">Mailing Lists</a>
|
|
</li>
|
|
</ul>
|
|
<p><strong>Resources</strong></p>
|
|
<ul>
|
|
<li> <a href="http://lucene.sourceforge.net/cgi-bin/faq/faqmanager.cgi">FAQ (Official)</a>
|
|
</li>
|
|
<li> <a href="http://www.jguru.com/faq/Lucene">jGuru FAQ</a>
|
|
</li>
|
|
<li> <a href="./gettingstarted.html">Getting Started</a>
|
|
</li>
|
|
<li> <a href="./queryparsersyntax.html">Query Syntax</a>
|
|
</li>
|
|
<li> <a href="./fileformats.html">File Formats</a>
|
|
</li>
|
|
<li> <a href="./api/index.html">Javadoc</a>
|
|
</li>
|
|
<li> <a href="./contributions.html">Contributions</a>
|
|
</li>
|
|
<li> <a href="./resources.html">Articles, etc.</a>
|
|
</li>
|
|
<li> <a href="./benchmarks.html">Benchmarks</a>
|
|
</li>
|
|
<li> <a href="./todo.html">TODO list</a>
|
|
</li>
|
|
<li> <a href="http://nagoya.apache.org/bugzilla/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&bug_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&product=Lucene&short_desc=%5BPATCH%5D&short_desc_type=allwordssubstr&long_desc=&long_desc_type=allwordssubstr&bug_file_loc=&bug_file_loc_type=allwordssubstr&keywords=&keywords_type=anywords&field0-0-0=noop&type0-0-0=noop&value0-0-0=&cmdtype=doit&order=%27Importance%27">Patches</a>
|
|
</li>
|
|
<li> <a href="http://jakarta.apache.org/site/bugs.html">Bugs</a>
|
|
</li>
|
|
<li> <a href="http://nagoya.apache.org/bugzilla/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&bug_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&product=Lucene&short_desc=&short_desc_type=allwordssubstr&long_desc=&long_desc_type=allwordssubstr&bug_file_loc=&bug_file_loc_type=allwordssubstr&keywords=&keywords_type=anywords&field0-0-0=noop&type0-0-0=noop&value0-0-0=&cmdtype=doit&order=%27Importance%27">Lucene Bugs</a>
|
|
</li>
|
|
<li> <a href="http://nagoya.apache.org/eyebrowse/SummarizeList?listId=30">Lucene-user</a>
|
|
</li>
|
|
<li> <a href="http://nagoya.apache.org/eyebrowse/SummarizeList?listId=29">Lucene-dev</a>
|
|
</li>
|
|
<li> <a href="./lucene-sandbox/">Lucene Sandbox</a>
|
|
</li>
|
|
</ul>
|
|
<p><strong>Download</strong></p>
|
|
<ul>
|
|
<li> <a href="http://jakarta.apache.org/site/binindex.html">Binaries</a>
|
|
</li>
|
|
<li> <a href="http://jakarta.apache.org/site/sourceindex.html">Source Code</a>
|
|
</li>
|
|
<li> <a href="http://jakarta.apache.org/site/cvsindex.html">CVS Repositories</a>
|
|
</li>
|
|
</ul>
|
|
<p><strong>Jakarta</strong></p>
|
|
<ul>
|
|
<li> <a href="http://jakarta.apache.org/site/getinvolved.html">Get Involved</a>
|
|
</li>
|
|
<li> <a href="http://jakarta.apache.org/site/acknowledgements.html">Acknowledgements</a>
|
|
</li>
|
|
<li> <a href="http://jakarta.apache.org/site/contact.html">Contact</a>
|
|
</li>
|
|
<li> <a href="http://jakarta.apache.org/site/legal.html">Legal</a>
|
|
</li>
|
|
</ul>
|
|
</td>
|
|
<td width="80%" align="left" valign="top">
|
|
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
|
<tr><td bgcolor="#525D76">
|
|
<font color="#ffffff" face="arial,helvetica,sanserif">
|
|
<a name="Purpose"><strong>Purpose</strong></a>
|
|
</font>
|
|
</td></tr>
|
|
<tr><td>
|
|
<blockquote>
|
|
<p>
|
|
This document describes the list of tasks on
|
|
the plates of the Lucene development team. Tasks are assigned into two
|
|
categories: core or non-core.
|
|
</p>
|
|
</blockquote>
|
|
</p>
|
|
</td></tr>
|
|
<tr><td><br/></td></tr>
|
|
</table>
|
|
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
|
<tr><td bgcolor="#525D76">
|
|
<font color="#ffffff" face="arial,helvetica,sanserif">
|
|
<a name="About Core vs. Non-Core Development"><strong>About Core vs. Non-Core Development</strong></a>
|
|
</font>
|
|
</td></tr>
|
|
<tr><td>
|
|
<blockquote>
|
|
<p>
|
|
Currently the Lucene development team is working on
|
|
categorizing change requests into <b>core</b> and <b>non-core</b>
|
|
changes.
|
|
</p>
|
|
<p>
|
|
Core changes would entail a change to the search engine
|
|
core itself. From Doug Cutting:
|
|
<blockquote><i>
|
|
"Examples include: file locking to make things
|
|
multi-process safe; adding an API for boosting individual documents and fields
|
|
values; making the scoring API extensible and public; etc."
|
|
</i></blockquote>
|
|
</p>
|
|
<p>
|
|
Non-core changes would not affect the search engine
|
|
itself, but would consist instead of projects or components that would
|
|
make useful
|
|
additions to the core framework. Again, from Doug
|
|
Cutting:
|
|
<blockquote><i>
|
|
"[Examples] include: support for more languages; query
|
|
parsers; database storage; crawlers, etc. Whether these belong in the
|
|
base distribution is a matter of debate (sometimes hot). My rule of
|
|
thumb for including them is their generality: if they are likely to be
|
|
useful to a large proportion of Lucene users then they should probably go
|
|
in the base distribution. Language support in particular is tricky.
|
|
Perhaps we should migrate to a model where the base distribution
|
|
includes no analyzers, and supply separate language packages."
|
|
</i></blockquote>
|
|
</p>
|
|
<p>
|
|
Change requests will be categorically defined by the
|
|
development team (committers) as core or non-core, and a committer will
|
|
be assigned responsibility for
|
|
coordinating development of the change request. All
|
|
change requests should be submitted to one of the Lucene mailing lists, or
|
|
through
|
|
the <a href="http://issues.apache.org/bugzilla/">Apache
|
|
Bugzilla</a> database.
|
|
</p>
|
|
</blockquote>
|
|
</p>
|
|
</td></tr>
|
|
<tr><td><br/></td></tr>
|
|
</table>
|
|
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
|
<tr><td bgcolor="#525D76">
|
|
<font color="#ffffff" face="arial,helvetica,sanserif">
|
|
<a name="Core Development Changes"><strong>Core Development Changes</strong></a>
|
|
</font>
|
|
</td></tr>
|
|
<tr><td>
|
|
<blockquote>
|
|
<p>
|
|
<i>No change requests classified as core yet!</i>
|
|
</p>
|
|
</blockquote>
|
|
</p>
|
|
</td></tr>
|
|
<tr><td><br/></td></tr>
|
|
</table>
|
|
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
|
<tr><td bgcolor="#525D76">
|
|
<font color="#ffffff" face="arial,helvetica,sanserif">
|
|
<a name="Non-Core Development Changes"><strong>Non-Core Development Changes</strong></a>
|
|
</font>
|
|
</td></tr>
|
|
<tr><td>
|
|
<blockquote>
|
|
<p>
|
|
<i>No change requests classified as non-core yet!</i>
|
|
</p>
|
|
</blockquote>
|
|
</p>
|
|
</td></tr>
|
|
<tr><td><br/></td></tr>
|
|
</table>
|
|
<table border="0" cellspacing="0" cellpadding="2" width="100%">
|
|
<tr><td bgcolor="#525D76">
|
|
<font color="#ffffff" face="arial,helvetica,sanserif">
|
|
<a name="Unclassified Changes"><strong>Unclassified Changes</strong></a>
|
|
</font>
|
|
</td></tr>
|
|
<tr><td>
|
|
<blockquote>
|
|
<p>
|
|
<table cellpadding="5">
|
|
<tr>
|
|
<th valign="top">Name</th>
|
|
<th valign="top">Description</th>
|
|
<th valign="top">Links</th>
|
|
</tr>
|
|
<tr>
|
|
<td valign="top">Term Vector support</td>
|
|
<td valign="top" />
|
|
<td valign="top">
|
|
<ul>
|
|
<li><a href="http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgNo=273">http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgNo=273</a></li>
|
|
<li><a href="http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgNo=272">http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgNo=272</a></li>
|
|
</ul>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td valign="top">Support for Search Term Highlighting</td>
|
|
<td valign="top" />
|
|
<td valign="top">
|
|
<ul>
|
|
<li><a href="http://www.geocrawler.org/archives/3/2624/2001/9/50/6553088/">http://www.geocrawler.org/archives/3/2624/2001/9/50/6553088/</a></li>
|
|
<li><a href="http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=115271">http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=115271</a></li>
|
|
<li><a href="http://www.iq-computing.de/index.asp?menu=projekte-lucene-highlight">http://www.iq-computing.de/index.asp?menu=projekte-lucene-highlight</a></li>
|
|
<li><a href="http://nagoya.apache.org/eyebrowse/BrowseList?listName=lucene-dev@jakarta.apache.org&by=thread&from=56403">http://nagoya.apache.org/eyebrowse/BrowseList?listName=lucene-dev@jakarta.apache.org&by=thread&from=56403</a></li>
|
|
</ul>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td valign="top">Better support for hits sorted by things other
|
|
than score.</td>
|
|
<td valign="top"> An easy, efficient case is to support results
|
|
sorted by the order documents were
|
|
added to the index. A little harder and less efficient is support
|
|
for
|
|
results sorted by an arbitrary field.
|
|
</td>
|
|
<td valign="top">
|
|
<ul>
|
|
<li><a href="http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=114756">http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=114756</a></li>
|
|
<li><a href="http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg00228.html">http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg00228.html</a></li>
|
|
</ul>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td valign="top">Add some requested methods:
|
|
IndexReader.getIndexedFields</td>
|
|
<td valign="top">
|
|
String[] IndexReader.getIndexedFields();</td>
|
|
<td valign="top">
|
|
<ul>
|
|
<li><a href="http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=330010">http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=330010</a></li>
|
|
<li><a href="http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=330009">http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=330009</a></li>
|
|
</ul>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td valign="top">Add lastModified() method to Directory,
|
|
FSDirectory and RamDirectory, so it could be cached in IndexWriter/Searcher
|
|
manager.</td>
|
|
<td valign="top" />
|
|
<td valign="top" />
|
|
</tr>
|
|
<tr>
|
|
<td valign="top">Support for adding more than 1 term to the same
|
|
position.</td>
|
|
<td valign="top">N.B. I think the Finnish lady already
|
|
implemented this. It required some pieces of Lucene to be modified. (OG).</td>
|
|
<td valign="top" />
|
|
</tr>
|
|
<tr>
|
|
<td valign="top">The ability to retrieve the number of occurrences
|
|
not only for a term but also for a Phrase.</td>
|
|
<td valign="top" />
|
|
<td valign="top">
|
|
<ul>
|
|
<li><a href="http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg00101.html">http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg00101.html</a></li>
|
|
</ul></td>
|
|
</tr>
|
|
<tr>
|
|
<td valign="top">Che Dong's CJKTokenizer for Chinese, Japanese,
|
|
and Korean.</td>
|
|
<td valign="top" />
|
|
<td valign="top">
|
|
<ul>
|
|
<li><a href="http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=330905">http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-dev@jakarta.apache.org&msgId=330905</a></li>
|
|
</ul>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td valign="top">Selecting a language-specific analyzer according
|
|
to a locale.</td>
|
|
<td valign="top"> Now we rewrite parts of Lucene code in order
|
|
to use another analyzer. It will be useful to select analyzer without
|
|
touching code.</td>
|
|
<td valign="top" />
|
|
</tr>
|
|
<tr>
|
|
<td valign="top">Adding "-encoding" option and encoding-sensitive
|
|
methods to tools.</td>
|
|
<td valign="top"> Current tools needs minor changes on a
|
|
Japanese (and other language) environment: adding an "-encode" option and
|
|
argument, using Reader/Writer classes instead of InputStream/OutputStream
|
|
classes, etc.</td>
|
|
<td valign="top" />
|
|
</tr>
|
|
</table>
|
|
</p>
|
|
</blockquote>
|
|
</p>
|
|
</td></tr>
|
|
<tr><td><br/></td></tr>
|
|
</table>
|
|
</td>
|
|
</tr>
|
|
|
|
<!-- FOOTER -->
|
|
<tr><td colspan="2">
|
|
<hr noshade="" size="1"/>
|
|
</td></tr>
|
|
<tr><td colspan="2">
|
|
<div align="center"><font color="#525D76" size="-1"><em>
|
|
Copyright © 1999-2002, Apache Software Foundation
|
|
</em></font></div>
|
|
</td></tr>
|
|
</table>
|
|
</body>
|
|
</html>
|
|
<!-- end the processing -->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|