
372 lines
18 KiB

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "">
<!-- Content Stylesheet for Site -->
<!-- start the processing -->
<!-- ====================================================================== -->
<!-- Main Page Section -->
<!-- ====================================================================== -->
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
<title>Jakarta Lucene - Lucene TODO List</title>
<body bgcolor="#ffffff" text="#000000" link="#525D76">
<table border="0" width="100%" cellspacing="0">
<!-- TOP IMAGE -->
<td align="left">
<a href=""><img src="" border="0"/></a>
<td align="right">
<a href=""><img src="./images/lucene_green_300.gif" alt="Jakarta Lucene" border="0"/></a>
<table border="0" width="100%" cellspacing="4">
<tr><td colspan="2">
<hr noshade="" size="1"/>
<td width="20%" valign="top" nowrap="true">
<li> <a href="./index.html">Overview</a>
<li> <a href="./powered.html">Powered by Lucene</a>
<li> <a href="./whoweare.html">Who We Are</a>
<li> <a href="">Mailing Lists</a>
<li> <a href="">FAQ (Official)</a>
<li> <a href="">jGuru FAQ</a>
<li> <a href="./gettingstarted.html">Getting Started</a>
<li> <a href="./queryparsersyntax.html">Query Syntax</a>
<li> <a href="./fileformats.html">File Formats</a>
<li> <a href="./api/index.html">Javadoc</a>
<li> <a href="./contributions.html">Contributions</a>
<li> <a href="./resources.html">Articles, etc.</a>
<li> <a href="./benchmarks.html">Benchmarks</a>
<li> <a href="./todo.html">TODO list</a>
<li> <a href="">Patches</a>
<li> <a href="">Bugs</a>
<li> <a href="">Lucene Bugs</a>
<li> <a href="">Lucene-user</a>
<li> <a href="">Lucene-dev</a>
<li> <a href="./lucene-sandbox/">Lucene Sandbox</a>
<li> <a href="">Binaries</a>
<li> <a href="">Source Code</a>
<li> <a href="">CVS Repositories</a>
<li> <a href="">Get Involved</a>
<li> <a href="">Acknowledgements</a>
<li> <a href="">Contact</a>
<li> <a href="">Legal</a>
<td width="80%" align="left" valign="top">
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Purpose"><strong>Purpose</strong></a>
This document describes the list of tasks on
the plates of the Lucene development team. Tasks are assigned into two
categories: core or non-core.
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="About Core vs. Non-Core Development"><strong>About Core vs. Non-Core Development</strong></a>
Currently the Lucene development team is working on
categorizing change requests into <b>core</b> and <b>non-core</b>
Core changes would entail a change to the search engine
core itself. From Doug Cutting:
"Examples include: file locking to make things
multi-process safe; adding an API for boosting individual documents and fields
values; making the scoring API extensible and public; etc."
Non-core changes would not affect the search engine
itself, but would consist instead of projects or components that would
make useful
additions to the core framework. Again, from Doug
"[Examples] include: support for more languages; query
parsers; database storage; crawlers, etc. Whether these belong in the
base distribution is a matter of debate (sometimes hot). My rule of
thumb for including them is their generality: if they are likely to be
useful to a large proportion of Lucene users then they should probably go
in the base distribution. Language support in particular is tricky.
Perhaps we should migrate to a model where the base distribution
includes no analyzers, and supply separate language packages."
Change requests will be categorically defined by the
development team (committers) as core or non-core, and a committer will
be assigned responsibility for
coordinating development of the change request. All
change requests should be submitted to one of the Lucene mailing lists, or
the <a href="">Apache
Bugzilla</a> database.
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Core Development Changes"><strong>Core Development Changes</strong></a>
<i>No change requests classified as core yet!</i>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Non-Core Development Changes"><strong>Non-Core Development Changes</strong></a>
<i>No change requests classified as non-core yet!</i>
<table border="0" cellspacing="0" cellpadding="2" width="100%">
<tr><td bgcolor="#525D76">
<font color="#ffffff" face="arial,helvetica,sanserif">
<a name="Unclassified Changes"><strong>Unclassified Changes</strong></a>
<table cellpadding="5">
<th valign="top">Name</th>
<th valign="top">Description</th>
<th valign="top">Links</th>
<td valign="top">Term Vector support</td>
<td valign="top" />
<td valign="top">
<li><a href=";msgNo=273">;msgNo=273</a></li>
<li><a href=";msgNo=272">;msgNo=272</a></li>
<td valign="top">Support for Search Term Highlighting</td>
<td valign="top" />
<td valign="top">
<li><a href=""></a></li>
<li><a href=";msgId=115271">;msgId=115271</a></li>
<li><a href=""></a></li>
<li><a href=";by=thread&amp;from=56403">;by=thread&amp;from=56403</a></li>
<td valign="top">Better support for hits sorted by things other
than score.</td>
<td valign="top"> An easy, efficient case is to support results
sorted by the order documents were
added to the index. A little harder and less efficient is support
results sorted by an arbitrary field.
<td valign="top">
<li><a href=";msgId=114756">;msgId=114756</a></li>
<li><a href=""></a></li>
<td valign="top">Add some requested methods:
<td valign="top">
String[] IndexReader.getIndexedFields();</td>
<td valign="top">
<li><a href=";msgId=330010">;msgId=330010</a></li>
<li><a href=";msgId=330009">;msgId=330009</a></li>
<td valign="top">Add lastModified() method to Directory,
FSDirectory and RamDirectory, so it could be cached in IndexWriter/Searcher
<td valign="top" />
<td valign="top" />
<td valign="top">Support for adding more than 1 term to the same
<td valign="top">N.B. I think the Finnish lady already
implemented this. It required some pieces of Lucene to be modified. (OG).</td>
<td valign="top" />
<td valign="top">The ability to retrieve the number of occurrences
not only for a term but also for a Phrase.</td>
<td valign="top" />
<td valign="top">
<li><a href=""></a></li>
<td valign="top">Che Dong's CJKTokenizer for Chinese, Japanese,
and Korean.</td>
<td valign="top" />
<td valign="top">
<li><a href=";msgId=330905">;msgId=330905</a></li>
<td valign="top">Selecting a language-specific analyzer according
to a locale.</td>
<td valign="top"> Now we rewrite parts of Lucene code in order
to use another analyzer. It will be useful to select analyzer without
touching code.</td>
<td valign="top" />
<td valign="top">Adding "-encoding" option and encoding-sensitive
methods to tools.</td>
<td valign="top"> Current tools needs minor changes on a
Japanese (and other language) environment: adding an "-encode" option and
argument, using Reader/Writer classes instead of InputStream/OutputStream
classes, etc.</td>
<td valign="top" />
<!-- FOOTER -->
<tr><td colspan="2">
<hr noshade="" size="1"/>
<tr><td colspan="2">
<div align="center"><font color="#525D76" size="-1"><em>
Copyright &#169; 1999-2002, Apache Software Foundation
<!-- end the processing -->