lucene/CHANGES.txt

Solr Change Log
$Id$

New Features
 1. added support for setting Lucene's positionIncrementGap
 2. Admin: new statistics for SolrIndexSearcher
 3. Admin: caches now show config params on stats page
 3. max() function added to FunctionQuery suite
 4. postOptimize hook, mirroring the functionallity of the postCommit hook,
    but only called on an index optimize.
 5. Ability to HTTP POST query requests to /select in addition to HTTP-GET
 6. The default search field may now be overridden by requests to the
    standard request handler using the df query parameter. (Erik Hatcher)
 7. Added DisMaxRequestHandler and SolrPluginUtils. (Chris Hostetter)
 8. Support for customizing the QueryResponseWriter per request
    (Mike Baranczak / SOLR-16 / hossman)
 9. Added KeywordTokenizerFactory (hossman)
10. copyField accepts dynamicfield-like names as the source.
    (Darren Erik Vengroff via yonik, SOLR-21)
11. new DocSet.andNot(), DocSet.andNotSize() (yonik)
12. Ability to store term vectors for fields. (Mike Klaas via yonik, SOLR-23)
13. New abstract BufferedTokenStream for people who want to write
    Tokenizers or TokenFilters that require arbitrary buffering of the
    stream. (SOLR-11 / yonik, hossman)
14. New RemoveDuplicatesToken - useful in situations where
    synonyms, stemming, or word-deliminater-ing produce identical tokens at
    the same position. (SOLR-11 / yonik, hossman)
15. Added highlighting to SolrPluginUtils and implemented in StandardRequestHandler
    and DisMaxRequestHandler (SOLR-24 / Mike Klaas via hossman,yonik)
16. SnowballPorterFilterFactory language is configurable via the "language"
    attribute, with the default being "English".  (Bertrand Delacretaz via yonik, SOLR-27)
17. ISOLatin1AccentFilterFactory, instantiates ISOLatin1AccentFilter to remove accents.
    (Bertrand Delacretaz via yonik, SOLR-28)
18. JSON, Python, Ruby QueryResponseWriters: use wt="json", "python" or "ruby"
    (yonik, SOLR-31)
19. Make web admin pages return UTF-8, change Content-type declaration to include a
    space between the mime-type and charset (Philip Jacob, SOLR-35)
20. Made query parser default operator configurable via schema.xml:
         <solrQueryParser defaultOperator="AND|OR"/>
    The default operator remains "OR".
21. JAVA API: new version of SolrIndexSearcher.getDocListAndSet() which takes
    flags (Greg Ludington via yonik, SOLR-39)
22. A HyphenatedWordsFilter, a text analysis filter used during indexing to rejoin
    words that were hyphenated and split by a newline. (Boris Vitez via yonik, SOLR-41)
23. Added a CompressableField base class which allows fields of derived types to
    be compressed using the compress=true setting.  The field type also gains the
    ability to specify a size threshold at which field data is compressed.
    (klaas, SOLR-45)
24. Simple faceted search support for fields (enumerating terms)
    and arbitrary queries added to both StandardRequestHandler and
    DisMaxRequestHandler. (hossman, SOLR-44)
25. In addition to specifying default RequestHandler params in the
    solrconfig.xml, support has been added for configuring values to be
    appended to the multi-val request params, as well as for configuring
    invariant params that can not overridden in the query. (hossman, SOLR-46)
26. Default operator for query parsing can now be specified with q.op=AND|OR
    from the client request, overriding the schema value. (ehatcher)
27. New XSLTResponseWriter does server side XSLT processing of XML Response.
    In the process, an init(NamedList) method was added to QueryResponseWriter
    which works the same way as SolrRequestHandler.
    (Bertrand Delacretaz / SOLR-49 / hossman)
28. json.wrf parameter adds a wrapper-function around the JSON response,
    useful in AJAX with dynamic script tags for specifying a JavaScript
    callback function. (Bertrand Delacretaz via yonik, SOLR-56)
29. autoCommit can be specified every so many documents added (klaas, SOLR-65)

Changes in runtime behavior
 1. classes reorganized into different packages, package names changed to Apache
 2. force read of document stored fields in QuerySenderListener
 3. Solr now looks in ./solr/conf for config, ./solr/data for data
    configurable via solr.solr.home system property
 4. Highlighter params changed to be prefixed with "hl."; allow fragmentsize
    customization and per-field overrides on many options
    (Andrew May via klaas, SOLR-37)
 5. Default param values for DisMaxRequestHandler should now be specified
    using a '<lst name="defaults">...</lst>' init param, for backwards
    compatability all init prams will be used as defaults if an init param
    with that name does not exist. (hossman, SOLR-43)
 6. The DisMaxRequestHandler now supports multiple occurances of the "fq"
    param. (hossman, SOLR-44)
 7. FunctionQuery.explain now uses ComplexExplanation to provide more
    accurate score explanations when composed in a BooleanQuery.
    (hossman, SOLR-25)
 8. Document update handling locking is much sparser, allowing performance gains
    through multiple threads.  Large commits also might be faster (klaas, SOLR-65)

Optimizations
 1. getDocListAndSet can now generate both a DocList and a DocSet from a
    single lucene query.
 2. BitDocSet.intersectionSize(HashDocSet) no longer generates an intermediate
    set
 3. OpenBitSet completed, replaces BitSet as the implementation for BitDocSet.
    Iteration is faster, and BitDocSet.intersectionSize(BitDocSet) and unionSize
    is between 3 and 4 times faster. (yonik, SOLR-15)
 4. much faster unionSize when one of the sets is a HashDocSet: O(smaller_set_size)
 5. Optimized getDocSet() for term queries resulting in a 36% speedup of facet.field
    queries where DocSets aren't cached (for example, if the number of terms in the field
    is larger than the filter cache.) (yonik)
 6. Optimized facet.field faceting by as much as 500 times when the field has
    a single token per document (not multiValued & not tokenized) by using the
    Lucene FieldCache entry for that field to tally term counts.  The first request
    utilizing the FieldCache will take longer than subsequent ones.

Bug Fixes
 1. Fixed delete-by-id for field types who's indexed form is different
    from the printable form (mainly sortable numeric types).
 2. Added escaping of attribute values in the XML response (Erik Hatcher)
 3. Added empty extractTerms() to FunctionQuery to enable use in
    a MultiSearcher (Yonik)
 4. WordDelimiterFilter sometimes lost token positionIncrement information
 5. Fix reverse sorting for fields were sortMissingFirst=true
    (Rob Staveley, yonik)
 6. Worked around a Jetty bug that caused invalid XML responses for fields
    containing non ASCII chars.  (Bertrand Delacretaz via yonik, SOLR-32)
 7. WordDelimiterFilter can throw exceptions if configured with both
    generate and catenate off.  (Mike Klaas via yonik, SOLR-34)
 8. Escape '>' in XML output (because ]]> is illegal in CharData)
 9. field boosts weren't being applied and doc boosts were being applied to fields (klaas)
10. Multiple-doc update generates well-formed xml (klaas, SOLR-65)

Other Changes
 1. Upgrade to Lucene 2.0 nightly build 2006-06-22, lucene SVN revision 416224,
    http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?view=markup&pathrev=416224
 2. Modified admin styles to improve display in Internet Explorer (Greg Ludington via billa, SOLR-6)
 3. Upgrade to Lucene 2.0 nightly build 2006-07-15, lucene SVN revision 422302,
 4. Included unique key field name/value (if available) in log message of add (billa, SOLR-18)
 5. Updated to Lucene 2.0 nightly build 2006-09-07
 6. Added javascript to catch empty query in admin query forms (Tomislav Nakic-Alfirevic via billa, SOLR-48
 7. blackslash escape * in ssh command used in snappuller for zsh compatibility, SOLR-63
 8. check solr return code in admin scripts, SOLR-62

2006/01/17 Solr open sourced, moves to Apache Incubator