mirror of https://github.com/apache/lucene.git
31 lines
1.4 KiB
Plaintext
31 lines
1.4 KiB
Plaintext
$Id$
|
|
|
|
2002-06-18 (cmarschner)
|
|
* added an experimental version of Lucene storage. see FetcherMain.java for details how to use it
|
|
LuceneStorage simply saves all fields as specified in WebDocument. add a converter to the
|
|
storage pipeline before LuceneStorage to do preprocessing
|
|
|
|
2002-06-17 (cmarschner)
|
|
* moved HostInfo and HostManager to larm.net package
|
|
* included URLNormalizer (todo: source code Docs)
|
|
* changed filters to use normalized URLs when appropriate;
|
|
logs contain normalized version of referer and URL now
|
|
(todo: change description of log format in technical_overview.rtf)
|
|
|
|
2002-06-01 (cmarschner)
|
|
* divided Storage into LinkStorage and DocumentStorage
|
|
* introduced StoragePipeline, made MessageHandler a LinkStorage. Fetcher now stores everything in storages
|
|
* removed a couple of unused classes
|
|
now everything's prepared for a LuceneStorage
|
|
* added build.xml by Mehran Mehr
|
|
|
|
2002-05-23 (cmarschner)
|
|
* removed 0x0d0d from the source files (Otis?)
|
|
* included Apache License into all of the source files in de.lanlab.larm.* directories
|
|
* added anchor text deparsing to the Tokenizer
|
|
* split store.log in two files:
|
|
- store.log contains the page file index: <referer> <URL> <ResultCode> <MimeType> <Size> <Title> <PageFileNo> <PageFileOffset>
|
|
- links.log contains link information: <referer> <URL> <isFrame> <AnchorText>
|
|
* changed lib to libs in the startup scripts
|
|
* added .bat files for Windows
|