$Id$ 2002-06-18 (cmarschner) * added an experimental version of Lucene storage. see FetcherMain.java for details how to use it LuceneStorage simply saves all fields as specified in WebDocument. add a converter to the storage pipeline before LuceneStorage to do preprocessing 2002-06-17 (cmarschner) * moved HostInfo and HostManager to larm.net package * included URLNormalizer (todo: source code Docs) * changed filters to use normalized URLs when appropriate; logs contain normalized version of referer and URL now (todo: change description of log format in technical_overview.rtf) 2002-06-01 (cmarschner) * divided Storage into LinkStorage and DocumentStorage * introduced StoragePipeline, made MessageHandler a LinkStorage. Fetcher now stores everything in storages * removed a couple of unused classes now everything's prepared for a LuceneStorage * added build.xml by Mehran Mehr 2002-05-23 (cmarschner) * removed 0x0d0d from the source files (Otis?) * included Apache License into all of the source files in de.lanlab.larm.* directories * added anchor text deparsing to the Tokenizer * split store.log in two files: - store.log contains the page file index: <PageFileNo> <PageFileOffset> - links.log contains link information: <referer> <URL> <isFrame> <AnchorText> * changed lib to libs in the startup scripts * added .bat files for Windows