mirror of https://github.com/apache/lucene.git
LUCENE-4008: Use pegdown to transform MIGRATE.txt and other text-only files to readable HTML. Please alsows run ant documentation when you have changed anything on those files to check output.
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1328978 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
b534190141
commit
a20aa3e0c9
|
@ -1,36 +1,37 @@
|
|||
# JRE Version Migration Guide
|
||||
|
||||
If possible, use the same JRE major version at both index and search time.
|
||||
When upgrading to a different JRE major version, consider re-indexing.
|
||||
|
||||
Different JRE major versions may implement different versions of Unicode,
|
||||
which will change the way some parts of Lucene treat your text.
|
||||
|
||||
For example: with Java 1.4, LetterTokenizer will split around the character U+02C6,
|
||||
For example: with Java 1.4, `LetterTokenizer` will split around the character U+02C6,
|
||||
but with Java 5 it will not.
|
||||
This is because Java 1.4 implements Unicode 3, but Java 5 implements Unicode 4.
|
||||
|
||||
For reference, JRE major versions with their corresponding Unicode versions:
|
||||
Java 1.4, Unicode 3.0
|
||||
Java 5, Unicode 4.0
|
||||
Java 6, Unicode 4.0
|
||||
Java 7, Unicode 6.0
|
||||
|
||||
* Java 1.4, Unicode 3.0
|
||||
* Java 5, Unicode 4.0
|
||||
* Java 6, Unicode 4.0
|
||||
* Java 7, Unicode 6.0
|
||||
|
||||
In general, whether or not you need to re-index largely depends upon the data that
|
||||
you are searching, and what was changed in any given Unicode version. For example,
|
||||
if you are completely sure that your content is limited to the "Basic Latin" range
|
||||
of Unicode, you can safely ignore this.
|
||||
|
||||
Special Notes:
|
||||
## Special Notes: LUCENE 2.9 TO 3.0, JAVA 1.4 TO JAVA 5 TRANSITION
|
||||
|
||||
LUCENE 2.9 TO 3.0, JAVA 1.4 TO JAVA 5 TRANSITION
|
||||
|
||||
* StandardAnalyzer will return the same results under Java 5 as it did under
|
||||
* `StandardAnalyzer` will return the same results under Java 5 as it did under
|
||||
Java 1.4. This is because it is largely independent of the runtime JRE for
|
||||
Unicode support, (with the exception of lowercasing). However, no changes to
|
||||
casing have occurred in Unicode 4.0 that affect StandardAnalyzer, so if you are
|
||||
using this Analyzer you are NOT affected.
|
||||
|
||||
* SimpleAnalyzer, StopAnalyzer, LetterTokenizer, LowerCaseFilter, and
|
||||
LowerCaseTokenizer may return different results, along with many other Analyzers
|
||||
and TokenStreams in Lucene's analysis modules. If you are using one of these
|
||||
* `SimpleAnalyzer`, `StopAnalyzer`, `LetterTokenizer`, `LowerCaseFilter`, and
|
||||
`LowerCaseTokenizer` may return different results, along with many other `Analyzer`s
|
||||
and `TokenStream`s in Lucene's analysis modules. If you are using one of these
|
||||
components, you may be affected.
|
||||
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -1,52 +1,21 @@
|
|||
Apache Lucene README file
|
||||
# Apache Lucene README file
|
||||
|
||||
INTRODUCTION
|
||||
## Introduction
|
||||
|
||||
Lucene is a Java full-text search engine. Lucene is not a complete
|
||||
application, but rather a code library and API that can easily be used
|
||||
to add search capabilities to applications.
|
||||
|
||||
The Lucene web site is at:
|
||||
http://lucene.apache.org/
|
||||
* The Lucene web site is at: http://lucene.apache.org/
|
||||
* Please join the Lucene-User mailing list by sending a message to:
|
||||
java-user-subscribe@lucene.apache.org
|
||||
|
||||
Please join the Lucene-User mailing list by sending a message to:
|
||||
java-user-subscribe@lucene.apache.org
|
||||
|
||||
Files in a binary distribution:
|
||||
## Files in a binary distribution
|
||||
|
||||
Files are organized by module, for example in core/:
|
||||
|
||||
core/lucene-core-XX.jar
|
||||
* `core/lucene-core-XX.jar`:
|
||||
The compiled core Lucene library.
|
||||
|
||||
Additional modules contain the same structure:
|
||||
|
||||
analysis/common/: Analyzers for indexing content in different languages and domains
|
||||
analysis/icu/: Analysis integration with ICU (International Components for Unicode)
|
||||
analysis/kuromoji/: Analyzer for indexing Japanese
|
||||
analysis/morfologik/: Analyzer for indexing Polish
|
||||
analysis/phonetic/: Analyzer for indexing phonetic signatures (for sounds-alike search)
|
||||
analysis/smartcn/: Analyzer for indexing Chinese
|
||||
analysis/stempel/: Analyzer for indexing Polish
|
||||
analysis/uima/: Analysis integration with Apache UIMA
|
||||
benchmark/: System for benchmarking Lucene
|
||||
demo/: Simple example code
|
||||
facet/: Faceted indexing and search capabilities
|
||||
grouping/: Search result grouping
|
||||
highlighter/: Highlights search keywords in results
|
||||
join/: Index-time and Query-time joins for normalized content
|
||||
memory/: Single-document in memory index implementation
|
||||
misc/: Index tools and other miscellaneous code
|
||||
queries/: Filters and Queries that add to core Lucene
|
||||
queryparser/: Query parsers and parsing framework
|
||||
sandbox/: Various third party contributions and new ideas.
|
||||
spatial/: Geospatial search
|
||||
suggest/: Auto-suggest and Spellchecking support
|
||||
test-framework/: Test Framework for testing Lucene-based applications
|
||||
|
||||
docs/index.html
|
||||
The contents of the Lucene website.
|
||||
|
||||
docs/api/index.html
|
||||
The Javadoc Lucene API documentation. This includes the core library,
|
||||
the test framework, and the demo, as well as all other modules.
|
||||
To review the documentation, read the main documentation page, located at:
|
||||
`docs/index.html`
|
||||
|
|
|
@ -184,11 +184,11 @@
|
|||
</target>
|
||||
|
||||
<target name="documentation" description="Generate all documentation"
|
||||
depends="javadocs,changes-to-html,doc-index"/>
|
||||
depends="javadocs,changes-to-html,process-webpages"/>
|
||||
<target name="javadoc" depends="javadocs"/>
|
||||
<target name="javadocs" description="Generate javadoc" depends="javadocs-lucene-core, javadocs-modules, javadocs-test-framework"/>
|
||||
|
||||
<target name="doc-index">
|
||||
<target name="process-webpages" depends="resolve-pegdown">
|
||||
<pathconvert pathsep="|" dirsep="/" property="buildfiles">
|
||||
<fileset dir="." includes="**/build.xml" excludes="build.xml,analysis/*,build/**,tools/**,backwards/**,site/**"/>
|
||||
</pathconvert>
|
||||
|
@ -205,6 +205,12 @@
|
|||
<param name="buildfiles" expression="${buildfiles}"/>
|
||||
<param name="version" expression="${version}"/>
|
||||
</xslt>
|
||||
|
||||
<pegdown todir="${javadoc.dir}">
|
||||
<fileset dir="." includes="MIGRATE.txt,JRE_VERSION_MIGRATION.txt"/>
|
||||
<globmapper from="*.txt" to="*.html"/>
|
||||
</pegdown>
|
||||
|
||||
<copy todir="${javadoc.dir}">
|
||||
<fileset dir="site/html" includes="**/*"/>
|
||||
</copy>
|
||||
|
|
|
@ -1506,4 +1506,60 @@ ${tests-output}/junit4-*.suites - per-JVM executed suites
|
|||
</scp>
|
||||
</sequential>
|
||||
</macrodef>
|
||||
|
||||
<!-- PEGDOWN macro: Before using depend on the target "resolve-pegdown" -->
|
||||
|
||||
<target name="resolve-pegdown" unless="pegdown.loaded">
|
||||
<ivy:cachepath organisation="org.pegdown" module="pegdown" revision="1.1.0"
|
||||
inline="true" conf="default" type="jar" transitive="true" pathid="pegdown.classpath"/>
|
||||
<property name="pegdown.loaded" value="true"/>
|
||||
</target>
|
||||
|
||||
<macrodef name="pegdown">
|
||||
<attribute name="todir"/>
|
||||
<attribute name="flatten" default="false"/>
|
||||
<attribute name="overwrite" default="false"/>
|
||||
<element name="nested" optional="false" implicit="true"/>
|
||||
<sequential>
|
||||
<copy todir="@{todir}" flatten="@{flatten}" overwrite="@{overwrite}" verbose="true"
|
||||
preservelastmodified="false" encoding="UTF-8" outputencoding="UTF-8"
|
||||
>
|
||||
<filterchain>
|
||||
<tokenfilter>
|
||||
<filetokenizer/>
|
||||
<replaceregex pattern="\b(LUCENE|SOLR)\-\d+\b" replace="[\0](https://issues.apache.org/jira/browse/\0)" flags="gs"/>
|
||||
<scriptfilter language="javascript" classpathref="pegdown.classpath"><![CDATA[
|
||||
importClass(java.lang.StringBuilder);
|
||||
importClass(org.pegdown.PegDownProcessor);
|
||||
importClass(org.pegdown.Extensions);
|
||||
importClass(org.pegdown.FastEncoder);
|
||||
var markdownSource = self.getToken();
|
||||
var title = undefined;
|
||||
if (markdownSource.search(/^(#+\s*)?(.+)[\n\r]/) == 0) {
|
||||
title = RegExp.$2;
|
||||
// Convert the first line into a markdown heading, if it is not already:
|
||||
if (RegExp.$1 == '') {
|
||||
markdownSource = '# ' + markdownSource;
|
||||
}
|
||||
}
|
||||
var processor = new PegDownProcessor(
|
||||
Extensions.ABBREVIATIONS | Extensions.AUTOLINKS |
|
||||
Extensions.FENCED_CODE_BLOCKS | Extensions.SMARTS
|
||||
);
|
||||
var html = new StringBuilder('<html>\n<head>\n');
|
||||
if (title) {
|
||||
html.append('<title>').append(FastEncoder.encode(title)).append('</title>\n');
|
||||
}
|
||||
html.append('<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">\n')
|
||||
.append('</head>\n<body>\n')
|
||||
.append(processor.markdownToHtml(markdownSource))
|
||||
.append('\n</body>\n</html>\n');
|
||||
self.setToken(html.toString());
|
||||
]]></scriptfilter>
|
||||
</tokenfilter>
|
||||
</filterchain>
|
||||
<nested/>
|
||||
</copy>
|
||||
</sequential>
|
||||
</macrodef>
|
||||
</project>
|
||||
|
|
|
@ -37,11 +37,14 @@
|
|||
<body>
|
||||
<div><img src="lucene_green_300.gif"/></div>
|
||||
<h1><xsl:text>Apache Lucene </xsl:text><xsl:value-of select="$version"/><xsl:text> Documentation</xsl:text></h1>
|
||||
<p>Lucene is a Java full-text search engine. Lucene is not a complete application,
|
||||
but rather a code library and API that can easily be used to add search capabilities
|
||||
to applications.</p>
|
||||
<p>
|
||||
This is the official documentation for <b><xsl:text>Apache Lucene </xsl:text>
|
||||
<xsl:value-of select="$version"/></b>. Additional documentation is available in the
|
||||
<a href="http://wiki.apache.org/lucene-java">Wiki</a>.
|
||||
</p>
|
||||
</p>
|
||||
<h2>Getting Started</h2>
|
||||
<p>The following section is intended as a "getting started" guide. It has three
|
||||
audiences: first-time users looking to install Apache Lucene in their
|
||||
|
@ -60,6 +63,8 @@
|
|||
<h2>Reference Documents</h2>
|
||||
<ul>
|
||||
<li><a href="changes/Changes.html">Changes</a>: List of changes in this release.</li>
|
||||
<li><a href="MIGRATE.html">Migration Guide</a>: What changed in Lucene 4; how to migrate code from Lucene 3.x.</li>
|
||||
<li><a href="JRE_VERSION_MIGRATION.html">JRE Version Migration</a>: Information about upgrading between major JRE versions.</li>
|
||||
<li><a href="fileformats.html">File Formats</a>: Guide to the index format used by Lucene.</li>
|
||||
<li><a href="core/org/apache/lucene/search/package-summary.html#package_description">Search and Scoring in Lucene</a>: Introduction to how Lucene scores documents.</li>
|
||||
<li><a href="core/org/apache/lucene/search/similarities/TFIDFSimilarity.html">Classic Scoring Formula</a>: Formula of Lucene's classic <a href="http://en.wikipedia.org/wiki/Vector_Space_Model">Vector Space</a> implementation. (look <a href="core/org/apache/lucene/search/similarities/package-summary.html#package_description">here</a> for other models)</li>
|
||||
|
|
Loading…
Reference in New Issue