[DOCS] Add attribute for Lucene analysis links (#51687)

Adds a `lucene-analysis-docs` attribute for the Lucene `/analysis/`
javadocs directory. This should prevent typos and keep the docs DRY.
This commit is contained in:
James Rodewig 2020-01-30 11:22:30 -05:00
parent 2a2a0941af
commit 36b2663e98
24 changed files with 33 additions and 27 deletions

View File

@ -1,6 +1,8 @@
[[analysis]] [[analysis]]
= Text analysis = Text analysis
:lucene-analysis-docs: https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis
[partintro] [partintro]
-- --

View File

@ -8,8 +8,8 @@ Strips all characters after an apostrophe, including the apostrophe itself.
This filter is included in {es}'s built-in <<turkish-analyzer,Turkish language This filter is included in {es}'s built-in <<turkish-analyzer,Turkish language
analyzer>>. It uses Lucene's analyzer>>. It uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/tr/ApostropheFilter.html[ApostropheFilter], {lucene-analysis-docs}/tr/ApostropheFilter.html[ApostropheFilter], which was
which was built for the Turkish language. built for the Turkish language.
[[analysis-apostrophe-tokenfilter-analyze-ex]] [[analysis-apostrophe-tokenfilter-analyze-ex]]

View File

@ -9,7 +9,7 @@ Latin Unicode block (first 127 ASCII characters) to their ASCII equivalent, if
one exists. For example, the filter changes `à` to `a`. one exists. For example, the filter changes `à` to `a`.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.html[ASCIIFoldingFilter]. {lucene-analysis-docs}/miscellaneous/ASCIIFoldingFilter.html[ASCIIFoldingFilter].
[[analysis-asciifolding-tokenfilter-analyze-ex]] [[analysis-asciifolding-tokenfilter-analyze-ex]]
==== Example ==== Example

View File

@ -9,7 +9,7 @@ Japanese, and Korean) tokens.
This filter is included in {es}'s built-in <<cjk-analyzer,CJK language This filter is included in {es}'s built-in <<cjk-analyzer,CJK language
analyzer>>. It uses Lucene's analyzer>>. It uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/cjk/CJKBigramFilter.html[CJKBigramFilter]. {lucene-analysis-docs}/cjk/CJKBigramFilter.html[CJKBigramFilter].
[[analysis-cjk-bigram-tokenfilter-analyze-ex]] [[analysis-cjk-bigram-tokenfilter-analyze-ex]]

View File

@ -14,7 +14,7 @@ characters
This filter is included in {es}'s built-in <<cjk-analyzer,CJK language This filter is included in {es}'s built-in <<cjk-analyzer,CJK language
analyzer>>. It uses Lucene's analyzer>>. It uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/cjk/CJKWidthFilter.html[CJKWidthFilter]. {lucene-analysis-docs}/cjk/CJKWidthFilter.html[CJKWidthFilter].
NOTE: This token filter can be viewed as a subset of NFKC/NFKD Unicode NOTE: This token filter can be viewed as a subset of NFKC/NFKD Unicode
normalization. See the normalization. See the

View File

@ -9,7 +9,7 @@ Performs optional post-processing of terms generated by the
This filter removes the english possessive (`'s`) from the end of words and This filter removes the english possessive (`'s`) from the end of words and
removes dots from acronyms. It uses Lucene's removes dots from acronyms. It uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/standard/ClassicFilter.html[ClassicFilter]. {lucene-analysis-docs}/standard/ClassicFilter.html[ClassicFilter].
[[analysis-classic-tokenfilter-analyze-ex]] [[analysis-classic-tokenfilter-analyze-ex]]
==== Example ==== Example

View File

@ -16,7 +16,7 @@ You can use the `common_grams` filter in place of the
completely ignore common words. completely ignore common words.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/commongrams/CommonGramsFilter.html[CommonGramsFilter]. {lucene-analysis-docs}/commongrams/CommonGramsFilter.html[CommonGramsFilter].
[[analysis-common-grams-analyze-ex]] [[analysis-common-grams-analyze-ex]]
==== Example ==== Example

View File

@ -8,7 +8,7 @@ Applies a set of token filters to tokens that match conditions in a provided
predicate script. predicate script.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/miscellaneous/ConditionalTokenFilter.html[ConditionalTokenFilter]. {lucene-analysis-docs}/miscellaneous/ConditionalTokenFilter.html[ConditionalTokenFilter].
[[analysis-condition-analyze-ex]] [[analysis-condition-analyze-ex]]
==== Example ==== Example

View File

@ -8,7 +8,7 @@ Converts all digits in the Unicode `Decimal_Number` General Category to `0-9`.
For example, the filter changes the Bengali numeral `৩` to `3`. For example, the filter changes the Bengali numeral `৩` to `3`.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysiscore/DecimalDigitFilter.html[DecimalDigitFilter]. {lucene-analysis-docs}/core/DecimalDigitFilter.html[DecimalDigitFilter].
[[analysis-decimal-digit-tokenfilter-analyze-ex]] [[analysis-decimal-digit-tokenfilter-analyze-ex]]
==== Example ==== Example

View File

@ -18,7 +18,7 @@ split `the|1 quick|2 fox|3` into the tokens `the`, `quick`, and `fox`
with respective payloads of `1`, `2`, and `3`. with respective payloads of `1`, `2`, and `3`.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/payloads/DelimitedPayloadTokenFilter.html[DelimitedPayloadTokenFilter]. {lucene-analysis-docs}/payloads/DelimitedPayloadTokenFilter.html[DelimitedPayloadTokenFilter].
[NOTE] [NOTE]
.Payloads .Payloads

View File

@ -17,7 +17,7 @@ Uses a specified list of words and a brute force approach to find subwords in
compound words. If found, these subwords are included in the token output. compound words. If found, these subwords are included in the token output.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/compound/DictionaryCompoundWordTokenFilter.html[DictionaryCompoundWordTokenFilter], {lucene-analysis-docs}/compound/DictionaryCompoundWordTokenFilter.html[DictionaryCompoundWordTokenFilter],
which was built for Germanic languages. which was built for Germanic languages.
[[analysis-dict-decomp-tokenfilter-analyze-ex]] [[analysis-dict-decomp-tokenfilter-analyze-ex]]

View File

@ -13,7 +13,7 @@ For example, you can use the `edge_ngram` token filter to change `quick` to
When not customized, the filter creates 1-character edge n-grams by default. When not customized, the filter creates 1-character edge n-grams by default.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/ngram/EdgeNGramTokenFilter.html[EdgeNGramTokenFilter]. {lucene-analysis-docs}/ngram/EdgeNGramTokenFilter.html[EdgeNGramTokenFilter].
[NOTE] [NOTE]
==== ====

View File

@ -22,7 +22,7 @@ Customized versions of this filter are included in several of {es}'s built-in
* <<italian-analyzer, Italian analyzer>> * <<italian-analyzer, Italian analyzer>>
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/util/ElisionFilter.html[ElisionFilter]. {lucene-analysis-docs}/util/ElisionFilter.html[ElisionFilter].
[[analysis-elision-tokenfilter-analyze-ex]] [[analysis-elision-tokenfilter-analyze-ex]]
==== Example ==== Example

View File

@ -22,7 +22,7 @@ https://github.com/OpenRefine/OpenRefine/wiki/Clustering-In-Depth#fingerprint[Op
project]. project].
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene//analysis/miscellaneous/FingerprintFilter.html[FingerprintFilter]. {lucene-analysis-docs}/miscellaneous/FingerprintFilter.html[FingerprintFilter].
[[analysis-fingerprint-tokenfilter-analyze-ex]] [[analysis-fingerprint-tokenfilter-analyze-ex]]
==== Example ==== Example

View File

@ -9,7 +9,7 @@ words. These subwords are then checked against the specified word list. Subwords
in the list are excluded from the token output. in the list are excluded from the token output.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/compound/HyphenationCompoundWordTokenFilter.html[HyphenationCompoundWordTokenFilter], {lucene-analysis-docs}/compound/HyphenationCompoundWordTokenFilter.html[HyphenationCompoundWordTokenFilter],
which was built for Germanic languages. which was built for Germanic languages.
[[analysis-hyp-decomp-tokenfilter-analyze-ex]] [[analysis-hyp-decomp-tokenfilter-analyze-ex]]

View File

@ -26,7 +26,7 @@ type.
==== ====
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/core/TypeTokenFilter.html[TypeTokenFilter]. {lucene-analysis-docs}/core/TypeTokenFilter.html[TypeTokenFilter].
[[analysis-keep-types-tokenfilter-analyze-include-ex]] [[analysis-keep-types-tokenfilter-analyze-include-ex]]
==== Include example ==== Include example

View File

@ -7,7 +7,7 @@
Keeps only tokens contained in a specified word list. Keeps only tokens contained in a specified word list.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/miscellaneous/KeepWordFilter.html[KeepWordFilter]. {lucene-analysis-docs}/miscellaneous/KeepWordFilter.html[KeepWordFilter].
[NOTE] [NOTE]
==== ====

View File

@ -9,7 +9,7 @@ For example, you can use the `length` filter to exclude tokens shorter than 2
characters and tokens longer than 5 characters. characters and tokens longer than 5 characters.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/miscellaneous/LengthFilter.html[LengthFilter]. {lucene-analysis-docs}/miscellaneous/LengthFilter.html[LengthFilter].
[TIP] [TIP]
==== ====

View File

@ -12,7 +12,7 @@ example, the filter can change the token stream `[ one, two, three ]` to
`[ one ]`. `[ one ]`.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/miscellaneous/LimitTokenCountFilter.html[LimitTokenCountFilter]. {lucene-analysis-docs}/miscellaneous/LimitTokenCountFilter.html[LimitTokenCountFilter].
[TIP] [TIP]
==== ====

View File

@ -104,13 +104,17 @@ PUT lowercase_example
(Optional, string) (Optional, string)
Language-specific lowercase token filter to use. Valid values include: Language-specific lowercase token filter to use. Valid values include:
`greek`::: Uses Lucene's https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/el/GreekLowerCaseFilter.html[GreekLowerCaseFilter] `greek`::: Uses Lucene's
{lucene-analysis-docs}/el/GreekLowerCaseFilter.html[GreekLowerCaseFilter]
`irish`::: Uses Lucene's http://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/ga/IrishLowerCaseFilter.html[IrishLowerCaseFilter] `irish`::: Uses Lucene's
http://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/ga/IrishLowerCaseFilter.html[IrishLowerCaseFilter]
`turkish`::: Uses Lucene's https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/tr/TurkishLowerCaseFilter.html[TurkishLowerCaseFilter] `turkish`::: Uses Lucene's
{lucene-analysis-docs}/tr/TurkishLowerCaseFilter.html[TurkishLowerCaseFilter]
If not specified, defaults to Lucene's https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/core/LowerCaseFilter.html[LowerCaseFilter]. If not specified, defaults to Lucene's
{lucene-analysis-docs}/core/LowerCaseFilter.html[LowerCaseFilter].
-- --
[[analysis-lowercase-tokenfilter-customize]] [[analysis-lowercase-tokenfilter-customize]]

View File

@ -11,7 +11,7 @@ For example, you can use the `ngram` token filter to change `fox` to
`[ f, fo, o, ox, x ]`. `[ f, fo, o, ox, x ]`.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/ngram/NGramTokenFilter.html[NGramTokenFilter]. {lucene-analysis-docs}/ngram/NGramTokenFilter.html[NGramTokenFilter].
[NOTE] [NOTE]
==== ====

View File

@ -12,7 +12,7 @@ such as finding words that end in `-ion` or searching file names by their
extension. extension.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/reverse/ReverseStringFilter.html[ReverseStringFilter]. {lucene-analysis-docs}/reverse/ReverseStringFilter.html[ReverseStringFilter].
[[analysis-reverse-tokenfilter-analyze-ex]] [[analysis-reverse-tokenfilter-analyze-ex]]
==== Example ==== Example

View File

@ -11,7 +11,7 @@ For example, you can use the `truncate` filter to shorten all tokens to
`3` characters or fewer, changing `jumping fox` to `jum fox`. `3` characters or fewer, changing `jumping fox` to `jum fox`.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/miscellaneous/TruncateTokenFilter.html[TruncateTokenFilter]. {lucene-analysis-docs}/miscellaneous/TruncateTokenFilter.html[TruncateTokenFilter].
[[analysis-truncate-tokenfilter-analyze-ex]] [[analysis-truncate-tokenfilter-analyze-ex]]
==== Example ==== Example

View File

@ -8,7 +8,7 @@ Changes token text to uppercase. For example, you can use the `uppercase` filter
to change `the Lazy DoG` to `THE LAZY DOG`. to change `the Lazy DoG` to `THE LAZY DOG`.
This filter uses Lucene's This filter uses Lucene's
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/core/UpperCaseFilter.html[UpperCaseFilter]. {lucene-analysis-docs}/core/UpperCaseFilter.html[UpperCaseFilter].
[WARNING] [WARNING]
==== ====