mirror of
https://github.com/apache/lucene.git
synced 2025-02-06 10:08:58 +00:00
be4680abb8
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1680973 13f79535-47bb-0310-9956-ffa450edef68
30 lines
1.3 KiB
Plaintext
30 lines
1.3 KiB
Plaintext
#
|
|
# This is a sample user dictionary for Kuromoji (JapaneseTokenizer)
|
|
#
|
|
# Add entries to this file in order to override the statistical model in terms
|
|
# of segmentation, readings and part-of-speech tags. Notice that entries do
|
|
# not have weights since they are always used when found. This is by-design
|
|
# in order to maximize ease-of-use.
|
|
#
|
|
# Entries are defined using the following CSV format:
|
|
# <text>,<token 1> ... <token n>,<reading 1> ... <reading n>,<part-of-speech tag>
|
|
#
|
|
# Notice that a single half-width space separates tokens and readings, and
|
|
# that the number tokens and readings must match exactly.
|
|
#
|
|
# Also notice that multiple entries with the same <text> is undefined.
|
|
#
|
|
# Whitespace only lines are ignored. Comments are not allowed on entry lines.
|
|
#
|
|
|
|
# Custom segmentation for kanji compounds
|
|
日本経済新聞,日本 経済 新聞,ニホン ケイザイ シンブン,カスタム名詞
|
|
関西国際空港,関西 国際 空港,カンサイ コクサイ クウコウ,カスタム名詞
|
|
|
|
# Custom segmentation for compound katakana
|
|
トートバッグ,トート バッグ,トート バッグ,かずカナ名詞
|
|
ショルダーバッグ,ショルダー バッグ,ショルダー バッグ,かずカナ名詞
|
|
|
|
# Custom reading for former sumo wrestler
|
|
朝青龍,朝青龍,アサショウリュウ,カスタム人名
|