22 Commits

Author SHA1 Message Date
Jim Ferenczi
e37a0ef844
Upgrade to lucene-8.0.0-snapshot-67cdd21996 (#35816) 2018-11-22 15:42:59 +01:00
Nik Everett
a45626deb5
Analysis: Wrap at 140 columns (#34494)
Applies our standard column width to all analysis plugins.
2018-10-17 16:17:25 -04:00
Armin Braun
ed3b44fb4c
Handle TokenizerFactory TODOs (#32063)
* Don't replace Replace TokenizerFactory with Supplier, this approach was rejected in #32063 
* Remove unused parameter from constructor
2018-07-17 14:14:02 +02:00
Claudio Bley
7184cf8b5b Fix kuromoji default stoptags (#26600)
Initialize the default stop-tags in `KuromojiPartOfSpeechFilterFactory` if the
`stoptags` are not given in the config. Also adding a test which checks that 
part-of-speech tokens are removed when using the kuromoji_part_of_speech 
filter.
2017-09-15 12:25:09 +02:00
Adrien Grand
eb782492be Remove support for lenient booleans.
Closes #22298
2017-08-28 09:56:01 +02:00
desmorto
292dd8f992 (refactor) some opportunities to use diamond operator (#25585)
* (refactor) some opportunities to use diamond operator

* Update ExceptionRetryIT.java

update typo
2017-08-15 16:36:42 -06:00
Jason Tedor
9781b88a38 Fix deprecation logging for lenient booleans
This commit fixes an issue with deprecation logging for lenient
booleans. The underlying issue is that adding deprecation logging for
lenient booleans added a static deprecation logger to the Settings
class. However, the Settings class is initialized very early and in CLI
tools can be initialized before logging is initialized. This leads to
status logger error messages. Additionally, the deprecation logging for
a lot of the settings does not provide useful context (for example, in
the token filter factories, the deprecation logging only produces the
name of the setting, but gives no context which token filter factory it
comes from). This commit addresses both of these issues by changing the
call sites to push a deprecation logger through to the lenient boolean
parsing.

Relates #22696
2017-01-19 12:30:33 -05:00
Daniel Mitterdorfer
aece89d6a1 Make boolean conversion strict (#22200)
This PR removes all leniency in the conversion of Strings to booleans: "true"
is converted to the boolean value `true`, "false" is converted to the boolean
value `false`. Everything else raises an error.
2017-01-19 07:59:18 +01:00
Tanguy Leroux
44ac5d057a Remove empty javadoc (#20871)
This commit removes as many as empty javadocs comments my regexp has found
2016-10-12 10:27:09 +02:00
Mike McCandless
0ccfe69789 Upgrade to Lucene 6.2.0 2016-08-24 17:26:28 -04:00
Nik Everett
71b95fb63c Switch analysis from push to pull
Instead of plugins calling `registerTokenizer` to extend the analyzer
they now instead have to implement `AnalysisPlugin` and override
`getTokenizer`. This lines up extending plugins in with extending
scripts. This allows `AnalysisModule` to construct the `AnalysisRegistry`
immediately as part of its constructor which makes testing anslysis
much simpler.

This also moves the default analysis configuration into `AnalysisModule`
which is how search is setup.

Like `ScriptModule`, `AnalysisModule` no longer extends `AbstractModule`.
Instead it is only responsible for building `AnslysisRegistry`. We still
bind `AnalysisRegistry` but we only do so in `Node`. This is means it
is available at module construction time so we slowly remove the need to
bind it in guice.
2016-06-26 07:15:42 -04:00
Adrien Grand
7ba5bceebe Add a MultiTermAwareComponent marker interface to analysis factories. #19028
This is the same as what Lucene does for its analysis factories, and we hawe
tests that make sure that the elasticsearch factories are in sync with
Lucene's. This is a first step to move forward on #9978 and #18064.
2016-06-23 10:19:24 +02:00
Ryan Ernst
a4503c2aed Plugins: Remove name() and description() from api
In 2.0 we added plugin descriptors which require defining a name and
description for the plugin. However, we still have name() and
description() which must be overriden from the Plugin class. This still
exists for classpath plugins. But classpath plugins are mainly for
tests, and even then, referring to classpath plugins with their class is
a better idea. This change removes name() and description(), replacing
the name for classpath plugins with the full class name.
2016-06-15 17:12:22 -07:00
Jun Ohtani
a9a0f262af Analysis Kuromoji: Add nbest option and NumberFilter
Add nbest_cost and nbest_examples parameter to KuromojiTokenizerFactory
Add KuromojiNumberFilterFactory
2016-03-22 20:09:56 +09:00
Simon Willnauer
7a53a396e4 Remove Unneded @Inject annotations 2016-03-09 12:10:47 +01:00
Ryan Ernst
4ea19995cf Remove wildcard imports 2015-12-18 12:43:47 -08:00
Simon Willnauer
aa38d053d7 Simplify Analysis registration and configuration
This change moves all the analysis component registration to the node level
and removes the significant API overhead to register tokenfilter, tokenizer,
charfilter and analyzer. All registration is done without guice interaction such
that real factories via functional interfaces are passed instead of class objects
that are instantiated at runtime.

This change also hides the internal analyzer caching that was done previously in the
IndicesAnalysisService entirely and decouples all analysis registration and creation
from dependency injection.
2015-10-30 11:40:18 +01:00
Simon Willnauer
66d5d0c4f2 Replace IndexSettings annotation with a full-fledged class
The @IndexSettings annoationat has been used to differentiate between node-level
and index level settings. It was also decoupled from realtime-updates such that
the settings object that a class got injected when it was created was static and
not subject to change when an update was applied. This change removes the annoation
and replaces it with a full-fledged class that adds type-safety and encapsulates additional
functionality as well as checks on the settings.
2015-10-22 20:43:41 +02:00
Nik Everett
dd8edd4015 Remove one usage of MapBuilder#immutableMap
It was causing the build to fail.
2015-10-05 08:52:09 -04:00
Ryan Ernst
dc1fa6736a Merged AbstractPlugin and Plugin. Also added Settings back to
indexModules and shardModules
2015-08-18 02:46:32 -07:00
Ryan Ernst
2bf84593e0 Plugins: Simplify Plugin API for constructing modules
The Plugin interface currently contains 6 different methods for
adding modules. Elasticsearch has 3 different levels of injectors,
and for each of those, there are two methods. The first takes no
arguments and returns a collection of class objects to construct. The
second takes a Settings object and returns a collection of module
objects already constructed. The settings argument is unecessary because
the plugin can already get the settings from its constructor. Removing
that, the only difference between the two versions is returning an
already constructed Module, or a module Class, and there is no reason
the plugin can't construct all their modules themselves.

This change reduces the plugin api down to just 3 methods for adding
modules. Each returns a Collection<Module>. It also removes the
processModule method, which was unnecessary since onModule
implementations fullfill the same requirement. And finally, it renames
the modules() method to nodeModules() so it is clear these are created
once for each node.
2015-08-17 20:41:45 -07:00
Simon Willnauer
9b41b94459 add analysis-kuromoji module 2015-06-05 13:12:07 +02:00