Commit Graph

120 Commits

Author SHA1 Message Date
Simon Willnauer aab4655e63 Unify Settings xcontent reading and writing (#26739)
This change adds a fromXContent method to Settings that allows to read
the xcontent that is produced by toXContent. It also replaces the entire settings
loader infrastructure and removes the structured map representation. Future PRs will
also tackle the `getAsMap` that exposes the internal represenation of settings for
better encapsulation.
2017-09-25 13:23:01 +02:00
Jason Tedor e0db89bc35 Upgrade to Lucene 7.0.0
This commit upgrades to the GA release of Luence 7!

Relates #26744
2017-09-21 19:19:33 -04:00
Claudio Bley 7184cf8b5b Fix kuromoji default stoptags (#26600)
Initialize the default stop-tags in `KuromojiPartOfSpeechFilterFactory` if the
`stoptags` are not given in the config. Also adding a test which checks that 
part-of-speech tokens are removed when using the kuromoji_part_of_speech 
filter.
2017-09-15 12:25:09 +02:00
Adrien Grand 78681bc9e5 Upgrade to lucene-7.0.0-snapshot-d94a5f0. (#26441) 2017-08-31 09:06:40 +02:00
Adrien Grand eb782492be Remove support for lenient booleans.
Closes #22298
2017-08-28 09:56:01 +02:00
desmorto 292dd8f992 (refactor) some opportunities to use diamond operator (#25585)
* (refactor) some opportunities to use diamond operator

* Update ExceptionRetryIT.java

update typo
2017-08-15 16:36:42 -06:00
Adrien Grand f0c1e30544 Upgrade to lucene-7.0.0-snapshot-a128fcb. (#26090) 2017-08-08 13:03:19 +02:00
Adrien Grand 481d5d09b2 Upgrade to lucene-7.0.0-snapshot-00142c9. (#25641)
Lucene 7.0 is feature-frozen now, so there should not be many changes until GA.
2017-07-11 13:58:55 +02:00
Adrien Grand 44e9c0b947 Upgrade to lucene-7.0.0-snapshot-ad2cb77. (#25349)
Most notable changes:
 - better update concurrency: LUCENE-7868
 - TopDocs.totalHits is now a long: LUCENE-7872
 - QueryBuilder does not remove the boolean query around multi-term synonyms:
   LUCENE-7878
 - removal of Fields: LUCENE-7500

For the `TopDocs.totalHits` change, this PR relies on the fact that the encoding
of vInts and vLongs are compatible: you can write and read with any of them as
long as the value can be represented by a positive int.
2017-06-22 12:35:33 +02:00
Adrien Grand 0c117145f6 Upgrade to lucene-7.0.0-snapshot-92b1783. (#25222)
This snapshot has faster range queries on range fields (LUCENE-7828), more
accurate norms (LUCENE-7730) and the ability to use fake term frequencies
(LUCENE-7854).
2017-06-15 09:52:07 +02:00
Nicholas Knize deb7caf4d3 Upgrade to lucene-7.0.0-snapshot-a0aef2f
This commit upgrades master to a current lucene snapshot with commit id a0aef2f.
2017-05-19 10:20:55 -05:00
Ryan Ernst 2a65bed243 Tests: Change rest test extension from .yaml to .yml (#24659)
This commit renames all rest test files to use the .yml extension
instead of .yaml. This way the extension used within all of
elasticsearch for yaml is consistent.
2017-05-16 17:24:35 -07:00
Nik Everett bb06d8ec4f Allow plugins to build pre-configured token filters (#24223)
This changes the way we register pre-configured token filters so that
plugins can declare them and starts to move all of the pre-configured
token filters out of core. It doesn't finish the job because doing
so would make the change unreviewably large. So this PR includes
a shim that keeps the "old" way of registering pre-configured token
filters around.

The Lowercase token filter is special because there is a "special"
interaction between it and the lowercase tokenizer. I'm not sure
exactly what to do about it so for now I'm leaving it alone with
the intent of figuring out what to do with it in a followup.

This also renames these pre-configured token filters from
"pre-built" to "pre-configured" because that seemed like a more
descriptive name.

This is a part of #23658
2017-05-09 14:50:49 -04:00
Ryan Ernst 212f24aa27 Tests: Clean up rest test file handling (#21392)
This change simplifies how the rest test runner finds test files and
removes all leniency.  Previously multiple prefixes and suffixes would
be tried, and tests could exist inside or outside of the classpath,
although outside of the classpath never quite worked. Now only classpath
tests are supported, and only one resource prefix is supported,
`/rest-api-spec/tests`.

closes #20240
2017-04-18 15:07:08 -07:00
Adrien Grand 4632661bc7 Upgrade to a Lucene 7 snapshot (#24089)
We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index.

Some notes about the change:
 - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors
 - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed
 - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge
2017-04-18 15:17:21 +02:00
Jim Ferenczi 0e95c90e9f Upgrade to Lucene 6.5.0 (#23750) 2017-03-27 15:57:54 +02:00
Jim Ferenczi 5c84640126 Upgrade to lucene-6.5.0-snapshot-d00c5ca (#23385)
Lucene upgrade
2017-02-27 18:39:04 +01:00
Adrien Grand 709cc9ba65 Upgrade to lucene-6.5.0-snapshot-f919485. (#23087) 2017-02-10 15:08:47 +01:00
Adrien Grand c8496fc4f4 Upgrade to Lucene 6.4.1. (#22978) 2017-02-06 09:28:43 +01:00
Jim Ferenczi 8028578305 Upgrade to Lucene 6.4.0 (#22724)
* Upgrade to Lucene 6.4.0

`ValueSource`s are now converted to `DoubleValueSource`s using the Lucene adapter made for the migration to the new API in 6.4.0.
2017-01-21 04:48:01 +01:00
Jason Tedor 9781b88a38 Fix deprecation logging for lenient booleans
This commit fixes an issue with deprecation logging for lenient
booleans. The underlying issue is that adding deprecation logging for
lenient booleans added a static deprecation logger to the Settings
class. However, the Settings class is initialized very early and in CLI
tools can be initialized before logging is initialized. This leads to
status logger error messages. Additionally, the deprecation logging for
a lot of the settings does not provide useful context (for example, in
the token filter factories, the deprecation logging only produces the
name of the setting, but gives no context which token filter factory it
comes from). This commit addresses both of these issues by changing the
call sites to push a deprecation logger through to the lenient boolean
parsing.

Relates #22696
2017-01-19 12:30:33 -05:00
Daniel Mitterdorfer aece89d6a1 Make boolean conversion strict (#22200)
This PR removes all leniency in the conversion of Strings to booleans: "true"
is converted to the boolean value `true`, "false" is converted to the boolean
value `false`. Everything else raises an error.
2017-01-19 07:59:18 +01:00
Adrien Grand f8998fece5 Upgrade to lucene-6.4.0-snapshot-084f7a0. (#22413) 2017-01-04 19:03:52 +01:00
Nik Everett f5f2149ff2 Remove much ceremony from parsing client yaml test suites (#22311)
* Remove a checked exception, replacing it with `ParsingException`.
* Remove all Parser classes for the yaml sections, replacing them with static methods.
* Remove `ClientYamlTestFragmentParser`. Isn't used any more.
* Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods.

I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.
2016-12-22 11:00:34 -05:00
Jim Ferenczi d791ddf704 Upgrade to lucene-6.4.0-snapshot-ec38570 (#21853)
Set lucene version to 6.4.0-snapshot-ec38570 and update all the sha1s/license
Fix invalid combo after upgrade in query_string query. split_on_whitespace=false is disallowed if auto_generate_phrase_queries=true
Adapt the expectations of some tests to the new format of the Lucene explain output
2016-11-29 18:40:31 +01:00
Adrien Grand 1fd5c47e7f Upgrade to lucene-6.3.0. (#21464) 2016-11-14 09:36:45 +01:00
Ryan Ernst 7a2c984bcc Test: Remove multi process support from rest test runner (#21391)
At one point in the past when moving out the rest tests from core to
their own subproject, we had multiple test classes which evenly split up
the tests to run. However, we simplified this and went back to a single
test runner to have better reproduceability in tests. This change
removes the remnants of that multiplexing support.
2016-11-07 15:07:34 -08:00
Adrien Grand 2a70f6e7b1 Upgrade to lucene-6.3.0-snapshot-a66a445. (#21309)
This addresses a bug that was introduced with https://issues.apache.org/jira/browse/LUCENE-7501.
2016-11-04 10:34:04 +01:00
Adrien Grand b3cc54cf0d Upgrade to lucene-6.3.0-snapshot-ed102d6 (#21150)
Lucene 6.3 is expected to be released in the next weeks so it'd be good to give
it some integration testing. I had to upgrade randomized-testing too so that
both Lucene and Elasticsearch are on the same version.
2016-10-28 14:47:15 +02:00
Jun Ohtani a66c76eb44 Merge pull request #20704 from johtani/remove_request_params_in_analyze_api
Removing request parameters in _analyze API
2016-10-27 17:43:18 +09:00
Tanguy Leroux 44ac5d057a Remove empty javadoc (#20871)
This commit removes as many as empty javadocs comments my regexp has found
2016-10-12 10:27:09 +02:00
Jun Ohtani 370f0b885e Removing request parameters in _analyze API
Remove request params in _analyze API without index param
Change rest-api-test using JSON
Change docs using JSON

Closes #20246
2016-10-07 16:23:24 +09:00
Simon Willnauer fe1803c957 Remove AnalysisService and reduce it to a simple name to analyzer mapping (#20627)
Today we hold on to all possible tokenizers, tokenfilters etc. when we create
an index service on a node. This was mainly done to allow the `_analyze` API to
directly access all these primitive. We fixed this in #19827 and can now get rid of
the AnalysisService entirely and replace it with a simple map like class. This
ensures we don't create a gazillion long living objects that are entirely useless since
they are never used in most of the indices. Also those objects might consume a considerable
amount of memory since they might load stopwords or synonyms etc.

Closes #19828
2016-09-23 08:53:50 +02:00
Mike McCandless 0ccfe69789 Upgrade to Lucene 6.2.0 2016-08-24 17:26:28 -04:00
Nik Everett 9270e8b22b Rename client yaml test infrastructure
This makes it obvious that these tests are for running the client yaml
suites. Now that there are other ways of running tests using the REST
client against a running cluster we can't go on calling the shared
client yaml tests "REST tests". They are rest tests, but they aren't
**the** rest tests.
2016-07-26 13:53:44 -04:00
Nik Everett a95d4f4ee7 Add Location header and improve REST testing
This adds a header that looks like `Location: /test/test/1` to the
response for the index/create/update API. The requirement for the header
comes from https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

https://tools.ietf.org/html/rfc7231#section-7.1.2 claims that relative
URIs are OK. So we use an absolute path which should resolve to the
appropriate location.

Closes #19079

This makes large changes to our rest test infrastructure, allowing us
to write junit tests that test a running cluster via the rest client.
It does this by splitting ESRestTestCase into two classes:
* ESRestTestCase is the superclass of all tests that use the rest client
to interact with a running cluster.
* ESClientYamlSuiteTestCase is the superclass of all tests that use the
rest client to run the yaml tests. These tests are shared across all
official clients, thus the `ClientYamlSuite` part of the name.
2016-07-25 17:02:40 -04:00
Ali Beyad 19d0dbcd17 Removes waiting for yellow cluster health upon index (#19460)
creation in the REST tests, as we no longer need it due
to index creation now waiting for active shard copies
before returning (by default, it waits for the primary of
each shard, which is the same as ensuring yellow health).

Relates #19450
2016-07-15 17:18:34 -04:00
Nik Everett 71b95fb63c Switch analysis from push to pull
Instead of plugins calling `registerTokenizer` to extend the analyzer
they now instead have to implement `AnalysisPlugin` and override
`getTokenizer`. This lines up extending plugins in with extending
scripts. This allows `AnalysisModule` to construct the `AnalysisRegistry`
immediately as part of its constructor which makes testing anslysis
much simpler.

This also moves the default analysis configuration into `AnalysisModule`
which is how search is setup.

Like `ScriptModule`, `AnalysisModule` no longer extends `AbstractModule`.
Instead it is only responsible for building `AnslysisRegistry`. We still
bind `AnalysisRegistry` but we only do so in `Node`. This is means it
is available at module construction time so we slowly remove the need to
bind it in guice.
2016-06-26 07:15:42 -04:00
Adrien Grand 7ba5bceebe Add a MultiTermAwareComponent marker interface to analysis factories. #19028
This is the same as what Lucene does for its analysis factories, and we hawe
tests that make sure that the elasticsearch factories are in sync with
Lucene's. This is a first step to move forward on #9978 and #18064.
2016-06-23 10:19:24 +02:00
Adrien Grand 600cbb6ab0 Upgrade to Lucene 6.1.0. #18926 2016-06-17 09:03:00 +02:00
Ryan Ernst a4503c2aed Plugins: Remove name() and description() from api
In 2.0 we added plugin descriptors which require defining a name and
description for the plugin. However, we still have name() and
description() which must be overriden from the Plugin class. This still
exists for classpath plugins. But classpath plugins are mainly for
tests, and even then, referring to classpath plugins with their class is
a better idea. This change removes name() and description(), replacing
the name for classpath plugins with the full class name.
2016-06-15 17:12:22 -07:00
Adrien Grand 44c653f5a8 Upgrade to lucene-6.1.0-snapshot-3a57bea. 2016-06-10 16:18:12 +02:00
Adrien Grand d182e171a4 Upgrade to Lucene 6.0.1. 2016-06-01 10:31:10 +02:00
Jun Ohtani 9eb242a5fe Analyze API : Rename filters/token_filters/char_filter to filter/token_filter/char_filter
Closes #15189
2016-04-21 18:05:11 +09:00
Adrien Grand 496c7fbd84 Upgrade Lucene 6 Release
* upgrades numerics to new Point format
* updates geo api changes
  * adds GeoPointDistanceRangeQuery as XGeoPointDistanceRangeQuery
  * cuts over to ES GeoHashUtils
2016-04-11 16:50:04 -05:00
Adrien Grand 42526ac28e Remove Settings.settingsBuilder.
We have both `Settings.settingsBuilder` and `Settings.builder` that do exactly
the same thing, so we should keep only one. I kept `Settings.builder` since it
has my preference but also it is the one that we use in examples of the Java API.
2016-04-08 18:10:02 +02:00
Simon Willnauer 1988b8b387 [TEST] Reuse EsTestCase#createAnalysisService in KuromojiAnalysisTests 2016-03-22 13:45:20 +01:00
Jun Ohtani a9a0f262af Analysis Kuromoji: Add nbest option and NumberFilter
Add nbest_cost and nbest_examples parameter to KuromojiTokenizerFactory
Add KuromojiNumberFilterFactory
2016-03-22 20:09:56 +09:00
Simon Willnauer e91a141233 Prevent index level setting from being configured on a node level
Today we allow to set all kinds of index level settings on the node level which
is error prone and difficult to get right in a consistent manner.
For instance if some analyzers are setup in a yaml config file some nodes might
not have these analyzers and then index creation fails.

Nevertheless, this change allows some selected settings to be specified on a node level
for instance:
 * `index.codec` which is used in a hot/cold node architecture and it's value is really per node or per index
 * `index.store.fs.fs_lock` which is also dependent on the filesystem a node uses

All other index level setting must be specified on the index level. For existing clusters the index must be closed
and all settings must be updated via the API on each of the indices.

Closes #16799
2016-03-17 14:42:18 +01:00
Adrien Grand 5596e31068 Upgrade to lucene-6.0.0-f0aa4fc. #17075 2016-03-14 07:58:52 +01:00