Commit Graph

68 Commits

Author SHA1 Message Date
Jason Tedor 5fcda57b37
Rename MetaData to Metadata in all of the places (#54519)
This is a simple naming change PR, to fix the fact that "metadata" is a
single English word, and for too long we have not followed general
naming conventions for it. We are also not consistent about it, for
example, METADATA instead of META_DATA if we were trying to be
consistent with MetaData (although METADATA is correct when considered
in the context of "metadata"). This was a simple find and replace across
the code base, only taking a few minutes to fix this naming issue
forever.
2020-03-31 17:24:38 -04:00
Julie Tibshirani c2e9d13ebd
Default include_type_name to false in the yml test harness. (#38058)
This PR removes the temporary change we made to the yml test harness in #37285
to automatically set `include_type_name` to `true` in index creation requests
if it's not already specified. This is possible now that the vast majority of
index creation requests were updated to be typeless in #37611. A few additional
tests also needed updating here.

Additionally, this PR updates the test harness to set `include_type_name` to
`false` in index creation requests when communicating with 6.x nodes. This
mirrors the logic added in #37611 to allow for typeless document write requests
in test set-up code. With this update in place, we can remove many references
to `include_type_name: false` from the yml tests.
2019-02-01 11:44:13 -08:00
Colin Goodheart-Smithe 21e392e95e
Removes typed calls from YAML REST tests (#37611)
This PR attempts to remove all typed calls from our YAML REST tests. The PR adds include_type_name: false to create index requests that use a mapping and also to put mapping requests. It also removes _type from index requests where they haven't already been removed. The PR ignores tests named *_with_types.yml since this are specifically testing typed API behaviour.

The change also includes changing the test harness to add the type _doc to index, update, get and bulk requests that do not specify the document type when the test is running against a mixed 7.x/6.x cluster.
2019-01-30 16:32:58 +00:00
Jim Ferenczi 18866c4c0b
Make hits.total an object in the search response (#35849)
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).

Relates #33028
2018-12-05 19:49:06 +01:00
Alan Woodward a646f85a99
Ensure TokenFilters only produce single tokens when parsing synonyms (#34331)
A number of tokenfilters can produce multiple tokens at the same position.  This
is a problem when using token chains to parse synonym files, as the SynonymMap
requires that there are no stacked tokens in its input.

This commit ensures that when used to parse synonyms, these tokenfilters either produce
a single version of their input token, or that they throw an error when mappings are 
generated.  In indexes created in elasticsearch 6.x deprecation warnings are emitted in place 
of the error. 

* asciifolding and cjk_bigram produce only the folded or bigrammed token
* decompounders, synonyms and keyword_repeat are skipped
* n-grams, word-delimiter-filter, multiplexer, fingerprint and phonetic throw errors

Fixes #34298
2018-11-29 10:35:38 +00:00
Nik Everett a45626deb5
Analysis: Wrap at 140 columns (#34494)
Applies our standard column width to all analysis plugins.
2018-10-17 16:17:25 -04:00
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
Russ Cam e2b665c2e6
Consistent encoder names (#29492)
This commit updates encoder names to be consistent within documentation
and align with snake casing convention.
2018-07-24 09:21:43 +10:00
Tanguy Leroux bf58660482
Remove all unused imports and fix CRLF (#31207)
The X-Pack opening and the recent other refactorings left a lot of 
unused imports in the codebase. This commit removes them all.
2018-06-11 15:12:12 +02:00
Jim Ferenczi b82017cbfe
Fix daitch_mokotoff phonetic filter to use the dedicated Lucene filter (#28225)
This commit changes the phonetic filter factory to use a DaitchMokotoffSoundexFilter
instead of a PhoneticFilter with a daitch_mokotoff encoder when daitch_mokotoff is selected.
The latter does not hanlde branching when computing the soundex and fails to encode multiple
variations when possible.

Closes #28211
2018-01-15 19:35:54 +01:00
Christoph Büscher 9253ea8aec Fix beidermorse phonetic token filter for unspecified `languageset` (#27112)
Currently, when we create a BeiderMorseFilter with an unspecified `languageset`,
the filter will not guess the language, which should be the default behaviour.
This change fixes this and adds a simple test for the cases with and without
provided `languageset` settings.

Closes #26771
2017-10-27 10:07:36 +02:00
Simon Willnauer cdd7c1e6c2 Return List instead of an array from settings (#26903)
Today we return a `String[]` that requires copying values for every
access. Yet, we already store the setting as a list so we can also directly
return the unmodifiable list directly. This makes list / array access in settings
a much cheaper operation especially if lists are large.
2017-10-09 09:52:08 +02:00
Simon Willnauer aab4655e63 Unify Settings xcontent reading and writing (#26739)
This change adds a fromXContent method to Settings that allows to read
the xcontent that is produced by toXContent. It also replaces the entire settings
loader infrastructure and removes the structured map representation. Future PRs will
also tackle the `getAsMap` that exposes the internal represenation of settings for
better encapsulation.
2017-09-25 13:23:01 +02:00
Adrien Grand eb782492be Remove support for lenient booleans.
Closes #22298
2017-08-28 09:56:01 +02:00
Ryan Ernst 2a65bed243 Tests: Change rest test extension from .yaml to .yml (#24659)
This commit renames all rest test files to use the .yml extension
instead of .yaml. This way the extension used within all of
elasticsearch for yaml is consistent.
2017-05-16 17:24:35 -07:00
Nik Everett bb06d8ec4f Allow plugins to build pre-configured token filters (#24223)
This changes the way we register pre-configured token filters so that
plugins can declare them and starts to move all of the pre-configured
token filters out of core. It doesn't finish the job because doing
so would make the change unreviewably large. So this PR includes
a shim that keeps the "old" way of registering pre-configured token
filters around.

The Lowercase token filter is special because there is a "special"
interaction between it and the lowercase tokenizer. I'm not sure
exactly what to do about it so for now I'm leaving it alone with
the intent of figuring out what to do with it in a followup.

This also renames these pre-configured token filters from
"pre-built" to "pre-configured" because that seemed like a more
descriptive name.

This is a part of #23658
2017-05-09 14:50:49 -04:00
Koen De Groote 0fef5acd01 Cleanup collections construction
This commit cleans up some cases where a list or map was being
constructed, and then an existing collection was copied into the new
collection. The clean is to instead use an appropriate constructor to
directly copy the existing collection in during collection
construction. The advantage of this is that the new collection is sized
appropriately.

Relates #24409
2017-04-30 21:26:51 -04:00
Ryan Ernst 212f24aa27 Tests: Clean up rest test file handling (#21392)
This change simplifies how the rest test runner finds test files and
removes all leniency.  Previously multiple prefixes and suffixes would
be tried, and tests could exist inside or outside of the classpath,
although outside of the classpath never quite worked. Now only classpath
tests are supported, and only one resource prefix is supported,
`/rest-api-spec/tests`.

closes #20240
2017-04-18 15:07:08 -07:00
Luca Cavanna cc65a94fd4 [TEST] improve yaml test sections parsing (#23407)
Throw error when skip or do sections are malformed, such as they don't start with the proper token (START_OBJECT). That signals bad indentation, which would be ignored otherwise. Thanks (or due to) our pull parsing code, we were still able to properly parse the sections, yet other runners weren't able to.

Closes #21980

* [TEST] fix indentation in matrix_stats yaml tests

* [TEST] fix indentation in painless yaml test

* [TEST] fix indentation in analysis yaml tests

* [TEST] fix indentation in generated docs yaml tests

* [TEST] fix indentation in multi_cluster_search yaml tests
2017-03-02 12:43:20 +01:00
Jason Tedor 9781b88a38 Fix deprecation logging for lenient booleans
This commit fixes an issue with deprecation logging for lenient
booleans. The underlying issue is that adding deprecation logging for
lenient booleans added a static deprecation logger to the Settings
class. However, the Settings class is initialized very early and in CLI
tools can be initialized before logging is initialized. This leads to
status logger error messages. Additionally, the deprecation logging for
a lot of the settings does not provide useful context (for example, in
the token filter factories, the deprecation logging only produces the
name of the setting, but gives no context which token filter factory it
comes from). This commit addresses both of these issues by changing the
call sites to push a deprecation logger through to the lenient boolean
parsing.

Relates #22696
2017-01-19 12:30:33 -05:00
Daniel Mitterdorfer aece89d6a1 Make boolean conversion strict (#22200)
This PR removes all leniency in the conversion of Strings to booleans: "true"
is converted to the boolean value `true`, "false" is converted to the boolean
value `false`. Everything else raises an error.
2017-01-19 07:59:18 +01:00
Nik Everett f5f2149ff2 Remove much ceremony from parsing client yaml test suites (#22311)
* Remove a checked exception, replacing it with `ParsingException`.
* Remove all Parser classes for the yaml sections, replacing them with static methods.
* Remove `ClientYamlTestFragmentParser`. Isn't used any more.
* Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods.

I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.
2016-12-22 11:00:34 -05:00
Ryan Ernst 7a2c984bcc Test: Remove multi process support from rest test runner (#21391)
At one point in the past when moving out the rest tests from core to
their own subproject, we had multiple test classes which evenly split up
the tests to run. However, we simplified this and went back to a single
test runner to have better reproduceability in tests. This change
removes the remnants of that multiplexing support.
2016-11-07 15:07:34 -08:00
Jun Ohtani a66c76eb44 Merge pull request #20704 from johtani/remove_request_params_in_analyze_api
Removing request parameters in _analyze API
2016-10-27 17:43:18 +09:00
Tanguy Leroux 44ac5d057a Remove empty javadoc (#20871)
This commit removes as many as empty javadocs comments my regexp has found
2016-10-12 10:27:09 +02:00
Jun Ohtani 370f0b885e Removing request parameters in _analyze API
Remove request params in _analyze API without index param
Change rest-api-test using JSON
Change docs using JSON

Closes #20246
2016-10-07 16:23:24 +09:00
Simon Willnauer fe1803c957 Remove AnalysisService and reduce it to a simple name to analyzer mapping (#20627)
Today we hold on to all possible tokenizers, tokenfilters etc. when we create
an index service on a node. This was mainly done to allow the `_analyze` API to
directly access all these primitive. We fixed this in #19827 and can now get rid of
the AnalysisService entirely and replace it with a simple map like class. This
ensures we don't create a gazillion long living objects that are entirely useless since
they are never used in most of the indices. Also those objects might consume a considerable
amount of memory since they might load stopwords or synonyms etc.

Closes #19828
2016-09-23 08:53:50 +02:00
Chris Earle 1cf694b63e Use StringBuilder in favor of StringBuffer
This removes all instances of StringBuffer that are removeable.

Uncontended synchronization in Java is pretty cheap, but it's unnecessary.
2016-08-25 16:20:03 -04:00
Nik Everett 9270e8b22b Rename client yaml test infrastructure
This makes it obvious that these tests are for running the client yaml
suites. Now that there are other ways of running tests using the REST
client against a running cluster we can't go on calling the shared
client yaml tests "REST tests". They are rest tests, but they aren't
**the** rest tests.
2016-07-26 13:53:44 -04:00
Nik Everett a95d4f4ee7 Add Location header and improve REST testing
This adds a header that looks like `Location: /test/test/1` to the
response for the index/create/update API. The requirement for the header
comes from https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

https://tools.ietf.org/html/rfc7231#section-7.1.2 claims that relative
URIs are OK. So we use an absolute path which should resolve to the
appropriate location.

Closes #19079

This makes large changes to our rest test infrastructure, allowing us
to write junit tests that test a running cluster via the rest client.
It does this by splitting ESRestTestCase into two classes:
* ESRestTestCase is the superclass of all tests that use the rest client
to interact with a running cluster.
* ESClientYamlSuiteTestCase is the superclass of all tests that use the
rest client to run the yaml tests. These tests are shared across all
official clients, thus the `ClientYamlSuite` part of the name.
2016-07-25 17:02:40 -04:00
Ali Beyad 19d0dbcd17 Removes waiting for yellow cluster health upon index (#19460)
creation in the REST tests, as we no longer need it due
to index creation now waiting for active shard copies
before returning (by default, it waits for the primary of
each shard, which is the same as ensuring yellow health).

Relates #19450
2016-07-15 17:18:34 -04:00
Tanguy Leroux 8c40b2b54e Fix order of modifiers 2016-07-01 16:57:14 +02:00
Nik Everett 71b95fb63c Switch analysis from push to pull
Instead of plugins calling `registerTokenizer` to extend the analyzer
they now instead have to implement `AnalysisPlugin` and override
`getTokenizer`. This lines up extending plugins in with extending
scripts. This allows `AnalysisModule` to construct the `AnalysisRegistry`
immediately as part of its constructor which makes testing anslysis
much simpler.

This also moves the default analysis configuration into `AnalysisModule`
which is how search is setup.

Like `ScriptModule`, `AnalysisModule` no longer extends `AbstractModule`.
Instead it is only responsible for building `AnslysisRegistry`. We still
bind `AnalysisRegistry` but we only do so in `Node`. This is means it
is available at module construction time so we slowly remove the need to
bind it in guice.
2016-06-26 07:15:42 -04:00
Adrien Grand 7ba5bceebe Add a MultiTermAwareComponent marker interface to analysis factories. #19028
This is the same as what Lucene does for its analysis factories, and we hawe
tests that make sure that the elasticsearch factories are in sync with
Lucene's. This is a first step to move forward on #9978 and #18064.
2016-06-23 10:19:24 +02:00
Ryan Ernst a4503c2aed Plugins: Remove name() and description() from api
In 2.0 we added plugin descriptors which require defining a name and
description for the plugin. However, we still have name() and
description() which must be overriden from the Plugin class. This still
exists for classpath plugins. But classpath plugins are mainly for
tests, and even then, referring to classpath plugins with their class is
a better idea. This change removes name() and description(), replacing
the name for classpath plugins with the full class name.
2016-06-15 17:12:22 -07:00
Adrien Grand 42526ac28e Remove Settings.settingsBuilder.
We have both `Settings.settingsBuilder` and `Settings.builder` that do exactly
the same thing, so we should keep only one. I kept `Settings.builder` since it
has my preference but also it is the one that we use in examples of the Java API.
2016-04-08 18:10:02 +02:00
Simon Willnauer e91a141233 Prevent index level setting from being configured on a node level
Today we allow to set all kinds of index level settings on the node level which
is error prone and difficult to get right in a consistent manner.
For instance if some analyzers are setup in a yaml config file some nodes might
not have these analyzers and then index creation fails.

Nevertheless, this change allows some selected settings to be specified on a node level
for instance:
 * `index.codec` which is used in a hot/cold node architecture and it's value is really per node or per index
 * `index.store.fs.fs_lock` which is also dependent on the filesystem a node uses

All other index level setting must be specified on the index level. For existing clusters the index must be closed
and all settings must be updated via the API on each of the indices.

Closes #16799
2016-03-17 14:42:18 +01:00
Simon Willnauer 7a53a396e4 Remove Unneded @Inject annotations 2016-03-09 12:10:47 +01:00
Adrien Grand eef19be072 Deprecate string in favor of text/keyword. #16877
This commit removes the ability to use string fields on indices created on or
after 5.0. Dynamic mappings now generate text fields by default for strings
but there are plans to also add a sub keyword field (in a future PR).

Most of the changes in this commit are just about replacing string with
keyword or text. Some tests have been removed because they existed because of
corner cases of string mappings like setting ignore-above on a text field or
enabling term vectors on a keyword field which are now impossible.

The plan is to remove strings entirely in 6.0.
2016-03-03 10:20:56 +01:00
Simon Willnauer e02d2e004e Rewrite SettingsFilter to be immutable
This change rewrites the entire settings filtering mechanism to be immutable.
All filters must be registered up-front in the SettingsModule. Filters that are comma-sparated are
not allowed anymore and check on registration.
This commit also adds settings filtering to the default settings recently added to ensure we don't render
filtered settings.
2016-02-03 20:05:55 +01:00
Boaz Leskes 2a137b5548 Make index uuid available in Index, ShardRouting & ShardId
In the early days Elasticsearch used to use the index name as the index identity. Around 1.0.0 we introduced a unique index uuid which is stored in the index setting. Since then we used that uuid in a few places but it is by far not the main identifier when working with indices, partially because it's not always readily available in all places.

This PR start to make a move in the direction of using uuids instead of name by making sure that the uuid is available on the Index class (currently just a wrapper around the name) and as such also available via ShardRouting and ShardId.

Note that this is by no means an attempt to do the right thing with the uuid in all places. In almost all places it falls back to the name based comparison that was done before. It is meant as a first step towards slowly improving the situation.

Closes #16217
2016-01-28 08:40:10 +01:00
Daniel Mitterdorfer e9bb3d31a3 Convert "path.*" and "pidfile" to new settings infra 2016-01-22 15:14:13 +01:00
Simon Willnauer fbfa9f4925 Merge branch 'master' into new_index_settings 2016-01-19 10:13:48 +01:00
Simon Willnauer 8e0390b09e Register index.version_created for several analysis plugin tests 2016-01-19 09:35:32 +01:00
Ryan Ernst ef4f0a8699 Test: Make rest test framework accept http directly for the test cluster
The rest test framework, because it used to be tightly integrated with
ESIntegTestCase, currently expects the addresses for the test cluster to
be passed using the transport protocol port. However, it only uses this
to then find the http address.

This change makes ESRestTestCase extend from ESTestCase instead of
ESIntegTestCase, and changes the sysprop used to tests.rest.cluster,
which now takes the http address.

closes #15459
2016-01-18 16:44:14 -08:00
Nik Everett 0786c506dc Remove a few more Xlint skips 2016-01-06 23:28:13 -05:00
Ryan Ernst 4ea19995cf Remove wildcard imports 2015-12-18 12:43:47 -08:00
Simon Willnauer 9f6598b18d Fix compile errors 2015-11-26 13:41:00 +01:00
David Pilato 52bf365013 Add support for `daitch_mokotoff`
[Daitch Mokotoff](https://en.wikipedia.org/wiki/Daitch%E2%80%93Mokotoff_Soundex) support has been added in Lucene 5.
So we can now support it as well.
2015-11-18 15:41:39 +01:00
Boaz Leskes ac0da91bf7 Extend usage of IndexSetting class
I decided to leave external listeners (used by plugins) alone, for now.

Closes #14731
2015-11-13 14:30:23 +01:00