Commit Graph

2124 Commits

Author SHA1 Message Date
Nick Knize 9168f1fb43
[License] Add SPDX and OpenSearch Modification license header (#509)
This commit adds the SPDX Apache-2.0 license header along with an additional
copyright header for all modifications.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
2021-04-09 14:28:18 -05:00
Rabi Panda 0bdd1293c1
Use alternate example data in OpenSearch test cases. (#454)
This commit updates some of the sample test data used in test cases in OpenSearch.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-25 08:52:07 -07:00
Rabi Panda 30c88e7e04
Fix failing rest-api-spec tests as part of renaming. (#451)
This commit fixes the following two failing yaml tests in rest-api-spec.

- indices.create/10_basic/Create index without soft deletes
- indices.stats/20_translog/Translog stats on closed indices without soft-deletes

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-23 16:39:26 -07:00
Rabi Panda 8bba6603da [Rename] Replace more instances of Elasticsearch with OpenSearch. (#432)
This commit replaces more replaceable instances of Elasticsearch with OpenSearch.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-21 20:56:34 -05:00
Nick Knize 7051167c83 [Rename] remaining elasticsearch pass 1 (#416)
This commit refactors instances of 'elasticsearch' with opensearch everywhere
except references to issues, and other places needed to test compatibility with
old elasticsearch clusters.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
2021-03-21 20:56:34 -05:00
Rabi Panda d1e070c92b [Rename] Fix import issues in tests. (#414)
This commit fixes the import issues in already refactored test packages.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-21 20:56:34 -05:00
Rabi Panda df11cc9de4 [Rename] Fix gradle build as part of the renaming process. (#397)
This commit fixes the currently broken gradle build resulted from the renaming work. It reverts a few dependencies and comments out the `opensearch_distibutions` task which is currently failing for some builds. We will address these separately in the future once we have a working build.

Signed-off-by: Rabi Panda <adnapibar@gmail.com>
2021-03-21 20:56:34 -05:00
Nick Knize 5b46a05702 [Rename] remaining packages and resources in test/fixture (#364)
This commit refactors the remaining o.e.index and o.e.test packages in the
test/fixtures module. References throughout the codebase are also refactored.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
2021-03-21 20:56:34 -05:00
Harold Wang a71f725dd8 [Rename] Rename files under three server sub folders (#195)
Refactor below three folders as part of the Elasticsearch to OpenSearch renaming effort.
. server/src/internalClusterTest
. server/src/main/java11
. server/src/main/resources
. rest-api-spec


Signed-off-by: Harold Wang <harowang@amazon.com>
2021-03-21 20:56:34 -05:00
Tianli Feng 672d975f43
Remove tagline from sever and RHLC (#427)
Signed-off-by: Tianli Feng <ftianli@amazon.com>
2021-03-19 18:17:10 -07:00
Nick Knize b360d4fb61 [TEST] Fix Feature Flags in Test Framework and SortTemplates yaml failure (#82)
This commit adds parse logic to correctly parse feature flags in the test framework. It also fixes a test failure in cat.templates/Sort Templates yaml test.

Signed-off-by: Peter Nied <petern@amazon.com>
2021-03-13 10:36:13 -06:00
Rabi Panda bb7dc316e4 Bring back the REST specs for data streams. (#78)
Add back the REST specs for data streams which were moved to x-pack as part of the commit fe12217c

Signed-off-by: Peter Nied <petern@amazon.com>
2021-03-13 10:36:13 -06:00
Nick Knize 755367f1c9 [PURIFY] Remove x-pack feature flag from yaml test (#68)
This commit removes the xpack no_xpack feature flag from the yaml test suite.

Signed-off-by: Peter Nied <petern@amazon.com>
2021-03-13 10:36:13 -06:00
Christoph Büscher 2d8cc4b154 Lower skip version for token_cound yaml test (#68583)
Signed-off-by: Peter Nied <petern@amazon.com>
2021-03-13 10:36:12 -06:00
Julie Tibshirani 747e1cc71d Fix the skip version for inner hits REST test. 2021-01-06 11:46:17 -08:00
Tanguy Leroux ceea5b313f
Mute MixedClusterClientYamlTestSuiteIT search.inner_hits/10_basic/Inner hits with disabled _source (#67083)
Relates #67079
2021-01-06 12:51:29 +01:00
Julie Tibshirani ff67baac38 Make sure shared source always represents the top-level root document. (#66725)
We started passing down the root document's _source when processing
nested hits, to avoid reloading and reparsing the root source for each hit.
Unfortunately the approach did not work when there are multiple layers of
`inner_hits`. In this case, the second-layer inner hit received its immediate
parent's source instead of the root source. This parent source is filtered to
just contain the parts corresponding to the nested document, but the source
parsing logic is designed to always operate on the top-level root source. This
caused failures when loading the second-layer inner hits.

This PR makes sure to always pass the root document's _source when processing
inner hits, even if there are multiple layers.
2021-01-05 14:27:41 -08:00
Seth Michael Larson 8f3a6cd913
[7.10] Use single backslash for nested paths (#66794) 2020-12-23 12:58:43 -06:00
Julie Tibshirani 08b0920c6c Adjust test skips now that inner_hits fix is backported. 2020-12-21 15:01:54 -08:00
Julie Tibshirani d4039228ae Fix regressions around nested hits and disabled _source. (#66572)
This PR fixes two bugs that can arise when _source is disabled and we fetch nested documents:
* Fix exception when highlighting `inner_hits` with disabled _source.
* Fix exception in nested `top_hits` with disabled _source.
* Add more tests for highlighting `inner_hits`.
2020-12-18 15:21:37 -08:00
Gordon Brown df8c92cfef
Mute tests failing on Debian 8 due to memory reporting (#66648) 2020-12-18 15:27:07 -07:00
Julie Tibshirani 9fa80c1896 Fix failure in fvh REST tests. (#66192)
In general, we can't guarantee that a match_all query will return documents in
the order they were indexed. This PR adds an ID to each document to avoid
relying on document order.
2020-12-15 12:01:25 -08:00
Jay Modi 01d54d222b
Fix cat tasks api params in spec and handler (#66294)
This commit fixes the cat tasks api parameter specification and the
handler so that the parameters are consumed during request preparation.

Closes #59493
Backport of #66272
2020-12-14 13:44:09 -07:00
Dimitrios Liappis 95dbc6d448
[7.10] Mute fvh REST tests (backport of #66149) #66152
Relates #66147
2020-12-10 12:33:15 +02:00
Julie Tibshirani d707f26eb9 Ensure consistent hit order in fvh REST tests. 2020-12-09 18:15:34 -08:00
Julie Tibshirani b2d3c3f6f9
Fix bug where fvh fragments could be loaded from wrong doc (#66142)
This PR fixes a regression where fvh fragments could be loaded from the wrong
document _source.

Some `FragmentsBuilder` implementations contain a `SourceLookup` to load from
_source. The lookup should be positioned to load from the current hit document.
However, since `FragmentsBuilder` are cached and shared across hits, the lookup
is never updated to load from the new documents. This means we accidentally
load _source from a different document.

The regression was introduced in #60179, which started storing `SourceLookup`
on `FragmentsBuilder`.

Fixes #65533.
2020-12-09 17:52:58 -08:00
Seth Michael Larson 5a89835025
Mark Task APIs as experimental in rest-api-spec 2020-12-03 15:11:21 -06:00
Daniel Mitterdorfer 723e14ab72
Mute field collapsing tests in MixedClusterClientYamlTestSuiteIT (#64914)
Relates #52416
2020-11-11 11:48:18 +01:00
Nik Everett 0c47d49784
Make sure non-collecting aggs include sub-aggs (backport of #64214) (#64247)
Now that we're consistently using `cat_match` to filter which shards we
run on we can get this confusing case:
1. You have a search with, say, a range and a sub-agg.
2. That search has a query that `can_match` can recognize will match no
   docs. On *any* shard.
3. So we dutifully run it on a single shard so it can produce the
   "empty" aggs.
4. The shard we pick happens to not have the target of the range mapped.
5. This kicks in the special range aggregator that doesn't collect any
   documents.
6. Before this commit, that range aggregator *also* never produced any
   sub-aggs.

So, without this change, it was quite possible for a search that
happened to match no documents to "throw away" the sub-aggs of a range
and a few other aggs.

We've had this problem for a long, long time but it is more confusing
now because `can_match` is really kicking in and causing us to see cases
where it looks like you are targeting a lot of shards but you really are
only targeting a couple. It used to be that to get the "no sub-aggs"
behavior you had to explicitly target only shards that didn't map the
target field of the `range` agg. And, like, in that case it isn't too
bad because you targeted a sort of degenerate shard. But now that
`can_match` is doing its thing you can end up with the confusing steps
above. It took me several hours to track down what what happening I know
how the individual pieces of all of this works. It took four hours to
figure out how they fit together in this case....

Anyway! This replaces all the aggregator implementations that throw out
the sub-aggregators with ones that keep them. I think this'll be less
confusing in the future.

Closes #64142
2020-10-28 08:38:05 -04:00
Armin Braun a697d5edae
Don't Generate an Index Setting History UUID unless it's Supported (#64164) (#64213)
In 7.x we can't just by default generate this setting as it might not be
supported by data nodes that are assigned shards for an older version in mixed version
clusters.

Closes #64152
2020-10-28 09:03:09 +01:00
Jason Tedor 04a9845a49
Adjust defaults for tiered data roles (#64015)
This commit adjusts the defaults for the tiered data roles so that they
are enabled by default, or if the node has the legacy data role. This
ensures that the default experience is that the tiered data roles are
enabled.

To fully specifiy the behavior for the tiered data roles then:
 - starting a new node with the defaults: enabled
 - starting a new node with node.roles configured: enabled if and only
   if the tiered data roles are explicitly configured, independently
   of the node having the data role
 - starting a new node with node.data enabled: enabled unless the
   tiered data roles are explicitly disabled
 - starting a new node with node.data disabled: disabled unless the
   tiered data roles are explicitly enabled
2020-10-27 12:48:31 -04:00
David Kyle 4545779415 Mute MixedClusterClientYamlTestSuiteIT 'Create a snapshot and then restore it' (#64153)
For #64152
2020-10-26 11:58:17 +00:00
Armin Braun 1880bcdc09
Add REST Test for Snapshot Clone API (#63863) (#63881)
Adds snapshot clone REST tests and HLRC support for the API.
2020-10-20 09:48:03 +02:00
Julie Tibshirani 9e52513c7b
Add support for missing value fetchers. (#63585)
This PR implements value fetching for the following field types:
* `text` phrase and prefix subfields
* `search_as_you_type`, plus its subfields
* `token_count`, which is implemented by fetching doc values

Supporting these types helps ensure that retrieving all fields through
`"fields": ["*"]` doesn't fail because of unsupported value fetchers.
2020-10-12 17:34:21 -07:00
Alan Woodward 88b45dfa61
Convert TextFieldMapper to parametrized form (#63269) (#63392)
As a result of this, we can remove a chunk of code from TypeParsers as well. Tests
for search/index mode analyzers have moved into their own file. This commit also
rationalises the serialization checks for parameters into a single SerializerCheck
interface that takes the values includeDefaults, isConfigured and the value
itself.

Relates to #62988
2020-10-07 13:26:25 +01:00
Igor Motov 6a9cde2918
Add support for x_opaque_id to _cat/tasks (#63036) (#63135)
Adds an optional column with support for x_opaque_id to _cat/tasks API.

Closes #61118
2020-10-01 13:17:46 -04:00
Alan Woodward 63afc61b08 Introduce FetchContext (#62357)
We currently pass a SearchContext around to share configuration among
FetchSubPhases. With the introduction of runtime fields, it would be useful
to start storing some state on this context to be shared between different
subphases (for example, stored fields or search lookups can be loaded lazily
but referred to by many different subphases). However, SearchContext is a
very large and unwieldy class, and adding more methods or state here feels
like a bridge too far.

This commit introduces a new FetchContext class that exposes only those
methods on SearchContext that are required for fetch phases. This reduces
the API surface area for fetch phases considerably, and should give us some
leeway to add further state.
2020-09-17 09:57:43 +01:00
Nik Everett 24a24d050a
Implement fields fetch for runtime fields (backport of #61995) (#62416)
This implements the `fields` API in `_search` for runtime fields using
doc values. Most of that implementation is stolen from the
`docvalue_fields` fetch sub-phase, just moved into the same API that the
`fields` API uses. At this point the `docvalue_fields` fetch phase looks
like a special case of the `fields` API.

While I was at it I moved the "which doc values sub-implementation
should I use for fetching?" question from a bunch of `instanceof`s to a
method on `LeafFieldData` so we can be much more flexible with what is
returned and we're not forced to extend certain classes just to make the
fetch phase happy.

Relates to #59332
2020-09-15 20:24:10 -04:00
Nik Everett 771a8893a6
Add more debugging information for cardinality agg (#62317) (#62397)
This adds two extra bits of info to the profiler:
1. Count of the number of different types of collectors. This lets us figure
   out if we're using the optimization for segment ordinals. It adds a few
   more similar counters just for good measure.
2. Profiles the `getLeafCollector` and `postCollection` methods. These are
   non-trivial for some aggregations, like cardinality.
2020-09-15 13:21:11 -04:00
Julie Tibshirani 4a19bdb2ea
Support the 'fields' option in inner_hits and top_hits. (#62337)
This PR adds support for the 'fields' option in the following places:
* Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing
* The `top_hits` aggregation

Addresses #61949.
2020-09-14 11:51:45 -07:00
Nhat Nguyen 3d69b5c41e Introduce point in time APIs in x-pack basic (#61062)
This commit introduces a new API that manages point-in-times in x-pack
basic. Elasticsearch pit (point in time) is a lightweight view into the
state of the data as it existed when initiated. A search request by
default executes against the most recent point in time. In some cases,
it is preferred to perform multiple search requests using the same point
in time. For example, if refreshes happen between search_after requests,
then the results of those requests might not be consistent as changes
happening between searches are only visible to the more recent point in
time.

A point in time must be opened before being used in search requests. The
`keep_alive` parameter tells Elasticsearch how long it should keep a
point in time around.

```
POST /my_index/_pit?keep_alive=1m
```

The response from the above request includes a `id`, which should be
passed to the `id` of the `pit` parameter of search requests.

```
POST /_search
{
    "query": {
        "match" : {
            "title" : "elasticsearch"
        }
    },
    "pit": {
            "id":  "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==",
            "keep_alive": "1m"
    }
}
```

Point-in-times are automatically closed when the `keep_alive` is
elapsed. However, keeping point-in-times has a cost; hence,
point-in-times should be closed as soon as they are no longer used in
search requests.

```
DELETE /_pit
{
    "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA="
}
```

#### Notable works in this change:

- Move the search state to the coordinating node: #52741
- Allow searches with a specific reader context: #53989
- Add the ability to acquire readers in IndexShard: #54966

Relates #46523
Relates #26472

Co-authored-by: Jim Ferenczi <jimczi@apache.org>
2020-09-10 19:25:47 -04:00
Jake Landis e80f68ed77
[7.x] Add test for item-level error when no write index defined for an alias in bulk API (#55503) (#61999)
Add test for item-level error when no write index defined for an alia…
Co-authored-by: Jake Landis <jake.landis@elastic.co>
Co-authored-by: bellengao <gbl_long@163.com>
2020-09-09 14:27:25 -05:00
Nik Everett 1104d65465
Fix bug with terms' min_doc_count (#62130) (#62177)
The `global_ordinals` implementation of `terms` had a bug when
`min_doc_count: 0` that'd cause sub-aggregations to have array index out
of bounds exceptions. Ooops. My fault. This fixes the bug by assigning
ordinals to those buckets.

Closes #62084
2020-09-09 13:04:51 -04:00
Luca Cavanna ab8f65a099
[TEST] Don't specify a type unless needed (#62011)
We have a couple of yaml tests that index documents under a 'test' type, while they could omit it. We do want to still test that specifying the type is still allowed in 7.x but we already have specific tests for that, and other tests should use the endpoint that don't require specifying a type.
2020-09-05 09:27:00 +02:00
Dan Hermann d52ee17054
Adjust BWC after backport of #60818 2020-09-01 08:30:32 -05:00
Dan Hermann 88a448f1cd
Fix wrong result when executing bulk requests with and without pipeline (#60818) (#61777) 2020-09-01 07:05:25 -05:00
Julie Tibshirani 85ad328df7
Ensure fetch fields aren't dropped when rewriting search. (#61390)
Previously we didn't retain the requested fields when performing a shallow copy
of the search source. This meant that when a search was rewritten, we could drop
the requested fields and fail to return them in the response.
2020-08-20 14:58:58 -07:00
Nik Everett 1e6400285c
Some progress on failing runtime fields tests (bring #61098 to 7.x) (#61115)
* Some progress on failing runtime fields tests (bring #61098 to 7.x)

This breaks apart the a test for the `terms` aggregation into one that
work for runtime fields and one that doesn't.
2020-08-17 09:56:55 -04:00
Nik Everett 639782da12
Break up a test for with runtime fields (brings #60931 to master) (#61103) (#61114)
Breaks up an integration test into one that runtime fields can run and
one that runtime fields have to skip. This is because runtime fields
don't have global ords and we assert things *about* global ords in the
test we have to skip.
2020-08-17 09:56:33 -04:00
Ryan Ernst bce93b93b2
Increase docs and client rest test timeouts for Darwin (#61075)
The Darwin CI hosts continue to struggle with timeouts. This commit
increases the timouts for docs and client rest tests.

relates #58286
2020-08-13 21:22:06 -07:00