OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-06 21:18:31 +00:00

Author	SHA1	Message	Date
javanna	f2acf466aa	Convert script/template objects to json format Elasticsearch accepts multiple content-type formats, hence scripts can be stored/provided in json, yaml, cbor or smile. Yet the format that should be used internally is json. This is a problem mainly around search templates, as they only support json out of the four content-types, so instead of maintaining the content-type of the request we should rather convert the scripts/templates to json. Binary formats were not previously supported. If you stored a template in yaml format, you'd get back an error "No encoder found for MIME type [application/yaml]" when trying to execute it. With this commit the request content-type is independent from the template, which always gets converted to json internally. That is transparent to users and doesn't affect the content type of the response obtained when executing the template.	2017-02-22 16:20:53 +01:00
Simon Willnauer	5c1924ad19	Remove BWC layer for number of reduce phases (#23303 ) Both PRs below have been backported to 5.4 such that we can enable BWC tests of this feature as well as remove version dependend serialization for search request / responses. Relates to #23288 Relates to #23253	2017-02-22 15:03:09 +01:00
mms-programming	d31e41547a	Handle BlobPath's trailing separator case (#23091 )	2017-02-22 09:04:55 +01:00
Areek Zillur	148be11f26	Make document write requests immutable (#23038 ) * Make document write requests immutable Previously, write requests were mutated at the transport level to update request version, version type and sequence no before replication. Now that all write requests go through the shard bulk transport action, we can use the primary response stored in item level bulk requests to pass the updated version, seqence no. to replicas. * incorporate feedback * minor cleanup * Add bwc test to ensure correct index version propagates to replica * Fix bwc for propagating write operation versions * Add assertion on replica request version type * fix tests using internal version type for replica op * Fix assertions to assert version type in replica and recovery * add bwc tests for version checks in concurrent indexing * incorporate feedback	2017-02-21 17:41:22 -05:00
Simon Willnauer	ca38e88148	Remote assertion that relies on all shards being successful The assertion that if there are buffered aggs at least one incremental reduce phase should have happened doens't hold if there are shard failure. This commit removes this assertion. Relates to #23288	2017-02-21 22:41:49 +01:00
Nik Everett	7475175957	Adds unit test for sampler aggregation (#23243 ) * Adds unit test for sampler aggregation Relates to #22278	2017-02-21 12:51:47 -05:00
Jim Ferenczi	0ff6356b7e	Revert "Never reduce the same agg twice" This change reverts 5e4ba4a60ecc1169e86c1dd517ed05a225c5278b Incremental reduction of aggs should also work with a single aggregation now that InternalTopHits.equals is fixed.	2017-02-21 18:48:28 +01:00
Simon Willnauer	ce625ebdcc	Expose `batched_reduce_size` via `_search` (#23288 ) In #23253 we added an the ability to incrementally reduce search results. This change exposes the parameter to control the batch since and therefore the memory consumption of a large search request.	2017-02-21 18:36:59 +01:00
Jim Ferenczi	1ba9770037	Fix comparaison of double in InternalTopHits InternalTopHits uses "==" to compare hit scores and fails when score is NaN. This commit changes the comparaison to always use Double.compare. Relates #23253	2017-02-21 18:18:44 +01:00
Simon Willnauer	5e4ba4a60e	Never reduce the same agg twice Some randomization caused reduction of the same agg multiple times which causes issues on some aggregations. Relates to #23253	2017-02-21 17:55:44 +01:00
Simon Willnauer	489f38918d	Fix incremental reduce randomization in base tests cases We can and should randomly reduce down to a single result before we passing the aggs to the final reduce. This commit changes the logic to do that and ensures we don't trip the assertions the previous imple tripped. Relates to #23253	2017-02-21 17:13:46 +01:00
Nik Everett	74c33823ab	Comment	2017-02-21 10:43:29 -05:00
Nik Everett	0dee1f85e6	Remove closeAgg	2017-02-21 10:31:42 -05:00
Tanguy Leroux	3a0fc526bb	UpdateRequest implements ToXContent (#23289 ) This commit changes UpdateRequest so that it implements the ToXContentObject interface.	2017-02-21 15:20:15 +01:00
Jim Ferenczi	cc865cbc96	Add unit tests for stats and extended stats aggregations (#23287 ) Add tests for InternalStats, InternalExtendedStats and StatsAggregator/ExtendedStatsAggregator Relates #22278	2017-02-21 15:14:54 +01:00
Simon Willnauer	f933f80902	First step towards incremental reduction of query responses (#23253 ) Today all query results are buffered up until we received responses of all shards. This can hold on to a significant amount of memory if the number of shards is large. This commit adds a first step towards incrementally reducing aggregations results if a, per search request, configurable amount of responses are received. If enough query results have been received and buffered all so-far received aggregation responses will be reduced and released to be GCed.	2017-02-21 13:02:48 +01:00
Tanguy Leroux	39ed76c58b	Add parsing method to bulk response (#23234 ) This commit adds the `fromXContent()` parsing method to BulkResponse.	2017-02-21 10:49:40 +01:00
Tanguy Leroux	c88eb00b83	Add javadoc for DocWriteResponse.Builders (#23267 )	2017-02-21 10:19:01 +01:00
Martin Scholz	24bf18b610	Upgrade HDRHistogram to 2.1.9 (#23254 )	2017-02-21 08:50:26 +01:00
Martin Scholz	3e292d5245	Migrate TermsQuery to TermInSetQuery (#23229 )	2017-02-21 08:49:43 +01:00
Jim Ferenczi	1ff5b318be	Fix for IpRangeAggregatorTests#testRanges Handle null from/to ranges. Closes #23272	2017-02-20 21:16:14 +01:00
Jason Tedor	4c2bd5feab	Introduce sequence-number-aware translog Today, the relationship between Lucene and the translog is rather simple: every document not in Lucene is guaranteed to be in the translog. We need a stronger guarantee from the translog though, namely that it can replay all operations after a certain sequence number. For this to be possible, the translog has to made sequence-number aware. As a first step, we introduce the min and max sequence numbers into the translog so that each generation knows the possible range of operations contained in the generation. This will enable future work to keep around all generations containing operations after a certain sequence number (e.g., the global checkpoint). Relates #22822	2017-02-20 15:05:24 -05:00
Jason Tedor	15f5810774	Mark IP range aggregator test as awaits fix This test reliably fails with the seed 4AC319F8A6B0329B.	2017-02-20 14:42:16 -05:00
Christoph Büscher	ea7deace5d	Adding fromXContent to Suggest and Suggestion class (#23226 ) A follow up to #23202, this adds parsing from xContent and tests to the four Suggestion implementations and the top level suggest element to be used later when parsing the entire SearchResponse.	2017-02-20 15:45:10 +01:00
Christoph Büscher	ea9d51114c	Tests: Add unit test for InternalChildren (#23261 ) Relates to #22278	2017-02-20 14:02:56 +01:00
Jim Ferenczi	76d6b872dd	Add unit tests for GeoBoundsAggregator/InternalGeoBounds (#23259 ) * Add unit tests for GeoBoundsAggregator/InternalGeoBounds Relates #22278	2017-02-20 12:04:30 +01:00
Jim Ferenczi	69b1463f7c	Add unit tests for BinaryRangeAggregator/InternalBinaryRange (#23255 ) * Add unit tests for BinaryRangeAggregator/InternalBinaryRange Relates #22278	2017-02-20 11:55:48 +01:00
Tanguy Leroux	872412f645	[Tests] Cleans up DocWriteResponse parsing tests (#23233 ) This commit cleans up some parsing tests added from the High Level Rest Client: IndexResponseTests, DeleteResponseTests, UpdateResponseTests, BulkItemResponseTests. These tests are now more uniform with the others test-from-to-XContent tests we have, they now shuffle the XContent fields before parsing, the asserting method for parsed objects does not used a Map<String, Object> anymore, and buggy equals/hasCode methods in ShardInfo and ShardInfo.Failure have been removed.	2017-02-20 09:45:33 +01:00
Nik Everett	d9c37ce195	Adds unit test for sampler aggregation Relates to #22278	2017-02-17 16:16:04 -05:00
Nik Everett	d1de9574ea	Checkstyle: Fix link lengths in sampler aggregation	2017-02-17 15:03:57 -05:00
Jay Modi	b234644035	Enforce Content-Type requirement on the rest layer and remove deprecated methods (#23146 ) This commit enforces the requirement of Content-Type for the REST layer and removes the deprecated methods in transport requests and their usages. While doing this, it turns out that there are many places where *Entity classes are used from the apache http client libraries and many of these usages did not specify the content type. The methods that do not specify a content type explicitly have been added to forbidden apis to prevent more of these from entering our code base. Relates #19388	2017-02-17 14:45:41 -05:00
Adrien Grand	3bd1d46fc7	Add unit tests for terms aggregation objects. (#23149 ) Relates #22278	2017-02-17 18:01:40 +01:00
javanna	578853f264	Remove stale comment about setting routing before parent Order does not matter anymore since we merged #15371	2017-02-17 17:10:53 +01:00
Yuhao Bi	576e698613	Minor fix of _cat output (#23211 ) (#23213 ) One line was missing a trailing "\n"	2017-02-17 10:46:20 +01:00
Jason Tedor	00a8b8799f	Fix control group pattern The file /proc/self/cgroup lists the control groups to which the process belongs. This file is a colon separated list of three fields: 1. a hierarchy ID number 2. a comma-separated list of hierarchies 3. the pathname of the control group in the hierarchy The regex pattern for this contains a bug for the second field. It allows one or two entries in the comma-separated list, but not more. This commit fixes the pattern to allow one or more entires in the comma-separated list. Relates #23219	2017-02-16 15:31:18 -05:00
Christoph Büscher	268d15ec4c	Adding fromXContent to Suggestion.Entry and subclasses (#23202 ) This adds parsing from xContent to Suggestion.Entry and its subclasses for Terms-, Phrase- and CompletionSuggestion.Entry.	2017-02-16 17:59:55 +01:00
markharwood	1cd1ff6010	Test fix - faulty assumptions about when exceptions are thrown in relation to number of failing shards. (#23205 ) Search exceptions are thrown only when all shards report failure. Fix changes assertion logic to reflect this. Closes #23203	2017-02-16 13:48:17 +00:00
Jason Tedor	0a5917d182	Fix get HEAD requests Get HEAD requests incorrectly return a content-length header of 0. This commit addresses this by removing the special handling for get HEAD requests, and just relying on the general mechanism that exists for handling HEAD requests in the REST layer. Relates #23186	2017-02-15 13:07:29 -05:00
Christoph Büscher	458ca09e70	Fix checkstyle issue with modifier order in DocWriteResponse	2017-02-15 17:53:39 +01:00
Tanguy Leroux	e8d669f50c	Add parsing methods to BulkItemResponse (#22859 ) This commit adds a parsing method to the BulkItemResponse class. In order to do that, the way DocWriteResponses are parsed has to be changed: ConstructingObjectParser/ObjectParser is removed in favor of a simpler and more readable way to parse these objects. DocWriteResponse now provides the parseInnerToXContent() method that can be used by subclasses (IndexResponse, UpdateReponse and DeleteResponse) to parse the current token/field and potentially update a DocWriteResponseBuilder. The DocWriteResponseBuilder is a simple POJO used to contain parsed values. It can be passed around from one parsing method to another parsing method. For example, this is what is done in IndexResponse: a IndexResponseBuilder is created in IndexResponse.fromXContent(), it get passed to IndexResponse.parseXContentFields() that parses fields specific to IndexResponse (like "created") and updates the context, delegating to DocWriteResponse.parseInnerToXContent() the parsing of any other field. Once all XContent is parsed, IndexResponse.fromXContent() uses the method IndexResponseBuilder.build() to create the new instance of IndexResponse. This behavior allow to reuse parsing code among the class hierarchy while keeping the current behavior. It also allows other objects like BulkItemResponse to reuse the same parsing code to parse DocWriteResponses. Finally, IndexResponseTests, UpdateResponseTests and DeleteResponseTests have been updated to introduce some random shuffling of fields before the XContent is parsed in order to ensure that the parsing code does not rely on field order.	2017-02-15 17:33:10 +01:00
Christoph Büscher	b963144254	Add xcontent parsing to completion suggestion option (#23071 ) This adds parsing from xContent to the CompletionSuggestion.Entry.Option. The completion suggestion option also inlines the xContent rendering of the containes SearchHit, so in order to reuse the SearchHit parser this also changes the way SearchHit is parsed from using a loop-based parser to using a ConstructingObjectParser that creates an intermediate map representation and then later uses this output to create either a single SearchHit or use it with additional fields defined in the parser for the completion suggestion option.	2017-02-15 16:52:17 +01:00
Jim Ferenczi	3c26754f87	Add BWC index for new released version 5.2.1	2017-02-15 11:14:37 +01:00
Jim Ferenczi	f1aaa71a7f	Create version constants for next bug fix version v5.2.2	2017-02-15 11:13:09 +01:00
Ryan Ernst	048c87d8a5	Improve setting deprecation message (#23156 ) This change modifies the deprecation log message emitted when a setting is found which is deprecated. The new message indicates docs for the deprecated settings can be found in the breaking changes docs for the next major version. closes #22849	2017-02-14 21:33:13 -08:00
Jason Tedor	6ac1cb660b	Cleanup RestGetIndicesAction.java This commit is just a code cleanup of RestGetIndicesAction.java. For example, we remove an unnecessary class, remove some unnecessary local variables, and simplify some code flow. Relates #23129	2017-02-14 16:51:27 -05:00
Jason Tedor	673754b1d5	Fix get source HEAD requests Get source HEAD requests incorrectly return a content-length header of 0. This commit addresses this by removing the special handling for get source HEAD requests, and just relying on the general mechanism that exists for handling HEAD requests in the REST layer. Relates #23151	2017-02-14 16:37:22 -05:00
Martijn van Groningen	cab43707dc	[percolator] Removed old 2.x bwc logic.	2017-02-14 22:17:17 +01:00
Areek Zillur	e178dc5493	Add request version asserting during replica operation (#23167 )	2017-02-14 15:40:55 -05:00
Simon Willnauer	a7a3729596	Add ExpandSearchPhase as a successor for the FetchSearchPhase (#23165 ) Now that we have more flexible search phases we should move the rather hacky integration of the collapse feature as a real search phase that can be tested and used by itself. This commit adds a new ExpandSearchPhase including a unittest for the phase. It's integrated into the fetch phase as an optional successor.	2017-02-14 17:14:17 +01:00
Adrien Grand	8d6a41f671	Nested queries should avoid adding unnecessary filters when possible. (#23079 ) When nested objects are present in the mappings, many queries get deoptimized due to the need to exclude documents that are not in the right space. For instance, a filter is applied to all queries that prevents them from matching non-root documents (`+: -_type:__`). Moreover, a filter is applied to all child queries of `nested` queries in order to make sure that the child query only matches child documents (`_type:__nested_path`), which is required by `ToParentBlockJoinQuery` (the Lucene query behing Elasticsearch's `nested` queries). These additional filters slow down `nested` queries. In 1.7-, the cost was somehow amortized by the fact that we cached filters very aggressively. However, this has proven to be a significant source of slow downs since 2.0 for users of `nested` mappings and queries, see #20797. This change makes the filtering a bit smarter. For instance if the query is a `match_all` query, then we need to exclude nested docs. However, if the query is `foo: bar` then it may only match root documents since `foo` is a top-level field, so no additional filtering is required. Another improvement is to use a `FILTER` clause on all types rather than a `MUST_NOT` clause on all nested paths when possible since `FILTER` clauses are more efficient. Here are some examples of queries and how they get rewritten: ``` "match_all": {} ``` This query gets rewritten to `ConstantScore(+:* -_type:__)` on master and `ConstantScore(_type:AutomatonQuery {\norg.apache.lucene.util.automaton.Automaton@4371da44})` with this change. The automaton is the complement of `_type:__` so it matches the same documents, but is faster since it is now a positive clause. Simplistic performance testing on a 10M index where each root document has 5 nested documents on average gave a latency of 420ms on master and 90ms with this change applied. ``` "term": { "foo": { "value": "0" } } ``` This query is rewritten to `+foo:0 #(ConstantScore(+: -_type:__))^0.0` on master and `foo:0` with this change: we do not need to filter nested docs out since the query cannot match nested docs. While doing performance testing in the same conditions as above, response times went from 250ms to 50ms. ``` "nested": { "path": "nested", "query": { "term": { "nested.foo": { "value": "0" } } } } ``` This query is rewritten to `+ToParentBlockJoinQuery (+nested.foo:0 #_type:__nested) #(ConstantScore(+:* -_type:__))^0.0` on master and `ToParentBlockJoinQuery (nested.foo:0)` with this change. The top-level filter (`-_type:__`) could be removed since `nested` queries only match documents of the parent space, as well as the child filter (`#_type:__nested`) since the child query may only match nested docs since the `nested` object has both `include_in_parent` and `include_in_root` set to `false`. While doing performance testing in the same conditions as above, response times went from 850ms to 270ms.	2017-02-14 16:05:19 +01:00

1 2 3 4 5 ...

7610 Commits