OpenSearch

Commit Graph

Author	SHA1	Message	Date
Adrien Grand	149ef74b26	Fix `missing` on aggs on `boolean` fields. (#22135 ) The creation of the `ValuesSource` used to pass `DateTimeZone.UTC` as a time zone all the time in case of empty fields in spite of the fact that all doc value formats but the date one reject this parameter. This commit centralizes the creation of the `ValuesSource` and adds unit tests to it. Closes #22009	2016-12-14 10:03:09 +01:00
Daniel Mitterdorfer	7e5058037b	Enable strict duplicate checks for JSON content With this commit we enable the Jackson feature 'STRICT_DUPLICATE_DETECTION' by default. This ensures that JSON keys are always unique. While this has a performance impact, benchmarking has indicated that the typical drop in indexing throughput is around 1 - 2%. As a last resort, we allow users to still disable strict duplicate checks by setting `-Des.json.strict_duplicate_detection=false` which is intentionally undocumented. Closes #19614	2016-12-14 09:35:53 +01:00
Nik Everett	49bdd29f91	Consolidate more parser creation into ESTestCase This will make it easier to add the forthcoming required argument, `NamedXContentRegistry`.	2016-12-13 20:28:41 -05:00
Nik Everett	872984d21a	Continue consolidating `XContentParser` construction in tests (#22145 ) Consolidate more parser creation in tests Moves more parser creation in tests to the `createParser` methods in `ESTestCase`.	2016-12-13 17:22:39 -05:00
Tal Levy	f56097b57a	Fixes GrokProcessor's ignorance of named-captures with same name. (#22131 ) Grok was originally ignoring potential matches to named-capture groups larger than one. For example, If you had two patterns containing the same named field, but only the second pattern matched, it would fail to pick this up. This PR fixes this by exploring all potential places where a named-capture was used and chooses the first one that matched. Fixes #22117.	2016-12-13 13:19:55 -08:00
Simon Willnauer	7a9b667e98	Introduce a low level protocol handshake (#22094 ) Today we rely on the version that the API user passes in together with the DiscoveryNode. This commit introduces a low level handshake where nodes exchange their version to be used with the transport protocol that is executed every time a connection to a node is established. This, on the one hand allows to change the wire protocol based on the version we are talking to even without a full cluster restart. Today we would need to carry on a BWC layer across major versions but with a handshake we can rely on the fact that the latest version of the previous minor executes a handshake and uses the latest protocol version across all communication with the N+1 version nodes. This change is yet fully backwards compatible, a followup PR will remove the BWC in 6.0 once this has been back-ported to the 5.x branch	2016-12-13 21:06:23 +01:00
Adrien Grand	049fd3991c	Remove `AggregationContext`. (#22124 ) This class is just a wrapper around `SearchContext`, so let's use `SearchContext` directly. The change is mechanical, except the `ValuesSourceConfig` class, where I moved the logic to get a `ValuesSource` given a config.	2016-12-13 09:09:40 +01:00
Luca Cavanna	6d987a9b69	Remove support for empty queries (#22092 ) Our query DSL supports empty queries (`{}`), which have a different meaning depending on the query that holds it, either ignored, match_all or match_none. We deprecated the support for empty queries in 5.0, where we log a deprecation warning wherever they are used. The way we supported it once we moved query parsing to the coordinating node was having an Optional<QueryBuilder> return type in all of our parse methods (called fromXContent). See #17624. The central place for this was QueryParseContext#parseInnerQueryBuilder. We can now remove all the optional return types and simply throw an exception whenever an empty query is found.	2016-12-12 12:37:12 +01:00
Simon Willnauer	01d67e09b9	Detach handshake from connect to node (#22037 ) Today we connect and publish the nodes connection before we execute a handshake with the node we connect to. In the case of connecting to a node that won't pass the handshake this connection is already `published` and other code paths can use it. This commit detaches the connection and the publish of the connection such that `TransportService` can do a handshake before actually connect and publish the connection.	2016-12-10 10:03:26 +01:00
Nik Everett	3adefb7b4a	Begin centralizing XContentParser creation into RestRequest (#22041 ) To get #22003 in cleanly we need to centralize as much `XContentParser` creation as possible into `RestRequest`. That'll mean we have to plumb the `NamedXContentRegistry` into fewer places. This removes `RestAction.hasBody`, `RestAction.guessBodyContentType`, and `RestActions.getRestContent`, moving callers over to `RestRequest.hasContentOrSourceParam`, `RestRequest.contentOrSourceParam`, and `RestRequest.contentOrSourceParamParser` and `RestRequest.withContentOrSourceParamParserOrNull`. The idea is to use `withContentOrSourceParamParserOrNull` if you need to handle requests without any sort of body content and to use `contentOrSourceParamParser` otherwise. I believe the vast majority of this PR to be purely mechanical but I know I've made the following behavioral change (I'll add more if I think of more): * If you make a request to an endpoint that requires a request body and has cut over to the new APIs instead of getting `Failed to derive xcontent` you'll get `Body required`. * Template parsing is now non-strict by default. This is important because we need to be able to deprecate things without requests failing.	2016-12-09 20:23:02 -05:00
Nik Everett	fc2060ba7e	Don't close rest client from its callback (#22061 ) If you try to close the rest client inside one of its callbacks then it blocks itself. The thread pool switches the status to one that requests a shutdown and then waits for the pool to shutdown. When another thread attempts to honor the shutdown request it waits for all the threads in the pool to finish what they are working on. Thus thread a is waiting on thread b while thread b is waiting on thread a. It isn't quite that simple, but it is close. Relates to #22027	2016-12-09 10:39:51 -05:00
Adrien Grand	36f598138a	Start using `ObjectParser` for aggs. (#22048 ) This is an attempt to start moving aggs parsing to `ObjectParser`. There is still A LOT to do, but ObjectParser is way better than the way aggregations parsing works today. For instance in most cases, we reject numbers that are provided as strings, which we are supposed to accept since some client languages (looking at you Perl) cannot make sure to use the appropriate types. Relates to #22009	2016-12-09 09:45:16 +01:00
Ryan Ernst	b1cef5fdf8	Remove 2.0 prerelease version constants (#22004 ) * Remove 2.0 prerelease version constants This is a start to addressing #21887. This removes: * pre 2.0 snapshot format support * automatic units addition to cluster settings * bwc check for delete by query in pre 2.0 indexes	2016-12-08 21:48:35 -08:00
Lee Hinman	ef64d230e7	Merge remote-tracking branch 'dakrone/index-seq-id-and-primary-term'	2016-12-08 19:47:21 -07:00
Lee Hinman	ee22a477df	Add internal _primary_term doc values field, fix _seq_no indexing This adds the `_primary_term` field internally to the mappings. This field is populated with the current shard's primary term. It is intended to be used for collision resolution when two document copies have the same sequence id, therefore, doc_values for the field are stored but the filed itself is not indexed. This also fixes the `_seq_no` field so that doc_values are retrievable (they were previously stored but irretrievable) and changes the `stats` implementation to more efficiently use the points API to retrieve the min/max instead of iterating on each doc_value value. Additionally, even though we intend to be able to search on the field, it was previously not searchable. This commit makes it searchable. There is no user-visible `_primary_term` field. Instead, the fields are updated by calling: ```java index.parsedDoc().updateSeqID(seqNum, primaryTerm); ``` This includes example methods in `Versions` and `Engine` for retrieving the sequence id values from the index (see `Engine.getSequenceID`) that are only used in unit tests. These will be extended/replaced by actual implementations once we make use of sequence numbers as a conflict resolution measure. Relates to #10708 Supercedes #21480 P.S. As a side effect of this commit, `SlowCompositeReaderWrapper` cannot be used for documents that contain `_seq_no` because it is a Point value and SCRW cannot wrap documents with points, so the tests have been updated to loop through the `LeafReaderContext`s now instead.	2016-12-08 19:47:03 -07:00
Christoph Büscher	7454a9647b	Add fromXContent to HighlightField This adds a fromXContent method and unit test to the HighlightField class so we can parse it as part of a serch response. This is part of the preparation for parsing search responses on the client side.	2016-12-07 16:32:44 +01:00
Nik Everett	ef83dbfbe6	Reindex: Better error message for pipeline in wrong place (#21985 ) `_update_by_query` supports specifying the `pipeline` to process the documents as a url parameter but `_reindex` doesn't. It doesn't because everything about the `_reindex` request that has to do with writing the documents is grouped under the `dest` object in the request body. This changes the response parameter from `request [_reindex] contains unrecognized parameter: [pipeline]` to `_reindex doesn't support [pipeline] as a query parmaeter. Specify it in the [dest] object instead.`	2016-12-06 14:55:46 -05:00
Ryan Ernst	c8f241f284	Plugins: Remove response action filters (#21950 ) Action filters currently have the ability to filter both the request and response. But the response side was not actually used. This change removes support for filtering responses with action filters.	2016-12-05 16:14:04 -08:00
Nik Everett	2087234d74	Timeout improvements for rest client and reindex (#21741 ) Changes the default socket and connection timeouts for the rest client from 10 seconds to the more generous 30 seconds. Defaults reindex-from-remote to those timeouts and make the timeouts configurable like so: ``` POST _reindex { "source": { "remote": { "host": "http://otherhost:9200", "socket_timeout": "1m", "connect_timeout": "10s" }, "index": "source", "query": { "match": { "test": "data" } } }, "dest": { "index": "dest" } } ``` Closes #21707	2016-12-05 10:54:51 -05:00
Igor Motov	c391b3fff6	Add proper descriptions to reindex, update-by-query and delete-by-query tasks. Related to #21768	2016-12-02 21:46:38 -05:00
Jack Conradson	0ecdef026d	Test fix for def equals test in Painless. (#21945 ) Closes #21801	2016-12-02 14:41:13 -08:00
Nik Everett	0c724b1878	Keep context during reindex's retries (#21941 ) * Keep context during reindex's retries This fixes reindex and friend's retries to keep the context. * Docs	2016-12-02 13:48:51 -05:00
Simon Willnauer	842e00c689	[TEST] Add back skip of external clusters	2016-12-02 11:53:33 +01:00
Simon Willnauer	572b4c3e72	Port assert from 5.x to master I added an assertion to Netty4/Netty3Transport in 5.x that is not in master yet. This commit port the assert to ensure we consumed all connection in `connectToChannels`	2016-12-02 10:34:33 +01:00
Simon Willnauer	adf9bd90a4	Remove legacy BWC test infrastructure and tests (#21915 ) We don't use the test infra nor do we run the tests. They might all be entirely out of date. We also have a different BWC test infra in-place. This change removes all of the legacy infra.	2016-12-02 08:06:20 +01:00
Simon Willnauer	155de53fe3	Add a connect timeout to the ConnectionProfile to allow per node connect timeouts (#21847 ) Timeouts are global today across all connections this commit allows to specify a connection timeout per node such that depending on the context connections can be established with different timeouts. Relates to #19719	2016-12-01 15:39:49 +01:00
Boaz Leskes	fe01c0f83b	fix TemplateQueryBuilderTests & Murmur3FieldMapperTests	2016-12-01 14:21:57 +01:00
Simon Willnauer	dd5256c324	Reduce number of connections per node depending on the nodes role (#21849 ) We currently treat every node equally when we establish connections to a node. Yet, if we are not master eligible or can't hold any data there is no point in creating a dedicated connection for sending the cluster state or running remote recoveries respectively. The usage of STATE and RECOVERY connections on non-master and/or non-data nodes will result in an IllegalStateException.	2016-12-01 08:00:48 +01:00
Jason Tedor	6c45695d52	Add version 5.1.1 This commit removes the version constant for 5.1.0 (due to an inadvertent release) and adds the version constant for 5.1.1. Relates #21890	2016-11-30 11:14:17 -05:00
Luca Cavanna	5b8bdba12e	Remove subrequests method from CompositeIndicesRequest (#21873 )	2016-11-30 15:03:58 +01:00
Adrien Grand	6231009a8f	Remove 2.x backward compatibility of mappings. (#21670 ) For the record, I also had to remove the geo-hash cell and geo-distance range queries to make the code compile. These queries already throw an exception in all cases with 5.x indices, so that does not hurt any more. I also had to rename all 2.x bwc indices from `index-${version}` to `unsupported-${version}` to make `OldIndexBackwardCompatibilityIT` happy.	2016-11-30 13:34:46 +01:00
Luca Cavanna	6eaff9432d	SearchTemplateRequest to implement CompositeIndicesRequest (#21865 ) SearchTemplateRequest to implement CompositeIndicesRequest Given that SearchTemplateRequest effectively delegates to search when a search is being executed, it should implement the CompositeIndicesRequest interface. The subrequests method should return a single search request. When a search is not going to be executed, because we are in simulate mode, there are no inner requests, and there are no corresponding indices to that request either. Closes #21747	2016-11-29 20:52:43 +01:00
Jim Ferenczi	d791ddf704	Upgrade to lucene-6.4.0-snapshot-ec38570 (#21853 ) Set lucene version to 6.4.0-snapshot-ec38570 and update all the sha1s/license Fix invalid combo after upgrade in query_string query. split_on_whitespace=false is disallowed if auto_generate_phrase_queries=true Adapt the expectations of some tests to the new format of the Lucene explain output	2016-11-29 18:40:31 +01:00
Nicholas Knize	af1ab68b64	Add RangeFieldMapper for numeric and date range types Lucene 6.2 added index and query support for numeric ranges. This commit adds a new RangeFieldMapper for indexing numeric (int, long, float, double) and date ranges and creating appropriate range and term queries. The design is similar to NumericFieldMapper in that it uses a RangeType enumerator for implementing the logic specific to each type. The following range types are supported by this field mapper: int_range, float_range, long_range, double_range, date_range. Lucene does not provide a DocValue field specific to RangeField types so the RangeFieldMapper implements a CustomRangeDocValuesField for handling doc value support. When executing a Range query over a Range field, the RangeQueryBuilder has been enhanced to accept a new relation parameter for defining the type of query as one of: WITHIN, CONTAINS, INTERSECTS. This provides support for finding all ranges that are related to a specific range in a desired way. As with other spatial queries, DISJOINT can be achieved as a MUST_NOT of an INTERSECTS query.	2016-11-29 10:10:14 -06:00
Simon Willnauer	f5ff69fabe	Remove connectToNodeLight and replace it with a connection profile (#21799 ) The Transport#connectToNodeLight concepts is confusing and not very flexible. neither really testable on a unittest level. This commit cleans up the code used to connect to nodes and simplifies transport implementations to share more code. This also allows to connect to nodes with custom profiles if needed, for instance future improvements can be added to connect to/from nodes that are non-data nodes without dedicated bulks and recovery connections.	2016-11-29 09:35:07 +01:00
Jason Tedor	a6082eb563	Grant Netty permission to read system somaxconn When Netty listens on a socket, it specifies the established connection backlog for the socket. On Linux, Netty tries to read the system-wide configuration for this from /proc/sys/net/core/somaxconn and falls back to a default value when it can not read this value. This commit grants Netty permission to read this file so that it can honor the system-wide configuration for the connection backlog for sockets that it is listening on. This also removes an obnoxious stack trace that appears when Netty logging is set to debug logging. Relates #21840	2016-11-28 18:47:32 -05:00
Luca Cavanna	360b74eda8	[TEST] Don't reinitialize YamlTestClient and RestClient before each single test (#21807 ) In the past we ran yaml tests against an internal cluster, which would get restarted after each test failure, hence the client objects needed to eventually be refreshed before each test. That is why we had the initClient method to re-initialize the YamlTestClient in the execution context. We ended up though re-initializing the client unconditionally, which is not needed. Also, ESRestTestCase recreates the RestClient against the external cluster before each test, which is not needed given that nothing changes in the external cluster. This commit removes the initClient method from the yaml tests execution context. The YamlTestClient can be eagerly created before the first yaml test runs and then re-used in subsequent tests. Also api calls to check for nodes versions etc. are moved out of YamlTestClient to ESClientYamlSuiteTestCase. Also the RestClient is now initialized in ESRestTestCase before the first test runs, and kept around afterwards as a static member. Basically each subclass of EsRestTestCase will have its own RestClient instance, but the client will be shared across the different tests within the same class. The yaml test suite is just a special suite, composed of 600+ tests that are loaded from files, which will share the same client instance. This change should speed tests up as well, as we don't recreate the RestClient before each single test, and we don't call _cat/nodes either before each single test.	2016-11-28 18:43:27 +01:00
Jason Tedor	6f95261632	Remove unused imports from Netty4Utils This commit removes two unused imports from Netty4Utils that were leftover from a previous change.	2016-11-27 13:18:50 -05:00
Jason Tedor	5e73282bbc	Simplify handling of fatal network layer errors This commit simplifies the handling of fatal errors on the network layer. The simplification here is to remove the use of a StringWriter/PrintWriter pair to format the stack trace, removing the need for the method to declare that it throws a checked IOException.	2016-11-27 13:14:24 -05:00
Tanguy Leroux	28dc02f01a	[Test] Mute EqualsTests..testBranch(Not)EqualsDefAndPrimitive It fails regurlarly and it is tracked by https://github.com/elastic/elasticsearch/issues/21801	2016-11-25 17:21:59 +01:00
Ryan Ernst	c3ec8e22b8	Wrap VerifyError in ScriptException (#21769 ) If a bug occurs in painless compilation (not from a user, but from the painless infrastructure), a VerifyError may be thrown when compiling the broken generated class. This commit wraps VerifyErrors in ScriptException so that useful information is returned to the user, which can be passed on to the ES team for analysis.	2016-11-23 14:45:21 -08:00
Jack Conradson	ba2d772668	Fix a VerifyError bug in Painless (#21765 ) This bug would cause a VerifyError when scripts using the === operator were comparing a def type against a primitive type since the primitive type wasn't being appropriately boxed.	2016-11-23 13:57:14 -08:00
Jason Tedor	8416b16dfd	Improve handling of unreleased versions Today when handling unreleased versions for backwards compatilibity support, we scatted version constants across the code base and add some asserts to support removing these constants when the version in question is actually released. This commit improves this situation, enabling us to just add a single unreleased version constant that can be renamed when the version is actually released. This should make maintenance of these versions simpler. Relates #21760	2016-11-23 15:49:05 -05:00
Nik Everett	434fa4bd26	Docs and tests for painless lack of boxing for ?: and ?. (#21756 ) NOTE: The result of `?.` and `?:` can't be assigned to primitives. So `int[] someArray = null; int l = someArray?.length` and `int s = params.size ?: 100` don't work. Do `def someArray = null; def l = someArray?.length` and `def s = params.size ?: 100` instead. Relates to #21748	2016-11-23 14:33:32 -05:00
Ryan Ernst	6940b2b8c7	Remove groovy scripting language (#21607 ) * Scripting: Remove groovy scripting language Groovy was deprecated in 5.0. This change removes it, along with the legacy default language infrastructure in scripting.	2016-11-22 19:24:12 -08:00
Nik Everett	dbdcf9e95c	Move painless yaml tests into painless dir They were in a directory named "plan_a", the old name for painless.	2016-11-22 20:27:14 -05:00
Nik Everett	457c2d8fb0	Add Debug.explain to painless You can use `Debug.explain(someObject)` in painless to throw an `Error` that can't be caught by painless code and contains an object's class. This is useful because painless's sandbox doesn't allow you to call `someObject.getClass()`. Closes #20263	2016-11-22 12:46:02 -05:00
Jason Tedor	446037ccb8	Die with dignity on the network layer When a fatal error is thrown on the network layer, such an error never makes its way to the uncaught exception handler. This prevents the node from being torn down if an out of memory error or other fatal error is thrown while handling HTTP or transport traffic. This commit adds logic to ensure that such errors bubble their way up to the uncaught exception handler, even though Netty tries really hard to swallow everything. Relates #21720	2016-11-21 22:14:30 -05:00
Nik Everett	f5c8c746e6	Implement toString in painless's AST This should make debugging painless' analysis and code generation a little easier. The `toString` implementations mirror the AST somewhat, and look like `(SSource (SReturn (ENumeric 1)))`.	2016-11-21 16:24:10 -05:00
Simon Willnauer	cb5c25ab4f	Add a StreamInput#readArraySize method that ensures sane array sizes (#21697 ) Today we read a vint from the stream to allocate the size of an array up-front before we start reading the values. This can be dangerous if for instance we read from a corrupted stream or if some manipulated bytes are send for instance from an attacker or a fuzzer. In most of the cases we can apply some best effort and validate the array size to be _sane_ by ensuring we can at read at least N bytes where N is the expected size of the array.	2016-11-21 21:39:21 +01:00

1 2 3 4 5 ...

3766 Commits