OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jim Ferenczi	6319424e4a	Move composite aggregation to core (#27474 ) This change removes the module named aggs-composite and adds the `composite` aggs as a core aggregation. This allows other plugins to use this new aggregation and simplifies the integration in the HL rest client.	2017-11-21 13:31:01 +01:00
Tim Brooks	f37eb1b403	Remove tcp profile from low level nio channel (#27441 ) This is related to #27260. Currently every nio channel has a profile field. Profile is a concept that only relates to the tcp transport. Http channels will not have profiles. This commit moves the profile from the nio channel to the read context. The context is the level that protocol specific features and logic should live.	2017-11-20 12:20:42 -07:00
Tim Brooks	0a8f48d592	Transition transport apis to use void listeners (#27440 ) Currently we use ActionListener<TcpChannel> for connect, close, and send message listeners in TcpTransport. However, all of the listeners have to capture a reference to a channel in the case of the exception api being called. This commit changes these listeners to be type <Void> as passing the channel to onResponse is not necessary. Additionally, this change makes it easier to integrate with low level transports (which use different implementations of TcpChannel).	2017-11-20 10:47:47 -07:00
Michael Basnight	2949c53174	Remove config prompting for secrets and text (#27216 ) This commit removes the ability to use ${prompt.secret} and ${prompt.text} as valid config settings. Secure settings has obsoleted the need for this, and it cleans up some of the code in Bootstrap.	2017-11-19 22:33:17 -06:00
Michael Basnight	cb3e8f4763	Move the CLI into its own subproject (#27114 ) Projects the depend on the CLI currently depend on core. This should not always be the case. The EnvironmentAwareCommand will remain in :core, but the rest of the CLI components have been moved into their own subproject of :core, :core:cli.	2017-11-18 21:42:57 -06:00
Tim Brooks	ce45e29be7	Remove manual tracking of registered channels (#27445 ) This is related to #27260. Currently, every ESSelector keeps track of all channels that are registered with it. ESSelector is just an abstraction over a raw java nio selector. The java nio selector already tracks its own selection keys. This commit removes our tracking and relies on the java nio selector tracking.	2017-11-17 16:20:09 -07:00
David Turner	08a257327f	Remove newline from log message (#27425 ) It leads to harder-to-parse logs that look like this: ``` 1> [2017-11-16T20:46:21,804][INFO ][o.e.t.r.y.ClientYamlTestClient] Adding header Content-Type 1> with value application/json 1> [2017-11-16T20:46:21,812][INFO ][o.e.t.r.y.ClientYamlTestClient] Adding header Content-Type 1> with value application/json 1> [2017-11-16T20:46:21,820][INFO ][o.e.t.r.y.ClientYamlTestClient] Adding header Content-Type 1> with value application/json 1> [2017-11-16T20:46:21,966][INFO ][o.e.t.r.y.ClientYamlTestClient] Adding header Content-Type 1> with value application/json ```	2017-11-17 14:12:06 +00:00
Tim Brooks	f761a0e0e4	Remove unneeded Throwable handling in nio (#27412 ) This is related to #27260. In the nio transport work we do not catch or handle `Throwable`. There are a few places where we have exception handlers that accept `Throwable`. This commit removes those cases.	2017-11-16 18:24:06 -07:00
David Turner	9766b858d0	Prepare for bump to 6.0.1 on the master branch (#27391 ) An assortment of fixes, particularly to version number calculations, in preparation for the bump to 6.0.1.	2017-11-16 18:38:54 +00:00
Tim Brooks	80ef9bbdb1	Remove parameterization from TcpTransport (#27407 ) This commit is a follow up to the work completed in #27132. Essentially it transitions two more methods (sendMessage and getLocalAddress) from Transport to TcpChannel. With this change, there is no longer a need for TcpTransport to be aware of the specific type of channel a transport returns. So that class is no longer parameterized by channel type.	2017-11-16 11:19:36 -07:00
Tim Brooks	35a5922927	Delete unneeded nio client (#27408 ) This is a follow up to #27132. As that PR greatly simplified the connection logic inside a low level transport implementation, much of the functionality provided by the NioClient class is no longer necessary. This commit removes that class.	2017-11-16 09:22:40 -07:00
Jim Ferenczi	623367d793	Add composite aggregator (#26800 ) * This change adds a module called `aggs-composite` that defines a new aggregation named `composite`. The `composite` aggregation is a multi-buckets aggregation that creates composite buckets made of multiple sources. The sources for each bucket can be defined as: * A `terms` source, values are extracted from a field or a script. * A `date_histogram` source, values are extracted from a date field and rounded to the provided interval. This aggregation can be used to retrieve all buckets of a deeply nested aggregation by flattening the nested aggregation in composite buckets. A composite buckets is composed of one value per source and is built for each document as the combinations of values in the provided sources. For instance the following aggregation: ```` "test_agg": { "terms": { "field": "field1" }, "aggs": { "nested_test_agg": "terms": { "field": "field2" } } } ```` ... which retrieves the top N terms for `field1` and for each top term in `field1` the top N terms for `field2`, can be replaced by a `composite` aggregation in order to retrieve all the combinations of `field1`, `field2` in the matching documents: ```` "composite_agg": { "composite": { "sources": [ { "field1": { "terms": { "field": "field1" } } }, { "field2": { "terms": { "field": "field2" } } }, } } ```` The response of the aggregation looks like this: ```` "aggregations": { "composite_agg": { "buckets": [ { "key": { "field1": "alabama", "field2": "almanach" }, "doc_count": 100 }, { "key": { "field1": "alabama", "field2": "calendar" }, "doc_count": 1 }, { "key": { "field1": "arizona", "field2": "calendar" }, "doc_count": 1 } ] } } ```` By default this aggregation returns 10 buckets sorted in ascending order of the composite key. Pagination can be achieved by providing `after` values, the values of the composite key to aggregate after. For instance the following aggregation will aggregate all composite keys that sorts after `arizona, calendar`: ```` "composite_agg": { "composite": { "after": {"field1": "alabama", "field2": "calendar"}, "size": 100, "sources": [ { "field1": { "terms": { "field": "field1" } } }, { "field2": { "terms": { "field": "field2" } } } } } ```` This aggregation is optimized for indices that set an index sorting that match the composite source definition. For instance the aggregation above could run faster on indices that defines an index sorting like this: ```` "settings": { "index.sort.field": ["field1", "field2"] } ```` In this case the `composite` aggregation can early terminate on each segment. This aggregation also accepts multi-valued field but disables early termination for these fields even if index sorting matches the sources definition. This is mandatory because index sorting picks only one value per document to perform the sort.	2017-11-16 15:13:36 +01:00
Tim Brooks	ca11085bb6	Add TcpChannel to unify Transport implementations (#27132 ) Right now our different transport implementations must duplicate functionality in order to stay compliant with the requirements of TcpTransport. They must all implement common logic to open channels, close channels, keep track of channels for eventual shutdown, etc. Additionally, there is a weird and complicated relationship between Transport and TransportService. We eventually want to start merging some of the functionality between these classes. This commit starts moving towards a world where TransportService retains all the application logic and channel state. Transport implementations in this world will only be tasked with returning a channel when one is requested, calling transport service when a channel is accepted from a server, and starting / stopping itself. Specifically this commit changes how channels are opened and closed. All Transport implementations now return a channel type that must comply with the new TcpChannel interface. This interface has the methods necessary for TcpTransport to completely manage the lifecycle of a channel. This includes setting the channel up, waiting for connection, adding close listeners, and eventually closing.	2017-11-15 12:38:39 -07:00
Luca Cavanna	382da0f227	REST spec: Validate that api name matches file name that contains it (#27366 ) This commit validates that each spec json file contains an API that has the same name as the file	2017-11-14 14:53:00 +01:00
Simon Willnauer	2299c70371	Allow affix settings to specify dependencies (#27161 ) We use affix settings to group settings / values under a certain namespace. In some cases like login information for instance a setting is only valid if one or more other settings are present. For instance `x.test.user` is only valid if there is an `x.test.passwd` present and vice versa. This change allows to specify such a dependency to prevent settings updates that leave settings in an inconsistent state.	2017-11-13 12:06:36 +01:00
Simon Willnauer	a34c2f0b8d	Ensure external refreshes will also refresh internal searcher to minimize segment creation (#27253 ) We cut over to internal and external IndexReader/IndexSearcher in #26972 which uses two independent searcher managers. This has the downside that refreshes of the external reader will never clear the internal version map which in-turn will trigger additional and potentially unnecessary segment flushes since memory must be freed. Under heavy indexing load with low refresh intervals this can cause excessive segment creation which causes high GC activity and significantly increases the required segment merges. This change adds a dedicated external reference manager that delegates refreshes to the internal reference manager that then `steals` the refreshed reader from the internal reference manager for external usage. This ensures that external and internal readers are consistent on an external refresh. As a sideeffect this also releases old segments referenced by the internal reference manager which can potentially hold on to already merged away segments until it is refreshed due to a flush or indexing activity.	2017-11-09 08:40:22 +00:00
Tim Brooks	dc86b4c2ed	Decouple `ChannelFactory` from Tcp classes (#27286 ) * Decouple `ChannelFactory` from Tcp classes This is related to #27260. Currently `ChannelFactory` is tightly coupled to classes related to the elasticsearch Tcp binary protocol. This commit modifies the factory to be able to construct http or other protocol channels.	2017-11-08 14:30:00 -07:00
Jason Tedor	d5451b2037	Die with dignity while merging If an out of memory error is thrown while merging, today we quietly rewrap it into a merge exception and the out of memory error is lost. Instead, we need to rethrow out of memory errors, and in fact any fatal error here, and let those go uncaught so that the node is torn down. This commit causes this to be the case. Relates #27265	2017-11-06 17:55:11 -05:00
Jason Tedor	766d29e7cf	Correctly encode warning headers The warnings headers have a fairly limited set of valid characters (cf. quoted-text in RFC 7230). While we have assertions that we adhere to this set of valid characters ensuring that our warning messages do not violate the specificaion, we were neglecting the possibility that arbitrary user input would trickle into these warning headers. Thus, missing here was tests for these situations and encoding of characters that appear outside the set of valid characters. This commit addresses this by encoding any characters in a deprecation message that are not from the set of valid characters. Relates #27269	2017-11-06 13:20:30 -05:00
Simon Willnauer	bd7efa908a	Add ability to split shards (#26931 ) This change adds a new `_split` API that allows to split indices into a new index with a power of two more shards that the source index. This API works alongside the `_shrink` API but doesn't require any shard relocation before indices can be split. The split operation is conceptually an inverse `_shrink` operation since we initialize the index with a _syntetic_ number of routing shards that are used for the consistent hashing at index time. Compared to indices created with earlier versions this might produce slightly different shard distributions but has no impact on the per-index backwards compatibility. For now, the user is required to prepare an index to be splittable by setting the `index.number_of_routing_shards` at index creation time. The setting allows the user to prepare the index to be splittable in factors of `index.number_of_routing_shards` ie. if the index is created with `index.number_of_routing_shards: 16` and `index.number_of_shards: 2` it can be split into `4, 8, 16` shards. This is an intermediate step until we can make this the default. This also allows us to safely backport this change to 6.x. The `_split` operation is implemented internally as a DeleteByQuery on the lucene level that is executed while the primary shards execute their initial recovery. Subsequent merges that are triggered due to this operation will not be executed immediately. All merges will be deferred unti the shards are started and will then be throttled accordingly. This change is intended for the 6.1 feature release but will not support pre-6.1 indices to be split unless these indices have been shrunk before. In that case these indices can be split backwards into their original number of shards.	2017-11-06 11:37:55 +01:00
Tanguy Leroux	43e7a4a349	Upgrade to Jackson 2.8.10 (#27230 ) While it's not possible to upgrade the Jackson dependencies to their latest versions yet (see #27032 (comment) for more) it's still possible to upgrade to the latest 2.8.x version.	2017-11-06 10:20:05 +01:00
Jim Ferenczi	429275a773	Remove ElasticsearchQueryCachingPolicy (#27190 ) We have an hidden setting called `index.queries.cache.term_queries` that disables caching of term queries in the query cache. Though term queries are not cached in the Lucene UsageTrackingQueryCachingPolicy since version 6.5. This makes the es policy useless but also makes it impossible to re-enable caching for term queries. This change appeared in Lucene 6.5 so this setting is no-op since version 5.4 of Elasticsearch The change in this PR removes the setting and the custom policy.	2017-11-06 08:26:24 +01:00
David Roberts	749c3ec716	Remove the single argument Environment constructor (#27235 ) Only tests should use the single argument Environment constructor. To enforce this the single arg Environment constructor has been replaced with a test framework factory method. Production code (beyond initial Bootstrap) should always use the same Environment object that Node.getEnvironment() returns. This Environment is also available via dependency injection.	2017-11-04 13:25:09 +00:00
kel	0f21262b36	Do not create directories if repository is readonly (#26909 ) For FsBlobStore and HdfsBlobStore, if the repository is read only, the blob store should be aware of the readonly setting and do not create directories if they don't exist. Closes #21495	2017-11-03 13:10:50 +01:00
Jason Tedor	d6d830ff0b	Fix logic detecting unreleased versions When partitioning version constants into released and unreleased versions, today we have a bug in finding the last unreleased version. Namely, consider the following version constants on the 6.x branch: ..., 5.6.3, 5.6.4, 6.0.0-alpha1, ..., 6.0.0-rc1, 6.0.0-rc2, 6.0.0, 6.1.0. In this case, our convention dictates that: 5.6.4, 6.0.0, and 6.1.0 are unreleased. Today we correctly detect that 6.0.0 and 6.1.0 are unreleased, and then we say the previous patch version is unreleased too. The problem is the logic to remove that previous patch version is broken, it does not skip alphas/betas/RCs which have been released. This commit fixes this by skipping backwards over pre-release versions when finding the previous patch version to remove. Relates #27206	2017-11-01 13:01:45 -04:00
Colin Goodheart-Smithe	99aca9cdfc	Enhances exists queries to reduce need for `_field_names` (#26930 ) * Enhances exists queries to reduce need for `_field_names` Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance. This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`. Closes #26770 * Addresses review comments * Addresses more review comments * implements existsQuery explicitly on every mapper * Reinstates ability to perform term query on `_field_names` * Added bwc depending on index created version * Review Comments * Skips tests that are not supported in 6.1.0 These values will need to be changed after backporting this PR to 6.x	2017-11-01 10:46:59 +00:00
kel	c3e2bdf20c	Raise IllegalArgumentException if query validation failed (#26811 ) Closes #26799	2017-10-31 12:17:27 +01:00
Adrien Grand	3812d3cb43	TopHitsAggregator must propagate calls to `setScorer`. (#27138 ) It is required in order to work correctly with bulk scorer implementations that change the scorer during the collection process. Otherwise sub collectors might call `Scorer.score()` on the wrong scorer. Closes #27131	2017-10-31 09:59:06 +01:00
Jason Tedor	a566942219	Refactor internal engine This commit is a minor refactoring of internal engine to move hooks for generating sequence numbers into the engine itself. As such, we refactor tests that relied on this hook to use the new hook, and remove the hook from the sequence number service itself. Relates #27082	2017-10-30 13:10:20 -04:00
Ryan Ernst	2a8452b513	Reindex: Fix headers in reindex action (#26937 ) The headers passed to reindex were skipped except for the last one. This commit fixes the copying of the headers, as well as adds a base test case for rest client builders to access the headers within the built rest client. relates #22976	2017-10-25 16:37:01 -07:00
olcbean	981b7f4d39	Make yaml test runner stricter by enforcing `required` for paths and parameters (#27035 ) Till now the yaml test runner was verifying that the provided path parts and parameters are supported. With this PR, yaml test runner also checks that all required path parts and parameters are provided.	2017-10-25 19:36:42 +00:00
Luca Cavanna	8caf7d4ff8	Decouple BulkProcessor from ThreadPool (#26727 ) Introduce minimal thread scheduler as a base class for `ThreadPool`. Such a class can be used from the `BulkProcessor` to schedule retries and the flush task. This allows to remove the `ThreadPool` dependency from `BulkProcessor`, which requires to provide settings that contain `node.name` and also needed log4j for logging. Instead, it needs now a `Scheduler` that is much lighter and gets automatically created and shut down on close. Closes #26028	2017-10-25 10:30:23 +02:00
Lee Hinman	fcfbdf1f37	Expose adaptive replica selection stats in /_nodes/stats API This exposes the collected metrics we store for ARS in the nodes stats, as well as the computed rank of nodes. Each node exposes its perspective about the cluster. Here's an example output (with `?human`): ```json ... "adaptive_selection" : { "_k6v1-wERxyUd5ke6s-D0g" : { "outgoing_searches" : 0, "avg_queue_size" : 0, "avg_service_time" : "7.8ms", "avg_service_time_ns" : 7896963, "avg_response_time" : "9ms", "avg_response_time_ns" : 9095598, "rank" : "9.1" }, "VJiCUFoiTpySGmO00eWmtQ" : { "outgoing_searches" : 0, "avg_queue_size" : 0, "avg_service_time" : "1.3ms", "avg_service_time_ns" : 1330240, "avg_response_time" : "4.5ms", "avg_response_time_ns" : 4524154, "rank" : "4.5" }, "DHNGTdzyT9iiaCpEUsIAKA" : { "outgoing_searches" : 0, "avg_queue_size" : 0, "avg_service_time" : "2.1ms", "avg_service_time_ns" : 2113164, "avg_response_time" : "6.3ms", "avg_response_time_ns" : 6375810, "rank" : "6.4" } } ... ```	2017-10-24 08:58:42 -06:00
Tim Brooks	277637f42f	Do not set SO_LINGER on server channels (#26997 ) Right now we are attempting to set SO_LINGER to 0 on server channels when we are stopping the tcp transport. This is not a supported socket option and throws an exception. This also prevents the channels from being closed. This commit 1. doesn't set SO_LINGER for server channges, 2. checks that it is a supported option in nio, and 3. changes the log message to warn for server channel close exceptions.	2017-10-13 13:06:38 -06:00
Jason Tedor	393e73612e	Fix formatting in channel close test This commit fixes the indentation in the transport test case for a channel closing while connecting.	2017-10-10 13:39:45 -04:00
Jason Tedor	4c06b8f1d2	Check for closed connection while opening While opening a connection to a node, a channel can subsequently close. If this happens, a future callback whose purpose is to close all other channels and disconnect from the node will fire. However, this future will not be ready to close all the channels because the connection will not be exposed to the future callback yet. Since this callback is run once, we will never try to disconnect from this node again and we will be left with a closed channel. This commit adds a check that all channels are open before exposing the channel and throws a general connection exception. In this case, the usual connection retry logic will take over. Relates #26932	2017-10-10 13:34:51 -04:00
Simon Willnauer	cdd7c1e6c2	Return List instead of an array from settings (#26903 ) Today we return a `String[]` that requires copying values for every access. Yet, we already store the setting as a list so we can also directly return the unmodifiable list directly. This makes list / array access in settings a much cheaper operation especially if lists are large.	2017-10-09 09:52:08 +02:00
Nhat	bf4c3642b2	remove _primary and _replica shard preferences (#26791 ) The shard preference _primary, _replica and its variants were useful for the asynchronous replication. However, with the current impl, they are no longer useful and should be removed. Closes #26335	2017-10-08 11:03:06 -04:00
Jason Tedor	470e5e7cfc	Add additional low-level logging handler () * Add additional low-level logging handler We have the trace handler which is useful for recording sent messages but there are times where it would be useful to have more low-level logging about the events occurring on a channel. This commit adds a logging handler that can be enabled by setting a certain log level (org.elasticsearch.transport.netty4.ESLoggingHandler) to trace that provides trace logging on low-level channel events and includes some information about the request/response read/write events on the channel as well. * Remove imports * License header * Remove redundant * Add test * More assertions	2017-10-05 12:10:58 -04:00
Martijn van Groningen	b27e408ed2	Removed void token filter entries and added two tests	2017-10-05 13:25:05 +02:00
Md. Abdulla-Al-Sun	a40c474e10	Added Bengali Analyzer to Elasticsearch with respect to the lucene update(PR#238)	2017-10-05 13:25:05 +02:00
Boaz Leskes	2a04118e88	Promote common rest test utility methods to ESRestTestCase We have duplicates in some classes and I was about to create one more.	2017-10-05 10:08:10 +02:00
Simon Willnauer	00dfdf50cf	Represent lists as actual lists inside Settings (#26878 ) Today we represent each value of a list setting with it's own dedicated key that ends with the index of the value in the list. Aside of the obvious weirdness this has several issues especially if lists are massive since it causes massive runtime penalties when validating settings. Like a list of 100k words will literally cause a create index call to timeout and in-turn massive slowdown on all subsequent validations runs. With this change we use a simple string list to represent the list. This change also forbids to add a settings that ends with a .0 which was internally used to detect a list setting. Once this has been rolled out for an entire major version all the internal .0 handling can be removed since all settings will be converted. Relates to #26723	2017-10-05 09:27:08 +02:00
Martijn van Groningen	dca787ed8a	upgrade to Lucene 7.1.0 snapshot version	2017-10-05 09:06:56 +02:00
Simon Willnauer	d1533e2397	Remove Settings#getAsMap() (#26845 ) Since `#getAsMap` exposes internal representation we are trying to remove it step by step. This commit is cleaning up some xcontent writing as well as usage in tests	2017-10-04 01:21:38 -06:00
Boaz Leskes	a18bd9caa2	Increase ESRestTestCase.waitForClusterStateUpdatesToFinish time out to 30s It is set to 10 sec but sometimes it takes the cluster longer to settle.	2017-10-03 12:24:36 +02:00
Tim Brooks	d80ad7f097	Check channel i open before setting SO_LINGER (#26857 ) This commit fixes a #26855. Right now we set SO_LINGER to 0 if we are stopping the transport. This can throw a ChannelClosedException if the raw channel is already closed. We have a number of scenarios where it is possible this could be called with a channel that is already closed. This commit fixes the issue be checking that the channel is not closed before attempting to set the socket option.	2017-10-02 15:09:52 -06:00
Tim Brooks	9ae7a80ba5	Move raw selector usage into ESSelector (#26825 ) Currently we only log generic messages about errors in logs from the nio event handler. This means that we do not know which channel had issues connection, reading, writing, etc. This commit changes the logs to include the local and remote addresses and profile for a channel.	2017-10-01 17:59:57 -06:00
Simon Willnauer	7b8d036ab5	Replace group map settings with affix setting (#26819 ) We use group settings historically instead of using a prefix setting which is more restrictive and type safe. The majority of the usecases needs to access a key, value map based on the _leave node_ of the setting ie. the setting `index.tag.*` might be used to tag an index with `index.tag.test=42` and `index.tag.staging=12` which then would be turned into a `{"test": 42, "staging": 12}` map. The group settings would always use `Settings#getAsMap` which is loosing type information and uses internal representation of the settings. Using prefix settings allows now to access such a method type-safe and natively.	2017-09-30 14:27:21 +02:00
Tim Brooks	bf403ae028	Add information about nio channels in logs (#26806 ) Currently we only log generic messages about errors in logs from the nio event handler. This means that we do not know which channel had issues connection, reading, writing, etc. This commit changes the logs to include the local and remote addresses and profile for a channel.	2017-09-28 17:11:26 -06:00

1 2 3 4 5 ...

1259 Commits