OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nhat Nguyen	262d3c0783	Allow engine to recover from translog upto a seqno (#33032 ) This change allows an engine to recover from its local translog up to the given seqno. The extended API can be used in these use cases: When a replica starts following a new primary, it resets its index to the safe commit, then replays its local translog up to the current global checkpoint (see #32867). When a replica starts a peer-recovery, it can initialize the start_sequence_number to the persisted global checkpoint instead of the local checkpoint of the safe commit. A replica will then replay its local translog up to that global checkpoint before accepting remote translog from the primary. This change will increase the chance of operation-based recovery. I will make this in a follow-up. Relates #32867	2018-08-22 07:57:44 -04:00
David Turner	ab000323fa	Allow extension of CapturingTransport by subclasses (#33012 ) Today, CapturingTransport#createCapturingTransportService creates a transport service with a connection manager with reasonable default behaviours, but overriding this behaviour in a consumer is a litle tricky. Additionally, the default behaviour for opening a connection duplicates the content of the CapturingTransport#openConnection() method. This change removes this duplication by delegating to openConnection() and introduces overridable nodeConnected() and onSendRequest() methods so that consumers can alter this behaviour more easily. Relates #32246 in which we test the mechanisms for opening connections to unknown (and possibly unreachable) nodes.	2018-08-22 09:09:08 +01:00
Alpar Torok	82d10b484a	Run forbidden api checks with runtimeJavaVersion (#32947 ) Run forbidden APIs checks with runtime hava version	2018-08-22 09:05:22 +03:00
Simon Willnauer	92076497e5	Use a dedicated ConnectionManger for RemoteClusterConnection (#32988 ) This change introduces a dedicated ConnectionManager for every RemoteClusterConnection such that there is not state shared with the TransportService internal ConnectionManager. All connections to a remote cluster are isolated from the TransportService but still uses the TransportService and it's internal properties like the Transport, tracing and internal listener actions on disconnects etc. This allows a remote cluster connection to have a different lifecycle than a local cluster connection, also local discovery code doesn't get notified if there is a disconnect on from a remote cluster and each connection can use it's own dedicated connection profile which allows to have a reduced set of connections per cluster without conflicting with the local cluster. Closes #31835	2018-08-21 12:43:25 +02:00
Tim Brooks	cd83ddcecc	Fix assertion in AbstractSimpleTransportTestCase (#32991 ) This is a follow-up to #32956. That commit incorrectly used assertBusy which led to a possible race in the test. This commit fixes it.	2018-08-20 16:09:22 -06:00
David Turner	cd6326b391	Introduce PreVoteCollector (#32847 ) An election requires a node to select a term that is higher than all previously-seen terms. If nodes are too enthusiastic about starting elections then they can effectively excludes itself from the cluster until the leader can bump to a still-higher term, and if this process repeats then a single faulty node can prevent the cluster from making useful progress. The solution is to start the election with a pre-voting round to ensure that there is at least a quorum of nodes who believe there to be no leader. This also fixes up some merge issues.	2018-08-20 17:48:05 +01:00
Tim Brooks	faa42de66d	Pass DiscoveryNode to initiateChannel (#32958 ) This is related to #32517. This commit passes the DiscoveryNode to the initiateChannel method for different Transport implementation. This will allow additional attributes (besides just the socket address) to be used when opening channels.	2018-08-20 08:54:55 -06:00
David Turner	f6891cd222	Fixup after merge	2018-08-20 08:58:03 +01:00
David Turner	f317562c82	Merge branch 'master' into zen2	2018-08-20 08:33:55 +01:00
Alpar Torok	4b34b3f4aa	Set forbidden APIs target compatibility to compiler java version (#32935 ) Set forbidden apis target compatibility to compiler version Fix outstanding deprecation	2018-08-20 09:27:02 +03:00
Tim Brooks	de92d2ef1f	Move connection listener to ConnectionManager (#32956 ) This is a followup to #31886. After that commit the TransportConnectionListener had to be propogated to both the Transport and the ConnectionManager. This commit moves that listener to completely live in the ConnectionManager. The request and response related methods are moved to a TransportMessageListener. That listener continues to live in the Transport class.	2018-08-18 10:09:24 -06:00
Tim Brooks	2464b68613	Move connection profile into connection manager (#32858 ) This is related to #31835. It moves the default connection profile into the ConnectionManager class. The will allow us to have different connection managers with different profiles.	2018-08-15 09:08:33 -06:00
Lee Hinman	48281ac5bc	Use generic AcknowledgedResponse instead of extended classes (#32859 ) This removes custom Response classes that extend `AcknowledgedResponse` and do nothing, these classes are not needed and we can directly use the non-abstract super-class instead. While this appears to be a large PR, no code has actually changed, only class names have been changed and entire classes removed.	2018-08-15 08:06:14 -06:00
Ryan Ernst	0158b59a5a	Test: Fix forbidden uses in test framework (#32824 ) This commit fixes existing uses of forbidden apis in the test framework and re-enables the forbidden apis check. It was previously completely disabled and had missed a rename of the forbidden apis signatures files. closes #32772	2018-08-14 11:35:09 -07:00
Tim Brooks	10fddb62ee	Remove client connections from TcpTransport (#31886 ) This is related to #31835. This commit adds a connection manager that manages client connections to other nodes. This means that the TcpTransport no longer maintains a map of nodes that it is connected to.	2018-08-13 16:44:09 -06:00
Armin Braun	d412230cda	SCRIPTING: Support BucketAggScript return null (#32811 ) * As explained in #32790, `BucketAggregationScript` must support `null` as a return value * Closes #32790	2018-08-13 20:08:26 +02:00
Yannick Welsch	e122505a91	Zen2: Deterministic MasterService (#32493 ) Increases testability of MasterService and the discovery layer. Changes: - Async publish method - Moved a few interfaces/classes top-level to simplify imports - Deterministic MasterService implementation for tests	2018-08-13 18:03:08 +02:00
Nik Everett	f5ba801c6b	Test: Only sniff host metadata for node_selectors (#32750 ) Our rest testing framework has support for sniffing the host metadata on startup and, before this change, it'd sniff that metadata before running the first test. This prevents running these tests against elasticsearch installations that won't support sniffing like Elastic Cloud. This change allows tests to only sniff for metadata when they encounter a test with a `node_selector`. These selectors are the things that need the metadata anyway and they are super rare. Tests that use these won't be able to run against installations that don't support sniffing but we can just skip them. In the case of Elastic Cloud, these tests were never going to work against Elastic Cloud anyway.	2018-08-10 13:35:47 -04:00
Christoph Büscher	22f7b03430	Fix test reproducability in AbstractBuilderTestCase setup (#32403 ) Currently AbstractBuilderTestCase generates certain random values in its `beforeTest()` method annotated with @Before only the first time that a test method in the suite is run while initializing the serviceHolder that we use for the rest of the test. This changes the values of subsequent random values and has the effect that when running single methods from a test suite with "-Dtests.method=*", the random values it sees are different from when the same test method is run as part of the whole test suite. This makes it hard to use the reproduction lines logged on failure. This change runs the inialization of the serviceHolder and the randomization connected to it using the test runners master seed, so reproduction by running just one method is possible again. Closes #32400	2018-08-10 15:13:44 +02:00
Boaz Leskes	f58ed21720	Refactor TransportShardBulkAction to better support retries (#31821 ) Processing bulk request goes item by item. Sometimes during processing, we need to stop execution and wait for a new mapping update to be processed by the node. This is currently achieved by throwing a `RetryOnPrimaryException`, which is caught higher up. When the exception is caught, we wait for the next cluster state to arrive and process the request again. Sadly this is a problem because all operations that were already done until the mapping change was required are applied again and get new sequence numbers. This in turn means that the previously issued sequence numbers are never replicated to the replicas. That causes the local checkpoint of those shards to be stuck and with it all the seq# based infrastructure. This commit refactors how we deal with retries with the goal of removing `RetryOnPrimaryException` and `RetryOnReplicaException` (not done yet). It achieves so by introducing a class `BulkPrimaryExecutionContext` that is used the capture the execution state and allows continuing from where the execution stopped. The class also formalizes the steps each item has to go through: 1) A translation phase for updates 2) Execution phase (always index/delete) 3) Waiting for a mapping update to come in, if needed 4) Requires a retry (for updates and cases where the mapping are still not available after the put mapping call returns) 5) A finalization phase which allows updates to the index/delete result to an update result.	2018-08-10 10:15:01 +02:00
Alpar Torok	af8c23eb40	Java version reproduction (#32715 ) Enhance reproduction line with info about jdks Provide the ability to control compiler and hava versions just by passing a property. The actual java home comes from the `JAVA<major>_HOME` env vars that we allready require. This works better with the Gradle daemon as well. Output is also changed a bit. for `-Druntime.java=8 -Dcompiler.java=9`: ``` ======================================= Elasticsearch Build Hamster says Hello! Gradle Version : 4.9 OS Info : Linux 4.17.8-1-ARCH (amd64) Compiler JDK Version : 11 (Oracle Corporation 11-ea [OpenJDK 64-Bit Server VM 11-ea+22]) Runtime JDK Version : 11 (Oracle Corporation 11-ea [OpenJDK 64-Bit Server VM 11-ea+22]) Gradle JDK Version : 10 (Oracle Corporation 10.0.1 [OpenJDK 64-Bit Server VM 10.0.1+10]) Compiler java.home : /home/alpar/opt/jdk-11-ea22/ Runtime java.home : /home/alpar/opt/jdk-11-ea22/ Gradle java.home : /usr/lib/jvm/java-10-openjdk Random Testing Seed : EA858533191E8DFB ======================================= ``` Without configuration: ``` ======================================= Elasticsearch Build Hamster says Hello! ======================================= Gradle Version : 4.9 OS Info : Linux 4.17.8-1-ARCH (amd64) JDK Version : 10 (Oracle Corporation 10.0.1 [OpenJDK 64-Bit Server VM 10.0.1+10]) JAVA_HOME : /usr/lib/jvm/java-10-openjdk Random Testing Seed : 4BD5B2A839C8FCA1 ======================================= ``` Here's how a reproduction line will look like (test made to fail): ``` ./gradlew :modules:lang-painless:test -Dtests.seed=2DA2379065A4EEAB -Dtests.class=org.elasticsearch.painless.AdditionTests -Dtests.method="testInt" -Dtests.security.manager=true -Dtests.locale=es-PE -Dtests.timezone=WET -Dcompiler.java=10 -Druntime.java=10 ```	2018-08-10 08:07:43 +00:00
Armin Braun	79375d35bb	Scripting: Replace Update Context (#32096 ) * SCRIPTING: Move Update Scripts to their own context * Added system property for backwards compatibility of change to `ctx.params`	2018-08-09 14:32:36 +02:00
Jason Tedor	dcc816427e	Expose whether or not the global checkpoint updated (#32659 ) It will be useful for future efforts to know if the global checkpoint was updated. To this end, we need to expose whether or not the global checkpoint was updated when the state of the replication tracker updates. For this, we add to the tracker a callback that is invoked whenever the global checkpoint is updated. For primaries this will be invoked when the computed global checkpoint is updated based on state changes to the tracker. For replicas this will be invoked when the local knowledge of the global checkpoint is advanced from the primary.	2018-08-07 15:10:09 -04:00
Tim Brooks	3d5e9114e3	Reduce connections used by MockNioTransport (#32620 ) The MockNioTransport (similar to the MockTcpTransport) is used for integ tests. The MockTcpTransport has always only opened a single for all of its work. The MockNioTransport has awlays opened the default number of connections (13). This means that every test where two transports connect requires 26 connections. This is more than is necessary. This commit modifies the MockNioTransport to only require 3 connections.	2018-08-07 12:52:28 -06:00
Lee Hinman	b3e15851a2	[TEST] Comment out account breaker assertion while diagnosing Relates to #30290	2018-08-07 09:36:37 -06:00
David Turner	289e34aeed	[Zen2] Add HandshakingTransportAddressConnector (#32643 ) The `PeerFinder`, introduced in #32246, needs to be able to identify, and connect to, a remote master node using only its `TransportAddress`. This can be done by opening a single-channel connection to the address, performing a handshake, and only then forming a full-blown connection to the node. This change implements this logic.	2018-08-07 13:34:07 +01:00
Armin Braun	0a67cb4133	LOGGING: Upgrade to Log4J 2.11.1 (#32616 ) * LOGGING: Upgrade to Log4J 2.11.1 * Upgrade to `2.11.1` to fix memory leaks in slow logger when logging large requests * This was caused by a bug in Log4J https://issues.apache.org/jira/browse/LOG4J2-2269 and is fixed in `2.11.1` via https://git-wip-us.apache.org/repos/asf?p=logging-log4j2.git;h=9496c0c * Fixes #32537 * Fixes #27300	2018-08-06 14:56:21 +02:00
Armin Braun	6fa7016bbf	SCRIPTING: Move Aggregation Scripts to their own context (#32068 ) * SCRIPTING: Move Aggregation Scripts to their own context	2018-08-04 10:37:07 +02:00
Yannick Welsch	0d60e8a029	Fix race between replica reset and primary promotion (#32442 ) We've recently seen a number of test failures that tripped an assertion in IndexShard (see issues linked below), leading to the discovery of a race between resetting a replica when it learns about a higher term and when the same replica is promoted to primary. This commit fixes the race by distinguishing between a cluster state primary term (called pendingPrimaryTerm) and a shard-level operation term. The former is set during the cluster state update or when a replica learns about a new primary. The latter is only incremented under the operation block, which can happen in a delayed fashion. It also solves the issue where a replica that's still adjusting to the new term receives a cluster state update that promotes it to primary, which can happen in the situation of multiple nodes being shut down in short succession. In that case, the cluster state update thread would call `asyncBlockOperations` in `updateShardState`, which in turn would throw an exception as blocking permits is not allowed while an ongoing block is in place, subsequently failing the shard. This commit therefore extends the IndexShardOperationPermits to allow it to queue multiple blocks (which will all take precedence over operations acquiring permits). Finally, it also moves the primary activation of the replication tracker under the operation block, so that the actual transition to primary only happens under the operation block. Relates to #32431, #32304 and #32118	2018-08-03 09:33:08 +02:00
Yannick Welsch	db6e8c736d	Remove cluster state initial customs (#32501 ) This infrastructure was introduced in #26144 and made obsolete in #30743	2018-08-02 15:49:59 +02:00
Jay Modi	f2f33f3149	Use hostname instead of IP with SPNEGO test (#32514 ) This change updates KerberosAuthenticationIT to resolve the host used to connect to the test cluster. This is needed because the host could be an IP address but SPNEGO requires a hostname to work properly. This is done by adding a hook in ESRestTestCase for building the HttpHost from the host and port. Additionally, the project now specifies the IPv4 loopback address as the http host. This is done because we need to be able to resolve the address used for the HTTP transport before the node starts up, but the http.ports file is not written until the node is started. Closes #32498	2018-08-01 12:57:33 +10:00
Nik Everett	22459576d7	Logging: Make node name consistent in logger (#31588 ) First, some background: we have 15 different methods to get a logger in Elasticsearch but they can be broken down into three broad categories based on what information is provided when building the logger. Just a class like: ``` private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class); ``` or: ``` protected final Logger logger = Loggers.getLogger(getClass()); ``` The class and settings: ``` this.logger = Loggers.getLogger(getClass(), settings); ``` Or more information like: ``` Loggers.getLogger("index.store.deletes", settings, shardId) ``` The goal of the "class and settings" variant is to attach the node name to the logger. Because we don't always have the settings available, we often use the "just a class" variant and get loggers without node names attached. There isn't any real consistency here. Some loggers get the node name because it is convenient and some do not. This change makes the node name available to all loggers all the time. Almost. There are some caveats are testing that I'll get to. But in production code the node name is node available to all loggers. This means we can stop using the "class and settings" variants to fetch loggers which was the real goal here, but a pleasant side effect is that the ndoe name is now consitent on every log line and optional by editing the logging pattern. This is all powered by setting the node name statically on a logging formatter very early in initialization. Now to tests: tests can't set the node name statically because subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in the same class loader. Also, lots of tests don't run with a real node so they don't have a node name at all. To support multiple nodes in the same JVM tests suss out the node name from the thread name which works surprisingly well and easy to test in a nice way. For those threads that are not part of an `ESIntegTestCase` node we stick whatever useful information we can get form the thread name in the place of the node name. This allows us to keep the logger format consistent.	2018-07-31 10:54:24 -04:00
Luca Cavanna	9a4d0069f6	REST high-level client: parse back _ignored meta field (#32362 ) `GetResult` and `SearchHit` have been adjusted to parse back the `_ignored` meta field whenever it gets printed out. Expanded the existing tests to make sure this is covered. Fixed also a small problem around highlighted fields in `SearchHitTests`.	2018-07-30 13:43:40 +02:00
Armin Braun	1628c833c7	TESTS: Move netty leak detection to paranoid level (#32354 )	2018-07-26 21:36:49 +02:00
Jim Ferenczi	8e5f281b27	AbstractQueryTestCase should run without type less often (#28936 ) This commit changes the randomization to always create an index with a type. It also adds a way to create a query shard context that maps to an index with no type registered in order to explicitely test cases where there is no type.	2018-07-26 20:29:05 +02:00
Jason Tedor	eb675a1c4d	Introduce index store plugins (#32375 ) Today we allow plugins to add index store implementations yet we are not doing this in our new way of managing plugins as pull versus push. That is, today we still allow plugins to push index store providers via an on index module call where they can turn around and add an index store. Aside from being inconsistent with how we manage plugins today where we would look to pull such implementations from plugins at node creation time, it also means that we do not know at a top-level (for example, in the indices service) which index stores are available. This commit addresses this by adding a dedicated plugin type for index store plugins, removing the index module hook for adding index stores, and by aggregating these into the top-level of the indices service.	2018-07-26 08:05:49 -04:00
Tim Vernum	387c3c7f1d	Introduce Application Privileges with support for Kibana RBAC (#32309 ) This commit introduces "Application Privileges" to the X-Pack security model. Application Privileges are managed within Elasticsearch, and can be tested with the _has_privileges API, but do not grant access to any actions or resources within Elasticsearch. Their purpose is to allow applications outside of Elasticsearch to represent and store their own privileges model within Elasticsearch roles. Access to manage application privileges is handled in a new way that grants permission to specific application names only. This lays the foundation for more OLS on cluster privileges, which is implemented by allowing a cluster permission to inspect not just the action being executed, but also the request to which the action is applied. To support this, a "conditional cluster privilege" is introduced, which is like the existing cluster privilege, except that it has a Predicate over the request as well as over the action name. Specifically, this adds - GET/PUT/DELETE actions for defining application level privileges - application privileges in role definitions - application privileges in the has_privileges API - changes to the cluster permission class to support checking of request objects - a new "global" element on role definition to provide cluster object level security (only for manage application privileges) - changes to `kibana_user`, `kibana_dashboard_only_user` and `kibana_system` roles to use and manage application privileges Closes #29820 Closes #31559	2018-07-24 10:34:46 -06:00
Yogesh Gaikwad	a525c36c60	[Kerberos] Add Kerberos authentication support (#32263 ) This commit adds support for Kerberos authentication with a platinum license. Kerberos authentication support relies on SPNEGO, which is triggered by challenging clients with a 401 response with the `WWW-Authenticate: Negotiate` header. A SPNEGO client will then provide a Kerberos ticket in the `Authorization` header. The tickets are validated using Java's built-in GSS support. The JVM uses a vm wide configuration for Kerberos, so there can be only one Kerberos realm. This is enforced by a bootstrap check that also enforces the existence of the keytab file. In many cases a fallback authentication mechanism is needed when SPNEGO authentication is not available. In order to support this, the DefaultAuthenticationFailureHandler now takes a list of failure response headers. For example, one realm can provide a `WWW-Authenticate: Negotiate` header as its default and another could provide `WWW-Authenticate: Basic` to indicate to the client that basic authentication can be used in place of SPNEGO. In order to test Kerberos, unit tests are run against an in-memory KDC that is backed by an in-memory ldap server. A QA project has also been added to test against an actual KDC, which is provided by the krb5kdc fixture. Closes #30243	2018-07-24 08:44:26 -06:00
Daniel Mitterdorfer	73a38895fd	Add Restore Snapshot High Level REST API With this commit we add the restore snapshot API to the Java high level REST client. Relates #27205 Relates #32155	2018-07-24 16:17:09 +02:00
Ioannis Kakavas	a2dbd83db1	Allow Integ Tests to run in a FIPS-140 JVM (#31989 ) * Complete changes for running IT in a fips JVM - Mute :x-pack:qa:sql:security:ssl:integTest as it cannot run in FIPS 140 JVM until the SQL CLI supports key/cert. - Set default JVM keystore/truststore password in top level build script for all integTest tasks in a FIPS 140 JVM - Changed top level x-pack build script to use keys and certificates for trust/key material when spinning up clusters for IT	2018-07-24 12:48:14 +03:00
Andrey Ershov	33f11e637d	Fail shard if IndexShard#storeStats runs into an IOException (#32241 ) Fail shard if IndexShard#storeStats runs into an IOException. Closes #29008	2018-07-23 16:38:55 +02:00
Christoph Büscher	ff87b7aba4	Remove unnecessary warning supressions (#32250 )	2018-07-23 11:31:04 +02:00
Armin Braun	24068a773d	TESTS: Check for Netty resource leaks (#31861 ) * Enabled advanced leak detection when loading `EsTestCase` * Added custom `Appender` to collect leak logs and check for logged errors in a way similar to what is done for the `StatusLogger` * Fixes #20398	2018-07-20 09:12:32 +02:00
Julie Tibshirani	15ff3da653	Add support for field aliases. (#32172 ) * Add basic support for field aliases in index mappings. (#31287) * Allow for aliases when fetching stored fields. (#31411) * Add tests around accessing field aliases in scripts. (#31417) * Add documentation around field aliases. (#31538) * Add validation for field alias mappings. (#31518) * Return both concrete fields and aliases in DocumentFieldMappers#getMapper. (#31671) * Make sure that field-level security is enforced when using field aliases. (#31807) * Add more comprehensive tests for field aliases in queries + aggregations. (#31565) * Remove the deprecated method DocumentFieldMappers#getFieldMapper. (#32148)	2018-07-18 09:33:09 -07:00
Boaz Leskes	5856c396dd	A replica can be promoted and started in one cluster state update (#32042 ) When a replica is fully recovered (i.e., in `POST_RECOVERY` state) we send a request to the master to start the shard. The master changes the state of the replica and publishes a cluster state to that effect. In certain cases, that cluster state can be processed on the node hosting the replica together with a cluster state that promotes that, now started, replica to a primary. This can happen due to cluster state batched processing or if the master died after having committed the cluster state that starts the shard but before publishing it to the node with the replica. If the master also held the primary shard, the new master node will remove the primary (as it failed) and will also immediately promote the replica (thinking it is started). Sadly our code in IndexShard didn't allow for this which caused [assertions](`13917162ad/server/src/main/java/org/elasticsearch/index/seqno/ReplicationTracker.java (L482)`) to be tripped in some of our tests runs.	2018-07-18 11:30:44 +02:00
Boaz Leskes	93d7468f3a	ESIndexLevelReplicationTestCase doesn't support replicated failures but it's good to know what they are Sometimes we have a test failure that hits an `UnsupportedOperationException` in this infrastructure. When debugging you want to know what caused this unexpected failure, but right now we're silent about it. This commit adds some information to the `UnsupportedOperationException` Relates to #32127	2018-07-18 08:49:16 +02:00
Nhat Nguyen	df1380b8d3	Remove versionType from translog (#31945 ) With the introduction of sequence number, we no longer use versionType to resolve out of order collision in replication and recovery requests. This PR removes removes the versionType from translog. We can only remove it in 7.0 because it is still required in a mixed cluster between 6.x and 5.x.	2018-07-17 21:59:48 -04:00
Ioannis Kakavas	9e529d9d58	Enable testing in FIPS140 JVM (#31666 ) Ensure our tests can run in a FIPS JVM JKS keystores cannot be used in a FIPS JVM as attempting to use one in order to init a KeyManagerFactory or a TrustManagerFactory is not allowed.( JKS keystore algorithms for private key encryption are not FIPS 140 approved) This commit replaces JKS keystores in our tests with the corresponding PEM encoded key and certificates both for key and trust configurations. Whenever it's not possible to refactor the test, i.e. when we are testing that we can load a JKS keystore, etc. we attempt to mute the test when we are running in FIPS 140 JVM. Testing for the JVM is naive and is based on the name of the security provider as we would control the testing infrastrtucture and so this would be reliable enough. Other cases of tests being muted are the ones that involve custom TrustStoreManagers or KeyStoreManagers, null TLS Ciphers and the SAMLAuthneticator class as we cannot sign XML documents in the way we were doing. SAMLAuthenticator tests in a FIPS JVM can be reenabled with precomputed and signed SAML messages at a later stage. IT will be covered in a subsequent PR	2018-07-17 10:54:10 +03:00
Daniel Mitterdorfer	016e8760f0	Turn off real-mem breaker in single node tests With this commit we disable the real-memory circuit breaker in tests that inherit from `ESSingleNodeTestCase`. As this breaker is based on real memory usage over which we have no (full) control in tests and their purpose is also not to test the circuit breaker, we use the deterministic circuit breaker implementation that only accounts for explicitly reserved memory. Closes #32047 Relates #32071	2018-07-16 10:40:36 +02:00
Armin Braun	3679d00a74	Replace Ingest ScriptContext with Custom Interface (#32003 ) * Replace Ingest ScriptContext with Custom Interface * Make org.elasticsearch.ingest.common.ScriptProcessorTests#testScripting more precise * Don't mock script factory in ScriptProcessorTests * Adjust mock script plugin in IT for new API	2018-07-13 23:26:10 +02:00
Vladimir Dolzhenko	b1bf643e41	lazy snapshot repository initialization (#31606 ) lazy snapshot repository initialization	2018-07-13 20:05:49 +02:00
Colin Goodheart-Smithe	0edb096eb4	Adds a new auto-interval date histogram (#28993 ) * Adds a new auto-interval date histogram This change adds a new type of histogram aggregation called `auto_date_histogram` where you can specify the target number of buckets you require and it will find an appropriate interval for the returned buckets. The aggregation works by first collecting documents in buckets at second interval, when it has created more than the target number of buckets it merges these buckets into minute interval bucket and continues collecting until it reaches the target number of buckets again. It will keep merging buckets when it exceeds the target until either collection is finished or the highest interval (currently years) is reached. A similar process happens at reduce time. This aggregation intentionally does not support min_doc_count, offest and extended_bounds to keep the already complex logic from becoming more complex. The aggregation accepts sub-aggregations but will always operate in `breadth_first` mode deferring the computation of sub-aggregations until the final buckets from the shard are known. min_doc_count is effectively hard-coded to zero meaning that we will insert empty buckets where necessary. Closes #9572 * Adds documentation * Added sub aggregator test * Fixes failing docs test * Brings branch up to date with master changes * trying to get tests to pass again * Fixes multiBucketConsumer accounting * Collects more buckets than needed on shards This gives us more options at reduce time in terms of how we do the final merge of the buckeets to produce the final result * Revert "Collects more buckets than needed on shards" This reverts commit 993c782d117892af9a3c86a51921cdee630a3ac5. * Adds ability to merge within a rounding * Fixes nonn-timezone doc test failure * Fix time zone tests * iterates on tests * Adds test case and documentation changes Added some notes in the documentation about the intervals that can bbe returned. Also added a test case that utilises the merging of conseecutive buckets * Fixes performance bug The bug meant that getAppropriate rounding look a huge amount of time if the range of the data was large but also sparsely populated. In these situations the rounding would be very low so iterating through the rounding values from the min key to the max keey look a long time (~120 seconds in one test). The solution is to add a rough estimate first which chooses the rounding based just on the long values of the min and max keeys alone but selects the rounding one lower than the one it thinks is appropriate so the accurate method can choose the final rounding taking into account the fact that intervals are not always fixed length. Thee commit also adds more tests * Changes to only do complex reduction on final reduce * merge latest with master * correct tests and add a new test case for 10k buckets * refactor to perform bucket number check in innerBuild * correctly derive bucket setting, update tests to increase bucket threshold * fix checkstyle * address code review comments * add documentation for default buckets * fix typo	2018-07-13 13:08:35 -04:00
Daniel Mitterdorfer	f174f72fee	Circuit-break based on real memory usage With this commit we introduce a new circuit-breaking strategy to the parent circuit breaker. Contrary to the current implementation which only accounts for memory reserved via child circuit breakers, the new strategy measures real heap memory usage at the time of reservation. This allows us to be much more aggressive with the circuit breaker limit so we bump it to 95% by default. The new strategy is turned on by default and can be controlled with the new cluster setting `indices.breaker.total.userealmemory`. Note that we turn it off for all integration tests with an internal test cluster because it leads to spurious test failures which are of no value (we cannot fully control heap memory usage in tests). All REST tests, however, will make use of the real memory circuit breaker. Relates #31767	2018-07-13 10:08:28 +02:00
olcbean	334c255516	XContentTests : Insert random fields at random positions (#30867 ) Currently AbstractXContentTestCase#testFromXContent appends random fields, but in a fixed position. This PR shuffles all fields after the random fields have been appended, hence the random fields are actually added to random positions.	2018-07-12 19:10:51 +02:00
Nik Everett	38e09a1508	Switch test framework to new style requests (#31939 ) In #29623 we added `Request` object flavored requests to the low level REST client and in #30315 we deprecated the old `performRequest`s. This changes all calls in the `test/framework` project to use the new versions.	2018-07-11 10:04:17 -04:00
Armin Braun	b4087d69d2	Fix assertIngestDocument wrongfully passing (#31913 ) * Fix assertIngestDocument wrongfully passing * Previously docA being subset of docB passed because iteration was over docA's keys only * Scalars in nested fields were not compared in all cases * Assertion errors were hard to interpret (message wasn't correct since it only mentioned the class type) * In cases where two paths contained different types a ClassCastException was thrown instead of an AssertionError * Fixes #28492	2018-07-11 10:24:21 +02:00
Alexander Reelsen	1c32497c44	Date: Add DateFormatters class that uses java.time (#31856 ) A newly added class called DateFormatters now contains java.time based builders for dates, which also intends to be fully backwards compatible, when the name based date formatters are picked. Also a new class named CompoundDateTimeFormatter for being able to parse multiple different formats has been added. A duelling test class has been added that ensures the same dates when parsing java or joda time formatted dates for the name based dates. Note, that java.time and joda time are not fully backwards compatible, which also means that old formats will currently not work with this setup.	2018-07-10 09:28:28 +02:00
Yannick Welsch	cce7dc20ad	Smaller aesthetic fixes to InternalTestCluster (#31831 ) Allows cluster to auto-reconfigure faster by starting up nodes in parallel.	2018-07-06 11:42:09 +02:00
Nik Everett	1099060735	Test: Do not remove xpack templates when cleaning (#31642 ) At the end of every `ESRestTestCase` we clean the cluster which includes deleting all of the templates. If xpack is installed it'll automatically recreate a few templates every time they are removed. Which is slow. This change stops the cleanup from removing the xpack templates. It cuts the time to run the docs tests more than in half and it probably saves a bit more time on other tests as well.	2018-07-05 09:43:43 -04:00
Christoph Büscher	bd1c513422	Reduce more raw types warnings (#31780 ) Similar to #31523.	2018-07-05 15:38:06 +02:00
Simon Willnauer	3f2a241b7f	Detach Transport from TransportService (#31727 ) Today TransportService is tightly coupled with Transport since it requires an instance of TransportService in order to receive responses and send requests. This is mainly due to the Request and Response handlers being maintained in TransportService but also because of the lack of a proper callback interface. This change moves request handler registry and response handler registration into Transport and adds all necessary methods to `TransportConnectionListener` in order to remove the `TransportService` dependency from `Transport` Transport now accepts one or more `TransportConnectionListener` instances that are executed sequentially in a blocking fashion.	2018-07-04 11:32:35 +02:00
Yannick Welsch	2bb4f38371	Add writeBlob option to replace existing blob (#31729 ) Adds a new parameter to the BlobContainer#writeBlob methods to specify whether the existing file should be overridden or not. For some metadata files in the repository, we actually want to replace the current file. This is currently implemented through an explicit blob delete and then a fresh write. In case of using a cloud provider (S3, GCS, Azure), this results in 2 API requests instead of just 1. This change will therefore allow us to achieve the same functionality using less API requests.	2018-07-03 09:13:50 +02:00
Christoph Büscher	31aabe4bf9	Clean up double semicolon code typos (#31687 )	2018-07-02 15:14:44 +02:00
Konrad Beiske	2971dd56ca	Enable setting client path prefix to / (#30119 ) Some proxies require all requests to have paths starting with / since there are no relative paths at the HTTP connection level. Elasticsearch assumes paths are absolute. In order to run rest tests against a cluster behind such a proxy, set the system property tests.rest.client_path_prefix to /.	2018-07-01 13:42:03 -04:00
Tanguy Leroux	d8b3f332ef	Remove extra check for object existence in repository-gcs read object (#31661 )	2018-06-29 13:52:31 +02:00
Tanguy Leroux	0ef22db844	[Test] Clean up some repository-s3 tests (#31601 ) This commit removes some tests in the repository-s3 plugin that have not been executed for 2+ years but have been maintained for nothing. Most of the tests in AbstractAwsTestCase were obsolete or superseded by fixture based integration tests.	2018-06-29 13:21:29 +02:00
Ryan Ernst	f924835265	Core: Require all actions have a Task (#31627 ) The TaskManager and TaskAwareRequest could return null when registering a task according to their javadocs, but no implementations ever actually did that. This commit removes that wording from the javadocs and ensures null is no longer allowed.	2018-06-28 08:24:03 -07:00
Alpar Torok	8557bbab28	Upgrade gradle wrapper to 4.8 (#31525 ) * Move to Gradle 4.8 RC1 * Use latest version of plugin The current does not work with Gradle 4.8 RC1 * Switch to Gradle GA * Add and configure build compare plugin * add work-around for https://github.com/gradle/gradle/issues/5692 * work around https://github.com/gradle/gradle/issues/5696 * Make use of Gradle build compare with reference project * Make the manifest more compare friendly * Clear the manifest in compare friendly mode * Remove animalsniffer from buildscript classpath * Fix javadoc errors * Fix doc issues * reference Gradle issues in comments * Conditionally configure build compare * Fix some more doclint issues * fix typo in build script * Add sanity check to make sure the test task was replaced Relates to #31324. It seems like Gradle has an inconsistent behavior and the taks is not always replaced. * Include number of non conforming tasks in the exception. * No longer replace test task, create implicit instead Closes #31324. The issue has full context in comments. With this change the `test` task becomes nothing more than an alias for `utest`. Some of the stand alone tests that had a `test` task now have `integTest`, and a few of them that used to have `integTest` to run multiple tests now only have `check`. This will also help separarate unit/micro tests from integration tests. * Revert "No longer replace test task, create implicit instead" This reverts commit f1ebaf7d93e4a0a19e751109bf620477dc35023c. * Fix replacement of the test task Based on information from gradle/gradle#5730 replace the task taking into account the task providres. Closes #31324. * Only apply build comapare plugin if needed * Make sure test runs before integTest * Fix doclint aftter merge * PR review comments * Switch to Gradle 4.8.1 and remove workaround * PR review comments * Consolidate task ordering	2018-06-28 08:13:21 +03:00
Luca Cavanna	a35b5341c4	[TEST] call yaml client close method from test suite (#31591 ) We added a way to close the yaml test client with #31575. Such close method also needs to be called from the test suite though for the additional clients to be closed.	2018-06-27 08:23:53 +02:00
Luca Cavanna	823a9d34da	[TEST] Close additional clients created while running yaml tests (#31575 ) We recently introduced a mechanism that allows to specify a node selector as part of do sections (see #31471). When a node selector that is not the default one is configured, a new client will be initialized with the same properties as the default one, but with the specified node selector. This commit improves such mechanism but also closing the additional clients being created and adding equals/hashcode impl to the custom node selector as they are cached into a map.	2018-06-26 16:56:35 +02:00
Alpar Torok	08b8d11e30	Add support for switching distribution for all integration tests (#30874 ) * remove left-over comment * make sure of the property for plugins * skip installing modules if these exist in the distribution * Log the distrbution being ran * Don't allow running with integ-tests-zip passed externally * top level x-pack/qa can't run with oss distro * Add support for matching objects in lists Makes it possible to have a key that points to a list and assert that a certain object is present in the list. All keys have to be present and values have to match. The objects in the source list may have additional fields. example: ``` match: { 'nodes.$master.plugins': { name: ingest-attachment } } ``` * Update plugin and module tests to work with other distributions Some of the tests expected that the integration tests will always be ran with the `integ-test-zip` distribution so that there will be no other plugins loaded. With this change, we check for the presence of the plugin without assuming exclusivity. * Allow modules to run on other distros as well To match the behavior of tets.distributions * Add and use a new `contains` assertion Replaces the previus changes that caused `match` to do a partial match. * Implement PR review comments	2018-06-26 06:49:03 -07:00
Nik Everett	232c71b6bf	QA: Create xpack yaml features (#31403 ) This creates a YAML test "features" that indices if the cluster being tested has xpack installed (`xpack`) or if it does not have xpack installed (`no_xpack`). It uses those features to centralize skipping a few tests that fail if xpack is installed. The plan is to use this in a followup to skip docs tests that require xpack when xpack is not installed. We plan to use the declaration of required license level on the docs page to generate the required `skip`. Closes #30933.	2018-06-26 09:26:48 -04:00
Sohaib Iftikhar	ca4c857a90	Improve test times for tests using `RandomObjects::addFields` (#31556 ) Currently RandomObjects::addFields can potentially generate a large number of fields This commit decreases the chances that a new object or array is added as a new branch of an object, which lowers the probability of ending up with very big documents generated. It also reduces the number of documents generated for the SimulatePipelineResponseTests from 10 to 5 to reduce the testing time required for parsing.	2018-06-26 12:39:53 +02:00
Christoph Büscher	86ab3a2d1a	Reduce number of raw types warnings (#31523 ) A first attempt to reduce the number of raw type warnings, most of the time by using the unbounded wildcard.	2018-06-25 15:59:03 +02:00
Jonathan Little	8e4768890a	Migrate scripted metric aggregation scripts to ScriptContext design (#30111 ) * Migrate scripted metric aggregation scripts to ScriptContext design #29328 * Rename new script context container class and add clarifying comments to remaining references to params._agg(s) * Misc cleanup: make mock metric agg script inner classes static * Move _score to an accessor rather than an arg for scripted metric agg scripts This causes the score to be evaluated only when it's used. * Documentation changes for params._agg -> agg * Migration doc addition for scripted metric aggs _agg object change * Rename "agg" Scripted Metric Aggregation script context variable to "state" * Rename a private base class from ...Agg to ...State that I missed in my last commit * Clean up imports after merge	2018-06-25 12:01:33 +01:00
Vladimir Dolzhenko	f04c579203	IndexShard should not return null stats (#31528 ) IndexShard should not return null stats - empty stats or AlreadyCloseException if it's closed is better	2018-06-22 21:08:11 +02:00
Luca Cavanna	16e4e7a7cf	Node selector per client rather than per request (#31471 ) We have made node selectors configurable per request, but all of other language clients don't allow for that. A good reason not to do so, is that having a different node selector per request breaks round-robin. This commit makes NodeSelector configurable only at client initialization. It also improves the docs on this matter, important given that a single node selector can still affect round-robin.	2018-06-22 17:15:29 +02:00
Ryan Ernst	59e7c6411a	Core: Combine messageRecieved methods in TransportRequestHandler (#31519 ) TransportRequestHandler currently contains 2 messageReceived methods, one which takes a Task, and one that does not. The first just delegates to the second. This commit changes all existing implementors of TransportRequestHandler to implement the version which takes Task, thus allowing the class to be a functional interface, and eliminating the need to throw exceptions when a task needs to be ensured.	2018-06-22 07:36:03 -07:00
Yannick Welsch	f22f91c57a	Allow multiple unicast host providers (#31509 ) Introduces support for multiple host providers, which allows the settings based hosts resolver to be treated just as any other UnicastHostsProvider. Also introduces the notion of a HostsResolver so that plugins such as FileBasedDiscovery do not need to create their own thread pool for resolving hosts, making it easier to add new similar kind of plugins.	2018-06-22 15:31:23 +02:00
Yannick Welsch	da69ab28c7	Return transport addresses from UnicastHostsProvider (#31426 ) With #20695 we removed local transport and there is just TransportAddress now. The UnicastHostsProvider currently returns DiscoveryNode instances, where, during pinging, we're actually only making use of the TransportAddress to establish a first connection to the possible new node. To simplify the interface, we can just return a list of transport addresses instead, which means that it's not necessary anymore to create fake node objects in each plugin just to return the address information.	2018-06-21 16:00:26 +02:00
Tim Brooks	86423f9563	Ensure local addresses aren't null (#31440 ) Currently we set local addresses on the creation time of a NioChannel. However, this may return null as the local address may not have been set yet. An example is the local address has not been set on a client channel as the connection process is not yet complete. This PR modifies the getter to set the local field if it is currently null.	2018-06-20 19:50:14 -06:00
Ryan Ernst	00283a61e1	Remove unused generic type for client execute method (#31444 ) This commit removes the request builder generic type for AbstractClient as it was unused.	2018-06-20 16:26:26 -07:00
Tim Brooks	9ab1325953	Introduce http and tcp server channels (#31446 ) Historically in TcpTransport server channels were represented by the same channel interface as socket channels. This was necessary as TcpTransport was parameterized by the channel type. This commit introduces TcpServerChannel and HttpServerChannel classes. Additionally, it adds the implementations for the various transports. This allows server channels to have unique functionality and not implement the methods they do not support (such as send and getRemoteAddress). Additionally, with the introduction of HttpServerChannel this commit extracts some of the storing and closing channel work to the abstract http server transport.	2018-06-20 16:34:56 -06:00
Nhat Nguyen	db1b97fd85	Remove QueryCachingPolicy#ALWAYS_CACHE (#31451 ) The QueryCachingPolicy#ALWAYS_CACHE was deprecated in Lucene-7.4 and will be removed in Lucene-8.0. This change replaces it with QueryCachingPolicy. This also makes INDEX_QUERY_CACHE_EVERYTHING_SETTING visible in testing only.	2018-06-20 10:34:08 -04:00
Tim Brooks	529e704b11	Unify http channels and exception handling (#31379 ) This is a general cleanup of channels and exception handling in http. This commit introduces a CloseableChannel that is a superclass of TcpChannel and HttpChannel. This allows us to unify the closing logic between tcp and http transports. Additionally, the normal http channels are extracted to the abstract server transport. Finally, this commit (mostly) unifies the exception handling between nio and netty4 http server transports.	2018-06-19 11:50:03 -06:00
Ryan Ernst	e67aa96c81	Core: Combine Action and GenericAction (#31405 ) Since #30966, Action no longer has anything but a call to the GenericAction super constructor. This commit renames GenericAction into Action, thus eliminating the Action class. Additionally, this commit removes the Request generic parameter of the class, since it was unused.	2018-06-18 23:53:04 +02:00
Simon Willnauer	3d5f113ada	Ensure we don't use a remote profile if cluster name matches (#31331 ) If we are running into a race condition between a node being configured to be a remote node for cross cluster search etc. and that node joining the cluster we might connect to that node with a remote profile. If that node now joins the cluster it connected to it as a CCS remote node we use the wrong profile and can't use bulk connections etc. anymore. This change uses the remote profile only if we connect to a node that has a different cluster name than the local cluster. This is not a perfect fix for this situation but is the safe option while potentially only loose a small optimization of using less connections per node which is small anyways since we only connect to a small set of nodes. Closes #29321	2018-06-17 13:32:53 +02:00
Tal Levy	3b70e943eb	add is-write-index flag to aliases (#30942 ) This commit adds the is-write-index flag for aliases. It allows requests to set the flag, and responses to display the flag. It does not validate and/or affect any indexing/getting/updating behavior of Elasticsearch -- this will be done in a follow-up PR.	2018-06-15 08:45:29 -07:00
Nhat Nguyen	8453ca638d	Upgrade to Lucene-7.4.0-snapshot-518d303506 (#31360 )	2018-06-15 10:58:21 -04:00
Nik Everett	856936c286	REST Client: NodeSelector for node attributes (#31296 ) Add a `NodeSelector` so that users can filter the nodes that receive requests based on node attributes. I believe we'll need this to backport #30523 and we want it anyway. I also added a bash script to help with rebuilding the sniffer parsing test documents.	2018-06-15 08:04:54 -04:00
Nhat Nguyen	e5b7137508	TEST: getCapturedRequestsAndClear should be atomic (#31312 ) We might lose messages between getCapturedRequestsAndClear calls. This commit makes sure that both getCapturedRequestsAndClear and getCapturedRequestsByTargetNodeAndClear are atomic.	2018-06-14 21:32:07 -04:00
Tim Brooks	fcf1e41e42	Extract common http logic to server (#31311 ) This is related to #28898. With the addition of the http nio transport, we now have two different modules that provide http transports. Currently most of the http logic lives at the module level. However, some of this logic can live in server. In particular, some of the setting of headers, cors, and pipelining. This commit begins this moving in that direction by introducing lower level abstraction (HttpChannel, HttpRequest, and HttpResonse) that is implemented by the modules. The higher level rest request and rest channel work can live entirely in server.	2018-06-14 15:10:02 -06:00
Tanguy Leroux	bbfe1eccc7	[Tests] Mutualize fixtures code in BaseHttpFixture (#31210 ) Many fixtures have similar code for writing the pid & ports files or for handling HTTP requests. This commit adds an AbstractHttpFixture class in the test framework that can be extended for specific testing purposes.	2018-06-14 14:09:56 +02:00
Tanguy Leroux	4d7447cb5e	Reenable Checkstyle's unused import rule (#31270 )	2018-06-14 09:52:46 +02:00
Nik Everett	77bb93557e	Test: Remove broken yml test feature (#31255 ) The `requires_replica` yaml test feature hasn't worked for years. This is what happens if you try to use it: ``` > Throwable #1: java.lang.NullPointerException > at __randomizedtesting.SeedInfo.seed([E6602FB306244B12:6E341069A8D826EA]:0) > at org.elasticsearch.test.rest.yaml.Features.areAllSupported(Features.java:58) > at org.elasticsearch.test.rest.yaml.section.SkipSection.skip(SkipSection.java:144) > at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:321) ``` None of our tests use it.	2018-06-13 09:33:06 -04:00
Tanguy Leroux	8b4d80ad09	Fix AntFixture waiting condition (#31272 ) The AntFixture waiting condition is evaluated to false but it should be true.	2018-06-13 12:40:22 +02:00
Ryan Ernst	a65b18f19d	Core: Remove plain execute method on TransportAction (#30998 ) TransportAction has many variants of execute. One of those variants executes by returning a future, which is then often blocked on by calling get(). This commit removes this variant of execute, instead using a helper method for tests that want to block, or having tests pass in a PlainActionFuture directly as a listener. Co-authored-by: Simon Willnauer <simonw@apache.org>	2018-06-13 09:58:13 +02:00
Van0SS	d5e8a5cd69	REST high-level client: add Cluster Health API (#29331 ) Relates to #27205	2018-06-12 13:34:06 +02:00
olcbean	7d7ead95b2	Add Get Aliases API to the high-level REST client (#28799 ) Given the weirdness of the response returned by the get alias API, we went for a client specific response, which allows us to hold the error message, exception and status returned as part of the response together with aliases. See #30536 . Relates to #27205	2018-06-12 10:26:17 +02:00
Nik Everett	0d9b78834f	LLClient: Support host selection (#30523 ) Allows users of the Low Level REST client to specify which hosts a request should be run on. They implement the `NodeSelector` interface or reuse a built in selector like `NOT_MASTER_ONLY` to chose which nodes are valid. Using it looks like: ``` Request request = new Request("POST", "/foo/_search"); RequestOptions options = request.getOptions().toBuilder(); options.setNodeSelector(NodeSelector.NOT_MASTER_ONLY); request.setOptions(options); ... ``` This introduces a new `Node` object which contains a `HttpHost` and the metadata about the host. At this point that metadata is just `version` and `roles` but I plan to add node attributes in a followup. The canonical way to get this metadata is to use the `Sniffer` to pull the information from the Elasticsearch cluster. I've marked this as "breaking-java" because it breaks custom implementations of `HostsSniffer` by renaming the interface to `NodesSniffer` and by changing it from returning a `List<HttpHost>` to a `List<Node>`. It shouldn't break anyone else though. Because we expect to find it useful, this also implements `host_selector` support to `do` statements in the yaml tests. Using it looks a little like: ``` --- "example test": - skip: features: host_selector - do: host_selector: version: " - 7.0.0" # same syntax as skip apiname: something: true ``` The `do` section parses the `version` string into a host selector that uses the same version comparison logic as the `skip` section. When the `do` section is executed it passed the off to the `RestClient`, using the `ElasticsearchHostsSniffer` to sniff the required metadata. The idea is to use this in mixed version tests to target a specific version of Elasticsearch so we can be sure about the deprecation logging though we don't currently have any examples that need it. We do, however, have at least one open pull request that requires something like this to properly test it. Closes #21888	2018-06-11 17:07:27 -04:00
Nhat Nguyen	dda56fc0fc	Move ESIndexLevelReplicationTestCase to test framework (#31243 ) Other components might benefit from the testing infra provided by ESIndexLevelReplicationTestCase. This commit moves it to the test framework.	2018-06-11 12:47:38 -04:00
Lee Hinman	c064b507df	Encapsulate Translog in Engine (#31220 ) This removes the abstract `getTranslog` method in `Engine`, instead leaving it to the abstract implementations of the other methods that use the translog. This allows future Engines not to have a Translog, as instead they must implement the methods that use the translog pieces to return necessary values.	2018-06-11 09:44:50 -06:00
Tanguy Leroux	bf58660482	Remove all unused imports and fix CRLF (#31207 ) The X-Pack opening and the recent other refactorings left a lot of unused imports in the codebase. This commit removes them all.	2018-06-11 15:12:12 +02:00
Jason Tedor	65c107b47d	Fix unknown licenses (#31223 ) The goal of this commit is to address unknown licenses when producing the dependencies info report. We have two different checks that we run on licenses. The first check is whether or not we have stashed a copy of the license text for a dependency in the repository. The second is to map every dependency to a license type (e.g., BSD 3-clause). The problem here is that the way we were handling licenses in the second check differs from how we handle licenses in the first check. The first check works by finding a license file with the name of the artifact followed by the text -LICENSE.txt. Yet in some cases we allow mapping an artifact name to another name used to check for the license (e.g., we map lucene-.* to lucene, and opensaml-.* to shibboleth. The second check understood the first way of looking for a license file but not the second way. So in this commit we teach the second check about the mappings from artifact names to license names. We do this by copying the configuration from the dependencyLicenses task to the dependenciesInfo task and then reusing the code from the first check in the second check. There were some other challenges here though. For example, dependenciesInfo was checking too many dependencies. For now, we should only be checking direct dependencies and leaving transitive dependencies from another org.elasticsearch artifact to that artifact (we want to do this differently in a follow-up). We also want to disable dependenciesInfo for projects that we do not publish, users only care about licenses they might be exposed to if they use our assembled products. With all of the changes in this commit we have eliminated all unknown licenses. A follow-up will enforce that when we add a new dependency it does not get mapped to unknown, these will be forbidden in the future. Therefore, with this change and earlier changes are left having no unknown licenses and two custom licenses; custom here means it does not map to an SPDX license type. Those two licenses are xz and ldapsdk. A future change will not allow additional custom licenses unless they are explicitly whitelisted. This ensures that if a new dependency is added it is mapped to an SPDX license or mapped to custom because it does not have an SPDX license.	2018-06-09 07:28:41 -04:00
Lee Hinman	bdb0fb2555	Fully encapsulate LocalCheckpointTracker inside of the engine (#31213 ) * Fully encapsulate LocalCheckpointTracker inside of the engine This makes the Engine interface not expose the `LocalCheckpointTracker`, instead exposing the pieces needed (like retrieving the local checkpoint) as individual methods.	2018-06-08 17:19:41 -06:00
Julie Tibshirani	00b0e10063	Remove DocumentFieldMappers#simpleMatchToFullName. (#31041 ) * Remove DocumentFieldMappers#simpleMatchToFullName, as it is duplicative of MapperService#simpleMatchToIndexNames. * Rename MapperService#simpleMatchToIndexNames -> simpleMatchToFullName for consistency. * Simplify EsIntegTestCase#assertConcreteMappingsOnAll to accept concrete fields instead of wildcard patterns.	2018-06-08 13:53:35 -07:00
Jason Tedor	e481b860a1	Enable engine factory to be pluggable (#31183 ) This commit enables the engine factory to be pluggable based on index settings used when creating the index service for an index.	2018-06-07 17:01:06 -04:00
Tanguy Leroux	b5f05f676c	Remove BlobContainer.move() method (#31100 ) closes #30680	2018-06-07 10:48:31 +02:00
Tim Brooks	67e73b4df4	Combine accepting selector and socket selector (#31115 ) This is related to #27260. This commit combines the AcceptingSelector and SocketSelector classes into a single NioSelector. This change allows the same selector to handle both server and socket channels. This is valuable as we do not necessarily want a dedicated thread running for accepting channels. With this change, this commit removes the configuration for dedicated accepting selectors for the normal transport class. The accepting workload for new node connections is likely low, meaning that there is no need to dedicate a thread to this process.	2018-06-06 11:59:54 -06:00
Yannick Welsch	1dca00deb9	Remove extra checks from HdfsBlobContainer (#31126 ) This commit saves one network roundtrip when reading or deleting files from an HDFS repository.	2018-06-06 16:38:37 +02:00
Tanguy Leroux	9531b7bbcb	Add BlobContainer.writeBlobAtomic() (#30902 ) This commit adds a new writeBlobAtomic() method to the BlobContainer interface that can be implemented by repository implementations which support atomic writes operations. When the BlobContainer implementation does not provide a specific implementation of writeBlobAtomic(), then the writeBlob() method is used. Related to #30680	2018-06-05 13:00:43 +02:00
Christoph Büscher	3f87c79500	Change ObjectParser exception (#31030 ) ObjectParser should throw XContentParseExceptions, not IAE. A dedicated parsing exception can includes the place where the error occurred. Closes #30605	2018-06-04 20:20:37 +02:00
Jason Tedor	be55da18c2	Enable customizing REST tests blacklist (#31074 ) This commit enables adding additional REST tests to the blacklist for builds that already define tests.rest.blacklist.	2018-06-04 13:35:49 -04:00
Boaz Leskes	a7ceefe93f	Make Persistent Tasks implementations version and feature aware (#31045 ) With #31020 we introduced the ability for transport clients to indicate what features they support in order to make sure we don't serialize object to them they don't support. This PR adapts the serialization logic of persistent tasks to be aware of those features and not serialize tasks that aren't supported. Also, a version check is added for the future where we may add new tasks implementations and need to be able to indicate they shouldn't be serialized both to nodes and clients. As the implementation relies on the interface of `PersistentTaskParams`, these are no longer optional. That's acceptable as all current implementation have them and we plan to make `PersistentTaskParams` more central in the future. Relates to #30731	2018-06-03 21:51:08 +02:00
Boaz Leskes	65d3f0efca	Adapt transport tests for the extra byte introduced in #31020 We now serialize a feature array, which takes an extra byte when empty.	2018-06-02 13:04:40 +02:00
Jason Tedor	4522b57e07	Introduce client feature tracking (#31020 ) This commit introduces the ability for a client to communicate to the server features that it can support and for these features to be used in influencing the decisions that the server makes when communicating with the client. To this end we carry the features from the client to the underlying stream as we carry the version of the client today. This enables us to enhance the logic where we make protocol decisions on the basis of the version on the stream to also make protocol decisions on the basis of the features on the stream. With such functionality, the client can communicate to the server if it is a transport client, or if it has, for example, X-Pack installed. This enables us to support rolling upgrades from the OSS distribution to the default distribution without breaking client connectivity as we can now elect to serialize customs in the cluster state depending on whether or not the client reports to us using the feature capabilities that it can under these customs. This means that we would avoid sending a client pieces of the cluster state that it can not understand. However, we want to take care and always send the full cluster state during node-to-node communication as otherwise we would end up with different understanding of what is in the cluster state across nodes depending on which features they reported to have. This is why when deciding whether or not to write out a custom we always send the custom if the client is not a transport client and otherwise do not send the custom if the client is transport client that does not report to have the feature required by the custom. Co-authored-by: Yannick Welsch <yannick@welsch.lu>	2018-06-01 11:45:35 -04:00
Jim Ferenczi	0791f93dbd	Add an option to split keyword field on whitespace at query time (#30691 ) This change adds an option named `split_queries_on_whitespace` to the `keyword` field type. When set to true full text queries (`match`, `multi_match`, `query_string`, ...) that target the field will split the input on whitespace to build the query terms. Defaults to `false`. Closes #30393	2018-06-01 09:47:03 +02:00
Nik Everett	b225f5e5c6	HLRest: Allow caller to set per request options (#30490 ) This modifies the high level rest client to allow calling code to customize per request options for the bulk API. You do the actual customization by passing a `RequestOptions` object to the API call which is set on the `Request` that is generated by the high level client. It also makes the `RequestOptions` a thing in the low level rest client. For now that just means you use it to customize the headers and the `httpAsyncResponseConsumerFactory` and we'll add node selectors and per request timeouts in a follow up. I only implemented this on the bulk API because it is the first one in the list alphabetically and I wanted to keep the change small enough to review. I'll convert the remaining APIs in a followup.	2018-05-31 13:59:52 -04:00
Ryan Ernst	46e8d97813	Core: Remove RequestBuilder from Action (#30966 ) This commit removes the RequestBuilder generic type from Action. It was needed to be used by the newRequest method, which in turn was used by client.prepareExecute. Both of these methods are now removed, along with the existing users of prepareExecute constructing the appropriate builder directly.	2018-05-31 16:15:00 +02:00
Martijn van Groningen	544822c78b	Moved keyword tokenizer to analysis-common module (#30642 ) Relates to #23658	2018-05-29 19:22:28 +02:00
Vladimir Dolzhenko	81eb8ba0f0	Include size of snapshot in snapshot metadata (#29602 ) Include size of snapshot in snapshot metadata Adds difference of number of files (and file sizes) between prev and current snapshot. Total number/size reflects total number/size of files in snapshot. Closes #18543	2018-05-25 21:04:50 +02:00
Martijn van Groningen	ae2f021f1c	Move score script context from SearchScript to its own class (#30816 )	2018-05-25 07:17:50 +02:00
Tim Brooks	e8b70273c1	Remove Throwable usage from transport modules (#30845 ) Currently nio and netty modules use the CompletableFuture class for managing listeners. This is unfortunate as that class accepts Throwable. This commit adds a class CompletableContext that wraps the CompletableFuture but does not accept Throwable. This allows the modification of netty and nio logic to no longer handle Throwable.	2018-05-24 17:33:29 -06:00
David Turner	ff0b6c795a	Decouple ClusterStateTaskListener & ClusterApplier (#30809 ) Today, the `ClusterApplier` and `MasterService` both use the `ClusterStateTaskListener` interface to notify their callers when asynchronous activities have completed. However, this is not wholly appropriate: none of the callers into the `ClusterApplier` care about the `ClusterState` arguments that they receive. This change introduces a dedicated ClusterApplyListener interface for callers into the `ClusterApplier`, to distinguish these listeners from the real `ClusterStateTaskListener`s that are waiting for responses from the `MasterService`.	2018-05-24 09:05:09 +01:00
Tim Brooks	d7040ad7b4	Reintroduce mandatory http pipelining support (#30820 ) This commit reintroduces `31251c9` and `63a5799`. These commits introduced a memory leak and were reverted. This commit brings those commits back and fixes the memory leak by removing unnecessary retain method calls.	2018-05-23 14:38:52 -06:00
Colin Goodheart-Smithe	4fd0a3e492	Revert "Make http pipelining support mandatory (#30695 )" (#30813 ) This reverts commit `31251c9` introduced in #30695. We suspect this commit is causing the OOME's reported in #30811 and we will use this PR to test this assertion.	2018-05-23 10:54:46 -06:00
Yannick Welsch	30b004f582	Use original settings on full-cluster restart (#30780 ) When doing a node restart using the test framework, the restarted node does not only use the settings provided to the original node, but also additional settings provided by plugin extensions, which does not correspond to the settings that a node would have on a true restart.	2018-05-23 09:02:01 +02:00
Tim Brooks	63a5799526	Remove http pipelining from integration test case (#30788 ) This is related to #29500. We are removing the ability to disable http pipelining. This PR removes the references to disabling pipelining in the integration test case.	2018-05-22 17:18:05 -06:00
Luca Cavanna	a17d6cab98	Replace Request#setHeaders with addHeader (#30588 ) Adding headers rather than setting them all at once seems more user-friendly and we already do it in a similar way for parameters (see Request#addParameter).	2018-05-22 20:32:30 +02:00
Nhat Nguyen	1918a30237	Upgrade to Lucene-7.4.0-snapshot-cc2ee23050 (#30778 ) The new snapshot includes LUCENE-8324 which fixes missing checkpoint after a fully deletes segment is dropped on flush. This snapshot should resolves failed tests in the CorruptedFileIT suite. Closes #30741 Closes #30577	2018-05-22 13:11:48 -04:00
Tim Brooks	31251c9a6d	Make http pipelining support mandatory (#30695 ) This is related to #29500 and #28898. This commit removes the abilitiy to disable http pipelining. After this commit, any elasticsearch node will support pipelined requests from a client. Additionally, it extracts some of the http pipelining work to the server module. This extracted work is used to implement pipelining for the nio plugin.	2018-05-22 09:29:31 -06:00
Tim Brooks	abf8c56a37	Remove logging from elasticsearch-nio jar (#30761 ) This is related to #27260. The elasticsearch-nio jar is supposed to be a library opposed to a framework. Currently it internally logs certain exceptions. This commit modifies it to not rely on logging. Instead exception handlers are passed by the applications that use the jar.	2018-05-21 20:18:12 -06:00
Nhat Nguyen	67d8fc222d	Upgrade to Lucene-7.4.0-snapshot-59f2b7aec2 (#30726 ) This snapshot resolves issues related to ShrinkIndexIT.	2018-05-18 18:21:39 -04:00
Ryan Ernst	b3f3a4312b	Plugins: Remove meta plugins (#30670 ) Meta plugins existed only for a short time, in order to enable breaking up x-pack into multiple plugins. However, now that x-pack is no longer installed as a plugin, the need for them has disappeared. This commit removes the meta plugins infrastructure.	2018-05-18 10:56:08 -07:00
Adrien Grand	28d4685d72	Mitigate date histogram slowdowns with non-fixed timezones. (#30534 ) Date histograms on non-fixed timezones such as `Europe/Paris` proved much slower than histograms on fixed timezones in #28727. This change mitigates the issue by using a fixed time zone instead when shard data doesn't cross a transition so that all timestamps share the same fixed offset. This should be a common case with daily indices. NOTE: Rewriting the aggregation doesn't work since the timezone is then also used on the coordinating node to create empty buckets, which might be out of the range of data that exists on the shard. NOTE: In order to be able to get a shard context in the tests, I reused code from the base query test case by creating a new parent test case for both queries and aggregations: `AbstractBuilderTestCase`. Mitigates #28727	2018-05-16 17:06:52 +02:00
Zachary Tong	df853c49c0	Add a MovingFunction pipeline aggregation, deprecate MovingAvg agg (#29594 ) This pipeline aggregation gives the user the ability to script functions that "move" across a window of data, instead of single data points. It is the scripted version of MovingAvg pipeline agg. Through custom script contexts, we expose a number of convenience methods: - MovingFunctions.max() - MovingFunctions.min() - MovingFunctions.sum() - MovingFunctions.unweightedAvg() - MovingFunctions.linearWeightedAvg() - MovingFunctions.ewma() - MovingFunctions.holt() - MovingFunctions.holtWinters() - MovingFunctions.stdDev() The user can also define any arbitrary logic via their own scripting, or combine with the above methods.	2018-05-16 10:57:00 -04:00
Van0SS	4478f10a2a	Rest High Level client: Add List Tasks (#29546 ) This change adds a `listTasks` method to the high level java ClusterClient which allows listing running tasks through the task management API. Related to #27205	2018-05-16 13:31:37 +02:00
Nik Everett	9b47e0508b	Fix compilation of test framework tests We accidentally broke the compilation of the test frameworks tests. All better now.	2018-05-15 22:56:41 -04:00
Jason Tedor	25c823da09	Skip shard deprecation messages in REST tests (#30630 ) A 6.x node can send a deprecation message that the default number of shards will change from five to one in 7.0.0. In a mixed cluster, whether or not a create index request sees five or one shard and produces a deprecation message depends on the version of the master node. This means that during BWC tests a test can see this deprecation message depending on the version of the master node. In 6.x when we introduced this deprecation message we assumed that whereever we see this deprecation message is expected. However, in a mixed cluster test we need a similar mechanism but it would only apply if the version of the master node is earlier than 7.0.0. This commit takes advantage of a recent change to expose the version of the master node to do sections of REST tests. With this in hand, we can skip asserting on the deprecation message if the version of the master node is before 7.0.0 and otherwise seeing that deprecation message would be completely unexpected.	2018-05-15 21:07:32 -04:00
Tim Brooks	99b9ab58e2	Add nio http server transport (#29587 ) This commit is related to #28898. It adds an nio driven http server transport. Currently it only supports basic http features. Cors, pipeling, and read timeouts will need to be added in future PRs.	2018-05-15 16:37:14 -06:00
Jason Tedor	abc06d5b79	Expose master version in REST test context (#30623 ) This commit exposes the master version to the REST test context. This will be needed in a follow-up where the master version will be used to determine whether or not a certain warning header is expected.	2018-05-15 17:26:43 -04:00
Nik Everett	869b639d14	QA: System property to override distribution (#30591 ) This configures all `qa` projects to use the distribution contained in the `tests.distribution` system property if it is set. The goal is to create a simple way to run tests against the default distribution which has x-pack basic features enabled while not forcing these tests on all contributors. You run these tests by doing something like: ``` ./gradlew -p qa -Dtests.distribution=zip check ``` or ``` ./gradlew -p qa -Dtests.distribution=zip bwcTest ``` x-pack basic shouldn't get in the way of any of these tests but nothing is ever perfect so this we have to disable a few when running with the zip distribution.	2018-05-15 17:16:16 -04:00
Julie Tibshirani	4f9dd37169	Add support for search templates to the high-level REST client. (#30473 )	2018-05-15 13:07:58 -07:00
Jason Tedor	4a4e3d70d5	Default to one shard (#30539 ) This commit changes the default out-of-the-box configuration for the number of shards from five to one. We think this will help address a common problem of oversharding. For users with time-based indices that need a different default, this can be managed with index templates. For users with non-time-based indices that find they need to re-shard with the split API in place they no longer need to resort only to reindexing. Since this has the impact of changing the default number of shards used in REST tests, we want to ensure that we still have coverage for issues that could arise from multiple shards. As such, we randomize (rarely) the default number of shards in REST tests to two. This is managed via a global index template. However, some tests check the templates that are in the cluster state during the test. Since this template is randomly there, we need a way for tests to skip adding the template used to set the number of shards to two. For this we add the default_shards feature skip. To avoid having to write our docs in a complicated way because sometimes they might be behind one shard, and sometimes they might be behind two shards we apply the default_shards feature skip to all docs tests. That is, these tests will always run with the default number of shards (one).	2018-05-14 12:22:35 -04:00
Martijn van Groningen	7b95470897	Moved tokenizers to analysis common module (#30538 ) The following tokenizers were moved: classic, edge_ngram, letter, lowercase, ngram, path_hierarchy, pattern, thai, uax_url_email and whitespace. Left keyword tokenizer factory in server module, because normalizers directly depend on it.This should be addressed on a follow up change. Relates to #23658	2018-05-14 07:55:01 +02:00
Yannick Welsch	fc870fdb4c	Use simpler write-once semantics for HDFS repository (#30439 ) There's no need for an extra `blobExists()` call when writing a blob to the HDFS service. The writeBlob implementation for the HDFS repository already uses the `CreateFlag.CREATE` option on the file creation, which ensures that the blob that's uploaded does not already exist. This saves one network roundtrip.	2018-05-11 09:50:37 +02:00
Jay Modi	f733de8e67	Security: fix TokenMetaData equals and hashcode (#30347 ) The TokenMetaData equals method compared byte arrays using `.equals` on the arrays themselves, which is the equivalent of an `==` check. This means that a seperate byte[] with the same contents would not be considered equivalent to the existing one, even though it should be. The method has been updated to use `Array#equals` and similarly the hashcode method has been updated to call `Arrays#hashCode` instead of calling hashcode on the array itself.	2018-05-10 13:12:11 -06:00
Jason Tedor	bf2365d13b	Remove BWC repository test (#30500 ) This commit removes a test that we can not restore from 1.x and 2.x repository files. This test is not needed, the version of Elasticsearch that this commit targets can not even read index files from those versions.	2018-05-09 23:24:54 -04:00
Nik Everett	f9dc86836d	Docs: Test examples that recreate lang analyzers (#29535 ) We have a pile of documentation describing how to rebuild the built in language analyzers and, previously, our documentation testing framework made sure that the examples successfully built an analyzer but they didn't assert that the analyzer built by the documentation matches the built in anlayzer. Unsuprisingly, some of the examples aren't quite right. This adds a mechanism that tests that the analyzers built by the docs. The mechanism is fairly simple and brutal but it seems to be working: build a hundred random unicode sequences and send them through the `_analyze` API with the rebuilt analyzer and then again through the built in analyzer. Then make sure both APIs return the same results. Each of these calls to `_anlayze` takes about 20ms on my laptop which seems fine.	2018-05-09 09:23:10 -04:00
Stéphane Campinas	2f8905839f	Correct wording in log message (#30336 )	2018-05-07 12:00:06 +02:00
Tanguy Leroux	1987d6261f	Do not fail snapshot when deleting a missing snapshotted file (#30332 ) When deleting or creating a snapshot for a given shard, elasticsearch usually starts by listing all the existing snapshotted files in the repository. Then it computes a diff and deletes the snapshotted files that are not needed anymore. During this deletion, an exception is thrown if the file to be deleted does not exist anymore. This behavior is challenging with cloud based repository implementations like S3 where a file that has been deleted can still appear in the bucket for few seconds/minutes (because the deletion can take some time to be fully replicated on S3). If the deleted file appears in the listing of files, then the following deletion will fail with a NoSuchFileException and the snapshot will be partially created/deleted. This pull request makes the deletion of these files a bit less strict, ie not failing if the file we want to delete does not exist anymore. It introduces a new BlobContainer.deleteIgnoringIfNotExists() method that can be used at some specific places where not failing when deleting a file is considered harmless. Closes #28322	2018-05-07 09:35:55 +02:00
Jim Ferenczi	dbd857341f	Upgrade to 7.4.0-snapshot-1ed95c097b (#30357 ) Upgrade to lucene-7.4.0-snapshot-1ed95c097b This version contains: * An Analyzer for Korean * An IntervalQuery and IntervalsSource that retrieve minimum intervals of positional queries. * A new API to retrieve matches (offsets and positions) of a query for a single document. * Support for soft deletes in the index writer. * A fixed shingle filter that handles index time synonyms. * Support for emoji sequence in ICUTokenizer (with an upgrade to icu 61.1)	2018-05-04 11:44:22 +02:00
Zachary Tong	3c2d2a7d4a	Fix NPE when CumulativeSum agg encounters null/empty bucket (#29641 ) Fix NPE when CumulativeSum agg encounters null/empty bucket If the cusum agg encounters a null value, it's because the value is missing (like the first value from a derivative agg), the path is not valid, or the bucket in the path was empty. Previously cusum would just explode on the null, but this changes it so we only increment the sum if the value is non-null and finite. This is safe because even if the cusum encounters all null or empty buckets, the cumulative sum is still zero (like how the sum agg returns zero even if all the docs were missing values) I went ahead and tweaked AggregatorTestCase to allow testing pipelines, so that I could delete the IT test and reimplement it as AggTests. Closes #27544	2018-05-02 12:22:55 -07:00
Ryan Ernst	fb0aa562a5	Network: Remove http.enabled setting (#29601 ) This commit removes the http.enabled setting. While all real nodes (started with bin/elasticsearch) will always have an http binding, there are many tests that rely on the quickness of not actually needing to bind to 2 ports. For this case, the MockHttpTransport.TestPlugin provides a dummy http transport implementation which is used by default in ESIntegTestCase. closes #12792	2018-05-02 11:42:05 -07:00
Ryan Ernst	f0e92676b1	Tests: Simplify VersionUtils released version splitting (#30322 ) This commit refactors VersionUtils.resolveReleasedVersions to be simpler, and in the process fixes the behavior to match that of VersionCollection.groovy. closes #30133	2018-05-02 09:29:35 -07:00
Adrien Grand	368ddc408f	Remove MapperService#types(). (#29617 ) This isn't be necessary with a single type per index.	2018-05-02 11:35:12 +02:00
Boaz Leskes	4a537ef03c	Bulk operation fail to replicate operations when a mapping update times out (#30244 ) Starting with the refactoring in https://github.com/elastic/elasticsearch/pull/22778 (released in 5.3) we may fail to properly replicate operation when a mapping update on master fails. If a bulk operations needs a mapping update half way, it will send a request to the master before continuing to index the operations. If that request times out or isn't acked (i.e., even one node in the cluster didn't process it within 30s), we end up throwing the exception and aborting the entire bulk. This is a problem because all operations that were processed so far are not replicated any more to the replicas. Although these operations were never "acked" to the user (we threw an error) it cause the local checkpoint on the replicas to lag (on 6.x) and the primary and replica to diverge. This PR does a couple of things: 1) Most importantly, treat any mapping update failure as a document level failure, meaning only the relevant indexing operation will fail. 2) Removes the mapping update callbacks from `IndexShard.applyIndexOperationOnPrimary` and similar methods for simpler execution. We don't use exceptions any more when a mapping update was successful. I think we need to do more work here (the fact that a single slow node can prevent those mappings updates from being acked and thus fail operations is bad), but I want to keep this as small as I can (it is already too big).	2018-05-01 08:15:02 +02:00
Nik Everett	50945051b6	HTML5ify Javadoc for core and test framework (#30234 ) `javadoc` will switch from detaulting to html4 to html5 in "a future release". We should get ahead of it so we're not surprised. Also, HTML5 is the future! Er, the present. Anyway, this follows up from #30220 to make the Javadoc for two of the four remaining projects HTML5 compatible.	2018-04-30 09:39:50 -04:00
Martijn van Groningen	621a1935b8	test: also assert deprecation warning after clusters have been closed.	2018-04-19 09:20:04 +02:00
Ryan Ernst	1cb3a9d9dc	Test: Guard deprecation check when 0 nodes created The internal test cluster can sometimes have 0 nodes. In this situation, the http.enabled flag will never be read, and thus no deprecation warning will be emitted. This commit guards the deprecation warning check in this case.	2018-04-18 21:18:56 -07:00
Ryan Ernst	98d776edaf	Networking: Deprecate http.enabled setting (#29591 ) This commit deprecates the http.enabled, in preparation for removing the feature in 7.0. relates #12792	2018-04-18 17:36:09 -07:00
Julie Tibshirani	c8209fa7b1	Fix the assertion message for an incorrect current version. (#29572 )	2018-04-17 19:27:02 -07:00
Nhat Nguyen	45c6c20467	Enforce translog access via engine (#29542 ) Today the translog of an engine is exposed and can be accessed directly. While this exposure offers much flexibility, it also causes these troubles: - Inconsistent behavior between translog method and engine method. For example, rolling a translog generation via an engine also trims unreferenced files, but translog's method does not. - An engine does not get notified when critical errors happen in translog as the access is direct. This change isolates translog of an engine and enforces all accesses to translog via the engine.	2018-04-17 08:03:41 -04:00
Jason Tedor	1dd0fd4874	Deprecate the index thread pool (#29540 ) The index thread pool is no longer needed as its primary use-case for single-document indexing requests has been relieved now that single-document indexing requests are converted to bulk indexing requests (with a single document payload).	2018-04-17 06:47:30 -04:00
olcbean	b3e3b80f1b	REST high-level client: add support for Indices Update Settings API [take 2] (#29327 ) Relates to #27205	2018-04-16 21:39:11 +02:00
Simon Willnauer	eab530ce11	Ensure flush happens on shard idle This adds 2 testcases that test if a shard goes idle pending (uncommitted) segments are committed and unreferenced files will be freed. Relates to #29482	2018-04-13 15:06:51 +02:00
Nhat Nguyen	f96e00badf	Add primary term to translog header (#29227 ) This change adds the current primary term to the header of the current translog file. Having a term in a translog header is a prerequisite step that allows us to trim translog operations given the max valid seq# for that term. This commit also updates tests to conform the primary term invariant which guarantees that all translog operations in a translog file have its terms at most the term stored in the translog header.	2018-04-12 13:57:59 -04:00
Lee Hinman	d72d3f996e	Add a helper method to get a random java.util.TimeZone (#29487 ) * Add a helper method to get a random java.util.TimeZone This adds a helper method to ESTestCase that returns a randomized `java.util.TimeZone`. This can be used when transitioning code from Joda to the JDK's time classes.	2018-04-12 11:56:42 -06:00
Adrien Grand	4918924fae	Remove legacy mapping code. (#29224 ) Some features have been deprecated since `6.0` like the `_parent` field or the ability to have multiple types per index. This allows to remove quite some code, which in-turn will hopefully make it easier to proceed with the removal of types.	2018-04-11 09:41:37 +02:00
tomcallahan	2574064e66	Enable rest tests via IDEs (#29439 ) Currently rest-based tests do not work from the IDE, as the security manager is configured to permit certain network operations when using the snapshot jars compiled by gradle. We have an existing workaround that explicitly associates a codebase with the path from which the classes are loaded (in this case, the IDE build directory). This PR adds the rest client to this workaround list.	2018-04-10 09:08:58 -04:00
Lee Hinman	a07ba9e400	Move Streams.copy into elasticsearch-core and make a multi-release jar (#29322 ) * Move Streams.copy into elasticsearch-core and make a multi-release jar This moves the method `Streams.copy(InputStream in, OutputStream out)` into the `elasticsearch-core` project (inside the `o.e.core.internal.io` package). It also makes this class into a multi-release class where the Java 9 equivalent uses `InputStream#transferTo`. This is a followup from https://github.com/elastic/elasticsearch/pull/29300#discussion_r178147495	2018-04-06 11:07:20 -06:00
Lee Hinman	a93c942927	Move ObjectParser into the x-content lib (#29373 ) * Move ObjectParser into the x-content lib This moves `ObjectParser`, `AbstractObjectParser`, and `ConstructingObjectParser` into the libs/x-content dependency. This decoupling allows them to be used for parsing for projects that don't want to depend on the entire Elasticsearch jar. Relates to #28504	2018-04-06 09:41:14 -06:00
Colin Goodheart-Smithe	55c8e80532	Fixes query_string query equals timezone check (#29406 ) * Fixes query_string query equals timezone check This change fixes a bug where two `QueryStringQueryBuilder`s were found to be equal if they had the same timezone set even if the query string in the builders were different Closes #29403 * Adds mutate function to QueryStringQueryBuilderTests * iter	2018-04-06 11:45:34 +01:00
Adrien Grand	569d0c0e89	Improve similarity integration. (#29187 ) This improves the way similarities are plugged in in order to: - reject the classic similarity on 7.x indices and emit a deprecation warning otherwise - reject unkwown parameters on 7.x indices and emit a deprecation warning otherwise Even though this breaks the plugin API, I'd like to backport to 7.x so that users can get deprecation warnings when they are doing something that will become unsupported in the future. Closes #23208 Closes #29035	2018-04-03 16:45:25 +02:00
Adrien Grand	3bdfc8f3fb	Upgrade to lucene-7.3.0-snapshot-98a6b3d. (#29298 ) Most notable changes include: - this release doesn't have the 7.2.1 version constant so I had to create one - spatial4j and jts were upgraded	2018-04-03 09:27:14 +02:00
Lee Hinman	6b2167f462	Begin moving XContent to a separate lib/artifact (#29300 ) * Begin moving XContent to a separate lib/artifact This commit moves a large portion of the XContent code from the `server` project to the `libs/xcontent` project. For the pieces that have been moved, some helpers have been duplicated to allow them to be decoupled from ES helper classes. In addition, `Booleans` and `CheckedFunction` have been moved to the `elasticsearch-core` project. This decoupling is a move so that we can eventually make things like the high-level REST client not rely on the entire ES jar, only the parts it needs. There are some pieces that are still not decoupled, in particular some of the XContent tests still remain in the server project, this is because they test a large portion of the pluggable xcontent pieces through `XContentElasticsearchException`. They may be decoupled in future work. Additionally, there may be more piecese that we want to move to the xcontent lib in the future that are not part of this PR, this is a starting point. Relates to #28504	2018-04-02 15:58:31 -06:00
Mayya Sharipova	e70cd35bda	Revert "REST high-level client: add support for Indices Update Settings API (#28892 )" (#29323 ) This reverts commit `b67b5b1bbd`.	2018-03-30 16:26:46 -07:00
Andy Bristol	b7e6fb9ac5	[test] remove Streamable serde assertions (#29307 ) Removes a set of assertions in the test framework that verified that Streamable objects could be serialized and deserialized across different versions. When this was discussed the consensus was that this approach has not caught many bugs in a long time and that serialization testing of objects was best left to their respective unit and integration tests. This commit also removes a transport interceptor that was used in ESIntegTestCase tests to make these assertions about objects coming in or off the wire.	2018-03-30 14:09:26 -07:00
olcbean	b67b5b1bbd	REST high-level client: add support for Indices Update Settings API (#28892 ) Relates to #27205	2018-03-30 10:53:29 +02:00
Jason Tedor	4ef3de40bc	Fix handling of bad requests (#29249 ) Today we have a few problems with how we handle bad requests: - handling requests with bad encoding - handling requests with invalid value for filter_path/pretty/human - handling requests with a garbage Content-Type header There are two problems: - in every case, we give an empty response to the client - in most cases, we leak the byte buffer backing the request! These problems are caused by a broader problem: poor handling preparing the request for handling, or the channel to write to when the response is ready. This commit addresses these issues by taking a unified approach to all of them that ensures that: - we respond to the client with the exception that blew us up - we do not leak the byte buffer backing the request	2018-03-28 16:25:01 -04:00
Simon Willnauer	13e19e7428	Allow _update and upsert to read from the transaction log (#29264 ) We historically removed reading from the transaction log to get consistent results from _GET calls. There was also the motivation that the read-modify-update principle we apply should not be hidden from the user. We still agree on the fact that we should not hide these aspects but the impact on updates is quite significant especially if the same documents is updated before it's written to disk and made serachable. This change adds back the ability to read from the transaction log but only for update calls. Calls to the _GET API will always do a refresh if necessary to return consistent results ie. if stored fields or DocValues Fields are requested. Closes #26802	2018-03-28 18:03:34 +02:00
Yannick Welsch	cacf759213	Remove RELOCATED index shard state (#29246 ) as this information is already covered by ReplicationTracker.primaryMode.	2018-03-28 12:25:46 +02:00
Nhat Nguyen	87957603c0	Prune only gc deletes below local checkpoint (#28790 ) Once a document is deleted and Lucene is refreshed, we will not be able to look up the `version/seq#` associated with that delete in Lucene. As conflicting operations can still be indexed, we need another mechanism to remember these deletes. Therefore deletes should still be stored in the Version Map, even after Lucene is refreshed. Obviously, we can't remember all deletes forever so a trimming mechanism is needed. Currently, we remember deletes for at least 1 minute (the default GC deletes cycle) and clean them periodically. This is, at the moment, the best we can do on the primary for user facing APIs but this arbitrary time limit is problematic for replicas. Furthermore, we can't rely on the primary and replicas doing the trimming in a synchronized manner, and failing to do so results in the replica and primary making different decisions. The following scenario can cause inconsistency between primary and replica. 1. Primary index doc (index, id=1, v2) 2. Network packet issue causes index operation to back off and wait 3. Primary deletes doc (delete, id=1, v3) 4. Replica processes delete (delete, id=1, v3) 5. 1+ minute passes (GC deletes runs replica) 6. Indexing op is finally sent to the replica which no processes it because it forgot about the delete. We can reply on sequence-numbers to prevent this issue. If we prune only deletes whose seqno at most the local checkpoint, a replica will correctly remember what it needs. The correctness is explained as follows: Suppose o1 and o2 are two operations on the same document with seq#(o1) < seq#(o2), and o2 arrives before o1 on the replica. o2 is processed normally since it arrives first; when o1 arrives it should be discarded: 1. If seq#(o1) <= LCP, then it will be not be added to Lucene, as it was already previously added. 2. If seq#(o1) > LCP, then it depends on the nature of o2: - If o2 is a delete then its seq# is recorded in the VersionMap, since seq#(o2) > seq#(o1) > LCP, so a lookup can find it and determine that o1 is stale. - If o2 is an indexing then its seq# is either in Lucene (if refreshed) or the VersionMap (if not refreshed yet), so a real-time lookup can find it and determine that o1 is stale. In this PR, we prefer to deploy a single trimming strategy, which satisfies both requirements, on primary and replicas because: - It's simpler - no need to distinguish if an engine is running at primary mode or replica mode or being promoted. - If a replica subsequently is promoted, user experience is fully maintained as that replica remembers deletes for the last GC cycle. However, the version map may consume less memory if we deploy two different trimming strategies for primary and replicas.	2018-03-26 13:42:08 -04:00
Boaz Leskes	f5d4550e93	Fold EngineDiskUtils into Store, for better lock semantics (#29156 ) #28245 has introduced the utility class`EngineDiskUtils` with a set of methods to prepare/change translog and lucene commit points. That util class bundled everything that's needed to create and empty shard, bootstrap a shard from a lucene index that was just restored etc. In order to safely do these manipulations, the util methods acquired the IndexWriter's lock. That would sometime fail due to concurrent shard store fetching or other short activities that require the files not to be changed while they read from them. Since there is no way to wait on the index writer lock, the `Store` class has other locks to make sure that once we try to acquire the IW lock, it will succeed. To side step this waiting problem, this PR folds `EngineDiskUtils` into `Store`. Sadly this comes with a price - the store class doesn't and shouldn't know about the translog. As such the logic is slightly less tight and callers have to do the translog manipulations on their own.	2018-03-26 14:08:03 +02:00
Jim Ferenczi	5288235ca3	Optimize the composite aggregation for match_all and range queries (#28745 ) This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate the collection when the leading source value is greater than the lowest value in the queue. Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents in the order of the values present in the leading source. For instance the following aggregation: ``` "composite" : { "sources" : [ { "value1": { "terms" : { "field": "timestamp", "order": "asc" } } } ], "size": 10 } ``` ... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents. For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited. This mode can execute iff: * The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`. * The query is a match_all query or a range query over the field that is used as the leading source in the composite definition. * The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only). If these conditions are not met this aggregation visits each document like any other agg.	2018-03-26 09:51:37 +02:00
Lee Hinman	b4af451ec5	Remove BytesArray and BytesReference usage from XContentFactory (#29151 ) * Remove BytesArray and BytesReference usage from XContentFactory This removes the usage of `BytesArray` and `BytesReference` from `XContentFactory`. Instead, a regular `byte[]` should be passed. To assist with this a helper has been added to `XContentHelper` that will preserve the offset and length from the underlying BytesReference. This is part of ongoing work to separate the XContent parts from ES so they can be factored into their own jar. Relates to #28504	2018-03-20 11:52:26 -06:00
Nik Everett	a813492fe3	Tests: Make $_path support dots in paths (#28917 ) `$_path` is used by documentation tests to ignore a value from a response, for example: ``` [source,js] ---- { "count": 1, "datafeeds": [ { "datafeed_id": "datafeed-total-requests", "state": "started", "node": { ... "attributes": { "ml.machine_memory": "17179869184", "ml.max_open_jobs": "20", "ml.enabled": "true" } }, "assignment_explanation": "" } ] } ---- // TESTRESPONSE[s/"17179869184"/$body.$_path/] ``` That example shows `17179869184` in the compiled docs but when it runs the tests generated by that doc it ignores `17179869184` and asserts instead that there is a value in that field. This is required because we can't predict things like "how many milliseconds will this take?" and "how much memory will this take?". Before this change it was impossible to use `$_path` when any component of the path contained a `.`. This fixes the `$_path` evaluator to properly escape `.`. Closes #28770	2018-03-19 14:17:09 -04:00
Christoph Büscher	312ccc05d5	[Tests] Fix GetResultTests and DocumentFieldTests failures (#29083 ) Changes made in #28972 seems to have changed some assumptions about how SMILE and CBOR write byte[] values and how this is tested. This changes the generation of the randomized DocumentField values back to BytesArray while expecting the JSON and YAML deserialisation to produce Base64 encoded strings and SMILE and CBOR to parse back BytesArray instances. Closes #29080	2018-03-15 16:42:26 +01:00
Boaz Leskes	bf65cb4914	Untangle Engine Constructor logic (#28245 ) Currently we have a fairly complicated logic in the engine constructor logic to deal with all the various ways we want to mutate the lucene index and translog we're opening. We can: 1) Create an empty index 2) Use the lucene but create a new translog 3) Use both 4) Force a new history uuid in all cases. This leads complicated code flows which makes it harder and harder to make sure we cover all the corner cases. This PR tries to take another approach. Constructing an InternalEngine always opens things as they are and all needed modifications are done by static methods directly on the directory, one at a time.	2018-03-14 20:59:47 +01:00
Lee Hinman	8e8fdc4f0e	Decouple XContentBuilder from BytesReference (#28972 ) * Decouple XContentBuilder from BytesReference This commit removes all mentions of `BytesReference` from `XContentBuilder`. This is needed so that we can completely decouple the XContent code and move it into its own dependency. While this change appears large, it is due to two main changes, moving `.bytes()` and `.string()` out of XContentBuilder itself into static methods `BytesReference.bytes` and `Strings.toString` respectively. The rest of the change is code reacting to these changes (the majority of it in tests). Relates to #28504	2018-03-14 13:47:57 -06:00
Jason Tedor	5904d936fa	Copy Lucene IOUtils (#29012 ) As we have factored Elasticsearch into smaller libraries, we have ended up in a situation that some of the dependencies of Elasticsearch are not available to code that depends on these smaller libraries but not server Elasticsearch. This is a good thing, this was one of the goals of separating Elasticsearch into smaller libraries, to shed some of the dependencies from other components of the system. However, this now means that simple utility methods from Lucene that we rely on are no longer available everywhere. This commit copies IOUtils (with some small formatting changes for our codebase) into the fold so that other components of the system can rely on these methods where they no longer depend on Lucene.	2018-03-13 12:49:33 -04:00
Jason Tedor	8b6fbe2c11	Add test for dying with dignity (#28987 ) I have long wanted an actual test that dying with dignity works. It is tricky because if dying with dignity works, it means the test JVM dies which is usually an abnormal condition. And anyway, how does one force a fatal error to be thrown. I was motivated to investigate this again by the fact that I missed a backport to one branch leading to an issue where Elasticsearch would not successfully die with dignity. And now we have a solution: we install a plugin that throws an out of memory error when it receives a request. We hack the standalone test infrastructure to prevent this from failing the test. To do this, we bypass the security manager and remove the PID file for the node; this tricks the test infrastructure into thinking that it does not need to stop the node. We also bypass seccomp so that we can fork jps to make sure that Elasticsearch really died. And to be extra paranoid, we parse the logs of the dead Elasticsearch process to make sure it died with dignity. Never forget.	2018-03-12 23:20:07 -04:00
Luca Cavanna	184a8718d8	REST high-level client: add flush API (#28852 ) Relates to #27205	2018-03-01 10:56:03 +01:00
Luca Cavanna	cd3d9c9f80	[TEST] share code between streamable/writeable/xcontent base test classes (#28785 ) Today we have two test base classes that have a lot in common when it comes to testing wire and xcontent serialization: `AbstractSerializingTestCase` and `AbstractXContentStreamableTestCase`. There are subtle differences though between the two, in the way they work, what can be overridden and features that they support (e.g. insertion of random fields). This commit introduces a new base class called `AbstractWireTestCase` which holds all of the serialization test code in common between `Streamable` and `Writeable`. It has two minimal subclasses called `AbstractWireSerializingTestCase` and `AbstractStreamableTestCase` which are specialized for `Writeable` and `Streamable`. This commit also introduces a new test class called `AbstractXContentTestCase` for all of the xContent testing, which holds a testFromXContent method for parsing and rendering to xContent. This one can be delegated to from the existing `AbstractStreamableXContentTestCase` and `AbstractSerializingTestCase` so that we avoid code duplicate as much as possible and all these base classes offer the same functionalities in the same way. Having this last base class decoupled from the serialization testing may also help with the REST high-level client testing, as there are some classes where it's hard to implement equals/hashcode and this makes it possible to override `assertEqualInstances` for custom equality comparisons (also this base class doesn't require implementing equals/hashcode as it doesn't test such methods.	2018-02-23 10:48:48 +01:00
Tim Brooks	5a8ec9b762	Selectors operate on channel contexts (#28468 ) This commit is related to #27260. Currently there is a weird relationship between channel contexts and nio channels. The selectors use the context for read and writing. But the selector operates directly on the nio channel for registering, closing, and connecting. This commit works on improving this relationship. The selector operates directly on the context which wraps the low level java.nio.channels. The NioChannel class is simply an API that is used to interact with the channel (sending messages from outside the selector event loop, scheduling a close, adding listeners, etc). The context is only used internally by the channel to implement these apis and by the selector to perform these operations.	2018-02-22 09:44:52 -07:00
Luca Cavanna	8b4a298874	Migrate some *ResponseTests to AbstractStreamableXContentTestCase (#28749 ) This allows us to save a bit of code, but also adds more coverage as it tests serialization which was missing in some of the existing tests. Also it requires implementing equals/hashcode and we get the corresponding tests for them for free from the base test class.	2018-02-21 20:04:12 +01:00
Lee Hinman	d7eae4b90f	Pass InputStream when creating XContent parser (#28754 ) * Pass InputStream when creating XContent parser Rather than passing the raw `BytesReference` in when creating the xcontent parser, this passes the StreamInput (which is an InputStream), this allows us to decouple XContent from BytesReference. This also removes the use of `commons.Booleans` so it doesn't require more external commons classes. Related to #28504 * Undo boolean removal * Enhance deprecation javadoc	2018-02-21 11:03:25 -07:00
Yu	7d8fb69d50	version set in ingest pipeline (#27573 ) Add support version and version_type in ingest pipelines Add support for setting document version and version type in set processor of an ingest pipeline.	2018-02-21 09:34:51 +01:00
Lee Hinman	d4fddfa2a0	Remove log4j dependency from elasticsearch-core (#28705 ) * Remove log4j dependency from elasticsearch-core This removes the log4j dependency from our elasticsearch-core project. It was originally necessary only for our jar classpath checking. It is now replaced by a `Consumer<String>` so that the es-core dependency doesn't have external dependencies. The parts of #28191 which were moved in conjunction (like `ESLoggerFactory` and `Loggers`) have been moved back where appropriate, since they are not required in the core jar. This is tangentially related to #28504 * Add javadocs for `output` parameter * Change @code to @link	2018-02-20 09:15:54 -07:00
Luca Cavanna	8bbb3c9ffa	REST high-level client: add support for Rollover Index API (#28698 ) Relates to #27205	2018-02-20 15:58:58 +01:00
Simon Willnauer	779bc6fd5c	Simplify Engine.Searcher creation (#28728 ) Today we have several levels of indirection to acquire an Engine.Searcher. We first acquire a the reference manager for the scope then acquire an IndexSearcher and then create a searcher for the engine based on that. This change simplifies the creation into a single method call instead of 3 different ones.	2018-02-20 09:35:49 +01:00
Nhat Nguyen	84fd39f5bb	Separate acquiring safe commit and last commit (#28271 ) Previously we introduced a new parameter to `acquireIndexCommit` to allow acquire either a safe commit or a last commit. However with the new parameters, callers can provide a nonsense combination - flush first but acquire the safe commit. This commit separates acquireIndexCommit method into two different methods to avoid that problem. Moreover, this change should also improve the readability. Relates #28038	2018-02-16 21:25:58 -05:00
Luca Cavanna	ebe5e8e635	REST high-level client: encode path parts (#28663 ) The REST high-level client supports now encoding of path parts, so that for instance documents with valid ids, but containing characters that need to be encoded as part of urls (`#` etc.), are properly supported. We also make sure that each path part can contain `/` by encoding them properly too. Closes #28625	2018-02-15 17:22:45 +01:00
olcbean	02fc16f10e	Add Cluster Put Settings API to the high level REST client (#28633 ) Relates to #27205	2018-02-15 17:21:45 +01:00
Boaz Leskes	beb55d148a	Simplify the Translog constructor by always expecting an existing translog (#28676 ) Currently the Translog constructor is capable both of opening an existing translog and creating a new one (deleting existing files). This PR separates these two into separate code paths. The constructors opens files and a dedicated static methods creates an empty translog.	2018-02-15 09:24:09 +01:00
Lee Hinman	b59b1cf59d	Move more XContent.createParser calls to non-deprecated version (#28672 ) * Move more XContent.createParser calls to non-deprecated version Part 2 This moves more of the callers to pass in the DeprecationHandler. Relates to #28504 * Use parser's deprecation handler where appropriate * Use logging handler in test that uses deprecated field on purpose	2018-02-14 11:24:48 -07:00
Lee Hinman	7c1f5f5054	Move more XContent.createParser calls to non-deprecated version (#28670 ) * Move more XContent.createParser calls to non-deprecated version This moves more of the callers to pass in the DeprecationHandler. Relates to #28504 * Use parser's deprecation handler where available	2018-02-14 09:01:40 -07:00
Michael Basnight	920dff7053	Add released major logic to version utils (#28644 ) Version Utils did not previously have logic that removed the last majors minor snapshot if there was a next bugfix and maintenance bugfix release. This adds the logic and fixes some broken assumptions in tests as well. relates #28505	2018-02-12 18:33:45 -06:00
Michael Basnight	4e0c1463d5	Fix build.snapshot bug in version collection (#28641 ) The build.snapshot was mistakenly passed in to every snapshot version, so when release tests were run, these versions were mistaken as released entities and could not be found in maven, because they do not exist. This fix removes that bug in logic, and always makes them proper snapshots. This has a benefit of cleaning up the VersionUtilsTests because they no longer rely on different sets of versions to check against, which was also a bug.	2018-02-12 14:56:07 -06:00
Martijn van Groningen	c19d84012e	iter	2018-02-12 13:50:48 +01:00
Martijn van Groningen	21e5ee6551	[TEST] Changed how stash dumps are logged in yaml tests in case of failures Currently if a yaml test has a teardown and a test is failing then a stash dump of a request in the teardown is logged instead of a stash dump of a request in the test itself. By handling the logging of stash dumps separately for setup, tests and teardown yaml sections we shouldn't miss the stash dump of request/response that is actually causing the yaml test to fail.	2018-02-12 13:50:48 +01:00
Boaz Leskes	4aece92b2c	IndexShardOperationPermits: shouldn't use new Throwable to capture stack traces (#28598 ) The is a follow up to #28567 changing the method used to capture stack traces, as requested during the review. Instead of creating a throwable, we explicitly capture the stack trace of the current thread. This should Make Jason Happy Again ™️ .	2018-02-12 10:33:13 +01:00
Michael Basnight	e0bea70070	Generalize BWC logic (#28505 ) Generalizing BWC building so that there is less code to modify for a release. This ensures we do not need to think about what major or minor version is in the gradle code. It follows the general rules of the elastic release structure. For more information on the rules, see the VersionCollection's javadoc. This also removes the additional bwc snapshots that will never be released, such as 6.0.2, which were being built and tested against every time we ran bwc tests. Additionally, it creates 4 new projects that correspond to the different types of snapshots that may exist for a given version. Its possible to now run those individual tasks to work out bwc logic whereas previously it was impossible and the entire suite of bwc tests had to be run to work out any logic changes in the build tools' bwc project. Please note that if the project does not make sense for the version that is current, that an error will be thrown from that individual project if an attempt is made to run it. This should allow for automating the version bumps as well, since it removes all the hardcoded version logic from the configs.	2018-02-09 14:55:10 -06:00
Boaz Leskes	ba59cf1262	Capture stack traces while issuing IndexShard operations permits to easy debugging (#28567 ) Today we acquire a permit from the shard to coordinate between indexing operations, recoveries and other state transitions. When we leak an permit it's practically impossible to find who the culprit is. This PR add stack traces capturing for each permit so we can identify which part of the code is responsible for acquiring the unreleased permit. This code is only active when assertions are active. The output is something like: ``` java.lang.AssertionError: shard [test][1] on node [node_s0] has pending operations: --> java.lang.RuntimeException: something helpful 2 at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:223) at org.elasticsearch.index.shard.IndexShard.<init>(IndexShard.java:322) at org.elasticsearch.index.IndexService.createShard(IndexService.java:382) at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:514) at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:143) at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:552) at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:529) at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:231) at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$6(ClusterApplierService.java:498) at java.base/java.lang.Iterable.forEach(Iterable.java:75) at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:495) at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:482) at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432) at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:161) at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:566) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641) at java.base/java.lang.Thread.run(Thread.java:844) --> java.lang.RuntimeException: something helpful at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:223) at org.elasticsearch.index.shard.IndexShard.<init>(IndexShard.java:311) at org.elasticsearch.index.IndexService.createShard(IndexService.java:382) at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:514) at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:143) at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:552) at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:529) at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:231) at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$6(ClusterApplierService.java:498) at java.base/java.lang.Iterable.forEach(Iterable.java:75) at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:495) at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:482) at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432) at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:161) at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:566) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641) at java.base/java.lang.Thread.run(Thread.java:844) ```	2018-02-08 22:59:02 +01:00
Tim Brooks	16f7e00514	Improve testTransportStatsWithException test (#28554 ) This commit modifies the transport stats with exception test to remove the requirement that we calculate the published address size when comparing bytes received. This is tricky and is currently broken as we also place the address string in the transport exception, however we do not adjust the bytes for that. The solution in this commit is to just serialize the transport exception in the test and use that for the calculation.	2018-02-07 14:31:42 -07:00
Lee Hinman	eebff4d2b3	Use non deprecated xcontenthelper (#28503 ) * Move to non-deprecated XContentHelper.createParser(...) This moves away from one of the now-deprecated XContentHelper.createParser methods in favor of specifying the deprecation logger at parser creation time. Relates to #28449 Note that this doesn't move all the `createParser` calls because some of them use the already-deprecated method that doesn't specify the XContentType. * Remove the deprecated (and now non-needed) createParser method	2018-02-05 16:18:18 -07:00
Lee Hinman	3ddea8d8d2	Start switching to non-deprecated ParseField.match method (#28488 ) This commit switches all the modules and server test code to use the non-deprecated `ParseField.match` method, passing in the parser's deprecation handler or the logging deprecation handler when a parser is not available (like in tests). Relates to #28449	2018-02-02 10:10:13 -07:00
Yannick Welsch	031415a5f6	Replicate writes only to fully initialized shards (#28049 ) The primary currently replicates writes to all other shard copies as soon as they're added to the routing table. Initially those shards are not even ready yet to receive these replication requests, for example when undergoing a file-based peer recovery. Based on the specific stage that the shard copies are in, they will throw different kinds of exceptions when they receive the replication requests. The primary then ignores responses from shards that match certain exception types. With this mechanism it's not possible for a primary to distinguish between a situation where a replication target shard is not allocated and ready yet to receive requests and a situation where the shard was successfully allocated and active but subsequently failed. This commit changes replication so that only initializing shards that have successfully opened their engine are used as replication targets. This removes the need to replicate requests to initializing shards that are not even ready yet to receive those requests. This saves on network bandwidth and enables features that rely on the distinction between a "not-yet-ready" shard and a failed shard.	2018-02-02 11:13:07 +01:00
Luca Cavanna	d860971572	REST high-level client: add support for split and shrink index API (#28425 ) Relates to #27205	2018-02-01 16:37:01 +01:00
Jim Ferenczi	dd40b984c4	Add a shallow copy method to aggregation builders (#28430 ) This change adds a shallow copy method for aggregation builders. This method returns a copy of the builder replacing the factoriesBuilder and metaDada This method is used when the builder is rewritten (AggregationBuilder#rewrite) in order to make sure that we create a new instance of the parent builder when sub aggregations are rewritten. Relates #27782	2018-02-01 09:22:32 +01:00
Jason Tedor	1b3d529bef	Introduce secure security manager to project This commit migrates SecureSM, our secure security manager implementation, from its own repository to being a sub-project of Elasticsearch.	2018-01-31 18:23:28 -05:00
Nhat Nguyen	5e0be61774	Add logging to index commit deletion policy (#28448 ) This would help us to figure out which index commit that an engine started with or used in peer-recovery. Relates #28405	2018-01-31 11:09:49 -05:00
markharwood	77d2dd203e	Search - add allow_partial_search_results flag with default setting false (#28440 ) Adds allow_partial_search_results flag to search requests with default setting = true. When false, will error if search either timeouts, has partial errors or has missing shards rather than returning partial search results. A cluster-level setting provides a default for search requests with no flag. Closes #27435	2018-01-31 15:51:29 +00:00
Jim Ferenczi	7edb978256	RandomDocumentPicks#randomFieldName can produce invalid field name (#28419 ) This change makes sure that this function does not create field names that end with a '.', more precisely it only allows alpha-numeric characters to compose the leaf field name. Closes #27373	2018-01-31 09:21:09 +01:00
Yannick Welsch	9dd0886265	Fix NullPointerException in MockUncasedHostProvider (#28424 ) The MockUncasedHostProvider accesses nodes that are not fully built yet, where TransportService.getNode() returns null, which means that the null entries end up in the list of seedNodes that UnicastZenPing then uses.	2018-01-30 10:44:19 +01:00
Simon Willnauer	43d1dcb919	Add a method that ensures that the cluster is yellow and has no intializing shards (#28416 )	2018-01-29 20:46:30 +01:00
Nik Everett	66ff1b2a59	Tests: Wipe cluster settings after every test (#28410 ) Cluster settings shouldn't leak into the next test. I played with failing the test if it left over any settings but that felt like it added more ceremony then it was worth. The advantage is that any test that intentionally wants to leave settings in place after the test would fail and require looking at but, so far as I can tell, we don't have any such tests.	2018-01-29 11:47:04 -05:00
Ryan Ernst	3dd833ca0a	Plugins: Use one confirmation of all meta plugin permissions (#28366 ) Currently meta plugins will ask for confirmation of security policy exceptions for each bundled plugin. This commit collects the necessary permissions of each bundled plugin, and asks for confirmation of all of them at the same time.	2018-01-26 15:44:44 -08:00
olcbean	9db23e48cd	Add Indices Aliases API to the high level REST client (#27876 ) Relates to #27205	2018-01-25 14:34:06 +01:00
Colin Goodheart-Smithe	75116a23cc	Adds test name to MockPageCacheRecycler exception (#28359 ) This change adds the test name to the exceptions thrown by the MockPageCacheRecycler and MockBigArrays. Also, if there is more than one page/array which are not released it will add the first one as the cause of the thrown exception and the others as suppressed exceptions. Relates to #21315	2018-01-25 08:13:33 +00:00
Alexander Reelsen	a87714aafc	Settings: Introduce settings updater for a list of settings (#28338 ) This introduces a settings updater that allows to specify a list of settings. Whenever one of those settings changes, the whole block of settings is passed to the consumer. This also fixes an issue with affix settings, when used in combination with group settings, which could result in no found settings when used to get a setting for a namespace. Lastly logging has been slightly changed, so that filtered settings now only log the setting key. Another bug has been fixed for the mock log appender, which did not work, when checking for the exact message. Closes #28047	2018-01-24 09:47:17 +01:00
Christoph Büscher	ba9e2e44cb	[Test] Re-Add integer_range and date_range field types for query builder tests (#28171 ) The tests for those field types were removed in #26549 because the range mapper was moved to a module, but later this mapper was moved back to core in #27854. This change adds back those two field types like before to the general setup in AbstractQueryTestCase and adds some specifics to the RangeQueryBuilder and TermsQueryBuilder tests. Also adding back an integration test in SearchQueryIT that has been removed before but that can be kept with the mapper back in core now. Relates to #28147	2018-01-23 13:08:54 +01:00
Luca Cavanna	0c83ee2a5d	Trim down usages of `ShardOperationFailedException` interface (#28312 ) In many cases we use the `ShardOperationFailedException` interface to abstract an exception that can only be of one type, namely `DefaultShardOperationException`. There is no need to use the interface in such cases, the concrete type should be used instead. That has the additional advantage of simplifying parsing such exceptions back from rest responses for the high-level REST client	2018-01-22 15:51:46 +01:00
kel	452c36c552	Calculate sum in Kahan summation algorithm in aggregations (#27807 ) (#27848 )	2018-01-22 12:42:56 +01:00
Adrien Grand	700d9ecc95	Remove the `update_all_types` option. (#28288 ) This option is not useful in 7.x since no indices may have more than one type anymore.	2018-01-22 12:03:07 +01:00
Tim Brooks	a6a57a71d3	Implement socket and server ChannelContexts (#28275 ) This commit is related to #27260. Currently have a channel context that implements reading and writing logic for socket channels. Additionally, we have exception contexts to handle exceptions. And accepting contexts to handle accepted channels. This PR introduces a ChannelContext that handles close and exception handling for all channel types. Additionally, it has implementers that provide specific functionality for socket channels (read and writing). And specific functionality for server channels (accepting).	2018-01-18 13:06:40 -07:00
Tim Brooks	20fb7a6d87	Modify Abstract transport tests to use impls (#28270 ) There a number of tests in `AbstractSimpleTransportTestCase` that create `MockTcpTransport` impls. This commit modifies two of these tests to use the transport implementation that is being tested.	2018-01-18 10:59:42 -07:00
Tim Brooks	4ea9ddb7d3	Unify nio read / write channel contexts (#28160 ) This commit is related to #27260. Right now we have separate read and write contexts for implementing specific protocol logic. However, some protocols require a closer relationship between read and write operations than is allowed by our current model. An example is HTTP which might require a write if some problem with request parsing was encountered. Additionally, some protocols require close messages to be sent when a channel is shutdown. This is also problematic in our current model, where we assume that channels should simply be queued for close and forgotten. This commit transitions to a single ChannelContext which implements all read, write, and close logic for protocols. It is the job of the context to tell the selector when to close the channel. A channel can still be manually queued for close with a selector. This is how server channels are closed for now. And this route allows timeout mechanisms on normal channel closes to be implemented.	2018-01-17 09:44:21 -07:00
Alexander Reelsen	d32cb8089b	Tests: Decrease log level for adding a header value (#28246 ) This logging message adds considerable noise to many REST tests, if you are using something like HTTP basic auth in every API call or set any custom header. The log level moves from info to debug, so can still be seen if wanted.	2018-01-17 09:14:44 +01:00
Jim Ferenczi	bd11e6c441	Fix NPE on composite aggregation with sub-aggregations that need scores (#28129 ) The composite aggregation defers the collection of sub-aggregations to a second pass that visits documents only if they appear in the top buckets. Though the scorer for sub-aggregations is not set on this second pass and generates an NPE if any sub-aggregation tries to access the score. This change creates a scorer for the second pass and makes sure that sub-aggs can use it safely to check the score of the collected documents.	2018-01-15 18:30:38 +01:00
Tim Brooks	ee7eac8dc1	`MockTcpTransport` to connect asynchronously (#28203 ) The method `initiateChannel` on `TcpTransport` is explicit in that channels can be connect asynchronously. All production implementations do connect asynchronously. Only the blocking `MockTcpTransport` connects in a synchronous manner. This avoids testing some of the blocking code in `TcpTransport` that waits on connections to complete. Additionally, it requires a more extensive method signature than required for other transports. This commit modifies the `MockTcpTransport` to make these connections asynchronously on a different thread. Additionally, it simplifies that `initiateChannel` method signature.	2018-01-15 10:20:30 -07:00
Tim Brooks	3895add2ca	Introduce elasticsearch-core jar (#28191 ) This is related to #27933. It introduces a jar named elasticsearch-core in the lib directory. This commit moves the JarHell class from server to elasticsearch-core. Additionally, PathUtils and some of Loggers are moved as JarHell depends on them.	2018-01-15 09:59:01 -07:00
Igor Motov	c75ac319a6	Add ability to associate an ID with tasks (#27764 ) Adds support for capturing the X-Opaque-Id header from a REST request and storing it's value in the tasks that this request started. It works for all user-initiated tasks (not only search). Closes #23250 Usage: ``` $ curl -H "X-Opaque-Id: imotov" -H "foo:bar" "localhost:9200/_tasks?pretty&group_by=parents" { "tasks" : { "7qrTVbiDQKiZfubUP7DPkg:6998" : { "node" : "7qrTVbiDQKiZfubUP7DPkg", "id" : 6998, "type" : "transport", "action" : "cluster:monitor/tasks/lists", "start_time_in_millis" : 1513029940042, "running_time_in_nanos" : 266794, "cancellable" : false, "headers" : { "X-Opaque-Id" : "imotov" }, "children" : [ { "node" : "V-PuCjPhRp2ryuEsNw6V1g", "id" : 6088, "type" : "netty", "action" : "cluster:monitor/tasks/lists[n]", "start_time_in_millis" : 1513029940043, "running_time_in_nanos" : 67785, "cancellable" : false, "parent_task_id" : "7qrTVbiDQKiZfubUP7DPkg:6998", "headers" : { "X-Opaque-Id" : "imotov" } }, { "node" : "7qrTVbiDQKiZfubUP7DPkg", "id" : 6999, "type" : "direct", "action" : "cluster:monitor/tasks/lists[n]", "start_time_in_millis" : 1513029940043, "running_time_in_nanos" : 98754, "cancellable" : false, "parent_task_id" : "7qrTVbiDQKiZfubUP7DPkg:6998", "headers" : { "X-Opaque-Id" : "imotov" } } ] } } } ```	2018-01-12 15:34:17 -05:00
Nhat Nguyen	626c3d1fda	Primary send safe commit in file-based recovery (#28038 ) Today a primary shard transfers the most recent commit point to a replica shard in a file-based recovery. However, the most recent commit may not be a "safe" commit; this causes a replica shard not having a safe commit point until it can retain a safe commit by itself. This commits collapses the snapshot deletion policy into the combined deletion policy and modifies the peer recovery source to send a safe commit. Relates #10708	2018-01-11 10:39:12 -05:00
Jason Tedor	2c24ac7426	Set watermarks in single-node test cases We set the watermarks to low values in other test cases to prevent test failures on nodes with low disk space (if the disk space is too low, the test will fail anyway but we should not prematurely fail). This commit sets the watermarks in the single-node test cases to avoid test failures in such situations. Relates #28134	2018-01-09 12:51:50 -05:00
Jim Ferenczi	36729d1c46	Add the ability to bundle multiple plugins into a meta plugin (#28022 ) This commit adds the ability to package multiple plugins in a single zip. The zip file for a meta plugin must contains the following structure: \|____elasticsearch/ \| \|____ <plugin1> <-- The plugin files for plugin1 (the content of the elastisearch directory) \| \|____ <plugin2> <-- The plugin files for plugin2 \| \|____ meta-plugin-descriptor.properties <-- example contents below The meta plugin properties descriptor is mandatory and must contain the following properties: description: simple summary of the meta plugin. name: the meta plugin name The installation process installs each plugin in a sub-folder inside the meta plugin directory. The example above would create the following structure in the plugins directory: \|_____ plugins \| \|____ <name_of_the_meta_plugin> \| \| \|____ meta-plugin-descriptor.properties \| \| \|____ <plugin1> \| \| \|____ <plugin2> If the sub plugins contain a config or a bin directory, they are copied in a sub folder inside the meta plugin config/bin directory. \|_____ config \| \|____ <name_of_the_meta_plugin> \| \| \|____ <plugin1> \| \| \|____ <plugin2> \|_____ bin \| \|____ <name_of_the_meta_plugin> \| \| \|____ <plugin1> \| \| \|____ <plugin2> The sub-plugins are loaded at startup like normal plugins with the same restrictions; they have a separate class loader and a sub-plugin cannot have the same name than another plugin (or a sub-plugin inside another meta plugin). It is also not possible to remove a sub-plugin inside a meta plugin, only full removal of the meta plugin is allowed. Closes #27316	2018-01-09 18:28:43 +01:00
Tanguy Leroux	bba591bea0	Consistent updates of IndexShardSnapshotStatus (#28130 ) This commit changes IndexShardSnapshotStatus so that the Stage is updated coherently with any required information. It also provides a asCopy() method that returns the status of a IndexShardSnapshotStatus at a given point in time, ensuring that all information are coherent. Closes #26480	2018-01-09 14:01:57 +01:00
olcbean	fd45a46ce8	Deprecate `isShardsAcked()` in favour of `isShardsAcknowledged()` (#27819 ) Several responses include the shards_acknowledged flag (indicating whether the requisite number of shard copies started before the completion of the operation) and there are two different getters used : isShardsAcknowledged() and isShardsAcked(). This PR deprecates the isShardsAcked() in favour of isShardsAcknowledged() in CreateIndexResponse, RolloverResponse and CreateIndexClusterStateUpdateResponse. Closes #27784	2018-01-08 10:57:45 +01:00
Jason Tedor	eaa636d4bb	Clarify reproduce info on Windows This commit correct the test failure reproduction line on Windows. Relates #28104	2018-01-06 22:49:14 -05:00
Jason Tedor	d712f581ca	Fix reproduction info to point to Gradle wrapper With the Gradle wrapper in place, we should point the reproduction info to specify using the Gradle wrapper too. Relates #28104	2018-01-06 08:47:23 -05:00
Tim Brooks	38701fb6ee	Create nio-transport plugin for NioTransport (#27949 ) This is related to #27260. This commit moves the NioTransport from :test:framework to a new nio-transport plugin. Additionally, supporting tcp decoding classes are moved to this plugin. Generic byte reading and writing contexts are moved to the nio library. Additionally, this commit adds a basic MockNioTransport to :test:framework that is a TcpTransport implementation for testing that is driven by nio.	2018-01-05 09:41:29 -07:00
Tim Brooks	be5da2815d	Set the elasticsearch-nio codebase for tests (#28067 ) This commit sets the elasticsearch-nio code base in the BootstrapForTesting class. This is necessary as that codebase needs socket permissions. Setting the codebase manually is necessary as intellij does not package our internal libraries when running tests.	2018-01-04 09:55:51 -07:00
Yannick Welsch	7cdbae2da8	Add Writeable.Reader support to TransportResponseHandler (#28010 ) Allows TransportResponse objects not to implement Streamable anymore. As an example, I've adapted the response handler for ShardActiveResponse, allowing the fields in that class to become final.	2018-01-04 10:27:08 +01:00
Ryan Ernst	d36ec18029	Plugins: Add plugin extension capabilities (#27881 ) This commit adds the infrastructure to plugin building and loading to allow one plugin to extend another. That is, one plugin may extend another by the "parent" plugin allowing itself to be extended through java SPI. When all plugins extending a plugin are finished loading, the "parent" plugin has a callback (through the ExtensiblePlugin interface) allowing it to reload SPI. This commit also adds an example plugin which uses as-yet implemented extensibility (adding to the painless whitelist).	2018-01-03 11:12:43 -08:00
Tim Brooks	c775374125	Disable nio test transport (#28028 ) This commit disables the nio transport as an option for the test transport in integration tests. This is because it does not currently run properly in intellij due to socket permissions. It should be reenabled once #27881 is merged (and the proper permissions are added).	2017-12-31 14:59:38 -07:00
Maxime Gréau	771defb97c	Build: Add 3rd party dependencies report generation (#27727 ) * Adds task dependenciesInfo to BuildPlugin to generate a CSV file with dependencies information (name,version,url,license) * Adds `ConcatFilesTask.groovy` to concatenates multiple files into one * Adds task `:distribution:generateDependenciesReport` to concatenate `dependencies.csv` files into a single file (`es-dependencies.csv` by default) # Examples: $ gradle dependenciesInfo :distribution:generateDependenciesReport ## Use `csv` system property to customize the output file path $ gradle dependenciesInfo :distribution:generateDependenciesReport -Dcsv=/tmp/elasticsearch-dependencies.csv ## When branch is not master, use `build.branch` system property to generate correct licenses URLs $ gradle dependenciesInfo :distribution:generateDependenciesReport -Dbuild.branch=6.x -Dcsv=/tmp/elasticsearch-dependencies.csv	2017-12-26 10:51:47 +01:00
Nhat Nguyen	6629f4ab0d	Rollback primary before recovering from translog (#27804 ) Today we always recover a primary from the last commit point. However with a new deletion policy, we keep multiple commit points in the existing store, thus we have chance to find a good starting commit point. With a good starting commit point, we may be able to throw away stale operations. This PR rollbacks a primary to a starting commit then recovering from translog. Relates #10708	2017-12-22 18:25:36 -05:00
Tim Brooks	06b313025c	Add elasticsearch-nio jar for base nio classes (#27801 ) This is related to #27802. This commit adds a jar called elasticsearch-nio that contains the base nio classes that will be used for the tcp nio transport and eventually the http nio transport. The jar does not depend on elasticsearch:core, so all references to core have been removed.	2017-12-20 16:29:16 -06:00
Nhat Nguyen	54b6885844	Check index under the store metadata lock (#27768 ) Today when we get a metadata snapshot directly from a store directory, we acquire a metadata lock, then acquire an IndexWriter lock. However, we create a CheckIndex in IndexShard without acquiring the metadata lock first. This causes a recovery failed because the IndexWriter lock can be still held by method snapshotStoreMetadata. This commit makes sure to create a CheckIndex under the metadata lock. Closes #24481 Closes #27731 Relates #24787	2017-12-20 11:26:06 -05:00
Tanguy Leroux	0f80e7c5f6	[Test] Fix IndicesClientDocumentationIT (#27899 ) The last operation executed in IndicesClientDocumentationIT.testCreate() is an asynchronous index creation. Because nothing waits for its completion, on slow machines the index can sometimes be created after the testCreate() test is finished, and it can fail the following test. Closes #27754	2017-12-20 09:31:10 +01:00
Nik Everett	32669ca265	Test: Change randomValueOtherThan(null, supplier) (#27901 ) When the first parameter of `ESTestCase#randomValueOtherThan` is `null` then run the supplier until it returns non-`null`. Previously, `randomValueOtherThan` just ran the supplier one time which was confusing. Unexpectedly, it looks like not tests rely on the original `null` handling. Closes #27821	2017-12-19 10:23:38 -05:00
Boaz Leskes	bea9471b2f	Use port 0 InternalTestCluster nodes (#27859 ) We currently have a complicated port assignment scheme to make sure that the nodes span off by the internal test cluster will be assigned fixed port ranges that will also not collide between clusters. The port ranges need to be fixed in advance so that the nodes will be able to find each other via `UnicastZenPing`. This approach worked well for the last few years but we are now at a point that our testing has grown beyond it and we exceed the 5 reusable ranges per JVM. This means that nodes are not always assigned the first 5 ports in their range which causes cluster formation issues. On top of that, most of the clusters that are span up don't even rely on `UnicastZenPing` but rather `MockZenPings` that uses in memory maps for discovery (with the down side that they are not influenced by network disruption simulations). This PR changes `InternalTestCluster` to use port 0 as a fixed assignment. This will allow the OS to manage ports and will ensure we don't have collisions. For tests that need to simulate network disruptions (and thus can't use `MockZenPings`), a new `UnicastHostProvider` is introduced that is based on the current state of the test cluster. Since that is only resolved at run time, it is aware of the port assignments of the OS. Closes #27818 Closes #27762	2017-12-19 08:43:03 +01:00
Jason Tedor	aebdb2a646	Filter current version from compatible versions We need to filter the current version from the list of compatible versions to match how we calculate the list of compatible versions in Gradle.	2017-12-18 17:37:22 -05:00
Yannick Welsch	a5e8a221ec	Move GlobalCheckpointTracker and remove SequenceNumbersService (#27837 ) This commit moves GlobalCheckpointTracker from the engine to IndexShard, where it better fits logically: Tracking the global checkpoint based on the local checkpoints of all shards in the replication group is not a property of the engine, but rather a property fulfilled by the current primary shard. The LocalCheckpointTracker on the other hand is driven by the contents of the local translog. By moving GlobalCheckpointTracker to IndexShard, it makes little sense to keep the SequenceNumbersService class around - it would only wrap the LocalCheckpointTracker. This commit therefore removes the class and replaces occurrences of SequenceNumbersService in the engine directly by LocalCheckpointTracker.	2017-12-18 15:27:44 +01:00
Alan Woodward	af3f63616b	Allow TrimFilter to be used in custom normalizers (#27758 ) AnalysisFactoryTestCase checks that the ES custom token filter multi-term awareness matches the underlying lucene factory. For the trim filter this won't be the case until LUCENE-8093 is released in 7.3, so we add a temporary exclusion Closes #27310	2017-12-18 14:27:03 +00:00
Jason Tedor	76771242e8	Fix version tests for release tests This commit fixes the version tests for release tests. The problem here is that during release tests all version should be treated as released so the assertions must be modified accordingly. Relates #27815	2017-12-18 08:51:37 -05:00
Boaz Leskes	9cd69e7ec1	recovery from snapshot should fill gaps (#27850 ) When snapshotting the primary we capture a lucene commit at an arbitrary moment from a sequence number perspective. This means that it is possible that the commit misses operations and that there is a gap between the local checkpoint in the commit and the maximum sequence number. When we restore, this will create a primary that "misses" operations and currently will mean that the sequence number system is stuck (i.e., the local checkpoint will be stuck). To fix this we should fill in gaps when we restore, in a similar fashion to normal store recovery.	2017-12-18 13:33:39 +01:00
David Turner	f0b21e3182	Make randomNonNegativeLong() draw from a uniform distribution (#27856 ) Currently randomNonNegativeLong() returns 0 half as often as any positive long, but random number generators are typically expected to return uniformly-distributed values unless otherwise specified. This fixes this issue by mapping Long.MIN_VALUE directly onto 0 rather than resampling.	2017-12-18 09:57:40 +00:00
Tim Brooks	916e7dbe29	Add NioGroup for use in different transports (#27737 ) This commit is related to #27260. It adds a base NioGroup for use in different transports. This class creates and starts the underlying selectors. Different protocols or transports are established by passing the ChannelFactory to the bindServerChannel or openChannel methods. This allows a TcpChannelFactory to be passed which will create and register channels that support the elasticsearch tcp binary protocol or a channel factory that will create http channels (or other).	2017-12-15 10:42:00 -07:00
Tim Brooks	f33f9612a7	Remove potential nio selector leak (#27825 ) When an ESSelector is created an underlying nio selector is opened. This selector is closed by the event loop after close has been signalled by another thread. However, there is a possibility that an ESSelector is created and some exception in the startup process prevents it from ever being started (however, close will still be called). The allows the selector to leak. This commit addresses this issue by having the signalling thread close the selector if the event loop is not running when close is signalled.	2017-12-14 14:37:41 -07:00
Adrien Grand	1b660821a2	Allow `_doc` as a type. (#27816 ) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751	2017-12-14 17:47:53 +01:00
Daniel Mitterdorfer	d26b33dea2	Mute VersionUtilsTest#testGradleVersionsMatchVersionUtils Relates #27815	2017-12-14 12:33:41 +01:00
Nhat Nguyen	57fc705d5e	Keep commits and translog up to the global checkpoint (#27606 ) We need to keep index commits and translog operations up to the current global checkpoint to allow us to throw away unsafe operations and increase the operation-based recovery chance. This is achieved by a new index deletion policy. Relates #10708	2017-12-12 19:20:08 -05:00
Tim Brooks	d1acb7697b	Remove internal channel tracking in transports (#27711 ) This commit attempts to continue unifying the logic between different transport implementations. As transports call a `TcpTransport` callback when a new channel is accepted, there is no need to internally track channels accepted. Instead there is a set of accepted channels in `TcpTransport`. This set is used for metrics and shutting down channels.	2017-12-08 16:56:53 -07:00
Tim Brooks	d82c40d35c	Implement byte array reusage in `NioTransport` (#27696 ) This is related to #27563. This commit modifies the InboundChannelBuffer to support releasable byte pages. These byte pages are provided by the PageCacheRecycler. The PageCacheRecycler must be passed to the Transport with this change.	2017-12-08 10:39:30 -07:00
Tim Brooks	da5f52a2fc	Add test for writer operation buffer accounting (#27707 ) This is a follow up to #27695. This commit adds a test checking that across multiple writes using multiple buffers, a write operation properly keeps track of which buffers still need to be written.	2017-12-07 12:48:49 -07:00
Christoph Büscher	b83e14858a	Correcting some minor typos in comments	2017-12-07 16:39:23 +01:00
Tim Brooks	5b3230cbae	Fix issue where the incorrect buffers are written (#27695 ) This is a followup to #27551. That commit introduced a bug where the incorrect byte buffers would be returned when we attempted a write. This commit fixes the logic.	2017-12-06 20:57:46 -07:00
Tim Brooks	2aa62daed4	Introduce resizable inbound byte buffer (#27551 ) This is related to #27563. In order to interface with java nio, we must have buffers that are compatible with ByteBuffer. This commit introduces a basic ByteBufferReference to easily allow transferring bytes off the wire to usage in the application. Additionally it introduces an InboundChannelBuffer. This is a buffer that can internally expand as more space is needed. It is designed to be integrated with a page recycler so that it can internally reuse pages. The final piece is moving all of the index work for writing bytes to a channel into the WriteOperation.	2017-12-06 11:02:25 -07:00
Jim Ferenczi	caea6b70fa	Add a new cluster setting to limit the total number of buckets returned by a request (#27581 ) This commit adds a new dynamic cluster setting named `search.max_buckets` that can be used to limit the number of buckets created per shard or by the reduce phase. Each multi bucket aggregator can consume buckets during the final build of the aggregation at the shard level or during the reduce phase (final or not) in the coordinating node. When an aggregator consumes a bucket, a global count for the request is incremented and if this number is greater than the limit an exception is thrown (TooManyBuckets exception). This change adds the ability for multi bucket aggregator to "consume" buckets in the global limit, the default is 10,000. It's an opt-in consumer so each multi-bucket aggregator must explicitly call the consumer when a bucket is added in the response. Closes #27452 #26012	2017-12-06 09:15:28 +01:00
Luca Cavanna	f4fb4d3bf5	Add support for filtering mappings fields (#27603 ) Add support for filtering fields returned as part of mappings in get index, get mappings, get field mappings and field capabilities API. Plugins can plug in their own function, which receives the index as argument, and return a predicate which controls whether each field is included or not in the returned output.	2017-12-05 20:31:29 +01:00
Jason Tedor	42a4ad35da	Add node name to thread pool executor name This commit adds the node name to the names of thread pool executors so that the node name is visible in rejected execution exception messages. Relates #27663	2017-12-05 07:45:40 -05:00
Lee Hinman	1ff5ef9055	[TEST] Check accounting breaker is equal to segment stats rather than 0 If there are existing indices, it may not be 0	2017-12-04 14:15:23 -07:00
Simon Willnauer	84ec472428	Include internal refreshes in refresh stats (#27615 ) Today we exclude internal refreshes in the refresh stats. Yet, it's very much confusing to not take these into account. This change includes internal refreshes into the stats until we have a dedicated stats for this.	2017-12-04 16:33:47 +01:00
Boaz Leskes	f58a3d0b96	testRelocationWithConcurrentIndexing: wait for green (on relevan index) and shard initialization to settle down before starting relocation	2017-12-04 13:18:42 +01:00
Boaz Leskes	1a976ea7a4	Cherry pick tests and seqNo recovery hardning from #27580	2017-12-04 13:15:40 +01:00
James Baiera	e16f1271b6	Fix SecurityException when HDFS Repository used against HA Namenodes (#27196 ) * Sense HA HDFS settings and remove permission restrictions during regular execution. This PR adds integration tests for HA-Enabled HDFS deployments, both regular and secured. The Mini HDFS fixture has been updated to optionally run in HA-Mode. A new test suite has been added for reproducing the effects of a Namenode failing over during regular repository usage. Going forward, the HDFS Repository will still be subject to its self imposed permission restrictions during normal use, but will no longer restrict them when running against an HA enabled HDFS cluster. Instead, the plugin will rely on the provided security policy and not further restrict the permissions so that the transparent operation to failover to a different Namenode in the client does not raise security exceptions. Additionally, we are now testing the secure mode with SASL based wire encryption of data between Elasticsearch and HDFS. This includes a missing library (commons codec) in order to support this change.	2017-12-01 14:26:05 -05:00
Lee Hinman	623d3700f0	Add accounting circuit breaker and track segment memory usage (#27116 ) * Add accounting circuit breaker and track segment memory usage This commit adds a new circuit breaker "accounting" that is used for tracking the memory usage of non-request-tied memory users. It also adds tracking for the amount of Lucene segment memory used by a shard as a user of the new circuit breaker. The Lucene segment memory is updated when the shard refreshes, and removed when the shard relocates away from a node or is deleted. It should also be noted that all tracking for segment memory uses `addWithoutBreaking` so as not to fail the shard if a limit is reached. The `accounting` breaker has a default limit of 100% and will contribute to the parent breaker limit. Resolves #27044	2017-12-01 07:59:45 -07:00
Luca Cavanna	3e8ca38fca	Deprecate the transport client in favour of the high-level REST client (#27085 )	2017-12-01 12:24:16 +01:00
Tim Brooks	b8557651aa	Add exception handling for write listeners (#27590 ) This potential issue was exposed when I saw this PR #27542. Essentially we currently execute the write listeners all over the place without consistently catching and handling exceptions. Some of these exceptions will be logged in different ways (including as low as `debug`). This commit adds a single location where these listeners are executed. If the listener throws an execption, the exception is caught and logged at the `warn` level.	2017-11-29 15:47:12 -07:00
David Turner	00867e618d	Transpose expected and actual, and remove duplicate info from message. (#27515 ) Previously: ``` > Throwable #1: java.lang.AssertionError: Expected all shards successful but got successful [8] total [9] > Expected: <8> > but: was <9> ``` Now: ``` > Throwable #1: java.lang.AssertionError: Expected all shards successful > Expected: <9> > but: was <8> ```	2017-11-24 17:45:34 +00:00
Tanguy Leroux	5dc5580eac	Delete shard store files before restoring a snapshot (#27476 ) Pull request #20220 added a change where the store files that have the same name but are different from the ones in the snapshot are deleted first before the snapshot is restored. This logic was based on the `Store.RecoveryDiff.different` set of files which works by computing a diff between an existing store and a snapshot. This works well when the files on the filesystem form valid shard store, ie there's a `segments` file and store files are not corrupted. Otherwise, the existing store's snapshot metadata cannot be read (using Store#snapshotStoreMetadata()) and an exception is thrown (CorruptIndexException, IndexFormatTooOldException etc) which is later caught as the begining of the restore process (see RestoreContext#restore()) and is translated into an empty store metadata (Store.MetadataSnapshot.EMPTY). This will make the deletion of different files introduced in #20220 useless as the set of files will always be empty even when store files exist on the filesystem. And if some files are present within the store directory, then restoring a snapshot with files with same names will fail with a FileAlreadyExistException. This is part of the #26865 issue. There are various cases were some files could exist in the store directory before a snapshot is restored. One that Igor identified is a restore attempt that failed on a node and only first files were restored, then the shard is allocated again to the same node and the restore starts again (but fails because of existing files). Another one is when some files of a closed index are corrupted / deleted and the index is restored. This commit adds a test that uses the infrastructure provided by IndexShardTestCase in order to test that restoring a shard succeed even when files with same names exist on filesystem. Related to #26865	2017-11-24 13:15:34 +01:00
Martijn van Groningen	f1ebf366bf	unmuted test, this has been fixed by #27397 Closes #27497	2017-11-24 08:53:00 +01:00
David Turner	89ba8996c6	Consolidate version numbering semantics (#27397 ) Fixes to the build system, particularly around BWC testing, and to make future version bumps less painful.	2017-11-23 20:21:53 +00:00
Martijn van Groningen	ca9c476d88	muted test	2017-11-22 19:18:35 +01:00
Tim Brooks	ef34555b29	Decouple nio constructs from the tcp transport (#27484 ) This is related to #27260. Currently, basic nio constructs (nio channels, the channel factories, selector event handlers, etc) implement logic that is specific to the tcp transport. For example, NioChannel implements the TcpChannel interface. These nio constructs at some point will also need to support other protocols (ex: http). This commit separates the TcpTransport logic from the nio building blocks.	2017-11-22 11:39:31 -06:00
Jim Ferenczi	6319424e4a	Move composite aggregation to core (#27474 ) This change removes the module named aggs-composite and adds the `composite` aggs as a core aggregation. This allows other plugins to use this new aggregation and simplifies the integration in the HL rest client.	2017-11-21 13:31:01 +01:00
Tim Brooks	f37eb1b403	Remove tcp profile from low level nio channel (#27441 ) This is related to #27260. Currently every nio channel has a profile field. Profile is a concept that only relates to the tcp transport. Http channels will not have profiles. This commit moves the profile from the nio channel to the read context. The context is the level that protocol specific features and logic should live.	2017-11-20 12:20:42 -07:00
Tim Brooks	0a8f48d592	Transition transport apis to use void listeners (#27440 ) Currently we use ActionListener<TcpChannel> for connect, close, and send message listeners in TcpTransport. However, all of the listeners have to capture a reference to a channel in the case of the exception api being called. This commit changes these listeners to be type <Void> as passing the channel to onResponse is not necessary. Additionally, this change makes it easier to integrate with low level transports (which use different implementations of TcpChannel).	2017-11-20 10:47:47 -07:00
Michael Basnight	2949c53174	Remove config prompting for secrets and text (#27216 ) This commit removes the ability to use ${prompt.secret} and ${prompt.text} as valid config settings. Secure settings has obsoleted the need for this, and it cleans up some of the code in Bootstrap.	2017-11-19 22:33:17 -06:00
Michael Basnight	cb3e8f4763	Move the CLI into its own subproject (#27114 ) Projects the depend on the CLI currently depend on core. This should not always be the case. The EnvironmentAwareCommand will remain in :core, but the rest of the CLI components have been moved into their own subproject of :core, :core:cli.	2017-11-18 21:42:57 -06:00
Tim Brooks	ce45e29be7	Remove manual tracking of registered channels (#27445 ) This is related to #27260. Currently, every ESSelector keeps track of all channels that are registered with it. ESSelector is just an abstraction over a raw java nio selector. The java nio selector already tracks its own selection keys. This commit removes our tracking and relies on the java nio selector tracking.	2017-11-17 16:20:09 -07:00
David Turner	08a257327f	Remove newline from log message (#27425 ) It leads to harder-to-parse logs that look like this: ``` 1> [2017-11-16T20:46:21,804][INFO ][o.e.t.r.y.ClientYamlTestClient] Adding header Content-Type 1> with value application/json 1> [2017-11-16T20:46:21,812][INFO ][o.e.t.r.y.ClientYamlTestClient] Adding header Content-Type 1> with value application/json 1> [2017-11-16T20:46:21,820][INFO ][o.e.t.r.y.ClientYamlTestClient] Adding header Content-Type 1> with value application/json 1> [2017-11-16T20:46:21,966][INFO ][o.e.t.r.y.ClientYamlTestClient] Adding header Content-Type 1> with value application/json ```	2017-11-17 14:12:06 +00:00
Tim Brooks	f761a0e0e4	Remove unneeded Throwable handling in nio (#27412 ) This is related to #27260. In the nio transport work we do not catch or handle `Throwable`. There are a few places where we have exception handlers that accept `Throwable`. This commit removes those cases.	2017-11-16 18:24:06 -07:00
David Turner	9766b858d0	Prepare for bump to 6.0.1 on the master branch (#27391 ) An assortment of fixes, particularly to version number calculations, in preparation for the bump to 6.0.1.	2017-11-16 18:38:54 +00:00
Tim Brooks	80ef9bbdb1	Remove parameterization from TcpTransport (#27407 ) This commit is a follow up to the work completed in #27132. Essentially it transitions two more methods (sendMessage and getLocalAddress) from Transport to TcpChannel. With this change, there is no longer a need for TcpTransport to be aware of the specific type of channel a transport returns. So that class is no longer parameterized by channel type.	2017-11-16 11:19:36 -07:00
Tim Brooks	35a5922927	Delete unneeded nio client (#27408 ) This is a follow up to #27132. As that PR greatly simplified the connection logic inside a low level transport implementation, much of the functionality provided by the NioClient class is no longer necessary. This commit removes that class.	2017-11-16 09:22:40 -07:00
Jim Ferenczi	623367d793	Add composite aggregator (#26800 ) * This change adds a module called `aggs-composite` that defines a new aggregation named `composite`. The `composite` aggregation is a multi-buckets aggregation that creates composite buckets made of multiple sources. The sources for each bucket can be defined as: * A `terms` source, values are extracted from a field or a script. * A `date_histogram` source, values are extracted from a date field and rounded to the provided interval. This aggregation can be used to retrieve all buckets of a deeply nested aggregation by flattening the nested aggregation in composite buckets. A composite buckets is composed of one value per source and is built for each document as the combinations of values in the provided sources. For instance the following aggregation: ```` "test_agg": { "terms": { "field": "field1" }, "aggs": { "nested_test_agg": "terms": { "field": "field2" } } } ```` ... which retrieves the top N terms for `field1` and for each top term in `field1` the top N terms for `field2`, can be replaced by a `composite` aggregation in order to retrieve all the combinations of `field1`, `field2` in the matching documents: ```` "composite_agg": { "composite": { "sources": [ { "field1": { "terms": { "field": "field1" } } }, { "field2": { "terms": { "field": "field2" } } }, } } ```` The response of the aggregation looks like this: ```` "aggregations": { "composite_agg": { "buckets": [ { "key": { "field1": "alabama", "field2": "almanach" }, "doc_count": 100 }, { "key": { "field1": "alabama", "field2": "calendar" }, "doc_count": 1 }, { "key": { "field1": "arizona", "field2": "calendar" }, "doc_count": 1 } ] } } ```` By default this aggregation returns 10 buckets sorted in ascending order of the composite key. Pagination can be achieved by providing `after` values, the values of the composite key to aggregate after. For instance the following aggregation will aggregate all composite keys that sorts after `arizona, calendar`: ```` "composite_agg": { "composite": { "after": {"field1": "alabama", "field2": "calendar"}, "size": 100, "sources": [ { "field1": { "terms": { "field": "field1" } } }, { "field2": { "terms": { "field": "field2" } } } } } ```` This aggregation is optimized for indices that set an index sorting that match the composite source definition. For instance the aggregation above could run faster on indices that defines an index sorting like this: ```` "settings": { "index.sort.field": ["field1", "field2"] } ```` In this case the `composite` aggregation can early terminate on each segment. This aggregation also accepts multi-valued field but disables early termination for these fields even if index sorting matches the sources definition. This is mandatory because index sorting picks only one value per document to perform the sort.	2017-11-16 15:13:36 +01:00
Tim Brooks	ca11085bb6	Add TcpChannel to unify Transport implementations (#27132 ) Right now our different transport implementations must duplicate functionality in order to stay compliant with the requirements of TcpTransport. They must all implement common logic to open channels, close channels, keep track of channels for eventual shutdown, etc. Additionally, there is a weird and complicated relationship between Transport and TransportService. We eventually want to start merging some of the functionality between these classes. This commit starts moving towards a world where TransportService retains all the application logic and channel state. Transport implementations in this world will only be tasked with returning a channel when one is requested, calling transport service when a channel is accepted from a server, and starting / stopping itself. Specifically this commit changes how channels are opened and closed. All Transport implementations now return a channel type that must comply with the new TcpChannel interface. This interface has the methods necessary for TcpTransport to completely manage the lifecycle of a channel. This includes setting the channel up, waiting for connection, adding close listeners, and eventually closing.	2017-11-15 12:38:39 -07:00
Luca Cavanna	382da0f227	REST spec: Validate that api name matches file name that contains it (#27366 ) This commit validates that each spec json file contains an API that has the same name as the file	2017-11-14 14:53:00 +01:00
Simon Willnauer	2299c70371	Allow affix settings to specify dependencies (#27161 ) We use affix settings to group settings / values under a certain namespace. In some cases like login information for instance a setting is only valid if one or more other settings are present. For instance `x.test.user` is only valid if there is an `x.test.passwd` present and vice versa. This change allows to specify such a dependency to prevent settings updates that leave settings in an inconsistent state.	2017-11-13 12:06:36 +01:00
Simon Willnauer	a34c2f0b8d	Ensure external refreshes will also refresh internal searcher to minimize segment creation (#27253 ) We cut over to internal and external IndexReader/IndexSearcher in #26972 which uses two independent searcher managers. This has the downside that refreshes of the external reader will never clear the internal version map which in-turn will trigger additional and potentially unnecessary segment flushes since memory must be freed. Under heavy indexing load with low refresh intervals this can cause excessive segment creation which causes high GC activity and significantly increases the required segment merges. This change adds a dedicated external reference manager that delegates refreshes to the internal reference manager that then `steals` the refreshed reader from the internal reference manager for external usage. This ensures that external and internal readers are consistent on an external refresh. As a sideeffect this also releases old segments referenced by the internal reference manager which can potentially hold on to already merged away segments until it is refreshed due to a flush or indexing activity.	2017-11-09 08:40:22 +00:00
Tim Brooks	dc86b4c2ed	Decouple `ChannelFactory` from Tcp classes (#27286 ) * Decouple `ChannelFactory` from Tcp classes This is related to #27260. Currently `ChannelFactory` is tightly coupled to classes related to the elasticsearch Tcp binary protocol. This commit modifies the factory to be able to construct http or other protocol channels.	2017-11-08 14:30:00 -07:00
Jason Tedor	d5451b2037	Die with dignity while merging If an out of memory error is thrown while merging, today we quietly rewrap it into a merge exception and the out of memory error is lost. Instead, we need to rethrow out of memory errors, and in fact any fatal error here, and let those go uncaught so that the node is torn down. This commit causes this to be the case. Relates #27265	2017-11-06 17:55:11 -05:00
Jason Tedor	766d29e7cf	Correctly encode warning headers The warnings headers have a fairly limited set of valid characters (cf. quoted-text in RFC 7230). While we have assertions that we adhere to this set of valid characters ensuring that our warning messages do not violate the specificaion, we were neglecting the possibility that arbitrary user input would trickle into these warning headers. Thus, missing here was tests for these situations and encoding of characters that appear outside the set of valid characters. This commit addresses this by encoding any characters in a deprecation message that are not from the set of valid characters. Relates #27269	2017-11-06 13:20:30 -05:00
Simon Willnauer	bd7efa908a	Add ability to split shards (#26931 ) This change adds a new `_split` API that allows to split indices into a new index with a power of two more shards that the source index. This API works alongside the `_shrink` API but doesn't require any shard relocation before indices can be split. The split operation is conceptually an inverse `_shrink` operation since we initialize the index with a _syntetic_ number of routing shards that are used for the consistent hashing at index time. Compared to indices created with earlier versions this might produce slightly different shard distributions but has no impact on the per-index backwards compatibility. For now, the user is required to prepare an index to be splittable by setting the `index.number_of_routing_shards` at index creation time. The setting allows the user to prepare the index to be splittable in factors of `index.number_of_routing_shards` ie. if the index is created with `index.number_of_routing_shards: 16` and `index.number_of_shards: 2` it can be split into `4, 8, 16` shards. This is an intermediate step until we can make this the default. This also allows us to safely backport this change to 6.x. The `_split` operation is implemented internally as a DeleteByQuery on the lucene level that is executed while the primary shards execute their initial recovery. Subsequent merges that are triggered due to this operation will not be executed immediately. All merges will be deferred unti the shards are started and will then be throttled accordingly. This change is intended for the 6.1 feature release but will not support pre-6.1 indices to be split unless these indices have been shrunk before. In that case these indices can be split backwards into their original number of shards.	2017-11-06 11:37:55 +01:00
Tanguy Leroux	43e7a4a349	Upgrade to Jackson 2.8.10 (#27230 ) While it's not possible to upgrade the Jackson dependencies to their latest versions yet (see #27032 (comment) for more) it's still possible to upgrade to the latest 2.8.x version.	2017-11-06 10:20:05 +01:00
Jim Ferenczi	429275a773	Remove ElasticsearchQueryCachingPolicy (#27190 ) We have an hidden setting called `index.queries.cache.term_queries` that disables caching of term queries in the query cache. Though term queries are not cached in the Lucene UsageTrackingQueryCachingPolicy since version 6.5. This makes the es policy useless but also makes it impossible to re-enable caching for term queries. This change appeared in Lucene 6.5 so this setting is no-op since version 5.4 of Elasticsearch The change in this PR removes the setting and the custom policy.	2017-11-06 08:26:24 +01:00
David Roberts	749c3ec716	Remove the single argument Environment constructor (#27235 ) Only tests should use the single argument Environment constructor. To enforce this the single arg Environment constructor has been replaced with a test framework factory method. Production code (beyond initial Bootstrap) should always use the same Environment object that Node.getEnvironment() returns. This Environment is also available via dependency injection.	2017-11-04 13:25:09 +00:00
kel	0f21262b36	Do not create directories if repository is readonly (#26909 ) For FsBlobStore and HdfsBlobStore, if the repository is read only, the blob store should be aware of the readonly setting and do not create directories if they don't exist. Closes #21495	2017-11-03 13:10:50 +01:00
Jason Tedor	d6d830ff0b	Fix logic detecting unreleased versions When partitioning version constants into released and unreleased versions, today we have a bug in finding the last unreleased version. Namely, consider the following version constants on the 6.x branch: ..., 5.6.3, 5.6.4, 6.0.0-alpha1, ..., 6.0.0-rc1, 6.0.0-rc2, 6.0.0, 6.1.0. In this case, our convention dictates that: 5.6.4, 6.0.0, and 6.1.0 are unreleased. Today we correctly detect that 6.0.0 and 6.1.0 are unreleased, and then we say the previous patch version is unreleased too. The problem is the logic to remove that previous patch version is broken, it does not skip alphas/betas/RCs which have been released. This commit fixes this by skipping backwards over pre-release versions when finding the previous patch version to remove. Relates #27206	2017-11-01 13:01:45 -04:00
Colin Goodheart-Smithe	99aca9cdfc	Enhances exists queries to reduce need for `_field_names` (#26930 ) * Enhances exists queries to reduce need for `_field_names` Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance. This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`. Closes #26770 * Addresses review comments * Addresses more review comments * implements existsQuery explicitly on every mapper * Reinstates ability to perform term query on `_field_names` * Added bwc depending on index created version * Review Comments * Skips tests that are not supported in 6.1.0 These values will need to be changed after backporting this PR to 6.x	2017-11-01 10:46:59 +00:00
kel	c3e2bdf20c	Raise IllegalArgumentException if query validation failed (#26811 ) Closes #26799	2017-10-31 12:17:27 +01:00
Adrien Grand	3812d3cb43	TopHitsAggregator must propagate calls to `setScorer`. (#27138 ) It is required in order to work correctly with bulk scorer implementations that change the scorer during the collection process. Otherwise sub collectors might call `Scorer.score()` on the wrong scorer. Closes #27131	2017-10-31 09:59:06 +01:00
Jason Tedor	a566942219	Refactor internal engine This commit is a minor refactoring of internal engine to move hooks for generating sequence numbers into the engine itself. As such, we refactor tests that relied on this hook to use the new hook, and remove the hook from the sequence number service itself. Relates #27082	2017-10-30 13:10:20 -04:00
Ryan Ernst	2a8452b513	Reindex: Fix headers in reindex action (#26937 ) The headers passed to reindex were skipped except for the last one. This commit fixes the copying of the headers, as well as adds a base test case for rest client builders to access the headers within the built rest client. relates #22976	2017-10-25 16:37:01 -07:00
olcbean	981b7f4d39	Make yaml test runner stricter by enforcing `required` for paths and parameters (#27035 ) Till now the yaml test runner was verifying that the provided path parts and parameters are supported. With this PR, yaml test runner also checks that all required path parts and parameters are provided.	2017-10-25 19:36:42 +00:00
Luca Cavanna	8caf7d4ff8	Decouple BulkProcessor from ThreadPool (#26727 ) Introduce minimal thread scheduler as a base class for `ThreadPool`. Such a class can be used from the `BulkProcessor` to schedule retries and the flush task. This allows to remove the `ThreadPool` dependency from `BulkProcessor`, which requires to provide settings that contain `node.name` and also needed log4j for logging. Instead, it needs now a `Scheduler` that is much lighter and gets automatically created and shut down on close. Closes #26028	2017-10-25 10:30:23 +02:00
Lee Hinman	fcfbdf1f37	Expose adaptive replica selection stats in /_nodes/stats API This exposes the collected metrics we store for ARS in the nodes stats, as well as the computed rank of nodes. Each node exposes its perspective about the cluster. Here's an example output (with `?human`): ```json ... "adaptive_selection" : { "_k6v1-wERxyUd5ke6s-D0g" : { "outgoing_searches" : 0, "avg_queue_size" : 0, "avg_service_time" : "7.8ms", "avg_service_time_ns" : 7896963, "avg_response_time" : "9ms", "avg_response_time_ns" : 9095598, "rank" : "9.1" }, "VJiCUFoiTpySGmO00eWmtQ" : { "outgoing_searches" : 0, "avg_queue_size" : 0, "avg_service_time" : "1.3ms", "avg_service_time_ns" : 1330240, "avg_response_time" : "4.5ms", "avg_response_time_ns" : 4524154, "rank" : "4.5" }, "DHNGTdzyT9iiaCpEUsIAKA" : { "outgoing_searches" : 0, "avg_queue_size" : 0, "avg_service_time" : "2.1ms", "avg_service_time_ns" : 2113164, "avg_response_time" : "6.3ms", "avg_response_time_ns" : 6375810, "rank" : "6.4" } } ... ```	2017-10-24 08:58:42 -06:00
Tim Brooks	277637f42f	Do not set SO_LINGER on server channels (#26997 ) Right now we are attempting to set SO_LINGER to 0 on server channels when we are stopping the tcp transport. This is not a supported socket option and throws an exception. This also prevents the channels from being closed. This commit 1. doesn't set SO_LINGER for server channges, 2. checks that it is a supported option in nio, and 3. changes the log message to warn for server channel close exceptions.	2017-10-13 13:06:38 -06:00
Jason Tedor	393e73612e	Fix formatting in channel close test This commit fixes the indentation in the transport test case for a channel closing while connecting.	2017-10-10 13:39:45 -04:00
Jason Tedor	4c06b8f1d2	Check for closed connection while opening While opening a connection to a node, a channel can subsequently close. If this happens, a future callback whose purpose is to close all other channels and disconnect from the node will fire. However, this future will not be ready to close all the channels because the connection will not be exposed to the future callback yet. Since this callback is run once, we will never try to disconnect from this node again and we will be left with a closed channel. This commit adds a check that all channels are open before exposing the channel and throws a general connection exception. In this case, the usual connection retry logic will take over. Relates #26932	2017-10-10 13:34:51 -04:00
Simon Willnauer	cdd7c1e6c2	Return List instead of an array from settings (#26903 ) Today we return a `String[]` that requires copying values for every access. Yet, we already store the setting as a list so we can also directly return the unmodifiable list directly. This makes list / array access in settings a much cheaper operation especially if lists are large.	2017-10-09 09:52:08 +02:00
Nhat	bf4c3642b2	remove _primary and _replica shard preferences (#26791 ) The shard preference _primary, _replica and its variants were useful for the asynchronous replication. However, with the current impl, they are no longer useful and should be removed. Closes #26335	2017-10-08 11:03:06 -04:00
Jason Tedor	470e5e7cfc	Add additional low-level logging handler () * Add additional low-level logging handler We have the trace handler which is useful for recording sent messages but there are times where it would be useful to have more low-level logging about the events occurring on a channel. This commit adds a logging handler that can be enabled by setting a certain log level (org.elasticsearch.transport.netty4.ESLoggingHandler) to trace that provides trace logging on low-level channel events and includes some information about the request/response read/write events on the channel as well. * Remove imports * License header * Remove redundant * Add test * More assertions	2017-10-05 12:10:58 -04:00
Martijn van Groningen	b27e408ed2	Removed void token filter entries and added two tests	2017-10-05 13:25:05 +02:00
Md. Abdulla-Al-Sun	a40c474e10	Added Bengali Analyzer to Elasticsearch with respect to the lucene update(PR#238)	2017-10-05 13:25:05 +02:00
Boaz Leskes	2a04118e88	Promote common rest test utility methods to ESRestTestCase We have duplicates in some classes and I was about to create one more.	2017-10-05 10:08:10 +02:00
Simon Willnauer	00dfdf50cf	Represent lists as actual lists inside Settings (#26878 ) Today we represent each value of a list setting with it's own dedicated key that ends with the index of the value in the list. Aside of the obvious weirdness this has several issues especially if lists are massive since it causes massive runtime penalties when validating settings. Like a list of 100k words will literally cause a create index call to timeout and in-turn massive slowdown on all subsequent validations runs. With this change we use a simple string list to represent the list. This change also forbids to add a settings that ends with a .0 which was internally used to detect a list setting. Once this has been rolled out for an entire major version all the internal .0 handling can be removed since all settings will be converted. Relates to #26723	2017-10-05 09:27:08 +02:00
Martijn van Groningen	dca787ed8a	upgrade to Lucene 7.1.0 snapshot version	2017-10-05 09:06:56 +02:00
Simon Willnauer	d1533e2397	Remove Settings#getAsMap() (#26845 ) Since `#getAsMap` exposes internal representation we are trying to remove it step by step. This commit is cleaning up some xcontent writing as well as usage in tests	2017-10-04 01:21:38 -06:00
Boaz Leskes	a18bd9caa2	Increase ESRestTestCase.waitForClusterStateUpdatesToFinish time out to 30s It is set to 10 sec but sometimes it takes the cluster longer to settle.	2017-10-03 12:24:36 +02:00
Tim Brooks	d80ad7f097	Check channel i open before setting SO_LINGER (#26857 ) This commit fixes a #26855. Right now we set SO_LINGER to 0 if we are stopping the transport. This can throw a ChannelClosedException if the raw channel is already closed. We have a number of scenarios where it is possible this could be called with a channel that is already closed. This commit fixes the issue be checking that the channel is not closed before attempting to set the socket option.	2017-10-02 15:09:52 -06:00
Tim Brooks	9ae7a80ba5	Move raw selector usage into ESSelector (#26825 ) Currently we only log generic messages about errors in logs from the nio event handler. This means that we do not know which channel had issues connection, reading, writing, etc. This commit changes the logs to include the local and remote addresses and profile for a channel.	2017-10-01 17:59:57 -06:00
Simon Willnauer	7b8d036ab5	Replace group map settings with affix setting (#26819 ) We use group settings historically instead of using a prefix setting which is more restrictive and type safe. The majority of the usecases needs to access a key, value map based on the _leave node_ of the setting ie. the setting `index.tag.*` might be used to tag an index with `index.tag.test=42` and `index.tag.staging=12` which then would be turned into a `{"test": 42, "staging": 12}` map. The group settings would always use `Settings#getAsMap` which is loosing type information and uses internal representation of the settings. Using prefix settings allows now to access such a method type-safe and natively.	2017-09-30 14:27:21 +02:00
Tim Brooks	bf403ae028	Add information about nio channels in logs (#26806 ) Currently we only log generic messages about errors in logs from the nio event handler. This means that we do not know which channel had issues connection, reading, writing, etc. This commit changes the logs to include the local and remote addresses and profile for a channel.	2017-09-28 17:11:26 -06:00
Simon Willnauer	25d6778d31	Add comment to TCP transport impls why we set SO_LINGER on close	2017-09-28 13:07:01 +02:00
Armin Braun	af06231d4c	#26701 Close TcpTransport on RST in some Spots to Prevent Leaking TIME_WAIT Sockets (#26764 ) #26701 Added option to RST instead of FIN to TcpTransport#closeChannels	2017-09-26 19:58:11 +00:00
Simon Willnauer	a506ba8602	Remove `Settings,put(Map<String,String>)` (#26785 ) `Map<String,String>` is basically erasing the type while other methods on the `Settings.Builder` are type safe and have corresponding `get` methods.	2017-09-26 12:15:20 +02:00
Simon Willnauer	aab4655e63	Unify Settings xcontent reading and writing (#26739 ) This change adds a fromXContent method to Settings that allows to read the xcontent that is produced by toXContent. It also replaces the entire settings loader infrastructure and removes the structured map representation. Future PRs will also tackle the `getAsMap` that exposes the internal represenation of settings for better encapsulation.	2017-09-25 13:23:01 +02:00
Jason Tedor	f35d1de502	Introduce global checkpoint background sync It is the exciting return of the global checkpoint background sync. Long, long ago, in snapshot version far, far away we had and only had a global checkpoint background sync. This sync would fire periodically and send the global checkpoint from the primary shard to the replicas so that they could update their local knowledge of the global checkpoint. Later in time, as we sped ahead towards finalizing the initial version of sequence IDs, we realized that we need the global checkpoint updates to be inline. This means that on a replication operation, the primary shard would piggy back the global checkpoint with the replication operation to the replicas. The replicas would update their local knowledge of the global checkpoint and reply with their local checkpoint. However, this could allow the global checkpoint on the primary to advance again and the replicas would fall behind in their local knowledge of the global checkpoint. If another replication operation never fired, then the replicas would be permanently behind. To account for this, we added one more sync that would fire when the primary shard fell idle. However, this has problems: - the shard idle timer defaults to five minutes, a long time to wait for the replicas to learn of the new global checkpoint - if a replica missed the sync, there was no follow-up sync to catch them up - there is an inherent race condition where the primary shard could fall idle mid-operation (after having sent the replication request to the replicas); in this case, there would never be a background sync after the operation completes - tying the global checkpoint sync to the idle timer was never natural To fix this, we add two additional changes for the global checkpoint to be synced to the replicas. The first is that we add a post-operation sync that only fires if there are no operations in flight and there is a lagging replica. This gives us a chance to sync the global checkpoint to the replicas immediately after an operation so that they are always kept up to date. The second is that we add back a global checkpoint background sync that fires on a timer. This timer fires every thirty seconds, and is not configurable (for simplicity). This background sync is smarter than what we had previously in the sense that it only sends a sync if the global checkpoint on at least one replica is lagging that of the primary. When the timer fires, we can compare the global checkpoint on the primary to its knowledge of the global checkpoint on the replicas and only send a sync if there is a shard behind. Relates #26591	2017-09-21 15:34:13 -04:00
James Baiera	c760eec054	Add permission checks before reading from HDFS stream (#26716 ) Add checks for special permissions before reading hdfs stream data. Also adds test from readonly repository fix. MiniHDFS will now start with an existing repository with a single snapshot contained within. Readonly Repository is created in tests and attempts to list the snapshots within this repo.	2017-09-21 11:55:07 -04:00
Michael Basnight	f385e0cf26	Add bad_request to the rest-api-spec catch params (#26539 ) This adds another request to the catch params. It also makes sure that the generic request param does not allow 400 either.	2017-09-14 14:24:03 -05:00
Christoph Büscher	c7c6443b10	[Docs] "The the" is a great band, but ... (#26644 ) Removing several occurrences of this typo in the docs and javadocs, seems to be a common mistake. Corrections turn up once in a while in PRs, better to correct some of this in one sweep.	2017-09-14 15:08:20 +02:00
Adrien Grand	93da7720ff	Move non-core mappers to a module. (#26549 ) Today we have all non-plugin mappers in core. I'd like to start moving those that neither map to json datatypes nor are very frequently used like `date` or `ip` to a module. This commit creates a new module called `mappers-extra` and moves the `scaled_float` and `token_count` mappers to it. I'd like to eventually move `range` fields there but it's more complicated due to their intimate relationship with range queries. Relates #10368	2017-09-13 17:58:53 +02:00
Simon Willnauer	42f3129d7b	Allow plugins to validate cluster-state on join (#26595 ) Today we don't have a pluggable way to validate if the cluster state is compatible with the node that joins. We already apply some checks for index compatibility that prevents nodes to join a cluster with indices it doesn't support but for plugins this isn't possible. This change adds a cluster state validator that allows plugins to prevent a join if the cluster-state is incompatible.	2017-09-12 15:32:33 +02:00
Ryan Ernst	5c35bff1c3	Test: Remove leftover static bwc test case (#26584 ) This test case was leftover from the static bwc tests. There was still one use for checking we do not load old indices, but this PR moves the legacy code needed for that directly into the test. I also opened a follow up issue to completely remove the unsupported test: #26583.	2017-09-11 15:38:30 -07:00
Jason Tedor	b2e4bfa0a7	Snapshot fallback should consider build.snapshot When determining if a build is a snapshot build, we look for a field in the JAR manifest. However, when running tests, we are not running with a compiled core Elasticsearch JAR, we are running with the compiled core classes on the classpath. We have a fallback for this, we always assume such a situation is a snapshot build. However, when running builds with -Dbuild.snapshot=false, this is not the case. As such, we need to fallback to the value of build.snapshot. However, there are cases where we are not running with a compiled core Elasticsearch JAR (e.g., when the transport client is embedded in a web container) so we should only do this fallback if we are in tests. To verify we are in tests, we check if randomized runner is on the classpath. Relates #26554	2017-09-11 07:42:11 -04:00
Martijn van Groningen	b391425da1	Added support to the percolate query to percolate multiple documents The percolator will add a `_percolator_document_slot` field to all percolator hits to indicate with what document it has matched. This number matches with the order in which the documents have been specified in the percolate query. Also improved the support for multiple percolate queries in a search request.	2017-09-08 17:28:39 +02:00
Tim Brooks	c1a20f7e48	Merge tsa with ts (#26369 ) We currently have a weird relationship between Transport, TransportService, and TransportServiceAdaptor. At some point I think that we would like to collapse these all into one concept as we only support TCP transports. This commit moves in that direction by eliminating the adaptor and just passing the transport service to the transport.	2017-09-05 09:15:56 -06:00
Boaz Leskes	2fd4af82e4	Move `UNASSIGNED_SEQ_NO` and `NO_OPS_PERFORMED` to SequenceNumbers (#26494 ) Where they better belong.	2017-09-04 16:31:00 +02:00
Alexander Reelsen	80d0a32f8e	ScriptService: Replace max compilation per minute setting with max compilation rate (#26399 ) The current script service has a script compilation limit for a one minute window. This is set to a small default value of 15. Instead of increasing that default value, this commit introduces a new setting that allows to configure a rate per time unit, so that the script service can deal with bursts better. The new setting is named `script.max_compilations_rate`, requires a nonnegative number and a positive time value. The default is `75/5m`, which is equivalent to the existing 15 per minute.	2017-09-01 10:15:27 +02:00
Lee Hinman	c3da66d021	Implement adaptive replica selection (#26128 ) * Implement adaptive replica selection This implements the selection algorithm described in the C3 paper for determining which copy of the data a query should be routed to. By using the service time EWMA, response time EWMA, and queue size EWMA we calculate the score of a node by piggybacking these metrics with each search request. Since Elasticsearch lacks the "broadcast to every copy" behavior that Cassandra has (as mentioned in the C3 paper) to update metrics after a node has been highly weighted, this implementation adjusts a node's response stats using the average of the its own and the "best" node's metrics. This is so that a long GC or other activity that may cause a node's rank to increase dramatically does not permanently keep a node from having requests routed to it, instead it will eventually lower its score back to the realm where it is a potential candidate for new queries. This feature is off by default and can be turned on with the dynamic setting `cluster.routing.use_adaptive_replica_selection`. Relates to #24915, however instead of `b=3` I used `b=4` (after benchmarking) * Randomly use adaptive replica selection for internal test cluster * Use an action name prefix for retrieving pending requests * Add unit test for replica selection * don't use adaptive replica selection in SearchPreferenceIT * Track client connections in a SearchTransportService instead of TransportService * Bind `entry` pieces in local variables * Add javadoc link to C3 paper and javadocs for stat adjustments * Bind entry's key and value to local variables * Remove unneeded actionNamePrefix parameter * Use conns.longValue() instead of cached Long * Add comments about removing entries from the map * Pull out bindings for `entry` in IndexShardRoutingTable * Use .compareTo instead of manually comparing * add assert for connections not being null and gte to 1 * Copy map for pending search connections instead of "live" map * Increase the number of pending search requests used for calculating rank when chosen When a node gets chosen, this increases the number of search counts for the winning node so that it will not be as likely to be chosen again for non-concurrent search requests. * Remove unused HashMap import * Rename rank -> rankShardsAndUpdateStats * Rename rankedActiveInitializingShardsIt -> activeInitializingShardsRankedIt * Instead of precalculating winning node, use "winning" shard from ranked list * Sort null ranked nodes before nodes that have a rank	2017-08-30 20:55:11 -06:00
Tal Levy	ed151d829d	Migrate Search requests to use Writeable reading strategies (#26428 ) Migrates many SearchRequest objects to use Writeable conventions and rejects usage of `readFrom` in these new classes.	2017-08-30 11:00:33 -07:00
Sergey Galkin	c075323522	Refactor create index service to be unit testable This commit refactors MetaDataCreateIndexService so that it is unit testable. Relates #25961	2017-08-29 16:55:44 -04:00
Michael Basnight	cfd14cd2b8	Revert shading for the low level rest client (#26367 ) At current, we do not feel there is enough of a reason to shade the low level rest client. It caused problems with commons logging and IDE's during the brief time it was used. We did not know exactly how many users will need this, and decided that leaving shading out until we gather more information is best. Users can still shade the jar themselves. For information and feeback, see issue #26366. Closes #26328 This reverts commit `3a20922046`. This reverts commit `2c271f0f22`. This reverts commit `9d10dbea39`. This reverts commit `e816ef89a2`.	2017-08-25 14:13:12 -05:00
Nik Everett	b3edd11aa0	Allow plugins to plug rescore implementations (#26368 ) This allows plugins to plug rescore implementations into Elasticsearch. While this is a fairly expert thing to do I've done my best to point folks to the QueryRescorer as one that at least documents the tradeoffs that it makes. I've attempted to limit the API surface area by removing `SearchContext` from the exposed interface, instead exposing just the IndexSearcher and `QueryShardContext`. I also tried to make some of the class names more consistent and do some general cleanup while I was there. I entertained the notion of moving the `QueryRescorer` to module. After all, it'd be a wonderful test to prove that you can plug rescore implementation into Elasticsearch if the only built in rescore implementation is in the module. But I decided against it because the new module would require a client jar and it'd require moving some more things around. I think if we really want to do it, we should do it as a followup. I did, on the other hand, create an "example" rescore plugin which should both be a nice example for anyone wanting to plug in their own rescore implementation and servers as a good integration test to make sure that you can indeed plug one in. Closes #26208	2017-08-25 13:46:57 -04:00
Yannick Welsch	0390c76f0a	Remove reinitShadowPrimary (#26349 ) With shadow replicas gone, there is no need to have this method anymore.	2017-08-25 10:37:51 +09:30
Tal Levy	6ab4b6b0ac	revamp TransportRequest handlers to support Writeable (#26315 ) This PR begins the long journey to deprecating Streamable. The idea here is to add additional method signatures that support Writeable.Reader, so that the work to migrate objects TransportMessage to implement Writeable and not Streamable. One example conversion is done in this PR: SimulatePipelineRequest.	2017-08-22 15:47:05 -07:00
Yannick Welsch	3d8feff66e	Use Java 9 FilePermission model (#26302 ) This commit makes the security code aware of the Java 9 FilePermission changes (see #21534) and allows us to remove the `jdk.io.permissionsUseCanonicalPath` system property.	2017-08-22 11:22:00 +09:30
Ryan Ernst	96b0d3e0cc	Script: Convert script query to a dedicated script context (#26003 ) This commit converts script query to use a new FilterScript context. The new context returns a boolean, so the error that would have previously happened at runtime if a non boolean was returned would now happen at script compilation. Also, the leniency of supporting returning a number and 0 mapping to false, non-zero to true is gone, but it was never documented. With the new context compilation will now also fail if special variables are used at compilation time, instead of runtime, eg ctx.	2017-08-18 15:18:35 -07:00
Tim Brooks	5d7a78fcdb	Use PlainListenableActionFuture for CloseFuture (#26242 ) Right now we use a custom future for the CloseFuture associated with a channel. This is because we need special unwrapping logic to ensure that exceptions from a future failure are a certain type (opposed to an UncategorizedException). However, the current version is limiting because we can only attach one listener. This commit changes the CloseFuture to extend the PlainListenableActionFuture. This change allows us to attach multiple listeners.	2017-08-18 13:38:38 -05:00
Luca Cavanna	1309dfd44d	Add links to external classes in clients javadoc (#25998 ) The client sniffer depends on the low-level REST client, while the Java high-level REST client and the transport client depend on Elasticsearch itself. Javadoc are not that useful unless they have links to the Elasticsearch classes in the latter case, and to the low-level REST client in the sniffer javadoc. This commit adds those links.	2017-08-17 21:03:47 +02:00
Colin Goodheart-Smithe	a975f4e5d6	Moves more classes over to ToXContentObject/Fragment (#26234 ) * Moves more classes over to ToXContentObject/Fragment * review comments	2017-08-16 15:40:40 +01:00
Simon Willnauer	a9169e536b	Several internal improvements to internal test cluster infra (#26214 ) This chance adds several random test infrastructure improvements that caused issues in on-going developments but are generally useful. For instance is it impossible to restart a node with a secure setting source since we close it after the node is started. This change makes it cloneable such that we can reuse it for a restart.	2017-08-15 17:42:15 +02:00
Martijn van Groningen	1146a35870	Move more token filters to analysis-common module The following token filters were moved: arabic_stem, brazilian_stem, czech_stem, dutch_stem, french_stem, german_stem and russian_stem. Relates to #23658	2017-08-11 17:39:24 +02:00
Andy Bristol	7e3cd6a019	reindex: automatically choose the number of slices (#26030 ) In reindex APIs, when using the `slices` parameter to choose the number of slices, adds the option to specify `slices` as "auto" which will choose a reasonable number of slices. It uses the number of shards in the source index, up to a ceiling. If there is more than one source index, it uses the smallest number of shards among them. This gives users an easy way to use slicing in these APIs without having to make decisions about how to configure it, as it provides a good-enough configuration for them out of the box. This may become the default behavior for these APIs in the future.	2017-08-11 08:25:25 -07:00
Simon Willnauer	6f82b0c6e2	Allow `ClusterState.Custom` to be created on initial cluster states (#26144 ) Today we have a `null` invariant on all `ClusterState.Custom`. This makes several code paths complicated and requires complex state handling in some cases. This change allows to register a custom supplier that is used to initialize the initial clusterstate with these transient customs.	2017-08-11 09:51:49 +02:00
Nik Everett	99ac7beb8e	Teach the build about betas and rcs (#26066 ) The build was ignoring suffixes like "beta1" and "rc1" on the version numbers which was causing the backwards compatibility packaging tests to fail because they expected to be upgrading from 6.0.0 even though they were actually upgrading from 6.0.0-beta1. This adds the suffixes to the information that the build scrapes from Version.java. It then uses those suffixes when it resolves artifacts build from the bwc branch and for testing. Closes #26017	2017-08-10 14:30:00 -04:00
Colin Goodheart-Smithe	dfbaf90951	Adds ToXContentFragment (#25771 ) * Adds ToXContentFragment This interface is meant for objects that implement `ToXContent` but are not complete objects. It is basically the opposite of `ToXContentObject`. It means that it will be easier to track the migration of classes over to the fragment/not fragment ToXContent model as it will be clear which classes are not migrated. When no classes directly implement `ToXContent` we can make `ToXContent` package private to be sure that all new classes must implement `ToXContentObject` or `ToXContentFragment`. * review comments * more review comments * javadocs * iter * Adds tests * iter * adds toString test for aggs * improves tests following review comments * iter * iter	2017-08-09 15:53:30 +01:00
Simon Willnauer	82fa531ab4	Remove `_index` fielddata hack if cluster alias is present (#26082 ) We introduced a hack in #25885 to respect the cluster alias if available on the `_index` field. This is important if aggregations or other field data related operations are executed. Yet, we added a small hack that duplicated an implementation detail from the `_index` field data builder to make this work. This change adds a necessary but simple API change that allows us to remove the hack and only have a single implementation.	2017-08-08 09:24:24 +02:00
Adrien Grand	f0cba4fce5	Add a scripted similarity. (#25831 ) The goal of this similarity is to help users who would like to keep the functionality of the `tf-idf` similarity that we want to remove, or to allow for specific usec-cases (disabling idf, disabling tf, disabling length norm, etc.) to not have to build a custom plugin and familiarize with the low-level Lucene API.	2017-08-08 08:55:12 +02:00
Martijn van Groningen	99d79d5a0f	tests: when do not generate random unicode strings for field names, but instead random alpha ascii strings Should fail build failures like this one: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.0+multijob-unix-compatibility/	2017-08-07 15:09:01 +02:00
Luca Cavanna	14ba36977e	[TEST] prevent yaml tests from using raw requests (#26044 ) Raw requests are supported only by the java yaml test runner and were introduced to test docs snippets. Some yaml tests ended up using them (see #23497) which causes failures for other language clients. This commit migrates those yaml tests to Java tests that send requests through the Java low-level REST client, and also moves the ability to send raw requests to a special client that's only available when testing docs snippets. Closes #25694	2017-08-07 11:02:16 +02:00
Boaz Leskes	e11cbed534	Adding a refresh listener to a recovering shard should be a noop (#26055 ) When `refresh=wait_for` is set on an indexing request, we register a listener on the shards that are call during the next refresh. During the recover translog phase, when the engine is open, we have a window of time when indexing operations succeed and they can add their listeners. Those listeners will only be called when the recovery finishes as we do not refresh during recoveries (unless the indexing buffer is full). Next to being a bad user experience, it can also cause deadlocks with an ongoing peer recovery that may wait for those operations to mark the replica in sync (details below). To fix this, this PR changes refresh listeners to be a noop when the shard is not yet serving reads (implicitly covering the recovery period). It doesn't matter anyway. Deadlock with recovery: When finalizing a peer recovery we mark the peer as "in sync". To do so we wait until the peer's local checkpoint is at least as high as the global checkpoint. If an operation with `refresh=wait_for` is added as a listener on that peer during recovery, it is not completed from the perspective of the primary. The primary than may wait for it to complete before advancing the local checkpoint for that peer. Since that peer is not considered in sync, the global checkpoint on the primary can be higher, causing a deadlock. Operation waits for recovery to finish and a refresh to happen. Recovery waits on the operation.	2017-08-04 19:51:15 +02:00
Tim Brooks	0401df81e0	Revert "Tests: Disable NIO transport mechanism in tests" This reverts commit `c24dbec6f5`.	2017-08-02 09:59:07 -05:00
Colin Goodheart-Smithe	87c6e63e73	Adds mutate function to various tests (#25999 ) * Adds mutate function to various tests Relates to #25929 * fix test * implements mutate function for all single bucket aggs * review comments * convert getMutateFunction to mutateIInstance	2017-08-02 11:38:31 +01:00
Alexander Reelsen	c24dbec6f5	Tests: Disable NIO transport mechanism in tests Due to test instability the new transport mechanism is always disabled and does not randomly pick the new IO transport.	2017-08-02 11:18:12 +02:00
Adrien Grand	58feb5efa0	Fix `_exists_` in query_string on empty indices. (#25993 ) It currently fails if there are no mappings yet. Closes #25956	2017-08-02 10:06:34 +02:00
Luca Cavanna	e2d25c3c89	[TEST] Remove duplicated main response unit test (#25855 ) Also move MainResponseTets to extend AbstractStreamableXContentTestCase	2017-08-02 08:42:38 +02:00
Tim Brooks	58d2dcc54f	Ensure send listener is called on IOException Currently there is an issue where the send listener is not called in the nio transport when an exception is throw during channel flush. This leads to memory leaks. This commit ensures that the listener is called	2017-08-01 22:30:04 -05:00
Tim Brooks	0f4f49496f	Use nio transport in test clusters (#25986 ) This commit adds the nio transport as an option in place of the mock tcp transport for tests. Each test will only use one transport type. The transport type is decided by a random boolean generated inside of the `ESTestCase` class.	2017-08-01 16:19:31 -05:00
Ryan Ernst	072281d5aa	Update version to 7.0.0-alpha1 (#25876 ) This commit updates the version for master to 7.0.0-alpha1. It also adds the 6.1 version constant, and fixes many tests, as well as marking some as awaits fix. Closes #25893 Closes #25870	2017-08-01 15:47:48 -04:00
Luca Cavanna	4d589afbc2	AbstractQueryBuilder to no longer extend ToXContentBytes (#25948 ) ToXContentToBytes is used as a base class that adds toString and buildAsBytes method implementation to classes that implement ToXContent. With the ongoing cleanups, this class is limited and doesn't add a lot of value, given that buildAsBytes can be replaced with XContentHelper.toXContent and toString can be replaced with Strings.toString(this). The plan would be to remove ToXContentToBytes entirely, and AbstractQueryBuilder is the first place where we can remove its usage.	2017-07-31 17:38:24 +02:00
Boaz Leskes	9d10ffd547	Goodbye, Translog Views (#25962 ) During peer recoveries, we need to copy over lucene files and replay the operations they miss from the source translog. Guaranteeing that translog files are not cleaned up has seen many iterations overtime. Back in the old 1.0 days, recoveries went through the Engine and actively prevented both translog cleaning and lucene commits. We then moved to a notion called Translog Views, which allowed the recovery code to "acquire" a view into the translog which is then guaranteed to be kept around until the view is closed. The Engine code was free to commit lucene and do what it ever it wanted without coordinating with recoveries. Translog file deletion logic was based on reference counting on the file level. Those counters were incremented when a view was acquired but also when the view was used to create a `Snapshot` that allowed you to read operations from the files. At some point we removed the file based counting complexity in favor of constructs on the Translog level that just keep track of "open" views and the minimum translog generation they refer to. To do so, Views had to be kept around until the last snapshot that was made from them was consumed. This was fine in recovery code but lead to [a subtle bug](https://github.com/elastic/elasticsearch/pull/25862) in the [Primary Replica Resyncer](https://github.com/elastic/elasticsearch/pull/25862). Concurrently, we have developed the notion of a `TranslogDeletionPolicy` which is responsible for the liveness aspect of translog files. This class makes it very simple to take translog Snapshot into account for keep translog files around, allowing people that just need a snapshot to just take a snapshot and not worry about views and such. Recovery code which actually does need a view can now prevent trimming by acquiring a simple retention lock (a `Closable`). This removes the need for the notion of a View.	2017-07-31 17:29:43 +02:00
Colin Goodheart-Smithe	7740cb54a5	Improves AbstractWireSerializingTestCase equals test (#25910 ) * Improves AbstractWireSerializingTestCase equals test `AbstractWireSerializingTestCase.testEqualsAndHashcode()` now uses `EqualsHashcodeTestUtils` to perform the hashCode and equals checks. To support this `AbstractWireSerializingTestCase` has two new methods: `getCopyFunction()` and `getMutateFunction` which are used when calling `EqualsHashcodeTestUtils` * Adds TODO * Makes equivalent changes to AbstractStreamableTestCase * corrects javadoc error	2017-07-31 14:46:58 +01:00
Martijn van Groningen	0b776a1de0	Move more token filters to analysis-common module The following token filters were moved: delimited_payload_filter, keep, keep_types, classic, apostrophe, decimal_digit, fingerprint, min_hash and scandinavian_folding. Relates to #23658	2017-07-31 15:15:04 +02:00
Martijn van Groningen	7c3735bdc4	percolator: Store the QueryBuilder's Writable representation instead of its XContent representation. The Writeble representation is less heavy to parse and that will benefit percolate performance and throughput. The query builder's binary format has now the same bwc guarentees as the xcontent format. Added a qa test that verifies that percolator queries written in older versions are still readable by the current version.	2017-07-28 12:24:10 +02:00
Yannick Welsch	1a01514081	Move tribe to a module (#25778 ) This commit moves tribe to a module, stripping core from the tribe functionality.	2017-07-28 11:23:50 +02:00
Jason Tedor	1492ccd7ae	Fix environment-aware command tests This commit fixes tests for environment-aware commands. A previous change added a check that es.path.conf is not null. The problem is that this system property is not being set in tests so this check trips every single time. To fix this, we move the check into a method that can be overridden, and then override this method in relevant places in tests to avoid having to set the property in tests. We also add a test that this check works as expected.	2017-07-28 14:37:04 +09:00
Simon Willnauer	b72c71083c	Cleanup IndexFieldData visibility (#25900 ) Today we expose `IndexFieldDataService` outside of IndexService to do maintenance or lookup field data in different ways. Yet, we have a streamlined way to access IndexFieldData via `QueryShardContext` that should encapsulate all access to it. This also ensures that we control all other functionality like cache clearing etc. This change also removes the `recycler` option from `ClearIndicesCacheRequest` this option is a no-op and should have been removed long ago.	2017-07-26 20:03:42 +02:00
Tim Brooks	6d02b45f10	Support client-only mode for NioTransport (#25839 ) Currently, NioTransport does start normal socket selectors and the client when the network server setting is set to false. This commit makes it so that the client will be started even when the network server is not enabled. Additionally, it randomly introduces the NioTransport as an option for the MockTransportClient throughout tests.	2017-07-26 10:27:15 -05:00
Luca Cavanna	d8203f19fd	Remove XContentHelper#toString(ToXContent) in favour of Strings#toString(ToXContent) (#25866 ) These two methods do do the same thing. The subtle difference between the two is that the former prints out pretty printed content by default while the latter doesn't. There are way more usages of the latter throughout the codebase hence I kept that variant although I do think that it would be much better to print out prettified content by default from a `toString`. That breaks quite some tests so I didn't make that change yet. Also XContentHelper#toString was outdated as it didn't check the ToXContent#isFragment method to decide whether a new anonymous object has to be created or not. It would simply fail with any ToXContentObject.	2017-07-26 16:00:59 +02:00
Simon Willnauer	634ce90dc0	Respect cluster alias in `_index` aggs and queries (#25885 ) Today when we aggregate on the `_index` field the cross cluster search alias is not taken into account. Neither is it respected when we search on the field. This change adds support for cluster alias when the cluster alias is present on the `_index` field. Closes #25606	2017-07-26 09:16:52 +02:00
Tim Brooks	2d22bad53f	Simplify selector close method (#25838 ) Currently we have an option to interrupt the selector thread on close. This option is not needed as we do not call this method and we should not be blocking on the network thread. Instead we only need to ever call wakeup() on the raw selector.	2017-07-25 10:52:15 -05:00
Michael Basnight	e816ef89a2	Shade external dependencies in the rest client jar This commit removes all external dependencies from the rest client jar and shades them in an 'org.elasticsearch.client' package within the jar using shadowJar gradle plugin. All projects that depended on the existing jar have been converted to using the 'org.elasticsearch.client' package prefixes to interact with the rest client. Closes #25208	2017-07-24 12:55:43 -05:00
Tim Brooks	0a4b38b60c	Close raw channel when bind / connect fails (#25840 ) Currently we are failing to close socket channels when the initial bind or connect operation fails. This leaves the file descriptor hanging around. This closes the channel when an exception occurs during bind or connect.	2017-07-22 13:55:33 -05:00
Tim Brooks	c7a7c69b2b	Simplify NioChannel creation and closing process (#25504 ) Currently an NioChannel is created and it is UNREGISTERED. At some point it is registered with a selector. From that point on, the channel can only be closed by the selector. The fact that a channel might not be associated with a selector has significant implications for concurrency and the channel shutdown process. The only thing that is simplified by allowing channels to be in a state independent of a selector is some testing scenarios. This PR modifies channels so that they are given a selector at creation time and are always associated with that selector. Only that selector can close that channel. This simplifies the channel lifecycle and closing intricacies.	2017-07-21 11:55:23 -05:00
Yannick Welsch	a2624dfcef	Move primary term from ReplicationRequest to ConcreteShardRequest (#25822 ) Removes the primary term from the replication request and pushes it into the transport envelope. This makes it possible to remove the term from the ReplicationOperation universe. The primary term that is to be used for a replication operation is now determined in the reroute phase when the node decides to execute a primary action (and validated once the primary action gets to execute). This makes it possible to validate that the primary action was sent to the correct primary shard instance that it was meant to be sent to (currently we only validate primary actions using the allocation id, which can be reused for failed and reallocated primaries).	2017-07-21 15:57:42 +02:00
Boaz Leskes	7488877d1a	Validate a joining node's version with version of existing cluster nodes (#25808 ) When a node tries to join a cluster, it goes through a validation step to make sure the node is compatible with the cluster. Currently we validation that the node can read the cluster state and that it is compatible with the indexes of the cluster. This PR adds validation that the joining node's version is compatible with the versions of existing nodes. Concretely we check that: 1) The node's min compatible version is higher or equal to any node in the cluster (this prevents a too-new node from joining) 2) The node's version is higher or equal to the min compat version of all cluster nodes (this prevents a too old join where, for example, the master is on 5.6, there's another 6.0 node in the cluster and a 5.4 node tries to join). 3) The node's major version is at least as higher as the lowest node in the cluster. This is important as we use the minimum version in the cluster to stop executing bwc code for operations that require multiple nodes. If the nodes are already operating in "new cluster mode", we should prevent nodes from the previous major to join (even if they are wire level compatible). This does mean that if you have a very unlucky partition during the upgrade which partitions all old nodes which are also a minority / data nodes only, the may not be able to re-join the cluster. We feel this edge case risk is well worth the simplification it brings to BWC layers only going one way. This restriction only holds if the cluster state has been recovered (i.e., the cluster has properly formed). Also, the node join validation can now selectively fail specific nodes (previously the entire batch was failed). This is an important preparation for a follow up PR where we plan to have a rejected joining node die with dignity.	2017-07-20 20:11:29 +02:00
Simon Willnauer	5e629cfba0	Ensure query resources are fetched asynchronously during rewrite (#25791 ) The `QueryRewriteContext` used to provide a client object that can be used to fetch geo-shapes, terms or documents for percolation. Unfortunately all client calls used to be blocking calls which can have significant impact on the rewrite phase since it occupies an entire search thread until the resource is received. In the case that the index the resource is fetched from isn't on the local node this can have significant impact on query throughput. Note: this doesn't fix MLT since it fetches stuff in doQuery which is a different beast. Yet, it is a huge step in the right direction	2017-07-20 15:37:50 +02:00
Boaz Leskes	9989ac69a4	Revert "Validate a joining node's version with version of existing cluster nodes (#25770 )" This reverts commit `1e1f8e6376`.	2017-07-19 17:34:53 +02:00
Simon Willnauer	4d78935df7	Introduce a new Rewriteable interface to streamline rewriting (#25788 ) Today we have duplicated code that is quite complicated to iterate over rewriteable (`QueryBuilders` mainly) This change introduces a `Rewriteable` interface that allow to share code to do the rewriting as well as encapsulation and composition of queries.	2017-07-19 15:06:49 +02:00
Adrien Grand	55ad318541	Reduce the overhead of timeouts and low-level search cancellation. (#25776 ) Setting a timeout or enforcing low-level search cancellation used to make us wrap the collector and check either the current time or whether the search task was cancelled for every collected document. This can be significant overhead on cheap queries that match many documents. This commit changes the approach to wrap the bulk scorer rather than the collector and exponentially increase the interval between two consecutive checks in order to reduce the overhead of those checks.	2017-07-19 14:15:53 +02:00
Boaz Leskes	1e1f8e6376	Validate a joining node's version with version of existing cluster nodes (#25770 ) When a node tries to join a cluster, it goes through a validation step to make sure the node is compatible with the cluster. Currently we validation that the node can read the cluster state and that it is compatible with the indexes of the cluster. This PR adds validation that the joining node's version is compatible with the versions of existing nodes. Concretely we check that: 1) The node's min compatible version is higher or equal to any node in the cluster (this prevents a too-new node from joining) 2) The node's version is higher or equal to the min compat version of all cluster nodes (this prevents a too old join where, for example, the master is on 5.6, there's another 6.0 node in the cluster and a 5.4 node tries to join). 3) The node's major version is at least as higher as the lowest node in the cluster. This is important as we use the minimum version in the cluster to stop executing bwc code for operations that require multiple nodes. If the nodes are already operating in "new cluster mode", we should prevent nodes from the previous major to join (even if they are wire level compatible). This does mean that if you have a very unlucky partition during the upgrade which partitions all old nodes which are also a minority / data nodes only, the may not be able to re-join the cluster. We feel this edge case risk is well worth the simplification it brings to BWC layers only going one way. Also, the node join validation can now selectively fail specific nodes (previously the entire batch was failed). This is an important preparation for a follow up PR where we plan to have a rejected joining node die with dignity.	2017-07-19 12:57:29 +02:00
Lee Hinman	610ba7e427	Register data node stats from info carried back in search responses (#25430 ) * Register data node stats from info carried back in search responses This is part of #24915, where we now calculate the EWMA of service time for tasks in the search threadpool, and send that as well as the current queue size back to the coordinating node. The coordinating node now tracks this information for each node in the cluster. This information will be used in the future the determining the best replica a search request should be routed to. This change has no user-visible difference. * Move response time timing into ResponseListenerWrapper * Move ResponseListenerWrapper to ActionListener instead of SearchActionListener Also removes the logger * Move `requestIndex` back to private * De-guice-ify ResponseCollectorService \o/ * Undo all changes to SearchQueryThenFetchAsyncAction * Remove unneeded response collector from TransportSearchAction * Undo all changes to SearchDfsQueryThenFetchAsyncAction * Completely rewrite the inside of ResponseCollectorService's record keeping * Documentation and cleanups for ResponseCollectorService * Add unit test for collection of queue size and service time * Fix Guice construction error * Add basic unit tests for ResponseCollectorService * Fix version constant for the master merge * Fix test compilation after master merge * Add a test for node removal on cluster changed event * Remove integration test as there are now unit tests * Rename ResponseListenerWrapper -> SearchExecutionStatsCollector * Fix line-length * Make classes private and final where appropriate * Pass nodeId into SearchExecutionStatsCollector and use only ActionListener * Get nodeId from connection so searchShardTarget can be private * Remove threadpool from SearchContext, get it from IndexShard instead * Add missing import * Use BiFunction for responseWrapper rather than passing in collector service	2017-07-17 11:04:51 -06:00
Adrien Grand	264088f1c4	Deprecate the `_default_` mapping. (#25652 ) Now that indices cannot have types anymore, this feature does not buy anything anymore. Closes #25500	2017-07-17 15:37:59 +02:00
Martijn van Groningen	8003171a0c	Move more token filters to analysis-common module The following token filters were moved: arabic_normalization, german_normalization, hindi_normalization, indic_normalization, persian_normalization, scandinavian_normalization, serbian_normalization, sorani_normalization, cjk_width and cjk_width Relates to #23658	2017-07-17 08:29:44 +02:00
Boaz Leskes	a6bea1bf97	testMockFailToSendNoConnectRule should wait for connection close to bubble up and disconnect the node #25521 changed channel closing to be handled async on anything but transport stop. This means it may take a while before calling `connection.close()` and the node being removed from the `connectedNodes` list (but the connection is immediately unusuable). Fixes #25686	2017-07-15 09:28:17 +02:00
Yannick Welsch	8f0b357651	Let primary own its replication group (#25692 ) Currently replication and recovery are both coordinated through the latest cluster state available on the ClusterService as well as through the GlobalCheckpointTracker (to have consistent local/global checkpoint information), making it difficult to understand the relation between recovery and replication, and requiring some tricky checks in the recovery code to coordinate between the two. This commit makes the primary the single owner of its replication group, which simplifies the replication model and allows to clean up corner cases we have in our recovery code. It also reduces the dependencies in the code, so that neither RecoverySourceXXX nor ReplicationOperation need access to the latest state on ClusterService anymore. Finally, it gives us the property that in-sync shard copies won't receive global checkpoint updates which are above their local checkpoint (relates #25485).	2017-07-14 13:52:53 +02:00
Luca Cavanna	ec66d655b5	Rename client artifacts (#25693 ) It was brought up that our current client artifacts have generic names like 'rest' that may cause conflicts with other artifacts. This commit renames: - rest -> elasticsearch-rest-client - sniffer -> elasticsearch-rest-client-sniffer - rest-high-level -> elasticsearch-rest-high-level-client A couple of small changes are also preparing the high level client for its first release. Closes #20248	2017-07-13 09:44:25 +02:00
Simon Willnauer	b7bc790428	Use a non default port range in MockTransportService We already use a per JVM port range in MockTransportService. Yet, it's possible that if we are executing in the JVM with ordinal 0 that other clusters reuse ports from the mock transport service and some tests try to simulate disconnects etc. By using a non-defautl port range (starting at 10300) we prevent internal test clusters from reusing any of the mock impls ports Relates to #25301	2017-07-12 22:29:21 +02:00
Simon Willnauer	e81804cfa4	Add a shard filter search phase to pre-filter shards based on query rewriting (#25658 ) Today if we search across a large amount of shards we hit every shard. Yet, it's quite common to search across an index pattern for time based indices but filtering will exclude all results outside a certain time range ie. `now-3d`. While the search can potentially hit hundreds of shards the majority of the shards might yield 0 results since there is not document that is within this date range. Kibana for instance does this regularly but used `_field_stats` to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results. This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.	2017-07-12 22:19:20 +02:00
Tim Brooks	a3ade99fcf	Fix BytesReferenceStreamInput#skip with offset (#25634 ) There is a bug when a call to `BytesReferenceStreamInput` skip is made on a `BytesReference` that has an initial offset. The offset for the current slice is added to the current index and then subtracted from the length. This introduces the possibility of a negative number of bytes to skip. This happens inside a loop, which leads to an infinte loop. This commit correctly subtracts the current slice index from the slice.length. Additionally, the `BytesArrayTests` are modified to test instances that include an offset.	2017-07-11 09:54:29 -05:00
Simon Willnauer	98c91a3bd0	Limit the number of concurrent shard requests per search request (#25632 ) This is a protection mechanism to prevent a single search request from hitting a large number of shards in the cluster concurrently. If a search is executed against all indices in the cluster this can easily overload the cluster causing rejections etc. which is not necessarily desirable. Instead this PR adds a per request limit of `max_concurrent_shard_requests` that throttles the number of concurrent initial phase requests to `256` by default. This limit can be increased per request and protects single search requests from overloading the cluster. Subsequent PRs can introduces addiontional improvemetns ie. limiting this on a `_msearch` level, making defaults a factor of the number of nodes or sort shards iters such that we gain the best concurrency across nodes.	2017-07-11 16:23:10 +02:00
Simon Willnauer	ec1afe30ea	Ensure remote cluster alias is preserved in inner hits aggs (#25627 ) We lost the cluster alias due to some special caseing in inner hits and due to the fact that we didn't pass on the alias to the shard request. This change ensures that we have the cluster alias present on the shard to ensure all SearchShardTarget reads preserve the alias. Relates to #25606	2017-07-11 11:34:06 +02:00
Tim Brooks	b22bbf94da	Avoid blocking on channel close on network thread (#25521 ) Currently when we close a channel in Netty4Utils.closeChannels we block until the closing is complete. This introduces the possibility that a network selector thread will block while waiting until a separate network selector thread closes a channel. For instance: T1 closes channel 1 (which is assigned to a T1 selector). Channel 1's close listener executes the closing of the node. That means that T1 now tries to close channel 2. However, channel 2 is assigned to a selector that is running on T2. T1 now must wait until T2 closes that channel at some point in the future. This commit addresses this by adding a boolean to closeChannels indicating if we should block on close. We only set this boolean to true if we are closing down the server channels at shutdown. This call is never made from a network thread. When we call the closeChannels method with that boolean set to false, we do not block on close.	2017-07-10 10:50:51 -05:00
Colin Goodheart-Smithe	3a5a54e83e	Collapses package structure for some bucket aggs (#25579 ) This change collapses some of the packages for the bucket aggregations into their parent packages. This was done for the following aggregations: * The variants of the range aggregation (geo_distance, date and ip) were moved into the `o.e.s.a.bucket.range` package * The `o.e.s.a.bucket.terms.support` package was removed and the classes were moved to `o.e.s.a.bucket.terms` * The filter aggregation was moved to `o.e.s.a.bucket.filter` Since this PR is already relatively large with only the above changes subsequent PRs will do similar operations on relevant metric and pipeline aggregations Relates to #22868	2017-07-10 15:08:15 +01:00
Boaz Leskes	09378f48e4	Add a scheduled translog retention check (#25622 ) We currently check whether translog files can be trimmed whenever we create a new translog generation or close a view. However #25294 added a long translog retention period (12h, max 512MB by default), which means translog files should potentially be cleaned up long after there isn't any indexing activity to trigger flushes/the creation of new translog files. We therefore need a scheduled background check to clean up those files once they are no longer needed. Relates to #10708	2017-07-10 10:28:39 +02:00
Jason Tedor	c084542731	Bump version to 6.0.0-beta1 This commit does two things: - bumps the version from 6.0.0-alpha3 to 6.0.0-beta1 - renames the 6.0.0-alpha3 version constant to 6.0.0-beta1 Relates #25621	2017-07-09 18:12:50 -04:00
Jason Tedor	bc22c1c286	Add disk threshold settings validation This commit adds cross-settings validation for the low/high/flood stage disk watermark settings. This validation was enabled by the introduction of multiple settings validation. Relates #25600	2017-07-07 19:54:36 -04:00
Nik Everett	794257c421	Drop current from the list of released versions (#25187 ) It hasn't been released....	2017-07-07 15:59:57 -04:00
Yannick Welsch	baa87db5d1	Harden global checkpoint tracker This commit refactors the global checkpont tracker to make it more resilient. The main idea is to make it more explicit what state is actually captured and how that state is updated through replication/cluster state updates etc. It also fixes the issue where the local checkpoint information is not being updated when a shard becomes primary. The primary relocation handoff becomes very simple too, we can just verbatim copy over the internal state. Relates #25468	2017-07-07 14:04:28 -04:00
Lee Hinman	8aa0a5c111	Improve REST error handling when endpoint does not support HTTP verb, add OPTIONS support (#24437 ) * Improved REST endpoint exception handling, see #15335 Also improved OPTIONS http method handling to better conform with the http spec. * Tidied up formatting and comments See #15335 * Tests for #15335 * Cleaned up comments, added section number * Swapped out tab indents for space indents * Test class now extends ESSingleNodeTestCase * Capture RestResponse so it can be examined in test cases Simple addition to surface the RestResponse object so we can run tests against it (see issue #15335). * Refactored class name, included feedback See #15335. * Unit test for REST error handling enhancements Randomizing unit test for enhanced REST response error handling. See issue #15335 for more details. * Cleaned up formatting * New constructor to set HTTP method Constructor added to support RestController test cases. * Refactored FakeRestRequest, streamlined test case. * Cleaned up conflicts * Tests for #15335 * Added functionality to ignore or include path wildcards See #15335 * Further enhancements to request handling Refactored executeHandler to prioritize explicit path matches. See #15335 for more information. * Cosmetic fixes * Refactored method handlers * Removed redundant import * Updated integration tests * Refactoring to address issue #17853 * Cleaned up test assertions * Fixed edge case if OPTIONS method randomly selected as invalid method In this test, an OPTIONS method request is valid, and should not return a 405 error. * Remove redundant static modifier * Hook the multiple PathTrie attempts into RestHandler.dispatchRequest * Add missing space * Correctly retrieve new handler for each Trie strategy * Only copy headers to threadcontext once * Fix test after REST header copying moved higher up * Restore original params when trying the next trie candidate * Remove OPTIONS for invalidHttpMethodArray so a 405 is guaranteed in tests * Re-add the fix I already added and got removed during merge :-/ * Add missing GET method to test * Add documentation to migration guide about breaking 404 -> 405 changes * Explain boolean response, pull into local var * fixup! Explain boolean response, pull into local var * Encapsulate multiple HTTP methods into PathTrie<MethodHandlers> * Add PathTrie.retrieveAll where all matching modes can be retrieved Then TrieMatchingMode can be package private and not leak into RestController * Include body of error with 405 responses to give hint about valid methods * Fix missing usageService handler addition I accidentally removed this :X * Initialize PathTrieIterator modes with Arrays.asList * Use "== false" instead of ! * Missing paren :-/	2017-07-07 09:01:23 -06:00
Adrien Grand	40bb1663ee	Index ids in binary form. (#25352 ) Indexing ids in binary form should help with indexing speed since we would have to compare fewer bytes upon sorting, should help with memory usage of the live version map since keys will be shorter, and might help with disk usage depending on how efficient the terms dictionary is at compressing terms. Since we can only expect base64 ids in the auto-generated case, this PR tries to use an encoding that makes the binary id equal to the base64-decoded id in the majority of cases (253 out of 256). It also specializes numeric ids, since this seems to be common when content that is stored in Elasticsearch comes from another database that uses eg. auto-increment ids. Another option could be to require base64 ids all the time. It would make things simpler but I'm not sure users would welcome this requirement. This PR should bring some benefits, but I expect it to be mostly useful when coupled with something like #24615. Closes #18154	2017-07-07 14:22:47 +02:00
Martijn van Groningen	6db708ef75	Move more token filters to analysis-common module The following token filters were moved: common grams, limit token, pattern capture and pattern raplace. Relates to #23658	2017-07-07 10:02:52 +02:00
Simon Willnauer	1f67d079b1	Validate `transport.profiles.` settings (#25508 ) Transport profiles unfortunately have never been validated. Yet, it's very easy to make a mistake when configuring profiles which will most likely stay undetected since we don't validate the settings but allow almost everything based on the wildcard in `transport.profiles.`. This change removes the settings subset based parsing of profiles but rather uses concrete affix settings for the profiles which makes it easier to fall back to higher level settings since the fallback settings are present when the profile setting is parsed. Previously, it was unclear in the code which setting is used ie. if the profiles settings (with removed prefixes) or the global node setting. There is no distinction anymore since we don't pull prefix based settings.	2017-07-07 09:40:59 +02:00
Simon Willnauer	38a1df7da1	Use a port range per JVM in MockTransportService (#25565 ) Some tests use MockTransportService to do network based testing. Yet, we run tests in multiple JVMs that means concurrent tests could claim port that another JVM just released and if that test tries to simulate a disconnect it might be smart enough to re-connect depending on what is tested. To reduce the risk, since this is very hard to debug we use a different default port range per JVM unless the incoming settings overriding it. Closes #25301	2017-07-06 09:14:52 +02:00
Simon Willnauer	6e5cc424a8	Switch indices read-only if a node runs out of disk space (#25541 ) Today when we run out of disk all kinds of crazy things can happen and nodes are becoming hard to maintain once out of disk is hit. While we try to move shards away if we hit watermarks this might not be possible in many situations. Based on the discussion in #24299 this change monitors disk utilization and adds a flood-stage watermark that causes all indices that are allocated on a node hitting the flood-stage mark to be switched read-only (with the option to be deleted). This allows users to react on the low disk situation while subsequent write requests will be rejected. Users can switch individual indices read-write once the situation is sorted out. There is no automatic read-write switch once the node has enough space. This requires user interaction. The flood-stage watermark is set to `95%` utilization by default. Closes #24299	2017-07-05 22:18:23 +02:00
Christoph Büscher	3185eaece8	QueryBuilders should implement ToXContentObject (#25530 ) All query builders written as self contained xContent objects, to we should mark them accordingly using ToXContentObject. This also makes it possible to use things like XContentHelper#toXContent to render query builders in tests.	2017-07-05 09:50:10 +02:00
Christoph Büscher	f576c987ce	Remove QueryParseContext (#25486 ) QueryParseContext is currently only used as a wrapper for an XContentParser, so this change removes it entirely and changes the appropriate APIs that use it so far to only accept a parser instead.	2017-07-03 17:30:40 +02:00
Simon Willnauer	5a7c8bb04e	Cleanup network / transport related settings (#25489 ) This commit makes the use of the global network settings explicit instead of implicit within NetworkService. It cleans up several places where we fall back to the global settings while we should have used tcp or http ones. In addition this change also removes unnecessary settings classes	2017-07-02 10:16:50 +02:00
James Baiera	74f4a14d82	Upgrading HDFS Repository Plugin to use HDFS 2.8.1 Client (#25497 ) Hadoop 2.7.x libraries fail when running on JDK9 due to the version string changing to a single character. On Hadoop 2.8, this is no longer a problem, and it is unclear on whether the fix will be backported to the 2.7 branch. This commit upgrades our dependency of Hadoop for the HDFS Repository to 2.8.1.	2017-06-30 17:57:56 -04:00
Tim Brooks	cac2eec7d2	Add NioTransport threads to thread name checks (#25477 ) We have various assertions that check we never block on transport threads. This commit adds the thread names for the NioTransport to these assertions. With this change I had to fix two places where we were calling blocking methods from the transport threads.	2017-06-29 15:16:07 -05:00
Tim Brooks	dd5d165da1	Prevent channel enqueue after selector close (#25478 ) This commit adds additional protection to `ESSelector` and its implementations to ensure that channels are not enqueued after the selector is closed. After a channel has been added to the queue, we check that the selector is open. If it is not, then we remove the channel from the queue. If the channel is removed successfully, we throw an `IllegalStateException`.	2017-06-29 14:02:50 -05:00
Tim Brooks	6c58f0c4e6	Handle ping correctly in NioTransport (#25462 ) Our current TCPTransport logic assumes that we do not pass pings to the TCPTransport level. This commit fixes an issue where NioTransport was passing pings to TCPTransport and leading to exceptions.	2017-06-29 11:03:51 -05:00
Christoph Büscher	acade2b40a	Tests: Remove platform specific assertion in NioSocketChannelTests This check depends on the language settings on the system the test runs on, e.g. it fails on Ubuntu with LANG=de_DE.UTF-8.	2017-06-29 17:32:51 +02:00
Christoph Büscher	927111c91d	Remove QueryParseContext from parsing QueryBuilders (#25448 ) Currently QueryParseContext is only a thin wrapper around an XContentParser that adds little functionality of its own. I provides helpers for long deprecated field names which can be removed and two helper methods that can be made static and moved to other classes. This is a first step in helping to remove QueryParseContext entirely.	2017-06-29 17:10:20 +02:00
Tim Brooks	cad57959e1	Remove finicky exception message assertion In SimpleNioTransportTests we assert that an IOException has a certain message. This message appears that it is not dependible (and might change based on platform). Our other transport tests (mock and netty) do not make this assertion. Instead they only assert on our application exception message. This commit removes the IOException message assertion. And retains the ConnectTransportException message assertion.	2017-06-28 14:16:04 -05:00
Tim Brooks	5f8be0e090	Introduce NioTransport into framework for testing (#24262 ) This commit introduces a nio based tcp transport into framework for testing. Currently Elasticsearch uses a simple blocking tcp transport for testing purposes (MockTcpTransport). This diverges from production where our current transport (netty) is non-blocking. The point of this commit is to introduce a testing variant that more closely matches the behavior of production instances.	2017-06-28 10:51:20 -05:00
Yannick Welsch	5a4a47332c	Use a single method to update shard state This commit refactors index shard to provide a single method for updating the shard state on an incoming cluster state update. Relates #25431	2017-06-28 09:48:47 -04:00
Jason Tedor	5a9fc8aa2a	Remove path.conf setting This commit removes path.conf as a valid setting and replaces it with a command-line flag for specifying a non-default path for configuration. Relates #25392	2017-06-26 15:18:29 -04:00
Martijn van Groningen	a34f5fa812	Move more token filters to analysis-common module The following token filters were moved: stemmer, stemmer_override, kstem, dictionary_decompounder, hyphenation_decompounder, reverse, elision and truncate. Relates to #23658	2017-06-26 09:02:16 +02:00
Ryan Ernst	1583f81047	Test: Allow merging mock secure settings (#25387 ) While real secure settings (ie an ES keystore) cannot be merged together, mocked secure settings can and need to be sometimes merged. This commit adds a merge method to allow tests to merge together multiple instances of secure settings.	2017-06-25 10:19:51 -07:00
Martijn van Groningen	9c511bc447	test: Replace OldIndexBackwardsCompatibilityIT#testOldClusterStates with a full cluster restart qa test OldIndexBackwardsCompatibilityIT#testOldClusterStates tested whether global and index metadata could be read from data directory, this can also be tested in full cluster qa test that checks cluster state via api. Relates to #24939	2017-06-23 09:54:05 +02:00
Boaz Leskes	d963882053	Enable a long translog retention policy by default (#25294 ) #25147 added the translog deletion policy but didn't enable it by default. This PR enables a default retention of 512MB (same maximum size of the current translog) and an age of 12 hours (i.e., after 12 hours all translog files will be deleted). This increases to chance to have an ops based recovery, even if the primary flushed or the replica was offline for a few hours. In order to see which parts of the translog are committed into lucene the translog stats are extended to include information about uncommitted operations. Views now include all translog ops and guarantee, as before, that those will not go away. Snapshotting a view allows to filter out generations that are not relevant based on a specific sequence number. Relates to #10708	2017-06-22 17:08:14 +02:00
Adrien Grand	44e9c0b947	Upgrade to lucene-7.0.0-snapshot-ad2cb77. (#25349 ) Most notable changes: - better update concurrency: LUCENE-7868 - TopDocs.totalHits is now a long: LUCENE-7872 - QueryBuilder does not remove the boolean query around multi-term synonyms: LUCENE-7878 - removal of Fields: LUCENE-7500 For the `TopDocs.totalHits` change, this PR relies on the fact that the encoding of vInts and vLongs are compatible: you can write and read with any of them as long as the value can be represented by a positive int.	2017-06-22 12:35:33 +02:00
Boaz Leskes	7013cbd927	Update MockTransportService to the age of Transport.Connection (#25320 ) MockTransportServices allows us to simulate network disruptions in our testing infra. Sadly it wasn't updated to the state of the art in Transport land. This PR brings it up to speed. Specifically: 1) Opening a connection is now also blocked (before only node connections were blocked) 2) Simplifies things using the latest connection based notification between TcpTransport and TransportService for when a disconnect happens. 3) By 2, it fixes a race condition where we may fail to respond to a sent request when it is sent concurrently with the closing of a connection. The old code relied on a node based bridge between tcp transport and transport service. Sadly, the following doesn't work any more: ``` if (transport.nodeConnected(node)) { // this a connected node, disconnecting from it will be up the exception transport.disconnectFromNode(node); <-- this may now be a noop and it doesn't mean that the transport service was notified of the disconnect between the nodeConnected check and here. } else { throw new ConnectTransportException(node, reason, e); } ```	2017-06-21 10:27:57 +02:00
Simon Willnauer	86a544de3b	Ensure we never read from a closed MockSecureSettings object (#25322 ) If secure settings are closed after the node has been constructed no key-store access is permitted. We should also try to be as close as possible to the real behavior if we mock secure settings. This change also adds the same behavior as bootstrap has to InternalTestCluster to ensure we fail if we try to read from secure settings after the node has been constructed.	2017-06-21 08:14:38 +02:00
Simon Willnauer	5abb7c4bec	Use IndexMetaData settings as a basis for new index settings (#25310 ) In MockFSDirectory we should use the actual indexes settings to build a new IndexMetaData settings object instead of the node settings. Relates to #25297	2017-06-20 15:44:19 +02:00
Nik Everett	3261586cac	Tweak reindex cancel logic and add many debug logs (#25256 ) I'm still trying to hunt down rare failures in the cancelation tests for reindex and friends. Here is the latest: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-unix-compatibility/os=ubuntu/876/console It doesn't show much, other than that one of the tasks didn't kill itself when asked to cancel. So I'm going a bit crazy with debug logging so that the next time this comes up I can trace exactly what happened. Additionally, this tweaks the logic around how rethrottles were performed around cancel. Previously we set the `requestsPerSecond` to `0` when we cancelled the task. That was the "old way" to set them to inifity which was the intent. This switches that from `0` to `Float.MAX_VALUE` which is the "new way" to set the `requestsPerSecond` to infinity. I don't know that this is much better, but it feels better.	2017-06-19 18:46:42 -04:00
Jay Modi	1a6491bc54	Test: do not copy secure settings when creating random directory service (#25297 ) In tests, we sometimes create a random directory service and as part of that the IndexSettings get built again. When we build them again, we need to make sure we do not set the secure settings on the new IndexMetaData object that gets created as the node settings already have the secure settings and the index settings and node settings will be combined. If both have secure settings, the settings builder will throw an AlreadySetException.	2017-06-19 14:52:32 -06:00
Yannick Welsch	1a20760d79	Simplify IndexShard indexing and deletion methods (#25249 ) Indexing or deleting documents through the IndexShard interface is quite complex and error-prone. It requires multiple calls, e.g. first prepareIndexOnPrimary, then do some checks if mapping updates have occurred, then do the actual indexing using index(...) etc. Currently each consumer of the interface (local recovery, peer recovery, replication) has additional custom checks built around it to deal with mapping updates, some of which are even inconsistent. This commit aims at reducing the complexity by exposing a simpler interface on IndexShard. There are no more prepare*** methods and the mapping complexity is also hidden, but still giving callers a possibility to implement custom logic to deal with mapping updates.	2017-06-19 20:11:54 +02:00
Martijn van Groningen	bcaa413b0b	test: Port the remaining old indices search tests to full cluster restart qa module Also tweaked the qa module's gradle file to actually run bwc tests against all index compat versions. Relates to #24939	2017-06-19 12:27:24 +02:00
Simon Willnauer	dc02b32650	Simplify connection closing and cleanups in TcpTransport (#25250 ) Today we maintain a map of open connections in order to close them when a low level channel gets closed or handles a failure. We also spawn a thread due to some tricky concurrency issues especially with respect to netty since they listener might be called on a transport / boss thread. Executions on those threads must not be blocking since otherwise we will likely deadlock the event processing which adds to the complexity of the concurrency model in this class. This change associates the connection with the close callback that every channel invokes once it's closed which allows us to remove the connections map. A relaxed non-blocking concurrency model in the connection close listener allows cleaning up connected nodes without blocking on any lock.	2017-06-19 09:19:45 +02:00
Simon Willnauer	5f18791f1c	[TEST] assertBusy on transport stats since some implementations invoke listeners concurrently	2017-06-18 00:08:34 +02:00
Christoph Büscher	e99ced06cc	[Tests] Check that parsing aggregations works in a forward compatible way (#25219 ) This change adds tests for the aggregation parsing that try to simulate that we can parse existing aggregations in a forward compatible way in the future, ignoring potential newly added fields or substructures to the xContent response.	2017-06-17 13:06:31 +02:00
Nik Everett	21b1db2965	Remove assemble from build task when assemble removed Removes the `assemble` task from the `build` task when we have removed `assemble` from the project. We removed `assemble` from projects that aren't published so our releases will be faster. But That broke CI because CI builds with `gradle precommit build` and, it turns out, that `build` includes `check` and `assemble`. With this change CI will only run `check` for projects without an `assemble`.	2017-06-16 17:19:14 -04:00
Simon Willnauer	f18b0d293c	Move TransportStats accounting into TcpTransport (#25251 ) Today TcpTransport is the de-facto base-class for transport implementations. The need for all the callbacks we have in TransportServiceAdaptor are not necessary anymore since we can simply have the logic inside the base class itself. This change moves the stats metrics directly into TcpTransport removing the need for low level bytes send / received callbacks.	2017-06-16 22:34:11 +02:00
Nik Everett	7b358190d6	Remove assemble task when not used for publishing (#25228 ) Removes the `assemble` task from projects that are not published. This should speed up `gradle assemble` by skipping projects that don't need to be built. Which is useful because `gradle assemble` is how we cut releases.	2017-06-16 11:46:34 -04:00
Christoph Büscher	d3442f7d0c	Add unit test for PathHierarchyTokenizerFactory (#24984 )	2017-06-15 19:18:33 +02:00
Martijn van Groningen	428e70758a	Moved more token filters to analysis-common module. The following token filters were moved: `edge_ngram`, `ngram`, `uppercase`, `lowercase`, `length`, `flatten_graph` and `unique`. Relates to #23658	2017-06-15 18:28:31 +02:00
Boaz Leskes	648b4717a4	move assertBusy to use CheckException (#25246 ) We use assertBusy in many places where the underlying code throw exceptions. Currently we need to wrap those exceptions in a RuntimeException which is ugly.	2017-06-15 13:24:07 +02:00
Adrien Grand	0c117145f6	Upgrade to lucene-7.0.0-snapshot-92b1783. (#25222 ) This snapshot has faster range queries on range fields (LUCENE-7828), more accurate norms (LUCENE-7730) and the ability to use fake term frequencies (LUCENE-7854).	2017-06-15 09:52:07 +02:00
Ryan Ernst	caf7792db1	Scripting: Rename SearchScript.needsScores to needs_score (#25235 ) This commit renames the needsScores method so as to make it automatically generatable, based on the name of the `_score` variable which is available in search scripts. It also adds documentation to ScriptContext to explain the naming and signature of such methods.	2017-06-14 22:01:19 -07:00
Nik Everett	ce11b894b4	Extract the snapshot/restore full cluster restart tests from the translog full cluster restart tests (#25204 ) Extract the snapshot/restore full cluster restart tests from the translog full cluster restart tests. That way they are easier to read.	2017-06-14 13:03:59 -04:00
Jay Modi	ed76b9a518	Test: allow setting socket timeout for rest client (#25221 ) In #25201, a setting was added to allow setting the retry timeout for the rest client under the impression that this would allow requests to go longer than 30s. However, there is also a socket timeout that needs to be set to greater than 30s, which this change adds a setting for.	2017-06-14 08:21:56 -06:00
Andy Bristol	48696ab544	expose simple pattern tokenizers (#25159 ) Expose the experimental simplepattern and simplepatternsplit tokenizers in the common analysis plugin. They provide tokenization based on regular expressions, using Lucene's deterministic regex implementation that is usually faster than Java's and has protections against creating too-deep stacks during matching. Both have a not-very-useful default pattern of the empty string because all tokenizer factories must be able to be instantiated at index creation time. They should always be configured by the user in practice.	2017-06-13 12:46:59 -07:00
Jay Modi	190242fb1b	Test: add setting to change request timeout for rest client (#25201 ) This commit adds a setting to change the request timeout for the rest client. This is useful as the default timeout is 30s, which is also the same default for calls like cluster health. If both are the same then the response from the cluster health api will not be received as the client usually times out first making test failures harder to debug. Relates #25185	2017-06-13 12:19:17 -06:00
Simon Willnauer	186c16ea41	Ensure pending transport handlers are invoked for all channel failures (#25150 ) Today if a channel gets closed due to a disconnect we notify the response handler that the connection is closed and the node is disconnected. Unfortunately this is not a complete solution since it only works for published connections. Connections that are unpublished ie. for discovery can indefinitely hang since we never invoke their handers when we get a failure while a user is waiting for the response. This change adds connection tracking to TcpTransport that ensures we are notifying the corresponding connection if there is a failure on a channel.	2017-06-13 09:37:05 +02:00
Tal Levy	340909582f	remove Ingest's Internal Template Service (#25085 ) Ingest was using it's own wrapper around TemplateScripts and the ScriptService. This commit removes that abstraction	2017-06-08 15:24:03 -07:00
Lee Hinman	119f8ed9f0	Correctly enable _all for older 5.x indices When we disabled `_all` by default for indices created in 6.0, we missed adding a layer that would handle the situation where `_all` was not enabled in 5.x and then the cluster was updated to 6.0, this means that when the cluster was updated the `_all` field would be disabled for 5.x indices and field values would not be added to the `_all` field. This adds a compatibility layer for 5.x indices where we treat the default enabled value for the `_all` field to be `true` if unset on 5.x indices. Resolves #25068	2017-06-08 14:37:44 -06:00
Nik Everett	4a8c09c5f1	Make randomVersionBetween work with unreleased versions (#25042 ) Test: randomVersionBetween works with unreleased Modifies randomVersionBetween so that it works with unreleased versions. This should make switching a version from unreleased to released much simpler.	2017-06-08 10:19:06 -04:00
Yannick Welsch	cd57395c98	Use correct primary term for replicating NOOPs (#25128 ) NOOPs should be, same as for indexing operations, written on the replica using the original operation term instead of the current term of the replica.	2017-06-08 14:20:26 +02:00
Jim Ferenczi	36a5cf8f35	Automatically early terminate search query based on index sorting (#24864 ) This commit refactors the query phase in order to be able to automatically detect queries that can be early terminated. If the index sort matches the query sort, the top docs collection is early terminated on each segment and the computing of the total number of hits that match the query is delegated to a simple TotalHitCountCollector. This change also adds a new parameter to the search request called `track_total_hits`. It indicates if the total number of hits that match the query should be tracked. If false, queries sorted by the index sort will not try to compute this information and and will limit the collection to the first N documents per segment. Aggregations are not impacted and will continue to see every document even when the index sort matches the query sort and `track_total_hits` is false. Relates #6720	2017-06-08 12:10:46 +02:00
Jim Ferenczi	21a57c1494	Always use DisjunctionMaxQuery to build cross fields disjunction (#25115 ) This commit modifies query_string, simple_query_string and multi_match queries to always use a DisjunctionMaxQuery when a disjunction over multiple fields is built. The tiebreaker is set to 1 in order to behave like the boolean query in terms of scoring. The removal of the coord factor in Lucene 7 made this change mandatory to correctly handle minimum_should_match. Closes #23966	2017-06-08 11:18:17 +02:00
David Roberts	f9503af0d5	[TEST] Move test skip/blacklist assumptions out of @Before method (#25100 ) This commit moves the assumeFalse() calls that implement test skipping and blacklisting out of the @Before method of ESClientYamlSuiteTestCase. The problem with having them in the @Before method is that if an assumption triggers then the @Before methods of classes that extend ESClientYamlSuiteTestCase will not run, but their @After methods will. This can lead to inconsistencies that cause assertions in the @After methods and fail the test even though it was skipped/blacklisted. Instead the assumeFalse() calls are now at the beginning of the test() method, which runs after all @Before methods (including those in classes that extend ESClientYamlSuiteTestCase) have completed. The only side effect is that overridden test() methods in classes that extend ESClientYamlSuiteTestCase which call super.test() and also do other things must now be designed not to consume any InternalAssumptionViolatedException that may be thrown by the super.test() call. Relates elastic/x-pack-elasticsearch#1650	2017-06-08 09:06:42 +01:00
Jack Conradson	d187fa78fd	Generate Painless Factory for Creating Script Instances (#25120 )	2017-06-07 16:06:11 -07:00
Christoph Büscher	9e741cd13d	Tests: Add ability to generate random new fields for xContent parsing test (#23437 ) For the response parsing we want to be lenient when it comes to parsing new xContent fields. In order to ensure this in our testing, this change adds a utility method to XContentTestUtils that takes xContent bytes representation as input and recursively a random field on each object level. Sometimes we also want to exclude a whole subtree from this treatment (e.g. skipping "_source"), other times an element (e.g. "fields", "highlight" in SearchHit) can have arbitraryly named objects. Those cases can be specified as exceptions.	2017-06-07 21:01:20 +02:00
Yannick Welsch	26ec89173b	Remove TranslogRecoveryPerformer (#24858 ) Splits TranslogRecoveryPerformer into three parts: - the translog operation to engine operation converter - the operation perfomer (that indexes the operation into the engine) - the translog statistics (for which there is already RecoveryState.Translog) This makes it possible for peer recovery to use the same IndexShard interface as bulk shard requests (i.e. Engine operations instead of Translog operations). It also pushes the "fail on bad mapping" logic outside of IndexShard. Future pull requests could unify the BulkShard and peer recovery path even more.	2017-06-07 17:11:27 +02:00
Tim Brooks	233c63fc63	Add version 5.6 to versions (#25084 ) * Add version 5.6 to versions * Fix test * Remove 5.4.2 constant	2017-06-07 09:59:27 -04:00
Tim Brooks	feca0a9f33	Bumping version to v6.0.0-alpha3 (#25077 )	2017-06-06 15:47:23 -05:00
Jim Ferenczi	7e60cf3e54	Move parent_id query to the parent-join module (#25072 ) This change moves the parent_id query to the parent-join module and handles the case when only the parent-join field can be declared on an index (index with single type on). If single type is off it uses the legacy parent join field mapper and switch to the new one otherwise (default in 6). Relates #20257	2017-06-06 19:35:14 +02:00
Nik Everett	73307a2144	Plugins can register pre-configured char filters (#25000 ) Fixes the plumbing so plugins can register char filters and moves the `html_strip` char filter into analysis-common. Relates to #23658	2017-06-05 09:25:15 -04:00
Nik Everett	190f5dce10	Test that gradle and Java version types match (#24943 ) Both gradle and java code attempt to infer the type of a each Version constant in Version.java. It is super important that they infer that each constant has the same type. If they disagree we might accidentally not be testing backwards compatibility for some version. This adds a test to make sure that they agree, modulo known and accepted differences (mostly around alphas). It also changes the minimum wire compatible version from the released 5.4.0 to the unreleased 5.5.0 as that lines up with the gradle logic. Relates to #24798 Note that the gradle and java version logic doesn't actually match so this contains a hack to make it look like it matches. Since this is a start, I'm merging it and going to work on some followups to make the logic actually match.....	2017-06-02 21:30:47 -04:00
Ryan Ernst	0d8216d5af	Scripting: Convert CompiledTemplate to a ScriptContext (#25032 ) This commit creates TemplateScript and associated classes so that templates no longer need a special ScriptService.compileTemplate method. The execute() method is equivalent to the old run() method. relates #20426	2017-06-02 13:41:26 -07:00
Ali Beyad	e024c67561	Checks the circuit breaker before allocating bytes for a new big array (#25010 ) Previously, when allocating bytes for a BigArray, the array was created (or attempted to be created) and only then would the array be checked for the amount of RAM used to see if the circuit breaker should trip. This is problematic because for very large arrays, if creating or resizing the array, it is possible to attempt to create/resize and get an OOM error before the circuit breaker trips, because the allocation happens before checking with the circuit breaker. This commit ensures that the circuit breaker is checked before all big array allocations (note, this does not effect the array allocations that are less than 16kb which use the [Type]ArrayWrapper classes found in BigArrays.java). If such an allocation or resizing would cause the circuit breaker to trip, then the breaker trips before attempting to allocate and potentially running into an OOM error from the JVM. Closes #24790	2017-06-02 15:16:22 -04:00
Boaz Leskes	aa5b11687d	reduce the number of threads used by testNotBlockingUnsafeStackTraces It times out some times. Fixes #24936	2017-06-02 19:06:58 +02:00
Nik Everett	18f16ba555	Test: improve error message on leftover tasks After every REST test we wait for the list of pending cluster tasks to empty before moving on to the next task. If the list doesn't empty in 10 second we fail the test. This improves the error message when we fail the test to include the list of running tasks.	2017-06-02 11:02:44 -04:00
Christoph Büscher	a94ac30360	[Tests] Improve error message for failed xContentEquivalent() tests (#24828 ) For comparing actual and parsed object equality for the response parsing we currently rely on comparing the original xContent and the output of the parsed object. Currently we only have cryptic error messages if this comparison fails which are hard to read also because we recursively compare lists and maps of the xContent structures we compare. This commits leverages the existing NotEqualMessageBuilder for providing error messages that are more detailed and useful for debugging if an error occurs.	2017-06-01 14:12:26 +02:00

... 8 9 10 11 12 ...

2005 Commits