OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jason Tedor	d5451b2037	Die with dignity while merging If an out of memory error is thrown while merging, today we quietly rewrap it into a merge exception and the out of memory error is lost. Instead, we need to rethrow out of memory errors, and in fact any fatal error here, and let those go uncaught so that the node is torn down. This commit causes this to be the case. Relates #27265	2017-11-06 17:55:11 -05:00
Jason Tedor	766d29e7cf	Correctly encode warning headers The warnings headers have a fairly limited set of valid characters (cf. quoted-text in RFC 7230). While we have assertions that we adhere to this set of valid characters ensuring that our warning messages do not violate the specificaion, we were neglecting the possibility that arbitrary user input would trickle into these warning headers. Thus, missing here was tests for these situations and encoding of characters that appear outside the set of valid characters. This commit addresses this by encoding any characters in a deprecation message that are not from the set of valid characters. Relates #27269	2017-11-06 13:20:30 -05:00
David Roberts	749c3ec716	Remove the single argument Environment constructor (#27235 ) Only tests should use the single argument Environment constructor. To enforce this the single arg Environment constructor has been replaced with a test framework factory method. Production code (beyond initial Bootstrap) should always use the same Environment object that Node.getEnvironment() returns. This Environment is also available via dependency injection.	2017-11-04 13:25:09 +00:00
Martijn van Groningen	9e67cca987	build: Fix setting the incorrect bwc version in mixed cluster qa module Prior to this change if the `bwcTest` task is run then it would create task for each version, but each task in reality would use wireCompatVersions - 1 ES version. So we were not actually testing against 5.6.x versions in the 6.x and 6.0 branches.	2017-11-03 14:18:27 +01:00
Jason Tedor	8b4a92fbb7	Adjust assertions for sequence numbers BWC tests This commit adjusts the assertions for the sequence number BWC tests to account for the fact that sometimes these tests are run in mixed-clusters with 5.6 nodes (that do not understand sequence numbers), and sometimes these tests are run in mixed-cluster with 6.0+ nodes (that all understood sequence numbers). Relates #27251	2017-11-03 08:58:05 -04:00
Jason Tedor	77f87732ef	Adjust .DS_Store test assertions on Windows Windows handles trying to read a file that does not exist because a component of the path is not a directory differently than other OS handle this situation. This commit adjusts these assertions for Windows.	2017-10-25 22:36:53 -04:00
Jason Tedor	6722b9c4a2	Ignore .DS_Store files on macOS Finder creates these files if you browse a directory there. These files are really annoying, but it's an incredible pain for users that these files are created unbeknownst to them, and then they get in the way of Elasticsearch starting. This commit adds leniency on macOS only to skip these files. Relates #27108	2017-10-25 11:25:29 -04:00
Simon Willnauer	8dda827ff4	Don't refresh on `_flush` `_force_merge` and `_upgrade` (#27000 ) Today all these API calls have a sideeffect of making documents visible to search requests. While this is sometimes desired it's an unnecessary sideeffect and now that we have an internal (engine-private) index reader (#26972) we artificially add a refresh call for bwc. This change removes this sideeffect in 7.0.	2017-10-16 10:16:35 +02:00
Anton Pozhidaev	cee9640c20	Update by Query is modified to accept short `script` parameter. (#26841 ) Update by Query is modified to accept short `script` parameter. Closes issue #24898	2017-10-11 21:57:46 +00:00
kel	2e36f19051	Add support for parsing inline script (#23824 ) (#26846 ) * Add support for parsing inline script (#23824) * Fix test	2017-10-11 09:15:37 -07:00
Martijn van Groningen	19dc629e6d	Test query builder bwc against previous supported versions instead of just the current version. Relates to #25456	2017-10-09 13:22:01 +02:00
Yannick Welsch	a4436195f8	Set minimum_master_nodes on rolling-upgrade test (#26911 ) The rolling-upgrade test was only writing the "minimum_master_nodes" setting to the configuration file of the old nodes, but not the upgraded ones. Also changes the value of "minimum_master_nodes" from "number_of_nodes" to "(number_of_nodes / 2) + 1".	2017-10-09 10:45:03 +02:00
Simon Willnauer	cdd7c1e6c2	Return List instead of an array from settings (#26903 ) Today we return a `String[]` that requires copying values for every access. Yet, we already store the setting as a list so we can also directly return the unmodifiable list directly. This makes list / array access in settings a much cheaper operation especially if lists are large.	2017-10-09 09:52:08 +02:00
Nhat	bf4c3642b2	remove _primary and _replica shard preferences (#26791 ) The shard preference _primary, _replica and its variants were useful for the asynchronous replication. However, with the current impl, they are no longer useful and should be removed. Closes #26335	2017-10-08 11:03:06 -04:00
Boaz Leskes	c342cdeab5	Setup debug logging for qa.full-cluster-restart	2017-10-07 23:37:09 +02:00
Boaz Leskes	2d409a912f	full-cluster-restart tests: prevent shards from going inactive FullClusterRestartIT.testRecovery relies on the translogs not being flushed	2017-10-05 10:08:10 +02:00
Boaz Leskes	2a04118e88	Promote common rest test utility methods to ESRestTestCase We have duplicates in some classes and I was about to create one more.	2017-10-05 10:08:10 +02:00
Luca Cavanna	9b9cb81c41	Fix serialization errors when cross cluster search goes to a single shard (#26881 ) The single shard optimization that we have in our search api changes the type of response returned by the query transport action name based on the shard search request. if the request goes to one shard, we will do query and fetch at the same time, hence the response will be different. The proxying layer used in cross cluster search was not aware of this distinction, which causes serialization issues every time a cross cluster search request goes to a single shard and goes through a gateway node which has to forward the shard request to a data node. The coordinating node would then expect a QueryFetchSearchResult while the gateway would return a QuerySearchResult. Closes #26833	2017-10-04 22:39:14 +02:00
Simon Willnauer	d1533e2397	Remove Settings#getAsMap() (#26845 ) Since `#getAsMap` exposes internal representation we are trying to remove it step by step. This commit is cleaning up some xcontent writing as well as usage in tests	2017-10-04 01:21:38 -06:00
Boaz Leskes	4f8131026e	RecoveryIT.testHistoryUUIDIsGenerated should reduce unassigned shards delay instead of ensure green. The ensure green approach to avoid allocation delays caused problems with other indices created by other tests which didn't use ensure green in the various cluster stages. This aligns testHistoryUUIDIsGenerated to use the same approach used by the other test.	2017-09-30 16:48:23 +02:00
Boaz Leskes	5df77a8c91	enable debug logging for testHistoryUUIDIsGenerated (+1 squashed commit) Squashed commits: [1d4f268] enable debug logging for testHistoryUUIDIsGenerated	2017-09-26 14:49:47 +02:00
Jay Modi	b8cd82e5c2	Increase time to wait for green in rolling upgrade tests (#26781 ) This commit increases the amount of time to wait for green to accound for unassigned shards that have been delayed. The default delay is 60s, so we need to wait longer than that. Previously, the wait would timeout at 30s due to the rest client and the default for the cluster health api. Closes #26742	2017-09-25 12:39:33 -06:00
Boaz Leskes	cd2a4372b4	RecoveryIT should wait for green when in mixed cluster to avoid unassigned shards The test starts with two old nodes and creates indices (without waiting for green, which is fixed here too). Then it restarts one of the nodes and waits for it to join the cluster. This wait condition only uses wait for yellow as our generic infra doesn't how many nodes are there in total. Once the restarted node is part of the cluster (mixed mode) the second old node is restarted. If indices are not fully allocated when that happens, the shards will go into delayed unassigned mode. If the recovery of the replica never completed we may end up with corrupted / no secondary copy on the node. This will cause the shards to be delayed for 1m before being reassigned and the test will time out.	2017-09-24 22:38:20 +02:00
Boaz Leskes	2b6f75730e	RecoveryIT up client time out to 40s to see response in a 30s time	2017-09-24 21:33:20 +02:00
Jason Tedor	2e63a13c0a	Upgrade to Log4j 2.9.1 This commit upgrades the Log4j dependency, picking up a fix for an issue with handling stack traces on JDK 9. Relates #26750	2017-09-22 11:57:06 -04:00
Jason Tedor	f35d1de502	Introduce global checkpoint background sync It is the exciting return of the global checkpoint background sync. Long, long ago, in snapshot version far, far away we had and only had a global checkpoint background sync. This sync would fire periodically and send the global checkpoint from the primary shard to the replicas so that they could update their local knowledge of the global checkpoint. Later in time, as we sped ahead towards finalizing the initial version of sequence IDs, we realized that we need the global checkpoint updates to be inline. This means that on a replication operation, the primary shard would piggy back the global checkpoint with the replication operation to the replicas. The replicas would update their local knowledge of the global checkpoint and reply with their local checkpoint. However, this could allow the global checkpoint on the primary to advance again and the replicas would fall behind in their local knowledge of the global checkpoint. If another replication operation never fired, then the replicas would be permanently behind. To account for this, we added one more sync that would fire when the primary shard fell idle. However, this has problems: - the shard idle timer defaults to five minutes, a long time to wait for the replicas to learn of the new global checkpoint - if a replica missed the sync, there was no follow-up sync to catch them up - there is an inherent race condition where the primary shard could fall idle mid-operation (after having sent the replication request to the replicas); in this case, there would never be a background sync after the operation completes - tying the global checkpoint sync to the idle timer was never natural To fix this, we add two additional changes for the global checkpoint to be synced to the replicas. The first is that we add a post-operation sync that only fires if there are no operations in flight and there is a lagging replica. This gives us a chance to sync the global checkpoint to the replicas immediately after an operation so that they are always kept up to date. The second is that we add back a global checkpoint background sync that fires on a timer. This timer fires every thirty seconds, and is not configurable (for simplicity). This background sync is smarter than what we had previously in the sense that it only sends a sync if the global checkpoint on at least one replica is lagging that of the primary. When the timer fires, we can compare the global checkpoint on the primary to its knowledge of the global checkpoint on the replicas and only send a sync if there is a shard behind. Relates #26591	2017-09-21 15:34:13 -04:00
Christoph Büscher	86b00b84bc	Remove parse field deprecations in query builders (#26711 ) The `fielddata` field and the use of the `_name` field in the short syntax of the range query have been deprecated in 5.0 and can be removed. The same goes for the deprecated `score_mode` field in HasParentQueryBuilder, the deprecated `like_text`, `ids` and `docs` parameter in the `more_like_this` query, the deprecated query name in the short version of the `regexp` query, and several deprecated alternative field names in other query builders.	2017-09-20 16:22:21 +02:00
Yannick Welsch	ff1e26276d	Deguice ActionFilter (#26691 ) Allows to instantiate TransportAction instances without Guice.	2017-09-20 10:30:21 +02:00
Boaz Leskes	04385a9ce9	Restoring from snapshot should force generation of a new history uuid (#26694 ) Restoring a shard from snapshot throws the primary back in time violating assumptions and bringing the validity of global checkpoints in question. To avoid problems, we should make sure that a shard that was restored will never be the source of an ops based recovery to a shard that existed before the restore. To this end we have introduced the notion of `histroy_uuid` in #26577 and required that both source and target will have the same history to allow ops based recoveries. This PR make sure that a shard gets a new uuid after restore. As suggested by @ywelsch , I derived the creation of a `history_uuid` from the `RecoverySource` of the shard. Store recovery will only generate a uuid if it doesn't already exist (we can make this stricter when we don't need to deal with 5.x indices). Peer recovery follows the same logic (note that this is different than the approach in #26557, I went this way as it means that shards always have a history uuid after being recovered on a 6.x node and will also mean that a rolling restart is enough for old indices to step over to the new seq no model). Local shards and snapshot force the generation of a new translog uuid. Relates #10708 Closes #26544	2017-09-19 15:58:36 +02:00
Michael Basnight	f385e0cf26	Add bad_request to the rest-api-spec catch params (#26539 ) This adds another request to the catch params. It also makes sure that the generic request param does not allow 400 either.	2017-09-14 14:24:03 -05:00
Boaz Leskes	1ca0b5e9e4	Introduce a History UUID as a requirement for ops based recovery (#26577 ) The new ops based recovery, introduce as part of #10708, is based on the assumption that all operations below the global checkpoint known to the replica do not need to be synced with the primary. This is based on the guarantee that all ops below it are available on primary and they are equal. Under normal operations this guarantee holds. Sadly, it can be violated when a primary is restored from an old snapshot. At the point the restore primary can miss operations below the replica's global checkpoint, or even worse may have total different operations at the same spot. This PR introduces the notion of a history uuid to be able to capture the difference with the restored primary (in a follow up PR). The History UUID is generated by a primary when it is first created and is synced to the replicas which are recovered via a file based recovery. The PR adds a requirement to ops based recovery to make sure that the history uuid of the source and the target are equal. Under normal operations, all shard copies will stay with that history uuid for the rest of the index lifetime and thus this is a noop. However, it gives us a place to guarantee we fall back to file base syncing in special events like a restore from snapshot (to be done as a follow up) and when someone calls the truncate translog command which can go wrong when combined with primary recovery (this is done in this PR). We considered in the past to use the translog uuid for this function (i.e., sync it across copies) and thus avoid adding an extra identifier. This idea was rejected as it removes the ability to verify that a specific translog really belongs to a specific lucene index. We also feel that having a history uuid will serve us well in the future.	2017-09-14 21:25:02 +03:00
Christoph Büscher	c7c6443b10	[Docs] "The the" is a great band, but ... (#26644 ) Removing several occurrences of this typo in the docs and javadocs, seems to be a common mistake. Corrections turn up once in a while in PRs, better to correct some of this in one sweep.	2017-09-14 15:08:20 +02:00
Jason Tedor	ca6bce75da	Refactor bootstrap check results and error messages This commit refactors the bootstrap checks into a single result object that encapsulates whether or not the check passed, and a failure message if the check failed. This simpifies the checks, and enables the messages to more easily be based on the state used to discern whether or not the check passed. Relates #26637	2017-09-13 21:30:27 -04:00
Simon Willnauer	b4de2a6f28	Add BootstrapContext to expose settings and recovered state to bootstrap checks (#26628 ) This exposes the node settings and the persistent part of the cluster state to the bootstrap checks to allow plugins to enforce certain preconditions based on the recovered state.	2017-09-13 22:14:17 +02:00
Jason Tedor	19a2156d18	Skip some logging tests on JDK 9 There is a bug in Log4j on JDK 9 for walking the stack to find where a log line is coming from. This bug is impacting some of our testing, so this commit marks these tests as skippable only on JDK 9 until the bug is fixed upstream. Relates #26467	2017-09-01 12:38:22 -04:00
Alexander Reelsen	80d0a32f8e	ScriptService: Replace max compilation per minute setting with max compilation rate (#26399 ) The current script service has a script compilation limit for a one minute window. This is set to a small default value of 15. Instead of increasing that default value, this commit introduces a new setting that allows to configure a rate per time unit, so that the script service can deal with bursts better. The new setting is named `script.max_compilations_rate`, requires a nonnegative number and a positive time value. The default is `75/5m`, which is equivalent to the existing 15 per minute.	2017-09-01 10:15:27 +02:00
Ryan Ernst	6ffbb9dfc6	Test: Quiet failing java 9 test due to log4j upgrade See https://github.com/elastic/elasticsearch/issues/26464	2017-08-31 16:04:18 -07:00
Ryan Ernst	42e8940a3d	Build: Ensure build metadata is written (#26427 ) This commit adds writing build metadata to the `check` command for each bwc project. This ensures the files will be written if doing a general `gradle check`, which is what CI intake jobs do. In later jobs like bwcTest, the extra bwc-release-snapshot info is needed. Note this commit also has a little cleanup of the output for the bwc checkout, as it was plastering a git warning, instead of the real info we care about (the refspec and commit that were used).	2017-08-30 07:26:33 -07:00
Jason Tedor	7a035f5f84	setgid on /etc/elasticearch on package install When creating the keystore explicitly (from executing elasticsearch-keystore create) or implicitly (for plugins that require the keystore to be created on install) on an Elasticsearch package installation, we are running as the root user. This leaves /etc/elasticsearch/elasticsearch.keystore having the wrong ownership (root:root) so that the elasticsearch user can not read the keystore on startup. This commit adds setgid to /etc/elasticsearch on package installation so that when executing this directory (as we would when creating the keystore), we will end up with the correct ownership (root:elasticsearch). Additionally, we set the permissions on the keystore to be 660 so that the elasticsearch user via its group can read this file on startup. Relates #26412	2017-08-28 20:47:42 -04:00
Michael Basnight	cfd14cd2b8	Revert shading for the low level rest client (#26367 ) At current, we do not feel there is enough of a reason to shade the low level rest client. It caused problems with commons logging and IDE's during the brief time it was used. We did not know exactly how many users will need this, and decided that leaving shading out until we gather more information is best. Users can still shade the jar themselves. For information and feeback, see issue #26366. Closes #26328 This reverts commit `3a20922046`. This reverts commit `2c271f0f22`. This reverts commit `9d10dbea39`. This reverts commit `e816ef89a2`.	2017-08-25 14:13:12 -05:00
Ryan Ernst	5202e7e93b	Settings: Move keystore creation to plugin installation (#26329 ) This commit removes the keystore creation on elasticsearch startup, and instead adds a plugin property which indicates the plugin needs the keystore to exist. It does still make sure the keystore.seed exists on ES startup, but through an "upgrade" method that loading the keystore in Bootstrap calls. closes #26309	2017-08-24 12:12:47 -07:00
Luca Cavanna	6d8e2c6d4c	Make RestHighLevelClient Closeable and simplify its creation (#26180 ) By making RestHighLevelClient Closeable, its close method will close the internal low-level REST client instance by default, which simplifies the way most users interact with the high-level client. Its constructor accepts now a RestClientBuilder, which clarifies that the low-level REST client is internally created and managed. It is still possible to provide an already built `RestClient` instance, but that can only be done by subclassing `RestHighLevelClient` and calling the protected constructor that accepts a `RestClient`. In such case a consumer has also to be provided, which controls what has to be done when the high-level client gets done. Closes #26086	2017-08-24 09:39:41 +02:00
Yannick Welsch	3d8feff66e	Use Java 9 FilePermission model (#26302 ) This commit makes the security code aware of the Java 9 FilePermission changes (see #21534) and allows us to remove the `jdk.io.permissionsUseCanonicalPath` system property.	2017-08-22 11:22:00 +09:30
Jason Tedor	4e97be02a9	Export HOSTNAME environment variable We previously explicitly set the HOSTNAME environment variable so that ${HOSTNAME} could be used a placeholder for defining the node.name in elasticsearch.yml. We removed explicitly setting this because bash defines HOSTNAME. The problem is that bash defines HOSTNAME as a bash variable, not as an environment variable. Therefore, to restore the previous behavior, we export the bash value for HOSTNAME as an environment variable named HOSTNAME. For consistency between Windows and the Unix-like systems, we also define HOSTNAME with a value equal to the environment variable COMPUTERNAME on Windows. Relates #26262	2017-08-17 16:51:02 -04:00
Jason Tedor	7fb910599a	Add packaging test for systemd runtime directive We previously added a RuntimeDirectory directive to the systemd service file for Elasticsearch. This commit adds a packaging test for the situation that this directive was intended to address. Relates #26229	2017-08-16 04:35:02 -04:00
Nik Everett	d150884ded	Drop upgrade from full cluster restart tests (#26224 ) Our documentation for the API is: ``` The _upgrade API is no longer useful and will be removed. Instead, see Reindex to upgrade. ``` Given that, I don't think we need to test the API anymore. Closes #25311	2017-08-15 16:00:35 -04:00
Jason Tedor	e9687622bd	Rename CONF_DIR to ES_PATH_CONF The environment variable CONF_DIR was previously inconsistently used in our packaging to customize the location of Elasticsearch configuration files. The importance of this environment variable has increased starting in 6.0.0 as it's now used consistently to ensure Elasticsearch and all secondary scripts (e.g., elasticsearch-keystore) all use the same configuration. The name CONF_DIR is there for legacy reasons yet it's too generic. This commit renames CONF_DIR to ES_PATH_CONF. Relates #26197	2017-08-15 06:19:06 +09:00
Tim Brooks	0f4f49496f	Use nio transport in test clusters (#25986 ) This commit adds the nio transport as an option in place of the mock tcp transport for tests. Each test will only use one transport type. The transport type is decided by a random boolean generated inside of the `ESTestCase` class.	2017-08-01 16:19:31 -05:00
Ryan Ernst	072281d5aa	Update version to 7.0.0-alpha1 (#25876 ) This commit updates the version for master to 7.0.0-alpha1. It also adds the 6.1 version constant, and fixes many tests, as well as marking some as awaits fix. Closes #25893 Closes #25870	2017-08-01 15:47:48 -04:00
Boaz Leskes	9f1d116967	Node should start up despite of a lingering `.es_temp_file` (#21210 ) When ES starts up we verify we can write to all data folders and that they support atomic moves. We do so by creating and deleting temp files. If for some reason the files was successfully created but not successfully deleted, we still shut down correctly but subsequent start attempts will fail with a file already exists exception. This commit makes sure to first clean any existing temporary files. Superseeds #21007	2017-08-01 15:41:27 +02:00

1 2 3 4 5 ...

809 Commits