Node IDs are currently randomly generated during node startup. That means they change every time the node is restarted. While this doesn't matter for ES proper, it makes it hard for external services to track nodes. Another, more minor, side effect is that indexing the output of, say, the node stats API results in creating new fields due to node ID being used as keys.
The first approach I considered was to use the node's published address as the base for the id. We already [treat nodes with the same address as the same](https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/discovery/zen/NodeJoinController.java#L387) so this is a simple change (see [here](https://github.com/elastic/elasticsearch/compare/master...bleskes:node_persistent_id_based_on_address)). While this is simple and it works for probably most cases, it is not perfect. For example, if after a node restart, the node is not able to bind to the same port (because it's not yet freed by the OS), it will cause the node to still change identity. Also in environments where the host IP can change due to a host restart, identity will not be the same.
Due to those limitation, I opted to go with a different approach where the node id will be persisted in the node's data folder. This has the upside of connecting the id to the nodes data. It also means that the host can be adapted in any way (replace network cards, attach storage to a new VM). I
It does however also have downsides - we now run the risk of two nodes having the same id, if someone copies clones a data folder from one node to another. To mitigate this I changed the semantics of the protection against multiple nodes with the same address to be stricter - it will now reject the incoming join if a node exists with the same id but a different address. Note that if the existing node doesn't respond to pings (i.e., it's not alive) it will be removed and the new node will be accepted when it tries another join.
Last, and most importantly, this change requires that *all* nodes persist data to disk. This is a change from current behavior where only data & master nodes store local files. This is the main reason for marking this PR as breaking.
Other less important notes:
- DummyTransportAddress is removed as we need a unique network address per node. Use `LocalTransportAddress.buildUnique()` instead.
- I renamed `node.add_lid_to_custom_path` to `node.add_lock_id_to_custom_path` to avoid confusion with the node ID which is now part of the `NodeEnvironment` logic.
- I removed the `version` paramater from `MetaDataStateFormat#write` , it wasn't really used and was just in the way :)
- TribeNodes are special in the sense that they do start multiple sub-nodes (previously known as client nodes). Those sub-nodes do not store local files but derive their ID from the parent node id, so they are generated consistently.
Today throughout the codebase, catch throwable is used with reckless
abandon. This is dangerous because the throwable could be a fatal
virtual machine error resulting from an internal error in the JVM, or an
out of memory error or a stack overflow error that leaves the virtual
machine in an unstable and unpredictable state. This commit removes
catch throwable from the codebase and removes the temptation to use it
by modifying listener APIs to receive instances of Exception instead of
the top-level Throwable.
Relates #19231
As discussed at https://github.com/elastic/elasticsearch-cloud-azure/issues/91#issuecomment-229113595, we know that the current `discovery-azure` plugin only works with Azure Classic VMs / Services (which is somehow Legacy now).
The proposal here is to rename `discovery-azure` to `discovery-azure-classic` in case some users are using it.
And deprecate it for 5.0.
Closes#19144.
As some plugins are becoming big now, it is hard for the user to know, if the plugin
is being downloaded or just nothing happens.
This commit adds a progress bar during download, which can be disabled by using the `-q`
parameter.
In addition this updates to jimfs 1.1, which allows us to test the batch mode, as adding
security policies are now supported due to having jimfs:// protocol support in URL stream
handlers.
This commit adds randomization for the packaging upgrade test. In
particular, we extract a list of the released version of Elasticsearch
from Maven Central and randomize the selection of the version to upgrade
from. The randomization is repeatable, and supports the tests.seed
property. Specific versions can be tested by setting the property
tests.packaging.upgrade.from.versions.
Relates #19033
Registering a script engine or native scripts still uses Guice today
and is much more complicated than needed. This change moves to a pull
based model where script plugins have to implement a dedicated interface
`ScriptPlugin` and defines simple getter returning instances rather than
classes.
The database files have been doubled in size compared to the previous files being used.
For this reason the database files are now gzip compressed, which required using
`GZIPInputStream` when loading database files.
Previously Elasticsearch used $DATA_DIR/$CLUSTER_NAME/nodes for the path
where data is stored, this commit changes that to be $DATA_DIR/nodes.
On startup, if the old folder structure is detected it will be used.
This behavior will be removed in Elasticsearch 6.0
Resolves#17810
Folded grok processor into ingest-common module.
The rest tests have been moved to ingest-common module as well, because these tests don't run in the rest-api-spec module but in the distribution:integ-test-zip module
and adding a test plugin there felt just wrong to me. I think this is ok. I left a tiny ingest rest test behind in that tests with an empty pipeline.
Removed messy tests, these tests were already covered in the rest tests
Added ingest test plugin in test infra so that each module testing integration with ingest doesn't need write its own plugin
Moved reindex ingest tests to qa module
Closes#18490
We have 3 evil tests for jarhell. They have been failing in java 9
because of how evil they are. The first checks the leniency we add for
jarhell in the jdk itself. This is unecessary, since if the leniency
wasn't there, we would already be failing all jarhell checks. The second
is checking the compile version is compatible with the jdk. This is
simpler since we don't need to fake the java version: we know 1.7 should
be compatibile with both java 8 and 9, so we can use that as a constant.
Finally the last test checks if the java version system property is
broken. This is simply something we should not check, we have to trust
that java specifies it correctly, and again, if it was broken, all
jarhell checks would be broken.
Lucene SuppressForbidden is marked lucene.internal and should not be
used outside of Lucene. This commit removes the uses of this class
within Elasticsearch. Instead,
org.elasticsearch.common.SuppressForbidden should be used, which was
already the case in most places.
This commit fixes an issue with the plugins directory being a symbolic
link. Namely, the install plugins command attempts to always create the
plugins directory just in case it does not exist. The JDK method used
here guarantees that the directory is created, and an exception is not
thrown if the directory could not be created because it already
exists. The problem is that this JDK method does not respect symlinks so
its internal existence checks fails, it proceeds to attempt to create
the directory, but the directory creation fails because the symlink
exists. This is documented as being not an issue. We work around this by
checking if there is a symlink where we expect the plugins directory to
be, and only attempt to create if not. We add a unit test that plugin
installation to a symlinked plugins directory works as expected.
This commit removes the ability to specify a custom plugins
path. Instead, the plugins path will always be a subdirectory called
"plugins" off of the home directory.
- now you can specify a list of grok patterns to match your field with
and the first one to successfully match wins.
- only non-null captures will be inserted into your matched document.
Fixes#17903.
This removes the ScriptMode class entirely, which was an enum with two
options (ON and OFF) which essentially boiled down to true and false.
Now the boolean values are used instead.
Today when parsing settings during bootstrap, we add a system property
for every Elasticsearch setting. Additionally, settings can be set via
system properties. This commit simplifies this situation.
- settings are no longer propogated to system properties
- system properties can not be used to set settings
- the "es." prefix on settings is no longer required (nor permitted)
- test logging has a dedicated system property (tests.logger.level)
Relates #18198
The plugin script parses command-line options looking for Java system
properties and extracts these arguments to pass to the java command when
starting the JVM. Since elasticsearch-plugin allows arbitrary user
arguments to the JVM via ES_JAVA_OPTS, this parsing is unnecessary. This
commit removes this unnecessary
Relates #18207
This commit modifies the packaging tests to account for the fact that
rpm behaves differently with respect to preserving directories marked as
"CONFIG | NOREPLACE" on older versions versus newer versions. Older
versions will leave the directory as-is while newer versions will append
the suffix ".rpmsave" to the directory name.
Relates #18216
This change makes the vagrant tasks extend LoggedExec, so that the
entire vagrant output can be dumped on failure (and completely logged
when using --info). It should help for debugging issues like #18122.
Exit with proper exit code (1) and an error message if elasticsearch
executable binary does not exists or has insufficient permissions to
execute.
Squashed commit of the following:
commit 9768d316303418ba4f9c96d3f87c376048a1b1bc
Author: Puru <tuladharpuru@gmail.com>
Date: Fri Apr 22 23:26:47 2016 +0545
Fixed ES_HOME typo
commit 79a2b0394297f8b02b6f71b71ba35ff79f1a684e
Author: Puru <tuladharpuru@gmail.com>
Date: Sun Apr 10 11:00:24 2016 +0545
Improve elasticsearch startup script test
Added improvement as per conversation in https://github.com/elastic/elasticsearch/pull/17082#issuecomment-206459613
commit 7be38e1fefd4baa6ccdbdc14745c00f6dc052e0c
Author: Puru <tuladharpuru@gmail.com>
Date: Wed Mar 23 13:23:52 2016 +0545
Add elasticsearch startup script test
The test ensures that elasticsearch startup script exists and is executable.
commit d10eed5c08260fa9c158a4487bbb3103a8d867ed
Author: Puru <tuladharpuru@gmail.com>
Date: Wed Mar 23 12:30:25 2016 +0545
Fixed IF syntax and failure message
commit 6dc66f616545572485b4d43bee05a4cbbf1bed72
Author: Puru <tuladharpuru@gmail.com>
Date: Sat Mar 12 11:08:11 2016 +0545
Fix exit code
Exit with proper exit code (1) and an error message if elasticsearch executable binary does not exists or has insufficient permissions to execute.
This changes our packaging to be explicit about the permissions of files
and directories in the tar.gz, rpm, and deb packages. This is to protect
against a user having an incorrectly set umask when installing.
Additionally, plugins that are installed now have their permissions set
by the plugin installation so that plugins that may have been packaged
with incorrect permissions are secured.
Resolves#17634
* `rename` processor, renamed `to` to `target_field`
* `date` processor, renamed `match_field` to `field` and renamed `match_formats` to `formats`
* `geoip` processor, renamed `source_field` to `field` and renamed `fields` to `properties`
* `attachment` processor, renamed `source_field` to `field` and renamed `fields` to `properties`
Closes#17835
In Elasticsearch 5.0.0, by default unquoted field names in JSON will be
rejected. This can cause issues, however, for documents that were
already indexed with unquoted field names. To alleviate this, a system
property has been added that can be enabled so migration can occur.
This system property will be removed in Elasticsearch 6.0.0
Resolves#17674
This commit adds a new configuration file jvm.options to centralize and
simplify management of JVM options. This separates the configuration of
the JVM from the packaging scripts (bin/elasticsearch*, bin/service.bat,
and init.d/elasticsearch) simplifying end-user operational management of
custom JVM options.
This change makes specifying which boxes to run vagrant tests on a
little easier. Previously there were two tasks, checkPackages and
checkPackagesAllDistros. With this change, there is a single
packagingTest task. The boxes to run on are specified using the
gradle property vagrant.boxes, which can be easily specified on the
command line, or in a gradle properties file. There are also two
alias names, 'sample' for a yum and apt box, and 'all' for all boxes.
This commit ensures that the data, logs, and config directories have the
proper ownership after the packages are installed. Additionally, this
commit ensures that the configs in /etc/elasticsearch are preserved
after removal of the RPM package.
Debian asks during installation, if the configuration file should be updated.
This is asked via a prompt and thus hangs.
This adds an option to always update to the newer config file, so automated
installation keeps working.
Change version, required a minor fix in the RPM building.
In case of a alpha/beta version, the release will contain alpha/beta
as the RPM version cannot contains dashes/tildes.
This commit adds a bootstrap check on Linux and OS X for the max size of
virtual memory (address space) to the user running the Elasticsearch
process.
Closes#16935
This commit sets up the default filesystem used during install plugins
tests. A hack is neeeded to handle the temporary directory because the
system property "java.io.tmpdir" will have been initialized to a value
that is sensible for the default filesystem, but not necessarily to a
value that makes sense for the mock filesystem in use during the
tests. This property is restored after each test.
This commit refactors the unit tests for installing plugins to test
against mock filesystems (as well as the native filesystem) for better
test coverage. This commit also adds tests that cover the POSIX
attributes handling when installing plugins (e.g., ensuring that the
plugins directory has the right permissions, the bin directory has
execute permissions, and the config directory has the same owner and
group as its parent).