This breaks on windows where TMP dir default to C:\Windows and startup
fails with a permission error.
I tried to create a tmp dir and pass in `TMP` env, but it lead to a
class not found error, and since testclusers is already independent of
the calling environment I stopped there.
The posix_spawn method of launching a process from Java
goes via an intermediate process called jspawnhelper
which lives in the lib directory rather than the bin
directory and hence got missed by the original chmod
loop. This change adds jspawnhelper as a special case.
It's the only program that's in the lib directory in a
macOS JDK 11.
* Bundle java in distributions
Setting up a jdk is currently a required external step when installing
elasticsearch. This is particularly problematic for the rpm/deb packages
as installing a jdk in the same package installation command does not
guarantee any order, so must be done in separate steps. Additionally,
JAVA_HOME must be set and often causes problems in selecting a correct
jdk when, for example, the system java is an older unsupported version.
This commit bundles platform specific openjdks into each distribution.
In addition to eliminating the issues above, it also presents future
possible improvements like using jlink to build jdk images only
containing modules that elasticsearch uses.
closes#31845
* Back port build changes from #39102
This back-ports how versions are determined and bwc test are set up from
#39102 without enabling the bwc from current version tests so it's
easier/possible to backmerge future buld changes.
It's expected that the tets are lacking many of the required fixes in
this version to enable them.
This commit adds a new build type (together with deb/rpm/tar/zip) to
represent the official Docker images. This build type will be displayed
in APIs such as the main and nodes info APIs.
With this commit we provide more info in an existing error message that is
raised when the file `jvm.dll` cannot be found on Windows when installing
Elasticsearch as a service.
This commit makes the rpm metadata indicate the pre 7.0 noarch packages
are obsoleted by this package. This fixes an issue where upgrading with
yum would cause an error thinking there was nothing to upgrade.
closes#39414
This commit sets the BWC projects to build in parallel if Gradle was
invoked with parallal project execution enabled. This substantially
speeds up the time of building the BWC projects since there are many
dependent projects needed to build a BWC version.
Finding java on the path is sometimes confusing for users and
unexpected, as well as leading to a different java being used than a
user expects. This commit adds warning messages when starting
elasticsearch (or any tools like the plugin cli) and using java found
on the PATH instead of via JAVA_HOME.
As the Dockerfile evolved we don't need anymore certain commands like
`unzip`, `which` and `wget` allowing us to slightly shrink the images.
Backport of: #39040
Currently init scripts fail when `/proc/sys/vm/max_map_count` is not present
with `-bash: [: too many arguments`.
Fix conditional logic to avoid trying to set the `max_map_count` sysctl if not
present.
Backport of: #35933
Relates: #27236
This commit enables the copyDockerfile task to render a Dockerfile that
sources the Elasticsearch binary from artifacts.elastic.co. This is
needed for reproducibility and transparency for the official Docker
images in the Docker library.
This commit adds the 7.1 version constant to the 7.x branch.
Co-authored-by: Andy Bristol <andy.bristol@elastic.co>
Co-authored-by: Tim Brooks <tim@uncontended.net>
Co-authored-by: Christoph Büscher <cbuescher@posteo.de>
Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
Co-authored-by: markharwood <markharwood@gmail.com>
Co-authored-by: Ioannis Kakavas <ioannis@elastic.co>
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
Co-authored-by: David Roberts <dave.roberts@elastic.co>
Co-authored-by: Jason Tedor <jason@tedor.me>
Co-authored-by: Alpar Torok <torokalpar@gmail.com>
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Tim Vernum <tim@adjective.org>
Co-authored-by: Albert Zaharovits <albert.zaharovits@gmail.com>
Renames the following settings to remove the mention of `zen` in their names:
- `discovery.zen.hosts_provider` -> `discovery.seed_providers`
- `discovery.zen.ping.unicast.concurrent_connects` -> `discovery.seed_resolver.max_concurrent_resolvers`
- `discovery.zen.ping.unicast.hosts.resolve_timeout` -> `discovery.seed_resolver.timeout`
- `discovery.zen.ping.unicast.hosts` -> `discovery.seed_addresses`
This commit adds classifiers to the distributions indicating the
OS (for archives) and platform. The current OSes are for windows, darwin (ie
macos) and linux. This change will allow future OS/architecture specific
changes to the distributions. Note the docs using distribution links
have been updated, but will be reworked in a followup to make OS
specific instructions for the archives.
In order to support JSON log format, a custom pattern layout was used and its configuration is enclosed in ESJsonLayout. Users are free to use their own patterns, but if smooth Beats integration is needed, they should use ESJsonLayout. EvilLoggerTests are left intact to make sure user's custom log patterns work fine.
To populate additional fields node.id and cluster.uuid which are not available at start time,
a cluster state update will have to be received and the values passed to log4j pattern converter.
A ClusterStateObserver.Listener is used to receive only one ClusteStateUpdate. Once update is received the nodeId and clusterUUid are set in a static field in a NodeAndClusterIdConverter.
Following fields are expected in JSON log lines: type, tiemstamp, level, component, cluster.name, node.name, node.id, cluster.uuid, message, stacktrace
see ESJsonLayout.java for more details and field descriptions
Docker log4j2 configuration is now almost the same as the one use for ES binary.
The only difference is that docker is using console appenders, whereas ES is using file appenders.
relates: #32850
The /etc/elasticsearch directory is currently configured as a config
file with noreplace. However, the directory itself is not config, and
can lead to an entire /etc/elasticsearch.rpmsave directory in some
situations. This commit fixes the ospackage config to not specify those
file bits for the directory itself, but only the files underneath it.
* Exit batch files explictly using ERRORLEVEL
This makes sure the exit code is preserved when calling the batch
files from different contexts other than DOS
Fixes#29582
This also fixes specific error codes being masked by an explict
exit /b 1
causing the useful exitcodes from ExitCodes to be lost.
* fix line breaks for calling cli to match the bash scripts
* indent size of bash files is 2, make sure editorconfig does the same for bat files
* update indenting to match bash files
* update elasticsearch-keystore.bat indenting
* Update elasticsearch-node.bat to exit outside of endlocal
elasticsearch-node tool helps to restore cluster if half or more of
master eligible nodes are lost. Of course, all bets are off, regarding
data consistency.
There are two parts of the tool: unsafe-bootstrap to be used when there
is still at least one master-eligible node alive and detach-cluster,
when there are no master-eligible nodes left.
This commit implements the first part.
Docs for the tool will be added separately as a part of #37812.
* Testing conventions now checks for tests in main
This is the last outstanding feature of the old NamingConventionsTask,
so time to remove it.
* PR review
This change adds a docker compose configuration that's used with
the `elasticsearch.test.fixtures` plugin to start up the image
and check that the TCP ports are up.
We can build on this to add other checks for culster health,
run REST tests, etc.
We can add multiple containers and configurations to the compose
file (e.x. test different env vars) and form clusters.
Currently integration tests which use either bwc snapshot versions or
the current version of elasticsearch depend on project substitutions to
link to the build of those artifacts. Likewise, vagrant tests use
dependency substitutions to get to bwc snapshots of rpm and debs.
This commit changes those to depend on the relevant project/configuration
and removes the dependency substitutions for distributions we do not
publish.
The integ tests currently use the raw zip project name as the
distribution type. This commit simplifies this specification to be
"default" or "oss". Whether zip or tar is used should be an internal
implementation detail of the integ test setup, which can (in the future)
be platform specific.
With the release of 11.0.2, the old URLs no longer work. This exposed a
few small bugs in the gradle config. One was that --no-cache was not
present in the docker build command, so it was not failing at
first. Then once only the ext.expansions was changed and the docker
build task was not, it was not executing it.
This commit updates the file docker's entrypoint script looks for when
deciding to process the ELASTIC_PASSWORD env var. The x-pack subdir
of bin no longer exists in 7.0, where the backcompat layer for x-pack
script locations was removed.
closes#37240
Some systems default to a nofile ulimit of 65535. To reduce the pain of
deploying Elasticsearch to such systems, this commit lowers the required
limit from 65536 to 65535.
This commit removes permission editing commands from the postinst
scriptlet. Instead, we now fully configure the owner/group (as well as
sticky bit) for these files and directories.
closes#37143
* Default include_type_name to false for get and put mappings.
* Default include_type_name to false for get field mappings.
* Add a constant for the default include_type_name value.
* Default include_type_name to false for get and put index templates.
* Default include_type_name to false for create index.
* Update create index calls in REST documentation to use include_type_name=true.
* Some minor clean-ups around the get index API.
* In REST tests, use include_type_name=true by default for index creation.
* Make sure to use 'expression == false'.
* Clarify the different IndexTemplateMetaData toXContent methods.
* Fix FullClusterRestartIT#testSnapshotRestore.
* Fix the ml_anomalies_default_mappings test.
* Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests.
We make sure to specify include_type_name=true during xContent parsing,
so we continue to test the legacy typed responses. XContent generation
for the typeless responses is currently only covered by REST tests,
but we will be adding unit test coverage for these as we implement
each typeless API in the Java HLRC.
This commit also refactors GetMappingsResponse to follow the same appraoch
as the other mappings-related responses, where we read include_type_name
out of the xContent params, instead of creating a second toXContent method.
This gives better consistency in the response parsing code.
* Fix more REST tests.
* Improve some wording in the create index documentation.
* Add a note about types removal in the create index docs.
* Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL.
* Make sure to mention include_type_name in the REST docs for affected APIs.
* Make sure to use 'expression == false' in FullClusterRestartIT.
* Mention include_type_name in the REST templates docs.
This commit makes the assemble tasks in the bwc projects noops by
setting the dependsOn directly. While we can not remove things from
dependsOn, we can still completely override the dependencies.
closes#33581
This commit adds a unique id to cluster blocks, so that they can be uniquely
identified if needed. This is important for the Close Index API where multiple
concurrent closing requests can be executed at the same time. By adding a
UUID to the cluster block, we can generate unique "closing block" that can
later be verified on shards and then checked again from the cluster state
before closing the index. When the verification on shard is done, the closing
block is replaced by the regular INDEX_CLOSED_BLOCK instance.
If something goes wrong, calling the Open Index API will remove the block.
Related to #33888
With this commit we instruct curl to retry with a backoff when
downloading the JDK for the Elasticsearch Docker image. This avoids
build failures on transient network issues. Note that this option
requires curl 7.12.3 or better.
Relates #37103
Relates #37113
We added some special handling for installing and removing the
ingest-geoip and ingest-user-agent plugins when we converted them to
modules. This special handling was done to minimize breaking users in a
minor release. However, do not want to maintain this behavior forever so
this commit removes that special handling in the master branch so that
starting with 7.0.0 this special handling will be gone.
* Deprecate types in index API
- deprecate type-based constructors of IndexRequest
- update tests to use typeless IndexRequest constructors
- no yaml tests as they have been already added in #35790
Relates to #35190
The following updates were made:
* Add deprecation warnings to `RestUpdateAction`, plus a test in `RestUpdateActionTests`.
* Deprecate relevant methods on the Java HLRC requests/ responses.
* Add HLRC integration tests for the typed APIs.
* Update documentation (for both the REST API and Java HLRC).
* Fix failing integration tests.
Because of an earlier PR, the REST yml tests were already updated (one version without types, and another legacy version that retains types).
The commit changes how indices are closed in the MetaDataIndexStateService.
It now uses a 3 steps process where writes are blocked on indices to be closed,
then some verifications are done on shards using the TransportVerifyShardBeforeCloseAction
added in #36249, and finally indices states are moved to CLOSE and their routing
tables removed.
The closing process also takes care of using the pre-7.0 way to close indices if the
cluster contains mixed version of nodes and a node does not support the TransportVerifyShardBeforeCloseAction. It also closes unassigned indices.
Related to #33888
When a security manager is present, the JVM will cache positive hostname
lookups indefinitely. This can be problematic, especially in the modern
world with cloud services where DNS addresses can change, or
environments using Docker containers where IP addresses could be
considered ephemeral. This behavior impacts cluster discovery,
cross-cluster replication and cross-cluster search, reindex from remote,
snapshot repositories, webhooks in Watcher, external authentication
mechanisms, and the Elastic Stack Monitoring Service. The experience of
watching a DNS lookup change yet not be reflected within Elasticsearch
is a poor experience for users. The reason the JVM has this is guard
against DNS cache posioning attacks. Yet, there is already a defense in
the modern world against such attacks: TLS. With proper certificate
validation, even if a resolver falls prey to a DNS cache poisoning
attack, using TLS would neuter the attack. Therefore we have a policy
with dubious security value that significantly impacts usability. As
such we make the usability/security tradeoff towards usability, since
the security risks are very low. This commit introduces new system
properties that Elasticsearch observes to override the JVM DNS cache
policy.
* Don't print download progress in batch mode
With this change we will no longer provide the progress bar in batch
mode.
Assuming that this is mode is mainly for consumption by tools which
will serialize the output, we shouldn't print a progress bar to be
for every percentile.
* PR review
For each API, the following updates were made:
- Add deprecation warnings to `Rest*Action`, plus tests in `Rest*ActionTests`.
- For each REST yml test, make sure there is one version without types, and another legacy version that retains types (called *_with_types.yml).
- Deprecate relevant methods on the Java HLRC requests/ responses.
- Update documentation (for both the REST API and Java HLRC).
This commit introduces the building of the Docker images as bonafide
packaging formats alongside our existing archive and packaging
distributions. This build is migrated from a dedicated repository, and
converted to Gradle in the process.
Currently is `java` is not in $PATH the preinst script fails
prematurely and prevents an appropriate message from getting displayed
to the user.
Make package installation more user friendly when java is not in
$PATH and add a test for it.
Also use a she-bang in the preinst script, as, at least in Debian,
maintainer scripts must start with the #! convention [1].
Relates #31845
[1] https://www.debian.org/doc/debian-policy/ch-maintainerscripts.html
In the long run we want to move all of startup to a Java program. This
will simplify our startup scripts and make maintenance of startup less
dependent on the underlying platform that we run on. This commit moves
the creation of the temporary directory off of system-dependent commands
and onto a simple Java program.
The list of official plugins accidentally included `qa` projects like,
well, `qa` and `amazon-ec2`. This changes the mechanism that we use to
build the list and adds a test to catch this.
Closes#35623
With this change, `Version` no longer carries information about the qualifier,
we still need a way to show the "display version" that does have both
qualifier and snapshot. This is now stored by the build and red from `META-INF`.
* Introduce property to set version qualifier
- VersionProperties.elasticsearch is now a string which can have qualifier
and snapshot too
- The Version class in the build no longer cares about snapshot and
qualifier.
This commit updates the procrun manager and service exes to 1.1.0. There
are a few bug fixes, including for a bug which can cause lingering
processes when removing the service.
Back in #32983 I broke running the integ-test-zip tests against an
external cluster by adding a test that reads the contents of the log
file. This fixes running against an external cluster by explicitly
skipping that test if running against an external cluster.
The BWC builds for the 6.x branch should be using JDK 11. This commit
fixes the BWC builds to specify that they use JDK 11 instead of JDK 10
which is now incompatible with the 6.x build.
To pass the HOSTNAME envrionment variable to the Windows service, we
have to add some command line flags to the service invocation. Namely,
we have to specify that we are passing HOSTNAME variable, and we will
pass for it the value of %%COMPUTERNAME%%. This ensures that if the
hostname is changed, we pick this up the next time that the service is
started. This change is needed for the service now that we use the
HOSTNAME as the default node name.
#32281 adds elasticsearch-shard to provide bwc version of elasticsearch-translog for 6.x; have to remove elasticsearch-translog for 7.0
Relates to #31389
When we implemented `refresh=wait_for` I added a test with the wrong
name. This caused us to not run it. The test asserted that running
several operations with `refresh=wait_for` did not fail if the index was
`_close`d while the operations were waiting. But to be honest, failure
here isn't that bad. The index being waited on is closed. You can't do
anything with it any way. The most important thing is actually that
these operations don't hang forever. Because hanging forever means that
the resources used by the operations aren't freed.
Anyway, when I noticed the error I reenabled the test. But they don't
pass consistently because *sometimes* the operations being tested fail.
They don't seem to hang and they always fail with "this index is closed
so you can't do anything with it" sorts of messages.
When the test started failing we disabled it again. This reenables the
test but causes it to ignore these "index is closed" failures. We'd
prefer they not happen at all but in the grand scheme of things they are
fine and making sure these operations don't hang is much more important.
This also updates the test to bring it more in line with my current
understanding of the "right" way to use the low level rest client.
* Add commented out JVM options for G1GC
These options are available now that we will be supporting G1GC for Java 10 and
above. They are also designed so that the CMS options don't have to be commented
out in order for the G1 options to take effect.
* Update wording
Changes the default of the `node.name` setting to the hostname of the
machine on which Elasticsearch is running. Previously it was the first 8
characters of the node id. This had the advantage of producing a unique
name even when the node name isn't configured but the disadvantage of
being unrecognizable and not being available until fairly late in the
startup process. Of particular interest is that it isn't available until
after logging is configured. This forces us to use a volatile read
whenever we add the node name to the log.
Using the hostname is available immediately on startup and is generally
recognizable but has the disadvantage of not being unique when run on
machines that don't set their hostname or when multiple elasticsearch
processes are run on the same host. I believe that, taken together, it
is better to default to the hostname.
1. Running multiple copies of Elasticsearch on the same node is a fairly
advanced feature. We do it all the as part of the elasticsearch build
for testing but we make sure to set the node name then.
2. That the node.name defaults to some flavor of "localhost" on an
unconfigured box feels like it isn't going to come up too much in
production. I expect most production deployments to at least set the
hostname.
As a bonus, production deployments need no longer set the node name in
most cases. At least in my experience most folks set it to the hostname
anyway.
I created a test a few days ago and declared a package that doesn't line
up with the directory structure. Oops. I a little surprised nothing
complained. But this fixes it.
I disabled one branch a few hours ago because it failed in CI. It looks
like other branches can also fail so I'll disable them as well and look
more closely on Monday.
Change the logging infrastructure to handle when the node name isn't
available in `elasticsearch.yml`. In that case the node name is not
available until long after logging is configured. The biggest change is
that the node name logging no longer fixed at pattern build time.
Instead it is read from a `SetOnce` on every print. If it is unset it is
printed as `unknown` so we have something that fits in the pattern.
On normal startup we don't log anything until the node name is available
so we never see the `unknown`s.
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309Closes#32899
Gradle triggers the build of artifacts even if assemble is disabled.
Most users will not need bwc distributions after running `./gradlew
assemble` so instead of forcing them to add `-x buildBwcVersion`, we
detect this and skip the configuration of the artifacts.
- third party audit detects jar hell with JDK so we disable it
- jdk non portable in forbiddenapis detects classes being used from the
JDK ( for fips ) that are not portable, this is intended so we don't
scan for it on fips.
- different exclusion rules for third party audit on fips
Closes#33179
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. This
changes all calls in the `client` and `distribution` projects to use
the new versions.
On some Linux distributions tmpfiles.d cleans files and
directories under /tmp if they haven't been accessed for
10 days.
This can cause problems for ML as ML is currently the only
component that uses the temp directory more than a few
seconds after startup. If you didn't open an ML job for
10 days and then tried to open one then the temp directory
would have been deleted.
This commit prevents the problem occurring in the case of
Elasticsearch being managed by systemd, as systemd private
temp directories are not subject to periodic cleanup (by
default).
Additionally there are now some docs to warn people about
the risk and suggest a manual mitigation for .tar.gz users.
First, some background: we have 15 different methods to get a logger in
Elasticsearch but they can be broken down into three broad categories
based on what information is provided when building the logger.
Just a class like:
```
private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class);
```
or:
```
protected final Logger logger = Loggers.getLogger(getClass());
```
The class and settings:
```
this.logger = Loggers.getLogger(getClass(), settings);
```
Or more information like:
```
Loggers.getLogger("index.store.deletes", settings, shardId)
```
The goal of the "class and settings" variant is to attach the node name
to the logger. Because we don't always have the settings available, we
often use the "just a class" variant and get loggers without node names
attached. There isn't any real consistency here. Some loggers get the
node name because it is convenient and some do not.
This change makes the node name available to all loggers all the time.
Almost. There are some caveats are testing that I'll get to. But in
*production* code the node name is node available to all loggers. This
means we can stop using the "class and settings" variants to fetch
loggers which was the real goal here, but a pleasant side effect is that
the ndoe name is now consitent on every log line and optional by editing
the logging pattern. This is all powered by setting the node name
statically on a logging formatter very early in initialization.
Now to tests: tests can't set the node name statically because
subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in
the same class loader. Also, lots of tests don't run with a real node so
they don't *have* a node name at all. To support multiple nodes in the
same JVM tests suss out the node name from the thread name which works
surprisingly well and easy to test in a nice way. For those threads
that are not part of an `ESIntegTestCase` node we stick whatever useful
information we can get form the thread name in the place of the node
name. This allows us to keep the logger format consistent.
Explicitly include all subdirectories of these folders in
/usr/share/elasticsearch in package distributions so that they are
managed by the package manager. This change does really have an
effect in the 7.x series, where there are no subdirectories in bin, and
we were already doing this in lib and modules. It does have an effect in
the 6.x series where the bin/x-pack subdirectory was not previously
tracked by the package manager and could be left behind on removal in
rpm distributions.
* Remove BouncyCastle dependency from runtime
This commit introduces a new gradle project that contains
the classes that have a dependency on BouncyCastle. For
the default distribution, It builds a jar from those and
in puts it in a subdirectory of lib
(/tools/security-cli) along with the BouncyCastle jars.
This directory is then passed in the
ES_ADDITIONAL_CLASSPATH_DIRECTORIES of the CLI tools
that use these classes.
BouncyCastle is removed as a runtime dependency (remains
as a compileOnly one) from x-pack core and x-pack security.
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. This
changes all calls in the `distribution/archives/integ-test-zip` project
to use the new versions.
The C2 compiler in JDK 10 appears to have an issue compiling to AVX-512
instructions (on hardware that supports such). As a workaround, this
commit adds a JVM flag on JDK 10+ to disable the use of AVX-512
instructions until a fix is introduced to the JDK. Instead, we use a
flag to enable AVX and AVX2 only.
Note: Based on my reading of the C2 code, this flag does not appear to
have any impact on hardware that does not support AVX2. I have tested
this manually on an Intel Atom C2538 processor that supports neither AVX
nor AVX2. I have also tested this manually on an Intel i5-3317U
processor that supports AVX but not AVX2.
Ensure our tests can run in a FIPS JVM
JKS keystores cannot be used in a FIPS JVM as attempting to use one
in order to init a KeyManagerFactory or a TrustManagerFactory is not
allowed.( JKS keystore algorithms for private key encryption are not
FIPS 140 approved)
This commit replaces JKS keystores in our tests with the
corresponding PEM encoded key and certificates both for key and trust
configurations.
Whenever it's not possible to refactor the test, i.e. when we are
testing that we can load a JKS keystore, etc. we attempt to
mute the test when we are running in FIPS 140 JVM. Testing for the
JVM is naive and is based on the name of the security provider as
we would control the testing infrastrtucture and so this would be
reliable enough.
Other cases of tests being muted are the ones that involve custom
TrustStoreManagers or KeyStoreManagers, null TLS Ciphers and the
SAMLAuthneticator class as we cannot sign XML documents in the
way we were doing. SAMLAuthenticator tests in a FIPS JVM can be
reenabled with precomputed and signed SAML messages at a later stage.
IT will be covered in a subsequent PR