elasticsearch-node tool helps to restore cluster if half or more of
master eligible nodes are lost. Of course, all bets are off, regarding
data consistency.
There are two parts of the tool: unsafe-bootstrap to be used when there
is still at least one master-eligible node alive and detach-cluster,
when there are no master-eligible nodes left.
This commit implements the first part.
Docs for the tool will be added separately as a part of #37812.
* Testing conventions now checks for tests in main
This is the last outstanding feature of the old NamingConventionsTask,
so time to remove it.
* PR review
This change adds a docker compose configuration that's used with
the `elasticsearch.test.fixtures` plugin to start up the image
and check that the TCP ports are up.
We can build on this to add other checks for culster health,
run REST tests, etc.
We can add multiple containers and configurations to the compose
file (e.x. test different env vars) and form clusters.
Currently integration tests which use either bwc snapshot versions or
the current version of elasticsearch depend on project substitutions to
link to the build of those artifacts. Likewise, vagrant tests use
dependency substitutions to get to bwc snapshots of rpm and debs.
This commit changes those to depend on the relevant project/configuration
and removes the dependency substitutions for distributions we do not
publish.
The integ tests currently use the raw zip project name as the
distribution type. This commit simplifies this specification to be
"default" or "oss". Whether zip or tar is used should be an internal
implementation detail of the integ test setup, which can (in the future)
be platform specific.
With the release of 11.0.2, the old URLs no longer work. This exposed a
few small bugs in the gradle config. One was that --no-cache was not
present in the docker build command, so it was not failing at
first. Then once only the ext.expansions was changed and the docker
build task was not, it was not executing it.
This commit updates the file docker's entrypoint script looks for when
deciding to process the ELASTIC_PASSWORD env var. The x-pack subdir
of bin no longer exists in 7.0, where the backcompat layer for x-pack
script locations was removed.
closes#37240
Some systems default to a nofile ulimit of 65535. To reduce the pain of
deploying Elasticsearch to such systems, this commit lowers the required
limit from 65536 to 65535.
This commit removes permission editing commands from the postinst
scriptlet. Instead, we now fully configure the owner/group (as well as
sticky bit) for these files and directories.
closes#37143
* Default include_type_name to false for get and put mappings.
* Default include_type_name to false for get field mappings.
* Add a constant for the default include_type_name value.
* Default include_type_name to false for get and put index templates.
* Default include_type_name to false for create index.
* Update create index calls in REST documentation to use include_type_name=true.
* Some minor clean-ups around the get index API.
* In REST tests, use include_type_name=true by default for index creation.
* Make sure to use 'expression == false'.
* Clarify the different IndexTemplateMetaData toXContent methods.
* Fix FullClusterRestartIT#testSnapshotRestore.
* Fix the ml_anomalies_default_mappings test.
* Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests.
We make sure to specify include_type_name=true during xContent parsing,
so we continue to test the legacy typed responses. XContent generation
for the typeless responses is currently only covered by REST tests,
but we will be adding unit test coverage for these as we implement
each typeless API in the Java HLRC.
This commit also refactors GetMappingsResponse to follow the same appraoch
as the other mappings-related responses, where we read include_type_name
out of the xContent params, instead of creating a second toXContent method.
This gives better consistency in the response parsing code.
* Fix more REST tests.
* Improve some wording in the create index documentation.
* Add a note about types removal in the create index docs.
* Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL.
* Make sure to mention include_type_name in the REST docs for affected APIs.
* Make sure to use 'expression == false' in FullClusterRestartIT.
* Mention include_type_name in the REST templates docs.
This commit makes the assemble tasks in the bwc projects noops by
setting the dependsOn directly. While we can not remove things from
dependsOn, we can still completely override the dependencies.
closes#33581
This commit adds a unique id to cluster blocks, so that they can be uniquely
identified if needed. This is important for the Close Index API where multiple
concurrent closing requests can be executed at the same time. By adding a
UUID to the cluster block, we can generate unique "closing block" that can
later be verified on shards and then checked again from the cluster state
before closing the index. When the verification on shard is done, the closing
block is replaced by the regular INDEX_CLOSED_BLOCK instance.
If something goes wrong, calling the Open Index API will remove the block.
Related to #33888
With this commit we instruct curl to retry with a backoff when
downloading the JDK for the Elasticsearch Docker image. This avoids
build failures on transient network issues. Note that this option
requires curl 7.12.3 or better.
Relates #37103
Relates #37113
We added some special handling for installing and removing the
ingest-geoip and ingest-user-agent plugins when we converted them to
modules. This special handling was done to minimize breaking users in a
minor release. However, do not want to maintain this behavior forever so
this commit removes that special handling in the master branch so that
starting with 7.0.0 this special handling will be gone.
* Deprecate types in index API
- deprecate type-based constructors of IndexRequest
- update tests to use typeless IndexRequest constructors
- no yaml tests as they have been already added in #35790
Relates to #35190
The following updates were made:
* Add deprecation warnings to `RestUpdateAction`, plus a test in `RestUpdateActionTests`.
* Deprecate relevant methods on the Java HLRC requests/ responses.
* Add HLRC integration tests for the typed APIs.
* Update documentation (for both the REST API and Java HLRC).
* Fix failing integration tests.
Because of an earlier PR, the REST yml tests were already updated (one version without types, and another legacy version that retains types).
The commit changes how indices are closed in the MetaDataIndexStateService.
It now uses a 3 steps process where writes are blocked on indices to be closed,
then some verifications are done on shards using the TransportVerifyShardBeforeCloseAction
added in #36249, and finally indices states are moved to CLOSE and their routing
tables removed.
The closing process also takes care of using the pre-7.0 way to close indices if the
cluster contains mixed version of nodes and a node does not support the TransportVerifyShardBeforeCloseAction. It also closes unassigned indices.
Related to #33888
When a security manager is present, the JVM will cache positive hostname
lookups indefinitely. This can be problematic, especially in the modern
world with cloud services where DNS addresses can change, or
environments using Docker containers where IP addresses could be
considered ephemeral. This behavior impacts cluster discovery,
cross-cluster replication and cross-cluster search, reindex from remote,
snapshot repositories, webhooks in Watcher, external authentication
mechanisms, and the Elastic Stack Monitoring Service. The experience of
watching a DNS lookup change yet not be reflected within Elasticsearch
is a poor experience for users. The reason the JVM has this is guard
against DNS cache posioning attacks. Yet, there is already a defense in
the modern world against such attacks: TLS. With proper certificate
validation, even if a resolver falls prey to a DNS cache poisoning
attack, using TLS would neuter the attack. Therefore we have a policy
with dubious security value that significantly impacts usability. As
such we make the usability/security tradeoff towards usability, since
the security risks are very low. This commit introduces new system
properties that Elasticsearch observes to override the JVM DNS cache
policy.
* Don't print download progress in batch mode
With this change we will no longer provide the progress bar in batch
mode.
Assuming that this is mode is mainly for consumption by tools which
will serialize the output, we shouldn't print a progress bar to be
for every percentile.
* PR review
For each API, the following updates were made:
- Add deprecation warnings to `Rest*Action`, plus tests in `Rest*ActionTests`.
- For each REST yml test, make sure there is one version without types, and another legacy version that retains types (called *_with_types.yml).
- Deprecate relevant methods on the Java HLRC requests/ responses.
- Update documentation (for both the REST API and Java HLRC).
This commit introduces the building of the Docker images as bonafide
packaging formats alongside our existing archive and packaging
distributions. This build is migrated from a dedicated repository, and
converted to Gradle in the process.
Currently is `java` is not in $PATH the preinst script fails
prematurely and prevents an appropriate message from getting displayed
to the user.
Make package installation more user friendly when java is not in
$PATH and add a test for it.
Also use a she-bang in the preinst script, as, at least in Debian,
maintainer scripts must start with the #! convention [1].
Relates #31845
[1] https://www.debian.org/doc/debian-policy/ch-maintainerscripts.html
In the long run we want to move all of startup to a Java program. This
will simplify our startup scripts and make maintenance of startup less
dependent on the underlying platform that we run on. This commit moves
the creation of the temporary directory off of system-dependent commands
and onto a simple Java program.
The list of official plugins accidentally included `qa` projects like,
well, `qa` and `amazon-ec2`. This changes the mechanism that we use to
build the list and adds a test to catch this.
Closes#35623
With this change, `Version` no longer carries information about the qualifier,
we still need a way to show the "display version" that does have both
qualifier and snapshot. This is now stored by the build and red from `META-INF`.
* Introduce property to set version qualifier
- VersionProperties.elasticsearch is now a string which can have qualifier
and snapshot too
- The Version class in the build no longer cares about snapshot and
qualifier.
This commit updates the procrun manager and service exes to 1.1.0. There
are a few bug fixes, including for a bug which can cause lingering
processes when removing the service.
Back in #32983 I broke running the integ-test-zip tests against an
external cluster by adding a test that reads the contents of the log
file. This fixes running against an external cluster by explicitly
skipping that test if running against an external cluster.
The BWC builds for the 6.x branch should be using JDK 11. This commit
fixes the BWC builds to specify that they use JDK 11 instead of JDK 10
which is now incompatible with the 6.x build.
To pass the HOSTNAME envrionment variable to the Windows service, we
have to add some command line flags to the service invocation. Namely,
we have to specify that we are passing HOSTNAME variable, and we will
pass for it the value of %%COMPUTERNAME%%. This ensures that if the
hostname is changed, we pick this up the next time that the service is
started. This change is needed for the service now that we use the
HOSTNAME as the default node name.
#32281 adds elasticsearch-shard to provide bwc version of elasticsearch-translog for 6.x; have to remove elasticsearch-translog for 7.0
Relates to #31389
When we implemented `refresh=wait_for` I added a test with the wrong
name. This caused us to not run it. The test asserted that running
several operations with `refresh=wait_for` did not fail if the index was
`_close`d while the operations were waiting. But to be honest, failure
here isn't that bad. The index being waited on is closed. You can't do
anything with it any way. The most important thing is actually that
these operations don't hang forever. Because hanging forever means that
the resources used by the operations aren't freed.
Anyway, when I noticed the error I reenabled the test. But they don't
pass consistently because *sometimes* the operations being tested fail.
They don't seem to hang and they always fail with "this index is closed
so you can't do anything with it" sorts of messages.
When the test started failing we disabled it again. This reenables the
test but causes it to ignore these "index is closed" failures. We'd
prefer they not happen at all but in the grand scheme of things they are
fine and making sure these operations don't hang is much more important.
This also updates the test to bring it more in line with my current
understanding of the "right" way to use the low level rest client.
* Add commented out JVM options for G1GC
These options are available now that we will be supporting G1GC for Java 10 and
above. They are also designed so that the CMS options don't have to be commented
out in order for the G1 options to take effect.
* Update wording
Changes the default of the `node.name` setting to the hostname of the
machine on which Elasticsearch is running. Previously it was the first 8
characters of the node id. This had the advantage of producing a unique
name even when the node name isn't configured but the disadvantage of
being unrecognizable and not being available until fairly late in the
startup process. Of particular interest is that it isn't available until
after logging is configured. This forces us to use a volatile read
whenever we add the node name to the log.
Using the hostname is available immediately on startup and is generally
recognizable but has the disadvantage of not being unique when run on
machines that don't set their hostname or when multiple elasticsearch
processes are run on the same host. I believe that, taken together, it
is better to default to the hostname.
1. Running multiple copies of Elasticsearch on the same node is a fairly
advanced feature. We do it all the as part of the elasticsearch build
for testing but we make sure to set the node name then.
2. That the node.name defaults to some flavor of "localhost" on an
unconfigured box feels like it isn't going to come up too much in
production. I expect most production deployments to at least set the
hostname.
As a bonus, production deployments need no longer set the node name in
most cases. At least in my experience most folks set it to the hostname
anyway.
I created a test a few days ago and declared a package that doesn't line
up with the directory structure. Oops. I a little surprised nothing
complained. But this fixes it.
I disabled one branch a few hours ago because it failed in CI. It looks
like other branches can also fail so I'll disable them as well and look
more closely on Monday.
Change the logging infrastructure to handle when the node name isn't
available in `elasticsearch.yml`. In that case the node name is not
available until long after logging is configured. The biggest change is
that the node name logging no longer fixed at pattern build time.
Instead it is read from a `SetOnce` on every print. If it is unset it is
printed as `unknown` so we have something that fits in the pattern.
On normal startup we don't log anything until the node name is available
so we never see the `unknown`s.
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309Closes#32899
Gradle triggers the build of artifacts even if assemble is disabled.
Most users will not need bwc distributions after running `./gradlew
assemble` so instead of forcing them to add `-x buildBwcVersion`, we
detect this and skip the configuration of the artifacts.
- third party audit detects jar hell with JDK so we disable it
- jdk non portable in forbiddenapis detects classes being used from the
JDK ( for fips ) that are not portable, this is intended so we don't
scan for it on fips.
- different exclusion rules for third party audit on fips
Closes#33179
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. This
changes all calls in the `client` and `distribution` projects to use
the new versions.
On some Linux distributions tmpfiles.d cleans files and
directories under /tmp if they haven't been accessed for
10 days.
This can cause problems for ML as ML is currently the only
component that uses the temp directory more than a few
seconds after startup. If you didn't open an ML job for
10 days and then tried to open one then the temp directory
would have been deleted.
This commit prevents the problem occurring in the case of
Elasticsearch being managed by systemd, as systemd private
temp directories are not subject to periodic cleanup (by
default).
Additionally there are now some docs to warn people about
the risk and suggest a manual mitigation for .tar.gz users.
First, some background: we have 15 different methods to get a logger in
Elasticsearch but they can be broken down into three broad categories
based on what information is provided when building the logger.
Just a class like:
```
private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class);
```
or:
```
protected final Logger logger = Loggers.getLogger(getClass());
```
The class and settings:
```
this.logger = Loggers.getLogger(getClass(), settings);
```
Or more information like:
```
Loggers.getLogger("index.store.deletes", settings, shardId)
```
The goal of the "class and settings" variant is to attach the node name
to the logger. Because we don't always have the settings available, we
often use the "just a class" variant and get loggers without node names
attached. There isn't any real consistency here. Some loggers get the
node name because it is convenient and some do not.
This change makes the node name available to all loggers all the time.
Almost. There are some caveats are testing that I'll get to. But in
*production* code the node name is node available to all loggers. This
means we can stop using the "class and settings" variants to fetch
loggers which was the real goal here, but a pleasant side effect is that
the ndoe name is now consitent on every log line and optional by editing
the logging pattern. This is all powered by setting the node name
statically on a logging formatter very early in initialization.
Now to tests: tests can't set the node name statically because
subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in
the same class loader. Also, lots of tests don't run with a real node so
they don't *have* a node name at all. To support multiple nodes in the
same JVM tests suss out the node name from the thread name which works
surprisingly well and easy to test in a nice way. For those threads
that are not part of an `ESIntegTestCase` node we stick whatever useful
information we can get form the thread name in the place of the node
name. This allows us to keep the logger format consistent.
Explicitly include all subdirectories of these folders in
/usr/share/elasticsearch in package distributions so that they are
managed by the package manager. This change does really have an
effect in the 7.x series, where there are no subdirectories in bin, and
we were already doing this in lib and modules. It does have an effect in
the 6.x series where the bin/x-pack subdirectory was not previously
tracked by the package manager and could be left behind on removal in
rpm distributions.
* Remove BouncyCastle dependency from runtime
This commit introduces a new gradle project that contains
the classes that have a dependency on BouncyCastle. For
the default distribution, It builds a jar from those and
in puts it in a subdirectory of lib
(/tools/security-cli) along with the BouncyCastle jars.
This directory is then passed in the
ES_ADDITIONAL_CLASSPATH_DIRECTORIES of the CLI tools
that use these classes.
BouncyCastle is removed as a runtime dependency (remains
as a compileOnly one) from x-pack core and x-pack security.
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. This
changes all calls in the `distribution/archives/integ-test-zip` project
to use the new versions.
The C2 compiler in JDK 10 appears to have an issue compiling to AVX-512
instructions (on hardware that supports such). As a workaround, this
commit adds a JVM flag on JDK 10+ to disable the use of AVX-512
instructions until a fix is introduced to the JDK. Instead, we use a
flag to enable AVX and AVX2 only.
Note: Based on my reading of the C2 code, this flag does not appear to
have any impact on hardware that does not support AVX2. I have tested
this manually on an Intel Atom C2538 processor that supports neither AVX
nor AVX2. I have also tested this manually on an Intel i5-3317U
processor that supports AVX but not AVX2.
Ensure our tests can run in a FIPS JVM
JKS keystores cannot be used in a FIPS JVM as attempting to use one
in order to init a KeyManagerFactory or a TrustManagerFactory is not
allowed.( JKS keystore algorithms for private key encryption are not
FIPS 140 approved)
This commit replaces JKS keystores in our tests with the
corresponding PEM encoded key and certificates both for key and trust
configurations.
Whenever it's not possible to refactor the test, i.e. when we are
testing that we can load a JKS keystore, etc. we attempt to
mute the test when we are running in FIPS 140 JVM. Testing for the
JVM is naive and is based on the name of the security provider as
we would control the testing infrastrtucture and so this would be
reliable enough.
Other cases of tests being muted are the ones that involve custom
TrustStoreManagers or KeyStoreManagers, null TLS Ciphers and the
SAMLAuthneticator class as we cannot sign XML documents in the
way we were doing. SAMLAuthenticator tests in a FIPS JVM can be
reenabled with precomputed and signed SAML messages at a later stage.
IT will be covered in a subsequent PR
We mistakenly enabled bundling of the default distribution's bin scripts
into the `integ-test-zip` artifact used by plugin authors to test plugins.
These didn't change the version of Elasticsearch used for testing but as
a side effect changed the LICENSE.txt from the Apache 2 license to the
Elastic license. We really didn't mean for that to happen. The bin script
and the elasticsearch-sql-cli jar file bundled into the distribution are
indeed governed by the Elastic license but we didn't intend for them to be
in the testing artifact in the first place. This removes them and fixes
the license of the `integ-test-zip` artifact.
* Upgrade bouncycastle
Required to fix
`bcprov-jdk15on-1.55.jar; invalid manifest format `
on jdk 11
* Downgrade bouncycastle to avoid invalid manifest
* Add checksum for new jars
* Update tika permissions for jdk 11
* Mute test failing on jdk 11
* Add JDK11 to CI
* Thread#stop(Throwable) was removed
http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-June/053536.html
* Disable failing tests #31456
* Temprorarily disable doc tests
To see if there are other failures on JDK11
* Only blacklist specific doc tests
* Disable only failing tests in ingest attachment plugin
* Mute failing HDFS tests #31498
* Mute failing lang-painless tests #31500
* Fix backwards compatability builds
Fix JAVA version to 10 for ES 6.3
* Add 6.x to bwx -> java10
* Prefix out and err from buildBwcVersion for readability
```
> Task :distribution:bwc:next-bugfix-snapshot:buildBwcVersion
[bwc] :buildSrc:compileJava
[bwc] WARNING: An illegal reflective access operation has occurred
[bwc] WARNING: Illegal reflective access by org.codehaus.groovy.reflection.CachedClass (file:/home/alpar/.gradle/wrapper/dists/gradle-4.5-all/cg9lyzfg3iwv6fa00os9gcgj4/gradle-4.5/lib/groovy-all-2.4.12.jar) to method java.lang.Object.finalize()
[bwc] WARNING: Please consider reporting this to the maintainers of org.codehaus.groovy.reflection.CachedClass
[bwc] WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
[bwc] WARNING: All illegal access operations will be denied in a future release
[bwc] :buildSrc:compileGroovy
[bwc] :buildSrc:writeVersionProperties
[bwc] :buildSrc:processResources
[bwc] :buildSrc:classes
[bwc] :buildSrc:jar
```
* Also set RUNTIME_JAVA_HOME for bwcBuild
So that we can make sure it's not too new for the build to understand.
* Align bouncycastle dependency
* fix painles array tets
closes#31500
* Update jar checksums
* Keep 8/10 runtime/compile untill consensus builds on 11
* Only skip failing tests if running on Java 11
* Failures are dependent of compile java version not runtime
* Condition doc test exceptions on compiler java version as well
* Disable hdfs tests based on runtime java
* Set runtime java to minimum supported for bwc
* PR review
* Add comment with ticket for forbidden apis
So the issue here is that we want to avoid setting vm.max_map_count if
it is already equal to the desired value (the bootstrap check requires
262144). The reason we want to avoid this is because in some use-cases
using sysctl to set this will fail. In this case, we want to enable
users to set this value externally and then allow that to cause using
sysctl to set the value to be skipped so that cases where using sysctl
will fail to no longer fail.
The package installation relies on java being in the path. If java is
not in the path, the tests fail at post-install time. This commit adds a
pre-install check to validate that java exists, and if it fails, the
package is never installed, and thus keeps a system clean, rather than
aborting at post-install and leaving behind a mess.
Closes#29665
With this commit we add the possibility to define further JVM options (and
system properties) based on the current environment. As a proof of concept, it
chooses Netty's allocator ergonomically based on the maximum defined heap size.
We switch to the unpooled allocator at 1GB heap size (value determined
experimentally, see #30684 for more details). We are also explicit about the
choice of the allocator in either case.
Relates #30684
This commit modifies the Sys V init startup scripts to only modify
vm.max_map_count if needed. In this case, needed means that the current
value is less than our default value of 262144 maps.
For 6.3 we renamed the `tar` and `zip` distributions to `oss-tar` and
`oss-zip`. Then we added new `tar` and `zip` distributions that contain
x-pack and are licensed under the Elastic License. Unfortunately we
accidentally generated POM files along side the new `tar` and `zip`
distributions that incorrectly claimed that they were Apache 2 licensed.
Oooops.
This fixes the license on the POMs generated for the `tar` and `zip`
distributions.
This was silly; Bouncy Castle has an armored input stream for reading
keys in ASCII armor format. This means that we do not need to strip the
header ourselves and base64 decode the key. This had problems anyway
because of discrepancies in the padding that Bouncy Castle would produce
and the JDK base64 decoder was expecting. Now that we armor input/output
the whole way during tests, we fix all random failures in test cases
too.
The java version checker requires being written with java 7 APIs.
In order to use java 8 apis in other launcher utilities, this commit
moves the java version checker back to its own jar.
This commit moves the default location of the full dependencies report
to be under the reports directory to align it with the location for the
dependenciesInfo task output.
A previous commit tried to add task dependencies for the
:distribution:generateDependenciesReport task so that a user did not
have to run "dependenciesInfo
:distribution:generateDependenciesReport". However this method did not
reliably add all task dependencies due to task ordering issues in
previous versions of Gradle and our build. This commit removes this for
now and a user will continue to have to run "dependenciesInfo
:distribution:generateDependenciesReport".
The goal of this commit is to address unknown licenses when producing
the dependencies info report. We have two different checks that we run
on licenses. The first check is whether or not we have stashed a copy of
the license text for a dependency in the repository. The second is to
map every dependency to a license type (e.g., BSD 3-clause). The problem
here is that the way we were handling licenses in the second check
differs from how we handle licenses in the first check. The first check
works by finding a license file with the name of the artifact followed
by the text -LICENSE.txt. Yet in some cases we allow mapping an artifact
name to another name used to check for the license (e.g., we map
lucene-.* to lucene, and opensaml-.* to shibboleth. The second check
understood the first way of looking for a license file but not the
second way. So in this commit we teach the second check about the
mappings from artifact names to license names. We do this by copying the
configuration from the dependencyLicenses task to the dependenciesInfo
task and then reusing the code from the first check in the second
check. There were some other challenges here though. For example,
dependenciesInfo was checking too many dependencies. For now, we should
only be checking direct dependencies and leaving transitive dependencies
from another org.elasticsearch artifact to that artifact (we want to do
this differently in a follow-up). We also want to disable
dependenciesInfo for projects that we do not publish, users only care
about licenses they might be exposed to if they use our assembled
products. With all of the changes in this commit we have eliminated all
unknown licenses. A follow-up will enforce that when we add a new
dependency it does not get mapped to unknown, these will be forbidden in
the future. Therefore, with this change and earlier changes are left
having no unknown licenses and two custom licenses; custom here means it
does not map to an SPDX license type. Those two licenses are xz and
ldapsdk. A future change will not allow additional custom licenses
unless they are explicitly whitelisted. This ensures that if a new
dependency is added it is mapped to an SPDX license or mapped to custom
because it does not have an SPDX license.
We no longer need animal sniffer because we use JDK functionality
(introduced in JDK 9) to target older versions of the JDK for
compilation. This functionality means that the JDK handles the problem
of ensuring that we do not use JDK APIs from the version that we are
compiling from that are not available in the version that we are
compiling to. A previous commit removed this for the REST client (where
we target JDK 7) but a few traces were left behind.
This commit adjusts the indentation in the CLI scripts to give a clear
visual indication that the line being indented is a continuation of the
previous line.
A previous refactoring of the CLI scripts migrated all of the CLI tools
to shell to a common script, elasticsearch-cli. This approach is fine in
Bash where it is easy to tear arguments apart but it doesn't work so
well on Windows where quoting is insane. To avoid having to tear the
arguments apart to separate the first argument to elasticsearch-cli from
the remaining arguments, we instead choose a strategy where we can avoid
tearing the arguments apart. To do this, we will instead pass the main
class by an environment variable and then we can pass the arguments
straight through. This will let us avoid awful quoting issues on
Windows. This is the Windows side of that effort and the Bash side was
in a previous commit.
A previous refactoring of the CLI scripts migrated all of the CLI tools
to shell to a common script, elasticsearch-cli. This approach is fine in
Bash where it is easy to tear arguments apart but it doesn't work so
well on Windows where quoting is insane. To avoid having to tear the
arguments apart to separate the first argument to elasticsearch-cli from
the remaining arguments, we instead choose a strategy where we can avoid
tearing the arguments apart. To do this, we will instead pass the main
class by an environment variable and then we can pass the arguments
straight through. This will let us avoid awful quoting issues on
Windows. This is the non-Windows side of that effort and the Windows
side will be in a follow-up.
If you invoke elasticsearch-plugin (or any other CLI script on Windows)
with a path that has a percent-encoded space (or any other
percent-encoded character) because the CLI scripts now shell into a
common shell script (elasticsearch-cli) the percent-encoded space ends
up being interpreted as a parameter. For example passing install --batch
file:/c:/encoded%20%space/analysis-icu-7.0.0.zip to elasticsearch-plugin
leads to the %20 being interpreted as %2 followed by a zero. Here, the
%2 is interpreted as the second parameter (--batch) and the
InstallPluginCommand class ends up seeing
file:/c/encoded--batch0space/analysis-icu-7.0.0.zip as the path which
will not exist. This commit addresses this by escaping the %* that is
used to pass the parameters to the common CLI script so that the common
script sees the correct parameters without the %2 being substituted.
Applies default file and directory permissions to zip distributions
similar to how they're set for the tar distributions. Previously zip
distributions would retain permissions they had on the build host's
working tree, which could vary depending on its umask
For #30799
A previous commit added the public key used for signing artifacts to the
plugin CLI. This commit is an iteration on that to add the header and
footer to the key so that it is clear what the key is. Instead, we strip
the header/footer on read. With this change we simplify our test where
keys already in this format are generated and we had to strip on the
test side.
We sign our official plugins yet this is not well-advertised and not at
all consumed during plugin installation. For plugins that are installed
over the intertubes, verifying that the downloaded artifact is signed by
our signing key would establish both integrity and validity of the
downloaded artifact. The chain of trust here is simple: our installable
artifacts (archive and package distributions) so that if a user trusts
our packages via their signatures, and our plugin installer (which would
be executing trusted code) verifies the downloaded plugin, then the user
can trust the downloaded plugin too. This commit adds verification of
official plugins downloaded during installation. We do not add
verification for offline plugin installs; a user can download our
signatures and verify the artifacts themselves.
This commit also needs to solve a few interesting challenges. One of
these is that we want the bouncy castle JARs on the classpath only for
the plugin installer, but not for the runtime
Elasticsearch. Additionally, we want these JARs to not be present for
the JAR hell checks. To address this, we shift these JARs into a
sub-directory of lib (lib/tools/plugin-cli) that is only loaded for the
plugin installer, and in the plugin installer we filter any JARs in this
directory from the JAR hell check.
If you have an unusual umask (e.g., 0002) and clone the GitHub
repository then files that we stick into our packages like the
README.textile and the license will have a file mode of 0664 on disk yet
we expect them to be 0644. Additionally, the same thing happens with
compiled artifacts like JARs. We try to set a default file mode yet it
does not seem to take everywhere. This commit adds explicit file modes
in some places that we were relying on the defaults to ensure that the
built artifacts have a consistent file mode regardless of the underlying
build host.
This commit reduces the Windows CLI scripts to one-liners by moving all
of the redundant logic to an elasticsearch-cli script. This commit is
only the Windows side, a previous commit covered the Linux side.
We post snapshot builds to snapshots.elastic.co yet the official plugin
installer will not let you install such plugins without manually
downloading them and installing them from a file URL. This commit adds
the ability for the plugin installer to use snapshots.elastic.co for
installing official plugins if a es.plugins.staging is set and the
current build is also a snapshot build. Otherwise, we continue to use
staging.elastic.co if the current build is a release build and
es.plugins.staging is set and, of course, use the release artifacts at
artifacts.elastic.co for release builds with es.plugins.staging unset.
This commit reduces the Linux CLI scripts to one-liners by moving all of
the redundant logic to an elasticsearch-cli script. This commit is only
the Linux side, a follow-up will do this for Windows too.
Meta plugins existed only for a short time, in order to enable breaking
up x-pack into multiple plugins. However, now that x-pack is no longer
installed as a plugin, the need for them has disappeared. This commit
removes the meta plugins infrastructure.
This commit removes xpack from being a meta-plugin-as-a-module.
It also fixes a couple tests which were missing task dependencies, which
failed once the gradle execution order changed.
This commit changes the default out-of-the-box configuration for the
number of shards from five to one. We think this will help address a
common problem of oversharding. For users with time-based indices that
need a different default, this can be managed with index templates. For
users with non-time-based indices that find they need to re-shard with
the split API in place they no longer need to resort only to
reindexing.
Since this has the impact of changing the default number of shards used
in REST tests, we want to ensure that we still have coverage for issues
that could arise from multiple shards. As such, we randomize (rarely)
the default number of shards in REST tests to two. This is managed via a
global index template. However, some tests check the templates that are
in the cluster state during the test. Since this template is randomly
there, we need a way for tests to skip adding the template used to set
the number of shards to two. For this we add the default_shards feature
skip. To avoid having to write our docs in a complicated way because
sometimes they might be behind one shard, and sometimes they might be
behind two shards we apply the default_shards feature skip to all docs
tests. That is, these tests will always run with the default number of
shards (one).
With the opening of xpack, we still retained a run task within
:x-pack:plugin. However, the root level run task also runs with the
default distribution. This change removes the extra run task inside
xpack in favor of using the root level task, and moves the
license/configuration code for run into the main run configuration.
This commit adds setting the homedir for the elasticsearch user to the
adduser command in the packaging preinstall script. While the
elasticsearch user is a system user, it is sometimes conventient to have
an existing homedir (even if it is not writeable). For example, running
cron as the elasticsearch user will try to change dir to the homedir.
closes#14453
Systemd overrides should happen through /etc/systemd/system, not
directly editing the service file. This commit removes marking the
service file as configuration for rpm and deb packages.
If the elasticsearch-env bash script chooses $ES_TMPDIR
then it also creates the directory. This change makes
elasticsearch-env.bat do the same thing: if %ES_TMPDIR%
is chosen by the script then the script will ensure it
exists, but if %ES_TMPDIR% is already set then the user
is responsible for creating it.
Relates #27609
Relates #28217
The overall NOTICE file for the ML X-Pack module should
include the notices from the 3rd party C++ components as
well as the 3rd party Java components.
This commit converts the deb package to use tildes in place of dash in
the internal package version. This is only relevant for prerelease
versions of elasticsearch. Previously, this was not possible due to
problems with the underlying library used by the ospackage plugin, but
since a recent upgrade, it now works.
closes#21139
Adds tasks that check that the all jars that we build have LICENSE.txt
and NOTICE.txt files and that the files are correct. Sets check to
depend on these task.
This is mostly there for extra parnoia because we automatically
configure all Jar tasks to include the LICENSE.txt and NOTICE.txt
files anyway. But it is quite possible to add configuration to those
tasks that would override either file.
This causes check to depend on several more things than it used to.
Take, for example, javadoc:
check depends on the new verifyJavadocJarNotice which depends on
extractJavadocJar which depends on javadocJar which depends on
javadoc, this check now depends on javadoc.
This commit adds some build time checks that the archive distributions
and package distributions contain the appropriate license and notice
files, and the package distributions contain the appropriate license
metadata.
This commit uses the customFields setting of the Deb task in ospackage
to work around the fact it does not know anything about the License
attribute natively.
THe deb distribution has a special copyright file instead of
LICENSE.txt, but the distributions were including the template file
instead of the rendered file (which includes the license name and text).
This commit adds the distribution type to the startup scripts so that we
can discern from log output and the main response the type of the
distribution (deb/rpm/tar/zip).
This commit moves the apache and elastic license files into a new
root level `licenses` directory and rewrites the top level LICENSE.txt
to clarify the repository has a mix of apache and elastic licensed code.
This commit adds license metadata to rpm and deb packages. Additionally,
it makes the copyright file for deb files follow the machine readable
specification, and sets the correct license text based on the oss vs
default deb packages.
X-Pack can no longer be installed as a plugin. This commit adds special
handling for when a user attempts to install X-Pack. This special
handling informs the user of the oss distribution that they should
download the default distribution and the user of the default
distribution that X-Pack does not require installation as it is included
by default.
This commit adds the distribution flavor (default versus oss) to the
build process which is passed through the startup scripts to
Elasticsearch. This change will be used to customize the message on
attempting to install/remove x-pack based on the distribution flavor.
This commit makes x-pack a module and adds it to the default
distrubtion. It also creates distributions for zip, tar, deb and rpm
which contain only oss code.
This commit moves the checks on JAVAX_HOME (where X is the java version
number) existing to the end of gradle's configuration phase, and based
on whether the tasks needing the java home are configured to execute.
relates #29519
This commit fixes plugin warning confirmation to include native
controller confirmation when no security policy exists. The case was
already covered for meta plugins, but not for normal plugins. Tests are
also added for all cases.
Some build tasks require older JDKs. For example, the BWC build tasks
for older versions of Elasticsearch require older JDKs. It is onerous to
require these be configured when merely compiling Elasticsearch, the
requirement that they be strictly set to appropriate values should only
be enforced if these tasks are going to be executed. To address this, we
lazy configure these tasks.
Today we have JAVA_HOME for the compiler Java home and RUNTIME_JAVA_HOME
for the test Java home. However, when we compile BWC nodes and run them,
neither of these Java homes might be the version that was suitable for
that BWC node (e.g., 5.6 requires JDK 8 to compile and to run). This
commit adds support for the environment variables JAVA\d+_HOME and uses
the appropriate Java home based on the version of the node being
started. We even do this for reindex-from-old which requires JDK 7 for
these very old nodes. Note that these environment variables are not
required if not running BWC tests, and they are strictly required if
running BWC tests.
The BWC builds always fetch the latest from the elastic/elasticsearch
repository for the BWC branches. Yet, there are use-cases for using the
local checkout without fetching the latest. This commit enables these
use-cases by adding a tests.bwc.git.fetch.latest property to skip the
fetches.
Today we have a silent batch mode in the install plugin command when
standard input is closed or there is no tty. It appears that
historically this was useful when running tests where we want to accept
plugin permissions without having to acknowledge them. Now that we have
an explicit batch mode flag, this use-case is removed. The motivation
for removing this now is that there is another place where silent batch
mode arises and that is when a user attempts to install a plugin inside
a Docker container without keeping standard input open and attaching a
tty. In this case, the install plugin command will treat the situation
as a silent batch mode and therefore the user will never have the chance
to acknowledge the additional permissions required by a plugin. This
commit removes this silent batch mode in favor of using the --batch flag
when running tests and requiring the user to take explicit action to
acknowledge the additional permissions (either by leaving standard input
open and attaching a tty, or by passing the --batch flags themselves).
Note that with this change the user will now see a null pointer
exception when they try to install a plugin in a Docker container
without keeping standard input open and attaching a tty. This will be
addressed in an immediate follow-up, but because the implications of
that change are larger, they should be handled separately from this one.
This commit changes the sysprop for overriding the branch bwc builds use
to be branch specific. There are 3 different bwc branches built, but all
of them currently read the exact same sysprop. For example, with this change
and current branches, you can now specify eg `-Dtests.bwc.refspec.6.x=my_6x`
and it will build only next-minor-snapshot with that branch, while
next-bugfix-snapshot will continue to use 5.6.
This is a follow up to a previous change which set the error file path
for the package distributions. The observation here is that we always
set the working directory of Elasticsearch to the root of the
installation (i.e., Elasticsearch home). Therefore, we can specify the
error file path relative to this directory and default it to the logs
directory, similar to the package distributions.
This is a follow up to a previous change which set the heap dump path
for the package distributions. The observation here is that we always
set the working directory of Elasticsearch to to the root of
installation (i.e., Elasticsearch home). Therefore, we can specify the
heap dump path relative to this directory and default it to the data
directory, similar to the package distributions.
When upgrading via the RPM package, we can run into a problem where
the keystore fails to be created. This arises because the %post script
on RPM runs after the new package files are installed but before the
removal of the old package files. This means that the contents of the
lib folder can contain files from the old package and the new package
and thus running the create keystore tool can encounter JAR hell
issues and fail. To solve this, we move creating the keystore to the
%posttrans script which runs after the old package files are
removed. We only need to do this on the RPM package, so we add a
switch in the shared post-install script.
The cd command on Windows has an oddity regarding changing
directories. If the drive of the current directory is a different drive
than than of the directory that was passed to the cd command, cd acts in
query mode and does not change the current directory. Instead, a flag is
needed to put the cd command into set mode so that the directory
actually changes. This causes a problem when starting Elasticsearch from
a directory different than the one where it is installed and this commit
fixes the issue.
Today we allow any other method of starting Elastisearch to override
jvm.options via ES_JAVA_OPTS. Yet, for some settings in the Windows
service, we do not allow this. This commit removes this in favor of
being consistent with other packaging choices.
Provide more actionable error message when installing an offline plugin
in the plugins directory, and the `plugins` directory for the node
contains plugin distribution.
Closes#27401
This commit adds a JVM flag to ensure that the JVM fatal error logs land
in the default log directory. Users that wish to use an alternative
location should change the path configured here.
As we have factored Elasticsearch into smaller libraries, we have ended
up in a situation that some of the dependencies of Elasticsearch are not
available to code that depends on these smaller libraries but not server
Elasticsearch. This is a good thing, this was one of the goals of
separating Elasticsearch into smaller libraries, to shed some of the
dependencies from other components of the system. However, this now
means that simple utility methods from Lucene that we rely on are no
longer available everywhere. This commit copies IOUtils (with some small
formatting changes for our codebase) into the fold so that other
components of the system can rely on these methods where they no longer
depend on Lucene.
We no longer source the environment file in the packaging scripts yet we
had leftover references to variables defined by those environment
files. This commit cleans these up.
Previously we allowed a lot of customization of Elasticsearch during
package installation (e.g., the username and group). This customization
was achieved by sourcing the env script (e.g.,
/etc/sysconfig/elasticsearch) during installation. Since we no longer
allow such flexibility, we do not need to source these env scripts
during package installation and removal.
This commit removes the ability to specify that a plugin requires the
keystore and instead creates the keystore on package installation or
when Elasticsearch is started for the first time. The reason that we opt
to create the keystore on package installation is to ensure that the
keystore has the correct permissions (the package installation scripts
run as root as opposed to Elasticsearch running as the elasticsearch
user) and to enable removing the keystore on package removal if the
keystore is not modified.