This commit moves the default location of the full dependencies report
to be under the reports directory to align it with the location for the
dependenciesInfo task output.
A previous commit tried to add task dependencies for the
:distribution:generateDependenciesReport task so that a user did not
have to run "dependenciesInfo
:distribution:generateDependenciesReport". However this method did not
reliably add all task dependencies due to task ordering issues in
previous versions of Gradle and our build. This commit removes this for
now and a user will continue to have to run "dependenciesInfo
:distribution:generateDependenciesReport".
The goal of this commit is to address unknown licenses when producing
the dependencies info report. We have two different checks that we run
on licenses. The first check is whether or not we have stashed a copy of
the license text for a dependency in the repository. The second is to
map every dependency to a license type (e.g., BSD 3-clause). The problem
here is that the way we were handling licenses in the second check
differs from how we handle licenses in the first check. The first check
works by finding a license file with the name of the artifact followed
by the text -LICENSE.txt. Yet in some cases we allow mapping an artifact
name to another name used to check for the license (e.g., we map
lucene-.* to lucene, and opensaml-.* to shibboleth. The second check
understood the first way of looking for a license file but not the
second way. So in this commit we teach the second check about the
mappings from artifact names to license names. We do this by copying the
configuration from the dependencyLicenses task to the dependenciesInfo
task and then reusing the code from the first check in the second
check. There were some other challenges here though. For example,
dependenciesInfo was checking too many dependencies. For now, we should
only be checking direct dependencies and leaving transitive dependencies
from another org.elasticsearch artifact to that artifact (we want to do
this differently in a follow-up). We also want to disable
dependenciesInfo for projects that we do not publish, users only care
about licenses they might be exposed to if they use our assembled
products. With all of the changes in this commit we have eliminated all
unknown licenses. A follow-up will enforce that when we add a new
dependency it does not get mapped to unknown, these will be forbidden in
the future. Therefore, with this change and earlier changes are left
having no unknown licenses and two custom licenses; custom here means it
does not map to an SPDX license type. Those two licenses are xz and
ldapsdk. A future change will not allow additional custom licenses
unless they are explicitly whitelisted. This ensures that if a new
dependency is added it is mapped to an SPDX license or mapped to custom
because it does not have an SPDX license.
We sign our official plugins yet this is not well-advertised and not at
all consumed during plugin installation. For plugins that are installed
over the intertubes, verifying that the downloaded artifact is signed by
our signing key would establish both integrity and validity of the
downloaded artifact. The chain of trust here is simple: our installable
artifacts (archive and package distributions) so that if a user trusts
our packages via their signatures, and our plugin installer (which would
be executing trusted code) verifies the downloaded plugin, then the user
can trust the downloaded plugin too. This commit adds verification of
official plugins downloaded during installation. We do not add
verification for offline plugin installs; a user can download our
signatures and verify the artifacts themselves.
This commit also needs to solve a few interesting challenges. One of
these is that we want the bouncy castle JARs on the classpath only for
the plugin installer, but not for the runtime
Elasticsearch. Additionally, we want these JARs to not be present for
the JAR hell checks. To address this, we shift these JARs into a
sub-directory of lib (lib/tools/plugin-cli) that is only loaded for the
plugin installer, and in the plugin installer we filter any JARs in this
directory from the JAR hell check.
If you have an unusual umask (e.g., 0002) and clone the GitHub
repository then files that we stick into our packages like the
README.textile and the license will have a file mode of 0664 on disk yet
we expect them to be 0644. Additionally, the same thing happens with
compiled artifacts like JARs. We try to set a default file mode yet it
does not seem to take everywhere. This commit adds explicit file modes
in some places that we were relying on the defaults to ensure that the
built artifacts have a consistent file mode regardless of the underlying
build host.
This commit removes xpack from being a meta-plugin-as-a-module.
It also fixes a couple tests which were missing task dependencies, which
failed once the gradle execution order changed.
With the opening of xpack, we still retained a run task within
:x-pack:plugin. However, the root level run task also runs with the
default distribution. This change removes the extra run task inside
xpack in favor of using the root level task, and moves the
license/configuration code for run into the main run configuration.
Adds tasks that check that the all jars that we build have LICENSE.txt
and NOTICE.txt files and that the files are correct. Sets check to
depend on these task.
This is mostly there for extra parnoia because we automatically
configure all Jar tasks to include the LICENSE.txt and NOTICE.txt
files anyway. But it is quite possible to add configuration to those
tasks that would override either file.
This causes check to depend on several more things than it used to.
Take, for example, javadoc:
check depends on the new verifyJavadocJarNotice which depends on
extractJavadocJar which depends on javadocJar which depends on
javadoc, this check now depends on javadoc.
This commit adds some build time checks that the archive distributions
and package distributions contain the appropriate license and notice
files, and the package distributions contain the appropriate license
metadata.
This commit adds the distribution type to the startup scripts so that we
can discern from log output and the main response the type of the
distribution (deb/rpm/tar/zip).
This commit moves the apache and elastic license files into a new
root level `licenses` directory and rewrites the top level LICENSE.txt
to clarify the repository has a mix of apache and elastic licensed code.
This commit adds license metadata to rpm and deb packages. Additionally,
it makes the copyright file for deb files follow the machine readable
specification, and sets the correct license text based on the oss vs
default deb packages.
This commit adds the distribution flavor (default versus oss) to the
build process which is passed through the startup scripts to
Elasticsearch. This change will be used to customize the message on
attempting to install/remove x-pack based on the distribution flavor.
This commit makes x-pack a module and adds it to the default
distrubtion. It also creates distributions for zip, tar, deb and rpm
which contain only oss code.
This is a follow up to a previous change which set the error file path
for the package distributions. The observation here is that we always
set the working directory of Elasticsearch to the root of the
installation (i.e., Elasticsearch home). Therefore, we can specify the
error file path relative to this directory and default it to the logs
directory, similar to the package distributions.
This is a follow up to a previous change which set the heap dump path
for the package distributions. The observation here is that we always
set the working directory of Elasticsearch to to the root of
installation (i.e., Elasticsearch home). Therefore, we can specify the
heap dump path relative to this directory and default it to the data
directory, similar to the package distributions.
This commit adds a JVM flag to ensure that the JVM fatal error logs land
in the default log directory. Users that wish to use an alternative
location should change the path configured here.
This commit moves the distribution specific tasks into the respective
archives and packages builds. The collocation of common and distribution
specific tasks make it much easier to reason about what is expected in a
particular distribution.
This commit adds intermediate gradle projects for archive based
distributions (zip, tar) and package based distributions (rpm, deb). The
grouping allows the common distribution build file to be considerably
shorter and clearly separated from the common zip/tar and rpm/deb
configuration.
In order to build a plugin that extends the painless whitelist, the spi
classes must be available to the plugin at compile time. This commit
moves the spi classes into a separate jar which will be published. Any
plugin authors whiching to extend painless through spi would then add a
compileOnly dependency on this jar.
Otherwise newer versions of Gradle will see the outputs as stale and
remove the directory between having created the directory and copying
files into the directory (leading to the directory being created again,
this time missing some sub-directories).
This commit adds the infrastructure to plugin building and loading to
allow one plugin to extend another. That is, one plugin may extend
another by the "parent" plugin allowing itself to be extended through
java SPI. When all plugins extending a plugin are finished loading, the
"parent" plugin has a callback (through the ExtensiblePlugin interface)
allowing it to reload SPI.
This commit also adds an example plugin which uses as-yet implemented
extensibility (adding to the painless whitelist).
* Adds task dependenciesInfo to BuildPlugin to generate a CSV file with dependencies information (name,version,url,license)
* Adds `ConcatFilesTask.groovy` to concatenates multiple files into one
* Adds task `:distribution:generateDependenciesReport` to concatenate `dependencies.csv` files into a single file (`es-dependencies.csv` by default)
# Examples:
$ gradle dependenciesInfo :distribution:generateDependenciesReport
## Use `csv` system property to customize the output file path
$ gradle dependenciesInfo :distribution:generateDependenciesReport -Dcsv=/tmp/elasticsearch-dependencies.csv
## When branch is not master, use `build.branch` system property to generate correct licenses URLs
$ gradle dependenciesInfo :distribution:generateDependenciesReport -Dbuild.branch=6.x -Dcsv=/tmp/elasticsearch-dependencies.csv
We have tests that manually unpackage the RPM and Debian package
distributions and start a cluster manually (not from the service) and
run a basic suite of integration tests against them. This is problematic
because it is not how the packages are intended to be used (instead,
they are intended to be installed using the package installation tools,
and started as services) and so violates assumptions that we make about
directory paths. This commit removes these integration tests, instead
relying on the packaging tests to ensure the packages are not
broken. Additionally, we add a sanity check that the package
distributions can be unpackaged. Finally, with this change we can remove
some leniency from elasticsearch-env about checking for the existence of
the environment file which the leniency was there solely for these
integration tests.
Relates #27725
JDK 9 has removed JVM options that were valid in JDK 8 (e.g., GC logging
flags) and replaced them with new flags that are not available in JDK
8. This means that a single JVM options file can no longer apply to JDK
8 and JDK 9, complicating development, complicating our packaging story,
and complicating operations. This commit extends the JVM options syntax
to specify the range of versions the option applies to. If the running
JVM matches the range of versions, the flag will be used to start the
JVM otherwise the flag will be ignored.
We implement this parser in Java for simplicity, and with this we start
our first step towards a Java launcher.
Relates #27675
The RPM and Debian packages depend on coreutils (for mktemp among
others). This commit adds an explicit package dependency on coreutils.
Relates #27660
For too long we have been groping around in the dark when faced with GC
issues because we rarely have GC logs at our disposal. This commit
enables GC logging by default out of the box.
Relates #27610
The JVM defaults to dumping the heap to the working directory of
Elasticsearch. For the RPM and Debian packages, this location is
/usr/share/elasticsearch. This directory is not writable by the
elasticsearch user, so by default heap dumps in this situation are
lost. This commit modifies the packaging for the RPM and Debian packages
to set the heap dump path to /var/lib/elasticsearch as the default
location for dumping the heap. This location is writable by the
elasticsearch user by default. We add documentation of this important
setting if /var/lib/elasticsearch is not suitable for receiving heap
dumps.
Relates #26755
The environment variable CONF_DIR was previously inconsistently used in
our packaging to customize the location of Elasticsearch configuration
files. The importance of this environment variable has increased
starting in 6.0.0 as it's now used consistently to ensure Elasticsearch
and all secondary scripts (e.g., elasticsearch-keystore) all use the
same configuration. The name CONF_DIR is there for legacy reasons yet
it's too generic. This commit renames CONF_DIR to ES_PATH_CONF.
Relates #26197
Today our shell scripts march on if they encounter an error during
execution. One place that this actually causes a problem is with the
Java version checker. What can happen is this: if the user botches their
installation so that the JavaVersionChecker can not be found on the
classpath, when we attempt to run the Java version checker, first an
error message that the class can not be found is displayed, and then we
print a message that their version of Java is not compatible; this
happens even if they are using a Java 8 installation. The problem is
that we should have immediately aborted when the class could not be
loaded. Since we do not exit when the shell script encounters an error,
we end up conflating failue to run the version check with a failed
version check. Instead, we really should abort the moment that one of
our scripts encounters an error. To do this, we make the following
changes:
- enable set -e and set -o pipefail
- make the Java version checker responsible for printing the error
message to the console
- remove the exit status check from the scripts
- actually on Windows, we still have to check the exit status because
there is no equivalent of set -e
- when we check for daemonization, we can no longer check the exit
status from grep because a failed grep will abort the script;
instead, we move the grep execution to be the condition for the if as
this does not trip the set -e failure conditions
- we should source elasticsearch-env before doing anything, so we move
the definition of parse_jvm_options below sourcing elasticsearch-env
- we make consistent all places where we use a subshell to use
backticks
Relates #26057
This commit introduces the elasticsearch-env script. The purpose of this
script is threefold:
- vastly simplify the various scripts used in Elasticsearch
- provide a script that can be included in other scripts in the
Elasticsearch ecosystem (e.g., plugins)
- correctly establish the environment for all scripts (e.g., so that
users can run `elasticsearch-keystore` from a package distribution
without having to worry about setting `CONF_DIR` first, otherwise the
keystore would be created in the wrong location)
Relates #25815
This commit changes the default heap size to 1 GB. Experimenting with
elasticsearch is often done on laptops, and 1 GB is much friendlier to
laptop memory. It does put more pressure on the gc, but the tradeoff is
a smaller default footprint. Users running in production can (and
should) adjust the heap size as necessary for their usecase.
This commit removes the default path settings for data and logs. With
this change, we now ship the packages with these settings set in the
elasticsearch.yml configuration file rather than going through the
default.path.data and default.path.logs dance that we went through in
the past.
Relates #25408
Removes the `distribution:bwc` project in favor of
`distribution:bwc-release-snapshot` and
`distribution:bwc-stable-snapshot`.
`distribution:bwc-release-snapshot` builds a snapshot of the
latest release branch (5.4 now) if needed for backwards
compatibility. `distribution:bwc-stable-snapshot` builds a
snapshot of the latest stable branch (5.x now) if needed for
backwards compatibility.
Some packaging tests depend on snapshot versions of packaging
distributions yet the build does not use a repository that includes such
distributions. While we could add such a repository, a better strategy
is to follow our approach for other BWC tests where we depend on a
locally-compiled archive distribution. This commit adds a local
compilation of packaging artifacts and substitutes these anywhere that
we would otherwise depend on a snapshot of these artifacts.
Relates #24861
When installing plugin permissions, we try to set the permissions on all
installed files ourselves because a umask from the user could violate
everything needed to get the permissions right. Sadly, directories were
not handled correctly at all and so we were still left with broken
installations with umasks like 0077. This commit fixes this issue, adds
a thorough unit test for the situation, and most importantly, adds a
test that sets the umask before installing the plugin.
Relates #24527
The plugin cli currently resides inside the elasticsearch jar. This
commit moves it into a plugin-cli jar. This is change alone is a no-op;
it does not change anything about what is loaded at runtime. But it will
allow easier testing (with fixtures in the future to test ES or maven
installation), as well as eventually not loading these classes when
starting elasticsearch.
After splitting integ tests into cluster configuration and the test
runner task, we still have dependencies of the test runner added as deps
of the cluster. This commit adds dependencies directly to the cluster,
so that the runner can have other dependencies independent of what is
needed for the cluster.
Adds the option for a plugin to specify extra directories containing notices
and licenses files to be incorporated into the overall notices file that is
generated for the plugin.
This can be useful, for example, where the plugin has a non-Java dependency
that itself incorporates many 3rd party components.
The current rest backcompat tests, which run against a mixed cluster of
5.x and 6.0 nodes, depend on snapshot builds of 5.x. However, this has
the potential for inconsistency that results in CI failures, and happens
quite often, whenever some backcompat logic is added to 5.x, but the bwc
test on master fails because the 5.x code has not yet been published as
a snapshot.
This change creates a git clone of the 5.x branch,
builds the zip distribution, and ties that into gradle substitutions for
the 5.x version.
Gradle's finalizedBy on tasks only ensures one task runs after another,
but not immediately after. This is problematic for our integration tests
since it allows multiple project's integ test clusters to be
simultaneously. While this has not been a problem thus far (gradle 2.13
happened to keep the finalizedBy tasks close enough that no clusters
were running in parallel), with gradle 3.3 the task graph generation has
changed, and numerous clusters may be running simultaneously, causing
memory pressure, and thus generally slower tests, or even failure if the
system has a limited amount of memory (eg in a vagrant host).
This commit reworks how integ tests are configured. It adds an
`integTestCluster` extension to gradle which is equivalent to the current
`integTest.cluster` and moves the rest test runner task to
`integTestRunner`. The `integTest` task is then just a dummy task,
which depends on the cluster runner task, as well as the cluster stop
task. This means running `integTest` in one project will both run the
rest tests, and shut down the cluster, before running `integTest` in
another project.
All plugins currently have their own licenses dir for the
dependencyLicenses task, but core disables this and has the check inside
distribution. This may have been better for maven, but for
gradle it makes more sense to just use the dependencyLicenses task that
automatically exists inside :core, and remove the hacked up version that
is inside distribution.
Today when running gradle clean
:distribution:(integ-test-zip|tar|zip):assemble, the created archive
distribution will be missing the empty plugins directory. This is
because the empty plugins folder created in the build folder for the
copy spec task is created during configuration and then is later wiped
away by the clean task. This commit addresses this issue, by pushing
creation of the directory out of the configuration phase.
Relates #21271
Today when installing Elasticsearch from an archive distribution (tar.gz
or zip), an empty plugins folder is not included. This means that if you
install Elasticsearch and immediately run elasticsearch-plugin list, you
will receive an error message about the plugins directory missing. While
the plugins directory would be created when starting Elasticsearch for
the first time, it would be better to just include an empty plugins
directory in the archive distributions. This commit makes this the
case. Note that the package distributions already include an empty
plugins folder.
Relates #21204
* Build: Remove old maven deploy support
This change removes the old maven deploy that we have in parallel to
maven-publish, and makes maven-publish fully work with publishing to
maven local. Using `gradle publishToMavenLocal` should be used to
publish to .m2.
Note that there is an unfortunate hack that means for
zip artifacts we must first create/publish a dummy pom file, and then
follow that with the real pom file. It would be nice to have the pom
file contains packaging=zip, but maven central then requires sources and
javadocs. But our zips are really just attached artifacts, so we already
set the packaging type to pom for our zip files. This change just works
around a limitation of the underlying maven publishing library which
silently skips attached artifacts when the packaging type is set to pom.
relates #20164closes#20375
* Remove unnecessary extra spacing
The integ-test-zip distribution did not specify a value for
path.conf. As such, it picked up the default value of
/etc/elasticsearch. This means that on machines that have this
directory, integration tests could fail because they would try to pick
up configuration from that directory rather than from the home directory
of the exploded distribution. This commit fixes this issue by specifying
a value of path.conf for the integ-test-zip distribution.
Relates #20271
This commit sets the default min heap equal to the default max
heap. This is to align the default out-of-box settings with the heap
size bootstrap check.
Some tests still start http implicitly or miss configuring the transport clients correctly.
This commit fixes all remaining tests and adds a depdenceny to `transport-netty` from
`qa/smoke-test-http` and `modules/reindex` since they need an http server running on the nodes.
This also moves all required permissions for netty into it's module and out of core.
Recent changes adds an extra bin/ directory that contains Windows bat files in the normal Bin folder of elasticsearch. For exemple the script elasticsearch.bat is located in bin/bin/elasticsearch.bat instead of bin/elasticsearch.bat.
The distributions had their own copies of these extra files, which was a
carry over from maven. This change removes the duplicate files and
copies them from the root of the project.
closes#18597
This change tweaks the metadata version associated with the deb and rpm
packages. For rpm, dashes are allowed in versions, so we can use the
version as it already exists. For rpm, dashes are not allowed, so this
uses underscores instead. This only affects prerelease versions (eg
alpha/beta/rc), and usually someone doesn't specify the version, so the
inconsistency doesn't matter that much.
This change makes `gradle run` use the full zip distribution. Arguably
we could make it do the same when run inside plugins, but I started out
with this simple change. I am open to moving it into the RunTask itself
so that plugins will do this as well.
After considerable discussion, we have elected to set the default min
heap to 256m and the default max heap to 2g. This is to balance the
desire for a good out-of-the-box performance experience (default max
heap of 2g) with a good out-of-the-box experience running on machines
with limited resources or running multiple instances on a single modern
developer laptop (default min heap of 1g).
Benchmarking with the geonames data set shows that Elasticsearch truly
needs 2g of heap or the heap is a bottleneck for indexing rates. If our
goal is to satisfy the out-of-the-box-experience, we should prioritize
the out-of-the-box performance experience. Thus, the default heap should
be 2g until smaller heaps are not the bottleneck for indexing small data
sets.
This commit fixes the Debian package requires clause for bash. For the
RPM it is okay to specify the binary, but for the Debian package the
package name must be specified.
Today we encourage users to set their minimum and maximum heap settings
equal to each other to prevent the heap from resizing. Yet, the default
heap settings do not start Elasticsearch in this fashion. This commit
addresses this discrepancy by setting the default min heap to '512m' and
the default max heap to the default min heap.
Relates #16334
This commit adds a hard requirement to the RPM and Debian packages for
/bin/bash to be present, and adds a note regarding this to the migration
docs.
Relates #18259
In preparation for a unified release process, we need to be able to
generate the pom files independently of trying to actually publish. This
change adds back the maven-publish plugin just for that purpose. The
nexus plugin still exists for now, so that we do not break snapshots,
but that can be removed at a later time once snapshots are happenign
through the unified tools. Note I also changed the dir jars are written
into so that all our artifacts are under build/distributions.
This changes our packaging to be explicit about the permissions of files
and directories in the tar.gz, rpm, and deb packages. This is to protect
against a user having an incorrectly set umask when installing.
Additionally, plugins that are installed now have their permissions set
by the plugin installation so that plugins that may have been packaged
with incorrect permissions are secured.
Resolves#17634
This allows for a local file based deploy without needed nexus
auth information.
Also signing of packages has been added, either via gradle.properties
or using system properties as a fallback.
The property build.repository allows to configure another endpoint if no
snapshot build is done.
Fix creation of .asc file for tar.gz distribution
Closes#17405
This commit adds a new configuration file jvm.options to centralize and
simplify management of JVM options. This separates the configuration of
the JVM from the packaging scripts (bin/elasticsearch*, bin/service.bat,
and init.d/elasticsearch) simplifying end-user operational management of
custom JVM options.
The build currently uses the old maven support in gradle. This commit
switches to use the newer maven-publish plugin. This will allow future
changes, for example, easily publishing to artifactory.
An additional part of this change makes publishing of build-tools part
of the normal publishing, instead of requiring a separate upload step
from within buildSrc. That also sets us up for a follow up to enable
precomit checks on the buildSrc code itself.
Removes all our logger wrappers except the wrapper for log4j1.2. If you
depend on Elasticsearch's jar in your application you'll need to declare
log4j 1.2 and/or some bridge to your favorite logger.
We did this to simplify our builds and code. No more commons-logging like
log implementation sniffing. No more optional dependency hacks in gradle.
We might one day want to use j.u.l instead of log4j. If we do want that
we can recover its wrapper by studying this commit. We didn't go directly
to j.u.l in this commit because that is a bigger change. Our logging
configuration is based on log4j1.2 and people are used to it. So it'd
be a much more fraught breaking change to do that conversion.
Both modules and integ-test-zip have integration tests (the latter being
the base rest tests). We can currently get odd behavior where
integ-test-zip's integ test does not shutdown its cluster before running
mdoule integ tests (and it then tries to shutdown all those clusters at
once after modules integ tests have run).
The underlying issue can be attributed to a bug in gradle with how cross project
mustRunAfter work with finalizers. This change works around this bug by
setting up mustRunAfter on the shutdown task itself.
We currently use the full suite of packaged rest tests for each
distribution. We also used to run rest tests within core integ tests,
but this stopped working when we split out the test-framework, since the
test files are in there.
This change simplifies the code to run packaged rest tests just once,
for the integ-test-zip, and removes the unused rest tests from
test-framework. Distributions rest tests now check that all modules
were loaded.
This change attempts to simplify the gradle tasks for precommit. One
major part of that is using a "less groovy style", as well as being more
consistent about how tasks are created and where they are configured. It
also allows the things creating the tasks to set up inter task
dependencies, instead of assuming them (ie decoupling from tasks
eleswhere in the build).