The `term` and `phrase` suggesters have different options to filter candidates
based on their frequencies. The `popular` mode for instance filters candidate
terms that occur in less docs than the original term. However when we compute this threshold
we use the total term frequency of a term instead of the document frequency. This is not inline
with the actual filtering which is always based on the document frequency. This change fixes
this discrepancy and clarifies the meaning of the different frequencies in use in the suggesters.
It also ensures that the threshold doesn't overflow the maximum allowed value (Integer.MAX_VALUE).
Closes#34282
Applies our line length guidance for all classes in the server in `lucene`
directories *except* `XMoreLikeThis`. The only long line in
`XMoreLikeThis` says "remove this when we upgrade to Lucene 5. Given
that we're on Lucene 8, this is a little terrifying and deserves another
look.
This drops checkstyle suppressions that refer to files that don't exist
since those suppressions don't do anything other than make us feel bad.
It also updates some suppressions to more closely match the path to the
file that they suppress. These suppressions are still needed but didn't
pass the "the file exists" test because they weren't precise. It is just
easier on future-me if they are precise.
We generate tests from our documentation, including assertions about the
responses returned by a particular API. But sometimes we *can't* assert
that the response is correct because of some defficiency in our tooling.
Previously we marked the response `// NOTCONSOLE` to skip it, but this
is kind of odd because `// NOTCONSOLE` is really to mark snippets that
are json but aren't requests or responses. This introduces a new
construct to skip response assertions:
```
// TESTRESPONSE[skip:reason we skipped this]
```
Now that JDK 11 is GA, we would switch our 6.x and master branches to
the JDK 11 compiler. This commit makes this change, as well as removes
JDK 10 from the CI configuration.
This slightly reworks the expert script plugin example so it fits on the
page when the docs are rendered. The box in which it is rendered is not
very wide so it took a bit of twisting to make it readable.
Mainly this fixes a warning by replacing the unchecked `new ActionListener`
with the checked `new ActionListener<Response>`, and it also fixes the line
length violations in this class.
We use wrap code in `// tag` and `//end` to include it in our docs. Our
current docs style wraps code snippets in a box that is only wide enough
for 76 characters and adds a horizontal scroll bar for wider snippets
which makes the snippet much harder to read. This adds a checkstyle check
that looks for java code that is included in the docs and is wider than
that 76 characters so all snippets fit into the box. It solves many of
the failures that this catches but suppresses many more. I will clean
those up in a follow up change.
This change adds the OneStatementPerLineCheck to our checkstyle precommit
checks. This rule restricts the number of statements per line to one. The
resoning behind this is that it is very difficult to read multiple statements on
one line. People seem to mostly use it in short lambdas and switch statements in
our code base, but just going through the changes already uncovered some actual
problems in randomization in test code, so I think its worth it.
Wraps all lines in our test framework at 140 characters because that is
our standard line length and removes all of the checkstyle suppressions
for the test framework.
Drops most of `ModuleTestCase` because it isn't used and we're moving
away from using guice in the way that it wants to test anyway. Also
switches a few classes that extend it but don't use it to extend
`ESTestCase` instead.
Gradle can sometimes emit mixed log lines due to how we spawn things in
separate processes. This commit changes the jarhell integ test to only
look for the exception and message, instead of including the outer part
about the exception in thread main.
closes#33774
Make sure that all java files have a package declaration and that all of
the package declarations line up with the directory structure. This
would have caught the bug that I caused in
190ea9a6de and fixed in
b6d68bd805.
This commit changes the sanity check which ensures the test task was
properly replaced with randomized testing to have a per project check,
isntead of a global one. The previous global check assumed all test
tasks within the root project and below should be randomized testing,
but that is not the case for a multi project in which only one project
is an elasticsearch plugin. While the new check is not able to emit all
of the failed replacements in one error message, the efficacy of the
check remains.
This commit switches the joda time backcompat in scripting to use
augmentation over ZonedDateTime. The augmentation methods provide
compatibility with the missing methods between joda's DateTime and
java's ZonedDateTime. Due to getDayOfWeek returning an enum in the java
API, ZonedDateTime is wrapped so that the method can return int like the
joda time does. The java time api version is renamed to
getDayOfWeekEnum, which will be kept through 7.x for compatibility while
users switch back to getDayOfWeek once joda compatibility is removed.
* LeafCollector.setScorer() now takes a Scorable
* Scorers may not have null Weights
* IndexWriter.getFlushingBytes() reports how much memory is being used by IW threads writing to disk
Today the FilterRoutingTests take the belt-and-braces approach of excluding
some node attribute values and including some others. This means that we don't
really test that both inclusion and exclusion work correctly: as long as one of
them works as expected then the test will pass. This change improves these
tests by only using one approach at once, demonstrating that both do indeed
work, and adds tests for various other scenarios too.
Change the logging infrastructure to handle when the node name isn't
available in `elasticsearch.yml`. In that case the node name is not
available until long after logging is configured. The biggest change is
that the node name logging no longer fixed at pattern build time.
Instead it is read from a `SetOnce` on every print. If it is unset it is
printed as `unknown` so we have something that fits in the pattern.
On normal startup we don't log anything until the node name is available
so we never see the `unknown`s.
This change collapses all metrics aggregations classes into a single package `org.elasticsearch.aggregations.metrics`.
It also restricts the visibility of some classes (aggregators and factories) that should not be used outside of the package.
Relates #22868
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309Closes#32899
Solves all of the xpack line length suppressions and then merges the
remainder of the xpack checkstyle_suppressions.xml file into the core
checkstyle_suppressions.xml file. At this point that just means the
antlr generated files for sql.
It also adds an exclusion to the line length tests for javadocs that
are just a URL. We have one such javadoc and breaking up the line would
make the link difficult to use.
This commit introduces the formal notion of a private setting. This
enables us to register some settings that we had previously not
registered as fully-fledged settings to avoid them being exposed via
APIs such as the create index API. For example, we had hacks in the
codebase to allow index.version.created to be passed around inside of
settings objects, but was not registered as a setting so that if a user
tried to use the setting on any API then they would get an
exception. This prevented users from setting index.version.created on
index creation, or updating it via the index settings API. By
introducing private settings, we can continue to reject these attempts,
yet now we can represent these settings as actual settings. In this
change, we register index.version.created as an actual setting. We do
not cutover all settings that we had been treating as private in this
pull request, it is already quite large due to moving some tests around
to account for the fact that some tests need to be able to set the
index.version.created. This can be done in a follow-up change.
Exclude classes meant for newer versions than what we are auditing against, those classes won't be found. There's no reason to exclude JDK classes from newer versions, with this PR, we will not extract them in the first place.
The new implementation is functional equivalent with the old, ant based one.
It parses task standard error to get the missing classes and violations in the same way.
I considered re-using ForbiddenApisCliTask but Gradle makes it hard to build inheritance with tasks that have task actions , since the order of the task actions can't be controlled.
This inheritance isn't dully desired either as the third party audit task is much more opinionated and we don't want to expose some of the configuration.
We could probably extract a common base class without any task actions, but probably more trouble than it's worth.
Closes#31715
This commit removes the setting of the fork options maximum memory size
in our build plugin and instead adds the value in the gradle.properties
file to be alongside the value set in jvmArgs.
This change is necessary when using parallel compilation as 512m is not
sufficient for parallel compilation on some machines.
This reworks how we configure the `shadow` plugin in the build. The major
change is that we no longer bundle dependencies in the `compile` configuration,
instead we bundle dependencies in the new `bundle` configuration. This feels
more right because it is a little more "opt in" rather than "opt out" and the
name of the `bundle` configuration is a little more obvious.
As an neat side effect of this, the `runtimeElements` configuration used when
one project depends on another now contains exactly the dependencies needed
to run the project so you no longer need to reference projects that use the
shadow plugin like this:
```
testCompile project(path: ':client:rest-high-level', configuration: 'shadow')
```
You can instead use the much more normal:
```
testCompile "org.elasticsearch.client:elasticsearch-rest-high-level-client:${version}"
```
Elasticsearch versions earlier than 6.4.0 cannot properly run in a
FIPS 140 JVM. This commit ensures that we use a non-FIPS JVM for
nodes that we spin up in BWC tests even when we're testing FIPS.
* Scripted metric aggregations: add deprecation warning and system property to control legacy params
Scripted metric aggregation params._agg/_aggs are replaced by state/states context variables. By default the old params are still present, and a deprecation warning is emitted when Scripted Metric Aggregations are used. A new system property can be used to disable the legacy params. This functionality will be removed in a future revision.
* Fix minor style issue and docs test failure
* Disable deprecated params._agg/_aggs in tests and revise tests to use state/states instead
* Add integration test covering deprecated scripted metrics aggs params._agg/_aggs access
* Disable deprecated params._agg/_aggs in docs integration tests and revise stored scripts to use state/states instead
* Revert unnecessary migrations doc change
A relevant note should be added in the changes destined for 7.0; this PR is going to be backported to 6.x.
* Replace deprecated _agg param bwc integration test with a couple of unit tests
* Fix compatibility test after merge
* Rename backwards compatibility system property per code review feedback
* Tweak deprecation warning text per review feedback
Add tests for build-tools to make sure example plugins build stand-alone using it.
This will catch issues such as referencing files from the buildSrc directly, breaking external uses of build-tools.
* Implement Version in java
- This allows to move all all .java files from .groovy.
- Will prevent eclipse from tangling up in this setup
- make it possible to use Version from Java
* PR review comments
* Cluster formation plugin with reference counting
```
> Task :plugins:ingest-user-agent:listElasticSearchClusters
Starting cluster: myTestCluster
* myTestCluster: /home/alpar/work/elastic/elasticsearch/plugins/ingest-user-agent/foo
Asked to unClaimAndStop myTestCluster, since cluster still has 1 claim it will not be stopped
> Task :plugins:ingest-user-agent:testme UP-TO-DATE
Stopping myTestCluster, since no of claims is 0
```
- Meant to auto manage the clusters lifecycle
- Add integration test for cluster formation
* Fix rebase
* Change to `useCluster` method on task
* Add a task to run forbiddenapis using cli
Add a task that offers an equivalent check to the forbidden APIs plugin,
but runs it using the forbiddenAPIs CLI instead.
This isn't wired into precommit first, and doesn't work for projects
that require specific signatures, etc. It's meant to show how this can
be used. The next step is to make a custom task type and configure it
based on the project extension from the pugin and make some minor
adjustments to some build scripts as we can't bee 100% compatible with
that at least due to how additional signatures are passed.
Notes:
- there's no `--target` for the CLI version so we have to pass in
specific bundled signature names
- the cli task already wires to `runtimeJavaHome`
- no equivalent for `failOnUnsupportedJava = false` but doesn't seem to
be a problem. Tested with Java 8 to 11
- there's no way to pass additional signatures as URL, these will have
to be on the file system, and can't be resources on the cp unless we
rely on how forbiddenapis is implemented and mimic them as boundled
signatures.
- the average of 3 runs is 4% slower using the CLI for :server.
( `./gradlew clean :server:forbiddenApis` vs `./gradlew clean
:server:forbiddenApisCli`)
- up-to-date checks don't work on the cli task yet, that will happen
with the custom task.
See also: #31715
The upcoming ML log structure finder functionality will use these
libraries, and it makes sense to use the same versions that are
being used elsewhere in Elasticsearch. This is especially true
with icu4j, which is pretty big.
Enhance reproduction line with info about jdks
Provide the ability to control compiler and hava versions just by
passing a property. The actual java home comes from the
`JAVA<major>_HOME` env vars that we allready require.
This works better with the Gradle daemon as well.
Output is also changed a bit.
for `-Druntime.java=8 -Dcompiler.java=9`:
```
=======================================
Elasticsearch Build Hamster says Hello!
Gradle Version : 4.9
OS Info : Linux 4.17.8-1-ARCH (amd64)
Compiler JDK Version : 11 (Oracle Corporation 11-ea [OpenJDK 64-Bit Server VM 11-ea+22])
Runtime JDK Version : 11 (Oracle Corporation 11-ea [OpenJDK 64-Bit Server VM 11-ea+22])
Gradle JDK Version : 10 (Oracle Corporation 10.0.1 [OpenJDK 64-Bit Server VM 10.0.1+10])
Compiler java.home : /home/alpar/opt/jdk-11-ea22/
Runtime java.home : /home/alpar/opt/jdk-11-ea22/
Gradle java.home : /usr/lib/jvm/java-10-openjdk
Random Testing Seed : EA858533191E8DFB
=======================================
```
Without configuration:
```
=======================================
Elasticsearch Build Hamster says Hello!
=======================================
Gradle Version : 4.9
OS Info : Linux 4.17.8-1-ARCH (amd64)
JDK Version : 10 (Oracle Corporation 10.0.1 [OpenJDK 64-Bit Server VM 10.0.1+10])
JAVA_HOME : /usr/lib/jvm/java-10-openjdk
Random Testing Seed : 4BD5B2A839C8FCA1
=======================================
```
Here's how a reproduction line will look like (test made to fail):
```
./gradlew :modules:lang-painless:test -Dtests.seed=2DA2379065A4EEAB -Dtests.class=org.elasticsearch.painless.AdditionTests -Dtests.method="testInt" -Dtests.security.manager=true -Dtests.locale=es-PE -Dtests.timezone=WET -Dcompiler.java=10 -Druntime.java=10
```
This commit adds two pieces. The first is a small set of documentation providing
instructions on how to get setup to run context examples. This will require a download
similar to how Kibana works for some of the examples. The second is an ingest processor
example using the downloaded data. More examples will follow as ideally one per PR.
This also adds a set of tests to individually test each script as a unit test.
Currently, snippets in lists cannot be rendered correctly as a console command because the console command requires a line continuation '+'. This allows snippets to have a line continuation between the snippet and the // CONSOLE.
This commit adds the elastic repo, alongside maven central, to any
plugin using BuildPlugin. This is necessary now because the default
distribution, which the test framework uses by default, is now only
hosted on elastic maven. While inside the elasticsearch build this does
not matter, those that build external plugins with our build-tools could
have tests fail to find the distribution dependency.
This commit adds a boolean system property, `es.scripting.use_java_time`,
which controls the concrete return type used by doc values within
scripts. The return type of accessing doc values for a date field is
changed to Object, essentially duck typing the type to allow
co-existence during the transition from joda time to java time.
The main highlight is the removal of the reclaim_deletes_weight in the TieredMergePolicy.
The es setting index.merge.policy.reclaim_deletes_weight is deprecated in this commit and the value is ignored. The new merge policy setting setDeletesPctAllowed should be added in a follow up.
When we added the `java-gradle-plugin` to `buildSrc` it added a second
task to generate the pom that duplicates the publishing work that we
configure in `BuildPlugin`. Not only does it dupliciate the pom, it
creates a pom that is missing things like `name` and `description` which
are required for publishing to maven central.
This change disables the duplicate pom generation.
These are collected from a number of open PRs and are required to
improove existing and write more readable future tests.
I am extracting them to their own PR hoping to be able to merge and use
them sooner.
* Determine the minimum gradle version based on the wrapper
This is restrictive and forces users of the plugin to move together with
us, but without integration tests it's close to impossible to make sure
that the claimed compatability is really there.
If we do want to offer more flexibility, we should add those tests
first.
* Track gradle version in individual file
* PR review
This bundles the x-pack:protocol project into the x-pack:plugin:core
project because we'd like folks to consider it an implementation detail
of our build rather than a separate artifact to be managed and depended
on. It is now bundled into both x-pack:plugin:core and
client:rest-high-level. To make this work I had to fix a few things.
Firstly, I had to make PluginBuildPlugin work with the shadow plugin.
In that case we have to bundle only the `shadow` dependencies and the
shadow jar.
Secondly, every reference to x-pack:plugin:core has to use the `shadow`
configuration. Without that the reference is missing all of the
un-shadowed dependencies. I tried to make it so that applying the shadow
plugin automatically redefines the `default` configuration to mirror the
`shadow` configuration which would allow us to use bare project references
to the x-pack:plugin:core project but I couldn't make it work. It'd *look*
like it works but then fail for transitive dependencies anyway. I think
it is still a good thing to do but I don't have the willpower to do it
now.
Finally, I had to fix an issue where Eclipse and IntelliJ didn't properly
reference shadowed transitive dependencies. Neither IDE supports shadowing
natively so they have to reference the shadowed projects. We fix this by
detecting `shadow` dependencies when in "Intellij mode" or "Eclipse mode"
and adding `runtime` dependencies to the same target. This convinces
IntelliJ and Eclipse to play nice.
* Complete changes for running IT in a fips JVM
- Mute :x-pack:qa:sql:security:ssl:integTest as it
cannot run in FIPS 140 JVM until the SQL CLI supports key/cert.
- Set default JVM keystore/truststore password in top level build
script for all integTest tasks in a FIPS 140 JVM
- Changed top level x-pack build script to use keys and certificates
for trust/key material when spinning up clusters for IT
Throw an exception for doc['field'].value
if this document is missing a value for the field.
After deprecation changes have been backported to 6.x,
make this a default behaviour in 7.0
Closes#29286
In 1.x and 2.x, plugins were published to maven and the plugin
installer downloaded them from there. This was later changed to install
from the download service, and in 5.0 plugin zips were no longer
published to maven. However, the build still currently produces an
unused pom file. This is troublesome in the special case when the main
jar of a plugin needs to be published (and thus needs a pom file of
the same name).
closes#31946
This commit moves additional unit test runners from being dependencies
of the test task to dependencies of check. Without this change,
reproduce lines are incorrect due to the additional test runner not
matching any of the reproduce class/method info.
closes#31964
Moves the customizations to the build to produce nice shadow jars and
javadocs into common build code, mostly BuildPlugin with a little into
the root build.gradle file. This means that any project that applies the
shadow plugin will automatically be set up just like the high level rest
client:
* The non-shadow jar will not be built
* The shadow jar will not have a "classifier"
* Tests will run against the shadow jar
* Javadoc will include all of the shadowed classes
* Service files in `META-INF/services` will be merged
We have been encountering name mismatches between API defined in our
REST spec and method names that have been added to the high-level REST
client. We should check this automatically to prevent furher mismatches,
and correct all the current ones.
This commit adds a test for this and corrects the issues found by it.