Indexing ids in binary form should help with indexing speed since we would
have to compare fewer bytes upon sorting, should help with memory usage of
the live version map since keys will be shorter, and might help with disk
usage depending on how efficient the terms dictionary is at compressing
terms.
Since we can only expect base64 ids in the auto-generated case, this PR tries
to use an encoding that makes the binary id equal to the base64-decoded id in
the majority of cases (253 out of 256). It also specializes numeric ids, since
this seems to be common when content that is stored in Elasticsearch comes
from another database that uses eg. auto-increment ids.
Another option could be to require base64 ids all the time. It would make things
simpler but I'm not sure users would welcome this requirement.
This PR should bring some benefits, but I expect it to be mostly useful when
coupled with something like #24615.
Closes#18154
This is a regression introduced in #25510, which removed the explicit fetching of upstream. Sadly this doesn't work if you don't have any local branch referring to `upstream` as an upstream branch.
In the past global checkpoint syncing was done in the background based an interval set by an index setting. In order to set that setting something reasonable for a test, the master needed to know about the setting. Therefore the test didn't check global checkpoints if the master was old. These days the global checkpoint sync is inlined with indexing operations and that restriction is not needed.
Some times we need a fix / change to have two parts in two different branches (corresponding to two different ES releases). In order to be able to test these cases you need to run the BWC tests against a local branch rather than then using a branch from `github.com/elastic/elasticsearch`.
This commit adds a system property called `tests.bwc.refspec` that allows you to do it. Note that I've chosen to go with the simplest code change for now, at the expense of some user friendliness.
Adds a unit test that checks the TermSuggestionContext contents that is the result
of TermSuggestionBuilder#build vs. the values the original builder contains.
Transport profiles unfortunately have never been validated. Yet, it's very
easy to make a mistake when configuring profiles which will most likely stay
undetected since we don't validate the settings but allow almost everything
based on the wildcard in `transport.profiles.*`. This change removes the
settings subset based parsing of profiles but rather uses concrete affix settings
for the profiles which makes it easier to fall back to higher level settings since
the fallback settings are present when the profile setting is parsed. Previously, it was
unclear in the code which setting is used ie. if the profiles settings (with removed
prefixes) or the global node setting. There is no distinction anymore since we don't pull
prefix based settings.
In the rolling upgrade tests, there is a test to create an index with
replica shards and ensure that in the mixed cluster environment, the
cluster health is green before any other tests are executed. However,
there were two problems with this. First, if the replica shard was
residing on the restarted node, then delayed allocation will kick in and
cause the cluster health request to timeout after 1m. The fix to this
was to drastically lower the delayed allocation setting. Second, if the
primary exists on the higher version node, then the replica cannot be
assigned to the lower version node because recovery cannot happen from
lower lucene versions. The fix here was to wait for the cluster health
to be yellow instead of green in the mixed cluster environment. In the
fully upgraded cluster, the cluster health check waits for a green
cluster as before.
Closes#25185
This change adds validation to the RemoteClusterConnection to ensure
we always use seed nodes from the same cluster. While we still allow to use
an arbitrary cluster alias we ensure that we, once we connected to a cluster the first time,
we always check against that initial cluster name when we execute a seed node handshake.
sequence number data in Lucene commit points. Instead, the test
retrieves the _seq_no value from the commit point directly and converts
it to a Long value.
This change adds a basic unit test for the SuggestionSearchContext that is
created as output of SuggestionBuilder#build. The current test only adds checks
for the common fields (like text, prefix, fieldName etc...).
Relates to #17118
We previously grouped all the license and notice files for httpcore, httpcore-nio, httpclient and httpasyncclient under the same license and notice file. There were though subtle differences between those which we didn't keep track of. For instance the httpcore license file has slightly changed since 4.4 which we have missed to track.
This commit goes back to having one license and notice file for each jar, to be completely sure that each dependency is associated with exactly the right licene and notice file.
Closes#25567
Using the infra that we now have in place, we can convert the low-level REST client docs so that they extract code snippets from real Java classes. This way we make sure that all the snippets properly compile. Compared to the high level REST client docs, in this case we don't run the tests themselves, as that would require depending on test-framework which requires java 8 while the low-level REST client is compatible with java 7. I think that compiling snippets is enough for now.
Some tests use MockTransportService to do network based testing.
Yet, we run tests in multiple JVMs that means
concurrent tests could claim port that another JVM just released
and if that test tries to simulate a disconnect it might be smart
enough to re-connect depending on what is tested. To reduce the risk,
since this is very hard to debug we use a different default
port range per JVM unless the incoming settings overriding it.
Closes#25301
* Refactor PathTrie and RestController to use a single trie for all methods
This changes `PathTrie` and `RestController` to use a single `PathTrie` for all
endpoints, it also allows retrieving the endpoints' supported HTTP methods more
easily.
This is a spin-off and prerequisite of #24437
* Use EnumSet instead of multiple if conditions
* Make MethodHandlers package-private and final
* Remove duplicate registerHandler method
* Remove public modifier
Today when we run out of disk all kinds of crazy things can happen
and nodes are becoming hard to maintain once out of disk is hit.
While we try to move shards away if we hit watermarks this might not
be possible in many situations. Based on the discussion in #24299
this change monitors disk utilization and adds a flood-stage watermark
that causes all indices that are allocated on a node hitting the flood-stage
mark to be switched read-only (with the option to be deleted). This allows users to react on the low disk
situation while subsequent write requests will be rejected. Users can switch
individual indices read-write once the situation is sorted out. There is no
automatic read-write switch once the node has enough space. This requires
user interaction.
The flood-stage watermark is set to `95%` utilization by default.
Closes#24299
This commit causes a replica to throwback its local checkpoint to the
global checkpoint when learning of a new primary through a replica
operation.
Relates #25452
In 6.x we prevent multiple types and default to `index.mapping.single_type: false`
This change removes the registered setting and ensures that it's preserved for
5.x indices.
Relates to #24961
Add an Important admonition for upgrading via the command line
using the Windows MSI Installer. This calls out the need to pass
the same command line options for an upgrade as were used for
the initial installation.
All query builders written as self contained xContent objects, to we should mark
them accordingly using ToXContentObject. This also makes it possible to use
things like XContentHelper#toXContent to render query builders in tests.
* Adds rewrite phase to aggregations
This change adds aggregations to the rewrite performed by the `SearchSourceBuilder`. This means that `AggregationBuilder`s are able to implement a `rewrite()` method where they can return a new `AggregationBuilder` which is functionally the same but in a more primitive form. This is exactly analogous to the rewrite done by the `QueryBuilder`s.
The first aggregation to implement the rewrite are the filter and filters aggregations so they can rewrite the filters they contain.
Closes#17676
* Removes rewrite from PipelineAggregationBuilder
Rewrite is based on shard level information. Since pipeline aggregation are run in the reduce phase it doesn’t make sense to rewrite them on the shards. In fact eventually we shouldn’t be transporting them to the shards at all and should be retaining them on the coordinating node for execution in the reduce phase
* Addresses review comments
* addresses more review comments
* Fixed imports