This commit enhances the license detection that we have for various
licenses. Here we improve the detection for all licenses (especially the
Apache 2.0 License), the BSD 2-clause license, the MIT (with
attribution) license, and we add detection for the BSD 3-clause
license. One way that we achieved this improvement is by changing how
the license files are read so that rather than reading them as a
multi-line string which ended up represented as "[line1, line2, line3,
...]" internally, we read the full bytes of the license text and replace
all whitespace with a single space so the license text is now loaded as
"line1 line2 line3". For the MIT license we add the actual license text
and remove the "MIT" string as not all copies of the license clearly
indicate that the text is the MIT license. We take a similar strategy
for the BSD-2 and BSD-3 clause licenses. With this change, we reduce the
number of "custom" licenses in the codebase from 31 to 2. The two
remaining appear to be truly custom licenses, not carrying licenses
identifiable by SPDX. A follow-up will address "unknown" licenses.
The following analyzers were moved from server module to analysis-common module:
`snowball`, `arabic`, `armenian`, `basque`, `bengali`, `brazilian`, `bulgarian`,
`catalan`, `chinese`, `cjk`, `czech`, `danish`, `dutch`, `english`, `finnish`,
`french`, `galician` and `german`.
Relates to #23658
We moved to 1 shard by default which caused some issues in how many
concurrent shard requests we allow by default. For instance searching
a 5 shard index on a single node will now be executed serially per shard
while we want these cases to have a good concurrency out of the box. This
change moves to `numNodes * 5` which corresponds to the default we used to
have in the previous version.
Relates to #30783Closes#30994
Full restructure of the spec into new sections for operators, statements, scripts, functions, lambdas, and regexes. Split of operators into 6 sections, a table, reference, array, numeric, boolean, and general. Clean up of all operators sections. Sporadic clean up else where.
* Initial commit of rest high level exposure of cancel task
* fix javadocs
* address some code review comments
* update branch to use tasks namespace instead of cluster
* High-level client: list tasks failure to not lose nodeId
This commit reworks testing for `ListTasksResponse` so that random
fields insertion can be tested and xcontent equivalence can be checked
too. Proper exclusions need to be configured, and failures need to be
tested separately. This helped finding a little problem, whenever there
is a node failure returned, the nodeId was lost as it was never printed
out as part of the exception toXContent.
* added comment
* merge from master
* re-work CancelTasksResponseTests to separate XContent failure cases from non-failure cases
* remove duplication of logic in parser creation
* code review changes
* refactor TasksClient to support RequestOptions
* add tests for parent task id
* address final PR review comments, mostly formatting and such
We no longer need animal sniffer because we use JDK functionality
(introduced in JDK 9) to target older versions of the JDK for
compilation. This functionality means that the JDK handles the problem
of ensuring that we do not use JDK APIs from the version that we are
compiling from that are not available in the version that we are
compiling to. A previous commit removed this for the REST client (where
we target JDK 7) but a few traces were left behind.
Use all running nodes as unicast seeds in the rolling restart tests to
avoid a race between pinging and the tests. Without this if the tests
are too fast then when a new node comes up and pings its single
configured seed node that node *might* not have a ping from the other
running node.
This is related to #27260 and #28898. This commit adds the transport-nio
plugin as a random option when running the http smoke tests. As part of
this PR, I identified an issue where cors support was not properly
enabled causing these tests to fail when using transport-nio. This
commit also fixes that issue.
Adds support for `ignore_unmapped` parameter in geo distance sorting,
which is functionally equivalent to specifying an `unmapped_type` in
the field sort.
Closes#28152
Several AcknowledgedResponse implementations only parse the boolean acknowledged
flag and then create an instance of their class using that flag. This can be
simplified by adding this basic parser to the superclass, provide a common
helper method and call the appropriate ctor in the fromXContent methods.
This change moves an integration test that relies on setting
the value of a static variable (boolean max clause count) to
an unit test where we are sure that the same jvm is used to access
the static variable.
This field is similar to the `feature` field but is better suited to index
sparse feature vectors. A use-case for this field could be to record topics
associated with every documents alongside a metric that quantifies how well
the topic is connected to this document, and then boost queries based on the
topics that the logged user is interested in.
Relates #27552
By default span_multi query will limit term expansions = boolean max clause.
This will limit high heap usage in case of high cardinality term
expansions. This applies only if top_terms_N is not used in inner multi
query.
This commit adjusts the indentation in the CLI scripts to give a clear
visual indication that the line being indented is a continuation of the
previous line.
SSLTrustRestrictionsTests updates the restrictions YML file during the test run to change the set of restrictions. This update was small, but it wasn't atomic.
If the yml file is reloaded while empty or invalid, then it causes all SSL certificates to be considered invalid (until it is reloaded again), which could break the sniffing/administrative client that runs underneath the tests.
A previous refactoring of the CLI scripts migrated all of the CLI tools
to shell to a common script, elasticsearch-cli. This approach is fine in
Bash where it is easy to tear arguments apart but it doesn't work so
well on Windows where quoting is insane. To avoid having to tear the
arguments apart to separate the first argument to elasticsearch-cli from
the remaining arguments, we instead choose a strategy where we can avoid
tearing the arguments apart. To do this, we will instead pass the main
class by an environment variable and then we can pass the arguments
straight through. This will let us avoid awful quoting issues on
Windows. This is the Windows side of that effort and the Bash side was
in a previous commit.
A previous refactoring of the CLI scripts migrated all of the CLI tools
to shell to a common script, elasticsearch-cli. This approach is fine in
Bash where it is easy to tear arguments apart but it doesn't work so
well on Windows where quoting is insane. To avoid having to tear the
arguments apart to separate the first argument to elasticsearch-cli from
the remaining arguments, we instead choose a strategy where we can avoid
tearing the arguments apart. To do this, we will instead pass the main
class by an environment variable and then we can pass the arguments
straight through. This will let us avoid awful quoting issues on
Windows. This is the non-Windows side of that effort and the Windows
side will be in a follow-up.
This is related to #28898. This commit adds the acceptor thread name to
the method checking if this thread is a transport thread. Additionally,
it modifies the nio http transport to use the same worker name as the
netty4 http server transport.
With #30490 we have introduced a new way to provide request options
whenever sending a request using the high-level REST client. Before you
could provide headers as the last argument varargs of each API method,
now you can provide `RequestOptions` that in the future will allow to
provide more options which can be specified per request.
This commit deprecates all of the client methods that accept a `Header`
varargs argument in favour of new methods that accept `RequestOptions`
instead. For some API we don't even go through deprecation given that
they were not released since they were added, hence in that case we can
just move them to the new method.
This is related to #27260. This commit combines the AcceptingSelector
and SocketSelector classes into a single NioSelector. This change
allows the same selector to handle both server and socket channels. This
is valuable as we do not necessarily want a dedicated thread running for
accepting channels.
With this change, this commit removes the configuration for dedicated
accepting selectors for the normal transport class. The accepting
workload for new node connections is likely low, meaning that there is
no need to dedicate a thread to this process.
I pushed a test that `assertBusy`s for a whole hour accidentally. I was
testing something and forgot to revert my local hack but caught it on
backport. This removes it.
This is much more realistic and can find more issues. This causes the
"mixed cluster" tests to be run twice so I had to fix the tests to work
in that case. In most cases I did as little as possible to get them
working but in a few cases I went a little beyond that to make them
easier for me to debug while getting them to work. My test changes:
1. Remove the "basic indexing" tests and replace them with a copy of the
tests used in the OSS. We have no way of sharing code between these two
projects so for now I copy.
2. Skip the a few tests in the "one third" upgraded scenario:
* creating a scroll to be reused when the cluster is fully upgraded
* creating some ml data to be used when the cluster is fully ugpraded
3. Drop many "assert yellow and that the cluster has two nodes"
assertions. These assertions duplicate those made by the wait condition
and they fail now that we have three nodes.
4. Switch many "assert green and that the cluster has two nodes" to 3
nodes. These assertions are unique from the wait condition and, while
I imagine they aren't required in all cases, now is not the time to
find that out. Thus, I made them work.
5. Rework the index audit trail test so it is more obvious that it is
the same test expecting different numbers based on the shape of the
cluster. The conditions for which number are expected are fairly
complex because the index audit trail is shut down until the template
for it is upgraded and the template is upgraded when a master node is
elected that has the new version of the software.
6. Add some more information to debug the index audit trail test because
it helped me figure out what was going on.
I also dropped the `waitCondition` from the `rolling-upgrade-basic`
tests because it wasn't needed.
Closes#25336
Currently the engine is initialized with a hardcoded 256MB of RAM. Elasticsearch
may never use more than that for a given shard, `IndexingMemoryController` only
has the power to flush segments to disk earlier in case multiple shards are
actively indexing and use too much memory.
While this amount of memory is enough for an index with few fields and larger
RAM buffers are not expected to improve indexing speed, this might actually be
little for an index that has many fields.
Kudos to @bleskes for finding it out when looking into a user who was reporting
a **much** slower indexing speed when upgrading from 2.x to 5.6 with an index
that has about 20,000 fields.
In #19749 an extra check was added before writing each blob to ensure that we would not be
overriding an existing blob. Due to S3's weak consistency model, this check was best effort. To
make matters worse, however, this resulted in a HEAD request to be done before every PUT, in
particular also when PUTTING a new object. The approach taken in #19749 worsened our
consistency guarantees for follow-up snapshot actions, as it made it less likely for new files that
had been written to be available for reads.
This commit therefore removes this extra check. Due to the weak consistency model, this check
was a best effort thing anyway, and there's currently no way to prevent accidental overrides on S3.
The native realm's usage stats were previously pulled from the cache,
which only contains the number of users that had authenticated in the
past 20 minutes. This commit changes this so that we pull the current
value from the security index by executing a search request. In order
to support this, the usage stats for realms is now asynchronous so that
we do not block while waiting on the search to complete.