Rewrote the GROUP BY to use composite aggregation instead of terms
(and everything that comes with it) but instead rely on composite aggregation
This not only works better but simplifies the code complexity since
composite is a straight, two-level tree:
1. root/group-by/composite-keys
2. (metric) aggregations
This removes a lot of complexity from all stages that involve creating,
assembling and especially parsing the results.
By moving to composite agg, the aggregation/GROUP BY are now pageable
so the consumer/listener had to be extended to include a dedicated
cursor and specific (bucket) extractors inline with the scroll requests.
While at it, also improved the support for implicit GROUP BY by
formalizing it (previously it supported only counts and no other
agg).
In addition:
Fixed a JDBC bug that caused incorrect timeout to be passed
Improved the returned RowSet a bit and add better naming
Pick up @Nullable move from core
Make sure to specify the TimeZone for DateTimeHistogram extraction
Add missing javadoc
To avoid delegating NamedWriteableRegistry (NWR) and to keep the scope
clean, SQL writeables now handle their own serialization, keeping the
boundary between the Elasticsearch's NWR in place.
Pass NamedWriteableRegistry only when looking at the next page
To keep in line with the existing patter and simplify the code
bureaucracy, the deserialization happens directly.
Since the SearchSourceBuilder deserialization happens explicitly (and
it's otherwise opaque), the declarative invocation isn't necessary
anymore.
Add a bit more randomization in tests
Original commit: elastic/x-pack-elasticsearch@f5af046386
This commit moves the `TimeValue` class into the elasticsearch-core project.
This allows us to use this class in many of our other projects without relying
on the entire `server` jar.
Relates to #28504
This commit switches the manual creation and addition of files to the
keystore to use the built-in support available in the integTestCluster
configuration closure.
This change removes the need to worry about the creation of the
keystore and possibly dealing with a prompt from the creation command.
Original commit: elastic/x-pack-elasticsearch@8a4026a096
This commit introduces built in support for adding files to the
keystore when configuring the integration test cluster for a project.
In order to use this support, simply add `keystoreFile` followed by the
secure setting name and the path to the source file inside the
integTestCluster closure for a project. The built in support will
handle the creation of the keystore and the addition of the file to the
keystore.
Currently numDocs() is computed lazily, but this doesn't help since
BaseCompositeReader calls numDocs() on its sub readers eagerly. This may cause
performance issues since every time we wrap a reader with DocumentSubSetReader
(which means for every query when DLS is enabled) we need to recompute the
number of live documents, which runs in linear time with the number of matches
of the role query.
Not computing numDocs() eagerly in DocumentSubSetReader might help, but it
would also be fragile since callers of this method still usually assume that
it runs in constant time. So I am proposing that we add a cache of the number
of live docs in order to decrease the performance hit of document-level
security. I would expect this cache to be efficient as it will not only reuse
entries in-between refreshes, but also across refreshes for segments that
haven't received any new updates.
Original commit: elastic/x-pack-elasticsearch@5a3af1b174
* Decouple TimeValue from Elasticsearch server classes
This commit decouples the `TimeValue` class from the other server classes. This
is in preperation to move `TimeValue` into the `elasticsearch-core` jar,
allowing us to use it from projects that cannot depend on the elasticsearch-core
library.
Relates to #28504
This change removes the check for extra tokens when parsing a source generated by a templated
_msearch request. This was added unintentionally in #29428 but the intent of this modification was to validate
simple _search request only.
The skeleton of ElasticsearchMergePolicy is quite similar to
MergePolicyWrapper. This commit therefore makes ElasticsearchMergePolicy
inherited from MergePolicyWrapper instead of MergePolicy.
Currently, a flush stats contains only the total flush which is the sum
of manual flush (via API) and periodic flush (async triggered when the
uncommitted translog size is exceeded the flush threshold). Sometimes,
it's useful to know these two numbers independently. This commit tracks
and returns a periodic flush count in a flush stats.
This adds an `include_type_name` option to the `indices.create`,
`indices.get_mapping` and `indices.put_mapping` APIs, which defaults to `true`.
When set to `false`, then mappings will be returned directly in the body of
the `indices.get_mapping` API, without keying them by the type name, the
`indices.create` will expect mappings directly under the `mappings` key, and
the `indices.put_mapping` will use `_doc` as a type name and fail if a `type`
is provided explicitly.
Relates #15613
Today we expose a mutable list of documents in ParseContext via
ParseContext#docs(). This, on the one hand places knowledge how
to access nested documnts in multiple places and on the other
allows for potential illegal access to nested only docs after
the docs are reversed. This change restricts the access and
streamlines nested / non-root doc access.
This change validates that the `_search` request does not have trailing
tokens after the main object and fails the request with a parsing exception otherwise.
Closes#28995
Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
Today when a user runs a CLI tool with standard input closed and no tty
attached, the result from reading is null and this usually leads to a
null pointer exception when we try to parse this input. This arises for
example when the user runs the plugin installer through a Docker
container without leaving standard input open and attaching a tty
(docker exec <container ID> bin/elasticsearch-plugin install). When we
try to read that the user accepts the plugin requiring additional
security permissions we will get back null. This commit addresses this
for all cases by throwing an illegal state exception. The solution for
the user is leave standard input open and attach a tty (or, for some
tools, use batch mode).
#29409 removed the nearlyEquals() double comparison snippet, which
makes these tests very flaky because they can generate very large or
very small doubles which don't work well with absolute error comparison.
We need to either refactor these tests to guarantee they stay in a small
range (which could be difficult due to holt/holt-winters) or re-implement
the more robust double comparison.
Tracking issue: #29456
Renaming should hopefully make it more clear that this is the size
of pages to process during rolling up, nothing to do with the size
of the various groups, metrics, etc.
Original commit: elastic/x-pack-elasticsearch@8a0a44f04b
This commit simplifies the exception handling in
TranslogWriter#closeWithTragicEvent. When invoking this method, the
inner close method could throw an exception which we always catch and
suppress into the exception that led us to tragically close. This commit
moves that repeated logic into closeWithTragicException and now callers
simply need to catch, invoke closeWithTragicException, and rethrow.
This constructor was actually never used, other than in tests, and even then,
there is no need for a custom period type as the human-readable toString value
will suffice.
Original commit: elastic/x-pack-elasticsearch@fc666a04b9
Currently rest-based tests do not work from the IDE, as the security
manager is configured to permit certain network operations when
using the snapshot jars compiled by gradle. We have an existing
workaround that explicitly associates a codebase with the path
from which the classes are loaded (in this case, the IDE build
directory). This PR adds the rest client to this workaround list.
In the case that a document with a percolator field is matched when using the `percolate` query then
the fetch phase can fail due to the fact that the percolator can't resolve any query from that document.
Closes#29429
It is common for users to wish to adjust the verification_mode in SSL
settings, usually with the intention of skipping hostname
verification. This has been supported for a long time, but the
relevant configuration setting was not clearly documented, which would
sometimes lead users to set `verification_mode` to `none`, and disable
more checks than they intended.
This commit adds clearer documentation regarding the options available
for `verification_mode` and actively discourages the use of `none`.
Original commit: elastic/x-pack-elasticsearch@2fdf53b42f
In case of a disjunction query with both range and term based clauses and
msm specified, the query analyzer needs to also reduce the msn if a range
based clause for the same field is encountered. This did not happen.
Instead of fixing this bug the logic has been simplified to just set a
percolator query's msm to 1 if a disjunction contains range clauses and
msm on disjunction has been specified. The logic would otherwise just get
to complex and the performance gain isn't that much for this kind of
percolator queries.
In case a percolator query has clauses that have duplicate terms or ranges then
for disjunction clauses with a minimum should match the query extraction of the
clause with the lowest msm should be used and for conjunction queries query
extractions wiht duplicate terms/ranges the msn should be ignored. If this
is not done then percolator queries that should match never match.
Example percolator query: value1 OR value2 OR value2 OR value3 OR value3 OR value3 OR value4 OR value5 (msm set to 3)
In the above example query the extracted msm would be 3
Example document1: value1 value2 value3
With the msm and extracted terms this would match and is expected behaviour
Example document2: value3
This document should match too (value3 appears in 3 clauses), but with msm set to 3 and the fact
that fact that only distinct values are indexed in extracted terms field this document would
Also added another random duel test.
Closes#29393
* Remove copy-pasted code
We had two instances of copy-pasted code with a bad license from
another website. The code was doing something rather simple, and
that functionality already exists within junit. This PR simply leverages
the junit functionality.
For the user cache, the crypt option rounds are actually the log2 of the number
of rounds. This commits updates the documentation to reflect this.
Original commit: elastic/x-pack-elasticsearch@d3cc2b7f29