Make SQL aware of missing and/or unmapped fields treating them as NULL
Make _all_ functions and operators null-safe aware, including when used
in filtering or sorting contexts
Add missing and null-safe doc value extractor
Modify dataset to have null fields spread around (in groups of 10)
Enforce missing last and unmapped_type inside sorting
Consolidate Predicate templating and declaration
Add support for Like/RLike in scripting
Generalize NULLS LAST/FIRST
Introduce early schema declaration for CSV spec tests: to keep the doc
snippets in place (introduce schema:: prefix for declaration)
upfront.
Fix#32079
A constant can now be used outside aggregation only queries.
Don't skip an ES query in case of constants-only selects.
Loosen the binary pipe restriction of being used only in aggregation queries.
Fixes https://github.com/elastic/elasticsearch/issues/31863
This moves the rollup cleanup code for http tests from the high level rest
client into the test framework and then entirely removes the rollup cleanup
code for http tests that lived in x-pack. This is nice because it
consolidates the cleanup into one spot, automatically invokes the cleanup
without the test having to know that it is "about rollup", and should allow
us to run the rollup docs tests.
Part of #34530
Security caches the result of role lookups and negative lookups are
cached indefinitely. In the case of transient failures this leads to a
bad experience as the roles could truly exist. The CompositeRolesStore
needs to know if a failure occurred in one of the roles stores in order
to make the appropriate decision as it relates to caching. In order to
provide this information to the CompositeRolesStore, the return type of
methods to retrieve roles has changed to a new class,
RoleRetrievalResult. This class provides the ability to pass back an
exception to the roles store. This exception does not mean that a
request should be failed but instead serves as a signal to the roles
store that missing roles should not be cached and neither should the
combined role if there are missing roles.
As part of this, the negative lookup cache was also changed from an
unbounded cache to a cache with a configurable limit.
Relates #33205
* New OCTET_LENGTH function
* Changed the way the FunctionRegistry stores functions, considering the alphabetic ordering by name
* Added documentation for the RANDOM function
Since all calls to `ESLoggerFactory` outside of the logging package were
deprecated, it seemed like it'd simplify things to migrate all of the
deprecated calls and declare `ESLoggerFactory` to be package private.
This does that.
Previously, parsing an arithmetic expression with `*` and no spaces,
e.g.: `2*i` threw a parsing exception as the grammar rule for
tableIdentifier was clashing with the rule for arithmetic operator `*`.
This issue comes already in the lexer and the left part of the
expression (in our example `2*`) was recognised as a
TABLE_IDENTIFIER token.
The solution adopted is to allow the `*` wildcard in the table name
only if it's surrounded with double quotes, e.g.: `"my*index"`
Closes: #33957
We use wrap code in `// tag` and `//end` to include it in our docs. Our
current docs style wraps code snippets in a box that is only wide enough
for 76 characters and adds a horizontal scroll bar for wider snippets
which makes the snippet much harder to read. This adds a checkstyle check
that looks for java code that is included in the docs and is wider than
that 76 characters so all snippets fit into the box. It solves many of
the failures that this catches but suppresses many more. I will clean
those up in a follow up change.
Centralize and simplify the script generation between operators and functions which are currently decoupled. As part of this process most predicates (<, <=, etc...) were made ScalarFunction as their purpose and functionality is quite similar (see % and MOD functions).
Renamed ProcessDefinition to Pipe
Add ScriptWeaver as a mixin-in interface for script customization
Add logic for equals/lte/lt
Improve BinaryOperator/expression toString
Minimize duplication across string functions
Close#33975
This change cleans up "unused variable" warnings. There are several cases were we
most likely want to suppress the warnings (especially in the client documentation test
where the snippets contain many unused variables). In a lot of cases the unused
variables can just be deleted though.
Two issues are resolved:
1. The `value_type` should be either long or double in case of numeric.
2. The key label for the aggregate filter (having) was duplicate of an aggr key label.
Fixes: #33520
* Changed the format of the String functions documentation page.
* Adopted the same format for Math functions, but completely changed the examples.
* Added missing documentation for Math functions.
This commit reverts most of #33157 as it introduces another race
condition and breaks a common case of watcher, when the first watch is
added to the system and the index does not exist yet.
This means, that the index will be created, which triggers a reload, but
during this time the put watch operation that triggered this is not yet
indexed, so that both processes finish roughly add the same time and
should not overwrite each other but act complementary.
This commit reverts the logic of cleaning out the ticker engine watches
on start up, as this is done already when the execution is paused -
which also gets paused on the cluster state listener again, as we can be
sure here, that the watches index has not yet been created.
This also adds a new test, that starts a one node cluster and emulates
the case of a non existing watches index and a watch being added, which
should result in proper execution.
Closes#33320
* Added TRUNCATE function, modified ROUND to accept two parameters instead of one. Made the second parameter optional for both functions.
* Added documentation for both functions.
Previously multiple comma separated lists of options where not
recognized correctly which resulted in only the last of them
to be taked into account, e.g.:
For the following query:
SELECT * FROM test WHERE QUERY('search', 'default_field=foo', 'default_operator=and')"
only the `default_operator=and` was finally passed to the ES query.
Fixes: #32602
Changes the format of log events in the audit logfile.
It also changes the filename suffix from `_access` to `_audit`.
The new entry format is consistent with Elastic Common Schema.
Entries are formatted as JSON with no nested objects and field
names have a dotted syntax. Moreover, log entries themselves
are not spaced by commas and there is exactly one entry per line.
In addition, entry fields are ordered, unlike a typical JSON doc,
such that a human would not strain his eyes over jumbled
fields from one line to the other; the order is defined in the log4j2
properties file.
The implementation utilizes the log4j2's `StringMapMessage`.
This means that the application builds the log event as a map
and the log4j logic (the appender's layout) handle the format
internally. The layout, such as the set of printed fields and their
order, can be changed at runtime without restarting the node.
We have a test dependency on Apache Mina when using SimpleKdcServer
for testing Kerberos. When checking for LDAP backend connectivity,
the code checks for deadlocks which require additional security
permissions accessClassInPackage.sun.reflect. As this is only for
test and we do not want to add security permissions to production,
this commit moves these tests and related classes to
x-pack evil-tests where they can run with security manager disabled.
The plan is to handle the security manager exception in the upstream issue
DIRMINA-1093
and then once the release is available to run these tests with security
manager enabled.
Closes#32739
This change removes the wrapping of the created field in the put user
response. The created field was added as a top level field in #32332,
while also still being wrapped within the `user` object of the
response. Since the value is available in both formats in 6.x, we can
remove the wrapped version for 7.0.
Previously, when an non-pruned cast (casting as a different
data type) got applied on a table column in the `SELECT` clause,
the name of the result column didn't contain the target data type
of the cast, e.g.:
SELECT CAST(MAX(salary) AS DOUBLE) FROM "test_emp"
returned as column name:
CAST(MAX(salary))
instead of:
CAST(MAX(salary) AS DOUBLE)
Closes#33571
* Added more tests for trivial casts that are pruned
This commit adds settings upgraders for the search.remote.* settings
that can be in the cluster state to automatically upgrade these settings
to cluster.remote.*. Because of the infrastructure that we have here,
these settings can be upgraded when recovering the cluster state, but
also when a user tries to make a dynamic update for these settings.
These logs are incredibly verbose, and it makes chasing normal failures
burdensome. This commit removes the debug logging, which can be
reenabled again if needed.
Previously, when an arithmetic function got applied on a
table column in the `SELECT` clause, the name of the result
column contained weird characters used internally when
processing the SQL statement e.g.:
SELECT CHAR(emp_no % 10000) FROM "test_emp"
returned:
CHAR((emp_no{f}#14) % 10000))
as the column name instead of:
CHAR((emp_no) % 10000))
Also, fix an issue that causes a ClassCastException to be thrown
when using functions where both arguments are literals.
Closes#31869Closes#33461
Split function section into multiple chapters
Add String functions
Add (small) section on Conversion/Cast functions
Add missing aggregation functions
Enable documentation testing (was disabled by accident). While at it,
fix failing tests
Improve spec tests to allow multi-line queries (useful for docs)
Add ability to ignore a spec test (name should end with -Ignore)
With features like CCR building on the CCS infrastructure, the settings
prefix search.remote makes less sense as the namespace for these remote
cluster settings than does a more general namespace like
cluster.remote. This commit replaces these settings with cluster.remote
with a fallback to the deprecated settings search.remote.
Extend SHOW TABLES, DESCRIBE and SHOW COLUMNS to support table
identifiers not just SQL LIKE pattern.
This allows both Elasticsearch-style multi-index patterns and SQL LIKE.
To disambiguate between the two (as the " vs ' can be easy to miss),
the grammar now requires LIKE keyword as a prefix for all LIKE-like
patterns.
Also added some docs comparing the two types of patterns.
Fix#33294
We need to wait for the job to fully initialize and start before
we can attempt to stop it. If we don't, it's possible for the stop
API to be called before the persistent task is fully loaded and it'll
throw an exception.
Closes#32773
In the previous commit where SmokeTestWatcherWithSecurityIT tests were
muted, I added the incorrect issue numbers. This commit fixes this. The
issue for the tests is #33320.
* Tests for JdbcResultSet
* Added VARCHAR conversion for different types
* Made error messages consistent: they now contain both the type that fails to be converted and the value itself
Authorization Realms allow an authenticating realm to delegate the task
of constructing a User object (with name, roles, etc) to one or more
other realms.
E.g. A client could authenticate using PKI, but then delegate to an LDAP
realm. The LDAP realm performs a "lookup" by principal, and then does
regular role-mapping from the discovered user.
This commit includes:
- authorization_realm support in the pki, ldap, saml & kerberos realms
- docs for authorization_realms
- checks that there are no "authorization chains"
(whereby "realm-a" delegates to "realm-b", but "realm-b" delegates to "realm-c")
Authorization realms is a platinum feature.