When trying to analyse a HAVING condition, we crate a temp Aggregate
Plan which wasn't created correctly (missing the expressions from the
SELECT clause) and as a result the ordinal (1, 2, etc) in the GROUP BY
couldn't be resolved correctly.
Also after successfully analysing the HAVING condition, still the
original plan was returned.
Fixes: #36059
Introduce Histogram grouping function for bucketing/grouping data based
on a given range. Both date and numeric histograms are supported using
the appropriate range declaration (numbers vs intervals).
SELECT HISTOGRAM(number, 50) AS h FROM index GROUP BY h
SELECT HISTOGRAM(date, INTERVAL 1 YEAR) AS h FROM index GROUP BY h
In addition add multiply operator for Intervals
Add docs for intervals and histogram
Fix#36509
Add CURRENT_TIMESTAMP as keyword as well function alongside NOW()
These return the current date/time for the given query, computed when
the statement reaches the server. For completeness, CURRENT_TIMESTAMP
also accepts precision as an optional parameter.
Fix#36534
Add missing `formatTemplate()` for conditional functions which
resulted in incomplete painless script. Moreover the specific
return type of Object in the painless signatures resulted in
casting exceptions when conditional functions are used in the
ORDER BY.
Fixes: #36631
Previously, Math.floorMod was used for integers and longs
which has different logic for negative numbers. Also, the
priority of data types check was wrong as if one of the args
is double the evaluation should be with doubles, then for floats,
then longs and finally integers.
Fixes: #36364
The getters and setters for useDisMax() have been deprecated since at least 6.0,
also there hasn't been any reference to the query parameter in the
documentation. Removing it from the builder and tests and replacing it with
`tieBreaker(1.0f)` where necessary.
* Renamed DAY_OF_WEEK and WEEK_OF_YEAR functions to their ISO version and
added the same functions with different functionality.
* Rewritten the datetime functions documentation to follow the format of the other
functions documentation pages.
Previously, we used a CamelCase to CAMEL_CASE transformation to get the
primary name of a function from its class name which led to some issues
since there are functions that we don't want to be registered this way
(e.g.: IFNULL). To simplify the logic and avoid and "magic"
transformations in the FunctionRegistry a primary name must be provided
explicitely for each function.
The same change is applied for the function resolution (when a function
is used in an SQL statement). There is no CamelCase to CAMEL_CASE
transformation but only upper-casing is applied (fuNcTiOn -> FUNCTION).
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).
Relates #33028
When the grouping key of a GROUP BY is a painless script (functions are
involved), the data type of the key was incorrect in certain cases
(Boolean, IP, Date). This resulted in returning wrong data type for this
columns in the query results. E.g.:
```
SELECT COUNT(*), a > 10 AS a FROM t GROUP BY a
```
Fixes: #35662
This change removes the deprecated useDisMax() and useAllFields() methods from
the QueryStringQueryBuilder and related tests. The disMax parameter has already
been a no-op since 6.0 and also the useAllFields has been deprecated since 6.0
and there is a direct replacement via defaultField.
`SIGN` and `RADIANS` where wrongly overriding `mathFunction()`.
Converted `mathFunction()` to private in `MathFunction` since it
shouldn't be overriden, as it uses the assigned `MathOperation`
to get the funtion name for painless scripts.
Fixes: #35654
Add special verifier rule to check that the arguments of conditional
functions are of the same or compatible types. This way the user gets
a descriptive error message with line number and column indicating
where is the offending argument.
Closes: #35907
Add GREATEST(expr1, expr2, ... exprN) and LEAST(expr1, expr2, exprN)
functions which are in the family of CONDITIONAL functions.
Implementation follows PostgreSQL behaviour, so the functions return
`NULL` when all of their arguments evaluate to `NULL`.
Renamed `CoalescePipe` and `CoalesceProcessor` to `ConditionalPipe` and
`ConditionalProcessor` respectively, to be able to reuse them for
`Greatest` and `Least` evaluations. To achieve that `ConditionalOperation`
has been added to differentiate between the functionalities at execution
time.
Closes: #35878
Due to some unresolvable type conflict between the expected definition
in JDBC vs ODBC, the driver mode is now passed over so that certain
command can change their results accordingly (in this case SYS COLUMNS)
Fix#35376
This operator handles nulls in different way than the normal `=`.
If one of the operants is `null` and the other not it returns `false`.
If both operants are `null` it returns `true`. Therefore in contrary to
`=`, which returns `null` if at least one of the operants is `null`, this one
never returns `null` as a result.
Closes: #35871
Move away from performing eager, fail-fast validation of mismatched
mapping to a lazy evaluation based on the fields actually used in the
query. This allows queries to run on the parts of the indices that
"work" which is not just convenient but also a necessity for large
mappings (like logging) where alignment is hard/impossible to achieve.
Fix#35659
Introduce INTERVAL as a DataType
Add INTERVAL to the grammar which supports the standard SQL declaration
(without precision):
> INTERVAL '1 23:45:01.123456789' DAY TO SECOND
but also number for single unit intervals:
> INTERVAL 1 YEAR
as well as the plurals of the units:
> INTERVAL 2 YEARS
Interval are internally supported as just another Literal being backed
by java.time.Period and java.time.Duration
Move JDBC away from JDBCType enum to SQLType interface
Refactor DataType by moving it into server core and adding dedicated (and
much simpler) JDBC driver type
Improve internal JDBC conversion by normalizing on the DataType
Rename JDBC columnInfo to JdbcColumnInfo to differentiate between it and
the SQL ColumnInfo
Fix#29990
This parameter in the `query_string` query was deprecated in 6.0 and ignored
since then. Its API methods and remaining uses can be removed in the upcoming
major version.
Relates to #35734
Fix bug in Analyzer that caused it to report unsupported fields only
when declared in projections. The rule has been extended to all field
declarations.
Fix#35673
Add `IsNull` node in parser to simplify expressions so that `<value> IS NULL` is
no longer translated internally to `NOT(<value> IS NOT NULL)`
Replace `IsNotNullProcessor` with `CheckNullProcessor` to encapsulate both
isNull and isNotNull functionality.
Closes: #34876Fixes: #35171
Change `nullable()` logic of AND and OR to false since in the Optimizer
we cannot fold to null as we might be handling and expression in the
SELECT clause.
Introduce folding of null for AND and OR expressions in PruneFilter()
since we now know that we are in HAVING or WHERE clause and we
can fold `null` to `false`
Fixes: #35088
Co-authored-by: Costin Leau <costin.leau@gmail.com>
Grammar's identifiers can be completely skipped from counting depths
as they just add another level to the tree and they are always children
of some other expression which gets counted.
Increased maximum depth from 100 to 200. After testing on production
configuration with -Xss1m, depths of at least 250 can be used, so being
conservative we put the limit lower.
Fixes: #35299
Override `process()` in `BinaryLogicProcessor` which doesn't immediately
return null if left or right argument is null, which is the behaviour of
`process()` of the parent class `BinaryProcessor`.
Also, add more tests for `AND` and `OR` in SELECT clause with literal.
Fixes: #35240
Add NotEquals node in parser to simplify expressions so that <value1> != <value2> is
no longer translated internally to NOT(<value1> = <value2>)
Closes: #35210Fixes: #35233
Include `null` literals when generating the painless script for `IN` expressions.
Previously, they were skipped, because of an issue that had been fixed with #35108.
Fixes: #35122
Replace standard `||` and `==` painless operators with
new `in` method introduced in `InternalSqlScriptUtils`.
This allows the list of values to become a script variable
which is replaced each time with the list of values provided
by the user.
Move In to the same package as InPipe & InProcessor
Follow up to #34750
Co-authored-by: Costin Leau <costin.leau@gmail.com>
Stop passing `Settings` to `AbstractComponent`'s ctor. This allows us to
stop passing around `Settings` in a *ton* of places. While this change
touches many files, it touches them all in fairly small, mechanical
ways, doing a few things per file:
1. Drop the `super(settings);` line on everything that extends
`AbstractComponent`.
2. Drop the `settings` argument to the ctor if it is no longer used.
3. If the file doesn't use `logger` then drop `extends
AbstractComponent` from it.
4. Clean up all compilation failure caused by the `settings` removal
and drop any now unused `settings` isntances and method arguments.
I've intentionally *not* removed the `settings` argument from a few
files:
1. TransportAction
2. AbstractLifecycleComponent
3. BaseRestHandler
These files don't *need* `settings` either, but this change is large
enough as is.
Relates to #34488
Extract data type verification for function arguments to a single place
so that NULL type can be treated as RESOLVED for all functions. Moreover
this enables to have consistent verification error messages for all functions.
Fixes: #34752Fixes: #33469
Previously, for some queries the validation for ORDER BY
fields didn't kick in since a HAVING close or an ORDER BY
with scalar function would add `Filter` and `Project` plans
between the `OrderBy` and the `Aggregate`.
Now the LogicalPlan tree beneath `OrderBy` is traversed and
the ORDER BY fields are properly verified.
Fixes: #34590
Previously, `Mapper` was returning an incorrect plan which resulted in an
```
SQLFeatureNotSupportedException: Found 1 problem(s)
line 1:8: Unexecutable item
```
Queries with a `WHERE` and/or `HAVING` clause which results in NO_MATCH
are now handled correctly and return 0 rows.
Fixes: #34613
Co-authored-by: Costin Leau <costin.leau@gmail.com>
Implemented null handling for both the value tested but also for
values inside the list of values tested against.
The null handling is implemented for local processors, painless scripts
and Lucene Terms queries making it available for `IN` expressions occuring
in `SELECT`, `WHERE` and `HAVING` clauses.
Closes: #34582
- Restrict visibility of Aggregators and Factories
- Move PipelineAggregatorBuilders up a level so it is consistent with
AggregatorBuilders
- Checkstyle line length fixes for a few classes
- Minor odds/ends (swapping to method references, formatting, etc)
Extend querying support on multiple indices from being strictly
identical to being just compatible.
Use FieldCapabilities API (extended through #33803) for mapping merging.
Close#31837#31611
Implement the functionality to translate the
`field IN (value1, value2,...)` expressions to proper Lucene queries
or painless script or local processors depending on the use case.
The `IN` expression can be used in SELECT, WHERE and HAVING clauses.
Closes: #32955
`CONVERT` works exactly like cast with slightly different syntax:
`CONVERT(<value>, <data_type)` as opposed to `CAST(<value> AS <data_type>)`
Moreover it support format of the MS-SQL data types `SQL_<type>`,
e.g.: `SQL_INTEGER`
Closes: #34513
Make SQL aware of missing and/or unmapped fields treating them as NULL
Make _all_ functions and operators null-safe aware, including when used
in filtering or sorting contexts
Add missing and null-safe doc value extractor
Modify dataset to have null fields spread around (in groups of 10)
Enforce missing last and unmapped_type inside sorting
Consolidate Predicate templating and declaration
Add support for Like/RLike in scripting
Generalize NULLS LAST/FIRST
Introduce early schema declaration for CSV spec tests: to keep the doc
snippets in place (introduce schema:: prefix for declaration)
upfront.
Fix#32079
A constant can now be used outside aggregation only queries.
Don't skip an ES query in case of constants-only selects.
Loosen the binary pipe restriction of being used only in aggregation queries.
Fixes https://github.com/elastic/elasticsearch/issues/31863
* New OCTET_LENGTH function
* Changed the way the FunctionRegistry stores functions, considering the alphabetic ordering by name
* Added documentation for the RANDOM function
The `-` and `+` as a number literal prefix are already
parsed by the rule in `valueExpression`. To accommodate
this, there are some code changes that enables the
`ExpressionBuilder` to parse Literal integers and decimals
together with the `-/+` prefix sign (if exists) and validate
them (wrong format, large numbers, etc.).
Follows: #33854
Previously, parsing an arithmetic expression with `*` and no spaces,
e.g.: `2*i` threw a parsing exception as the grammar rule for
tableIdentifier was clashing with the rule for arithmetic operator `*`.
This issue comes already in the lexer and the left part of the
expression (in our example `2*`) was recognised as a
TABLE_IDENTIFIER token.
The solution adopted is to allow the `*` wildcard in the table name
only if it's surrounded with double quotes, e.g.: `"my*index"`
Closes: #33957
Remove CamelCase to CAMEL_CASE conversion when resolving
a function. Only convert user input to upper case and then
try to match with aliases or primary names.
Keep the internal conversion FunctionName to FUNCTION__NAME
which provides flexibility when registering functions by their class
name.
Fixes: #34114
Fixes the equals and hash function to ignore the order of aggregations to ensure equality after serialization
and deserialization. This ensures storing configs with aggregation works properly.
This also addresses a potential issue in caching when the same query contains aggregations but in
different order. 1st it will not hit in the cache, 2nd cache objects which shall be equal might end up twice in
the cache.
Centralize and simplify the script generation between operators and functions which are currently decoupled. As part of this process most predicates (<, <=, etc...) were made ScalarFunction as their purpose and functionality is quite similar (see % and MOD functions).
Renamed ProcessDefinition to Pipe
Add ScriptWeaver as a mixin-in interface for script customization
Add logic for equals/lte/lt
Improve BinaryOperator/expression toString
Minimize duplication across string functions
Close#33975
This change cleans up "unused variable" warnings. There are several cases were we
most likely want to suppress the warnings (especially in the client documentation test
where the snippets contain many unused variables). In a lot of cases the unused
variables can just be deleted though.
Implement circuit breaker logic in the parser which catches expressions
that can blow up the tree and result in StackOverflowError being thrown.
Co-authored-by: Costin Leau <costin.leau@gmail.com>
Two issues are resolved:
1. The `value_type` should be either long or double in case of numeric.
2. The key label for the aggregate filter (having) was duplicate of an aggr key label.
Fixes: #33520
* Added TRUNCATE function, modified ROUND to accept two parameters instead of one. Made the second parameter optional for both functions.
* Added documentation for both functions.
Removed rules in the grammar that were superfluous, as they
are already "caught" other rules in the same context.
Also switched to exact ambig detection for debug mode
Fixes: #31885
Previously multiple comma separated lists of options where not
recognized correctly which resulted in only the last of them
to be taked into account, e.g.:
For the following query:
SELECT * FROM test WHERE QUERY('search', 'default_field=foo', 'default_operator=and')"
only the `default_operator=and` was finally passed to the ES query.
Fixes: #32602
DAYNAME and MONTHNAME functions tests will be skipped if the right JVM parameter (-Djava.locale.providers=COMPAT) is not used in unit testing environment
Previously, when an non-pruned cast (casting as a different
data type) got applied on a table column in the `SELECT` clause,
the name of the result column didn't contain the target data type
of the cast, e.g.:
SELECT CAST(MAX(salary) AS DOUBLE) FROM "test_emp"
returned as column name:
CAST(MAX(salary))
instead of:
CAST(MAX(salary) AS DOUBLE)
Closes#33571
* Added more tests for trivial casts that are pruned
* SQL: Make Literal a NamedExpression
Literal now is a NamedExpression reducing the need for Aliases for
folded expressions leading to simpler optimization rules.
Fix#33523
Previously, when an arithmetic function got applied on a
table column in the `SELECT` clause, the name of the result
column contained weird characters used internally when
processing the SQL statement e.g.:
SELECT CHAR(emp_no % 10000) FROM "test_emp"
returned:
CHAR((emp_no{f}#14) % 10000))
as the column name instead of:
CHAR((emp_no) % 10000))
Also, fix an issue that causes a ClassCastException to be thrown
when using functions where both arguments are literals.
Closes#31869Closes#33461
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309Closes#32899
Extend SHOW TABLES, DESCRIBE and SHOW COLUMNS to support table
identifiers not just SQL LIKE pattern.
This allows both Elasticsearch-style multi-index patterns and SQL LIKE.
To disambiguate between the two (as the " vs ' can be easy to miss),
the grammar now requires LIKE keyword as a prefix for all LIKE-like
patterns.
Also added some docs comparing the two types of patterns.
Fix#33294
The incorrect NodeInfo is created when the optional parameter is not used, leading to the incorrect constructor being used. Simplified LocateFunctionProcessorDefinition by using one constructor instead of two.
Fixes https://github.com/elastic/elasticsearch/issues/32554
Skip the comparative tests using lowercasing/uppercasing against H2 (which considers the Locale).
ES-SQL is, so far, ignoring the Locale.
Still, the same queries are executed against ES-SQL alone and results asserted to be correct.
Added support for string manipulating functions with more than one parameter:
CONCAT, LEFT, RIGHT, REPEAT, POSITION, LOCATE, REPLACE, SUBSTRING, INSERT
Due to the way ANTLR works, any declared tokens need to be accounted for
manually inside function names (otherwise a different rule gets applied).
Fix#32046
The previous errors in compileJava were not cause by the brackets but my the
content of the @link section. Corrected this so its a working javadoc link again.
Added support for ASCII, BIT_LENGTH, CHAR, CHAR_LENGTH, LCASE, LENGTH, LTRIM, RTRIM, SPACE, UCASE functions.
Wherever Painless scripting is necessary (WHERE conditions, ORDER BY etc), those scripts are being used.
For historical reasons SQL restricts GROUP BY to only one field.
This commit removes the restriction and improves the test suite with
multi group by tests.
Close#31793
* Move to Gradle 4.8 RC1
* Use latest version of plugin
The current does not work with Gradle 4.8 RC1
* Switch to Gradle GA
* Add and configure build compare plugin
* add work-around for https://github.com/gradle/gradle/issues/5692
* work around https://github.com/gradle/gradle/issues/5696
* Make use of Gradle build compare with reference project
* Make the manifest more compare friendly
* Clear the manifest in compare friendly mode
* Remove animalsniffer from buildscript classpath
* Fix javadoc errors
* Fix doc issues
* reference Gradle issues in comments
* Conditionally configure build compare
* Fix some more doclint issues
* fix typo in build script
* Add sanity check to make sure the test task was replaced
Relates to #31324. It seems like Gradle has an inconsistent behavior and
the taks is not always replaced.
* Include number of non conforming tasks in the exception.
* No longer replace test task, create implicit instead
Closes#31324. The issue has full context in comments.
With this change the `test` task becomes nothing more than an alias for `utest`.
Some of the stand alone tests that had a `test` task now have `integTest`, and a
few of them that used to have `integTest` to run multiple tests now only
have `check`.
This will also help separarate unit/micro tests from integration tests.
* Revert "No longer replace test task, create implicit instead"
This reverts commit f1ebaf7d93e4a0a19e751109bf620477dc35023c.
* Fix replacement of the test task
Based on information from gradle/gradle#5730 replace the task taking
into account the task providres.
Closes#31324.
* Only apply build comapare plugin if needed
* Make sure test runs before integTest
* Fix doclint aftter merge
* PR review comments
* Switch to Gradle 4.8.1 and remove workaround
* PR review comments
* Consolidate task ordering
TransportAction currently contains 2 doExecute methods, one which takes
a the task, and one that does not. The latter is what some subclasses
implement, while the first one just calls the latter, dropping the given
task. This commit combines these methods, in favor of just always
assuming a task is present.
Most transport actions don't need the node ThreadPool. This commit
removes the ThreadPool as a super constructor parameter for
TransportAction. The actions that do need the thread pool then have a
member added to keep it from their own constructor.
Most transport actions don't need to resolve index names. This commit
removes the index name resolver as a super constructor parameter for
TransportAction. The actions that do need the resolver then have a
member added to keep the resolver from their own constructor.
This is related to #27260 and #28898. This commit adds the transport-nio
plugin as a random option when running the http smoke tests. As part of
this PR, I identified an issue where cors support was not properly
enabled causing these tests to fail when using transport-nio. This
commit also fixes that issue.
This commit removes the RequestBuilder generic type from Action. It was
needed to be used by the newRequest method, which in turn was used by
client.prepareExecute. Both of these methods are now removed, along with
the existing users of prepareExecute constructing the appropriate
builder directly.
This commit adds the ability to configure how a docvalue field should be
formatted, so that it would be possible eg. to return a date field
formatted as the number of milliseconds since Epoch.
Closes#27740
Make all bool constructs use match/should (that is a query context) as
that is controlled and changed to a filter context by ES automatically
based on the sort order (_doc, field vs _sort) and trackScores.
Fix#29685
Due to the way composite aggregation works, ordering in GROUP BY can be
applied only through grouped columns which now the analyzer verifier
enforces.
Fix 29900
Tweak the return data, in particular with regards for ODBC columns to
better align with the spec
Fix order for SYS TYPES and TABLES according to the JDBC/ODBC spec
Fix#30386Fix#30521
Dates internally contain milliseconds (which appear when converting them
to Strings) however parsing does not accept them (and is being strict).
The parser has been changed so that Date is mandatory but the time
(including its fractions such as millis) are optional.
Fix#30002
When dealing with filtering, a composite aggregation might return empty
buckets (which have been filtered) which gets sent as is to the client.
Unfortunately this interprets the response as no more data instead of
retrying.
This now has changed and the listener keeps retrying until either the
query has ended or data passes the filter.
Fix#30292
This commit removes the http.enabled setting. While all real nodes (started with bin/elasticsearch) will always have an http binding, there are many tests that rely on the quickness of not actually needing to bind to 2 ports. For this case, the MockHttpTransport.TestPlugin provides a dummy http transport implementation which is used by default in ESIntegTestCase.
closes#12792
* SQL: Reduce number of ranges generated for comparisons
Rewrote optimization rule for combining ranges by improving the
detection of binary comparisons in a tree to better combine
them in a range, regardless of their place inside an expression.
Additionally, improve the comparisons of Numbers of different types
Also, improve reassembly of conjunction/disjunction into balanced
trees.
Do not promote BinaryComparisons to Ranges since it introduces NULL
boundaries and thus a corner-case that needs too much handling
Compare BinaryComparisons directly between themselves and to Ranges
Fix#30017
We had a number of awaitsFix links that weren't updated after the xpack
merge.
Where possible I changed the links to the new locations, but in some
circumstances the original ticket was closed (suggesting the awaitsfix
should be removed) or was otherwise unclear the status.
Currently the test picks random java.util.TimeZone ids in some places.
Internally we still need to convert back to joda DateTimeZone by id
occassionally (e.g. when serializing to pre 6.3 versions). There are
some deprecated "SystemV/*" time zones that Jodas DateTimeZone refuses
to convert. This change excludes those rare cases from the set of
allowed random time zones. It would be quiet odd for them to appear in
practice.
Closes#30156
A few of the old style license got kept around because their comment
string did not start with a space. This caused the license check to not
see it as a license and skip it. This commit cleans it up.
This commit fixes the classpath for the SQL CLI tool on Windows. As the
x-pack bin folder was collapsed into the distribution bin folder, the
location of the classpath here needed to no longer contain the old
plugins directory.
This commit adds the distribution type to the startup scripts so that we
can discern from log output and the main response the type of the
distribution (deb/rpm/tar/zip).
This commit makes x-pack a module and adds it to the default
distrubtion. It also creates distributions for zip, tar, deb and rpm
which contain only oss code.