* Convert RunTask to use testclusers, remove ClusterFormationTasks
This PR adds a new RunTask and a way for it to start a
testclusters cluster out of band and block on it to replace
the old RunTask that used ClusterFormationTasks.
With this we can now remove ClusterFormationTasks.
DATE_PART(<datetime unit>, <date/datetime>) is a function that allows
the user to extract the specified unit from a date/datetime field
similar to the EXTRACT (<datetime unit> FROM <date/datetime>) but
with different names and aliases for the units and it also provides more
options like `DATE_PART('tzoffset', datetimeField)`.
Implemented following the SQL server's spec: https://docs.microsoft.com/en-us/sql/t-sql/functions/datepart-transact-sql?view=sql-server-2017
with the difference that the <datetime unit> argument is either a
literal single quoted string or gets a value from a table field, whereas
in SQL server keywords are used (unquoted identifiers) and it's not
possible to use a value coming for a table column.
Closes: #46372
(cherry picked from commit ead743d3579eb753fd314d4a58fae205e465d72e)
Add examples of failures for both sql and csv integeration
tests and instructions on how to mute them.
(cherry picked from commit 591bba46516d770f5fc95a4c536dd7448b74dd49)
When an integration test fails before the assertion of the results it's
missing information, like the file name and the line in the file where
the test resides.
(cherry picked from commit 683dc7213311d13c81e06829e08f3f9f80ebf73a)
Previously, if a column (field, scalar, alias) appeared more than once in the
SELECT list, the value was returned only once (1st appearance) in each row.
Fixes: #41811
(cherry picked from commit 097ea36581a751605fc4f2088319d954ce35b5d1)
To be on the safe side in terms of use cases also add the alias
DATETRUNC to the DATE_TRUNC function.
Follows: #46473
(cherry picked from commit 9ac223cb1fc66486f86e218fa785a32b61e9bacc)
In some cases, the fetch size affects the way the groups are returned
causing the last page to go beyond the limit. Add dedicated check to
prevent extra data from being returned.
Fix#47002
(cherry picked from commit f4c29646f097bbd29855300342823ef4cef61c05)
Enables support for Cartesian geometries shape type. We still need to
decide how to handle the distance function since it is currently using
the haversine distance formula and returns results in meters, which
doesn't make any sense for Cartesian geometries.
Closes#46412
Relates to #43644
Add initial PIVOT support for transforming a regular table into a
statistics table around an arbitrary pivoting column:
SELECT * FROM
(SELECT languages, country, salary, FROM mp)
PIVOT (AVG(salary) FOR countries IN ('NL', 'DE', 'ES', 'RO', 'US'))
In the current implementation PIVOT allows only one aggregation however
this restriction is likely to be lifted in the future.
Also not all aggregations are working, in particular MatrixStats are not yet supported.
(cherry picked from commit d91263746a222915c570d4a662ec48c1d6b4f583)
Since the `resolveAllDependencies` task resolves all the congfigurations
it can find, this was not caught by our testing, but it's required to be
configuraed specifically.
We should probably cut-over to the new configurations at some point to
avoid problems like this.
Closeselastic/infra#14580
When encountering only indices with empty mapping, the IndexResolver
throws an exception as it expects to find at least one entry.
This commit fixes this case so that an empty mapping is returned.
Fix#46757
(cherry picked from commit 5f4f5807acb93b5fab36718c092c328977a396b6)
Handle queries with implicit GROUP BY where the aggregation is not in
the projection/SELECT but inside the filter/HAVING such as:
SELECT 1 FROM x HAVING COUNT(*) > 0
The engine now properly identifies the case and handles it accordingly.
Fix#37051
(cherry picked from commit fa53ca05d8219c27079b50b4a5b7aeb220c7cde2)
Improve the defensive behavior of ResultSet when dealing with incorrect
API usage. In particular handle the case of dealing with no row
available (either because the cursor is before the first entry or after
the last).
Fix#46750
(cherry picked from commit 58fa38e4606625962e879265d35eacb0960c6cdb)
DATE_TRUNC(<truncate field>, <date/datetime>) is a function that allows
the user to truncate a timestamp to the specified field by zeroing out
the rest of the fields. The function is implemented according to the
spec from PostgreSQL: https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-TRUNCCloses: #46319
(cherry picked from commit b37e96712db1aace09f17b574eb02ff6b942a297)
The sql project uses a common set of security tests, which are run in
subprojects. Currently these are shared through a shared directory, but
this is not setup correctly to ensure it is built before tests run. This
commit changes the test classes to be an artifact of the sql/qa/security
project and makes the test runner use the built artifact (a directory of
classes) for tests.
closes#45866
* Fix issue with painless scripting not being correctly generated when
datetime functions are used for GROUPing of an INTERVAL operation.
(cherry picked from commit cb92828e8ec9d9d241bd6189e5835fd99f8b9a44)
Previously, when the condition (1st argument) of the IIF function could
be evaluated (folded) to false, the `IfConditional` was eliminated which
caused `IndexOutOfBoundsException` to be thrown when `info()` and
`resolveType()` methods where called.
Fixes: #46268
(cherry picked from commit 9a885a3ac47bc8f52c07770d1d8d670ce0af1e59)
In internal test clusters tests we check that wiping all indices was acknowledged
but in REST tests we didn't.
This aligns the behavior in both kinds of tests.
Relates #45605 which might be caused by unacked deletes that were just slow.
Changes the order of parameters in Geometries from lat, lon to lon, lat
and moves all Geometry classes are moved to the
org.elasticsearch.geomtery package.
Backport of #45332Closes#45048
* Add format parameter to the range queries built for CURRENT_* functions
used in comparison conditions
* Use range queries for date fields equality/non-equality as well.
(cherry picked from commit c1e81e90f937ee5a002524d632bfce74d76962f9)
* Name each inner_hits section of nested queries differently and extract and combine the multiple values it generates into a single list.
This also introduces a limitation (its origin it's with Elasticsearch
though) on the sorting capabilities when the sorting is based on the
nested fields filtered: only one of the conditions applied to nested
documents will be used in the nested sorting.
(cherry picked from commit cfc5cf68f6e83b07bb9006986d0903d6be418ec6)
* Switch from using docvalue_fields to extracting values from _source
where applicable. Doing this means parsing the _source and handling the
numbers parsing just like Elasticsearch is doing it when it's indexing
a document.
* This also introduces a minor limitation: aliases type of fields that
are NOT part of a tree of sub-fields will not be able to be retrieved
anymore. field_caps API doesn't shed any light into a field being an
alias or not and at _source parsing time there is no way to know if a
root field is an alias or not. Fields of the type "a.b.c.alias" can be
extracted from docvalue_fields, only if the field they point to can be
extracted from docvalue_fields. Also, not all fields in a hierarchy of
fields can be evaluated to being an alias.
(cherry picked from commit 8bf8a055e38f00df5f49c8d97f632f69d6e00c2c)
Test clusters currently has its own set of logic for dealing with
finding different versions of Elasticsearch, downloading them, and
extracting them. This commit converts testclusters to use the
DistributionDownloadPlugin.
By default, we don't check ranges while indexing geo_shapes. As a
result, it is possible to index geoshapes that contain contain
coordinates outside of -90 +90 and -180 +180 ranges. Such geoshapes
will currently break SQL and ML retrieval mechanism. This commit removes
these restriction from the validator is used in SQL and ML retrieval.
* Add test for SQL not being available error message in JDBC.
* Add a new qa sub-project that explicitly disables SQL XPack module in Gradle.
(cherry picked from commit 8a1ac8a3a88a325ec9b99963e0fa288c18ee0ee5)
This commit removes some very old test logging annotations that appeared
to be added to investigate test failures that are long since closed. If
these are needed, they can be added back on a case-by-case basis with a
comment associating them to a test failure.
To be consistent with the `search.max_buckets` default setting,
set the hard limit of the PriorityQueue used for in memory sorting,
when sorting on an aggregate function, to 10000.
Fixes: #43168
(cherry picked from commit 079e012fdea68ea0a7daae078359495047e9c407)
- Previously, when shorting on an aggregate function the bucket
processing ended early when the explicit (LIMIT XXX) or the impliciti
limit of 512 was reached. As a consequence, only a set of grouping
buckets was processed and the results returned didn't reflect the global
ordering.
- Previously, the priority queue shorting method had an inverse
comparison check and the final response from the priority queue was also
returned in the inversed order because of the calls to the `pop()`
method.
Fixes: #42851
(cherry picked from commit 19909edcfdf5792b38c1363b07379783ebd0e6c4)
In AsciiDoc, `subs="attributes,callouts,macros"` options were required
to render `include-tagged::` in a code block.
With elastic/docs#827, Elasticsearch Reference documentation migrated
from AsciiDoc to Asciidoctor.
In Asciidoctor, the `subs="attributes,callouts,macros"` options are no
longer needed to render `include-tagged::` in a code block. This commit
removes those unneeded options.
Resolves#41589
Refactors the WKT and GeoJSON parsers from an utility class into an
instantiatable objects. This is a preliminary step in
preparation for moving out coordinate validators from Geometry
constructors. This should allow us to make validators plugable.
Allow querying of FROZEN indices both through dedicated SQL grammar
extension:
> SELECT field FROM FROZEN index
and also through driver configuration parameter, namely:
> index.include.frozen: true/false
Fix#39390Fix#39377
(cherry picked from commit 2445a933915f420c7f51e8505afa0a7978ce6b0f)
Due to a bug in JTS WKT parser, JTS cannot parse most of WKT shapes if
the shape type is written in the lower case. For examples `point (1 2)`
is causing JTS inside H2GIS to fail on tr-TR locale as a result of
case-insensitive comparison.
Interval * integer number is a valid operation which previously was
only supported for foldables (literals) and not when a field was
involved. That was because:
1. There was no common type returned for that combination
2. The `BinaryArithmeticOperation` was permitting the multiplication
(called by fold()) but the BinaryArithmeticProcessor didn't allow it
Moreover the error message for invalid arithmetic operations was wrong
because of the issue with the overloading methods of
`LoggerMessageFormat.format`.
Fixes: #41239Fixes: #41200
(cherry picked from commit 91039bab12d3ef27d6eac9cdc891a3b3ad0c694d)
Adds an initial limited implementations of geo features to SQL. This implementation is based on the [OpenGIS® Implementation Standard for Geographic information - Simple feature access](http://www.opengeospatial.org/standards/sfs), which is the current standard for GIS system implementation. This effort is concentrate on SQL option AKA ISO 19125-2.
Queries that are supported as a result of this initial implementation
Metadata commands
- `DESCRIBE table` - returns the correct column types `GEOMETRY` for geo shapes and geo points.
- `SHOW FUNCTIONS` - returns a list that includes supported `ST_` functions
- `SYS TYPES` and `SYS COLUMNS` display correct types `GEO_SHAPE` and `GEO_POINT` for geo shapes and geo points accordingly.
Returning geoshapes and geopoints from elasticsearch
- `SELECT geom FROM table` - returns the geoshapes and geo_points as libs/geo objects in JDBC or as WKT strings in console.
- `SELECT ST_AsWKT(geom) FROM table;` and `SELECT ST_AsText(geom) FROM table;`- returns the geoshapes ang geopoints in their WKT representation;
Using geopoints to elasticsearch
- The following functions will be supported for geopoints in queries, sorting and aggregations: `ST_GeomFromText`, `ST_X`, `ST_Y`, `ST_Z`, `ST_GeometryType`, and `ST_Distance`. In most cases when used in queries, sorting and aggregations, these function are translated into script. These functions can be used in the SELECT clause for both geopoints and geoshapes.
- `SELECT * FROM table WHERE ST_Distance(ST_GeomFromText(POINT(1 2), point) < 10;` - returns all records for which `point` is located within 10m from the `POINT(1 2)`. In this case the WHERE clause is translated into a range query.
Limitations:
Geoshapes cannot be used in queries, sorting and aggregations as part of this initial effort. In order to fully take advantage of geoshapes we would need to have access to geoshape doc values, which is coming in #37206. `ST_Z` cannot be used on geopoints in queries, sorting and aggregations since we don't store altitude in geo_point doc values.
Relates to #29872
Backport of #42031