OpenSearch/docs/reference/sql/limitations.asciidoc

[role="xpack"]
[testenv="basic"]
[[sql-limitations]]
== SQL Limitations

beta[]

[float]
=== Nested fields in `SYS COLUMNS` and `DESCRIBE TABLE`

{es} has a special type of relationship fields called `nested` fields. In {es-sql} they can be used by referencing their inner
sub-fields. Even though `SYS COLUMNS` and `DESCRIBE TABLE` will still display them as having the type `NESTED`, they cannot
be used in a query. One can only reference its sub-fields in the form:

[source, sql]
--------------------------------------------------
[nested_field_name].[sub_field_name]
--------------------------------------------------

For example:

[source, sql]
--------------------------------------------------
SELECT dep.dep_name.keyword FROM test_emp GROUP BY languages;
--------------------------------------------------

[float]
=== Multi-nested fields

{es-sql} doesn't support multi-nested documents, so a query cannot reference more than one nested field in an index.
This applies to multi-level nested fields, but also multiple nested fields defined on the same level. For example, for this index:

[source, sql]
----------------------------------------------------
       column         |     type      |    mapping
----------------------+---------------+-------------
nested_A              |STRUCT         |NESTED
nested_A.nested_X     |STRUCT         |NESTED
nested_A.nested_X.text|VARCHAR        |KEYWORD
nested_A.text         |VARCHAR        |KEYWORD
nested_B              |STRUCT         |NESTED
nested_B.text         |VARCHAR        |KEYWORD
----------------------------------------------------

`nested_A` and `nested_B` cannot be used at the same time, nor `nested_A`/`nested_B` and `nested_A.nested_X` combination.
For such situations, {es-sql} will display an error message.

[float]
=== Paginating nested inner hits

When SELECTing a nested field, pagination will not work as expected, {es-sql} will return __at least__ the page size records. 
This is because of the way nested queries work in {es}: the root nested field will be returned and it's matching inner nested fields as well,
pagination taking place on the **root nested document and not on its inner hits**.

[float]
=== Normalized `keyword` fields

`keyword` fields in {es} can be normalized by defining a `normalizer`. Such fields are not supported in {es-sql}.

[float]
=== Array type of fields

Array fields are not supported due to the "invisible" way in which {es} handles an array of values: the mapping doesn't indicate whether
a field is an array (has multiple values) or not, so without reading all the data, {es-sql} cannot know whether a field is a single or multi value.

[float]
=== Sorting by aggregation

When doing aggregations (`GROUP BY`) {es-sql} relies on {es}'s `composite` aggregation for its support for paginating results.
But this type of aggregation does come with a limitation: sorting can only be applied on the key used for the aggregation's buckets. This
means that queries like `SELECT * FROM test GROUP BY age ORDER BY COUNT(*)` are not possible.

[float]
=== Using aggregation functions on top of scalar functions

Aggregation functions like <<sql-functions-aggs-min,`MIN`>>, <<sql-functions-aggs-max,`MAX`>>, etc. can only be used
directly on fields, and so queries like `SELECT MAX(abs(age)) FROM test` are not possible.

[float]
=== Using a sub-select

Using sub-selects (`SELECT X FROM (SELECT Y)`) is **supported to a small degree**: any sub-select that can be "flattened" into a single
`SELECT` is possible with {es-sql}. For example:

["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs.csv-spec[limitationSubSelect]
--------------------------------------------------

The query above is possible because it is equivalent with:

["source","sql",subs="attributes,macros"]
--------------------------------------------------
include-tagged::{sql-specs}/docs.csv-spec[limitationSubSelectRewritten]
--------------------------------------------------

But, if the sub-select would include a `GROUP BY` or `HAVING` or the enclosing `SELECT` would be more complex than `SELECT X
FROM (SELECT ...) WHERE [simple_condition]`, this is currently **un-supported**.

[float]
=== Using <<sql-functions-aggs-first, `FIRST`>>/<<sql-functions-aggs-last,`LAST`>> aggregation functions in `HAVING` clause

Using `FIRST` and `LAST` in the `HAVING` clause is not supported. The same applies to
<<sql-functions-aggs-min,`MIN`>> and <<sql-functions-aggs-max,`MAX`>> when their target column
is of type <<keyword, `keyword`>> as they are internally translated to `FIRST` and `LAST`.
SQL: documentation improvements and updates (#36918) * Added Limitations page * Made the aggregations page follow the common template for functions * Modified all tables to have the first row's cells content centered * Polishing in other various sections 2018-12-21 16:25:54 -05:00			`[role="xpack"]`
			`[testenv="basic"]`
			`[[sql-limitations]]`
			`== SQL Limitations`

			`beta[]`

			`[float]`
			=== Nested fields in `SYS COLUMNS` and `DESCRIBE TABLE`

			{es} has a special type of relationship fields called `nested` fields. In {es-sql} they can be used by referencing their inner
			sub-fields. Even though `SYS COLUMNS` and `DESCRIBE TABLE` will still display them as having the type `NESTED`, they cannot
			`be used in a query. One can only reference its sub-fields in the form:`

			`[source, sql]`
			`--------------------------------------------------`
			`[nested_field_name].[sub_field_name]`
			`--------------------------------------------------`

			`For example:`

			`[source, sql]`
			`--------------------------------------------------`
			`SELECT dep.dep_name.keyword FROM test_emp GROUP BY languages;`
			`--------------------------------------------------`

			`[float]`
			`=== Multi-nested fields`

			`{es-sql} doesn't support multi-nested documents, so a query cannot reference more than one nested field in an index.`
			`This applies to multi-level nested fields, but also multiple nested fields defined on the same level. For example, for this index:`

			`[source, sql]`
			`----------------------------------------------------`
			`column \| type \| mapping`
			`----------------------+---------------+-------------`
			`nested_A \|STRUCT \|NESTED`
			`nested_A.nested_X \|STRUCT \|NESTED`
			`nested_A.nested_X.text\|VARCHAR \|KEYWORD`
			`nested_A.text \|VARCHAR \|KEYWORD`
			`nested_B \|STRUCT \|NESTED`
			`nested_B.text \|VARCHAR \|KEYWORD`
			`----------------------------------------------------`

			`nested_A` and `nested_B` cannot be used at the same time, nor `nested_A`/`nested_B` and `nested_A.nested_X` combination.
			`For such situations, {es-sql} will display an error message.`

			`[float]`
			`=== Paginating nested inner hits`

			`When SELECTing a nested field, pagination will not work as expected, {es-sql} will return __at least__ the page size records.`
			`This is because of the way nested queries work in {es}: the root nested field will be returned and it's matching inner nested fields as well,`
			`pagination taking place on the root nested document and not on its inner hits.`

			`[float]`
			=== Normalized `keyword` fields

			`keyword` fields in {es} can be normalized by defining a `normalizer`. Such fields are not supported in {es-sql}.

			`[float]`
			`=== Array type of fields`

			`Array fields are not supported due to the "invisible" way in which {es} handles an array of values: the mapping doesn't indicate whether`
			`a field is an array (has multiple values) or not, so without reading all the data, {es-sql} cannot know whether a field is a single or multi value.`

			`[float]`
			`=== Sorting by aggregation`

			When doing aggregations (`GROUP BY`) {es-sql} relies on {es}'s `composite` aggregation for its support for paginating results.
			`But this type of aggregation does come with a limitation: sorting can only be applied on the key used for the aggregation's buckets. This`
			means that queries like `SELECT * FROM test GROUP BY age ORDER BY COUNT(*)` are not possible.
SQL: add sub-selects to the Limitations page (#37012) 2019-01-07 03:08:51 -05:00
SQL: [Docs] Add limitation for aggregate functions on scalars (#38186) Currently aggregate functions can operate only directly on fields. They cannot be used on top of scalar functions as painless scripting is currently not supported. 2019-02-01 09:13:51 -05:00			`[float]`
			`=== Using aggregation functions on top of scalar functions`

			Aggregation functions like <<sql-functions-aggs-min,`MIN`>>, <<sql-functions-aggs-max,`MAX`>>, etc. can only be used
			directly on fields, and so queries like `SELECT MAX(abs(age)) FROM test` are not possible.

SQL: add sub-selects to the Limitations page (#37012) 2019-01-07 03:08:51 -05:00			`[float]`
			`=== Using a sub-select`

			Using sub-selects (`SELECT X FROM (SELECT Y)`) is supported to a small degree: any sub-select that can be "flattened" into a single
			`SELECT` is possible with {es-sql}. For example:

			`["source","sql",subs="attributes,macros"]`
			`--------------------------------------------------`
			`include-tagged::{sql-specs}/docs.csv-spec[limitationSubSelect]`
			`--------------------------------------------------`

			`The query above is possible because it is equivalent with:`

			`["source","sql",subs="attributes,macros"]`
			`--------------------------------------------------`
			`include-tagged::{sql-specs}/docs.csv-spec[limitationSubSelectRewritten]`
			`--------------------------------------------------`

			But, if the sub-select would include a `GROUP BY` or `HAVING` or the enclosing `SELECT` would be more complex than `SELECT X
			FROM (SELECT ...) WHERE [simple_condition]`, this is currently un-supported.
SQL: Implement FIRST/LAST aggregate functions (#37936) FIRST and LAST can be used with one argument and work similarly to MIN and MAX but they are implemented using a Top Hits aggregation and therefore can also operate on keyword fields. When a second argument is provided then they return the first/last value of the first arg when its values are ordered ascending/descending (respectively) by the values of the second argument. Currently because of the usage of a Top Hits aggregation FIRST and LAST cannot be used in the HAVING clause of a GROUP BY query to filter on the results of the aggregation. Closes: #35639 2019-01-31 09:33:05 -05:00
			`[float]`
SQL: [Docs] Add limitation for aggregate functions on scalars (#38186) Currently aggregate functions can operate only directly on fields. They cannot be used on top of scalar functions as painless scripting is currently not supported. 2019-02-01 09:13:51 -05:00			=== Using <<sql-functions-aggs-first, `FIRST`>>/<<sql-functions-aggs-last,`LAST`>> aggregation functions in `HAVING` clause
SQL: Implement FIRST/LAST aggregate functions (#37936) FIRST and LAST can be used with one argument and work similarly to MIN and MAX but they are implemented using a Top Hits aggregation and therefore can also operate on keyword fields. When a second argument is provided then they return the first/last value of the first arg when its values are ordered ascending/descending (respectively) by the values of the second argument. Currently because of the usage of a Top Hits aggregation FIRST and LAST cannot be used in the HAVING clause of a GROUP BY query to filter on the results of the aggregation. Closes: #35639 2019-01-31 09:33:05 -05:00
			Using `FIRST` and `LAST` in the `HAVING` clause is not supported. The same applies to
			<<sql-functions-aggs-min,`MIN`>> and <<sql-functions-aggs-max,`MAX`>> when their target column
			is of type <<keyword, `keyword`>> as they are internally translated to `FIRST` and `LAST`.