mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-17 02:14:54 +00:00
SQL: Doc on syntax (identifiers in particular) (#38662)
Add section on syntax, identifiers and literals and on single vs double quotes. (cherry picked from commit aafdb598082e451f36294bd174d0887a276d8c7f)
This commit is contained in:
parent
eb7da4b90d
commit
9357c17288
@ -5,7 +5,7 @@
|
||||
|
||||
Table with reserved keywords that need to be quoted. Also provide an example to make it more obvious.
|
||||
|
||||
The following table lists all of the keywords that are reserved in Presto,
|
||||
The following table lists all of the keywords that are reserved in {es-sql},
|
||||
along with their status in the SQL standard. These reserved keywords must
|
||||
be quoted (using double quotes) in order to be used as an identifier, for example:
|
||||
|
||||
@ -31,43 +31,65 @@ s|SQL-92
|
||||
|`BETWEEN` |reserved |reserved
|
||||
|`BY` |reserved |reserved
|
||||
|`CAST` |reserved |reserved
|
||||
|`CATALOG` |reserved |reserved
|
||||
|`CONVERT` |reserved |reserved
|
||||
|`CURRENT_DATE` |reserved |reserved
|
||||
|`CURRENT_TIMESTAMP` |reserved |reserved
|
||||
|`DAY` |reserved |reserved
|
||||
|`DAYS` | |
|
||||
|`DESC` |reserved |reserved
|
||||
|`DESCRIBE` |reserved |reserved
|
||||
|`DISTINCT` |reserved |reserved
|
||||
|`ESCAPE` |reserved |reserved
|
||||
|`EXISTS` |reserved |reserved
|
||||
|`EXPLAIN` |reserved |reserved
|
||||
|`EXTRACT` |reserved |reserved
|
||||
|`FALSE` |reserved |reserved
|
||||
|`FIRST` |reserved |reserved
|
||||
|`FROM` |reserved |reserved
|
||||
|`FULL` |reserved |reserved
|
||||
|`GROUP` |reserved |reserved
|
||||
|`HAVING` |reserved |reserved
|
||||
|`HOUR` |reserved |reserved
|
||||
|`HOURS` | |
|
||||
|`IN` |reserved |reserved
|
||||
|`INNER` |reserved |reserved
|
||||
|`INTERVAL` |reserved |reserved
|
||||
|`IS` |reserved |reserved
|
||||
|`JOIN` |reserved |reserved
|
||||
|`LEFT` |reserved |reserved
|
||||
|`LIKE` |reserved |reserved
|
||||
|`LIMIT` |reserved |reserved
|
||||
|`MATCH` |reserved |reserved
|
||||
|`MINUTE` |reserved |reserved
|
||||
|`MINUTES` | |
|
||||
|`MONTH` |reserved |reserved
|
||||
|`NATURAL` |reserved |reserved
|
||||
|`NO` |reserved |reserved
|
||||
|`NOT` |reserved |reserved
|
||||
|`NULL` |reserved |reserved
|
||||
|`NULLS` | |
|
||||
|`ON` |reserved |reserved
|
||||
|`OR` |reserved |reserved
|
||||
|`ORDER` |reserved |reserved
|
||||
|`OUTER` |reserved |reserved
|
||||
|`RIGHT` |reserved |reserved
|
||||
|`RLIKE` | |
|
||||
|`QUERY` | |
|
||||
|`SECOND` |reserved |reserved
|
||||
|`SECONDS` | |
|
||||
|`SELECT` |reserved |reserved
|
||||
|`SESSION` | |reserved
|
||||
|`TABLE` |reserved |reserved
|
||||
|`TABLES` | |
|
||||
|`THEN` |reserved |reserved
|
||||
|`TO` |reserved |reserved
|
||||
|`TRUE` |reserved |reserved
|
||||
|`TYPE` | |
|
||||
|`USING` |reserved |reserved
|
||||
|`WHEN` |reserved |reserved
|
||||
|`WHERE` |reserved |reserved
|
||||
|`WITH` |reserved |reserved
|
||||
|`YEAR` |reserved |reserved
|
||||
|`YEARS` | |
|
||||
|
||||
|===
|
||||
|
@ -12,9 +12,13 @@
|
||||
[partintro]
|
||||
--
|
||||
|
||||
X-Pack includes a SQL feature to execute SQL against Elasticsearch
|
||||
X-Pack includes a SQL feature to execute SQL queries against {es}
|
||||
indices and return results in tabular format.
|
||||
|
||||
The following chapters aim to cover everything from usage, to syntax and drivers.
|
||||
Experience users or those in a hurry might want to jump directly to
|
||||
the list of SQL <<sql-commands, commands>> and <<sql-functions, functions>>.
|
||||
|
||||
<<sql-overview, Overview>>::
|
||||
Overview of {es-sql} and its features.
|
||||
<<sql-getting-started, Getting Started>>::
|
||||
@ -22,22 +26,19 @@ indices and return results in tabular format.
|
||||
<<sql-concepts, Concepts and Terminology>>::
|
||||
Language conventions across SQL and {es}.
|
||||
<<sql-security,Security>>::
|
||||
Securing {es-sql} and {es}.
|
||||
Secure {es-sql} and {es}.
|
||||
<<sql-rest,REST API>>::
|
||||
Accepts SQL in a JSON document, executes it, and returns the
|
||||
results.
|
||||
Execute SQL in JSON format over REST.
|
||||
<<sql-translate,Translate API>>::
|
||||
Accepts SQL in a JSON document and translates it into a native
|
||||
Elasticsearch query and returns that.
|
||||
Translate SQL in JSON format to {es} native query.
|
||||
<<sql-cli,CLI>>::
|
||||
Command-line application that connects to {es} to execute
|
||||
SQL and print tabular results.
|
||||
Command-line application for executing SQL against {es}.
|
||||
<<sql-jdbc,JDBC>>::
|
||||
A JDBC driver for {es}.
|
||||
JDBC driver for {es}.
|
||||
<<sql-odbc,ODBC>>::
|
||||
An ODBC driver for {es}.
|
||||
ODBC driver for {es}.
|
||||
<<sql-client-apps,Client Applications>>::
|
||||
Documentation for configuring various SQL/BI tools with {es-sql}.
|
||||
Setup various SQL/BI tools with {es-sql}.
|
||||
<<sql-spec,SQL Language>>::
|
||||
Overview of the {es-sql} language, such as supported data types, commands and
|
||||
syntax.
|
||||
|
@ -3,12 +3,14 @@
|
||||
[[sql-spec]]
|
||||
== SQL Language
|
||||
|
||||
This chapter describes the SQL semantics supported in X-Pack namely:
|
||||
This chapter describes the SQL syntax and semantics supported namely:
|
||||
|
||||
<<sql-data-types>>:: Data types
|
||||
<<sql-lexical-structure>>:: Lexical structure
|
||||
<<sql-commands>>:: Commands
|
||||
<<sql-data-types>>:: Data types
|
||||
<<sql-index-patterns>>:: Index patterns
|
||||
|
||||
include::syntax/lexic/index.asciidoc[]
|
||||
include::syntax/commands/index.asciidoc[]
|
||||
include::data-types.asciidoc[]
|
||||
include::syntax/index.asciidoc[]
|
||||
include::index-patterns.asciidoc[]
|
||||
|
228
docs/reference/sql/language/syntax/lexic/index.asciidoc
Normal file
228
docs/reference/sql/language/syntax/lexic/index.asciidoc
Normal file
@ -0,0 +1,228 @@
|
||||
[role="xpack"]
|
||||
[testenv="basic"]
|
||||
[[sql-lexical-structure]]
|
||||
== Lexical Structure
|
||||
|
||||
This section covers the major lexical structure of SQL, which for the most part, is going to resemble that of ANSI SQL itself hence why low-levels details are not discussed in depth.
|
||||
|
||||
{es-sql} currently accepts only one _command_ at a time. A command is a sequence of _tokens_ terminated by the end of input stream.
|
||||
|
||||
A token can be a __key word__, an _identifier_ (_quoted_ or _unquoted_), a _literal_ (or constant) or a special character symbol (typically a delimiter). Tokens are typically separated by whitespace (be it space, tab) though in some cases, where there is no ambiguity (typically due to a character symbol) this is not needed - however for readability purposes this should be avoided.
|
||||
|
||||
[[sql-syntax-keywords]]
|
||||
[float]
|
||||
=== Key Words
|
||||
|
||||
Take the following example:
|
||||
|
||||
[source, sql]
|
||||
----
|
||||
SELECT * FROM table
|
||||
----
|
||||
|
||||
This query has four tokens: `SELECT`, `\*`, `FROM` and `table`. The first three, namely `SELECT`, `*` and `FROM` are __key words__ meaning words that have a fixed meaning in SQL. The token `table` is an _identifier_ meaning it identifies (by name) an entity inside SQL such as a table (in this case), a column, etc...
|
||||
|
||||
As one can see, both key words and identifiers have the _same_ lexical structure and thus one cannot know whether a token is one or the other without knowing the SQL language; the complete list of key words is available in the <<sql-syntax-reserved, reserved appendix>>.
|
||||
Do note that key words are case-insensitive meaning the previous example can be written as:
|
||||
|
||||
[source, sql]
|
||||
----
|
||||
select * fRoM table;
|
||||
----
|
||||
|
||||
Identifiers however are not - as {es} is case sensitive, {es-sql} uses the received value verbatim.
|
||||
|
||||
To help differentiate between the two, through-out the documentation the SQL key words are upper-cased a convention we find increases readability and thus recommend to others.
|
||||
|
||||
[[sql-syntax-identifiers]]
|
||||
[float]
|
||||
=== Identifiers
|
||||
|
||||
Identifiers can be of two types: __quoted__ and __unquoted__:
|
||||
|
||||
[source, sql]
|
||||
----
|
||||
SELECT ip_address FROM "hosts-*"
|
||||
----
|
||||
|
||||
This query has two identifiers, `ip_address` and `hosts-\*` (an <<multi-index,index pattern>>). As `ip_address` does not clash with any key words it can be used verbatim, `hosts-*` on the other hand cannot as it clashes with `-` (minus operation) and `*` hence the double quotes.
|
||||
|
||||
Another example:
|
||||
|
||||
[source, sql]
|
||||
----
|
||||
SELECT "from" FROM "<logstash-{now/d}>"
|
||||
----
|
||||
|
||||
The first identifier from needs to quoted as otherwise it clashes with the `FROM` key word (which is case insensitive as thus can be written as `from`) while the second identifier using {es} <<date-math-index-names>> would have otherwise confuse the parser.
|
||||
|
||||
Hence why in general, *especially* when dealing with user input it is *highly* recommended to use quotes for identifiers. It adds minimal increase to your queries and in return offers clarity and disambiguation.
|
||||
|
||||
[[sql-syntax-literals]]
|
||||
[float]
|
||||
=== Literals (Constants)
|
||||
|
||||
{es-sql} supports two kind of __implicitly-typed__ literals: strings and numbers.
|
||||
|
||||
[[sql-syntax-string-literals]]
|
||||
[float]
|
||||
==== String Literals
|
||||
|
||||
A string literal is an arbitrary number of characters bounded by single quotes `'`: `'Giant Robot'`.
|
||||
To include a single quote in the string, escape it using another single quote: `'Captain EO''s Voyage'`.
|
||||
|
||||
NOTE: An escaped single quote is *not* a double quote (`"`), but a single quote `'` _repeated_ (`''`).
|
||||
|
||||
[sql-syntax-numeric-literals]
|
||||
[float]
|
||||
==== Numeric Literals
|
||||
|
||||
Numeric literals are accepted both in decimal and scientific notation with exponent marker (`e` or `E`), starting either with a digit or decimal point `.`:
|
||||
|
||||
[source, sql]
|
||||
----
|
||||
1969 -- integer notation
|
||||
3.14 -- decimal notation
|
||||
.1234 -- decimal notation starting with decimal point
|
||||
4E5 -- scientific notation (with exponent marker)
|
||||
1.2e-3 -- scientific notation with decimal point
|
||||
----
|
||||
|
||||
Numeric literals that contain a decimal point are always interpreted as being of type `double`. Those without are considered `integer` if they fit otherwise their type is `long` (or `BIGINT` in ANSI SQL types).
|
||||
|
||||
[[sql-syntax-generic-literals]]
|
||||
[float]
|
||||
==== Generic Literals
|
||||
|
||||
When dealing with arbitrary type literal, one creates the object by casting, typically, the string representation to the desired type. This can be achieved through the dedicated <<sql-operators-cast, cast operator>> and <<sql-functions-type-conversion, functions>>:
|
||||
|
||||
[source, sql]
|
||||
----
|
||||
123::LONG -- cast 123 to a LONG
|
||||
CAST('1969-05-13T12:34:56' AS TIMESTAMP) -- cast the given string to datetime
|
||||
CONVERT('10.0.0.1', IP) -- cast '10.0.0.1' to an IP
|
||||
----
|
||||
|
||||
Do note that {es-sql} provides functions that out of the box return popular literals (like `E()`) or provide dedicated parsing for certain strings.
|
||||
|
||||
[[sql-syntax-single-vs-double-quotes]]
|
||||
[float]
|
||||
=== Single vs Double Quotes
|
||||
|
||||
It is worth pointing out that in SQL, single quotes `'` and double quotes `"` have different meaning and *cannot* be used interchangeably.
|
||||
Single quotes are used to declare a <<sql-syntax-string-literals, string literal>> while double quotes for <<sql-syntax-identifiers, identifiers>>.
|
||||
|
||||
To wit:
|
||||
|
||||
[source, sql]
|
||||
----
|
||||
SELECT "first_name" <1>
|
||||
FROM "musicians" <1>
|
||||
WHERE "last_name" <1>
|
||||
= 'Carroll' <2>
|
||||
----
|
||||
|
||||
<1> Double quotes `"` used for column and table identifiers
|
||||
<2> Single quotes `'` used for a string literal
|
||||
|
||||
[[sql-syntax-special-chars]]
|
||||
[float]
|
||||
=== Special characters
|
||||
|
||||
A few characters that are not alphanumeric have a dedicated meaning different from that of an operator. For completeness these are specified below:
|
||||
|
||||
|
||||
[cols="^m,^15"]
|
||||
|
||||
|===
|
||||
|
||||
s|Char
|
||||
s|Description
|
||||
|
||||
|* | The asterisk (or wildcard) is used in some contexts to denote all fields for a table. Can be also used as an argument to some aggregate functions.
|
||||
|, | Commas are used to enumerate the elements of a list.
|
||||
|. | Used in numeric constants or to separate identifiers qualifiers (catalog, table, column names, etc...).
|
||||
|()| Parentheses are used for specific SQL commands, function declarations or to enforce precedence.
|
||||
|===
|
||||
|
||||
[[sql-syntax-operators]]
|
||||
[float]
|
||||
=== Operators
|
||||
|
||||
Most operators in {es-sql} have the same precedence and are left-associative. As this is done at parsing time, parenthesis need to be used to enforce a different precedence.
|
||||
|
||||
The following table indicates the supported operators and their precendence (highest to lowest);
|
||||
|
||||
[cols="^2m,^,^3"]
|
||||
|
||||
|===
|
||||
|
||||
s|Operator/Element
|
||||
s|Associativity
|
||||
s|Description
|
||||
|
||||
|.
|
||||
|left
|
||||
|qualifier separator
|
||||
|
||||
|::
|
||||
|left
|
||||
|PostgreSQL-style type cast
|
||||
|
||||
|+ -
|
||||
|right
|
||||
|unary plus and minus (numeric literal sign)
|
||||
|
||||
|* / %
|
||||
|left
|
||||
|multiplication, division, modulo
|
||||
|
||||
|+ -
|
||||
|left
|
||||
|addition, substraction
|
||||
|
||||
|BETWEEN IN LIKE
|
||||
|
|
||||
|range containment, string matching
|
||||
|
||||
|< > <= >= = <=> <> !=
|
||||
|
|
||||
|comparison
|
||||
|
||||
|NOT
|
||||
|right
|
||||
|logical negation
|
||||
|
||||
|AND
|
||||
|left
|
||||
|logical conjunction
|
||||
|
||||
|OR
|
||||
|left
|
||||
|logical disjunction
|
||||
|
||||
|===
|
||||
|
||||
|
||||
[[sql-syntax-comments]]
|
||||
[float]
|
||||
=== Comments
|
||||
|
||||
{es-sql} allows comments which are sequence of characters ignored by the parsers.
|
||||
|
||||
Two styles are supported:
|
||||
|
||||
Single Line:: Comments start with a double dash `--` and continue until the end of the line.
|
||||
Multi line:: Comments that start with `/\*` and end with `*/` (also known as C-style).
|
||||
|
||||
|
||||
[source, sql]
|
||||
----
|
||||
-- single line comment
|
||||
/* multi
|
||||
line
|
||||
comment
|
||||
that supports /* nested comments */
|
||||
*/
|
||||
----
|
||||
|
Loading…
x
Reference in New Issue
Block a user