This is mainly a promotion of Literal to Attribute to better handle folding expressions from extracted queries
Original commit: elastic/x-pack-elasticsearch@c3bb48bb61
Given that the Catalog was only ever used to hold a single index, the corresponding abstraction can be removed in favour of the abstraction that it holds, namely `GetIndexResult`.
Original commit: elastic/x-pack-elasticsearch@6932db642c
Given that we get now filtered mappings directly from the get index API (in case security is configured with FLS), we don't need the security filter nor the filtered catalog. That means we can remove the delayed action support also from AuthorizationService and rather make SQLAction a composite action like others. It will be authorized as an action, but its indices won't be checked while that will happen with its inner actions (get index and search) which need to be properly authorized.
Also, SQLGetIndicesAction is not needed anymore, as its purpose was to retrieve the indices access resolver put in the context by the security plugin for delayed actions, which are not supported anymore.
This commit kind of reverts elastic/x-pack-elasticsearch#2162, as it is now possible to integrate with security out-of-the-box
relates elastic/x-pack-elasticsearch#2934
Original commit: elastic/x-pack-elasticsearch@64d5044426
* SQL: GROUP BY with multiple fields are forbidden
The check is performed in the folder Verifier as the optimizer can eliminate some fields (like those with constants)
Original commit: elastic/x-pack-elasticsearch@8d49f4ab02
SQL: Extend HAVING support
Enhance Analyzer to support HAVING scalar functions over aggregates
Enhance Analyzer to push down undeclared aggs into the Aggregate
Fix bug in Analyzer$MissingRef that caused invalid groupings to still be resolved when pushed into an Aggregate
Preserve location information across the plan
Add AttributeMap as a backing for AttributeSet
Add Optimizer rule for combining projects
Add tz to DT functions toString
Change formatTemplate to not use String.format and thus to avoid
interfering with special % chars
Extend dataset with random salary and languages
Add unit tests for AttributeMap
Fix MathFunction scripting
Improve MissingRefs to enrich UnresolvedAttribute with metadata
During the Analysis unpushed attributes are automatically enriched to
provide more accurate error information
Enhance Verifier to deal with invalid (but resolved) ORDER/HAVING
Add OrderBy arithmetic tests
Improve Verifier to prevent GROUP BY on aggregations
Add tests on grouping by scalar functions
Original commit: elastic/x-pack-elasticsearch@5030d7a755
This commit fixes a bug in the testVersionHandling test and adds more randomization to serialized cursors.
Original commit: elastic/x-pack-elasticsearch@fab8d50518
While working on cursor cleanup, I realized that we still have two ways to serialize the cursor and the second way doesn't contain the cursor version (only client version, that can be potentially different from the cursor version). This commit switches to the unified way of serializing the cursor.
This is a follow up for elastic/x-pack-elasticsearch#3064.
Original commit: elastic/x-pack-elasticsearch@ef1a6427dd
SQL: Introduce PreAnalyze phase to resolve catalogs async
The new preanalyze phase collects all unresolved relations and tries
to resolve them as indices through typical async calls _before_ starting the analysis process.
The result is loaded into a catalog which is then passed to the analyzer.
While at it, the analyzer was made singleton and state across the engine
is done through SqlSession#currentContext().
Commit missing fix
Fix typo
Fix license
Fix line length
remove redundant static modifier
Remove redundant generics type
Rename catalogResolver instance member to indexResolver
Fix translate action to return a response through the listener, it hangs otherwise
IndexResolver improvements
Make sure that get index requests calls are locally executed by providing local flag.
Don't replace index/alias name with concrete index name in asCatalog response conversion. We need to preserve the original alias name for security, so it is reused in the subsequent search.
Update roles and actions names for security tests
Get index is now executed instead of sql get indices, and sql get indices has been removed.
Also made cluster privileges more restrictive to make sure that cluster state calls are no longer executed.
Fix most of the security IT tests
indices options are now unified, always lenient. The only situation where we get authorization exception back is when the user is not authorized for the sql action (besides for which indices).
Improve SessionContext handling
Fix context being invalid in non-executable phases
Make Explain & Debug command fully async
Resolve checkstyle error about redundant modifiers
Temporarily restore SqlGetIndicesAction
SqlGetIndicesAction action is still needed in RestSqlJdbcAction (metaTable and metaColumn methods), where we can't at the moment call IndexResolver directly, as security (FLS) needs index resolver to be called as part of the execution of an indices action. Once mappings are returned filtered, delayed action and the security filter will go away, as well as SqlGetIndicesAction.
SqlGetIndicesAction doesn't need to be a delayed action, my bad
[TEST] remove unused expectSqlWithAsyncLookup and rename expectSqlWithSyncLookup to expectSqlCompositeAction
Polish and feedback
Add unit test for PreAnalyzer
Original commit: elastic/x-pack-elasticsearch@57846ed613
SQL: Improve grammar to better handle quotes
Fix typo in handling (back)quoted identifiers
Clarify use of unquote (dedicated for literals) and text (generic)
Address feedback
clarify that ` are picked up but not supported/recommended
Fix merge and adjust json errors to work on windows
Original commit: elastic/x-pack-elasticsearch@67e0f3f38e
The /_sql endpoint now returns the results in the text format by default. Structured formats are also supported using the format parameter or accept header similar to _cat endpoints.
Original commit: elastic/x-pack-elasticsearch@4353793b83
Instead of returning "error response" objects and then translating them
into SQL exceptions this just throws the SQL exceptions directly. This
means the CLI catches exceptions and prints out the messages which isn't
ideal if this were hot code but it isn't and this is a much simpler way
of doing things.
Original commit: elastic/x-pack-elasticsearch@08431d3941
Adds the option to specify an elasticsearch filter in addition to the SQL query by introducing a filter parameter in the REST query which would create a boolean filter if the SQL query generates an elasticsearch query or a constant score query if SQL if the SQL query doesn't generates an elasticsearch query. Usage:
{
"query": "SELECT * FROM index",
"filter" : { "term" : { "tag" : "tech" } }
}
relates elastic/x-pack-elasticsearch#2895
Original commit: elastic/x-pack-elasticsearch@9a73813c7f
This commits also simplifies the serialization mechanism by remove 2 ways to serialize the cursor. Adding the version there was complicating things too much otherwise.
Original commit: elastic/x-pack-elasticsearch@4f2c541e0a
Now that we can parse Elasticsearch's standard error messages in the CLI
and JDBC client we can just let those standard error messages bubble out
of Elasticsearch rather than catch and encode them.
In a followup we can remove the encoding entirely.
Original commit: elastic/x-pack-elasticsearch@bad043b6f7
This teaches SQL to parse Elasticsearch's standard error responses
but doesn't change SQL to general Elasticsearch's standard error responses
in all cases. That can come in a followup. We do this parsing with
jackson-core, the same dependency Elasticsearch uses for parsing
json. We shade jackson-core in the JDBC driver so that users don't have to worry about
dependency clashes. We do not do so in the CLI because it is a standalone
application.
We get a few "bonus" changes along the way:
1. We save a copy operation. Before this change responses were spooled
into memory and then parsed. After this change they are parsed directly
from the response stream.
2. We had a few classes entirely to support the spooling operation that we
no longer need: `BytesArray`, `FastByteArrayInputStream`, and
`BasicByteArrayOutputStream`.
3. SQL's `Version` was incorrectly parsing the version from the jar manifest.
We didn't notice because the test was rigged to return `UNKNOWN` because
we *were* running the test from the compiled classes directory instead of the
jar. As part of shading jackson we moved running the tests to running against
the shaded jar. Now we can actually assert that we parse the version correctly.
It turns out we weren't. So I fixed it.
Original commit: elastic/x-pack-elasticsearch@2e8f397bf4
* Fix several NOCOMMITS
- renamed Assert to Check to make the intent clear
- clarify esMajor/Minor inside connection (thse are actually our own
methods, not part of JDBC API)
- wire pageTimeout into Cursor#nextPage
Original commit: elastic/x-pack-elasticsearch@7626c0a44a
The keywords inside SqlBase are now sorted alphabetically - much easier
to read and update the docs
Original commit: elastic/x-pack-elasticsearch@5aa89c5950
This prevents us from having to hack a fake schema together on the
second (and third and fourth, etc) page of results.
Original commit: elastic/x-pack-elasticsearch@7ba3119daa
The CLI and JDBC were meant to share the same named xcontent registry
but the cli action didn't reuse it by mistake.
Original commit: elastic/x-pack-elasticsearch@6c9af5a22a
The `AbstractSqlServer` class was used when SQL's CLI and JDBC were
transport actions. They are REST layer concepts now and it is unused so
it should go.
Original commit: elastic/x-pack-elasticsearch@25ec699564
Fix the name of the action the SQL uses to lookup index information from
the cluster state. The old name was silly.
Original commit: elastic/x-pack-elasticsearch@805fb29662
Organizes the SQL translate action to match the way that x-pack has been
organizing new actions for a while. All of the pieces are put into the
same class file.
Original commit: elastic/x-pack-elasticsearch@def911c0ab
Adds docs for the REST API, translate API, the CLI, and JDBC.
Next we need to add more example queries and documentation for our
extensions.
Original commit: elastic/x-pack-elasticsearch@ed6d1360d2
* Remove usage of Settings inside SqlSettings
Also hook client timeouts to the backend
Set UTC as default timezone when using CSV
As the JVM timezone changes, make sure to pin it to UTC since this is what the results are computed against
Original commit: elastic/x-pack-elasticsearch@3e7aad8c1f
Moves joda time handling into DocValueExtractor, that's the only place where it occurs at the moment.
Original commit: elastic/x-pack-elasticsearch@205e82990a
Firstly, data in H2 is now stored in TIMESTAMP WITH TIME ZONE since H2
does not allow a global TZ to be set and picks the JVM TZ when a record
is read.
JdbcAssert is now aware of this allows TIMESTAMP with TZ == TIMESTAMP
Discovered a serious bug in DateTimeFunction - unfortunately date
histogram is not useful except for year since most extract functions
avoid ordering which a histogram preserves.
Thus most DTF are now terms aggs with scripting.
Improved a bug that caused duplicate functions to not be detected because
of aliasing.
Moved some datetime tests to CSV but the aggs tests now are in sync with
H2
Fixed bug that caused arithmetic on aggs to not be properly resolved by
splitting the processor definition tree to aggName (unresolved) and
aggPath (resolved)
Original commit: elastic/x-pack-elasticsearch@e75ada68f1
We weren't returning errors correctly from the server
or catching them correctly in the CLI. This fixes that
and adds simple integration tests.
Original commit: elastic/x-pack-elasticsearch@259da0da6f
A JDBC driver should throw only checked SQLExceptions.
Introduce JdbcSQLException and fix some no-commits along the way.
Original commit: elastic/x-pack-elasticsearch@299fcf9ace
Better handling of SQL exceptions (result of incorrect queries) vs
unexpected ones (engine failure, ES...)
Original commit: elastic/x-pack-elasticsearch@2698402cdb
This commit removes ThrowableConsumer, WrappingException, ActionUtils and ObjectUtils by replacing them with core equivalents when needed.
Original commit: elastic/x-pack-elasticsearch@5a68418a3d
We switched most of the commands to asynchronous mode, this commit removes the synchronous version of exec to prevent accidental use of it in the future.
Original commit: elastic/x-pack-elasticsearch@03d0a6350d
Right now if you run `gradle regen` on Windows you'll get `CRLF` line
endings on all the ANTLR generated files because we run
```
ant.fixcrlf(srcdir: outputPath) {
patternset(includes: 'SqlBase*.java')
}
```
The docs for fixcrlf say that the default line endings that it
corrects to is based on the OS:
https://ant.apache.org/manual/Tasks/fixcrlf.html
This change locks it to `LF`.
Original commit: elastic/x-pack-elasticsearch@4396729e04
`RowSetCursor` was just like `RowSet` only it had methods that allowed
you to scroll to the next page. We now use `RowSet#nextPageCursor` to
get the next page in a way that doesn't require us to store state on
the server. So we can remove `RowSetCursor` entirely now.
Original commit: elastic/x-pack-elasticsearch@6a4a1efb20
* Switch `ResultSet#getFetchSize` from returning the *requested*
fetch size to returning the size of the current page of results.
For scrolling searches without parent/child this'll match the
requested fetch size but for other things it won't. The nice thing
about this is that it lets us tell users a thing to look at if
they are wondering why they are using a bunch of memory.
* Remove all the entire JDBC action and implement it on the REST
layer instead.
* Create some code that can be shared between the cli and jdbc
actions to properly handle errors and request deserialization so
there isn't as much copy and paste between the two. This helps
because it calls out explicitly the places where they are different.
* I have not moved the CLI REST handle to shared code because
I think it'd be clearer to make that change in a followup PR.
* There is now no more need for constructs that allow iterating
all of the results in the same request. I have not removed these
because I feel that that cleanup is big enough to deserve its own
PR.
Original commit: elastic/x-pack-elasticsearch@3b12afd11c
These wrap `DataInput` and `DataOutput` to add the protocol
version being serialized. This is similar to the mechanism
used by core and it has made adding and removing fields from
the serialization protocol fairly simple.
Original commit: elastic/x-pack-elasticsearch@90b3f1199a
* big refactor of Processor by introducing ProcessorDefinition an
immutable tree structure used for resolving multiple inputs across
folding (in particular for aggregations) which at runtime gets
translated into 'compiled' or small Processors
Add expression arithmetic, expression folding and type coercion
Folding
* for literals, scalars and inside the optimizer
Type validation happens per type hierarchy (numeric vs decimal) not type
Ceil/Floor/Round functions return long/int instead of double
ScalarFunction preserves ProcessorDefinition instead of functionId
Original commit: elastic/x-pack-elasticsearch@a703f8b455
Removes a few NOCOMMITs that are tracked other places and updates
a few with plans on how to work on them.
Original commit: elastic/x-pack-elasticsearch@8d1cfdf4ee
This renames that `write` and `read` methods in SQL to `writeTo` and
`readFrom` to line up with the names used in core. I don't have a
strong opinion whether or not any name is better than any other but
I figure there isn't a good reason for SQL to be different from the
rest of Elasticsearch.
Original commit: elastic/x-pack-elasticsearch@e5de9a4b81
* Move CLI to TransportSqlAction
* Moves REST endpoint from `/_cli` to `/_sql/cli`
* Removes the special purpose CLI transport action instead
implements the CLI entirely on the REST layer, delegating
all SQL stuff to the same action that backs the `/_sql` REST
API.
* Reworks "embedded testing mode" to use a `FilterClient` to
bounce capture the sql transport action and execute in embedded.
* Switches CLI formatting from consuming the entire response
to consuming just the first page of the response and returning
a `cursor` that can be used to read the next page. That read is
not yet implemented.
* Switch CLI formatting from the consuming the `RowSetCursor` to
consuming the `SqlResponse` object.
* Adds tests for CLI formatting.
* Support next page in the cli
* Rename cli's CommandRequest/CommandResponse to
QueryInitRequest/QueryInitResponse to line up with jdbc
* Implement QueryPageRequest/QueryPageResponse in cli
* Use `byte[]` to represent the cursor in the cli. Those bytes
mean something, but only to the server. The only reasonint that
the client does about them is "if length == 0 then there isn't a
next page."
* Pull common code from jdbc's QueryInitRequest, QueryPageRequest,
QueryInitResponse, and QueryPageResponse into the shared-proto
project
* By implication this switches jdbc's QueryPageRequest to using
the same cursor implementation as the cli
Original commit: elastic/x-pack-elasticsearch@193586f1ee
Instead of throwing and catching an exception for invalid
indices this returns *why* they are invalid in a convenient
object form that can be thrown as an exception when the index
is required or the entire index can be ignored when listing
indices.
Original commit: elastic/x-pack-elasticsearch@f45cbce647
This integrates SQL's metadata calls with security by creating
`SqlIndicesAction` and routing all of SQL's metadata calls through
it. Since it *does* know up from which indices it is working against
it can be an `IndicesRequest.Replaceable` and integrate with the
existing security infrastructure for filtering indices.
This request is implemented fairly similarly to the `GetIndexAction`
with the option to read from the master or from a local copy of
cluster state. Currently SQL forces it to run on the local copy
because the request doesn't properly support serialization. I'd
like to implement that in a followup.
Original commit: elastic/x-pack-elasticsearch@15f9512820
In core we prefer not to extend `AbstractLifecycleComponent` because
the reasons for the tradeoffs that it makes are lost to the sands of
time.
Original commit: elastic/x-pack-elasticsearch@ec1a32bbb3
Core doesn't go in for fancy collection utils in general and just
manipulates the required collections in line. In an effort to keep
SQL "more like the rest of Elasticsearch" I'm starting to remove
SQL's `CollectionUtils`.
Original commit: elastic/x-pack-elasticsearch@878ee181cb
Scrolling was only implemented for the `SqlAction` (not jdbc or cli)
and it was implemented by keeping request state on the server. On
principle we try to avoid adding extra state to elasticsearch where
possible because it creates extra points of failure and tends to
have lots of hidden complexity.
This replaces the state on the server with serializing state to the
client. This looks to the user like a "next_page" key with fairly
opaque content. It actually consists of an identifier for the *kind*
of scroll, the scroll id, and a base64 string containing the field
extractors.
Right now this only implements scrolling for `SqlAction`. The plan
is to reuse the same implementation for jdbc and cli in a followup.
This also doesn't implement all of the required serialization.
Specifically it doesn't implement serialization of
`ProcessingHitExtractor` because I haven't implemented serialization
for *any* `ColumnProcessors`.
Original commit: elastic/x-pack-elasticsearch@a8567bc5ec
Adds a granular licensing support to SQL. JDBC now requires a platinum license, everything else work with any non-expired license.
Original commit: elastic/x-pack-elasticsearch@a30470e2c9
This moves validating that each index contains only a single type
into EsCatalog which removes a race condition and removes a method
from the Catalog interface. Removing methods from the Catalog
interface is nice because the fewer methods that the interface has
the fewer have to be secured.
Original commit: elastic/x-pack-elasticsearch@85cd089e47
This adds support for field level security to SQL by creating a new type of flow for securing requests that look like sql requests. `AuthorizationService` verifies that the user can execute the request but doesn't check the indices in the request because they are not yet ready. Instead, it adds a `BiFunction` to the context that can be used to check permissions for an index while servicing the request. This allows requests to cooperatively secure themselves. SQL does this by implementing filtering on top of its `Catalog` abstraction and backing that filtering with security's filters. This minimizes the touch points between security and SQL.
Stuff I'd like to do in followups:
What doesn't work at all still:
1. `SHOW TABLES` is still totally unsecured
2. `DESCRIBE TABLE` is still totally unsecured
3. JDBC's metadata APIs are still totally unsecured
What kind of works but not well:
1. The audit trail doesn't show the index being authorized for SQL.
Original commit: elastic/x-pack-elasticsearch@86f88ba2f5
Indices discovery actively ignores indices with more than one type.
However queries made such indices throw an exception (assuming the user
by mistake or not, selects such an index).
Original commit: elastic/x-pack-elasticsearch@16855c7b8f
SQL relies on being able to fetch information about fields from
the cluster state and it'd be disasterous if that information
wasn't available. This should catch that.
Original commit: elastic/x-pack-elasticsearch@1a62747332
Running the sql rest action test inside the server caused a dependency
loop which was failing the build.
Original commit: elastic/x-pack-elasticsearch@43283671d8
This is a hack to remove a dependency cycle I added in elastic/x-pack-elasticsearch#2109. I think
it'd be cleaner to remove the cycle by making sql its own plugin that
doesn't depend on the rest of x-pack-elasticsearch but is still
included within x-pack-elasticsearch. But that is a broader change.
Original commit: elastic/x-pack-elasticsearch@47b7d69d80