mirror of https://github.com/apache/druid.git
Doc fixes for query from deep storage and MSQ (#15313)
Minor updates to the documentation. Added prerequisites. Removed a known issue in MSQ since its no longer valid. --------- Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
This commit is contained in:
parent
9576fd3141
commit
5036af6fb3
|
@ -42,10 +42,6 @@ an [UnknownError](./reference.md#error_UnknownError) with a message including "N
|
|||
- `GROUPING SETS` are not implemented. Queries using these features return a
|
||||
[QueryNotSupported](reference.md#error_QueryNotSupported) error.
|
||||
|
||||
- For some `COUNT DISTINCT` queries, you'll encounter a [QueryNotSupported](reference.md#error_QueryNotSupported) error
|
||||
that includes `Must not have 'subtotalsSpec'` as one of its causes. This is caused by the planner attempting to use
|
||||
`GROUPING SET`s, which are not implemented.
|
||||
|
||||
- The numeric varieties of the `EARLIEST` and `LATEST` aggregators do not work properly. Attempting to use the numeric
|
||||
varieties of these aggregators lead to an error like
|
||||
`java.lang.ClassCastException: class java.lang.Double cannot be cast to class org.apache.druid.collections.SerializablePair`.
|
||||
|
|
|
@ -24,6 +24,10 @@ title: "Query from deep storage"
|
|||
|
||||
Druid can query segments that are only stored in deep storage. Running a query from deep storage is slower than running queries from segments that are loaded on Historical processes, but it's a great tool for data that you either access infrequently or where the low latency results that typical Druid queries provide is not necessary. Queries from deep storage can increase the surface area of data available to query without requiring you to scale your Historical processes to accommodate more segments.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Query from deep storage requires the Multi-stage query (MSQ) task engine. Load the extension for it if you don't already have it enabled before you begin. See [enable MSQ](../multi-stage-query/index.md#load-the-extension) for more information.
|
||||
|
||||
## Keep segments in deep storage only
|
||||
|
||||
Any data you ingest into Druid is already stored in deep storage, so you don't need to perform any additional configuration from that perspective. However, to take advantage of the cost savings that querying from deep storage provides, make sure not all your segments get loaded onto Historical processes.
|
||||
|
|
|
@ -31,6 +31,8 @@ To run the queries in this tutorial, replace `ROUTER:PORT` with the location of
|
|||
|
||||
For more general information, see [Query from deep storage](../querying/query-from-deep-storage.md).
|
||||
|
||||
If you are trying this feature on an existing cluster, make sure query from deep storage [prerequisites](../querying/query-from-deep-storage.md#prerequisites) are met.
|
||||
|
||||
## Load example data
|
||||
|
||||
Use the **Load data** wizard or the following SQL query to ingest the `wikipedia` sample datasource bundled with Druid. If you use the wizard, make sure you change the partitioning to be by hour.
|
||||
|
|
Loading…
Reference in New Issue