From 91cd5734727f2b0fc8aab999c711f3d5babaf82e Mon Sep 17 00:00:00 2001 From: Charles Smith Date: Wed, 18 Aug 2021 08:37:05 -0700 Subject: [PATCH] fixes web console introduction and addresses linking issues (#11609) * fixes web console introduction and addresses linking issues * fix merge conflict --- docs/ingestion/data-formats.md | 2 +- docs/operations/druid-console.md | 2 +- docs/operations/high-availability.md | 3 +-- docs/querying/aggregations.md | 2 +- docs/querying/post-aggregations.md | 3 +-- docs/querying/sql.md | 4 ++-- docs/querying/using-caching.md | 2 +- 7 files changed, 8 insertions(+), 10 deletions(-) diff --git a/docs/ingestion/data-formats.md b/docs/ingestion/data-formats.md index df0f165d879..2f07d5c2f63 100644 --- a/docs/ingestion/data-formats.md +++ b/docs/ingestion/data-formats.md @@ -661,7 +661,7 @@ the set of ingested dimensions, if missing the discovered fields will make up th `timeAndDims` parse spec must specify which fields will be extracted as dimensions through the `dimensionSpec`. -[All column types](https://orc.apache.org/docs/types.md) are supported, with the exception of `union` types. Columns of +[All column types](https://orc.apache.org/docs/types.html) are supported, with the exception of `union` types. Columns of `list` type, if filled with primitives, may be used as a multi-value dimension, or specific elements can be extracted with `flattenSpec` expressions. Likewise, primitive fields may be extracted from `map` and `struct` types in the same manner. Auto field discovery will automatically create a string dimension for every (non-timestamp) primitive or `list` of diff --git a/docs/operations/druid-console.md b/docs/operations/druid-console.md index 8d965c74be2..3c3ac77ce66 100644 --- a/docs/operations/druid-console.md +++ b/docs/operations/druid-console.md @@ -22,7 +22,7 @@ title: "Web console" ~ under the License. --> -Druid include a console for managing datasources, segments, tasks, data processes (Historicals and MiddleManagers), and coordinator dynamic configuration. Users can also run SQL and native Druid queries in the console. +Druid includes a console for managing datasources, segments, tasks, data processes (Historicals and MiddleManagers), and coordinator dynamic configuration. You can also run SQL and native Druid queries in the console. The Druid Console is hosted by the [Router](../design/router.md) process. diff --git a/docs/operations/high-availability.md b/docs/operations/high-availability.md index 4e8927f582f..7d142628fa7 100644 --- a/docs/operations/high-availability.md +++ b/docs/operations/high-availability.md @@ -29,8 +29,7 @@ Apache ZooKeeper, metadata store, the coordinator, the overlord, and brokers are We recommend either installing ZooKeeper on its own hardware, or running 3 or 5 Master servers (where overlords or coordinators are running) and configuring ZooKeeper on them appropriately. See the [ZooKeeper admin guide](https://zookeeper.apache.org/doc/current/zookeeperAdmin) for more details. - For highly-available metadata storage, we recommend MySQL or PostgreSQL with replication and failover enabled. -See [MySQL HA/Scalability Guide](https://dev.mysql.com/doc/mysql-ha-scalability/en/) -and [PostgreSQL's High Availability, Load Balancing, and Replication](https://www.postgresql.org/docs/current/high-availability.html) for MySQL and PostgreSQL, respectively. +See [MySQL Enterprise High Availability](https://www.mysql.com/products/enterprise/high_availability.html) and [PostgreSQL's High Availability, Load Balancing, and Replication](https://www.postgresql.org/docs/current/high-availability.html) for more information. - For highly-available Apache Druid Coordinators and Overlords, we recommend to run multiple servers. If they are all configured to use the same ZooKeeper cluster and metadata storage, then they will automatically failover between each other as necessary. diff --git a/docs/querying/aggregations.md b/docs/querying/aggregations.md index fe3edc03cf8..f5cf05d8056 100644 --- a/docs/querying/aggregations.md +++ b/docs/querying/aggregations.md @@ -126,7 +126,7 @@ Computes and stores the sum of values as 32-bit floating point value. Similar to ### `doubleMean` aggregator -Computes and returns arithmetic mean of a column values as 64 bit float value. `doubleMean` is a query time aggregator only. It is not available for indexing. +Computes and returns the arithmetic mean of a column's values as a 64-bit floating point value. `doubleMean` is a query time aggregator only. It is not available for indexing. To accomplish mean aggregation on ingestion, refer to the [Quantiles aggregator](../development/extensions-core/datasketches-quantiles.md#aggregator) from the DataSketches extension. diff --git a/docs/querying/post-aggregations.md b/docs/querying/post-aggregations.md index 765aea8b8cc..2031309bb0a 100644 --- a/docs/querying/post-aggregations.md +++ b/docs/querying/post-aggregations.md @@ -96,8 +96,7 @@ The constant post-aggregator always returns the specified value. The difference between the `doubleMax` aggregator and the `doubleGreatest` post-aggregator is that `doubleMax` returns the highest value of all rows for one specific column while `doubleGreatest` returns the highest value of multiple columns in one row. These are similar to the -SQL [MAX](https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_max) and -[GREATEST](https://dev.mysql.com/doc/refman/5.7/en/comparison-operators.html#function_greatest) functions. +SQL `MAX` and `GREATEST` functions. Example: diff --git a/docs/querying/sql.md b/docs/querying/sql.md index 371832c16d2..228420aec5c 100644 --- a/docs/querying/sql.md +++ b/docs/querying/sql.md @@ -334,8 +334,8 @@ Only the COUNT, ARRAY_AGG, and STRING_AGG aggregations can accept the DISTINCT k |`MAX(expr)`|Takes the maximum of numbers.|`null` if `druid.generic.useDefaultValueForNull=false`, otherwise `-9223372036854775808` (minimum LONG value)| |`AVG(expr)`|Averages numbers.|`null` if `druid.generic.useDefaultValueForNull=false`, otherwise `0`| |`APPROX_COUNT_DISTINCT(expr)`|_Usage note:_ consider using `APPROX_COUNT_DISTINCT_DS_HLL` instead, which offers better accuracy in many cases.

Counts distinct values of expr, which can be a regular column or a hyperUnique column. This is always approximate, regardless of the value of "useApproximateCountDistinct". This uses Druid's built-in "cardinality" or "hyperUnique" aggregators. See also `COUNT(DISTINCT expr)`.|`0`| -|`APPROX_COUNT_DISTINCT_DS_HLL(expr, [lgK, tgtHllType])`|Counts distinct values of `expr`, which can be a regular column or an [HLL sketch](../development/extensions-core/datasketches-hll.md) column. Results are always approximate, regardless of the value of [`useApproximateCountDistinct`](../querying/sql.html#connection-context). The `lgK` and `tgtHllType` parameters here are, like the equivalents in the [aggregator](../development/extensions-core/datasketches-hll.html#aggregators), described in the HLL sketch documentation. The [DataSketches extension](../development/extensions-core/datasketches-extension.md) must be loaded to use this function. See also `COUNT(DISTINCT expr)`. |`0`| -|`APPROX_COUNT_DISTINCT_DS_THETA(expr, [size])`|Counts distinct values of expr, which can be a regular column or a [Theta sketch](../development/extensions-core/datasketches-theta.md) column. This is always approximate, regardless of the value of [`useApproximateCountDistinct`](../querying/sql.html#connection-context). The `size` parameter is described in the Theta sketch documentation. The [DataSketches extension](../development/extensions-core/datasketches-extension.md) must be loaded to use this function. See also `COUNT(DISTINCT expr)`. |`0`| +|`APPROX_COUNT_DISTINCT_DS_HLL(expr, [lgK, tgtHllType])`|Counts distinct values of `expr`, which can be a regular column or an [HLL sketch](../development/extensions-core/datasketches-hll.md) column. Results are always approximate, regardless of the value of [`useApproximateCountDistinct`](#connection-context). The `lgK` and `tgtHllType` parameters here are, like the equivalents in the [aggregator](../development/extensions-core/datasketches-hll.md#aggregators), described in the HLL sketch documentation. The [DataSketches extension](../development/extensions-core/datasketches-extension.md) must be loaded to use this function. See also `COUNT(DISTINCT expr)`. |`0`| +|`APPROX_COUNT_DISTINCT_DS_THETA(expr, [size])`|Counts distinct values of expr, which can be a regular column or a [Theta sketch](../development/extensions-core/datasketches-theta.md) column. This is always approximate, regardless of the value of [`useApproximateCountDistinct`](#connection-context). The `size` parameter is described in the Theta sketch documentation. The [DataSketches extension](../development/extensions-core/datasketches-extension.md) must be loaded to use this function. See also `COUNT(DISTINCT expr)`. |`0`| |`DS_HLL(expr, [lgK, tgtHllType])`|Creates an [HLL sketch](../development/extensions-core/datasketches-hll.md) on the values of expr, which can be a regular column or a column containing HLL sketches. The `lgK` and `tgtHllType` parameters are described in the HLL sketch documentation. The [DataSketches extension](../development/extensions-core/datasketches-extension.md) must be loaded to use this function.|`'0'` (STRING)| |`DS_THETA(expr, [size])`|Creates a [Theta sketch](../development/extensions-core/datasketches-theta.md) on the values of expr, which can be a regular column or a column containing Theta sketches. The `size` parameter is described in the Theta sketch documentation. The [DataSketches extension](../development/extensions-core/datasketches-extension.md) must be loaded to use this function.|`'0.0'` (STRING)| |`APPROX_QUANTILE(expr, probability, [resolution])`|_Deprecated._ Use `APPROX_QUANTILE_DS` instead, which provides a superior distribution-independent algorithm with formal error guarantees.

Computes approximate quantiles on numeric or [approxHistogram](../development/extensions-core/approximate-histograms.md#approximate-histogram-aggregator) exprs. The "probability" should be between 0 and 1 (exclusive). The "resolution" is the number of centroids to use for the computation. Higher resolutions will give more precise results but also have higher overhead. If not provided, the default resolution is 50. The [approximate histogram extension](../development/extensions-core/approximate-histograms.md) must be loaded to use this function.|`NaN`| diff --git a/docs/querying/using-caching.md b/docs/querying/using-caching.md index e72f4f7f1e5..8f85b04f384 100644 --- a/docs/querying/using-caching.md +++ b/docs/querying/using-caching.md @@ -51,7 +51,7 @@ druid.realtime.cache.useCache=true druid.realtime.cache.populateCache=true ``` -See [Peon caching](configuration/index.md#peon-caching) and [Indexer caching](configuration/index.md#indexer-caching) for a description of all available task executor service caching options. +See [Peon caching](../configuration/index.md#peon-caching) and [Indexer caching](../configuration/index.md#indexer-caching) for a description of all available task executor service caching options. ## Enabling query caching on Brokers Brokers support both segment-level and whole-query result level caching.