docs: update future development blurbs (#16939)

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2025-02-17 23:46:30 +00:00 · 2024-10-01 15:02:05 -07:00 · 2024-10-01 15:02:05 -07:00 · 1fc82a96bd
commit 1fc82a96bd
parent 878adff9aa
4 changed files with 11 additions and 17 deletions
--- a/docs/design/architecture.md
+++ b/docs/design/architecture.md
@ -105,10 +105,9 @@ for reading from external data sources and publishing new Druid segments.
 [**Indexer**](../design/indexer.md) services are an alternative to Middle Managers and Peons. Instead of
 forking separate JVM processes per-task, the Indexer runs tasks as individual threads within a single JVM process.

-The Indexer is designed to be easier to configure and deploy compared to the Middle Manager + Peon system and to better enable resource sharing across tasks. The Indexer is a newer feature and is currently designated [experimental](../development/experimental.md) due to the fact that its memory management system is still under
-development. It will continue to mature in future versions of Druid.
+The Indexer is designed to be easier to configure and deploy compared to the MiddleManager + Peon system and to better enable resource sharing across tasks, which can help streaming ingestion. The Indexer is currently designated [experimental](../development/experimental.md).

-Typically, you would deploy either Middle Managers or Indexers, but not both.
+Typically, you would deploy one of the following: MiddleManagers, [MiddleManager-less ingestion using Kubernetes](../development/extensions-contrib/k8s-jobs.md), or Indexers. You wouldn't deploy more than one of these options.

 ## Colocation of services

--- a/docs/design/indexer.md
+++ b/docs/design/indexer.md
@ -24,8 +24,7 @@ sidebar_label: "Indexer"
  -->

 :::info
- The Indexer is an optional and [experimental](../development/experimental.md) feature.
- Its memory management system is still under development and will be significantly enhanced in later releases.
+ The Indexer is an optional and experimental feature. If you're primarily performing batch ingestion, we recommend you use either the MiddleManager and Peon task execution system or [MiddleManager-less ingestion using Kubernetes](../development/extensions-contrib/k8s-jobs.md). If you're primarily doing streaming ingestion, you may want to try either [MiddleManager-less ingestion using Kubernetes](../development/extensions-contrib/k8s-jobs.md) or the Indexer service.
 :::

 The Apache Druid Indexer service is an alternative to the Middle Manager + Peon task execution system. Instead of forking a separate JVM process per-task, the Indexer runs tasks as separate threads within a single JVM process.
--- a/docs/development/extensions-core/kubernetes.md
+++ b/docs/development/extensions-core/kubernetes.md
@ -54,7 +54,7 @@ Additionally, this extension has following configuration.

 ### Gotchas

- Label/Annotation path in each pod spec MUST EXIST, which is easily satisfied if there is at least one label/annotation in the pod spec already. This limitation may be removed in future.
+- Label/Annotation path in each pod spec MUST EXIST, which is easily satisfied if there is at least one label/annotation in the pod spec already. 
 - All Druid Pods belonging to one Druid cluster must be inside same kubernetes namespace.
 - All Druid Pods need permissions to be able to add labels to self-pod, List and Watch other Pods, create and read ConfigMap for leader election. Assuming, "default" service account is used by Druid pods, you might need to add following or something similar Kubernetes Role and Role Binding.

--- a/docs/querying/datasource.md
+++ b/docs/querying/datasource.md
@ -431,25 +431,21 @@ and how to detect it.
 3. One common reason for implicit subquery generation is if the types of the two halves of an equality do not match.
 For example, since lookup keys are always strings, the condition `druid.d JOIN lookup.l ON d.field = l.field` will
 perform best if `d.field` is a string.
-4. The join operator must evaluate the condition for each row. In the future, we expect
-to implement both early and deferred condition evaluation, which we expect to improve performance considerably for
-common use cases.
+4. The join operator must evaluate the condition for each row. 
 5. Currently, Druid does not support pushing down predicates (condition and filter) past a Join (i.e. into
 Join's children). Druid only supports pushing predicates into the join if they originated from
 above the join. Hence, the location of predicates and filters in your Druid SQL is very important.
 Also, as a result of this, comma joins should be avoided.

-#### Future work for joins
+#### Limitations for joins

-Joins are an area of active development in Druid. The following features are missing today but may appear in
-future versions:
+Joins in Druid have the following limitations:

- Reordering of join operations to get the most performant plan.
- Preloaded dimension tables that are wider than lookups (i.e. supporting more than a single key and single value).
- RIGHT OUTER and FULL OUTER joins in the native query engine. Currently, they are partially implemented. Queries run
+- The order of joins is not entirely optimized. Join operations are not reordered to get the most performant plan.
+- Preloaded dimension tables that are wider than lookups (i.e. supporting more than a single key and single value) are not supported.
+- RIGHT OUTER and FULL OUTER joins in the native query engine are not fully implemented. Queries run
  but results are not always correct.
- Performance-related optimizations as mentioned in the [previous section](#join-performance).
- Join conditions on a column containing a multi-value dimension.
+- Join conditions on a column can't contain a multi-value dimension.

 ### `unnest`