druid

Commit Graph

Author	SHA1	Message	Date
Victoria Lim	a800dae87a	doc: List Protobuf as a supported format (#13640 )	2023-01-06 15:09:37 -08:00
317brian	6bbf4266b2	docs: documentation for unnest datasource (#13479 ) Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2023-01-06 11:41:11 -08:00
Gian Merlino	7f3c117e3a	SQL: Improve docs around casts. (#13466 ) Main change: clarify that the "default value" for casts only applies if druid.generic.useDefaultValueForNull = true. Secondary change: adjust a bunch of wording from future to present tense.	2022-12-15 15:01:40 -08:00
Katya Macedo	78c1a2bd66	Remove limit from timeseries (#13457 ) CI build failures seem unrelated to docs	2022-12-02 12:19:59 -08:00
Jill Osborne	138a6de507	Update nested columns docs (#13461 ) * Update nested columns docs (cherry picked from commit `04206c5179`) * Update nested-columns.md (cherry picked from commit `8085ee7217`)	2022-12-01 10:47:32 -08:00
317brian	cc2e4a80ff	doc: add a basic JDBC tutorial (#13343 ) * initial commit for jdbc tutorial (cherry picked from commit 04c4adad71e5436b76c3425fe369df03aaaf0acb) * add commentary * address comments from charles * add query context to example * fix typo * add links * Apply suggestions from code review Co-authored-by: Frank Chen <frankchen@apache.org> * fix datatype * address feedback * add parameterize to spelling file. the past tense version was already there Co-authored-by: Frank Chen <frankchen@apache.org>	2022-11-30 16:25:35 -08:00
Jill Osborne	100a2aa4a2	Update and document experimental features (#13348 ) * Update and document experimental features * Updated * Update experimental-features.md * Update docs/development/experimental-features.md Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com> * Updated after review * Updated * Update materialized-view.md * Update experimental-features.md Co-authored-by: Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com>	2022-11-29 08:01:28 +05:30
Andreas Maechler	03175a2b8d	Add missing MSQ error code fields to docs (#13308 ) * Fix typo * Fix some spacing * Add missing fields * Cleanup table spacing * Remove durable storage docs again Thanks Brian for pointing out previous discussions. * Update docs/multi-stage-query/reference.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Mark codes as code * And even more codes as code * Another set of spaces * Combine `ColumnTypeNotSupported` Thanks Karan. * More whitespaces and typos * Add spelling and fix links Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2022-11-10 21:03:04 +05:30
Jill Osborne	965e41538e	Update nested columns doc (#13314 ) * Updated nested columns doc * Update nested-columns.md * Update nested-columns.md	2022-11-10 09:53:28 +08:00
Gian Merlino	8f90589ce5	Always return sketches from DS_HLL, DS_THETA, DS_QUANTILES_SKETCH. (#13247 ) * Always return sketches from DS_HLL, DS_THETA, DS_QUANTILES_SKETCH. These aggregation functions are documented as creating sketches. However, they are planned into native aggregators that include finalization logic to convert the sketch to a number of some sort. This creates an inconsistency: the functions sometimes return sketches, and sometimes return numbers, depending on where they lie in the native query plan. This patch changes these SQL aggregators to _never_ finalize, by using the "shouldFinalize" feature of the native aggregators. It already existed for theta sketches. This patch adds the feature for hll and quantiles sketches. As to impact, Druid finalizes aggregators in two cases: - When they appear in the outer level of a query (not a subquery). - When they are used as input to an expression or finalizing-field-access post-aggregator (not any other kind of post-aggregator). With this patch, the functions will no longer be finalized in these cases. The second item is not likely to matter much. The SQL functions all declare return type OTHER, which would be usable as an input to any other function that makes sense and that would be planned into an expression. So, the main effect of this patch is the first item. To provide backwards compatibility with anyone that was depending on the old behavior, the patch adds a "sqlFinalizeOuterSketches" query context parameter that restores the old behavior. Other changes: 1) Move various argument-checking logic from runtime to planning time in DoublesSketchListArgBaseOperatorConversion, by adding an OperandTypeChecker. 2) Add various JsonIgnores to the sketches to simplify their JSON representations. 3) Allow chaining of ExpressionPostAggregators and other PostAggregators in the SQL layer. 4) Avoid unnecessary FieldAccessPostAggregator wrapping in the SQL layer, now that expressions can operate on complex inputs. 5) Adjust return type to thetaSketch (instead of OTHER) in ThetaSketchSetBaseOperatorConversion. * Fix benchmark class. * Fix compilation error. * Fix ThetaSketchSqlAggregatorTest. * Hopefully fix ITAutoCompactionTest. * Adjustment to ITAutoCompactionTest.	2022-11-03 09:43:00 -07:00
arvindanugula	42384d85e7	Update nested-columns.md (#13227 ) typo error corrected.	2022-10-14 16:15:46 -07:00
Victoria Lim	02ad62a08c	Docs: update description of query priority default value (#13191 ) * update description of default for query priority * update order * update terms * standardize to query context parameters	2022-10-14 14:28:04 -07:00
Jill Osborne	548d810baa	Correct nested columns example (#13150 )	2022-09-28 10:39:56 +05:30
Apoorv Gupta	c8f4d72fb1	Fix documentation bug about injective lookups (#13147 ) replace mapping to `unique keys` with mapping to `unique values`.	2022-09-27 10:16:48 +08:00
hosswald	5ed5c83aab	Clarified the behaviour of SQL COUNT(DISTINCT dim) on multi-value dimensions (#13128 ) * Clarified the behaviour of COUNT(DISTINCT column) on multi-value columns * Update docs/querying/sql-aggregations.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> Co-authored-by: Vadim Ogievetsky <vadimon@gmail.com> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2022-09-20 18:03:34 -07:00
Vadim Ogievetsky	bb0b810b1d	fix html tags in docs (#13117 ) * fix html tags in docs * revert not null	2022-09-18 19:40:33 -07:00
Gian Merlino	d4967c38f8	Various documentation updates. (#13107 ) * Various documentation updates. 1) Split out "data management" from "ingestion". Break it into thematic pages. 2) Move "SQL-based ingestion" into the Ingestion category. Adjust content so all conceptual content is in concepts.md and all syntax content is in reference.md. Shorten the known issues page to the most interesting ones. 3) Add SQL-based ingestion to the ingestion method comparison page. Remove the index task, since index_parallel is just as good when maxNumConcurrentSubTasks: 1. 4) Rename various mentions of "Druid console" to "web console". 5) Add additional information to ingestion/partitioning.md. 6) Remove a mention of Tranquility. 7) Remove a note about upgrading to Druid 0.10.1. 8) Remove no-longer-relevant task types from ingestion/tasks.md. 9) Move ingestion/native-batch-firehose.md to the hidden section. It was previously deprecated. 10) Move ingestion/native-batch-simple-task.md to the hidden section. It is still linked in some places, but it isn't very useful compared to index_parallel, so it shouldn't take up space in the sidebar. 11) Make all br tags self-closing. 12) Certain other cosmetic changes. 13) Update to node-sass 7. * make travis use node12 for docs Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com>	2022-09-16 21:58:11 -07:00
Jill Osborne	1f69140623	Nested columns documentation (#12946 ) Co-authored-by: Clint Wylie <cjwylie@gmail.com> Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Charles Smith <techdocsmith@gmail.com> Co-authored-by: brian.le <brian.le@imply.io>	2022-09-06 14:42:18 -07:00
Vadim Ogievetsky	897689c03b	remove mentions of DruidQueryRel from docs (#13033 ) * remove mentions of DruidQueryRel * Update docs/querying/sql-translation.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/querying/sql-translation.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2022-09-06 13:37:27 -07:00
Jill Osborne	7a1e1f88bb	Remove experimental note from stable features (#12973 ) * Removed experimental note for features that are no longer experimental * Updated native batch doc	2022-08-25 09:26:46 -07:00
Karan Kumar	f7c6316992	Setting useNativeQueryExplain to true (#12936 ) * Setting useNativeQueryExplain to true * Update docs/querying/sql-query-context.md Co-authored-by: Santosh Pingale <pingalesantosh@gmail.com> * Fixing tests * Fixing broken tests Co-authored-by: Santosh Pingale <pingalesantosh@gmail.com>	2022-08-24 17:39:55 +05:30
Petar Petrov	6fec1d4c95	Add useNativeQueryExplain in sql query context documentation (#12924 ) (#12934 ) Co-authored-by: Petar Petrov <petar.petrov@system73.com>	2022-08-22 16:31:15 +05:30
Clint Wylie	f8097ccfaa	basic docs for nested column query functions (#12922 ) * basic docs for nested column query functions	2022-08-19 17:12:19 -07:00
Clint Wylie	69fe1f04e5	document virtualColumns in native query documentation, fix some redirects (#12917 ) * document virtualColumns in native query documentation, fix some redirects * after all that, forgot to run spellcheck locally * review stuff	2022-08-18 20:49:23 -07:00
Rohan Garg	5394838030	Enable conversion of join to filter by default (#12868 )	2022-08-13 20:37:43 +05:30
Gian Merlino	01d555e47b	Adjust "in" filter null behavior to match "selector". (#12863 ) * Adjust "in" filter null behavior to match "selector". Now, both of them match numeric nulls if constructed with a "null" value. This is consistent as far as native execution goes, but doesn't match the behavior of SQL = and IN. So, to address that, this patch also updates the docs to clarify that the native filters do match nulls. This patch also updates the SQL docs to describe how Boolean logic is handled in addition to how NULL values are handled. Fixes #12856. * Fix test.	2022-08-08 09:08:36 -07:00
Frank Chen	a544aff761	Document missed simple granularities (#12768 ) * Document missed simple granularities * Update docs/querying/granularities.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/querying/granularities.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2022-07-14 14:02:28 +08:00
Victoria Lim	d8f8c56f94	Docs: Index page with all SQL functions (#12771 ) * list of all functions * add function names to spelling file	2022-07-14 09:59:55 +08:00
Gian Merlino	97207cdcc7	Automatic sizing for GroupBy dictionaries. (#12763 ) * Automatic sizing for GroupBy dictionary sizes. Merging and selector dictionary sizes currently both default to 100MB. This is not optimal, because it can lead to OOM on small servers and insufficient resource utilization on larger servers. It also invites end users to try to tune it when queries run out of dictionary space, which can make things worse if the end user sets it to too high. So, this patch: - Adds automatic tuning for selector and merge dictionaries. Selectors use up to 15% of the heap and merge buffers use up to 30% of the heap (aggregate across all queries). - Updates out-of-memory error messages to emphasize enabling disk spilling vs. increasing memory parameters. With the memory parameters automatically sized, it is more likely that an end user will get benefit from enabling disk spilling. - Removes the query context parameters that allow lowering of configured dictionary sizes. These complicate the calculation, and I don't see a reasonable use case for them. * Adjust tests. * Review adjustments. * Additional comment. * Remove unused import.	2022-07-11 08:20:50 -07:00
Jill Osborne	682ea7f32d	IMPLY-12348: Update description of UNION ALL in SQL syntax doc (#12710 ) * IMPLY-12348: Updated description of UNION ALL * Update docs/querying/sql.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/sql.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update sql.md * Update docs/querying/sql.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Charles Smith <techdocsmith@gmail.com> Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2022-07-05 13:08:01 -07:00
Gian Merlino	0099940808	Add TIME_IN_INTERVAL SQL operator. (#12662 ) * Add TIME_IN_INTERVAL SQL operator. The operator is implemented as a convertlet rather than an OperatorConversion, because this allows it to be equivalent to using the >= and < operators directly. * SqlParserPos cannot be null here. * Remove unused import. * Doc updates. * Add words to dictionary.	2022-06-21 13:05:37 -07:00
Jill Osborne	f050069767	Segments doc update (#12344 ) * Corrected heading levels in segments doc * IMPLY-18394: Updated Segments doc * Update docs/design/segments.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/segments.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/segments.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/segments.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/segments.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/segments.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update segments.md * Updated links to changed headings in Segments doc * Corrected spelling error * Update segments.md Incorporated suggestions from Paul Rogers. * Update index.md * Update segments.md * Update segments.md * Update segments.md * Update compaction.md * Update docs/design/segments.md fix typo * Update docs/ingestion/compaction.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/design/segments.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/design/segments.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/design/segments.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/design/segments.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/design/segments.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/design/segments.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/design/segments.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/design/segments.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/design/segments.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/design/segments.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * Update docs/design/segments.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Charles Smith <techdocsmith@gmail.com> Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2022-06-16 13:25:17 -07:00
Jill Osborne	9c8e6bb000	Addition to Multitenancy considerations doc (#12567 ) * Small addition to Multitenancy considerations doc * Update docs/querying/multitenancy.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update multitenancy.md Edit suggested by @kfaraz Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2022-06-02 10:32:14 -07:00
Gian Merlino	37853f8de4	ConcurrentGrouper: Add mergeThreadLocal option, fix bug around the switch to spilling. (#12513 ) * ConcurrentGrouper: Add option to always slice up merge buffers thread-locally. Normally, the ConcurrentGrouper shares merge buffers across processing threads until spilling starts, and then switches to a thread-local model. This minimizes memory use and reduces likelihood of spilling, which is good, but it creates thread contention. The new mergeThreadLocal option causes a query to start in thread-local mode immediately, and allows us to experiment with the relative performance of the two modes. * Fix grammar in docs. * Fix race in ConcurrentGrouper. * Fix issue with timeouts. * Remove unused import. * Add "tradeoff" to dictionary.	2022-05-21 10:28:54 -07:00
Gian Merlino	65a1375b67	SQL: Add is_active to sys.segments, update examples and docs. (#11550 ) * SQL: Add is_active to sys.segments, update examples and docs. is_active is short for: (is_published = 1 AND is_overshadowed = 0) OR is_realtime = 1 It's important because this represents "all the segments that should be queryable, whether or not they actually are right now". Most of the time, this is the set of segments that people will want to look at. The web console already adds this filter to a lot of its queries, proving its usefulness. This patch also reworks the caveat at the bottom of the sys.segments section, so its information is mixed into the description of each result field. This should make it more likely for people to see the information. * Wording updates. * Adjustments for spellcheck. * Adjust IT.	2022-05-19 14:23:28 -07:00
Charles Smith	3e8d7a6d9f	Sql docs items (#12530 ) * touch up sql refactor * brush up SQL refactor * incorporate feedback * reorder sql * Update docs/querying/sql.md Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2022-05-17 16:56:31 -07:00
Hellmar Becker	985640f103	Clarify the use of the Lookup API (#12088 ) * Update lookups.md * Update docs/querying/lookups.md Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com> * Update docs/querying/lookups.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>	2022-05-16 07:50:24 -07:00
Gian Merlino	5b6727f319	Enable vectorized virtual column processing by default. (#12520 ) In the majority of cases, this improves performance. There's only one case I'm aware of where this may be a net negative: for time_floor(__time, <period>) where there are many repeated __time values. In nonvectorized processing, SingleLongInputCachingExpressionColumnValueSelector implements an optimization to avoid computing the time_floor function on every row. There is no such optimization in vectorized processing. IMO, we shouldn't mention this in the docs. Rationale: It's too fiddly of a thing: it's not guaranteed that nonvectorized processing will be faster due to the optimization, because it would have to overcome the inherent speed advantage of vectorization. So it'd always require testing to determine the best setting for a specific dataset. It would be bad if users disabled vectorization thinking it would speed up their queries, and it actually slowed them down. And even if users do their own testing, at some point in the future we'll implement the optimization for vectorized processing too, and it's likely that users that explicitly disabled vectorization will continue to have it disabled. I'd like to avoid this outcome by encouraging all users to enable vectorization at all times. Really advanced users would be following development activity anyway, and can read this issue	2022-05-16 15:43:53 +05:30
Gian Merlino	ff253fd8a3	Add setProcessingThreadNames context parameter. (#12514 ) setting thread names takes a measurable amount of time in the case where segment scans are very quick. In high-QPS testing we found a slight performance boost from turning off processing thread renaming. This option makes that possible.	2022-05-16 13:42:00 +05:30
Rohan Garg	75836a5a06	Add feature flag for sql planning of TimeBoundary queries (#12491 ) * Add feature flag for sql planning of TimeBoundary queries * fixup! Add feature flag for sql planning of TimeBoundary queries * Add documentation for enableTimeBoundaryPlanning * fixup! Add documentation for enableTimeBoundaryPlanning	2022-05-10 15:23:42 +05:30
Gian Merlino	a2bad0b3a2	Reduce allocations due to Jackson serialization. (#12468 ) * Reduce allocations due to Jackson serialization. This patch attacks two sources of allocations during Jackson serialization: 1) ObjectMapper.writeValue and JsonGenerator.writeObject create a new DefaultSerializerProvider instance for each call. It has lots of fields and creates pressure on the garbage collector. So, this patch adds helper functions in JacksonUtils that enable reuse of SerializerProvider objects and updates various call sites to make use of this. 2) GroupByQueryToolChest copies the ObjectMapper for every query to install a special module that supports backwards compatibility with map-based rows. This isn't needed if resultAsArray is set and all servers are running Druid 0.16.0 or later. This release was a while ago. So, this patch disables backwards compatibility by default, which eliminates the need to copy the heavyweight ObjectMapper. The patch also introduces a configuration option that allows admins to explicitly enable backwards compatibility. * Add test. * Update additional call sites and add to forbidden APIs.	2022-04-27 14:17:26 -07:00
Victoria Lim	63a993c33a	stringFirst and stringLast supported in ingestion (#12466 )	2022-04-22 10:28:49 +08:00
Victoria Lim	f95447070e	updated docs for sql query context (#12406 )	2022-04-21 11:19:39 -07:00
jacobtolar	0edc22179c	Document expression post-aggregators (#11896 ) * Document expression post-aggregators * Update docs/querying/post-aggregations.md Co-authored-by: Frank Chen <frankchen@apache.org> Co-authored-by: Frank Chen <frankchen@apache.org>	2022-04-19 10:36:19 +08:00
Victoria Lim	c86c48203e	recommendation for comparing strings and numbers (#12442 )	2022-04-18 09:28:32 -07:00
Peter Marshall	5167d328b1	Docs - query caching (#11584 ) * Update caching.md Knowledge from https://the-asf.slack.com/archives/CJ8D1JTB8/p1597781107153900 Update caching.md A few additional updates OTBO https://the-asf.slack.com/archives/CJ8D1JTB8/p1608669046041300 * Update caching.md Typos * Amendments on the segment cache Significant updates on content around the segment cache, pull process, and in-memory cache * Update docs/design/historical.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/historical.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/historical.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/historical.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/historical.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/historical.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/caching.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/caching.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/caching.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/caching.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/historical.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/historical.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/design/historical.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/operations/basic-cluster-tuning.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/caching.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/caching.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/caching.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/caching.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/caching.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/operations/basic-cluster-tuning.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update basic-cluster-tuning.md typo * Update docs/querying/caching.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Whole-query caching update Made more succinct and removed specific config to change. * Update docs/design/historical.md Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2022-04-18 17:00:21 +08:00
Adarsh Sanjeev	ef45a1551e	Convert inQueryThreshold into query context parameter. (#12357 ) Added Calcites InQueryThreshold as a query context parameter. Setting this parameter appropriately reduces the time taken for queries with large number of values in their IN conditions.	2022-03-22 18:33:57 +05:30
Gian Merlino	875e0696e0	GroupBy: Cap dictionary-building selector memory usage. (#12309 ) * GroupBy: Cap dictionary-building selector memory usage. New context parameter "maxSelectorDictionarySize" controls when the per-segment processing code should return early and trigger a trip to the merge buffer. Includes: - Vectorized and nonvectorized implementations. - Adjustments to GroupByQueryRunnerTest to exercise this code in the v2SmallDictionary suite. (Both the selector dictionary and the merging dictionary will be small in that suite.) - Tests for the new config parameter. * Fix issues from tests. * Add "pre-existing" to dictionary. * Simplify GroupByColumnSelectorStrategy interface by removing one of the writeToKeyBuffer methods. * Adjustments from review comments.	2022-03-08 13:13:11 -08:00
Karan Kumar	5794331eb1	Adding new config for disabling group by on multiValue column (#12253 ) As part of #12078 one of the followup's was to have a specific config which does not allow accidental unnesting of multi value columns if such columns become part of the grouping key. Added a config groupByEnableMultiValueUnnesting which can be set in the query context. The default value of groupByEnableMultiValueUnnesting is true, therefore it does not change the current engine behavior. If groupByEnableMultiValueUnnesting is set to false, the query will fail if it encounters a multi-value column in the grouping key.	2022-02-16 20:53:26 +05:30
somu-imply	eae163a797	Moving in filter check to broker (#12195 ) * Moving in filter check to broker * Adding more unit tests, making error message meaningful * Spelling and doc changes * Updating default to -1 and making this feature hide by default. The number of IN filters can grow upto a max limit of 100 * Removing upper limit of 100, updated docs * Making documentation more meaningful * Moving check outside to PlannerConfig, updating test cases and adding back max limit * Updated with some additional code comments * Missed removing one line during the checkin * Addressing doc changes and one forbidden API correction * Final doc change * Adding a speling exception, correcting a testcase * Reading entire filter tree to address combinations of ANDs and ORs * Specifying in docs that, this case works only for ORs * Revert "Reading entire filter tree to address combinations of ANDs and ORs" This reverts commit `81ca8f8496`. * Covering a class cast exception and updating docs * Counting changed Co-authored-by: Jihoon Son <jihoonson@apache.org>	2022-02-15 20:45:07 -08:00

1 2 3 4

196 Commits