druid

Commit Graph

Author	SHA1	Message	Date
Jill Osborne	61d986a179	Filters doc fix (#17553 )	2024-12-10 09:34:43 -08:00
Charles Smith	0325f62af2	[Docs] Remove ambiguous advice regarding TopN correctness (#17522 )	2024-11-27 11:41:28 -08:00
Akshat Jain	17215cd677	Remove support for Java 8 (#17466 ) All JDK 8 based CI checks have been removed. Images used in Dockerfile(s) have been updated to Java 17 based images. Documentation has been updated accordingly.	2024-11-21 15:33:08 +05:30
Katya Macedo	75d9ece665	Docs: update descriptions and default values (#17473 )	2024-11-13 16:29:27 -08:00
Ashwin Tumma	d5bb7de5cf	Fix Map Lookup Introspection Endpoints and update doc for Globally Cached Lookups (#17436 ) Map Lookup Introspection API endpoints /keys and /values no longer return an invalid JSON object. Also, update documentation to clarify the version returned by the /version introspection endpoint. --------- Co-authored-by: Ashwin Tumma <ashwin.tumma@salesforce.com>	2024-10-30 08:23:22 -07:00
Shivam Garg	6898a5a359	Removed Microsecond from Extract function (#17247 )	2024-10-11 05:32:26 +02:00
Charles Smith	5ed68622c3	[Docs] Update known issues for window functions (#17097 ) * draft update to known issues * Update known issues Remove addressed known issues. Clarify the issue with SELECT * queries.	2024-10-08 08:47:13 -07:00
317brian	1fc82a96bd	docs: update future development blurbs (#16939 ) Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2024-10-01 15:02:05 -07:00
Sree Charan Manamala	661614129e	Window Functions : Context Parameter to Enable Transfer of RACs over wire (#17150 )	2024-09-28 08:04:22 +02:00
Sree Charan Manamala	67d361c9bf	Window Functions : Remove enable windowing flag (#17087 )	2024-09-23 08:24:26 +02:00
Abhishek Radhakrishnan	39723e5401	Update note about `sys.tasks` table (#17096 ) Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2024-09-18 11:02:45 -07:00
Edgar Melendrez	64a4d115c5	[Docs] adding admonition for div (#17093 ) Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2024-09-17 13:54:49 -07:00
Lasse Mammen	307b8e3357	feat: json_merge expression and sql function (#17081 )	2024-09-17 18:27:34 +05:30
Victoria Lim	2e2f3cf66a	docs: Refresh docs for SQL input source (#17031 ) Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2024-09-16 15:52:37 -07:00
Edgar Melendrez	48a758ee08	[docs] reverting changes for sql-functions.md (#17019 )	2024-09-06 16:07:32 -07:00
Edgar Melendrez	2d9e92ce78	[docs] Batch11 date and time functions (#16926 ) * first draft of functions * minor improvments * Update docs/querying/sql-functions.md * Update docs/querying/sql-scalar.md * Apply suggestions from code review Accepted as is Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * applying next round of suggestions * fixing missing column name * addressing floor and ceil functions * Apply suggestions from code review Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * re-wording TIMESTAMPADD --------- Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>	2024-09-06 12:20:47 -07:00
Edgar Melendrez	ed811262e3	[docs] Batch13 IP functions (#16947 ) * new datasource * reviewing before pr * Update docs/querying/sql-functions.md * Apply suggestions from code review Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Applying suggestions to IPV4_PARSE --------- Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2024-09-06 12:19:36 -07:00
Edgar Melendrez	c49dc83b22	[docs] batch 12: reduction functions (#16930 ) * [docs] batch 12: reduction functions * Update docs/querying/sql-functions.md * Update docs/querying/sql-functions.md * Update docs/querying/sql-functions.md * applying suggestions * Apply suggestions from code review Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> --------- Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>	2024-09-05 17:02:45 -07:00
Jill Osborne	b4d83a86c2	Middle Manager wording update in docs (#17005 )	2024-09-05 10:25:30 -07:00
Jill Osborne	3e031b9dc2	Add dynamic query params example (#16964 ) Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>	2024-08-27 14:27:13 -07:00
317brian	418da92228	docs: update query from deepstorage segment requirement (#16842 ) Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Rishabh Singh <6513075+findingrish@users.noreply.github.com> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>	2024-08-23 11:59:29 -07:00
Hugh Evans	60d4317968	Linked back to query granularity docs (#16883 ) * Linked back to query granularity docs * Update ingestion-spec.md clairfy about query granularities in the spec. * Update docs/design/storage.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/ingestion/ingestion-spec.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Update docs/querying/granularities.md Co-authored-by: Charles Smith <techdocsmith@gmail.com> * Apply suggestions from code review --------- Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2024-08-23 08:44:19 -07:00
Gian Merlino	0603d5153d	Segments sorted by non-time columns. (#16849 ) * Segments primarily sorted by non-time columns. Currently, segments are always sorted by __time, followed by the sort order provided by the user via dimensionsSpec or CLUSTERED BY. Sorting by __time enables efficient execution of queries involving time-ordering or granularity. Time-ordering is a simple matter of reading the rows in stored order, and granular cursors can be generated in streaming fashion. However, for various workloads, it's better for storage footprint and query performance to sort by arbitrary orders that do not start with __time. With this patch, users can sort segments by such orders. For spec-based ingestion, users add "useExplicitSegmentSortOrder: true" to dimensionsSpec. The "dimensions" list determines the sort order. To define a sort order that includes "__time", users explicitly include a dimension named "__time". For SQL-based ingestion, users set the context parameter "useExplicitSegmentSortOrder: true". The CLUSTERED BY clause is then used as the explicit segment sort order. In both cases, when the new "useExplicitSegmentSortOrder" parameter is false (the default), __time is implicitly prepended to the sort order, as it always was prior to this patch. The new parameter is experimental for two main reasons. First, such segments can cause errors when loaded by older servers, due to violating their expectations that timestamps are always monotonically increasing. Second, even on newer servers, not all queries can run on non-time-sorted segments. Scan queries involving time-ordering and any query involving granularity will not run. (To partially mitigate this, a currently-undocumented SQL feature "sqlUseGranularity" is provided. When set to false the SQL planner avoids using "granularity".) Changes on the write path: 1) DimensionsSpec can now optionally contain a __time dimension, which controls the placement of __time in the sort order. If not present, __time is considered to be first in the sort order, as it has always been. 2) IncrementalIndex and IndexMerger are updated to sort facts more flexibly; not always by time first. 3) Metadata (stored in metadata.drd) gains a "sortOrder" field. 4) MSQ can generate range-based shard specs even when not all columns are singly-valued strings. It merely stops accepting new clustering key fields when it encounters the first one that isn't a singly-valued string. This is useful because it enables range shard specs on "someDim" to be created for clauses like "CLUSTERED BY someDim, __time". Changes on the read path: 1) Add StorageAdapter#getSortOrder so query engines can tell how a segment is sorted. 2) Update QueryableIndexStorageAdapter, IncrementalIndexStorageAdapter, and VectorCursorGranularizer to throw errors when using granularities on non-time-ordered segments. 3) Update ScanQueryEngine to throw an error when using the time-ordering "order" parameter on non-time-ordered segments. 4) Update TimeBoundaryQueryRunnerFactory to perform a segment scan when running on a non-time-ordered segment. 5) Add "sqlUseGranularity" context parameter that causes the SQL planner to avoid using granularities other than ALL. Other changes: 1) Rename DimensionsSpec "hasCustomDimensions" to "hasFixedDimensions" and change the meaning subtly: it now returns true if the DimensionsSpec represents an unchanging list of dimensions, or false if there is some discovery happening. This is what call sites had expected anyway. * Fixups from CI. * Fixes. * Fix missing arg. * Additional changes. * Fix logic. * Fixes. * Fix test. * Adjust test. * Remove throws. * Fix styles. * Fix javadocs. * Cleanup. * Smoother handling of null ordering. * Fix tests. * Missed a spot on the merge. * Fixups. * Avoid needless Filters.and. * Add timeBoundaryInspector to test. * Fix tests. * Fix FrameStorageAdapterTest. * Fix various tests. * Use forceSegmentSortByTime instead of useExplicitSegmentSortOrder. * Pom fix. * Fix doc.	2024-08-23 08:24:43 -07:00
Edgar Melendrez	c4981e34c4	[docs] Batch10 date and time functions (#16900 ) * just starting * TIME_PARSE and TIME_FORMAT remaining * fixing typo * adding last two functions * review sql-functions.md * Apply suggestions from code review Suggestions that were accepted as is Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/querying/sql-functions.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/querying/sql-functions.md needed to confirm that it did indeed return as a number Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * reviewing remaining suggestions * addressing review for time_format * Apply suggestions from code review Accepted as is Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * addressing final suggestion * time_zone -> timezone * timezone fix --------- Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>	2024-08-22 20:25:27 -07:00
Edgar Melendrez	fda2d19b88	[Docs] Batch09: only `lookup` (#16878 ) * [Docs] Batch09: only `lookup` * slight changes * Apply suggestions from code review Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * applying suggestiontions * Apply suggestions from code review Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * otherwise null -> otherwise returns null * updating definition in sql-scalar.md * Apply suggestions from code review Co-authored-by: Charles Smith <techdocsmith@gmail.com> * hoping to re-run web checks * change replaceMissingValueWith -> defaultValue * Update docs/querying/sql-scalar.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * acronym_to_name -> airportcode_to_name * shortens `airportcode_to_name` to `code_to_name` --------- Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2024-08-22 11:11:16 -07:00
Edgar Melendrez	725695342c	[Docs] Batch07: adding examples to string functions (#16862 ) * Lower,Upper,Lpad,Rpad,Parse_long * up to REGEXP_EXTRACT * batch 07 ready for review * updated definitions in scalar * Apply suggestions from code review Co-authored-by: Charles Smith <techdocsmith@gmail.com> * rpad and lpad * addressing comments * minor fixes * improving examples based on suggestions * matched -> matches * correcting typo * Apply suggestions from code review Co-authored-by: Charles Smith <techdocsmith@gmail.com> --------- Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2024-08-21 15:08:25 -07:00
Edgar Melendrez	5b94839d9d	[Docs] Batch08: adding examples to string functions (#16871 ) * batch08 completed * reviewing batch08 * apply corrections suggestions by @FrankChen021	2024-08-16 10:15:30 +08:00
Hugh Evans	6cfdeb3894	Added a topic listing reserved keywords (#16843 )	2024-08-15 10:25:09 -07:00
Sree Charan Manamala	1f6d2c41d2	Update doc for dynamic parameters supporting array (#16660 ) Update dynamic parameter docs to provide how it can used to replace an Array	2024-08-07 12:33:37 +05:30
Edgar Melendrez	83cf4dc554	[docs] fixes to sql-scalar.md (#16826 ) Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>	2024-08-06 17:12:57 -07:00
Edgar Melendrez	ebea34a814	[Docs] Batch06: starting string functions (#16838 ) * batch06, starting string functions * addind space after Syntax * quick change * correcting spelling * Update docs/querying/sql-functions.md * Update sql-functions.md * applying suggestions * Update docs/querying/sql-functions.md * Update docs/querying/sql-functions.md --------- Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2024-08-06 11:32:26 -07:00
Edgar Melendrez	3bb6d40285	[docs] batch 5 updating functions (#16812 ) * batch 5 * Update docs/querying/sql-functions.md * applying suggestions --------- Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-07-30 17:30:01 -07:00
Edgar Melendrez	85a8a1d805	[Docs]Batch04 - Bitwise numeric functions (#16805 ) * Batch04 - Bitwise numeric functions * Batch04 - Bitwise numeric functions * minor fixes * rewording bitwise_shift functions * rewording bitwise_shift functions * Update docs/querying/sql-functions.md * applying suggestions --------- Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-07-30 10:53:59 -07:00
Edgar Melendrez	028ee23a1e	[Docs] batch 03 - trig functions (#16795 ) * batch 03 - trig functions * Apply suggestions from code review Co-authored-by: Charles Smith <techdocsmith@gmail.com> * applying suggestions and corrections --------- Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2024-07-26 13:11:17 -07:00
Clint Wylie	5da69a01cb	change arrayIngestMode default to array (#16789 ) * change arrayIngestMode default to array * remove arrayIngestMode flag option none * fix space * fix test	2024-07-25 15:09:40 +08:00
Zoltan Haindrich	7e3fab5bf9	Make WindowFrames more specific (#16741 ) Changes the WindowFrame internals / representation a bit; introduces dedicated frametypes for rows and groups which corresponds to the implemented processing methods	2024-07-25 04:57:36 +02:00
Edgar Melendrez	ca787885c9	[docs] batch02 of updating functions (#16761 ) * applying changes * ensuring batch is updated * Update docs/querying/sql-functions.md * raise -> raises * addressing review * Apply suggestions from code review Co-authored-by: Charles Smith <techdocsmith@gmail.com> --------- Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Charles Smith <techdocsmith@gmail.com>	2024-07-24 15:28:57 -07:00
Edgar Melendrez	934c10b1cd	docs: Adding admonition box to warn about MVD (#16712 ) Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-07-22 17:32:23 -07:00
Clint Wylie	35b876436b	remove native scan query legacy mode (#16659 )	2024-07-18 23:33:27 -07:00
Edgar Melendrez	721a65046f	docs: add examples for SQL functions (#16745 ) * updating first batch of numeric functions * First batch of functions * addressing first few comments * alphabetize list * draft with suggestions applied * minor discrepency expr -> <NUMERIC> * changed raises to calculates * Update docs/querying/sql-functions.md * switch to underscore * changed to exp(1) to match slack message * adding html text for trademark symbol to .spelling * fixed discrepancy between description and example --------- Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-07-18 17:06:22 -07:00
Gian Merlino	dbed1b0f50	Defer more expressions in vectorized groupBy. (#16338 ) * Defer more expressions in vectorized groupBy. This patch adds a way for columns to provide GroupByVectorColumnSelectors, which controls how the groupBy engine operates on them. This mechanism is used by ExpressionVirtualColumn to provide an ExpressionDeferredGroupByVectorColumnSelector that uses the inputs of an expression as the grouping key. The actual expression evaluation is deferred until the grouped ResultRow is created. A new context parameter "deferExpressionDimensions" allows users to control when this deferred selector is used. The default is "fixedWidthNonNumeric", which is a behavioral change from the prior behavior. Users can get the prior behavior by setting this to "singleString". * Fix style. * Add deferExpressionDimensions to SqlExpressionBenchmark. * Fix style. * Fix inspections. * Add more testing. * Use valueOrDefault. * Compute exprKeyBytes a bit lighter-weight.	2024-06-26 17:28:36 -07:00
Victoria Lim	836cdb48a5	docs: Migration guide for MVDs to arrays (#16516 ) Co-authored-by: Clint Wylie <cjwylie@gmail.com> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>	2024-06-13 13:05:58 -07:00
317brian	8e11adfc6f	docs: remove outdated druidversion var from a page (#16570 ) Co-authored-by: asdf2014 <asdf2014@apache.org>	2024-06-10 15:30:36 +08:00
Gian Merlino	b837ce565b	Simplify serialized form of JsonInputFormat. (#15691 ) * Simplify serialized form of JsonInputFormat. Use JsonInclude for keepNullColumns, assumeNewlineDelimited, and useJsonNodeReader. Because the default value of keepNullColumns is variable, we store the original configured value rather than the derived value, and include if the original value is nonnull. * Fix test.	2024-06-05 20:01:14 -07:00
Charles Smith	8f78c901e7	docs: add lookups to the sidebar (#16530 ) Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>	2024-06-03 16:04:15 -07:00
Vadim Ogievetsky	a124c6cbbd	fix typo in extension name (#16466 )	2024-05-20 09:47:22 +08:00
Gian Merlino	72432c2e78	Speed up SQL IN using SCALAR_IN_ARRAY. (#16388 ) * Speed up SQL IN using SCALAR_IN_ARRAY. Main changes: 1) DruidSqlValidator now includes a rewrite of IN to SCALAR_IN_ARRAY, when the size of the IN is above inFunctionThreshold. The default value of inFunctionThreshold is 100. Users can restore the prior behavior by setting it to Integer.MAX_VALUE. 2) SearchOperatorConversion now generates SCALAR_IN_ARRAY when converting to a regular expression, when the size of the SEARCH is above inFunctionExprThreshold. The default value of inFunctionExprThreshold is 2. Users can restore the prior behavior by setting it to Integer.MAX_VALUE. 3) ReverseLookupRule generates SCALAR_IN_ARRAY if the set of reverse-looked-up values is greater than inFunctionThreshold. * Revert test. * Additional coverage. * Update docs/querying/sql-query-context.md Co-authored-by: Benedict Jin <asdf2014@apache.org> * New test. --------- Co-authored-by: Benedict Jin <asdf2014@apache.org>	2024-05-14 08:09:27 -07:00
Misha	b5958b6b07	Feature configurable calcite bloat (#16248 ) * Configurable bloat for calcite ProjectMergeRule implemented * Comment added * Default bloat value increased to 1000 * Implemented bloat configuration from QueryContext * Code refactored, docs updated --------- Co-authored-by: sviatahorau <mikhail.sviatahorau@deep.bi>	2024-05-06 20:43:39 +05:30
Gian Merlino	db82adcdfd	SCALAR_IN_ARRAY: Optimization and behavioral follow-ups. (#16311 ) * Four changes to scalar_in_array as follow-ups to #16306: 1) Align behavior for `null` scalars to the behavior of the native `in` and `inType` filters: return `true` if the array itself contains null, else return `null`. 2) Rename the class to more closely match the function name. 3) Add a specialization for constant arrays, where we build a `HashSet`. 4) Use `castForEqualityComparison` to properly handle cross-type comparisons. Additional tests verify comparisons between LONG and DOUBLE are now handled properly. * Fix spelling. * Adjustments from review.	2024-04-26 16:01:17 -07:00
Sree Charan Manamala	ad5701e891	new SCALAR_IN_ARRAY function analogous to DRUID_IN (#16306 ) * scalar_in function * api doc * refactor	2024-04-18 21:15:15 -07:00

1 2 3 4 5 ...

348 Commits