druid

Commit Graph

Author	SHA1	Message	Date
Gian Merlino	f81855d607	Add unauthorized errorCode to query docs. (#5691 )	2018-04-26 13:06:25 -07:00
scrawfor	15f4ab2b31	Expose noop filter to users (#5597 )	2018-04-18 07:57:07 -07:00
Gian Merlino	fbf3fc178e	Timeseries: Add "grandTotal" option. (#5640 ) * Timeseries: Add "grandTotal" option. * Modify whitespace. * Checkstyle workaround.	2018-04-16 18:22:19 -07:00
Caroline1000	afa75e04b7	change header in overlord console; minor querydoc change (#5625 ) * change header in overlord console; minor querydoc change * remove change to overlord console * address Gian comments	2018-04-11 12:57:22 -07:00
Arup Malakar	0c4598c1fe	Fix typo in avatica java client code documenation (#5553 )	2018-03-29 16:36:40 -05:00
Dyana Rose	db508cf3ca	[docs] fix invalid example json (#5547 ) https://github.com/druid-io/druid/issues/5546	2018-03-28 13:53:38 -07:00
Clint Wylie	50e0e7f97d	Correct lookup documentation (#5537 ) fixes #5536	2018-03-26 17:01:02 -07:00
Atul Mohan	ec17a44e09	Add result level caching to Brokers (#5028 ) * Add result level caching to Brokers * Minor doc changes * Simplify sequences * Move etag execution * Modify cacheLimit criteria * Fix incorrect etag computation * Fix docs * Add separate query runner for result level caching * Update docs * Add post aggregated results to result level cache * Fix indents * Check byte size for exceeding cache limit * Fix indents * Fix indents * Add flag for result caching * Remove logs * Make cache object generation synchronous * Avoid saving intermediate cache results to list * Fix changes that handle etag based response * Release bytestream after use * Address PR comments * Discard resultcache stream after use * Fix docs * Address comments * Add comment about fluent workflow issue	2018-03-23 19:11:52 -07:00
Gian Merlino	0851f2206c	Expanded documentation for DataSketches aggregators. (#5513 ) Originally written by @AlexanderSaydakov in druid-io/druid-io.github.io#448. I also added redirects and updated links to point to the new datasketches-extension.html landing page for the extension, rather than to the old page about theta sketches.	2018-03-21 18:19:27 -07:00
Gian Merlino	7416d1d02d	Add "joda" option to timeFormat extractionFn. (#5448 )	2018-03-02 19:59:26 -08:00
Gian Merlino	f3796bc81b	SQL: Lower default JDBC frame size. (#5409 ) The previous default of 100,000 was a bit excessive and could easily lead to OOM errors on "select *" style queries.	2018-02-21 10:00:48 -08:00
Stephanie Rivera	77bb2f9c9f	Update post-aggregations.md (#5237 ) * Update post-aggregations.md I think this is more clear. I am not sure how multiplying by 100 is involved in averaging... * Update post-aggregations.md adding additional aggregator * Update post-aggregations.md	2018-02-06 16:26:39 -08:00
Gian Merlino	ed47a1e1a9	Lookups: Inherit "injective" from registered lookups, improve docs. (#5316 ) Code changes: - In the lookup-based extractionFns, inherit injective property from the lookup itself if not specified. Doc changes: - Add a "Query execution" section to the lookups doc explaining how injective lookups and their optimizations work. - Remove scary warnings against using registeredLookup extractionFns. They are necessary and important since they work with filters and function cascades -- two things that the dimension specs do not do. They deserve to be first class citizens. - Move the "registeredLookup" fn above the "lookup" fn. It's probably more commonly used, so the docs read better this way.	2018-02-01 18:30:19 -08:00
Gian Merlino	53e3c7d1b2	SQL: Add additional unsupported features to the docs. (#5290 )	2018-01-24 11:27:47 -08:00
Jihoon Son	972b4d189a	Fix topN doc (#5240 )	2018-01-09 20:10:13 -08:00
Chuanlei Ni	368d03146b	assign granularity.all to SelectQuery by default (#5091 )	2017-11-21 17:10:19 -08:00
Daniel	22c49b0d33	docs: fix broken link to broker configuration (#5105 )	2017-11-21 13:32:00 +09:00
Gian Merlino	486159ba8c	SQL: Add TIMESTAMPADD. (#5079 )	2017-11-16 12:00:34 -08:00
Gian Merlino	4fd4444b42	SQL: Add "array" result format, and document result formats. (#5032 ) * SQL: Add "array" result format, and document result formats. * Code style.	2017-11-13 20:24:06 -08:00
Himanshu	bbb678efd7	fix lookups endpoint collisions (#5058 ) * fix lookups endpoint collissions * fix errors	2017-11-09 17:39:53 -08:00
Roman Leventov	a8dc056c09	Add retries for coordinator fetch and lookup start in LookupReferencesManager (#5029 ) * Add retries for coordinator fetch and lookup start in LookupReferencesManager * Fix LookupConfigTest * Address comments * Address more comments * And address more comments * Address comms * Recognize 'not found' lookups in LookupReferencesManager.tryGetLookupListFromCoordinator(), by @egor-ryashin	2017-11-09 02:30:36 -03:00
Jonathan Wei	6840eabd87	Add Router connection balancers for Avatica queries (#4983 ) * Add Router connection balancers for Avatica queries * PR comments * Adjust test bounds * PR comments * Add doc comments * PR comments * PR comment * Checkstyle fix	2017-11-01 14:01:13 -07:00
Gian Merlino	d5e83f9d50	Fix docs for MOD. (#4971 )	2017-10-18 16:43:28 -07:00
Jihoon Son	52d7f74226	Add streaming aggregation as the last step of ConcurrentGrouper if data are spilled (#4704 ) * Add steaming grouper * Fix doc * Use a single dictionary while combining * Revert GroupByBenchmark * Removed unused code * More cleanup * Remove unused config * Fix some typos and bugs * Refactor Groupers.mergeIterators() * Add comments for combining tree * Refactor buildCombineTree * Refactor iterator * Add ParallelCombiner * Add ParallelCombinerTest * Handle InterruptedException * use AbstractPrioritizedCallable * Address comments * [maven-release-plugin] prepare release druid-0.11.0-sg * [maven-release-plugin] prepare for next development iteration * Address comments * Revert "[maven-release-plugin] prepare for next development iteration" This reverts commit `5c6b31e488`. * Revert "[maven-release-plugin] prepare release druid-0.11.0-sg" This reverts commit `0f5c3a8b82`. * Fix build failure * Change list to array * rename sortableIds * Address comments * change to foreach loop * Fix comment * Revert keyEquals() * Remove loop * Address comments * Fix build fail * Address comments * Remove unused imports * Fix method name * Split intermediate and leaf combine degrees * Add comments to StreamingMergeSortedGrouper * Add more comments and fix overflow * Address comments * ConcurrentGrouperTest cleanup * add thread number configuration for parallel combining * improve doc * address comments * fix build	2017-10-17 23:24:08 -07:00
Gian Merlino	f51f346e36	SQL: Fix POWER doc, add test. (#4953 )	2017-10-13 14:38:15 -07:00
Gian Merlino	5cfc7f9ef7	Fix formatting of SQL TRIM docs. (#4951 )	2017-10-13 14:38:06 -07:00
Atul Mohan	c07678b143	Synchronization of lookups during startup of druid processes (#4758 ) * Changes for lookup synchronization * Refactor of Lookup classes * Minor refactors and doc update * Change coordinator instance to be retrieved by DruidLeaderClient * Wait before thread shutdown * Make disablelookups flag true by default * Update docs * Rename flag * Move executorservice shutdown to finally block * Update LookupConfig * Refactoring and doc changes * Remove lookup config constructor * Revert Lookupconfig constructor changes * Add tests to LookupConfig * Make executorservice local * Update LRM * Move ListeningScheduledExecutorService to ExecutorCompletionService * Move exception to outer block * Remove check to see future is done * Remove unnecessary assignment * Add logging	2017-10-12 21:22:24 -05:00
Gian Merlino	b20e3038b6	SQL: Upgrade to Calcite 1.14.0, some refactoring of internals. (#4889 ) * SQL: Upgrade to Calcite 1.14.0, some refactoring of internals. This brings benefits: - Ability to do GROUP BY and ORDER BY with ordinals. - Ability to support IN filters beyond 19 elements (fixes #4203). Some refactoring of druid-sql internals: - Builtin aggregators and operators are implemented as SqlAggregators and SqlOperatorConversions rather being special cases. This simplifies the Expressions and GroupByRules code, which were becoming complex. - SqlAggregator implementations are no longer responsible for filtering. Added new functions: - Expressions: strpos. - SQL: TRUNCATE, TRUNC, LENGTH, CHAR_LENGTH, STRLEN, STRPOS, SUBSTR, and DATE_TRUNC. * Add missing @Override annotation. * Adjustments for forbidden APIs. * Adjustments for forbidden APIs. * Disable GROUP BY alias. * Doc reword.	2017-10-10 12:44:05 -07:00
Gian Merlino	4e1d0f49d8	Docs: Fix link to broker configuration. (#4934 )	2017-10-10 11:18:46 -07:00
Gian Merlino	2ce8123bdb	Move scan-query from a contrib extension into core. (#4751 ) * Move scan-query from a contrib extension into core. Based on a proposal at: https://groups.google.com/d/topic/druid-development/ME_OatUDnbk/discussion This patch also adds support for virtual columns to the Scan query, and updates Druid SQL to use Scan instead of Select. This patch also makes some behavioral changes to handling of the __time column. In particular, it is now is returned as "__time" rather than "timestamp"; it is no longer included if you do not specifically ask for it in your "columns"; and it is returned as a long rather than a string. Users can revert time handling to the legacy extension behavior by setting "legacy" : true in their queries, or setting the property druid.query.scan.legacy = true. This is meant to provide a migration path for users that were formerly using the contrib extension. * Adjustments from review. * Add back Select query. * Adjust SQL docs. * Restore SelectQuery link.	2017-09-13 09:51:24 -07:00
Gian Merlino	4909c48b0c	SQL: Full TRIM support. (#4750 ) * SQL: Full TRIM support. - Support trimming arbitrary characters - Support BOTH, LEADING, and TRAILING * Remove unused import. * Fix tests, add RTRIM / LTRIM. * Remove unused imports. * BTRIM and docs. * Replace for with foreach.	2017-09-12 11:49:08 -07:00
Gian Merlino	9078925cab	Docs for finalizingFieldAccess post-aggregator. (#4737 )	2017-08-31 11:45:49 -07:00
Gian Merlino	daf3c5f927	Add "round" option to cardinality and hyperUnique aggregators. (#4720 ) * Add "round" option to cardinality and hyperUnique aggregators. Also turn it on by default in SQL, to make math on distinct counts work more as expected. * Fix some compile errors. * Fix test. * Formatting.	2017-08-28 14:52:11 -07:00
Jihoon Son	f3f2cd35e1	Array-based aggregation for groupBy query (#4576 ) * Array-based aggregation * Fix handling missing grouping key * Handle invalid offset * Fix compilation * Add cardinality check * Fix cardinality check * Address comments * Address comments * Address comments * Address comments * Cleanup GroupByQueryEngineV2.process * Change to Byte.SIZE * Add flatMap	2017-08-03 20:04:54 +03:00
Gian Merlino	d4ef0f6d94	Improved SQL support for floats and doubles. (#4598 ) * Improved SQL support for floats and doubles. - Use Druid FLOAT for SQL FLOAT, and Druid DOUBLE for SQL DOUBLE, REAL, and DECIMAL. - Use float* aggregators when appropriate. - Add tests involving both float and double columns. - Adjust documentation accordingly. * CR comments. * Fix braces.	2017-07-25 13:54:44 -07:00
Slim	71e7a4c054	Adding double colums supports (#4491 ) * add double columns support * Fix numbers and expected results in UTs * adding float aggregators * fix IT expected test results * fix comments * more fixes * fix comp * fix test * refactor double and float aggregator factories * fix * fix UTs * fix comments * clean unused code * fix more comments * undo unnecessary changes * fix null issue * refactor TopNColumnSelectorStrategyFactory * fix docs * refactor NumericTopNColumnSelectorStrategy * fix return * fix comments * handle the null case in DimesionIndexer * more null fixing * cosmetic changes	2017-07-20 10:14:14 +03:00
Gian Merlino	16817e408d	SQL + Expressions = Best friends forever. (#4360 ) * SQL + Expressions = Best friends forever. - Use expressions as a projection layer for anything that can't be expressed using traditional Druid extractionFns. Sometimes they're embedded directly (like "expression" filters, builtin aggregators, or "expression" post-aggregators). Sometimes they're referenced through virtual columns (like dimensionSpecs, which can't innately reference functions of more than one column without the virtual column layer). - Add many new functions and operators, taking advantage of the expression capability (see the querying/sql.md doc). - Improve consistency of constant reduction and of casting by using Druid expressions for this instead of Calcite's RexExecutor. * Fix casting bug, and other code review comments. * Fix docs.	2017-07-07 08:48:26 -07:00
jeffhartley	3e7f7720a1	update aggregations.md re: rollup (#4455 ) noted that rollup could be on or off	2017-06-23 14:28:59 -07:00
Jonathan Wei	3b70995bb3	Configurable row limit for JDBC frames (#4417 )	2017-06-16 17:07:40 -07:00
Amar Ramachandran	fc80df339e	Fix incorrect name (#4386 )	2017-06-09 13:32:17 -04:00
kaijianding	551a89bd67	serialize DateTime As Long to improve json serde performance (#4038 )	2017-06-06 10:08:51 -07:00
Jonathan Wei	b90c28e861	Support limit push down for GroupBy (#3873 ) * Support limit push down for GroupBy V2 * Use orderBy spec ordering when applying limit push down * PR Comments * Remove unused var * Checkstyle fixes * Fix test * Add comment on non-final variables, fix checkstyle * Address PR comments * PR comments * Remove unnecessary buffer reset * Fix missing @JsonProperty annotation	2017-06-02 15:39:04 -07:00
fanjieqi	2e933e1413	fix a bug in select-query.md which the property_form lack of the『granularity』 (#4327 ) There result would be {"error"=>"Unknown exception", "errorMessage"=>nil, "errorClass"=>"java.lang.NullPointerException", "host"=>nil} when the json lack of 『granularity』.	2017-05-30 17:04:39 -07:00
Kamal Gurala	dcb07d6958	Option to configure default analysis types in SegmentMetadataQuery (#4259 ) * Option to configure default analysis types * Updated Docs and renamed * Added serde tests and Null handling * Fixed Documentation * Updated implementation * Updated implementation * Updated implementation * Added usingDefaultIntervals in Builder * Updated implementation * Updated implementation and added failing test * filterSegments implementation updated * Updated imlementation * Padding * Add missing Override * Updated implementation * Fixed a naming bug * Fixed bug * Removed comment	2017-05-26 12:12:39 -07:00
zwang180	2c55a935f8	Delete a duplicate "Bucket Extraction Function" section at the bottom of "Querying"-"DimensionSpec" page (#4331 )	2017-05-25 14:16:00 -07:00
Himanshu	136b2fae72	improve query timeout handling and limit max scatter-gather bytes (#4229 ) * improve query timeout handling and limit max scatter-gather bytes * address review comments	2017-05-16 12:47:32 -05:00
Himanshu	417714d228	additional lookup status discovery http endpoints at coordinator (#4228 ) * additional lookup status discovery http endpoints at coordinator * more changes * jsonize the error msgs as well * fix tests	2017-05-04 11:15:30 -07:00
Himanshu	5a5a2749cd	improvements to coordinator lookups management (#3855 ) * coordinator lookups mgmt improvements * revert replaces removal, deprecate it instead * convert and use older specs stored in db * more tests and updates * review comments * add behavior for 0.10.0 to 0.9.2 downgrade * incorporating more review comments * remove explicit lock and use LifecycleLock in LookupReferencesManager. use LifecycleLock in LookupCoordinatorManager as well * wip on LookupCoordinatorManager * lifecycle lock * refactor thread creation into utility method * more review comments addressed * support smooth roll back of lookup snapshots from 0.10.0 to 0.9.2 * correctly use LifecycleLock in LookupCoordinatorManager and remove synchronization from start/stop * run lookup mgmt on leader coordinator only * wip: changes to do multiple start() and stop() on LookupCoordinatorManager * lifecycleLock fix usage in LookupReferencesManagerTest * add LifecycleLock back * fix license hdr * some fixes * make LookupReferencesManager.getAllLookupsState() consistent while still being lockless * address review comments * addressing leventov's comments * address charle's comments * add IOE.java * for safety in LookupReferencesManager mainThread check for lifecycle started state on each loop in addition to interrupt * move thread creation utility method to Execs * fix names * add tests for LookupCoordinatorManager.lookupManagementLoop() * add further tests for figuring out toBeLoaded and toBeDropped on LookupCoordinatorManager * address leventov comments * remove LookupsStateWithMap and parameterize LookupsState * address review comments * address more review comments * misc fixes	2017-04-28 08:41:38 -05:00
asrayousuf	e4fbc2bc5b	Updating the description of useCache (#4200 ) Updating the description of useCache Updating query-context doc based on Gian's comment Updating query-context doc based on Gian's comment Updating query-context doc based on Gian's comment Updating query-context doc based on Gian's comment	2017-04-25 10:26:15 -07:00
Jihoon Son	5b69f2eff2	Make timeout behavior consistent to document (#4134 ) * Make timeout behavior consistent to document * Refactoring BlockingPool and add more methods to QueryContexts * remove unused imports * Addressed comments * Address comments * remove unused method * Make default query timeout configurable * Fix test failure * Change timeout from period to millis	2017-04-19 09:47:53 +09:00
Gian Merlino	b2954d5fea	Better groupBy error messages and docs around resource limits. (#4162 ) * Better groupBy error messages and docs around resource limits. * Fix BufferGrouper test from datasketches. * Further clarify.	2017-04-13 10:38:53 -07:00
Gian Merlino	dd6c0ab509	Add SQL REGEXP_EXTRACT function; add "index" to "regex" extractionFn. (#4055 ) * Add SQL REGEXP_EXTRACT function; add "index" to "regex" extractionFn. * Fix tests.	2017-03-24 17:38:36 -07:00
Erik Dubbelboer	2cbc4764f8	Comparing dimensions to each other in a filter (#3928 ) Comparing dimensions to each other using a select filter	2017-03-23 18:23:46 -07:00
Gian Merlino	db15d494ca	Update docs for query filter HavingSpecs. (#4063 )	2017-03-15 13:59:09 -04:00
Gian Merlino	3216134f8c	SQL: Make row extractions extensible and add one for lookups. (#3991 ) This is a reopening of #3989, since that PR was merged to master prematurely and accidentally.	2017-03-13 21:56:16 -07:00
Gian Merlino	cab2e2f5d5	Add docs about filtering and indexes on numeric columns. (#4035 )	2017-03-10 12:48:59 -08:00
Gian Merlino	960769c583	SQL: Fix example INFORMATION_SCHEMA query. (#4017 )	2017-03-06 16:07:47 -08:00
Gian Merlino	4ca5270e88	Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. (#4004 ) * Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. Includes two fixes: - groupBy v2 now ignores chunkPeriod, since it wouldn't have helped anyway (its mergeResults returns a lazy sequence) and it generates incorrect results. - Fix chunkPeriod handling for periods of irregular length, like "P1M" or "P1Y". Also includes doc and test fixes: - groupBy v1 was no longer being tested by GroupByQueryRunnerTest since #3953, now it is once again. - chunkPeriod documentation was misleading due to its checkered past. Updated it to be more accurate. * Remove unused import. * Restore buffer size.	2017-03-06 12:27:02 -06:00
Gian Merlino	337f3870d8	Fix TimeFormatExtractionFn getCacheKey when tz, locale are not provided. (#4007 ) * Fix TimeFormatExtractionFn getCacheKey when tz, locale are not provided. * Remove unused import. * Use defaults in cache key.	2017-03-04 17:41:59 -08:00
Gian Merlino	af5a4cce3c	SQL: Clarify approximate distinct count behavior. (#4000 )	2017-03-03 13:42:30 -08:00
Himanshu	e7e3c2dc5a	support singleThreaded flag for groupBy-v2 as well (#3992 )	2017-03-03 23:43:06 +05:30
Gian Merlino	4a56d7d8a0	SQL: Ability to generate exact distinct count queries. (#3999 )	2017-03-03 23:40:36 +05:30
Gian Merlino	3e8dbd59f8	Fix groupBy docs to reflect that 'v2' is default. (#3993 )	2017-03-02 15:13:39 -08:00
Gian Merlino	e63eefd7ff	Revert "SQL: Make row extractions extensible and add one for lookups. (#3989 )" The PR was merged to master accidentally. This reverts commit `23927a3c96`.	2017-03-01 17:06:12 -08:00
Jonathan Wei	5fb1638534	Add default configuration for select query 'fromNext' parameter (#3986 ) * Add default configuration for select query 'fromNext' parameter * PR comments * Fix PagingSpec config injection * Injection fix for test	2017-03-01 17:05:35 -08:00
Gian Merlino	23927a3c96	SQL: Make row extractions extensible and add one for lookups. (#3989 ) * SQL: Make row extractions extensible and add one for lookups. * Fix QuantileSqlAggregatorTest.	2017-03-01 17:03:43 -08:00
Jihoon Son	7200dce112	Atomic merge buffer acquisition for groupBys (#3939 ) * Atomic merge buffer acquisition for groupBys * documentation * documentation * address comments * address comments * fix test failure * Addressed comments - Add InsufficientResourcesException - Renamed GroupByQueryBrokerResource to GroupByQueryResource * addressed comments * Add takeBatch() to BlockingPool	2017-02-22 14:49:37 -06:00
Gian Merlino	e7d01b67b6	Move SQL configs to sql.md. (#3959 ) This puts all the SQL stuff in one place. It also makes life easier by pointing out that configs be made in either common.runtime.properties or the broker runtime.properties.	2017-02-22 08:37:24 -08:00
Jonathan Wei	bc33b68b51	Use GroupBy V2 as default (#3953 ) * Use GroupBy V2 as default * Remove unused line * Change assert to exception propagation	2017-02-18 07:40:40 -08:00
Gian Merlino	ca6053d045	SQL: Resolve column type conflicts in favor of newer segments. (#3930 ) * SQL: Resolve column type conflicts in favor of newer segments. Helps with schema evolution from e.g. long -> float, which is supported on the query side. * Take columns from highest timestamp instead of max segment id. * Fixes and docs.	2017-02-15 17:48:49 -08:00
Gian Merlino	16ef513c7d	SQL: Add context and contextual functions to planner. (#3919 ) * SQL: Add context and contextual functions to planner. Added support for context parameters specified as JDBC connection properties or a JSON object for SQL-over-JSON-over-HTTP. Also added features that depend on context functionality: - Added CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP functions. - Added support for time zones other than UTC via a "timeZone" context. - Pass down query context to Druid queries too. Also some bug fixes: - Fix DATE handling, it was largely done incorrectly before. - Fix CAST(__time TO DATE) which should do a floor-to-day. - Fix non-equality comparisons to FLOOR(__time TO X). - Fix maxQueryCount property. * Pass down context to nested queries too.	2017-02-15 14:09:14 -08:00
Jihoon Son	a459db68b6	Fine grained buffer management for groupby (#3863 ) * Fine-grained buffer management for group by queries * Remove maxQueryCount from GroupByRules * Fix code style * Merge master * Fix compilation failure * Address comments * Address comments - Revert Sequence - Add isInitialized() to Grouper - Initialize the grouper in RowBasedGrouperHelper.Accumulator - Simple refactoring RowBasedGrouperHelper.Accumulator - Add tests for checking the number of used merge buffers - Improve docs * Revert unnecessary changes * change to visible to testing * fix misspelling	2017-02-14 12:55:54 -08:00
DaimonPl	a2875a4d91	pre-computed HLL support for hyperUnique aggregator (#3909 )	2017-02-13 15:26:20 -08:00
Himanshu	9dfcf0763a	disable javascript execution by default (#3818 )	2017-02-13 15:11:18 -08:00
Jonathan Wei	ca2b04f0fd	Add long/float ColumnSelectorStrategy implementations (#3838 ) * Add long/float ColumnSelectorStrategy implementations * Address PR comments * Add String strategy with internal dictionary to V2 groupby, remove dict from numeric wrapping selectors, more tests * PR comments * Use BaseSingleValueDimensionSelector for long/float wrapping * remove unused import * Address PR comments * PR comments * PR comments * More PR comments * Fix failing calcite histogram subquery tests * ScanQuery test and comment about isInputRaw * Add outputType to extractionDimensionSpec, tweak SQL tests * Fix limit spec optimization for numerics * Add cardinality sanity checks to TopN * Fix import from merge * Add tests for filtered dimension spec outputType * Address PR comments * Allow filtered dimspecs on numerics * More comments	2017-02-08 20:39:29 -08:00
Gian Merlino	ac84a3e011	SQL: Add resolution parameter, fix filtering bug with APPROX_QUANTILE (#3868 ) * SQL: Add resolution parameter to quantile agg, rename to APPROX_QUANTILE. * Fix bug with re-use of filtered approximate histogram aggregators. Also add APPROX_QUANTILE tests for filtering and running on complex columns. Includes some slight refactoring to allow tests to make DruidTables that include complex columns. * Remove unused import	2017-01-25 18:39:26 -08:00
Gian Merlino	d51f5e058d	SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. (#3852 ) * SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. Switched from CalciteConnection to Planner, bringing benefits: - CalciteConnection's JDBC interface no longer sits between the SQL server (HTTP/Avatica) and Druid's query layer. Instead, the SQL servers can use Druid Sequence objects directly, reducing overhead in the query return path. - Implemented our own Planner-based Avatica Meta, letting us control connection timeouts and connection / statement limits. The previous CalciteConnection-based implementation didn't have any limits or timeouts. - The Planner interface lets us override the operator table, opening up SQL language extensions. This patch includes two: APPROX_COUNT_DISTINCT in core, and a QUANTILE aggregator in the druid-histogram extension. Also: - Added INFORMATION_SCHEMA metadata schema. - Added tests for Unicode literals and escapes. * Verify statement is actually open before closing it. * More detailed INFORMATION_SCHEMA docs.	2017-01-19 16:32:20 -08:00
Gian Merlino	e86859b228	SQL support for nested groupBys. (#3806 ) * SQL support for nested groupBys. Allows, for example, doing exact count distinct by writing: SELECT COUNT() FROM (SELECT DISTINCT col FROM druid.foo) Contrast with approximate count distinct, which is: SELECT COUNT(DISTINCT col) FROM druid.foo Add deeply-nested groupBy docs, tests, and maxQueryCount config. * Extract magic constants into statics. * Rework rules to put preconditions in the "matches" method.	2017-01-11 18:32:53 -08:00
Jihoon Son	c099977a5b	Add an option to SearchQuery to choose a search query execution strategy (#3792 ) * Add an option to SearchQuery to choose a search query execution strategy. Supported strategies are 1) Index-only query execution 2) Cursor-based scan 3) Auto: choose an efficient strategy for a given query * Add SearchStrategy and SearchQueryExecutor * Address comments * Rename strategies and set UseIndexesStrategy as the default strategy * Add a cost-based planner for auto strategy * Add document * Fix code style * apply code style * apply comments	2017-01-10 18:04:20 -08:00
Gian Merlino	dd63f54325	Built-in SQL. (#3682 )	2016-12-16 17:15:59 -08:00
Jonathan Wei	2bfcc8a592	First and Last Aggregator (#3566 ) * add first and last aggregator * add test and fix * moving around * separate aggregator valueType * address PR comment * add finalize inner query and adjust v1 inner indexing * better test and fixes * java-util import fixes * PR comments * Add first/last aggs to ITWikipediaQueryTest	2016-12-16 15:26:40 -08:00
Himanshu	ed322a4beb	remove size from default analysisTypes list for segmentMetadata query (#3773 )	2016-12-13 18:01:21 -08:00
Erik Dubbelboer	bb9e35e1af	Add Greatest and Least post aggregations (#3567 )	2016-12-07 17:58:23 -08:00
Himanshu	45da7e48f1	groupBy sort results by (dimensions,timestamp) instead of (timestamp,dimension) (#3672 ) * sortByDimsFirst flag for groupBy query * Remove need for KeyType in Grouper<KeyType> to be Comparable<KeyType> * fix review comments * fix review comments regarding removing code duplication of dim/time comparison * move comparator for KeyType object to KeySerdeFactory so that creation of comparator does not need KeySerde * remove unnecessary system.out.println * make access static var NATURAL_NULLS_FIRST directly * further review comments addressing	2016-12-06 09:48:56 -08:00
Gian Merlino	353fee79dd	Add "asMillis" option to "timeFormat" extractionFn. (#3733 ) This is useful for chaining extractionFns that all want to treat time as millis, such as having a javascript extractionFn after a timeFormat.	2016-12-02 13:45:16 -08:00
Gian Merlino	102375d9bb	Add "strlen" extractionFn. (#3731 )	2016-12-02 12:08:51 -08:00
Gian Merlino	4c5d10f8a3	Add DimFilterHavingSpec. (#3727 ) * Add DimFilterHavingSpec. * Add test for DimFilterHavingSpec with extractionFns.	2016-12-02 10:04:30 -08:00
Gian Merlino	e4465e63bd	Fix ordering of sections on dimensionspecs.md. (#3722 ) The Filtered and List DimensionSpecs were mixed in with the extraction functions.	2016-11-29 16:28:36 -08:00
Joan Viladrosa	2df98bcaa6	Fixed Missing commas in json example of Lookup (#3680 )	2016-11-15 14:56:18 +09:00
Gian Merlino	2c504b6258	Add "like" filter. (#3642 ) * Add "like" filter. * Addressed some PR comments. * Slight simplifications to LikeFilter. * Additional simplifications. * Fix comment in LikeFilter. * Clarify comment in LikeFilter. * Simplify LikeMatcher a bit. * No use going through the optimized path if prefix is empty. * Add more tests.	2016-11-04 23:25:03 +05:30
Gian Merlino	f8d71fc602	groupBy: Fix maxMergingDictionarySize config. (#3488 )	2016-09-22 10:02:33 -07:00
Gian Merlino	e0e28866ee	JavaScript docs: Fix links and typos, add to TOC. (#3457 )	2016-09-13 15:26:44 -07:00
Gian Merlino	76a24054e3	JavaScript docs, including docs for globals. (#3454 )	2016-09-13 13:46:55 -07:00
Gian Merlino	1344e3c3af	Clearer filter docs. (#3448 )	2016-09-09 13:47:13 -07:00
Gian Merlino	1e3f94237e	groupBy v2: Configurable load factor. (#3437 ) Also change defaults: - bufferGrouperMaxLoadFactor from 0.75 to 0.7. - maxMergingDictionarySize to 100MB from 25MB, should be more appropriate for most heaps.	2016-09-07 14:14:59 -05:00
Gian Merlino	e9050c2b4c	TimeFormatExtractionFn: Allow null formats (equivalent to ISO8601) and granular bucketing. (#3411 )	2016-08-31 20:58:53 +05:30
Jonathan Wei	4e91330a17	Use DimensionSpec in CardinalityAggregatorFactory (#3406 ) * Use DimensionSpec in CardinalityAggregatorFactory * Address PR comments * Fix requiredFields()	2016-08-30 15:54:02 -07:00
rajk-tetration	362b9266f8	Adding filters for TimeBoundary on backend (#3168 ) * Adding filters for TimeBoundary on backend Signed-off-by: Balachandar Kesavan <raj.ksvn@gmail.com> * updating TimeBoundaryQuery constructor in QueryHostFinderTest * add filter helpers * update filterSegments + test * Conditional filterSegment depending on whether a filter exists * Style changes * Trigger rebuild * Adding documentation for timeboundaryquery filtering * added filter serialization to timeboundaryquery cache * code style changes	2016-08-15 10:25:24 -07:00
Gian Merlino	8899affe48	Introduce standardized "Resource limit exceeded" error. (#3338 ) Fixes #3336.	2016-08-09 10:50:56 -07:00
Gian Merlino	21bce96c4c	More useful query errors. (#3335 ) Follow-up to #1773, which meant to add more useful query errors but did not actually do so. Since that patch, any error other than interrupt/cancel/timeout was reported as `{"error":"Unknown exception"}`. With this patch, the error fields are: - error, one of the specific strings "Query interrupted", "Query timeout", "Query cancelled", or "Unknown exception" (same behavior as before). - errorMessage, the message of the topmost non-QueryInterruptedException in the causality chain. - errorClass, the class of the topmost non-QueryInterruptedException in the causality chain. - host, the host that failed the query.	2016-08-09 16:14:52 +08:00
Jonathan Wei	decefb7477	Add time interval dim filter and retention analysis example (#3315 ) * Add time interval dim filter and retention analysis example * Use closed-open matching for intervals, update cache key generation * Fix time filtering tests for interval boundary change	2016-08-05 07:25:04 -07:00
kaijianding	50d52a24fc	ability to not rollup at index time, make pre aggregation an option (#3020 ) * ability to not rollup at index time, make pre aggregation an option * rename getRowIndexForRollup to getPriorIndex * fix doc misspelling * test query using no-rollup indexes * fix benchmark fail due to jmh bug	2016-08-02 11:13:05 -07:00
Dave Li	bc20658239	groupBy nested query using v2 strategy (#3269 ) * changed v2 nested query strategy * add test for #3239 * update for new ValueMatcher interface and add benchmarks * enable time filtering * address PR comments * add failing test for outer filter aggregator * add helper class for sharing code * update nested groupby doc * move temporary storage instantiation * address PR comment * address PR comment 2	2016-08-01 18:30:39 -07:00
Jonathan Wei	a6105cbb86	Add numeric StringComparator (#3270 ) * Add numeric StringComparator * Only use direct long comparison for numeric ordering in BoundFilter, add time filtering benchmark query * Address PR comments, add multithreaded BoundDimFilter test * Add comment on strlen tie handling * Add timeseries interval filter benchmark * Adjust docs * Use jackson for StringComparator, address PR comments * Add new TopNMetricSpec and SearchSortSpec with tests (WIP) * More TopNMetricSpec and SearchSortSpec tests * Fix NewSearchSortSpec serde * Update docs for new DimensionTopNMetricSpec * Delete NumericDimensionTopNMetricSpec * Delete old SearchSortSpec * Rename NewSearchSortSpec to SearchSortSpec * Add TopN numeric comparator benchmark, address PR comments * Refactor OrderByColumnSpec * Add null checks to NumericComparator and String->BigDecimal conversion function * Add more OrderByColumnSpec serde tests	2016-07-29 15:44:16 -07:00
Gian Merlino	2553997200	Associate groupBy v2 resources with the Sequence lifecycle. (#3296 ) This fixes a potential issue where groupBy resources could be allocated to create a Sequence, but then the Sequence is never used, and thus the resources are never freed. Also simplifies how groupBy handles config overrides (this made the new unit test easier to write).	2016-07-27 18:44:19 -07:00
kaijianding	3dc2974894	Add timestampSpec to metadata.drd and SegmentMetadataQuery (#3227 ) * save TimestampSpec in metadata.drd * add timestampSpec info in SegmentMetadataQuery	2016-07-25 15:45:30 -07:00
Jonathan Wei	a42ccb6d19	Support filtering on long columns (including __time) (#3180 ) * Support filtering on __time column * Rename DruidPredicate * Add docs for ValueMatcherFactory, add comment on getColumnCapabilities * Combine ValueMatcherFactory predicate methods to accept DruidCompositePredicate * Address PR comments (support filter on all long columns) * Use predicate factory instead of composite predicate * Address PR comments * Lazily initialize long handling in selector/in filter * Move long value parsing from InFilter to InDimFilter, make long value parsing thread-safe * Add multithreaded selector/in filter test * Fix non-final lock object in SelectorDimFilter	2016-07-20 17:08:49 -07:00
Gian Merlino	3ab4a4efbc	Fix formatting in granularities doc. (#3229 )	2016-07-08 09:29:58 -07:00
Gian Merlino	fdc7e88a7d	Allow queries with no aggregators. (#3216 ) This is actually reasonable for a groupBy or lexicographic topNs that is being used to do a "COUNT DISTINCT" kind of query. No aggregators are needed for that query, and including a dummy aggregator wastes 8 bytes per row. It's kind of silly for timeseries, but why not.	2016-07-06 20:38:54 +05:30
jaehong choi	efbcbf5315	Support alphanumeric sort in search query (#2593 ) * support alphanumeric sort in search query * address a comment about handling equals() and hashCode() * address comments * add Ut for string comparators * address a comment about space indentations.	2016-06-28 15:06:18 -07:00
Gian Merlino	4cc39b2ee7	Alternative groupBy strategy. (#2998 ) This patch introduces a GroupByStrategy concept and two strategies: "v1" is the current groupBy strategy and "v2" is a new one. It also introduces a merge buffers concept in DruidProcessingModule, to try to better manage memory used for merging. Both of these are described in more detail in #2987. There are two goals of this patch: 1. Make it possible for historical/realtime nodes to return larger groupBy result sets, faster, with better memory management. 2. Make it possible for brokers to merge streams when there are no order-by columns, avoiding materialization. This patch does not do anything to help with memory management on the broker when there are order-by columns or when there are nested queries. That could potentially be done in a future patch.	2016-06-24 18:06:09 -07:00
Dave Li	12be1c0a4b	Add bucket extraction function (#3033 ) * add bucket extraction function * add doc and header * updated doc and test	2016-06-17 09:24:27 -07:00
linbo.jin	8c76fe7b97	docs: change OR to AND inside query docs about multi-value dims (#3162 ) * docs: replace OR by AND inside topnquery docs about multi value dimensions * docs: replace OR by AND inside groupby docs about multi value dimensions	2016-06-17 08:54:18 -07:00
Kirill Kozlov	9f93be448e	Fix logical operator in example (#3093 )	2016-06-07 10:44:18 -07:00
Gian Merlino	99ee3f4dc3	Fixups, clarifications to lookup docs. (#3060 )	2016-06-07 10:43:35 -07:00
Charles Allen	fa41a6466a	Cleanup the base lookup cluster wide config docs (#3061 ) * Cleanup the base lookup cluster wide config docs * Add better examples in lookups-cached-global.md * Use actual valid stock lookups * Fixed maps with : * Add mix of lookups * Better examples in extension * Remove unneeded namespace requirement * Add extra line space * Add link to lookup tiers * Renamed header	2016-06-07 10:42:41 -07:00
Charles Allen	8cac710546	Async lookups-cached-global by default (#3074 ) * Async lookups-cached-global by default * Also better lookup docs * Fix test timeouts * Fix timing of deserialized test * Fix problem with 0 wait failing immediately	2016-06-03 15:58:10 -05:00
Gian Merlino	603fbbcc20	Fix docs for "contains" search spec. (#3066 )	2016-06-02 19:03:40 -07:00
Vadim Ogievetsky	13c267bfee	Added new line for site formatting (#3059 )	2016-06-02 11:36:45 -07:00
Vadim Ogievetsky	767190d5db	Clear up confusing wording (#3052 ) There is no such thing as a "Java aggregator" in Druid from a user's point of view, there are just native aggregator that happen to be implemented in Java.	2016-06-01 15:41:50 -07:00
Charles Allen	eaaad01de7	[QTL] Datasource as lookupTier (#2955 ) * Datasource as lookup tier * Adds an option to let indexing service tasks pull their lookup tier from the datasource they are working for. * Fix bad docs for lookups lookupTier * Add Datasource name holder * Move task and datasource to be pulled from Task file * Make LookupModule pull from bound dataSource * Fix test * Fix code style on imports * Fix formatting * Make naming better * Address code comments about naming	2016-05-17 15:44:42 -07:00
Parag Jain	e3ea842cd3	add available query granularity strings (#2960 )	2016-05-12 18:49:31 -07:00
Himanshu	8e2742b7e8	adding QueryGranularity to segment metadata and optionally expose same from segmentMetadata query (#2873 )	2016-05-03 11:31:10 -07:00
Slim	55785267e4	postAgg filedName must match name of AGG (#2874 )	2016-04-22 11:11:54 -07:00
Himanshu	3cfd9c64c9	make singleThreaded groupBy query config overridable at query time (#2828 ) * make isSingleThreaded groupBy query processing overridable at query time * refactor code in GroupByMergedQueryRunner to make processing of single threaded and parallel merging of runners consistent	2016-04-21 17:12:58 -07:00
Slim	984a518c9f	Merge pull request #2734 from b-slim/LookupIntrospection2 [QTL][Lookup] adding introspection endpoint	2016-04-21 12:15:57 -05:00
Gian Merlino	e320d13385	Fix various broken links in the docs. (#2833 )	2016-04-13 13:30:01 -07:00
Charles Allen	2b99f717e4	Move lookup config doc to proper location	2016-04-08 08:15:38 -07:00
Charles Allen	f915a59138	Merge pull request #2691 from metamx/lookupExtrFn Add ExtractionFn to LookupExtractor bridge	2016-04-06 09:13:08 -07:00
jon-wei	0e481d6f93	Allow filters to use extraction functions	2016-04-05 13:24:56 -07:00
Fangjin Yang	eea7a47870	Merge pull request #2576 from navis/paging-from-next Add option for select query to get next page without modifying returned paging identifiers	2016-04-01 13:50:36 -07:00
navis.ryu	077522a46f	stringFormat extractionFn should be able to return null on null values (Fix for #2706 )	2016-04-01 13:40:56 +09:00
navis.ryu	29bb00535b	Add option for select query to get next page without modifying returned paging identifiers	2016-04-01 09:03:03 +09:00
Gian Merlino	1853f36e9f	More consistent empty-set filtering behavior on multi-value columns. The behavior is now that filters on "null" will match rows with no values. The behavior in the past was inconsistent; sometimes these filters would match and sometimes they wouldn't. Adds tests for this behavior to SelectorFilterTest and BoundFilterTest, for query-level filters and filtered aggregates. Fixes #2750.	2016-03-29 15:32:13 -07:00
Charles Allen	4764e86409	Add docs for RegisteredDimensionExtractionFn	2016-03-28 13:27:49 -07:00
Charles Allen	ab324e4ac0	Move lookup docs that are in druid-proper back into lookups.md	2016-03-25 10:46:50 -07:00
Fangjin Yang	a5d5529749	Merge pull request #2711 from gianm/filtered-aggregator-impls All Filters should work with FilteredAggregators.	2016-03-23 13:37:21 -07:00
Gian Merlino	dd86198902	All Filters should work with FilteredAggregators. This removes Filter.makeMatcher(ColumnSelectorFactory) and adds a ValueMatcherFactory implementation to FilteredAggregatorFactory so it can take advantage of existing makeMatcher(ValueMatcherFactory) implementations. This patch also removes the Bound-based method from ValueMatcherFactory. Its only user was the SpatialFilter, which could use the Predicate-based method. Fixes #2604.	2016-03-23 12:24:01 -07:00
Gian Merlino	2dfd3877c0	Fix a bunch of broken links in the docs.	2016-03-23 10:21:28 -07:00
Fangjin Yang	d1f8f2b2fd	Merge pull request #2698 from druid-io/fix-ext-docs refactor extensions into their own docs	2016-03-22 22:04:12 -07:00
fjy	943cbe6e76	refactor extensions into their own docs	2016-03-22 18:54:10 -07:00
Fangjin Yang	041350c31b	Merge pull request #2701 from gianm/mvd-docs Improved docs for multi-value dimensions.	2016-03-22 18:09:37 -07:00
Gian Merlino	451c0bc6d8	Merge pull request #2702 from pjain1/improve_docs how to query in the querying section, correct default for select strategy, formatting	2016-03-22 16:40:35 -07:00
Parag Jain	39ecb9929d	how to query, correct default for select strategy, formatting	2016-03-22 17:06:15 -05:00
Gian Merlino	ff25325f3b	Improved docs for multi-value dimensions. - Add central doc for multi-value dimensions, with some content from other docs. - Link to multi-value dimension doc from topN and groupBy docs. - Fixes a broken link from dimensionspecs.md, which was presciently already linking to this nonexistent doc. - Resolve inconsistent naming in docs & code (sometimes "multi-valued", sometimes "multi-value") in favor of "multi-value".	2016-03-22 14:40:55 -07:00
Nishant	11b8d1ed70	Merge pull request #2686 from gianm/fix-analysistypes-docs Fix analysisTypes docs for SegmentMetadataQuery.	2016-03-18 16:15:38 -07:00
Gian Merlino	76ae30604e	Fix analysisTypes docs for SegmentMetadataQuery.	2016-03-18 13:17:33 -07:00
Charles Allen	5da9a280b6	Query Time Lookup - Dynamic Configuration	2016-03-18 09:45:05 -07:00
Slim	cf342d8d3c	Merge pull request #2517 from b-slim/adding_lookup_snapshot_utility [QTL][Lookup] lookup module with the snapshot utility	2016-03-17 11:39:47 -05:00
Slim Bouguerra	0c86b29ef0	lookup module with the snapshot utility	2016-03-17 09:20:41 -05:00

1 2 3 4 5 ...

326 Commits