druid

Commit Graph

Author	SHA1	Message	Date
Gian Merlino	244046fda5	SQL: Fix too-long headers in http responses. (#6411 ) Fixes #6409 by moving column name info from HTTP headers into the result body.	2018-10-01 18:13:08 -07:00
Jihoon Son	cb14a43038	Remove ConvertSegmentTask, HadoopConverterTask, and ConvertSegmentBackwardsCompatibleTask (#6393 ) * Remove ConvertSegmentTask, HadoopConverterTask, and ConvertSegmentBackwardsCompatibleTask * update doc and remove auto conversion * remove remaining doc * fix teamcity	2018-10-01 12:03:35 -07:00
Shiv Toolsidass	a56ffe6ab2	Added backpressure metric to docs and defaultMetricDimensions (#6405 ) * Added backpressure metric to docs and defaultMetricDimensions.json * Reworded description for backpressure metric in docs	2018-09-29 17:57:29 -07:00
adursun	6f44e568db	Add missing comma (#6399 )	2018-09-28 09:02:36 -07:00
QiuMM	47a6cca013	Add TimestampSpec format for microsecond (#6395 )	2018-09-27 09:38:44 -07:00
Jihoon Son	6fb503c073	Deprecate task audit logging (#6368 ) * Deprecate task audit logging * fix test * fix it test	2018-09-26 16:28:02 -07:00
Nishant Bangarwa	c9d281a2e9	Add ability to pass in Bloom filter from Hive Queries (#6222 ) * Bloom filter initial implementation fix checkstyle review comments Fix wierd failure review comments Revert "Fix wierd failure" This reverts commit a13a83ad7887e679f6d539191b52aeaaea85b613. * fix test * review comment	2018-09-26 16:04:26 -07:00
Caroline1000	034d006d24	add to docs on including Drop rule with Load rule (#6378 )	2018-09-25 20:13:52 -07:00
Benedict Jin	e5d9fcfe8f	Add maven.exec.xxx.skip option for exec-maven-plugin (#6162 ) * Fix conflicts * Modify io.druid into org.apache.druid	2018-09-25 10:05:26 -07:00
Jihoon Son	99428e20d2	Deprecate dimensions / metrics APIs on brokers (#6361 ) * Deprecate dimensions / metrics APIs on brokers * add segmentMetadataQuery link * add more doc	2018-09-24 17:56:38 -07:00
Jonathan Wei	ee7b565469	Docs for ingestion stat reports and new parse exception handling (#6373 )	2018-09-24 17:45:05 -07:00
Alexander Saydakov	93345064b5	HllSketch module (#5712 ) * HllSketch module * updated license and imports * updated package name * implemented makeAggregateCombiner() * removed json marks * style fix * added module * removed unnecessary import, side effect of package renaming * use TreadLocalRandom * addressing code review points, mostly formatting and comments * javadoc * natural order with nulls * typo * factored out raw input value extraction * singleton * style fix * style fix * use Collections.singletonList instead of Arrays.asList * suppress warning	2018-09-24 08:41:56 -07:00
Jonathan Wei	f12ffd19a8	Add Kafka reset instructions for tutorial (#6362 )	2018-09-21 14:18:31 -07:00
Jonathan Wei	8972244c68	Mutual TLS support (#6076 ) * Mutual TLS support * Kafka test fixes * TeamCity fix * Split integration tests * Use localhost DOCKER_IP * Increase server thread count * Increase SSL handshake timeouts * Add broken pipe retries, use injected client config params * PR comments, Rat license check exclusion	2018-09-19 09:56:15 -07:00
Dayue Gao	edf0c13807	add a sql option to force user to specify time condition (#6246 ) * add a sql option to force user to specify time condition * rename forceTimeCondition to requireTimeCondition, refine error message	2018-09-17 13:52:24 -07:00
QiuMM	288aa4d504	Add missing metadata table information in docs (#6309 ) * Add missing metadata table information in doc file * address review comment	2018-09-14 12:17:05 -07:00
QiuMM	85391e9fb3	fix opentsdb emitter always be running and fail sending tags whose value contains colon (#6251 ) * fix opentsdb emitter always be running * check if emitter started * add more details about consumeDelay in doc * fix possible thread unsafe * fix fail sending tags whose value contain colon	2018-09-14 12:14:15 -07:00
QiuMM	87ccee05f7	Add ability to specify list of task ports and port range (#6263 ) * support specify list of task ports * fix typos * address comments * remove druid.indexer.runner.separateIngestionEndpoint config * tweak doc * fix doc * code cleanup * keep some useful comments	2018-09-13 19:36:04 -07:00
Jonathan Wei	fd6786ac6c	Fix endpoint permissions section in basic-security docs (#6331 )	2018-09-13 15:23:41 -07:00
Clint Wylie	91a37c692d	'suspend' and 'resume' support for supervisors (kafka indexing service, materialized views) (#6234 ) * 'suspend' and 'resume' support for kafka indexing service changes: * introduces `SuspendableSupervisorSpec` interface to describe supervisors which support suspend/resume functionality controlled through the `SupervisorManager`, which will gracefully shutdown the supervisor and it's tasks, update it's `SupervisorSpec` with either a suspended or running state, and update with the toggled spec. Spec updates are provided by `SuspendableSupervisorSpec.createSuspendedSpec` and `SuspendableSupervisorSpec.createRunningSpec` respectively. * `KafkaSupervisorSpec` extends `SuspendableSupervisorSpec` and now supports suspend/resume functionality. The difference in behavior between 'running' and 'suspended' state is whether the supervisor will attempt to ensure that indexing tasks are or are not running respectively. Behavior is identical otherwise. * `SupervisorResource` now provides `/druid/indexer/v1/supervisor/{id}/suspend` and `/druid/indexer/v1/supervisor/{id}/resume` which are used to suspend/resume suspendable supervisors * Deprecated `/druid/indexer/v1/supervisor/{id}/shutdown` and moved it's functionality to `/druid/indexer/v1/supervisor/{id}/terminate` since 'shutdown' is ambiguous verbage for something that effectively stops a supervisor forever * Added ability to get all supervisor specs from `/druid/indexer/v1/supervisor` by supplying the 'full' query parameter `/druid/indexer/v1/supervisor?full` which will return a list of json objects of the form `{"id":<id>, "spec":<SupervisorSpec>}` * Updated overlord console ui to enable suspend/resume, and changed 'shutdown' to 'terminate' * move overlord console status to own column in supervisor table so does not look like garbage * spacing * padding * other kind of spacing * fix rebase fail * fix more better * all supervisors now suspendable, updated materialized view supervisor to support suspend, more tests * fix log	2018-09-13 14:42:18 -07:00
Gian Merlino	d6cbdf86c2	Broker backpressure. (#6313 ) * Broker backpressure. Adds a new property "druid.broker.http.maxQueuedBytes" and a new context parameter "maxQueuedBytes". Both represent a maximum number of bytes queued per query before exerting backpressure on the channel to the data server. Fixes #4933. * Fix query context doc.	2018-09-10 09:33:29 -07:00
Gian Merlino	4669f0878f	SQL: UNION ALL operator. (#6314 ) * SQL: UNION ALL operator. * Remove unused import.	2018-09-09 22:32:56 -07:00
Clint Wylie	e6e068ce60	Add support for 'maxTotalRows' to incremental publishing kafka indexing task and appenderator based realtime task (#6129 ) * resolves #5898 by adding maxTotalRows to incremental publishing kafka index task and appenderator based realtime indexing task, as available in IndexTask * address review comments * changes due to review * merge fail	2018-09-07 13:17:49 -07:00
Jonathan Wei	60cbc64472	Use PasswordProvider, fix info on initial passwords in basic security extension docs (#6303 ) * Fix info on initial passwords in basic security extension docs * Use PasswordProvider * Compile fix	2018-09-05 17:07:16 -07:00
Jonathan Wei	4caa61d8fa	Fix tutorial sample data filename, fix logger classname in metrics docs (#6299 )	2018-09-04 21:47:12 -07:00
Eyal Yurman	10ca290d64	Correct file name typo in Quickstart tutorial (#6297 ) Correct name wikipedia-2015-09-12-sampled.json.gz to wikiticker-2015-09-12-sampled.json.gz	2018-09-04 14:20:17 -07:00
Jonathan Wei	180e3ccfad	Docs consistency cleanup (#6259 )	2018-09-04 12:54:41 -07:00
QiuMM	9b04846e6b	correct metric name in doc file (#6271 )	2018-08-30 10:57:35 -07:00
Gian Merlino	431d3d8497	Rename io.druid to org.apache.druid. (#6266 ) * Rename io.druid to org.apache.druid. * Fix META-INF files and remove some benchmark results. * MonitorsConfig update for metrics package migration. * Reorder some dimensions in inner queries for some reason. * Fix protobuf tests.	2018-08-30 09:56:26 -07:00
Himanshu	1fae6513e1	add "subtotalsSpec" attribute to groupBy query (#5280 ) * add subtotalsSpec attribute to groupBy query * dont sent subtotalsSpec to downstream nodes from broker and other updates * address review comment * fix checkstyle issues after merge to master * add docs for subtotalsSpec feature * address doc review comments	2018-08-28 17:46:38 -07:00
Jim Slattery	d957295b98	spelling: storage (#6248 )	2018-08-27 16:35:31 -07:00
Gian Merlino	0172326c62	SQL: Support more result formats, add columns header. (#6191 ) * SQL: Support more result formats, add columns header. - Add result formats for line-based JSON and CSV. - Add X-Druid-Sql-Columns header with a list of all columns that the response will contain. - Add more comprehensive documentation on what callers should expect when making Druid SQL queries. * Fix some tests. * Adjust tests. * Adjust trailer, add types header. * Fix trailers.	2018-08-26 23:00:14 -06:00
Susie	6e73ad6231	Fix bound query keys for Filtering on numeric values (#5881 ) It is currently showing the use of `lowerBound` and `upperBound` instead of `lower` and `upper` for the range.	2018-08-23 14:07:10 -07:00
QiuMM	ceb8f8e625	remove unnecessary tlsPortFinder to avoid potential port conflicts (#6194 )	2018-08-23 10:41:49 -07:00
Ryan Plessner	9c500fb69f	Add PostgreSQLConnectorConfig to expose SSL configuration options (#6181 ) * Add PostgreSQLConnectorConfig to expose SSL configuration options for the Postgres Metadata Storage module. * Fix checkstyle violations and add license header * Convert properties in the postgres docs to be the full property path and fix typo * Fix grammar in sslFactory docs	2018-08-21 16:45:27 -07:00
QiuMM	266f3dfbcb	remove duplicate link to operations/recommendations.html (#6193 )	2018-08-21 12:02:43 -07:00
QiuMM	b0cf8d0252	'shutdownAllTasks' API for a dataSource (#6185 ) * 'shutdownAllTasks' API for a dataSource Change-Id: I30d14390457d39e0427d23a48f4f224223dc5777 * fix api path and return Change-Id: Ib463f31ee2c4cb168cf2697f149be845b57c42e5 * optimize implementation Change-Id: I50a8dcd44dd9d36c9ecbfa78e103eb9bff32eab9	2018-08-17 12:57:09 -04:00
Jonathan Wei	0c3bb47558	Change hybrid cache default types in docs to caffeine (#6182 )	2018-08-17 12:17:43 -04:00
Caroline1000	f447b784de	update sigar link (#6175 )	2018-08-14 16:58:29 -07:00
QiuMM	69f555019b	convert all time-intervals in ISO 8601 format to uppercase in doc files (#6118 ) Change-Id: I904fed4cfb600a8a42664335557f611133a5078d	2018-08-13 12:58:47 -07:00
Jonathan Wei	94a937b5e8	New doc fixes (#6156 )	2018-08-13 11:11:32 -07:00
Atul Mohan	064c22c937	Fix redirects (#6151 )	2018-08-10 13:55:47 -07:00
Jonathan Wei	b0805540af	Fix kafka tutorial typo (#6141 )	2018-08-09 18:41:05 -07:00
Jonathan Wei	af0557c1f7	Unified configuration doc page (#6127 ) * Unified configuration doc page * Rename to index.md, update redirects * PR comments * PR comments * PR comment	2018-08-09 14:52:14 -07:00
Jonathan Wei	fea2ab7094	New docs intro (#6122 ) * New docs intro * PR comments * Fix arch diagram * PR comment * PR comment * PR comment	2018-08-09 14:19:11 -07:00
pdeva	c028d18d74	update redis-cache documentation (#6109 ) * update redis-cache documentation added clarifying info on setup and enablement * added link	2018-08-09 13:44:59 -07:00
Jonathan Wei	aa660b8751	Add docs for virtual columns and transform specs (#6119 ) * Add docs for virtual columns and transform specs * PR Comments * PR comment	2018-08-09 14:42:52 -06:00
Jonathan Wei	2b64025eaf	Separate hadoop and native batch docs more (#6120 ) * Separate hadoop and native batch docs more * Rebase with parallel batch * PR comments	2018-08-09 14:40:20 -06:00
Jonathan Wei	24f2e8ba26	New quickstart and tutorials (#6126 ) * New quickstart and tutorials * PR comments * Fix tranquility	2018-08-09 14:37:52 -06:00
Jonathan Wei	2b0f03acb9	Unified API doc page (#6128 ) * Unified API doc page * PR comments * Fix metadata endpoint	2018-08-09 14:27:42 -06:00
Gian Merlino	3525d4059e	Cache: Add maxEntrySize config, make groupBy cacheable by default. (#5108 ) * Cache: Add maxEntrySize config. The idea is this makes it more feasible to cache query types that can potentially generate large result sets, like groupBy and select, without fear of writing too much to the cache per query. Includes a refactor of cache population code in CachingQueryRunner and CachingClusteredClient, such that they now use the same CachePopulator interface with two implementations: one for foreground and one for background. The main reason for splitting the foreground / background impls is that the foreground impl can have a more effective implementation of maxEntrySize. It can stop retaining subvalues for the cache early. * Add CachePopulatorStats. * Fix whitespace. * Fix docs. * Fix various tests. * Add tests. * Fix tests. * Better tests * Remove conflict markers. * Fix licenses.	2018-08-07 10:23:15 -07:00
Jihoon Son	56ab4363ea	Native parallel batch indexing without shuffle (#5492 ) * Native parallel indexing without shuffle * fix build * fix ci * fix ingestion without intervals * fix retry * fix retry * add it test * use chat handler * fix build * add docs * fix ITUnionQueryTest * fix failures * disable metrics reporting * working * Fix split of static-s3 firehose * Add endpoints to supervisor task and a unit test for endpoints * increase timeout in test * Added doc * Address comments * Fix overlapping locks * address comments * Fix static s3 firehose * Fix test * fix build * fix test * fix typo in docs * add missing maxBytesInMemory to doc * address comments * fix race in test * fix test * Rename to ParallelIndexSupervisorTask * fix teamcity * address comments * Fix license * addressing comments * addressing comments * indexTaskClient-based segmentAllocator instead of CountingActionBasedSegmentAllocator * Fix race in TaskMonitor and move HTTP endpoints to supervisorTask from runner * Add more javadocs * use StringUtils.nonStrictFormat for logging * fix typo and remove unused class * fix tests * change package * fix strict build * tmp * Fix overlord api according to the recent change in master * Fix it test	2018-08-06 23:59:42 -07:00
Nishant Bangarwa	75c8a87ce1	Part 2 of changes for SQL Compatible Null Handling (#5958 ) * Part 2 of changes for SQL Compatible Null Handling * Review comments - break lines longer than 120 characters * review comments * review comments * fix license * fix test failure * fix CalciteQueryTest failure * Null Handling - Review comments * review comments * review comments * fix checkstyle * fix checkstyle * remove unrelated change * fix test failure * fix failing test * fix travis failures * Make StringLast and StringFirst aggregators nullable and fix travis failures	2018-08-02 08:20:25 -07:00
Andrés Gómez	e270362767	Add stringLast and stringFirst aggregators extension (#5789 ) * Add lastString and firstString aggregators extension * Remove duplicated class * Move first-last-string doc page to extensions-contrib * Fix ObjectStrategy compare method * Fix doc bad aggregatos type name * Create FoldingAggregatorFactory classes to fix SegmentMetadataQuery * Add getMaxStringBytes() method to support JSON serialization * Fix null pointer exception at segment creation phase when the string value is null * Control the valueSelector object class on BufferAggregators * Perform all improvements * Add java doc on SerializablePairLongStringSerde * Refactor ObjectStraty compare method * Remove unused ; * Add aggregateCombiner unit tests. Rename BufferAggregators unit tests * Remove unused imports * Add license header * Add class name to java doc class serde * Throw exception if value is unsupported class type * Move first-last-string extension into druid core * Update druid core docs * Fix null pointer exception when pair->string is null * Add null control unit tests * Remove unused imports * Add first/last string folding aggregator on AggregatorsModule to support segment metadata query * Change SerializablePairLongString to extend SerializablePair * Change vars from public to private * Convert vars to primitive type * Clarify compare comment * Change IllegalStateException to ISE * Remove TODO comments * Control possible null pointer exception * Add @Nullable annotation * Remove empty line * Remove unused parameter type * Improve AggregatorCombiner javadocs * Add filterNullValues option at StringLast and StringFirst aggregators * Add filterNullValues option at agg documentation * Fix checkstyle * Update header license * Fix StringFirstAggregatorFactory.VALUE_COMPARATOR * Fix StringFirstAggregatorCombiner * Fix if condition at StringFirstAggregateCombiner * Remove filterNullValues from string first/last aggregators * Add isReset flag in FirstAggregatorCombiner * Change Arrays.asList to Collections.singletonList	2018-08-01 10:52:54 -07:00
Caroline1000	7f89c72932	Add definition of 'NONE' to queryGranularity in ingestion.index doc (#6073 ) * Add meaning of granularity = None to queryGranularity * Fix format	2018-07-30 14:07:33 -07:00
Gian Merlino	63be028cee	CompactionTask: Reject empty intervals on construction. (#6059 ) * CompactionTask: Reject empty intervals on construction. They don't make sense anyway, and it's better to fail fast. * Switch API.	2018-07-30 08:52:50 -07:00
Eyal Yurman	94d6c9a0a5	Remove JDK 7 from build documentation. (#6031 ) See issue #6030	2018-07-26 17:05:07 -07:00
Jonathan Wei	efab3b0160	Add concat and textcat SQL functions (#6005 )	2018-07-20 11:21:04 -07:00
Gian Merlino	cd8ea3da8d	SQL: Add server-wide default time zone config. (#5993 ) * SQL: Add server-wide default time zone config. * Switch API.	2018-07-18 13:12:40 -07:00
Caroline1000	5f78a333ad	show that flatten will also work with avro extension (#5874 ) * show that flatten will also work with avro extension * fix url	2018-07-11 16:47:03 -07:00
Gian Merlino	04ea3c9f8c	Update license headers. (#5976 ) * Update license headers. For compliance with http://www.apache.org/legal/src-headers.html. * More license adjustments. * Fix mistakenly edited package line.	2018-07-11 09:55:18 -07:00
Caroline1000	b3976050ad	add definition of balancerComputeThreads (#5865 )	2018-07-05 09:54:36 -07:00
Caroline1000	ee4a5aafb0	add config values for GCS deep storage (#5875 ) * add config values for GCS deep storage * fix config values for GCS deep storage	2018-07-05 09:53:41 -07:00
Dylan Wylie	10642ef9ca	Fix filtered request logging docs (#5924 ) - Setting druid.request.logging.delegate has no effect. - The provider is injected based on a type parameter & this looks to be scoped to delegate for filtered loggers	2018-07-05 09:51:10 -07:00
scrawfor	bf2a31a5bc	Add new 'true' filter which always returns true. (#5711 ) * Add new 'true' filter which always returns true. * Add support for bitmap index. * Adds documentation. * Removes No-op Filter	2018-06-28 11:52:45 -07:00
Gian Merlino	a28314349c	Fix spelling of "propagate" in various places. (#5896 ) One of these is a configuration parameter (introduced in #5429), but it's never been in a release, so I think it's ok to rename it.	2018-06-25 09:18:08 -07:00
varaga	b4b1b2a020	Provisioning support for ZooKeeper Authorization (#5701 ) Review comments implemented	2018-06-15 14:02:01 -07:00
zhangxinyu	e43e5ebbcd	Materialized view implementation (#5556 ) * implement materialized view * modify code according to jihoonson's comments * modify code according to jihoonson's comments - 2 * add documentation about materialized view * use new HadoopTuningConfig in pr 5583 * add minDataLag and fix optimizer bug * correct value of DEFAULT_MIN_DATA_LAG_MS * modify code according to jihoonson's comments - 3 * use the boolean expression instead of if-else	2018-06-09 12:24:54 -07:00
Caroline1000	96feb479cd	add order change needed for KIS in 0.12.0 (#5760 )	2018-06-08 15:25:26 -07:00
Hongze Zhang	cfa94b747b	Update to jetty 9.4; Enable request decompression (#5624 ) * Update to jetty 9.4; Enable request decompression; Add http compression config options * Fix BadMessageException from jetty server at HttpGenerator.generateHeaders(...)	2018-06-08 14:53:08 -07:00
awelsh93	adbe22c05b	Security - add anonymous authenticator (#5842 ) * Anonymous authenticator that authenticates all requests and then directs them to an authorizer. * Adding documentation * Removed some fields from class AnonymousAuthenticator * Updating docs	2018-06-07 10:17:54 -07:00
Siddharth Subramanian	37409dc2f4	Fix minor documentation error (#5851 ) Adding a required `,` in the sample JSON	2018-06-06 12:51:56 -07:00
Ryan Plessner	ee45ee6915	Fix docs to reflect the correct default max total row count for the IndexTuningConfig (#5845 )	2018-06-05 13:15:12 -07:00
awelsh93	1a4707f09c	Remove extra slash in endpoint (#5822 )	2018-06-05 13:11:26 -07:00
Alexander Saydakov	d1cdcd4895	Datasketches doc correction (#5816 ) * func was renamed to operation during code review * added missing descriptions, some cleanup	2018-06-05 17:52:37 +05:30
Atul Mohan	50ad7a45ff	Fix authentication doc (#5813 )	2018-05-30 11:10:48 -07:00
Jihoon Son	67ff7dacbd	Support server-side encryption for s3 (#5740 ) * Support server-side encryption for s3 * fix teamcity * typo * address comments * Refactoring configuration injection * fix doc * fix doc	2018-05-28 20:22:08 -07:00
Joseph Glanville	5cbfb95e1f	docs: Document inputFormat on Hadoop InputSpecs (#5784 )	2018-05-24 21:44:37 -07:00
Gian Merlino	bc0ff251a3	Docs: Clarify the meaning of maxSplitSize. (#5803 )	2018-05-24 21:43:39 -07:00
Michael Schnupp	33b4eb624d	fix freeSpacePercent in segmentCache.locations (#5765 ) * fix freeSpacePercent in segmentCache.locations * the check should probably test the other way around * documentation should put the option in the right place * examples have a superfluous backslash * add test to verify correct behavior * switch to Path and test with jimfs Path allows to use different filesystems. Jimfs provides an actual (in memory) filesystem. This also allows more complex test scenarios. The behavior should be unchanged by this commit. * Revert "switch to Path and test with jimfs" This reverts commit `8b9a418d65`.	2018-05-24 11:15:30 +09:00
Atul Mohan	1b9611a60e	Local indexing from RDBMS (#5441 ) * Local indexing from RDBMS * Fix content * Remove pom changes * Remove extraneous space * Add tests and update documentation * Fix comments * Fix docs * Fix build related issue * Handle invalid strings * Make target database independent of metadata storage * Add firehose connector * Fix accessibility * Add docs * Remove unused def * Remove lazy instantiation of jsoniterator * Move unused changes * Move unused changes * Fix build * Make Sqlfirehose method private	2018-05-22 12:33:01 +09:00
Caroline1000	c73e3ea4f5	Provide examples to havingSpec filters (#5774 ) * expand examples * expand examples for filtered havingSpecs * expand other having examples * remove blank code block * add better AND/OR/NOT examples * fix indentation	2018-05-14 13:43:42 -07:00
Abhishek Kaushik	aa23fe6386	Typo fix in historical doc (#5753 )	2018-05-08 11:08:27 -07:00
Kirill Kozlov	67d0b0ee42	Add taskType dimension to task metrics (#5664 )	2018-05-07 09:42:26 -07:00
kaijianding	c12c16385e	support throw duplcate row during realtime ingestion in RealtimePlumber (#5693 )	2018-05-04 10:12:25 -07:00
Dylan Wylie	2c5f0038fd	Make lookup offheap buffer configurable (#5696 ) * Make lookup offheap buffer configurable Fixes #3663 * Address comments * Update docs * Update docs	2018-05-04 10:00:55 -07:00
Stuart McLean	c2b5e5ec95	Default caffeine cache size (#5738 ) * add default caffeine cache size based on runtime Xmx or max 1GB * update docs for caffeine cache * fix formatting * test caffeine size should never be less than 0 * set caffeine max default size to 1G not 1M * fix caffeine cache tests	2018-05-04 09:29:11 -07:00
Surekha	13c616ba24	'maxBytesInMemory' tuningConfig introduced for ingestion tasks (#5583 ) * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Fix check style and remove a comment * Add overlord unsecured paths to coordinator when using combined service (#5579) * Add overlord unsecured paths to coordinator when using combined service * PR comment * More error reporting and stats for ingestion tasks (#5418) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments * Allow getDomain to return disjointed intervals (#5570) * Allow getDomain to return disjointed intervals * Indentation issues * Adding feature thetaSketchConstant to do some set operation in PostAgg (#5551) * Adding feature thetaSketchConstant to do some set operation in PostAggregator * Updated review comments for PR #5551 - Adding thetaSketchConstant * Fixed CI build issue * Updated review comments 2 for PR #5551 - Adding thetaSketchConstant * Fix taskDuration docs for KafkaIndexingService (#5572) * With incremental handoff the changed line is no longer true. * Add doc for automatic pendingSegments (#5565) * Add missing doc for automatic pendingSegments * address comments * Fix indexTask to respect forceExtendableShardSpecs (#5509) * Fix indexTask to respect forceExtendableShardSpecs * add comments * Deprecate spark2 profile in pom.xml (#5581) Deprecated due to https://github.com/druid-io/druid/pull/5382 * CompressionUtils: Add support for decompressing xz, bz2, zip. (#5586) Also switch various firehoses to the new method. Fixes #5585. * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Address code review comments * Fix the coding style according to druid conventions * Add more javadocs * Rename some variables/methods * Other minor issues * Address more code review comments * Some refactoring to put defaults in IndexTaskUtils * Added check for maxBytesInMemory in AppenderatorImpl * Decrement bytes in abandonSegment * Test unit test for multiple sinks in single appenderator * Fix some merge conflicts after rebase * Fix some style checks * Merge conflicts * Fix failing tests Add back check for 0 maxBytesInMemory in OnHeapIncrementalIndex * Address PR comments * Put defaults for maxRows and maxBytes in TuningConfig * Change/add javadocs * Refactoring and renaming some variables/methods * Fix TeamCity inspection warnings * Added maxBytesInMemory config to HadoopTuningConfig * Updated the docs and examples * Added maxBytesInMemory config in docs * Removed references to maxRowsInMemory under tuningConfig in examples * Set maxBytesInMemory to 0 until used Set the maxBytesInMemory to 0 if user does not set it as part of tuningConfing and set to part of max jvm memory when ingestion task starts * Update toString in KafkaSupervisorTuningConfig * Use correct maxBytesInMemory value in AppenderatorImpl * Update DEFAULT_MAX_BYTES_IN_MEMORY to 1/6 max jvm memory Experimenting with various defaults, 1/3 jvm memory causes OOM * Update docs to correct maxBytesInMemory default value * Minor to rename and add comment * Add more details in docs * Address new PR comments * Address PR comments * Fix spelling typo	2018-05-03 16:25:58 -07:00
Gian Merlino	739e347320	Allow Hadoop dataSource inputSpec to be specified multiple times. (#5717 ) * Allow Hadoop dataSource inputSpec to be specified multiple times. * Fix test	2018-05-03 13:51:57 -07:00
Stuart McLean	d2b8d880ea	include hybrid and caffeine in cache docs and show caffeine as default (#5737 )	2018-05-03 09:52:05 -07:00
Jihoon Son	d4311b4a5a	Support enablePathStyleAccess, disableChunkedEncoding, and forceGlobalBucketAccessEnabled for aws client (#5702 ) * Support enablePathStyleAccess and disableChunkedEncoding for aws client * add an option for forceGlobalBucketAccessEnabled * add missing doc	2018-05-02 10:45:38 -07:00
Jakub Kukul	e2431ae161	Update defaultHadoopCoordinates in documentation. (#5720 ) * Update defaultHadoopCoordinates in documentation. To match changes applied in #5382. * Remove a parameter with defaults from example configuration file. If it has reasonable defaults, then why would it be in an example config file? Also, it is yet another place that has been forgotten to be updated and will be forgotten in the future. Also, if someone is running different hadoop version, then there's much more work to be done than just changing this property, so why give users false hopes? * Fix typo in documentation.	2018-04-30 20:49:14 -07:00
Dylan Wylie	754c80e74a	Fix quickstart docs to specify that Java 8 is required. (#5722 ) See #4907 #5719	2018-04-30 13:25:59 -07:00
Gian Merlino	0f8493846e	Replace dev list references in docs. (#5723 )	2018-04-30 11:25:45 -07:00
David Lim	8ec2d2fe18	Use unique segment paths for Kafka indexing (#5692 ) * support unique segment file paths * forbiddenapis * code review changes * code review changes * code review changes * checkstyle fix	2018-04-29 21:59:48 -07:00
Gian Merlino	762f8829e4	Add task action metrics, add taskId metric dimension. (#5714 ) * Add task action metrics, add taskId metric dimension. Adds two new metrics: task/action/log/time and task/action/run/time. Also adds taskId as a dimension, to give us the ability to drill down into metrics for an individual task. Also standardizes metrics-attachment using two helper methods in IndexTaskUtils. * Fix typo	2018-04-29 21:24:06 -07:00
Joseph Glanville	90cd05696e	Document processing properties required for Middlemanager (#5660 )	2018-04-29 17:20:17 -07:00
Jihoon Son	86746f82d8	Use mergeBuffer instead of processingBuffer in parallelCombiner (#5634 ) * Use mergeBuffer instead of processingBuffer in parallelCombiner * Fix test * address comments * fix test * Fix test * Update comment * address comments * fix build * Fix test failure	2018-04-27 18:14:37 -07:00
Gian Merlino	f81855d607	Add unauthorized errorCode to query docs. (#5691 )	2018-04-26 13:06:25 -07:00
Caroline1000	fd76af9737	remove old prod cluster config link (#5676 )	2018-04-23 18:00:24 -07:00
scrawfor	15f4ab2b31	Expose noop filter to users (#5597 )	2018-04-18 07:57:07 -07:00
Gian Merlino	fbf3fc178e	Timeseries: Add "grandTotal" option. (#5640 ) * Timeseries: Add "grandTotal" option. * Modify whitespace. * Checkstyle workaround.	2018-04-16 18:22:19 -07:00
Jonathan Wei	d0b66a6af5	Fix HTTP OPTIONS request auth handling (#5638 ) * Fix HTTP OPTIONS request auth handling * PR comment * More PR comments * Fix * PR comment	2018-04-16 18:09:56 -07:00
Jihoon Son	6b3bde0143	Fix granularitySpec doc (#5647 )	2018-04-16 14:24:39 -04:00
Jonathan Wei	882b172318	Revert "Fix HTTP OPTIONS request auth handling (#5615 )" (#5637 ) This reverts commit `df51a7bcb7`.	2018-04-12 16:43:54 -07:00
Jonathan Wei	df51a7bcb7	Fix HTTP OPTIONS request auth handling (#5615 ) * Fix HTTP OPTIONS request auth handling * Flip configuration boolean	2018-04-12 14:02:20 -07:00
Caroline1000	48c1a1ef57	change header from Data Schema to Ingestion Spec (#5631 )	2018-04-11 21:42:54 -07:00
Nishant Bangarwa	e6efd75a3d	Add config to allow setting up custom unsecured paths for druid nodes. (#5614 ) * Add config to allow setting up custom unsecured paths for druid nodes. * return all resources for Unsecured paths * review comment - Add test * fix tests * fix test	2018-04-11 17:10:07 -07:00
Caroline1000	afa75e04b7	change header in overlord console; minor querydoc change (#5625 ) * change header in overlord console; minor querydoc change * remove change to overlord console * address Gian comments	2018-04-11 12:57:22 -07:00
Nishant Bangarwa	b32aad9ab4	Fix some broken links in druid docs (#5622 ) * Fix some broken links in druid docs * review comment	2018-04-11 10:27:33 -07:00
Nishant Bangarwa	80fa5094e8	Fix Kerberos Authentication failing requests without cookies and excludedPaths config. (#5596 ) * Fix Kerberos Authentication failing requests without cookies. KerberosAuthenticator was failing `First` request from the clients. After authentication we were setting the cookie properly but not setting the the authenticated flag in the request. This PR fixed that. Additional Fixes - * Removing of Unused SpnegoFilterConfig - replaced by KerberosAuthenticator * Unused internalClientKeytab and principal from KerberosAuthenticator * Fix docs accordingly and add docs for configuring an escalated client. * Fix excluded path config behavior * spelling correction * Revert "spelling correction" This reverts commit `fb754b43d8`. * Revert "Fix excluded path config behavior" This reverts commit `3901047769`.	2018-04-09 20:45:35 -07:00
Alexander T	ad6f234e1e	Update lookups-cached-global.md (#5525 ) Update lookup creation example to work with version 0.12.0	2018-04-06 16:13:17 -07:00
Jihoon Son	723857699c	Add doc for automatic pendingSegments (#5565 ) * Add missing doc for automatic pendingSegments * address comments	2018-04-05 23:53:43 -07:00
Dylan Wylie	ddd23a11e6	Fix taskDuration docs for KafkaIndexingService (#5572 ) * With incremental handoff the changed line is no longer true.	2018-04-05 23:52:58 -07:00
Arup Malakar	0c4598c1fe	Fix typo in avatica java client code documenation (#5553 )	2018-03-29 16:36:40 -05:00
Dyana Rose	db508cf3ca	[docs] fix invalid example json (#5547 ) https://github.com/druid-io/druid/issues/5546	2018-03-28 13:53:38 -07:00
Clint Wylie	50e0e7f97d	Correct lookup documentation (#5537 ) fixes #5536	2018-03-26 17:01:02 -07:00
Nathan Hartwell	ea30c05355	Adding ParserSpec for Influx Line Protocol (#5440 ) * Adding ParserSpec for Influx Line Protocol * Addressing PR feedback - Remove extraneous TODO - Better handling of parse errors (e.g. invalid timestamp) - Handle sub-millisecond timestamps * Adding documentation for Influx parser * Fixing docs	2018-03-26 14:28:46 -07:00
Atul Mohan	ec17a44e09	Add result level caching to Brokers (#5028 ) * Add result level caching to Brokers * Minor doc changes * Simplify sequences * Move etag execution * Modify cacheLimit criteria * Fix incorrect etag computation * Fix docs * Add separate query runner for result level caching * Update docs * Add post aggregated results to result level cache * Fix indents * Check byte size for exceeding cache limit * Fix indents * Fix indents * Add flag for result caching * Remove logs * Make cache object generation synchronous * Avoid saving intermediate cache results to list * Fix changes that handle etag based response * Release bytestream after use * Address PR comments * Discard resultcache stream after use * Fix docs * Address comments * Add comment about fluent workflow issue	2018-03-23 19:11:52 -07:00
Charles Allen	ef21ce5a64	Add graceful shutdown timeout for Jetty (#5429 ) * Add graceful shutdown timeout * Handle interruptedException * Incorporate code review comments * Address code review comments * Poll for activeConnections to be zero * Use statistics handler to get active requests * Use native jetty shutdown gracefully * Move log line back to where it was * Add unannounce wait time * Make the default retain prior behavior * Update docs with new config defaults * Make duration handling on jetty shutdown more consistent * StatisticsHandler is a wrapper * Move jetty lifecycle error logging to error	2018-03-23 09:38:17 -07:00
Gian Merlino	0851f2206c	Expanded documentation for DataSketches aggregators. (#5513 ) Originally written by @AlexanderSaydakov in druid-io/druid-io.github.io#448. I also added redirects and updated links to point to the new datasketches-extension.html landing page for the extension, rather than to the old page about theta sketches.	2018-03-21 18:19:27 -07:00
Jihoon Son	1ad898bde2	Use the official aws-sdk instead of jet3t (#5382 ) * Use the official aws-sdk instead of jet3t * fix compile and serde tests * address comments and fix test * add http version string * remove redundant dependencies, fix potential NPE, and fix test * resolve TODOs * fix build * downgrade jackson version to 2.6.7 * fix test * resolve the last TODO * support proxy and endpoint configurations * fix build * remove debugging log * downgrade hadoop version to 2.8.3 * fix tests * remove unused log * fix it test * revert KerberosAuthenticator change * change hadoop-aws scope to provided in hdfs-storage * address comments * address comments	2018-03-21 15:36:54 -07:00
Slim	17c71a2a60	Make Doubles aggregators use 64bits by default (#5478 ) * use 64-bit float representation for double based aggregator Change-Id: Ia4f442037052add178f6ac68138c9d52f96c6e09 * review comments Change-Id: I5a588f7364f236bf22f2b138e9d743bfb27c67fe	2018-03-19 19:13:04 -07:00
Christoph Hösler	34f655599d	Let MySQLConnector accept all UTF charsets and recommend utf8mb4 (#5411 ) * Let MySQLConnector accept all UTF charsets and recommend utf8mb4 * Fix regex and remove newline in log statement	2018-03-13 01:16:10 -07:00
Himanshu	8fae0edc95	allow arbitrary aggregators for reindexing with hadoop (#5294 )	2018-03-07 17:13:56 -08:00
Hongze Zhang	b084075279	Add http/https proxy options to PullDependencies.java (#5450 )	2018-03-07 15:05:43 -08:00
Gian Merlino	7416d1d02d	Add "joda" option to timeFormat extractionFn. (#5448 )	2018-03-02 19:59:26 -08:00
Gian Merlino	e4eaee3806	Support for disabling bitmap indexes. (#5402 ) * Support for disabling bitmap indexes. Can save space for columns where bitmap indexes are pointless (like free-form text). * Remove import. * Fix CompactionTaskTest. * Update for review comments. * Review comments, tests. * Fix test.	2018-02-28 19:19:56 -08:00
Alexander Korablev	6a3a5350b8	Make memcached protocol and locator configurable. (#5438 ) * Make memcached protocol and locator configurable. * Style fix. * Style fix. * Style fix.	2018-02-28 17:16:43 -08:00
Gian Merlino	f3796bc81b	SQL: Lower default JDBC frame size. (#5409 ) The previous default of 100,000 was a bit excessive and could easily lead to OOM errors on "select *" style queries.	2018-02-21 10:00:48 -08:00
Parag Jain	fba13d8978	time based checkpointing for Kafka Indexing Service (#5255 ) * time based checkpointing * add test and fix issue * fix comments * fix formatting * update docs	2018-02-15 20:57:02 -08:00
David Lim	20a3164180	Support for router forwarding requests to active coordinator/overlord (#5369 ) * allow router to forward requests to coordinator and overlord * fix forbidden API * more forbidden api fixes * code review changes	2018-02-15 14:38:58 -08:00
Javier Collado	c45fe37611	Feature add coordinator servers endpoint documentation (#5392 ) * Add new servers section to the coordinator endpoints documentation * Remove trailing whitespace	2018-02-15 14:37:58 -08:00
Dan Suzuki	472ba14dfe	Support Map type in ORC extension (#5363 ) * Support map type in orc extension. Added getMapObject in OrcHadoopInputRowParser Updated parse tests to parse map-type field in OrcHadoopInputRowParserTest * changed from for-loop to foreach * added resolution of column names when map types are exploded to several columns. updated the document as well -- orc.md. * Update orc.md change from review	2018-02-15 13:03:15 -08:00
Parag Jain	b9b3be6965	fix segment info in Kafka indexing service docs (#5390 ) * fix segment info in Kafka indexing service docs * review updates	2018-02-15 09:57:30 -08:00
QiuMM	aa7aee53ce	Opentsdb emitter extension (#5380 ) * opentsdb emitter extension * doc for opentsdb emitter extension * update opentsdb emitter doc * add the ms unit to the constant name * add a configurable event limit * fix version to 0.13.0-SNAPSHOT * using a thread to consume metric event * rename method and parameter	2018-02-13 13:10:22 -08:00
Andrew	06f0067b6c	Fix typo: change partitioningSpec to partitionsSpec in design/segments (#5376 )	2018-02-12 11:03:40 -08:00
Stephanie Rivera	77bb2f9c9f	Update post-aggregations.md (#5237 ) * Update post-aggregations.md I think this is more clear. I am not sure how multiplying by 100 is involved in averaging... * Update post-aggregations.md adding additional aggregator * Update post-aggregations.md	2018-02-06 16:26:39 -08:00
Jihoon Son	2099b43e5f	Add a new config object for compactConfig (#5264 ) * add a new config object for compactConfig * fix test * address comments * Update doc	2018-02-06 12:13:52 -08:00
Gian Merlino	9a62b02cb7	Extensions: Option to load classes from extension jars first. (#5321 ) The behavior is configurable through druid.extensions.useExtensionClassloaderFirst. It is useful when extensions want to load a dependency different from one provided by Druid, for example a different version of geoip or protobuf.	2018-02-06 16:14:03 +05:30
Jihoon Son	0db696b7c9	Fix CompactionTask doc (#5351 ) * Fix CompactionTask doc * Update coordinator doc	2018-02-05 22:38:34 -08:00
Himanshu	222a13e401	Use httpRemote and not remoteHttp for using HTTP Tasks Mgmt (#5334 )	2018-02-02 14:16:43 -06:00
Gian Merlino	ed47a1e1a9	Lookups: Inherit "injective" from registered lookups, improve docs. (#5316 ) Code changes: - In the lookup-based extractionFns, inherit injective property from the lookup itself if not specified. Doc changes: - Add a "Query execution" section to the lookups doc explaining how injective lookups and their optimizations work. - Remove scary warnings against using registeredLookup extractionFns. They are necessary and important since they work with filters and function cascades -- two things that the dimension specs do not do. They deserve to be first class citizens. - Move the "registeredLookup" fn above the "lookup" fn. It's probably more commonly used, so the docs read better this way.	2018-02-01 18:30:19 -08:00
Jonathan Wei	2a892709e8	More memory limiting for HttpPostEmitter (#5300 ) * More memory limiting for HttpPostEmitter * Less aggressive large events test * Fix tests * Restrict batch queue size first, keep minimum of 2 queue items	2018-01-26 15:48:45 -08:00
Jonathan Wei	f6749f1229	Allow separate truststore conf for HttpEmitter (#5298 ) * Fix HttpEmitter TLS support, allow separate truststore conf * PR comment, fix tests	2018-01-26 10:46:06 -06:00
Jonathan Wei	80419752b5	Add metamx emitter, http clients, and metrics packages to druid java-util (#5289 ) * Add metamx java-util emitter, http clients, and metrics packages to druid java-util * Remove metamx java-util from pom.xml files * Checkstyle fixes * Import fix * TeamCity inspection fixes * Use slf4j, move some version defs to master pom.xml * Use parent jvm-attach-api and maven-surefire-plugin versions * Add ] to log msg, suppress inspection	2018-01-24 22:10:36 +01:00
Fokko Driesprong	cc32640642	Update the example of the dimensionsSpec (#5293 ) The example was outdated with the dateSpec	2018-01-24 11:28:54 -08:00
Gian Merlino	53e3c7d1b2	SQL: Add additional unsupported features to the docs. (#5290 )	2018-01-24 11:27:47 -08:00
Akash Dwivedi	d6932c1621	java-util version update + Add UnusedConnectionTimeout config. (#5239 ) * java-util version update + Add UnusedConnectionTimeout config. * warn if unusedConnectionTime >= readTimeout. * Doc update + addressed comment. * Use compareTo to compare duration. * remove unused variable. * addressed comments and default for unusedConnectionTimeout.	2018-01-17 15:54:18 -06:00
Jihoon Son	241efafbb2	Automatic compaction by coordinators (#5102 ) * Automatic compaction by coordinator * add links * skip compaction for very recent segments if they are small * fix finding search interval * fix finding search interval * fix TimelineHolder iteration * add test for newestSegmentFirstPolicy * add CompactionSegmentIterator * add numTargetCompactionSegments * add missing config * fix skipping huge shards * fix handling large number of segments per shard * fix test failure * change recursive call to loop * fix logging * fix build * fix test failure * address comments * change dataSources type * check running pendingTasks at each run * fix test * address comments * fix build * fix test * address comments * address comments * add doc for segment size optimization * address comment	2018-01-13 13:52:37 +09:00
Shen Liu	3c69717202	Fix typo in configuration/index.md (#5249 ) (#5250 ) * Fix #5212 - typo in auth.md. * Fix typo in configuration (#5249) * Add a backquote. * Fix typo from HttpEmitterMonitor to HttpEmittingMonitor.	2018-01-11 18:29:12 +09:00
Atul Mohan	3cc4a0ab19	Support for encryption of MySQL connections (#5122 ) * Encrypting MySQL connections * Update docs * Make verifyServerCertificate a configurable parameter * Change password parameter and doc update * Make server certificate verification disabled by default * Update tostring * Update docs * Add check for trust store passwords * Add warning for null password	2018-01-10 11:33:54 -08:00
Jihoon Son	972b4d189a	Fix topN doc (#5240 )	2018-01-09 20:10:13 -08:00
Jonathan Wei	02544f9197	Add missing auth doc links (#5224 )	2018-01-05 16:23:13 -06:00
Himanshu	a46d34daa2	HTTP based task/worker management. (#5104 ) * just renaming of SegmentChangeRequestHistory etc * additional change history refactoring changes * WorkerTaskManager a replica of WorkerTaskMonitor * HttpServerInventoryView refactoring to extract sync code and robustification * Introducing HttpRemoteTaskRunner * Additional Worker side updates	2018-01-04 19:19:35 -08:00
Jonathan Wei	935ac646f4	Upgrade to Calcite 1.15.0 (#5210 ) * Upgrade to Calcite 1.15.0 * Use Filtration.eternity()	2018-01-04 12:11:24 -08:00
Shen Liu	5a8ea5f8ab	Fix #5212 - typo in auth.md. (#5213 )	2018-01-04 12:09:42 -08:00
Nishant Bangarwa	4cc31e4e7a	Update Zookeeper version (#5184 )	2018-01-04 10:59:20 +08:00
Yuya Fujiwara	3d3b04e1b8	docs: fix broken link to ingestions and tasks on the Druid Concepts page (#5197 ) * fix broken links * add newline	2017-12-27 07:55:24 -08:00
Himanshu	0f5c7d1aec	Add freeSpacePercent config in segment location to enforce free space while storing segments (#5137 ) * Add freeSpacePercent config in segment location config to enforce free space while storing segments * address review comments * address review comments: more doc on freeSpacePercent and use Double for freeSpacePercent	2017-12-21 15:31:09 +03:00
Nishant Bangarwa	494e0b79ed	Allow configuring header size for druid requests (#5174 ) * Allow configuring header size for druid requests * fix configuration name in doc. * add more info to docs. * Add info to kerberos doc.	2017-12-20 18:51:40 -08:00
Jonathan Wei	f48c9d7be1	Basic auth extension (#5099 ) * Basic auth extension * Add auth configuration integration test * Fix missing authorizerName property * PR comments * Fix missing @JsonProperty annotation * PR comments * more PR comments	2017-12-14 10:36:04 -08:00
Roman Leventov	a7a6a0487e	Replace IOPeon with SegmentWriteOutMedium; Improve buffer compression (#4762 ) * Replace IOPeon with OutputMedium; Improve compression * Fix test * Cleanup CompressionStrategy * Javadocs * Add OutputBytesTest * Address comments * Random access in OutputBytes and GenericIndexedWriter * Fix bugs * Fixes * Test OutputBytes.readFully() * Address comments * Rename OutputMedium to SegmentWriteOutMedium and OutputBytes to WriteOutBytes * Add comments to ByteBufferInputStream * Remove unused declarations	2017-12-04 18:04:27 -08:00
Slim	790678f02c	Fix typo in docs (#5074 ) Small fix for the realtime pull docs fix issue #5072	2017-11-22 23:16:36 -03:00
Chuanlei Ni	368d03146b	assign granularity.all to SelectQuery by default (#5091 )	2017-11-21 17:10:19 -08:00
Daniel	22c49b0d33	docs: fix broken link to broker configuration (#5105 )	2017-11-21 13:32:00 +09:00
Roman Leventov	dbb37b727d	Add useL2 and populateL2 configs to HybridCache (#5088 ) * Add useL2 and populateL2 configs to HybridCache * typo	2017-11-20 16:57:05 -06:00
chaoqiang	50140ce820	StatsD Emitter Doc on blankHolder (#5101 ) * fix equalDistribution worker select strategy * replace anonymous Comparator * keep previous version sorting comment * fix code style * update comment * move JsonProperty * fix statsD emitter with blank character * Add blankHolder doc On statsD monitor	2017-11-18 12:00:47 -08:00
Parag Jain	cb03efeb14	Kafka Index Task that supports Incremental handoffs (#4815 ) * Kafka Index Task that supports Incremental handoffs - Incrementally handoff segments when they hit maxRowsPerSegment limit - Decouple segment partitioning from Kafka partitioning, all records from consumed partitions go to a single druid segment - Support for restoring task on middle manager restarts by check pointing end offsets for segments * take care of review comments * make getCurrentOffsets call async, keep track of publishing sequence, review comments * fix setEndoffset duplicate request handling, formatting * fix unit test * backward compatibility * make AppenderatorDriverMetadata backwards compatible * add unit test * fix deadlock between persist and push executors in AppenderatorImpl * fix formatting * use persist dir instead of work dir * review comments * fix deadlock * actually fix deadlock	2017-11-17 16:05:20 -06:00
Gian Merlino	486159ba8c	SQL: Add TIMESTAMPADD. (#5079 )	2017-11-16 12:00:34 -08:00
Gian Merlino	4fd4444b42	SQL: Add "array" result format, and document result formats. (#5032 ) * SQL: Add "array" result format, and document result formats. * Code style.	2017-11-13 20:24:06 -08:00
Jonathan Wei	9ac150c23a	Split internal client escalation from Authenticator interface (#5073 ) * Split internal client escalation from Authenticator interface * PR comments	2017-11-13 19:29:08 -08:00
Akash Dwivedi	c1538f29fc	maxQueryTimeout property in runtime properties. (#4852 ) * maxQueryTimeout property in runtime properties. * extra line * move withTimeoutAndMaxScatterGatherBytes method to QueryLifeCycle. * Fix initialize method. * remove unused import. * doc update. * some more details in doc about query failure.. * minor fix. * decorating QueryRunner to set and verify context. Added by servers. * remove whitespace.	2017-11-13 19:23:11 -06:00
Jonathan Wei	819700cbc5	Automatically insert authenticator/authorizer names into config properties (#5071 )	2017-11-13 13:12:31 -08:00
Gian Merlino	9444da5038	SQL: Improved behavior when implicitly casting strings to date/time literals. (#5023 ) * SQL: Improved behavior when implicitly casting strings to date/time literals. - Handle all flavors of ISO8601 and SQL literals. - Throw errors on other literals instead of silently transforming them to 0. * Respect timeZone when format is null.	2017-11-10 17:43:22 +09:00
Himanshu	bbb678efd7	fix lookups endpoint collisions (#5058 ) * fix lookups endpoint collissions * fix errors	2017-11-09 17:39:53 -08:00
Himanshu	2ecebb3173	Fix coordinator/overlord redirects when TLS is enabled (#5037 ) * Fix coordinator/overlord redirects when TLS is enabled * address review comment * fix UTs * workaround to not ignore URL instance to fix the teamcity build * update tls doc	2017-11-09 13:10:28 -08:00
Roman Leventov	a8dc056c09	Add retries for coordinator fetch and lookup start in LookupReferencesManager (#5029 ) * Add retries for coordinator fetch and lookup start in LookupReferencesManager * Fix LookupConfigTest * Address comments * Address more comments * And address more comments * Address comms * Recognize 'not found' lookups in LookupReferencesManager.tryGetLookupListFromCoordinator(), by @egor-ryashin	2017-11-09 02:30:36 -03:00
Gian Merlino	e6ec4310b1	IT: Switch to OpenJDK8 base image. (#5060 ) * IT: Switch to OpenJDK8 base image. Also split the Docker image into a base image and a child image, and build the base image ahead of time for efficiency's sake. Also upgrade ZK to 3.4.10. * Additional comments about ZK upgrades.	2017-11-08 19:56:31 -08:00
Jihoon Son	5f3c863d5e	Add compaction task (#4985 ) * Add compaction task * added doc * use combining aggregators * address comments * add support for dimensionsSpec * fix getUniqueDims and getUniqueMetics * find unique dimensionsSpec * fix compilation * add unit test * fix test * fix test * test for different dimension orderings and types, and doc for type and ordering * add control for custom ordering and type * update doc * fix compile * fix compile * add segments param * fix serde error * fix build	2017-11-03 21:55:27 -06:00
Roman Leventov	5eb08c27cb	Add Emitter monitoring (#4973 ) * Add Emitter monitoring * Fix typo * Fixes * testing new emitter * Fix failed test (#71) * testing new emitter * fix on failed test * Remove emitter's readTimeout from docs * Update docs * Add HttpEmittingMonitor * Update java-util to 1.3.2	2017-11-03 21:27:57 -06:00
Jiaqi Liu	7c8b14f18c	Fix doc link (#5040 )	2017-11-03 11:04:33 -07:00
Jonathan Wei	6840eabd87	Add Router connection balancers for Avatica queries (#4983 ) * Add Router connection balancers for Avatica queries * PR comments * Adjust test bounds * PR comments * Add doc comments * PR comments * PR comment * Checkstyle fix	2017-11-01 14:01:13 -07:00
Himanshu	654cdc07f5	Document HTTP based segment management and Deprecate classes to remove in future (#4997 ) * document http segment management * deprecated classes that shouldn't be used any further	2017-11-01 12:59:27 -04:00
Gian Merlino	0ce406bdf1	Introduce "transformSpec" at ingest-time. (#4890 ) * Introduce "transformSpec" at ingest-time. It accepts a "filter" (standard query filter object) and "transforms" (a list of objects with "name" and "expression"). These can be used to do filtering and single-row transforms without need for a separate data processing job. The "expression" fields use the same expression language as other expression-based feature. * Remove forbidden api. * Fix compile error. * Fix tests. * Some more changes. - Add nullable annotation to Firehose.nextRow. - Add tests for index task, realtime task, kafka task, hadoop mapper, and ingestSegment firehose. * Fix bad merge. * Adjust imports. * Adjust whitespace. * Make Transform into an interface. * Add missing annotation. * Switch logger. * Switch logger. * Adjust test. * Adjustment to handling for DatasourceIngestionSpec. * Fix test. * CR comments. * Remove unused method. * Add javadocs. * More javadocs, and always decorate. * Fix bug in TransformingStringInputRowParser. * Fix bad merge. * Fix ISFF tests. * Fix DORC test.	2017-10-30 17:38:52 -07:00
elloooooo	52a162e302	define earlyMessegeRejectPeriod as the period after the taskduration (#4990 )	2017-10-27 01:13:46 +05:30
Himanshu	ef4a8cb724	Optional segment load/drop management without zookeeper using http (#4966 ) * introducing CuratorLoadQueuePeon * HttpLoadQueuePeon based off of current code * Revert "Remove SegmentLoaderConfig.numLoadingThreads config (#4829)" This reverts commit `d8b3bfa63c`. * SegmentLoadDropHandler copy/pasted from ZkCoordinator * Revert "1-based counts in ZkCoordinator (#4917)" This reverts commit `e725ff4146`. * remove non-zk part from ZkCoordinator * remove zk part from SegmentLoadDropHandler * additional changes for segment load/drop management with http * address review comments * add some more logs * Execs class is moved	2017-10-19 12:41:23 -07:00
Darío	ce7bf3f325	Update batch-ingestion.md (#4963 ) I've had problems ingesting several S3 files with Druid. After checking I saw this: https://groups.google.com/forum/#!msg/druid-user/4L62vjor4NM/p8Z_R3lEAQAJ and realised that the docs hasn't been updated. This issue might have been solved with new Druid versions, but for those who are still using older ones (0.9.2), it's nice having this change made :)	2017-10-18 16:44:09 -07:00
Gian Merlino	d5e83f9d50	Fix docs for MOD. (#4971 )	2017-10-18 16:43:28 -07:00
Jihoon Son	52d7f74226	Add streaming aggregation as the last step of ConcurrentGrouper if data are spilled (#4704 ) * Add steaming grouper * Fix doc * Use a single dictionary while combining * Revert GroupByBenchmark * Removed unused code * More cleanup * Remove unused config * Fix some typos and bugs * Refactor Groupers.mergeIterators() * Add comments for combining tree * Refactor buildCombineTree * Refactor iterator * Add ParallelCombiner * Add ParallelCombinerTest * Handle InterruptedException * use AbstractPrioritizedCallable * Address comments * [maven-release-plugin] prepare release druid-0.11.0-sg * [maven-release-plugin] prepare for next development iteration * Address comments * Revert "[maven-release-plugin] prepare for next development iteration" This reverts commit `5c6b31e488`. * Revert "[maven-release-plugin] prepare release druid-0.11.0-sg" This reverts commit `0f5c3a8b82`. * Fix build failure * Change list to array * rename sortableIds * Address comments * change to foreach loop * Fix comment * Revert keyEquals() * Remove loop * Address comments * Fix build fail * Address comments * Remove unused imports * Fix method name * Split intermediate and leaf combine degrees * Add comments to StreamingMergeSortedGrouper * Add more comments and fix overflow * Address comments * ConcurrentGrouperTest cleanup * add thread number configuration for parallel combining * improve doc * address comments * fix build	2017-10-17 23:24:08 -07:00
Slim	af2bc5f814	Make float default representation for DoubleSum/Min/Max aggregators (#4944 ) * Introduce System wide property to select how to store double. Set the default to store as float Change-Id: Id85cca04ed0e7ecbce78624168c586dcc2adafaa * fix tests Change-Id: Ib42db724b8a8f032d204b58c366caaeabdd0d939 * Change the property name Change-Id: I3ed69f79fc56e3735bc8f3a097f52a9f932b4734 * add tests and make default distribution store doubles as 64bits Change-Id: I237b07829117ac61e247a6124423b03992f550f2 * adding mvn argument to parallel-test profile Change-Id: Iae5d1328f901c4876b133894fa37e0d9a4162b05 * move property name and helper function to io.druid.segment.column.Column Change-Id: I62ea903d332515de2b7ca45c02587a1b015cb065 * fix docs and clean style Change-Id: I726abb8f52d25dc9dc62ad98814c5feda5e4d065 * fix docs Change-Id: If10f4cf1e51a58285a301af4107ea17fe5e09b6d	2017-10-16 17:17:22 -07:00
Gian Merlino	f51f346e36	SQL: Fix POWER doc, add test. (#4953 )	2017-10-13 14:38:15 -07:00
Gian Merlino	5cfc7f9ef7	Fix formatting of SQL TRIM docs. (#4951 )	2017-10-13 14:38:06 -07:00
Atul Mohan	c07678b143	Synchronization of lookups during startup of druid processes (#4758 ) * Changes for lookup synchronization * Refactor of Lookup classes * Minor refactors and doc update * Change coordinator instance to be retrieved by DruidLeaderClient * Wait before thread shutdown * Make disablelookups flag true by default * Update docs * Rename flag * Move executorservice shutdown to finally block * Update LookupConfig * Refactoring and doc changes * Remove lookup config constructor * Revert Lookupconfig constructor changes * Add tests to LookupConfig * Make executorservice local * Update LRM * Move ListeningScheduledExecutorService to ExecutorCompletionService * Move exception to outer block * Remove check to see future is done * Remove unnecessary assignment * Add logging	2017-10-12 21:22:24 -05:00
Jihoon Son	dfa9cdc982	Prioritized locking (#4550 ) * Implementation of prioritized locking * Fix build failure * Fix tc fail * Fix typos * Fix IndexTaskTest * Addressed comments * Fix test * Fix spacing * Fix build error * Fix build error * Add lock status * Cleanup suspicious method * Add nullables * add doInCriticalSection to TaskLockBox and revert return type of task actions * fix build * refactor CriticalAction * make replaceLock transactional * fix formatting * fix javadoc * fix build	2017-10-11 23:16:31 -07:00
Roman Leventov	7a9940d624	Add /readiness to HistoricalResource (#4916 ) * Add /loadStatusCode to HistoricalResource * Address comments * Fixes	2017-10-11 20:35:52 -07:00
Gian Merlino	b20e3038b6	SQL: Upgrade to Calcite 1.14.0, some refactoring of internals. (#4889 ) * SQL: Upgrade to Calcite 1.14.0, some refactoring of internals. This brings benefits: - Ability to do GROUP BY and ORDER BY with ordinals. - Ability to support IN filters beyond 19 elements (fixes #4203). Some refactoring of druid-sql internals: - Builtin aggregators and operators are implemented as SqlAggregators and SqlOperatorConversions rather being special cases. This simplifies the Expressions and GroupByRules code, which were becoming complex. - SqlAggregator implementations are no longer responsible for filtering. Added new functions: - Expressions: strpos. - SQL: TRUNCATE, TRUNC, LENGTH, CHAR_LENGTH, STRLEN, STRPOS, SUBSTR, and DATE_TRUNC. * Add missing @Override annotation. * Adjustments for forbidden APIs. * Adjustments for forbidden APIs. * Disable GROUP BY alias. * Doc reword.	2017-10-10 12:44:05 -07:00
Gian Merlino	4e1d0f49d8	Docs: Fix link to broker configuration. (#4934 )	2017-10-10 11:18:46 -07:00
chunghochen	0614b92df1	adding new post aggregators for test statistics to druid-stats extension (#4532 ) * adding new post aggregators of test stats to druid-stats extension * changes to address code review comments * fix checkstyle violations using druid_intellij_formatting.xml after merge upstream/master * add @Override annotation per CI log * make changes per review comments/discussions * remove some blocks per review comments	2017-10-09 23:43:27 -07:00
Parag Jain	7cc18226cd	add more tls configs to enable/disable specific cipher suites and protocols (#4902 ) * add more tls configs to enable/disable specific cipher suites and protocols * fix doc, allow empty list	2017-10-09 13:53:12 -07:00
Himanshu	0e856ee806	add configs to enable fast request failure on broker and historical (#4540 ) * add configs to enable fast request failure on broker * address review comments * fix styling error * fix style error * have enableRequestLimit config instead of having user specify max limit * add comment * fix style error * add UT fo LimitRequestsFilter * address review comments * fix test * make LimitRequestsFilterTest more robust * fix JettyQosTest	2017-10-06 14:45:13 -05:00
Himanshu	f69c9280c4	remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form (#4858 ) * remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form * sanitize output of /druid/coordinator/v1/cluster endpoint	2017-09-28 10:40:59 -05:00
Gian Merlino	bf8fd4c203	Add flattenSpec support to the Avro parser. (#4832 ) * Add flattenSpec support to the Avro parser. Also: - Refactor the JSONPathParser a bit so it can share flattening code with Avro (see ObjectFlatteners). - Remove the JSONParser. It was only used in two places: by UriNamespaceExtractor, and as a base for JSONToLowerParser. Migrated the former to JSONPathParser and made the latter a standalone. - Move GenericRecordAsMap to the Parquet extension, since the Avro extension no longer uses it. * Fix indentation. * Fix equals/hashCode.	2017-09-26 09:26:06 -07:00
Roman Leventov	b56a907145	Add namespace extraction thread config (#4833 )	2017-09-25 09:52:36 -07:00
Charles Allen	a6470c1d03	Move caffeine out of extension and make it the default cache implementation. (#4810 ) * Move caffeine out of extension. * Remove `JsonTypeName` from the class itself * Fix bad docs * Fix distribution pom * Fix unused import * Make caffeine default * Address code comments * Add more description around the jre version in the readme * Add suggested comments	2017-09-22 10:46:55 -07:00
Jonathan Wei	09fcb75583	Add RequestLogEvent emitters config to graphite-emitter (#4678 ) * Add RequestLogEvent emitters config to graphite-emitter * eagerly compute emitter list * use lambdas * checkstyle	2017-09-22 06:14:32 -07:00
Roman Leventov	d8b3bfa63c	Remove SegmentLoaderConfig.numLoadingThreads config (#4829 )	2017-09-20 21:27:43 -07:00
Himanshu	a36adc63e4	[documentation] add more jvm and os guidelines (#4793 ) * add more jvm and os guidelines * address review comments * add not so general guidelines too * duplicate statement removal	2017-09-20 13:12:57 -07:00
Jonathan Wei	164c73f2b2	Fix kerberos authenticator docs (#4822 )	2017-09-19 14:32:22 -05:00
Jonathan Wei	c2a0e753b6	Extension points for authentication/authorization (#4271 ) * Extension points for authentication/authorization * Address some PR comments * Authorization result caching * Add unit tests for SecuritySanityCheckFilter and PreResponseAuthorizationCheckFilter * Use Set for auth caching, close outputstreams in filters * Don't close output stream on success in sanity check filter * Add ConfigResourceFilter to coordinator lookups * Fix filtering authorization check for empty resource list * HttpClient users must explicitly escalate the client * Remove response modification from PreResponseAuthorizationCheckFilter * Remove extraneous pom.xml * Fix unit test * Better lifecycle management * Rename AuthorizationManager to Authorizer * Fix authorization denials for empty supervisor list * Address some PR comments * Address more PR comments * Small cleanup * Add Jetty HttpClient wrapper to Authenticator * Remove Authorizer start/stop * Restore immutable context map in DruidConnection, UT fix * Fix/update docs * Add authorization checks to EventReceiverFirehose * Fix router authorization check failure, restore PreResponseAuthorizationFilter changes * Compile fixes * Test fixes * Update Authenticator/Authorizer doc comments * Merge fixes * PR comments * Fix test * Fix IT * More PR comments * PR comments * SSL fix	2017-09-15 23:45:48 -07:00
Yuya Fujiwara	0fe734805b	formatted table. (#4797 )	2017-09-15 17:39:06 -07:00
Roman Leventov	267f415dc3	Update emitter library and add support for ParametrizedUriEmitter (#4722 ) * Move emitters from io.druid.server.initialization to the dedicated io.druid.server.emitter package; Update emitter library to 0.6.0; Add support for ParametrizedUriEmitter; Support hierarical properties in JsonConfigurator (was needed for ParametrizedUriEmitter) * Log created RequestLoggers * Fix forbidden API * Test fix * More Http and Parametrized Http Emitter docs * Switch to debug level	2017-09-13 17:17:19 -05:00
Gian Merlino	2ce8123bdb	Move scan-query from a contrib extension into core. (#4751 ) * Move scan-query from a contrib extension into core. Based on a proposal at: https://groups.google.com/d/topic/druid-development/ME_OatUDnbk/discussion This patch also adds support for virtual columns to the Scan query, and updates Druid SQL to use Scan instead of Select. This patch also makes some behavioral changes to handling of the __time column. In particular, it is now is returned as "__time" rather than "timestamp"; it is no longer included if you do not specifically ask for it in your "columns"; and it is returned as a long rather than a string. Users can revert time handling to the legacy extension behavior by setting "legacy" : true in their queries, or setting the property druid.query.scan.legacy = true. This is meant to provide a migration path for users that were formerly using the contrib extension. * Adjustments from review. * Add back Select query. * Adjust SQL docs. * Restore SelectQuery link.	2017-09-13 09:51:24 -07:00
Kenji Noguchi	c0be050242	Add jq expression support in flattenSpec (#4171 ) * add jq expression in the flattenSpec * more tests * add benchmark * fix style * use JsonNode for both JSONPath and JQ * clean up * more clean up * add documentation * fix style * move jackson-jq version to dependencyManagement section. remove commented code * oops. revert wrong fix * throw IllegalArgumentException for JQ syntax error * remove e.printStackTrace() that is forbidden * touch	2017-09-12 14:18:34 -05:00
Gian Merlino	4909c48b0c	SQL: Full TRIM support. (#4750 ) * SQL: Full TRIM support. - Support trimming arbitrary characters - Support BOTH, LEADING, and TRAILING * Remove unused import. * Fix tests, add RTRIM / LTRIM. * Remove unused imports. * BTRIM and docs. * Replace for with foreach.	2017-09-12 11:49:08 -07:00
Parag Jain	b5e839b3db	injectable sslcontextfactory for jetty server and key manager factory algorithm (#4769 ) * injectable sslcontextfactory for jetty server key manager factory algorithm * explicitly set trustAll certificates to false in sslcontextfactory	2017-09-12 11:45:03 -07:00
dgolitsyn	752151f6cb	Add CachingCostBalancerStrategy (#4731 ) * Add CachingCostBalancerStrategy; Rename ServerView.ServerCallback to ServerRemovedCallback * Fix benchmark units * Style, forbidden-api, review, bug fixes * Add docs * Address comments	2017-09-08 12:23:04 -05:00
Gian Merlino	33c0928bed	Collapse worker select strategies, change default, add strong affinity. (#4534 ) * Collapse worker select strategies, change default, add strong affinity. - Change default worker select strategy to equalDistribution. It is more generally useful than fillCapacity. - Collapse the WithAffinity strategies into the regular ones. The WithAffinity strategies are retained for backwards compatibility. - Change WorkerSelectStrategy to return nullable instead of Optional. - Fix a couple of errors in the docs. * Fix test. * Review adjustments. * Remove unused imports. * Switch to DateTimes.nowUtc. * Simplify code. * Fix tests (worker assignment started off on a different foot)	2017-09-04 14:40:55 -07:00
Himanshu	06ac6678e6	DruidLeaderSelector interface for leader election and Curator based impl. (#4699 ) * DruidLeaderSelector interface for leader election and Curator based impl. DruidCoordinator/TaskMaster are updated to use the new interface. * add fake DruidNode binding in integration-tests module * add docs on DruidLeaderSelector interface * remove start/stop and keep register/unregister Listener in DruidLeaderSelector interface * updated comments on DruidLeaderSelector * cache the listener executor in CuratorDruidLeaderSelector * use same latch owner name that was used before * remove stuff related to druid.zk.paths.indexer.leaderLatchPath config * randomize the delay when giving up leadership and restarting leader latch	2017-09-01 09:49:04 -07:00
Gian Merlino	9078925cab	Docs for finalizingFieldAccess post-aggregator. (#4737 )	2017-08-31 11:45:49 -07:00
Bartosz Ługowski	8dddccc687	Graphite emitter - add plaintext protocol (#4265 ) * Graphite emitter - add plaintext protocol. Configurable option of replacing slash to dot in metric name. * Graphite emitter - fix misspelling in docs. * Graphite emitter - extend docs. * Graphite emitter - fix code style.	2017-08-29 06:23:06 -07:00
Gian Merlino	daf3c5f927	Add "round" option to cardinality and hyperUnique aggregators. (#4720 ) * Add "round" option to cardinality and hyperUnique aggregators. Also turn it on by default in SQL, to make math on distinct counts work more as expected. * Fix some compile errors. * Fix test. * Formatting.	2017-08-28 14:52:11 -07:00
Gian Merlino	9fbfc1be32	Add @ExtensionPoint and @PublicApi annotations. (#4433 ) * Add @ExtensionPoint and @PublicApi annotations. * Clean up wording. * Remove unused import. * Remove unused imports. * Only types can be extension points. * Adjust annotations some more. * Remove unused import. * Make ServletFilterHolder an extension point. * Add a couple extension points, and update docs.	2017-08-28 14:50:58 -07:00
zhangxinyu1	b04261e7a2	In indexing service flow chart, it should be middlemanager who writes task status to zookeeper (#4654 )	2017-08-27 10:17:15 -07:00
QiuMM	59a48a560a	Redis cache extension doc (#4702 ) * Redis cache extension doc * link redis cache doc in extensions.md	2017-08-24 09:53:51 -05:00
Akash Dwivedi	b43720c46d	Correction in indexing-service configuration doc. (#4700 )	2017-08-22 23:21:34 -05:00
Jonathan Wei	1bddfc089c	Additional docs/log for direct memory usage (#4631 ) * Additional docs/log for direct memory usage * Tweak docs * Doc rewording	2017-08-10 23:33:20 -07:00
Niketh Sabbineni	eb0deba54a	Fix NPE when locations are empty (#4667 ) * Fix NPE when locations are empty * Addressing comments	2017-08-10 23:31:28 -07:00
Yuewen Wang	c821bc9a5a	Implement "earlyMessageRejectionPeriod" config discussed in issue #4599 (#4607 ) * Implement "earlyMessageRejectionPeriod" config discussed in issue #4599 * implement the logics of this param * Added doc of this config * Added unit tests of it * Update KafkaSupervisor.java ameliorate comment * fix format * fix bug when rebasing	2017-08-11 09:12:08 +09:00
Peter Cunningham	ede7cf9eef	Added support for where clauses to JDBC lookups. (#4643 ) * Added support for where clauses to filter lookup values on ingestion. Added a filter field to the JDBC lookups that is used to generate a where clause so that only rows matching the filter value will be brought into Druid. Example being filter="SOMECOLUMN=1" * Required changes based on code review. * Required changes based on code review. * Added support for where clauses to filter lookup values on ingestion. Added a filter field to the JDBC lookups that is used to generate a where clause so that only rows matching the filter value will be brought into Druid. Example being filter="SOMECOLUMN=1" * Updates based on code review, mainly formatting and small refactor of the buildLookupQuery method. * Fixed broken buildLookupQuery method * Removed empty line. * Updates per review comments	2017-08-09 10:47:46 -07:00
Jihoon Son	d5606bc558	Passing lockTimeout as a parameter for TaskLockbox.lock() (#4549 ) * Passing lockTimeout as a parameter for TaskLockbox.lock() * Remove TIME_UNIT * Fix tc fail * Add taskLockTimeout to TaskContext * Add caution	2017-08-08 18:21:07 -07:00
David Lim	dd0b84e766	Fix bugs in RTR related to blacklisting, change default worker strategy (#4619 ) * fix bugs in RTR related to blacklisting, change default worker strategy to equalDistribution * code review and additional changes * fix errorprone * code review changes	2017-08-03 10:34:45 -07:00
Jihoon Son	f3f2cd35e1	Array-based aggregation for groupBy query (#4576 ) * Array-based aggregation * Fix handling missing grouping key * Handle invalid offset * Fix compilation * Add cardinality check * Fix cardinality check * Address comments * Address comments * Address comments * Address comments * Cleanup GroupByQueryEngineV2.process * Change to Byte.SIZE * Add flatMap	2017-08-03 20:04:54 +03:00
David Lim	f4ba7a68ab	fix missing column in caching documentation (#4612 )	2017-07-28 13:53:04 -07:00
Roman Leventov	684cfbf889	Upgrade to server-metrics 0.5.0 (#4480 ) * Upgrade to server-metrics 0.4.3 * Upgrade to 0.5.0 * Add CpuAcctDeltaMonitor description to docs	2017-07-26 08:56:00 -07:00
Gian Merlino	d4ef0f6d94	Improved SQL support for floats and doubles. (#4598 ) * Improved SQL support for floats and doubles. - Use Druid FLOAT for SQL FLOAT, and Druid DOUBLE for SQL DOUBLE, REAL, and DECIMAL. - Use float* aggregators when appropriate. - Add tests involving both float and double columns. - Adjust documentation accordingly. * CR comments. * Fix braces.	2017-07-25 13:54:44 -07:00
Himanshu	ae6780f62a	rolling upgrade order change to bring coordinator and overlord together (#4281 ) * rolling upgrade order change to bring coordinator and overlord together * mentioned merged Coordinator-Overlord in upgrade order doc * revert autoscaling doc change * auto scaling doc fix	2017-07-25 12:54:12 -05:00
Slim	71e7a4c054	Adding double colums supports (#4491 ) * add double columns support * Fix numbers and expected results in UTs * adding float aggregators * fix IT expected test results * fix comments * more fixes * fix comp * fix test * refactor double and float aggregator factories * fix * fix UTs * fix comments * clean unused code * fix more comments * undo unnecessary changes * fix null issue * refactor TopNColumnSelectorStrategyFactory * fix docs * refactor NumericTopNColumnSelectorStrategy * fix return * fix comments * handle the null case in DimesionIndexer * more null fixing * cosmetic changes	2017-07-20 10:14:14 +03:00
Roman Leventov	b2865b7c7b	Make possible to start Peon without DI loading of any querying-related stuff (#4516 ) * Make QueryRunnerFactoryConglomerate injection lazy in TaskToolbox/TaskToolboxFactory * Extract QueryablePeonModule and add druid.modules.excludeList config * Typo	2017-07-12 13:18:25 -05:00
Jihoon Son	cc20260078	Early publishing segments in the middle of data ingestion (#4238 ) * Early publishing segments in the middle of data ingestion * Remove unnecessary logs * Address comments * Refactoring the patch according to #4292 and address comments * Set the total shard number of NumberedShardSpec to 0 * refactoring * Address comments * Fix tests * Address comments * Fix sync problem of committer and retry push only * Fix doc * Fix build failure * Address comments * Fix compilation failure * Fix transient test failure	2017-07-10 22:35:36 -07:00
Gian Merlino	16817e408d	SQL + Expressions = Best friends forever. (#4360 ) * SQL + Expressions = Best friends forever. - Use expressions as a projection layer for anything that can't be expressed using traditional Druid extractionFns. Sometimes they're embedded directly (like "expression" filters, builtin aggregators, or "expression" post-aggregators). Sometimes they're referenced through virtual columns (like dimensionSpecs, which can't innately reference functions of more than one column without the virtual column layer). - Add many new functions and operators, taking advantage of the expression capability (see the querying/sql.md doc). - Improve consistency of constant reduction and of casting by using Druid expressions for this instead of Calcite's RexExecutor. * Fix casting bug, and other code review comments. * Fix docs.	2017-07-07 08:48:26 -07:00
Parag Jain	6e2f78f552	TLS support (#4270 )	2017-07-06 17:40:12 -07:00
Roman Leventov	6173570425	Add ExtensionsConfig.excludeModules (#4438 ) * Add ExtensionsConfig.excludeModules * Add branch * Refactor Initialization.getFromExtensions() * excludeModules -> moduleExcludeList * Initialization.getFromExtensions() and getLoadedModules() should return Collection, not Set * Fix doc	2017-06-28 14:01:31 -07:00
Gian Merlino	4c33d0a00f	Add some new expression functions and macros. (#4442 ) * Add some new expression functions and macros. See misc/math-expr.md for the list of added functions, except for "like", which previously existed but was not documented. * Add easymock to datasketches tests. * Add easymock to distinctcount tests. * Add easymock to virtual-columns tests. * Code review comments. * Clean up code a bit. * Add easymock to scan-query tests. * Rework ExprMacros that have multiple impls. * Improve test coverage.	2017-06-28 10:15:58 -07:00
Roman Leventov	2fa4b10145	More fine-grained DI for management node types. Don't allocate processing resources on Router (#4429 ) * Remove DruidProcessingModule, QueryableModule and QueryRunnerFactoryModule from DI for coordinator, overlord, middle-manager. Add RouterDruidProcessing not to allocate processing resources on router * Fix examples * Fixes * Revert Peon configs and add comments * Remove qualifier	2017-06-27 22:58:01 -07:00
dgolitsyn	e04b8be52e	maxSegmentsInQueue in CoordinatorDinamicConfig (#4445 ) * Add maxSegmentsInQueue parameter to CoordinatorDinamicConfig and use it in LoadRule to improve segments loading and replication time * Rename maxSegmentsInQueue to maxSegmentsInNodeLoadingQueue * Make CoordinatorDynamicConfig constructor private; add/fix tests; set default maxSegmentsInNodeLoadingQueue to 0 (unbounded) * Docs added for maxSegmentsInNodeLoadingQueue parameter in CoordinatorDynamicConfig * More docs for maxSegmentsInNodeLoadingQueue and style fixes	2017-06-27 22:58:36 -05:00
Roman Leventov	05d58689ad	Remove the ability to create segments in v8 format (#4420 ) * Remove ability to create segments in v8 format * Fix IndexGeneratorJobTest * Fix parameterized test name in IndexMergerTest * Remove extra legacy merging stuff * Remove legacy serializer builders * Remove ConciseBitmapIndexMergerTest and RoaringBitmapIndexMergerTest	2017-06-26 13:21:39 -07:00
jeffhartley	3e7f7720a1	update aggregations.md re: rollup (#4455 ) noted that rollup could be on or off	2017-06-23 14:28:59 -07:00
Fokko Driesprong	ff501e8f13	Add Date support to the parquet reader (#4423 ) * Add Date support to the parquet reader Add support for the Date logical type. Currently this is not supported. Since the parquet date is number of days since epoch gets interpreted as seconds since epoch, it will fails on indexing the data because it will not map to the appriopriate bucket. * Cleaned up code and tests Got rid of unused json files in the examples, cleaned up the tests by using try-with-resources. Now get the filenames from the json file instead of hard coding them and integrated general improvements from the feedback provided by leventov. * Got rid of the caching Remove the caching of the logical type of the time dimension column and cleaned up the code a bit.	2017-06-22 15:56:08 -05:00
Jonathan Wei	b333deae9d	Add script for getting milestone contributors (#4396 ) * Add script for getting milestone contributors * Use resp.text instead of resp.content * Add usage notes	2017-06-17 13:25:19 -07:00
Jonathan Wei	3b70995bb3	Configurable row limit for JDBC frames (#4417 )	2017-06-16 17:07:40 -07:00
Amar Ramachandran	fc80df339e	Fix incorrect name (#4386 )	2017-06-09 13:32:17 -04:00
Gian Merlino	1f2afccdf8	Expressions: Add ExprMacros. (#4365 ) * Expressions: Add ExprMacros, which have the same syntax as functions, but can convert themselves to any kind of Expr at parse-time. ExprMacroTable is an extension point for adding new ExprMacros. Anything that might need to parse expressions needs an ExprMacroTable, which can be injected through Guice. * Address code review comments.	2017-06-08 09:32:10 -04:00
Jordan Pittier	6697f3a62b	[Doc]Update configuration/historical.md: correct default numBootstrapThreads value (#4376 ) According to `4ace65a2af/server/src/main/java/io/druid/segment/loading/SegmentLoaderConfig.java (L81)` if numBootstrapThreads is not set, it default to numLoadingThreads.	2017-06-07 10:24:42 -07:00
kaijianding	551a89bd67	serialize DateTime As Long to improve json serde performance (#4038 )	2017-06-06 10:08:51 -07:00
Yuya Fujiwara	152d4e89ab	Fix typo in the avro.md. (#4370 )	2017-06-06 07:14:08 -07:00
David Lim	13ecf90923	Report Kafka lag information in supervisor status report (#4314 ) * refactor lag reporting and report lag at status endpoint * refactor offset reporting logic to fetch offsets periodically vs. at request time * remove JavaCompatUtils * code review changes * code review changes	2017-06-05 13:26:25 -07:00
Jonathan Wei	b90c28e861	Support limit push down for GroupBy (#3873 ) * Support limit push down for GroupBy V2 * Use orderBy spec ordering when applying limit push down * PR Comments * Remove unused var * Checkstyle fixes * Fix test * Add comment on non-final variables, fix checkstyle * Address PR comments * PR comments * Remove unnecessary buffer reset * Fix missing @JsonProperty annotation	2017-06-02 15:39:04 -07:00
Jonathan Wei	6daddf97c5	More documentation on expected interval format for coordinator endpoints (#4361 )	2017-06-02 15:21:44 -07:00
Jihoon Son	1150bf7a2c	Refactoring Appenderator Driver (#4292 ) * Refactoring Appenderator 1) Added publishExecutor and handoffExecutor for background publishing and handing segments off 2) Change add() to not move segments out in it * Address comments 1) Remove publishTimeout for KafkaIndexTask 2) Simplifying registerHandoff() 3) Add increamental handoff test * Remove unused variable * Add persist() to Appenderator and more tests for AppenderatorDriver * Remove unused imports * Fix strict build * Address comments	2017-06-02 07:09:11 +09:00
fanjieqi	2e933e1413	fix a bug in select-query.md which the property_form lack of the『granularity』 (#4327 ) There result would be {"error"=>"Unknown exception", "errorMessage"=>nil, "errorClass"=>"java.lang.NullPointerException", "host"=>nil} when the json lack of 『granularity』.	2017-05-30 17:04:39 -07:00
Kenji Noguchi	3400f601db	Protobuf extension (#4039 ) * move ProtoBufInputRowParser from processing module to protobuf extensions * Ported PR #3509 * add DynamicMessage * fix local test stuff that slipped in * add license header * removed redundant type name * removed commented code * fix code style * rename ProtoBuf -> Protobuf * pom.xml: shade protobuf classes, handle .desc resource file as binary file * clean up error messages * pick first message type from descriptor if not specified * fix protoMessageType null check. add test case * move protobuf-extension from contrib to core * document: add new configuration keys, and descriptions * update document. add examples * move protobuf-extension from contrib to core (2nd try) * touch * include protobuf extensions in the distribution * fix whitespace * include protobuf example in the distribution * example: create new pb obj everytime * document: use properly quoted json * fix whitespace * bump parent version to 0.10.1-SNAPSHOT * ignore Override check * touch	2017-05-30 13:11:58 -07:00
Kamal Gurala	dcb07d6958	Option to configure default analysis types in SegmentMetadataQuery (#4259 ) * Option to configure default analysis types * Updated Docs and renamed * Added serde tests and Null handling * Fixed Documentation * Updated implementation * Updated implementation * Updated implementation * Added usingDefaultIntervals in Builder * Updated implementation * Updated implementation and added failing test * filterSegments implementation updated * Updated imlementation * Padding * Add missing Override * Updated implementation * Fixed a naming bug * Fixed bug * Removed comment	2017-05-26 12:12:39 -07:00
zwang180	2c55a935f8	Delete a duplicate "Bucket Extraction Function" section at the bottom of "Querying"-"DimensionSpec" page (#4331 )	2017-05-25 14:16:00 -07:00
Jihoon Son	11b7b1bea6	Add support for HttpFirehose (#4297 ) * Add support for HttpFirehose * Fix document * Add documents	2017-05-25 16:13:04 -05:00
李成露(StefanLee)	22977780aa	Doc (#4217 ) * Fixed (#4216) Modify the default value of `druid.server.http.numThreads` to `Math.max(10, (Runtime.getRuntime().availableProcessors() * 17) / 16 + 2) + 30` * Fixed(#4216) Modify the default value of `druid.server.http.numThreads` to `max(10, (Number of cores * 17) / 16 + 2) + 30` * Fixed(#4216) Modify the default value of `druid.server.http.numThreads` to `max(10, (Number of cores * 17) / 16 + 2) + 30`	2017-05-23 17:04:52 +09:00
Gian Merlino	adeecc0e72	Add /isLeader call to overlord and coordinator. (#4282 ) This is useful for putting them behind load balancers or proxies, as it lets the load balancer know which server is currently active through an http health check. Also makes the method naming a little more consistent between coordinator and overlord code.	2017-05-18 20:46:13 -05:00
Jihoon Son	733dfc9b30	Add PrefetchableTextFilesFirehoseFactory for cloud storage types (#4193 ) * Add PrefetcheableTextFilesFirehoseFactory * fix comment * exception handling * Fix wrong json property * Remove ReplayableFirehoseFactory and fix misspelling * Defer object initialization * Add a temporaryDirectory parameter to FirehoseFactory.connect() * fix when cache and fetch are disabled * Address comments * Add more test * Increase timeout for test * Add wrapObjectStream * Move methods to Firehose from PrefetchableFirehoseFactory * Cleanup comment * add directory listing to s3 firehose * Rename a variable * Addressing comments * Update document * Support disabling prefetch * Fix race condition * Add fetchLock * Remove ReplayableFirehoseFactoryTest * Fix compilation error * Fix test failure * Address comments * Add default implementation for new method	2017-05-18 15:37:18 +09:00
David Lim	8333043b7b	add skipOffsetGaps flag (#4256 )	2017-05-16 12:19:28 -06:00
Himanshu	136b2fae72	improve query timeout handling and limit max scatter-gather bytes (#4229 ) * improve query timeout handling and limit max scatter-gather bytes * address review comments	2017-05-16 12:47:32 -05:00
Jihoon Son	50a4ec2b0b	Add support for headers and skipping thereof for CSV and TSV (#4254 ) * initial commit * small fixes * fix bug * fix bug * address code review * more cr * more cr * more cr * fix * Skip head rows for CSV and TSV * Move checking skipHeadRows to FileIteratingFirehose * Remove checking null iterators * Remove unused imports * Address comments * Fix compilation error * Address comments * Add more tests * Add a comment to ReplayableFirehose * Addressing comments * Add docs and fix typos	2017-05-15 22:57:31 -07:00
Himanshu	462f6482df	optionally add extensions to explicitly specified hadoopContainerClassPath (#4230 ) * optionally add extensions to explicitly specified hadoopContainerClassPath * note extensions always pushed in hadoop container when druid.extensions.hadoopContainerDruidClasspath is not provided explicitly	2017-05-08 14:24:14 -05:00
Himanshu	417714d228	additional lookup status discovery http endpoints at coordinator (#4228 ) * additional lookup status discovery http endpoints at coordinator * more changes * jsonize the error msgs as well * fix tests	2017-05-04 11:15:30 -07:00
Parag Jain	4502c207af	fix injection bug and documentation (#4243 )	2017-05-03 15:07:43 -05:00
hzy001	0c464f4a84	Fix docs (#4225 ) * Fix one typo Signed-off-by: Hao Ziyu <haoziyu@qiyi.com> * Fix deprecated links Signed-off-by: Hao Ziyu <haoziyu@qiyi.com>	2017-05-01 09:55:43 -07:00
Jihoon Son	7411b18df9	Add BroadcastDistributionRule (#4077 ) * Add BroadcastDistributionRule * Add missing null check * Rename variable 'colocateDataSource' to 'colocatedDatasource' * Address comments * Document for broadcast rules * Drop segments which are not co-located anymore * Remove duplicated segment loading and dropping * Add caveat * address comments	2017-05-01 09:55:17 -07:00
Himanshu	5a5a2749cd	improvements to coordinator lookups management (#3855 ) * coordinator lookups mgmt improvements * revert replaces removal, deprecate it instead * convert and use older specs stored in db * more tests and updates * review comments * add behavior for 0.10.0 to 0.9.2 downgrade * incorporating more review comments * remove explicit lock and use LifecycleLock in LookupReferencesManager. use LifecycleLock in LookupCoordinatorManager as well * wip on LookupCoordinatorManager * lifecycle lock * refactor thread creation into utility method * more review comments addressed * support smooth roll back of lookup snapshots from 0.10.0 to 0.9.2 * correctly use LifecycleLock in LookupCoordinatorManager and remove synchronization from start/stop * run lookup mgmt on leader coordinator only * wip: changes to do multiple start() and stop() on LookupCoordinatorManager * lifecycleLock fix usage in LookupReferencesManagerTest * add LifecycleLock back * fix license hdr * some fixes * make LookupReferencesManager.getAllLookupsState() consistent while still being lockless * address review comments * addressing leventov's comments * address charle's comments * add IOE.java * for safety in LookupReferencesManager mainThread check for lifecycle started state on each loop in addition to interrupt * move thread creation utility method to Execs * fix names * add tests for LookupCoordinatorManager.lookupManagementLoop() * add further tests for figuring out toBeLoaded and toBeDropped on LookupCoordinatorManager * address leventov comments * remove LookupsStateWithMap and parameterize LookupsState * address review comments * address more review comments * misc fixes	2017-04-28 08:41:38 -05:00
Gian Merlino	631068b099	Fix broken DataSketches link. (#4221 ) * Fix broken DataSketches link. * Better fixed link.	2017-04-27 17:37:12 -07:00
Himanshu	40057570f3	doc update on overlord console url when coordinator is acting as overlord (#4213 )	2017-04-26 15:03:54 -07:00
asrayousuf	e4fbc2bc5b	Updating the description of useCache (#4200 ) Updating the description of useCache Updating query-context doc based on Gian's comment Updating query-context doc based on Gian's comment Updating query-context doc based on Gian's comment Updating query-context doc based on Gian's comment	2017-04-25 10:26:15 -07:00
satishbhor	d51097c809	Fix lz4 library incompatibility in kafka-indexing-service extension (#4115 ) * Fix lz4 library incompatibility in kafka-indexing-service extension #3266 * Bumped Kafka version to 0.10.2.0 for : Fix lz4 library incompatibility in kafka-indexing-service extension #3266 * Replaced Lists.newArrayList() with Collections.singletonList() For Fix lz4 library incompatibility in kafka-indexing-service extension #4115	2017-04-25 12:23:51 +09:00
Jihoon Son	5b69f2eff2	Make timeout behavior consistent to document (#4134 ) * Make timeout behavior consistent to document * Refactoring BlockingPool and add more methods to QueryContexts * remove unused imports * Addressed comments * Address comments * remove unused method * Make default query timeout configurable * Fix test failure * Change timeout from period to millis	2017-04-19 09:47:53 +09:00
Gian Merlino	b2954d5fea	Better groupBy error messages and docs around resource limits. (#4162 ) * Better groupBy error messages and docs around resource limits. * Fix BufferGrouper test from datasketches. * Further clarify.	2017-04-13 10:38:53 -07:00
Xiuming Chen	7e4e5510e0	Outdated property names (#4146 ) Outdated property names?	2017-04-05 16:37:38 -07:00
Dongkyu Hwangbo	0d2e91ed50	Adding Kafka-emitter (#3860 ) * Initial commit * Apply another config: clustername * Rename variable * Fix bug * Add retry logic * Edit retry logic * Upgrade kafka-clients version to the most recent release * Make callback single object * Write documentation * Rewrite error message and emit logic * Handling AlertEvent * Override toString() * make clusterName more optional * bump up druid version * add producer.config option which make user can apply another optional config value of kafka producer * remove potential blocking in emit() * using MemoryBoundLinkedBlockingQueue * Fixing coding convention * Remove logging every exception and just increment counting * refactoring * trivial modification * logging when callback has exception * Replace kafka-clients 0.10.1.1 with 0.10.2.0 * Resolve the problem related of classloader * adopt try statement * code reformatting * make variables final * rewrite toString	2017-04-04 14:07:43 -07:00
JackyWoo	a0f2cf05d5	Add EqualDistributionWithAffinityWorkerSelectStrategy which balance w… (#3998 ) * Add EqualDistributionWithAffinityWorkerSelectStrategy which balance work load within affinity workers. * add docs to equalDistributionWithAffinity	2017-03-25 19:15:49 -07:00
Gian Merlino	dd6c0ab509	Add SQL REGEXP_EXTRACT function; add "index" to "regex" extractionFn. (#4055 ) * Add SQL REGEXP_EXTRACT function; add "index" to "regex" extractionFn. * Fix tests.	2017-03-24 17:38:36 -07:00
Himanshu	de081c711b	RealtimeIndexTask to support alertTimeout in context (#4089 ) * RealtimeIndexTask to support alertTimeout in context and raise alert if task process exists after the timeout * move alertTimeout config to tuningConfig and document	2017-03-24 12:48:12 -07:00
Gian Merlino	b4289c0004	Remove "granularity" from IngestSegmentFirehose. (#4110 ) It wasn't doing anything useful (the sequences were being concatted, and cursor.getTime() wasn't being called) and it defaulted to Granularities.NONE. Changing it to Granularities.ALL gave me a 700x+ performance boost on a small dataset I was reindexing (2m27s to 365ms). Most of that was from avoiding making a lot of unnecessary column selectors.	2017-03-24 10:28:54 -07:00
Erik Dubbelboer	2cbc4764f8	Comparing dimensions to each other in a filter (#3928 ) Comparing dimensions to each other using a select filter	2017-03-23 18:23:46 -07:00
Gian Merlino	db15d494ca	Update docs for query filter HavingSpecs. (#4063 )	2017-03-15 13:59:09 -04:00
hzy001	c4f44c0590	Update the docs (#4059 ) Signed-off-by: Hao Ziyu <haoziyu@qiyi.com>	2017-03-15 10:32:29 -04:00
Gian Merlino	3216134f8c	SQL: Make row extractions extensible and add one for lookups. (#3991 ) This is a reopening of #3989, since that PR was merged to master prematurely and accidentally.	2017-03-13 21:56:16 -07:00
Gian Merlino	cab2e2f5d5	Add docs about filtering and indexes on numeric columns. (#4035 )	2017-03-10 12:48:59 -08:00
Gian Merlino	960769c583	SQL: Fix example INFORMATION_SCHEMA query. (#4017 )	2017-03-06 16:07:47 -08:00
Gian Merlino	4ca5270e88	Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. (#4004 ) * Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. Includes two fixes: - groupBy v2 now ignores chunkPeriod, since it wouldn't have helped anyway (its mergeResults returns a lazy sequence) and it generates incorrect results. - Fix chunkPeriod handling for periods of irregular length, like "P1M" or "P1Y". Also includes doc and test fixes: - groupBy v1 was no longer being tested by GroupByQueryRunnerTest since #3953, now it is once again. - chunkPeriod documentation was misleading due to its checkered past. Updated it to be more accurate. * Remove unused import. * Restore buffer size.	2017-03-06 12:27:02 -06:00
kaijianding	19ac1c7c2c	Add SameIntervalMergeTask for easier usage of MergeTask (#3981 ) * Add SameIntervalMergeTask for easier usage of MergeTask * fix a bug and add ut * remove same_interval_merge_sub from Task.java and remove other no needed code	2017-03-06 11:21:32 -06:00
Gian Merlino	337f3870d8	Fix TimeFormatExtractionFn getCacheKey when tz, locale are not provided. (#4007 ) * Fix TimeFormatExtractionFn getCacheKey when tz, locale are not provided. * Remove unused import. * Use defaults in cache key.	2017-03-04 17:41:59 -08:00
Gian Merlino	af5a4cce3c	SQL: Clarify approximate distinct count behavior. (#4000 )	2017-03-03 13:42:30 -08:00
Himanshu	e7e3c2dc5a	support singleThreaded flag for groupBy-v2 as well (#3992 )	2017-03-03 23:43:06 +05:30
Gian Merlino	4a56d7d8a0	SQL: Ability to generate exact distinct count queries. (#3999 )	2017-03-03 23:40:36 +05:30
Gian Merlino	3e8dbd59f8	Fix groupBy docs to reflect that 'v2' is default. (#3993 )	2017-03-02 15:13:39 -08:00
Gian Merlino	e63eefd7ff	Revert "SQL: Make row extractions extensible and add one for lookups. (#3989 )" The PR was merged to master accidentally. This reverts commit `23927a3c96`.	2017-03-01 17:06:12 -08:00
Jonathan Wei	5fb1638534	Add default configuration for select query 'fromNext' parameter (#3986 ) * Add default configuration for select query 'fromNext' parameter * PR comments * Fix PagingSpec config injection * Injection fix for test	2017-03-01 17:05:35 -08:00
Gian Merlino	23927a3c96	SQL: Make row extractions extensible and add one for lookups. (#3989 ) * SQL: Make row extractions extensible and add one for lookups. * Fix QuantileSqlAggregatorTest.	2017-03-01 17:03:43 -08:00
Aseem Bansal	b8ba237f78	Update toc.md (#3704 )	2017-03-01 14:33:39 -08:00
Fokko Driesprong	add17fa7db	Remove the metadataUpdateSpec from specfile (#3973 ) Get rid of the metadataUpdateSpec section in the json example to ingest parquet into druid. When this element is present, it will fail start an indexing job.	2017-03-01 14:24:36 -08:00
Akash Dwivedi	94da5e80f9	Namespace optimization for hdfs data segments. (#3877 ) * NN optimization for hdfs data segments. * HdfsDataSegmentKiller, HdfsDataSegment finder changes to use new storage format.Docs update. * Common utility function in DataSegmentPusherUtil. * new static method `makeSegmentOutputPathUptoVersionForHdfs` in JobHelper * reuse getHdfsStorageDirUptoVersion in DataSegmentPusherUtil.getHdfsStorageDir() * Addressed comments. * Review comments. * HdfsDataSegmentKiller requested changes. * extra newline * Add maprfs.	2017-03-01 09:51:20 -08:00
Jonathan Wei	a08660a9ca	Support ingestion of long/float dimensions (#3966 ) * Support ingestion for long/float dimensions * Allow non-arrays for key components in indexing type strategy interfaces * Add numeric index merge test, fixes * Docs for numeric dims at ingestion * Remove unused import * Adjust docs, add aggregate on numeric dims tests * remove unused imports * Throw exception for bitmap method on numerics * Move typed selector creation to DimensionIndexer interface * unused imports * Fix * Remove unused DimensionSpec from indexer methods, check for dims first in inc index storage adapter * Remove spaces	2017-02-28 19:04:41 -08:00
kaijianding	ef6a19c81b	buildV9Directly in MergeTask and AppendTask (#3976 ) * buildV9Directly in MergeTask and AppendTask * add doc	2017-02-28 10:04:32 -08:00
praveev	c3bf40108d	One granularity (#3850 ) * Refactor Segment Granularity * Beginning of one granularity * Copy the fix for custom periods in segment-grunalrity over here. * Remove the custom serialization for now. * Compilation cleanup * Reformat code * Fixing unit tests * Unify to use a single iterable * Backward compatibility for rolling upgrade * Minor check style. Cosmetic changes. * Rename length and millis to duration * CR feedback * Minor changes.	2017-02-25 01:02:29 -06:00
Aseem Bansal	1098ba7a7f	Update toc.md (#3703 )	2017-02-23 09:39:06 -08:00
Jihoon Son	ebd100cbb0	Set default query granularity for null value (#3965 )	2017-02-22 17:38:43 -08:00
Jihoon Son	7200dce112	Atomic merge buffer acquisition for groupBys (#3939 ) * Atomic merge buffer acquisition for groupBys * documentation * documentation * address comments * address comments * fix test failure * Addressed comments - Add InsufficientResourcesException - Renamed GroupByQueryBrokerResource to GroupByQueryResource * addressed comments * Add takeBatch() to BlockingPool	2017-02-22 14:49:37 -06:00
Gian Merlino	e7d01b67b6	Move SQL configs to sql.md. (#3959 ) This puts all the SQL stuff in one place. It also makes life easier by pointing out that configs be made in either common.runtime.properties or the broker runtime.properties.	2017-02-22 08:37:24 -08:00
Jonathan Wei	bc33b68b51	Use GroupBy V2 as default (#3953 ) * Use GroupBy V2 as default * Remove unused line * Change assert to exception propagation	2017-02-18 07:40:40 -08:00
michaelschiff	e5fb0e1ff5	New property for each metric that tells the StatsDEmitter to convert metric values from range 0-1 to 0-100. This (#3936 ) prevents rates and percentages expressed as Doubles (0.xx) from being rounded down to 0.	2017-02-16 13:55:56 -08:00
Gian Merlino	ca6053d045	SQL: Resolve column type conflicts in favor of newer segments. (#3930 ) * SQL: Resolve column type conflicts in favor of newer segments. Helps with schema evolution from e.g. long -> float, which is supported on the query side. * Take columns from highest timestamp instead of max segment id. * Fixes and docs.	2017-02-15 17:48:49 -08:00
Gian Merlino	16ef513c7d	SQL: Add context and contextual functions to planner. (#3919 ) * SQL: Add context and contextual functions to planner. Added support for context parameters specified as JDBC connection properties or a JSON object for SQL-over-JSON-over-HTTP. Also added features that depend on context functionality: - Added CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP functions. - Added support for time zones other than UTC via a "timeZone" context. - Pass down query context to Druid queries too. Also some bug fixes: - Fix DATE handling, it was largely done incorrectly before. - Fix CAST(__time TO DATE) which should do a floor-to-day. - Fix non-equality comparisons to FLOOR(__time TO X). - Fix maxQueryCount property. * Pass down context to nested queries too.	2017-02-15 14:09:14 -08:00
Jihoon Son	a459db68b6	Fine grained buffer management for groupby (#3863 ) * Fine-grained buffer management for group by queries * Remove maxQueryCount from GroupByRules * Fix code style * Merge master * Fix compilation failure * Address comments * Address comments - Revert Sequence - Add isInitialized() to Grouper - Initialize the grouper in RowBasedGrouperHelper.Accumulator - Simple refactoring RowBasedGrouperHelper.Accumulator - Add tests for checking the number of used merge buffers - Improve docs * Revert unnecessary changes * change to visible to testing * fix misspelling	2017-02-14 12:55:54 -08:00
Gian Merlino	78b0d134ae	Require Java 8 and include some Java 8 dependencies. (#3914 ) * Require Java 8 and include some Java 8 dependencies. - Upgrade Jetty to 9.3.16.v20170120. - Upgrade DataSketches to 0.8.4. - Bundle caffeine-cache by default. - Still target Java 7 when compiling base Druid classes. * Update cluster, quickstart docs. * Remove oraclejdk7 from travis.yml.	2017-02-14 12:51:51 -08:00
DaimonPl	a2875a4d91	pre-computed HLL support for hyperUnique aggregator (#3909 )	2017-02-13 15:26:20 -08:00
Himanshu	9dfcf0763a	disable javascript execution by default (#3818 )	2017-02-13 15:11:18 -08:00
Himanshu	8cf7ad1e3a	druid.coordinator.asOverlord.enabled flag at coordinator to make it an overlord too (#3711 )	2017-02-13 15:03:59 -08:00
Jonathan Wei	ca2b04f0fd	Add long/float ColumnSelectorStrategy implementations (#3838 ) * Add long/float ColumnSelectorStrategy implementations * Address PR comments * Add String strategy with internal dictionary to V2 groupby, remove dict from numeric wrapping selectors, more tests * PR comments * Use BaseSingleValueDimensionSelector for long/float wrapping * remove unused import * Address PR comments * PR comments * PR comments * More PR comments * Fix failing calcite histogram subquery tests * ScanQuery test and comment about isInputRaw * Add outputType to extractionDimensionSpec, tweak SQL tests * Fix limit spec optimization for numerics * Add cardinality sanity checks to TopN * Fix import from merge * Add tests for filtered dimension spec outputType * Address PR comments * Allow filtered dimspecs on numerics * More comments	2017-02-08 20:39:29 -08:00
Erik Dubbelboer	2aa2fa57b5	Simple doc fix (#3907 )	2017-02-06 15:52:17 +05:30
Darío	8f4394ca49	Update segments.md (#3904 )	2017-02-03 10:31:14 -06:00
Nishant Bangarwa	a457cded28	Druid Extension to enable Authentication using Kerberos. (#3853 ) * Add extension for supporting kerberos security - This PR adds an extension for supporting druid authentication via Kerberos. - Working on the docs. * Add docs * review comments * more review comments * Block all paths by default * more review comments - use proper Oid * Allow extensions to override httpclient for integration tests * Add kerberos lock to prevent multithreaded issues. * review comment - remove enabled flag and fix router injection * Add Cookie Handling and more detailed docs * review comment - rename DruidKerberosConfig -> AuthKerberosConfig * review comments * fix travis failure on jdk7	2017-02-02 14:55:21 -06:00
Jonathan Wei	182261f713	Allow configurable temp directory for query processing (#3893 )	2017-02-02 10:22:28 -08:00
Gian Merlino	151ff6d064	flattenSpec: Document that "expr" is ignored for type "root". (#3884 )	2017-01-31 10:27:20 -08:00
Gian Merlino	ac84a3e011	SQL: Add resolution parameter, fix filtering bug with APPROX_QUANTILE (#3868 ) * SQL: Add resolution parameter to quantile agg, rename to APPROX_QUANTILE. * Fix bug with re-use of filtered approximate histogram aggregators. Also add APPROX_QUANTILE tests for filtering and running on complex columns. Includes some slight refactoring to allow tests to make DruidTables that include complex columns. * Remove unused import	2017-01-25 18:39:26 -08:00
Niketh Sabbineni	2b8d3c102b	Remove throttling on drop segments (#3736 ) * Remove throttling on drop * Throttle loadqueuepeon segment change requests to ZK * Make initial delay configurable, add docs, shutdown gracefully * Make loadqueuepeon repeat delay configurable	2017-01-20 10:02:19 -08:00
Gian Merlino	d51f5e058d	SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. (#3852 ) * SQL: Ditch CalciteConnection layer and add DruidMeta, extension aggregators. Switched from CalciteConnection to Planner, bringing benefits: - CalciteConnection's JDBC interface no longer sits between the SQL server (HTTP/Avatica) and Druid's query layer. Instead, the SQL servers can use Druid Sequence objects directly, reducing overhead in the query return path. - Implemented our own Planner-based Avatica Meta, letting us control connection timeouts and connection / statement limits. The previous CalciteConnection-based implementation didn't have any limits or timeouts. - The Planner interface lets us override the operator table, opening up SQL language extensions. This patch includes two: APPROX_COUNT_DISTINCT in core, and a QUANTILE aggregator in the druid-histogram extension. Also: - Added INFORMATION_SCHEMA metadata schema. - Added tests for Unicode literals and escapes. * Verify statement is actually open before closing it. * More detailed INFORMATION_SCHEMA docs.	2017-01-19 16:32:20 -08:00
kaijianding	33ae9dd485	streaming version of select query (#3307 ) * streaming version of select query * use columns instead of dimensions and metrics;prepare for valueVector;remove granularity * respect query limit within historical * use constant * fix thread name corrupted bug when using jetty qtp thread rather than processing thread while working with SpecificSegmentQueryRunner * add some test for scan query * add scan query document * fix merge conflicts * add compactedList resultFormat, this format is better for json ser/der * respect query timeout * respect query limit on broker * use static consts and remove unused code	2017-01-19 16:09:53 -06:00
David Lim	ff52581bd3	IndexTask improvements (#3611 ) * index task improvements * code review changes * add null check	2017-01-18 14:24:37 -08:00
Fokko Driesprong	31bea380eb	Updated Apache Zookeeper to the latest stable version (#3841 )	2017-01-12 13:39:29 -08:00
Gian Merlino	e86859b228	SQL support for nested groupBys. (#3806 ) * SQL support for nested groupBys. Allows, for example, doing exact count distinct by writing: SELECT COUNT() FROM (SELECT DISTINCT col FROM druid.foo) Contrast with approximate count distinct, which is: SELECT COUNT(DISTINCT col) FROM druid.foo Add deeply-nested groupBy docs, tests, and maxQueryCount config. * Extract magic constants into statics. * Rework rules to put preconditions in the "matches" method.	2017-01-11 18:32:53 -08:00
Gian Merlino	76620615a1	Properly respect the enableAvatica and enableJsonOverHttp options. (#3834 )	2017-01-11 14:43:34 -06:00
Jihoon Son	c099977a5b	Add an option to SearchQuery to choose a search query execution strategy (#3792 ) * Add an option to SearchQuery to choose a search query execution strategy. Supported strategies are 1) Index-only query execution 2) Cursor-based scan 3) Auto: choose an efficient strategy for a given query * Add SearchStrategy and SearchQueryExecutor * Address comments * Rename strategies and set UseIndexesStrategy as the default strategy * Add a cost-based planner for auto strategy * Add document * Fix code style * apply code style * apply comments	2017-01-10 18:04:20 -08:00
Vinh Tran	dddeae813a	Update caching.md typo (#3824 ) * Update caching.md Typo of Command vs Comma * Update index.md Fixing `Command` typo	2017-01-06 12:14:07 -08:00
Yuusaku Taniguchi	02519d5b64	Exhibitor Support (#3664 ) * allow JsonConfigTesterBase to treat the fields of collections * [Feature] Exhibitor Support (#3664) This patch provides the integration of Druid & Netflix Exhibitor. Druid currently use Apache Curator as ZooKeeper client. Curator can be integrated with Exhibitor to achieve a live/updating list of the ZooKeeper ensemble. This patch enables Druid to use this features.	2017-01-02 09:15:36 -08:00
Himanshu	4ca3b7f1e4	overlord helpers framework and tasklog auto cleanup (#3677 ) * overlord helpers framework and tasklog auto cleanup * review comment changes * further review comments addressed	2016-12-21 15:18:55 -08:00
Nishant	f576a0ff14	Contrib Extension for Ambari Metrics Emitter (#3767 ) * Contrib Extension for Ambari Metrics Emitter extension to enable druid to send metrics to ambari metrics server (https://cwiki.apache.org/confluence/display/AMBARI/Metrics) review comments switch to public repo * review comments * add docs * fix pom version * Add link for doc page in extensions.md * remove unused imports * review comments review comments remove unused dependency review comment	2016-12-19 11:12:47 -08:00
Nishant	35160e5595	Add metrics for Query Count statistics (#3470 ) * Add metrics for Query Count statistics This PR adds a new metrics monitor “QueryCountStatsMonitor” which emits three new metrics - 1) query/success/count - number of successful queries 2) query/failed/count - number of failed queries 3) query/interrupted/count - number of interrupted/timedout queries fix bindings * make fields final * fix imports * AsyncQueryForwardingServlet implement QueryStatsProvider * remove unused import	2016-12-19 09:47:58 -08:00
David Lim	8eee259629	add documentation on segments generated (#3785 )	2016-12-19 09:41:47 -08:00
Dongkyu Hwangbo	da007ca3c2	Replace caravel with superset (#3780 )	2016-12-16 20:47:52 -08:00
Gian Merlino	dd63f54325	Built-in SQL. (#3682 )	2016-12-16 17:15:59 -08:00
Jonathan Wei	2bfcc8a592	First and Last Aggregator (#3566 ) * add first and last aggregator * add test and fix * moving around * separate aggregator valueType * address PR comment * add finalize inner query and adjust v1 inner indexing * better test and fixes * java-util import fixes * PR comments * Add first/last aggs to ITWikipediaQueryTest	2016-12-16 15:26:40 -08:00
Nishant	8cfcb95fbc	Add Filtered and Composing request loggers (#3469 ) * Add Filtered and Composing request loggers Add Filtered and Composite Request loggers - enables users to filter request logs for slow queries. fix test * review comments * review comment * remove unused import	2016-12-16 11:18:32 -08:00
Himanshu	ed322a4beb	remove size from default analysisTypes list for segmentMetadata query (#3773 )	2016-12-13 18:01:21 -08:00

... 5 6 7 8 9 ...

1945 Commits