druid

Commit Graph

Author	SHA1	Message	Date
Kevin Conaway	93fdbcb364	Change RealtimeIndexTask to use AppenderatorDriver (#5261 ) * Change RealtimeIndexTask to use AppenderatorDriver instead of RealtimePlumber. Related to #4774 * Remove unused throwableDuringPublishing * Fix usage of forbidden API * Update realtime index IT to account for not skipping older data any more * Separate out waiting on publish futures and handoff futures to avoid a race condition where the handoff timeout expires before the segment is published * #5261 Add separate AppenderatorDriverRealtimeIndexTask and revert changes to RealtimeIndexTask * #5261 Add separate AppenderatorDriverRealtimeIndexTask and revert changes to RealtimeIndexTask * #5261 Readability improvements in AppenderatorDriverRealtimeIndexTask. Combine publish and handoff futures in to single future * #5261 Add separate tuningConfig for RealtimeAppenderatorIndexTask. Revert changes to RealtimeTuningConfig * #5261 Change JSON type to realtime_appenderator to keep the same naming pattern as RealtimeIndexTask	2018-02-06 10:21:31 -08:00
Gian Merlino	9a62b02cb7	Extensions: Option to load classes from extension jars first. (#5321 ) The behavior is configurable through druid.extensions.useExtensionClassloaderFirst. It is useful when extensions want to load a dependency different from one provided by Druid, for example a different version of geoip or protobuf.	2018-02-06 16:14:03 +05:30
Gian Merlino	7e02408510	Update versions to 0.13.0-SNAPSHOT. (#5323 )	2018-02-02 12:06:38 -06:00
Clint Wylie	1fffc681d2	fix RemoteTaskRunner terminating lazy workers below autoscaler minNumWorkers value (#5310 ) * fix RemoteTaskRunner terminating lazy workers below autoscaler minNumWorkers value * add comment	2018-02-01 17:57:01 +01:00
Jihoon Son	3a69b0e513	Handle nullable taskTypes for rolling upgrade (#5309 )	2018-01-30 13:32:54 -08:00
Himanshu	59250cf19b	add taskType from announcement in HttpRemoteTaskRunnerWorkItem (#5301 )	2018-01-26 15:47:35 -08:00
Jonathan Wei	80419752b5	Add metamx emitter, http clients, and metrics packages to druid java-util (#5289 ) * Add metamx java-util emitter, http clients, and metrics packages to druid java-util * Remove metamx java-util from pom.xml files * Checkstyle fixes * Import fix * TeamCity inspection fixes * Use slf4j, move some version defs to master pom.xml * Use parent jvm-attach-api and maven-surefire-plugin versions * Add ] to log msg, suppress inspection	2018-01-24 22:10:36 +01:00
Roman Leventov	61e6878afd	Check Javadoc reference integrity (#5279 )	2018-01-22 13:51:28 -08:00
Jihoon Son	241efafbb2	Automatic compaction by coordinators (#5102 ) * Automatic compaction by coordinator * add links * skip compaction for very recent segments if they are small * fix finding search interval * fix finding search interval * fix TimelineHolder iteration * add test for newestSegmentFirstPolicy * add CompactionSegmentIterator * add numTargetCompactionSegments * add missing config * fix skipping huge shards * fix handling large number of segments per shard * fix test failure * change recursive call to loop * fix logging * fix build * fix test failure * address comments * change dataSources type * check running pendingTasks at each run * fix test * address comments * fix build * fix test * address comments * address comments * add doc for segment size optimization * address comment	2018-01-13 13:52:37 +09:00
Roman Leventov	8877ce38d6	Enforce modifier order with Checkstyle (#5246 )	2018-01-11 09:50:42 +01:00
Gian Merlino	7b8b0a96d6	IndexTask: Add summary stats line at the end. (#5241 )	2018-01-10 14:06:54 +09:00
Himanshu	a46d34daa2	HTTP based task/worker management. (#5104 ) * just renaming of SegmentChangeRequestHistory etc * additional change history refactoring changes * WorkerTaskManager a replica of WorkerTaskMonitor * HttpServerInventoryView refactoring to extract sync code and robustification * Introducing HttpRemoteTaskRunner * Additional Worker side updates	2018-01-04 19:19:35 -08:00
Roman Leventov	579f9fbedf	Add IndexedInts.debugToString() and AbstractIndex.toString(); Add Sequence.toList() and limit() (#5175 ) * Add IndexedInts.debugToString() and AbstractIndex.toString() * Fix AppenderatorTest	2018-01-04 09:56:47 +09:00
David Lim	a7967ade4d	Support replaceExisting parameter for segments pushers (#5187 ) * support replaceExisting parameter for segments pushers * code review changes * code review changes	2018-01-03 16:13:21 -08:00
Jihoon Son	b31abd03ad	Fix timeout in RemoteTaskRunnerTest (#5191 ) * Fix timeout in RemoteTaskRunnerTest * add message for npe	2017-12-22 17:40:11 +09:00
Jihoon Son	9199d61389	Automatic pendingSegments cleanup (#5149 ) * PendingSegments cleanup * fix build * address comments * address comments * fix potential npe * address comments * fix build * fix test * fix test	2017-12-20 14:46:34 -08:00
Roman Leventov	5787d04fad	Bump Druid version to 0.12.0 (#5138 )	2017-12-15 07:37:01 -08:00
Roman Leventov	64848c7ebf	DataSegment memory optimizations (#5094 ) * Deduplicate DataSegments contents (loadSpec's keys, dimensions and metrics lists as a whole) more aggressively; use ArrayMap instead of default LinkedHashMap for DataSegment.loadSpec, because they have only 3 entries on average; prune DataSegment.loadSpec on brokers * Fix DataSegmentTest * Refinements * Try to fix * Fix the second DataSegmentTest * Nullability * Fix tests * Fix tests, unify to use TestHelper.getJsonMapper() * Revert TestUtil as ServerTestHelper, fix tests * Add newline * Fix indexing tests * Fix s3 tests * Try to fix tests, remove lazy caching of ObjectMapper in TestHelper, rename TestHelper.getJsonMapper() to makeJsonMapper() * Fix HDFS tests * Fix HdfsDataSegmentPusherTest * Capitalize constant names	2017-12-12 11:41:40 -08:00
Gian Merlino	4f5e2b4549	Fix some unemitted alerts. (#5141 )	2017-12-06 18:37:01 -08:00
Roman Leventov	a7a6a0487e	Replace IOPeon with SegmentWriteOutMedium; Improve buffer compression (#4762 ) * Replace IOPeon with OutputMedium; Improve compression * Fix test * Cleanup CompressionStrategy * Javadocs * Add OutputBytesTest * Address comments * Random access in OutputBytes and GenericIndexedWriter * Fix bugs * Fixes * Test OutputBytes.readFully() * Address comments * Rename OutputMedium to SegmentWriteOutMedium and OutputBytes to WriteOutBytes * Add comments to ByteBufferInputStream * Remove unused declarations	2017-12-04 18:04:27 -08:00
Parag Jain	7c01f77b04	Parse Batch support (#5081 ) * add parseBatch and deprecate parse method in InputRowParser add addAll method, skip max rows in memory check for it remove parse method from implemetations transform transformers add string multiplier input row parser fix withParseSpec fix kafka batch indexing fix isPersistRequired comments * add unit test * make persist async * review comments	2017-12-04 16:06:16 -06:00
Jihoon Son	645de02fb2	Remove Injector from IngestSegmentFirehoseFactory (#5045 ) * Add lock check to segment list actions * fix test * remove lock check	2017-11-20 15:54:35 -08:00
Parag Jain	cb03efeb14	Kafka Index Task that supports Incremental handoffs (#4815 ) * Kafka Index Task that supports Incremental handoffs - Incrementally handoff segments when they hit maxRowsPerSegment limit - Decouple segment partitioning from Kafka partitioning, all records from consumed partitions go to a single druid segment - Support for restoring task on middle manager restarts by check pointing end offsets for segments * take care of review comments * make getCurrentOffsets call async, keep track of publishing sequence, review comments * fix setEndoffset duplicate request handling, formatting * fix unit test * backward compatibility * make AppenderatorDriverMetadata backwards compatible * add unit test * fix deadlock between persist and push executors in AppenderatorImpl * fix formatting * use persist dir instead of work dir * review comments * fix deadlock * actually fix deadlock	2017-11-17 16:05:20 -06:00
Jihoon Son	93459f748f	Fix missing intervals after compacting intervals (#5092 ) * Fix missing intervals after compacting intervals * fix build	2017-11-16 11:42:38 -08:00
Jonathan Wei	af44d1142b	Add unsecured /health endpoint, remove auth checks from isLeader (#5087 ) * Add unsecured /health endpoint, remove auth checks from isLeader * PR comments	2017-11-15 14:41:30 -06:00
Akash Dwivedi	c1538f29fc	maxQueryTimeout property in runtime properties. (#4852 ) * maxQueryTimeout property in runtime properties. * extra line * move withTimeoutAndMaxScatterGatherBytes method to QueryLifeCycle. * Fix initialize method. * remove unused import. * doc update. * some more details in doc about query failure.. * minor fix. * decorating QueryRunner to set and verify context. Added by servers. * remove whitespace.	2017-11-13 19:23:11 -06:00
Himanshu	2ecebb3173	Fix coordinator/overlord redirects when TLS is enabled (#5037 ) * Fix coordinator/overlord redirects when TLS is enabled * address review comment * fix UTs * workaround to not ignore URL instance to fix the teamcity build * update tls doc	2017-11-09 13:10:28 -08:00
Jihoon Son	c11c71ab3e	Using ImmutableDruidDataSource as a key for map and set instead of DruidDataSource (#5054 ) * use ImmutableDruidDataSource for map and set * address comments * unused import * allow returning only ImmutableDruidDataSource in MetadataSegmentManager * address comments * remove TreeSet * revert to use TreeSet	2017-11-09 16:07:58 -03:00
Roman Leventov	3541b7544b	Prohibit and remove unused declarations in the processing module (#4930 ) * Prohibit and remove unused declarations in the processing module * Fix tests * Fix integration tests * Suppress unused * Try to remove SuppressWarnings unused in VirtualColumn * Remove reset 'false positives' * Annotate CliCommandCreator as ExtensionPoint * Unused import warning instead of error in IntelliJ * Fixes * Add comment * Fix AzureBlob * Fix CloudFilesBlob * Address comments * Add Project SDK section to INTELLIJ_SETUP.md * Fix image	2017-11-09 09:27:27 -08:00
Jihoon Son	5f3c863d5e	Add compaction task (#4985 ) * Add compaction task * added doc * use combining aggregators * address comments * add support for dimensionsSpec * fix getUniqueDims and getUniqueMetics * find unique dimensionsSpec * fix compilation * add unit test * fix test * fix test * test for different dimension orderings and types, and doc for type and ordering * add control for custom ordering and type * update doc * fix compile * fix compile * add segments param * fix serde error * fix build	2017-11-03 21:55:27 -06:00
Gian Merlino	6c725a7e06	Fix havingSpec on complex aggregators. (#5024 ) * Fix havingSpec on complex aggregators. - Uses the technique from #4883 on DimFilterHavingSpec too. - Also uses Transformers from #4890, necessitating a move of that and other related classes from druid-server to druid-processing. They probably make more sense there anyway. - Adds a SQL query test. Fixes #4957. * Remove unused import.	2017-11-01 12:58:08 -04:00
Jihoon Son	e96daa2593	Fix SQLMetadataSegmentManager (#5001 )	2017-10-31 08:02:41 -07:00
Gian Merlino	0ce406bdf1	Introduce "transformSpec" at ingest-time. (#4890 ) * Introduce "transformSpec" at ingest-time. It accepts a "filter" (standard query filter object) and "transforms" (a list of objects with "name" and "expression"). These can be used to do filtering and single-row transforms without need for a separate data processing job. The "expression" fields use the same expression language as other expression-based feature. * Remove forbidden api. * Fix compile error. * Fix tests. * Some more changes. - Add nullable annotation to Firehose.nextRow. - Add tests for index task, realtime task, kafka task, hadoop mapper, and ingestSegment firehose. * Fix bad merge. * Adjust imports. * Adjust whitespace. * Make Transform into an interface. * Add missing annotation. * Switch logger. * Switch logger. * Adjust test. * Adjustment to handling for DatasourceIngestionSpec. * Fix test. * CR comments. * Remove unused method. * Add javadocs. * More javadocs, and always decorate. * Fix bug in TransformingStringInputRowParser. * Fix bad merge. * Fix ISFF tests. * Fix DORC test.	2017-10-30 17:38:52 -07:00
Gian Merlino	5fc6891404	Reduce code duplication between test ExprMacroTables. (#4979 )	2017-10-18 15:57:49 -05:00
Roman Leventov	dc7cb117a1	Refactor ColumnSelectorFactory; Rely on ColumnValueSelector's polymorphism (#4886 ) * Refactor ColumnSelectorFactory; Rely on ColumnValueSelector's polymorphism * Fix MapVirtualColumn.makeColumnValueSelector() * Minor fixes * Fix IndexGeneratorCombinerTest * DimensionSelector to return zeros when treated as numeric ColumnValueSelector * Fix IncrementalIndexTest * Fix IncrementalIndex.makeColumnSelectorFactory() * Optimize MapBasedRow.getMetric() * Fix VarianceAggregatorTest * Simplify IncrementalIndex.makeColumnSelectorFactory() * Address comments * More comments * Test	2017-10-13 21:44:17 -05:00
Jihoon Son	8d9902831e	Refactoring PrefetchableTextFilesFirehoseFactory (#4836 ) * Refactoring prefetchable firehose * Fix to read cache when prefetch is disabled * More tests * Cleanup codes * Add Fetcher * Fix test failure * Count file size * Fix test * rename generic parameter * address comments * address comments * reuse buffer * move Execs to java-util * use execs * Fix build	2017-10-13 21:39:28 -05:00
Jihoon Son	675c6c00dd	Add checkstyle and intellij rule to prohibit unnecessary qualifiers in interfaces (#4958 ) * add checkstyle and intellij rule * fix tc fail	2017-10-13 07:56:19 -07:00
Atul Mohan	c07678b143	Synchronization of lookups during startup of druid processes (#4758 ) * Changes for lookup synchronization * Refactor of Lookup classes * Minor refactors and doc update * Change coordinator instance to be retrieved by DruidLeaderClient * Wait before thread shutdown * Make disablelookups flag true by default * Update docs * Rename flag * Move executorservice shutdown to finally block * Update LookupConfig * Refactoring and doc changes * Remove lookup config constructor * Revert Lookupconfig constructor changes * Add tests to LookupConfig * Make executorservice local * Update LRM * Move ListeningScheduledExecutorService to ExecutorCompletionService * Move exception to outer block * Remove check to see future is done * Remove unnecessary assignment * Add logging	2017-10-12 21:22:24 -05:00
Jihoon Son	dfa9cdc982	Prioritized locking (#4550 ) * Implementation of prioritized locking * Fix build failure * Fix tc fail * Fix typos * Fix IndexTaskTest * Addressed comments * Fix test * Fix spacing * Fix build error * Fix build error * Add lock status * Cleanup suspicious method * Add nullables * add doInCriticalSection to TaskLockBox and revert return type of task actions * fix build * refactor CriticalAction * make replaceLock transactional * fix formatting * fix javadoc * fix build	2017-10-11 23:16:31 -07:00
Jihoon Son	56fb11ce0b	Lazy initialization for JavaScript functions (#4871 ) * Lazy initialization of JavaScript functions * Fix test failure * Fix thread-safety and postpone js conf check * Fix test fail * Fix test * Fix KafkaIndexTaskTest * Move config check	2017-10-10 21:52:42 -07:00
Gian Merlino	797b54d283	DruidLeaderClient: Throw IOException on retryable errors. (#4913 ) * DruidLeaderClient: Throw IOException on retryable errors. Fixes #4911. * Adjustments.	2017-10-06 15:12:09 -05:00
Parag Jain	535c034c06	assume scheme to be http if not present (#4912 )	2017-10-06 14:50:48 -05:00
Gian Merlino	c19cd23e94	RTR: Demote chatty log message. (#4895 ) "No worker selection strategy set." would get logged any time tryAssignTask runs in the default configuration, which is often. It also doesn't provide much value.	2017-10-03 08:16:32 -07:00
Roman Leventov	3f1009aaa1	Make Overlord auto-scaling and provisioning extensible (#4730 ) * Make AutoScaler, ProvisioningStrategy and BaseWorkerBehaviorConfig extension points; More logging in PendingTaskBasedWorkerProvisioningStrategy * Address comments and fix a bug * Extract method * debug logging * Rename BaseWorkerBehaviorConfig to WorkerBehaviorConfig and WorkerBehaviorConfig to DefaultWorkerBehaviorConfig * Fixes	2017-10-02 20:12:23 -05:00
QiuMM	6f91d9ca1e	change WorkerSelectStrategy's defaultImpl from FillCapacityWorkerSelectStrategy to EqualDistributionWorkerSelectStrategy (#4777 )	2017-10-02 16:52:41 -07:00
Jonathan Wei	5e60ccade1	Add context map to AuthenticationResult (#4870 )	2017-10-02 17:08:14 -05:00
Jihoon Son	ee7eaccbab	Better logging for SegmentAllocateAction (#4884 ) * Better logging for SegmentAllocateAction * Split methods	2017-10-02 09:29:21 -07:00
Gian Merlino	1f2074c247	Bump versions in master to 0.11.1-SNAPSHOT. (#4878 ) * Bump versions in master to 0.11.1-SNAPSHOT. * Missed a few.	2017-09-28 17:09:51 -05:00
Himanshu	f69c9280c4	remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form (#4858 ) * remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form * sanitize output of /druid/coordinator/v1/cluster endpoint	2017-09-28 10:40:59 -05:00
Roman Leventov	9c126e2aa9	Forbid MapMaker (#4845 ) * Forbid MapMaker * Shorter syntax * Forbid Maps.newConcurrentMap()	2017-09-27 06:49:47 -07:00
Roman Leventov	e267f3901b	Enforce Indentation with Checkstyle (#4799 )	2017-09-21 13:06:48 -07:00
Jonathan Wei	3a4a483bb0	Single auth check for authorized resource filtering (#4818 ) * Single auth check for authorized resource filtering * PR comment * PR comments	2017-09-19 21:46:08 +05:30
Jonathan Wei	c2a0e753b6	Extension points for authentication/authorization (#4271 ) * Extension points for authentication/authorization * Address some PR comments * Authorization result caching * Add unit tests for SecuritySanityCheckFilter and PreResponseAuthorizationCheckFilter * Use Set for auth caching, close outputstreams in filters * Don't close output stream on success in sanity check filter * Add ConfigResourceFilter to coordinator lookups * Fix filtering authorization check for empty resource list * HttpClient users must explicitly escalate the client * Remove response modification from PreResponseAuthorizationCheckFilter * Remove extraneous pom.xml * Fix unit test * Better lifecycle management * Rename AuthorizationManager to Authorizer * Fix authorization denials for empty supervisor list * Address some PR comments * Address more PR comments * Small cleanup * Add Jetty HttpClient wrapper to Authenticator * Remove Authorizer start/stop * Restore immutable context map in DruidConnection, UT fix * Fix/update docs * Add authorization checks to EventReceiverFirehose * Fix router authorization check failure, restore PreResponseAuthorizationFilter changes * Compile fixes * Test fixes * Update Authenticator/Authorizer doc comments * Merge fixes * PR comments * Fix test * Fix IT * More PR comments * PR comments * SSL fix	2017-09-15 23:45:48 -07:00
Egor Riashin	6f3e52b3db	Make optional Peon "stdin" check (#4760 )	2017-09-11 16:37:01 -05:00
Himanshu	834e050bc4	Use internal-discovery and http for talking to overlord/coordinator leaders (#4735 ) * Use internal-discovery and http for talking to overlord/coordinator leaders * CuratorDruidNodeDiscovery.getAllNodes() best effort 30 sec wait for cache initialization * DruidLeaderClientProvider to eagerly instantiate DruidNodeDiscovery when needed so that DruidNodeDiscovery impl cache gets initialized well in time * Revert "DruidLeaderClientProvider to eagerly instantiate DruidNodeDiscovery when needed so that DruidNodeDiscovery impl cache gets initialized well in time" This reverts commit f1a2432614ba56ddc2d55fe47e990d17fcfd6129. * add lifecycle to DruidLeaderClient to early initialize DruidNodeDiscovery so that it has its cache update well in time	2017-09-11 11:18:01 -07:00
dgolitsyn	752151f6cb	Add CachingCostBalancerStrategy (#4731 ) * Add CachingCostBalancerStrategy; Rename ServerView.ServerCallback to ServerRemovedCallback * Fix benchmark units * Style, forbidden-api, review, bug fixes * Add docs * Address comments	2017-09-08 12:23:04 -05:00
Gian Merlino	33c0928bed	Collapse worker select strategies, change default, add strong affinity. (#4534 ) * Collapse worker select strategies, change default, add strong affinity. - Change default worker select strategy to equalDistribution. It is more generally useful than fillCapacity. - Collapse the WithAffinity strategies into the regular ones. The WithAffinity strategies are retained for backwards compatibility. - Change WorkerSelectStrategy to return nullable instead of Optional. - Fix a couple of errors in the docs. * Fix test. * Review adjustments. * Remove unused imports. * Switch to DateTimes.nowUtc. * Simplify code. * Fix tests (worker assignment started off on a different foot)	2017-09-04 14:40:55 -07:00
Himanshu	06ac6678e6	DruidLeaderSelector interface for leader election and Curator based impl. (#4699 ) * DruidLeaderSelector interface for leader election and Curator based impl. DruidCoordinator/TaskMaster are updated to use the new interface. * add fake DruidNode binding in integration-tests module * add docs on DruidLeaderSelector interface * remove start/stop and keep register/unregister Listener in DruidLeaderSelector interface * updated comments on DruidLeaderSelector * cache the listener executor in CuratorDruidLeaderSelector * use same latch owner name that was used before * remove stuff related to druid.zk.paths.indexer.leaderLatchPath config * randomize the delay when giving up leadership and restarting leader latch	2017-09-01 09:49:04 -07:00
Charles Allen	bdfc6fe25e	Move common TypeReference into JacksonUtils (#4738 )	2017-08-31 13:40:16 -07:00
Gian Merlino	daf3c5f927	Add "round" option to cardinality and hyperUnique aggregators. (#4720 ) * Add "round" option to cardinality and hyperUnique aggregators. Also turn it on by default in SQL, to make math on distinct counts work more as expected. * Fix some compile errors. * Fix test. * Formatting.	2017-08-28 14:52:11 -07:00
Roman Leventov	cbd1902db8	Add forbidden-apis plugin; prohibit using system time zone (#4611 ) * Forbidden APIs WIP * Remove some tests * Restore io.druid.math.expr.Function * Integration tests fix * Add comments * Fix in SimpleWorkerProvisioningStrategy * Formatting * Replace String.format() with StringUtils.format() in RemoteTaskRunnerTest * Address comments * Fix GroupByMultiSegmentTest	2017-08-21 13:02:42 -07:00
Himanshu	74a64c88ab	internal-discovery: interfaces for announcement/discovery, curator based impls (#4634 ) * internal-discovery: interfaces for announcement/discovery, curator impls * more tests * address some review comments * more fixes * address more review comments * simplify ObjectMapper setup in CuratorDruidNodeAnnouncerAndDiscoveryTest * fix KafkaIndexTaskTest * make lookupTier overridable via RealtimeIndexTask and KafkaIndexTask context * make teamcity build happy	2017-08-16 13:07:16 -07:00
Roman Leventov	bf28d0775b	Remove QueryRunner.run(Query, responseContext) and related legacy methods (#4482 ) * Remove QueryRunner.run(Query, responseContext) and related legacy methods * Remove local var	2017-08-11 09:12:38 +09:00
Jihoon Son	d5606bc558	Passing lockTimeout as a parameter for TaskLockbox.lock() (#4549 ) * Passing lockTimeout as a parameter for TaskLockbox.lock() * Remove TIME_UNIT * Fix tc fail * Add taskLockTimeout to TaskContext * Add caution	2017-08-08 18:21:07 -07:00
Roman Leventov	486b7a2347	TaskMaster deadlock fix (#4548 ) * Stop RemoteTaskRunner's cleanupExec using TaskMaster's lifecycle, not global injected lifecycle * Prohibit starting Lifecycle twice; Make Lifecycle to reject addMaybeStartHandler() attempts in the process of stopping rather than entering deadlock * Fix Lifecycle.addMaybeStartHandler() * Remove RemoteTaskRunnerFactoryTest * Add docs * Language * Address comments * Fix RemoteTaskRunnerTestUtils	2017-08-07 14:28:43 -07:00
Roman Leventov	aa7e4ae5e4	Enforce correct spacing with Checkstyle (#4651 )	2017-08-05 10:18:25 -07:00
David Lim	dd0b84e766	Fix bugs in RTR related to blacklisting, change default worker strategy (#4619 ) * fix bugs in RTR related to blacklisting, change default worker strategy to equalDistribution * code review and additional changes * fix errorprone * code review changes	2017-08-03 10:34:45 -07:00
Gian Merlino	a9c875d746	IndexTask: Use shared groupId when "appendToExisting" is on. (#4582 ) This allows the tasks to run concurrently. Additionally, rework the partition-determining code in a couple ways: - Use a task-id based sequenceName so concurrently running append tasks do not clobber each others' segments. - Make the list of shardSpecs empty when rollup is non-guaranteed, and let allocators handle the creation of incremental shardSpecs.	2017-07-24 09:57:23 -07:00
Roman Leventov	c0beb78ffd	Enforce brace formatting with Checkstyle (#4564 )	2017-07-21 10:26:59 -05:00
Gian Merlino	38b03f56b4	IndexTask: Raise default maxTotalRows. (#4579 ) 150k is very low, given that it should only be limited by disk space rather than JVM heap.	2017-07-20 13:55:07 -07:00
Akash Dwivedi	0b85c60869	Fix issue-4539 (#4546 ) * Protect double URI encoding * removeExtraLine	2017-07-19 09:38:29 -07:00
Roman Leventov	60cdf94677	Add PMD and prohibit unnecessary fully qualified class names in code (#4350 ) * Add PMD and prohibit unnecessary fully qualified class names in code * Extra fixes * Remove extra unnecessary fully-qualified names * Remove qualifiers * Remove qualifier	2017-07-17 22:22:29 +09:00
Roman Leventov	b7203510b8	Fix RemoteTaskRunner's auto-scaling (#3768 ) * Rename ResourceManagementStrategy to ProvisioningStrategy, similarly for related classes. Make ProvisioningService non-global, created per RemoteTaskRunner instead. Add OverlordBlinkLeadershipTest. * Fix RemoteTaskRunnerFactoryTest.testExecNotSharedBetweenRunners() * Small fix * Make SimpleProvisioner and PendingProvisioner more similar in details * Fix executor name * Style fixes * Use LifecycleLock in RemoteTaskRunner	2017-07-14 09:11:39 +09:00
Chris Gavin	960cb07ea6	Fix some unnecessary use of boxed types and incorrect format strings spotted by lgtm. (#4474 ) * Remove some unnecessary use of boxed types. * Fix some incorrect format strings. * Enable IDEA's MalformedFormatString inspection. * Add a Checkstyle check for finding uses of incorrect logging packages. * Fix some incorrect usages of the metamx logger. * Bypass incorrect logger Checkstyle check where using the correct logger is not simple. * Fix some more places where the wrong number of arguments are provided to format strings. * Suppress `MalformedFormatString` inspection on legacy logging test. * Use @SuppressWarnings rather than a noinspection suppression comment. * Fix some more incorrect format strings. * Suppress some more incorrect format string warnings where the incorrect string is intentional. * Log the aggregator when closing it fails. * Remove some unneeded log lines.	2017-07-13 12:15:32 -07:00
Roman Leventov	b2865b7c7b	Make possible to start Peon without DI loading of any querying-related stuff (#4516 ) * Make QueryRunnerFactoryConglomerate injection lazy in TaskToolbox/TaskToolboxFactory * Extract QueryablePeonModule and add druid.modules.excludeList config * Typo	2017-07-12 13:18:25 -05:00
Jihoon Son	6d2df2a542	Fix duplicated locks after sync from storage (#4521 ) * Fix duplicated locks after sync from storage * Remove unnecessary table creation	2017-07-11 10:10:11 -07:00
Akash Dwivedi	5f411f14af	Timeout for LockAcquireAction (#4461 ) * Timeout for LockAcquireAction * Static inner class. * Rebase changes. * makeAlert and throw exception incase of overlapping interval. * Addressed comments. * remove unused import. * Addressed comments	2017-07-11 18:59:32 +09:00
Jihoon Son	cc20260078	Early publishing segments in the middle of data ingestion (#4238 ) * Early publishing segments in the middle of data ingestion * Remove unnecessary logs * Address comments * Refactoring the patch according to #4292 and address comments * Set the total shard number of NumberedShardSpec to 0 * refactoring * Address comments * Fix tests * Address comments * Fix sync problem of committer and retry push only * Fix doc * Fix build failure * Address comments * Fix compilation failure * Fix transient test failure	2017-07-10 22:35:36 -07:00
Jihoon Son	8ed25acc15	Fix a bug for CSVParser/DelimitedParser when empty column exists in the header row (#4504 ) * Fix a bug when empty column exists in header row * Address comments	2017-07-07 16:19:25 -07:00
Parag Jain	6e2f78f552	TLS support (#4270 )	2017-07-06 17:40:12 -07:00
Roman Leventov	9ae457f7ad	Avoid using the default system Locale and printing to System.out in production code (#4409 ) * Avoid usages of Default system Locale and printing to System.out or System.err in production code * Fix Charset in DruidKerberosUtil * Remove redundant string format in GenericIndexed * Rename StringUtils.safeFormat() to unimportantSafeFormat(); add StringUtils.format() which fails as well as String.format() * Fix testSafeFormat() * More fixes of redundant StringUtils.format() inside ISE * Rename unimportantSafeFormat() to nonStrictFormat()	2017-06-29 14:06:19 -07:00
Roman Leventov	ae900a4934	Update versions to 0.11.0-SNAPSHOT (#4483 )	2017-06-28 17:05:58 -07:00
Jihoon Son	e3c13c246a	Respect reportParseExceptions option in IndexTask.determineShardSpecs() (#4467 ) * Respect reportParseExceptions option in IndexTask.determineShardSpecs() * Fix typo	2017-06-27 10:28:22 -07:00
Roman Leventov	05d58689ad	Remove the ability to create segments in v8 format (#4420 ) * Remove ability to create segments in v8 format * Fix IndexGeneratorJobTest * Fix parameterized test name in IndexMergerTest * Remove extra legacy merging stuff * Remove legacy serializer builders * Remove ConciseBitmapIndexMergerTest and RoaringBitmapIndexMergerTest	2017-06-26 13:21:39 -07:00
Jihoon Son	b37c9b5fe0	Fix a bug of CSV/TSV parsers when extracting columns from header (#4443 ) * Reset fieldNames whenever a new file begins * Fix test failure * Fix test failure	2017-06-23 14:29:26 -07:00
Goh Wei Xiang	f68a0693f3	Allow use of non-threadsafe ObjectCachingColumnSelectorFactory (#4397 ) * Adding a flag to indicate when ObjectCachingColumnSelectorFactory need not be threadsafe. * - Use of computeIfAbsent over putIfAbsent - Replace Maps.newXXXMap() with normal instantiation - Documentations on when is thread-safe required. - Use Builders for On/OffheapIncrementalIndex * - Optimization on computeIfAbsent - Constant EMPTY DimensionsSpec - Improvement on IncrementalIndexSchema.Builder - Remove setting of default values - Use var args for metrics - Correction on On/OffheapIncrementalIndex Builders - Combine On/OffheapIncrementalIndex Builders * - Removing unused imports. * - Helper method for testing with IncrementalIndex.Builder * - Correction on javadoc. * Style fix	2017-06-16 16:04:19 -05:00
Gian Merlino	1f2afccdf8	Expressions: Add ExprMacros. (#4365 ) * Expressions: Add ExprMacros, which have the same syntax as functions, but can convert themselves to any kind of Expr at parse-time. ExprMacroTable is an extension point for adding new ExprMacros. Anything that might need to parse expressions needs an ExprMacroTable, which can be injected through Guice. * Address code review comments.	2017-06-08 09:32:10 -04:00
Roman Leventov	63a897c278	Enable most IntelliJ 'Probable bugs' inspections (#4353 ) * Enable most IntelliJ 'Probable bugs' inspections * Fix in RemoteTestNG * Fix IndexSpec's equals() and hashCode() to include longEncoding * Fix inspection errors * Extract global isntance of natural().nullsFirst(); address comments * Fix * Use noinspection comments instead of SuppressWarnings on method for IntelliJ-specific inspections * Prohibit Ordering.natural().nullsFirst() using Checkstyle	2017-06-07 09:54:25 -07:00
Roman Leventov	31d33b333e	Make using implicit system Charset an error (#4326 ) * Make using implicit system charset an error * Use StringUtils.toUtf8() and fromUtf8() instead of String.getBytes() and new String() * Use English locale in StringUtils.safeFormat() * Restore comment	2017-06-05 23:57:25 -07:00
David Lim	13ecf90923	Report Kafka lag information in supervisor status report (#4314 ) * refactor lag reporting and report lag at status endpoint * refactor offset reporting logic to fetch offsets periodically vs. at request time * remove JavaCompatUtils * code review changes * code review changes	2017-06-05 13:26:25 -07:00
Slim	a2584d214a	Delagate creation of segmentPath/LoadSpec to DataSegmentPushers and add S3a support (#4116 ) * Adding s3a schema and s3a implem to hdfs storage module. * use 2.7.3 * use segment pusher to make loadspec * move getStorageDir and makeLoad spec under DataSegmentPusher * fix uts * fix comment part1 * move to hadoop 2.8 * inject deep storage properties * set version to 2.7.3 * fix build issue about static class * fix comments * fix default hadoop default coordinate * fix create filesytem * downgrade aws sdk * bump the version	2017-06-04 00:55:09 -06:00
Jihoon Son	f876246af7	Rename FiniteAppenderatorDriver to AppenderatorDriver (#4356 )	2017-06-03 00:48:44 +09:00
Jihoon Son	1150bf7a2c	Refactoring Appenderator Driver (#4292 ) * Refactoring Appenderator 1) Added publishExecutor and handoffExecutor for background publishing and handing segments off 2) Change add() to not move segments out in it * Address comments 1) Remove publishTimeout for KafkaIndexTask 2) Simplifying registerHandoff() 3) Add increamental handoff test * Remove unused variable * Add persist() to Appenderator and more tests for AppenderatorDriver * Remove unused imports * Fix strict build * Address comments	2017-06-02 07:09:11 +09:00
chaoqiang	5fc4abcf71	fix equalDistribution worker select strategy (#4318 ) * fix equalDistribution worker select strategy * replace anonymous Comparator * keep previous version sorting comment * fix code style * update comment * move JsonProperty	2017-05-25 13:30:42 +09:00
Gian Merlino	adeecc0e72	Add /isLeader call to overlord and coordinator. (#4282 ) This is useful for putting them behind load balancers or proxies, as it lets the load balancer know which server is currently active through an http health check. Also makes the method naming a little more consistent between coordinator and overlord code.	2017-05-18 20:46:13 -05:00
Jihoon Son	733dfc9b30	Add PrefetchableTextFilesFirehoseFactory for cloud storage types (#4193 ) * Add PrefetcheableTextFilesFirehoseFactory * fix comment * exception handling * Fix wrong json property * Remove ReplayableFirehoseFactory and fix misspelling * Defer object initialization * Add a temporaryDirectory parameter to FirehoseFactory.connect() * fix when cache and fetch are disabled * Address comments * Add more test * Increase timeout for test * Add wrapObjectStream * Move methods to Firehose from PrefetchableFirehoseFactory * Cleanup comment * add directory listing to s3 firehose * Rename a variable * Addressing comments * Update document * Support disabling prefetch * Fix race condition * Add fetchLock * Remove ReplayableFirehoseFactoryTest * Fix compilation error * Fix test failure * Address comments * Add default implementation for new method	2017-05-18 15:37:18 +09:00
Himanshu	daa8ef8658	Optional long-polling based segment announcement via HTTP instead of Zookeeper (#3902 ) * Optional long-polling based segment announcement via HTTP instead of Zookeeper * address review comments * make endpoint /druid-internal/v1 instead of /druid/internal so that jetty qos filters can be configured easily when needed * update segment callback initialization to be called only after first segment list fetch has been succeeded from all servers * address review comments * remove size check not required anymore as only segment servers announce themselves and not all peon processes * annouce segment server on historical only after cached segments are loaded * fix checkstyle errors	2017-05-17 16:31:58 -05:00
Roman Leventov	b7a52286e8	Make @Override annotation obligatory (#4274 ) * Make MissingOverride an error * Make travis stript to fail fast * Add missing Override annotations * Comment	2017-05-16 13:30:30 -05:00
Benedict Jin	e823085866	Improve `collection` related things that reusing a immutable object instead of creating a new object (#4135 )	2017-05-17 01:38:51 +09:00
Jihoon Son	50a4ec2b0b	Add support for headers and skipping thereof for CSV and TSV (#4254 ) * initial commit * small fixes * fix bug * fix bug * address code review * more cr * more cr * more cr * fix * Skip head rows for CSV and TSV * Move checking skipHeadRows to FileIteratingFirehose * Remove checking null iterators * Remove unused imports * Address comments * Fix compilation error * Address comments * Add more tests * Add a comment to ReplayableFirehose * Addressing comments * Add docs and fix typos	2017-05-15 22:57:31 -07:00
Roman Leventov	1ebfa22955	Update Error prone configuration; Fix bugs (#4252 ) * Make Errorprone the default compiler * Address comments * Make Error Prone's ClassCanBeStatic rule a error * Preconditions allow only %s pattern * Fix DruidCoordinatorBalancerTester * Try to give the compiler more memory * Remove distribution module activation on jdk 1.8 because only jdk 1.8 is used now * Don't show compiler warnings * Try different travis script * Fix travis.yml * Make Error Prone optional again * For error-prone compiler * Increase compiler's maxmem * Don't run Error Prone for benchmarks because of OOM * Skip install step in Travis * Remove MetricHolder.writeToChannel() * In travis.yml, check compilation before tests, because it may fail faster	2017-05-12 15:55:17 +09:00
Himanshu	462f6482df	optionally add extensions to explicitly specified hadoopContainerClassPath (#4230 ) * optionally add extensions to explicitly specified hadoopContainerClassPath * note extensions always pushed in hadoop container when druid.extensions.hadoopContainerDruidClasspath is not provided explicitly	2017-05-08 14:24:14 -05:00
Gian Merlino	f0fd8ba191	Add supervisors to overlord console. (#4248 )	2017-05-04 11:13:12 -07:00
Gian Merlino	2ca7b00346	Update versions to 0.10.1-SNAPSHOT. (#4191 )	2017-04-20 18:12:28 -07:00
Roman Leventov	15f3a94474	Copy closer into Druid codebase (fixes #3652 ) (#4153 )	2017-04-10 09:38:45 +09:00
Parag Jain	7e0d4c9555	secure supervisor endpoints (#3985 )	2017-04-05 16:42:32 -07:00
JackyWoo	a0f2cf05d5	Add EqualDistributionWithAffinityWorkerSelectStrategy which balance w… (#3998 ) * Add EqualDistributionWithAffinityWorkerSelectStrategy which balance work load within affinity workers. * add docs to equalDistributionWithAffinity	2017-03-25 19:15:49 -07:00
Himanshu	de081c711b	RealtimeIndexTask to support alertTimeout in context (#4089 ) * RealtimeIndexTask to support alertTimeout in context and raise alert if task process exists after the timeout * move alertTimeout config to tuningConfig and document	2017-03-24 12:48:12 -07:00
Gian Merlino	b4289c0004	Remove "granularity" from IngestSegmentFirehose. (#4110 ) It wasn't doing anything useful (the sequences were being concatted, and cursor.getTime() wasn't being called) and it defaulted to Granularities.NONE. Changing it to Granularities.ALL gave me a 700x+ performance boost on a small dataset I was reindexing (2m27s to 365ms). Most of that was from avoiding making a lot of unnecessary column selectors.	2017-03-24 10:28:54 -07:00
Zhihui Jiao	6febcd9f24	Fix IngestSegmentFirehoseFactory (#4069 )	2017-03-17 16:57:25 -06:00
Parag Jain	c155d9a5e9	increase kill timeout (#4002 )	2017-03-08 09:00:34 -08:00
kaijianding	19ac1c7c2c	Add SameIntervalMergeTask for easier usage of MergeTask (#3981 ) * Add SameIntervalMergeTask for easier usage of MergeTask * fix a bug and add ut * remove same_interval_merge_sub from Task.java and remove other no needed code	2017-03-06 11:21:32 -06:00
Roman Leventov	ea1f5b7954	LifecycleLock for better synchronization in lifecycled objects (#3964 ) * Introduce LifecycleLock * Add LifecycleLockTest * Rename LifecycleLock.release() to exitStart() * Rewrite LifecycleLock using AbstractQueuedSynchronizer for more safety, added tests * Add LifecycleLock.exitStop() and reset() * Add LifecycleLock.awaitStarted(timeout) * Braces * Fix	2017-03-02 12:22:57 -08:00
Akash Dwivedi	94da5e80f9	Namespace optimization for hdfs data segments. (#3877 ) * NN optimization for hdfs data segments. * HdfsDataSegmentKiller, HdfsDataSegment finder changes to use new storage format.Docs update. * Common utility function in DataSegmentPusherUtil. * new static method `makeSegmentOutputPathUptoVersionForHdfs` in JobHelper * reuse getHdfsStorageDirUptoVersion in DataSegmentPusherUtil.getHdfsStorageDir() * Addressed comments. * Review comments. * HdfsDataSegmentKiller requested changes. * extra newline * Add maprfs.	2017-03-01 09:51:20 -08:00
praveev	5ccfdcc48b	Fix testDeadlock timeout delay (#3979 ) * No more singleton. Reduce iterations * Granularities * Fix the delay in the test * Add license header * Remove unused imports * Lot more unused imports from all the rearranging * CR feedback * Move javadoc to constructor	2017-02-28 12:51:41 -06:00
kaijianding	ef6a19c81b	buildV9Directly in MergeTask and AppendTask (#3976 ) * buildV9Directly in MergeTask and AppendTask * add doc	2017-02-28 10:04:32 -08:00
Parag Jain	469ae374a3	add kill task link on console (#3974 ) * add kill task link on console * refresh after kill	2017-02-25 14:58:16 +05:30
praveev	c3bf40108d	One granularity (#3850 ) * Refactor Segment Granularity * Beginning of one granularity * Copy the fix for custom periods in segment-grunalrity over here. * Remove the custom serialization for now. * Compilation cleanup * Reformat code * Fixing unit tests * Unify to use a single iterable * Backward compatibility for rolling upgrade * Minor check style. Cosmetic changes. * Rename length and millis to duration * CR feedback * Minor changes.	2017-02-25 01:02:29 -06:00
David Lim	3c54fc912a	fix numShards = -1 not being handled correctly (#3937 )	2017-02-14 18:45:38 -08:00
Himanshu	9dfcf0763a	disable javascript execution by default (#3818 )	2017-02-13 15:11:18 -08:00
Himanshu	8cf7ad1e3a	druid.coordinator.asOverlord.enabled flag at coordinator to make it an overlord too (#3711 )	2017-02-13 15:03:59 -08:00
Parag Jain	8e31a465ad	report hand off count finite appenderator driver (#3925 )	2017-02-13 10:41:24 -08:00
Gian Merlino	12317fd001	Bump version to 0.10.0-SNAPSHOT. (#3913 )	2017-02-06 17:54:35 -08:00
DaimonPl	93b71e265e	Extract HLL related code to separate module (#3900 )	2017-02-03 09:45:11 -08:00
Parag Jain	1aabb45a09	auto reset option for Kafka Indexing service (#3842 ) * auto reset option for Kafka Indexing service in case message at the offset being fetched is not present anymore at kafka brokers * review comments * review comments * reverted last change * review comments * review comments * fix typo	2017-02-02 14:57:45 -06:00
David Lim	ff52581bd3	IndexTask improvements (#3611 ) * index task improvements * code review changes * add null check	2017-01-18 14:24:37 -08:00
Jihoon Son	d80bec83cc	Enable auto license checking (#3836 ) * Enable license checking * Clean duplicated license headers	2017-01-10 18:13:47 -08:00
Charles Allen	229559b46a	Make TaskLockbox's ReentrantLock fair (#3828 )	2017-01-07 12:34:47 -08:00
Himanshu	4ca3b7f1e4	overlord helpers framework and tasklog auto cleanup (#3677 ) * overlord helpers framework and tasklog auto cleanup * review comment changes * further review comments addressed	2016-12-21 15:18:55 -08:00
Gian Merlino	6440ddcbca	Fix #3795 (Java 7 compatibility). (#3796 ) * Fix #3795 (Java 7 compatibility). Also introduce Animal Sniffer checks during build, which would have caught the original problems. * Add Animal Sniffer on caffeine-cache for JDK8.	2016-12-21 10:19:13 -08:00
Roman Leventov	70e83bea6d	Fix PathChildrenCache's ExecutorService leak (#3726 ) * Fix PathChildrenCache's executorService leak in Announcer, CuratorInventoryManager and RemoteTaskRunner * Use a single ExecutorService for all workerStatusPathChildrenCaches in RemoteTaskRunner	2016-12-07 13:00:10 -08:00
Gian Merlino	4e67dd28c0	RemoteTaskRunnerConfig: Fix Guice error on startup. (#3737 )	2016-12-06 00:19:53 +05:30
Charles Allen	27ab23ef44	Don't update segment metadata if archive doesn't move anything (#3476 ) * Don't update segment metadata if archive doesn't move anything * Fix restore task to handle potential null values * Don't try to update empty metadata * Address review comments * Move to druid-io java-util	2016-12-01 07:49:28 -08:00
Niketh Sabbineni	2640d170c3	Blacklist workers if they fail for too many times (#3643 ) * Blacklist workers if they fail for too many times * Adding documentation * Changing to timeout to period and updating docs * 1. Add configurable maxPercentageBlacklistWorkers 2. Rename variable * Change maxPercentageBlacklistWorkers to double * Remove thread.sleep	2016-11-29 12:38:56 +05:30
Roman Leventov	c070b4a816	Fix concurrency defects, remove unnecessary volatiles (#3701 )	2016-11-22 16:42:28 -08:00
Roman Leventov	7b56cec3b9	Fix resource leaks (#3702 )	2016-11-18 21:21:36 +05:30
Gian Merlino	bcd20441be	Make buildV9Directly the default. (#3688 )	2016-11-14 09:29:32 -08:00
Roman Leventov	fbbb55f867	Update emitter dependency to 0.4.0 and emit "version" dimension for all druid metrics (#3679 ) * Update emitter dependency to 0.4.0 and emit "version" dimension for all druid metrics, not only query metrics * Remove unused imports * Use empty string instead of "testing-version" as a version placeholder	2016-11-11 17:17:27 -06:00
Himanshu	b76b3f8d85	reset-cluster command to clean up druid state stored on metadata and deep storage (#3670 )	2016-11-09 11:07:01 -06:00
Akash Dwivedi	4b3bd8bd63	Migrating java-util from Metamarkets. (#3585 ) * Migrating java-util from Metamarkets. * checkstyle and updated license on java-util files. * Removed unused imports from whole project. * cherry pick metamx/java-util@826021f. * Copyright changes on java-util pom, address review comments.	2016-10-21 14:57:07 -07:00
Parag Jain	1e79a1be82	fix useExplicitVersion (#3559 )	2016-10-10 14:28:06 -05:00
Akash Dwivedi	078de4fcf9	Use explicit version from HadoopIngestionSpec. (#3554 )	2016-10-07 13:59:14 -07:00
Parag Jain	e419407eba	handle supervisor spec metadata failures (#3456 ) close kafka consumer in case supervisor start fails	2016-10-04 10:15:28 -07:00
Gian Merlino	40f2fe7893	Bump versions to 0.9.3-SNAPSHOT (#3524 )	2016-09-29 13:53:32 -07:00
David Lim	ca9114b41b	add supervisor reset API (#3484 ) * add supervisor reset API * CR doc changes and kill running tasks / clear offsets from supervisor	2016-09-22 17:51:06 -07:00
Gian Merlino	27bd5cb13a	Add forceExtendableShardSpecs option to Hadoop indexing, IndexTask. (#3473 ) Fixes #3241.	2016-09-21 13:40:04 -06:00
Gian Merlino	7a2a4bc6de	JavaScript: Disable now affects worker selection and router strategy too. (#3458 )	2016-09-13 16:37:42 -07:00
Dave Li	c4e8440c22	Adds long compression methods (#3148 ) * add read * update deprecated guava calls * add write and vsizeserde * add benchmark * separate encoding and compression * add header and reformat * update doc * address PR comment * fix buffer order * generate benchmark files * separate encoding strategy and format * fix benchmark * modify supplier write to channel * add float NONE handling * address PR comment * address PR comment 2	2016-08-30 16:17:46 -07:00
Nishant	4c2b8d29d3	Make RTR assign pending tasks by insertion order (#3405 )	2016-08-30 12:22:44 -07:00
Gian Merlino	2f46effc8e	FileTaskLogsTest: Throw unthrown exception. (#3352 )	2016-08-11 09:40:28 -07:00
Himanshu	03cfcf002b	fix the race described in #3174 (#3205 )	2016-08-10 11:29:50 -07:00
kaijianding	50d52a24fc	ability to not rollup at index time, make pre aggregation an option (#3020 ) * ability to not rollup at index time, make pre aggregation an option * rename getRowIndexForRollup to getPriorIndex * fix doc misspelling * test query using no-rollup indexes * fix benchmark fail due to jmh bug	2016-08-02 11:13:05 -07:00
David Lim	d5ed3f1347	change expected response from ACCEPTED to OK (#3280 )	2016-07-23 19:48:30 -07:00
Gian Merlino	06624c40c0	Share query handling between Appenderator and RealtimePlumber. (#3248 ) Fixes inconsistent metric handling between the two implementations. Formerly, RealtimePlumber only emitted query/segmentAndCache/time and query/wait and Appenderator only emitted query/partial/time and query/wait (all per sink). Now they both do the same thing: - query/segmentAndCache/time, query/segment/time are the time spent per sink. - query/cpu/time is the CPU time spent per query. - query/wait/time is the executor waiting time per sink. These generally match historical metrics, except segmentAndCache & segment mean the same thing here, because one Sink may be partially cached and partially uncached and we aren't splitting that out.	2016-07-19 22:15:13 -05:00
Hyukjin Kwon	55e7a52475	Replace deprecated usage for StringInputRowParser and JSONParseSpec (#3215 )	2016-07-14 09:19:17 -07:00
Gian Merlino	ea03906fcf	Configurable compressRunOnSerialization for Roaring bitmaps. (#3228 ) Defaults to true, which is a change in behavior (this used to be false and unconfigurable).	2016-07-08 10:24:19 +05:30
Xavier Léauté	485e381387	remove datasource from hadoop output path (#3196 ) fixes #2083, follow-up to #1702	2016-06-29 08:53:45 -07:00
Hyukjin Kwon	45f553fc28	Replace the deprecated usage of NoneShardSpec (#3166 )	2016-06-25 10:27:25 -07:00
Charles Allen	6be18376c0	Make forking task runner have more informative thread names during the long-blocking part (#3172 ) * Make forking task runner have more informative thread names during the long-blocking part * Make string.format do the work	2016-06-24 08:56:01 -07:00
Gian Merlino	ebf890fe79	Update master version to 0.9.2-SNAPSHOT. (#3133 )	2016-06-13 13:10:38 -07:00
David Lim	5a3db634ff	add synchronization to SupervisorManager (#3077 )	2016-06-07 00:29:23 -06:00
David Lim	a2290a8f05	support seamless config changes (#3051 )	2016-06-03 13:50:19 -07:00
Charles Allen	474286bbce	Make TaskMaster giant lock fair (#3050 )	2016-06-02 12:10:40 -07:00
David Lim	3ef24c03b3	Validate X-Druid-Task-Id header in request/response and support retrying on outdated TaskLocation information, add KafkaIndexTaskClient unit tests (#3006 ) * validate X-Druid-Task-Id header in request and add header to response * modify KafkaIndexTaskClient to take a TaskLocationProvider as the TaskLocation may not remain constant	2016-05-25 22:05:18 -07:00
Charles Allen	15ccf451f9	Move QueryGranularity static fields to QueryGranularities (#2980 ) * Move QueryGranularity static fields to QueryGranularityUtil * Fixes #2979 * Add test showing #2979 * change name to QueryGranularities	2016-05-17 16:23:48 -07:00
Charles Allen	eaaad01de7	[QTL] Datasource as lookupTier (#2955 ) * Datasource as lookup tier * Adds an option to let indexing service tasks pull their lookup tier from the datasource they are working for. * Fix bad docs for lookups lookupTier * Add Datasource name holder * Move task and datasource to be pulled from Task file * Make LookupModule pull from bound dataSource * Fix test * Fix code style on imports * Fix formatting * Make naming better * Address code comments about naming	2016-05-17 15:44:42 -07:00
David Lim	b489f63698	Supervisor for KafkaIndexTask (#2656 ) * supervisor for kafka indexing tasks * cr changes	2016-05-04 23:13:13 -07:00
Gian Merlino	f8ddfb9a4b	Split SegmentInsertAction and SegmentTransactionalInsertAction for backwards compat. (#2922 ) Fixes #2912.	2016-05-04 13:54:34 -07:00
Himanshu	50065c8288	fix spurious failure of RTR concurrency test (#2915 )	2016-05-04 10:30:20 -07:00
Charles Allen	3f71a4a302	Fix missing log arguments in PendingTaskBasedWorkerResourceManagementStrategy (#2898 )	2016-04-28 18:15:41 -07:00
Parag Jain	0d745ee120	Basic authorization support in Druid (#2424 ) - Introduce `AuthorizationInfo` interface, specific implementations of which would be provided by extensions - If the `druid.auth.enabled` is set to `true` then the `isAuthorized` method of `AuthorizationInfo` will be called to perform authorization checks - `AuthorizationInfo` object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as `AuthConfig.DRUID_AUTH_TOKEN` - As per the scope of this PR, all resources that needs to be secured are divided into 3 types - `DATASOURCE`, `CONFIG` and `STATE`. For any type of resource, possible actions are - `READ` or `WRITE` - Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example - `DatasourceResourceFilter` is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as `/druid/coordinator/v1/datasources/`, `/druid/coordinator/v1/metadata/datasources/`, `/druid/v2/datasources/` - `RulesResourceFilter` is used where the datasource information is present after "rules" segment in the request Path such as `/druid/coordinator/v1/rules/` - `TaskResourceFilter` is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as `druid/indexer/v1/task` - `ConfigResourceFilter` is used for endpoints like `/druid/coordinator/v1/config`, `/druid/indexer/v1/worker`, `/druid/worker/v1` etc - `StateResourceFilter` is used for endpoints like `/druid/broker/v1/loadstatus`, `/druid/coordinator/v1/leader`, `/druid/coordinator/v1/loadqueue`, `/druid/coordinator/v1/rules` etc - For endpoints where a list of resources is returned like `/druid/coordinator/v1/datasources`, `/druid/indexer/v1/completeTasks` etc. the list is filtered to return only the resources to which the requested user has access. In these cases, `HttpServletRequest` instance needs to be injected in the endpoint method. Note - JAX-RS specification provides an interface called `SecurityContext`. However, we did not use this but provided our own interface `AuthorizationInfo` mainly because it provides more flexibility. For example, `SecurityContext` has a method called `isUserInRole(String role)` which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources. Fixes #2355 with PR #2424	2016-04-28 16:50:28 -07:00
Himanshu	9669e79df2	fix misleading error log due to race in RTR and concurrency test (#2878 )	2016-04-28 10:28:00 -07:00
Nishant	c29cb7d711	add pending task based resource management strategy (#2086 )	2016-04-27 10:40:53 -07:00
Nishant	bf5e5e7b75	fix #2886 (#2887 ) Fixes https://github.com/druid-io/druid/issues/2886	2016-04-27 08:29:41 -07:00
David Lim	7641f2628f	add control and status endpoints to KafkaIndexTask (#2730 )	2016-04-21 15:34:59 -07:00
Nishant	dbf63f738f	Add ability to filter segments for specific dataSources on broker without creating tiers (#2848 ) * Add back FilteredServerView removed in `a32906c7fd` to reduce memory usage using watched tiers. * Add functionality to specify "druid.broker.segment.watchedDataSources"	2016-04-19 10:10:06 -07:00
Gian Merlino	08c784fbf6	KafkaIndexTask: Use a separate sequence per Kafka partition in order to make (#2844 ) segment creation deterministic. This means that each segment will contain data from just one Kafka partition. So, users will probably not want to have a super high number of Kafka partitions... Fixes #2703.	2016-04-18 22:29:52 -07:00
jon-wei	0e481d6f93	Allow filters to use extraction functions	2016-04-05 13:24:56 -07:00
Fangjin Yang	1e02eeab13	Merge pull request #2683 from metamx/default_retry Better defaults for Retry policy for task actions	2016-03-29 08:02:59 -07:00
Gian Merlino	195c9c5240	Overlord: Avoid a scary Jersey warning. Avoids the following message from being printed on Overlord startup: WARNING: Parameter 1 of type io.druid.indexing.common.actions.TaskActionHolder<T> from public <T> javax.ws.rs.core.Response io.druid.indexing.overlord.http.OverlordResource.doAction (io.druid.indexing.common.actions.TaskActionHolder<T>) is not resolvable to a concrete type	2016-03-28 19:08:56 -07:00
Fangjin Yang	c2284929dc	Merge pull request #2739 from gianm/fix-wtmtest-failure Fix handling of InterruptedException in WorkerTaskMonitor's mainLoop.	2016-03-28 14:52:10 -07:00
Gian Merlino	ee4bb96855	Fix handling of InterruptedException in WorkerTaskMonitor's mainLoop. I believe this will fix #2664.	2016-03-25 12:17:33 -07:00
Himanshu Gupta	004b00bb96	config to explicitly specify classpath for hadoop container during hadoop ingestion	2016-03-25 10:51:28 -05:00
Himanshu	00d7021291	Merge pull request #2607 from jon-wei/dim_schema Support use of DimensionSchema class in DimensionsSpec	2016-03-22 11:53:46 -05:00
Himanshu	3220b109ad	Merge pull request #2570 from binlijin/single_dimension_partitioning Single dimension hash-based partitioning	2016-03-22 11:51:06 -05:00
binlijin	bce600f5d5	Single dimension hash-based partitioning	2016-03-22 13:15:33 +08:00
jon-wei	a59c9ee1b1	Support use of DimensionSchema class in DimensionsSpec	2016-03-21 13:12:04 -07:00
Nishant	ed8f39fcfe	Better defaults for Retry policy for task actions This PR changes the retry of task actions to be a bit more aggressive by reducing the maxWait. Current defaults were 1 min to 10 mins, which lead to a very delayed recovery in case there are any transient network issues between the overlord and the peons. doc changes.	2016-03-18 11:59:55 -07:00
Charles Allen	c716af5b04	Merge pull request #2678 from metamx/fixImports Fix some google related imports	2016-03-17 11:53:16 -07:00
Charles Allen	a52c6d3bee	Fix some google related imports	2016-03-17 11:03:29 -07:00
Gian Merlino	738dcd8cd9	Update version to 0.9.1-SNAPSHOT. Fixes #2462	2016-03-17 10:34:20 -07:00
Nishant	9cceff2274	Use ImmutableWorkerInfo instead of ZKWorker review comments add test for equals and hashcode	2016-03-14 11:17:15 -07:00
Himanshu	d51a0a0cf4	Merge pull request #2220 from gianm/appenderator-kafka Appenderators, DataSource metadata, KafkaIndexTask	2016-03-14 13:14:36 -05:00
Nishant	cf7f6da392	Merge pull request #2634 from gianm/stopGracefully-avoid-interrupt ThreadPoolTaskRunner: Make graceful shutdown logs less scary.	2016-03-11 16:36:10 -08:00
Charles Allen	a3f0048ea4	Merge pull request #2631 from gianm/plumbers-rpe Better logging for ParseExceptions on index aggregation, and remove unnecessary exception handling.	2016-03-11 14:22:58 -08:00
Gian Merlino	79a95f7789	WorkerTaskMonitor: stop() waits for mainLoop to exit. Fixes #2637.	2016-03-11 11:40:13 -08:00
Gian Merlino	05397a9b4f	ThreadPoolTaskRunner: Make graceful shutdown logs less scary. - It's okay to suppress InterruptedException during graceful shutdown, as tasks may use it to accelerate their own shutdown. - It's okay to ignore return statuses during graceful shutdown (which may be FAILED!) because it actually doesn't matter what they are.	2016-03-11 07:49:29 -08:00
Gian Merlino	187569e702	DataSource metadata. Geared towards supporting transactional inserts of new segments. This involves an interface "DataSourceMetadata" that allows combining of partially specified metadata (useful for partitioned ingestion). DataSource metadata is stored in a new "dataSource" table.	2016-03-10 17:41:50 -08:00
Gian Merlino	3d2214377d	Appenderatoring. Appenderators are a way of getting more control over the ingestion process than a Plumber allows. The idea is that existing Plumbers could be implemented using Appenderators, but you could also implement things that Plumbers can't do. FiniteAppenderatorDrivers help simplify indexing a finite stream of data. Also: - Sink: Ability to consider itself "finished" vs "still writable". - Sink: Ability to return the number of rows contained within the sink.	2016-03-10 17:41:50 -08:00
Gian Merlino	08284fea62	Publish test-jar for indexing-service.	2016-03-10 16:50:37 -08:00
Gian Merlino	92c828f904	Make SegmentHandoffNotifier Closeable.	2016-03-10 16:50:37 -08:00
Gian Merlino	8a11161b20	Plumbers: Move plumber.add out of try/catch for ParseException. The incremental indexes handle that now so it's not necessary. Also, add debug logging and more detailed exceptions to the incremental indexes for the case where there are parse exceptions during aggregation.	2016-03-10 16:39:26 -08:00
Charles Allen	d299540efc	Make HadoopTask load hadoop dependency classes LAST for local isolated classrunner	2016-03-10 10:18:23 -08:00
Himanshu Gupta	0402636598	configurable handoffConditionTimeout in realtime tasks for segment handoff wait	2016-03-05 10:14:54 -06:00
Gian Merlino	e9c23bf376	OverlordResource: Use getZkWorkers on RemoteTaskRunner. Restores old behavior of this api, from before #2249 when getWorkers returned ZkWorkers.	2016-03-02 17:31:34 -08:00
Fangjin Yang	80d954578d	Merge pull request #2572 from gianm/fix-rit-taskresource Fix default TaskResource for RealtimeIndexTasks.	2016-03-02 10:20:27 -08:00
Gian Merlino	acd95d3e28	TaskLocation: Add toString method. Necessary because these objects are used in log messages.	2016-03-01 17:52:06 -08:00
Gian Merlino	a355bfb7a9	Fix default TaskResource for RealtimeIndexTasks. It was supposed to be the same as the task id, but it wasn't because "makeTaskId" has a random component.	2016-03-01 16:54:22 -08:00
Björn Zettergren	2462c82c0e	New defaults for maxRowsInMemory rowFlushBoundary To bring consistency to docs and source this commit changes the default values for maxRowsInMemory and rowFlushBoundary to 75000 after discussion in PR https://github.com/druid-io/druid/pull/2457. The previous default was 500000 and it's lower now on the grounds that it's better for a default to be somewhat less efficient, and work, than to reach for the stars and possibly result in "OutOfMemoryError: java heap space" errors.	2016-03-01 13:50:28 +01:00
Charles Allen	c6803c4364	Allow specifying peon javaOpts as an array	2016-02-26 13:24:35 -08:00
Himanshu Gupta	bc156effe7	RTR has multiple threads for assignment of pending tasks now.	2016-02-26 09:27:03 -06:00
Fangjin Yang	53a5f07c14	Merge pull request #2544 from metamx/fixMaxPort Limit PortFinder to 0xFFFF	2016-02-25 17:12:53 -08:00
Fangjin Yang	143e85eaa5	Merge pull request #2419 from gianm/task-hostports Plumb task peon host/ports back out to the overlord.	2016-02-25 17:11:53 -08:00
Charles Allen	3fa7a7ebfe	Limit PortFinder to 0xFFFF	2016-02-25 08:16:40 -08:00
Charles Allen	187b788089	UnRegister port in ForkingTaskRunner	2016-02-25 08:04:25 -08:00
Gian Merlino	cf0bc905fb	Plumb task peon host/ports back out to the overlord. - Add TaskLocation class - Add registerListener to TaskRunner - Add getLocation to TaskRunnerWorkItem - Implement location tracking in existing TaskRunners - Rework WorkerTaskMonitor to do management out of a single thread so it can handle status and location updates more simply.	2016-02-24 15:13:10 -08:00
Nishant	fb7eae34ed	Merge pull request #2249 from metamx/workerExpanded Use Worker instead of ZkWorker whenever possible	2016-02-24 13:23:22 +05:30
Charles Allen	ac13a5942a	Use Worker instead of ZkWorker whenver possible * Moves last run task state information to Worker * Makes WorkerTaskRunner a TaskRunner which has interfaces to help with getting information about a Worker	2016-02-23 15:02:03 -08:00
Gian Merlino	3534483433	Better handling of ParseExceptions. Two changes: - Allow IncrementalIndex to suppress ParseExceptions on "aggregate". - Add "reportParseExceptions" option to realtime tuning configs. By default this is "false". Behavior of the counters should now be: - processed: Number of rows indexed, including rows where some fields could be parsed and some could not. - thrownAway: Number of rows thrown away due to rejection policy. - unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all). If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would cause an exception to be thrown). In addition, "processed" will only include fully parseable rows (because even partial parse failures will cause exceptions to be thrown). Fixes #2510.	2016-02-23 10:11:43 -08:00
Bingkun Guo	499288ff4b	Merge pull request #2509 from metamx/hadoopIsolatorTest Add hadoop classloader isolation tests for HadoopTask	2016-02-19 14:23:22 -06:00
Fangjin Yang	a3c29b91cc	Merge pull request #2505 from gianm/rt-exceptions Harmonize realtime indexing loop across the task and standalone nodes.	2016-02-19 11:23:14 -08:00
Charles Allen	9dff0e5dbd	Add hadoop classloader isolation tests for HadoopTask	2016-02-19 11:15:53 -08:00
Fangjin Yang	ddf913d626	Merge pull request #2508 from gianm/ftr-shutdown-logging ForkingTaskRunner: Better logging during orderly shutdown.	2016-02-19 10:02:24 -08:00
Gian Merlino	c0c6cf77fa	ForkingTaskRunner: Better logging during orderly shutdown.	2016-02-19 09:17:16 -08:00
Gian Merlino	243ac5399b	Harmonize realtime indexing loop across the task and standalone nodes. - Both now catch ParseExceptions on plumber.add (see https://groups.google.com/d/topic/druid-user/wmiRDvx2RvM/discussion) - Standalone now treats IndexSizeExceededException as fatal (previously only the task did)	2016-02-19 07:34:15 -08:00
Charles Allen	87752be740	Make HadoopTasks's classloader a single one	2016-02-18 20:58:09 -08:00
Andrés Gomez	07d714b1b5	Fixed equal distribution strategy when exist disable middleManager with same currCapacityUsed.	2016-02-17 08:40:42 +01:00
Himanshu	5779b32742	Merge pull request #2439 from metamx/fix2435 Make QuotableWhiteSpaceSplitter able to take JSON	2016-02-11 13:14:43 -06:00
Charles Allen	40ade32a1f	Fix dependencies. * Don't put druid***selfcontained.jar at the end of the hadoop isolated classpath Add `<scope>provided</scope>` to prevent repeated dependency inclusion in the extension directories	2016-02-11 07:30:14 -08:00
Charles Allen	3a6452c6d4	Make QuotableWhiteSpaceSplitter able to take json * Fixes #2435	2016-02-10 16:42:14 -08:00
Xavier Léauté	91f23583f5	Merge pull request #2436 from gianm/mm-less-suppressey Harmonize znode writing code in RTR and Worker.	2016-02-10 16:11:30 -08:00
Gian Merlino	fa92b77f5a	Harmonize znode writing code in RTR and Worker. - Throw most exceptions rather than suppressing them, which should help detect problems. Continue suppressing exceptions that make sense to suppress. - Handle payload length checks consistently, and improve error message. - Remove unused WorkerCuratorCoordinator.announceTaskAnnouncement method. - Max znode length should be int, not long. - Add tests.	2016-02-10 14:52:00 -08:00
Charles Allen	2bde8b1d68	Make hadoop classpath isolation more explicit * Fixes #2428	2016-02-10 12:09:17 -08:00
Charles Allen	a0728fa854	Allow ScalingStats to be null * Fixes #2378	2016-02-02 18:01:01 -08:00
Parag Jain	7853a9cc41	clean up TaskLifecycleTest	2016-01-31 11:19:20 -06:00
Gian Merlino	5fd4b79373	RealtimeIndexTask: Fix NPE caused by calling stopGracefully before a firehose had been connected.	2016-01-29 11:20:23 -08:00
Gian Merlino	c4fde52160	Fix 'graceful shutdown aborted' log message in ThreadPoolTaskRunner.	2016-01-29 11:07:17 -08:00
Nishant	dcb7830330	Merge pull request #984 from drcrallen/thread-priority-rebase Use thread priorities. (aka set `nice` values for background-like tasks)	2016-01-21 15:02:34 +05:30
Charles Allen	66e74b1a63	Minor field name change in RemoteTaskRunnerFactory to be more descriptive * Addresses https://github.com/druid-io/druid/pull/2309#discussion_r50335081	2016-01-20 17:43:20 -08:00
Charles Allen	3152d08844	Fix overlord scheduled executor injection * Fixes https://github.com/druid-io/druid/issues/2308	2016-01-20 14:16:14 -08:00
Charles Allen	2e1d6aaf3d	Use thread priorities. (aka set `nice` values for background-like tasks) * Defaults the thread priority to java.util.Thread.NORM_PRIORITY in io.druid.indexing.common.task.AbstractTask * Each exec service has its own Task Factory which is assigned a priority for spawned task. Therefore each priority class has a unique exec service * Added priority to tasks as taskPriority in the task context. <0 means low, 0 means take default, >0 means high. It is up to any particular implementation to determine how to handle these numbers * Add options to ForkingTaskRunner * Add "-XX:+UseThreadPriorities" default option * Add "-XX:ThreadPriorityPolicy=42" default option * AbstractTask - Removed unneded @JsonIgnore on priority * Added priority to RealtimePlumber executors. All sub-executors (non query runners) get Thread.MIN_PRIORITY * Add persistThreadPriority and mergeThreadPriority to realtime tuning config	2016-01-20 14:00:31 -08:00
Nishant	ac6c90e657	Merge pull request #1953 from metamx/taskRunnerResourceManagement Move resource managemnt to be the responsibility of the TaskRunner	2016-01-20 22:27:47 +05:30
Jonathan Wei	df2906a91c	Merge pull request #2290 from gianm/index-merger-v9-stuff Respect buildV9Directly in PlumberSchools, so it works on standalone realtime.	2016-01-19 13:04:00 -08:00
Fangjin Yang	0c31f007fc	Merge pull request #1728 from himanshug/aggregators_in_segment_metadata Store AggregatorFactory[] in segment metadata	2016-01-19 12:55:49 -08:00
Himanshu	fe841fd961	Merge pull request #2118 from guobingkun/fix_segment_loading Fix loading segment for historical	2016-01-19 14:25:48 -06:00
Himanshu Gupta	a99aef29a1	adding aggregators to segment metadata	2016-01-19 14:23:39 -06:00
Gian Merlino	1dcf22edb7	Respect buildV9Directly in PlumberSchools, so it works on standalone realtime nodes. Also parameterize some tests to run with/without buildV9Directly: - IndexGeneratorJobTest - RealtimeIndexTaskTest - RealtimePlumberSchoolTest	2016-01-19 12:15:06 -08:00
Bingkun Guo	c4ad50f92c	Fix loading segment for historical Historical will drop a segment that shouldn't be dropped in the following scenario: Historical node tried to load segmentA, but failed with SegmentLoadingException, then ZkCoordinator called removeSegment(segmentA, blah) to schedule a runnable that would drop segmentA by deleting its files. Now, before that runnable executed, another LOAD request was sent to this historical, this time historical actually succeeded on loading segmentA and announced it. But later on, the scheduled drop-of-segment runnable started executing and removed the segment files, while historical is still announcing segmentA.	2016-01-19 10:29:49 -06:00
Himanshu Gupta	164b0aad7a	removing Map<String,Object> segmentMetadata from methods in Index[Maker/Merger] and using Metadata class instead of a Map to store segment metadata	2016-01-18 22:03:46 -06:00
Kurt Young	82ff98c2bf	add config for build v9 directly and update docs	2016-01-16 11:26:34 +08:00

... 3 4 5 6 7 ...

1583 Commits