druid

Commit Graph

Author	SHA1	Message	Date
Clint Wylie	074a45219d	add google cloud storage InputSource for native batch (#8907 ) * add google cloud storage InputSource for native batch * rename * checkstyle * fix * fix spelling * review comments	2019-11-19 19:49:43 -08:00
Chi Cao Minh	8365bdf62a	Address security vulnerabilities (#8878 ) * Address security vulnerabilities Security vulnerabilities addressed by upgrading 3rd party libs: - Upgrade avro-ipc to 1.9.1 - sonatype-2019-0115 - Upgrade caffeine to 2.8.0 - sonatype-2019-0282 - Upgrade commons-beanutils to 1.9.4 - CVE-2014-0114 - Upgrade commons-codec to 1.13 - sonatype-2012-0050 - Upgrade commons-compress to 1.19 - CVE-2019-12402 - sonatype-2018-0293 - Upgrade hadoop-common to 2.8.5 - CVE-2018-11767 - Upgrade hadoop-mapreduce-client-core to 2.8.5 - CVE-2017-3166 - Upgrade hibernate-validator to 5.2.5 - CVE-2017-7536 - Upgrade httpclient to 4.5.10 - sonatype-2017-0359 - Upgrade icu4j to 55.1 - CVE-2014-8147 - Upgrade jackson-databind to 2.6.7.3: - CVE-2017-7525 - Upgrade jetty-http to 9.4.12: - CVE-2017-7657 - CVE-2017-7658 - CVE-2017-7656 - CVE-2018-12545 - Upgrade log4j-core to 2.8.2 - CVE-2017-5645: - Upgrade netty to 3.10.6 - CVE-2015-2156 - Upgrade netty-common to 4.1.42 - CVE-2019-9518 - Upgrade netty-codec-http to 4.1.42 - CVE-2019-16869 - Upgrade nimbus-jose-jwt to 4.41.1 - CVE-2017-12972 - CVE-2017-12974 - Upgrade plexus-utils to 3.0.24 - CVE-2017-1000487 - sonatype-2015-0173 - sonatype-2016-0398 - Upgrade postgresql to 42.2.8 - CVE-2018-10936 Note that if users are using JDBC lookups with postgres, they may need to update the JDBC jar used by the lookup extension. * Fix license for postgresql	2019-11-19 09:14:33 -08:00
Chi Cao Minh	d60978343a	Improve missing JDBC driver error for lookups (#8872 ) If the JDBC drivers are missing from the lookup extensions, throw an exception that directs the user how to resolve the issue. This change is a follow up to #8825.	2019-11-18 11:42:38 -08:00
Jihoon Son	1611792855	Add InputSource and InputFormat interfaces (#8823 ) * Add InputSource and InputFormat interfaces * revert orc dependency * fix dimension exclusions and failing unit tests * fix tests * fix test * fix test * fix firehose and inputSource for parallel indexing task * fix tc * fix tc: remove unused method * Formattable * add needsFormat(); renamed to ObjectSource; pass metricsName for reader * address comments * fix closing resource * fix checkstyle * fix tests * remove verify from csv * Revert "remove verify from csv" This reverts commit `1ea7758489`. * address comments * fix import order and javadoc * flatMap * sampleLine * Add IntermediateRowParsingReader * Address comments * move csv reader test * remove test for verify * adjust comments * Fix InputEntityIteratingReader * rename source -> entity * address comments	2019-11-15 09:22:09 -08:00
Clint Wylie	cc54b2a9df	support for array expressions in TransformSpec with ExpressionTransform (#8744 ) * transformSpec + array expressions changes: * added array expression support to transformSpec * removed ParseSpec.verify since its only use afaict was preventing transform expr that did not replace their input from functioning * hijacked index task test to test changes * remove docs about being unsupported * re-arrange test assert * unused imports * imports * fix tests * preserve types * suppress warning, fixes, add test * formatting * cleanup * better list to array type conversion and tests * fix oops	2019-11-13 11:04:37 -08:00
fst0	80dbf44fca	Add reference to druid.storage.type (#8857 ) * Add reference to `druid.storage.type` This should be in here. Without setting storage type to S3 globally it will obviously not be used, even if all other parameters are correct. * Update s3.md Add global storage parameter to knob table. * Update s3.md	2019-11-13 10:03:41 -08:00
Lucas Capistrant	a066cc5648	Fix groupMapping endpoint URIs in druid-basic-security doc (#8847 )	2019-11-12 21:12:34 +05:30
Jonathan Wei	75ea0d592a	Add more datasketches doubles sketch SQL functions (#8843 ) * Add more datasketches doubles sketch SQL postaggs * style and lgtm	2019-11-08 18:05:06 -08:00
Gian Merlino	0e8c3f74d0	SQL: EARLIEST, LATEST aggregators. (#8815 ) * SQL: EARLIEST, LATEST aggregators. I chose these names instead of FIRST, LAST because those are already reserved functions in Calcite that mean something different. I think these are also better names anyway. * Finalify. * SQL updates. * Adjust aggregator calls. * Validations, test updates. * Review docs.	2019-11-08 16:29:25 -08:00
Clint Wylie	7aafcf8bca	parallel broker merges on fork join pool (#8578 ) * sketch of broker parallel merges done in small batches on fork join pool * fix non-terminating sequences, auto compute parallelism * adjust benches * adjust benchmarks * now hella more faster, fixed dumb * fix * remove comments * log.info for debug * javadoc * safer block for sequence to yielder conversion * refactor LifecycleForkJoinPool into LifecycleForkJoinPoolProvider which wraps a ForkJoinPool * smooth yield rate adjustment, more logs to help tune * cleanup, less logs * error handling, bug fixes, on by default, more parallel, more tests * remove unused var * comments * timeboundary mergeFn * simplify, more javadoc * formatting * pushdown config * use nanos consistently, move logs back to debug level, bit more javadoc * static terminal result batch * javadoc for nullability of createMergeFn * cleanup * oops * fix race, add docs * spelling, remove todo, add unhandled exception log * cleanup, revert unintended change * another unintended change * review stuff * add ParallelMergeCombiningSequenceBenchmark, fixes * hyper-threading is the enemy * fix initial start delay, lol * parallelism computer now balances partition sizes to partition counts using sqrt of sequence count instead of sequence count by 2 * fix those important style issues with the benchmarks code * lazy sequence creation for benchmarks * more benchmark comments * stable sequence generation time * update defaults to use 100ms target time, 4096 batch size, 16384 initial yield, also update user docs * add jmh thread based benchmarks, cleanup some stuff * oops * style * add spread to jmh thread benchmark start range, more comments to benchmarks parameters and purpose * retool benchmark to allow modeling more typical heterogenous heavy workloads * spelling * fix * refactor benchmarks * formatting * docs * add maxThreadStartDelay parameter to threaded benchmark * why does catch need to be on its own line but else doesnt	2019-11-07 11:58:46 -08:00
Jad Naous	ce3c0dae4d	Add note on JDBC libs for lookups (#8825 ) * Add note on JDBC libs for lookups * Fix directory and additional "the"	2019-11-06 13:31:26 -08:00
Himanshu	5adc8212b4	add documentation for druid docker and k8s operator (#8802 ) * add documentation for druid docker and k8s operator * address review comment and add Kubernetes to spelling file	2019-11-06 12:56:21 -08:00
Tijo Thomas	27acdbd2b8	'hadoop fs' command is deprecated . The new approach is to use hdfs command . Replacing 'hadoop fs' command with 'hdfs dfs' (#8762 )	2019-11-01 04:42:10 +05:30
Giuseppe Martino	9c171e2b1f	Message rejection absolute date (#8656 ) * Add option lateMessageRejectionStartDate * Use option lateMessageRejectionStartDate * Fix tests * Add lateMessageRejectionStartDate to kafka indexing service * Update tests kafka indexing service * Fix tests for KafkaSupervisorTest * Add lateMessageRejectionStartDate to KinesisSupervisorIOConfig * Fix var name * Update documentation * Add check lateMessageRejectionStartDateTime and lateMessageRejectionPeriod, fails if both were specified.	2019-10-31 15:13:02 -07:00
Clint Wylie	3ff5e02237	remove select query (#8739 ) * remove select query * thanks teamcity * oops * oops * add back a SelectQuery class that throws RuntimeExceptions linking to docs * adjust text * update docs per review * deprecated	2019-10-30 19:29:56 -07:00
Gian Merlino	7605c23354	Remove Tranquility configs and certain doc references. (#8793 ) Since it hasn't received updates or community interest in a while, it makes sense to de-emphasize it in the distribution and most documentation (outside of simple mentions of its existence).	2019-10-30 16:30:16 -07:00
Gian Merlino	c922d2c3c9	Use bundled ZooKeeper in tutorials. (#8792 )	2019-10-30 16:17:28 -07:00
Gian Merlino	aa81253cf4	Fix typos. (#8767 )	2019-10-28 12:47:01 -07:00
Gian Merlino	b65d2ac648	Add HDFS firehose (#8754 ) * Add HDFS firehose. * Tests, support for lists of paths. * Fixups. * Update list of firehoses. * Wildcards is a word.	2019-10-28 08:07:38 -07:00
Vadim Ogievetsky	f9b94a5db1	Docs: remove self link (#8760 ) This section links to itself in the description. I tried to follow that link and spit hot tea all over my monitor from laughter.	2019-10-27 22:33:22 -07:00
Clint Wylie	09f92818d4	update druid expression docs to indicate that array functions do not work at indexing time (#8734 ) * update druid expression docs to indicate that array functions are not supported in transformSpec * fix unrelated spelling check	2019-10-24 22:04:08 -07:00
Eyal Yurman	14e33428f0	Moving Average extention: Add Sum averagers (#8511 ) * Add sum averagers. * avoid casting double to long.	2019-10-24 16:37:24 -07:00
Vadim Ogievetsky	cc3650ee3b	fix doc headers (#8729 )	2019-10-24 11:17:39 -07:00
Jihoon Son	f5b9bf5525	Cluster-wide configuration for query vectorization (#8657 ) * Cluster-wide configuration for query vectorization * add doc * fix build * fix doc * rename to QueryConfig and add javadoc * fix checkstyle * fix variable names	2019-10-23 21:44:28 +08:00
David Glasser	b453fda251	docs: clarify native batch ingestion w/ overlapping segments (#8720 ) I was confused by a paragraph in the docs that I myself wrote!	2019-10-22 21:01:56 -07:00
Jad Naous	2ab43aa688	Update tutorial-kerberos-hadoop.md (#8689 ) * Update tutorial-kerberos-hadoop.md Fix up what looks like a bad merge. * Update tutorial-kerberos-hadoop.md Fix spelling issues	2019-10-22 14:40:41 -07:00
Abhishek Radhakrishnan	42cfe679f1	Update query result timestamp to match query intervals. (#8717 )	2019-10-22 14:39:47 -07:00
Surekha	e919eccc4b	Update docs to add metadataSegment configs (#8708 ) * Add metadataSegment configs to docs * rearrange in alphabetical order	2019-10-22 01:19:36 -07:00
Kamal Gurala	3ed5f9698a	gcs prefix doc fix (#8699 )	2019-10-21 08:29:54 -07:00
Surekha	98f59ddd7e	Add `sys.supervisors` table to system tables (#8547 ) * Add supervisors table to SystemSchema * Add docs * fix checkstyle * fix test * fix CI * Add comments * Fix javadoc teamcity error * comments * fix links in docs * fix links * rename fullStatus query param to system and remove it from docs	2019-10-18 15:16:42 -07:00
Jonathan Wei	d88075237a	Add initial SQL support for non-expression sketch postaggs (#8487 ) * Add initial SQL support for non-expression sketch postaggs * Checkstyle, spotbugs * checkstyle * imports * Update SQL docs * Checkstyle * Fix theta sketch operator docs * PR comments * Checkstyle fixes * Add missing entries for HLL sketch module * PR comments, add round param to HLL estimate operator, fix optional HLL param	2019-10-18 14:59:44 -07:00
Jihoon Son	30c15900be	Auto compaction based on parallel indexing (#8570 ) * Auto compaction based on parallel indexing * javadoc and doc * typo * update spell * addressing comments * address comments * fix log * fix build * fix test * increase default max input segment bytes per task * fix test	2019-10-18 13:24:14 -07:00
Mingming Qiu	2c758ef5ff	Support assign tasks to run on different categories of MiddleManagers (#7066 ) * Support assign tasks to run on different tiers of MiddleManagers * address comments * address comments * rename tier to category and docs * doc * fix doc * fix spelling errors * docs	2019-10-17 12:57:19 -07:00
Jad Naous	d54d2e1627	Update segments.md (#8693 ) Make bullet numbers clearer with parantheses, fix last reference to 2 being interpreted as a bullet point.	2019-10-17 11:55:23 -07:00
Jad Naous	9f4e11df32	Update tutorial-rollup.md (#8687 ) At this point there hasn't yet been an explanation in the tutorial of what "segments" are	2019-10-16 20:08:09 -06:00
Jonathan Wei	89ce6384f5	More Kinesis resharding adjustments (#8671 ) * More Kinesis resharding adjustments * Fix TC inspection * Fix comment' * Adjust comment, small refactor * Make repartition transition time configurable * Add spellcheck exclusion * Spelling fix	2019-10-15 23:19:17 -07:00
Jihoon Son	4046c86d62	Stateful auto compaction (#8573 ) * Stateful auto compaction * javaodc * add removed test back * fix test * adding indexSpec to compactionState * fix build * add lastCompactionState * address comments * extract CompactionState * fix doc * fix build and test * Add a task context to store compaction state; add javadoc * fix it test	2019-10-15 22:57:42 -07:00
Mitch Lloyd	1a78a0c98a	Add credentials for ECS (#8651 ) * Add credentials for ECS * Fix import order * Update S3 authentication methods table * Update .spelling for new documentation	2019-10-12 09:12:14 -07:00
Abhishek Radhakrishnan	d87840d894	Minor updates to documentation. (#8665 )	2019-10-12 09:11:03 -07:00
Jihoon Son	96d8523ecb	Use hash of Segment IDs instead of a list of explicit segments in auto compaction (#8571 ) * IOConfig for compaction task * add javadoc, doc, unit test * fix webconsole test * add spelling * address comments * fix build and test * address comments	2019-10-09 11:12:00 -07:00
Clint Wylie	8bda3afea4	fix spelling errors triggered by another doc PR (#8653 )	2019-10-08 23:43:58 -07:00
Nishant Bangarwa	0853273091	Add tier based usage metrics for historical nodes to help with autoscaling (#8636 ) * Add tier based usage metrics for historical nodes to help with druid historical autoscaling Add tier based usage metrics for historical nodes to help druid cluster orchestration systems understand the historical node usage and requirements. Following metrics would be helpful - tier/required/capacity- total capacity in bytes required in each tier. Dimensions - tier tier/total/capacity - total capacity in bytes available in a given tier. Dimension - tier tier/historical/count - no. of historical nodes available in each tier. Dimension - tier tier/replication/factor - configured maximum replication factor in given tier. Dimension - tier * fix unit test failures	2019-10-08 19:55:32 -07:00
Mohammad J. Khan	18758f5228	Support LDAP authentication/authorization (#6972 ) * Support LDAP authentication/authorization * fixed integration-tests * fixed Travis CI build errors related to druid-security module * fixed failing test * fixed failing test header * added comments, force build * fixes for strict compilation spotbugs checks * removed authenticator rolling credential update feature * removed escalator rolling credential update feature * fixed teamcity inspection deprecated API usage error * fixed checkstyle execution error, removed unused import * removed cached config as part of removing authenticator rolling credential update feature * removed config bundle entity as part of removing authenticator rolling credential update feature * refactored ldao configuration * added support for SSLContext configuration and TLSCertificateChecker * removed check to return authentication failure when user has no group assigned, will be checked and handled by the authorizer * Separate out authorizer checks between metadata-backed store user and LDAP user/groups * refactored BasicSecuritySSLSocketFactory usage to fix strict compilation spotbugs checks * fixes build issue * final review comments updates * final review comments updates * fixed LGTM and spellcheck alerts * Fixed Avatica auth failure error message check * Updated metadata credentials validator exception message string, replaced DB with metadata store	2019-10-08 17:08:27 -07:00
Clint Wylie	2f20799868	merge recommendations into basic-cluster-tuning, add additional info (#8649 ) * merge recommendations into basic-cluster-tuning, add additional info * stupid sidebar	2019-10-08 16:33:54 -07:00
Himanshu	c078ed40fd	groupBy query: optional limit push down to segment scan (#8426 ) * groupBy query: optional limit push down to segment scan * make segment level limit push down configurable * fix teamcity errors * fix segment limit pushdown flag handling on query level config override * use equals for comparator check * fix sql and null handling * fix unused imports * handle null offset in NullableValueGroupByColumnSelectorStrategy for buffer comparator similar to RowBasedGrouperHelper.NullableRowBasedKeySerdeHelper	2019-10-08 15:35:07 -07:00
Lucas Capistrant	d801ce2f29	Update rollup table to properly reflect 0.16.0 (#8638 ) This table stated that `index_parallel` tasks were best-effort only. However, this changed with #8061 and this documentation update was simply missed.	2019-10-07 12:37:15 -07:00
Xavier Léauté	1d42551d95	Fix statsd types (#8628 ) * fix segment underReplicated/unavailable counts to be gauges instead of counters * fix jvm/gc/cpu to be a counter instead of timre jvm/gc/cpu represents the total cpu time spent for multiple gc invocations, not the time spent in each gc cycle. the number needs to be divided by jvm/gc/count to get the average gc time per cycle * update docs * fix spellcheck	2019-10-06 14:14:09 -07:00
Parag Jain	f0d74b240d	password provider for basic authentication of HttpEmitterConfig (#8618 )	2019-10-02 15:59:17 -07:00
Nishant Bangarwa	8537fbeca7	Implementing dropwizard emitter for druid (#7363 ) * Implementing dropwizard emitter for druid making metric manager and alert emitters as optional * Refactor and make things work more improvements improve docs refactrings * Fix teamcity inspections * review comments * more review comments * add limit to max number of gauges * update pom version * fix pom * review comments * review comment * review comments * fix broken doc link review comments review comments * review comments * fix checkstyle * more spell check fixes * fix travis failures	2019-10-01 14:59:30 -07:00
pdeva	db65068c42	add reference to indexer nodes (#8607 )	2019-09-30 16:45:33 -06:00

1 2 3 4 5 ...

2000 Commits