druid

Commit Graph

Author	SHA1	Message	Date
Xavier Léauté	d26e1bc70d	update code check plugins for Java 15 support (#10978 ) * update maven-forbidden-api plugin to 3.1 * update maven-pmd-plugin to 3.14 * update spotbugs to 4.2.2 * fixes validation failures newly caught by those updates - fix SpotBugs NP_NONNULL_PARAM_VIOLATION - fix PMD UnnecessaryFullyQualifiedName	2021-03-11 07:31:41 -08:00
Xavier Léauté	7a68cd8b86	use maven enforcer to check maven version (#10977 ) * removes a warning about prerequisites only being allowed for plugins * update maven enforcer plugin to the latest version (3.0.0-M3)	2021-03-11 07:30:10 -08:00
Tianxin Zhao	a57c28e9ce	prometheus metric exporter (#10412 ) * prometheus-emitter * use existing jetty server to expose prometheus collection endpoint * unused variables * better variable names * removed unused dependencies * more metric definitions * reorganize * use prometheus HTTPServer instead of hooking into Jetty server * temporary empty help string * temporary non-empty help. fix incorrect dimension value in JSON (also updated statsd json) * added full help text. added metric conversion factor for timers that are not using seconds. Correct metric dimension name in documentation * added documentation for prometheus emitter * safety for invalid labelNames * fix travis checks * Unit test and better sanitization of metrics names and label values * add precondition to check namespace against regex * use precompiled regex * remove static imports. fix metric types * better docs. fix possible NPE in PrometheusEmitterConfig. Guard against multiple calls to PrometheusEmitter.start() * Update regex for label-value replacements to allow internal numeric values. Additional tests * Adds missing license header updates website/.spelling to add words used in prometheus-emitter docs. updates docs/operations/metrics.md to correct the spelling of bufferPoolName * fixes version in extensions-contrib/prometheus-emitter * fix style guide errors * update import ordering * add another word to website/.spelling * remove unthrown declared exception * remove unused import * Pushgateway strategy for metrics * typo * Format fix and nullable strategy * Update pom file for prometheus-emitter * code review comments. Counter to gauge for cache metrics, periodical task to pushGateway * Syntax fix * Dimension label regex include numeric character back, fix previous commit * bump prometheus-emitter pom dev version * Remove scheduled task inside poen that push metrics * Fix checkstyle * Unit test coverage * Unit test coverage * Spelling * Doc fix * spelling Co-authored-by: Michael Schiff <michael.schiff@tubemogul.com> Co-authored-by: Michael Schiff <schiff.michael@gmail.com> Co-authored-by: Tianxin Zhao <tianxin.zhao@tubemogul.com> Co-authored-by: Tianxin Zhao <tizhao@adobe.com>	2021-03-09 14:37:31 -08:00
Clint Wylie	96889cdebc	add avro + kafka + schema registry integration test (#10929 ) * add avro + schema registry integration test * style * retry init * maybe this * oops heh * this will fix it * review stuffs * fix comment	2021-03-08 08:12:12 -08:00
zhangyue19921010	bddacbb1c3	Dynamic auto scale Kafka-Stream ingest tasks (#10524 ) * druid task auto scale based on kafka lag * fix kafkaSupervisorIOConfig and KinesisSupervisorIOConfig * druid task auto scale based on kafka lag * fix kafkaSupervisorIOConfig and KinesisSupervisorIOConfig * test dynamic auto scale done * auto scale tasks tested on prd cluster * auto scale tasks tested on prd cluster * modify code style to solve 29055.10 29055.9 29055.17 29055.18 29055.19 29055.20 * rename test fiel function * change codes and add docs based on capistrant reviewed * midify test docs * modify docs * modify docs * modify docs * merge from master * Extract the autoScale logic out of SeekableStreamSupervisor to minimize putting more stuff inside there && Make autoscaling algorithm configurable and scalable. * fix ci failed * revert msic.xml * add uts to test autoscaler create && scale out/in and kafka ingest with scale enable * add more uts * fix inner class check * add IT for kafka ingestion with autoscaler * add new IT in groups=kafka-index named testKafkaIndexDataWithWithAutoscaler * review change * code review * remove unused imports * fix NLP * fix docs and UTs * revert misc.xml * use jackson to build autoScaleConfig with default values * add uts * use jackson to init AutoScalerConfig in IOConfig instead of Map<> * autoscalerConfig interface and provide a defaultAutoScalerConfig * modify uts * modify docs * fix checkstyle * revert misc.xml * modify uts * reviewed code change * reviewed code change * code reviewed * code review * log changed * do StringUtils.encodeForFormat when create allocationExec * code review && limit taskCountMax to partitionNumbers * modify docs * code review Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2021-03-06 14:36:52 +05:30
Atul Mohan	6040c30fcd	Upgrade jetty to latest version (#10937 ) * Upgrade jetty * Fix license	2021-03-04 08:28:50 -06:00
Alexander Saydakov	f930cf14d6	Use the latest Apache DataSketches release 2.0.0 (#10917 ) * use the latest Apache DataSketches release 2.0.0 * updated datasketches version Co-authored-by: AlexanderSaydakov <AlexanderSaydakov@users.noreply.github.com>	2021-02-26 07:52:00 -06:00
Suneet Saldanha	bc7004006f	Update dependency-check plugin (#10883 ) * Use dependency-check aggregate * oops	2021-02-16 19:22:04 -08:00
Will Xu	c8d2654605	Use native git for git-commit-id-plugin to speed up build (#10881 ) * Segment timeline doesn't show results older than 3 months * Adoption testing patch for web segment timeline view and also refactoring default time config * Changing git-commit-id-plugin to use native git, shaving off 15% off build time Co-authored-by: dev <dev@dev.minitoken.com>	2021-02-12 09:31:07 -08:00
Jihoon Son	a2b5e01142	Bump DataSketches memory to 1.3.0 (#10789 )	2021-02-04 18:39:52 -08:00
Himadri Singh	1c1b396eaa	AWS Web Identity / IRSA Support (#10541 ) * AWS Web Identity Support required for AWS IRSA * Update kinesis-ingestion.md * disabling coverage tests https://github.com/apache/druid/pull/10541#issuecomment-737558213 * exclude coverage * Update licenses.yaml	2021-01-25 18:44:02 +05:30
Jihoon Son	95065bdf1a	Bump dev version to 0.22.0-SNAPSHOT (#10759 )	2021-01-15 13:16:23 -08:00
Himanshu	c7b1212a43	AWS RDS token based password provider (#9518 ) * refresh db pwd * aws iam token password provider * fix analyze-dependencies build * fix doc build * add ut for BasicDataSourceExt * more doc updates * more doc update * moving aws token password provider to new extension * remove duplicate changes * make all config inline * extension docs * refresh db password in SQL Firehose code path as well * add ut * fix build * add new extension to distribution * rds lib is not provided * fix license build * add version to license * change parent version to 0.19.0-snapshot * address review comments * fix core/ code coverage * Update server/src/main/java/org/apache/druid/metadata/BasicDataSourceExt.java Co-authored-by: Clint Wylie <cjwylie@gmail.com> * address review comments * fix spellchecker * remove inadvertant website file change Co-authored-by: Clint Wylie <cjwylie@gmail.com>	2021-01-06 21:15:29 -08:00
Xavier Léauté	b7a16d08a6	Update Apache Kafka to 2.7.0 (#10701 ) - align scala versions to match Kafka	2020-12-22 13:56:00 -08:00
Himanshu	ac1882bf74	kubernetes based discovery druid extension to run Druid on K8S without Zookeeper (#10544 ) * honor zk enablement config in more places in druid code * kubernetes based discovery module * fix spotbugs check * fix intellij checks error * fix doc link to kubernetes.md from extension * make spellchecker happy * update license.yaml * fix dependency check errors * update extension coverage * UTs for BaseNodeRoleWatcher * fix forbidden-api check * update k8s module coverage ignores * add Bouncy Castle License being same as MIT License for license checking purposes * further update licenses.yaml * label/annotation pre-existence assumption * address review comment	2020-12-14 21:10:31 -08:00
Jihoon Son	abcf624a2e	Bump up jackson-databind to 2.10.5.1 (#10655 ) * Bump up jackson version to 2.10.5.1 * only jackson-databind * license	2020-12-09 13:54:47 -08:00
Suneet Saldanha	c94be8a945	Revert "Update google client libraries (#10536 )" (#10599 ) This reverts commit `4537016cad`.	2020-12-03 20:14:52 +05:30
Ayush Kulshrestha	d0c2ede50c	Added CronScheduler support as a proof to clock drift while emitting metrics (#10448 ) Co-authored-by: Ayush Kulshrestha <ayush.kulshrestha@miqdigital.com>	2020-11-25 12:31:38 +01:00
Nishant Bangarwa	4537016cad	Update google client libraries (#10536 ) modify license.yaml Update google oauth client version	2020-11-20 15:23:30 -08:00
Suneet Saldanha	6c8a77b7a9	Bump jetty to latest version (#10563 ) This addresses CVE-2020-27216 which was flagged by the security vulnerability job.	2020-11-09 08:51:36 -08:00
Jonathan Wei	65c0d64676	Update version to 0.21.0-SNAPSHOT (#10450 ) * [maven-release-plugin] prepare release druid-0.21.0 * [maven-release-plugin] prepare for next development iteration * Update web-console versions	2020-10-03 16:08:34 -07:00
Abhishek Agarwal	d057c5149f	Fix the offset setting in GoogleStorage#get (#10449 ) * Fix the offset in get of GCP object * upgrade compute dependency * fix version * review comments * missed	2020-10-01 08:38:58 -07:00
Igor Dvorzhak	d0ee2e3a48	Upgrade ORC to 1.5.10 version (#10291 )	2020-09-18 13:38:45 -07:00
Xavier Léauté	225490474d	Update Kafka dependencies to 2.6.0 (#10286 ) * update Kafka dependencies to Kafka 2.6.0 * switch to Scala 2.13 build of Kafka * update integration tests * update Kafka tutorial	2020-08-15 07:56:40 -07:00
Richard Startin	e363b1cd20	Update RoaringBitmap to 0.9.0 (#9987 )	2020-07-23 19:29:25 -07:00
Gian Merlino	eeaf609fc0	Update Jetty to 9.4.30.v20200611. (#10098 ) * Update Jetty to 9.4.30.v20200611. This is the latest version currently available in the 9.4.x line. * Various adjustments. * Class name fixes. * Remove unused HttpClientModule code. * Add coverage suppressions. * Another coverage suppression. * Fix wildcards.	2020-07-07 14:24:02 -07:00
Clint Wylie	c86e7ce30b	bump version to 0.20.0-SNAPSHOT (#10124 )	2020-07-06 15:08:32 -07:00
frank chen	60c6bd5b4c	support Aliyun OSS service as deep storage (#9898 ) * init commit, all tests passed * fix format Signed-off-by: frank chen <frank.chen021@outlook.com> * data stored successfully * modify config path * add doc * add aliyun-oss extension to project * remove descriptor deletion code to avoid warning message output by aliyun client * fix warnings reported by lgtm-com * fix ci warnings Signed-off-by: frank chen <frank.chen021@outlook.com> * fix errors reported by intellj inspection check Signed-off-by: frank chen <frank.chen021@outlook.com> * fix doc spelling check Signed-off-by: frank chen <frank.chen021@outlook.com> * fix dependency warnings reported by ci Signed-off-by: frank chen <frank.chen021@outlook.com> * fix warnings reported by CI Signed-off-by: frank chen <frank.chen021@outlook.com> * add package configuration to support showing extension info Signed-off-by: frank chen <frank.chen021@outlook.com> * add IT test cases and fix bugs Signed-off-by: frank chen <frank.chen021@outlook.com> * 1. code review comments adopted 2. change schema from 'aliyun-oss' to 'oss' Signed-off-by: frank chen <frank.chen021@outlook.com> * add license info Signed-off-by: frank chen <frank.chen021@outlook.com> * fix doc Signed-off-by: frank chen <frank.chen021@outlook.com> * exclude execution of IT testcases of OSS extension from CI Signed-off-by: frank chen <frank.chen021@outlook.com> * put the extensions under contrib group and add to distribution * fix names in test cases * add unit test to cover OssInputSource * fix names in test cases * fix dependency problem reported by CI Signed-off-by: frank chen <frank.chen021@outlook.com>	2020-07-01 22:20:53 -07:00
Jihoon Son	657f8ee80f	Fix RetryQueryRunner to actually do the job (#10082 ) * Fix RetryQueryRunner to actually do the job * more javadoc * fix test and checkstyle * don't combine for testing * address comments * fix unit tests * always initialize response context in cachingClusteredClient * fix subquery * address comments * fix test * query id for builders * make queryId optional in the builders and ClusterQueryResult * fix test * suppress tests and unused methods * exclude groupBy builder * fix jacoco exclusion * add tests for builders * address comments * don't truncate	2020-07-01 14:02:21 -07:00
Suneet Saldanha	15a0b4ffe2	Filter http requests by http method (#10085 ) * Filter http requests by http method Add a config that allows a user which http methods to allow against their Druid server. Druid will only accept http requests with the method: GET, PUT, POST, DELETE and OPTIONS. If a Druid admin wants to allow other methods, they can do so by using the ServerConfig#allowedHttpMethods config. If a Druid user would like to disallow OPTIONS, this can be done by changing the AuthConfig#allowUnauthenticatedHttpOptions config * Exclude OPTIONS from always supported HTTP methods Add HEAD as an allowed method for web console e2e tests * fix docs * fix security IT * Actually fix the web console e2e tests * Ignore icode coverage for nitialization classes * code review	2020-06-29 16:59:31 -07:00
Clint Wylie	ec1f443a5c	update avatica to handle additional character sets over jdbc (#10074 ) * update avatica to handle additional character sets over jdbc * update license yaml, fix test * oops	2020-06-24 19:58:34 -07:00
Chi Cao Minh	67669b4ad4	Fix CVE-2020-13602 (#10024 ) Upgrade postgres jdbc driver to latest version to address CVE, which was fixed in 42.2.13.	2020-06-11 17:30:13 -07:00
Xavier Léauté	65280a6953	update kafka client version to 2.5.0 (#9902 ) - remove dependency on deprecated internal Kafka classes - keep LZ4 version in line with the version shipped with Kafka	2020-05-27 13:20:32 -07:00
Chi Cao Minh	427239f451	Enforce code coverage (#9863 ) * Enforce code coverage Add an automated way of checking if new code has adequate unit tests, since merging code coverage reports and check coverage thresholds via coveralls or codecov is unreliable. The following minimum unit test code coverage is now enforced: - 80% functions - 65% branch - 65% line Branch and line coverage thresholds are slightly lower for now as they are harder to achieve. After the code coverage check looks reliable, the thresholds can be increased later if needed. * Add comments	2020-05-20 09:31:37 -07:00
Alexander Saydakov	522df300c2	Datasketches 1 3 0 (#9880 ) * use the latest datasketches release * new sketch debug print Co-authored-by: AlexanderSaydakov <AlexanderSaydakov@users.noreply.github.com>	2020-05-16 14:09:23 -07:00
Francesco Nidito	e7e41e3a36	Adding support for autoscaling in GCE (#8987 ) * Adding support for autoscaling in GCE * adding extra google deps also in gce pom * fix link in doc * remove unused deps * adding terms to spelling file * version in pom 0.17.0-incubating-SNAPSHOT --> 0.18.0-SNAPSHOT * GCEXyz -> GceXyz in naming for consistency * add preconditions * add VisibleForTesting annotation * typos in comments * use StringUtils.format instead of String.format * use custom exception instead of exit * factorize interval time between retries * making literal value a constant * iter all network interfaces * use provided on google (non api) deps * adding missing dep * removing unneded this and use Objects methods instead o 3-way if in hash and comparison * adding import * adding retries around getRunningInstances and adding limit for operation end waiting * refactor GceEnvironmentConfig.hashCode * 0.18.0-SNAPSHOT -> 0.19.0-SNAPSHOT * removing unused config * adding tests to hash and equals * adding nullable to waitForOperationEnd * adding testTerminate * adding unit tests for createComputeService * increasing retries in unrelated integration-test to prevent sporadic failure (hopefully) * reverting queryResponseTemplate change * adding comment for Compute.Builder.build() returning null	2020-04-28 03:13:39 -07:00
Clint Wylie	fc5383cd00	revert datasketches-java version to 1.1.0-incubating until new version is released (#9751 ) * revert datasketches-java version to 1.1.0-incubating until fix is in place * fix tests * checkstyle	2020-04-24 12:52:12 -07:00
Maytas Monsereenusorn	16f5ae4405	Add integration tests for kafka ingestion (#9724 ) * add kafka admin and kafka writer * refactor kinesis IT * fix typo refactor * parallel * parallel * parallel * parallel works now * add kafka it * add doc to readme * fix tests * fix failing test * test * test * test * test * address comments * addressed comments	2020-04-22 10:43:34 -07:00
Jihoon Son	6a52bdc605	Skip license check for dependency reduced pom files (#9687 )	2020-04-11 18:11:53 -07:00
Chi Cao Minh	e6dd6a4119	Skip node dev dependency vulnerability scan (#9684 ) Since they are not production dependencies, security vulnerabilities in the dev dependencies can be ignored.	2020-04-11 14:24:25 -07:00
Chi Cao Minh	eb45981b60	Upgrade netty 4 to fix CVE-2020-11612 (#9651 )	2020-04-09 13:26:14 -07:00
Chi Cao Minh	84c1c2505d	Web console basic end-to-end-test (#9595 ) Load data and query (i.e., automate https://druid.apache.org/docs/latest/tutorials/tutorial-batch.html) to have some basic checks ensuring the web console is wired up to druid correctly. The new end-to-end tests (tutorial-batch.spec.ts) are added to `web-console/e2e-tests`. Within that directory: - `components` represent the various tabs of the web console. Currently, abstractions for `load data`, `ingestion`, `datasources`, and `query` are implemented. - `components/load-data/data-connector` contains abstractions for the different data source options available to the data loader's `Connect` step. Currently, only the `Local file` data source connector is implemented. - `components/load-data/config` contains abstractions for the different configuration options available for each step of the data loader flow. Currently, the `Configure Schema`, `Partition`, and `Publish` steps have initial implementation of their configuration options. - `util` contains various helper methods for the tests and does not contain abstractions of the web console. Changes to add the new tests to CI: - `.travis.yml`: New "web console end-to-end tests" job - `web-console/jest.*.js`: Refactor jest configurations to have different flavors for unit tests and for end-to-end tests. In particular, the latter adds a jest setup configuration to wait for the web console to be ready (`web-console/e2e-tests/util/setup.ts`). - `web-console/package.json`: Refactor run scripts to add new script for running end-to-end tests. - `web-console/script/druid`: Utility scripts for building, starting, and stopping druid. Other changes: - `pom.xml`: Refactor various settings disable java static checks and to disable java tests into two new maven profiles. Since the same settings are used in several places (e.g., .travis.yml, Dockerfiles, etc.), having them in maven profiles makes it more maintainable. - `web-console/src/console-application.tsx`: Fix typo ("the the").	2020-04-09 12:38:09 -07:00
Maytas Monsereenusorn	b95a1b9878	Fix NPE in RemoteTaskRunner event handler causes JVM shutdown (#9610 ) * Fix NPE in RemoteTaskRunner event handler causes JVM shutdown * address comments * fix compile * fix checkstyle * fix lgtm * fix merge * fix test * fix tests * change scope * address comments * address comments	2020-04-07 14:53:51 -07:00
bolkedebruin	2d99966933	Add Apache Ranger Authorization (#9579 )	2020-04-04 18:02:24 +02:00
Jihoon Son	0da8ffc3ff	Bump up development version to 0.19.0-SNAPSHOT (#9586 )	2020-03-30 16:24:04 -07:00
Himanshu	5604ac7963	druid extension for OpenID Connect auth using pac4j lib (#8992 ) * druid pac4j security extension for OpenID Connect OAuth 2.0 authentication * update version in druid-pac4j pom * introducing unauthorized resource filter * authenticated but authorized /unified-webconsole.html * use httpReq.getRequestURI() for matching callback path * add documentation * minor doc addition * licesne file updates * make dependency analyze succeed * fix doc build * hopefully fixes doc build * hopefully fixes license check build * yet another try on fixing license build * revert unintentional changes to website folder * update version to 0.18.0-SNAPSHOT * check session and its expiry on each request * add crypto service * code for encrypting the cookie * update doc with cookiePassphrase * update license yaml * make sessionstore in Pac4jFilter private non static * make Pac4jFilter fields final * okta: use sha256 for hmac * remove incubating * add UTs for crypto util and session store impl * use standard charsets * add license header * remove unused file * add org.objenesis.objenesis to license.yaml * a bit of nit changes in CryptoService and embedding EncryptionResult for clarity * rename alg to cipherAlgName * take cipher alg name, mode and padding as input * add java doc for CryptoService and make it more understandable * another UT for CryptoService * cache pac4j Config * use generics clearly in Pac4jSessionStore * update cookiePassphrase doc to mention PasswordProvider * mark stuff Nullable where appropriate in Pac4jSessionStore * update doc to mention jdbc * add error log on reaching callback resource * javadoc for Pac4jCallbackResource * introduce NOOP_HTTP_ACTION_ADAPTER * add correct module name in license file * correct extensions folder name in licenses.yaml * replace druid-kubernetes-extensions to druid-pac4j * cache SecureRandom instance * rename UnauthorizedResourceFilter to AuthenticationOnlyResourceFilter	2020-03-23 18:15:45 -07:00
Chi Cao Minh	e7b3dd9cd1	Update to mysql connector 5.1.48 (#9514 )	2020-03-16 10:38:31 -07:00
Clint Wylie	8b9fe6f584	query laning and load shedding (#9407 ) * prototype * merge QueryScheduler and QueryManager * everything in its right place * adjustments * docs * fixes * doc fixes * use resilience4j instead of semaphore * more tests * simplify * checkstyle * spelling * oops heh * remove unused * simplify * concurrency tests * add SqlResource tests, refactor error response * add json config tests * use LongAdder instead of AtomicLong * remove test only stuffs from scheduler * javadocs, etc * style * partial review stuffs * adjust * review stuffs * more javadoc * error response documentation * spelling * preserve user specified lane for NoSchedulingStrategy * more test, why not * doc adjustment * style * missed review for make a thing a constant * fixes and tests * fix test * Update docs/configuration/index.md Co-Authored-By: sthetland <steve.hetland@imply.io> * doc update Co-authored-by: sthetland <steve.hetland@imply.io>	2020-03-10 02:57:16 -07:00
Maytas Monsereenusorn	2db20afbb7	Integration test cluster supports override config (#9473 ) * integration test refactor * integration test refactor * refactor integration test * refactor integration test * refactor integration test * refactor integration test * refactor integration test * refactor integration test * refactor integration test * refactor integration test * address comments	2020-03-09 21:17:49 -07:00
zachjsh	d771b42ed1	Move Azure extension into Core (#9394 ) * Move Azure extension into Core Moving the azure extension into Core. * * Fix build failure * * Add The MIT License (MIT) to list of compatible licenses * * Address review comments * * change reference to contrib azure to core azure * * Fix spelling mistakes.	2020-02-25 17:49:16 -08:00
Chi Cao Minh	7fc99ee206	Add common optional dependencies for extensions (#9399 ) * Add common optional dependencies for extensions Include hadoop-aws and postgres JDBC connector jar to improve out-of-the-box experience for extensions. The mysql JDBC connector jar is not bundled as it is GPL. * Update docs * Fix typo	2020-02-25 00:04:00 -08:00
Fokko Driesprong	806dfe6de6	Bump Apache Avro to 1.9.2 (#9381 ) * Bump Apache Avro 1.9.2 Bugfixes that where discovered in other projects * Update missing license	2020-02-24 10:04:22 +01:00
Chi Cao Minh	a5c49cc4bd	Change security vulnerability scan to cron job (#9340 ) * Change security vulnerability scan to cron job Previously, when new CVEs were reported, the security vulnerability scan would unfortunately block PRs that did not modify any dependencies. To prevent this issue, the security scan is now run as a Travis cron job that runs on master and notifies the druid dev list if it fails. The security scan has also been added to the "apache-release" maven profile, to ensure that it passes before a release. Also adjusted some Travis CI job failure help messages to not be folded in the Travis CI job logs. * Dedup plugin configuration definition	2020-02-11 13:43:08 -08:00
Lucas Capistrant	53bb45fc9a	Forbid easily misused HashSet and HashMap constructors (#9165 ) * Forbid easily misused HashSet and HashMap constructors * Add two LinkedHashMap constructors to forbidden-apis and create utility method as replacement for them * Fix visibility of constant in CollectionUtils.java * Make an exception for an instance of LinkedHashMap#<init>(int) because proper sizing is used * revert changes to sql module tests that should be in separate PR * Finish reverting changes to sql module tests that were flagged in checkstyle during CI * Add netty dependency resulting from SupressForbidden	2020-02-07 10:44:09 +03:00
Gian Merlino	3ef5c2f2e8	Add MemoryOpenHashTable, a table similar to ByteBufferHashTable. (#9308 ) * Add MemoryOpenHashTable, a table similar to ByteBufferHashTable. With some key differences to improve speed and design simplicity: 1) Uses Memory rather than ByteBuffer for its backing storage. 2) Uses faster hashing and comparison routines (see HashTableUtils). 3) Capacity is always a power of two, allowing simpler design and more efficient implementation of findBucket. 4) Does not implement growability; instead, leaves that to its callers. The idea is this removes the need for subclasses, while still giving callers flexibility in how to handle table-full scenarios. * Fix LGTM warnings. * Adjust dependencies. * Remove easymock from druid-benchmarks. * Adjustments from review. * Fix datasketches unit tests. * Fix checkstyle.	2020-02-04 19:57:59 -08:00
Suneet Saldanha	33a97dfaae	Guicify druid sql module (#9279 ) * Guicify druid sql module Break up the SQLModule in to smaller modules and provide a binding that modules can use to register schemas with druid sql. * fix some tests * address code review * tests compile * Working tests * Add all the tests * fix up licenses and dependencies * add calcite dependency to druid-benchmarks * tests pass * rename the schemas	2020-02-04 11:33:48 -08:00
zachjsh	74ac9151c9	Fix / suppress netty CVEs CVE-2019-20445 and CVE-2019-20444 (#9300 ) * Suppress netty 3 vulnerabilites and upgrade netty 4 version * Upgrade netty 4 version to fix vulnerabilities CVE-2019-20445 and CVE-2019-20444 * suppress these CVEs for netty 3 * * simplify suppression xml file * update licenses file with new version of netty * * fix type in licenses.yaml	2020-01-31 14:51:54 -08:00
Clint Wylie	c6c8b80644	fix build by updating kafka client to 2.2.2 for CVE-2019-12399 (#9259 ) * fix build by updating kafka client to 2.2.2 for CVE-2019-12399 * one kafka version to rule them all * notice	2020-01-27 11:07:02 -08:00
Chi Cao Minh	0b0056b77f	More tests for range partition parallel indexing (#9232 ) Add more unit tests for range partition native batch parallel indexing. Also, fix a bug where ParallelIndexPhaseRunner incorrectly thinks that identical collected DimensionDistributionReports are not equal due to not overriding equals() in DimensionDistributionReport.	2020-01-21 12:59:43 -08:00
Fokko Driesprong	12b84cfb33	Bump Jackson to 2.10.2 (#9173 )	2020-01-17 11:39:32 +01:00
Jonathan Wei	58d337186b	Graduation update for ASF release process guide and download links (#9126 ) * Graduation update for ASF release process guide and download links * Fix release vote thread typo * Fix pom.xml	2020-01-06 15:00:33 -06:00
Jonathan Wei	aa539177ec	De-incubation cleanup in code, docs, packaging (#9108 ) * De-incubation cleanup in code, docs, packaging * remove unused docs script	2020-01-03 12:33:19 -05:00
Jonathan Wei	4e8368a5d9	Set version to 0.18.0-SNAPSHOT (#9109 )	2020-01-02 17:55:10 -05:00
Benedict Jin	7a7c948595	Exclude .asf.yaml from the configuration of the rat plugin (#9088 )	2019-12-23 13:08:23 -08:00
Suneet Saldanha	301c0649a7	Fix equalsAndHashCode in ClientCompactQueryTuningConfig (#9035 ) * Fix equalsAndHashCode in ClientCompactQueryTuningConfig This change introduces a dependency to EqualsVerifier for the test scope. The dependency is licensed under Apache 2. The library makes it trivial to add equals and hashCode checks to prevent bugs like this from happening in the future * fix checkstyle * fix test name	2019-12-16 14:33:00 -08:00
Jonathan Wei	8af41d7cd0	Update version to 0.18.0-incubating-SNAPSHOT (#9009 )	2019-12-11 14:04:03 -08:00
Chi Cao Minh	bab78fc80e	Parallel indexing single dim partitions (#8925 ) * Parallel indexing single dim partitions Implements single dimension range partitioning for native parallel batch indexing as described in #8769. This initial version requires the druid-datasketches extension to be loaded. The algorithm has 5 phases that are orchestrated by the supervisor in `ParallelIndexSupervisorTask#runRangePartitionMultiPhaseParallel()`. These phases and the main classes involved are described below: 1) In parallel, determine the distribution of dimension values for each input source split. `PartialDimensionDistributionTask` uses `StringSketch` to generate the approximate distribution of dimension values for each input source split. If the rows are ungrouped, `PartialDimensionDistributionTask.UngroupedRowDimensionValueFilter` uses a Bloom filter to skip rows that would be grouped. The final distribution is sent back to the supervisor via `DimensionDistributionReport`. 2) The range partitions are determined. In `ParallelIndexSupervisorTask#determineAllRangePartitions()`, the supervisor uses `StringSketchMerger` to merge the individual `StringSketch`es created in the preceding phase. The merged sketch is then used to create the range partitions. 3) In parallel, generate partial range-partitioned segments. `PartialRangeSegmentGenerateTask` uses the range partitions determined in the preceding phase and `RangePartitionCachingLocalSegmentAllocator` to generate `SingleDimensionShardSpec`s. The partition information is sent back to the supervisor via `GeneratedGenericPartitionsReport`. 4) The partial range segments are grouped. In `ParallelIndexSupervisorTask#groupGenericPartitionLocationsPerPartition()`, the supervisor creates the `PartialGenericSegmentMergeIOConfig`s necessary for the next phase. 5) In parallel, merge partial range-partitioned segments. `PartialGenericSegmentMergeTask` uses `GenericPartitionLocation` to retrieve the partial range-partitioned segments generated earlier and then merges and publishes them. * Fix dependencies & forbidden apis * Fixes for integration test * Address review comments * Fix docs, strict compile, sketch check, rollup check * Fix first shard spec, partition serde, single subtask * Fix first partition check in test * Misc rewording/refactoring to address code review * Fix doc link * Split batch index integration test * Do not run parallel-batch-index twice * Adjust last partition * Split ITParallelIndexTest to reduce runtime * Rename test class * Allow null values in range partitions * Indicate which phase failed * Improve asserts in tests	2019-12-09 23:05:49 -08:00
Chi Cao Minh	af74acaa85	Address security vulnerabilities CVSS >= 7 (#8980 ) * Address security vulnerabilities CVSS >= 7 Update dependencies to address security vulnerabilities with CVSS scores of 7 or higher. A new Travis CI job is added to prevent new high/critical security vulnerabilities from being added. Updated dependencies: - api-util 1.0.0 -> 1.0.3 - jackson 2.9.10 -> 2.10.1 - kafka 2.1.0 -> 2.1.1 - libthrift 0.10.0 -> 0.13.0 - protobuf 3.2.0 -> 3.11.0 The following high/critical security vulnerabilities are currently suppressed (so that the new Travis CI job can be added now) and are left as future work to fix: - hibernate-validator:5.2.5 - jackson-mapper-asl:1.9.13 - libthrift:0.6.1 - netty:3.10.6 - nimbus-jose-jwt:4.41.1 * Rename EDL1 license file * Fix inspection errors	2019-12-05 14:34:35 -08:00
jon-wei	dfbc066163	Revert "[maven-release-plugin] prepare release druid-0.16.1-incubating-rc1" This reverts commit `a0f21d9b07`.	2019-11-27 23:22:43 -08:00
jon-wei	0402ff85b8	Revert "[maven-release-plugin] prepare for next development iteration" This reverts commit `8ffa71e7e6`.	2019-11-27 23:22:32 -08:00
jon-wei	8ffa71e7e6	[maven-release-plugin] prepare for next development iteration	2019-11-27 23:18:48 -08:00
jon-wei	a0f21d9b07	[maven-release-plugin] prepare release druid-0.16.1-incubating-rc1	2019-11-27 23:18:37 -08:00
Chi Cao Minh	fba876b607	Update jackson to 2.9.10 (#8940 ) Addresses security vulnerabilities: - sonatype-2016-0397: https://github.com/FasterXML/jackson-core/issues/315 - sonatype-2017-0355: https://github.com/FasterXML/jackson-core/pull/322	2019-11-26 21:41:14 -08:00
Alexander Saydakov	4a9da3f3fc	use the latest release of datasketches (#8647 ) * use the latest release of datasketches * added datasketches-memory dependency * updated datasketches entries * use datasketches-memory-1.2.0 * updated dependencies * fixed tests	2019-11-25 19:45:51 -08:00
Jonathan Wei	dc6178d1f2	Upgrade Calcite to 1.21 (#8566 ) * Upgrade Calcite to 1.21 * Checkstyle, test fix' * Exclude calcite yaml deps, update license.yaml * Add method for exception chain handling * Checkstyle * PR comments, Add outer limit context flag * Revert project settings change * Update subquery test comment * Checkstyle fix * Fix test in sql compat mode * Fix test * Fix dependency analysis * Address PR comments * Checkstyle * Adjust testSelectStarFromSelectSingleColumnWithLimitDescending	2019-11-20 21:22:55 -08:00
Chi Cao Minh	8365bdf62a	Address security vulnerabilities (#8878 ) * Address security vulnerabilities Security vulnerabilities addressed by upgrading 3rd party libs: - Upgrade avro-ipc to 1.9.1 - sonatype-2019-0115 - Upgrade caffeine to 2.8.0 - sonatype-2019-0282 - Upgrade commons-beanutils to 1.9.4 - CVE-2014-0114 - Upgrade commons-codec to 1.13 - sonatype-2012-0050 - Upgrade commons-compress to 1.19 - CVE-2019-12402 - sonatype-2018-0293 - Upgrade hadoop-common to 2.8.5 - CVE-2018-11767 - Upgrade hadoop-mapreduce-client-core to 2.8.5 - CVE-2017-3166 - Upgrade hibernate-validator to 5.2.5 - CVE-2017-7536 - Upgrade httpclient to 4.5.10 - sonatype-2017-0359 - Upgrade icu4j to 55.1 - CVE-2014-8147 - Upgrade jackson-databind to 2.6.7.3: - CVE-2017-7525 - Upgrade jetty-http to 9.4.12: - CVE-2017-7657 - CVE-2017-7658 - CVE-2017-7656 - CVE-2018-12545 - Upgrade log4j-core to 2.8.2 - CVE-2017-5645: - Upgrade netty to 3.10.6 - CVE-2015-2156 - Upgrade netty-common to 4.1.42 - CVE-2019-9518 - Upgrade netty-codec-http to 4.1.42 - CVE-2019-16869 - Upgrade nimbus-jose-jwt to 4.41.1 - CVE-2017-12972 - CVE-2017-12974 - Upgrade plexus-utils to 3.0.24 - CVE-2017-1000487 - sonatype-2015-0173 - sonatype-2016-0398 - Upgrade postgresql to 42.2.8 - CVE-2018-10936 Note that if users are using JDBC lookups with postgres, they may need to update the JDBC jar used by the lookup extension. * Fix license for postgresql	2019-11-19 09:14:33 -08:00
Vadim Ogievetsky	17d773dca2	Web console: replace (and remove) old consoles (#8838 ) * first steps * clean licenses * fix capabilities * fix specs * more tests * new web console on coordinator and overlord, remove setup for old consoles, old configs * better message * update licenses * sync license files * more button * fix tslint issue * jetty-rewrite dependency to add redirects for old console paths * put dependency in the right place * fix overlord detection * fix notices, dedupe licenses * make segment timeline work in no SQL mode * update license * revert hard coded coordinator mode from testing * update restricted mode copy	2019-11-15 19:45:14 -08:00
Atul Mohan	517c14632e	Upgrade joda-time to 2.10.5 (#8821 ) * Upgrade joda * Update license	2019-11-06 14:30:22 -08:00
Roman Leventov	5c0fc0a13a	Fix ambiguity about IndexerSQLMetadataStorageCoordinator.getUsedSegmentsForInterval() returning only non-overshadowed or all used segments (#8564 ) * IndexerSQLMetadataStorageCoordinator.getTimelineForIntervalsWithHandle() don't fetch abutting intervals; simplify getUsedSegmentsForIntervals() * Add VersionedIntervalTimeline.findNonOvershadowedObjectsInInterval() method; Propagate the decision about whether only visible segmetns or visible and overshadowed segments should be returned from IndexerMetadataStorageCoordinator's methods to the user logic; Rename SegmentListUsedAction to RetrieveUsedSegmentsAction, SegmetnListUnusedAction to RetrieveUnusedSegmentsAction, and UsedSegmentLister to UsedSegmentsRetriever * Fix tests * More fixes * Add javadoc notes about returning Collection instead of Set. Add JacksonUtils.readValue() to reduce boilerplate code * Fix KinesisIndexTaskTest, factor out common parts from KinesisIndexTaskTest and KafkaIndexTaskTest into SeekableStreamIndexTaskTestBase * More test fixes * More test fixes * Add a comment to VersionedIntervalTimelineTestBase * Fix tests * Set DataSegment.size(0) in more tests * Specify DataSegment.size(0) in more places in tests * Fix more tests * Fix DruidSchemaTest * Set DataSegment's size in more tests and benchmarks * Fix HdfsDataSegmentPusherTest * Doc changes addressing comments * Extended doc for visibility * Typo * Typo 2 * Address comment	2019-11-06 11:07:04 -08:00
Jonathan Wei	526f04c47c	Fix missing jackson jars for hadoop ingestion (#8652 ) * Fix missing jackson jars for hadoop ingestion * PR comments * pom ordering * New approach * Remove all jackson-core/mapper-asl exclusions from hdfs storage	2019-10-08 23:54:55 -07:00
Nishant Bangarwa	8537fbeca7	Implementing dropwizard emitter for druid (#7363 ) * Implementing dropwizard emitter for druid making metric manager and alert emitters as optional * Refactor and make things work more improvements improve docs refactrings * Fix teamcity inspections * review comments * more review comments * add limit to max number of gauges * update pom version * fix pom * review comments * review comment * review comments * fix broken doc link review comments review comments * review comments * fix checkstyle * more spell check fixes * fix travis failures	2019-10-01 14:59:30 -07:00
Fokko Driesprong	a2363b6b61	Remove commons-httpclient (#8407 )	2019-09-27 02:14:58 -07:00
Fokko Driesprong	99c3e0bb3f	Bump HttpClient to 4.5.10 (#8404 ) * Bump HttpClient to 4.5.9 * Remove Licenses file * Revert license * Remove duplicate dependency * Bump HttpClient to 4.5.10	2019-09-27 02:14:36 -07:00
Chi Cao Minh	5f61374cb3	Fix dependency analyze warnings (#8230 ) * Fix dependency analyze warnings Update the maven dependency plugin to the latest version and fix all warnings for unused declared and used undeclared dependencies in the compile scope. Added new travis job to add the check to CI. Also fixed some source code files to use the correct packages for their imports and updated druid-forbidden-apis to prevent regressions. * Address review comments * Adjust scope for org.glassfish.jaxb:jaxb-runtime * Fix dependencies for hdfs-storage * Consolidate netty4 versions	2019-09-09 14:37:21 -07:00
Richard Startin	58e2634dc5	Update RoaringBitmap version to 0.8.11 (#8490 )	2019-09-09 13:42:16 -07:00
Chi Cao Minh	14a8613d69	Exit JVM on curator unhandled errors (#8458 ) * Exit JVM on curator unhandled errors If an unhandled error occurs when curator is talking to ZooKeeper, exit the JVM in addition to stopping the lifecycle to prevent the process from being left in a zombie state. With this change, BoundedExponentialBackoffRetryWithQuit is no longer needed as when curator exceeds the configured retries, it triggers its unhandled error listeners. A new "connectionTimeoutMs" CuratorConfig setting is added mostly to facilitate testing curator unhandled errors, but it may be useful for users as well. * Address review comments	2019-09-06 16:43:59 -07:00
Xavier Léauté	4b69ce0f09	enable unit tests with JDK11 (#8400 ) * enable unit tests with JDK11 This enables unit tests with openjdk11, splitting up the build into stages to have it fail faster The integration test docker image still uses openjdk8, so there is little reason to run those tests with JDK11 yet * remove stages	2019-08-28 10:29:13 -07:00
Chi Cao Minh	31e6280b75	Use Codecov (#8388 ) * Use Codecov Upload coverage reports to Codecov. For now, having Codecov comment on PRs or enforcing a minimum coverage threshold are both disabled until the Codecov coverage reports look reliable: https://codecov.io/gh/apache/incubator-druid * Split bash and curl into separate lines	2019-08-28 08:49:30 -07:00
Clint Wylie	c73a489335	bump master version to 0.17.0-incubating-SNAPSHOT (#8421 )	2019-08-28 01:58:36 -07:00
Clint Wylie	7afe473fd3	How to asf release (#8370 ) * add ASF release manager guide * fix broken link * fix bold * fix order * clean up * oops * pom * more * fix * fixes * fix * fix	2019-08-27 18:36:13 -07:00
Clint Wylie	44dd5b5f0d	add jaxb-runtime to fix exception with newer versions of java (#8409 ) * add jaxb-runtime to fix exception with jdk9+ * fix licenses * oops	2019-08-27 14:25:05 -06:00
Dylan Wylie	b2821a8371	do not exclude client core jar (#8339 ) make indexing service depend on hadoop client	2019-08-26 13:48:24 -07:00
Furkan KAMACI	02fe3db911	Zookeeper version is updated. (#8363 ) * Zookeeper version is updated. * Zookeeper version is updated at licenses.yaml * licenses.yaml is updated and dependencies are fixed to make the project successfully build. * Zookeeper versions are fixed at licenses.yaml	2019-08-24 22:00:43 -07:00
Chi Cao Minh	2383d9e522	Disable coveralls (#8382 ) The coveralls code coverage reports inaccurate coverage for our parallel builds. Disable it until it can be fixed or a better alternative can be found.	2019-08-23 08:05:37 -07:00
Benedict Jin	14a4238381	Bump JUnitParams from 1.0.4 to 1.1.1 (#8017 )	2019-08-20 16:15:12 -07:00
Fokko Driesprong	8821ac330d	Bump opencsv from 4.2 to 4.6 (#8294 ) * Bump opencsv from 4.2 to 4.6 * Fix transitive dependencies	2019-08-20 16:12:03 -07:00
Fokko Driesprong	3a58431bff	Bump jackson-jq from 0.0.7 to 0.0.10 (#8293 ) * Bump jackson-jq from 0.0.7 to 0.0.10 For the changelog: https://github.com/eiiches/jackson-jq/releases * Update dependent licenses	2019-08-20 16:09:04 -07:00
Chi Cao Minh	6fa22f6939	Enable code coverage (#8303 ) * Enable code coverage Code coverage was disabled via https://github.com/apache/incubator-druid/pull/3122 due to an issue with cobertura in Travis CI. Switch code coverage tool from cobertura to jacoco to avoid issue and re-enable coveralls for Travis CI. * Exclude non-production code * Exclude benchmark generated code * Exclude DruidTestRunnerFactory	2019-08-20 15:36:19 -07:00
Fokko Driesprong	cb1339e19a	Bump derby from 10.11.1.1 to 10.14.2.0 (#8292 ) * Bump derby from 10.11.1.1 to 10.15.1.3 * Update server/pom.xml as well * Move to derby 10.14.2.0 10.15.* is Java9+ https://db.apache.org/derby/derby_downloads.html	2019-08-20 14:03:32 -07:00
Fokko Driesprong	1a3aa1cfc0	Bump commons-io from 2.5 to 2.6 (#8006 ) * Bump commons-io from 2.5 to 2.6 * Update licenses.yaml * Address comments	2019-08-13 17:10:37 -07:00
Benedict Jin	170368999d	Bump rhino from 1.7R5 to 1.7.11 (#8008 ) * Bump rhino from 1.7R5 to 1.7.11 * Update the version of rhino in licenses.yaml	2019-08-09 13:10:54 -07:00
Benedict Jin	f7cf2f7cad	Bump httpcore from 4.4.4 to 4.4.11 (#7870 ) * Bump httpcore from 4.4.4 to 4.4.11 * Update the version of httpcore in licenses.yaml	2019-08-09 19:53:20 +03:00
Chi Cao Minh	b359c5b3d9	Fix SIGAR dependency connection timeout (#8258 ) After enabling parallel builds for "mvn install", the sigar dependency would sometimes resolve to the incorrect artifact repo for some of the maven modules. This issue seems to be fixed by moving the definition of the sigar dependency's artifact repo to the root POM. Also, depending on network speeds, "mvn -q install" may take longer than the default 10 minute timeout to print any output. Use travis_wait to extend the timeout to 15 minutes.	2019-08-08 20:13:18 -05:00
Chi Cao Minh	05b44e3467	Speedup Travis CI jobs (#8240 ) Reorganize Travis CI jobs into smaller faster (and more) jobs. Add various maven options to skip unnecessary work and refactored Travis CI job definitions to follow DRY. Detailed changes: .travis.yml - Refactor build logic to get rid of copy-and-paste logic - Skip static checks and enable parallelism for maven install - Split static analysis into different jobs to ease triage - Use "name" attribute instead of NAME environment variable - Split "indexing" and "web console" out of "other modules test" - Split 2 integration test jobs into multiple smaller jobs build.sh - Enable parallelism - Disable more static checks travis_script_integration.sh travis_script_integration_part2.sh integration-tests/README.md - Use TestNG groups instead of shell scripts and move definition of jobs into Travis CI yaml integration-tests/pom.xml - Show elapsed time of individual tests to aid in future rebalancing of Travis CI integration test jobs run time TestNGGroup.java - Use TestNG groups to make it easy to have multiple Travis CI integration test jobs. TestNG groups also make it easier to have an "other" integration test group and make it less likely a test will accidentally not be included in a CI job. IT*Test.java AbstractITBatchIndexTest.java AbstractKafkaIndexerTest.java - Add TestNG group - Fix various IntelliJ inspection warnings - Reduce scope of helper methods since the TestNG group annotation on the class makes TestNG consider all public methods as test methods pom.xml - Allow enforce plugin to be run from command-line - Bump resources plugin version so that "[debug] execute contextualize" output is correctly suppressed by "mvn -q" - Bump exec plugin version so that skip property is renamed from "skip" to "exec.skip" web-console/pom.xml - Add property to allow disabling javascript-related work. This property is overridden in Travis CI to speed up the jobs.	2019-08-07 09:52:42 -07:00
Chi Cao Minh	7783b31846	Add IPv4 druid expressions (#8197 ) * Add IPv4 druid expressions New druid expressions for filtering IPv4 addresses: - ipv4address_match: Check if IP address belongs to a subnet - ipv4address_parse: Convert string IP address to long - ipv4address_stringify: Convert long IP address to string These expressions operate on IP addresses represented as either strings or longs, so that they can be applied to dimensions with mixed representation of IP addresses. The filtering is more efficient when operating on IP addresses as longs. In other words, the intended use case is: 1) Use ipv4address_parse to convert to long at ingestion time 2) Use ipv4address_match to filter (on longs) at query time 3) Use ipv4adress_stringify to convert to (readable) string at query time * Fix licenses and null handling * Simplify IPv4 expressions * Fix tests * Fix check for valid ipv4 address string	2019-08-01 11:45:04 -07:00
Chi Cao Minh	ab71a2e1e4	Revert "Fix dependency analyze warnings (#8128 )" (#8189 ) This reverts commit `5dd0d8e873`.	2019-07-29 11:42:16 -07:00
Chi Cao Minh	5dd0d8e873	Fix dependency analyze warnings (#8128 ) * Fix dependency analyze warnings Update the maven dependency plugin to the latest version and fix all warnings for unused declared and used undeclared dependencies in the compile scope. Added new travis job to add the check to CI. Also fixed some source code files to use the correct packages for their imports. * Fix licenses and dependencies * Fix licenses and dependencies again * Fix integration test dependency * Address review comments * Fix unit test dependencies * Fix integration test dependency * Fix integration test dependency again * Fix integration test dependency third time * Fix integration test dependency fourth time * Fix compile error * Fix assert package	2019-07-26 10:49:03 -07:00
Gian Merlino	ffa25b7832	Query vectorization. (#6794 ) * Benchmarks: New SqlBenchmark, add caching & vectorization to some others. - Introduce a new SqlBenchmark geared towards benchmarking a wide variety of SQL queries. Rename the old SqlBenchmark to SqlVsNativeBenchmark. - Add (optional) caching to SegmentGenerator to enable easier benchmarking of larger segments. - Add vectorization to FilteredAggregatorBenchmark and GroupByBenchmark. * Query vectorization. This patch includes vectorized timeseries and groupBy engines, as well as some analogs of your favorite Druid classes: - VectorCursor is like Cursor. (It comes from StorageAdapter.makeVectorCursor.) - VectorColumnSelectorFactory is like ColumnSelectorFactory, and it has methods to create analogs of the column selectors you know and love. - VectorOffset and ReadableVectorOffset are like Offset and ReadableOffset. - VectorAggregator is like BufferAggregator. - VectorValueMatcher is like ValueMatcher. There are some noticeable differences between vectorized and regular execution: - Unlike regular cursors, vector cursors do not understand time granularity. They expect query engines to handle this on their own, which a new VectorCursorGranularizer class helps with. This is to avoid too much batch-splitting and to respect the fact that vector selectors are somewhat more heavyweight than regular selectors. - Unlike FilteredOffset, FilteredVectorOffset does not leverage indexes for filters that might partially support them (like an OR of one filter that supports indexing and another that doesn't). I'm not sure that this behavior is desirable anyway (it is potentially too eager) but, at any rate, it'd be better to harmonize it between the two classes. Potentially they should both do some different thing that is smarter than what either of them is doing right now. - When vector cursors are created by QueryableIndexCursorSequenceBuilder, they use a morphing binary-then-linear search to find their start and end rows, rather than linear search. Limitations in this patch are: - Only timeseries and groupBy have vectorized engines. - GroupBy doesn't handle multi-value dimensions yet. - Vector cursors cannot handle virtual columns or descending order. - Only some filters have vectorized matchers: "selector", "bound", "in", "like", "regex", "search", "and", "or", and "not". - Only some aggregators have vectorized implementations: "count", "doubleSum", "floatSum", "longSum", "hyperUnique", and "filtered". - Dimension specs other than "default" don't work yet (no extraction functions or filtered dimension specs). Currently, the testing strategy includes adding vectorization-enabled tests to TimeseriesQueryRunnerTest, GroupByQueryRunnerTest, GroupByTimeseriesQueryRunnerTest, CalciteQueryTest, and all of the filtering tests that extend BaseFilterTest. In all of those classes, there are some test cases that don't support vectorization. They are marked by special function calls like "cannotVectorize" or "skipVectorize" that tell the test harness to either expect an exception or to skip the test case. Testing should be expanded in the future -- a project in and of itself. Related to #3011. * WIP * Adjustments for unused things. * Adjust javadocs. * DimensionDictionarySelector adjustments. * Add "clone" to BatchIteratorAdapter. * ValueMatcher javadocs. * Fix benchmark. * Fixups post-merge. * Expect exception on testGroupByWithStringVirtualColumn for IncrementalIndex. * BloomDimFilterSqlTest: Tag two non-vectorizable tests. * Minor adjustments. * Update surefire, bump up Xmx in Travis. * Some more adjustments. * Javadoc adjustments * AggregatorAdapters adjustments. * Additional comments. * Remove switching search. * Only missiles.	2019-07-12 12:54:07 -07:00
Clint Wylie	42a7b8849a	remove FirehoseV2 and realtime node extensions (#8020 ) * remove firehosev2 and realtime node extensions * revert intellij stuff * rat exclusion	2019-07-04 15:40:22 -07:00
Benedict Jin	6395c08309	Bump commons-codec from 1.7 to 1.12 (#7995 )	2019-06-29 07:40:19 -07:00
Benedict Jin	7a5bc5ffcd	Bump jaxb-api from 2.3.0 to 2.3.1 (#7978 )	2019-06-27 08:51:00 -07:00
Roman Leventov	46ea5b88b7	Add the pull-request template (#7206 ) * Add the pull-request template * Rewording * Replaced checklist link, added Rat exclusion * Update the PR template. Add Concurrency Checklist to the repository * Merge Description and Design sections. Softer language. Removed requirement to test in production environment. Added a committer's instruction to justify addition of meta tags. * Rephrase item about comments * Add license header * Add item to concurrency checklist	2019-06-27 15:51:25 +03:00
Benedict Jin	bc1413e4e3	Bump commons-cli from 1.2 to 1.3.1 (#7966 )	2019-06-26 08:05:13 -07:00
Fokko Driesprong	48f20fe754	Add Spotbugs (#7894 ) * Add Spotbugs Exclude all the issues for now, so we can add them one by one. (cherry picked from commit ceda4754dc8c703d1e0de85b48cd5f5409cfd5b7) * Add additional rules to the list * More rules * More rules * Add comments to the xml * Move the spotbugs-exclude.xml to codestyle/	2019-06-20 21:06:52 +03:00
Fokko Driesprong	41f23b5120	Bump commons-compress from 1.16 to 1.18 (#7924 )	2019-06-19 10:43:01 -07:00
Xue Yu	20d1db9dff	bump fastutil to 8.2.3 (#7920 )	2019-06-18 09:17:34 -07:00
Benedict Jin	fb7f8ec362	Bump RoaringBitmap from 0.8.0 to 0.8.6 (#7906 )	2019-06-17 17:02:52 +08:00
Jihoon Son	d00a9676b7	Set aws.region for unit tests automatically (#7868 ) * Set aws.region for unit tests automatically * Update README.template	2019-06-14 15:34:21 -07:00
Fokko Driesprong	f2b00023f8	Bump Checkstyle to 8.21 (#7826 )	2019-06-04 01:02:46 -07:00
Fokko Driesprong	c8e1511f12	Bump Joda time to 2.10.2 (#7809 )	2019-05-31 14:25:35 -07:00
Jihoon Son	7abfbb066a	Bump up snapshot version to 0.16.0 (#7802 )	2019-05-30 17:17:33 -07:00
Xavier Léauté	58a6f0d5d0	Enable compiling against Java 9+ (tests disabled) This change only enables compilation to ensure code compiles against recent Java versions going forward. Tests are still disabled in this profile until test failures are addressed.	2019-05-27 18:40:19 -07:00
Clint Wylie	db3792727e	use unminified jquery to be more friendly for source releases, fix license stuff (#7751 ) * use unminified jquery to be more friendly for source releases, fix license stuff * other license file * rats	2019-05-24 11:53:25 -07:00
awelsh93	6964ac23a2	Adding influxdb emitter as a contrib extension (#7717 ) * Adding influxdb emitter as a contrib extension * addressing code review comments	2019-05-23 11:11:48 -07:00
mcbrewster	1b284ca847	add tests to dialogs, compnents and views. Add index files to components and dialogs. add nested file structure (#7669 )	2019-05-22 20:36:51 -07:00
Gian Merlino	b6941551ae	Upgrade various build and doc links to https. (#7722 ) * Upgrade various build and doc links to https. Where it wasn't possible to upgrade build-time dependencies to https, I kept http in place but used hardcoded checksums or GPG keys to ensure that artifacts fetched over http are verified properly. * Switch to https://apache.org.	2019-05-21 11:30:14 -07:00
Clint Wylie	ddda8b74cb	update lz4-java to 1.6.0 (lz4 1.9.1) (#7700 )	2019-05-20 13:01:48 -07:00
Fokko Driesprong	2aa9613bed	Bump Checkstyle to 8.20 (#7651 ) * Bump Checkstyle to 8.20 Moderate severity vulnerability that affects: com.puppycrawl.tools:checkstyle Checkstyle prior to 8.18 loads external DTDs by default, which can potentially lead to denial of service attacks or the leaking of confidential information. Affected versions: < 8.18 * Oops, missed one * Oops, missed a few	2019-05-14 11:53:37 -07:00
Xue Yu	35a1fbefea	upgrade avatica to 1.12.0 (#7644 )	2019-05-12 14:38:06 -07:00
Clint Wylie	6a6c6d573d	Add plain text README.txt, use relative link from README.md to build.md (#7611 ) * use relative link to build instructions from top level readme * add textfile to readme * formatting * make README.BINARY plaintext, move LABELS.md to LABELS, README.txt to README * exclude README.BINARY still * remove jdk links/recommmendations * add script to use DRUIDVERSION in textfile README instead of latest, add links to recommended jdk to build.md * license * better readme template, links to latest if does not detect an apache release version * fix	2019-05-09 21:29:26 -07:00
Samarth Jain	b542bb9f34	TDigest backed sketch aggregators (#7331 ) * First set of changes for tDigest histogram * Add license * Address code review comments * Add a doc page for new T-Digest sketch aggregators. Minor code cleanup and comments. * Remove synchronization from BufferAggregators. Address code review comments * Fix typo	2019-05-09 17:22:55 -07:00
Xavier Léauté	751e1c9ba7	add javax.xml.bind dependencies removed in jdk11 (#7604 ) We depend on javax.xml.bind at runtime. This change adds an explicit dependency on the J2EE module that was removed in Java 11.	2019-05-06 19:30:14 -07:00
Jonathan Wei	7c2ca474da	Add single-machine deployment example cfgs and scripts (#7590 ) * Add single-machine deployment example cfgs and scripts * Add (8u92+) * Use combined coordinator-overlord for single machine confs * RAT fix	2019-05-06 19:11:13 -07:00
Xavier Léauté	51a62cb31b	Update dependencies for JDK11 support (#7601 ) * update asm for jdk11 support * update jvm-attach-api for jdk11 support	2019-05-06 14:07:44 -07:00
Xavier Léauté	f7bfe8f269	Update mocking libraries for Java 11 support (#7596 ) * update easymock / powermock for to 4.0.2 / 2.0.2 for JDK11 support * update tests to use new easymock interfaces * fix tests failing due to easymock fixes * remove dependency on jmockit * fix race condition in ResourcePoolTest	2019-05-06 12:28:56 -07:00
Eyal Yurman	f02251ab2d	Contributing Moving-Average Query to open source. (#6430 ) * Contributing Moving-Average Query to open source. * Fix failing code inspections. * See if explicit types will invoke the correct comparison function. * Explicitly remove support for druid.generic.useDefaultValueForNull configuration parameter. * Update styling and headers for complience. * Refresh code with latest master changes: * Remove NullDimensionSelector. * Apply changes of RequestLogger. * Apply changes of TimelineServerView. * Small checkstyle fix. * Checkstyle fixes. * Fixing rat errors; Teamcity errors. * Removing support theta sketches. Will be added back in this pr or a following once DI conflicts with datasketches are resolved. * Implements some of the review fixes. * Contributing Moving-Average Query to open source. * Fix failing code inspections. * See if explicit types will invoke the correct comparison function. * Explicitly remove support for druid.generic.useDefaultValueForNull configuration parameter. * Update styling and headers for complience. * Refresh code with latest master changes: * Remove NullDimensionSelector. * Apply changes of RequestLogger. * Apply changes of TimelineServerView. * Small checkstyle fix. * Checkstyle fixes. * Fixing rat errors; Teamcity errors. * Removing support theta sketches. Will be added back in this pr or a following once DI conflicts with datasketches are resolved. * Implements some of the review fixes. * More fixes for review. * More fixes from review. * MapBasedRow is Unmodifiable. Create new rows instead of modifying existing ones. * Remove more changes related to datasketches support. * Refactor BaseAverager startFrom field and add a comment. * fakeEvents field: Refactor initialization and add comment. * Rename parameters (tiny change). * Fix variable name typo in test (JAN_4). * Fix styling of non camelCase fields. * Fix Preconditions.checkArgument for cycleSize. * Add more documentation to RowBucketIterable and other classes. * key/value comment on in MovingAverageIterable. * Fix anonymous makeColumnValueSelector returning null. * Replace IdentityYieldingAccumolator with Yielders.each(). * * internalNext() should return null instead of throwing exception. * Remove unused variables/prarameters. * Harden MovingAverageIterableTest (Switch anyOf to exact match). * Change internalNext() from recursion to iteration; Simplify next() and hasNext(). * Remove unused imports. * Address review comments. * Rename fakeEvents to emptyEvents. * Remove redundant parameter key from computeMovingAverage. * Check yielder as well in RowBucketIterable#hasNext() * Fix javadoc.	2019-04-26 17:07:48 -07:00
Clint Wylie	89bb43f382	'core' ORC extension (#7138 ) * orc extension reworked to use apache orc map-reduce lib, moved to core extensions, support for flattenSpec, tests, docs * change binary handling to be compatible with avro and parquet, Rows.objectToStrings now converts byte[] to base64, change date handling * better docs and tests * fix it * formatting * doc fix * fix it * exclude redundant dependencies * use latest orc-mapreduce, add hadoop jobProperties recommendations to docs * doc fix * review stuff and fix binaryAsString * cache for root level fields * more better	2019-04-09 09:03:26 -07:00
Richard Startin	d29a32062f	upgrade to RoaringBitmap 0.8.0 and serialise directly to ByteBuffer (#7408 )	2019-04-04 13:22:50 -04:00
Charles Allen	eeb3dbe79d	Move GCP to a core extension (#6953 ) * Move GCP to a core extension * Don't provide druid-core >.< * Keep AWS and GCP modules separate * Move AWSModule to its own module * Add aws ec2 extension and more modules in more places * Fix bad imports * Fix test jackson module * Include AWS and GCP core in server * Add simple empty method comment * Update version to 15 * One more 0.13.0-->0.15.0 change * Fix multi-binding problem * Grep for s3-extensions and update docs * Update extensions.md	2019-03-27 09:00:43 -07:00
Jonathan Wei	8ca7cb4886	Fix rat check for source assembly after build (#7333 )	2019-03-22 22:48:35 -07:00
Jonathan Wei	7a57bc0dc3	Exclude git.version from rat check (#7322 )	2019-03-21 20:54:27 -07:00
Jonathan Wei	5486c2abf8	Update LICENSE and NOTICE files (#7026 ) * Update LICENSE and NOTICE files * Update react-table version	2019-03-04 18:45:22 -08:00
Jonathan Wei	258485a2fb	Exclude github issue templates from license check (#7070 ) * Exclude github issue templates from license check * Adjust capitalization	2019-02-19 12:38:52 -08:00
Surekha	80a2ef7be4	Support kafka transactional topics (#5404 ) (#6496 ) * Support kafka transactional topics * update kafka to version 2.0.0 * Remove the skipOffsetGaps option since it's not used anymore * Adjust kafka consumer to use transactional semantics * Update tests * Remove unused import from test * Fix compilation * Invoke transaction api to fix a unit test * temporary modification of travis.yml for debugging * another attempt to get travis tasklogs * update kafka to 2.0.1 at all places * Remove druid-kafka-eight dependency from integration-tests, remove the kafka firehose test and deprecate kafka-eight classes * Add deprecated in docs for kafka-eight and kafka-simple extensions * Remove skipOffsetGaps and code changes for transaction support * Fix indentation * remove skipOffsetGaps from kinesis * Add transaction api to KafkaRecordSupplierTest * Fix indent * Fix test * update kafka version to 2.1.0	2019-02-18 11:50:08 -08:00
Edward Gan	90c1a54b86	Moments Sketch custom aggregator (#6581 ) * Moments Sketch Integration with Druid * updates, add documentation, fix warnings * nits * disallowed base64 * update to druid 0.14	2019-02-13 14:03:47 -08:00
Jonathan Wei	fafbc4a80e	Set version to 0.15.0-incubating-SNAPSHOT (#7014 )	2019-02-07 14:02:52 -08:00
Jonathan Wei	8bc5eaa908	Set version to 0.14.0-incubating-SNAPSHOT (#7003 )	2019-02-04 19:36:20 -08:00
Vadim Ogievetsky	7f1b19bfb1	Adding a Unified web console. (#6923 ) * Adding new web console. * fixed css * fix form height * fix typo * do import custom react-table css * added repo field so npm does not complain * ask travis for node 10 * move indexing-service/src/main/resources/indexer_static into web-console * fix resource names and paths * add licenses * fix exclude file * add licenses to misc files and tidy up * remove rebase marker * fix link * updated env variable name * tidy up licenses and surface errors * cleanup * remove unused code, fix missing await * TeamCity does not like the name aux * add more links to tasks view * rm pages * update gitignore * update readme to be accurate * make clean script * removed old console dependancy * update Jetty routes * add a comment for welcome files for coordinator * do not show inital notifaction for now * renamed overlord console back to console.html * fix coordinator console * rename coordinator-console.html to index.html	2019-01-31 17:26:41 -08:00
Ankit Kothari	8492d94f59	Kill Hadoop MR task on kill of Hadoop ingestion task (#6828 ) * KillTask from overlord UI now makes sure that it terminates the underlying MR job, thus saving unnecessary compute Run in jobby is now split into 2 1. submitAndGetHadoopJobId followed by 2. run submitAndGetHadoopJobId is responsible for submitting the job and returning the jobId as a string, run monitors this job for completion JobHelper writes this jobId in the path provided by HadoopIndexTask which in turn is provided by the ForkingTaskRunner HadoopIndexTask reads this path when kill task is clicked to get hte jobId and fire the kill command via the yarn api. This is taken care in the stopGracefully method which is called in SingleTaskBackgroundRunner. Have enabled `canRestore` method to return `true` for HadoopIndexTask in order for the stopGracefully method to be called HadoopJob files have been changed to incorporate the changes to jobby Addressing PR comments * Addressing PR comments - Fix taskDir * Addressing PR comments - For changing the contract of Task.stopGracefully() `SingleTaskBackgroundRunner` calls stopGracefully in stop() and then checks for canRestore condition to return the status of the task * Addressing PR comments 1. Formatting 2. Removing `submitAndGetHadoopJobId` from `Jobby` and calling writeJobIdToFile in the job itself * Addressing PR comments 1. POM change. Moving hadoop dependency to indexing-hadoop * Addressing PR comments 1. stopGracefully now accepts TaskConfig as a param Handling isRestoreOnRestart in stopGracefully for `AppenderatorDriverRealtimeIndexTask, RealtimeIndexTask, SeekableStreamIndexTask` Changing tests to make TaskConfig param isRestoreOnRestart to true	2019-01-25 15:43:06 -08:00
Gian Merlino	e497141e92	Update Curator to 4.1.0. (#6862 )	2019-01-15 14:12:07 -08:00

1 2 3 4 5 ...

1568 Commits