druid

Commit Graph

Author	SHA1	Message	Date
Yi Yuan	de8daf8139	Delete buildV9Directly in Kafka and Kinesis Indexing Service (#11351 ) * delete_buildV9Directly_in_kafka_and_kinesis_indexing_service * delete * delete them from server * delete buildV9Directly from hadoop indexing * bug fixed Co-authored-by: yuanyi <yuanyi@freewheel.tv>	2021-06-23 16:36:46 -07:00
Xavier Léauté	c8b3f8cc00	avoid logging pause message multiple times (#11375 ) In some instances the ingestion thread could be woken up spuriously, resulting in the "Received pause command..." log message getting logged multiple times. This change ensures we only log it once the first time the pause is requested.	2021-06-22 09:30:38 -07:00
Kashif Faraz	f0b105ec63	Temporarily skip compaction for locked intervals (#11190 ) * Add overlord API /lockedIntervals. Skip compaction for locked intervals * Revert formatting changes * Add license info * Fix checkstyle * Remove invalid API invocation * Fix checkstyle * Add DatasourceIntervalsTest * Fix checkstyle * Remove LockedIntervalsResponse * Add integration tests for lockedIntervals * Add ITAutoCompactionLockContentionTest * Add config druid.coordinator.compaction.skipLockedIntervals * Add test for TaskQueue	2021-06-20 17:21:59 -07:00
Xavier Léauté	712f2a5d00	upgrade error-prone to 2.7.1 and support checks with Java 11+ (#11363 ) * upgrade error-prone to 2.7.1 and support checks with Java 11+ - upgrade error-prone to 2.7.1 - support running error-prone with Java 11 and above using -Xplugin instead of custom compiler - add compiler arguments to ignore warnings/errors in Java 15/16 - introduce strictCompile property to enable strict profiles since we now need multiple strict profiles for Java 8 - properly exclude all generated source files from error-prone - fix druid-processing overriding annotation processors from parent pom - fix druid-core disabling most non-default checks - align plugin and annotation errorprone versions - fix / suppress additional issues found by error-prone: * fix bug in SeekableStreamSupervisor initializing ArrayList size with the taskGroupdId * fix missing @Override annotations - remove outdated compiler plugin in benchmarks - remove deleted ParameterPackage error-prone rule - re-enable checks on benchmark module as well * fix IntelliJ inspections * disable LongFloatConversion due to bug in error-prone with JDK 8 * add comment about InsecureCrypto	2021-06-16 12:55:34 -07:00
dependabot[bot]	167044f715	Bump fastutil from 8.2.3 to 8.5.4 (#11347 ) * Bump fastutil from 8.2.3 to 8.5.4 Bumps [fastutil](https://github.com/vigna/fastutil) from 8.2.3 to 8.5.4. - [Release notes](https://github.com/vigna/fastutil/releases) - [Changelog](https://github.com/vigna/fastutil/blob/master/CHANGES) - [Commits](https://github.com/vigna/fastutil/compare/8.2.3...8.5.4) --- updated-dependencies: - dependency-name: it.unimi.dsi:fastutil dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * update licenses.yaml * update maven dependency list for -core and -extra libraries to pass maven dependency checks Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Xavier Léauté <xvrl@apache.org>	2021-06-10 07:43:18 -07:00
zachjsh	27f1b6cbf3	Fix Index hadoop failing with index.zip is not a valid DFS filename (#11316 ) * * Fix bug * * simplify class loading * * fix example configs for integration tests * Small classloader cleanup Co-authored-by: jon-wei <jon.wei@imply.io>	2021-06-01 19:14:50 -04:00
Xavier Léauté	b517c3339b	remove ZooKeeper 3.4 support + pass tests with Java 15 (#11073 ) With this change, Druid will only support ZooKeeper 3.5.x and later. In order to support Java 15 we need to switch to ZK 3.5.x client libraries and drop support for ZK 3.4.x (see #10780 for the detailed reasons) * remove ZooKeeper 3.4.x compatibility * exclude additional ZK 3.5.x netty dependencies to ensure we use our version * keep ZooKeeper version used for integration tests in sync with client library version * remove the need to specify ZK version at runtime for docker * add support to run integration tests with JDK 15 * build and run unit tests with Java 15 in travis	2021-05-25 12:49:49 -07:00
Jihoon Son	4100c5edc0	Fix taskQueue to honor (#11243 ) useLineageBasedSegmentAllocation in default taskContext	2021-05-14 09:30:50 +08:00
Agustin Gonzalez	8e5048e643	Avoid memory mapping hydrants after they are persisted & after they are merged for native batch ingestion (#11123 ) * Avoid mapping hydrants in create segments phase for native ingestion * Drop queriable indices after a given sink is fully merged * Do not drop memory mappings for realtime ingestion * Style fixes * Renamed to match use case better * Rollback memoization code and use the real time flag instead * Null ptr fix in FireHydrant toString plus adjustments to memory pressure tracking calculations * Style * Log some count stats * Make sure sinks size is obtained at the right time * BatchAppenderator unit test * Fix comment typos * Renamed methods to make them more readable * Move persisted metadata from FireHydrant class to AppenderatorImpl. Removed superfluous differences and fix comment typo. Removed custom comparator * Missing dependency * Make persisted hydrant metadata map concurrent and better reflect the fact that keys are Java references. Maintain persisted metadata when dropping/closing segments. * Replaced concurrent variables with normal ones * Added batchMemoryMappedIndex "fallback" flag with default "false". Set this to "true" make code fallback to previous code path. * Style fix. * Added note to new setting in doc, using Iterables.size (and removing a dependency), and fixing a typo in a comment. * Forgot to commit this edited documentation message	2021-05-11 14:34:26 -07:00
Maytas Monsereenusorn	4326e699bd	Add feature to automatically remove datasource metadata based on retention period (#11227 ) * add auto clean up datasource metadata * add test * fix checkstyle * add comments * fix error * address comments * Address comments * fix test * fix test * fix typo * add comment * fix test * fix test	2021-05-11 01:22:33 -07:00
Jihoon Son	2df42143ae	Fix idempotence of segment allocation and task report apis in native batch ingestion (#11189 ) * Fix idempotence of segment allocation and task report apis in native batch ingestion * better error and javadoc * checkstyle and dependency * fix tests and add more tests * task config instead of context; add doc * unused import and dependency * typo in doc * fix unintended changes * fix wrong import * remove unnecessary error handling * add task context back * default task context * fix test and doc * address comments * unused imports	2021-05-07 14:29:48 -07:00
zachjsh	99f39c7202	Hadoop segment index file rename (#11194 ) * Do stuff * Do more stuff * * Do more stuff * * Do more stuff * * working * * cleanup * * more cleanup * * more cleanup * * add license header * * Add unit tests * * add java docs * * add more unit tests * * Cleanup test * * Move removing of workingPath to index task rather than in hadoop job. * * Address review comments * * remove unused import * * Address review comments * Do not overwrite segment descriptor for segment if it already exists. * * add comments to FileSystemHelper class * * fix local hadoop integration test * * Fix failing test failures when running with java11 * Revert "Revert "Adjust HadoopIndexTask temp segment renaming to avoid potential race conditions (#11075)" (#11151)" This reverts commit `49a9c3ffb7`. * * remove JobHelperPowerMockTest * * remove FileSystemHelper class	2021-05-04 20:22:18 -04:00
Jonathan Wei	49a9c3ffb7	Revert "Adjust HadoopIndexTask temp segment renaming to avoid potential race conditions (#11075 )" (#11151 ) This reverts commit `a2892d9c40`.	2021-04-22 15:33:27 -07:00
zachjsh	a2892d9c40	Adjust HadoopIndexTask temp segment renaming to avoid potential race conditions (#11075 ) * Do stuff * Do more stuff * * Do more stuff * * Do more stuff * * working * * cleanup * * more cleanup * * more cleanup * * add license header * * Add unit tests * * add java docs * * add more unit tests * * Cleanup test * * Move removing of workingPath to index task rather than in hadoop job. * * Address review comments * * remove unused import * * Address review comments * Do not overwrite segment descriptor for segment if it already exists. * * add comments to FileSystemHelper class * * fix local hadoop integration test	2021-04-21 12:24:31 -07:00
Maytas Monsereenusorn	4576152e4a	Make dropExisting flag for Compaction configurable and add warning documentations (#11070 ) * Make dropExisting flag for Compaction configurable * fix checkstyle * fix checkstyle * fix test * add tests * fix spelling * fix docs * add IT * fix test * fix doc * fix doc	2021-04-09 00:12:28 -07:00
Lucas Capistrant	8264203cee	Allow client to configure batch ingestion task to wait to complete until segments are confirmed to be available by other (#10676 ) * Add ability to wait for segment availability for batch jobs * IT updates * fix queries in legacy hadoop IT * Fix broken indexing integration tests * address an lgtm flag * spell checker still flagging for hadoop doc. adding under that file header too * fix compaction IT * Updates to wait for availability method * improve unit testing for patch * fix bad indentation * refactor waitForSegmentAvailability * Fixes based off of review comments * cleanup to get compile after merging with master * fix failing test after previous logic update * add back code that must have gotten deleted during conflict resolution * update some logging code * fixes to get compilation working after merge with master * reset interrupt flag in catch block after code review pointed it out * small changes following self-review * fixup some issues brought on by merge with master * small changes after review * cleanup a little bit after merge with master * Fix potential resource leak in AbstractBatchIndexTask * syntax fix * Add a Compcation TuningConfig type * add docs stipulating the lack of support by Compaction tasks for the new config * Fixup compilation errors after merge with master * Remove erreneous newline	2021-04-08 21:03:00 -07:00
Xavier Léauté	15bdd6bc2f	Fix unit tests and GC settings for Java 15 (#11074 ) * JavaScript script engine support was removed in JDK 15: skip those tests for JDKs without it * Fix flaky HTTP client tests with Java 15 * Switch from CMS to G1GC in integration tests, since CMS is no longer available in JDK 15	2021-04-08 10:33:37 -07:00
Maytas Monsereenusorn	d7f5293364	Add an option for ingestion task to drop (mark unused) all existing segments that are contained by interval in the ingestionSpec (#11025 ) * Auto-Compaction can run indefinitely when segmentGranularity is changed from coarser to finer. * Add option to drop segments after ingestion * fix checkstyle * add tests * add tests * add tests * fix test * add tests * fix checkstyle * fix checkstyle * add docs * fix docs * address comments * address comments * fix spelling	2021-04-01 12:29:36 -07:00
Gian Merlino	bf20f9e979	DruidInputSource: Fix issues in column projection, timestamp handling. (#10267 ) * DruidInputSource: Fix issues in column projection, timestamp handling. DruidInputSource, DruidSegmentReader changes: 1) Remove "dimensions" and "metrics". They are not necessary, because we can compute which columns we need to read based on what is going to be used by the timestamp, transform, dimensions, and metrics. 2) Start using ColumnsFilter (see below) to decide which columns we need to read. 3) Actually respect the "timestampSpec". Previously, it was ignored, and the timestamp of the returned InputRows was set to the `__time` column of the input datasource. (1) and (2) together fix a bug in which the DruidInputSource would not properly read columns that are used as inputs to a transformSpec. (3) fixes a bug where the timestampSpec would be ignored if you attempted to set the column to something other than `__time`. (1) and (3) are breaking changes. Web console changes: 1) Remove "Dimensions" and "Metrics" from the Druid input source. 2) Set timestampSpec to `{"column": "__time", "format": "millis"}` for compatibility with the new behavior. Other changes: 1) Add ColumnsFilter, a new class that allows input readers to determine which columns they need to read. Currently, it's only used by the DruidInputSource, but it could be used by other columnar input sources in the future. 2) Add a ColumnsFilter to InputRowSchema. 3) Remove the metric names from InputRowSchema (they were unused). 4) Add InputRowSchemas.fromDataSchema method that computes the proper ColumnsFilter for given timestamp, dimensions, transform, and metrics. 5) Add "getRequiredColumns" method to TransformSpec to support the above. * Various fixups. * Uncomment incorrectly commented lines. * Move TransformSpecTest to the proper module. * Add druid.indexer.task.ignoreTimestampSpecForDruidInputSource setting. * Fix. * Fix build. * Checkstyle. * Misc fixes. * Fix test. * Move config. * Fix imports. * Fixup. * Fix ShuffleResourceTest. * Add import. * Smarter exclusions. * Fixes based on tests. Also, add TIME_COLUMN constant in the web console. * Adjustments for tests. * Reorder test data. * Update docs. * Update docs to say Druid 0.22.0 instead of 0.21.0. * Fix test. * Fix ITAutoCompactionTest. * Changes from review & from merging.	2021-03-25 10:32:21 -07:00
Jihoon Son	a041933017	Allow overlapping intervals for the compaction task (#10912 ) * Allow overlapping intervals for the compaction task * unused import * line indentation Co-authored-by: Maytas Monsereenusorn <maytasm@apache.org>	2021-03-23 11:21:54 -07:00
Maytas Monsereenusorn	f19c2e9ce4	If ingested data has sparse columns, the ingested data with forceGuaranteedRollup=true can result in imperfect rollup and final dimension ordering can be different from dimensionSpec ordering in the ingestionSpec (#10948 ) * add IT * add IT * add the fix * fix checkstyle * fix compile * fix compile * fix test * fix test * address comments	2021-03-18 17:04:28 -07:00
Suneet Saldanha	6b0c2e8996	CompactionTask throws exception on conflicting segmentGranularity (#10996 ) * CompactionTask throws exception on conflicting segmentGranularity * add comment	2021-03-16 12:51:50 -07:00
Maytas Monsereenusorn	ed91a2bb38	Fix Kinesis should not increment throwAway count on EOS record (#10976 ) * fix Kinesis increament throwAway on EOS record * fix checkstyle * fix IT * fix test * fix IT * fix IT * fix IT * fix IT	2021-03-11 22:04:58 -08:00
zhangyue19921010	bddacbb1c3	Dynamic auto scale Kafka-Stream ingest tasks (#10524 ) * druid task auto scale based on kafka lag * fix kafkaSupervisorIOConfig and KinesisSupervisorIOConfig * druid task auto scale based on kafka lag * fix kafkaSupervisorIOConfig and KinesisSupervisorIOConfig * test dynamic auto scale done * auto scale tasks tested on prd cluster * auto scale tasks tested on prd cluster * modify code style to solve 29055.10 29055.9 29055.17 29055.18 29055.19 29055.20 * rename test fiel function * change codes and add docs based on capistrant reviewed * midify test docs * modify docs * modify docs * modify docs * merge from master * Extract the autoScale logic out of SeekableStreamSupervisor to minimize putting more stuff inside there && Make autoscaling algorithm configurable and scalable. * fix ci failed * revert msic.xml * add uts to test autoscaler create && scale out/in and kafka ingest with scale enable * add more uts * fix inner class check * add IT for kafka ingestion with autoscaler * add new IT in groups=kafka-index named testKafkaIndexDataWithWithAutoscaler * review change * code review * remove unused imports * fix NLP * fix docs and UTs * revert misc.xml * use jackson to build autoScaleConfig with default values * add uts * use jackson to init AutoScalerConfig in IOConfig instead of Map<> * autoscalerConfig interface and provide a defaultAutoScalerConfig * modify uts * modify docs * fix checkstyle * revert misc.xml * modify uts * reviewed code change * reviewed code change * code reviewed * code review * log changed * do StringUtils.encodeForFormat when create allocationExec * code review && limit taskCountMax to partitionNumbers * modify docs * code review Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2021-03-06 14:36:52 +05:30
Maytas Monsereenusorn	b7b0ee8362	Add query granularity to compaction task (#10900 ) * add query granularity to compaction task * fix checkstyle * fix checkstyle * fix test * fix test * add tests * fix test * fix test * cleanup * rename class * fix test * fix test * add test * fix test	2021-03-02 11:23:52 -08:00
Gian Merlino	07902f607b	Granularity: Introduce primitive-typed bucketStart, increment methods. (#10904 ) * Granularity: Introduce primitive-typed bucketStart, increment methods. Saves creation of unnecessary DateTime objects in timestamp_floor and timestamp_ceil expressions. * Fix style. * Amp up the test coverage.	2021-02-25 07:59:20 -08:00
Agustin Gonzalez	eabad0fb35	Keep query granularity of compacted segments after compaction (#10856 ) * Keep query granularity of compacted segments after compaction * Protect against null isRollup * Fix bugspot check RC_REF_COMPARISON_BAD_PRACTICE_BOOLEAN & edit an existing comment * Make sure that NONE is also included when comparing for the finer granularity * Update integration test check for segment size due to query granularity propagation affecting size * Minor code cleanup * Added functional test to verify queryGranlarity after compaction * Minor style fix * Update unit tests	2021-02-18 01:35:10 -08:00
Maytas Monsereenusorn	6541178c21	Support segmentGranularity for auto-compaction (#10843 ) * Support segmentGranularity for auto-compaction * Support segmentGranularity for auto-compaction * Support segmentGranularity for auto-compaction * Support segmentGranularity for auto-compaction * resolve conflict * Support segmentGranularity for auto-compaction * Support segmentGranularity for auto-compaction * fix tests * fix more tests * fix checkstyle * add unit tests * fix checkstyle * fix checkstyle * fix checkstyle * add unit tests * add integration tests * fix checkstyle * fix checkstyle * fix failing tests * address comments * address comments * fix tests * fix tests * fix test * fix test * fix test * fix test * fix test * fix test * fix test * fix test	2021-02-12 03:03:20 -08:00
Egor Riashin	5d8a69ecf9	fixed input source sampler buildReader exp (#10651 ) * fixed input source sampler buildReader exp * sampler exception unit test * styling Co-authored-by: egor-ryashin <egor.ryashin@rilldata.com>	2021-02-05 13:33:55 -08:00
Abhishek Agarwal	96d26e5338	Fix kinesis ingestion bugs (#10761 ) * add offsetFetchPeriod to kinesis ingestion doc * Remove jackson dependencies from extensions * Use fixed delay for lag collection * Metrics reset after finishing processing * comments * Broaden the list of exceptions to retry for * Unit tests * Add more tests * Refactoring * re-order metrics * Doc suggestions Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com> * Add tests Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>	2021-02-05 02:49:58 -08:00
Jihoon Son	3f8f00a231	Fix CVE-2021-25646 (#10818 )	2021-02-04 11:21:43 -08:00
Agustin Gonzalez	0e4750bac2	Granularity interval materialization (#10742 ) * Prevent interval materialization for UniformGranularitySpec inside the overlord * Change API of bucketIntervals in GranularitySpec to return an Iterable<Interval> * Javadoc update, respect inputIntervals contract * Eliminate dependency on wrappedspec (i.e. ArbitraryGranularity) in UniformGranularitySpec * Added one boundary condition test to UniformGranularityTest and fixed Travis forbidden method errors in IntervalsByGranularity * Fix Travis style & other checks * Refactor TreeSet to facilitate re-use in UniformGranularitySpec * Make sure intervals are unique when there is no segment granularity * Style/bugspot fixes... * More travis checks * Add condensedIntervals method to GranularitySpec and pass it as needed to the lock method * Style & PR feedback * Fixed failing test * Fixed bug in IntervalsByGranularity iterator that it would return repeated elements (see added unit tests that were broken before this change) * Refactor so that we can get the condensed buckets without materializing the intervals * Get rid of GranularitySpec::condensedInputIntervals ... not needed * Travis failures fixes * Travis checkstyle fix * Edited/added javadoc comments and a method name (code review feedback) * Fixed jacoco coverage by moving class and adding more coverage * Avoid materializing the condensed intervals when locking * Deal with overlapping intervals * Remove code and use library code instead * Refactor intervals by granularity using the FluentIterable, add sanity checks * Change !hasNext() to inputIntervals().isEmpty() * Remove redundant lambda * Use materialized intervals here since this is outside the overlord (for performance) * Name refactor to reflect the fact that bucket intervals are sorted. * Style fixes * Removed redundant method and have condensedIntervalIterator throw IAE when element is null for consistency with other methods in this class (as well that null interval when condensing does not make sense) * Remove forbidden api * Move helper class inside common base class to reduce public space pollution	2021-01-29 06:02:10 -08:00
Abhishek Agarwal	0080e333cc	Fix cardinality estimation (#10762 ) * Fix cardinality estimation * Add unit test * code coverage * fix typo	2021-01-28 15:06:10 -08:00
Maytas Monsereenusorn	a46d561bd7	Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead (#10740 ) * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * fix checkstyle * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * fix test * fix test * add log * Fix byte calculation for maxBytesInMemory to take into account of Sink/Hydrant Object overhead * address comments * fix checkstyle * fix checkstyle * add config to skip overhead memory calculation * add test for the skipBytesInMemoryOverheadCheck config * add docs * fix checkstyle * fix checkstyle * fix spelling * address comments * fix travis * address comments	2021-01-27 00:34:56 -08:00
zhangyue19921010	2590ad4f67	Historical unloads damaged segments automatically when lazy on start. (#10688 ) * ready to test * tested on dev cluster * tested * code review * add UTs * add UTs * ut passed * ut passed * opti imports * done * done * fix checkstyle * modify uts * modify logs * changing the package of SegmentLazyLoadFailCallback.java to org.apache.druid.segment * merge from master * modify import orders * merge from master * merge from master * modify logs * modify docs * modify logs to rerun ci * modify logs to rerun ci * modify logs to rerun ci * modify logs to rerun ci * modify logs to rerun ci * modify logs to rerun ci * modify logs to rerun ci * modify logs to rerun ci Co-authored-by: yuezhang <yuezhang@freewheel.tv>	2021-01-16 19:53:30 -08:00
Jihoon Son	95065bdf1a	Bump dev version to 0.22.0-SNAPSHOT (#10759 )	2021-01-15 13:16:23 -08:00
Jihoon Son	ca32652932	Fix potential deadlock in batch ingestion (#10736 ) * Fix potential deadlock in batch ingestion * fix checkstyle and comment * this is better	2021-01-12 12:50:45 -08:00
Xavier Léauté	118b50195e	Introduce KafkaRecordEntity to support Kafka headers in InputFormats (#10730 ) Today Kafka message support in streaming indexing tasks is limited to message values, and does not provide a way to expose Kafka headers, timestamps, or keys, which may be of interest to more specialized Druid input formats. For instance, Kafka headers may be used to indicate payload format/encoding or additional metadata, and timestamps are often omitted from values in Kafka streams applications, since they are included in the record. This change proposes to introduce KafkaRecordEntity as InputEntity, which would give input formats full access to the underlying Kafka record, including headers, key, timestamps. It would also open access to low-level information such as topic, partition, offset if needed. KafkaEntity is a subclass of ByteEntity for backwards compatibility with existing input formats, and to avoid introducing unnecessary complexity for Kinesis indexing tasks.	2021-01-08 16:04:37 -08:00
Liran Funaro	08ab82f55c	IncrementalIndex Tests and Benchmarks Parametrization (#10593 ) * Remove redundant IncrementalIndex.Builder * Parametrize incremental index tests and benchmarks - Reveal and fix a bug in OffheapIncrementalIndex * Fix forbiddenapis error: Forbidden method invocation: java.lang.String#format(java.lang.String,java.lang.Object[]) [Uses default locale] * Fix Intellij errors: declared exception is never thrown * Add documentation and validate before closing objects on tearDown. * Add documentation to OffheapIncrementalIndexTestSpec * Doc corrections and minor changes. * Add logging for generated rows. * Refactor new tests/benchmarks. * Improve IncrementalIndexCreator documentation * Add required tests for DataGenerator * Revert "rollupOpportunity" to be a string	2021-01-07 22:18:47 -08:00
Jonathan Wei	68bb038b31	Multiphase segment merge for IndexMergerV9 (#10689 ) * Multiphase merge for IndexMergerV9 * JSON fix * Cleanup temp files * Docs * Address logging and add IT * Fix spelling and test unloader datasource name	2021-01-05 22:19:09 -08:00
Jihoon Son	ea2d51d61f	Better error message for compaction task when it sees no segments for compaction (#10728 )	2021-01-05 16:49:57 -08:00
Jonathan Wei	769c21cc87	Add sample method to IndexingServiceClient (#10729 ) * Add sample method to IndexingServiceClient * Add unit test * Fix LGTM	2021-01-05 15:02:44 -08:00
Clint Wylie	da0eabaa01	integration test for coordinator and overlord leadership client (#10680 ) * integration test for coordinator and overlord leadership, added sys.servers is_leader column * docs * remove not needed * fix comments * fix compile heh * oof * revert unintended * fix tests, split out docker-compose file selection from starting cluster, use docker-compose down to stop cluster * fixes * style * dang * heh * scripts are hard * fix spelling * fix thing that must not matter since was already wrong ip, log when test fails * needs more heap * fix merge * less aggro	2020-12-17 22:50:12 -08:00
Gian Merlino	96a387d972	Fixes and tests related to the Indexer process. (#10631 ) * Fixes and tests related to the Indexer process. Three bugs fixed: 1) Indexers would not announce themselves as segment servers if they did not have storage locations defined. This used to work, but was broken in #9971. Fixed this by adding an "isSegmentServer" method to ServerType and updating SegmentLoadDropHandler to always announce if this method returns true. 2) Certain batch task types were written in a way that assumed "isReady" would be called before "run", which is not guaranteed. In particular, they relied on it in order to initialize "taskLockHelper". Fixed this by updating AbstractBatchIndexTask to ensure "isReady" is called before "run" for these tasks. 3) UnifiedIndexerAppenderatorsManager did not properly handle complex datasources. Introduced DataSourceAnalysis in order to fix this. Test changes: 1) Add a new "docker-compose.cli-indexer.yml" config that spins up an Indexer instead of a MiddleManager. 2) Introduce a "USE_INDEXER" environment variable that determines if docker-compose will start up an Indexer or a MiddleManager. 3) Duplicate all the jdk8 tests and run them in both MiddleManager and Indexer mode. 4) Various adjustments to encourage fail-fast errors in the Docker build scripts. 5) Various adjustments to speed up integration tests and reduce memory usage. 6) Add another Mac-specific approach to determining a machine's own IP. This was useful on my development machine. 7) Update segment-count check in ITCompactionTaskTest to eliminate a race condition (it was looking for 6 segments, which only exist together briefly, until the older 4 are marked unused). Javadoc updates: 1) AbstractBatchIndexTask: Added javadocs to determineLockGranularityXXX that make it clear when taskLockHelper will be initialized as a side effect. (Related to the second bug above.) 2) Task: Clarified that "isReady" is not guaranteed to be called before "run". It was already implied, but now it's explicit. 3) ZkCoordinator: Clarified deprecation message. 4) DataSegmentServerAnnouncer: Clarified deprecation message. * Fix stop_cluster script. * Fix sanity check in script. * Fix hashbang lines. * Test and doc adjustments. * Additional tests, and adjustments for tests. * Split ITs back out. * Revert change to druid_coordinator_period_indexingPeriod. * Set Indexer capacity to match MM. * Bump up Historical memory. * Bump down coordinator, overlord memory. * Bump up Broker memory.	2020-12-08 16:02:26 -08:00
Gian Merlino	9acab0b646	DruidInputSource: Sort segments by ID before grouping into splits. (#10646 ) This is useful because it groups up segments for the same time chunk into the same splits, which in turn is useful because it minimizes the number of time chunks that each task will have to deal with.	2020-12-07 13:48:24 -08:00
egor-ryashin	f46cc4faaf	Revert "fixed input source sampler buildReader exp" This reverts commit `e688db8`	2020-12-07 18:34:59 +03:00
egor-ryashin	e688db8503	fixed input source sampler buildReader exp	2020-12-07 18:28:25 +03:00
Gian Merlino	b7641f644c	Two fixes related to encoding of % symbols. (#10645 ) * Two fixes related to encoding of % symbols. 1) TaskResourceFilter: Don't double-decode task ids. request.getPathSegments() returns already-decoded strings. Applying StringUtils.urlDecode on top of that causes erroneous behavior with '%' characters. 2) Update various ThreadFactoryBuilder name formats to escape '%' characters. This fixes situations where substrings starting with '%' are erroneously treated as format specifiers. ITs are updated to include a '%' in extra.datasource.name.suffix. * Avoid String.replace. * Work around surefire bug. * Fix xml encoding. * Another try at the proper encoding. * Give up on the emojis. * Less ambitious testing. * Fix an additional problem. * Adjust encodeForFormat to return null if the input is null.	2020-12-06 22:35:11 -08:00
Gian Merlino	17f39ab91e	Fix misspellings in druid-forbidden-apis. (#10634 ) These caused certain APIs to not actually be properly forbidden. Also removed two MoreExecutors entries for methods that don't exist in our version of Guava.	2020-12-05 15:26:57 -08:00
Jihoon Son	7462b0b953	Allow missing intervals for Parallel task with hash/range partitioning (#10592 ) * Allow missing intervals for Parallel task * fix row filter * fix tests * fix log	2020-11-25 14:50:22 -08:00

1 2 3 4 5 ...

1824 Commits