* Fixes and tests related to the Indexer process.
Three bugs fixed:
1) Indexers would not announce themselves as segment servers if they
did not have storage locations defined. This used to work, but was
broken in #9971. Fixed this by adding an "isSegmentServer" method
to ServerType and updating SegmentLoadDropHandler to always announce
if this method returns true.
2) Certain batch task types were written in a way that assumed "isReady"
would be called before "run", which is not guaranteed. In particular,
they relied on it in order to initialize "taskLockHelper". Fixed this
by updating AbstractBatchIndexTask to ensure "isReady" is called
before "run" for these tasks.
3) UnifiedIndexerAppenderatorsManager did not properly handle complex
datasources. Introduced DataSourceAnalysis in order to fix this.
Test changes:
1) Add a new "docker-compose.cli-indexer.yml" config that spins up an
Indexer instead of a MiddleManager.
2) Introduce a "USE_INDEXER" environment variable that determines if
docker-compose will start up an Indexer or a MiddleManager.
3) Duplicate all the jdk8 tests and run them in both MiddleManager and
Indexer mode.
4) Various adjustments to encourage fail-fast errors in the Docker
build scripts.
5) Various adjustments to speed up integration tests and reduce memory
usage.
6) Add another Mac-specific approach to determining a machine's own IP.
This was useful on my development machine.
7) Update segment-count check in ITCompactionTaskTest to eliminate a
race condition (it was looking for 6 segments, which only exist
together briefly, until the older 4 are marked unused).
Javadoc updates:
1) AbstractBatchIndexTask: Added javadocs to determineLockGranularityXXX
that make it clear when taskLockHelper will be initialized as a side
effect. (Related to the second bug above.)
2) Task: Clarified that "isReady" is not guaranteed to be called before
"run". It was already implied, but now it's explicit.
3) ZkCoordinator: Clarified deprecation message.
4) DataSegmentServerAnnouncer: Clarified deprecation message.
* Fix stop_cluster script.
* Fix sanity check in script.
* Fix hashbang lines.
* Test and doc adjustments.
* Additional tests, and adjustments for tests.
* Split ITs back out.
* Revert change to druid_coordinator_period_indexingPeriod.
* Set Indexer capacity to match MM.
* Bump up Historical memory.
* Bump down coordinator, overlord memory.
* Bump up Broker memory.
* Two fixes related to encoding of % symbols.
1) TaskResourceFilter: Don't double-decode task ids. request.getPathSegments()
returns already-decoded strings. Applying StringUtils.urlDecode on
top of that causes erroneous behavior with '%' characters.
2) Update various ThreadFactoryBuilder name formats to escape '%'
characters. This fixes situations where substrings starting with '%'
are erroneously treated as format specifiers.
ITs are updated to include a '%' in extra.datasource.name.suffix.
* Avoid String.replace.
* Work around surefire bug.
* Fix xml encoding.
* Another try at the proper encoding.
* Give up on the emojis.
* Less ambitious testing.
* Fix an additional problem.
* Adjust encodeForFormat to return null if the input is null.
* Integration tests for stream ingestion with various data formats
* fix npe
* better logging; fix tsv
* fix tsv
* exclude kinesis from travis
* some readme
* add kafka admin and kafka writer
* refactor kinesis IT
* fix typo refactor
* parallel
* parallel
* parallel
* parallel works now
* add kafka it
* add doc to readme
* fix tests
* fix failing test
* test
* test
* test
* test
* address comments
* addressed comments
* Test file format extensions for inputSource (orc, parquet)
* Test file format extensions for inputSource (orc, parquet)
* fix path
* resolve merge conflict
* fix typo
* kinesis IT
* Kinesis IT
* Kinesis IT
* Kinesis IT
* Kinesis IT
* Kinesis IT
* Kinesis IT
* Kinesis IT
* Kinesis IT
* Kinesis IT
* Kinesis IT
* Kinesis IT
* Kinesis IT
* Kinesis IT
* Kinesis IT
* fix kinesis timeout
* Kinesis IT
* Kinesis IT
* fix checkstyle
* Kinesis IT
* address comments
* fix checkstyle
* integration test refactor
* integration test refactor
* refactor integration test
* refactor integration test
* refactor integration test
* refactor integration test
* refactor integration test
* refactor integration test
* refactor integration test
* refactor integration test
* address comments
* Address security vulnerabilities CVSS >= 7
Update dependencies to address security vulnerabilities with CVSS scores
of 7 or higher. A new Travis CI job is added to prevent new
high/critical security vulnerabilities from being added.
Updated dependencies:
- api-util 1.0.0 -> 1.0.3
- jackson 2.9.10 -> 2.10.1
- kafka 2.1.0 -> 2.1.1
- libthrift 0.10.0 -> 0.13.0
- protobuf 3.2.0 -> 3.11.0
The following high/critical security vulnerabilities are currently
suppressed (so that the new Travis CI job can be added now) and are left
as future work to fix:
- hibernate-validator:5.2.5
- jackson-mapper-asl:1.9.13
- libthrift:0.6.1
- netty:3.10.6
- nimbus-jose-jwt:4.41.1
* Rename EDL1 license file
* Fix inspection errors
* Fix dependency analyze warnings
Update the maven dependency plugin to the latest version and fix all
warnings for unused declared and used undeclared dependencies in the
compile scope. Added new travis job to add the check to CI. Also fixed
some source code files to use the correct packages for their imports and
updated druid-forbidden-apis to prevent regressions.
* Address review comments
* Adjust scope for org.glassfish.jaxb:jaxb-runtime
* Fix dependencies for hdfs-storage
* Consolidate netty4 versions
Reorganize Travis CI jobs into smaller faster (and more) jobs. Add
various maven options to skip unnecessary work and refactored Travis CI
job definitions to follow DRY.
Detailed changes:
.travis.yml
- Refactor build logic to get rid of copy-and-paste logic
- Skip static checks and enable parallelism for maven install
- Split static analysis into different jobs to ease triage
- Use "name" attribute instead of NAME environment variable
- Split "indexing" and "web console" out of "other modules test"
- Split 2 integration test jobs into multiple smaller jobs
build.sh
- Enable parallelism
- Disable more static checks
travis_script_integration.sh
travis_script_integration_part2.sh
integration-tests/README.md
- Use TestNG groups instead of shell scripts and move definition of jobs
into Travis CI yaml
integration-tests/pom.xml
- Show elapsed time of individual tests to aid in future rebalancing of
Travis CI integration test jobs run time
TestNGGroup.java
- Use TestNG groups to make it easy to have multiple Travis CI
integration test jobs. TestNG groups also make it easier to have an
"other" integration test group and make it less likely a test will
accidentally not be included in a CI job.
IT*Test.java
AbstractITBatchIndexTest.java
AbstractKafkaIndexerTest.java
- Add TestNG group
- Fix various IntelliJ inspection warnings
- Reduce scope of helper methods since the TestNG group annotation on
the class makes TestNG consider all public methods as test methods
pom.xml
- Allow enforce plugin to be run from command-line
- Bump resources plugin version so that "[debug] execute contextualize"
output is correctly suppressed by "mvn -q"
- Bump exec plugin version so that skip property is renamed from "skip"
to "exec.skip"
web-console/pom.xml
- Add property to allow disabling javascript-related work. This property
is overridden in Travis CI to speed up the jobs.
* Fix dependency analyze warnings
Update the maven dependency plugin to the latest version and fix all
warnings for unused declared and used undeclared dependencies in the
compile scope. Added new travis job to add the check to CI. Also fixed
some source code files to use the correct packages for their imports.
* Fix licenses and dependencies
* Fix licenses and dependencies again
* Fix integration test dependency
* Address review comments
* Fix unit test dependencies
* Fix integration test dependency
* Fix integration test dependency again
* Fix integration test dependency third time
* Fix integration test dependency fourth time
* Fix compile error
* Fix assert package
* refactor lookups to be more chill to router
* remove accidental change
* fix and combine LookupIntrospectionResourceTest
* fix inspection
* rename RouterLookupModule to LookupSerdeModule and RouterLookupExtractorFactoryContainerProvider to NoopLookupExtractorFactoryContainerProvider
* make comment generic
* use ConfigResourceFilter instead of StateResourceFilter
* fix indentation
* unused import
* another unused import
* refactor some stuff into processing module, split up LookupModule.java classes into their own files
* Support kafka transactional topics
* update kafka to version 2.0.0
* Remove the skipOffsetGaps option since it's not used anymore
* Adjust kafka consumer to use transactional semantics
* Update tests
* Remove unused import from test
* Fix compilation
* Invoke transaction api to fix a unit test
* temporary modification of travis.yml for debugging
* another attempt to get travis tasklogs
* update kafka to 2.0.1 at all places
* Remove druid-kafka-eight dependency from integration-tests, remove the kafka firehose test and deprecate kafka-eight classes
* Add deprecated in docs for kafka-eight and kafka-simple extensions
* Remove skipOffsetGaps and code changes for transaction support
* Fix indentation
* remove skipOffsetGaps from kinesis
* Add transaction api to KafkaRecordSupplierTest
* Fix indent
* Fix test
* update kafka version to 2.1.0
* Rename io.druid to org.apache.druid.
* Fix META-INF files and remove some benchmark results.
* MonitorsConfig update for metrics package migration.
* Reorder some dimensions in inner queries for some reason.
* Fix protobuf tests.
* Enable parallel test
* Remove unnecessary NotThreadSafe annocation
* Randomize the start port when finding available ports
* Fix test failure
* Change to handle all negatives
- Switch to using Druid parent POM
- Add required fields for Sonatype
- Common plugin versions and settings have been moved to the parent pom
- Cleanup artifacts and POMs for consistent formatting
- Remove org.hyperic.sigar dependency and update docs to reflect necessary jars to add at runtime when sigar is needed