* Tidy up lifecycle, query, and ingestion logging.
The goal of this patch is to improve the clarity and usefulness of
Druid's logging for cluster operators. For more information, see
https://twitter.com/cowtowncoder/status/1195469299814555648.
Concretely, this patch does the following:
- Changes a lot of INFO logs to DEBUG, and DEBUG to TRACE, with the
goal of reducing redundancy and improving clarity by avoiding
showing rarely-useful log messages. This includes most "starting"
and "stopping" messages, and most messages related to individual
columns.
- Adds new log4j2 templates that show operators how to enabled DEBUG
logging for certain important packages.
- Eliminate stack traces for query errors, unless log level is DEBUG
or more. This is useful because query errors often indicate user
error rather than system error, but dumping stack trace often gave
operators the impression that there was a system failure.
- Adds task id to Appenderator, AppenderatorDriver thread names. In
the default log4j2 configuration, this will put them in log lines
as well. It's very useful if a user is using the Indexer, where
multiple tasks run in the same JVM.
- More consistent terminology when it comes to "sequences" (sets of
segments that are handed-off together by Kafka ingestion) and
"offsets" (cursors in partitions). These terms had been confused in
some log messages due to the fact that Kinesis calls offsets
"sequence numbers".
- Replaces some ugly toString calls with either the JSONification or
something more operator-accessible (like a URL or segment identifier,
instead of JSON object representing the same).
* Adjustments.
* Adjust integration test.
* Startup scripts: verify Java 8 (exactly), improve port/java verification messages.
Java 11 compatibility isn't fully baked yet (users have reported various
issues on Java 11), so block startup with an error message unless Java 8
is found. Allow overriding this decision with an environment variable.
* Message adjustments.
Since it hasn't received updates or community interest in a while, it makes sense
to de-emphasize it in the distribution and most documentation (outside of simple
mentions of its existence).
When following the instructions from Hadoop batch load tutorial in the
docs, building the Docker images fails with:
gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now
Updating nss allows the curl command for downloading Hadoop to succeed
while building the Docker image.
* Add checkstyle for "Local variable names shouldn't start with capital"
* Adjust some local variables to constants
* Replace StringUtils.LINE_SEPARATOR with System.lineSeparator()
* Upgrade various build and doc links to https.
Where it wasn't possible to upgrade build-time dependencies to https,
I kept http in place but used hardcoded checksums or GPG keys to ensure
that artifacts fetched over http are verified properly.
* Switch to https://apache.org.
* Some adjustments to config examples.
- Add ExitOnOutOfMemoryError to jvm.config examples. It was added a
pretty long time ago (8u92) and is helpful since it prevents zombie
processes from hanging around. (OOMEs tend to bork things)
- Disable Broker caching and enable it on Historicals in example
configs. This config tends to scale better since it enables the
Historicals to merge results rather than sending everything by-segment
to the Broker. Also switch to "caffeine" cache from "local".
- Increase concurrency a bit for Broker example config.
- Enable SQL in the example config, a baby step towards making SQL
more of a thing. (It's still off by default in the code.)
- Reduce memory use a bit for the quickstart configs.
- Add example Router configs, in case someone wants to use that. One
reason might be to get the fancy new console (#6923).
* Add example Router configs.
* Fix up router example properties.
* Add router to quickstart supervise conf.
* Prohibit some guava collection APIs and use JDK APIs directly
* reset files that changed by accident
* sort codestyle/druid-forbidden-apis.txt alphabetically
This PR accumulates many refactorings and small improvements that I did while preparing the next change set of https://github.com/druid-io/druid/projects/2. I finally decided to make them a separate PR to minimize the volume of the main PR.
Some of the changes:
- Renamed confusing "Generic Column" term to "Numeric Column" (what it actually implies) in many class names.
- Generified `ComplexMetricExtractor`
* Adding licenses and enable apache-rat-plugi.
Change-Id: I4685a2d9f1e147855dba69329b286f2d5bee3c18
* restore the copywrite of demo_table and add it to the list of allowed ones
Change-Id: I2a9efde6f4b984bc1ac90483e90d98e71f818a14
* revirew comments
Change-Id: I0256c930b7f9a5bb09b44b5e7a149e6ec48cb0ca
* more fixup
Change-Id: I1355e8a2549e76cd44487abec142be79bec59de2
* align
Change-Id: I70bc47ecb577bdf6b91639dd91b6f5642aa6b02f
* Rename io.druid to org.apache.druid.
* Fix META-INF files and remove some benchmark results.
* MonitorsConfig update for metrics package migration.
* Reorder some dimensions in inner queries for some reason.
* Fix protobuf tests.
* Various changes about druid-services module
* Patch improvements from reviewer
* Add ToArrayCallWithZeroLengthArrayArgument & ArraysAsListWithZeroOrOneArgument into inspection profile
* Fix ArraysAsListWithZeroOrOneArgument
* Fix conflict
* Fix ToArrayCallWithZeroLengthArrayArgument
* Fix AliEqualsAvoidNull
* Remove blank line
* Remove unused import clauses
* Fix code style in TopNQueryRunnerTest
* Fix conflict
* Don't use Collections.singletonList when converting the type of array type
* Add argLine into maven-surefire-plugin in druid-process module & increase the timeout value for testMoveSegment testcase
* Roll back the latest commit
* Add java.io.File#toURL() into druid-forbidden-apis
* Using Boolean.parseBoolean instead of Boolean.valueOf for CliCoordinator#isOverlord
* Add a new regexp element into stylecode xml file
* Fix style error for new regexp
* Set the level of ArraysAsListWithZeroOrOneArgument as WARNING
* Fix style error for new regexp
* Add option BY_LEVEL for ToArrayCallWithZeroLengthArrayArgument in inspection profile
* Roll back the level as ToArrayCallWithZeroLengthArrayArgument as ERROR
* Add toArray(new Object[0]) regexp into checkstyle config file & fix them
* Set the level of ArraysAsListWithZeroOrOneArgument as ERROR & Roll back the level of ToArrayCallWithZeroLengthArrayArgument as WARNING until Youtrack fix it
* Add a comment for string equals regexp in checkstyle config
* Fix code format
* Add RedundantTypeArguments as ERROR level inspection
* Fix cannot resolve symbol datasource
* fix freeSpacePercent in segmentCache.locations
* the check should probably test the other way around
* documentation should put the option in the right place
* examples have a superfluous backslash
* add test to verify correct behavior
* switch to Path and test with jimfs
Path allows to use different filesystems.
Jimfs provides an actual (in memory) filesystem.
This also allows more complex test scenarios.
The behavior should be unchanged by this commit.
* Revert "switch to Path and test with jimfs"
This reverts commit 8b9a418d65.