Fix errors related to zulu8 installation for building the Hadoop Docker image in the Load From Apache Hadoop tutorial.
The steps to download zulu8 in the Dockerfile and setup-zulu-repo.sh were replaced with the steps in the Dockerfile released by zulu-openjdk: be45d20302/centos/8u282-8.52.0.23/Dockerfile.
Add support for hadoop 3 profiles . Most of the details are captured in #11791 .
We use a combination of maven profiles and resource filtering to achieve this. Hadoop2 is supported by default and a new maven profile with the name hadoop3 is created. This will allow the user to choose the profile which is best suited for the use case.
* first pass compaction refactor. includes updated behavior for queryGranularity. removes duplicated doc
* fix links, typos, some reorganization
* fix spelling. TBD still there for work in progress
* updates tutorial examples, adds more clarification around compaction use cases
* add granularity spec to automatic compaction config
* final edits
* spelling fixes
* apply suggestions from review
* upadtes from review
* last edits
* move note
* clarify null
* fix links & spelling
* latest review
* edits to auto-compaction config
* add back rollup
* fix links & spelling
* Update compaction.md
add granularityspec to example
* Ability to use mirror of archive.apache.org
* Ability to use mirror of archive.apache.org: documentation
* Ability to use mirror of archive.apache.org: fix int test Dockerfile: missing COPY instruction
* WIP integration tests
* Add integration test for ingestion with transformSpec
* WIP almost working tests
* Add ignored tests
* checkstyle stuff
* remove newPage from index task ingestion spec
* more test cleanup
* still not quite working
* Actually disable the tests
* working tests
* fix codestyle
* dont use junit in integration tests
* actually fix the bug
* fix checkstyle
* bring index tests closer to reindex tests
* Tutorials use new ingestion spec where possible
There are 2 main changes
* Use task type index_parallel instead of index
* Remove the use of parser + firehose in favor of inputFormat + inputSource
index_parallel is the preferred method starting in 0.17. Setting the job to
index_parallel with the default maxNumConcurrentSubTasks(1) is the equivalent
of an index task
Instead of using a parserSpec, dimensionSpec and timestampSpec have been
promoted to the dataSchema. The format is described in the ioConfig as the
inputFormat.
There are a few cases where the new format is not supported
* Hadoop must use firehoses instead of the inputSource and inputFormat
* There is no equivalent of a combining firehose as an inputSource
* A Combining firehose does not support index_parallel
* fix typo
When following the instructions from Hadoop batch load tutorial in the
docs, building the Docker images fails with:
gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now
Updating nss allows the curl command for downloading Hadoop to succeed
while building the Docker image.
* Upgrade various build and doc links to https.
Where it wasn't possible to upgrade build-time dependencies to https,
I kept http in place but used hardcoded checksums or GPG keys to ensure
that artifacts fetched over http are verified properly.
* Switch to https://apache.org.
* Some adjustments to config examples.
- Add ExitOnOutOfMemoryError to jvm.config examples. It was added a
pretty long time ago (8u92) and is helpful since it prevents zombie
processes from hanging around. (OOMEs tend to bork things)
- Disable Broker caching and enable it on Historicals in example
configs. This config tends to scale better since it enables the
Historicals to merge results rather than sending everything by-segment
to the Broker. Also switch to "caffeine" cache from "local".
- Increase concurrency a bit for Broker example config.
- Enable SQL in the example config, a baby step towards making SQL
more of a thing. (It's still off by default in the code.)
- Reduce memory use a bit for the quickstart configs.
- Add example Router configs, in case someone wants to use that. One
reason might be to get the fancy new console (#6923).
* Add example Router configs.
* Fix up router example properties.
* Add router to quickstart supervise conf.
* Adding licenses and enable apache-rat-plugi.
Change-Id: I4685a2d9f1e147855dba69329b286f2d5bee3c18
* restore the copywrite of demo_table and add it to the list of allowed ones
Change-Id: I2a9efde6f4b984bc1ac90483e90d98e71f818a14
* revirew comments
Change-Id: I0256c930b7f9a5bb09b44b5e7a149e6ec48cb0ca
* more fixup
Change-Id: I1355e8a2549e76cd44487abec142be79bec59de2
* align
Change-Id: I70bc47ecb577bdf6b91639dd91b6f5642aa6b02f
* Rename io.druid to org.apache.druid.
* Fix META-INF files and remove some benchmark results.
* MonitorsConfig update for metrics package migration.
* Reorder some dimensions in inner queries for some reason.
* Fix protobuf tests.