* DruidInputSource: Fix issues in column projection, timestamp handling.
DruidInputSource, DruidSegmentReader changes:
1) Remove "dimensions" and "metrics". They are not necessary, because we
can compute which columns we need to read based on what is going to
be used by the timestamp, transform, dimensions, and metrics.
2) Start using ColumnsFilter (see below) to decide which columns we need
to read.
3) Actually respect the "timestampSpec". Previously, it was ignored, and
the timestamp of the returned InputRows was set to the `__time` column
of the input datasource.
(1) and (2) together fix a bug in which the DruidInputSource would not
properly read columns that are used as inputs to a transformSpec.
(3) fixes a bug where the timestampSpec would be ignored if you attempted
to set the column to something other than `__time`.
(1) and (3) are breaking changes.
Web console changes:
1) Remove "Dimensions" and "Metrics" from the Druid input source.
2) Set timestampSpec to `{"column": "__time", "format": "millis"}` for
compatibility with the new behavior.
Other changes:
1) Add ColumnsFilter, a new class that allows input readers to determine
which columns they need to read. Currently, it's only used by the
DruidInputSource, but it could be used by other columnar input sources
in the future.
2) Add a ColumnsFilter to InputRowSchema.
3) Remove the metric names from InputRowSchema (they were unused).
4) Add InputRowSchemas.fromDataSchema method that computes the proper
ColumnsFilter for given timestamp, dimensions, transform, and metrics.
5) Add "getRequiredColumns" method to TransformSpec to support the above.
* Various fixups.
* Uncomment incorrectly commented lines.
* Move TransformSpecTest to the proper module.
* Add druid.indexer.task.ignoreTimestampSpecForDruidInputSource setting.
* Fix.
* Fix build.
* Checkstyle.
* Misc fixes.
* Fix test.
* Move config.
* Fix imports.
* Fixup.
* Fix ShuffleResourceTest.
* Add import.
* Smarter exclusions.
* Fixes based on tests.
Also, add TIME_COLUMN constant in the web console.
* Adjustments for tests.
* Reorder test data.
* Update docs.
* Update docs to say Druid 0.22.0 instead of 0.21.0.
* Fix test.
* Fix ITAutoCompactionTest.
* Changes from review & from merging.
* no need for intervals
* don't set redundant fields
* fix tests
* better filter control
* work with not
* wrap callout with form group
* update snapshot
* add split hint
* highlight issues with spec
* fixes
* fix default value
* move intervals back to partition step
* work with all sorts of chars
* fix enabled view
- Update playwright to latest version
- Provide environment variable to disable/enable headless mode
- Allow running E2E tests against any druid cluster running on standard
ports (tutorial-batch.spec.ts now uses an absolute instead of relative
path for the input data)
- Provide environment variable to change target web console port
- Druid setup does not need to download zookeeper
Add an E2E test for the web console workflow of reindexing a Druid
datasource to change the secondary partitioning type. The new test
changes dynamic to single dim partitions since the autocompaction test
already does dynamic to hashed partitions.
Also, run the web console E2E tests in parallel to reduce CI time and
change naming convention for test datasources to make it easier to map
them to the corresponding test run.
Main changes:
1) web-consolee2e-tests/reindexing.spec.ts
- new E2E test
2) web-console/e2e-tests/component/load-data/data-connector/reindex.ts
- new data loader connector for druid input source
3) web-console/e2e-tests/component/load-data/config/partition.ts
- move partition spec definitions from compaction.ts
- add new single dim partition spec definition