* Fix four bugs with numeric dimension output types.
This patch includes the following bug fixes:
- TopNColumnSelectorStrategyFactory: Cast dimension values to the output type
during dimExtractionScanAndAggregate instead of updateDimExtractionResults.
This fixes a bug where, for example, grouping on doubles-cast-to-longs would
fail to merge two doubles that should have been combined into the same long value.
- TopNQueryEngine: Use DimExtractionTopNAlgorithm when treating string columns
as numeric dimensions. This fixes a similar bug: grouping on string-cast-to-long
would fail to merge two strings that should have been combined.
- GroupByQuery: Cast numeric types to the expected output type before comparing them
in compareDimsForLimitPushDown. This fixes#6123.
- GroupByQueryQueryToolChest: Convert Jackson-deserialized dimension values into
the proper output type. This fixes an inconsistency between results that came
from cache vs. not-cache: for example, Jackson sometimes deserializes integers
as Integers and sometimes as Longs.
And the following code-cleanup changes, related to the fixes above:
- DimensionHandlerUtils: Introduce convertObjectToType, compareObjectsAsType,
and converterFromTypeToType to make it easier to handle casting operations.
- TopN in general: Rename various "dimName" variables to "dimValue" where they
actually represent dimension values. The old names were confusing.
* Remove unused imports.
* Add PostgreSQLConnectorConfig to expose SSL configuration options for the Postgres Metadata Storage module.
* Fix checkstyle violations and add license header
* Convert properties in the postgres docs to be the full property path and fix typo
* Fix grammar in sslFactory docs
* 'shutdownAllTasks' API for a dataSource
Change-Id: I30d14390457d39e0427d23a48f4f224223dc5777
* fix api path and return
Change-Id: Ib463f31ee2c4cb168cf2697f149be845b57c42e5
* optimize implementation
Change-Id: I50a8dcd44dd9d36c9ecbfa78e103eb9bff32eab9
* Fix three bugs with segment publishing.
1. In AppenderatorImpl: always use a unique path if requested, even if the segment
was already pushed. This is important because if we don't do this, it causes
the issue mentioned in #6124.
2. In IndexerSQLMetadataStorageCoordinator: Fix a bug that could cause it to return
a "not published" result instead of throwing an exception, when there was one
metadata update failure, followed by some random exception. This is done by
resetting the AtomicBoolean that tracks what case we're in, each time the
callback runs.
3. In BaseAppenderatorDriver: Only kill segments if we get an affirmative false
publish result. Skip killing if we just got some exception. The reason for this
is that we want to avoid killing segments if they are in an unknown state.
Two other changes to clarify the contracts a bit and hopefully prevent future bugs:
1. Return SegmentPublishResult from TransactionalSegmentPublisher, to make it
more similar to announceHistoricalSegments.
2. Make it explicit, at multiple levels of javadocs, that a "false" publish result
must indicate that the publish _definitely_ did not happen. Unknown states must be
exceptions. This helps BaseAppenderatorDriver do the right thing.
* Remove javadoc-only import.
* Updates.
* Fix test.
* Fix tests.
* Fix missing exception handling as part of `io.druid.java.util.http.client.netty.HttpClientPipelineFactory`
* 1. Extends SimpleChannelUpstreamHandler; 2. Remove sendUpstream; 3. Using ExpectedException.
* Add more checks for channel
* Fix missing exception handler in NettyHttpClient and ChannelResourceFactory
* Rename the anonymous class of `SimpleChannelUpstreamHandler` as connectionErrorHandler & use `addLast` instead of `addFirst`
* Remove `removeHandlers()`
* Using expectedException.expect instead of Assert.assertNotNull in testHttpsEchoServer
* Using handshakeFuture.setFailure instead of logger
* Using handshakeFuture.setFailure instead of logger
* Cache: Add maxEntrySize config.
The idea is this makes it more feasible to cache query types that
can potentially generate large result sets, like groupBy and select,
without fear of writing too much to the cache per query.
Includes a refactor of cache population code in CachingQueryRunner and
CachingClusteredClient, such that they now use the same CachePopulator
interface with two implementations: one for foreground and one for
background.
The main reason for splitting the foreground / background impls is
that the foreground impl can have a more effective implementation of
maxEntrySize. It can stop retaining subvalues for the cache early.
* Add CachePopulatorStats.
* Fix whitespace.
* Fix docs.
* Fix various tests.
* Add tests.
* Fix tests.
* Better tests
* Remove conflict markers.
* Fix licenses.
* Native parallel indexing without shuffle
* fix build
* fix ci
* fix ingestion without intervals
* fix retry
* fix retry
* add it test
* use chat handler
* fix build
* add docs
* fix ITUnionQueryTest
* fix failures
* disable metrics reporting
* working
* Fix split of static-s3 firehose
* Add endpoints to supervisor task and a unit test for endpoints
* increase timeout in test
* Added doc
* Address comments
* Fix overlapping locks
* address comments
* Fix static s3 firehose
* Fix test
* fix build
* fix test
* fix typo in docs
* add missing maxBytesInMemory to doc
* address comments
* fix race in test
* fix test
* Rename to ParallelIndexSupervisorTask
* fix teamcity
* address comments
* Fix license
* addressing comments
* addressing comments
* indexTaskClient-based segmentAllocator instead of CountingActionBasedSegmentAllocator
* Fix race in TaskMonitor and move HTTP endpoints to supervisorTask from runner
* Add more javadocs
* use StringUtils.nonStrictFormat for logging
* fix typo and remove unused class
* fix tests
* change package
* fix strict build
* tmp
* Fix overlord api according to the recent change in master
* Fix it test
* order using IncrementalIndexRowComparator at persist time when rollup is disabled, allowing increased effectiveness of dimension compression, resolves#6066
* fix stuff from review
* validate baseDataSource non-empty string in DerivativeDataSourceMetadata; added a test
* added another test for validating null baseDataSource
* restored the arguments check order