* Add "like" filter.
* Addressed some PR comments.
* Slight simplifications to LikeFilter.
* Additional simplifications.
* Fix comment in LikeFilter.
* Clarify comment in LikeFilter.
* Simplify LikeMatcher a bit.
* No use going through the optimized path if prefix is empty.
* Add more tests.
* Add configs to enable running integration tests for cluster running behind proxy
As part of https://issues.apache.org/jira/browse/KNOX-758
I am working on adding support for proxying druid queries & UIs using
Apache KNOX gateway.
This PR adds support for integration-tests to be run using the proxy
gateway.
Changes Include -
1) Instead of hostName and port, ability to specify url in the config
file.
2) tests now use HTTPClient defined in DruidTestModule that can pass in
the request.
Note - the config changes are backwards compatible and existing configs
should work fine.
* review comments
* review comments
* URIExtractionNamespace: Treat null values in lookup maps as missing entries.
This is useful when many logical lookups are derived from the same base JSON file,
and some lookups' values may be unknown sometimes.
* Add test, logging message, and address other comments.
* Update docs.
* Support string type in math expression
addressed comments
addressed comments
Addressed comments
* Updated math function document
* Addressed comments
* Return better lastVersion from JDBCExtractionNamespaceCacheFactory's cache populator callable
* Return the lastVersion if URI lookup last modified date is not later than the last cached, from URIExtractionNamespaceCacheFactory's cache populator callable
* Fix a race condition in NamespaceExtractionCacheManager.cancelFuture()
* Don't delete cache from NamespaceExtractionCacheManager if the ExtractionNamespaceCacheFactory returned the same version as the last; Better exception treatment in the scheduled cache updater runnable in NamespaceExtractionCacheManager (in particular, don't consume Errors); throw AssertionError in StaticMapExtractionNamespaceCacheFactory if the lastVersion != null)
* In NamespaceExtractionCacheManager, put NamespaceImplData.latestVersion update in the same synchronized() block with swapAndClearCache(id, cacheId); Turn getPostRunnable which returns a callback into a simple updateNamespace() method
* In StaticMapExtractionNamespaceCacheFactory.getCachePopulator(), check the input directly, not inside a callback
* In URIExtractionNamespaceCacheFactory, allow URI last modified time to go backwards
* Better logging in NamespaceExtractionCacheManager
* Add comment on lastVersion nullability in URIExtractionNamespaceCacheFactory
* Remove unused ComplexColumnImpl class
* Remove throws IOException from close() in GenericColumn, ComplexColumn, IndexedFloats and IndexedLongs
* Use concise try-with-resources syntax in several places
* Fix resource leaks (ComplexColumn and GenericColumn) in SegmentAnalyzer, SearchQueryRunner, QueryableIndexIndexableAdapter and QueryableIndexStorageAdapter
* Use Closer in Iterable, returned from QueryableIndexIndexableAdapter.getRows(), in order to try to close everything even if closing some parts thew exceptions
* Supports expression-paramed aggregator (squashed and rebased on master) also includes math post aggregator (was #2820)
* Addressed comments
* addressed comments
* Remove unused numProcessed param from PooledTopNAlgorithm.aggregateDimValue()
* Replace AtomicInteger with simple int in PooledTopNAlgorithm.scanAndAggregate() and aggregateDimValue()
* Remove unused import
Despite the non-thread-safety of HyperLogLogCollector, it is actually currently used
by multiple threads during realtime indexing. HyperUniquesAggregator's "aggregate" and
"get" methods can be called simultaneously by OnheapIncrementalIndex, since its
"doAggregate" and "getMetricObjectValue" methods are not synchronized.
This means that the optimization of HyperLogLogCollector.fold in #3314 (saving and
restoring position rather than duplicating the storage buffer of the right-hand side)
could cause corruption in the face of concurrent writes.
This patch works around the issue by duplicating the storage buffer in "get" before
returning a collector. The returned collector still shares data with the original one,
but the situation is no worse than before #3314. In the future we may want to consider
making a thread safe version of HLLC that avoids these kinds of problems in realtime
indexing. But for now I thought it was best to do a small change that restored the old
behavior.
* Improve performance of StringDimensionMergerV9 and StringDimensionMergerLegacy by avoiding primitive int boxing by using IntIterator in IndexedInts instead of Iterator<Integer>; Extract some common logic for V9 and Legacy mergers; Minor improvements to resource handling in StringDimensionMergerV9
* Don't mask index in MergeIntIterator.makeQueueElement()
* DRY conversion RoaringBitmap's IntIterator to fastutil's IntIterator
* Do implement skip(n) in IntIterators extending AbstractIntIterator because original implementation is not reliable
* Use Test(expected=Exception.class) instead of try { } catch (Exception e) { /* ignore */ }
This is useful for the insert-segment-to-db tool, which would otherwise
potentially insert a lot of overshadowed segments as "used", causing
load and drop churn in the cluster.
* support finding segments from a AWS S3 storage.
* add more Uts
* address comments and add a document for the feature.
* update docs indentation
* update docs indentation
* address comments.
1. add a Ut for json ser/deser for the config object.
2. more informant error message in a Ut.
* address comments.
1. use @Min to validate the configuration object
2. change updateDescriptor to a string as it does not take an argument otherwise
* fix a Ut failure - delete a Ut for testing default max length.