* Monomorphic processing of topN queries with simple double aggregators and historical segments
* Add CalledFromHotLoop annocations to specialized methods in SimpleDoubleBufferAggregator
* Fix a bug in Historical1SimpleDoubleAggPooledTopNScannerPrototype
* Fix a bug in SpecializationService
* In SpecializationService, emit maxSpecializations warning only once
* Make GenericIndexed.theBuffer final
* Address comments
* Newline
* Reapply 439c906 (Make GenericIndexed.theBuffer final)
* Remove extra PooledTopNAlgorithm.capabilities field
* Improve CachingIndexed.inspectRuntimeShape()
* Fix CompressedVSizeIntsIndexedSupplier.inspectRuntimeShape()
* Don't override inspectRuntimeShape() in subclasses of CompressedVSizeIndexedInts
* Annotate methods in specializations of DimensionSelector and FloatColumnSelector with @CalledFromHotLoop
* Make ValueMatcher to implement HotLoopCallee
* Doc fix
* Fix inspectRuntimeShape() impl in ExpressionSelectors
* INFO logging of specialization events
* Remove modificator
* Fix OrFilter
* Fix AndFilter
* Refactor PooledTopNAlgorithm.scanAndAggregate()
* Small refactoring
* Add 'nothing to inspect' messages in empty HotLoopCallee.inspectRuntimeShape() implementations
* Don't care about runtime shape in tests
* Fix accessor bugs in Historical1SimpleDoubleAggPooledTopNScannerPrototype and HistoricalSingleValueDimSelector1SimpleDoubleAggPooledTopNScannerPrototype, cover them with tests
* Doc wording
* Address comments
* Remove MagicAccessorBridge and ensure Offset subclasses are public
* Attach error message to element
* Make Errorprone the default compiler
* Address comments
* Make Error Prone's ClassCanBeStatic rule a error
* Preconditions allow only %s pattern
* Fix DruidCoordinatorBalancerTester
* Try to give the compiler more memory
* Remove distribution module activation on jdk 1.8 because only jdk 1.8 is used now
* Don't show compiler warnings
* Try different travis script
* Fix travis.yml
* Make Error Prone optional again
* For error-prone compiler
* Increase compiler's maxmem
* Don't run Error Prone for benchmarks because of OOM
* Skip install step in Travis
* Remove MetricHolder.writeToChannel()
* In travis.yml, check compilation before tests, because it may fail faster
* coordinator lookups mgmt improvements
* revert replaces removal, deprecate it instead
* convert and use older specs stored in db
* more tests and updates
* review comments
* add behavior for 0.10.0 to 0.9.2 downgrade
* incorporating more review comments
* remove explicit lock and use LifecycleLock in LookupReferencesManager. use LifecycleLock in LookupCoordinatorManager as well
* wip on LookupCoordinatorManager
* lifecycle lock
* refactor thread creation into utility method
* more review comments addressed
* support smooth roll back of lookup snapshots from 0.10.0 to 0.9.2
* correctly use LifecycleLock in LookupCoordinatorManager and remove synchronization from start/stop
* run lookup mgmt on leader coordinator only
* wip: changes to do multiple start() and stop() on LookupCoordinatorManager
* lifecycleLock fix usage in LookupReferencesManagerTest
* add LifecycleLock back
* fix license hdr
* some fixes
* make LookupReferencesManager.getAllLookupsState() consistent while still being lockless
* address review comments
* addressing leventov's comments
* address charle's comments
* add IOE.java
* for safety in LookupReferencesManager mainThread check for lifecycle started state on each loop in addition to interrupt
* move thread creation utility method to Execs
* fix names
* add tests for LookupCoordinatorManager.lookupManagementLoop()
* add further tests for figuring out toBeLoaded and toBeDropped on LookupCoordinatorManager
* address leventov comments
* remove LookupsStateWithMap and parameterize LookupsState
* address review comments
* address more review comments
* misc fixes
* Make timeout behavior consistent to document
* Refactoring BlockingPool and add more methods to QueryContexts
* remove unused imports
* Addressed comments
* Address comments
* remove unused method
* Make default query timeout configurable
* Fix test failure
* Change timeout from period to millis
* Minor bug fixes in GenericIndexed; Refactor and optimize GenericIndexed; Remove some unnecessary ByteBuffer duplications in some deserialization paths; Add ZeroCopyByteArrayOutputStream
* Fixes
* Move GenericIndexedWriter.writeLongValueToOutputStream() and writeIntValueToOutputStream() to SerializerUtils
* Move constructors
* Add GenericIndexedBenchmark
* Comments
* Typo
* Note in Javadoc that IntermediateLongSupplierSerializer, LongColumnSerializer and LongMetricColumnSerializer are thread-unsafe
* Use primitive collections in IntermediateLongSupplierSerializer instead of BiMap
* Optimize TableLongEncodingWriter
* Add checks to SerializerUtils methods
* Don't restrict byte order in SerializerUtils.writeLongToOutputStream() and writeIntToOutputStream()
* Update GenericIndexedBenchmark
* SerializerUtils.writeIntToOutputStream() and writeLongToOutputStream() separate for big-endian and native-endian
* Add GenericIndexedBenchmark.indexOf()
* More checks in methods in SerializerUtils
* Use helperBuffer.arrayOffset()
* Optimizations in SerializerUtils
* Allow compilation as Java8 source and target for everything except API
* Remove conditions in tests which assume that we may run with Java 7
* Update easymock to 3.4
* Make Animal Sniffer to check Java 1.8 usage; remove redundant druid-caffeine-cache configuration
* Use try-with-resources in LargeColumnSupportedComplexColumnSerializerTest.testSanity()
* Remove java7 special for druid-api
* Add PostAggregators to generator cache keys for top-n queries
* Add tests for strings
* Remove debug comments
* Add type keys and list sizes to cache key
* Make post aggregators used for sort are considered for cache key generation
* Use assertArrayEquals()
* Improve findPostAggregatorsForSort()
* Address comments
* fix test failure
* address comments
* Specialize LoadBalancingPool as MemcacheClientPool, reduce locking and don't override Object.finalize()
* Remove locking and don't override Object.finalize() in ReferenceCountingResourceHolder
* Add leak counts in ReferenceCountingResourceHolder and MemcacheClientPool. Add tests for ReferenceCountingResourceHolder and MemcacheClientPool
* Fix a race condition in ReferenceCountingResourceHolder.increment()
* Add virtual column types, holder serde, and safety features.
Virtual columns:
- add long, float, dimension selectors
- put cache IDs in VirtualColumnCacheHelper
- adjust serde so VirtualColumns can be the holder object for Jackson
- add fail-fast validation for cycle detection and duplicates
- add expression virtual column in core
Storage adapters:
- move virtual column hooks before checking base columns, to prevent surprises
when a new base column is added that happens to have the same name as a
virtual column.
* Fix ExtractionDimensionSpecs with virtual dimensions.
* Fix unused imports.
* CR comments
* Merge one more time, with feeling.
* Removing unused code from io.druid.java.util.common.guava package; fix#3563 (more consistent and paranoiac resource handing in Sequences subsystem); Add Sequences.wrap() for DRY in MetricsEmittingQueryRunner, CPUTimeMetricQueryRunner and SpecificSegmentQueryRunner; Catch MissingSegmentsException in SpecificSegmentQueryRunner's yielder.next() method (follow up on #3617)
* Make Sequences.withEffect() execute the effect if the wrapped sequence throws exception from close()
* Fix strange code in MetricsEmittingQueryRunner
* Add comment on why YieldingSequenceBase is used in Sequences.withEffect()
* Use Closer in OrderedMergeSequence and MergeSequence to close multiple yielders
* Filters: Use ColumnSelectorFactory directly for building row-based matchers.
* Adjustments based on code review.
- BoundDimFilter: fewer volatiles, rename matchesAnything to !matchesNothing.
- HavingSpecs: Clarify that they are not thread-safe, and make DimFilterHavingSpec
not thread safe.
- Renamed rowType to rowSignature.
- Added specializations for time-based vs non-time-based DimensionSelector in RBCSF.
- Added convenience method DimensionHanderUtils.createColumnSelectorPlus.
- Added singleton ZeroIndexedInts.
- Added test cases for DimFilterHavingSpec.
* Make ValueMatcherColumnSelectorStrategy actually use the associated selector.
* Add RangeIndexedInts.
* DimFilterHavingSpec: Fix concurrent usage guard on jdk7.
* Add assertion to ZeroIndexedInts.
* Rename no-longer-volatile members.
* add first and last aggregator
* add test and fix
* moving around
* separate aggregator valueType
* address PR comment
* add finalize inner query and adjust v1 inner indexing
* better test and fixes
* java-util import fixes
* PR comments
* Add first/last aggs to ITWikipediaQueryTest
* Enable parallel test
* Remove unnecessary NotThreadSafe annocation
* Randomize the start port when finding available ports
* Fix test failure
* Change to handle all negatives
Excludes tests from AvoidStaticImport, since those are used often there and
I didn't want to make this changeset too large. Production code use was minimal
and I switched those to non-static imports.
* Support string type in math expression
addressed comments
addressed comments
Addressed comments
* Updated math function document
* Addressed comments
* Supports expression-paramed aggregator (squashed and rebased on master) also includes math post aggregator (was #2820)
* Addressed comments
* addressed comments
This is useful for the insert-segment-to-db tool, which would otherwise
potentially insert a lot of overshadowed segments as "used", causing
load and drop churn in the cluster.
1. Wrap temporaryStorage in a resource holder, to avoid spurious "Closed"
errors from already-running processing tasks.
2. Exit early from the merging accumulator if the query is cancelled.
* fix ConcurrentModificationException in CachingClusteredClient.run()
* obtain new copy of PartitionHolder to avoid potential multi-threads read/write issue
Refcounting prevents releasing the merge buffer, or closing the concurrent
grouper, before the processing threads have all finished. The better
error handling prevents an avalanche of per-runner exceptions when grouping
resources are exhausted, by grouping those all up into a single merged
exception.
This patch introduces a GroupByStrategy concept and two strategies: "v1"
is the current groupBy strategy and "v2" is a new one. It also introduces
a merge buffers concept in DruidProcessingModule, to try to better
manage memory used for merging.
Both of these are described in more detail in #2987.
There are two goals of this patch:
1. Make it possible for historical/realtime nodes to return larger groupBy
result sets, faster, with better memory management.
2. Make it possible for brokers to merge streams when there are no order-by
columns, avoiding materialization.
This patch does not do anything to help with memory management on the broker
when there are order-by columns or when there are nested queries. That could
potentially be done in a future patch.