* SQL: Upgrade to Calcite 1.14.0, some refactoring of internals.
This brings benefits:
- Ability to do GROUP BY and ORDER BY with ordinals.
- Ability to support IN filters beyond 19 elements (fixes#4203).
Some refactoring of druid-sql internals:
- Builtin aggregators and operators are implemented as SqlAggregators
and SqlOperatorConversions rather being special cases. This simplifies
the Expressions and GroupByRules code, which were becoming complex.
- SqlAggregator implementations are no longer responsible for filtering.
Added new functions:
- Expressions: strpos.
- SQL: TRUNCATE, TRUNC, LENGTH, CHAR_LENGTH, STRLEN, STRPOS, SUBSTR,
and DATE_TRUNC.
* Add missing @Override annotation.
* Adjustments for forbidden APIs.
* Adjustments for forbidden APIs.
* Disable GROUP BY alias.
* Doc reword.
* Added org.joda.time.DateTime#(java.lang.String) to forbidden API.
* Added org.joda.time.DateTime#(java.lang.String, org.joda.time.format.DateTimeFormatter) to forbidden API.
* Add additional APIs that may create DateTime with default time zone
* Add helper function that accepts formatter to parse String.
* Add additional forbidden APIs
* Replace existing usage of forbidden APIs
* Use wrapper class to enforce Chronology on DateTimeFormatter.
* Creates constant UtcFormatter for constant ISODateTimeFormat.
* Move caffeine out of extension.
* Remove `JsonTypeName` from the class itself
* Fix bad docs
* Fix distribution pom
* Fix unused import
* Make caffeine default
* Address code comments
* Add more description around the jre version in the readme
* Add suggested comments
* Move emitters from io.druid.server.initialization to the dedicated io.druid.server.emitter package; Update emitter library to 0.6.0; Add support for ParametrizedUriEmitter; Support hierarical properties in JsonConfigurator (was needed for ParametrizedUriEmitter)
* Log created RequestLoggers
* Fix forbidden API
* Test fix
* More Http and Parametrized Http Emitter docs
* Switch to debug level
* Move scan-query from a contrib extension into core.
Based on a proposal at: https://groups.google.com/d/topic/druid-development/ME_OatUDnbk/discussion
This patch also adds support for virtual columns to the Scan query,
and updates Druid SQL to use Scan instead of Select.
This patch also makes some behavioral changes to handling of the __time
column. In particular, it is now is returned as "__time" rather than
"timestamp"; it is no longer included if you do not specifically ask for
it in your "columns"; and it is returned as a long rather than a string.
Users can revert time handling to the legacy extension behavior by
setting "legacy" : true in their queries, or setting the property
druid.query.scan.legacy = true. This is meant to provide a migration
path for users that were formerly using the contrib extension.
* Adjustments from review.
* Add back Select query.
* Adjust SQL docs.
* Restore SelectQuery link.
* add jq expression in the flattenSpec
* more tests
* add benchmark
* fix style
* use JsonNode for both JSONPath and JQ
* clean up
* more clean up
* add documentation
* fix style
* move jackson-jq version to dependencyManagement section. remove commented code
* oops. revert wrong fix
* throw IllegalArgumentException for JQ syntax error
* remove e.printStackTrace() that is forbidden
* touch
* Use trusty for travis jobs.
The distro was set to "precise" in #4572 due to memory issues on trusty,
but we've been seeing performance issues on "precise" recently so let's
see how trusty is working these days.
* Less quiet.
* Adjust memory settings.
* Add back -q option.
* Tweak memory again.
* Adjustments.
* Try squeezing memory a bit more.
* SQL + Expressions = Best friends forever.
- Use expressions as a projection layer for anything that can't be
expressed using traditional Druid extractionFns. Sometimes they're
embedded directly (like "expression" filters, builtin aggregators,
or "expression" post-aggregators). Sometimes they're referenced
through virtual columns (like dimensionSpecs, which can't innately
reference functions of more than one column without the virtual
column layer).
- Add many new functions and operators, taking advantage of the
expression capability (see the querying/sql.md doc).
- Improve consistency of constant reduction and of casting by
using Druid expressions for this instead of Calcite's RexExecutor.
* Fix casting bug, and other code review comments.
* Fix docs.
* Make using implicit system charset an error
* Use StringUtils.toUtf8() and fromUtf8() instead of String.getBytes() and new String()
* Use English locale in StringUtils.safeFormat()
* Restore comment
* Adding s3a schema and s3a implem to hdfs storage module.
* use 2.7.3
* use segment pusher to make loadspec
* move getStorageDir and makeLoad spec under DataSegmentPusher
* fix uts
* fix comment part1
* move to hadoop 2.8
* inject deep storage properties
* set version to 2.7.3
* fix build issue about static class
* fix comments
* fix default hadoop default coordinate
* fix create filesytem
* downgrade aws sdk
* bump the version
* move ProtoBufInputRowParser from processing module to protobuf extensions
* Ported PR #3509
* add DynamicMessage
* fix local test stuff that slipped in
* add license header
* removed redundant type name
* removed commented code
* fix code style
* rename ProtoBuf -> Protobuf
* pom.xml: shade protobuf classes, handle .desc resource file as binary file
* clean up error messages
* pick first message type from descriptor if not specified
* fix protoMessageType null check. add test case
* move protobuf-extension from contrib to core
* document: add new configuration keys, and descriptions
* update document. add examples
* move protobuf-extension from contrib to core (2nd try)
* touch
* include protobuf extensions in the distribution
* fix whitespace
* include protobuf example in the distribution
* example: create new pb obj everytime
* document: use properly quoted json
* fix whitespace
* bump parent version to 0.10.1-SNAPSHOT
* ignore Override check
* touch
* Make Errorprone the default compiler
* Address comments
* Make Error Prone's ClassCanBeStatic rule a error
* Preconditions allow only %s pattern
* Fix DruidCoordinatorBalancerTester
* Try to give the compiler more memory
* Remove distribution module activation on jdk 1.8 because only jdk 1.8 is used now
* Don't show compiler warnings
* Try different travis script
* Fix travis.yml
* Make Error Prone optional again
* For error-prone compiler
* Increase compiler's maxmem
* Don't run Error Prone for benchmarks because of OOM
* Skip install step in Travis
* Remove MetricHolder.writeToChannel()
* In travis.yml, check compilation before tests, because it may fail faster
* Initial commit
* Apply another config: clustername
* Rename variable
* Fix bug
* Add retry logic
* Edit retry logic
* Upgrade kafka-clients version to the most recent release
* Make callback single object
* Write documentation
* Rewrite error message and emit logic
* Handling AlertEvent
* Override toString()
* make clusterName more optional
* bump up druid version
* add producer.config option which make user can apply another optional config value of kafka producer
* remove potential blocking in emit()
* using MemoryBoundLinkedBlockingQueue
* Fixing coding convention
* Remove logging every exception and just increment counting
* refactoring
* trivial modification
* logging when callback has exception
* Replace kafka-clients 0.10.1.1 with 0.10.2.0
* Resolve the problem related of classloader
* adopt try statement
* code reformatting
* make variables final
* rewrite toString
* Monomorphic processing: add HotLoopCallee, CalledFromHotLoop, RuntimeShapeInspector, SpecializationService. Specialize topN queries with 1 or 2 aggregators. Add Cursor.advanceUninterruptibly() and isDoneOrInterrupted() for exception-free query processing.
* Use Execs.singleThreaded()
* RuntimeShapeInspector to support nullable fields
* Make CalledFromHotLoop annotation Inherited
* Remove unnecessary conversion of array of ColumnSelectorPluses to list and back to array in CardinalityAggregatorFactory
* Close InputStream in SpecializationService
* Formatting
* Test specialized PooledTopNScanners
* Set flags in PooledTopNAlgorithm directly
* Fix tests, dependent on CountAggragatorFactory toString() form
* Fix
* Revert CountAggregatorFactory changes
* Implement inspectRuntimeShape() for LongWrappingDimensionSelector and FloatWrappingDimensionSelector
* Remove duplicate RoaringBitmap dependency in the extendedset pom.xml
* Fix
* Treat ByteBuffers specially in StringRuntimeShape
* Doc fix
* Annotate BufferAggregator.init() with CalledFromHotLoop
* Make triggerSpecializationIterationsThreshold an int
* Remove SpecializationService.PerPrototypeClassState.of()
* Add comments
* Limit the amount of specializations that SpecializationService could make
* Add default implementation for BufferAggregator.inspectRuntimeShape(), for compatibility with extensions
* Use more efficient ConcurrentMap's idioms in SpecializationService
* Allow compilation as Java8 source and target for everything except API
* Remove conditions in tests which assume that we may run with Java 7
* Update easymock to 3.4
* Make Animal Sniffer to check Java 1.8 usage; remove redundant druid-caffeine-cache configuration
* Use try-with-resources in LargeColumnSupportedComplexColumnSerializerTest.testSanity()
* Remove java7 special for druid-api
* Require Java 8 and include some Java 8 dependencies.
- Upgrade Jetty to 9.3.16.v20170120.
- Upgrade DataSketches to 0.8.4.
- Bundle caffeine-cache by default.
- Still target Java 7 when compiling base Druid classes.
* Update cluster, quickstart docs.
* Remove oraclejdk7 from travis.yml.
* Add extension for supporting kerberos security
- This PR adds an extension for supporting druid authentication via
Kerberos.
- Working on the docs.
* Add docs
* review comments
* more review comments
* Block all paths by default
* more review comments - use proper Oid
* Allow extensions to override httpclient for integration tests
* Add kerberos lock to prevent multithreaded issues.
* review comment - remove enabled flag and fix router injection
* Add Cookie Handling and more detailed docs
* review comment - rename DruidKerberosConfig -> AuthKerberosConfig
* review comments
* fix travis failure on jdk7
* streaming version of select query
* use columns instead of dimensions and metrics;prepare for valueVector;remove granularity
* respect query limit within historical
* use constant
* fix thread name corrupted bug when using jetty qtp thread rather than processing thread while working with SpecificSegmentQueryRunner
* add some test for scan query
* add scan query document
* fix merge conflicts
* add compactedList resultFormat, this format is better for json ser/der
* respect query timeout
* respect query limit on broker
* use static consts and remove unused code
* Exclude the transitive dependency LGPL jar since it is not needed
* add reason why exclude
* exclude from the root dependency
* add banning tool to enforce exclusions
* Lookup cache refactoring (the main part of druid-io/druid#3667)
* Use PowerMock's static methods in NamespaceLookupExtractorFactoryTest
* Fix KafkaLookupExtractorFactoryTest
* Use VisibleForTesting annotation instead of Javadoc comment
* Create a NamespaceExtractionCacheManager separately for each test in NamespaceExtractionCacheManagersTest
* Rename CacheScheduler.NoCache.ENTRY_DISPOSED to ENTRY_CLOSED
* Reduce visibility of NamespaceExtractionCacheManager.cacheCount() and monitor() implementations, and don't run NamespaceExtractionCacheManagerExecutorsTest with off-heap cache (it didn't before)
* In NamespaceLookupExtractorFactory, use safer idiom to check if CacheState is NoCache or VersionedCache
* More logging in CacheHandler constructor and close(), VersionedCache.close()
* PR comments addressed
* Make CacheScheduler.EntryImpl AutoCloseable, avoid 'dispose' verb in comments, logging and naming in CacheScheduler in favor of 'close'
* More Javadoc comments to CacheScheduler
* Fix NPE
* Remove logging in OnHeapNamespaceExtractionCacheManager.expungeCollectedCaches()
* Make NamespaceExtractionCacheManagersTest.testRacyCreation() to have similar load to what it be before the refactoring
* Unwrap NamespaceExtractionCacheManager.scheduledExecutorService from unneeded MoreExecutors.listeningDecorator() and specify that this is ScheduledThreadPoolExecutor, which ensures happens-before between periodic runs of the tasks
* More comments on MapDbCacheDisposer.disposed
* Replace concat with Long.toString()
* Comment on why NamespaceExtractionCacheManager.scheduledExecutorService() returns ScheduledThreadPoolExecutor
* Place logging statements in VersionedCache.close() and CacheHandler.close() after actual closing logic, because logging may fail
* Make JDBCExtractionNamespaceCacheFactory and StaticMapExtractionNamespaceCacheFactory to try to close newly created VersionedCache if population has failed, as it is done already in URIExtractionNamespaceCacheFactory
* Don't close the whole CacheScheduler.Entry, if the cache update task failed
* Replace AtomicLong updateCounter and firstRunLatch with Phaser-based UpdateCounter in CacheScheduler.EntryImpl