* Introduce "transformSpec" at ingest-time.
It accepts a "filter" (standard query filter object) and "transforms" (a
list of objects with "name" and "expression"). These can be used to do
filtering and single-row transforms without need for a separate data
processing job.
The "expression" fields use the same expression language as other
expression-based feature.
* Remove forbidden api.
* Fix compile error.
* Fix tests.
* Some more changes.
- Add nullable annotation to Firehose.nextRow.
- Add tests for index task, realtime task, kafka task, hadoop mapper,
and ingestSegment firehose.
* Fix bad merge.
* Adjust imports.
* Adjust whitespace.
* Make Transform into an interface.
* Add missing annotation.
* Switch logger.
* Switch logger.
* Adjust test.
* Adjustment to handling for DatasourceIngestionSpec.
* Fix test.
* CR comments.
* Remove unused method.
* Add javadocs.
* More javadocs, and always decorate.
* Fix bug in TransformingStringInputRowParser.
* Fix bad merge.
* Fix ISFF tests.
* Fix DORC test.
* Fix binary serialization in caching
The previous caching code just concatenated a list of objects into a
byte array -- this is actually not valid because jackson-databind uses
enumerated references to strings internally, and concatenating multiple
binary serialized objects can throw off the references.
This change uses a single JsonGenerator to serialize the object list
rather than concatenating byte arrays.
* remove unused imports
* introducing CuratorLoadQueuePeon
* HttpLoadQueuePeon based off of current code
* Revert "Remove SegmentLoaderConfig.numLoadingThreads config (#4829)"
This reverts commit d8b3bfa63c.
* SegmentLoadDropHandler copy/pasted from ZkCoordinator
* Revert "1-based counts in ZkCoordinator (#4917)"
This reverts commit e725ff4146.
* remove non-zk part from ZkCoordinator
* remove zk part from SegmentLoadDropHandler
* additional changes for segment load/drop management with http
* address review comments
* add some more logs
* Execs class is moved
* Only consider loaded replicants when computing replication status.
This affects the computation of segment/underReplicated/count and
segment/unavailable/count, as well as the loadstatus?simple and
loadstatus?full APIs.
I'm not sure why they currently consider segments in the load
queues, but it would make more sense to me if they only considered
segments that are actually loaded.
* Fix tests.
* Fix imports.
* Changes for lookup synchronization
* Refactor of Lookup classes
* Minor refactors and doc update
* Change coordinator instance to be retrieved by DruidLeaderClient
* Wait before thread shutdown
* Make disablelookups flag true by default
* Update docs
* Rename flag
* Move executorservice shutdown to finally block
* Update LookupConfig
* Refactoring and doc changes
* Remove lookup config constructor
* Revert Lookupconfig constructor changes
* Add tests to LookupConfig
* Make executorservice local
* Update LRM
* Move ListeningScheduledExecutorService to ExecutorCompletionService
* Move exception to outer block
* Remove check to see future is done
* Remove unnecessary assignment
* Add logging
* Add ability to optionally specify a sequence identifier to reduce the possibility of duplicate events entering the event receiver firehose
* Add ability to optionally specify a sequence identifier to reduce the possibility of duplicate events entering the event receiver firehose
* Add a hard coded limit to the maximum number of possible producer IDs to prevent a malicious (or uninformed) client from overflowing the heap
* add configs to enable fast request failure on broker
* address review comments
* fix styling error
* fix style error
* have enableRequestLimit config instead of having user specify max limit
* add comment
* fix style error
* add UT fo LimitRequestsFilter
* address review comments
* fix test
* make LimitRequestsFilterTest more robust
* fix JettyQosTest
* Fix#4647
* NPE protect bucketInterval as well
* Add test to verify timezone as well
* Also handle case when intervals are already present
* Fix checkstyle error
* Use factory method instead for Datetime
* Use Intervals factory method
* Priority on loading for primary replica
* Simplicity fixes
* Fix on skipping drop for quick return.
* change to debug logging for no replicants.
* Fix on filter logic
* swapping if-else
* Fix on wrong "hasTier" logic
* Refactoring of LoadRule
* Rename createPredicate to createLoadQueueSizeLimitingPredicate
* Rename getHolderList to getFilteredHolders
* remove varargs
* extract out currentReplicantsInTier
* rename holders to holdersInTier
* don't do temporary removal of tier.
* rename primaryTier to tierToSkip
* change LinkedList to ArrayList
* Change MinMaxPriorityQueue in DruidCluster to TreeSet.
* Adding some comments.
* Modify log messages in light of predicates.
* Add in-method comments
* Don't create new Object2IntOpenHashMap for each run() call.
* Cache result from strategy call in the primary assignment to be reused during the same run.
* Spelling mistake
* Cleaning up javadoc.
* refactor out loading in progress check.
* Removed redundant comment.
* Removed forbidden API
* Correct non-forbidden API.
* Precision in variable type for NavigableSet.
* Obsolete comment.
* Clarity in method call and moving retrieval of ServerHolder into method call.
* Comment on mutability of CoordinatoorStats.
* Added auxiliary fixture for dropping.
* Add identity to query metrics, logs.
Also fix a bug where unauthorized requests would not emit any logs or metrics,
and instead would log a "Tried to emit logs and metrics twice" warning.
Also rename QueryResource's "getServer" to "cancelQuery", because that's what
it does.
* Do not emit identity by default.
* remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form
* sanitize output of /druid/coordinator/v1/cluster endpoint
* Added org.joda.time.DateTime#(java.lang.String) to forbidden API.
* Added org.joda.time.DateTime#(java.lang.String, org.joda.time.format.DateTimeFormatter) to forbidden API.
* Add additional APIs that may create DateTime with default time zone
* Add helper function that accepts formatter to parse String.
* Add additional forbidden APIs
* Replace existing usage of forbidden APIs
* Use wrapper class to enforce Chronology on DateTimeFormatter.
* Creates constant UtcFormatter for constant ISODateTimeFormat.
* Move caffeine out of extension.
* Remove `JsonTypeName` from the class itself
* Fix bad docs
* Fix distribution pom
* Fix unused import
* Make caffeine default
* Address code comments
* Add more description around the jre version in the readme
* Add suggested comments
* Move emitters from io.druid.server.initialization to the dedicated io.druid.server.emitter package; Update emitter library to 0.6.0; Add support for ParametrizedUriEmitter; Support hierarical properties in JsonConfigurator (was needed for ParametrizedUriEmitter)
* Log created RequestLoggers
* Fix forbidden API
* Test fix
* More Http and Parametrized Http Emitter docs
* Switch to debug level
* fixes HttpServerInventoryView to call server/segment callbacks correctly and Unit Tests for the class
* fix checkstyle and forbidden-api errors
* HttpServerInventoryView to finish start() only after server inventory is initialized
* fix compilation errors
* address review comments
* add exponential backoff instead of fixed 5 secs on successive failures
* update test to exercise server fail scenarios
* use AtomicInteger for requestNum and increment only once
* Move scan-query from a contrib extension into core.
Based on a proposal at: https://groups.google.com/d/topic/druid-development/ME_OatUDnbk/discussion
This patch also adds support for virtual columns to the Scan query,
and updates Druid SQL to use Scan instead of Select.
This patch also makes some behavioral changes to handling of the __time
column. In particular, it is now is returned as "__time" rather than
"timestamp"; it is no longer included if you do not specifically ask for
it in your "columns"; and it is returned as a long rather than a string.
Users can revert time handling to the legacy extension behavior by
setting "legacy" : true in their queries, or setting the property
druid.query.scan.legacy = true. This is meant to provide a migration
path for users that were formerly using the contrib extension.
* Adjustments from review.
* Add back Select query.
* Adjust SQL docs.
* Restore SelectQuery link.
* SQL: Full TRIM support.
- Support trimming arbitrary characters
- Support BOTH, LEADING, and TRAILING
* Remove unused import.
* Fix tests, add RTRIM / LTRIM.
* Remove unused imports.
* BTRIM and docs.
* Replace for with foreach.