* Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file
* delete descriptor.file when killing segments
* fix test
* Add doc for ha
* improve warning
* document middle manager api
* re-arrange
* correction
* document more missing overlord api calls, minor re-arrange of some code i was referencing
* fix it
* this will fix it
* fixup
* link to other docs
* Prohibit assigning concurrent maps into Map-types variables and fields; Fix a race condition in CoordinatorRuleManager; improve logic in DirectDruidClient and ResourcePool
* Enforce that if compute(), computeIfAbsent(), computeIfPresent() or merge() is called on a ConcurrentHashMap, it's stored in a ConcurrentHashMap-typed variable, not ConcurrentMap; add comments explaining get()-before-computeIfAbsent() optimization; refactor Counters; fix a race condition in Intialization.java
* Remove unnecessary comment
* Checkstyle
* Fix getFromExtensions()
* Add a reference to the comment about guarded computeIfAbsent() optimization; IdentityHashMap optimization
* Fix UriCacheGeneratorTest
* Workaround issue with MaterializedViewQueryQueryToolChest
* Strengthen Appenderator's contract regarding concurrency
* * Add few methods about base64 into StringUtils
* Use `java.util.Base64` instead of others
* Add org.apache.commons.codec.binary.Base64 & com.google.common.io.BaseEncoding into druid-forbidden-apis
* Rename encodeBase64String & decodeBase64String
* Update druid-forbidden-apis
* Use multi-guava version friendly direct executor implementation
* Don't use a singleton
* Fix strict compliation complaints
* Copy Guava's DirectExecutor
* Fix javadoc
* Imports are the devil
* FilteredRequestLogger: Fix start/stop, invalid delegate behavior.
Fixes two bugs:
1) FilteredRequestLogger did not start/stop the delegate.
2) FilteredRequestLogger would ignore an invalid delegate type, and
instead silently substitute the "noop" logger. This was due to a larger
problem with RequestLoggerProvider setup in general; the fix here is
to remove "defaultImpl" from the RequestLoggerProvider interface, and
instead have JsonConfigurator be responsible for creating the
default implementations. It is stricter about things than the old system
was, and is only willing to make a noop logger if it doesn't see any
request logger configs. Otherwise, it'll raise a provision error.
* Remove unneeded annotations.
* FileUtils: Sync directory entry too on writeAtomically.
See the fsync(2) man page for why this is important:
https://linux.die.net/man/2/fsync
This also plumbs CompressionUtils's "zip" function through
writeAtomically, so the code for handling atomic local filesystem
writes is all done in the same place.
* Remove unused import.
* Avoid FileOutputStream.
* Allow non-atomic writes to overwrite.
* Add some comments. And no need to flush an unbuffered stream.
* autosize processing buffers based on direct memory sizing
* remove oops, more test
* max 1gb autosize buffers, test, start of docs
* fix oops
* revert accidental change
* print buffer size in exception
* change the things
Not putting this to 0.13 milestone because the found bugs are not critical (one is a harmless DI config duplicate, and another is in a benchmark.
Change in `DumpSegment` is just an indentation change.
* Add checkstyle rules about imports and empty lines between members
* Add suppressions
* Update Eclipse import order
* Add empty line
* Fix StatsDEmitter
Also add a "fromIndex" argument to the strpos expression function. There
are some -1 and +1 adjustment terms due to the fact that the strpos
expression behaves like Java indexOf (0-indexed), but the POSITION SQL
function is 1-indexed.
* Optimization for expressions that hit a single long column.
There was previously a single-long-input optimization that applied only
to the time column. These have been combined together. Also adds
type-specific value caching to ExprEval, which allowed simplifying
the SingleLongInputCachingExpressionColumnValueSelector code.
* Add more benchmarks.
* Don't use LRU cache for __time.
* Simplify a bit.
* Let the cache grow.
* move parquet-extensions from contrib to core, adds new hadoop parquet parser that does not convert to avro first and supports flattenSpec and int96 columns, add support for flattenSpec for parquet-avro conversion parser, much test with a bunch of files lifted from spark-sql
* fix avro flattener to support nullable primitives for auto discovery and now only supports primitive arrays instead of all arrays
* remove leftover print
* convert micro timestamp to millis
* checkstyle
* add ignore for .parquet and .parq to rat exclude
* fix legit test failure from avro flattern behavior change
* fix rebase
* add exclusions to pom to cut down on redundant jars
* refactor tests, add support for unwrapping lists for parquet-avro, review comments
* more comment
* fix oops
* tweak parquet-avro list handling
* more docs
* fix style
* grr styles
* Prohibit some guava collection APIs and use JDK APIs directly
* reset files that changed by accident
* sort codestyle/druid-forbidden-apis.txt alphabetically
@gianm gauge `emitter/buffers/failed` and `emitter/buffers/totalAllocated` metrics proved to be very inconvenient in practice: they are not additive across a fleet of nodes, are randomly reset when a JVM restarts (that may be completely independent from metrics emitting issues). Regarding [problem identification](https://github.com/apache/incubator-druid/pull/4973#issuecomment-338119125), `emitter/buffers/dropped` serves this well.
Currently in this PR I added `/delta` vs. `/total` metrics, but if there are no objections I would remove `/total` and so align `emitter/buffers/allocated` and `/failed` with `/dropped` and `/events/emitted` which are already delta-only.