com.maxmind.geoip2 2.6.0 depends on com.google.http-client 1.15.0-rc (3 years old).
When trying to include other libraries in Druid that require an up to date version of com.google.http-client this causes a problem.
Geared towards supporting transactional inserts of new segments. This involves an
interface "DataSourceMetadata" that allows combining of partially specified metadata
(useful for partitioned ingestion).
DataSource metadata is stored in a new "dataSource" table.
Appenderators are a way of getting more control over the ingestion process
than a Plumber allows. The idea is that existing Plumbers could be implemented
using Appenderators, but you could also implement things that Plumbers can't do.
FiniteAppenderatorDrivers help simplify indexing a finite stream of data.
Also:
- Sink: Ability to consider itself "finished" vs "still writable".
- Sink: Ability to return the number of rows contained within the sink.
The incremental indexes handle that now so it's not necessary.
Also, add debug logging and more detailed exceptions to the incremental
indexes for the case where there are parse exceptions during aggregation.
After finding the FireChief for a specific partition, Druid will need to find the specific queryRunner for each segment being queried by passing the query to FireChief. Currently Druid is passing the original query that contains all the segments need to be queried, it's possible that fireChief.getQueryRunner(query) returns more than 1 queryRunner because query.getIntervals() is not specific to a single segment.
In this patch, for each segment being queried, Druid will update the query with its corresponding SpecificSegmentSpec.
See stack traces here, from current master: https://gist.github.com/gianm/bd9a66c826995f97fc8f
1. The thread "qtp925672150-62" holds the lock on InternalInjectorCreator.class,
used by Scopes.SINGLETON, and wants the lock on "handlers" in Lifecycle.addMaybeStartHandler
called by DiscoveryModule.getServiceAnnouncer.
2. The main thread holds the lock on "handlers" in Lifecycle.addMaybeStartHandler, which it
took because it's trying to add the ExecutorLifecycle to the lifecycle. main is trying
to get the InternalInjectorCreator.class lock because it's running ExecutorLifecycle.start,
which does some Jackson deserialization, and Jackson needs that lock in order to inject
stuff into the Task it's deserializing.
This patch eagerly instantiates ChatHandlerResource (which I believe is what's trying to
create the ServiceAnnouncer in the qtp925672150-62 jetty thread) and the ExecutorLifecycle.
To bring consistency to docs and source this commit changes the default
values for maxRowsInMemory and rowFlushBoundary to 75000 after
discussion in PR https://github.com/druid-io/druid/pull/2457.
The previous default was 500000 and it's lower now on the grounds that
it's better for a default to be somewhat less efficient, and work,
than to reach for the stars and possibly result in
"OutOfMemoryError: java heap space" errors.
Add tests that verify whether RealtimeManager is querying the correct FireChief for a specific partition
make FireChief static and package private, add latches in the UT