This is a feature meant to allow realtime tasks to work without being told upfront
what shardSpec they should use (so we can potentially publish a variable number
of segments per interval).
The idea is that there is a "pendingSegments" table in the metadata store that
tracks allocated segments. Each one has a segment id (the same segment id we know
and love) and is also part of a sequence.
The sequences are an idea from @cheddar that offers a way of doing replication.
If there are N tasks reading exactly the same data with exactly the same logic
(think Kafka tasks reading a fixed range of offsets) then you can place them
in the same sequence, and they will generate the same sequence of segments.
1) Remove maven client from downloading extensions at runtime.
2) Provide a way to load Druid extensions and hadoop dependencies through file system.
3) Refactor pull-deps so that it can download extensions into extension directories.
4) Add documents on how to use this new extension loading mechanism.
5) Change the way how Druid tarball is generated. Now all the extensions + hadoop-client 2.3.0
are packaged within the Druid tarball.
- move assembly out of druid-services into a 'distribution' module
- create separate 'extensions-distribution' module and assembly to
package extensions and their dependencies into a local maven
repository
- include this extensions maven repository in the binaries tarball
review comments
more refactoring and cleaning of redundant code
add UT + docs + more refactoring
fixes + review comments
more cleanup
end points to fetch history
review comments
remove unnecessary changes
review comments rename header name
review comments + add test for MetadataRulesManager
review comments docs
- Uses method overrides instead of modified Jetty code, now that
ProxyServlet provides enough method hooks for proper overrides.
This means we may also benefit from any Jetty ProxyServlet fixes
- Adds test for async proxy servlet to make sure gzip encoding is
properly proxied.
- Router now proxies POST requests for requests that are not Druid
queries, by only treating /druid/v2/* endpoints as queries.
This commit also includes
1) the addition of a context parameter on timeseries queries that allows it to ignore empty buckets instead of generating results for them
2) A cleanup of an unused method on an interface