Commit Graph

150 Commits

Author SHA1 Message Date
QiuMM b0cf8d0252 'shutdownAllTasks' API for a dataSource (#6185)
* 'shutdownAllTasks' API for a dataSource

Change-Id: I30d14390457d39e0427d23a48f4f224223dc5777

* fix api path and return

Change-Id: Ib463f31ee2c4cb168cf2697f149be845b57c42e5

* optimize implementation

Change-Id: I50a8dcd44dd9d36c9ecbfa78e103eb9bff32eab9
2018-08-17 12:57:09 -04:00
Jonathan Wei 2b0f03acb9 Unified API doc page (#6128)
* Unified API doc page

* PR comments

* Fix metadata endpoint
2018-08-09 14:27:42 -06:00
Gian Merlino 3525d4059e
Cache: Add maxEntrySize config, make groupBy cacheable by default. (#5108)
* Cache: Add maxEntrySize config.

The idea is this makes it more feasible to cache query types that
can potentially generate large result sets, like groupBy and select,
without fear of writing too much to the cache per query.

Includes a refactor of cache population code in CachingQueryRunner and
CachingClusteredClient, such that they now use the same CachePopulator
interface with two implementations: one for foreground and one for
background.

The main reason for splitting the foreground / background impls is
that the foreground impl can have a more effective implementation of
maxEntrySize. It can stop retaining subvalues for the cache early.

* Add CachePopulatorStats.

* Fix whitespace.

* Fix docs.

* Fix various tests.

* Add tests.

* Fix tests.

* Better tests

* Remove conflict markers.

* Fix licenses.
2018-08-07 10:23:15 -07:00
Caroline1000 96feb479cd add order change needed for KIS in 0.12.0 (#5760) 2018-06-08 15:25:26 -07:00
Hongze Zhang cfa94b747b Update to jetty 9.4; Enable request decompression (#5624)
* Update to jetty 9.4; Enable request decompression; Add http compression config options

* Fix BadMessageException from jetty server at HttpGenerator.generateHeaders(...)
2018-06-08 14:53:08 -07:00
Kirill Kozlov 67d0b0ee42 Add taskType dimension to task metrics (#5664) 2018-05-07 09:42:26 -07:00
kaijianding c12c16385e support throw duplcate row during realtime ingestion in RealtimePlumber (#5693) 2018-05-04 10:12:25 -07:00
Jakub Kukul e2431ae161 Update defaultHadoopCoordinates in documentation. (#5720)
* Update defaultHadoopCoordinates in documentation.

To match changes applied in #5382.

* Remove a parameter with defaults from example configuration file.

If it has reasonable defaults, then why would it be in an example config file?

Also, it is yet another place that has been forgotten to be updated and will be forgotten in the future.

Also, if someone is running different hadoop version, then there's much more work to be done than just changing this property, so why give users false hopes?

* Fix typo in documentation.
2018-04-30 20:49:14 -07:00
David Lim 8ec2d2fe18 Use unique segment paths for Kafka indexing (#5692)
* support unique segment file paths

* forbiddenapis

* code review changes

* code review changes

* code review changes

* checkstyle fix
2018-04-29 21:59:48 -07:00
Gian Merlino 762f8829e4
Add task action metrics, add taskId metric dimension. (#5714)
* Add task action metrics, add taskId metric dimension.

Adds two new metrics: task/action/log/time and task/action/run/time. Also
adds taskId as a dimension, to give us the ability to drill down into metrics
for an individual task. Also standardizes metrics-attachment using two helper
methods in IndexTaskUtils.

* Fix typo
2018-04-29 21:24:06 -07:00
Hongze Zhang b084075279 Add http/https proxy options to PullDependencies.java (#5450) 2018-03-07 15:05:43 -08:00
Jonathan Wei 80419752b5 Add metamx emitter, http clients, and metrics packages to druid java-util (#5289)
* Add metamx java-util emitter, http clients, and metrics packages to druid java-util

* Remove metamx java-util from pom.xml files

* Checkstyle fixes

* Import fix

* TeamCity inspection fixes

* Use slf4j, move some version defs to master pom.xml

* Use parent jvm-attach-api and maven-surefire-plugin versions

* Add ] to log msg, suppress inspection
2018-01-24 22:10:36 +01:00
Jihoon Son 241efafbb2
Automatic compaction by coordinators (#5102)
* Automatic compaction by coordinator

* add links

* skip compaction for very recent segments if they are small

* fix finding search interval

* fix finding search interval

* fix TimelineHolder iteration

* add test for newestSegmentFirstPolicy

* add CompactionSegmentIterator

* add numTargetCompactionSegments

* add missing config

* fix skipping huge shards

* fix handling large number of segments per shard

* fix test failure

* change recursive call to loop

* fix logging

* fix build

* fix test failure

* address comments

* change dataSources type

* check running pendingTasks at each run

* fix test

* address comments

* fix build

* fix test

* address comments

* address comments

* add doc for segment size optimization

* address comment
2018-01-13 13:52:37 +09:00
Himanshu 2ecebb3173 Fix coordinator/overlord redirects when TLS is enabled (#5037)
* Fix coordinator/overlord redirects when TLS is enabled

* address review comment

* fix UTs

* workaround to not ignore URL instance to fix the teamcity build

* update tls doc
2017-11-09 13:10:28 -08:00
Jihoon Son 52d7f74226 Add streaming aggregation as the last step of ConcurrentGrouper if data are spilled (#4704)
* Add steaming grouper

* Fix doc

* Use a single dictionary while combining

* Revert GroupByBenchmark

* Removed unused code

* More cleanup

* Remove unused config

* Fix some typos and bugs

* Refactor Groupers.mergeIterators()

* Add comments for combining tree

* Refactor buildCombineTree

* Refactor iterator

* Add ParallelCombiner

* Add ParallelCombinerTest

* Handle InterruptedException

* use AbstractPrioritizedCallable

* Address comments

* [maven-release-plugin] prepare release druid-0.11.0-sg

* [maven-release-plugin] prepare for next development iteration

* Address comments

* Revert "[maven-release-plugin] prepare for next development iteration"

This reverts commit 5c6b31e488.

* Revert "[maven-release-plugin] prepare release druid-0.11.0-sg"

This reverts commit 0f5c3a8b82.

* Fix build failure

* Change list to array

* rename sortableIds

* Address comments

* change to foreach loop

* Fix comment

* Revert keyEquals()

* Remove loop

* Address comments

* Fix build fail

* Address comments

* Remove unused imports

* Fix method name

* Split intermediate and leaf combine degrees

* Add comments to StreamingMergeSortedGrouper

* Add more comments and fix overflow

* Address comments

* ConcurrentGrouperTest cleanup

* add thread number configuration for parallel combining

* improve doc

* address comments

* fix build
2017-10-17 23:24:08 -07:00
Parag Jain 7cc18226cd add more tls configs to enable/disable specific cipher suites and protocols (#4902)
* add more tls configs to enable/disable specific cipher suites and protocols

* fix doc, allow empty list
2017-10-09 13:53:12 -07:00
Himanshu f69c9280c4 remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form (#4858)
* remove ServerConfig from DruidNode as all information needs to be present in DruidNode serialized form

* sanitize output of /druid/coordinator/v1/cluster endpoint
2017-09-28 10:40:59 -05:00
Himanshu a36adc63e4 [documentation] add more jvm and os guidelines (#4793)
* add more jvm and os guidelines

* address review comments

* add not so general guidelines too

* duplicate statement removal
2017-09-20 13:12:57 -07:00
Parag Jain b5e839b3db injectable sslcontextfactory for jetty server and key manager factory algorithm (#4769)
* injectable sslcontextfactory for jetty server

key manager factory algorithm

* explicitly set trustAll certificates to false in sslcontextfactory
2017-09-12 11:45:03 -07:00
Jonathan Wei 1bddfc089c Additional docs/log for direct memory usage (#4631)
* Additional docs/log for direct memory usage

* Tweak docs

* Doc rewording
2017-08-10 23:33:20 -07:00
Himanshu ae6780f62a rolling upgrade order change to bring coordinator and overlord together (#4281)
* rolling upgrade order change to bring coordinator and overlord together

* mentioned merged Coordinator-Overlord in upgrade order doc

* revert autoscaling doc change

* auto scaling doc fix
2017-07-25 12:54:12 -05:00
Parag Jain 6e2f78f552 TLS support (#4270) 2017-07-06 17:40:12 -07:00
Parag Jain 4502c207af fix injection bug and documentation (#4243) 2017-05-03 15:07:43 -05:00
Jihoon Son 7411b18df9 Add BroadcastDistributionRule (#4077)
* Add BroadcastDistributionRule

* Add missing null check

* Rename variable 'colocateDataSource' to 'colocatedDatasource'

* Address comments

* Document for broadcast rules

* Drop segments which are not co-located anymore

* Remove duplicated segment loading and dropping

* Add caveat

* address comments
2017-05-01 09:55:17 -07:00
Akash Dwivedi 94da5e80f9 Namespace optimization for hdfs data segments. (#3877)
* NN optimization for hdfs data segments.

* HdfsDataSegmentKiller, HdfsDataSegment finder changes to use new storage
format.Docs update.

* Common utility function in DataSegmentPusherUtil.

* new static method `makeSegmentOutputPathUptoVersionForHdfs` in JobHelper

* reuse getHdfsStorageDirUptoVersion in
DataSegmentPusherUtil.getHdfsStorageDir()

* Addressed comments.

* Review comments.

* HdfsDataSegmentKiller requested changes.

* extra newline

* Add maprfs.
2017-03-01 09:51:20 -08:00
Nishant 35160e5595 Add metrics for Query Count statistics (#3470)
* Add metrics for Query Count statistics

This PR adds a new metrics monitor “QueryCountStatsMonitor” which emits
three new metrics -
1) query/success/count - number of successful queries
2) query/failed/count - number of failed queries
3) query/interrupted/count - number of interrupted/timedout queries

fix bindings

* make fields final

* fix imports

* AsyncQueryForwardingServlet implement QueryStatsProvider

* remove unused import
2016-12-19 09:47:58 -08:00
kaijianding 4be3eb0ce7 report message gap, source gap and sink count in RealtimePlumber (#3744)
* report message gap, source gap and sink count in RealtimePlumber

* report message gap, sink count in Appenderator

* add ingest/events/sourceGap in metrics.md

* remove source gap
2016-12-13 11:23:02 -06:00
Jonathan Wei 7c63bee7f5 Add mapreduce.job.classloader.system.classes property to 'Other Hadoop Versions' docs (#3706) 2016-11-18 16:16:50 -08:00
Himanshu b76b3f8d85 reset-cluster command to clean up druid state stored on metadata and deep storage (#3670) 2016-11-09 11:07:01 -06:00
jaehong choi 6f21778364 Support finding segments in AWS S3. (#3399)
* support finding segments from a AWS S3 storage.

* add more Uts

* address comments and add a document for the feature.

* update docs indentation

* update docs indentation

* address comments.
1. add a Ut for json ser/deser for the config object.
2. more informant error message in a Ut.

* address comments.
1. use @Min to validate the configuration object
2. change updateDescriptor to a string as it does not take an argument otherwise

* fix a Ut failure - delete a Ut for testing default max length.
2016-10-10 17:27:09 -07:00
Ashish 6b40bf8b32 doc: added note to README, about necessary hdfs config after insert-segment-to-db (#3402) 2016-08-28 16:39:33 -07:00
Chanh Le d624037698 Pull-deps: correct the library directory in the document (#3361)
* Pull-deps: correct the library directory in the document

* Pull-deps: correct the library directory in the document in the last example command
2016-08-16 17:18:15 -07:00
Fangjin Yang 6beb8ac342 fix some docs and add new content (#3369) 2016-08-16 15:00:18 -07:00
Himanshu ed5b92d612 document how to check MM enabled/disabled (#3331) 2016-08-06 05:56:51 +08:00
Gian Merlino e5397ed316 Link up Hadoop class loading docs better. (#3302) 2016-07-29 10:19:54 -07:00
Charles Allen 546e4f79b0 Add size of pending deletes to historical metrics (#3295)
* Add size of pending deletes to historical metrics
2016-07-27 11:30:47 -07:00
Charles Allen b1e3fe77f5 More logging around how the coordinator balancer is happening (#3279)
* More logging around how the coordinator balancer is happening

* Address comments

* Address code review comments and add actual logging
2016-07-27 13:24:32 +05:30
Gian Merlino dd4ec751d0 Update docs for working with Hadoop dependencies. (#3252)
- Attempt to make things clearer in general
- Point out that HDFS deep storage and MR jobs don't use the same loading mechanism
- Recommend using mapreduce.job.classloader = true when possible
2016-07-18 07:47:58 -05:00
Gian Merlino 6a03a0cfec Fix ingest/persist/backPressure docs. (#3243) 2016-07-13 21:56:28 -07:00
Gian Merlino b8a4f4ea7b DumpSegment: Add --dump bitmaps option. (#3221)
Also make --dump metadata respect --column.
2016-07-06 12:42:50 -07:00
Parag Jain 99844dfeb5 remove need for tmp extensions dir (#3211)
correct lib path relative to base distribution dir
2016-07-01 12:55:57 -07:00
michaelschiff 66d8ad36d7 adds new coordinator metrics 'segment/unavailable/count' and (#3176)
'segment/underReplicated/count' (#3173)
2016-06-23 14:53:15 -07:00
Gian Merlino da660bb592 DumpSegment tool. (#3182)
Fixes #2723.
2016-06-23 14:37:50 -07:00
Gian Merlino 3b3e772748 Add --no-default-remote-repositories flag to pull-deps. (#3120) 2016-06-13 17:01:18 +05:30
Kirill Kozlov 4ab675e863 Fix command name in example (#3088) 2016-06-07 10:44:27 -07:00
Gian Merlino cd5c5419bb Make docs deploying better. (#3040)
- Make redirects for old links based on _redirects.json
- Replace #{DRUIDVERSION} tokens in docs with current version
- Allow origins named something other than "origin"
- Can use either s3cmd or awscli, depending on availability
2016-05-31 15:34:58 -07:00
Fangjin Yang 00de26c76a fix extensions docs (#2995)
* fix extensions docs

* fix mistakes
2016-05-19 14:01:06 -07:00
Himanshu 6c5bf91f9a publish metrics numJettyConns to see how number of active jetty connections change over time (#2839)
this can be compared with numer of active queries to see if requests are waiting in jetty queue
2016-05-02 14:08:25 -07:00
du00cs 639d0630b8 jackson conflict workaround in hadooop ingestio & parquet extension coordinate update (#2817) 2016-04-13 14:20:33 -07:00
Sébastien Launay 37d2ab623e Merge pull request #2815 from slaunay/documentation/hadoop-classpath-issue-fix-with-configuration
Doc for mapreduce.job.user.classpath.first=true
2016-04-12 10:51:51 -07:00
fjy e3e932a4d4 refactor extensions into core and contrib 2016-03-08 17:12:09 -08:00
Bingkun Guo 18f9e05f0f improve doc on including druid and hadoop extensions 2016-02-26 13:53:08 -06:00
Fangjin Yang 083f019a48 Merge pull request #2465 from druid-io/more-doc-fix
more doc fixes
2016-02-17 11:00:38 -08:00
fjy 7da6594bfe more doc fixes 2016-02-17 09:43:47 -08:00
Slim e9f1c94822 Update metrics.md 2016-02-17 09:27:15 -06:00
Slim ebbb1aa74e Update metrics.md 2016-02-17 09:05:16 -06:00
Gian Merlino 95d5526e7c Freshen up rolling update docs
1. Clarify what "Indexing Service / Realtime" means
2. Add info about restore-based middle manager rolling restarts
3. Add info about what happens in middle manager updates
4. More consistent capitalization and spelling of node types
2016-02-09 13:57:04 -08:00
fjy 003f54e268 add doc rendering 2016-02-04 14:21:59 -08:00
fjy 1aa363cea7 new quickstart 2016-02-04 09:37:38 -08:00
Fangjin Yang 459c2a49ca Merge pull request #2364 from metamx/fix2356
Add more docs around timezone handling
2016-02-01 10:58:15 -08:00
Charles Allen c9393e5289 Add more docs around timezone handling
* Fixes #2356
2016-02-01 08:51:07 -08:00
Jaebin Yoon 66a74a2b88 Fixed the broken link 2016-02-01 01:07:24 -08:00
Robin c9368702fa do some editing of the instructions for using mysql for metadata 2016-01-21 10:37:30 -06:00
Gian Merlino 2d3f6e7705 Some more multitenancy docs 2016-01-17 17:47:49 -08:00
Himanshu d255f4baac Merge pull request #2234 from pjain1/emit_realtime_metrics
emit handoff count metrics
2016-01-08 14:24:16 -06:00
Parag Jain 9dba0f67e7 emit handoff count metrics 2016-01-08 12:36:13 -06:00
Fangjin Yang 3048b1f0a5 Merge pull request #2174 from metamx/ingest-size-metrics
Add metrics for ingest/bytes/received for EventReceiverFirehose
2016-01-06 22:05:55 -08:00
Fangjin Yang dd262f0451 Merge pull request #2215 from pjain1/fix_doc_metrics
correct metric name - segment/added/count -> segment/assigned/count
2016-01-06 16:21:54 -08:00
Parag Jain 768d07b702 correct metric name - segment/added/count -> segment/assigned/count 2016-01-06 15:55:11 -06:00
Nishant 14989f272d Add metrics for ingest/bytes/received for EventReceiverFirehose
review comments

review comments
2016-01-05 20:06:09 +05:30
Robin e280ab5f07 update zookeeper version to 3.4.7 2016-01-04 11:47:02 -06:00
Bingkun Guo 3c107c5757 Merge pull request #2150 from himanshug/emit_query_bytes
emit query/bytes metric
2015-12-30 13:44:19 -06:00
Fangjin Yang b1261035a7 Merge pull request #1861 from guobingkun/insert_segment_tool
insert-segment tool
2015-12-29 10:06:07 -08:00
Himanshu Gupta 1a8546a682 emit query/bytes metric 2015-12-23 00:29:44 -06:00
Himanshu Gupta b96f560255 emit query/node/bytes metric 2015-12-21 23:23:20 -06:00
Bingkun Guo 89b477970f DataSegmentFinder tool
`insert-segment-to-db` is a tool that can insert segments into Druid metadata storage. It is intended to be used
to update the segment table in metadata storage after people manually migrate segments from one place to another.
It can also be used to insert missing segment into Druid, or even recover metadata storage by telling it where the
segments are stored.

Note: This tool expects users to have Druid cluster running in a "safe" mode, where there are no active tasks to interfere
the segments being inserted. Users can optionally bring down the cluster to make 100% sure nothing is interfering.
2015-12-21 00:02:04 -06:00
Gian Merlino e75c2a407d Merge pull request #1944 from druid-io/fix-doc
fix website rendering for this doc
2015-11-10 16:04:40 -08:00
fjy e923de3eea fix website rendering for this doc 2015-11-10 15:36:30 -08:00
Xavier Léauté cf779946ef Merge pull request #1791 from guobingkun/event_receiver_firehose_monitor
EventReceiverFirehoseMonitor
2015-11-10 11:09:42 -08:00
Bingkun Guo 962f65cc76 fix metadata typo and rename default extension directory 2015-11-03 14:50:42 -06:00
Oleg Zaezdny 95a5ae0373 Docs improved by adding more details about local cache and memory for segments on historicals. 2015-11-01 21:56:28 +02:00
Bingkun Guo c3b6fcce9d Add EventReceiverFirehoseMonitor
add an EventReceiverFirehoseMonitor so that we can monitor how many
events have been queued in the EventReceiverFirehose and get a sense
about whether the firehose is under too much pressure.
2015-10-30 11:40:02 -05:00
Bingkun Guo 657a5ac346 fix pull-deps remoteRepository option 2015-10-30 11:32:56 -05:00
Fangjin Yang 5f23703216 Merge pull request #1638 from guobingkun/remove_maven_client_code
Remove Maven client at runtime + Provide a way to load Druid extensions through local file system
2015-10-26 09:30:05 -07:00
Nishant 7cecc55045 Add segment merge time as a metric
Add merge and persist cpu time

Fix typo

review comment

move cpu time measuring to VMUtils

review comments.
2015-10-22 12:28:03 +05:30
Bingkun Guo 4914925d65 New extension loading mechanism
1) Remove maven client from downloading extensions at runtime.
2) Provide a way to load Druid extensions and hadoop dependencies through file system.
3) Refactor pull-deps so that it can download extensions into extension directories.
4) Add documents on how to use this new extension loading mechanism.
5) Change the way how Druid tarball is generated. Now all the extensions + hadoop-client 2.3.0
are packaged within the Druid tarball.
2015-10-21 14:22:36 -05:00
Bingkun Guo 620e334d0f fix ingestion faq link 2015-10-16 10:14:14 -05:00
Xavier Léauté b464da438c Merge pull request #1688 from metamx/moreMemcachedMetrics
More memcached metrics
2015-09-15 15:33:51 -07:00
Charles Allen 5813856819 More memcached metrics 2015-09-08 13:34:58 -07:00
Charles Allen fcf5cae81d Add CPU time to metrics for segment scanning. 2015-09-08 13:34:19 -07:00
fjy 4055f9ca48 more docs for common questions 2015-08-25 17:49:04 -07:00
Xavier Léauté 0cbda0c01d update version numbers in docs 2015-08-17 16:41:21 -07:00
Parag Jain 41fa9bf994 swap description and dimension for some JVM metrics 2015-08-17 15:03:06 -05:00
fjy 92293ef094 Added section on best practices for schema designa and a few other edits 2015-07-24 14:06:20 -07:00
Qi Wang 7211791585 add workaround for cdh 2015-07-19 14:11:47 -07:00
fjy 08d00cc80f rework the realtime examples a bit; add more faq 2015-07-07 14:07:14 -07:00
fjy 42ac41d55e add more docs based on proposed wishlist 2015-07-02 17:46:08 -07:00
Xavier Léauté 2da12de598 add back query/node/(time|ttfb) docs 2015-06-26 17:58:47 -07:00
Charles Allen fbcac10e00 Remove metrics emitting from caching clustered client 2015-06-26 10:49:13 -07:00
Himanshu Gupta 8edc2aaca3 renaming all *.md filenames to only have lowercase and dashes
so that they are editable on case-insensitive os as well
2015-05-29 20:55:42 -05:00