druid

Commit Graph

Author	SHA1	Message	Date
Michael Trelinski	347779b17a	Zookeeper loss (#6740 ) * Update init Fix bin/init to source from proper directory. * Fix for Proposal #6518: Shutdown druid processes upon complete loss of ZK connectivity * Zookeeper Loss: - Add feature documentation - Cosmetic refactors - Variable extractions - Remove getter * - Change config key name and reword documentation - Switch from Function<Void,Void> to Runnable/Lambda - try { … } finally { … } * Fix line length too long * - change to formatted string for logging - use System.err.println after lifecycle stops * commenting on makeEnsembleProvider()-created Zookeeper termination * Add javadoc * added java doc reference back to apache discussion thread. * move comment to other class * favor two-slash comments instead of multiline comments	2019-03-29 15:10:42 -07:00
Justin Borromeo	ad7862c58a	Time Ordering On Scans (#7133 ) * Moved Scan Builder to Druids class and started on Scan Benchmark setup * Need to form queries * It runs. * Stuff for time-ordered scan query * Move ScanResultValue timestamp comparator to a separate class for testing * Licensing stuff * Change benchmark * Remove todos * Added TimestampComparator tests * Change number of benchmark iterations * Added time ordering to the scan benchmark * Changed benchmark params * More param changes * Benchmark param change * Made Jon's changes and removed TODOs * Broke some long lines into two lines * nit * Decrease segment size for less memory usage * Wrote tests for heapsort scan result values and fixed bug where iterator wasn't returning elements in correct order * Wrote more tests for scan result value sort * Committing a param change to kick teamcity * Fixed codestyle and forbidden API errors * . * Improved conciseness * nit * Created an error message for when someone tries to time order a result set > threshold limit * Set to spaces over tabs * Fixing tests WIP * Fixed failing calcite tests * Kicking travis with change to benchmark param * added all query types to scan benchmark * Fixed benchmark queries * Renamed sort function * Added javadoc on ScanResultValueTimestampComparator * Unused import * Added more javadoc * improved doc * Removed unused import to satisfy PMD check * Small changes * Changes based on Gian's comments * Fixed failing test due to null resultFormat * Added config and get # of segments * Set up time ordering strategy decision tree * Refactor and pQueue works * Cleanup * Ordering is correct on n-way merge -> still need to batch events into ScanResultValues * WIP * Sequence stuff is so dirty :( * Fixed bug introduced by replacing deque with list * Wrote docs * Multi-historical setup works * WIP * Change so batching only occurs on broker for time-ordered scans Restricted batching to broker for time-ordered queries and adjusted tests Formatting Cleanup * Fixed mistakes in merge * Fixed failing tests * Reset config * Wrote tests and added Javadoc * Nit-change on javadoc * Checkstyle fix * Improved test and appeased TeamCity * Sorry, checkstyle * Applied Jon's recommended changes * Checkstyle fix * Optimization * Fixed tests * Updated error message * Added error message for UOE * Renaming * Finish rename * Smarter limiting for pQueue method * Optimized n-way merge strategy * Rename segment limit -> segment partitions limit * Added a bit of docs * More comments * Fix checkstyle and test * Nit comment * Fixed failing tests -> allow usage of all types of segment spec * Fixed failing tests -> allow usage of all types of segment spec * Revert "Fixed failing tests -> allow usage of all types of segment spec" This reverts commit `ec470288c7`. * Revert "Merge branch '6088-Time-Ordering-On-Scans-N-Way-Merge' of github.com:justinborromeo/incubator-druid into 6088-Time-Ordering-On-Scans-N-Way-Merge" This reverts commit `57033f36df`, reversing changes made to `8f01d8dd16`. * Check type of segment spec before using for time ordering * Fix bug in numRowsScanned * Fix bug messing up count of rows * Fix docs and flipped boolean in ScanQueryLimitRowIterator * Refactor n-way merge * Added test for n-way merge * Refixed regression * Checkstyle and doc update * Modified sequence limit to accept longs and added test for long limits * doc fix * Implemented Clint's recommendations	2019-03-28 14:37:09 -07:00
Surekha	be318f4de3	Add column type to sys table docs (#7359 ) * Add column type * oops should be used=1	2019-03-27 20:21:57 -07:00
Charles Allen	eeb3dbe79d	Move GCP to a core extension (#6953 ) * Move GCP to a core extension * Don't provide druid-core >.< * Keep AWS and GCP modules separate * Move AWSModule to its own module * Add aws ec2 extension and more modules in more places * Fix bad imports * Fix test jackson module * Include AWS and GCP core in server * Add simple empty method comment * Update version to 15 * One more 0.13.0-->0.15.0 change * Fix multi-binding problem * Grep for s3-extensions and update docs * Update extensions.md	2019-03-27 09:00:43 -07:00
Justin Borromeo	c7fea6ac8f	Added better QueryInterruptedException error message for UnsupportedOperationException (#7248 ) * Added error message for UOE * Updated docs * Doc change * Doc change	2019-03-26 15:20:24 -07:00
Gian Merlino	4ca5fe0f60	SQL: Add PARSE_LONG function. (#7326 ) * SQL: Add PARSE_LONG function. * Fix test.	2019-03-22 15:40:10 -07:00
Vadim Ogievetsky	e4f2dcacf2	Druid console docs (#7300 ) * console docs * fix typo	2019-03-21 00:37:33 -07:00
Justin Borromeo	ff94bd16e6	Fix conflicting information in configuration doc (#7299 ) * Doc fix * Fix typo	2019-03-19 14:55:58 -07:00
Qi Shu	5406aaa49d	Add SQL auto complete in druid console (#7244 ) * Add SQL auto complete in druid console * Add comment in sql.md to alert user to change create-sql-function-doc if sql.md format gets changed	2019-03-16 01:45:53 -07:00
Jihoon Son	892d1d35d6	Deprecate NoneShardSpec and drop support for automatic segment merge (#6883 ) * Deprecate noneShardSpec * clean up noneShardSpec constructor * revert unnecessary change * Deprecate mergeTask * add more doc * remove convert from indexMerger * Remove mergeTask * remove HadoopDruidConverterConfig * fix build * fix build * fix teamcity * fix teamcity * fix ServerModule * fix compilation * fix compilation	2019-03-15 23:29:25 -07:00
Atul Mohan	2daeb50008	Add support for optional client authentication on TLS (#7250 ) * Add optional client auth * Add docs	2019-03-15 15:14:34 -07:00
Hongze Zhang	f9d99b245b	Add missing doc link for operations/http-compression.html; Fix magic numbers in test cases using JettyServerInitUtils.wrapWithDefaultGzipHandler (#7110 )	2019-03-13 14:09:19 -07:00
Clint Wylie	3895914aa2	consolidate CompressionUtils.java since now in the same jar (#6908 )	2019-03-13 11:02:44 -04:00
Gian Merlino	9178793ab5	Further improve caching documentation. (#7236 ) Follow-up to #7223 that fixes a doc bug (a result-level cache property was misspelled), changes the recommended "small cluster" threshold from 20 to 5 servers, and clarifies behavior of the various caching options.	2019-03-11 17:57:00 -07:00
Pierre-Emile Ferron	a88fbcd5db	Improve caching doc (#7223 ) - Set correct default values for query context result cache parameters - Add details about broker cache impact on local historical merging	2019-03-11 20:06:28 -04:00
Venkatraman P	3118160387	Adding a tutorial in doc for using Kerberized Hadoop as deep storage. (#6863 ) * Adding a tutorial in doc for using Kerberized Hadoop as deep storage. * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md Fixed - to ~ in Apache License section. * Update tutorial-kerberos-hadoop.md * Update tutorial-kerberos-hadoop.md	2019-03-11 11:39:15 -07:00
Jonathan Wei	e1d8c17746	Add commit ID milestone helper script (#7100 ) * Add commit ID milestone helper script * Filter on merged/closed in API call	2019-03-11 11:36:07 -07:00
Jonathan Wei	94463b5778	Add missing redirects and fix broken links (#7213 ) * Add missing redirects * Fix zookeeper redirect * Fix broken links	2019-03-09 15:16:23 -08:00
jorbay-au	62f0de9b89	Remove outdated instruction for rule updates (#7205 )	2019-03-08 16:42:08 -08:00
Clint Wylie	a44df6522c	rename maintenance mode to decommission (#7154 ) * rename maintenance mode to decommission * review changes * missed one * fix straggler, add doc about decommissioning stalling if no active servers * fix missed typo, docs * refine docs * doc changes, replace generals * add explicit comment to mention suppressed stats for balanceTier * rename decommissioningVelocity to decommissioningMaxSegmentsToMovePercent and update docs * fix precondition check * decommissioningMaxPercentOfMaxSegmentsToMove * fix test * fix test * fixes	2019-03-08 16:33:51 -08:00
Jihoon Son	e48a9c138e	Reduce default max # of subTasks to 1 for native parallel task (#7181 ) * Reduce # of max subTasks to 2 * fix typo and add more doc * add more doc and link * change default and add warning * fix doc * add test * fix it test	2019-03-05 22:06:36 -08:00
Jonathan Wei	9183e32876	Add more approximate algorithm docs (#7195 )	2019-03-05 16:44:02 -08:00
Xue Yu	65118277a3	support sin cos etc trigonometric function in sql (#7182 ) * support triangle function in sql * feedback address	2019-03-04 19:18:22 -08:00
Jonathan Wei	5486c2abf8	Update LICENSE and NOTICE files (#7026 ) * Update LICENSE and NOTICE files * Update react-table version	2019-03-04 18:45:22 -08:00
Roman Leventov	10c9f6d708	Fix and document concurrency of EventReceiverFirehose and TimedShutoffFirehose; Refine concurrency specification of Firehose (#7038 ) #### `EventReceiverFirehoseFactory` Fixed several concurrency bugs in `EventReceiverFirehoseFactory`: - Race condition over putting an entry into `producerSequences` in `checkProducerSequence()`. - `Stopwatch` used to measure time across threads, but it's a non-thread-safe class. - Use `System.nanoTime()` instead of `System.currentTimeMillis()` because the latter are [not suitable](https://stackoverflow.com/a/351571/648955) for measuring time intervals. - `close()` was not synchronized by could be called from multiple threads concurrently. Removed unnecessary `readLock` (protecting `hasMore()` and `nextRow()` which are always called from a single thread). Removed unnecessary `volatile` modifiers. Documented threading model and concurrent control flow of `EventReceiverFirehose` instances. Important: please read the updated Javadoc for `EventReceiverFirehose.addAll()`. It allows events from different requests (batches) to be interleaved in the buffer. Is this OK? #### `TimedShutoffFirehoseFactory` - Fixed a race condition that was possible because `close()` that was not properly synchronized. Documented threading model and concurrent control flow of `TimedShutoffFirehose` instances. #### `Firehose` Refined concurrency contract of `Firehose` based on `EventReceiverFirehose` implementation. Importantly, now it states that `close()` doesn't affect `hasMore()` and `nextRow()` and could be called concurrently with them. In other words, specified that `close()` is for "row supply" side rather than "row consume" side. However, I didn't check that other `Firehose` implementatations adhere to this contract. <hr> This issue is the result of reviewing `EventReceiverFirehose` and `TimedShutoffFirehose` using [this checklist](https://medium.com/@leventov/code-review-checklist-java-concurrency-49398c326154).	2019-03-04 18:50:03 -03:00
Jihoon Son	ded03d9d4c	Improve doc for auto compaction (#7117 ) * Improve doc for auto compaction * fix doc * address comments	2019-03-02 12:21:50 -08:00
Jihoon Son	45f12de9ad	Fix supported file formats for Hadoop vs Native batch doc (#7069 ) * Fix supported file formats * address comment	2019-02-28 19:44:45 -08:00
Jonathan Wei	32c418fdd8	Reword 'node' to 'process' (#7172 )	2019-02-28 18:10:39 -08:00
Jonathan Wei	a0afd7931d	Add web consoles doc page (#7123 ) * Add web consoles doc page * PR comments * Remove 'unified' * PR comments * Fix TOC * PR comments * More revisions * GUI -> UI * Update router docs * Reword router doc	2019-02-28 14:02:39 -08:00
Jonathan Wei	0b4f771062	Exclude hadoop-lzo from thrift-extensions build (#7151 )	2019-02-27 19:57:53 -08:00
Jonathan Wei	3d247498ef	Update tutorials for 0.14.0-incubating (#7157 )	2019-02-27 19:50:31 -08:00
Jihoon Son	6b232d8195	Improve compaction tutorial to demonstrate compaction with keepSegmentGranularity = true (#7079 ) * Improve compaction tutorial to demonstrate compaction with keepSegmentGranularity = true * typo * add a warning	2019-02-27 16:02:51 -08:00
Jihoon Son	4e2b085201	Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file in deep storage (#6911 ) * Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file * delete descriptor.file when killing segments * fix test * Add doc for ha * improve warning	2019-02-20 15:10:29 -08:00
Mingming Qiu	dd34691004	Coordinator await initialization before finishing startup (#6847 ) * Curator server inventory await initialization * address comments * print exception object in log * remove throws ISE * cachingCost awaitInitialization default to false	2019-02-20 11:56:23 -08:00
David Glasser	a81b1b8c9c	index_parallel: support !appendToExisting with no explicit intervals (#7046 ) * index_parallel: support !appendToExisting with no explicit intervals This enables ParallelIndexSupervisorTask to dynamically request locks at runtime if it is run without explicit intervals in the granularity spec and with appendToExisting set to false. Previously, it behaved as if appendToExisting was set to true, which was undocumented and inconsistent with IndexTask and Hadoop indexing. Also, when ParallelIndexSupervisorTask allocates segments in the explicit interval case, fail if its locks on the interval have been revoked. Also make a few other additions/clarifications to native ingestion docs. Fixes #6989. * Review feedback. PR description on GitHub updated to match. * Make native batch ingestion partitions start at 0 * Fix to previous commit * Unit test. Verified to fail without the other commits on this branch. * Another round of review * Slightly scarier warning	2019-02-20 10:54:26 -08:00
Surekha	2b04e6d0bc	add note on consistency of results for sys.segments queries (#7034 ) * add doc * change docs * PR comments * few more changes	2019-02-19 10:52:37 -08:00
Clint Wylie	cadb6c5280	Missing Overlord and MiddleManager api docs (#7042 ) * document middle manager api * re-arrange * correction * document more missing overlord api calls, minor re-arrange of some code i was referencing * fix it * this will fix it * fixup * link to other docs	2019-02-19 10:52:05 -08:00
Surekha	80a2ef7be4	Support kafka transactional topics (#5404 ) (#6496 ) * Support kafka transactional topics * update kafka to version 2.0.0 * Remove the skipOffsetGaps option since it's not used anymore * Adjust kafka consumer to use transactional semantics * Update tests * Remove unused import from test * Fix compilation * Invoke transaction api to fix a unit test * temporary modification of travis.yml for debugging * another attempt to get travis tasklogs * update kafka to 2.0.1 at all places * Remove druid-kafka-eight dependency from integration-tests, remove the kafka firehose test and deprecate kafka-eight classes * Add deprecated in docs for kafka-eight and kafka-simple extensions * Remove skipOffsetGaps and code changes for transaction support * Fix indentation * remove skipOffsetGaps from kinesis * Add transaction api to KafkaRecordSupplierTest * Fix indent * Fix test * update kafka version to 2.1.0	2019-02-18 11:50:08 -08:00
scrawfor	0fa9000849	Add Postgresql SqlFirehose (#6813 ) * Add Postgresql SqlFirehose * Fix Code Style. * Fix style. * Fix Import Order. * Add Line Break before package.	2019-02-14 22:52:03 -08:00
awelsh93	ee91e27fe7	Update api-reference.md doc (#7065 ) - moving description of coordinator isLeader endpoint	2019-02-14 14:38:09 +00:00
Edward Gan	90c1a54b86	Moments Sketch custom aggregator (#6581 ) * Moments Sketch Integration with Druid * updates, add documentation, fix warnings * nits * disallowed base64 * update to druid 0.14	2019-02-13 14:03:47 -08:00
Jihoon Son	970308463d	Add doc for Hadoop-based ingestion vs Native batch ingestion (#7044 ) * Add doc for Hadoop-based ingestion vs Native batch ingestion * add links * add links	2019-02-13 11:23:08 -08:00
Jihoon Son	b1c4a5de0d	Fix and improve doc for partitioning of local index (#7064 )	2019-02-13 11:20:52 -08:00
Jihoon Son	d42de574d6	Add an api to get all lookup specs (#7025 ) * Add an api to get all lookup specs * add doc	2019-02-08 11:05:59 -08:00
Jihoon Son	8e3a58f723	Improve druid.storage.sse.kms.keyId and druid.s3.protocol (#7012 ) * Improve druid.storage.sse.kms.keyId and druid.s3.protocol * fix article	2019-02-06 15:00:51 -08:00
Jihoon Son	75c70c2ccc	Add doc for S3 permissions settings (#7011 ) * Add doc for S3 permissions settings * add a comment about additional settings	2019-02-05 11:52:09 -08:00
Egor Riashin	97b6407983	maintenance mode for Historical (#6349 ) * maintenance mode for Historical forbidden api fix, config deserialization fix logging fix, unit tests * addressed comments * addressed comments * a style fix * addressed comments * a unit-test fix due to recent code-refactoring * docs & refactoring * addressed comments * addressed a LoadRule drop flaw * post merge cleaning up	2019-02-04 18:11:00 -08:00
Jonathan Wei	953b96d0a4	Add more sketch aggregator support in Druid SQL (#6951 ) * Add more sketch aggregator support in Druid SQL * Add docs * Tweak module serde register * Fix tests * Checkstyle * Test fix * PR comment * PR comment * PR comments	2019-02-02 22:34:53 -08:00
Surekha	7baa33049c	Introduce published segment cache in broker (#6901 ) * Add published segment cache in broker * Change the DataSegment interner so it's not based on DataSEgment's equals only and size is preserved if set * Added a trueEquals to DataSegment class * Use separate interner for realtime and historical segments * Remove trueEquals as it's not used anymore, change log message * PR comments * PR comments * Fix tests * PR comments * Few more modification to * change the coordinator api * removeall segments at once from MetadataSegmentView in order to serve a more consistent view of published segments * Change the poll behaviour to avoid multiple poll execution at same time * minor changes * PR comments * PR comments * Make the segment cache in broker off by default * Added a config to PlannerConfig * Moved MetadataSegmentView to sql module * Add doc for new planner config * Update documentation * PR comments * some more changes * PR comments * fix test * remove unintentional change, whether to synchronize on lifecycleLock is still in discussion in PR * minor changes * some changes to initialization * use pollPeriodInMS * Add boolean cachePopulated to check if first poll succeeds * Remove poll from start() * take the log message out of condition in stop()	2019-02-02 22:27:13 -08:00
Justin Borromeo	6430ef8e1b	lol (#6985 )	2019-02-01 14:21:13 -08:00
Clint Wylie	7a5827e12e	bloom filter sql aggregator (#6950 ) * adds sql aggregator for bloom filter, adds complex value serde for sql results * fix tests * checkstyle * fix copy-paste	2019-02-01 13:54:46 -08:00
lxqfy	e45f9ea5e9	Update metrics.md (#6976 )	2019-02-01 13:40:44 -08:00
jorbay-au	852fe86ea2	Remove repeated word in indexing-service.md (#6983 )	2019-02-01 13:38:22 -08:00
Furkan KAMACI	185a7d4fc5	Updated definition and added link for Zookeeper connection string. (#6961 ) * Updated definition and added link for Zookeeper connection string. * Conflicts are merged.	2019-01-31 10:14:42 -08:00
Gian Merlino	54735a5ad1	Kafka indexing: Remove experimental notice. (#6970 )	2019-01-31 09:54:22 -08:00
Surekha	4c211ab2b4	update sys table docs (#6955 ) * update sys table docs * Capitalize SQL	2019-01-31 08:51:39 -08:00
Jonathan Wei	82137874ea	Add master/data/query server concepts to docs/packaging (#6916 ) * Add master/data/query server concepts to docs/packaging * PR comments * TOC and markdown fix * Update image legend * PR comment * More PR comments	2019-01-30 19:41:07 -08:00
Jihoon Son	d4fbbb8deb	Support protocol configuration for S3 (#6954 ) * Support protocol configuration for S3 * Add doc	2019-01-30 19:32:00 -08:00
Gian Merlino	edee576a7a	Add doc for druid.storage.useS3aSchema. (#6964 )	2019-01-30 10:26:37 -08:00
Clint Wylie	a6d81c0d16	Adds bloom filter aggregator to 'druid-bloom-filters' extension (#6397 ) * blooming aggs * partially address review * fix docs * minor test refactor after rebase * use copied bloomkfilter * add ByteBuffer methods to BloomKFilter to allow agg to use in place, simplify some things, more tests * add methods to BloomKFilter to get number of set bits, use in comparator, fixes * more docs * fix * fix style * simplify bloomfilter bytebuffer merge, change methods to allow passing buffer offsets * oof, more fixes * more sane docs example * fix it * do the right thing in the right place * formatting * fix * avoid conflict * typo fixes, faster comparator, docs for comparator behavior * unused imports * use buffer comparator instead of deserializing * striped readwrite lock for buffer agg, null handling comparator, other review changes * style fixes * style * remove sync for now * oops * consistency * inspect runtime shape of selector instead of selector plus, static comparator, add inner exception on serde exception * CardinalityBufferAggregator inspect selectors instead of selectorPluses * fix style * refactor away from using ColumnSelectorPlus and ColumnSelectorStrategyFactory to instead use specialized aggregators for each supported column type, other review comments * adjustment * fix teamcity error? * rename nil aggs to empty, change empty agg constructor signature, add comments * use stringutils base64 stuff to be chill with master * add aggregate combiner, comment	2019-01-29 20:05:17 +07:00
Justin Borromeo	8d70ba69cf	Fix broken link on select query doc page (#6933 ) * Fixed broken link * Typo fix	2019-01-28 17:02:21 -08:00
Clint Wylie	af3cbc3687	add bloom filter druid expression (#6904 ) * add "bloom_filter_test" druid expression to support bloom filters in ExpressionVirtualColumn and ExpressionDimFilter and sql expressions * more docs * use java.util.Base64, doc fixes	2019-01-28 08:41:45 -05:00
Navin Kumar	ae4dba7785	Fix Configuration options (#6884 ) Change `druid.metadata.postgres.` to `druid.metadata.postgres.ssl.`	2019-01-27 12:35:27 -08:00
Gian Merlino	7c5a06bb85	More docs on data modeling. (#6899 ) * More docs on data modeling. * Try to fix formatting. * Fix indentation. * More details and adjustments after feedback.	2019-01-27 11:33:21 -08:00
Janek Lasocki-Biczysko	89f2475369	Move ingest/kafka/* metrics into a separate section on the metrics docs (#6895 ) The `ingest/kafka/*` metrics were grouped together with metrics relevant to RealtimeMetricsMonitor, whereas they should be in their own section.	2019-01-28 00:11:53 +08:00
Jihoon Son	3b020fd81b	Improve doc for auto compaction (#6782 ) * Improve doc for auto compaction * address comments * address comments * address comments	2019-01-23 16:21:45 -08:00
Justin Borromeo	86e171a234	Doc change and commands tested command on v5 and v8 (#6886 )	2019-01-18 15:13:11 -08:00
Jonathan Wei	68f744ec0a	Fixed buckets histogram aggregator (#6638 ) * Fixed buckets histogram aggregator * PR comments * More PR comments * Checkstyle * TeamCity * More TeamCity * PR comment * PR comment * Fix doc formatting	2019-01-17 14:51:16 -08:00
lxqfy	f6dcd63084	Fixed the format of broker client configration (#6878 )	2019-01-16 22:57:50 -08:00
Dayue Gao	5b8a221713	Add SQL id, request logs, and metrics (#6302 ) * use SqlLifecyle to manage sql execution, add sqlId * add sql request logger * fix UT * rename sqlId to sqlQueryId, sql/time to sqlQuery/time, etc * add docs and more sql request logger impls * add UT for http and jdbc * fix forbidden use of com.google.common.base.Charsets * fix UT in QuantileSqlAggregatorTest, supressed unused warning of getSqlQueryId * do not use default method in QueryMetrics interface * capitalize 'sql' everywhere in the non-property parts of the docs * use RequestLogger interface to log sql query * minor bugfixes and add switching request logger * add filePattern configs for FileRequestLogger * address review comments, adjust sql request log format * fix inspection error * try SuppressWarnings("RedundantThrows") to fix inspection error on ComposingRequestLoggerProvider	2019-01-15 23:12:59 -08:00
Jonathan Wei	9a8bade2fb	Update approximate aggregators docs (#6848 )	2019-01-11 21:50:51 -08:00
Furkan KAMACI	55927bf8e3	Kafka version is updated (#6835 ) Update Kafka version in tutorial from 0.10.2.0 to 0.10.2.2	2019-01-10 17:58:40 -08:00
Jihoon Son	c35a39d70b	Add support maxRowsPerSegment for auto compaction (#6780 ) * Add support maxRowsPerSegment for auto compaction * fix build * fix build * fix teamcity * add test * fix test * address comment	2019-01-10 09:50:14 -08:00
Furkan KAMACI	ea973fee6b	Tranquility version is updated (#6824 )	2019-01-10 09:46:58 +08:00
dongyifeng	def823124c	add version comparator for StringComparator (#6745 ) * add version comparator for StringComparator * add more test case and docs	2019-01-08 17:17:03 -08:00
Benjamin Hopp	ef80c4e036	Update sql.md (#6821 ) Corrected defaults for druid.sql.avatica.maxStatementsPerConnection and druid.sql.avatica.maxConnections	2019-01-08 10:15:12 -08:00
Janek Lasocki-Biczysko	b88e6304c4	Fix broken link in ingestion/schema-design.md docs (#6810 )	2019-01-06 18:20:53 -08:00
David Glasser	c08f391605	statsd-emitter: support constant DogStatsD tags (#6791 ) PR #6605 added support to the statsd emitter for DogStatsD tags. This commit lets you specify "constant tags" in the config file which are included with every event. This is helpful if you are running in an environment where you cannot configure your datadog-agent with tags like "cluster name" --- eg, a Kubernetes cluster with a datadog-agent on each node and different Druid deployments in different namespaces but sharing the same datadog-agent daemonset. Also fix the name of an existing boolean getter to start with 'is'.	2019-01-04 15:35:37 +08:00
thomask	0e04acca43	Show how to include classpath in command (#6802 ) Would have saved me some time	2019-01-03 18:31:55 -08:00
Jihoon Son	9ad6a733a5	Add support segmentGranularity for CompactionTask (#6758 ) * Add support segmentGranularity * add doc and fix combination of options * improve doc	2019-01-03 17:50:45 -08:00
Mingming Qiu	6761663509	make kafka poll timeout can be configured (#6773 ) * make kafka poll timeout can be configured * add doc * rename DEFAULT_POLL_TIMEOUT to DEFAULT_POLL_TIMEOUT_MILLIS	2019-01-03 12:16:02 +08:00
Mingming Qiu	114a9fc38f	change propertyBase in ServerViewModule (#6774 )	2019-01-02 16:44:02 +08:00
Clint Wylie	67f832957b	add bloom filter operator to general sql docs (#6785 )	2018-12-31 11:30:33 -08:00
Joshua Sun	7c7997e8a1	Add Kinesis Indexing Service to core Druid (#6431 ) * created seekablestream classes * created seekablestreamsupervisor class * first attempt to integrate kafa indexing service to use SeekableStream * seekablestream bug fixes * kafkarecordsupplier * integrated kafka indexing service with seekablestream * implemented resume/suspend and refactored some package names * moved kinesis indexing service into core druid extensions * merged some changes from kafka supervisor race condition * integrated kinesis-indexing-service with seekablestream * unite tests for kinesis-indexing-service * various bug fixes for kinesis-indexing-service * refactored kinesisindexingtask * finished up more kinesis unit tests * more bug fixes for kinesis-indexing-service * finsihed refactoring kinesis unit tests * removed KinesisParititons and KafkaPartitions to use SeekableStreamPartitions * kinesis-indexing-service code cleanup and docs * merge #6291 merge #6337 merge #6383 * added more docs and reordered methods * fixd kinesis tests after merging master and added docs in seekablestream * fix various things from pr comment * improve recordsupplier and add unit tests * migrated to aws-java-sdk-kinesis * merge changes from master * fix pom files and forbiddenapi checks * checkpoint JavaType bug fix * fix pom and stuff * disable checkpointing in kinesis * fix kinesis sequence number null in closed shard * merge changes from master * fixes for kinesis tasks * capitalized <partitionType, sequenceType> * removed abstract class loggers * conform to guava api restrictions * add docker for travis other modules test * address comments * improve RecordSupplier to supply records in batch * fix strict compile issue * add test scope for localstack dependency * kinesis indexing task refactoring * comments * github comments * minor fix * removed unneeded readme * fix deserialization bug * fix various bugs * KinesisRecordSupplier unable to catch up to earliest position in stream bug fix * minor changes to kinesis * implement deaggregate for kinesis * Merge remote-tracking branch 'upstream/master' into seekablestream * fix kinesis offset discrepancy with kafka * kinesis record supplier disable getPosition * pr comments * mock for kinesis tests and remove docker dependency for unit tests * PR comments * avg lag in kafkasupervisor #6587 * refacotred SequenceMetadata in taskRunners * small fix * more small fix * recordsupplier resource leak * revert .travis.yml formatting * fix style * kinesis docs * doc part2 * more docs * comments * comments2 revert string replace changes * comments * teamcity * comments part 1 * comments part 2 * comments part 3 * merge #6754 * fix injection binding * comments * KinesisRegion refactor * comments part idk lol * can't think of a commit msg anymore * remove possiblyResetDataSourceMetadata() for IncrementalPublishingTaskRunner * commmmmmmmmmments * extra error handling in KinesisRecordSupplier getRecords * comments * quickfix * typo * oof	2018-12-21 12:49:24 -07:00
Gian Merlino	7a09cde4de	Broker: Await initialization before finishing startup. (#6742 ) * Broker: Await initialization before finishing startup. In particular, hold off on announcing the service and starting the HTTP server until the server view and SQL metadata cache are finished initializing. This closes a window of time where a Broker could return partial results shortly after startup. As part of this, some simplification of server-lifecycle service announcements. This helps ensure that the two different kinds of announcements we do (legacy and new-style) stay in sync. * Remove unused imports. * Fix NPE in ServerRunnable.	2018-12-18 20:32:31 -08:00
Jihoon Son	2c380e3a26	Fix doc for automatic compaction (#6749 )	2018-12-17 11:44:33 -08:00
Jonathan Wei	c713116a75	Use @Coordinator leader client in CoordinatorRuleManager (#6729 )	2018-12-16 15:18:09 -08:00
Gian Merlino	04e7c7fbdc	FilteredRequestLogger: Fix start/stop, invalid delegate behavior. (#6637 ) * FilteredRequestLogger: Fix start/stop, invalid delegate behavior. Fixes two bugs: 1) FilteredRequestLogger did not start/stop the delegate. 2) FilteredRequestLogger would ignore an invalid delegate type, and instead silently substitute the "noop" logger. This was due to a larger problem with RequestLoggerProvider setup in general; the fix here is to remove "defaultImpl" from the RequestLoggerProvider interface, and instead have JsonConfigurator be responsible for creating the default implementations. It is stricter about things than the old system was, and is only willing to make a noop logger if it doesn't see any request logger configs. Otherwise, it'll raise a provision error. * Remove unneeded annotations.	2018-12-14 16:55:44 +08:00
Clint Wylie	4ec068642d	move parquet extension input formats up a level to `org.apache.druid.data.input.parquet.DruidParquetInputFormat` for `parquet` and `org.apache.druid.data.input.parquet.DruidParquetAvroInputFormat` for `parquet-avro` (#6727 )	2018-12-13 16:33:42 -08:00
David Lim	f7bbee2e65	Front Matter header needs to be on the first line for md to be rendered properly by jekyll (#6733 )	2018-12-13 11:47:20 -08:00
Vadim Ogievetsky	da4836f38c	Added titles and harmonized docs to improve usability and SEO (#6731 ) * added titles and harmonized docs * manually fixed some titles	2018-12-12 20:42:12 -08:00
Clint Wylie	55914687bb	Fix broken link in docs toc (#6728 ) Change 'peon.html' to the correct link, 'peons.html'. No redirect is needed because the file has always been 'peons', just an incorrect link was introduced in the toc here https://github.com/apache/incubator-druid/pull/6259/files#diff-45297643736c5fb6da0e92f2c3df5d68R89	2018-12-12 15:14:38 -08:00
Vincent Newkirk	cc44a4a28f	Correct Documentation for lowerStrict/upperStrict (#6707 ) The documentation for Bound filter's lowerStrict/upperStrict is incorrect. It is not consistent with the examples provided and actual behaviour of the bound filter. Correct this.	2018-12-06 10:14:50 -08:00
Mingming Qiu	607339003b	Add TaskCountStatsMonitor to monitor task count stats (#6657 ) * Add TaskCountStatsMonitor to monitor task count stats * address comments * add file header * tweak test	2018-12-04 13:37:17 -08:00
Clint Wylie	a1c9d0add2	autosize processing buffers based on direct memory sizing by default (#6588 ) * autosize processing buffers based on direct memory sizing * remove oops, more test * max 1gb autosize buffers, test, start of docs * fix oops * revert accidental change * print buffer size in exception * change the things	2018-12-03 18:40:02 -07:00
David Lim	e2bedab665	fix links to use relative references (#6696 )	2018-11-30 16:32:10 -08:00
David Lim	b332021c49	remove extensions from default configs that have configuration/library dependencies and update docs (#6694 )	2018-11-30 12:52:46 -08:00
rcgarcia74	9bf835b84f	remove #658 doc reference for Schema-less design (#6693 )	2018-11-30 12:53:57 -07:00
Jihoon Son	d6539abd0a	Fix overlord api and console (#6686 ) * Fix overlord APIs and console * remove getRunningTasksByDataSource * add missing path to isApplicable	2018-11-29 23:45:28 -08:00
Mingming Qiu	c5405bb592	emit maxLag/avgLag in KafkaSupervisor (#6587 ) * emit maxLag/totalLag/avgLag in KafkaSupervisor * modify ingest/kafka/totalLag to ingest/kafka/lag for backwards compatibility	2018-11-28 02:11:14 -08:00
Mingming Qiu	849ba867b2	fix missing property in JsonTypeInfo of SegmentWriteOutMediumFactory (#6656 )	2018-11-27 15:59:58 -08:00
Clint Wylie	efdec50847	bloom filter sql (#6502 ) * bloom filter sql support * docs * style fix * style fixes after rebase * use copied/patched bloomkfilter * remove context literal lookup function, changes from review * fix build * rename LookupOperatorConversion to QueryLookupOperatorConversion * remove doc * revert unintended change * add internal exception to bloom filter deserialization exception	2018-11-27 14:11:18 +08:00
Evans Hauser	03df481c9c	Docs: Fix wikipedia links in Ingestion:Rollup (#6659 ) The rendered site doesn't have automatic link detection, so we need to add these links in explicitly. This also fixes the Measure link, which included an extra `)` http://druid.io/docs/latest/ingestion/index.html#rollup	2018-11-23 16:28:05 -08:00
seoeun	22a5bf97a2	Fix issue that tasks tables in metadata storage are not cleared (#6592 ) * tasks tables in metadata storage are not cleared * address comments. remove tasklogs and revert obsolete changes * address comments. change comment and update doc. * address comments. update doc more detailed * address comments. remove redundant log and update doc more detailed. * address comments. update document	2018-11-22 11:50:31 +08:00
Jonathan Wei	e285b1103d	Use PasswordProvider for basic HTTP escalator (#6650 )	2018-11-21 07:34:15 -08:00
Caroline1000	a438a9b99c	fix typo in config page of docs (#6645 )	2018-11-19 16:32:58 -08:00
Deiwin Sarjas	e0d1dc5846	Support DogStatsD style tags in statsd-emitter (#6605 ) * Replace StatsD client library The [Datadog package][1] is a StatsD compatible drop-in replacement for the client library, but it seems to be [better maintained][2] and has support for Datadog DogStatsD specific features, which will be made use of in a subsequent commit. The `count`, `time`, and `gauge` methods are actually exactly compatible with the previous library and the modifications shouldn't be required, but EasyMock seems to have a hard time dealing with the variable arguments added by the DogStatsD library and causes tests to fail if no arguments are provided for the last String vararg. Passing an empty array fixes the test failures. [1]: https://github.com/DataDog/java-dogstatsd-client [2]: https://github.com/tim-group/java-statsd-client/issues/37#issuecomment-248698856 * Retain dimension key information for StatsD metrics This doesn't change behavior, but allows separating dimensions from the metric name in subsequent commits. There is a possible order change for values from `dimsBuilder.build().values()`, but from the tests it looks like it doesn't affect actual behavior and the order of user dimensions is also retained. * Support DogStatsD style tags in statsd-emitter Datadog [doesn't support name-encoded dimensions and uses a concept of _tags_ instead.][1] This change allows Datadog users to send the metrics without having to encode the various dimensions in the metric names. This enables building graphs and monitors with and without aggregation across various dimensions from the same data. As tests in this commit verify, the behavior remains the same for users who don't enable the `druid.emitter.statsd.dogstatsd` configuration flag. [1]: https://www.datadoghq.com/blog/the-power-of-tagged-metrics/#tags-decouple-collection-and-reporting * Disable convertRange behavior for DogStatsD users DogStatsD, unlike regular StatsD, supports floating-point values, so this behavior is unnecessary. It would be possible to still support `convertRange`, even with `dogstatsd` enabled, but that would mean that people using the default mapping would have some of the gauges unnecessarily converted. `time` is in milliseconds and doesn't support floating-point values.	2018-11-19 09:47:57 -08:00
Gian Merlino	7cd457f41c	Kafka: Add warning to doc for earlyMessageRejectionPeriod. (#6644 )	2018-11-18 15:47:38 -07:00
Benjamin Hopp	8a258d3a6a	Fix Hadoop Indexing doc to clarify segmentOutputPath only required for CLI indexer (#6636 ) * Updated hadoop indexing doc to reflect segmentOutputPath is only required when using CLI indexer; otherwise it must be NULL	2018-11-17 12:19:20 +08:00
Niketh Sabbineni	2ebdce20b1	Fix smile query documentation (#6620 )	2018-11-14 08:51:02 +08:00
Jihoon Son	cdae2fe7b5	Deprecate IntervalChunkingQueryRunner (#6591 ) * Deprecate IntervalChunkingQueryRunner * add doc * deprecate metric * fix doc	2018-11-14 06:33:27 +08:00
Gian Merlino	154b6fbcef	SQL: Add "POSITION" function. (#6596 ) Also add a "fromIndex" argument to the strpos expression function. There are some -1 and +1 adjustment terms due to the fact that the strpos expression behaves like Java indexOf (0-indexed), but the POSITION SQL function is 1-indexed.	2018-11-13 13:39:00 -08:00
Jihoon Son	7b262b7123	Remove unnecessary path param from auto compaction api (#6594 ) * Remove unnecessary path param from auto compaction api * fix ci	2018-11-13 09:46:13 -08:00
David Lim	afb239b17a	add missing license headers, in particular to MD files; clean up RAT … (#6563 ) * add missing license headers, in particular to MD files; clean up RAT exclusions * revert inadvertent doc changes * docs * cr changes * fix modified druid-production.svg	2018-11-13 09:38:37 -08:00
Clint Wylie	1224d8b746	overhaul 'druid-parquet-extensions' module, promoting from 'contrib' to 'core' (#6360 ) * move parquet-extensions from contrib to core, adds new hadoop parquet parser that does not convert to avro first and supports flattenSpec and int96 columns, add support for flattenSpec for parquet-avro conversion parser, much test with a bunch of files lifted from spark-sql * fix avro flattener to support nullable primitives for auto discovery and now only supports primitive arrays instead of all arrays * remove leftover print * convert micro timestamp to millis * checkstyle * add ignore for .parquet and .parq to rat exclude * fix legit test failure from avro flattern behavior change * fix rebase * add exclusions to pom to cut down on redundant jars * refactor tests, add support for unwrapping lists for parquet-avro, review comments * more comment * fix oops * tweak parquet-avro list handling * more docs * fix style * grr styles	2018-11-05 21:33:42 -08:00
David Lim	23ad3d214c	fixup docs to download from Apache mirror, fixup tarball name and path, change references from quickstart/* to quickstart/tutorial/* (#6570 )	2018-11-01 21:47:29 -07:00
Caroline1000	26d992840c	correct default tier name (#6568 )	2018-11-01 17:51:13 -07:00
QiuMM	ddd15a6907	correct default value for maxTotalRows (#6566 )	2018-11-01 16:53:15 -07:00
Jihoon Son	a92c2a197b	Move supervisor APIs to api-reference (#6555 ) * Move supervisor APIs to api-reference * fix kafka-specific docs * add ingestion stats report	2018-11-01 13:10:05 -07:00
QiuMM	7b34662462	Period load/drop/broadcast rules should include the future by default (#6414 ) * Period load/drop/broadcast rules should include the future by default * address comments * adjust coordinator console and tweak docs * address comments * fix travis-ci	2018-11-01 09:43:34 -07:00
Jihoon Son	d2a533c7c7	Add doc for missing balancerComputeThreads configuration (#6561 ) * Add doc for missing balancerComputeThreads configuration * remove duplicate	2018-10-31 18:43:12 -07:00
taiii	b1159174b7	Update mysql.md (#6545 )	2018-10-30 14:01:32 -07:00
Jonathan Wei	8382764900	Remove unused bin/init script, conf-quickstart reference (#6520 )	2018-10-26 11:30:01 -07:00
Jonathan Wei	b2d9b6f23d	Allow custom TLS cert checks (#6432 ) * Allow custom TLS cert checks * PR comment * Checkstyle, PR comment	2018-10-24 16:31:52 -07:00
QiuMM	601183b4c7	Add period drop before rule (#6415 ) * Add period drop before rule * add license header * support period drop before rule in coordinator console * address comments	2018-10-24 12:44:30 -07:00
David Lim	822e564f54	include mysql-metadata-storage extension in distribution, but without… (#6497 ) * include mysql-metadata-storage extension in distribution, but without the GPL-licensed connector library * Install mysql connector package * use symlinks to avoid versioning issues * add documentation for fetching the mysql connector	2018-10-20 18:18:58 -07:00
QiuMM	f5f4171a45	QueryCountStatsMonitor: emit query/count (#6473 ) Let `QueryCountStatsMonitor` emit `query/count`, then I can monitor QPS of my services, or I have to count it by myself.	2018-10-19 10:15:02 -03:00
patelh	c780aacc03	Add ability to specify dbcp properties file (#6419 ) * Add ability to specify dbcp properties file * Address PR comments, use mock config, remove setter * Add documentation * APRC, updated docs with example file contents * APRC, add @Nullable, @VisibileForTesting, update doc * APRC, remove error log, use props directly as jackson binding * Remove unused files	2018-10-16 12:27:19 -07:00
QiuMM	85a89e2703	make druid node bind address configurable (#6464 ) * make druid node bind address configurable * fix tests * fix travis-ci	2018-10-15 14:19:40 -07:00
robertervin	95ab1ea737	Fix Empty InDimFilter Failure (#6330 ) * fix empty InDimFilter failure (#6101) * Add test case for empty values input * Add documentation for empty values in InDimFilter	2018-10-14 20:43:16 -07:00
Clint Wylie	84598fba3b	combine druid-api, druid-common, java-util into druid-core (#6443 ) * combine druid-api, druid-common, java-util * spacing	2018-10-14 20:37:37 -07:00
dongyifeng	b06ac54a5e	add PrefixFilteredDimensionSpec for multi-value dimensions (#6307 ) * add PrefixFilteredDimensionSpec for multi-value dimensions * add docs for PrefixFilteredDimensionSpec * remove unnecessary null handling * add null check to the result of NullHandling	2018-10-12 17:51:09 -07:00
vishnu rao	6567fff9e7	Query Response format to be based on http 'accept' header & Query Payload content type to be based on 'content-type' header (#4033 ) * o- Query Response format to be based on http 'accept' header & Query Payload contenty type to be based on 'content-type' header * o- Query Response format to be based on http 'accept' header & Query Payload contenty type to be based on 'content-type' header o- if Accept header is absent, it defaults to Content-Type header * Feature: Query Response format to be based on http 'accept' header & Query Payload content type to be based on 'content-type' PR #4033 Minor change to a comment - restoring to previous wording * Feature: Query Response format to be based on http 'accept' header & Query Payload content type to be based on 'content-type' PR #4033 o- minor change to check for empty string	2018-10-12 14:29:14 -07:00
Atul Mohan	ab7b4798cc	Securing passwords used for SSL connections to Kafka (#6285 ) * Secure credentials in consumer properties * Merge master * Refactor property population into separate method * Fix property setter * Fix tests	2018-10-11 10:03:01 -07:00
QiuMM	f8f4526b16	Add suspend\|resume\|terminate all supervisors endpoints. (#6272 ) * ability to showdown all supervisors * add doc * address comments * fix code style * address comments * change ternary assignment to if statement * better docs	2018-10-10 21:41:59 -07:00
Clint Wylie	f7775d1db3	fixes for LookupReferencesManagerTest (#6444 ) * some fixes for LookupReferencesManagerTest * docs * formatting * more formatting fixes	2018-10-10 18:02:11 -07:00
Surekha	3a0a667fe0	Introduce SystemSchema tables (#5989 ) (#6094 ) * Added SystemSchema with following tables (#5989) * SEGMENTS table provides details on served and published segments * SERVERS table provides details on data servers * SERVERSEGMETS table is the JOIN of SEGMENTS and SERVERS * TASKS table provides details on tasks * Add documentation for system schema * Fix static-analysis warnings * Address PR comments Add unit tests Fix a test * Try to fix a test * Fix a bug around replica count * rename io.druid to org.apache.druid * Major change is to make tasks and segment queries streaming * Made tasks/segments stream to calcite instead of storing it in memory * Add num_rows to segments table * Refactor JsonParserIterator * Replace with closeable iterator * Fix docs, make num_rows column nullable, some unit test changes * make num_rows column type long, allow it to be null fix a compile error after merge, add TrafficCop param to InputStreamResponseHandler * Filter null rows for segments table from Linq4j enumerable * change num_replicas datatype to long in segments table * Fix some tests and address comments * Doc updates, other PR comments * Update tests * Address comments * Add auth check * Update docs * Refactoring * Fix teamcity warning, change the getQueryableServer in TimelineServerView * Fix compilation after rebase * Use the stream API from AuthorizationUtils * Added LeaderClient interface and NoopDruidLeaderClient class * Revert "Added LeaderClient interface and NoopDruidLeaderClient class" This reverts commit `100fa46e39`. * Make the naming consistent to server_segments for the join table * Add ForbiddenException on auth check failure * Remove static block from SystemSchema * Try to fix a test in CalciteQueryTest due to rename of server_segments * Fix the json output format in the coordinator API * Add auth check in the segments API * Add null check to avoid NPE * Use annonymous class object instead of mock for DruidLeaderClient in SqlBenchmark * Fix test failures, type long/BIGINT can be nullable * Revert long nullability to fix tests * Fix style for tests * PR comments * Address PR comments * Add the missing BytesAccumulatingResponseHandler class * Use Sequences.withBaggage in DruidPlanner * Fix docs, add comments * Close the iterator if hasNext returns false	2018-10-10 17:17:29 -07:00
QiuMM	d559dfecb2	replace deprecated druid.port by druid.plaintextPort in docs (#6427 )	2018-10-09 10:57:01 -07:00
Jihoon Son	88d23b77b7	Add support keepSegmentGranularity for automatic compaction (#6407 ) * Add support keepSegmentGranularity for automatic compaction * skip unknown dataSource * ignore single semgnet to compact * add doc * address comments * address comment	2018-10-07 16:48:58 -07:00
Jihoon Son	45aa51a00c	Add support hash partitioning by a subset of dimensions to indexTask (#6326 ) * Add support hash partitioning by a subset of dimensions to indexTask * add doc * fix style * fix test * fix doc * fix build	2018-10-06 16:45:07 -07:00
Roman Leventov	c5872bef41	Improve GC metrics documentation (#6423 )	2018-10-05 14:57:01 -07:00
Gian Merlino	244046fda5	SQL: Fix too-long headers in http responses. (#6411 ) Fixes #6409 by moving column name info from HTTP headers into the result body.	2018-10-01 18:13:08 -07:00
Jihoon Son	cb14a43038	Remove ConvertSegmentTask, HadoopConverterTask, and ConvertSegmentBackwardsCompatibleTask (#6393 ) * Remove ConvertSegmentTask, HadoopConverterTask, and ConvertSegmentBackwardsCompatibleTask * update doc and remove auto conversion * remove remaining doc * fix teamcity	2018-10-01 12:03:35 -07:00
Shiv Toolsidass	a56ffe6ab2	Added backpressure metric to docs and defaultMetricDimensions (#6405 ) * Added backpressure metric to docs and defaultMetricDimensions.json * Reworded description for backpressure metric in docs	2018-09-29 17:57:29 -07:00
adursun	6f44e568db	Add missing comma (#6399 )	2018-09-28 09:02:36 -07:00
QiuMM	47a6cca013	Add TimestampSpec format for microsecond (#6395 )	2018-09-27 09:38:44 -07:00
Jihoon Son	6fb503c073	Deprecate task audit logging (#6368 ) * Deprecate task audit logging * fix test * fix it test	2018-09-26 16:28:02 -07:00
Nishant Bangarwa	c9d281a2e9	Add ability to pass in Bloom filter from Hive Queries (#6222 ) * Bloom filter initial implementation fix checkstyle review comments Fix wierd failure review comments Revert "Fix wierd failure" This reverts commit a13a83ad7887e679f6d539191b52aeaaea85b613. * fix test * review comment	2018-09-26 16:04:26 -07:00
Caroline1000	034d006d24	add to docs on including Drop rule with Load rule (#6378 )	2018-09-25 20:13:52 -07:00
Benedict Jin	e5d9fcfe8f	Add maven.exec.xxx.skip option for exec-maven-plugin (#6162 ) * Fix conflicts * Modify io.druid into org.apache.druid	2018-09-25 10:05:26 -07:00
Jihoon Son	99428e20d2	Deprecate dimensions / metrics APIs on brokers (#6361 ) * Deprecate dimensions / metrics APIs on brokers * add segmentMetadataQuery link * add more doc	2018-09-24 17:56:38 -07:00
Jonathan Wei	ee7b565469	Docs for ingestion stat reports and new parse exception handling (#6373 )	2018-09-24 17:45:05 -07:00
Alexander Saydakov	93345064b5	HllSketch module (#5712 ) * HllSketch module * updated license and imports * updated package name * implemented makeAggregateCombiner() * removed json marks * style fix * added module * removed unnecessary import, side effect of package renaming * use TreadLocalRandom * addressing code review points, mostly formatting and comments * javadoc * natural order with nulls * typo * factored out raw input value extraction * singleton * style fix * style fix * use Collections.singletonList instead of Arrays.asList * suppress warning	2018-09-24 08:41:56 -07:00
Jonathan Wei	f12ffd19a8	Add Kafka reset instructions for tutorial (#6362 )	2018-09-21 14:18:31 -07:00
Jonathan Wei	8972244c68	Mutual TLS support (#6076 ) * Mutual TLS support * Kafka test fixes * TeamCity fix * Split integration tests * Use localhost DOCKER_IP * Increase server thread count * Increase SSL handshake timeouts * Add broken pipe retries, use injected client config params * PR comments, Rat license check exclusion	2018-09-19 09:56:15 -07:00
Dayue Gao	edf0c13807	add a sql option to force user to specify time condition (#6246 ) * add a sql option to force user to specify time condition * rename forceTimeCondition to requireTimeCondition, refine error message	2018-09-17 13:52:24 -07:00
QiuMM	288aa4d504	Add missing metadata table information in docs (#6309 ) * Add missing metadata table information in doc file * address review comment	2018-09-14 12:17:05 -07:00
QiuMM	85391e9fb3	fix opentsdb emitter always be running and fail sending tags whose value contains colon (#6251 ) * fix opentsdb emitter always be running * check if emitter started * add more details about consumeDelay in doc * fix possible thread unsafe * fix fail sending tags whose value contain colon	2018-09-14 12:14:15 -07:00
QiuMM	87ccee05f7	Add ability to specify list of task ports and port range (#6263 ) * support specify list of task ports * fix typos * address comments * remove druid.indexer.runner.separateIngestionEndpoint config * tweak doc * fix doc * code cleanup * keep some useful comments	2018-09-13 19:36:04 -07:00
Jonathan Wei	fd6786ac6c	Fix endpoint permissions section in basic-security docs (#6331 )	2018-09-13 15:23:41 -07:00
Clint Wylie	91a37c692d	'suspend' and 'resume' support for supervisors (kafka indexing service, materialized views) (#6234 ) * 'suspend' and 'resume' support for kafka indexing service changes: * introduces `SuspendableSupervisorSpec` interface to describe supervisors which support suspend/resume functionality controlled through the `SupervisorManager`, which will gracefully shutdown the supervisor and it's tasks, update it's `SupervisorSpec` with either a suspended or running state, and update with the toggled spec. Spec updates are provided by `SuspendableSupervisorSpec.createSuspendedSpec` and `SuspendableSupervisorSpec.createRunningSpec` respectively. * `KafkaSupervisorSpec` extends `SuspendableSupervisorSpec` and now supports suspend/resume functionality. The difference in behavior between 'running' and 'suspended' state is whether the supervisor will attempt to ensure that indexing tasks are or are not running respectively. Behavior is identical otherwise. * `SupervisorResource` now provides `/druid/indexer/v1/supervisor/{id}/suspend` and `/druid/indexer/v1/supervisor/{id}/resume` which are used to suspend/resume suspendable supervisors * Deprecated `/druid/indexer/v1/supervisor/{id}/shutdown` and moved it's functionality to `/druid/indexer/v1/supervisor/{id}/terminate` since 'shutdown' is ambiguous verbage for something that effectively stops a supervisor forever * Added ability to get all supervisor specs from `/druid/indexer/v1/supervisor` by supplying the 'full' query parameter `/druid/indexer/v1/supervisor?full` which will return a list of json objects of the form `{"id":<id>, "spec":<SupervisorSpec>}` * Updated overlord console ui to enable suspend/resume, and changed 'shutdown' to 'terminate' * move overlord console status to own column in supervisor table so does not look like garbage * spacing * padding * other kind of spacing * fix rebase fail * fix more better * all supervisors now suspendable, updated materialized view supervisor to support suspend, more tests * fix log	2018-09-13 14:42:18 -07:00
Gian Merlino	d6cbdf86c2	Broker backpressure. (#6313 ) * Broker backpressure. Adds a new property "druid.broker.http.maxQueuedBytes" and a new context parameter "maxQueuedBytes". Both represent a maximum number of bytes queued per query before exerting backpressure on the channel to the data server. Fixes #4933. * Fix query context doc.	2018-09-10 09:33:29 -07:00
Gian Merlino	4669f0878f	SQL: UNION ALL operator. (#6314 ) * SQL: UNION ALL operator. * Remove unused import.	2018-09-09 22:32:56 -07:00
Clint Wylie	e6e068ce60	Add support for 'maxTotalRows' to incremental publishing kafka indexing task and appenderator based realtime task (#6129 ) * resolves #5898 by adding maxTotalRows to incremental publishing kafka index task and appenderator based realtime indexing task, as available in IndexTask * address review comments * changes due to review * merge fail	2018-09-07 13:17:49 -07:00
Jonathan Wei	60cbc64472	Use PasswordProvider, fix info on initial passwords in basic security extension docs (#6303 ) * Fix info on initial passwords in basic security extension docs * Use PasswordProvider * Compile fix	2018-09-05 17:07:16 -07:00
Jonathan Wei	4caa61d8fa	Fix tutorial sample data filename, fix logger classname in metrics docs (#6299 )	2018-09-04 21:47:12 -07:00
Eyal Yurman	10ca290d64	Correct file name typo in Quickstart tutorial (#6297 ) Correct name wikipedia-2015-09-12-sampled.json.gz to wikiticker-2015-09-12-sampled.json.gz	2018-09-04 14:20:17 -07:00
Jonathan Wei	180e3ccfad	Docs consistency cleanup (#6259 )	2018-09-04 12:54:41 -07:00
QiuMM	9b04846e6b	correct metric name in doc file (#6271 )	2018-08-30 10:57:35 -07:00
Gian Merlino	431d3d8497	Rename io.druid to org.apache.druid. (#6266 ) * Rename io.druid to org.apache.druid. * Fix META-INF files and remove some benchmark results. * MonitorsConfig update for metrics package migration. * Reorder some dimensions in inner queries for some reason. * Fix protobuf tests.	2018-08-30 09:56:26 -07:00
Himanshu	1fae6513e1	add "subtotalsSpec" attribute to groupBy query (#5280 ) * add subtotalsSpec attribute to groupBy query * dont sent subtotalsSpec to downstream nodes from broker and other updates * address review comment * fix checkstyle issues after merge to master * add docs for subtotalsSpec feature * address doc review comments	2018-08-28 17:46:38 -07:00
Jim Slattery	d957295b98	spelling: storage (#6248 )	2018-08-27 16:35:31 -07:00
Gian Merlino	0172326c62	SQL: Support more result formats, add columns header. (#6191 ) * SQL: Support more result formats, add columns header. - Add result formats for line-based JSON and CSV. - Add X-Druid-Sql-Columns header with a list of all columns that the response will contain. - Add more comprehensive documentation on what callers should expect when making Druid SQL queries. * Fix some tests. * Adjust tests. * Adjust trailer, add types header. * Fix trailers.	2018-08-26 23:00:14 -06:00
Susie	6e73ad6231	Fix bound query keys for Filtering on numeric values (#5881 ) It is currently showing the use of `lowerBound` and `upperBound` instead of `lower` and `upper` for the range.	2018-08-23 14:07:10 -07:00
QiuMM	ceb8f8e625	remove unnecessary tlsPortFinder to avoid potential port conflicts (#6194 )	2018-08-23 10:41:49 -07:00
Ryan Plessner	9c500fb69f	Add PostgreSQLConnectorConfig to expose SSL configuration options (#6181 ) * Add PostgreSQLConnectorConfig to expose SSL configuration options for the Postgres Metadata Storage module. * Fix checkstyle violations and add license header * Convert properties in the postgres docs to be the full property path and fix typo * Fix grammar in sslFactory docs	2018-08-21 16:45:27 -07:00
QiuMM	266f3dfbcb	remove duplicate link to operations/recommendations.html (#6193 )	2018-08-21 12:02:43 -07:00
QiuMM	b0cf8d0252	'shutdownAllTasks' API for a dataSource (#6185 ) * 'shutdownAllTasks' API for a dataSource Change-Id: I30d14390457d39e0427d23a48f4f224223dc5777 * fix api path and return Change-Id: Ib463f31ee2c4cb168cf2697f149be845b57c42e5 * optimize implementation Change-Id: I50a8dcd44dd9d36c9ecbfa78e103eb9bff32eab9	2018-08-17 12:57:09 -04:00
Jonathan Wei	0c3bb47558	Change hybrid cache default types in docs to caffeine (#6182 )	2018-08-17 12:17:43 -04:00
Caroline1000	f447b784de	update sigar link (#6175 )	2018-08-14 16:58:29 -07:00
QiuMM	69f555019b	convert all time-intervals in ISO 8601 format to uppercase in doc files (#6118 ) Change-Id: I904fed4cfb600a8a42664335557f611133a5078d	2018-08-13 12:58:47 -07:00
Jonathan Wei	94a937b5e8	New doc fixes (#6156 )	2018-08-13 11:11:32 -07:00
Atul Mohan	064c22c937	Fix redirects (#6151 )	2018-08-10 13:55:47 -07:00
Jonathan Wei	b0805540af	Fix kafka tutorial typo (#6141 )	2018-08-09 18:41:05 -07:00
Jonathan Wei	af0557c1f7	Unified configuration doc page (#6127 ) * Unified configuration doc page * Rename to index.md, update redirects * PR comments * PR comments * PR comment	2018-08-09 14:52:14 -07:00
Jonathan Wei	fea2ab7094	New docs intro (#6122 ) * New docs intro * PR comments * Fix arch diagram * PR comment * PR comment * PR comment	2018-08-09 14:19:11 -07:00
pdeva	c028d18d74	update redis-cache documentation (#6109 ) * update redis-cache documentation added clarifying info on setup and enablement * added link	2018-08-09 13:44:59 -07:00
Jonathan Wei	aa660b8751	Add docs for virtual columns and transform specs (#6119 ) * Add docs for virtual columns and transform specs * PR Comments * PR comment	2018-08-09 14:42:52 -06:00
Jonathan Wei	2b64025eaf	Separate hadoop and native batch docs more (#6120 ) * Separate hadoop and native batch docs more * Rebase with parallel batch * PR comments	2018-08-09 14:40:20 -06:00
Jonathan Wei	24f2e8ba26	New quickstart and tutorials (#6126 ) * New quickstart and tutorials * PR comments * Fix tranquility	2018-08-09 14:37:52 -06:00
Jonathan Wei	2b0f03acb9	Unified API doc page (#6128 ) * Unified API doc page * PR comments * Fix metadata endpoint	2018-08-09 14:27:42 -06:00
Gian Merlino	3525d4059e	Cache: Add maxEntrySize config, make groupBy cacheable by default. (#5108 ) * Cache: Add maxEntrySize config. The idea is this makes it more feasible to cache query types that can potentially generate large result sets, like groupBy and select, without fear of writing too much to the cache per query. Includes a refactor of cache population code in CachingQueryRunner and CachingClusteredClient, such that they now use the same CachePopulator interface with two implementations: one for foreground and one for background. The main reason for splitting the foreground / background impls is that the foreground impl can have a more effective implementation of maxEntrySize. It can stop retaining subvalues for the cache early. * Add CachePopulatorStats. * Fix whitespace. * Fix docs. * Fix various tests. * Add tests. * Fix tests. * Better tests * Remove conflict markers. * Fix licenses.	2018-08-07 10:23:15 -07:00
Jihoon Son	56ab4363ea	Native parallel batch indexing without shuffle (#5492 ) * Native parallel indexing without shuffle * fix build * fix ci * fix ingestion without intervals * fix retry * fix retry * add it test * use chat handler * fix build * add docs * fix ITUnionQueryTest * fix failures * disable metrics reporting * working * Fix split of static-s3 firehose * Add endpoints to supervisor task and a unit test for endpoints * increase timeout in test * Added doc * Address comments * Fix overlapping locks * address comments * Fix static s3 firehose * Fix test * fix build * fix test * fix typo in docs * add missing maxBytesInMemory to doc * address comments * fix race in test * fix test * Rename to ParallelIndexSupervisorTask * fix teamcity * address comments * Fix license * addressing comments * addressing comments * indexTaskClient-based segmentAllocator instead of CountingActionBasedSegmentAllocator * Fix race in TaskMonitor and move HTTP endpoints to supervisorTask from runner * Add more javadocs * use StringUtils.nonStrictFormat for logging * fix typo and remove unused class * fix tests * change package * fix strict build * tmp * Fix overlord api according to the recent change in master * Fix it test	2018-08-06 23:59:42 -07:00
Nishant Bangarwa	75c8a87ce1	Part 2 of changes for SQL Compatible Null Handling (#5958 ) * Part 2 of changes for SQL Compatible Null Handling * Review comments - break lines longer than 120 characters * review comments * review comments * fix license * fix test failure * fix CalciteQueryTest failure * Null Handling - Review comments * review comments * review comments * fix checkstyle * fix checkstyle * remove unrelated change * fix test failure * fix failing test * fix travis failures * Make StringLast and StringFirst aggregators nullable and fix travis failures	2018-08-02 08:20:25 -07:00
Andrés Gómez	e270362767	Add stringLast and stringFirst aggregators extension (#5789 ) * Add lastString and firstString aggregators extension * Remove duplicated class * Move first-last-string doc page to extensions-contrib * Fix ObjectStrategy compare method * Fix doc bad aggregatos type name * Create FoldingAggregatorFactory classes to fix SegmentMetadataQuery * Add getMaxStringBytes() method to support JSON serialization * Fix null pointer exception at segment creation phase when the string value is null * Control the valueSelector object class on BufferAggregators * Perform all improvements * Add java doc on SerializablePairLongStringSerde * Refactor ObjectStraty compare method * Remove unused ; * Add aggregateCombiner unit tests. Rename BufferAggregators unit tests * Remove unused imports * Add license header * Add class name to java doc class serde * Throw exception if value is unsupported class type * Move first-last-string extension into druid core * Update druid core docs * Fix null pointer exception when pair->string is null * Add null control unit tests * Remove unused imports * Add first/last string folding aggregator on AggregatorsModule to support segment metadata query * Change SerializablePairLongString to extend SerializablePair * Change vars from public to private * Convert vars to primitive type * Clarify compare comment * Change IllegalStateException to ISE * Remove TODO comments * Control possible null pointer exception * Add @Nullable annotation * Remove empty line * Remove unused parameter type * Improve AggregatorCombiner javadocs * Add filterNullValues option at StringLast and StringFirst aggregators * Add filterNullValues option at agg documentation * Fix checkstyle * Update header license * Fix StringFirstAggregatorFactory.VALUE_COMPARATOR * Fix StringFirstAggregatorCombiner * Fix if condition at StringFirstAggregateCombiner * Remove filterNullValues from string first/last aggregators * Add isReset flag in FirstAggregatorCombiner * Change Arrays.asList to Collections.singletonList	2018-08-01 10:52:54 -07:00
Caroline1000	7f89c72932	Add definition of 'NONE' to queryGranularity in ingestion.index doc (#6073 ) * Add meaning of granularity = None to queryGranularity * Fix format	2018-07-30 14:07:33 -07:00
Gian Merlino	63be028cee	CompactionTask: Reject empty intervals on construction. (#6059 ) * CompactionTask: Reject empty intervals on construction. They don't make sense anyway, and it's better to fail fast. * Switch API.	2018-07-30 08:52:50 -07:00
Eyal Yurman	94d6c9a0a5	Remove JDK 7 from build documentation. (#6031 ) See issue #6030	2018-07-26 17:05:07 -07:00
Jonathan Wei	efab3b0160	Add concat and textcat SQL functions (#6005 )	2018-07-20 11:21:04 -07:00
Gian Merlino	cd8ea3da8d	SQL: Add server-wide default time zone config. (#5993 ) * SQL: Add server-wide default time zone config. * Switch API.	2018-07-18 13:12:40 -07:00
Caroline1000	5f78a333ad	show that flatten will also work with avro extension (#5874 ) * show that flatten will also work with avro extension * fix url	2018-07-11 16:47:03 -07:00
Gian Merlino	04ea3c9f8c	Update license headers. (#5976 ) * Update license headers. For compliance with http://www.apache.org/legal/src-headers.html. * More license adjustments. * Fix mistakenly edited package line.	2018-07-11 09:55:18 -07:00
Caroline1000	b3976050ad	add definition of balancerComputeThreads (#5865 )	2018-07-05 09:54:36 -07:00
Caroline1000	ee4a5aafb0	add config values for GCS deep storage (#5875 ) * add config values for GCS deep storage * fix config values for GCS deep storage	2018-07-05 09:53:41 -07:00
Dylan Wylie	10642ef9ca	Fix filtered request logging docs (#5924 ) - Setting druid.request.logging.delegate has no effect. - The provider is injected based on a type parameter & this looks to be scoped to delegate for filtered loggers	2018-07-05 09:51:10 -07:00
scrawfor	bf2a31a5bc	Add new 'true' filter which always returns true. (#5711 ) * Add new 'true' filter which always returns true. * Add support for bitmap index. * Adds documentation. * Removes No-op Filter	2018-06-28 11:52:45 -07:00
Gian Merlino	a28314349c	Fix spelling of "propagate" in various places. (#5896 ) One of these is a configuration parameter (introduced in #5429), but it's never been in a release, so I think it's ok to rename it.	2018-06-25 09:18:08 -07:00
varaga	b4b1b2a020	Provisioning support for ZooKeeper Authorization (#5701 ) Review comments implemented	2018-06-15 14:02:01 -07:00
zhangxinyu	e43e5ebbcd	Materialized view implementation (#5556 ) * implement materialized view * modify code according to jihoonson's comments * modify code according to jihoonson's comments - 2 * add documentation about materialized view * use new HadoopTuningConfig in pr 5583 * add minDataLag and fix optimizer bug * correct value of DEFAULT_MIN_DATA_LAG_MS * modify code according to jihoonson's comments - 3 * use the boolean expression instead of if-else	2018-06-09 12:24:54 -07:00
Caroline1000	96feb479cd	add order change needed for KIS in 0.12.0 (#5760 )	2018-06-08 15:25:26 -07:00
Hongze Zhang	cfa94b747b	Update to jetty 9.4; Enable request decompression (#5624 ) * Update to jetty 9.4; Enable request decompression; Add http compression config options * Fix BadMessageException from jetty server at HttpGenerator.generateHeaders(...)	2018-06-08 14:53:08 -07:00
awelsh93	adbe22c05b	Security - add anonymous authenticator (#5842 ) * Anonymous authenticator that authenticates all requests and then directs them to an authorizer. * Adding documentation * Removed some fields from class AnonymousAuthenticator * Updating docs	2018-06-07 10:17:54 -07:00
Siddharth Subramanian	37409dc2f4	Fix minor documentation error (#5851 ) Adding a required `,` in the sample JSON	2018-06-06 12:51:56 -07:00
Ryan Plessner	ee45ee6915	Fix docs to reflect the correct default max total row count for the IndexTuningConfig (#5845 )	2018-06-05 13:15:12 -07:00
awelsh93	1a4707f09c	Remove extra slash in endpoint (#5822 )	2018-06-05 13:11:26 -07:00
Alexander Saydakov	d1cdcd4895	Datasketches doc correction (#5816 ) * func was renamed to operation during code review * added missing descriptions, some cleanup	2018-06-05 17:52:37 +05:30
Atul Mohan	50ad7a45ff	Fix authentication doc (#5813 )	2018-05-30 11:10:48 -07:00
Jihoon Son	67ff7dacbd	Support server-side encryption for s3 (#5740 ) * Support server-side encryption for s3 * fix teamcity * typo * address comments * Refactoring configuration injection * fix doc * fix doc	2018-05-28 20:22:08 -07:00
Joseph Glanville	5cbfb95e1f	docs: Document inputFormat on Hadoop InputSpecs (#5784 )	2018-05-24 21:44:37 -07:00
Gian Merlino	bc0ff251a3	Docs: Clarify the meaning of maxSplitSize. (#5803 )	2018-05-24 21:43:39 -07:00
Michael Schnupp	33b4eb624d	fix freeSpacePercent in segmentCache.locations (#5765 ) * fix freeSpacePercent in segmentCache.locations * the check should probably test the other way around * documentation should put the option in the right place * examples have a superfluous backslash * add test to verify correct behavior * switch to Path and test with jimfs Path allows to use different filesystems. Jimfs provides an actual (in memory) filesystem. This also allows more complex test scenarios. The behavior should be unchanged by this commit. * Revert "switch to Path and test with jimfs" This reverts commit `8b9a418d65`.	2018-05-24 11:15:30 +09:00
Atul Mohan	1b9611a60e	Local indexing from RDBMS (#5441 ) * Local indexing from RDBMS * Fix content * Remove pom changes * Remove extraneous space * Add tests and update documentation * Fix comments * Fix docs * Fix build related issue * Handle invalid strings * Make target database independent of metadata storage * Add firehose connector * Fix accessibility * Add docs * Remove unused def * Remove lazy instantiation of jsoniterator * Move unused changes * Move unused changes * Fix build * Make Sqlfirehose method private	2018-05-22 12:33:01 +09:00
Caroline1000	c73e3ea4f5	Provide examples to havingSpec filters (#5774 ) * expand examples * expand examples for filtered havingSpecs * expand other having examples * remove blank code block * add better AND/OR/NOT examples * fix indentation	2018-05-14 13:43:42 -07:00
Abhishek Kaushik	aa23fe6386	Typo fix in historical doc (#5753 )	2018-05-08 11:08:27 -07:00
Kirill Kozlov	67d0b0ee42	Add taskType dimension to task metrics (#5664 )	2018-05-07 09:42:26 -07:00
kaijianding	c12c16385e	support throw duplcate row during realtime ingestion in RealtimePlumber (#5693 )	2018-05-04 10:12:25 -07:00
Dylan Wylie	2c5f0038fd	Make lookup offheap buffer configurable (#5696 ) * Make lookup offheap buffer configurable Fixes #3663 * Address comments * Update docs * Update docs	2018-05-04 10:00:55 -07:00
Stuart McLean	c2b5e5ec95	Default caffeine cache size (#5738 ) * add default caffeine cache size based on runtime Xmx or max 1GB * update docs for caffeine cache * fix formatting * test caffeine size should never be less than 0 * set caffeine max default size to 1G not 1M * fix caffeine cache tests	2018-05-04 09:29:11 -07:00
Surekha	13c616ba24	'maxBytesInMemory' tuningConfig introduced for ingestion tasks (#5583 ) * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Fix check style and remove a comment * Add overlord unsecured paths to coordinator when using combined service (#5579) * Add overlord unsecured paths to coordinator when using combined service * PR comment * More error reporting and stats for ingestion tasks (#5418) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments * Allow getDomain to return disjointed intervals (#5570) * Allow getDomain to return disjointed intervals * Indentation issues * Adding feature thetaSketchConstant to do some set operation in PostAgg (#5551) * Adding feature thetaSketchConstant to do some set operation in PostAggregator * Updated review comments for PR #5551 - Adding thetaSketchConstant * Fixed CI build issue * Updated review comments 2 for PR #5551 - Adding thetaSketchConstant * Fix taskDuration docs for KafkaIndexingService (#5572) * With incremental handoff the changed line is no longer true. * Add doc for automatic pendingSegments (#5565) * Add missing doc for automatic pendingSegments * address comments * Fix indexTask to respect forceExtendableShardSpecs (#5509) * Fix indexTask to respect forceExtendableShardSpecs * add comments * Deprecate spark2 profile in pom.xml (#5581) Deprecated due to https://github.com/druid-io/druid/pull/5382 * CompressionUtils: Add support for decompressing xz, bz2, zip. (#5586) Also switch various firehoses to the new method. Fixes #5585. * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Address code review comments * Fix the coding style according to druid conventions * Add more javadocs * Rename some variables/methods * Other minor issues * Address more code review comments * Some refactoring to put defaults in IndexTaskUtils * Added check for maxBytesInMemory in AppenderatorImpl * Decrement bytes in abandonSegment * Test unit test for multiple sinks in single appenderator * Fix some merge conflicts after rebase * Fix some style checks * Merge conflicts * Fix failing tests Add back check for 0 maxBytesInMemory in OnHeapIncrementalIndex * Address PR comments * Put defaults for maxRows and maxBytes in TuningConfig * Change/add javadocs * Refactoring and renaming some variables/methods * Fix TeamCity inspection warnings * Added maxBytesInMemory config to HadoopTuningConfig * Updated the docs and examples * Added maxBytesInMemory config in docs * Removed references to maxRowsInMemory under tuningConfig in examples * Set maxBytesInMemory to 0 until used Set the maxBytesInMemory to 0 if user does not set it as part of tuningConfing and set to part of max jvm memory when ingestion task starts * Update toString in KafkaSupervisorTuningConfig * Use correct maxBytesInMemory value in AppenderatorImpl * Update DEFAULT_MAX_BYTES_IN_MEMORY to 1/6 max jvm memory Experimenting with various defaults, 1/3 jvm memory causes OOM * Update docs to correct maxBytesInMemory default value * Minor to rename and add comment * Add more details in docs * Address new PR comments * Address PR comments * Fix spelling typo	2018-05-03 16:25:58 -07:00
Gian Merlino	739e347320	Allow Hadoop dataSource inputSpec to be specified multiple times. (#5717 ) * Allow Hadoop dataSource inputSpec to be specified multiple times. * Fix test	2018-05-03 13:51:57 -07:00
Stuart McLean	d2b8d880ea	include hybrid and caffeine in cache docs and show caffeine as default (#5737 )	2018-05-03 09:52:05 -07:00
Jihoon Son	d4311b4a5a	Support enablePathStyleAccess, disableChunkedEncoding, and forceGlobalBucketAccessEnabled for aws client (#5702 ) * Support enablePathStyleAccess and disableChunkedEncoding for aws client * add an option for forceGlobalBucketAccessEnabled * add missing doc	2018-05-02 10:45:38 -07:00
Jakub Kukul	e2431ae161	Update defaultHadoopCoordinates in documentation. (#5720 ) * Update defaultHadoopCoordinates in documentation. To match changes applied in #5382. * Remove a parameter with defaults from example configuration file. If it has reasonable defaults, then why would it be in an example config file? Also, it is yet another place that has been forgotten to be updated and will be forgotten in the future. Also, if someone is running different hadoop version, then there's much more work to be done than just changing this property, so why give users false hopes? * Fix typo in documentation.	2018-04-30 20:49:14 -07:00
Dylan Wylie	754c80e74a	Fix quickstart docs to specify that Java 8 is required. (#5722 ) See #4907 #5719	2018-04-30 13:25:59 -07:00
Gian Merlino	0f8493846e	Replace dev list references in docs. (#5723 )	2018-04-30 11:25:45 -07:00
David Lim	8ec2d2fe18	Use unique segment paths for Kafka indexing (#5692 ) * support unique segment file paths * forbiddenapis * code review changes * code review changes * code review changes * checkstyle fix	2018-04-29 21:59:48 -07:00
Gian Merlino	762f8829e4	Add task action metrics, add taskId metric dimension. (#5714 ) * Add task action metrics, add taskId metric dimension. Adds two new metrics: task/action/log/time and task/action/run/time. Also adds taskId as a dimension, to give us the ability to drill down into metrics for an individual task. Also standardizes metrics-attachment using two helper methods in IndexTaskUtils. * Fix typo	2018-04-29 21:24:06 -07:00
Joseph Glanville	90cd05696e	Document processing properties required for Middlemanager (#5660 )	2018-04-29 17:20:17 -07:00
Jihoon Son	86746f82d8	Use mergeBuffer instead of processingBuffer in parallelCombiner (#5634 ) * Use mergeBuffer instead of processingBuffer in parallelCombiner * Fix test * address comments * fix test * Fix test * Update comment * address comments * fix build * Fix test failure	2018-04-27 18:14:37 -07:00
Gian Merlino	f81855d607	Add unauthorized errorCode to query docs. (#5691 )	2018-04-26 13:06:25 -07:00
Caroline1000	fd76af9737	remove old prod cluster config link (#5676 )	2018-04-23 18:00:24 -07:00
scrawfor	15f4ab2b31	Expose noop filter to users (#5597 )	2018-04-18 07:57:07 -07:00
Gian Merlino	fbf3fc178e	Timeseries: Add "grandTotal" option. (#5640 ) * Timeseries: Add "grandTotal" option. * Modify whitespace. * Checkstyle workaround.	2018-04-16 18:22:19 -07:00
Jonathan Wei	d0b66a6af5	Fix HTTP OPTIONS request auth handling (#5638 ) * Fix HTTP OPTIONS request auth handling * PR comment * More PR comments * Fix * PR comment	2018-04-16 18:09:56 -07:00
Jihoon Son	6b3bde0143	Fix granularitySpec doc (#5647 )	2018-04-16 14:24:39 -04:00
Jonathan Wei	882b172318	Revert "Fix HTTP OPTIONS request auth handling (#5615 )" (#5637 ) This reverts commit `df51a7bcb7`.	2018-04-12 16:43:54 -07:00
Jonathan Wei	df51a7bcb7	Fix HTTP OPTIONS request auth handling (#5615 ) * Fix HTTP OPTIONS request auth handling * Flip configuration boolean	2018-04-12 14:02:20 -07:00
Caroline1000	48c1a1ef57	change header from Data Schema to Ingestion Spec (#5631 )	2018-04-11 21:42:54 -07:00
Nishant Bangarwa	e6efd75a3d	Add config to allow setting up custom unsecured paths for druid nodes. (#5614 ) * Add config to allow setting up custom unsecured paths for druid nodes. * return all resources for Unsecured paths * review comment - Add test * fix tests * fix test	2018-04-11 17:10:07 -07:00
Caroline1000	afa75e04b7	change header in overlord console; minor querydoc change (#5625 ) * change header in overlord console; minor querydoc change * remove change to overlord console * address Gian comments	2018-04-11 12:57:22 -07:00

... 3 4 5 6 7 ...

1986 Commits