druid

Commit Graph

Author	SHA1	Message	Date
Atul Mohan	064c22c937	Fix redirects (#6151 )	2018-08-10 13:55:47 -07:00
Jonathan Wei	b0805540af	Fix kafka tutorial typo (#6141 )	2018-08-09 18:41:05 -07:00
Jonathan Wei	af0557c1f7	Unified configuration doc page (#6127 ) * Unified configuration doc page * Rename to index.md, update redirects * PR comments * PR comments * PR comment	2018-08-09 14:52:14 -07:00
Jonathan Wei	fea2ab7094	New docs intro (#6122 ) * New docs intro * PR comments * Fix arch diagram * PR comment * PR comment * PR comment	2018-08-09 14:19:11 -07:00
pdeva	c028d18d74	update redis-cache documentation (#6109 ) * update redis-cache documentation added clarifying info on setup and enablement * added link	2018-08-09 13:44:59 -07:00
Jonathan Wei	aa660b8751	Add docs for virtual columns and transform specs (#6119 ) * Add docs for virtual columns and transform specs * PR Comments * PR comment	2018-08-09 14:42:52 -06:00
Jonathan Wei	2b64025eaf	Separate hadoop and native batch docs more (#6120 ) * Separate hadoop and native batch docs more * Rebase with parallel batch * PR comments	2018-08-09 14:40:20 -06:00
Jonathan Wei	24f2e8ba26	New quickstart and tutorials (#6126 ) * New quickstart and tutorials * PR comments * Fix tranquility	2018-08-09 14:37:52 -06:00
Jonathan Wei	2b0f03acb9	Unified API doc page (#6128 ) * Unified API doc page * PR comments * Fix metadata endpoint	2018-08-09 14:27:42 -06:00
Gian Merlino	3525d4059e	Cache: Add maxEntrySize config, make groupBy cacheable by default. (#5108 ) * Cache: Add maxEntrySize config. The idea is this makes it more feasible to cache query types that can potentially generate large result sets, like groupBy and select, without fear of writing too much to the cache per query. Includes a refactor of cache population code in CachingQueryRunner and CachingClusteredClient, such that they now use the same CachePopulator interface with two implementations: one for foreground and one for background. The main reason for splitting the foreground / background impls is that the foreground impl can have a more effective implementation of maxEntrySize. It can stop retaining subvalues for the cache early. * Add CachePopulatorStats. * Fix whitespace. * Fix docs. * Fix various tests. * Add tests. * Fix tests. * Better tests * Remove conflict markers. * Fix licenses.	2018-08-07 10:23:15 -07:00
Jihoon Son	56ab4363ea	Native parallel batch indexing without shuffle (#5492 ) * Native parallel indexing without shuffle * fix build * fix ci * fix ingestion without intervals * fix retry * fix retry * add it test * use chat handler * fix build * add docs * fix ITUnionQueryTest * fix failures * disable metrics reporting * working * Fix split of static-s3 firehose * Add endpoints to supervisor task and a unit test for endpoints * increase timeout in test * Added doc * Address comments * Fix overlapping locks * address comments * Fix static s3 firehose * Fix test * fix build * fix test * fix typo in docs * add missing maxBytesInMemory to doc * address comments * fix race in test * fix test * Rename to ParallelIndexSupervisorTask * fix teamcity * address comments * Fix license * addressing comments * addressing comments * indexTaskClient-based segmentAllocator instead of CountingActionBasedSegmentAllocator * Fix race in TaskMonitor and move HTTP endpoints to supervisorTask from runner * Add more javadocs * use StringUtils.nonStrictFormat for logging * fix typo and remove unused class * fix tests * change package * fix strict build * tmp * Fix overlord api according to the recent change in master * Fix it test	2018-08-06 23:59:42 -07:00
Nishant Bangarwa	75c8a87ce1	Part 2 of changes for SQL Compatible Null Handling (#5958 ) * Part 2 of changes for SQL Compatible Null Handling * Review comments - break lines longer than 120 characters * review comments * review comments * fix license * fix test failure * fix CalciteQueryTest failure * Null Handling - Review comments * review comments * review comments * fix checkstyle * fix checkstyle * remove unrelated change * fix test failure * fix failing test * fix travis failures * Make StringLast and StringFirst aggregators nullable and fix travis failures	2018-08-02 08:20:25 -07:00
Andrés Gómez	e270362767	Add stringLast and stringFirst aggregators extension (#5789 ) * Add lastString and firstString aggregators extension * Remove duplicated class * Move first-last-string doc page to extensions-contrib * Fix ObjectStrategy compare method * Fix doc bad aggregatos type name * Create FoldingAggregatorFactory classes to fix SegmentMetadataQuery * Add getMaxStringBytes() method to support JSON serialization * Fix null pointer exception at segment creation phase when the string value is null * Control the valueSelector object class on BufferAggregators * Perform all improvements * Add java doc on SerializablePairLongStringSerde * Refactor ObjectStraty compare method * Remove unused ; * Add aggregateCombiner unit tests. Rename BufferAggregators unit tests * Remove unused imports * Add license header * Add class name to java doc class serde * Throw exception if value is unsupported class type * Move first-last-string extension into druid core * Update druid core docs * Fix null pointer exception when pair->string is null * Add null control unit tests * Remove unused imports * Add first/last string folding aggregator on AggregatorsModule to support segment metadata query * Change SerializablePairLongString to extend SerializablePair * Change vars from public to private * Convert vars to primitive type * Clarify compare comment * Change IllegalStateException to ISE * Remove TODO comments * Control possible null pointer exception * Add @Nullable annotation * Remove empty line * Remove unused parameter type * Improve AggregatorCombiner javadocs * Add filterNullValues option at StringLast and StringFirst aggregators * Add filterNullValues option at agg documentation * Fix checkstyle * Update header license * Fix StringFirstAggregatorFactory.VALUE_COMPARATOR * Fix StringFirstAggregatorCombiner * Fix if condition at StringFirstAggregateCombiner * Remove filterNullValues from string first/last aggregators * Add isReset flag in FirstAggregatorCombiner * Change Arrays.asList to Collections.singletonList	2018-08-01 10:52:54 -07:00
Caroline1000	7f89c72932	Add definition of 'NONE' to queryGranularity in ingestion.index doc (#6073 ) * Add meaning of granularity = None to queryGranularity * Fix format	2018-07-30 14:07:33 -07:00
Gian Merlino	63be028cee	CompactionTask: Reject empty intervals on construction. (#6059 ) * CompactionTask: Reject empty intervals on construction. They don't make sense anyway, and it's better to fail fast. * Switch API.	2018-07-30 08:52:50 -07:00
Eyal Yurman	94d6c9a0a5	Remove JDK 7 from build documentation. (#6031 ) See issue #6030	2018-07-26 17:05:07 -07:00
Jonathan Wei	efab3b0160	Add concat and textcat SQL functions (#6005 )	2018-07-20 11:21:04 -07:00
Gian Merlino	cd8ea3da8d	SQL: Add server-wide default time zone config. (#5993 ) * SQL: Add server-wide default time zone config. * Switch API.	2018-07-18 13:12:40 -07:00
Caroline1000	5f78a333ad	show that flatten will also work with avro extension (#5874 ) * show that flatten will also work with avro extension * fix url	2018-07-11 16:47:03 -07:00
Gian Merlino	04ea3c9f8c	Update license headers. (#5976 ) * Update license headers. For compliance with http://www.apache.org/legal/src-headers.html. * More license adjustments. * Fix mistakenly edited package line.	2018-07-11 09:55:18 -07:00
Caroline1000	b3976050ad	add definition of balancerComputeThreads (#5865 )	2018-07-05 09:54:36 -07:00
Caroline1000	ee4a5aafb0	add config values for GCS deep storage (#5875 ) * add config values for GCS deep storage * fix config values for GCS deep storage	2018-07-05 09:53:41 -07:00
Dylan Wylie	10642ef9ca	Fix filtered request logging docs (#5924 ) - Setting druid.request.logging.delegate has no effect. - The provider is injected based on a type parameter & this looks to be scoped to delegate for filtered loggers	2018-07-05 09:51:10 -07:00
scrawfor	bf2a31a5bc	Add new 'true' filter which always returns true. (#5711 ) * Add new 'true' filter which always returns true. * Add support for bitmap index. * Adds documentation. * Removes No-op Filter	2018-06-28 11:52:45 -07:00
Gian Merlino	a28314349c	Fix spelling of "propagate" in various places. (#5896 ) One of these is a configuration parameter (introduced in #5429), but it's never been in a release, so I think it's ok to rename it.	2018-06-25 09:18:08 -07:00
varaga	b4b1b2a020	Provisioning support for ZooKeeper Authorization (#5701 ) Review comments implemented	2018-06-15 14:02:01 -07:00
zhangxinyu	e43e5ebbcd	Materialized view implementation (#5556 ) * implement materialized view * modify code according to jihoonson's comments * modify code according to jihoonson's comments - 2 * add documentation about materialized view * use new HadoopTuningConfig in pr 5583 * add minDataLag and fix optimizer bug * correct value of DEFAULT_MIN_DATA_LAG_MS * modify code according to jihoonson's comments - 3 * use the boolean expression instead of if-else	2018-06-09 12:24:54 -07:00
Caroline1000	96feb479cd	add order change needed for KIS in 0.12.0 (#5760 )	2018-06-08 15:25:26 -07:00
Hongze Zhang	cfa94b747b	Update to jetty 9.4; Enable request decompression (#5624 ) * Update to jetty 9.4; Enable request decompression; Add http compression config options * Fix BadMessageException from jetty server at HttpGenerator.generateHeaders(...)	2018-06-08 14:53:08 -07:00
awelsh93	adbe22c05b	Security - add anonymous authenticator (#5842 ) * Anonymous authenticator that authenticates all requests and then directs them to an authorizer. * Adding documentation * Removed some fields from class AnonymousAuthenticator * Updating docs	2018-06-07 10:17:54 -07:00
Siddharth Subramanian	37409dc2f4	Fix minor documentation error (#5851 ) Adding a required `,` in the sample JSON	2018-06-06 12:51:56 -07:00
Ryan Plessner	ee45ee6915	Fix docs to reflect the correct default max total row count for the IndexTuningConfig (#5845 )	2018-06-05 13:15:12 -07:00
awelsh93	1a4707f09c	Remove extra slash in endpoint (#5822 )	2018-06-05 13:11:26 -07:00
Alexander Saydakov	d1cdcd4895	Datasketches doc correction (#5816 ) * func was renamed to operation during code review * added missing descriptions, some cleanup	2018-06-05 17:52:37 +05:30
Atul Mohan	50ad7a45ff	Fix authentication doc (#5813 )	2018-05-30 11:10:48 -07:00
Jihoon Son	67ff7dacbd	Support server-side encryption for s3 (#5740 ) * Support server-side encryption for s3 * fix teamcity * typo * address comments * Refactoring configuration injection * fix doc * fix doc	2018-05-28 20:22:08 -07:00
Joseph Glanville	5cbfb95e1f	docs: Document inputFormat on Hadoop InputSpecs (#5784 )	2018-05-24 21:44:37 -07:00
Gian Merlino	bc0ff251a3	Docs: Clarify the meaning of maxSplitSize. (#5803 )	2018-05-24 21:43:39 -07:00
Michael Schnupp	33b4eb624d	fix freeSpacePercent in segmentCache.locations (#5765 ) * fix freeSpacePercent in segmentCache.locations * the check should probably test the other way around * documentation should put the option in the right place * examples have a superfluous backslash * add test to verify correct behavior * switch to Path and test with jimfs Path allows to use different filesystems. Jimfs provides an actual (in memory) filesystem. This also allows more complex test scenarios. The behavior should be unchanged by this commit. * Revert "switch to Path and test with jimfs" This reverts commit `8b9a418d65`.	2018-05-24 11:15:30 +09:00
Atul Mohan	1b9611a60e	Local indexing from RDBMS (#5441 ) * Local indexing from RDBMS * Fix content * Remove pom changes * Remove extraneous space * Add tests and update documentation * Fix comments * Fix docs * Fix build related issue * Handle invalid strings * Make target database independent of metadata storage * Add firehose connector * Fix accessibility * Add docs * Remove unused def * Remove lazy instantiation of jsoniterator * Move unused changes * Move unused changes * Fix build * Make Sqlfirehose method private	2018-05-22 12:33:01 +09:00
Caroline1000	c73e3ea4f5	Provide examples to havingSpec filters (#5774 ) * expand examples * expand examples for filtered havingSpecs * expand other having examples * remove blank code block * add better AND/OR/NOT examples * fix indentation	2018-05-14 13:43:42 -07:00
Abhishek Kaushik	aa23fe6386	Typo fix in historical doc (#5753 )	2018-05-08 11:08:27 -07:00
Kirill Kozlov	67d0b0ee42	Add taskType dimension to task metrics (#5664 )	2018-05-07 09:42:26 -07:00
kaijianding	c12c16385e	support throw duplcate row during realtime ingestion in RealtimePlumber (#5693 )	2018-05-04 10:12:25 -07:00
Dylan Wylie	2c5f0038fd	Make lookup offheap buffer configurable (#5696 ) * Make lookup offheap buffer configurable Fixes #3663 * Address comments * Update docs * Update docs	2018-05-04 10:00:55 -07:00
Stuart McLean	c2b5e5ec95	Default caffeine cache size (#5738 ) * add default caffeine cache size based on runtime Xmx or max 1GB * update docs for caffeine cache * fix formatting * test caffeine size should never be less than 0 * set caffeine max default size to 1G not 1M * fix caffeine cache tests	2018-05-04 09:29:11 -07:00
Surekha	13c616ba24	'maxBytesInMemory' tuningConfig introduced for ingestion tasks (#5583 ) * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Fix check style and remove a comment * Add overlord unsecured paths to coordinator when using combined service (#5579) * Add overlord unsecured paths to coordinator when using combined service * PR comment * More error reporting and stats for ingestion tasks (#5418) * Add more indexing task status and error reporting * PR comments, add support in AppenderatorDriverRealtimeIndexTask * Use TaskReport instead of metrics/context * Fix tests * Use TaskReport uploads * Refactor fire department metrics retrieval * Refactor input row serde in hadoop task * Refactor hadoop task loader names * Truncate error message in TaskStatus, add errorMsg to task report * PR comments * Allow getDomain to return disjointed intervals (#5570) * Allow getDomain to return disjointed intervals * Indentation issues * Adding feature thetaSketchConstant to do some set operation in PostAgg (#5551) * Adding feature thetaSketchConstant to do some set operation in PostAggregator * Updated review comments for PR #5551 - Adding thetaSketchConstant * Fixed CI build issue * Updated review comments 2 for PR #5551 - Adding thetaSketchConstant * Fix taskDuration docs for KafkaIndexingService (#5572) * With incremental handoff the changed line is no longer true. * Add doc for automatic pendingSegments (#5565) * Add missing doc for automatic pendingSegments * address comments * Fix indexTask to respect forceExtendableShardSpecs (#5509) * Fix indexTask to respect forceExtendableShardSpecs * add comments * Deprecate spark2 profile in pom.xml (#5581) Deprecated due to https://github.com/druid-io/druid/pull/5382 * CompressionUtils: Add support for decompressing xz, bz2, zip. (#5586) Also switch various firehoses to the new method. Fixes #5585. * This commit introduces a new tuning config called 'maxBytesInMemory' for ingestion tasks Currently a config called 'maxRowsInMemory' is present which affects how much memory gets used for indexing.If this value is not optimal for your JVM heap size, it could lead to OutOfMemoryError sometimes. A lower value will lead to frequent persists which might be bad for query performance and a higher value will limit number of persists but require more jvm heap space and could lead to OOM. 'maxBytesInMemory' is an attempt to solve this problem. It limits the total number of bytes kept in memory before persisting. * The default value is 1/3(Runtime.maxMemory()) * To maintain the current behaviour set 'maxBytesInMemory' to -1 * If both 'maxRowsInMemory' and 'maxBytesInMemory' are present, both of them will be respected i.e. the first one to go above threshold will trigger persist * Address code review comments * Fix the coding style according to druid conventions * Add more javadocs * Rename some variables/methods * Other minor issues * Address more code review comments * Some refactoring to put defaults in IndexTaskUtils * Added check for maxBytesInMemory in AppenderatorImpl * Decrement bytes in abandonSegment * Test unit test for multiple sinks in single appenderator * Fix some merge conflicts after rebase * Fix some style checks * Merge conflicts * Fix failing tests Add back check for 0 maxBytesInMemory in OnHeapIncrementalIndex * Address PR comments * Put defaults for maxRows and maxBytes in TuningConfig * Change/add javadocs * Refactoring and renaming some variables/methods * Fix TeamCity inspection warnings * Added maxBytesInMemory config to HadoopTuningConfig * Updated the docs and examples * Added maxBytesInMemory config in docs * Removed references to maxRowsInMemory under tuningConfig in examples * Set maxBytesInMemory to 0 until used Set the maxBytesInMemory to 0 if user does not set it as part of tuningConfing and set to part of max jvm memory when ingestion task starts * Update toString in KafkaSupervisorTuningConfig * Use correct maxBytesInMemory value in AppenderatorImpl * Update DEFAULT_MAX_BYTES_IN_MEMORY to 1/6 max jvm memory Experimenting with various defaults, 1/3 jvm memory causes OOM * Update docs to correct maxBytesInMemory default value * Minor to rename and add comment * Add more details in docs * Address new PR comments * Address PR comments * Fix spelling typo	2018-05-03 16:25:58 -07:00
Gian Merlino	739e347320	Allow Hadoop dataSource inputSpec to be specified multiple times. (#5717 ) * Allow Hadoop dataSource inputSpec to be specified multiple times. * Fix test	2018-05-03 13:51:57 -07:00
Stuart McLean	d2b8d880ea	include hybrid and caffeine in cache docs and show caffeine as default (#5737 )	2018-05-03 09:52:05 -07:00
Jihoon Son	d4311b4a5a	Support enablePathStyleAccess, disableChunkedEncoding, and forceGlobalBucketAccessEnabled for aws client (#5702 ) * Support enablePathStyleAccess and disableChunkedEncoding for aws client * add an option for forceGlobalBucketAccessEnabled * add missing doc	2018-05-02 10:45:38 -07:00

1 2 3 4 5 ...

1604 Commits