David Lim
ff52581bd3
IndexTask improvements ( #3611 )
...
* index task improvements
* code review changes
* add null check
2017-01-18 14:24:37 -08:00
Gian Merlino
bcd20441be
Make buildV9Directly the default. ( #3688 )
2016-11-14 09:29:32 -08:00
praveev
52a74cf84f
Use timestamp in millis as Map key instead of DateTime object ( #3674 )
...
* Use Long timestamp as key instead of DateTime.
DateTime representation is screwed up when you store with an obj
and read with a different DateTime obj.
For example: The code below fails when you use DateTime as key
```
DateTime odt = DateTime.now(DateTimeUtils.getZone(DateTimeZone.forID("America/Los_Angeles")));
HashMap<DateTime, String> map = new HashMap<>();
map.put(odt, "abc");
DateTime dt = new DateTime(odt.getMillis());
System.out.println(map.get(dt));
```
* Respect timezone when creating the file.
* Update docs with timezone caveat in granularity spec
* Remove unused imports
2016-11-11 10:20:20 -08:00
Akash Dwivedi
3a83e0513e
Doc update(batch-ingestion) to include useExplicitVersion. ( #3557 )
2016-10-07 14:48:00 -07:00
praveev
43cdc675c7
Add support for timezone in segment granularity ( #3528 )
...
* Add support for timezone in segment granularity
* CR feedback. Handle null timezone during equals check.
* Include timezone in docs.
Add timezone for ArbitraryGranularitySpec.
2016-10-03 08:15:42 -07:00
Gian Merlino
27bd5cb13a
Add forceExtendableShardSpecs option to Hadoop indexing, IndexTask. ( #3473 )
...
Fixes #3241 .
2016-09-21 13:40:04 -06:00
Gian Merlino
e0e28866ee
JavaScript docs: Fix links and typos, add to TOC. ( #3457 )
2016-09-13 15:26:44 -07:00
Gian Merlino
76a24054e3
JavaScript docs, including docs for globals. ( #3454 )
2016-09-13 13:46:55 -07:00
Slim
ba6ddf307e
Adding hadoop kerberos authentification. ( #3419 )
...
* adding kerberos authentication
* make the 2 functions identical
2016-09-13 10:42:50 -07:00
Dave Li
c4e8440c22
Adds long compression methods ( #3148 )
...
* add read
* update deprecated guava calls
* add write and vsizeserde
* add benchmark
* separate encoding and compression
* add header and reformat
* update doc
* address PR comment
* fix buffer order
* generate benchmark files
* separate encoding strategy and format
* fix benchmark
* modify supplier write to channel
* add float NONE handling
* address PR comment
* address PR comment 2
2016-08-30 16:17:46 -07:00
kaijianding
50d52a24fc
ability to not rollup at index time, make pre aggregation an option ( #3020 )
...
* ability to not rollup at index time, make pre aggregation an option
* rename getRowIndexForRollup to getPriorIndex
* fix doc misspelling
* test query using no-rollup indexes
* fix benchmark fail due to jmh bug
2016-08-02 11:13:05 -07:00
Gian Merlino
e5397ed316
Link up Hadoop class loading docs better. ( #3302 )
2016-07-29 10:19:54 -07:00
Navis Ryu
cd7337fc8a
Calculate max split size based on numMapTask in DatasourceInputFormat ( #2882 )
...
* Calculate max split size based on numMapTask
* updated docs & fixed possible ArithmeticException
2016-07-20 16:53:51 -07:00
Gian Merlino
ea03906fcf
Configurable compressRunOnSerialization for Roaring bitmaps. ( #3228 )
...
Defaults to true, which is a change in behavior (this used to be false and unconfigurable).
2016-07-08 10:24:19 +05:30
Jonathan Wei
c5dbf364e3
Fix JSON flatten docs, add link to path expression tester ( #3105 )
2016-06-07 14:39:57 -07:00
Nishant
0ac1b27d53
Allow manually setting of shutoffTime for EventReceiverFirehose ( #2803 )
...
* Allow dynamically setting of shutoffTime for EventReceiverFirehose
Allow dynamically setting shutoffTime for EventReceiverFirehose
review comments and tests
* shut down exec on close
2016-05-24 07:24:00 -07:00
Gian Merlino
fffa9c8265
Fix flattenSpec docs, "nested" should be "path". ( #2924 )
2016-05-05 08:59:41 -07:00
David Lim
890bdb543d
doc fixes ( #2897 )
2016-04-28 15:34:58 -07:00
Fangjin Yang
abd951df1a
Document how to use roaring bitmaps ( #2824 )
...
* Document how to use roaring bitmaps
This fixes #2408 .
While not all indexSpec properties are explained, it does explain how roaring bitmaps can be turned on.
* fix
* fix
* fix
* fix
2016-04-12 19:28:02 -07:00
Sébastien Launay
37d2ab623e
Merge pull request #2815 from slaunay/documentation/hadoop-classpath-issue-fix-with-configuration
...
Doc for mapreduce.job.user.classpath.first=true
2016-04-12 10:51:51 -07:00
Himanshu Gupta
004b00bb96
config to explicitly specify classpath for hadoop container during hadoop ingestion
2016-03-25 10:51:28 -05:00
Gian Merlino
2dfd3877c0
Fix a bunch of broken links in the docs.
2016-03-23 10:21:28 -07:00
fjy
943cbe6e76
refactor extensions into their own docs
2016-03-22 18:54:10 -07:00
binlijin
bce600f5d5
Single dimension hash-based partitioning
2016-03-22 13:15:33 +08:00
Gian Merlino
a2b1652787
Clarify parser docs.
...
- Clarify what parseSpecs are used for.
- Avro, Protobuf should use timeAndDims parseSpecs.
- Hadoop jobs should use hadoopyString string parsers.
2016-03-10 08:45:04 -08:00
fjy
e3e932a4d4
refactor extensions into core and contrib
2016-03-08 17:12:09 -08:00
Fangjin Yang
8e36e6fa43
Merge pull request #2610 from dclim/add-combineText-doc
...
add combineText property and cleanup batch ingestion doc
2016-03-08 12:54:16 -08:00
dclim
df29667a89
add combineText property and cleanup batch ingestion doc
2016-03-08 13:10:34 -07:00
Himanshu Gupta
0402636598
configurable handoffConditionTimeout in realtime tasks for segment handoff wait
2016-03-05 10:14:54 -06:00
Slim Bouguerra
623e89aa54
skip corrupt message
2016-03-04 08:30:40 -06:00
Björn Zettergren
2462c82c0e
New defaults for maxRowsInMemory rowFlushBoundary
...
To bring consistency to docs and source this commit changes the default
values for maxRowsInMemory and rowFlushBoundary to 75000 after
discussion in PR https://github.com/druid-io/druid/pull/2457 .
The previous default was 500000 and it's lower now on the grounds that
it's better for a default to be somewhat less efficient, and work,
than to reach for the stars and possibly result in
"OutOfMemoryError: java heap space" errors.
2016-03-01 13:50:28 +01:00
Charles Allen
1fe277ee29
Merge pull request #2367 from se7entyse7en/feature-rackspace-cloud-files-static-firehose
...
Adds support to use Rackspace's cloudfiles as static firehose
2016-02-25 17:31:06 -08:00
Gian Merlino
3534483433
Better handling of ParseExceptions.
...
Two changes:
- Allow IncrementalIndex to suppress ParseExceptions on "aggregate".
- Add "reportParseExceptions" option to realtime tuning configs. By default this is "false".
Behavior of the counters should now be:
- processed: Number of rows indexed, including rows where some fields could be parsed and some could not.
- thrownAway: Number of rows thrown away due to rejection policy.
- unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all).
If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would
cause an exception to be thrown). In addition, "processed" will only include fully parseable rows
(because even partial parse failures will cause exceptions to be thrown).
Fixes #2510 .
2016-02-23 10:11:43 -08:00
Himanshu Gupta
21b0b8a07d
new coordinator endpoint to get list of used segment given a dataSource and list of intervals
2016-02-21 23:17:58 -06:00
Himanshu Gupta
09ffcae4ae
give user the option to specify the segments for dataSource inputSpec
2016-02-21 23:15:31 -06:00
Fangjin Yang
083f019a48
Merge pull request #2465 from druid-io/more-doc-fix
...
more doc fixes
2016-02-17 11:00:38 -08:00
fjy
7da6594bfe
more doc fixes
2016-02-17 09:43:47 -08:00
Gian Merlino
3a996216bd
Multivalued dimensions can be compressed since 0.8.0.
2016-02-17 08:33:21 -08:00
Himanshu
f6eebf5884
Merge pull request #2422 from rasahner/docMinorFixes
...
some minor doc changes
2016-02-09 10:03:22 -06:00
Robin
1d57e3267d
some minor doc changes
2016-02-09 08:20:53 -06:00
fjy
6fc5bcb1ef
fix docs
2016-02-08 13:40:53 -08:00
fjy
003f54e268
add doc rendering
2016-02-04 14:21:59 -08:00
fjy
1aa363cea7
new quickstart
2016-02-04 09:37:38 -08:00
Lou Marvin Caraig
9de57eb1c8
Added documentation
2016-02-02 14:32:12 +01:00
Björn Zettergren
d373573c25
DOCs: Missing 'type' for leaveIntermediate
...
Added missing 'Boolean' as type for leaveIntermediate row in table TuningConfig
2016-01-29 14:42:19 +01:00
Himanshu Gupta
b3437825f0
add ignoreWhenNoSegments flag to optionally ignore the dataSource inputSpec when no segments were found
2016-01-26 17:23:55 -06:00
binlijin
cd1c71ceb4
rename persistBackgroundCount to numBackgroundPersistThreads
2016-01-22 14:29:41 +08:00
Nishant
dcb7830330
Merge pull request #984 from drcrallen/thread-priority-rebase
...
Use thread priorities. (aka set `nice` values for background-like tasks)
2016-01-21 15:02:34 +05:30
Charles Allen
2a69a58570
Merge pull request #2149 from binlijin/master
...
Do persist IncrementalIndex in another thread in IndexGeneratorReducer
2016-01-20 17:06:42 -08:00
Charles Allen
2e1d6aaf3d
Use thread priorities. (aka set `nice` values for background-like tasks)
...
* Defaults the thread priority to java.util.Thread.NORM_PRIORITY in io.druid.indexing.common.task.AbstractTask
* Each exec service has its own Task Factory which is assigned a priority for spawned task. Therefore each priority class has a unique exec service
* Added priority to tasks as taskPriority in the task context. <0 means low, 0 means take default, >0 means high. It is up to any particular implementation to determine how to handle these numbers
* Add options to ForkingTaskRunner
* Add "-XX:+UseThreadPriorities" default option
* Add "-XX:ThreadPriorityPolicy=42" default option
* AbstractTask - Removed unneded @JsonIgnore on priority
* Added priority to RealtimePlumber executors. All sub-executors (non query runners) get Thread.MIN_PRIORITY
* Add persistThreadPriority and mergeThreadPriority to realtime tuning config
2016-01-20 14:00:31 -08:00
Logan Linn
c3bdaefe1f
Update batch-ingestion.md
...
Fix documented type of the `dataGranularity` config
2016-01-19 17:20:47 -08:00
binlijin
8e43e2c446
Do persist IncrementalIndex in another thread in IndexGeneratorReducer
2016-01-20 09:20:09 +08:00
Kurt Young
82ff98c2bf
add config for build v9 directly and update docs
2016-01-16 11:26:34 +08:00
Zhao Weinan
5e57ddb8cc
Adding avro support to realtime & hadoop batch indexing.
2016-01-05 10:21:27 +08:00
Robin
0961c0b703
trivial documentation fix
2016-01-04 12:39:10 -06:00
fjy
88f6b9b5ad
Multiple improvements for docs
2016-01-02 21:54:54 -08:00
Himanshu Gupta
48de9dfafa
doc update to make it easy to find how to do re-indexing or delta ingestion
2015-12-30 23:58:09 -06:00
fjy
398a3ec620
add docs for more specs
2015-12-17 18:06:30 -08:00
jon-wei
c53bf85d83
Add docs and benchmark for JSON flattening parser
2015-12-09 16:13:30 -08:00
Himanshu Gupta
efe3c9f4a5
update the examples for batch reindexing/delta ingestion to use "intervals" instead of deprecated "interval"
2015-12-06 00:22:20 -06:00
Himanshu Gupta
61aaa09012
support multiple intervals in dataSource input spec
2015-12-03 21:28:04 -06:00
jon-wei
95dca4440f
Update data formats doc with info about JSON multi-value dimensions
2015-11-24 14:38:06 -08:00
sahner
a4ed2ce2d1
fix formatting in schema-design
2015-11-17 16:50:53 -06:00
fjy
8f231fd3e3
cleanup druid codebase
2015-11-04 13:59:53 -08:00
Nishant
efc49da073
fix doc - correct default value for maxRowsInMemory
2015-11-01 22:09:24 -08:00
Bingkun Guo
4914925d65
New extension loading mechanism
...
1) Remove maven client from downloading extensions at runtime.
2) Provide a way to load Druid extensions and hadoop dependencies through file system.
3) Refactor pull-deps so that it can download extensions into extension directories.
4) Add documents on how to use this new extension loading mechanism.
5) Change the way how Druid tarball is generated. Now all the extensions + hadoop-client 2.3.0
are packaged within the Druid tarball.
2015-10-21 14:22:36 -05:00
Gian Merlino
933cbdf780
Adjust realtime constraints in the docs.
2015-10-09 10:52:52 -07:00
Gian Merlino
b29cbf97a6
Docs: Suggest hadoopyString parser for Hadoop.
2015-09-16 10:19:42 -07:00
Himanshu Gupta
075b6d4385
update ingestion faq to mention dataSource inputSpec as an option of reindexing via hadoop
2015-09-10 14:41:13 -05:00
Xavier Léauté
d89b0fa76a
Merge pull request #1662 from qix/pathFormat-doc
...
Add documentation for pathFormat in batch ingestion
2015-08-31 11:14:54 -07:00
Josh Yudaken
29c29b42d3
Add default value and link to joda docs
2015-08-31 11:09:54 -07:00
lvjq
2237a8cf0f
kafka 8 simple consumer firehose
2015-08-27 20:50:46 -05:00
Bingkun
ae1f104c10
Fix batch ingestion doc
2015-08-26 15:16:21 -05:00
Gian Merlino
10946610f4
Merge pull request #1656 from druid-io/all-the-docs
...
more docs for common questions
2015-08-25 17:49:47 -07:00
fjy
4055f9ca48
more docs for common questions
2015-08-25 17:49:04 -07:00
sahner
3def847e28
add documentation about TimedShutoff firehose
2015-08-24 20:41:42 -05:00
Josh Yudaken
5e42aee49e
Add documentation for pathFormat in batch ingestion
2015-08-24 14:39:57 -07:00
Himanshu Gupta
cfd81bfac7
updating the docs on how to do hadoop batch re-ingesion and delta ingestion
2015-08-16 14:07:35 -05:00
fjy
012fff6616
fix firehose docs
2015-08-04 09:52:23 -07:00
Himanshu Gupta
7ee509bcd0
fix mysql references in tutorial docs
2015-07-30 22:05:05 -05:00
pdeva
ef0439229d
Specify dynamic dimension schema
...
Document how druid can dynamically infer dimension columns
2015-07-27 20:20:53 -07:00
sahner
4801de62a2
make "announce" the chathandler default in realtime node,
...
remove doc references to chathandler type "announce" since it is the default now,
2015-07-27 12:14:28 -05:00
pdeva
76bf8ccd8c
correct key name
2015-07-25 21:58:37 -07:00
fjy
92293ef094
Added section on best practices for schema designa and a few other edits
2015-07-24 14:06:20 -07:00
Himanshu Gupta
119ec13d23
updating hadoop tuningConfig doc with useCombiner flag
2015-07-22 13:55:00 -05:00
Himanshu Gupta
dd95ef77c0
recommend druid-hdfs-storage and hadoop dependencies to be in the classpath instead of added as an extension
2015-07-18 16:18:12 -05:00
Charles Allen
e051e93d19
Merge pull request #1518 from RealROI/more-azure-features
...
Azure Blob Store support for Firehose and Indexing Service Logs
2015-07-17 16:10:22 -07:00
Zak Kristjanson
0bda7af52c
Add more support for Azure Blob Store
...
Azure Blob Store support for Task Logs and a firehose for data ingestion
2015-07-17 15:38:21 -07:00
Shiyu Qiu
bec8e8e23a
fix doc data-formats.md
2015-07-15 17:13:33 -05:00
Tim
3b692fb6f7
fix #1525 - typo: "HadoopBatchIndexer"
2015-07-14 20:48:24 -07:00
fjy
08d00cc80f
rework the realtime examples a bit; add more faq
2015-07-07 14:07:14 -07:00
sahner
acd20e8c00
say explicitly that local firehose searches directories recursively for files
2015-07-05 14:46:44 -05:00
Fangjin Yang
2544f3655e
Merge pull request #1457 from ravishrathod/rabbitmq-doc
...
updating doc for rabbitmq firehose
2015-06-23 08:24:49 -07:00
ravishrathod
9213fd3801
updating doc for rabbitmq firehose
2015-06-22 02:40:11 -04:00
fjy
9c74993559
fix protobuf impl and docs
2015-06-20 21:59:38 -07:00
fjy
74d8840414
Change tranquility links
2015-05-31 10:59:38 -07:00
Himanshu Gupta
be4ecc4b91
in batch ingestion metadataUpdateSpec->type is derby, mysql etc and not metadata
2015-05-29 22:16:18 -05:00
Xavier Léauté
d2346b6834
shorten links and file names
...
* remove redundant parts in file names
* delete unsupported "Druid-Personal-Demo-Cluster"
2015-05-29 20:55:42 -05:00
Himanshu Gupta
8edc2aaca3
renaming all *.md filenames to only have lowercase and dashes
...
so that they are editable on case-insensitive os as well
2015-05-29 20:55:42 -05:00