Sébastien Launay
37d2ab623e
Merge pull request #2815 from slaunay/documentation/hadoop-classpath-issue-fix-with-configuration
...
Doc for mapreduce.job.user.classpath.first=true
2016-04-12 10:51:51 -07:00
Himanshu Gupta
004b00bb96
config to explicitly specify classpath for hadoop container during hadoop ingestion
2016-03-25 10:51:28 -05:00
Gian Merlino
2dfd3877c0
Fix a bunch of broken links in the docs.
2016-03-23 10:21:28 -07:00
fjy
943cbe6e76
refactor extensions into their own docs
2016-03-22 18:54:10 -07:00
binlijin
bce600f5d5
Single dimension hash-based partitioning
2016-03-22 13:15:33 +08:00
Gian Merlino
a2b1652787
Clarify parser docs.
...
- Clarify what parseSpecs are used for.
- Avro, Protobuf should use timeAndDims parseSpecs.
- Hadoop jobs should use hadoopyString string parsers.
2016-03-10 08:45:04 -08:00
fjy
e3e932a4d4
refactor extensions into core and contrib
2016-03-08 17:12:09 -08:00
Fangjin Yang
8e36e6fa43
Merge pull request #2610 from dclim/add-combineText-doc
...
add combineText property and cleanup batch ingestion doc
2016-03-08 12:54:16 -08:00
dclim
df29667a89
add combineText property and cleanup batch ingestion doc
2016-03-08 13:10:34 -07:00
Himanshu Gupta
0402636598
configurable handoffConditionTimeout in realtime tasks for segment handoff wait
2016-03-05 10:14:54 -06:00
Slim Bouguerra
623e89aa54
skip corrupt message
2016-03-04 08:30:40 -06:00
Björn Zettergren
2462c82c0e
New defaults for maxRowsInMemory rowFlushBoundary
...
To bring consistency to docs and source this commit changes the default
values for maxRowsInMemory and rowFlushBoundary to 75000 after
discussion in PR https://github.com/druid-io/druid/pull/2457 .
The previous default was 500000 and it's lower now on the grounds that
it's better for a default to be somewhat less efficient, and work,
than to reach for the stars and possibly result in
"OutOfMemoryError: java heap space" errors.
2016-03-01 13:50:28 +01:00
Charles Allen
1fe277ee29
Merge pull request #2367 from se7entyse7en/feature-rackspace-cloud-files-static-firehose
...
Adds support to use Rackspace's cloudfiles as static firehose
2016-02-25 17:31:06 -08:00
Gian Merlino
3534483433
Better handling of ParseExceptions.
...
Two changes:
- Allow IncrementalIndex to suppress ParseExceptions on "aggregate".
- Add "reportParseExceptions" option to realtime tuning configs. By default this is "false".
Behavior of the counters should now be:
- processed: Number of rows indexed, including rows where some fields could be parsed and some could not.
- thrownAway: Number of rows thrown away due to rejection policy.
- unparseable: Number of rows thrown away due to being completely unparseable (no fields salvageable at all).
If "reportParseExceptions" is true then "unparseable" will always be zero (because a parse error would
cause an exception to be thrown). In addition, "processed" will only include fully parseable rows
(because even partial parse failures will cause exceptions to be thrown).
Fixes #2510 .
2016-02-23 10:11:43 -08:00
Himanshu Gupta
21b0b8a07d
new coordinator endpoint to get list of used segment given a dataSource and list of intervals
2016-02-21 23:17:58 -06:00
Himanshu Gupta
09ffcae4ae
give user the option to specify the segments for dataSource inputSpec
2016-02-21 23:15:31 -06:00
Fangjin Yang
083f019a48
Merge pull request #2465 from druid-io/more-doc-fix
...
more doc fixes
2016-02-17 11:00:38 -08:00
fjy
7da6594bfe
more doc fixes
2016-02-17 09:43:47 -08:00
Gian Merlino
3a996216bd
Multivalued dimensions can be compressed since 0.8.0.
2016-02-17 08:33:21 -08:00
Himanshu
f6eebf5884
Merge pull request #2422 from rasahner/docMinorFixes
...
some minor doc changes
2016-02-09 10:03:22 -06:00
Robin
1d57e3267d
some minor doc changes
2016-02-09 08:20:53 -06:00
fjy
6fc5bcb1ef
fix docs
2016-02-08 13:40:53 -08:00
fjy
003f54e268
add doc rendering
2016-02-04 14:21:59 -08:00
fjy
1aa363cea7
new quickstart
2016-02-04 09:37:38 -08:00
Lou Marvin Caraig
9de57eb1c8
Added documentation
2016-02-02 14:32:12 +01:00
Björn Zettergren
d373573c25
DOCs: Missing 'type' for leaveIntermediate
...
Added missing 'Boolean' as type for leaveIntermediate row in table TuningConfig
2016-01-29 14:42:19 +01:00
Himanshu Gupta
b3437825f0
add ignoreWhenNoSegments flag to optionally ignore the dataSource inputSpec when no segments were found
2016-01-26 17:23:55 -06:00
binlijin
cd1c71ceb4
rename persistBackgroundCount to numBackgroundPersistThreads
2016-01-22 14:29:41 +08:00
Nishant
dcb7830330
Merge pull request #984 from drcrallen/thread-priority-rebase
...
Use thread priorities. (aka set `nice` values for background-like tasks)
2016-01-21 15:02:34 +05:30
Charles Allen
2a69a58570
Merge pull request #2149 from binlijin/master
...
Do persist IncrementalIndex in another thread in IndexGeneratorReducer
2016-01-20 17:06:42 -08:00
Charles Allen
2e1d6aaf3d
Use thread priorities. (aka set `nice` values for background-like tasks)
...
* Defaults the thread priority to java.util.Thread.NORM_PRIORITY in io.druid.indexing.common.task.AbstractTask
* Each exec service has its own Task Factory which is assigned a priority for spawned task. Therefore each priority class has a unique exec service
* Added priority to tasks as taskPriority in the task context. <0 means low, 0 means take default, >0 means high. It is up to any particular implementation to determine how to handle these numbers
* Add options to ForkingTaskRunner
* Add "-XX:+UseThreadPriorities" default option
* Add "-XX:ThreadPriorityPolicy=42" default option
* AbstractTask - Removed unneded @JsonIgnore on priority
* Added priority to RealtimePlumber executors. All sub-executors (non query runners) get Thread.MIN_PRIORITY
* Add persistThreadPriority and mergeThreadPriority to realtime tuning config
2016-01-20 14:00:31 -08:00
Logan Linn
c3bdaefe1f
Update batch-ingestion.md
...
Fix documented type of the `dataGranularity` config
2016-01-19 17:20:47 -08:00
binlijin
8e43e2c446
Do persist IncrementalIndex in another thread in IndexGeneratorReducer
2016-01-20 09:20:09 +08:00
Kurt Young
82ff98c2bf
add config for build v9 directly and update docs
2016-01-16 11:26:34 +08:00
Zhao Weinan
5e57ddb8cc
Adding avro support to realtime & hadoop batch indexing.
2016-01-05 10:21:27 +08:00
Robin
0961c0b703
trivial documentation fix
2016-01-04 12:39:10 -06:00
fjy
88f6b9b5ad
Multiple improvements for docs
2016-01-02 21:54:54 -08:00
Himanshu Gupta
48de9dfafa
doc update to make it easy to find how to do re-indexing or delta ingestion
2015-12-30 23:58:09 -06:00
fjy
398a3ec620
add docs for more specs
2015-12-17 18:06:30 -08:00
jon-wei
c53bf85d83
Add docs and benchmark for JSON flattening parser
2015-12-09 16:13:30 -08:00
Himanshu Gupta
efe3c9f4a5
update the examples for batch reindexing/delta ingestion to use "intervals" instead of deprecated "interval"
2015-12-06 00:22:20 -06:00
Himanshu Gupta
61aaa09012
support multiple intervals in dataSource input spec
2015-12-03 21:28:04 -06:00
jon-wei
95dca4440f
Update data formats doc with info about JSON multi-value dimensions
2015-11-24 14:38:06 -08:00
sahner
a4ed2ce2d1
fix formatting in schema-design
2015-11-17 16:50:53 -06:00
fjy
8f231fd3e3
cleanup druid codebase
2015-11-04 13:59:53 -08:00
Nishant
efc49da073
fix doc - correct default value for maxRowsInMemory
2015-11-01 22:09:24 -08:00
Bingkun Guo
4914925d65
New extension loading mechanism
...
1) Remove maven client from downloading extensions at runtime.
2) Provide a way to load Druid extensions and hadoop dependencies through file system.
3) Refactor pull-deps so that it can download extensions into extension directories.
4) Add documents on how to use this new extension loading mechanism.
5) Change the way how Druid tarball is generated. Now all the extensions + hadoop-client 2.3.0
are packaged within the Druid tarball.
2015-10-21 14:22:36 -05:00
Gian Merlino
933cbdf780
Adjust realtime constraints in the docs.
2015-10-09 10:52:52 -07:00
Gian Merlino
b29cbf97a6
Docs: Suggest hadoopyString parser for Hadoop.
2015-09-16 10:19:42 -07:00
Himanshu Gupta
075b6d4385
update ingestion faq to mention dataSource inputSpec as an option of reindexing via hadoop
2015-09-10 14:41:13 -05:00
Xavier Léauté
d89b0fa76a
Merge pull request #1662 from qix/pathFormat-doc
...
Add documentation for pathFormat in batch ingestion
2015-08-31 11:14:54 -07:00
Josh Yudaken
29c29b42d3
Add default value and link to joda docs
2015-08-31 11:09:54 -07:00
lvjq
2237a8cf0f
kafka 8 simple consumer firehose
2015-08-27 20:50:46 -05:00
Bingkun
ae1f104c10
Fix batch ingestion doc
2015-08-26 15:16:21 -05:00
Gian Merlino
10946610f4
Merge pull request #1656 from druid-io/all-the-docs
...
more docs for common questions
2015-08-25 17:49:47 -07:00
fjy
4055f9ca48
more docs for common questions
2015-08-25 17:49:04 -07:00
sahner
3def847e28
add documentation about TimedShutoff firehose
2015-08-24 20:41:42 -05:00
Josh Yudaken
5e42aee49e
Add documentation for pathFormat in batch ingestion
2015-08-24 14:39:57 -07:00
Himanshu Gupta
cfd81bfac7
updating the docs on how to do hadoop batch re-ingesion and delta ingestion
2015-08-16 14:07:35 -05:00
fjy
012fff6616
fix firehose docs
2015-08-04 09:52:23 -07:00
Himanshu Gupta
7ee509bcd0
fix mysql references in tutorial docs
2015-07-30 22:05:05 -05:00
pdeva
ef0439229d
Specify dynamic dimension schema
...
Document how druid can dynamically infer dimension columns
2015-07-27 20:20:53 -07:00
sahner
4801de62a2
make "announce" the chathandler default in realtime node,
...
remove doc references to chathandler type "announce" since it is the default now,
2015-07-27 12:14:28 -05:00
pdeva
76bf8ccd8c
correct key name
2015-07-25 21:58:37 -07:00
fjy
92293ef094
Added section on best practices for schema designa and a few other edits
2015-07-24 14:06:20 -07:00
Himanshu Gupta
119ec13d23
updating hadoop tuningConfig doc with useCombiner flag
2015-07-22 13:55:00 -05:00
Himanshu Gupta
dd95ef77c0
recommend druid-hdfs-storage and hadoop dependencies to be in the classpath instead of added as an extension
2015-07-18 16:18:12 -05:00
Charles Allen
e051e93d19
Merge pull request #1518 from RealROI/more-azure-features
...
Azure Blob Store support for Firehose and Indexing Service Logs
2015-07-17 16:10:22 -07:00
Zak Kristjanson
0bda7af52c
Add more support for Azure Blob Store
...
Azure Blob Store support for Task Logs and a firehose for data ingestion
2015-07-17 15:38:21 -07:00
Shiyu Qiu
bec8e8e23a
fix doc data-formats.md
2015-07-15 17:13:33 -05:00
Tim
3b692fb6f7
fix #1525 - typo: "HadoopBatchIndexer"
2015-07-14 20:48:24 -07:00
fjy
08d00cc80f
rework the realtime examples a bit; add more faq
2015-07-07 14:07:14 -07:00
sahner
acd20e8c00
say explicitly that local firehose searches directories recursively for files
2015-07-05 14:46:44 -05:00
Fangjin Yang
2544f3655e
Merge pull request #1457 from ravishrathod/rabbitmq-doc
...
updating doc for rabbitmq firehose
2015-06-23 08:24:49 -07:00
ravishrathod
9213fd3801
updating doc for rabbitmq firehose
2015-06-22 02:40:11 -04:00
fjy
9c74993559
fix protobuf impl and docs
2015-06-20 21:59:38 -07:00
fjy
74d8840414
Change tranquility links
2015-05-31 10:59:38 -07:00
Himanshu Gupta
be4ecc4b91
in batch ingestion metadataUpdateSpec->type is derby, mysql etc and not metadata
2015-05-29 22:16:18 -05:00
Xavier Léauté
d2346b6834
shorten links and file names
...
* remove redundant parts in file names
* delete unsupported "Druid-Personal-Demo-Cluster"
2015-05-29 20:55:42 -05:00
Himanshu Gupta
8edc2aaca3
renaming all *.md filenames to only have lowercase and dashes
...
so that they are editable on case-insensitive os as well
2015-05-29 20:55:42 -05:00