mirror of https://github.com/apache/druid.git
fix some docs and add new content (#3369)
This commit is contained in:
parent
e4f0eac8e6
commit
6beb8ac342
|
@ -12,11 +12,11 @@ This extension enables Druid to ingest and understand the Apache Orc data format
|
|||
|
||||
This is for batch ingestion using the HadoopDruidIndexer. The inputFormat of inputSpec in ioConfig must be set to `"org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat"`.
|
||||
|
||||
Field | Type | Description | Required
|
||||
----------|-------------|----------------------------------------------------------------------------------------|---------
|
||||
type | String | This should say `orc` | yes
|
||||
parseSpec | JSON Object | Specifies the timestamp and dimensions of the data. Any parse spec that extends ParseSpec is possible but only their TimestampSpec and DimensionsSpec are used. | yes
|
||||
typeString| String | String representation of Orc struct type info. If not specified, auto constructed from parseSpec but all metric columns are dropped | no
|
||||
|Field | Type | Description | Required|
|
||||
|----------|-------------|----------------------------------------------------------------------------------------|---------|
|
||||
|type | String | This should say `orc` | yes|
|
||||
|parseSpec | JSON Object | Specifies the timestamp and dimensions of the data. Any parse spec that extends ParseSpec is possible but only their TimestampSpec and DimensionsSpec are used. | yes|
|
||||
|typeString| String | String representation of Orc struct type info. If not specified, auto constructed from parseSpec but all metric columns are dropped | no|
|
||||
|
||||
For example of `typeString`, string column col1 and array of string column col2 is represented by `"struct<col1:string,col2:array<string>>"`.
|
||||
|
||||
|
|
|
@ -12,10 +12,10 @@ This extension enables Druid to ingest and understand the Apache Parquet data fo
|
|||
|
||||
This is for batch ingestion using the HadoopDruidIndexer. The inputFormat of inputSpec in ioConfig must be set to `"io.druid.data.input.parquet.DruidParquetInputFormat"`. Make sure also to include "io.druid.extensions:druid-avro-extensions" as an extension.
|
||||
|
||||
Field | Type | Description | Required
|
||||
----------|-------------|----------------------------------------------------------------------------------------|---------
|
||||
type | String | This should say `parquet` | yes
|
||||
parseSpec | JSON Object | Specifies the timestamp and dimensions of the data. Should be a timeAndDims parseSpec. | yes
|
||||
|Field | Type | Description | Required|
|
||||
|----------|-------------|----------------------------------------------------------------------------------------|---------|
|
||||
|type | String | This should say `parquet` | yes|
|
||||
|parseSpec | JSON Object | Specifies the timestamp and dimensions of the data. Should be a timeAndDims parseSpec. | yes|
|
||||
|
||||
For example:
|
||||
|
||||
|
|
|
@ -36,8 +36,11 @@ Core extensions are maintained by Druid committers.
|
|||
|
||||
# Community Extensions
|
||||
|
||||
A number of community members have contributed their own extensions to Druid that are not packaged with the default Druid tarball.
|
||||
Community extensions are not maintained by Druid committers, although we accept patches from community members using these extensions.
|
||||
<div class="note caution">
|
||||
Community extensions are not maintained by Druid committers, although we accept patches from community members using these extensions. They may not have been as extensively tested as the core extensions.
|
||||
</div>
|
||||
|
||||
A number of community members have contributed their own extensions to Druid that are not packaged with the default Druid tarball.
|
||||
If you'd like to take on maintenance for a community extension, please post on [druid-development group](https://groups.google.com/forum/#!forum/druid-development) to let us know!
|
||||
|
||||
All of these community extensions can be downloaded using *pull-deps* with the coordinate io.druid.extensions.contrib:EXTENSION_NAME:LATEST_DRUID_STABLE_VERSION.
|
||||
|
|
|
@ -128,5 +128,10 @@ The interval of a segment will be compared against the specified period. The per
|
|||
|
||||
# Permanently Deleting Data
|
||||
|
||||
Druid can fully drop data from the cluster, wipe the metadata store entry, and remove the data from deep storage for any segments that are
|
||||
marked as unused (segments dropped from the cluster via rules are always marked as unused). You can submit a [kill task](../ingestion/tasks.html) to the [indexing service](../design/indexing-service.html) to do this.
|
||||
Druid can fully drop data from the cluster, wipe the metadata store entry, and remove the data from deep storage for any segments that are
|
||||
marked as unused (segments dropped from the cluster via rules are always marked as unused). You can submit a [kill task](../ingestion/tasks.html) to the [indexing service](../design/indexing-service.html) to do this.
|
||||
|
||||
# Reloading Dropped Data
|
||||
|
||||
Data that has been dropped from a Druid cluster cannot be reloaded using only rules. To reload dropped data in Druid, you must first set your retention period (i.e. changing the retention period from 1 month to 2 months), and
|
||||
then enable the datasource in the Druid coordinator console, or through the Druid coordinator endpoints.
|
||||
|
|
Loading…
Reference in New Issue