Commit Graph

288 Commits

Author SHA1 Message Date
Himanshu 14aec7fcec
add config to optionally disable all compression in intermediate segment persists while ingestion (#7919)
* disable all compression in intermediate segment persists while ingestion

* more changes and build fix

* by default retain existing indexingSpec for intermediate persisted segments

* document indexSpecForIntermediatePersists index tuning config

* fix build issues

* update serde tests
2019-07-10 12:22:24 -07:00
Eyal Yurman 2eee711653 Add missing reference to Materialized-View extension. (#8003)
* Reference Materialized View extension from extensions page.

* Add comma
2019-07-06 13:50:41 -07:00
Chi Cao Minh 0ded0ce414 Add round support for DS-HLL (#8023)
* Add round support for DS-HLL

Since the Cardinality aggregator has a "round" option to round off estimated
values generated from the HyperLogLog algorithm, add the same "round" option to
the DataSketches HLL Sketch module aggregators to be consistent.

* Fix checkstyle errors

* Change HllSketchSqlAggregator to do rounding

* Fix test for standard-compliant null handling mode
2019-07-05 15:37:58 -07:00
Clint Wylie 42a7b8849a remove FirehoseV2 and realtime node extensions (#8020)
* remove firehosev2 and realtime node extensions

* revert intellij stuff

* rat exclusion
2019-07-04 15:40:22 -07:00
Clint Wylie f7283378ac
remove deprecated standalone realtime node (#7915)
* remove CliRealtime, RealtimeManager, etc

* add redirects for deleted page to page that explains the deleted thing

* adjust docs
2019-07-02 18:12:17 -07:00
Eyal Yurman 3650eed1aa Improve pull-deps reference in extensions page. (#8002) 2019-07-01 11:18:27 -07:00
Alexander Saydakov f38a62e949 theta sketch to string post agg (#7937) 2019-06-27 15:09:57 -07:00
Jonathan Wei 35601bb7a0 Add finalizeAsBase64Binary option to FixedBucketsHistogramAggregatorFactory (#7784)
* Add finalizeAsBase64Binary option to FixedBucketsHistogramAggregatorFactory

* Add finalizeAsBase64Binary option to ApproximateHistogramFactory

* Update approx histogram doc
2019-06-21 18:00:19 -07:00
Xue Yu 456a3654ce add PolygonBound and missing extentions list doc (#7885) 2019-06-13 12:03:58 -07:00
Justin Borromeo 8032c4add8 Add errors and state to stream supervisor status API endpoint (#7428)
* Add state and error tracking for seekable stream supervisors

* Fixed nits in docs

* Made inner class static and updated spec test with jackson inject

* Review changes

* Remove redundant config param in supervisor

* Style

* Applied some of Jon's recommendations

* Add transience field

* write test

* implement code review changes except for reconsidering logic of markRunFinishedAndEvaluateHealth()

* remove transience reporting and fix SeekableStreamSupervisorStateManager impl

* move call to stateManager.markRunFinished() from RunNotice to runInternal() for tests

* remove stateHistory because it wasn't adding much value, some fixes, and add more tests

* fix tests

* code review changes and add HTTP health check status

* fix test failure

* refactor to split into a generic SupervisorStateManager and a specific SeekableStreamSupervisorStateManager

* fixup after merge

* code review changes - add additional docs

* cleanup KafkaIndexTaskTest

* add additional documentation for Kinesis indexing

* remove unused throws class
2019-05-31 17:16:01 -07:00
Jonathan Wei ec4d09a02f Remove obsolete isExcluded config from Kerberos authenticator (#7745) 2019-05-23 16:00:05 -07:00
awelsh93 6964ac23a2 Adding influxdb emitter as a contrib extension (#7717)
* Adding influxdb emitter as a contrib extension

* addressing code review comments
2019-05-23 11:11:48 -07:00
gocho1 bd899b9224 add s3 authentication method informations (#7674)
* add s3 authentication method informations

* add druid.s3.fileSessionCredentials related content

* remove authentication parameters to avoid confusion as it is more detailed in S3 Deep Storage page

* streamline s3 docs
2019-05-22 11:46:02 -07:00
Gian Merlino b6941551ae Upgrade various build and doc links to https. (#7722)
* Upgrade various build and doc links to https.

Where it wasn't possible to upgrade build-time dependencies to https,
I kept http in place but used hardcoded checksums or GPG keys to ensure
that artifacts fetched over http are verified properly.

* Switch to https://apache.org.
2019-05-21 11:30:14 -07:00
andrewluotechnologies 1add566411 Fix typo (ComplexMetricSerde class name was spelled incorrectly) (#7694) 2019-05-19 09:49:54 -07:00
Gian Merlino 0352f450d7 Fix broken links in docs, add broken link checker. (#7658)
Also adds back insert-segment-to-db.md with some docs about why and
when it was removed (in #6911).
2019-05-15 14:49:50 -07:00
Jonathan Wei e874da7cea
Add simpler permissions option to BasicAuthorizer GET APIs (#7635)
* Add simpler permissions option to BasicAuthorizer GET APIs

* Adjust log message

Co-Authored-By: Himanshu <g.himanshu@gmail.com>

* Adjust log message

Co-Authored-By: Himanshu <g.himanshu@gmail.com>
2019-05-15 12:59:32 -07:00
Alexander Saydakov ca1a6649f6 Datasketches quantiles more post-aggs (#7550)
* rank and CDF post-aggs

* added post-aggs to the module

* added new post-aggs

* moved post-agg IDs

* moved post-agg IDs
2019-05-10 11:46:54 -07:00
Clint Wylie 6a6c6d573d
Add plain text README.txt, use relative link from README.md to build.md (#7611)
* use relative link to build instructions from top level readme

* add textfile to readme

* formatting

* make README.BINARY plaintext, move LABELS.md to LABELS, README.txt to README

* exclude README.BINARY still

* remove jdk links/recommmendations

* add script to use DRUIDVERSION in textfile README instead of latest, add links to recommended jdk to build.md

* license

* better readme template, links to latest if does not detect an apache release version

* fix
2019-05-09 21:29:26 -07:00
Samarth Jain b542bb9f34 TDigest backed sketch aggregators (#7331)
* First set of changes for tDigest histogram

* Add license

* Address code review comments

* Add a doc page for new T-Digest sketch aggregators. Minor code cleanup and comments.

* Remove synchronization from BufferAggregators. Address code review comments

* Fix typo
2019-05-09 17:22:55 -07:00
Jinseon Lee 0ef435a16c add postgresql meta db table schema configuration property (#7137) (#7183)
* add postgresql meta db table schema configuration property (#7137)

If the postgresql db schema changes, you must set the configuration
values.
You do not need to set it if there is no change from the default schema
'public'.
druid.metadata.postgres.dbTableSchema=public

* create postgresql metadb table schema configuration property (#7137)
If the postgresql db schema changes, you must set the configuration
values.
You do not need to set it if there is no change from the default schema
'public'.
druid.metadata.postgres.dbTableSchema=public
check PostgreSQLTablesConfig.java

* modify postgresql readme file. - metadb table schema (#7137)
If the postgresql db schema changes, you must set the configuration
values.
You do not need to set it if there is no change from the default schema
'public'.
druid.metadata.postgres.dbTableSchema=public
check PostgreSQLTablesConfig.java
2019-05-08 12:56:30 -07:00
Gian Merlino 727b65c7e5 Remove SQL experimental banner and other doc adjustments. (#7591)
* Remove SQL experimental banner and other doc adjustments.

Also,

- Adjust the ToC and other docs a bit so SQL and native queries are
  presented on more equal footing.
- De-emphasize querying historicals and peons directly in the
  native query docs. This is a really niche thing and may have been
  confusing to include prominently in the very first paragraph.
- Remove DataSketches and Kafka indexing service from the experimental
  features ToC. They are not experimental any longer and were there in
  error.

* More notes.

* Slight tweak.

* Remove extra extra word.

* Remove RT node from ToC.
2019-05-06 12:31:51 -07:00
Jonathan Wei a013350018 Adjust required permissions for system schema (#7579)
* Adjust required permissions for system schema

* PR comments, fix current_size handling

* Checkstyle

* Set curr_size instead of current_size

* Adjust information schema docs

* Fix merge conflict

* Update tests
2019-05-02 07:18:02 -07:00
Eyal Yurman f02251ab2d Contributing Moving-Average Query to open source. (#6430)
* Contributing Moving-Average Query to open source.

* Fix failing code inspections.

* See if explicit types will invoke the correct comparison function.

* Explicitly remove support for druid.generic.useDefaultValueForNull configuration parameter.

* Update styling and headers for complience.

* Refresh code with latest master changes:

* Remove NullDimensionSelector.
* Apply changes of RequestLogger.
* Apply changes of TimelineServerView.

* Small checkstyle fix.

* Checkstyle fixes.

* Fixing rat errors; Teamcity errors.

* Removing support theta sketches. Will be added back in this pr or a following once DI conflicts with datasketches are resolved.

* Implements some of the review fixes.

* Contributing Moving-Average Query to open source.

* Fix failing code inspections.

* See if explicit types will invoke the correct comparison function.

* Explicitly remove support for druid.generic.useDefaultValueForNull configuration parameter.

* Update styling and headers for complience.

* Refresh code with latest master changes:

* Remove NullDimensionSelector.
* Apply changes of RequestLogger.
* Apply changes of TimelineServerView.

* Small checkstyle fix.

* Checkstyle fixes.

* Fixing rat errors; Teamcity errors.

* Removing support theta sketches. Will be added back in this pr or a following once DI conflicts with datasketches are resolved.

* Implements some of the review fixes.

* More fixes for review.

* More fixes from review.

* MapBasedRow is Unmodifiable. Create new rows instead of modifying existing ones.

* Remove more changes related to datasketches support.

* Refactor BaseAverager startFrom field and add a comment.

* fakeEvents field: Refactor initialization and add comment.

* Rename parameters (tiny change).

* Fix variable name typo in test (JAN_4).

* Fix styling of non camelCase fields.

* Fix Preconditions.checkArgument for cycleSize.

* Add more documentation to RowBucketIterable and other classes.

* key/value comment on in MovingAverageIterable.

* Fix anonymous makeColumnValueSelector returning null.

* Replace IdentityYieldingAccumolator with Yielders.each().

* * internalNext() should return null instead of throwing exception.
* Remove unused variables/prarameters.

* Harden MovingAverageIterableTest (Switch anyOf to exact match).

* Change internalNext() from recursion to iteration; Simplify next() and hasNext().

* Remove unused imports.

* Address review comments.

* Rename fakeEvents to emptyEvents.

* Remove redundant parameter key from computeMovingAverage.

* Check yielder as well in RowBucketIterable#hasNext()

* Fix javadoc.
2019-04-26 17:07:48 -07:00
Clint Wylie 09b7700d13 fix docs (#7556) 2019-04-25 22:00:37 -07:00
Jonathan Wei 8b1a4e18dd Additional Apache branding doc updates (#7524) 2019-04-23 14:39:16 -07:00
Jonathan Wei 74960e82bf Add more Apache branding to docs (#7515) 2019-04-19 15:52:26 -07:00
zhaojiandong 1d9450da81 Some docs optimization (#6890)
* some markdown docs optimization

* markdown escape
2019-04-12 17:30:57 -07:00
Clint Wylie 89bb43f382 'core' ORC extension (#7138)
* orc extension reworked to use apache orc map-reduce lib, moved to core extensions, support for flattenSpec, tests, docs

* change binary handling to be compatible with avro and parquet, Rows.objectToStrings now converts byte[] to base64, change date handling

* better docs and tests

* fix it

* formatting

* doc fix

* fix it

* exclude redundant dependencies

* use latest orc-mapreduce, add hadoop jobProperties recommendations to docs

* doc fix

* review stuff and fix binaryAsString

* cache for root level fields

* more better
2019-04-09 09:03:26 -07:00
Charles Allen eeb3dbe79d Move GCP to a core extension (#6953)
* Move GCP to a core extension

* Don't provide druid-core >.<

* Keep AWS and GCP modules separate

* Move AWSModule to its own module

* Add aws ec2 extension and more modules in more places

* Fix bad imports

* Fix test jackson module

* Include AWS and GCP core in server

* Add simple empty method comment

* Update version to 15

* One more 0.13.0-->0.15.0 change

* Fix multi-binding problem

* Grep for s3-extensions and update docs

* Update extensions.md
2019-03-27 09:00:43 -07:00
Clint Wylie 3895914aa2 consolidate CompressionUtils.java since now in the same jar (#6908) 2019-03-13 11:02:44 -04:00
Jonathan Wei 94463b5778 Add missing redirects and fix broken links (#7213)
* Add missing redirects

* Fix zookeeper redirect

* Fix broken links
2019-03-09 15:16:23 -08:00
Jonathan Wei 9183e32876 Add more approximate algorithm docs (#7195) 2019-03-05 16:44:02 -08:00
Jonathan Wei 32c418fdd8 Reword 'node' to 'process' (#7172) 2019-02-28 18:10:39 -08:00
Jonathan Wei a0afd7931d
Add web consoles doc page (#7123)
* Add web consoles doc page

* PR comments

* Remove 'unified'

* PR comments

* Fix TOC

* PR comments

* More revisions

* GUI -> UI

* Update router docs

* Reword router doc
2019-02-28 14:02:39 -08:00
Jonathan Wei 0b4f771062 Exclude hadoop-lzo from thrift-extensions build (#7151) 2019-02-27 19:57:53 -08:00
Jihoon Son 4e2b085201
Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file in deep storage (#6911)
* Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file

* delete descriptor.file when killing segments

* fix test

* Add doc for ha

* improve warning
2019-02-20 15:10:29 -08:00
Surekha 80a2ef7be4 Support kafka transactional topics (#5404) (#6496)
* Support kafka transactional topics

* update kafka to version 2.0.0
* Remove the skipOffsetGaps option since it's not used anymore
* Adjust kafka consumer to use transactional semantics
* Update tests

* Remove unused import from test

* Fix compilation

* Invoke transaction api to fix a unit test

* temporary modification of travis.yml for debugging

* another attempt to get travis tasklogs

* update kafka to 2.0.1 at all places

* Remove druid-kafka-eight dependency from integration-tests, remove the kafka firehose test and deprecate kafka-eight classes

* Add deprecated in docs for kafka-eight and kafka-simple extensions

* Remove skipOffsetGaps and code changes for transaction support

* Fix indentation

* remove skipOffsetGaps from kinesis

* Add transaction api to KafkaRecordSupplierTest

* Fix indent

* Fix test

* update kafka version to 2.1.0
2019-02-18 11:50:08 -08:00
Edward Gan 90c1a54b86 Moments Sketch custom aggregator (#6581)
* Moments Sketch Integration with Druid

* updates, add documentation, fix warnings

* nits

* disallowed base64

* update to druid 0.14
2019-02-13 14:03:47 -08:00
Jihoon Son 8e3a58f723
Improve druid.storage.sse.kms.keyId and druid.s3.protocol (#7012)
* Improve druid.storage.sse.kms.keyId and druid.s3.protocol

* fix article
2019-02-06 15:00:51 -08:00
Jihoon Son 75c70c2ccc Add doc for S3 permissions settings (#7011)
* Add doc for S3 permissions settings

* add a comment about additional settings
2019-02-05 11:52:09 -08:00
Clint Wylie 7a5827e12e bloom filter sql aggregator (#6950)
* adds sql aggregator for bloom filter, adds complex value serde for sql results

* fix tests

* checkstyle

* fix copy-paste
2019-02-01 13:54:46 -08:00
Gian Merlino 54735a5ad1 Kafka indexing: Remove experimental notice. (#6970) 2019-01-31 09:54:22 -08:00
Jonathan Wei 82137874ea Add master/data/query server concepts to docs/packaging (#6916)
* Add master/data/query server concepts to docs/packaging

* PR comments

* TOC and markdown fix

* Update image legend

* PR comment

* More PR comments
2019-01-30 19:41:07 -08:00
Jihoon Son d4fbbb8deb Support protocol configuration for S3 (#6954)
* Support protocol configuration for S3

* Add doc
2019-01-30 19:32:00 -08:00
Clint Wylie a6d81c0d16 Adds bloom filter aggregator to 'druid-bloom-filters' extension (#6397)
* blooming aggs

* partially address review

* fix docs

* minor test refactor after rebase

* use copied bloomkfilter

* add ByteBuffer methods to BloomKFilter to allow agg to use in place, simplify some things, more tests

* add methods to BloomKFilter to get number of set bits, use in comparator, fixes

* more docs

* fix

* fix style

* simplify bloomfilter bytebuffer merge, change methods to allow passing buffer offsets

* oof, more fixes

* more sane docs example

* fix it

* do the right thing in the right place

* formatting

* fix

* avoid conflict

* typo fixes, faster comparator, docs for comparator behavior

* unused imports

* use buffer comparator instead of deserializing

* striped readwrite lock for buffer agg, null handling comparator, other review changes

* style fixes

* style

* remove sync for now

* oops

* consistency

* inspect runtime shape of selector instead of selector plus, static comparator, add inner exception on serde exception

* CardinalityBufferAggregator inspect selectors instead of selectorPluses

* fix style

* refactor away from using ColumnSelectorPlus and ColumnSelectorStrategyFactory to instead use specialized aggregators for each supported column type, other review comments

* adjustment

* fix teamcity error?

* rename nil aggs to empty, change empty agg constructor signature, add comments

* use stringutils base64 stuff to be chill with master

* add aggregate combiner, comment
2019-01-29 20:05:17 +07:00
Clint Wylie af3cbc3687 add bloom filter druid expression (#6904)
* add "bloom_filter_test" druid expression to support bloom filters in ExpressionVirtualColumn and ExpressionDimFilter and sql expressions

* more docs

* use java.util.Base64, doc fixes
2019-01-28 08:41:45 -05:00
Navin Kumar ae4dba7785 Fix Configuration options (#6884)
Change `druid.metadata.postgres.*` to `druid.metadata.postgres.ssl.*`
2019-01-27 12:35:27 -08:00
Justin Borromeo 86e171a234 Doc change and commands tested command on v5 and v8 (#6886) 2019-01-18 15:13:11 -08:00
Jonathan Wei 68f744ec0a
Fixed buckets histogram aggregator (#6638)
* Fixed buckets histogram aggregator

* PR comments

* More PR comments

* Checkstyle

* TeamCity

* More TeamCity

* PR comment

* PR comment

* Fix doc formatting
2019-01-17 14:51:16 -08:00