* Fix missing temp dir for native single_dim
Native single dim indexing throws a file not found exception from
InputEntityIteratingReader.java:81. This MR creates the required
temporary directory when setting up the
PartialDimensionDistributionTask. The change was tested on a Druid
cluster. After installing the change native single_dim indexing
completes successfully.
* Fix indentation
* Use SinglePhaseSubTask as example for creating the temp dir
* Move temporary indexing dir creation in to TaskToolbox
* Remove unused dependency
Co-authored-by: Morri Feldman <morri@appsflyer.com>
* Fill in the core partition set size properly for batch ingestion with
dynamic partitioning
* incomplete javadoc
* Address comments
* fix tests
* fix json serde, add tests
* checkstyle
* Set core partition set size for hash-partitioned segments properly in
batch ingestion
* test for both parallel and single-threaded task
* unused variables
* fix test
* unused imports
* add hash/range buckets
* some test adjustment and missing json serde
* centralized partition id allocation in parallel and simple tasks
* remove string partition chunk
* revive string partition chunk
* fill numCorePartitions for hadoop
* clean up hash stuffs
* resolved todos
* javadocs
* Fix tests
* add more tests
* doc
* unused imports
* Allow append to existing datasources when dynamic partitioing is used
* fix test
* checkstyle
* checkstyle
* fix test
* fix test
* fix other tests..
* checkstyle
* hansle unknown core partitions size in overlord segment allocation
* fail to append when numCorePartitions is unknown
* log
* fix comment; rename to be more intuitive
* double append test
* cleanup complete(); add tests
* fix build
* add tests
* address comments
* checkstyle
* optimize for protobuf parsing
* fix import error and maven dependency
* add unit test in protobufInputrowParserTest for flatten data
* solve code duplication (remove the log and main())
* rename 'flatten' to 'flat' to make it clearer
Co-authored-by: xionghuilin <xionghuilin@bytedance.com>
* Druid user permissions apply in the console
* Update index.md
* noting user warning in console page; some minor shuffling
* noting user warning in console page; some minor shuffling 1
* touchups
* link checking fixes
* Updated per suggestions
* retry 500 and 503 errors against kinesis
* add test that exercises retry logic
* more branch coverage
* retry 500 and 503 on getRecords request when fetching sequence numberu
Co-authored-by: Harshpreet Singh <hrshpr@twitch.tv>
* change default number of segment loading threads
* fix docs
* missed file
* min -> max for segment loading threads
Co-authored-by: Dylan <dwylie@spotx.tv>
* fix docs error: google to azure and hdfs to http
* fix docs error: indexSpecForIntermediatePersists of tuningConfig in hadoop-based batch part
* fix docs error: logParseExceptions of tuningConfig in hadoop-based batch part
* fix docs error: maxParseExceptions of tuningConfig in hadoop-based batch part
* Fill in the core partition set size properly for batch ingestion with
dynamic partitioning
* incomplete javadoc
* Address comments
* fix tests
* fix json serde, add tests
* checkstyle
* Set core partition set size for hash-partitioned segments properly in
batch ingestion
* test for both parallel and single-threaded task
* unused variables
* fix test
* unused imports
* add hash/range buckets
* some test adjustment and missing json serde
* centralized partition id allocation in parallel and simple tasks
* remove string partition chunk
* revive string partition chunk
* fill numCorePartitions for hadoop
* clean up hash stuffs
* resolved todos
* javadocs
* Fix tests
* add more tests
* doc
* unused imports
* IntelliJ inspection and checkstyle rule for "Collection.EMPTY_* field accesses replaceable with Collections.empty*()"
* Reverted checkstyle rule
* Added tests to pass CI
* Codestyle
* API to verify a datasource has the latest ingested data
* API to verify a datasource has the latest ingested data
* API to verify a datasource has the latest ingested data
* API to verify a datasource has the latest ingested data
* API to verify a datasource has the latest ingested data
* fix checksyle
* API to verify a datasource has the latest ingested data
* API to verify a datasource has the latest ingested data
* API to verify a datasource has the latest ingested data
* API to verify a datasource has the latest ingested data
* fix spelling
* address comments
* fix checkstyle
* update docs
* fix tests
* fix doc
* address comments
* fix typo
* fix spelling
* address comments
* address comments
* fix typo in docs
* ROUND and having comparators correctly handle doubles
Double.NaN, Double.POSITIVE_INFINITY and Double.NEGATIVE_INFINITY are not real
numbers. Because of this, they can not be converted to BigDecimal and instead
throw a NumberFormatException.
This change adds support for calculations that produce these numbers either
for use in the `ROUND` function or the HavingSpecMetricComparator by not
attempting to convert the number to a BigDecimal.
The bug in ROUND was first introduced in #7224 where we added the ability to
round to any decimal place. This PR changes the behavior back to using
`Math.round` if we recognize a number that can not be converted to a
BigDecimal.
* Add tests and fix spellcheck
* update error message in ExpressionsTest
* Address comments
* fix up round for infinity
* round non numeric doubles returns a double
* fix spotbugs
* Update docs/misc/math-expr.md
* Update docs/querying/sql.md
* Remove LegacyDataSource.
Its purpose was to enable deserialization of strings into TableDataSources.
But we can do this more straightforwardly with Jackson annotations.
* Slight test improvement.
* lpad and rpad functions deal with empty pad
Return null if the pad string used by the `lpad` and `rpad` functions is
an empty string
* Fix rpad
* Match PostgreSQL behavior in SQL compliant null handling mode
* Match PostgreSQL behavior for pad -ve len
* address review comments
* Fix broadcast rule drop and docs
* Remove racy test check
* Don't drop non-broadcast segments on tasks, add overshadowing handling
* Don't use realtimes for overshadowing
* Fix dropping for ingestion services
Commit 771870ae2d312d643e6d98f3d0af8a9618af9681 removed constructor
arguments from the rules. Therefore multiple parameters of the test are
now the same and can be removed.
The parameters generator uses CompressionStrategy.noNoneValues() instead
of CompressionStrategyTest.compressionStrategies() which wrapped each
strategy in a single element array. This improves readability of the
test.
* make joinables closeable
* tests and adjustments
* refactor to make join stuffs impelement ReferenceCountedObject instead of Closable, more tests
* fixes
* javadocs and stuff
* fix bugs
* more test
* fix lgtm alert
* simplify
* fixup javadoc
* review stuffs
* safeguard against exceptions
* i hate this checkstyle rule
* make IndexedTable extend Closeable
* remove incorrect and unnecessary overrides from BooleanVectorValueMatcher
* add test case
* add unit tests for ... part of VectorValueMatcherColumnProcessorFactory
* Update VectorValueMatcherColumnProcessorFactoryTest.java
* move benchmark data generator into druid-processing, add a GeneratorInputSource to fill up a cluster with data
* newlines
* make test coverage not fail maybe
* remove useless test
* Update pom.xml
* Update GeneratorInputSourceTest.java
* less passive aggressive test names