UniformGranularityTest's test to test a large number of intervals
runs through 10 years of 1 second intervals. This pushes a lot of
stuff through IntervalIterator and shows up in terms of test
runtime as one of the hottest tests. Most of the time is going to
constructing jodatime objects because it is doing things with
DateTime objects instead of millis. Change the calls to use
millis instead and things go faster.
If a server is removed during `HttpServerInventoryView.serverInventoryInitialized`,
the initialization gets stuck as this server is never synced. The method eventually times
out (default 250s).
Fix: Mark a server as stopped if it is removed. `serverInventoryInitialized` only waits for
non-stopped servers to sync.
Other changes:
- Add new metrics for better debugging of slow broker/coordinator startup
- `segment/serverview/sync/healthy`: whether the server view is syncing properly with a server
- `segment/serverview/sync/unstableTime`: time for which sync with a server has been unstable
- Clean up logging in `HttpServerInventoryView` and `ChangeRequestHttpSyncer`
- Minor refactor for readability
- Add utility class `Stopwatch`
- Add tests and stubs
After #13197 , several coordinator configs are now redundant as they are not being
used anymore, neither with `smartSegmentLoading` nor otherwise.
Changes:
- Remove dynamic configs `emitBalancingStats`: balancer error stats are always
emitted, debug stats can be logged by using `debugDimensions`
- `useBatchedSegmentSampler`, `percentOfSegmentsToConsiderPerMove`:
batched segment sampling is always used
- Add test to verify deserialization with unknown properties
- Update `CoordinatorRunStats` to always track stats, this can be optimized later.
with a RuntimeException. Now the RuntimeException is being replaced by an user facing DruidException of Invalid category which would allow calcite not to throw an uncategorized exception.
The latest topic offsets are polled frequently and used to determine the lag based on the current offsets. However, when the offsets are stale (which can happen due to connection issues commonly), we may see a negative lag .
This PR prevents emission of metrics when the offsets are stale and at least one of the partitions has a negative lag.
Currently, IntelliJ-inspections are run sequentially w.r.t static-checks, thereby increasing build time. Moving IntelliJ-inspections to a separate job to improve builds time and get a quick insight into such issues early on.
* Changes the get results API in SqlStatementResource to take a page number instead of row/offset.
* Adds "pages" containing information on each page to the results status.
* Update the "numRows" and "sizeInByes" to "numTotalRows" and "totalSizeInBytes" respectively, which are totalled across all pages.
* combine string column implementations
changes:
* generic indexed, front-coded, and auto string columns now all share the same column and index supplier implementations
* remove CachingIndexed implementation, which I think is largely no longer needed by the switch of many things to directly using ByteBuffer, avoiding the cost of creating Strings
* remove ColumnConfig.columnCacheSizeBytes since CachingIndexed was the only user
1) Fix a problem where the fault wasn't reported when the left-hand side
had too many buffered frames. (Instead, frames continued to be buffered,
eventually running the server out of memory.)
2) Always update the mark when rewinding isn't necessary. It fixes a problem where
frames would be needlessly buffered when there isn't a key match across
the two sides.
3) Memory reserved for building the trackers now change based on the heap sized
* Add "stringEncoding" parameter to DataSketches HLL.
Builds on the concept from #11172 and adds a way to feed HLL sketches
with UTF-8 bytes.
This must be an option rather than always-on, because prior to this
patch, HLL sketches used UTF-16LE encoding when hashing strings. To
remain compatible with sketch images created prior to this patch -- which
matters during rolling updates and when reading sketches that have been
written to segments -- we must keep UTF-16LE as the default.
Not currently documented, because I'm not yet sure how best to expose
this functionality to users. I think the first place would be in the SQL
layer: we could have it automatically select UTF-8 or UTF-16LE when
building sketches at query time. We need to be careful about this, though,
because UTF-8 isn't always faster. Sometimes, like for the results of
expressions, UTF-16LE is faster. I expect we will sort this out in
future patches.
* Fix benchmark.
* Fix style issues, improve test coverage.
* Put round back, to make IT updates easier.
* Fix test.
* Fix issue with filtered aggregators and add test.
* Use DS native update(ByteBuffer) method. Improve test coverage.
* Add another suppression.
* Fix ITAutoCompactionTest.
* Update benchmarks.
* Updates.
* Fix conflict.
* Adjustments.
In these other cases, stick to plain "filter". This simplifies lots of
logic downstream, and doesn't hurt since we don't have intervals-specific
optimizations outside of tables.
Fixes an issue where we couldn't properly filter on a column from an
external datasource if it was named __time.
Mocks generally have state and should not be static. In particular, the
"Yielder" included in one of the mocks can only be iterated once, which
made the test suite order-dependent.
* Support complex variance object inputs for variance SQL agg function
* Add test
* Include complexTypeChecker, address PR comments
* Checkstyle, javadoc link
Currently, github notification emails to commits@druid.apache.org have
a wide variety of subject lines. This foils email client threading and
makes the firehose of emails hard to follow. This is an attempt,
inspired by the config for plc4x, to make the emails easier to follow.
After this change, it may even make sense to send some to dev@ instead
of commits@.
* Properly read SQL-compatible segments in default-value mode.
Main changes:
1) Dictionary-encoded and front-coded string columns: in default-value
mode, detect cases where a dictionary has the empty string in it, then
either combine it with null (if null is present) or replace it with
null (if null is not present).
2) Numeric nullable columns: in default-value mode, ignore the null
value bitmap. This causes all null numbers to be read as zeroes.
Testing strategy:
1) Add a mmappedWithSqlCompatibleNulls case to BaseFilterTest that
writes segments under SQL-compatible mode, and reads them under
default-value mode.
2) Unit tests for the new wrapper classes (CombineFirstTwoEntriesIndexed,
CombineFirstTwoValuesColumnarInts, CombineFirstTwoValuesColumnarMultiInts,
CombineFirstTwoValuesIndexedInts).
* Fix a mistake, use more singlethreadedness.
* WIP
* Tests, improvements.
* Style.
* See Spot bug.
* Remove unused method.
* Address review comments.
1) Read bitmaps even if we don't retain them.
2) Combine StringFrontCodedDictionaryEncodedColumn and ScalarStringDictionaryEncodedColumn.
* Add missing tests.
This PR aims to expose a new API called
"@path("/druid/v2/sql/statements/")" which takes the same payload as the current "/druid/v2/sql" endpoint and allows users to fetch results in an async manner.
* Fix another infinite loop and remove Mockito usage
The ConfigManager objects were `started()` without ever being
stopped. This scheduled a poll call that never-ended, to make
matters worse, the poll interval was set to 0 ms, making an
infinite poll with 0 sleep, i.e. an infinite loop.
Also introduce test classes and remove usage of mocks
* Checkstyle
Adds support for automatic cleaning of a "query-results" directory in durable storage. This directory will be cleaned up only if the task id is not known to the overlord. This will allow the storage of query results after the task has finished running.
* Cache parsed expressions and binding analysis in more places.
Main changes:
1) Cache parsed and analyzed expressions within PlannerContext for a
single SQL query.
2) Cache parsed expressions together with input binding analysis using
a new class AnalyzeExpr.
This speeds up SQL planning, because SQL planning involves parsing
analyzing the same expression strings over and over again.
* Fixes.
* Fix style.
* Fix test.
* Simplify: get rid of AnalyzedExpr, focus on caching.
* Rename parse -> parseExpression.
Changes to `cost` strategy:
- In every `ServerHolder`, track the number of segments per datasource per interval
- Perform cost computations for a given interval just once, and then multiply by a constant
factor to account for the total segment count in that interval
- Do not perform joint cost computations with segments that are outside the compute interval
(± 45 days) for the segment being considered for move
- Remove metrics `segment/cost/*` as they were coordinator killers! Turning on these metrics
(by setting `emitBalancingStats` to true) has often caused the coordinator to be stuck for hours.
Moreover, they are too complicated to decipher and do not provide any meaningful insight into
a Druid cluster.
- Add new simpler metrics `segment/balancer/compute/*` to track cost computation time,
count and errors.
Other changes:
- Remove flaky test from `CostBalancerStrategyTest`.
- Add tests to verify that computed cost has remained unchanged
- Remove usages of mock `BalancerStrategy` from `LoadRuleTest`, `BalanceSegmentsTest`
- Clean up `BalancerStrategy` interface
Changes:
- Throw an `InsertCannotAllocateSegmentFault` if the allocated segment is not aligned with
the requested granularity.
- Tests to verify new behaviour
* Updates: use the target table directly, sanitized replace time chunks and clustered by cols.
* Add DruidSqlParserUtil and tests.
* minor refactor
* Use SqlUtil.isLiteral
* Throw ValidationException if CLUSTERED BY column descending order is specified.
- Fails query planning
* Some more tests.
* fixup existing comment
* Update comment
* checkstyle fix: remove unused imports
* Remove InsertCannotOrderByDescendingFault and deprecate the fault in readme.
* minor naming
* move deprecated field to the bottom
* update docs.
* add one more example.
* Collapsible query and result
* checkstyle fixes
* Code cleanup
* order by changes
* conditionally set attributes only for explain queries.
* Cleaner ordinal check.
* Add limit test and update javadoc.
* Commentary and minor adjustments.
* Checkstyle fixes.
* One more checkArg.
* add unexpected kind to exception.
* MSQ: Change default clusterStatisticsMergeMode to SEQUENTIAL.
This is an undocumented parameter that controls how cluster-by statistics
are merged. In PARALLEL mode, statistics are gathered from workers all
at once. In SEQUENTIAL mode, statistics are gathered time chunk by time
chunk. This improves accuracy for jobs with many time chunks, and reduces
memory usage.
The main downside of SEQUENTIAL is that it can take longer, but in most
situations I've seen, PARALLEL is only really usable in cases where the
sketches are small enough that SEQUENTIAL would also run relatively
quickly. So it seems like SEQUENTIAL is a better default.
* Switch off-test from SEQUENTIAL to PARALLEL.
* Fix sequential merge for situations where there are no time chunks at all.
* Add a couple more tests.
* Add OverlordStatusMonitor and CoordinatorStatusMonitor to monitor service leader status
* make the monitor more general
* resolve conflict
* use Supplier pattern to provide metrics
* reformat code and doc
* move service specific tag to dimension
* minor refine
* update doc
* reformat code
* address comments
* remove declared exception
* bind HeartbeatSupplier conditionally in Coordinator
Users can now add a guardrail to prevent subquery’s results from exceeding the set number of bytes by setting druid.server.http.maxSubqueryRows in Broker's config or maxSubqueryRows in the query context. This feature is experimental for now and would default back to row-based limiting in case it fails to get the accurate size of the results consumed by the query.
* SqlResults: Coerce arrays to lists for VARCHAR.
Useful for STRING_TO_MV, which returns VARCHAR at the SQL layer and an
ExprEval with String[] at the native layer.
* Fix style.
* Improve test coverage.
* Remove unnecessary throws.
Recently, we have seen flakiness in these two tests, apparently due to
computations based on Runtime.getRuntime().maxMemory() differing during
static initialization and in the actual tests. I can't think of a reason
why this would be happening, but anyway, this patch switches the tests to
use the statics instead of recomputing Runtime.getRuntime().maxMemory().
* SQL OperatorConversions: Introduce.aggregatorBuilder, allow CAST-as-literal.
Four main changes:
1) Provide aggregatorBuilder, a more consistent way of defining the
SqlAggFunction we need for all of our SQL aggregators. The mechanism
is analogous to the one we already use for SQL functions
(OperatorConversions.operatorBuilder).
2) Allow CASTs of constants to be considered as "literalOperands". This
fixes an issue where various of our operators are defined with
OperandTypes.LITERAL as part of their checkers, which doesn't allow
casts. However, in these cases we generally _do_ want to allow casts.
The important piece is that the value must be reducible to a constant,
not that the SQL text is literally a literal.
3) Update DataSketches SQL aggregators to use the new aggregatorBuilder
functionality. The main user-visible effect here is [2]: the aggregators
would now accept, for example, "CAST(0.99 AS DOUBLE)" as a literal
argument. Other aggregators could be updated in a future patch.
4) Rename "requiredOperands" to "requiredOperandCount", because the
old name was confusing. (It rhymes with "literalOperands" but the
arguments mean different things.)
* Adjust method calls.
* Clarify compaction docs.
The prior wording made it sound like segmentGranularity, queryGranularity,
and rollup are always required for granularitySpec. They are not required,
but they are strongly recommended. The adjusted wording hopefully does
a better job of making that clear.
* Fix link.
* Wording adjustments.