* Allow for appending tasks to co-exist with each other.
Add a config parameter for appending tasks to allow them to
use a SHARED lock. This will allow multiple appending tasks
to add segments to the same datasource at the same time.
This config should actually be the default, but it is added
as a config to enable a smooth transition/validation in
production settings before forcing it as the default
behavior going forward.
This change leverages the TaskLockType.SHARED that existed
previously, this used to carry the semantics of a READ lock,
which was "escalated" when the task wanted to actually
persist the segment. As of many moons before this diff, the
SHARED lock had stopped being used but was still piped into
the code. It turns out that with a few tweaks, it can be
adjusted to be a shared lock for append tasks to allow them
all to write to the same datasource, so that is what this does.
* Can only reuse the shared lock if using the same groupId
* Need to serialize out the task lock type
* Adjust Unit tests to expect new field in JSON
* Remove use of deprecated PMD ruleset
This fixes annoying warnings we were getting during build.
- Use a custom PMD ruleset, since the built-in one uses deprecated rules.
- UnnecessaryImport replaces most of the deprecated rules
- Update maven-pmd-plugin to 3.15
- Exclude ancient asm version from caliper, since this was causing
incompatibility warnings with PMD and could also affect our tests runs
in unexpected ways
In this PR, we will now return 400 instead of 500 when SQL query cannot be planned. I also fixed a bug where error messages were not getting sent to the users in case the rules throw UnsupportSQLQueryException.
DruidSchema consists of a concurrent HashMap of DataSource -> Segement -> AvailableSegmentMetadata. AvailableSegmentMetadata contains RowSignature of the segment, and for each segment, a new object is getting created. RowSignature is an immutable class, and hence it can be interned, and this can lead to huge savings of memory being used in broker, since a lot of the segments of a table would potentially have same RowSignature.
* replace deprecated forbidden-apis config failOnUnresolvableSignatures
with ignoreSignaturesOfMissingClasses which avoids warnings for
classes not present in a particular sub-module
* fix incorrect signature for Files.createTempDirectory
* clean up the balancing code around the batched vs deprecated way of sampling segments to balance
* fix docs, clarify comments, add deprecated annotations to legacy code
* remove unused variable
* update dynamic config dialog in console to state percentOfSegmentsToConsiderPerMove deprecated
* fix dynamic config text for percentOfSegmentsToConsiderPerMove
* run prettier to cleanup coordinator-dynamic-config.tsx changes
* update jest snapshot
* update documentation per review feedback
* fix bug where queries fail immediately when timeout is 0 instead of using default timeout
* fix to use serverside max
* more better
* less flaky test
* oops
* Metrics docs layout and info about query/bytes
Knowledge transfer from https://groups.google.com/g/druid-user/c/8fiflmSEoTQ - updated the layout of the Metrics part, adding links between docs pages.
Update index.md
Amended typo
* Update docs/configuration/index.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/configuration/index.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/operations/metrics.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/operations/metrics.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/operations/metrics.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Feedback applied
Http --> HTTP and moved content / removed >
* Update docs/configuration/index.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/configuration/index.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update rollup.md
Added SE tip around roll-up.
* Update docs/ingestion/rollup.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
This PR fixes an issue in which if a lookup is configured incorreclty; does not serialize properly when being pulled by peon node, it causes the task to fail. The failure occurs because the peon and other leaf nodes (broker, historical), have retry logic that continues to retry the lookup loading for 3 minutes by default. The http listener thread on the peon task is not started until lookup loading completes, by default, the overlord waits 1 minute by default, to communicate with the peon task to get the task status, after which is orders the task to shut down, causing the ingestion task to fail.
To fix the issue, we catch the exception serialization error, and do not retry. Also fixed an issue in which a bad lookup config interferes with any other good lookup configs from being loaded.
This PR does two things
1. It adds the capability to surface missing features in SQL to users - The calcite planner will explore through multiple rules to convert a logical SQL query to a druid native query. Some rules change the shape of the query itself, optimize it and some rules are responsible for translating the query into a druid native query. These are DruidQueryRule, DruidOuterQueryRule, DruidJoinRule, DruidUnionDataSourceRule, DruidUnionRule etc. These rules will look at SQL and will do the necessary transformation. But if the rule can't transform the query, it returns back the control to the calcite planner without recording why was it not able to transform. E.g. there is a join query with a non-equal join condition. DruidJoinRule will look at the condition, see that it is not supported, and return back the control. The reason can be that a query can be planned in many different ways so if one rule can't parse it, the query may still be parseable by other rules. In this PR, we are intercepting these gaps and passing them back to the user if the query could not be planned at all.
2. The said capability has been used to generate actionable errors for some common unsupported SQL features. However, not all possible errors are covered and we can keep adding more in the future.
* Refactor ResponseContext
Fixes a number of issues in preparation for request trailers
and the query profile.
* Converts keys from an enum to classes for smaller code
* Wraps stored values in functions for easier capture for other uses
* Reworks the "header squeezer" to handle types other than arrays.
* Uses metadata for visibility, and ability to compress,
to replace ad-hoc code.
* Cleans up JSON serialization for the response context.
* Other miscellaneous cleanup.
* Handle unknown keys in deserialization
Also, make "Visibility" into a boolean.
* Revised comment
* Renamd variable
Fixes#10744
Fixes:
./bin/node.sh: 44: ./bin/node.sh: source: not found
Could not find java - please run /opt/druid/apache-druid-0.20.0/bin/verify-java to confirm it is installed.
Druid currently has 2 serverViews, regular serverView and filtered serverView. The regular serverView is used to monitor all segment announcements from all data nodes (historicals, tasks, indexers). The filtered serverView is used when you want to watch segment announcements from particular tiers. Since these server views keep track of different sets of druidServers and segments in memory, they should be maintained separately. However, they currently share the same name for their executorService, which can cause confusion and make debugging harder especially in the broker since it is using both serverViews, the filtered view for normal query processing and the regular view to serve the servers table (I'm unsure whether this is intended or whether this is a good behavior). This PR changes it to a more obvious name.
This PR also removes SingleServerInventoryView. This view was deprecated a long time ago and has not been documented at least since 0.13 (#6127). I also don't think this can be better in any case than BatchServerInventoryView. Finally, I merged AbstractCuratorServerInventoryView and BatchServerInventoryView as we no longer need AbstractCuratorServerInventoryView after SingleServerInventoryView is removed.
* Enable allocating segments at ALL granularity.
The main change is that Granularity.granularitiesFinerThan will return ALL if ALL
is passed in.
Allocating segments at ALL granularity is somewhat unconventional, but there
is nothing wrong with it, and it actually makes a lot of sense for tables that
are meant to be used for lookups or dimensions rather than main fact tables.
This change enables ALL segmentGranularity to work properly in appendToExisting
mode.
Also clarifies behavior in javadocs and tests.
* Move tests to improve coverage.
* Make nodeRole available during binding; add support for dynamic registration of DruidService
* fix checkstyle and test
* fix customRole test
* address comments
* add more javadoc
* Enhancements to IndexTaskClient.
1) Ability to use handlers other than StringFullResponseHandler. This
functionality is not used in production code yet, but is useful
because it will allow tasks to communicate with each other in
non-string-based formats and in streaming fashion. In the future,
we'll be able to use this to make task-to-task communication
more efficient.
2) Truncate server errors at 1KB, so long errors do not pollute logs.
3) Change error log level for retryable errors from WARN to INFO. (The
final error is still WARN.)
4) Harmonize log and exception messages to have a more consistent format.
* Additional tests and improvements.
* apply log file rolling strategy
* fix doc
Signed-off-by: frank chen <frank.chen021@outlook.com>
* Use absolute log path and allow spaces in log path
* Update log4j2 configuration
* apply FileAppender to ZooKeeper
* DO NOT redirect application's console log to file in supervisor
This PR fixes a problem where the com.sun.jndi.ldap.Connection tries to build BasicSecuritySSLSocketFactory when calling LDAPCredentialsValidator.validateCredentials since BasicSecuritySSLSocketFactory is in extension class loader and not visible to system classloader.
* allow `DruidSchema` to fallback to segment metadata type if typeSignature is null, to avoid producing incorrect SQL schema if broker is upgraded to 0.23 before historicals
* mmm, forbidden tests
changes:
* adds new config, druid.expressions.useStrictBooleans which make longs the official boolean type of all expressions
* vectorize logical operators and boolean functions, some only if useStrictBooleans is true
under "Aggregators", about the lgK setting, it said "Must be a power of 2 from 4 to 21 inclusively." 21 is not a power of 2, nor is 12, the given default. I think there may have been confusion because lgK represents log2 of K. We could say "K must be a power of 2...", or just say lgK must be between 4 and 21.
* Code cleanup from query profile project
* Fix spelling errors
* Fix Javadoc formatting
* Abstract out repeated test code
* Reuse constants in place of some string literals
* Fix up some parameterized types
* Reduce warnings reported by Eclipse
* Reverted change due to lack of tests
* Use intermediate-persist IndexSpec during multiphase merge.
The main change is the addition of an intermediate-persist IndexSpec
to the main "merge" method in IndexMerger. There are also a few minor
adjustments to the IndexMerger interface to encourage more harmonious
usage of its methods in the future.
* Additional changes inspired by the test coverage checker.
- Remove unused-in-production IndexMerger methods "append" and "convert".
- Add additional unit tests to UnifiedIndexerAppenderatorsManager.
* Additional adjustments.
* Even more additional adjustments.
* Test fixes.
Add a "guessAggregatorHeapFootprint" method to AggregatorFactory that
mitigates #6743 by enabling heap footprint estimates based on a specific
number of rows. The idea is that at ingestion time, the number of rows
that go into an aggregator will be 1 (if rollup is off) or will likely
be a small number (if rollup is on).
It's a heuristic, because of course nothing guarantees that the rollup
ratio is a small number. But it's a common case, and I expect this logic
to go wrong much less often than the current logic. Also, when it does
go wrong, users can fix it by lowering maxRowsInMemory or
maxBytesInMemory. The current situation is unintuitive: when the
estimation goes wrong, users get an OOME, but actually they need to
*raise* these limits to fix it.
* Add support for custom reset condition & support for other args to have defaults to make the method api consistent
* Add support for custom reset condition to InputEntity
* Fix test names
* Clarifying comments to why we need to read the message's content to identify S3's resettable exception
* Add unit test to verify custom resettable condition for S3Entity
* Provide a way to customize retries since they are expensive to test
Currently, when we try to do EXPLAIN PLAN FOR, it returns the structure of the SQL parsed (via Calcite's internal planner util), which is verbose (since it tries to explain about the nodes in the SQL, instead of the Druid Query), and not representative of the native Druid query which will get executed on the broker side.
This PR aims to change the format when user tries to EXPLAIN PLAN FOR for queries which are executed by converting them into Druid's native queries (i.e. not sys schemas).
* Use 404 instead of 400
* Use 404 instead of 400
* Add UT test cases
* Add IT testcases
* add UT for task resource filter
Signed-off-by: frank chen <frank.chen021@outlook.com>
* Using org.testing.Assert instead of org.junit.Assert
* Resolve comments and fix test
* Fix test
* Fix tests
* Resolve comments
Add the ability to pass time column in first/last aggregator (and latest/earliest SQL functions). It is to support cases where the time to query upon is stored as a part of a column different than __time. Also, some other logical time column can be specified.
* Consolidate a bunch of ad-hoc segments metadata SQL; fix some bugs.
This patch gathers together a variety of SQL from SqlSegmentsMetadataManager
and IndexerSQLMetadataStorageCoordinator into a new class SqlSegmentsMetadataQuery.
It focuses on SQL related to retrieving segment payloads and marking
segments used and unused.
In addition to cleaning up the code a bit, this patch also fixes a bug
with years before 0 or after 9999. The prior SQL did not work properly
because dates outside this range cannot be compared as strings. The new
code does work for these far-past and far-future years.
So, if you're ever interested in using Druid to analyze things from
ancient Babylon, you better apply this patch first!
* Fix test compiling.
* Fixes and improvements.
* Fix forbidden API.
* Additional fixes.
Simplifies logic for callers that only want to get a list of all the
column names, or column names and types. Updated callers SegmentAnalyzer,
HashJoinSegmentStorageAdapter, and DruidSegmentReader.