Follow up changes to #12599
Changes:
- Rename column `used_flag_last_updated` to `used_status_last_updated`
- Remove new CLI tool `UpdateTables`.
- We already have a `CreateTables` with similar functionality, which should be able to
handle update cases too.
- Any user running the cluster for the first time should either just have `connector.createTables`
enabled or run `CreateTables` which should create tables at the latest version.
- For instance, the `UpdateTables` tool would be inadequate when a new metadata table has
been added to Druid, and users would have to run `CreateTables` anyway.
- Remove `upgrade-prep.md` and include that info in `metadata-init.md`.
- Fix log messages to adhere to Druid style
- Use lambdas
* Add new configurable buffer period to create gap between mark unused and kill of segment
* Changes after testing
* fixes and improvements
* changes after initial self review
* self review changes
* update sql statement that was lacking last_used
* shore up some code in SqlMetadataConnector after self review
* fix derby compatibility and improve testing/docs
* fix checkstyle violations
* Fixes post merge with master
* add some unit tests to improve coverage
* ignore test coverage on new UpdateTools cli tool
* another attempt to ignore UpdateTables in coverage check
* change column name to used_flag_last_updated
* fix a method signature after column name switch
* update docs spelling
* Update spelling dictionary
* Fixing up docs/spelling and integrating altering tasks table with my alteration code
* Update NULL values for used_flag_last_updated in the background
* Remove logic to allow segs with null used_flag_last_updated to be killed regardless of bufferPeriod
* remove unneeded things now that the new column is automatically updated
* Test new background row updater method
* fix broken tests
* fix create table statement
* cleanup DDL formatting
* Revert adding columns to entry table by default
* fix compilation issues after merge with master
* discovered and fixed metastore inserts that were breaking integration tests
* fixup forgotten insert by using pattern of sharing now timestamp across columns
* fix issue introduced by merge
* fixup after merge with master
* add some directions to docs in the case of segment table validation issues
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Victoria Lim <lim.t.victoria@gmail.com>
* Various documentation updates.
1) Split out "data management" from "ingestion". Break it into thematic pages.
2) Move "SQL-based ingestion" into the Ingestion category. Adjust content so
all conceptual content is in concepts.md and all syntax content is in reference.md.
Shorten the known issues page to the most interesting ones.
3) Add SQL-based ingestion to the ingestion method comparison page. Remove the
index task, since index_parallel is just as good when maxNumConcurrentSubTasks: 1.
4) Rename various mentions of "Druid console" to "web console".
5) Add additional information to ingestion/partitioning.md.
6) Remove a mention of Tranquility.
7) Remove a note about upgrading to Druid 0.10.1.
8) Remove no-longer-relevant task types from ingestion/tasks.md.
9) Move ingestion/native-batch-firehose.md to the hidden section. It was previously deprecated.
10) Move ingestion/native-batch-simple-task.md to the hidden section. It is still linked in some
places, but it isn't very useful compared to index_parallel, so it shouldn't take up space
in the sidebar.
11) Make all br tags self-closing.
12) Certain other cosmetic changes.
13) Update to node-sass 7.
* make travis use node12 for docs
Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com>
* remove things that do not apply
* fix more things
* pin node to a working version
* fix
* fixes
* known issues tidy up
* revert auto formatting changes
* remove management-uis page which is 100% lies
* don't mention the Coordinator console (that no longer exits)
* goodies
* fix typo
Added link to the relevant section of the Basic Cluster Tuning page on each process page.
This is in order to improve access to this information, which is not easy to find through search or nav.
* Update caching.md
Knowledge from https://the-asf.slack.com/archives/CJ8D1JTB8/p1597781107153900
Update caching.md
A few additional updates OTBO https://the-asf.slack.com/archives/CJ8D1JTB8/p1608669046041300
* Update caching.md
Typos
* Amendments on the segment cache
Significant updates on content around the segment cache, pull process, and in-memory cache
* Update docs/design/historical.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/design/historical.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/design/historical.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/design/historical.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/design/historical.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/design/historical.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/caching.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/caching.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/caching.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/caching.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/design/historical.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/design/historical.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/design/historical.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/operations/basic-cluster-tuning.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/caching.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/caching.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/caching.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/caching.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/querying/caching.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/operations/basic-cluster-tuning.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update basic-cluster-tuning.md
typo
* Update docs/querying/caching.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Whole-query caching update
Made more succinct and removed specific config to change.
* Update docs/design/historical.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* refactor and link fixes
* add sql docs to left nav
* code format for needle
* updated web console script
* link fixes
* update earliest/latest functions
* edits for grammar and style
* more link fixes
* another link
* update with #12226
* update .spelling file
* Rename field, fix router documentation
* Add more lines to doc
* Apply doc suggestions from code review
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
This PR adds a new property druid.router.sql.enable which allows the
Router to handle SQL queries when set to true.
This change does not affect Avatica JDBC requests and they are still routed
by hashing the Connection ID.
To allow parsing of the request object as a SqlQuery (contained in module druid-sql),
some classes have been moved from druid-server to druid-services with
the same package name.
This change allows the selection of a specific broker service (or broker tier) by the Router.
The newly added ManualTieredBrokerSelectorStrategy works as follows:
Check for the parameter brokerService in the query context. If this is a valid broker service, use it.
Check if the field defaultManualBrokerService has been set in the strategy. If this is a valid broker service, use it.
Move on to the next strategy
* Fixes and tests related to the Indexer process.
Three bugs fixed:
1) Indexers would not announce themselves as segment servers if they
did not have storage locations defined. This used to work, but was
broken in #9971. Fixed this by adding an "isSegmentServer" method
to ServerType and updating SegmentLoadDropHandler to always announce
if this method returns true.
2) Certain batch task types were written in a way that assumed "isReady"
would be called before "run", which is not guaranteed. In particular,
they relied on it in order to initialize "taskLockHelper". Fixed this
by updating AbstractBatchIndexTask to ensure "isReady" is called
before "run" for these tasks.
3) UnifiedIndexerAppenderatorsManager did not properly handle complex
datasources. Introduced DataSourceAnalysis in order to fix this.
Test changes:
1) Add a new "docker-compose.cli-indexer.yml" config that spins up an
Indexer instead of a MiddleManager.
2) Introduce a "USE_INDEXER" environment variable that determines if
docker-compose will start up an Indexer or a MiddleManager.
3) Duplicate all the jdk8 tests and run them in both MiddleManager and
Indexer mode.
4) Various adjustments to encourage fail-fast errors in the Docker
build scripts.
5) Various adjustments to speed up integration tests and reduce memory
usage.
6) Add another Mac-specific approach to determining a machine's own IP.
This was useful on my development machine.
7) Update segment-count check in ITCompactionTaskTest to eliminate a
race condition (it was looking for 6 segments, which only exist
together briefly, until the older 4 are marked unused).
Javadoc updates:
1) AbstractBatchIndexTask: Added javadocs to determineLockGranularityXXX
that make it clear when taskLockHelper will be initialized as a side
effect. (Related to the second bug above.)
2) Task: Clarified that "isReady" is not guaranteed to be called before
"run". It was already implied, but now it's explicit.
3) ZkCoordinator: Clarified deprecation message.
4) DataSegmentServerAnnouncer: Clarified deprecation message.
* Fix stop_cluster script.
* Fix sanity check in script.
* Fix hashbang lines.
* Test and doc adjustments.
* Additional tests, and adjustments for tests.
* Split ITs back out.
* Revert change to druid_coordinator_period_indexingPeriod.
* Set Indexer capacity to match MM.
* Bump up Historical memory.
* Bump down coordinator, overlord memory.
* Bump up Broker memory.
* Add availability and consistency docs.
Describes transactional ingestion and atomic replacement. Also, this patch
deletes some bad advice from the javadocs for SegmentTransactionalInsertAction.
* Fix missing word.
* Filter http requests by http method
Add a config that allows a user which http methods to allow against their
Druid server.
Druid will only accept http requests with the method: GET, PUT, POST, DELETE
and OPTIONS.
If a Druid admin wants to allow other methods, they can do so by using the
ServerConfig#allowedHttpMethods config.
If a Druid user would like to disallow OPTIONS, this can be done by changing
the AuthConfig#allowUnauthenticatedHttpOptions config
* Exclude OPTIONS from always supported HTTP methods
Add HEAD as an allowed method for web console e2e tests
* fix docs
* fix security IT
* Actually fix the web console e2e tests
* Ignore icode coverage for nitialization classes
* code review
* Druid user permissions apply in the console
* Update index.md
* noting user warning in console page; some minor shuffling
* noting user warning in console page; some minor shuffling 1
* touchups
* link checking fixes
* Updated per suggestions
* Refresh query docs.
Larger changes:
- New doc: querying/datasource.md describes the various kinds of
datasources you can use, and has examples for both SQL and native.
- New doc: querying/query-execution.md describes how native queries
are executed at a high level. It doesn't go into the details of specific
query engines or how queries run at a per-segment level. But I think it
would be good to add or link that content here in the future.
- Refreshed doc: querying/sql.md updated to refer to joins, reformatted
a bit, added a new "Query translation" section that explains how
queries are translated from SQL to native, and removed configuration
details (moved to configuration/index.md).
- Refreshed doc: querying/joins.md updated to refer to join datasources.
Smaller changes:
- Add helpful banners to the top of query documentation pages telling
people whether a given page describes SQL, native, or both.
- Add SQL metrics to operations/metrics.md.
- Add some color and cross-links in various places.
- Add native query component docs to the sidebar, and renamed them so
they look nicer.
- Remove Select query from the sidebar.
- Fix Broker SQL configs in configuration/index.md. Remove them from
querying/sql.md.
- Combined querying/searchquery.md and querying/searchqueryspec.md.
* Updates.
* Fix numbering.
* Fix glitches.
* Add new words to spellcheck file.
* Assorted changes.
* Further adjustments.
* Add missing punctuation.
* Reconcile terminology and method naming to 'used/unused segments'; Don't use terms 'enable/disable data source'; Rename MetadataSegmentManager to MetadataSegments; Make REST API methods which mark segments as used/unused to return server error instead of an empty response in case of error
* Fix brace
* Import order
* Rename withKillDataSourceWhitelist to withSpecificDataSourcesToKill
* Fix tests
* Fix tests by adding proper methods without interval parameters to IndexerMetadataStorageCoordinator instead of hacking with Intervals.ETERNITY
* More aligned names of DruidCoordinatorHelpers, rename several CoordinatorDynamicConfig parameters
* Rename ClientCompactTaskQuery to ClientCompactionTaskQuery for consistency with CompactionTask; ClientCompactQueryTuningConfig to ClientCompactionTaskQueryTuningConfig
* More variable and method renames
* Rename MetadataSegments to SegmentsMetadata
* Javadoc update
* Simplify SegmentsMetadata.getUnusedSegmentIntervals(), more javadocs
* Update Javadoc of VersionedIntervalTimeline.iterateAllObjects()
* Reorder imports
* Rename SegmentsMetadata.tryMark... methods to mark... and make them to return boolean and the numbers of segments changed and relay exceptions to callers
* Complete merge
* Add CollectionUtils.newTreeSet(); Refactor DruidCoordinatorRuntimeParams creation in tests
* Remove MetadataSegmentManager
* Rename millisLagSinceCoordinatorBecomesLeaderBeforeCanMarkAsUnusedOvershadowedSegments to leadingTimeMillisBeforeCanMarkAsUnusedOvershadowedSegments
* Fix tests, refactor DruidCluster creation in tests into DruidClusterBuilder
* Fix inspections
* Fix SQLMetadataSegmentManagerEmptyTest and rename it to SqlSegmentsMetadataEmptyTest
* Rename SegmentsAndMetadata to SegmentsAndCommitMetadata to reduce the similarity with SegmentsMetadata; Rename some methods
* Rename DruidCoordinatorHelper to CoordinatorDuty, refactor DruidCoordinator
* Unused import
* Optimize imports
* Rename IndexerSQLMetadataStorageCoordinator.getDataSourceMetadata() to retrieveDataSourceMetadata()
* Unused import
* Update terminology in datasource-view.tsx
* Fix label in datasource-view.spec.tsx.snap
* Fix lint errors in datasource-view.tsx
* Doc improvements
* Another attempt to please TSLint
* Another attempt to please TSLint
* Style fixes
* Fix IndexerSQLMetadataStorageCoordinator.createUsedSegmentsSqlQueryForIntervals() (wrong merge)
* Try to fix docs build issue
* Javadoc and spelling fixes
* Rename SegmentsMetadata to SegmentsMetadataManager, address other comments
* Address more comments