Commit Graph

197 Commits

Author SHA1 Message Date
317brian 07883e311e
doc: fix unnecessary link (#13785)
CI errors look unrelated to this change.
2023-02-21 17:34:46 -08:00
benkrug c6b1576fc1
Update clean-metadata-store.md (#13131)
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2023-02-21 12:53:54 -08:00
Suneet Saldanha 714ac07b52
Allow users to add additional metadata to ingestion metrics (#13760)
* Allow users to add additional metadata to ingestion metrics

When submitting an ingestion spec, users may pass a map of metadata
in the ingestion spec config that will be added to ingestion metrics.

This will make it possible for operators to tag metrics with other
metadata that doesn't necessarily line up with the existing tags
like taskId.

Druid clusters that ingest these metrics can take advantage of the
nested data columns feature to process this additional metadata.

* rename to tags

* docs

* tests

* fix test

* make code cov happy

* checkstyle
2023-02-08 18:07:23 -08:00
AmatyaAvadhanula dcdae84888
Add server view initialization metrics (#13716)
* Add server view init metrics

* Test coverage

* Rename metrics
2023-02-07 20:02:00 +05:30
Suneet Saldanha bea18dc9e4
Update basic auth examples (#13750) 2023-02-03 14:45:48 -08:00
Sergio Ferragut 7f830b20d7
fixed init commands for both mysql and postgresql (#13713) 2023-02-01 18:07:31 -08:00
Suneet Saldanha cfc3115a59
Compaction history returns empty list instead of 404 when not found (#13730)
* Compaction history returns empty list instead of 404 when not found

* checkstyle
2023-02-01 17:44:07 -08:00
Suneet Saldanha 016c881795
Add API to return automatic compaction config history (#13699)
Add a new API to return the history of changes to automatic compaction config history to make it easy for users to see what changes have been made to their auto-compaction config.

The API is scoped per dataSource to allow users to triage issues with an individual dataSource. The API responds with a list of configs when there is a change to either the settings that impact all auto-compaction configs on a cluster or the dataSource in question.
2023-01-23 13:23:45 -08:00
Eyal Yurman 44374f91bc
Fix broken links to Oracle JDK docs (#13687)
* Fix broken link for SSLContext java doc

* Update tls-support.md

* Update tls-support.md

* Update tls-support.md

* Update simple-client-sslcontext.md
2023-01-18 14:46:08 +05:30
Vadim Ogievetsky f97bcc69d3
Docs: reword single server page (#13659)
* reword single server page

* fix typo

* Update docs/operations/single-server.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* spelling

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2023-01-11 21:12:52 -08:00
Rishabh Singh 4ebdfe226d
Druid automated quickstart (#13365)
* Druid automated quickstart

* remove conf/druid/single-server/quickstart/_common/historical/jvm.config

* Minor changes in python script

* Add lower bound memory for some services

* Additional runtime properties for services

* Update supervise script to accept command arguments, corresponding changes in druid-quickstart.py

* File end newline

* Limit the ability to start multiple instances of a service, documentation changes

* simplify script arguments

* restore changes in medium profile

* run-druid refactor

* compute and pass middle manager runtime properties to run-druid
supervise script changes to process java opts array
use argparse, leave free memory, logging

* Remove extra quotes from mm task javaopts array

* Update logic to compute minimum memory

* simplify run-druid

* remove debug options from run-druid

* resolve the config_path provided

* comment out service specific runtime properties which are computed in the code

* simplify run-druid

* clean up docs, naming changes

* Throw ValueError exception on illegal state

* update docs

* rename args, compute_only -> compute, run_zk -> zk

* update help documentation

* update help documentation

* move task memory computation into separate method

* Add validation checks

* remove print

* Add validations

* remove start-druid bash script, rename start-druid-main

* Include tasks in lower bound memory calculation

* Fix test

* 256m instead of 256g

* caffeine cache uses 5% of heap

* ensure min task count is 2, task count is monotonic

* update configs and documentation for runtime props in conf/druid/single-server/quickstart

* Update docs

* Specify memory argument for each profile in single-server.md

* Update middleManager runtime.properties

* Move quickstart configs to conf/druid/base, add bash launch script, support python2

* Update supervise script

* rename base config directory to auto

* rename python script, changes to pass repeated args to supervise

* remove exmaples/conf/druid/base dir

* add docs

* restore changes in conf dir

* update start-druid-auto

* remove hashref for commands in supervise script

* start-druid-main java_opts array is comma separated

* update entry point script name in python script

* Update help docs

* documentation changes

* docs changes

* update docs

* add support for running indexer

* update supported services list

* update help

* Update python.md

* remove dir

* update .spelling

* Remove dependency on psutil and pathlib

* update docs

* Update get_physical_memory method

* Update help docs

* update docs

* update method to get physical memory on python

* udpate spelling

* update .spelling

* minor change

* Minor change

* memory comptuation for indexer

* update start-druid

* Update python.md

* Update single-server.md

* Update python.md

* run python3 --version to check if python is installed

* Update supervise script

* start-druid: echo message if python not found

* update anchor text

* minor change

* Update condition in supervise script

* JVM not jvm in docs
2022-12-09 11:04:02 -08:00
Kashif Faraz c7229fc787
Limit max batch size for segment allocation, add docs (#13503)
Changes:
- Limit max batch size in `SegmentAllocationQueue` to 500
- Rename `batchAllocationMaxWaitTime` to `batchAllocationWaitTime` since the actual
wait time may exceed this configured value.
- Replace usage of `SegmentInsertAction` in `TaskToolbox` with `SegmentTransactionalInsertAction`
2022-12-07 10:07:14 +05:30
Jill Osborne 5c520e0cf9
Update LDAP configuration docs (#13245)
* Update LDAP configuration docs

* Updated after review

* Update auth-ldap.md

Updated.

* Update auth-ldap.md

* Updated spelling file

* Update docs/operations/auth-ldap.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/operations/auth-ldap.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/operations/auth-ldap.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update auth-ldap.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2022-11-29 09:26:32 -08:00
Katya Macedo fd239305d9
Update metrics doc (#13316)
Changes:
- used inline code-style to format dimension names
- removed unnecessary punctuation
2022-11-21 09:43:52 +05:30
Clint Wylie 1231ce3b75
dump-segment tool support for examining nested columns (#13356)
* add nested mode to dump segment tool to dump nested columns

* docs

* more test

* fix it
2022-11-14 16:08:47 -08:00
Gian Merlino 77478f25fb
Add taskActionType dimension to task/action/run/time. (#13333)
* Add taskActionType dimension to task/action/run/time.

* Spelling.
2022-11-11 12:00:08 +05:30
Andreas Maechler 03175a2b8d
Add missing MSQ error code fields to docs (#13308)
* Fix typo

* Fix some spacing

* Add missing fields

* Cleanup table spacing

* Remove durable storage docs again

Thanks Brian for pointing out previous discussions.

* Update docs/multi-stage-query/reference.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Mark codes as code

* And even more codes as code

* Another set of spaces

* Combine `ColumnTypeNotSupported`

Thanks Karan.

* More whitespaces and typos

* Add spelling and fix links

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2022-11-10 21:03:04 +05:30
AmatyaAvadhanula a2013e6566
Enhance streaming ingestion metrics (#13331)
Changes:
- Add a metric for partition-wise kafka/kinesis lag for streaming ingestion.
- Emit lag metrics for streaming ingestion when supervisor is not suspended and state is in {RUNNING, IDLE, UNHEALTHY_TASKS, UNHEALTHY_SUPERVISOR}
- Document metrics
2022-11-09 23:44:15 +05:30
Kashif Faraz ff8e0c3397
Fix issues with caching cost strategy (#13321)
`cachingCost` strategy has some discrepancies when compared to cost strategy.
This commit addresses two of these by retaining the same behaviour as the `cost` strategy
when computing the cost of moving a segment to a server:
- subtract the self cost of a segment if it is being served by the target server
- subtract the cost of segments that are marked to be dropped

Other changes:
- Add tests to verify fixed strategy. These tests would fail without the fixes made to `CachingCostStrategy.computeCost()`
- Fix the definition of the segment related metrics in the docs.
- Fix some docs issues introduced in #13181
2022-11-08 16:11:39 +05:30
Jill Osborne d1a4de022a
Update retention rules doc (#13181)
* Update retention rules doc

* Update rule-configuration.md

* Updated

* Updated

* Updated

* Updated

* Update rule-configuration.md

* Update rule-configuration.md
2022-11-07 14:47:33 -08:00
AmatyaAvadhanula a738ac9ad7
Improve task pause logging and metrics for streaming ingestion (#13313)
* Improve task pause logging and metrics for streaming ingestion

* Add metrics doc

* Fix spelling
2022-11-07 21:33:54 +05:30
317brian c83115e4e1
api: change API page formatting (#13213)
Tracking additional improvements requested by @paul-rogers: #13239

* api: refactor page so that indented bullet is child and unindented portion is parent

* get rid of post etc headings and combine them with the endpoint

* Update docs/operations/api-reference.md

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* fix broken links

* fix typo

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2022-10-18 13:22:26 -07:00
Gian Merlino 3bbb76f17b
Docs: Add query/cpu/time to real-time metrics. (#13229) 2022-10-15 18:26:44 +05:30
Gian Merlino 2f731f356e
Update pull-deps docs with correct repo list. (#13134)
There is only one default remote repo at this time.
2022-09-21 12:16:57 -07:00
Vadim Ogievetsky bb0b810b1d
fix html tags in docs (#13117)
* fix html tags in docs

* revert not null
2022-09-18 19:40:33 -07:00
Gian Merlino d4967c38f8
Various documentation updates. (#13107)
* Various documentation updates.

1) Split out "data management" from "ingestion". Break it into thematic pages.

2) Move "SQL-based ingestion" into the Ingestion category. Adjust content so
   all conceptual content is in concepts.md and all syntax content is in reference.md.
   Shorten the known issues page to the most interesting ones.

3) Add SQL-based ingestion to the ingestion method comparison page. Remove the
   index task, since index_parallel is just as good when maxNumConcurrentSubTasks: 1.

4) Rename various mentions of "Druid console" to "web console".

5) Add additional information to ingestion/partitioning.md.

6) Remove a mention of Tranquility.

7) Remove a note about upgrading to Druid 0.10.1.

8) Remove no-longer-relevant task types from ingestion/tasks.md.

9) Move ingestion/native-batch-firehose.md to the hidden section. It was previously deprecated.

10) Move ingestion/native-batch-simple-task.md to the hidden section. It is still linked in some
    places, but it isn't very useful compared to index_parallel, so it shouldn't take up space
    in the sidebar.

11) Make all br tags self-closing.

12) Certain other cosmetic changes.

13) Update to node-sass 7.

* make travis use node12 for docs

Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com>
2022-09-16 21:58:11 -07:00
Vadim Ogievetsky 2493eb17bf
Doc fixes around msq (#13090)
* remove things that do not apply

* fix more things

* pin node to a working version

* fix

* fixes

* known issues tidy up

* revert auto formatting changes

* remove management-uis page which is 100% lies

* don't mention the Coordinator console (that no longer exits)

* goodies

* fix typo
2022-09-16 02:15:26 -07:00
Benedict Jin 4bde50e683
Bump the version of Druid docker image from 0.16.0-incubating to latest (#13058) 2022-09-10 14:06:00 +05:30
Rohan Garg 7aa8d7f987
Add query/time metric for SQL queries from router (#12867)
* Add query/time metric for SQL queries from router

* Fix query cancel bug when user has overriden native query-id in a SQL query
2022-09-07 13:54:46 +05:30
317brian d4233ef2a1
msq: add multi-stage-query docs (#12983)
* msq: add multi-stage-query docs

* add screenshots

add back theta sketches tutoria

change filename

fix filename

fix link

fix headings

* fixes

* fixes

* fix spelling issues and update spell file

* address feedback from karan

* add missing guardrail to known issues

* update blurb

* fix typo

* remove durable storage info

* update titles

* Restore en.json

* Update query view

* address comments from vad

* Update docs/multi-stage-query/msq-known-issues.md

finish sentence

* add apache license to docs

* add apache license to docs

Co-authored-by: Katya Macedo <katya.macedo@imply.io>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2022-09-06 23:06:09 +05:30
Gian Merlino 48ceab2153
Add Java 17 information to documentation. (#12990)
The docs say Java 17 support is experimental, and give tips on running
successfully with Java 17.

This patch also removes java.base/jdk.internal.perf and
jdk.management/com.sun.management.internal from the list of required
exports and opens, because they were formerly needed for JvmMonitor,
which was rewritten in #12481 to use MXBeans instead.
2022-08-30 12:32:49 -07:00
Santosh Pingale 31dc9004bd
Auto-reload TLS certs for druid endpoints (#12933)
* #12064 Auto-reload tls certs for druid endpoints

* #12064 Add missing toString param

* #12064 Add tests and new jks
Co-authored-by: zemin-piao <pzm6391@gmail.com>

* #12064 Refine tests

* #12064 Add documentation

* Apply suggestions from code review

Co-authored-by: Frank Chen <frankchen@apache.org>

Co-authored-by: santosh <santosh.pingale@adyen.com>
Co-authored-by: Frank Chen <frankchen@apache.org>
2022-08-25 20:12:43 +08:00
Rohan Garg 3c129f6728
Add sql planning time metric (#12923) 2022-08-22 11:09:44 +05:30
Ian Roberts 770358dc34
Update tls-support.md (#12916)
Fixing " lists all possible values for the configs belong" in TLS section
2022-08-18 09:46:30 +08:00
Gian Merlino ef6811ef88
Improved Java 17 support and Java runtime docs. (#12839)
* Improved Java 17 support and Java runtime docs.

1) Add a "Java runtime" doc page with information about supported
   Java versions, garbage collection, and strong encapsulation..

2) Update asm and equalsverifier to versions that support Java 17.

3) Add additional "--add-opens" lines to surefire configuration, so
   tests can pass successfully under Java 17.

4) Switch openjdk15 tests to openjdk17.

5) Update FrameFile to specifically mention Java runtime incompatibility
   as the cause of not being able to use Memory.map.

6) Update SegmentLoadDropHandler to log an error for Errors too, not
   just Exceptions. This is important because an IllegalAccessError is
   encountered when the correct "--add-opens" line is not provided,
   which would otherwise be silently ignored.

7) Update example configs to use druid.indexer.runner.javaOptsArray
   instead of druid.indexer.runner.javaOpts. (The latter is deprecated.)

* Adjustments.

* Use run-java in more places.

* Add run-java.

* Update .gitignore.

* Exclude hadoop-client-api.

Brought in when building on Java 17.

* Swap one more usage of java.

* Fix the run-java script.

* Fix flag.

* Include link to Temurin.

* Spelling.

* Update examples/bin/run-java

Co-authored-by: Xavier Léauté <xl+github@xvrl.net>

Co-authored-by: Xavier Léauté <xl+github@xvrl.net>
2022-08-03 23:16:05 -07:00
zachjsh c0380e7b0a
* fix duplicate dimension (#12778) 2022-07-14 10:39:03 +05:30
TSFenwick 8c02880d5f
Emit metrics for distribution of number of rows per segment (#12730)
* initial commit of bucket dimensions for metrics

return counts of segments that have rowcount in a bucket size for a datasource
return average value of rowcount per segment in a datasource
added unit test
naming could use a lot of work
buckets right now are not finalized
added javadocs
altered metrics.md

* fix checkstyle issues

* addressed review comments

add monitor test
move added functionality to new monitor
update docs

* address comments

renamed monitor
handle tombstones better
update docs
added javadocs

* Add support for tombstones in the segment distribution

* undo changes to tombstone segmentizer factory

* fix accidental whitespacing changes

* address comments regarding metrics documentation

and rename variable to be more accurate

* fix tests

* fix checkstyle issues

* fix broken test

* undo removal of timeout
2022-07-12 07:04:42 -07:00
Victoria Lim 94564b6ce6
Update screenshots for Druid console doc (#12593)
* druid console doc updates

* remove extra image

* Apply suggestions from code review

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* updated screenshot labels

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2022-06-15 16:42:20 -07:00
Victoria Lim 353475bd36
Docs for automatic compaction (#12569)
* docs for auto-compaction

* fix broken links

* another link

* Apply suggestions from code review

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Apply suggestions from code review

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Apply suggestions from code review

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Suneet Saldanha <suneet@apache.org>

* reorg content for skipOffset

* Update docs/ingestion/automatic-compaction.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Apply suggestions from code review

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

Co-authored-by: Suneet Saldanha <suneet@apache.org>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
2022-06-09 14:55:12 -07:00
Victoria Lim 1506b26ce4
fix typo (#12607) 2022-06-04 13:14:18 +08:00
Agustin Gonzalez 2f3d7a4c07
Emit state of replace and append for native batch tasks (#12488)
* Emit state of replace and append for native batch tasks

* Emit count of one depending on batch ingestion mode (APPEND, OVERWRITE, REPLACE)

* Add metric to compaction job

* Avoid null ptr exc when null emitter

* Coverage

* Emit tombstone & segment counts

* Tasks need a type

* Spelling

* Integrate BatchIngestionMode in batch ingestion tasks functionality

* Typos

* Remove batch ingestion type from metric since it is already in a dimension. Move IngestionMode to AbstractTask to facilitate having mode as a dimension. Add metrics to streaming. Add missing coverage.

* Avoid inner class referenced by sub-class inspection. Refactor computation of IngestionMode to make it more robust to null IOConfig and fix test.

* Spelling

* Avoid polluting the Task interface

* Rename computeCompaction methods to avoid ambiguous java compiler error if they are passed null. Other minor cleanup.
2022-05-23 12:32:47 -07:00
Katya Macedo 177638f171
Fix typo, add comma (#12529) 2022-05-17 16:42:47 -07:00
Rohan Garg 2dd073c2cd
Pass metrics object for Scan, Timeseries and GroupBy queries during cursor creation (#12484)
* Pass metrics object for Scan, Timeseries and GroupBy queries during cursor creation

* fixup! Pass metrics object for Scan, Timeseries and GroupBy queries during cursor creation

* Document vectorized dimension
2022-05-09 10:40:17 -07:00
Victoria Lim 0206a2da5c
Update automatic compaction docs with consistent terminology (#12416)
* specify automatic compaction where applicable

* Apply suggestions from code review

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* update for style and consistency

* implement suggested feedback

* remove duplicate example

* Apply suggestions from code review

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/ingestion/compaction.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/operations/api-reference.md

* update .spelling

* Adopt review suggestions

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
2022-05-03 16:22:25 -07:00
Rocky Chen 770ad95169
Add a metric for task duration in the pending queue (#12492)
This PR is to measure how long a task stays in the pending queue and emits the value with the metric task/pending/time. The metric is measured in RemoteTaskRunner and HttpRemoteTaskRunner.

An example of the metric:

```
2022-04-26T21:59:09,488 INFO [rtr-pending-tasks-runner-0] org.apache.druid.java.util.emitter.core.LoggingEmitter - {"feed":"metrics","timestamp":"2022-04-26T21:59:09.487Z","service":"druid/coordinator","host":"localhost:8081","version":"2022.02.0-iap-SNAPSHOT","metric":"task/pending/time","value":8,"dataSource":"wikipedia","taskId":"index_parallel_wikipedia_gecpcglg_2022-04-26T21:59:09.432Z","taskType":"index_parallel"}
```

------------------------------------------
Key changed/added classes in this PR

    Emit metric task/pending/time in classes RemoteTaskRunner and HttpRemoteTaskRunner.
    Update related factory classes and tests.
2022-05-02 23:47:25 -04:00
zachjsh 564d6defd4
Worker level task metrics (#12446)
* * fix metric name inconsistency

* * add task slot metrics for middle managers

* * add new WorkerTaskCountStatsMonitor to report task count metrics
  from worker

* * more stuff

* * remove unused variable

* * more stuff

* * add javadocs

* * fix checkstyle

* * fix hadoop test failure

* * cleanup

* * add more code coverage in tests

* * fix test failure

* * add docs

* * increase code coverage

* * fix spelling

* * fix failing tests

* * remove dead code

* * fix spelling
2022-04-26 11:44:44 -05:00
Peter Marshall 5167d328b1
Docs - query caching (#11584)
* Update caching.md

Knowledge from https://the-asf.slack.com/archives/CJ8D1JTB8/p1597781107153900

Update caching.md

A few additional updates OTBO https://the-asf.slack.com/archives/CJ8D1JTB8/p1608669046041300

* Update caching.md

Typos

* Amendments on the segment cache

Significant updates on content around the segment cache, pull process, and in-memory cache

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/operations/basic-cluster-tuning.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/operations/basic-cluster-tuning.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update basic-cluster-tuning.md

typo

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Whole-query caching update

Made more succinct and removed specific config to change.

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2022-04-18 17:00:21 +08:00
mark-imply d98cbd90f0
Update basic-cluster-tuning.md (#12412)
Changed "Other useful JVM flags" to "Other generally useful JVM flags" in order to align with the introduction to the doc.
2022-04-08 15:29:55 +05:30
Victoria Lim 9ed7aa33ec
Docs for request logging (#12363)
* add docs for request logging

* remove stray character

* Update docs/operations/request-logging.md

Co-authored-by: TSFenwick <tsfenwick@gmail.com>

* Apply suggestions from code review

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

Co-authored-by: TSFenwick <tsfenwick@gmail.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2022-03-28 14:09:41 -07:00
Sandeep 61e1ffc7f7
add a new query laning metrics to visualize lane assignment (#12111)
* add a new query laning metrics to visualize lane assignment

* fixes :spotbugs check

* Update docs/operations/metrics.md

Co-authored-by: Benedict Jin <asdf2014@apache.org>

* Update server/src/main/java/org/apache/druid/server/QueryScheduler.java

Co-authored-by: Benedict Jin <asdf2014@apache.org>

* Update server/src/main/java/org/apache/druid/server/QueryScheduler.java

Co-authored-by: Benedict Jin <asdf2014@apache.org>

Co-authored-by: Benedict Jin <asdf2014@apache.org>
2022-03-04 15:21:17 +08:00
Victoria Lim c61b19d443
Refactor SQL docs (#12239)
* refactor and link fixes

* add sql docs to left nav

* code format for needle

* updated web console script

* link fixes

* update earliest/latest functions

* edits for grammar and style

* more link fixes

* another link

* update with #12226

* update .spelling file
2022-02-11 14:43:30 -08:00
Suneet Saldanha ced1389d4c
Enable auto kill segments by default (#12187)
* Enable auto-kill by default

* tests

* wip

* test

* fix IT

* fix it

* remove from docs

* make coverage bot happy
2022-02-07 06:57:54 -08:00
Suneet Saldanha 159f97dcb0
Update docs for druid.processing.numThreads in brokers (#12231)
* Update docs for druid.processing.numThreads

* error msg

* one more reference
2022-02-04 17:34:21 -08:00
Victoria Lim 24716bfedc
Doc updates for metadata cleanup and storage (#12190)
* doc updates for metadata storage/cleanup

* Add comments for disabling cleanup

* Apply suggestions from code review

* updated for https://github.com/apache/druid/pull/12201

* Apply suggestions from code review

Co-authored-by: Maytas Monsereenusorn <maytasm@apache.org>

* move retention period line earlier; more concise text

* fix typo

Co-authored-by: Maytas Monsereenusorn <maytasm@apache.org>
2022-01-27 11:40:54 -08:00
Victoria Lim d2ac146365
Docs for cluster tiering to improve query concurrency (#12128)
* add new doc

* Apply suggestions from code review

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* reorder query laning properties

* rename doc

* new name in doc header

* organize material into "service tiering" section

* text edits and update sidebars.json

* update query laning

* how queries get assigned to lanes

* add more details to intro; use more consistent terminology

* more content

* Apply suggestions from code review

Co-authored-by: Jihoon Son <jihoonson@apache.org>

* Update docs/operations/mixed-workloads.md

* Apply suggestions from code review

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* typo

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Jihoon Son <jihoonson@apache.org>
2022-01-15 12:22:08 +08:00
Karan Kumar 377edff042
Ingestion metrics doc fix (#12066)
* Ingestion metrics doc fix.

* Fixing typo

* Adding missed keywords in ignore list
2021-12-15 12:51:53 +05:30
Lucas Capistrant 761fe9f144
Add new metric that quantifies how long batch ingest jobs waited for segment availability and whether or not that wait was successful (#12002)
* add a unit test that tests that new metric is emitted

* remove unused import

* clarify in doc that this is for batch tasks

* fix IndexTaskTest
2021-12-10 11:40:52 -06:00
Peter Marshall 0b3f0bbbd8
Docs - Metrics docs layout and info about query/bytes (#11481)
* Metrics docs layout and info about query/bytes

Knowledge transfer from https://groups.google.com/g/druid-user/c/8fiflmSEoTQ - updated the layout of the Metrics part, adding links between docs pages.

Update index.md

Amended typo

* Update docs/configuration/index.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/configuration/index.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/operations/metrics.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/operations/metrics.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/operations/metrics.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Feedback applied

Http --> HTTP and moved content / removed >

* Update docs/configuration/index.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/configuration/index.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
2021-12-07 09:45:24 -08:00
Charles Smith 7ed46800c3
Docs: Add multi-dimension partitioning doc; refactor native batch and separate into smaller topics. (#11983)
Adds documentation for multi-dimension partitioning. cc: @kfaraz
Refactors the native batch partitioning topic as follows:

Native batch ingestion covers parallel-index
Native batch simple task indexing covers index
Native batch input sources covers ioSource
Native batch ingestion with firehose covers deprecated firehose
2021-12-03 16:37:14 +05:30
Nikhil Navadiya 3c51136098
Add worker category dimension (#11554)
* Add worker category as dimension in TaskSlotCountStatsMonitor

* Change description

* Add workerConfig as field

* Modify HttpRemoteTaskRunnerTest to test worker category in taskslot metrics

* Fixing tests

* Fixing alerts

* Adding unit test in SingleTaskBackgroundRunnerTest for task slot metrics APIs

* Resolving false positive spell check

* addressing comments

* throw UnsupportedOperationException for tasklotmetrics APIs in SingleTaskBackgroundRunner

Co-authored-by: Nikhil Navadiya <nnavadiya@twitter.com>
2021-11-18 22:59:07 -08:00
Charles Smith 33a5cda061
Docs: Splits Kafka topic. Adds detailed example for kafka inputFormat (#11912)
* Splits Kafka topic according to function. Adds detailed example for kafka inputFormat

* Apply suggestions from code review

accept suggestions from review

Co-authored-by: sthetland <steve.hetland@imply.io>
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Apply suggestions from code review

accept suggestions

Co-authored-by: sthetland <steve.hetland@imply.io>
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* accept suggestions

* accept suggestions

* final typos and clarifications

* bringing forward some syntax fixes

Co-authored-by: sthetland <steve.hetland@imply.io>
Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
2021-11-12 13:02:23 -08:00
Jian Wang 8e7e679984
Add more metrics for Jetty server thread pool usage (#11113)
Add more metrics for jetty server thread pool usage so we know if we have allocated enough http threads to handle requests.
2021-11-07 16:51:44 +05:30
Kashif Faraz abac9e39ed
Revert permission changes to Supervisor and Task APIs (#11819)
* Revert "Require Datasource WRITE authorization for Supervisor and Task access (#11718)"

This reverts commit f2d6100124.

* Revert "Require DATASOURCE WRITE access in SupervisorResourceFilter and TaskResourceFilter (#11680)"

This reverts commit 6779c4652d.

* Fix docs for the reverted commits

* Fix and restore deleted tests

* Fix and restore SystemSchemaTest
2021-10-25 14:50:38 +05:30
Charles Smith 10c5fa93f1
remove dupe sentence (#11821) 2021-10-25 14:48:20 +05:30
Charles Smith 6089a168ea
Docs - update dynamic config provider topic (#11795)
* update dynamic config provider

* update topic

* add examples for dynamic config provider:

* Update docs/development/extensions-core/kafka-ingestion.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/development/extensions-core/kafka-ingestion.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/development/extensions-core/kafka-ingestion.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/operations/dynamic-config-provider.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/operations/dynamic-config-provider.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/operations/dynamic-config-provider.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/operations/dynamic-config-provider.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/development/extensions-core/kafka-ingestion.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>

* Update docs/operations/dynamic-config-provider.md

Co-authored-by: Clint Wylie <cjwylie@gmail.com>

* Update docs/operations/dynamic-config-provider.md

Co-authored-by: Clint Wylie <cjwylie@gmail.com>

* Update kafka-ingestion.md

Co-authored-by: Katya Macedo  <38017980+ektravel@users.noreply.github.com>
Co-authored-by: Clint Wylie <cjwylie@gmail.com>
2021-10-14 17:51:32 -07:00
Arun Ramani b6b42d3936
Minor processor quota computation fix + docs (#11783)
* cpu/cpuset cgroup and procfs data gathering

* Renames and default values

* Formatting

* Trigger Build

* Add cgroup monitors

* Return 0 if no period

* Update

* Minor processor quota computation fix + docs

* Address comments

* Address comments

* Fix spellcheck

Co-authored-by: arunramani-imply <84351090+arunramani-imply@users.noreply.github.com>
2021-10-08 22:52:03 -05:00
Kashif Faraz c2c724c065
Fix docs to explain that WRITE permissions do not include READ (#11785)
* Fix docs to explain that WRITE and READ are exclusive

* Fix indentation

* Use suggested doc style
2021-10-08 14:10:20 -07:00
Charles Smith 3ecbd3aec4
docs for changes to authorization in #11718 and #11720 (#11779)
* security recommendation

* Update docs/operations/security-overview.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/operations/security-user-auth.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update docs/operations/security-user-auth.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update security-user-auth.md

add newline

* Update docs/operations/security-overview.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Update security-overview.md

add suggestion for environment variable dynamic config provider

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
Co-authored-by: Clint Wylie <cwylie@apache.org>
2021-10-08 14:04:04 -07:00
Kashif Faraz f2d6100124
Require Datasource WRITE authorization for Supervisor and Task access (#11718)
Follow up PR for #11680

Description
Supervisor and Task APIs are related to ingestion and must always require Datasource WRITE
authorization even if they are purely informative.

Changes
Check Datasource WRITE in SystemSchema for tables "supervisors" and "tasks"
Check Datasource WRITE for APIs /supervisor/history and /supervisor/{id}/history
Check Datasource for all Indexing Task APIs
2021-10-08 10:39:48 +05:30
Charles Smith 621e5ac63f
docs: clarify RealtimeMetricsMonitor, HistoricalMetricsMonitor (#11565)
* docs: clarify RealtimeMetricsMonitor, HistoricalMetricsMonitor

* Update docs/configuration/index.md
2021-10-05 17:38:23 -07:00
Clint Wylie 5de26cf6d9
add optional system schema authorization (#11720)
* add optional system schema authorization

* remove unused

* adjust docs

* doc fixes, missing ldap config change for integration tests

* style
2021-09-21 13:28:26 -07:00
Charles Smith 91cd573472
fixes web console introduction and addresses linking issues (#11609)
* fixes web console introduction and addresses  linking issues

* fix merge conflict
2021-08-18 08:37:05 -07:00
imply-jhan 332e68edb5
improve the metric definition (#11602) 2021-08-17 12:31:42 +07:00
Charles Smith 6524d838d7
Docs refactor of ingestion. Carries #11541 (#11576)
* Docs refactor of ingestion. Carries #11541

* Update docs/misc/math-expr.md

* add Apache license

* fix header, add topics to sidebar

* Update docs/ingestion/partitioning.md

* pick up changes to  and  md from c7fdf1d, #11479

Co-authored-by: Suneet Saldanha <suneet@apache.org>
Co-authored-by: Jihoon Son <jihoonson@apache.org>
2021-08-13 08:42:03 -07:00
Gian Merlino faebefecae
Docs: add pointers from api-reference to sql docs. (#11548) 2021-08-11 09:00:33 -07:00
Peter Marshall 60e3955adb
Docs - clarify datasource API sources (#11489)
* Update api-reference.md

Added note OTBO Druid slack

* Update api-reference.md

Changed to an alternative explanation

* Update api-reference.md

Oops fixed.

* Update docs/operations/api-reference.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update docs/operations/api-reference.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

Co-authored-by: Suneet Saldanha <suneet@apache.org>
2021-08-05 11:29:33 -07:00
Yi Yuan 23d7d71ea5
Add Environment Variable DynamicConfigProvider (#11377)
* add_environment_variable_DynamicConfigProvider

* fix code

* code fixed

* code fixed

* add document

* fix doc

* fix doc

* add more unit test

* fix style

* fix document

* bug fixed

* fix unit test

* fix comment

* fix test

Co-authored-by: yuanyi <yuanyi@freewheel.tv>
2021-08-04 20:26:58 -07:00
Harini Rajendran 995d99d9e4
add ingest/notices/queueSize metric to give visibility into supervisor notices queue size (#11417) 2021-07-30 07:59:26 -07:00
Lucas Capistrant 9767b42e85
Add a new metric query/segments/count that is not emitted by default (#11394)
* Add a new metric query/segments/count that is not emitted by default

* docs

* test the default implementation of the metric

* fix spelling error in docs

* document the fact that query retries will result in additional metric emissions

* update using recommended text from @jihoonson
2021-07-22 17:57:35 -07:00
Maytas Monsereenusorn 05d5dd9289
compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded (#11426)
* compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded

* compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded

* compaction/status API retains status for datasources that no longer existed causing in-memory used to grow unbounded

* fix test

* fix test
2021-07-13 09:48:06 +07:00
Clint Wylie 17efa6f556
add single input string expression dimension vector selector and better expression planning (#11213)
* add single input string expression dimension vector selector and better expression planning

* better

* fixes

* oops

* rework how vector processor factories choose string processors, fix to be less aggressive about vectorizing

* oops

* javadocs, renaming

* more javadocs

* benchmarks

* use string expression vector processor with vector size 1 instead of expr.eval

* better logging

* javadocs, surprising number of the the

* more

* simplify
2021-07-06 11:20:49 -07:00
frank chen 906a704c55
Eliminate ambiguities of KB/MB/GB in the doc (#11333)
* GB ---> GiB

* suppress spelling check

* MB --> MiB, KB --> KiB

* Use IEC binary prefix

* Add reference link

* Fix doc style
2021-06-30 13:42:45 -07:00
Charles Smith fcb4eaa3d4
add docs for high-churn datasource cleanup (#11245)
* add docs for high-churn datasource cleanup

* fix most comments except for task log

* address  comments

* update strategy recommendation

* address addtional comments

* fix

* address comments

* address comments from @sthetland
2021-05-20 09:48:42 -07:00
Maytas Monsereenusorn 3455352241
Add feature to automatically remove compaction configurations for inactive datasources (#11232)
* add auto cleanup

* add auto cleanup

* add auto cleanup

* add tests

* add tests

* use retryutils

* use retryutils

* use retryutils

* address comments
2021-05-11 18:49:18 -07:00
Maytas Monsereenusorn 4326e699bd
Add feature to automatically remove datasource metadata based on retention period (#11227)
* add auto clean up datasource metadata

* add test

* fix checkstyle

* add comments

* fix error

* address comments

* Address comments

* fix test

* fix test

* fix typo

* add comment

* fix test

* fix test
2021-05-11 01:22:33 -07:00
Charles Smith cf2cde1d2d
add links to release notes, light refactor of landing page (#11051)
* add links to release notes, light refactor of landing page

* Update docs/design/index.md
2021-05-07 14:26:47 -07:00
Maytas Monsereenusorn d73f72e508
Add feature to automatically remove supervisor based on retention period (#11200)
* add auto clean up

* add test

* add test

* fix test

* Address comments

* Address comments
2021-05-06 22:25:23 -07:00
Maytas Monsereenusorn 84aac4832d
Add feature to automatically remove rules based on retention period (#11164)
* Add feature to automatically remove rules based on retention period

* Add feature to automatically remove rules based on retention period

* address comments
2021-05-03 11:50:45 -07:00
Maytas Monsereenusorn 6d2b5cdd7e
Add feature to automatically remove audit logs based on retention period (#11084)
* add docs

* add impl

* fix checkstyle

* fix test

* add test

* fix checkstyle

* fix checkstyle

* fix test

* Address comments

* Address comments

* fix spelling

* fix docs
2021-04-20 17:10:43 -07:00
Charles Smith 09dcf6aa36
fix syntax error for loadstatus api (#11136) 2021-04-20 14:17:20 +08:00
Charles Smith b51632b0bf
Update security overview with additional recommendations (#11016)
* updatee security overview with additional recommendations for improved security

* address first set of review questions

* Update docs/operations/security-overview.md

* Update docs/operations/security-overview.md

* apply changes from review

* Update docs/operations/security-overview.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update docs/operations/security-overview.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update docs/operations/security-overview.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Update security-overview.md

fix additional comments & typos cc: @suneet-s, @jihoonsoon

Co-authored-by: Suneet Saldanha <suneet@apache.org>
2021-04-14 08:58:17 -07:00
zhangyue19921010 95b82dd325
Add missing API references for coordinator (#10967)
* add miss API references for coordinator

* add miss API references for coordinator

* add miss API references for coordinator

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
2021-04-09 18:20:47 -07:00
sthetland fb6751fa45
Fix old broken link (#11048)
* link check fixes

* updated link target

* Update aggregations.md

* spelling error
2021-04-07 20:40:50 -07:00
zachjsh 8cf1e83543
Add paramter to loadstatus API to compute underdeplication against cluster view (#11056)
* Add paramter to loadstatus API to compute underdeplication against cluster view

This change adds a query parameter `computeUsingClusterView` to loadstatus apis
that if specified have the coordinator compute undereplication for segments based
on the number of services available within cluster that the segment can be replicated
on, instead of the configured replication count configured in load rule. A default
load rule is created in all clusters that specified that all segments should be
replicated 2 times. As replicas are forced to be on separate nodes in the cluster,
this causes the loadstatus api to report that there are under-replicated segments
when there is only 1 data server in the cluster. In this case, calling loadstatus
api without this new query parameter will always result in a response indicating
under-replication of segments

* * fix exception mapper

* * Address review comments

* * update external API docs

* Apply suggestions from code review

Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>

* * update more external docs

* * update javadoc

* Apply suggestions from code review

Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>

Co-authored-by: Charles Smith <38529548+techdocsmith@users.noreply.github.com>
2021-04-05 00:02:43 -04:00
Clint Wylie 470d659ca0
add documentation for coordinator dynamic configuration (#11052) 2021-04-02 22:01:43 -07:00
Tushar Raj 6789ed0a05
Update reset-cluster.md (#10990)
fixed Error: Could not find or load main class org.apache.druid.cli.Main
2021-03-29 20:38:35 -07:00
Charles Smith d69533dbd9
First refactor of compaction (#10935)
* first pass compaction refactor. includes updated behavior for queryGranularity. removes duplicated doc

* fix links, typos, some reorganization

* fix spelling. TBD still there for work in progress

* updates tutorial examples, adds more clarification around compaction use cases

* add granularity spec to automatic compaction config

* final edits

* spelling fixes

* apply suggestions from review

* upadtes from review

* last edits

* move note

* clarify null

* fix links & spelling

* latest review

* edits to auto-compaction config

* add back rollup

* fix links & spelling

* Update compaction.md

add granularityspec to example
2021-03-24 11:41:44 -07:00
Charles Smith 573de3bc0d
clarify security requirements around HTTPInputSource (#10914)
* clarify security requirements around HTTPInputSource

* explicitly mention write/datasource in best practices. clarify that the ingestion task is the risk

* Update docs/operations/security-overview.md

Co-authored-by: Suneet Saldanha <suneet@apache.org>

Co-authored-by: Suneet Saldanha <suneet@apache.org>
2021-02-26 09:37:47 -08:00
zachjsh 67eff4110d
Improve Druid ldap auth documentation (#10915)
* Improve Druid ldap auth documentation

Improved the ldap auth docs by clarifying that the object classes and
attributes noted are specific to Microsoft Active Directory, and could
be different depending on the specific ldap server being used. Also
emphasized the importance of the memberOf field and noted that the
step about adding users to roles is only needed in certain circumstances.

* * add another note

* Apply suggestions from code review

Co-authored-by: sthetland <steve.hetland@imply.io>

* * simplify

* * Address review comments

Co-authored-by: sthetland <steve.hetland@imply.io>
2021-02-24 15:28:41 -08:00
sthetland 1e40f51e65
Fix example names of security artifacts in docs (#10882)
* replacing example names

* unrelated typos

* unintended changes

* a few more typo fixes
2021-02-16 14:58:50 -08:00