d294404924
Kinesis ingestion requires all shards to have at least 1 record at the required position in druid. Even if this is satisified initially, resharding the stream can lead to empty intermediate shards. A significant delay in writing to newly created shards was also problematic. Kinesis shard sequence numbers are big integers. Introduce two more custom sequence tokens UNREAD_TRIM_HORIZON and UNREAD_LATEST to indicate that a shard has not been read from and that it needs to be read from the start or the end respectively. These values can be used to avoid the need to read at least one record to obtain a sequence number for ingesting a newly discovered shard. If a record cannot be obtained immediately, use a marker to obtain the relevant shardIterator and use this shardIterator to obtain a valid sequence number. As long as a valid sequence number is not obtained, continue storing the token as the offset. These tokens (UNREAD_TRIM_HORIZON and UNREAD_LATEST) are logically ordered to be earlier than any valid sequence number. However, the ordering requires a few subtle changes to the existing mechanism for record sequence validation: The sequence availability check ensures that the current offset is before the earliest available sequence in the shard. However, current token being an UNREAD token indicates that any sequence number in the shard is valid (despite the ordering) Kinesis sequence numbers are inclusive i.e if current sequence == end sequence, there are more records left to read. However, the equality check is exclusive when dealing with UNREAD tokens. |
||
---|---|---|
.. | ||
assets | ||
e2e-tests | ||
lib | ||
script | ||
src | ||
.editorconfig | ||
.eslintrc.js | ||
.gitignore | ||
.stylelintrc.json | ||
README.md | ||
babel.config.js | ||
console-config.js | ||
favicon.png | ||
jest.common.config.js | ||
jest.e2e.config.js | ||
jest.unit.config.js | ||
package-lock.json | ||
package.json | ||
pom.xml | ||
tsconfig.json | ||
tsconfig.test.json | ||
unified-console.html | ||
webpack.config.js |
README.md
Apache Druid web console
This is the Druid web console that servers as a data management interface for Druid.
Developing the console
Getting started
- You need to be within the
web-console
directory - Install the modules with
npm install
- Run
npm run compile
to compile the scss files (this usually needs to be done only once) - Run
npm start
will start in development mode and will proxy druid requests tolocalhost:8888
Note: you can provide an environment variable to proxy to a different Druid host like so: druid_host=1.2.3.4:8888 npm start
Note: you can provide an environment variable use webpack-bundle-analyzer as a plugin in the build script or like so: BUNDLE_ANALYZER_PLUGIN='TRUE' npm start
To try the console in (say) coordinator mode you could run it as such:
druid_host=localhost:8081 npm start
Developing
You should use a TypeScript friendly IDE (such as WebStorm, or VS Code) to develop the web console.
The console relies on eslint (and various plugins), sass-lint, and prettier to enforce code style. If you are going to do any non-trivial development you should set up your IDE to automatically lint and fix your code as you make changes.
Configuring WebStorm
-
Preferences | Languages & Frameworks | JavaScript | Code Quality Tools | ESLint
- Select "Automatic ESLint Configuration"
- Check "Run eslint --fix on save"
-
Preferences | Languages & Frameworks | JavaScript | Prettier
- Set "Run for files" to
{**/*,*}.{js,ts,jsx,tsx,css,scss}
- Check "On code reformat"
- Check "On save"
- Set "Run for files" to
Configuring VS Code
- Install
dbaeumer.vscode-eslint
extension - Install
esbenp.prettier-vscode
extension - Open User Settings (JSON) and set the following:
"editor.defaultFormatter": "esbenp.prettier-vscode", "editor.formatOnSave": true, "editor.codeActionsOnSave": { "source.fixAll.eslint": true }
Auto-fixing manually
It is also possible to auto-fix and format code without making IDE changes by running the following script:
npm run autofix
— run code linters and formatter
You could also run fixers individually:
npm run eslint-fix
— run code linter and fix issuesnpm run sasslint-fix
— run style linter and fix issuesnpm run prettify
— reformat code and styles
Updating the list of license files
If you change the dependencies of the console in any way please run script/licenses
(from the web-console directory).
It will analyze the changes and update the ../licenses
file as needed.
Please be conscious of not introducing dependencies on packages with Apache incompatible licenses.
Running end-to-end tests
From the web-console directory:
- Build druid distribution:
script/druid build
- Start druid cluster:
script/druid start
- Run end-to-end tests:
npm run test-e2e
- Stop druid cluster:
script/druid stop
If you already have a druid cluster running on the standard ports, the steps to build/start/stop a druid cluster can be skipped.
Screenshots for debugging
e2e-tests/util/debug.ts:saveScreenshotIfError()
is used to save a screenshot of the web console
when the test fails. For example, if e2e-tests/tutorial-batch.spec.ts
fails, it will create
load-data-from-local-disk-error-screenshot.png
.
Disabling headless mode
Disabling headless mode while running the tests can be helpful. This can be done via the DRUID_E2E_TEST_HEADLESS
environment variable, which defaults to true
.
Like so: DRUID_E2E_TEST_HEADLESS=false npm run test-e2e
Running against alternate web console
The environment variable DRUID_E2E_TEST_UNIFIED_CONSOLE_PORT
can be used to target a web console running on a
non-default port (i.e., not port 8888
). For example, this environment variable can be used to target the
development mode of the web console (started via npm start
), which runs on port 18081
.
Like so: DRUID_E2E_TEST_UNIFIED_CONSOLE_PORT=18081 npm run test-e2e
Running and debugging a single e2e test using Jest and Playwright
- Run -
jest --config jest.e2e.config.js e2e-tests/tutorial-batch.spec.ts
- Debug -
PWDEBUG=console jest --config jest.e2e.config.js e2e-tests/tutorial-batch.spec.ts
Description of the directory structure
As part of this directory:
assets/
- The images (and other assets) used within the consolee2e-tests/
- End-to-end tests for the consolelib/
- A place where keywords and generated docs live.public/
- The compiled destination for the files powering this consolescript/
- Some helper bash scripts for running this consolesrc/
- This directory (together withlib
) constitutes all the source code for this console
List of non SQL data reading APIs used
GET /status
GET /druid/indexer/v1/supervisor?full
POST /druid/indexer/v1/worker
GET /druid/indexer/v1/workers
GET /druid/indexer/v1/tasks
GET /druid/coordinator/v1/loadqueue?simple
GET /druid/coordinator/v1/config
GET /druid/coordinator/v1/metadata/datasources?includeUnused
GET /druid/coordinator/v1/rules
GET /druid/coordinator/v1/config/compaction
GET /druid/coordinator/v1/tiers