1ef25a438f
* Broker: Add ability to inline subqueries. The main changes: - ClientQuerySegmentWalker: Add ability to inline queries. - Query: Add "getSubQueryId" and "withSubQueryId" methods. - QueryMetrics: Add "subQueryId" dimension. - ServerConfig: Add new "maxSubqueryRows" parameter, which is used by ClientQuerySegmentWalker to limit how many rows can be inlined per query. - IndexedTableJoinMatcher: Allow creating keys on top of unknown types, by assuming they are strings. This is useful because not all types are known for fields in query results. - InlineDataSource: Store RowSignature rather than component parts. Add more zealous "equals" and "hashCode" methods to ease testing. - Moved QuerySegmentWalker test code from CalciteTests and SpecificSegmentsQueryWalker in druid-sql to QueryStackTests in druid-server. Use this to spin up a new ClientQuerySegmentWalkerTest. * Adjustments from CI. * Fix integration test. |
||
---|---|---|
.. | ||
docker | ||
docker-base | ||
src | ||
.gitignore | ||
README.md | ||
pom.xml | ||
run_cluster.sh | ||
stop_cluster.sh |
README.md
Integration Testing
To run integration tests, you have to specify the druid cluster the tests should use.
Druid comes with the mvn profile integration-tests for setting up druid running in docker containers, and using that cluster to run the integration tests.
To use a druid cluster that is already running, use the mvn profile int-tests-config-file, which uses a configuration file describing the cluster.
Integration Testing Using Docker
Before starting, if you don't already have docker on your machine, install it as described on Docker installation instructions. Ensure that you have at least 4GB of memory allocated to the docker engine. (You can verify it under Preferences > Advanced.)
Also set the DOCKER_IP
environment variable to localhost on your system, as follows:
export DOCKER_IP=127.0.0.1
Running tests
To run all tests from a test group using docker and mvn run the following command: (list of test groups can be found at integration-tests/src/test/java/org/apache/druid/tests/TestNGGroup.java)
mvn verify -P integration-tests -Dgroups=<test_group>
To run only a single test using mvn run the following command:
mvn verify -P integration-tests -Dit.test=<test_name>
Add -rf :druid-integration-tests
when running integration tests for the second time or later without changing
the code of core modules in between to skip up-to-date checks for the whole module dependency tree.
Integration tests can also be run with either Java 8 or Java 11 by adding -Djvm.runtime=# to mvn command, where # can either be 8 or 11.
Druid's configuration (using Docker) can be overrided by providing -Doverride.config.path=<PATH_TO_FILE>. The file must contain one property per line, the key must start with druid_ and the format should be snake case.
Running Tests Using A Configuration File for Any Cluster
Make sure that you have at least 6GB of memory available before you run the tests.
To run tests on any druid cluster that is already running, create a configuration file:
{
"broker_host": "<broker_ip>",
"broker_port": "<broker_port>",
"router_host": "<router_ip>",
"router_port": "<router_port>",
"indexer_host": "<indexer_ip>",
"indexer_port": "<indexer_port>",
"coordinator_host": "<coordinator_ip>",
"coordinator_port": "<coordinator_port>",
"middlemanager_host": "<middle_manager_ip>",
"zookeeper_hosts": "<comma-separated list of zookeeper_ip:zookeeper_port>",
"cloud_bucket": "<(optional) cloud_bucket for test data if running cloud integration test>",
"cloud_path": "<(optional) cloud_path for test data if running cloud integration test>",
}
Set the environment variable CONFIG_FILE to the name of the configuration file:
export CONFIG_FILE=<config file name>
To run all tests from a test group using mvn run the following command: (list of test groups can be found at integration-tests/src/test/java/org/apache/druid/tests/TestNGGroup.java)
mvn verify -P int-tests-config-file -Dgroups=<test_group>
To run only a single test using mvn run the following command:
mvn verify -P int-tests-config-file -Dit.test=<test_name>
Running a Test That Uses Cloud
The integration test that indexes from Cloud or uses Cloud as deep storage is not run as part of the integration test run discussed above. Running these tests requires the user to provide their own Cloud.
Currently, the integration test supports Google Cloud Storage, Amazon S3, and Microsoft Azure. These can be run by providing "gcs-deep-storage", "s3-deep-storage", or "azure-deep-storage" to -Dgroups for Google Cloud Storage, Amazon S3, and Microsoft Azure respectively. Note that only one group should be run per mvn command.
In addition to specifying the -Dgroups to mvn command, the following will need to be provided:
- Set the bucket and path for your test data. This can be done by setting -Ddruid.test.config.cloudBucket and -Ddruid.test.config.cloudPath in the mvn command or setting "cloud_bucket" and "cloud_path" in the config file.
- Copy wikipedia_index_data1.json, wikipedia_index_data2.json, and wikipedia_index_data3.json located in integration-tests/src/test/resources/data/batch_index to your Cloud storage at the location set in step 1.
- Provide -Doverride.config.path=<PATH_TO_FILE> with your Cloud credentials/configs set. See integration-tests/docker/environment-configs/override-examples/ directory for env vars to provide for each Cloud storage.
For running Google Cloud Storage, in addition to the above, you will also have to:
- Provide -Dresource.file.dir.path=<PATH_TO_FOLDER> with folder that contains GOOGLE_APPLICATION_CREDENTIALS file
For example, to run integration test for Google Cloud Storage:
mvn verify -P integration-tests -Dgroups=gcs-deep-storage -Doverride.config.path=<PATH_TO_FILE> -Dresource.file.dir.path=<PATH_TO_FOLDER> -Ddruid.test.config.cloudBucket=test-bucket -Ddruid.test.config.cloudPath=test-data-folder/
Running a Test That Uses Hadoop
The integration test that indexes from hadoop is not run as part of the integration test run discussed above. This is because druid test clusters might not, in general, have access to hadoop. That's the case (for now, at least) when using the docker cluster set up by the integration-tests profile, so the hadoop test has to be run using a cluster specified in a configuration file.
The data file is integration-tests/src/test/resources/hadoop/batch_hadoop.data. Create a directory called batchHadoop1 in the hadoop file system (anywhere you want) and put batch_hadoop.data into that directory (as its only file).
Add this keyword to the configuration file (see above):
"hadoopTestDir": "<name_of_dir_containing_batchHadoop1>"
Run the test using mvn:
mvn verify -P int-tests-config-file -Dit.test=ITHadoopIndexTest
In some test environments, the machine where the tests need to be executed cannot access the outside internet, so mvn cannot be run. In that case, do the following instead of running the tests using mvn:
Compile druid and the integration tests
On a machine that can do mvn builds:
cd druid
mvn clean package
cd integration_tests
mvn dependency:copy-dependencies package
Put the compiled test code into your test cluster
Copy the integration-tests directory to the test cluster.
Set CLASSPATH
TDIR=<directory containing integration-tests>/target
VER=<version of druid you built>
export CLASSPATH=$TDIR/dependency/*:$TDIR/druid-integration-tests-$VER.jar:$TDIR/druid-integration-tests-$VER-tests.jar
Run the test
java -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Ddruid.test.config.type=configFile -Ddruid.test.config.configFile=<pathname of configuration file> org.testng.TestNG -testrunfactory org.testng.DruidTestRunnerFactory -testclass org.apache.druid.tests.hadoop.ITHadoopIndexTest
Writing a New Test
What should we cover in integration tests
For every end-user functionality provided by druid we should have an integration-test verifying the correctness.
Rules to be followed while writing a new integration test
Every Integration Test must follow these rules:
- Name of the test must start with a prefix "IT"
- A test should be independent of other tests
- Tests are to be written in TestNG style (http://testng.org/doc/documentation-main.html#methods)
- If a test loads some data it is the responsibility of the test to clean up the data from the cluster
How to use Guice Dependency Injection in a test
A test can access different helper and utility classes provided by test-framework in order to access Coordinator,Broker etc.. To mark a test be able to use Guice Dependency Injection - Annotate the test class with the below annotation
@Guice(moduleFactory = DruidTestModuleFactory.class)
This will tell the test framework that the test class needs to be constructed using guice.
Helper Classes provided
- IntegrationTestingConfig - configuration of the test
- CoordinatorResourceTestClient - httpclient for coordinator endpoints
- OverlordResourceTestClient - httpclient for indexer endpoints
- QueryResourceTestClient - httpclient for broker endpoints
Static Utility classes
- RetryUtil - provides methods to retry an operation until it succeeds for configurable no. of times
- FromFileTestQueryHelper - reads queries with expected results from file and executes them and verifies the results using ResultVerifier
Refer ITIndexerTest as an example on how to use dependency Injection