# Docker Compose Configuration The integration tests use Docker Compose to launch Druid clusters. Each test defines its own cluster depending on what is to be tested. Since a large amount of the definition is common, we use inheritance to simplify cluster definition. Tests are split into categories so that they can run in parallel. Some of these categories use the same cluster configuration. To further reduce redundancy, test categories can share cluster configurations. See also: * [Druid configuration](druid-config.md) which is done via Compose. * [Test configuration](test-config.md) which tells tests about the cluster configuration. * [Docker compose specification](https://github.com/compose-spec/compose-spec/blob/master/spec.md) ## File Structure Docker Compose files live in the `druid-it-cases` module (`cases` folder) in the `cluster` directory. There is a separate subdirectory for each cluster type (subset of test categories), plus a `Common` folder for shared files. ### Cluster Directory Each test category uses an associated cluster. In some cases, multiple tests use the same cluster definition. Each cluster is defined by a directory in `$MODULE/cluster/$CLUSTER_NAME`. The directory contains a variety of files, most of which are optional: * `docker-compose.yaml` - Docker composes file, if created explicitly. * `docker-compose.py` - Docker compose "template" if generated. The Python template format is preferred. (One of the `docker-compose.*` files is required) * `verify.sh` - Verify the environment for the cluster. Cloud tests require that a number of environment variables be set to pass keys and other setup to tests. (Optional) * `setup.sh` - Additional cluster setup, such as populating the "shared" directory with test-specific items. (Optional) The `verify.sh` and `setup.sh` scripts are sourced into one of the "master" scripts and can thus make use of environment variables already set: * `BASE_MODULE_DIR` points to `integration-tests-ex/cases` where the "base" set of scripts and cluster definitions reside. * `MODULE_DIR` points to the Maven module folder that contains the test. * `CATEGORY` gives the name of the test category. * `DRUID_INTEGRATION_TEST_GROUP` is the cluster name. Often the same as `CATEGORY`, but not always. The `set -e` option is in effect so that an any errors fail the test. ## Shared Directory Each test has a "shared" directory that is mounted into each container to hold things like logs, security files, etc. The directory is known as `/shared` within the container, and resides in `target/`. Even if two categories share a cluster configuration, they will have separate local versions of the shared directory. This is important to keep log files separate for each category. ## Base Configurations Test clusters run some number of third-party "infrastructure" containers, and some number of Druid service containers. For the most part, each of these services (in Compose terms) is similar from test to test. Compose provides [an inheritance feature]( https://github.com/compose-spec/compose-spec/blob/master/spec.md#extends) that we use to define base configurations. * `cluster/Common/dependencies.yaml` defines external dependencis (MySQL, Kafka, ZK etc.) * `cluster/Common/druid.yaml` defines typical settings for each Druid service. Test-specific configurations extend and customize the above. ### Druid Configuration Docker compose passes information to Docker in the form of environment variables. The test use a variation of the environment-variable-based configuration used in the [public Docker image](https://druid.apache.org/docs/latest/tutorials/docker.html). That is, variables of the form `druid_my_config` are converted, by the image launch script, into properties of the form `my.config`. These properties are then written to a launch-specific `runtime.properties` file. Rather than have a test version of `runtime.properties`, instead we have a set of files that define properties as environment variables. All are located in `cases/cluster/Common/environment-configs`: * `common.env` - Properties common to all services. This is the test equivalent to the `common.runtime.properties` file. * `.env` - Properties unique to one service. This is the test equivalent to the `service/runtime.properties` files. ### MySQL Driver Unit tests can use any MySQL driver, typically MySQL or MariaDB. The tests use MySQL by default. Choose a different driver by setting the `MYSQL_DRIVER_CLASSNAME` environment variable when running tests. The variable chooses the selected driver both in the Druid server running in a container, and in the test "clients". ### Special Environment Variables Druid properties can be a bit awkward and verbose in a test environment. A number of test-specific properties help: * `druid_standard_loadList` - Common extension load list for all tests, in the form of a comma-delimited list of extensions (without the brackets.) Defined in `common.env`. * `druid_test_loadList` - A list of additional extensions to load for a specific test. Defined in the `docker-compose.yaml` file for that test category. Do not include quotes. Example test-specific list: ```text druid_test_loadList=druid-azure-extensions,my-extension ``` The launch script combines the two lists, and adds the required brackets and quotes. ## Test-Specific Cluster Each test has a directory named `cluster/`. Docker Compose uses this name as the cluster name which appears in the Docker desktop UI. The folder contains the `docker-compose.yaml` file that defines the test cluster. In the simplest case, the file just lists the services to run as extensions of the base services: ```text services: zookeeper: extends: file: ../Common/dependencies.yaml service: zookeeper broker: extends: file: ../Common/compose/druid.yaml service: broker ... ``` ## Cluster Configuration If a test wants to run two of some service (say Coordinator), then it can use the "standard" definition for only one of them and must fill in the details (especially distinct port numbers) for the second. (See `HighAvilability` for an example.) By default, the container and internal host name is the same as the service name. Thus, a `broker` service resides in a `broker` container known as host `broker` on the Docker overlay network. The service name is also usually the log file name. Thus `broker` logs to `/target//logs/broker.log`. An environment variable `DRUID_INSTANCE` adds a suffix to the service name and causes the log file to be `broker-one.log` if the instance is `one`. The service name should have the full name `broker-one`. Druid configuration comes from the common and service-specific environment files in `/compose/environment-config`. A test-specific service configuration can override any of these settings using the `environment` section. (See [Druid Configuration](druid-config.md) for details.) For special cases, the service can define its configuration in-line and not load the standard settings at all. Each service can override the Java options. However, in practice, the only options that actually change are those for memory. As a result, the memory settings reside in `DRUID_SERVICE_JAVA_OPTS`, which you can easily change on a service-by-service or test-by-test basis. Debugging is enabled on port 8000 in the container. Each service that wishes to expose debugging must map that container port to a distinct host port. The easiest way understand the above is to look at a few examples. ## Service Names The Docker Compose file sets up an "overlay" network to connect the containers. Each is known via a host name taken from the service name. Thus "zookeeper" is the name of the ZK service and of the container that runs ZK. Use these names in configuration within each container. ### Host Ports Outside of the application network, containers are accessible only via the host ports defined in the Docker Compose files. Thus, ZK is known as `localhost:2181` to tests and other code running outside of Docker. ## Test-Specific Configuration In addition to the Druid configuration discussed above, the framework provides three ways to pass test-specific configuration to the tests. All of these methods override any configuration in the `docker-compose` or cluster `env` files. The values here are passed into the Druid server as configuration values. The values apply to all services. (This mechanism does not allow service-specific values.) In all three approaches, use the `druid_` environment variable form. Precendence is in the order below with the user file lowest priority and environment variables highest. ### User-specific `~/druid-it/`. * Instead, create a template file: `templates/.py`. * The minimal file appears below: ```python from template import BaseTemplate, generate generate(__file__, BaseTemplate()) ``` The above will generate a "generic" cluster: one of each kind of service, with either a Middle Manager or Indexer depending on the `USE_INDEXER` env var. You customize your specific cluster by creating a test-specific template class which overrides the various methods that build up the cluster. By using Python, we first build the cluster as a set of Python dictionaries and arrays, then we let [PyYAML](https://pyyaml.org/wiki/PyYAMLDocumentation) convert the objects to a YAML file. Many methods exist to help you populate the configuration tree. See any of the existing files for examples. For example, you can: * Add test-specific environment config to one, some or all services. * Add or remove services. * Create multiples of selected services. The advantage is that, as Druid evolves and we change the basics, those changes are automatically propagated to all test clusters. Once you've created your file, the test framework will re-generate the `docker-compose.yaml` file on each run to reflect any per-run customization. The generated file is found in `target/cluster//docker-compose.yaml`. As with all generated files: resist the temptation to change the generated file: change the template instead. The generated `docker-compose.yaml` file goes into a temporary folder: `target/cluster/`. The script copies over the `Common` directory as well.