13 KiB
Docker Compose Configuration
The integration tests use Docker Compose to launch Druid clusters. Each test defines its own cluster depending on what is to be tested. Since a large amount of the definition is common, we use inheritance to simplify cluster definition.
Tests are split into categories so that they can run in parallel. Some of these categories use the same cluster configuration. To further reduce redundancy, test categories can share cluster configurations.
See also:
- Druid configuration which is done via Compose.
- Test configuration which tells tests about the cluster configuration.
- Docker compose specification
File Structure
Docker Compose files live in the druid-it-cases
module (cases
folder)
in the cluster
directory. There is a separate subdirectory for each cluster type
(subset of test categories), plus a Common
folder for shared files.
Cluster Directory
Each test category uses an associated cluster. In some cases, multiple tests use
the same cluster definition. Each cluster is defined by a directory in
$MODULE/cluster/$CLUSTER_NAME
. The directory contains a variety of files, most
of which are optional:
docker-compose.yaml
- Docker composes file, if created explicitly.docker-compose.py
- Docker compose "template" if generated. The Python template format is preferred. (One of thedocker-compose.*
files is required)verify.sh
- Verify the environment for the cluster. Cloud tests require that a number of environment variables be set to pass keys and other setup to tests. (Optional)setup.sh
- Additional cluster setup, such as populating the "shared" directory with test-specific items. (Optional)
The verify.sh
and setup.sh
scripts are sourced into one of the "master"
scripts and can thus make use of environment variables already set:
BASE_MODULE_DIR
points tointegration-tests-ex/cases
where the "base" set of scripts and cluster definitions reside.MODULE_DIR
points to the Maven module folder that contains the test.CATEGORY
gives the name of the test category.DRUID_INTEGRATION_TEST_GROUP
is the cluster name. Often the same asCATEGORY
, but not always.
The set -e
option is in effect so that an any errors fail the test.
Shared Directory
Each test has a "shared" directory that is mounted into each container to hold things
like logs, security files, etc. The directory is known as /shared
within the container,
and resides in target/<category>
. Even if two categories share a cluster configuration,
they will have separate local versions of the shared directory. This is important to
keep log files separate for each category.
Base Configurations
Test clusters run some number of third-party "infrastructure" containers, and some number of Druid service containers. For the most part, each of these services (in Compose terms) is similar from test to test. Compose provides an inheritance feature that we use to define base configurations.
cluster/Common/dependencies.yaml
defines external dependencis (MySQL, Kafka, ZK etc.)cluster/Common/druid.yaml
defines typical settings for each Druid service.
Test-specific configurations extend and customize the above.
Druid Configuration
Docker compose passes information to Docker in the form of environment variables.
The test use a variation of the environment-variable-based configuration used in
the public Docker image.
That is, variables of the form druid_my_config
are converted, by the image launch
script, into properties of the form my.config
. These properties are then written
to a launch-specific runtime.properties
file.
Rather than have a test version of runtime.properties
, instead we have a set of
files that define properties as environment variables. All are located in
cases/cluster/Common/environment-configs
:
common.env
- Properties common to all services. This is the test equivalent to thecommon.runtime.properties
file.<service>.env
- Properties unique to one service. This is the test equivalent to theservice/runtime.properties
files.
MySQL Driver
Unit tests can use any MySQL driver, typically MySQL or MariaDB. The tests use MySQL
by default. Choose a different driver by setting the MYSQL_DRIVER_CLASSNAME
environment
variable when running tests. The variable chooses the selected driver both in the Druid
server running in a container, and in the test "clients".
Special Environment Variables
Druid properties can be a bit awkward and verbose in a test environment. A number of test-specific properties help:
druid_standard_loadList
- Common extension load list for all tests, in the form of a comma-delimited list of extensions (without the brackets.) Defined incommon.env
.druid_test_loadList
- A list of additional extensions to load for a specific test. Defined in thedocker-compose.yaml
file for that test category. Do not include quotes.
Example test-specific list:
druid_test_loadList=druid-azure-extensions,my-extension
The launch script combines the two lists, and adds the required brackets and quotes.
Test-Specific Cluster
Each test has a directory named cluster/<category>
. Docker Compose uses this name
as the cluster name which appears in the Docker desktop UI. The folder contains
the docker-compose.yaml
file that defines the test cluster.
In the simplest case, the file just lists the services to run as extensions of the base services:
services:
zookeeper:
extends:
file: ../Common/dependencies.yaml
service: zookeeper
broker:
extends:
file: ../Common/compose/druid.yaml
service: broker
...
Cluster Configuration
If a test wants to run two of some service (say Coordinator), then it
can use the "standard" definition for only one of them and must fill in
the details (especially distinct port numbers) for the second.
(See HighAvilability
for an example.)
By default, the container and internal host name is the same as the service
name. Thus, a broker
service resides in a broker
container known as
host broker
on the Docker overlay network.
The service name is also usually the log file name. Thus broker
logs
to /target/<category>/logs/broker.log
.
An environment variable DRUID_INSTANCE
adds a suffix to the service
name and causes the log file to be broker-one.log
if the instance
is one
. The service name should have the full name broker-one
.
Druid configuration comes from the common and service-specific environment
files in /compose/environment-config
. A test-specific service configuration
can override any of these settings using the environment
section.
(See Druid Configuration for details.)
For special cases, the service can define its configuration in-line and
not load the standard settings at all.
Each service can override the Java options. However, in practice, the
only options that actually change are those for memory. As a result,
the memory settings reside in DRUID_SERVICE_JAVA_OPTS
, which you can
easily change on a service-by-service or test-by-test basis.
Debugging is enabled on port 8000 in the container. Each service that wishes to expose debugging must map that container port to a distinct host port.
The easiest way understand the above is to look at a few examples.
Service Names
The Docker Compose file sets up an "overlay" network to connect the containers. Each is known via a host name taken from the service name. Thus "zookeeper" is the name of the ZK service and of the container that runs ZK. Use these names in configuration within each container.
Host Ports
Outside of the application network, containers are accessible only via the
host ports defined in the Docker Compose files. Thus, ZK is known as localhost:2181
to tests and other code running outside of Docker.
Test-Specific Configuration
In addition to the Druid configuration discussed above, the framework provides
three ways to pass test-specific configuration to the tests. All of these methods
override any configuration in the docker-compose
or cluster env
files.
The values here are passed into the Druid server as configuration values. The
values apply to all services. (This mechanism does not allow service-specific
values.) In all three approaches, use the druid_
environment variable form.
Precendence is in the order below with the user file lowest priority and environment variables highest.
User-specific ~/druid-it/<category.env
file
If you are debugging a test, you may need to provide values specific to your setup.
Examples include user names, passwords, credentials, cloud buckets, etc. Put these
in a file in your home directory (not Druid development directory). Create a
subdirectory ~/druid-it
, then create a separate file for each category that you
want to customize. Create entries for your information:
druid_cloud_bucket=MyBucket
Test-specific OVERRIDE_ENV
file
Build scripts can pass values into Druid via a file. Set the OVERRIDE_ENV
environment
variable with the path to the file. Each line is formatted as above. The variable can
be set on the command line:
OVERRIDE_ENV=/tmp/special.env ./cluster.sh up Category
It can also be set in Maven, or passed from the build environment, through Maven, to the script.
Environment variables
Normally the environment of the script that runs Druid is separate from the environment
passed to the container. However, the launch script will copy across any variable that
starts with druid_
. The variable can be set on the command line:
druid_my_config=my_value ./cluster.sh up Category
It can also be set in Maven, or passed from the build environment, through Maven, to the script. This is the preferred way to pass environment-specific information from Travis into the test containers.
Define a Test Cluster
To define a test cluster, do the following:
- Define the overlay network.
- Extend the third-party services required (at least ZK and MySQL).
- Extend each Druid service needed. Add a
depends_on
forzookeeper
and, for the Coordinator and Overlord,metadata
. - If you need multiple instances of the same service, extend that service twice, and define distinct names and port numbers.
- Add any test-specific environment configuration required.
Generating docker-compose.yaml
Files
Each test has somewhat different needs for its test cluster. Yet, there is a great amount of consistency across test clusters and across services. The result, if we create files by hand, is a great amount of copy/paste redundancy, with all the problems that copy/paste implies.
As an alternative, the framework provides a simple-minded way to generate the
docker-compose.yaml
file using a simple Python-based template mechanism. To use
this:
- Omit the test cluster directory:
cluster/<category>
. - Instead, create a template file:
templates/<category>.py
. - The minimal file appears below:
from template import BaseTemplate, generate
generate(__file__, BaseTemplate())
The above will generate a "generic" cluster: one of each kind of service, with
either a Middle Manager or Indexer depending on the USE_INDEXER
env var.
You customize your specific cluster by creating a test-specific template class which overrides the various methods that build up the cluster. By using Python, we first build the cluster as a set of Python dictionaries and arrays, then we let PyYAML convert the objects to a YAML file. Many methods exist to help you populate the configuration tree. See any of the existing files for examples.
For example, you can:
- Add test-specific environment config to one, some or all services.
- Add or remove services.
- Create multiples of selected services.
The advantage is that, as Druid evolves and we change the basics, those changes are automatically propagated to all test clusters.
Once you've created your file, the test framework will re-generate the
docker-compose.yaml
file on each run to reflect any per-run customization.
The generated file is found in target/cluster/<category>/docker-compose.yaml
.
As with all generated files: resist the temptation to change the generated file:
change the template instead.
The generated docker-compose.yaml
file goes into a temporary folder:
target/cluster/<category>
. The script copies over the Common
directory
as well.