hadoop/hadoop-hdds/docs/content/beyond/Containers.md

236 lines
7.8 KiB
Markdown

---
title: "Ozone Containers"
summary: Ozone uses containers extensively for testing. This page documents the usage and best practices of Ozone.
weight: 2
---
<!---
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
Docker heavily is used at the ozone development with three principal use-cases:
* __dev__:
* We use docker to start local pseudo-clusters (docker provides unified environment, but no image creation is required)
* __test__:
* We create docker images from the dev branches to test ozone in kubernetes and other container orchestrator system
* We provide _apache/ozone_ images for each release to make it easier for evaluation of Ozone.
These images are __not__ created __for production__ usage.
<div class="alert alert-warning" role="alert">
We <b>strongly</b> recommend that you create your own custom images when you
deploy ozone into production using containers. Please treat all the standard
shipped container images and k8s resources as examples and guides to help you
customize your own deployment.
</div>
* __production__:
* We have documentation on how you can create your own docker image for your production cluster.
Let's check out each of the use-cases in more detail:
## Development
Ozone artifact contains example docker-compose directories to make it easier to start Ozone cluster in your local machine.
From distribution:
```bash
cd compose/ozone
docker-compose up -d
```
After a local build:
```bash
cd hadoop-ozone/dist/target/ozone-*/compose
docker-compose up -d
```
These environments are very important tools to start different type of Ozone clusters at any time.
To be sure that the compose files are up-to-date, we also provide acceptance test suites which start
the cluster and check the basic behaviour.
The acceptance tests are part of the distribution, and you can find the test definitions in `smoketest` directory.
You can start the tests from any compose directory:
For example:
```bash
cd compose/ozone
./test.sh
```
### Implementation details
`compose` tests are based on the apache/hadoop-runner docker image. The image itself does not contain
any Ozone jar file or binary just the helper scripts to start ozone.
hadoop-runner provdes a fixed environment to run Ozone everywhere, but the ozone distribution itself
is mounted from the including directory:
(Example docker-compose fragment)
```
scm:
image: apache/hadoop-runner:jdk11
volumes:
- ../..:/opt/hadoop
ports:
- 9876:9876
```
The containers are configured based on environment variables, but because the same environment
variables should be set for each containers we maintain the list of the environment variables
in a separated file:
```
scm:
image: apache/hadoop-runner:jdk11
#...
env_file:
- ./docker-config
```
The docker-config file contains the list of the required environment variables:
```
OZONE-SITE.XML_ozone.om.address=om
OZONE-SITE.XML_ozone.om.http-address=om:9874
OZONE-SITE.XML_ozone.scm.names=scm
OZONE-SITE.XML_ozone.enabled=True
#...
```
As you can see we use naming convention. Based on the name of the environment variable, the
appropriate hadoop config XML (`ozone-site.xml` in our case) will be generated by a
[script](https://github.com/apache/hadoop/tree/docker-hadoop-runner-latest/scripts) which is
included in the `hadoop-runner` base image.
The [entrypoint](https://github.com/apache/hadoop/blob/docker-hadoop-runner-latest/scripts/starter.sh)
of the `hadoop-runner` image contains a helper shell script which triggers this transformation and
can do additional actions (eg. initialize scm/om storage, download required keytabs, etc.)
based on environment variables.
## Test/Staging
The `docker-compose` based approach is recommended only for local test, not for multi node cluster.
To use containers on a multi-node cluster we need a Container Orchestrator like Kubernetes.
Kubernetes example files are included in the `kubernetes` folder.
*Please note*: all the provided images are based the `hadoop-runner` image which contains all the
required tool for testing in staging environments. For production we recommend to create your own,
hardened image with your own base image.
### Test the release
The release can be tested with deploying any of the example clusters:
```bash
cd kubernetes/examples/ozone
kubectl apply -f
```
Plese note that in this case the latest released container will be downloaded from the dockerhub.
### Test the development build
To test a development build you can create your own image and upload it to your own docker registry:
```bash
mvn clean install -f pom.ozone.xml -DskipTests -Pdocker-build,docker-push -Ddocker.image=myregistry:9000/name/ozone
```
The configured image will be used in all the generated kubernetes resources files (`image:` keys are adjusted during the build)
```bash
cd kubernetes/examples/ozone
kubectl apply -f
```
## Production
<div class="alert alert-danger" role="alert">
We <b>strongly</b> recommend to use your own image in your production cluster
and
adjust base image, umask, security settings, user settings according to your own requirements.
</div>
You can use the source of our development images as an example:
* [Base image] (https://github.com/apache/hadoop/blob/docker-hadoop-runner-jdk11/Dockerfile)
* [Docker image] (https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/dist/src/main/docker/Dockerfile)
Most of the elements are optional and just helper function but to use the provided example
kubernetes resources you may need the scripts from
[here](https://github.com/apache/hadoop/tree/docker-hadoop-runner-jdk11/scripts)
* The two python scripts convert environment variables to real hadoop XML config files
* The start.sh executes the python scripts (and other initialization) based on environment variables.
## Containers
Ozone related container images and source locations:
<table class="table table-dark">
<thead>
<tr>
<th scope="col">#</th>
<th scope="col">Container</th>
<th scope="col">Repository</th>
<th scope="col">Base</th>
<th scope="col">Branch</th>
<th scope="col">Tags</th>
<th scope="col">Comments</th>
</tr>
</thead>
<tbody>
<tr>
<th scope="row">1</th>
<td>apache/ozone</td>
<td>https://github.com/apache/hadoop-docker-ozone</td>
<td>ozone-... </td>
<td>hadoop-runner</td>
<td>0.3.0,0.4.0,0.4.1</td>
<td>For each Ozone release we create new release tag.</td>
</tr>
<tr>
<th scope="row">2</th>
<td>apache/hadoop-runner </td>
<td>https://github.com/apache/hadoop</td>
<td>docker-hadoop-runner</td>
<td>centos</td>
<td>jdk11,jdk8,latest</td>
<td>This is the base image used for testing Hadoop Ozone.
This is a set of utilities that make it easy for us run ozone.</td>
</tr>
<!---tr>
<th scope="row">3</th>
<td>apache/ozone:build (WIP)</td>
<td>https://github.com/apache/hadoop-docker-ozone</td>
<td>ozone-build </td>
<td> </td>
<td> </td>
<td>TODO: Add more documentation here.</td>
</tr-->
</tbody>
</table>