HDDS-2002. Update documentation for 0.4.1 release.

Signed-off-by: Anu Engineer <aengineer@apache.org>
This commit is contained in:
Nanda kumar 2019-08-21 22:47:41 +05:30 committed by Anu Engineer
parent 0b796754b9
commit b661dcf563
25 changed files with 269 additions and 258 deletions

View File

@ -25,8 +25,9 @@ Docker heavily is used at the ozone development with three principal use-cases:
* __dev__:
* We use docker to start local pseudo-clusters (docker provides unified environment, but no image creation is required)
* __test__:
* We create docker images from the dev branches to test ozone in kubernetes and other container orchestator system
* We provide _apache/ozone_ images for each release to make it easier the evaluation of Ozone. These images are __not__ created __for production__ usage.
* We create docker images from the dev branches to test ozone in kubernetes and other container orchestrator system
* We provide _apache/ozone_ images for each release to make it easier for evaluation of Ozone.
These images are __not__ created __for production__ usage.
<div class="alert alert-warning" role="alert">
We <b>strongly</b> recommend that you create your own custom images when you
@ -36,7 +37,7 @@ shipped container images and k8s resources as examples and guides to help you
</div>
* __production__:
* We document how can you create your own docker image for your production cluster.
* We have documentation on how you can create your own docker image for your production cluster.
Let's check out each of the use-cases in more detail:
@ -46,38 +47,41 @@ Ozone artifact contains example docker-compose directories to make it easier to
From distribution:
```
```bash
cd compose/ozone
docker-compose up -d
```
After a local build
After a local build:
```
```bash
cd hadoop-ozone/dist/target/ozone-*/compose
docker-compose up -d
```
These environments are very important tools to start different type of Ozone clusters at any time.
To be sure that the compose files are up-to-date, we also provide acceptance test suites which start the cluster and check the basic behaviour.
To be sure that the compose files are up-to-date, we also provide acceptance test suites which start
the cluster and check the basic behaviour.
The acceptance tests are part of the distribution, and you can find the test definitions in `./smoketest` directory.
The acceptance tests are part of the distribution, and you can find the test definitions in `smoketest` directory.
You can start the tests from any compose directory:
For example:
```
```bash
cd compose/ozone
./test.sh
```
### Implementation details
`./compose` tests are based on the apache/hadoop-runner docker image. The image itself doesn't contain any Ozone jar file or binary just the helper scripts to start ozone.
`compose` tests are based on the apache/hadoop-runner docker image. The image itself does not contain
any Ozone jar file or binary just the helper scripts to start ozone.
hadoop-runner provdes a fixed environment to run Ozone everywhere, but the ozone distribution itself is mounted from the including directory:
hadoop-runner provdes a fixed environment to run Ozone everywhere, but the ozone distribution itself
is mounted from the including directory:
(Example docker-compose fragment)
@ -91,7 +95,9 @@ hadoop-runner provdes a fixed environment to run Ozone everywhere, but the ozone
```
The containers are conigured based on environment variables, but because the same environment variables should be set for each containers we maintain the list of the environment variables in a separated file:
The containers are configured based on environment variables, but because the same environment
variables should be set for each containers we maintain the list of the environment variables
in a separated file:
```
scm:
@ -111,23 +117,32 @@ OZONE-SITE.XML_ozone.enabled=True
#...
```
As you can see we use naming convention. Based on the name of the environment variable, the appropariate hadoop config XML (`ozone-site.xml` in our case) will be generated by a [script](https://github.com/apache/hadoop/tree/docker-hadoop-runner-latest/scripts) which is included in the `hadoop-runner` base image.
As you can see we use naming convention. Based on the name of the environment variable, the
appropriate hadoop config XML (`ozone-site.xml` in our case) will be generated by a
[script](https://github.com/apache/hadoop/tree/docker-hadoop-runner-latest/scripts) which is
included in the `hadoop-runner` base image.
The [entrypoint](https://github.com/apache/hadoop/blob/docker-hadoop-runner-latest/scripts/starter.sh) of the `hadoop-runner` image contains a helper shell script which triggers this transformation and cab do additional actions (eg. initialize scm/om storage, download required keytabs, etc.) based on environment variables.
The [entrypoint](https://github.com/apache/hadoop/blob/docker-hadoop-runner-latest/scripts/starter.sh)
of the `hadoop-runner` image contains a helper shell script which triggers this transformation and
can do additional actions (eg. initialize scm/om storage, download required keytabs, etc.)
based on environment variables.
## Test/Staging
The `docker-compose` based approach is recommended only for local test not for multi node cluster. To use containers on a multi-node cluster we need a Container Orchestrator like Kubernetes.
The `docker-compose` based approach is recommended only for local test, not for multi node cluster.
To use containers on a multi-node cluster we need a Container Orchestrator like Kubernetes.
Kubernetes example files are included in the `kubernetes` folder.
*Please note*: all the provided images are based the `hadoop-runner` image which contains all the required tool for testing in staging environments. For production we recommend to create your own, hardened image with your own base image.
*Please note*: all the provided images are based the `hadoop-runner` image which contains all the
required tool for testing in staging environments. For production we recommend to create your own,
hardened image with your own base image.
### Test the release
The release can be tested with deploying any of the example clusters:
```
```bash
cd kubernetes/examples/ozone
kubectl apply -f
```
@ -139,13 +154,13 @@ Plese note that in this case the latest released container will be downloaded fr
To test a development build you can create your own image and upload it to your own docker registry:
```
```bash
mvn clean install -f pom.ozone.xml -DskipTests -Pdocker-build,docker-push -Ddocker.image=myregistry:9000/name/ozone
```
The configured image will be used in all the generated kubernetes resources files (`image:` keys are adjusted during the build)
```
```bash
cd kubernetes/examples/ozone
kubectl apply -f
```
@ -160,10 +175,12 @@ adjust base image, umask, security settings, user settings according to your own
You can use the source of our development images as an example:
* Base image: https://github.com/apache/hadoop/blob/docker-hadoop-runner-jdk11/Dockerfile
* Docker image: https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/dist/src/main/Dockerfile
* [Base image] (https://github.com/apache/hadoop/blob/docker-hadoop-runner-jdk11/Dockerfile)
* [Docker image] (https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/dist/src/main/docker/Dockerfile)
Most of the elements are optional and just helper function but to use the provided example kubernetes resources you may need the scripts from [here](https://github.com/apache/hadoop/tree/docker-hadoop-runner-jdk11/scripts)
Most of the elements are optional and just helper function but to use the provided example
kubernetes resources you may need the scripts from
[here](https://github.com/apache/hadoop/tree/docker-hadoop-runner-jdk11/scripts)
* The two python scripts convert environment variables to real hadoop XML config files
* The start.sh executes the python scripts (and other initialization) based on environment variables.
@ -205,7 +222,7 @@ Ozone related container images and source locations:
<td>This is the base image used for testing Hadoop Ozone.
This is a set of utilities that make it easy for us run ozone.</td>
</tr>
<tr>
<!---tr>
<th scope="row">3</th>
<td>apache/ozone:build (WIP)</td>
<td>https://github.com/apache/hadoop-docker-ozone</td>
@ -213,6 +230,6 @@ Ozone related container images and source locations:
<td> </td>
<td> </td>
<td>TODO: Add more documentation here.</td>
</tr>
</tr-->
</tbody>
</table>

View File

@ -22,7 +22,9 @@ weight: 4
limitations under the License.
-->
In the `compose` directory of the ozone distribution there are multiple pseudo-cluster setup which can be used to run Ozone in different way (for example with secure cluster, with tracing enabled, with prometheus etc.).
In the `compose` directory of the ozone distribution there are multiple pseudo-cluster setup which
can be used to run Ozone in different way (for example: secure cluster, with tracing enabled,
with prometheus etc.).
If the usage is not document in a specific directory the default usage is the following:
@ -31,8 +33,7 @@ cd compose/ozone
docker-compose up -d
```
The data of the container is ephemeral and deleted together with the docker volumes. To force the deletion of existing data you can always delete all the temporary data:
The data of the container is ephemeral and deleted together with the docker volumes.
```bash
docker-compose down
```

View File

@ -56,7 +56,7 @@ To start ozone with HDFS you should start the the following components:
2. HDFS Datanode (from the Hadoop distribution with the plugin on the
classpath from the Ozone distribution)
3. Ozone Manager (from the Ozone distribution)
4. Storage Container manager (from the Ozone distribution)
4. Storage Container Manager (from the Ozone distribution)
Please check the log of the datanode whether the HDDS/Ozone plugin is started or
not. Log of datanode should contain something like this:

View File

@ -36,7 +36,7 @@ actual data streams. This is the default Storage container format. From
Ozone's perspective, container is a protocol spec, actual storage layouts
does not matter. In other words, it is trivial to extend or bring new
container layouts. Hence this should be treated as a reference implementation
of containers under Ozone.
of containers under Ozone.
## Understanding Ozone Blocks and Containers
@ -51,13 +51,13 @@ shows the logical layout out of Ozone block.
The container ID lets the clients discover the location of the container. The
authoritative information about where a container is located is with the
Storage Container Manager or SCM. In most cases, the container location will
Storage Container Manager (SCM). In most cases, the container location will be
cached by Ozone Manager and will be returned along with the Ozone blocks.
Once the client is able to locate the contianer, that is, understand which
data nodes contain this container, the client will connect to the datanode
read the data the data stream specified by container ID:Local ID. In other
and read the data stream specified by _Container ID:Local ID_. In other
words, the local ID serves as index into the container which describes what
data stream we want to read from.

View File

@ -23,7 +23,7 @@ summary: Storage Container Manager or SCM is the core metadata service of Ozone
Storage container manager provides multiple critical functions for the Ozone
cluster. SCM acts as the cluster manager, Certificate authority, Block
manager and the replica manager.
manager and the Replica manager.
{{<card title="Cluster Management" icon="tasks">}}
SCM is in charge of creating an Ozone cluster. When an SCM is booted up via <kbd>init</kbd> command, SCM creates the cluster identity and root certificates needed for the SCM certificate authority. SCM manages the life cycle of a data node in the cluster.

View File

@ -56,7 +56,7 @@ Ozone.
![FunctionalOzone](FunctionalOzone.png)
Any distributed system can viewed from different perspectives. One way to
Any distributed system can be viewed from different perspectives. One way to
look at Ozone is to imagine it as Ozone Manager as a name space service built on
top of HDDS, a distributed block store.
@ -67,8 +67,8 @@ Another way to visualize Ozone is to look at the functional layers; we have a
We have a data storage layer, which is basically the data nodes and they are
managed by SCM.
The replication layer, provided by Ratis is used to replicate metadata (Ozone
Manager and SCM) and also used for consistency when data is modified at the
The replication layer, provided by Ratis is used to replicate metadata (OM and SCM)
and also used for consistency when data is modified at the
data nodes.
We have a management server called Recon, that talks to all other components

View File

@ -21,14 +21,14 @@ summary: Ozone Manager is the principal name space service of Ozone. OM manages
limitations under the License.
-->
Ozone Manager or OM is the namespace manager for Ozone.
Ozone Manager (OM) is the namespace manager for Ozone.
This means that when you want to write some data, you ask Ozone
manager for a block and Ozone Manager gives you a block and remembers that
information. When you want to read the that file back, you need to find the
address of the block and Ozone manager returns it you.
Manager for a block and Ozone Manager gives you a block and remembers that
information. When you want to read that file back, you need to find the
address of the block and Ozone Manager returns it you.
Ozone manager also allows users to organize keys under a volume and bucket.
Ozone Manager also allows users to organize keys under a volume and bucket.
Volumes and buckets are part of the namespace and managed by Ozone Manager.
Each ozone volume is the root of an independent namespace under OM.
@ -57,17 +57,17 @@ understood if we trace what happens during a key write and key read.
* To write a key to Ozone, a client tells Ozone manager that it would like to
write a key into a bucket that lives inside a specific volume. Once Ozone
manager determines that you are allowed to write a key to specified bucket,
Manager determines that you are allowed to write a key to the specified bucket,
OM needs to allocate a block for the client to write data.
* To allocate a block, Ozone manager sends a request to Storage Container
Manager or SCM; SCM is the manager of data nodes. SCM picks three data nodes
* To allocate a block, Ozone Manager sends a request to Storage Container
Manager (SCM); SCM is the manager of data nodes. SCM picks three data nodes
into which client can write data. SCM allocates the block and returns the
block ID to Ozone Manager.
* Ozone manager records this block information in its metadata and returns the
block and a block token (a security permission to write data to the block)
the client.
to the client.
* The client uses the block token to prove that it is allowed to write data to
the block and writes data to the data node.
@ -82,6 +82,6 @@ Ozone manager.
* Key reads are simpler, the client requests the block list from the Ozone
Manager
* Ozone manager will return the block list and block tokens which
allows the client to read the data from nodes.
allows the client to read the data from data nodes.
* Client connects to the data node and presents the block token and reads
the data from the data node.

View File

@ -74,21 +74,21 @@ It is possible to pass an array of arguments to the createVolume by creating vol
Once you have a volume, you can create buckets inside the volume.
{{< highlight bash >}}
{{< highlight java >}}
// Let us create a bucket called videos.
assets.createBucket("videos");
OzoneBucket video = assets.getBucket("videos");
{{< /highlight >}}
At this point we have a usable volume and a bucket. Our volume is called assets and bucket is called videos.
At this point we have a usable volume and a bucket. Our volume is called _assets_ and bucket is called _videos_.
Now we can create a Key.
### Reading and Writing a Key
With a bucket object the users can now read and write keys. The following code reads a video called intro.mp4 from the local disk and stores in the video bucket that we just created.
With a bucket object the users can now read and write keys. The following code reads a video called intro.mp4 from the local disk and stores in the _video_ bucket that we just created.
{{< highlight bash >}}
{{< highlight java >}}
// read data from the file, this is a user provided function.
byte [] videoData = readFile("intro.mp4");

View File

@ -21,7 +21,7 @@ summary: Hadoop Compatible file system allows any application that expects an HD
limitations under the License.
-->
The Hadoop compatible file system interface allpws storage backends like Ozone
The Hadoop compatible file system interface allows storage backends like Ozone
to be easily integrated into Hadoop eco-system. Ozone file system is an
Hadoop compatible file system.
@ -36,7 +36,7 @@ ozone sh volume create /volume
ozone sh bucket create /volume/bucket
{{< /highlight >}}
Once this is created, please make sure that bucket exists via the listVolume or listBucket commands.
Once this is created, please make sure that bucket exists via the _list volume_ or _list bucket_ commands.
Please add the following entry to the core-site.xml.
@ -45,6 +45,10 @@ Please add the following entry to the core-site.xml.
<name>fs.o3fs.impl</name>
<value>org.apache.hadoop.fs.ozone.OzoneFileSystem</value>
</property>
<property>
<name>fs.AbstractFileSystem.o3fs.impl</name>
<value>org.apache.hadoop.fs.ozone.OzFs</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>o3fs://bucket.volume</value>

View File

@ -26,7 +26,7 @@ Ozone provides S3 compatible REST interface to use the object store data with an
## Getting started
S3 Gateway is a separated component which provides the S3 compatible. It should be started additional to the regular Ozone components.
S3 Gateway is a separated component which provides the S3 compatible APIs. It should be started additional to the regular Ozone components.
You can start a docker based cluster, including the S3 gateway from the release package.
@ -93,7 +93,7 @@ If security is not enabled, you can *use* **any** AWS_ACCESS_KEY_ID and AWS_SECR
If security is enabled, you can get the key and the secret with the `ozone s3 getsecret` command (*kerberos based authentication is required).
```
```bash
/etc/security/keytabs/testuser.keytab testuser/scm@EXAMPLE.COM
ozone s3 getsecret
awsAccessKey=testuser/scm@EXAMPLE.COM
@ -103,7 +103,7 @@ awsSecret=c261b6ecabf7d37d5f9ded654b1c724adac9bd9f13e247a235e567e8296d2999
Now, you can use the key and the secret to access the S3 endpoint:
```
```bash
export AWS_ACCESS_KEY_ID=testuser/scm@EXAMPLE.COM
export AWS_SECRET_ACCESS_KEY=c261b6ecabf7d37d5f9ded654b1c724adac9bd9f13e247a235e567e8296d2999
aws s3api --endpoint http://localhost:9878 create-bucket --bucket bucket1
@ -116,7 +116,7 @@ aws s3api --endpoint http://localhost:9878 create-bucket --bucket bucket1
To show the storage location of a S3 bucket, use the `ozone s3 path <bucketname>` command.
```
```bash
aws s3api --endpoint-url http://localhost:9878 create-bucket --bucket=bucket1
ozone s3 path bucket1
@ -128,23 +128,23 @@ Ozone FileSystem Uri is : o3fs://bucket1.s3thisisakey
### AWS Cli
`aws` CLI could be used with specifying the custom REST endpoint.
`aws` CLI could be used by specifying the custom REST endpoint.
```
```bash
aws s3api --endpoint http://localhost:9878 create-bucket --bucket buckettest
```
Or
```
```bash
aws s3 ls --endpoint http://localhost:9878 s3://buckettest
```
### S3 Fuse driver (goofys)
Goofys is a S3 FUSE driver. It could be used to mount any Ozone bucket as posix file system:
Goofys is a S3 FUSE driver. It could be used to mount any Ozone bucket as posix file system.
```
```bash
goofys --endpoint http://localhost:9878 bucket1 /mount/bucket1
```

View File

@ -32,28 +32,29 @@ compatible metrics endpoint where all the available hadoop metrics are published
## Monitoring with prometheus
(1) To enable the Prometheus metrics endpoint you need to add a new configuration to the `ozone-site.xml` file:
* To enable the Prometheus metrics endpoint you need to add a new configuration to the `ozone-site.xml` file.
```
```xml
<property>
<name>hdds.prometheus.endpoint.enabled</name>
<value>true</value>
</property>
```
_Note_: for Docker compose based pseudo cluster put the `OZONE-SITE.XML_hdds.prometheus.endpoint.enabled=true` line to the `docker-config` file.
_Note_: for Docker compose based pseudo cluster put the \
`OZONE-SITE.XML_hdds.prometheus.endpoint.enabled=true` line to the `docker-config` file.
(2) Restart the Ozone Manager and Storage Container Manager and check the prometheus endpoints:
* Restart the Ozone Manager and Storage Container Manager and check the prometheus endpoints:
* http://scm:9874/prom
* http://ozoneManager:9876/prom
(3) Create a prometheus.yaml configuration with the previous endpoints:
* Create a prometheus.yaml configuration with the previous endpoints:
```yaml
global:
scrape_interval: 15s
scrape_interval: 15s
scrape_configs:
- job_name: ozone
@ -64,20 +65,21 @@ scrape_configs:
- "ozoneManager:9874"
```
(4) Start with prometheus from the directory where you have the prometheus.yaml file:
* Start with prometheus from the directory where you have the prometheus.yaml file:
```
```bash
prometheus
```
(5) Check the active targets in the prometheus web-ui:
* Check the active targets in the prometheus web-ui:
http://localhost:9090/targets
![Prometheus target page example](prometheus.png)
(6) Check any metrics on the prometheus web ui. For example:
* Check any metrics on the prometheus web ui.\
For example:
http://localhost:9090/graph?g0.range_input=1h&g0.expr=om_metrics_num_key_allocate&g0.tab=1

View File

@ -46,13 +46,13 @@ You also need the following:
First of all create a docker image with the Spark image creator.
Execute the following from the Spark distribution
```
```bash
./bin/docker-image-tool.sh -r myrepo -t 2.4.0 build
```
_Note_: if you use Minikube add the `-m` flag to use the docker daemon of the Minikube image:
```
```bash
./bin/docker-image-tool.sh -m -r myrepo -t 2.4.0 build
```
@ -64,18 +64,22 @@ Create a new directory for customizing the created docker image.
Copy the `ozone-site.xml` from the cluster:
```
```bash
kubectl cp om-0:/opt/hadoop/etc/hadoop/ozone-site.xml .
```
And create a custom `core-site.xml`:
And create a custom `core-site.xml`.
```
```xml
<configuration>
<property>
<name>fs.o3fs.impl</name>
<value>org.apache.hadoop.fs.ozone.BasicOzoneFileSystem</value>
</property>
<property>
<name>fs.AbstractFileSystem.o3fs.impl</name>
<value>org.apache.hadoop.fs.ozone.OzFs</value>
</property>
</configuration>
```
@ -98,13 +102,13 @@ ENV SPARK_EXTRA_CLASSPATH=/opt/hadoop/conf
ADD hadoop-ozone-filesystem-lib-legacy-0.4.0-SNAPSHOT.jar /opt/hadoop-ozone-filesystem-lib-legacy.jar
```
```
```bash
docker build -t myrepo/spark-ozone
```
For remote kubernetes cluster you may need to push it:
```
```bash
docker push myrepo/spark-ozone
```
@ -112,7 +116,7 @@ docker push myrepo/spark-ozone
Download any text file and put it to the `/tmp/alice.txt` first.
```
```bash
kubectl port-forward s3g-0 9878:9878
aws s3api --endpoint http://localhost:9878 create-bucket --bucket=test
aws s3api --endpoint http://localhost:9878 put-object --bucket test --key alice.txt --body /tmp/alice.txt
@ -130,7 +134,7 @@ Write down the ozone filesystem uri as it should be used with the spark-submit c
## Create service account to use
```
```bash
kubectl create serviceaccount spark -n yournamespace
kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=yournamespace:spark --namespace=yournamespace
```
@ -138,13 +142,14 @@ kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount
Execute the following spark-submit command, but change at least the following values:
* the kubernetes master url (you can check your ~/.kube/config to find the actual value)
* the kubernetes namespace (yournamespace in this example)
* serviceAccountName (you can use the _spark_ value if you folllowed the previous steps)
* container.image (in this example this is myrepo/spark-ozone. This is pushed to the registry in the previous steps)
* location of the input file (o3fs://...), use the string which is identified earlier with the `ozone s3 path <bucketname>` command
* the kubernetes master url (you can check your _~/.kube/config_ to find the actual value)
* the kubernetes namespace (_yournamespace_ in this example)
* serviceAccountName (you can use the _spark_ value if you followed the previous steps)
* container.image (in this example this is _myrepo/spark-ozone_. This is pushed to the registry in the previous steps)
* location of the input file (o3fs://...), use the string which is identified earlier with the \
`ozone s3 path <bucketname>` command
```
```bash
bin/spark-submit \
--master k8s://https://kubernetes:6443 \
--deploy-mode cluster \
@ -162,7 +167,8 @@ bin/spark-submit \
Check the available `spark-word-count-...` pods with `kubectl get pod`
Check the output of the calculation with `kubectl logs spark-word-count-1549973913699-driver`
Check the output of the calculation with \
`kubectl logs spark-word-count-1549973913699-driver`
You should see the output of the wordcount job. For example:

View File

@ -24,5 +24,6 @@ weight: 8
{{<jumbotron title="Recipes of Ozone">}}
Standard How-to documents which describe how to use Ozone with other Software. For example, How to use Ozone with Apache Spark.
Standard how-to documents which describe how to use Ozone with other Software.
For example, how to use Ozone with Apache Spark.
{{</jumbotron>}}

View File

@ -24,8 +24,8 @@ icon: user
Apache Ranger™ is a framework to enable, monitor and manage comprehensive data
security across the Hadoop platform. The next version(any version after 1.20)
of Apache Ranger is aware of Ozone, and can manage an Ozone cluster.
security across the Hadoop platform. Any version of Apache Ranger which is greater
than 1.20 is aware of Ozone, and can manage an Ozone cluster.
To use Apache Ranger, you must have Apache Ranger installed in your Hadoop

View File

@ -31,11 +31,13 @@ secure networks where it is possible to deploy without securing the cluster.
This release of Ozone follows that model, but soon will move to _secure by
default._ Today to enable security in ozone cluster, we need to set the
configuration **ozone.security.enabled** to true.
configuration **ozone.security.enabled** to _true_ and **hadoop.security.authentication**
to _kerberos_.
Property|Value
----------------------|---------
ozone.security.enabled| **true**
ozone.security.enabled| _true_
hadoop.security.authentication| _kerberos_
# Tokens #
@ -68,7 +70,7 @@ also enabled by default when security is enabled.
Each of the service daemons that make up Ozone needs a Kerberos service
principal name and a corresponding [kerberos key tab]({{https://web.mit.edu/kerberos/krb5-latest/doc/basic/keytab_def.html}}) file.
principal name and a corresponding [kerberos key tab](https://web.mit.edu/kerberos/krb5-latest/doc/basic/keytab_def.html) file.
All these settings should be made in ozone-site.xml.
@ -77,101 +79,100 @@ All these settings should be made in ozone-site.xml.
<div class="card-body">
<h3 class="card-title">Storage Container Manager</h3>
<p class="card-text">
<br>
<br>
SCM requires two Kerberos principals, and the corresponding key tab files
for both of these principals.
<br>
<table class="table table-dark">
<thead>
<tr>
<th scope="col">Property</th>
<th scope="col">Description</th>
</tr>
</thead>
<tbody>
<tr>
<th scope="row">hdds.scm.kerberos.principal</th>
<td>The SCM service principal. e.g. scm/HOST@REALM.COM</td>
</tr>
<tr>
<th scope="row">hdds.scm.kerberos.keytab.file</th>
<td>The keytab file used by SCM daemon to login as its service principal.</td>
</tr>
<tr>
<th scope="row">hdds.scm.http.kerberos.principal</th>
<td>SCM http server service principal.</td>
</tr>
<tr>
<th scope="row">hdds.scm.http.kerberos.keytab</th>
<td>The keytab file used by SCM http server to login as its service principal.</td>
</tr>
</tbody>
</table>
<br>
<table class="table table-dark">
<thead>
<tr>
<th scope="col">Property</th>
<th scope="col">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>hdds.scm.kerberos.principal</th>
<td>The SCM service principal. <br/> e.g. scm/_HOST@REALM.COM</td>
</tr>
<tr>
<td>hdds.scm.kerberos.keytab.file</th>
<td>The keytab file used by SCM daemon to login as its service principal.</td>
</tr>
<tr>
<td>hdds.scm.http.kerberos.principal</th>
<td>SCM http server service principal.</td>
</tr>
<tr>
<td>hdds.scm.http.kerberos.keytab</th>
<td>The keytab file used by SCM http server to login as its service principal.</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="card">
<div class="card-body">
<h3 class="card-title">Ozone Manager</h3>
<p class="card-text">
<br>
Like SCM, OM also requires two Kerberos principals, and the
corresponding key tab files for both of these principals.
<br>
<table class="table table-dark">
<thead>
<tr>
<th scope="col">Property</th>
<th scope="col">Description</th>
</tr>
</thead>
<tbody>
<tr>
<th scope="row">ozone.om.kerberos.principal</th>
<td>The OzoneManager service principal. e.g. om/_HOST@REALM
.COM</td>
</tr>
<tr>
<th scope="row">ozone.om.kerberos.keytab.file</th>
<td>TThe keytab file used by SCM daemon to login as its service principal.</td>
</tr>
<tr>
<th scope="row">ozone.om.http.kerberos.principal</th>
<td>Ozone Manager http server service principal.</td>
</tr>
<tr>
<th scope="row">ozone.om.http.kerberos.keytab</th>
<td>The keytab file used by OM http server to login as its service principal.</td>
</tr>
</tbody>
</table>
</div>
<br>
Like SCM, OM also requires two Kerberos principals, and the
corresponding key tab files for both of these principals.
<br>
<table class="table table-dark">
<thead>
<tr>
<th scope="col">Property</th>
<th scope="col">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ozone.om.kerberos.principal</th>
<td>The OzoneManager service principal. <br/> e.g. om/_HOST@REALM.COM</td>
</tr>
<tr>
<td>ozone.om.kerberos.keytab.file</th>
<td>TThe keytab file used by SCM daemon to login as its service principal.</td>
</tr>
<tr>
<td>ozone.om.http.kerberos.principal</th>
<td>Ozone Manager http server service principal.</td>
</tr>
<tr>
<td>ozone.om.http.kerberos.keytab</th>
<td>The keytab file used by OM http server to login as its service principal.</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="card">
<div class="card-body">
<h3 class="card-title">S3 Gateway</h3>
<p class="card-text">
<br>
<br>
S3 gateway requires one service principal and here the configuration values
needed in the ozone-site.xml.
<br>
needed in the ozone-site.xml.
<br>
<table class="table table-dark">
<thead>
<tr>
<th scope="col">Property</th>
<th scope="col">Description</th>
</tr>
</thead>
<tr>
<th scope="row">ozone.s3g.keytab.file</th>
<td>The keytab file used by S3 gateway</td>
</tr>
<tr>
<th scope="row">ozone.s3g.authentication.kerberos
.principal</th>
<td>S3 Gateway principal. e.g. HTTP/_HOST@EXAMPLE.COM</td>
</tr>
</tbody>
</table>
</div>
<thead>
<tr>
<th scope="col">Property</th>
<th scope="col">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ozone.s3g.authentication.kerberos.principal</th>
<td>S3 Gateway principal. <br/> e.g. HTTP/_HOST@EXAMPLE.COM</td>
</tr>
<tr>
<td>ozone.s3g.keytab.file</th>
<td>The keytab file used by S3 gateway</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>

View File

@ -32,10 +32,13 @@ However, we support the legacy Kerberos based Authentication to make it easy
for the current set of users.The HDFS configuration keys are the following
that is setup in hdfs-site.xml.
Property|Example Value|Comment
--------|--------------|--------------
dfs.datanode.keytab.file| /keytab/dn.service.keytab| Keytab file.
dfs.datanode.kerberos.principal| dn/_HOST@REALM.TLD| principal name.
Property|Description
--------|--------------
dfs.datanode.kerberos.principal|The datanode service principal. <br/> e.g. dn/_HOST@REALM.COM
dfs.datanode.keytab.file| The keytab file used by datanode daemon to login as its service principal.
hdds.datanode.http.kerberos.principal| Datanode http server service principal.
hdds.datanode.http.kerberos.keytab| The keytab file used by datanode http server to login as its service principal.
## How a data node becomes secure.
@ -63,7 +66,7 @@ boot time to prove the identity of the data node container (This is also work
in progress.)
Once a certificate is issued, a Data node is secure and Ozone manager can
Once a certificate is issued, a data node is secure and Ozone manager can
issue block tokens. If there is no data node certificates or the SCM's root
certificate is not present in the data node, then data node will register
itself and down load the SCM's root certificate as well get the certificates

View File

@ -35,12 +35,12 @@ The user needs to `kinit` first and once they have authenticated via kerberos
* S3 clients can get the secret access id and user secret from OzoneManager.
```
```bash
ozone s3 getsecret
```
This command will talk to ozone, validate the user via kerberos and generate
the AWS credentials. The values will be printed out on the screen. You can
set these values up in your .aws file for automatic access while working
set these values up in your _.aws_ file for automatic access while working
against Ozone S3 buckets.
<div class="alert alert-danger" role="alert">
@ -51,7 +51,7 @@ against Ozone S3 buckets.
* Now you can proceed to setup these secrets in aws configs:
```
```bash
aws configure set default.s3.signature_version s3v4
aws configure set aws_access_key_id ${accessId}
aws configure set aws_secret_access_key ${secret}

View File

@ -22,20 +22,19 @@ icon: lock
limitations under the License.
-->
## Transparent Data Encryption
Ozone TDE setup process and usage are very similar to HDFS TDE.
The major difference is that Ozone TDE is enabled at Ozone bucket level
when a bucket is created.
### Setting up the Key Management Server
To use TDE, clients must setup a Key Management server and provide that URI to
To use TDE, clients must setup a Key Management Server and provide that URI to
Ozone/HDFS. Since Ozone and HDFS can use the same Key Management Server, this
configuration can be provided via *hdfs-site.xml*.
Property| Value
-----------------------------------|-----------------------------------------
hadoop.security.key.provider.path | KMS uri. e.g. kms://http@kms-host:9600/kms
hadoop.security.key.provider.path | KMS uri. <br> e.g. kms://http@kms-host:9600/kms
### Using Transparent Data Encryption
If this is already configured for your cluster, then you can simply proceed

View File

@ -21,9 +21,9 @@ summary: Native ACL support provides ACL functionality without Ranger integratio
limitations under the License.
-->
Ozone supports a set of native ACLs. These ACLs cane be used independently or
along with Ranger. If Apache Ranger is enabled, then ACL will be checked
first with Ranger and then Ozone's internal ACLs will be evaluated.
Ozone supports a set of native ACLs. These ACLs can be used independently or
along with Ranger. If Apache Ranger is enabled, then ACL will be checked
first with Ranger and then Ozone's internal ACLs will be evaluated.
Ozone ACLs are a super set of Posix and S3 ACLs.
@ -31,10 +31,10 @@ The general format of an ACL is _object_:_who_:_rights_.
Where an _object_ can be:
1. **Volume** - An Ozone volume. e.g. /volume
2. **Bucket** - An Ozone bucket. e.g. /volume/bucket
3. **Key** - An object key or an object. e.g. /volume/bucket/key
4. **Prefix** - A path prefix for a specific key. e.g. /volume/bucket/prefix1/prefix2
1. **Volume** - An Ozone volume. e.g. _/volume_
2. **Bucket** - An Ozone bucket. e.g. _/volume/bucket_
3. **Key** - An object key or an object. e.g. _/volume/bucket/key_
4. **Prefix** - A path prefix for a specific key. e.g. _/volume/bucket/prefix1/prefix2_
Where a _who_ can be:
@ -63,23 +63,20 @@ volume and keys in a bucket. Please note: Under Ozone, Only admins can create vo
to the volume and buckets which allow listing of the child objects. Please note: The user and admins can list the volumes owned by the user.
3. **Delete** Allows the user to delete a volume, bucket or key.
4. **Read** Allows the user to read the metadata of a Volume and Bucket and
data stream and metadata of a key(object).
data stream and metadata of a key.
5. **Write** - Allows the user to write the metadata of a Volume and Bucket and
allows the user to overwrite an existing ozone key(object).
allows the user to overwrite an existing ozone key.
6. **Read_ACL** Allows a user to read the ACL on a specific object.
7. **Write_ACL** Allows a user to write the ACL on a specific object.
<h3>Ozone Native ACL APIs <span class="badge badge-secondary">Work in
progress</span></h3>
<h3>Ozone Native ACL APIs</h3>
The ACLs can be manipulated by a set of APIs supported by Ozone. The APIs
supported are:
1. **SetAcl** This API will take user principal, the name of the object, type
of the object and a list of ACLs.
2. **GetAcl** This API will take the name of an ozone object and type of the
object and will return a list of ACLs.
3. **RemoveAcl** - It is possible that we might support an API called RemoveACL
as a convenience API, but in reality it is just a GetACL followed by SetACL
with an etag to avoid conflicts.
1. **SetAcl** This API will take user principal, the name, type
of the ozone object and a list of ACLs.
2. **GetAcl** This API will take the name and type of the ozone object
and will return a list of ACLs.
3. **RemoveAcl** - This API will take the name, type of the
ozone object and the ACL that has to be removed.

View File

@ -29,7 +29,7 @@ Ozone shell supports the following bucket commands.
### Create
The bucket create command allows users to create a bucket.
The `bucket create` command allows users to create a bucket.
***Params:***
@ -46,7 +46,7 @@ Since no scheme was specified this command defaults to O3 (RPC) protocol.
### Delete
The bucket delete command allows users to delete a bucket. If the
The `bucket delete` command allows users to delete a bucket. If the
bucket is not empty then this command will fail.
***Params:***
@ -63,7 +63,8 @@ The above command will delete _jan_ bucket if it is empty.
### Info
The bucket info commands returns the information about the bucket.
The `bucket info` commands returns the information about the bucket.
***Params:***
| Arguments | Comment |
@ -78,15 +79,15 @@ The above command will print out the information about _jan_ bucket.
### List
The bucket list command allows users to list the buckets in a volume.
The `bucket list` command allows users to list the buckets in a volume.
***Params:***
| Arguments | Comment |
|--------------------------------|-----------------------------------------|
| -l, --length | Maximum number of results to return. Default: 100
| -p, --prefix | Optional, Only buckets that match this prefix will be returned.
| -s, --start | The listing will start from key after the start key.
| -l, \-\-length | Maximum number of results to return. Default: 100
| -p, \-\-prefix | Optional, Only buckets that match this prefix will be returned.
| -s, \-\-start | The listing will start from key after the start key.
| Uri | The name of the _volume_.
{{< highlight bash >}}
@ -94,18 +95,3 @@ ozone sh bucket list /hive
{{< /highlight >}}
This command will list all buckets on the volume _hive_.
### path
The bucket command to provide ozone mapping for s3 bucket (Created via aws cli)
{{< highlight bash >}}
ozone s3 path <<s3Bucket>>
{{< /highlight >}}
The above command will print VolumeName and the mapping created for s3Bucket.
You can try out these commands from the docker instance of the [Alpha
Cluster](runningviadocker.html).

View File

@ -34,7 +34,7 @@ Ozone shell supports the following key commands.
### Get
The key get command downloads a key from Ozone cluster to local file system.
The `key get` command downloads a key from Ozone cluster to local file system.
***Params:***
@ -52,7 +52,7 @@ local file sales.orc.
### Put
Uploads a file from the local file system to the specified bucket.
The `key put` command uploads a file from the local file system to the specified bucket.
***Params:***
@ -61,7 +61,7 @@ Uploads a file from the local file system to the specified bucket.
|--------------------------------|-----------------------------------------|
| Uri | The name of the key in **/volume/bucket/key** format.
| FileName | Local file to upload.
| -r, --replication | Optional, Number of copies, ONE or THREE are the options. Picks up the default from cluster configuration.
| -r, \-\-replication | Optional, Number of copies, ONE or THREE are the options. Picks up the default from cluster configuration.
{{< highlight bash >}}
ozone sh key put /hive/jan/corrected-sales.orc sales.orc
@ -70,7 +70,7 @@ The above command will put the sales.orc as a new key into _/hive/jan/corrected-
### Delete
The key delete command removes the key from the bucket.
The `key delete` command removes the key from the bucket.
***Params:***
@ -87,7 +87,8 @@ The above command deletes the key _/hive/jan/corrected-sales.orc_.
### Info
The key info commands returns the information about the key.
The `key info` commands returns the information about the key.
***Params:***
| Arguments | Comment |
@ -103,15 +104,15 @@ key.
### List
The key list command allows user to list all keys in a bucket.
The `key list` command allows user to list all keys in a bucket.
***Params:***
| Arguments | Comment |
|--------------------------------|-----------------------------------------|
| -l, --length | Maximum number of results to return. Default: 1000
| -p, --prefix | Optional, Only buckets that match this prefix will be returned.
| -s, --start | The listing will start from key after the start key.
| -l, \-\-length | Maximum number of results to return. Default: 1000
| -p, \-\-prefix | Optional, Only buckets that match this prefix will be returned.
| -s, \-\-start | The listing will start from key after the start key.
| Uri | The name of the _volume_.
{{< highlight bash >}}
@ -135,7 +136,4 @@ The `key rename` command changes the name of an existing key in the specified bu
{{< highlight bash >}}
ozone sh key rename /hive/jan sales.orc new_name.orc
{{< /highlight >}}
The above command will rename `sales.orc` to `new_name.orc` in the bucket `/hive/jan`.
You can try out these commands from the docker instance of the [Alpha
Cluster](runningviadocker.html).
The above command will rename _sales.orc_ to _new\_name.orc_ in the bucket _/hive/jan_.

View File

@ -30,15 +30,15 @@ Volume commands generally need administrator privileges. The ozone shell support
### Create
The volume create command allows an administrator to create a volume and
The `volume create` command allows an administrator to create a volume and
assign it to a user.
***Params:***
| Arguments | Comment |
|--------------------------------|-----------------------------------------|
| -q, --quota | Optional, This argument that specifies the maximum size this volume can use in the Ozone cluster. |
| -u, --user | Required, The name of the user who owns this volume. This user can create, buckets and keys on this volume. |
| -q, \-\-quota | Optional, This argument that specifies the maximum size this volume can use in the Ozone cluster. |
| -u, \-\-user | Required, The name of the user who owns this volume. This user can create, buckets and keys on this volume. |
| Uri | The name of the volume. |
{{< highlight bash >}}
@ -50,7 +50,7 @@ volume has a quota of 1TB, and the owner is _bilbo_.
### Delete
The volume delete command allows an administrator to delete a volume. If the
The `volume delete` command allows an administrator to delete a volume. If the
volume is not empty then this command will fail.
***Params:***
@ -68,8 +68,9 @@ inside it.
### Info
The volume info commands returns the information about the volume including
The `volume info` commands returns the information about the volume including
quota and owner information.
***Params:***
| Arguments | Comment |
@ -84,7 +85,7 @@ The above command will print out the information about hive volume.
### List
The volume list command will list the volumes owned by a user.
The `volume list` command will list the volumes owned by a user.
{{< highlight bash >}}
ozone sh volume list --user hadoop
@ -100,8 +101,8 @@ The volume update command allows changing of owner and quota on a given volume.
| Arguments | Comment |
|--------------------------------|-----------------------------------------|
| -q, --quota | Optional, This argument that specifies the maximum size this volume can use in the Ozone cluster. |
| -u, --user | Optional, The name of the user who owns this volume. This user can create, buckets and keys on this volume. |
| -q, \-\-quota | Optional, This argument that specifies the maximum size this volume can use in the Ozone cluster. |
| -u, \-\-user | Optional, The name of the user who owns this volume. This user can create, buckets and keys on this volume. |
| Uri | The name of the volume. |
{{< highlight bash >}}
@ -109,6 +110,3 @@ ozone sh volume update --quota=10TB /hive
{{< /highlight >}}
The above command updates the volume quota to 10TB.
You can try out these commands from the docker instance of the [Alpha
Cluster](runningviadocker.html).

View File

@ -25,7 +25,7 @@ title: Ozone on Kubernetes
{{< /requirements >}}
As the _apache/ozone_ docker images are available from the dockerhub the deployment process is very similar Minikube deployment. The only big difference is that we have dedicated set of k8s files for hosted clusters (for example we can use one datanode per host)
As the _apache/ozone_ docker images are available from the dockerhub the deployment process is very similar to Minikube deployment. The only big difference is that we have dedicated set of k8s files for hosted clusters (for example we can use one datanode per host)
Deploy to kubernetes
`kubernetes/examples` folder of the ozone distribution contains kubernetes deployment resource files for multiple use cases.

View File

@ -33,7 +33,7 @@ requests blocks from SCM, to which clients can write data.
## Setting up an Ozone only cluster
* Please untar the ozone-<version> to the directory where you are going
* Please untar the ozone-\<version\> to the directory where you are going
to run Ozone from. We need Ozone jars on all machines in the cluster. So you
need to do this on all machines in the cluster.
@ -152,14 +152,13 @@ ozone om --init
{{< /highlight >}}
Once Ozone manager has created the Object Store, we are ready to run the name
services.
Once Ozone manager is initialized, we are ready to run the name service.
{{< highlight bash >}}
ozone --daemon start om
{{< /highlight >}}
At this point Ozone's name services, the Ozone manager, and the block service SCM is both running.
At this point Ozone's name services, the Ozone manager, and the block service SCM is both running.\
**Please note**: If SCM is not running
```om --init``` command will fail. SCM start will fail if on-disk data structures are missing. So please make sure you have done both ```scm --init``` and ```om --init``` commands.

View File

@ -30,7 +30,7 @@ The easiest way to start up an all-in-one ozone container is to use the latest
docker image from docker hub:
```bash
docker run -P 9878:9878 -P 9876:9876 apache/ozone
docker run -p 9878:9878 -p 9876:9876 apache/ozone
```
This command will pull down the ozone image from docker hub and start all
ozone services in a single container. <br>
@ -40,7 +40,7 @@ Container Manager) one data node and the S3 compatible REST server
# Local multi-container cluster
If you would like to use a more realistic pseud-cluster where each components
If you would like to use a more realistic pseudo-cluster where each components
run in own containers, you can start it with a docker-compose file.
We have shipped a docker-compose and an enviorment file as part of the
@ -65,7 +65,7 @@ If you need multiple datanodes, we can just scale it up:
```
# Running S3 Clients
Once the cluster is booted up and ready, you can verify it is running by
Once the cluster is booted up and ready, you can verify its status by
connecting to the SCM's UI at [http://localhost:9876](http://localhost:9876).
The S3 gateway endpoint will be exposed at port 9878. You can use Ozone's S3
@ -103,7 +103,6 @@ our bucket.
aws s3 --endpoint http://localhost:9878 ls s3://bucket1/testfile
```
.
<div class="alert alert-info" role="alert"> You can also check the internal
bucket browser supported by Ozone S3 interface by clicking on the below link.
<br>