HDFS-11963. Ozone: Documentation: Add getting started page. Contributed by Anu Engineer.
This commit is contained in:
parent
0f671caf8d
commit
f8110c3e6e
|
@ -0,0 +1,146 @@
|
||||||
|
<!---
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
you may not use this file except in compliance with the License.
|
||||||
|
You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software
|
||||||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
See the License for the specific language governing permissions and
|
||||||
|
limitations under the License. See accompanying LICENSE file.
|
||||||
|
-->
|
||||||
|
|
||||||
|
Ozone Command Shell
|
||||||
|
===================
|
||||||
|
|
||||||
|
Ozone command shell gives a command shell interface to work against ozone.
|
||||||
|
Please note that this document assumes that cluster is deployed
|
||||||
|
with simple authentication.
|
||||||
|
|
||||||
|
The Ozone commands take the following format.
|
||||||
|
|
||||||
|
* `hdfs oz --command_ http://hostname:port/volume/bucket/key -user
|
||||||
|
<name> -root`
|
||||||
|
|
||||||
|
The *--root* option is a command line short cut that allows *hdfs oz*
|
||||||
|
commands to be run as the user that started the cluster. This is useful to
|
||||||
|
indicate that you want the commands to be run as some admin user. The only
|
||||||
|
reason for this option is that it makes the life of a lazy developer more
|
||||||
|
easier.
|
||||||
|
|
||||||
|
Ozone Volume Commands
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
The volume commands allow users to create, delete and list the volumes in the
|
||||||
|
ozone cluster.
|
||||||
|
|
||||||
|
### Create Volume
|
||||||
|
|
||||||
|
Volumes can be created only by Admins. Here is an example of creating a volume.
|
||||||
|
|
||||||
|
* `hdfs oz -createVolume http://localhost:9864/hive -user bilbo -quota
|
||||||
|
100TB -root`
|
||||||
|
|
||||||
|
The above command creates a volume called `hive` owned by user `bilbo`. The
|
||||||
|
`--root` option allows the command to be executed as user `hdfs` which is an
|
||||||
|
admin in the cluster.
|
||||||
|
|
||||||
|
### Update Volume
|
||||||
|
|
||||||
|
Updates information like ownership and quota on an existing volume.
|
||||||
|
|
||||||
|
* `hdfs oz -updateVolume http://localhost:9864/hive -quota 500TB -root`
|
||||||
|
|
||||||
|
The above command changes the volume quota of hive from 100TB to 500TB.
|
||||||
|
|
||||||
|
### Delete Volume
|
||||||
|
Deletes a Volume if it is empty.
|
||||||
|
|
||||||
|
* `hdfs oz -deleteVolume http://localhost:9864/hive -root`
|
||||||
|
|
||||||
|
|
||||||
|
### Info Volume
|
||||||
|
Info volume command allows the owner or the administrator of the cluster to read meta-data about a specific volume.
|
||||||
|
|
||||||
|
* `hdfs oz -infoVolume http://localhost:9864/hive -root`
|
||||||
|
|
||||||
|
### List Volumes
|
||||||
|
|
||||||
|
List volume command can be used by administrator to list volumes of any user. It can also be used by a user to list volumes owned by him.
|
||||||
|
|
||||||
|
* `hdfs oz -listVolume http://localhost:9864/ -user bilbo -root`
|
||||||
|
|
||||||
|
The above command lists all volumes owned by user bilbo.
|
||||||
|
|
||||||
|
Ozone Bucket Commands
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
Bucket commands follow a similar pattern as volume commands. However bucket commands are designed to be run by the owner of the volume.
|
||||||
|
Following examples assume that these commands are run by the owner of the volume or bucket.
|
||||||
|
|
||||||
|
|
||||||
|
### Create Bucket
|
||||||
|
|
||||||
|
Create bucket call allows the owner of a volume to create a bucket.
|
||||||
|
|
||||||
|
* `hdfs oz -createBucket http://localhost:9864/hive/january`
|
||||||
|
|
||||||
|
This call creates a bucket called `january` in the volume called `hive`. If
|
||||||
|
the volume does not exist, then this call will fail.
|
||||||
|
|
||||||
|
|
||||||
|
### Update Bucket
|
||||||
|
Updates bucket meta-data, like ACLs.
|
||||||
|
|
||||||
|
* `hdfs oz -updateBucket http://localhost:9864/hive/january -addAcl
|
||||||
|
user:spark:rw`
|
||||||
|
|
||||||
|
### Delete Bucket
|
||||||
|
Deletes a bucket if it is empty.
|
||||||
|
|
||||||
|
* `hdfs oz -deleteBucket http://localhost:9864/hive/january`
|
||||||
|
|
||||||
|
### Info Bucket
|
||||||
|
Returns information about a given bucket.
|
||||||
|
|
||||||
|
* `hdfs oz -infoBucket http://localhost:9864/hive/january`
|
||||||
|
|
||||||
|
### List Buckets
|
||||||
|
List buckets on a given volume.
|
||||||
|
|
||||||
|
* `hdfs oz -listtBucket http://localhost:9864/hive`
|
||||||
|
|
||||||
|
Ozone Key Commands
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Ozone key commands allows users to put, delete and get keys from ozone buckets.
|
||||||
|
|
||||||
|
### Put Key
|
||||||
|
Creates or overwrites a key in ozone store, -file points to the file you want
|
||||||
|
to upload.
|
||||||
|
|
||||||
|
* `hdfs oz -putKey http://localhost:9864/hive/january/processed.orc -file
|
||||||
|
processed.orc`
|
||||||
|
|
||||||
|
### Get Key
|
||||||
|
Downloads a file from the ozone bucket.
|
||||||
|
|
||||||
|
* `hdfs oz -getKey http://localhost:9864/hive/january/processed.orc -file
|
||||||
|
processed.orc.copy`
|
||||||
|
|
||||||
|
### Delete Key
|
||||||
|
Deletes a key from the ozone store.
|
||||||
|
|
||||||
|
* `hdfs oz -deleteKey http://localhost:9864/hive/january/processed.orc`
|
||||||
|
|
||||||
|
### Info Key
|
||||||
|
Reads key metadata from the ozone store.
|
||||||
|
|
||||||
|
* `hdfs oz -infoKey http://localhost:9864/hive/january/processed.orc`
|
||||||
|
|
||||||
|
### List Keys
|
||||||
|
List all keys in an ozone bucket.
|
||||||
|
|
||||||
|
* `hdfs oz -listKey http://localhost:9864/hive/january`
|
|
@ -0,0 +1,257 @@
|
||||||
|
<!---
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
you may not use this file except in compliance with the License.
|
||||||
|
You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software
|
||||||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
See the License for the specific language governing permissions and
|
||||||
|
limitations under the License. See accompanying LICENSE file.
|
||||||
|
-->
|
||||||
|
Ozone - Object store for Hadoop
|
||||||
|
==============================
|
||||||
|
|
||||||
|
Introduction
|
||||||
|
------------
|
||||||
|
Ozone is an object store for Hadoop. It is a redundant, distributed object
|
||||||
|
store build by leveraging primitives present in HDFS. Ozone supports REST
|
||||||
|
API for accessing the store.
|
||||||
|
|
||||||
|
Getting Started
|
||||||
|
---------------
|
||||||
|
Ozone is a work in progress and currently lives in its own branch. To
|
||||||
|
use it, you have to build a package by yourself and deploy a cluster.
|
||||||
|
|
||||||
|
### Building Ozone
|
||||||
|
|
||||||
|
To build Ozone, please checkout the hadoop sources from github. Then
|
||||||
|
checkout the ozone branch, HDFS-7240 and build it.
|
||||||
|
|
||||||
|
- `git checkout HDFS-7240`
|
||||||
|
- `mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true -Pdist -Dtar -DskipShade`
|
||||||
|
|
||||||
|
skipShade is just to make compilation faster and not really required.
|
||||||
|
|
||||||
|
This will give you a tarball in your distribution directory. This is the
|
||||||
|
tarball that can be used for deploying your hadoop cluster. Here is an
|
||||||
|
example of the tarball that will be generated.
|
||||||
|
|
||||||
|
* `~/apache/hadoop/hadoop-dist/target/hadoop-3.0.0-alpha4-SNAPSHOT.tar.gz`
|
||||||
|
|
||||||
|
Please proceed to setup a hadoop cluster by creating the hdfs-site.xml and
|
||||||
|
other configuration files that are needed for your cluster.
|
||||||
|
|
||||||
|
### Ozone Configuration
|
||||||
|
|
||||||
|
Ozone relies on its own configuration file called `ozone-site.xml`. It is
|
||||||
|
just for convenience and ease of management -- you can add these settings
|
||||||
|
to `hdfs-site.xml`, if you don't want to keep ozone settings separate.
|
||||||
|
This document refers to `ozone-site.xml` so that ozone settings are in one
|
||||||
|
place and not mingled with HDFS settings.
|
||||||
|
|
||||||
|
* _*ozone.enabled*_ This is the most important setting for ozone.
|
||||||
|
Currently, Ozone is an opt-in subsystem of HDFS. By default, Ozone is
|
||||||
|
disabled. Setting this flag to `true` enables ozone in the HDFS cluster.
|
||||||
|
Here is an example,
|
||||||
|
|
||||||
|
```
|
||||||
|
<property>
|
||||||
|
<name>ozone.enabled</name>
|
||||||
|
<value>True</value>
|
||||||
|
</property>
|
||||||
|
```
|
||||||
|
* _*ozone.container.metadata.dirs*_ Ozone is designed with modern hardware
|
||||||
|
in mind. It tries to use SSDs effectively. So users can specify where the
|
||||||
|
datanode metadata must reside. Usually you pick your fastest disk (SSD if
|
||||||
|
you have them on your datanodes). Datanodes will write the container metadata
|
||||||
|
to these disks. This is a required setting, if this is missing datanodes will
|
||||||
|
fail to come up. Here is an example,
|
||||||
|
|
||||||
|
```
|
||||||
|
<property>
|
||||||
|
<name>ozone.container.metadata.dirs</name>
|
||||||
|
<value>/data/disk1/container/meta</value>
|
||||||
|
</property>
|
||||||
|
```
|
||||||
|
|
||||||
|
* _*ozone.scm.names*_ Ozone is build on top of container framework (See Ozone
|
||||||
|
Architecture TODO). Storage container manager(SCM) is a distributed block
|
||||||
|
service which is used by ozone and other storage services.
|
||||||
|
This property allows datanodes to discover where SCM is, so that
|
||||||
|
datanodes can send heartbeat to SCM. SCM is designed to be highly available
|
||||||
|
and datanodes assume there are multiple instances of SCM which form a highly
|
||||||
|
available ring. The HA feature of SCM is a work in progress. So we
|
||||||
|
configure ozone.scm.names to be a single machine. Here is an example,
|
||||||
|
|
||||||
|
```
|
||||||
|
<property>
|
||||||
|
<name>ozone.scm.names</name>
|
||||||
|
<value>scm.hadoop.apache.org</value>
|
||||||
|
</property>
|
||||||
|
```
|
||||||
|
|
||||||
|
* _*ozone.scm.datanode.id*_ Each datanode that speaks to SCM generates an ID
|
||||||
|
just like HDFS. This ID is stored is a location pointed by this setting. If
|
||||||
|
this setting is not valid, datanodes will fail to come up. Please note:
|
||||||
|
This path that is will created by datanodes to store the datanode ID. Here is an example,
|
||||||
|
|
||||||
|
```
|
||||||
|
<property>
|
||||||
|
<name>ozone.scm.datanode.id</name>
|
||||||
|
<value>/data/disk1/scm/meta/node/datanode.id</value>
|
||||||
|
</property>
|
||||||
|
```
|
||||||
|
|
||||||
|
* _*ozone.scm.block.client.address*_ Storage Container Manager(SCM) offers a
|
||||||
|
set of services that can be used to build a distributed storage system. One
|
||||||
|
of the services offered is the block services. KSM and HDFS would use this
|
||||||
|
service. This property describes where KSM can discover SCM's block service
|
||||||
|
endpoint. There is corresponding ports etc, but assuming that we are using
|
||||||
|
default ports, the server address is the only required field. Here is an
|
||||||
|
example,
|
||||||
|
|
||||||
|
```
|
||||||
|
<property>
|
||||||
|
<name>ozone.scm.block.client.address</name>
|
||||||
|
<value>scm.hadoop.apache.org</value>
|
||||||
|
</property>
|
||||||
|
```
|
||||||
|
|
||||||
|
* _*ozone.ksm.address*_ KSM server address. This is used by Ozonehandler and
|
||||||
|
Ozone File System.
|
||||||
|
|
||||||
|
```
|
||||||
|
<property>
|
||||||
|
<name>ozone.ksm.address</name>
|
||||||
|
<value>ksm.hadoop.apache.org</value>
|
||||||
|
</property>
|
||||||
|
```
|
||||||
|
|
||||||
|
Here is a quick summary of settings needed by Ozone.
|
||||||
|
|
||||||
|
| Setting | Value | Comment |
|
||||||
|
|--------------------------------|------------------------------|------------------------------------------------------------------|
|
||||||
|
| ozone.enabled | True | This enables SCM and containers in HDFS cluster. |
|
||||||
|
| ozone.container.metadata.dirs | file path | The container metadata will be stored here in the datanode. |
|
||||||
|
| ozone.scm.names | SCM server name | Hostname:port or or IP:port address of SCM. |
|
||||||
|
| ozone.scm.datanode.id | file path | Data node ID is the location of datanode's ID file |
|
||||||
|
| ozone.scm.block.client.address | SCM server name | Used by services like KSM |
|
||||||
|
| ozone.ksm.address | KSM server name | Used by Ozone handler and Ozone file system. |
|
||||||
|
|
||||||
|
Here is a working example of`ozone-site.xml`.
|
||||||
|
|
||||||
|
```
|
||||||
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
|
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
|
||||||
|
<configuration>
|
||||||
|
<property>
|
||||||
|
<name>ozone.enabled</name>
|
||||||
|
<value>True</value>
|
||||||
|
</property>
|
||||||
|
|
||||||
|
<property>
|
||||||
|
<name>ozone.container.metadata.dirs</name>
|
||||||
|
<value>/data/disk1/scm/meta</value>
|
||||||
|
</property>
|
||||||
|
|
||||||
|
|
||||||
|
<property>
|
||||||
|
<name>ozone.scm.names</name>
|
||||||
|
<value>scm.hadoop.apache.org</value>
|
||||||
|
</property>
|
||||||
|
|
||||||
|
<property>
|
||||||
|
<name>ozone.scm.datanode.id</name>
|
||||||
|
<value>/data/disk1/scm/meta/node/datanode.id</value>
|
||||||
|
</property>
|
||||||
|
|
||||||
|
<property>
|
||||||
|
<name>ozone.scm.block.client.address</name>
|
||||||
|
<value>scm.hadoop.apache.org</value>
|
||||||
|
</property>
|
||||||
|
|
||||||
|
<property>
|
||||||
|
<name>ozone.ksm.address</name>
|
||||||
|
<value>ksm.hadoop.apache.org</value>
|
||||||
|
</property>
|
||||||
|
</configuration>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Starting Ozone
|
||||||
|
|
||||||
|
Ozone is designed to run concurrently with HDFS. The simplest way to [start
|
||||||
|
HDFS](../hadoop-common/ClusterSetup.html) is to run `start-dfs.sh` from the
|
||||||
|
`$HADOOP/sbin/start-dfs.sh`. Once HDFS
|
||||||
|
is running, please verify it is fully functional by running some commands like
|
||||||
|
|
||||||
|
- *./hdfs dfs -mkdir /usr*
|
||||||
|
- *./hdfs dfs -ls /*
|
||||||
|
|
||||||
|
Once you are sure that HDFS is running, start Ozone. To start ozone, you
|
||||||
|
need to start SCM and KSM. Currently we assume that both KSM and SCM
|
||||||
|
is running on the same node, this will change in future.
|
||||||
|
|
||||||
|
- `./hdfs --daemon start scm`
|
||||||
|
- `./hdfs --daemon start ksm`
|
||||||
|
|
||||||
|
if you would like to start HDFS and Ozone together, you can do that by running
|
||||||
|
a single command.
|
||||||
|
- `$HADOOP/sbin/start-ozone.sh`
|
||||||
|
|
||||||
|
This command will start HDFS and then start the ozone components.
|
||||||
|
|
||||||
|
Once you have ozone running you can use these ozone [shell](./OzoneCommandShell.html)
|
||||||
|
commands to create a volume, bucket and keys.
|
||||||
|
|
||||||
|
### Diagnosing issues
|
||||||
|
|
||||||
|
Ozone tries not to pollute the existing HDFS streams of configuration and
|
||||||
|
logging. So ozone logs are by default configured to be written to a file
|
||||||
|
called `ozone.log`. This is controlled by the settings in `log4j.properties`
|
||||||
|
file in the hadoop configuration directory.
|
||||||
|
|
||||||
|
Here is the log4j properties that are added by ozone.
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
#
|
||||||
|
# Add a logger for ozone that is separate from the Datanode.
|
||||||
|
#
|
||||||
|
#log4j.debug=true
|
||||||
|
log4j.logger.org.apache.hadoop.ozone=DEBUG,OZONE,FILE
|
||||||
|
|
||||||
|
# Do not log into datanode logs. Remove this line to have single log.
|
||||||
|
log4j.additivity.org.apache.hadoop.ozone=false
|
||||||
|
|
||||||
|
# For development purposes, log both to console and log file.
|
||||||
|
log4j.appender.OZONE=org.apache.log4j.ConsoleAppender
|
||||||
|
log4j.appender.OZONE.Threshold=info
|
||||||
|
log4j.appender.OZONE.layout=org.apache.log4j.PatternLayout
|
||||||
|
log4j.appender.OZONE.layout.ConversionPattern=%d{ISO8601} [%t] %-5p \
|
||||||
|
%X{component} %X{function} %X{resource} %X{user} %X{request} - %m%n
|
||||||
|
|
||||||
|
# Real ozone logger that writes to ozone.log
|
||||||
|
log4j.appender.FILE=org.apache.log4j.DailyRollingFileAppender
|
||||||
|
log4j.appender.FILE.File=${hadoop.log.dir}/ozone.log
|
||||||
|
log4j.appender.FILE.Threshold=debug
|
||||||
|
log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
|
||||||
|
log4j.appender.FILE.layout.ConversionPattern=%d{ISO8601} [%t] %-5p \
|
||||||
|
(%F:%L) %X{function} %X{resource} %X{user} %X{request} - \
|
||||||
|
%m%n
|
||||||
|
```
|
||||||
|
|
||||||
|
If you would like to have a single datanode log instead of ozone stuff
|
||||||
|
getting written to ozone.log, please remove this line or set this to true.
|
||||||
|
|
||||||
|
` log4j.additivity.org.apache.hadoop.ozone=false`
|
||||||
|
|
||||||
|
On the SCM/KSM side, you will be able to see
|
||||||
|
|
||||||
|
- `hadoop-hdfs-ksm-hostname.log`
|
||||||
|
- `hadoop-hdfs-scm-hostname.log`
|
||||||
|
|
||||||
|
Please file any issues you see under [Object store in HDFS (HDFS-7240)](https://issues.apache.org/jira/browse/HDFS-7240)
|
||||||
|
as this is still a work in progress.
|
|
@ -103,6 +103,12 @@
|
||||||
<item name="Disk Balancer" href="hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html"/>
|
<item name="Disk Balancer" href="hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html"/>
|
||||||
</menu>
|
</menu>
|
||||||
|
|
||||||
|
<menu name="Ozone" inherit="top">
|
||||||
|
<item name="Getting Started" href="hadoop-project-dist/hadoop-hdfs/OzoneGettingStarted.html"/>
|
||||||
|
<item name="Commands Reference" href="hadoop-project-dist/hadoop-hdfs/OzoneCommandShell.html"/>
|
||||||
|
<item name="Ozone Metrics" href="hadoop-project-dist/hadoop-hdfs/Ozonemetrics.html"/>
|
||||||
|
</menu>
|
||||||
|
|
||||||
<menu name="MapReduce" inherit="top">
|
<menu name="MapReduce" inherit="top">
|
||||||
<item name="Tutorial" href="hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html"/>
|
<item name="Tutorial" href="hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html"/>
|
||||||
<item name="Commands Reference" href="hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html"/>
|
<item name="Commands Reference" href="hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html"/>
|
||||||
|
|
Loading…
Reference in New Issue