YARN-8875. [Submarine] Add documentation for submarine installation script details. (Xun Liu via wangda)

Change-Id: I1c8d39c394e5a30f967ea514919835b951f2c124
This commit is contained in:
Wangda Tan 2018-10-16 13:36:09 -07:00
parent babd1449bf
commit ed08dd3b0c
7 changed files with 724 additions and 178 deletions

View File

@ -0,0 +1,36 @@
<!---
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
# How to Install Dependencies
Submarine project uses YARN Service, Docker container, and GPU (when GPU hardware available and properly configured).
That means as an admin, you have to properly setup YARN Service related dependencies, including:
- YARN Registry DNS
Docker related dependencies, including:
- Docker binary with expected versions.
- Docker network which allows Docker container can talk to each other across different nodes.
And when GPU wanna to be used:
- GPU Driver.
- Nvidia-docker.
For your convenience, we provided installation documents to help you to setup your environment. You can always choose to have them installed in your own way.
Use Submarine installer to install dependencies: [EN](InstallationScriptEN.html) [CN](InstallationScriptCN.html)
Alternatively, you can follow manual install dependencies: [EN](InstallationGuide.html) [CN](InstallationGuideChineseVersion.html)
Once you have installed dependencies, please follow following guide to [TestAndTroubleshooting](TestAndTroubleshooting.html).

View File

@ -41,6 +41,4 @@ Click below contents if you want to understand more.
- [Developer guide](DeveloperGuide.html)
- [Installation guide](InstallationGuide.html)
- [Installation guide Chinese version](InstallationGuideChineseVersion.html)
- [Installation guides](HowToInstall.html)

View File

@ -16,9 +16,11 @@
## Prerequisites
(Please note that all following prerequisites are just an example for you to install. You can always choose to install your own version of kernel, different users, different drivers, etc.).
### Operating System
The operating system and kernel versions we used are as shown in the following table, which should be minimum required versions:
The operating system and kernel versions we have tested are as shown in the following table, which is the recommneded minimum required versions.
| Enviroment | Verion |
| ------ | ------ |
@ -27,7 +29,7 @@ The operating system and kernel versions we used are as shown in the following t
### User & Group
As there are some specific users and groups need to be created to install hadoop/docker. Please create them if they are missing.
As there are some specific users and groups recommended to be created to install hadoop/docker. Please create them if they are missing.
```
adduser hdfs
@ -45,7 +47,7 @@ usermod -aG docker hadoop
### GCC Version
Check the version of GCC tool
Check the version of GCC tool (to compile kernel).
```bash
gcc --version
@ -64,7 +66,7 @@ wget http://vault.centos.org/7.3.1611/os/x86_64/Packages/kernel-headers-3.10.0-5
rpm -ivh kernel-headers-3.10.0-514.el7.x86_64.rpm
```
### GPU Servers
### GPU Servers (Only for Nvidia GPU equipped nodes)
```
lspci | grep -i nvidia
@ -76,9 +78,9 @@ lspci | grep -i nvidia
### Nvidia Driver Installation
### Nvidia Driver Installation (Only for Nvidia GPU equipped nodes)
If nvidia driver/cuda has been installed before, They should be uninstalled firstly.
To make a clean installation, if you have requirements to upgrade GPU drivers. If nvidia driver/cuda has been installed before, They should be uninstalled firstly.
```
# uninstall cuda
@ -96,16 +98,16 @@ yum install nvidia-detect
nvidia-detect -v
Probing for supported NVIDIA devices...
[10de:13bb] NVIDIA Corporation GM107GL [Quadro K620]
This device requires the current 390.87 NVIDIA driver kmod-nvidia
This device requires the current xyz.nm NVIDIA driver kmod-nvidia
[8086:1912] Intel Corporation HD Graphics 530
An Intel display controller was also detected
```
Pay attention to `This device requires the current 390.87 NVIDIA driver kmod-nvidia`.
Download the installer [NVIDIA-Linux-x86_64-390.87.run](https://www.nvidia.com/object/linux-amd64-display-archive.html).
Pay attention to `This device requires the current xyz.nm NVIDIA driver kmod-nvidia`.
Download the installer like [NVIDIA-Linux-x86_64-390.87.run](https://www.nvidia.com/object/linux-amd64-display-archive.html).
Some preparatory work for nvidia driver installation
Some preparatory work for nvidia driver installation. (This is follow normal Nvidia GPU driver installation, just put here for your convenience)
```
# It may take a while to update
@ -163,6 +165,8 @@ https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
### Docker Installation
We recommend to use Docker version >= 1.12.5, following steps are just for your reference. You can always to choose other approaches to install Docker.
```
yum -y update
yum -y install yum-utils
@ -226,9 +230,9 @@ Server:
OS/Arch: linux/amd64
```
### Nvidia-docker Installation
### Nvidia-docker Installation (Only for Nvidia GPU equipped nodes)
Submarine is based on nvidia-docker 1.0 version
Submarine depends on nvidia-docker 1.0 version
```
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker-1.0.1-1.x86_64.rpm
@ -285,7 +289,6 @@ Reference:
https://github.com/NVIDIA/nvidia-docker/tree/1.0
### Tensorflow Image
There is no need to install CUDNN and CUDA on the servers, because CUDNN and CUDA can be added in the docker images. we can get basic docker images by following WriteDockerfile.md.
@ -367,7 +370,7 @@ ENV PATH $PATH:$JAVA_HOME/bin
### Test tensorflow in a docker container
After docker image is built, we can check
tensorflow environments before submitting a yarn job.
Tensorflow environments before submitting a yarn job.
```shell
$ docker run -it ${docker_image_name} /bin/bash
@ -394,10 +397,13 @@ If there are some errors, we could check the following configuration.
### Etcd Installation
To install Etcd on specified servers, we can run Submarine/install.sh
etcd is a distributed reliable key-value store for the most critical data of a distributed system, Registration and discovery of services used in containers.
You can also choose alternatives like zookeeper, Consul.
To install Etcd on specified servers, we can run Submarine-installer/install.sh
```shell
$ ./Submarine/install.sh
$ ./Submarine-installer/install.sh
# Etcd status
systemctl status Etcd.service
```
@ -421,7 +427,10 @@ b3d05464c356441a: name=etcdnode1 peerURLs=http://${etcd_host_ip3}:2380 clientURL
### Calico Installation
To install Calico on specified servers, we can run Submarine/install.sh
Calico creates and manages a flat three-tier network, and each container is assigned a routable ip. We just add the steps here for your convenience.
You can also choose alternatives like Flannel, OVS.
To install Calico on specified servers, we can run Submarine-installer/install.sh
```
systemctl start calico-node.service
@ -460,11 +469,8 @@ docker exec workload-A ping workload-B
## Hadoop Installation
### Compile hadoop source code
```
mvn package -Pdist -DskipTests -Dtar
```
### Get Hadoop Release
You can either get Hadoop release binary or compile from source code. Please follow the https://hadoop.apache.org/ guides.
### Start yarn service
@ -593,10 +599,10 @@ Add configurations in container-executor.cfg
...
# Add configurations in `[docker]` part
# /usr/bin/nvidia-docker is the path of nvidia-docker command
# nvidia_driver_375.26 means that nvidia driver version is 375.26. nvidia-smi command can be used to check the version
# nvidia_driver_375.26 means that nvidia driver version is <version>. nvidia-smi command can be used to check the version
docker.allowed.volume-drivers=/usr/bin/nvidia-docker
docker.allowed.devices=/dev/nvidiactl,/dev/nvidia-uvm,/dev/nvidia-uvm-tools,/dev/nvidia1,/dev/nvidia0
docker.allowed.ro-mounts=nvidia_driver_375.26
docker.allowed.ro-mounts=nvidia_driver_<version>
[gpu]
module.enabled=true
@ -607,154 +613,3 @@ Add configurations in container-executor.cfg
root=/sys/fs/cgroup
yarn-hierarchy=/hadoop-yarn
```
#### Test with a tensorflow job
Distributed-shell + GPU + cgroup
```bash
./yarn jar /home/hadoop/hadoop-current/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar job run \
--env DOCKER_JAVA_HOME=/opt/java \
--env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.1.0 --name distributed-tf-gpu \
--env YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK=calico-network \
--docker_image gpu-cuda9.0-tf1.8.0-with-models \
--input_path hdfs://${dfs_name_service}/tmp/cifar-10-data \
--checkpoint_path hdfs://${dfs_name_service}/user/hadoop/tf-distributed-checkpoint \
--num_ps 0 \
--ps_resources memory=4G,vcores=2,gpu=0 \
--ps_launch_cmd "python /test/cifar10_estimator/cifar10_main.py --data-dir=hdfs://${dfs_name_service}/tmp/cifar-10-data --job-dir=hdfs://${dfs_name_service}/tmp/cifar-10-jobdir --num-gpus=0" \
--worker_resources memory=4G,vcores=2,gpu=1 --verbose \
--num_workers 1 \
--worker_launch_cmd "python /test/cifar10_estimator/cifar10_main.py --data-dir=hdfs://${dfs_name_service}/tmp/cifar-10-data --job-dir=hdfs://${dfs_name_service}/tmp/cifar-10-jobdir --train-steps=500 --eval-batch-size=16 --train-batch-size=16 --sync --num-gpus=1"
```
## Issues:
### Issue 1: Fail to start nodemanager after system reboot
```
2018-09-20 18:54:39,785 ERROR org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Failed to bootstrap configured resource subsystems!
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException: Unexpected: Cannot create yarn cgroup Subsystem:cpu Mount points:/proc/mounts User:yarn Path:/sys/fs/cgroup/cpu,cpuacct/hadoop-yarn
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.initializePreMountedCGroupController(CGroupsHandlerImpl.java:425)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.initializeCGroupController(CGroupsHandlerImpl.java:377)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsCpuResourceHandlerImpl.bootstrap(CGroupsCpuResourceHandlerImpl.java:98)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsCpuResourceHandlerImpl.bootstrap(CGroupsCpuResourceHandlerImpl.java:87)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerChain.bootstrap(ResourceHandlerChain.java:58)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:320)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:389)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:929)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:997)
2018-09-20 18:54:39,789 INFO org.apache.hadoop.service.AbstractService: Service NodeManager failed in state INITED
```
Solution: Grant user yarn the access to `/sys/fs/cgroup/cpu,cpuacct`, which is the subfolder of cgroup mount destination.
```
chown :yarn -R /sys/fs/cgroup/cpu,cpuacct
chmod g+rwx -R /sys/fs/cgroup/cpu,cpuacct
```
If GPUs are usedthe access to cgroup devices folder is neede as well
```
chown :yarn -R /sys/fs/cgroup/devices
chmod g+rwx -R /sys/fs/cgroup/devices
```
### Issue 2: container-executor permission denied
```
2018-09-21 09:36:26,102 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: IOException executing command:
java.io.IOException: Cannot run program "/etc/yarn/sbin/Linux-amd64-64/container-executor": error=13, Permission denied
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:938)
at org.apache.hadoop.util.Shell.run(Shell.java:901)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1213)
```
Solution: The permission of `/etc/yarn/sbin/Linux-amd64-64/container-executor` should be 6050
### Issue 3How to get docker service log
Solution: we can get docker log with the following command
```
journalctl -u docker
```
### Issue 4docker can't remove containers with errors like `device or resource busy`
```bash
$ docker rm 0bfafa146431
Error response from daemon: Unable to remove filesystem for 0bfafa146431771f6024dcb9775ef47f170edb2f1852f71916ba44209ca6120a: remove /app/docker/containers/0bfafa146431771f6024dcb9775ef47f170edb2f152f71916ba44209ca6120a/shm: device or resource busy
```
Solution: to find which process leads to a `device or resource busy`, we can add a shell script, named `find-busy-mnt.sh`
```bash
#!/bin/bash
# A simple script to get information about mount points and pids and their
# mount namespaces.
if [ $# -ne 1 ];then
echo "Usage: $0 <devicemapper-device-id>"
exit 1
fi
ID=$1
MOUNTS=`find /proc/*/mounts | xargs grep $ID 2>/dev/null`
[ -z "$MOUNTS" ] && echo "No pids found" && exit 0
printf "PID\tNAME\t\tMNTNS\n"
echo "$MOUNTS" | while read LINE; do
PID=`echo $LINE | cut -d ":" -f1 | cut -d "/" -f3`
# Ignore self and thread-self
if [ "$PID" == "self" ] || [ "$PID" == "thread-self" ]; then
continue
fi
NAME=`ps -q $PID -o comm=`
MNTNS=`readlink /proc/$PID/ns/mnt`
printf "%s\t%s\t\t%s\n" "$PID" "$NAME" "$MNTNS"
done
```
Kill the process by pid, which is found by the script
```bash
$ chmod +x find-busy-mnt.sh
./find-busy-mnt.sh 0bfafa146431771f6024dcb9775ef47f170edb2f152f71916ba44209ca6120a
# PID NAME MNTNS
# 5007 ntpd mnt:[4026533598]
$ kill -9 5007
```
### Issue 5Failed to execute `sudo nvidia-docker run`
```
docker: Error response from daemon: create nvidia_driver_361.42: VolumeDriver.Create: internal error, check logs for details.
See 'docker run --help'.
```
Solution:
```
#check nvidia-docker status
$ systemctl status nvidia-docker
$ journalctl -n -u nvidia-docker
#restart nvidia-docker
systemctl stop nvidia-docker
systemctl start nvidia-docker
```
### Issue 6Yarn failed to start containers
if the number of GPUs required by applications is larger than the number of GPUs in the cluster, there would be some containers can't be created.

View File

@ -0,0 +1,242 @@
<!---
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
# submarine installer
## 项目介绍
介绍 **submarine-installer** 项目之前,首先要说明一下 **Hadoop {Submarine}** 这个项目,**Hadoop {Submarine}** 是 hadoop 3.2 版本中最新发布的机器学习框架子项目,他让 hadoop 支持 `Tensorflow`、`MXNet`、`Caffe`、`Spark` 等多种深度学习框架,提供了机器学习算法开发、分布式模型训练、模型管理和模型发布等全功能的系统框架,结合 hadoop 与身俱来的数据存储和数据处理能力,让数据科学家们能够更好的挖掘和发挥出数据的价值。
hadoop 在 2.9 版本中就已经让 YARN 支持了 Docker 容器的资源调度模式,**Hadoop {Submarine}** 在此基础之上通过 YARN 把分布式深度学习框架以 Docker 容器的方式进行调度和运行起来。
由于分布式深度学习框架需要运行在多个 Docker 的容器之中,并且需要能够让运行在容器之中的各个服务相互协调,完成分布式机器学习的模型训练和模型发布等服务,这其中就会牵涉到 `DNS`、`Docker` 、 `GPU`、`Network`、`显卡`、`操作系统内核` 修改等多个系统工程问题,正确的部署好 **Hadoop {Submarine}** 的运行环境是一件很困难和耗时的事情。
为了降低 hadoop 2.9 以上版本的 docker 等组件的部署难度,所以我们专门开发了这个用来部署 `Hadoop {Submarine} ` 运行时环境的 `submarine-installer` 项目,提供一键安装脚本,也可以分步执行安装、卸载、启动和停止各个组件,同时讲解每一步主要参数配置和注意事项。我们同时还向 hadoop 社区提交了部署 `Hadoop {Submarine} ` 运行时环境的 [中文手册](InstallationGuideChineseVersion.md) 和 [英文手册](InstallationGuide.md) ,帮助用户更容易的部署,发现问题也可以及时解决。
## 先决条件
**submarine-installer** 目前只支持 `centos-release-7-3.1611.el7.centos.x86_64` 以上版本的操作系统中进行使用。
## 配置说明
使用 **submarine-installer** 进行部署之前,你可以参考 [install.conf](install.conf) 文件中已有的配置参数和格式,根据你的使用情况进行如下的参数配置:
+ **DNS 配置项**
LOCAL_DNS_HOST服务器端本地 DNS IP 地址配置,可以从 `/etc/resolv.conf` 中查看
YARN_DNS_HOSTyarn dns server 启动的 IP 地址
+ **ETCD 配置项**
机器学习是一个计算密度型系统,对数据传输性能要求非常高,所以我们使用了网络效率损耗最小的 ETCD 网络组件,它可以通过 BGP 路由方式支持 overlay 网络,同时在跨机房部署时支持隧道模式。
你需要选择至少三台以上的服务器作为 ETCD 的运行服务器,这样可以让 `Hadoop {Submarine} ` 有较好的容错性和稳定性。
**ETCD_HOSTS** 配置项中输入作为 ETCD 服务器的IP数组参数配置一般是这样
ETCD_HOSTS=(hostIP1 hostIP2 hostIP3),注意多个 hostIP 之间请使用空格进行隔开。
+ **DOCKER_REGISTRY 配置项**
你首先需要安装好一个可用的 docker 的镜像管理仓库,这个镜像仓库用来存放你所需要的各种深度学习框架的镜像文件,然后将镜像仓库的 IP 地址和端口配置进来参数配置一般是这样DOCKER_REGISTRY="10.120.196.232:5000"
+ **DOWNLOAD_SERVER 配置项**
`submarine-installer` 默认都是从网络上直接下载所有的依赖包例如GCC、Docker、Nvidia 驱动等等),这往往需要消耗大量的时间,并且在有些服务器不能连接互联网的环境中将无法部署,所以我们在 `submarine-installer` 中内置了 HTTP 下载服务,只需要在一台能够连接互联网的服务器中运行 `submarine-installer` ,就可以为所有其他服务器提供依赖包的下载,只需要你按照以下配置进行操作:
1. 首先,你需要将 `DOWNLOAD_SERVER_IP` 配置为一台能够连接互联网的服务器IP地址`DOWNLOAD_SERVER_PORT` 配置为一个不会不太常用的端口。
2. 在 `DOWNLOAD_SERVER_IP` 所在的那台服务器中运行 `submarine-installer/install.sh` 命令后,在安装界面中选择 `[start download server]` 菜单项,`submarine-installer` 将会把部署所有的依赖包全部下载到 `submarine-installer/downloads` 目录中,然后通过 `python -m SimpleHTTPServer ${DOWNLOAD_SERVER_PORT}` 命令启动一个 HTTP 下载服务,不要关闭这台服务器中运行着的 `submarine-installer`
3. 在其他服务器中同样运行 `submarine-installer/install.sh` 命令 ,按照安装界面中的 `[install component]` 菜单依次进行各个组件的安装时,会自动从 `DOWNLOAD_SERVER_IP` 所在的那台服务器下载依赖包进行安装部署。
4. **DOWNLOAD_SERVER** 另外还有一个用处是,你可以自行把各个依赖包手工下载下来,然后放到其中一台服务器的 `submarine-installer/downloads` 目录中,然后开启 `[start download server]` ,这样就可以为整个集群提供离线安装部署的能力。
+ **YARN_CONTAINER_EXECUTOR_PATH 配置项**
如何编译 YARN 的 container-executor你进入到 `hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager` 目录中执行 `mvn package -Pnative -DskipTests` 命令,将会编译出 `./target/native/target/usr/local/bin/container-executor` 文件。
你需要将 `container-executor` 文件的完整路径填写在 YARN_CONTAINER_EXECUTOR_PATH 配置项中。
+ **YARN_HIERARCHY 配置项**
请保持和你所使用的 YARN 集群的 `yarn-site.xml` 配置文件中的 `yarn.nodemanager.linux-container-executor.cgroups.hierarchy` 相同的配置,`yarn-site.xml` 中如果未配置该项,那么默认为 `/hadoop-yarn`
+ **YARN_NODEMANAGER_LOCAL_DIRS 配置项**
请保持和你所使用的 YARN 集群的 `yarn-site.xml` 配置文件中的 `yarn.nodemanager.local-dirs` 相同的配置。
+ **YARN_NODEMANAGER_LOG_DIRS 配置项**
请保持和你所使用的 YARN 集群的 `yarn-site.xml` 配置文件中的 `yarn.nodemanager.log-dirs` 相同的配置。
## 使用说明
**submarine-installer** 完全使用 Shell 脚本编写,不需要安装 ansible 等任何部署工具,避免了不同公司用户的服务器管理规范不同而导致程序不通用,例如:有些机房是不容许 ROOT 用户通过 SHELL 直接进行远程服务器操作等。
**submarine-installer** 的部署过程,完全是通过在菜单中进行选择的操作方式进行的,避免了误操作的同时,你还可以通过各个菜单项目对任意一个组件进行分步执行安装、卸载、启动和停止各个组件,具有很好的灵活性,在部分组件出现问题后,也可以通过 **submarine-installer** 对系统进行诊断和修复。
**submarine-installer** 部署过程中屏幕中会显示日志信息,日志信息一共有三种字体颜色:
+ 红色字体颜色:说明组件安装出现了错误,部署已经终止。
+ 绿色文字颜色:说明组件安装正常,部署正常运行。
+ 蓝色文字颜色:需要你按照提示信息在另外一个 SHELL 终端中进行手工输入命令,一般是修改操作系统内核配置操作,按照提示信息依次操作就可以了。
**启动 submarine-installer**
运行 `submarine-installer/install.sh` 命令启动,部署程序首先会检测服务器中的网卡 IP 地址,如果服务器有多个网卡或配置了多个 IP ,会以列表的形式显示,选择你实际使用的 IP 地址。
**submarine-installer** 菜单说明:
![alt text](./images/submarine-installer.gif "Submarine Installer")
## 部署说明
部署流程如下所示:
1. 参照配置说明,根据你的服务器使用情况配置好 install.conf 文件
2. 将整个 `submarine-installer` 文件夹打包复制到所有的服务器节点中
3. 首先在配置为 **DOWNLOAD_SERVER** 的服务器中
+ 运行 `submarine-installer/install.sh` 命令
+ 在安装界面中选择 `[start download server]` 菜单项,等待下载完各个依赖包后,启动 HTTP 服务
4. 在其他需要进行部署的服务器中
运行 `submarine-installer/install.sh` 命令,显示的主菜单 **[Main menu]** 中有以下菜单:
+ prepare system environment
+ install component
+ uninstall component
+ start component
+ stop component
+ start download server
5. **prepare system environment**
+ **prepare operation system**
检查部署服务器的操作系统和版本;
+ **prepare operation system kernel**
显示操作系统内核更新的操作命令的提示信息,根据你的选择是否自动更新内核版本;
+ **prepare GCC version**
显示操作系统中现在的 GCC 版本内核更新的操作命令的提示信息和根据你的选择是否自动更新 GCC 版本;
+ **check GPU**
检查服务器是否能够检测到 GPU 显卡;
+ **prepare user&group**
显示添加 hadoop 和 docker 的用户和用户组操作命令的提示信息,需要你自己根据提示信息检查服务器中是否存在所需要的用户和用户组;
+ **prepare nvidia environment**
自动进行操作系统内核和头文件的更新,自动安装 `epel-release``dkms`
显示修改系统内核参数配置的操作命令的提示信息,需要你另外打开一个终端根据命令顺序执行;
6. install component
+ **instll etcd**
下载 etcd 的 bin 文件,并安装到 `/usr/bin` 目录中;
根据 **ETCD_HOSTS** 配置项生成 `etcd.service` 文件, 安装到 `/etc/systemd/system/` 目录中;
+ **instll docker**
下载 docker 的 RPM 包进行本地安装;
生成 `daemon.json` 配置文件,安装到 `/etc/docker/` 目录中;
生成 `docker.service` 配置文件,安装到 `/etc/systemd/system/` 目录中;
+ **instll calico network**
下载 `calico` 、`calicoctl` 和 `calico-ipam` 文件,安装到 `/usr/bin` 目录中;
生成 `calicoctl.cfg` 配置文件,安装到 `/etc/calico/` 目录中;
生成 `calico-node.service` 配置文件,安装到 `/etc/systemd/system/` 目录中;
安装完毕后,会在容器中会根据 **CALICO_NETWORK_NAME** 配置项自动创建 calico network并自动创建 2 个 Docker 容器,检查 2 个容器是否能偶互相 PING 通;
+ **instll nvidia driver**
下载 `nvidia-detect` 文件,在服务器中检测显卡版本;
根据显卡版本号下载 Nvidia 显卡驱动安装包;
检测本服务器中是否 `disabled Nouveau` ,如果没有停止安装,那么你需要执行 **[prepare system environment]** 菜单中的 **[prepare nvidia environment]** 子菜单项,按照提示进行操作;
如果本服务器中已经 `disabled Nouveau` ,那么就会进行本地安装;
+ **instll nvidia docker**
下载 `nvidia-docker` 的 RPM 安装包并进行安装;
显示检测 `nvidia-docker` 是否可用的命令提示信息,需要你另外打开一个终端根据命令顺序执行;
+ **instll yarn container-executor**
根据 **YARN_CONTAINER_EXECUTOR_PATH 配置项**,将 `container-executor` 文件复制到 `/etc/yarn/sbin/Linux-amd64-64/` 目录中;
根据配置生成 `container-executor.cfg` 文件,复制到 `/etc/yarn/sbin/etc/hadoop/` 目录中;
+ **instll submarine autorun script**
复制 `submarine.sh` 文件到 `/etc/rc.d/init.d/` 目录中;
`/etc/rc.d/init.d/submarine.sh` 添加到 `/etc/rc.d/rc.local` 系统自启动文件中;
7. uninstall component
删除指定组件的 BIN 文件和配置文件,不在复述
- uninstll etcd
- uninstll docker
- uninstll calico network
- uninstll nvidia driver
- uninstll nvidia docker
- uninstll yarn container-executor
- uninstll submarine autorun script
8. start component
重启指定组件,不在复述
- start etcd
- start docker
- start calico network
9. stop component
停止指定组件,不在复述
- stop etcd
- stop docker
- stop calico network
10. start download server
只能在 **DOWNLOAD_SERVER_IP 配置项** 所在的服务器中才能执行本操作;

View File

@ -0,0 +1,250 @@
<!---
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
# submarine installer
## Introduction
Hadoop {Submarine} is the latest machine learning framework subproject in the Hadoop 3.2 release. It allows Hadoop to support `Tensorflow`, `MXNet`,` Caffe`, `Spark`, etc. A variety of deep learning frameworks provide a full-featured system framework for machine learning algorithm development, distributed model training, model management, and model publishing, combined with hadoop's intrinsic data storage and data processing capabilities to enable data scientists to Good mining and the value of the data.
Hadoop has enabled YARN to support Docker container since 2.x. **Hadoop {Submarine}** then uses YARN to schedule and run the distributed deep learning framework in the form of a Docker container.
Since the distributed deep learning framework needs to run in multiple Docker containers and needs to be able to coordinate the various services running in the container, complete the services of model training and model publishing for distributed machine learning. Involving multiple system engineering problems such as `DNS`, `Docker`, `GPU`, `Network`, `graphics card`, `operating system kernel` modification, etc. It is very difficult and time-consuming to properly deploy the **Hadoop {Submarine}** runtime environment.
In order to reduce the difficulty of deploying components, we have developed this **submarine-installer** project to deploy the **Hadoop {Submarine}** runtime environment, providing a one-click installation script or step-by-step installation. Unload, start, and stop individual components, and explain the main parameter configuration and considerations for each step. We also submitted a [Chinese manual](InstallationGuideChineseVersion.md) and an [English manual](InstallationGuide.md) for the **Hadoop {Submarine}** runtime environment to the hadoop community to help users deploy more easily and find problems in a timely manner.
This installer is just created for your convenience. You can choose to install required libraries by yourself.
## prerequisites
**submarine-installer** currently only supports operating systems based on `centos-release-7-3.1611.el7.centos.x86_64` and above.
## Configuration instructions
Before deploying with submarine-installer, you can refer to the existing configuration parameters and format in the `install.conf` file, and configure the following parameters according to your usage:
+ **DNS Configuration**
LOCAL_DNS_HOST: server-side local DNS IP address configuration, which can be viewed from `/etc/resolv.conf`
YARN_DNS_HOST: yarn dns server started IP address
+ **ETCD Configuration**
Machine learning is a computationally-density system that requires very high data transmission performance. Therefore, we use the ETCD network component with the least network efficiency loss. It can support the overlay network through BGP routing and support tunnel mode when deployed across the equipment room.
Please note that you can choose to use different Docker networks. ETCD is not the only network solution supported by Submarine.
You need to select at least three servers as the running server for ETCD, which will make **Hadoop {Submarine}** better fault tolerant and stable.
Enter the IP array as the ETCD server in the ETCD_HOSTS configuration item. The parameter configuration is generally like this:
ETCD_HOSTS=(hostIP1 hostIP2 hostIP3). Note that spaces between multiple hostIPs should be separated by spaces.
+ **DOCKER_REGISTRY Configuration**
You can follow the following step to setup your Docker registry. But it is not a hard requirement since you can use a pre-setup Docker registry instead.
You first need to install an image management repository for the available docker. This image repository is used to store the image files of the various deep learning frameworks you need, and then configure the IP address and port of the mirror repository. The parameter configuration is generally the same :
DOCKER_REGISTRY="10.120.196.232:5000"
+ **DOWNLOAD_SERVER Configuration**
By default, **submarine-installer** downloads all dependencies directly from the network (eg GCC, Docker, Nvidia drivers, etc.), which often takes a lot of time and cannot be used in environments where some servers cannot connect to the Internet. Deployment, so we built the HTTP download service in **submarine-installer**, you only need to run **submarine-installer** on a server that can connect to the Internet, you can download the dependencies for all other servers, you only need Follow these configurations:
1. First, you need to configure `DOWNLOAD_SERVER_IP` as a server IP address that can connect to the Internet, and configure `DOWNLOAD_SERVER_PORT` as a port that is not very common.
2. After running the `submarine-installer/install.sh` command on the server where `DOWNLOAD_SERVER_IP` is located, select the `[start download server]` menu item in the installation interface. **submarine-installer** will download all the dependencies of the deployment to the server. In the `submarine-installer/downloads` directory, start an HTTP download service with the `python -m SimpleHTTPServer ${DOWNLOAD_SERVER_PORT}` command. Do not close the **submarine-installer** running on this server.
3. When you run the `submarine-installer/install.sh` command on other servers and follow the `[install component]` menu in the installation interface to install each component in turn, it will automatically download the dependencies from the server where `DOWNLOAD_SERVER_IP` is located for installation and deployment. .
4. **DOWNLOAD_SERVER** Another useful thing is that you can manually download the dependencies by hand, put them in the `submarine-installer/downloads` directory of one of the servers, and then open `[start download server]`, so that you can The cluster provides the ability to deploy offline deployments.
+ **YARN_CONTAINER_EXECUTOR_PATH Configuration**
You can get container-executor binary from either binary release package or build from source.
You need to fill in the full path of the container-executor file in the `YARN_CONTAINER_EXECUTOR_PATH` configuration item.
+ **YARN_HIERARCHY Configuration**
Please keep the same configuration as `yarn.nodemanager.linux-container-executor.cgroups.hierarchy` in the `yarn-site.xml` configuration file of the YARN cluster you are using. If this is not configured in `yarn-site.xml`, Then the default is `/hadoop-yarn`.
+ **YARN_NODEMANAGER_LOCAL_DIRS Configuration**
Please keep the same configuration as `yarn.nodemanager.local-dirs` in the `yarn-site.xml` configuration file of the YARN cluster you are using.
+ **YARN_NODEMANAGER_LOG_DIRS Configuration**
Please keep the same configuration as `yarn.nodemanager.log-dirs` in the `yarn-site.xml` configuration file of the YARN cluster you are using.
## Instructions for use
**submarine-installer** is completely written in shell script. It does not need to install any deployment tools such as ansible. It avoids different server management specifications of different company users and causes the program to be uncommon. For example, some computer rooms do not allow ROOT users to directly remotely through SHELL. Server operation, etc.
The deployment process of **submarine-installer** is completely performed by selecting the operation in the menu. It avoids misoperations. You can also install, uninstall, and start any component in each step through various menu items. And the various components are stopped, and the flexibility is very good. After some components have problems, the system can also be diagnosed and repaired by **submarine-installer**.
**submarine-installer** The log information is displayed on the screen during the deployment process. The log information has three font colors:
+ Red font color: Indicates that the component installation has an error and the deployment has terminated.
+ Green text color: The component is installed properly and the deployment is working properly.
+ Blue text color: You need to manually enter the command in another SHELL terminal according to the prompt information. Generally, modify the operating system kernel configuration operation, and follow the prompt information to operate it.
**Start submarine-installer**
Run the `submarine-installer/install.sh` command to start. The deployment program first detects the IP address of the network card in the server. If the server has multiple network cards or multiple IP addresses configured, it will be displayed in the form of a list. Select the one you actually use. IP address.
**submarine-installer** Menu description
![alt text](./images/submarine-installer.gif "Submarine Installer")
## Deployment instructions
The deployment process is as follows:
1. Refer to the configuration instructions to configure the `install.conf` file based on your server usage.
2. Copy the entire **submarine-installer** folder to all server nodes
3. First in the server configured as **DOWNLOAD_SERVER**
+ Run the `submarine-installer/install.sh` command
+ Select the `[start download server]` menu item in the installation interface, and wait for the download of each dependency package to start the HTTP service.
4. **In other servers that need to be deployed**
Run the `submarine-installer/install.sh` command to display the following menu in the main menu **[Main menu]**:
+ prepare system environment
+ install component
+ uninstall component
+ start component
+ stop component
+ start download server
5. **prepare system environment**
- **prepare operation system**
Check the operating system and version of the deployment server;
- **prepare operation system kernel**
Display the prompt information of the operation command of the operating system kernel update, and automatically update the kernel version according to your choice;
- **prepare GCC version**
Display the prompt information of the operation command of the current GCC version kernel update in the operating system and whether to automatically update the GCC version according to your choice;
- **check GPU**
Check if the server can detect the GPU graphics card;
- **prepare user&group**
Display the prompts for adding user and user group operation commands for hadoop and docker. You need to check whether there are any required users and user groups in the server according to the prompt information.
- **prepare nvidia environment**
Automatically update the operating system kernel and header files, and automatically install `epel-release` and `dkms`;
Display the prompt information for modifying the operation command of the system kernel parameter configuration, you need to open another terminal according to the command sequence;
6. **install component**
- **instll etcd**
Download the bin file for etcd and install it in the `/usr/bin` directory;
Generate the `etcd.service` file according to the **ETCD_HOSTS** configuration item and install it into the `/etc/systemd/system/` directory.
- **instll docker**
Download docker's RPM package for local installation;
Generate the `daemon.json` configuration file and install it into the `/etc/docker/` directory.
Generate the `docker.service` configuration file and install it into the `/etc/systemd/system/` directory.
- **instll calico network**
Download the `calico`, `calicoctl`, and `calico-ipam` files and install them in the `/usr/bin` directory.
Generate the `calicoctl.cfg` configuration file and install it into the `/etc/calico/` directory.
Generate the `calico-node.service` configuration file and install it into the `/etc/systemd/system/` directory.
After the installation is complete, the calico network will be automatically created in the container according to the **CALICO_NETWORK_NAME** configuration item, and two Docker containers will be created automatically to check whether the two containers can even ping each other.
- **instll nvidia driver**
Download the `nvidia-detect` file to detect the graphics card version in the server;Download the `nvidia-detect` file to detect the graphics card version in the server;
Download the Nvidia graphics driver installation package according to the graphics card version number;
Check if the Nouveau is disabled in this server. If the installation is not stopped, you need to execute the **[prepare nvidia environment]** submenu item in the **[prepare system environment]** menu and follow the prompts.
If Nouveau has been disabled in this server, it will be installed locally;
- **instll nvidia docker**
Download the nvidia-docker RPM installation package and install it;
Display the command prompt information to detect whether nvidia-docker is available. You need to open another terminal to execute according to the command sequence.
- **instll yarn container-executor**
Copy the `container-executor` file to the `/etc/yarn/sbin/Linux-amd64-64/` directory according to the **YARN_CONTAINER_EXECUTOR_PATH** configuration item;
Generate the `container-executor.cfg` file according to the configuration and copy it to the `/etc/yarn/sbin/etc/hadoop/` directory.
- **instll submarine autorun script**
Copy the submarine.sh file to the `/etc/rc.d/init.d/` directory;
Add `/etc/rc.d/init.d/submarine.sh` to the `/etc/rc.d/rc.local` system self-starting file;
7. uninstall component
Delete the BIN file and configuration file of the specified component, not in the retelling
- uninstll etcd
- uninstll docker
- uninstll calico network
- uninstll nvidia driver
- uninstll nvidia docker
- uninstll yarn container-executor
- uninstll submarine autorun script
8. start component
Restart the specified component, not repeat
- start etcd
- start docker
- start calico network
9. stop component
Stop specifying component, not repeating
- stop etcd
- stop docker
- stop calico network
10. start download server
This operation can only be performed on the server where the **DOWNLOAD_SERVER_IP** configuration item is located;

View File

@ -0,0 +1,165 @@
<!---
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
#### Test with a tensorflow job
Distributed-shell + GPU + cgroup
```bash
./yarn jar /home/hadoop/hadoop-current/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar job run \
--env DOCKER_JAVA_HOME=/opt/java \
--env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.1.0 --name distributed-tf-gpu \
--env YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK=calico-network \
--worker_docker_image gpu-cuda9.0-tf1.8.0-with-models \
--ps_docker_image dockerfile-cpu-tf1.8.0-with-models \
--input_path hdfs://${dfs_name_service}/tmp/cifar-10-data \
--checkpoint_path hdfs://${dfs_name_service}/user/hadoop/tf-distributed-checkpoint \
--num_ps 0 \
--ps_resources memory=4G,vcores=2,gpu=0 \
--ps_launch_cmd "python /test/cifar10_estimator/cifar10_main.py --data-dir=hdfs://${dfs_name_service}/tmp/cifar-10-data --job-dir=hdfs://${dfs_name_service}/tmp/cifar-10-jobdir --num-gpus=0" \
--worker_resources memory=4G,vcores=2,gpu=1 --verbose \
--num_workers 1 \
--worker_launch_cmd "python /test/cifar10_estimator/cifar10_main.py --data-dir=hdfs://${dfs_name_service}/tmp/cifar-10-data --job-dir=hdfs://${dfs_name_service}/tmp/cifar-10-jobdir --train-steps=500 --eval-batch-size=16 --train-batch-size=16 --sync --num-gpus=1"
```
## Issues:
### Issue 1: Fail to start nodemanager after system reboot
```
2018-09-20 18:54:39,785 ERROR org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Failed to bootstrap configured resource subsystems!
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerException: Unexpected: Cannot create yarn cgroup Subsystem:cpu Mount points:/proc/mounts User:yarn Path:/sys/fs/cgroup/cpu,cpuacct/hadoop-yarn
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.initializePreMountedCGroupController(CGroupsHandlerImpl.java:425)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.initializeCGroupController(CGroupsHandlerImpl.java:377)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsCpuResourceHandlerImpl.bootstrap(CGroupsCpuResourceHandlerImpl.java:98)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsCpuResourceHandlerImpl.bootstrap(CGroupsCpuResourceHandlerImpl.java:87)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerChain.bootstrap(ResourceHandlerChain.java:58)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:320)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:389)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:929)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:997)
2018-09-20 18:54:39,789 INFO org.apache.hadoop.service.AbstractService: Service NodeManager failed in state INITED
```
Solution: Grant user yarn the access to `/sys/fs/cgroup/cpu,cpuacct`, which is the subfolder of cgroup mount destination.
```
chown :yarn -R /sys/fs/cgroup/cpu,cpuacct
chmod g+rwx -R /sys/fs/cgroup/cpu,cpuacct
```
If GPUs are usedthe access to cgroup devices folder is neede as well
```
chown :yarn -R /sys/fs/cgroup/devices
chmod g+rwx -R /sys/fs/cgroup/devices
```
### Issue 2: container-executor permission denied
```
2018-09-21 09:36:26,102 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: IOException executing command:
java.io.IOException: Cannot run program "/etc/yarn/sbin/Linux-amd64-64/container-executor": error=13, Permission denied
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:938)
at org.apache.hadoop.util.Shell.run(Shell.java:901)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1213)
```
Solution: The permission of `/etc/yarn/sbin/Linux-amd64-64/container-executor` should be 6050
### Issue 3How to get docker service log
Solution: we can get docker log with the following command
```
journalctl -u docker
```
### Issue 4docker can't remove containers with errors like `device or resource busy`
```bash
$ docker rm 0bfafa146431
Error response from daemon: Unable to remove filesystem for 0bfafa146431771f6024dcb9775ef47f170edb2f1852f71916ba44209ca6120a: remove /app/docker/containers/0bfafa146431771f6024dcb9775ef47f170edb2f152f71916ba44209ca6120a/shm: device or resource busy
```
Solution: to find which process leads to a `device or resource busy`, we can add a shell script, named `find-busy-mnt.sh`
```bash
#!/bin/bash
# A simple script to get information about mount points and pids and their
# mount namespaces.
if [ $# -ne 1 ];then
echo "Usage: $0 <devicemapper-device-id>"
exit 1
fi
ID=$1
MOUNTS=`find /proc/*/mounts | xargs grep $ID 2>/dev/null`
[ -z "$MOUNTS" ] && echo "No pids found" && exit 0
printf "PID\tNAME\t\tMNTNS\n"
echo "$MOUNTS" | while read LINE; do
PID=`echo $LINE | cut -d ":" -f1 | cut -d "/" -f3`
# Ignore self and thread-self
if [ "$PID" == "self" ] || [ "$PID" == "thread-self" ]; then
continue
fi
NAME=`ps -q $PID -o comm=`
MNTNS=`readlink /proc/$PID/ns/mnt`
printf "%s\t%s\t\t%s\n" "$PID" "$NAME" "$MNTNS"
done
```
Kill the process by pid, which is found by the script
```bash
$ chmod +x find-busy-mnt.sh
./find-busy-mnt.sh 0bfafa146431771f6024dcb9775ef47f170edb2f152f71916ba44209ca6120a
# PID NAME MNTNS
# 5007 ntpd mnt:[4026533598]
$ kill -9 5007
```
### Issue 5Failed to execute `sudo nvidia-docker run`
```
docker: Error response from daemon: create nvidia_driver_361.42: VolumeDriver.Create: internal error, check logs for details.
See 'docker run --help'.
```
Solution:
```
#check nvidia-docker status
$ systemctl status nvidia-docker
$ journalctl -n -u nvidia-docker
#restart nvidia-docker
systemctl stop nvidia-docker
systemctl start nvidia-docker
```
### Issue 6Yarn failed to start containers
if the number of GPUs required by applications is larger than the number of GPUs in the cluster, there would be some containers can't be created.