Example of Zookeeper and Solr cluster with Docker networking ------------------------------------------------------------ _Note: this article dates from Jan 2016. While this approach would still work, in Jan 2019 this would typically done with Docker cluster and orchestration tools like Kubernetes. See for example [this blog post](https://lucidworks.com/2019/02/07/running-solr-on-kubernetes-part-1/)._ In this example I'll create a cluster with 3 ZooKeeper nodes and 3 Solr nodes, distributed over 3 machines (trinity10, trinity20, trinity30). I'll use an overlay network, specify fixed IP addresses when creating containers, and I'll pass in explicit `/etc/hosts` entries to make sure they are available even when nodes are down. I won't show the configuration of the key-value store to configuration to enable networking, see [the docs](https://docs.docker.com/engine/userguide/networking/get-started-overlay/) for that. I'll not use Docker Swarm in this example, but specifically place and configure containers where I want them by ssh'ing into the appropriate Docker host. To make this example easier to understand I'll just use shell commands. For actual use you may want to use a fancier deployment tool like [Fabric](http://www.fabfile.org). Note: this example requires Docker 1.10. I'll run these commands from the first machine, trinity10. Create a network named "netzksolr" for this cluster. The `--ip-range` specifies the range of addresses to use for containers, whereas the `--subnet` specifies all possible addresses in this network. So effectively, addresses in the subnet but outside the range are reserved for containers that specifically use the `--ip` option. ``` docker network create --driver=overlay --subnet 192.168.22.0/24 --ip-range=192.168.22.128/25 netzksolr ``` As a simple test, check the automatic assignment and specific assignment work: ``` $ docker run -i --rm --net=netzksolr busybox ip -4 addr show eth0 | grep inet inet 192.168.23.129/24 scope global eth0 $ docker run -i --rm --net=netzksolr --ip=192.168.22.5 busybox ip -4 addr show eth0 | grep inet inet 192.168.22.5/24 scope global eth0 ``` So next create containers for ZooKeeper nodes. First define some environment variables for convenience: ``` # the machine to run the container on ZK1_HOST=trinity10.lan ZK2_HOST=trinity20.lan ZK3_HOST=trinity30.lan # the IP address for the container ZK1_IP=192.168.22.10 ZK2_IP=192.168.22.11 ZK3_IP=192.168.22.12 # the Docker image ZK_IMAGE=jplock/zookeeper ``` Then create the containers: ``` ssh -n $ZK1_HOST "docker pull jplock/zookeeper && docker create --ip=$ZK1_IP --net netzksolr --name zk1 --hostname=zk1 --add-host zk2:$ZK2_IP --add-host zk3:$ZK3_IP -it $ZK_IMAGE" ssh -n $ZK2_HOST "docker pull jplock/zookeeper && docker create --ip=$ZK2_IP --net netzksolr --name zk2 --hostname=zk2 --add-host zk1:$ZK1_IP --add-host zk3:$ZK3_IP -it $ZK_IMAGE" ssh -n $ZK3_HOST "docker pull jplock/zookeeper && docker create --ip=$ZK3_IP --net netzksolr --name zk3 --hostname=zk3 --add-host zk1:$ZK1_IP --add-host zk2:$ZK2_IP -it $ZK_IMAGE" ``` Next configure those containers by creating ZooKeeper's `zoo.cfg` and `myid` files: ``` # Add ZooKeeper nodes to the ZooKeeper config. # If you use hostnames here, ZK will complain with UnknownHostException about the other nodes. # In ZooKeeper 3.4.6 that stays broken forever; in 3.4.7 that does recover. # If you use IP addresses you avoid the UnknownHostException and get a quorum more quickly, # but IP address changes can impact you. docker cp zk1:/opt/zookeeper/conf/zoo.cfg . cat >>zoo.cfg <' | wc -l" 11 mak@trinity10:~$ docker exec -it --user=solr zksolr1 bash -c "wget -q -O - 'http://zksolr1:8983/solr/my_collection1/select?q=*:*&shards=shard1&rows=100&indent=true&fl=id' | grep '' | wc -l" 4 mak@trinity10:~$ docker exec -it --user=solr zksolr1 bash -c "wget -q -O - 'http://zksolr1:8983/solr/my_collection1/select?q=*:*&shards=shard2&rows=100&indent=true&fl=id' | grep '' | wc -l" 7 ``` Now to get external access to this overlay network from outside we can use a container to proxy the connections. For a simple TCP proxy container with an exposed port on the Docker host, proxying to a single Solr node, you can use [brandnetworks/tcpproxy](https://github.com/brandnetworks/tcpproxy): ``` ssh -n trinity10.lan "docker pull brandnetworks/tcpproxy && docker run -p 8001 -p 8002 --net netzksolr --name zksolrproxy --hostname=zksolrproxy.netzksolr -tid brandnetworks/tcpproxy --connections 8002:zksolr1:8983" docker port zksolrproxy 8002 ``` Or use a suitably configured HAProxy to round-robin between all Solr nodes. Or, instead of the overlay network, use [Project Calico](http://www.projectcalico.org) and configure L3 routing so you do not need to mess with proxies. Now I can get to Solr on http://trinity10:32774/solr/#/. In the Cloud -> Tree -> /live_nodes view I see the Solr nodes. From the Solr UI select the collection1 core, and click on Cloud -> Graph to see how it has created two shards across our Solr nodes. Now, by way of test, we'll stop the Solr containers, and start them out-of-order, and verify the IP addresses are unchanged, and check the same results come back: ``` ssh -n $ZKSOLR1_HOST docker kill zksolr1 ssh -n $ZKSOLR2_HOST docker kill zksolr2 ssh -n $ZKSOLR3_HOST docker kill zksolr3 ssh -n $ZKSOLR1_HOST docker start zksolr1 sleep 3 ssh -n $ZKSOLR3_HOST docker start zksolr3 sleep 3 ssh -n $ZKSOLR2_HOST docker start zksolr2 ssh -n $ZKSOLR1_HOST 'docker inspect --format "{{ .NetworkSettings.Networks.netzksolr.IPAddress }}" zksolr1' ssh -n $ZKSOLR2_HOST 'docker inspect --format "{{ .NetworkSettings.Networks.netzksolr.IPAddress }}" zksolr2' ssh -n $ZKSOLR3_HOST 'docker inspect --format "{{ .NetworkSettings.Networks.netzksolr.IPAddress }}" zksolr3' docker exec -it --user=solr zksolr1 bash -c "wget -q -O - 'http://zksolr1:8983/solr/my_collection1/select?q=*:*&indent=true&rows=100&fl=id' | egrep '' | wc -l" docker exec -it --user=solr zksolr1 bash -c "wget -q -O - 'http://zksolr1:8983/solr/my_collection1/select?q=*:*&shards=shard1&rows=100&indent=true&fl=id' | grep '' | wc -l" docker exec -it --user=solr zksolr1 bash -c "wget -q -O - 'http://zksolr1:8983/solr/my_collection1/select?q=*:*&shards=shard2&rows=100&indent=true&fl=id' | grep '' | wc -l" ``` Good, that works. Finally To clean up this example: ``` ssh -n $ZK1_HOST "docker kill zk1; docker rm zk1" ssh -n $ZK2_HOST "docker kill zk2; docker rm zk2" ssh -n $ZK3_HOST "docker kill zk3; docker rm zk3" ssh -n $ZKSOLR1_HOST "docker kill zksolr1; docker rm zksolr1" ssh -n $ZKSOLR2_HOST "docker kill zksolr2; docker rm zksolr2" ssh -n $ZKSOLR3_HOST "docker kill zksolr3; docker rm zksolr3" ssh -n trinity10.lan "docker kill zksolrproxy; docker rm zksolrproxy" docker network rm netzksolr ```