k8s-int-test-build: zk-less druid cluster and http based segment/task managment (#10686)

* zk-less druid cluster in k8s build

* attempt to fix build and use http based remote task management

* mm/router logs for debugging

* add default account k8s role and binding for pod, configMap access

* fix issue

* change router port to 8088 for common readinessProbe

* break build_run_k8s_cluster.sh into separate scripts

* revert changes to K8sDruidNodeAnnouncer.java

* k8s extension doc update

* add license to new file

* address review comments

* do not try to load lookups at startup to improve cluster startup time
This commit is contained in:
Himanshu 2021-01-05 18:51:47 -08:00 committed by GitHub
parent ea2d51d61f
commit d2e6240cac
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
9 changed files with 237 additions and 197 deletions

View File

@ -565,8 +565,11 @@ jobs:
jdk: openjdk8
services: &integration_test_services_k8s
- docker
env: CONFIG_FILE='k8s_run_config_file.json' IT_TEST='-Dit.test=ITNestedQueryPushDownTest'
before_script: integration-tests/script/build_run_k8s_cluster.sh
env: CONFIG_FILE='k8s_run_config_file.json' IT_TEST='-Dit.test=ITNestedQueryPushDownTest' POD_NAME=int-test POD_NAMESPACE=default
before_script:
- integration-tests/script/setup_k8s_cluster.sh
- integration-tests/script/setup_druid_operator_on_k8s.sh
- integration-tests/script/setup_druid_on_k8s.sh
script: &run_integration_test_k8s
- ${MVN} verify -pl integration-tests -P int-tests-config-file ${IT_TEST} ${MAVEN_SKIP}
after_script: integration-tests/script/stop_k8s_cluster.sh

View File

@ -39,9 +39,7 @@ This extension works together with HTTP based segment and task management in Dru
`druid.indexer.runner.type=httpRemote`
`druid.discovery.type=k8s`
For Node Discovery, Each Druid process running inside a pod "announces" itself by adding few "labels" and "annotations" in the pod spec. So, to add those...
- Druid process needs to be aware of pod name and namespace which it reads from environment variables `POD_NAME` and `POD_NAMESPACE`. These variable names can be changed, see configuration below. But in the end, each pod needs to have pod name and namespace added as environment variables.
- Label/Annotation path in the pod spec must exist, which is easily satisfied if there is at least one label/annotation in the pod spec already. This limitation may be removed in future.
For Node Discovery, Each Druid process running inside a pod "announces" itself by adding few "labels" and "annotations" in the pod spec. Druid process needs to be aware of pod name and namespace which it reads from environment variables `POD_NAME` and `POD_NAMESPACE`. These variable names can be changed, see configuration below. But in the end, each pod needs to have self pod name and namespace added as environment variables.
Additionally, this extension has following configuration.
@ -57,3 +55,34 @@ Additionally, this extension has following configuration.
|`druid.discovery.k8s.renewDeadline`|`Duration`|Lease renewal period used by Leader.|PT17S|No|
|`druid.discovery.k8s.retryPeriod`|`Duration`|Retry wait used by Leader Election algorithm on failed operations.|PT5S|No|
### Gotchas
- Label/Annotation path in each pod spec MUST EXIST, which is easily satisfied if there is at least one label/annotation in the pod spec already. This limitation may be removed in future.
- Druid Pods need permissions to be able to add labels to self-pod, List and Watch other Pods, create ConfigMap for leader election. Assuming, "default" service account is used by Druid pods, you might need to add following or something similar Kubernetes Role and Role Binding.
```
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: druid-cluster
rules:
- apiGroups:
- ""
resources:
- pods
- configmaps
verbs:
- '*'
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: druid-cluster
subjects:
- kind: ServiceAccount
name: default
roleRef:
kind: Role
name: druid-cluster
apiGroup: rbac.authorization.k8s.io
```

View File

@ -0,0 +1,39 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: druid-cluster
rules:
- apiGroups:
- ""
resources:
- pods
- configmaps
verbs:
- '*'
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: druid-cluster
subjects:
- kind: ServiceAccount
name: default
roleRef:
kind: Role
name: druid-cluster
apiGroup: rbac.authorization.k8s.io

View File

@ -26,6 +26,12 @@ spec:
podLabels:
environment: stage
release: alpha
podAnnotations:
dummy: k8s_extn_needs_atleast_one_annotation
readinessProbe:
httpGet:
path: /status/health
port: 8088
securityContext:
fsGroup: 0
runAsUser: 0
@ -60,10 +66,15 @@ spec:
</Configuration>
common.runtime.properties: |
# Zookeeper
druid.zk.service.host=tiny-cluster-zk-0.tiny-cluster-zk
druid.zk.paths.base=/druid
druid.zk.service.compress=false
#
# Zookeeper-less Druid Cluster
#
druid.zk.service.enabled=false
druid.discovery.type=k8s
druid.discovery.k8s.clusterIdentifier=druid-it
druid.serverview.type=http
druid.coordinator.loadqueuepeon.type=http
druid.indexer.runner.type=httpRemote
# Metadata Store
druid.metadata.storage.type=derby
@ -79,13 +90,30 @@ spec:
#
# Extensions
#
druid.extensions.loadList=["druid-avro-extensions","druid-hdfs-storage", "druid-kafka-indexing-service", "druid-datasketches"]
druid.extensions.loadList=["druid-avro-extensions","druid-hdfs-storage", "druid-kafka-indexing-service", "druid-datasketches", "druid-kubernetes-extensions"]
#
# Service discovery
#
druid.selectors.indexing.serviceName=druid/overlord
druid.selectors.coordinator.serviceName=druid/coordinator
druid.indexer.logs.type=file
druid.indexer.logs.directory=/druid/data/task-logs
druid.indexer.task.baseDir=/druid/data/task-base
druid.lookup.enableLookupSyncOnStartup=false
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
nodes:
brokers:
# Optionally specify for running broker as Deployment
@ -174,7 +202,6 @@ spec:
druid.coordinator.asOverlord.enabled=true
druid.coordinator.asOverlord.overlordService=druid/overlord
druid.indexer.queue.startDelay=PT30S
druid.indexer.runner.type=local
extra.jvm.options: |-
-Xmx800m
-Xms800m
@ -237,16 +264,16 @@ spec:
routers:
nodeType: "router"
druid.port: 8888
druid.port: 8088
services:
- spec:
type: NodePort
ports:
- name: router-service-port
nodePort: 30400
port: 8888
port: 8088
protocol: TCP
targetPort: 8888
targetPort: 8088
selector:
nodeSpecUniqueStr: druid-tiny-cluster-routers
metadata:
@ -258,7 +285,7 @@ spec:
replicas: 1
runtime.properties: |
druid.service=druid/router
druid.plaintextPort=8888
druid.plaintextPort=8088
# HTTP proxy
druid.router.http.numConnections=50
@ -317,4 +344,4 @@ spec:
requests:
memory: "3G"
limits:
memory: "3G"
memory: "3G"

View File

@ -1,84 +0,0 @@
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -e
export DRUID_OPERATOR_VERSION=0.0.3
# setup client keystore
cd integration-tests
./docker/tls/generate-client-certs-and-keystores.sh
rm -rf docker/client_tls
cp -r client_tls docker/client_tls
cd ..
# Build Docker images for pods
mvn -B -ff -q dependency:go-offline \
install \
-Pdist,bundle-contrib-exts \
-Pskip-static-checks,skip-tests \
-Dmaven.javadoc.skip=true
docker build -t druid/cluster:v1 -f distribution/docker/DockerfileBuildTarAdvanced .
# Set Necessary ENV
export CHANGE_MINIKUBE_NONE_USER=true
export MINIKUBE_WANTUPDATENOTIFICATION=false
export MINIKUBE_WANTREPORTERRORPROMPT=false
export MINIKUBE_HOME=$HOME
export KUBECONFIG=$HOME/.kube/config
sudo apt install -y conntrack
# This tmp dir is used for MiddleManager pod and Historical Pod to cache segments.
mkdir tmp
chmod 777 tmp
# Lacunch K8S cluster
curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/v1.18.1/bin/linux/amd64/kubectl && chmod +x kubectl && sudo mv kubectl /usr/local/bin/
curl -Lo minikube https://storage.googleapis.com/minikube/releases/v1.8.1/minikube-linux-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/
sudo /usr/local/bin/minikube start --profile=minikube --vm-driver=none --kubernetes-version=v1.18.1
sudo /usr/local/bin/minikube update-context
# Prepare For Druid-Operator
git clone https://github.com/druid-io/druid-operator.git
cd druid-operator
git checkout -b druid-operator-$DRUID_OPERATOR_VERSION druid-operator-$DRUID_OPERATOR_VERSION
cd ..
sed -i "s|REPLACE_IMAGE|druidio/druid-operator:$DRUID_OPERATOR_VERSION|g" druid-operator/deploy/operator.yaml
cp integration-tests/tiny-cluster.yaml druid-operator/examples/
cp integration-tests/tiny-cluster-zk.yaml druid-operator/examples/
sed -i "s|REPLACE_VOLUMES|`pwd`|g" druid-operator/examples/tiny-cluster.yaml
# Create ZK, Historical, MiddleManager, Overlord-coordiantor, Broker and Router pods using statefulset
sudo /usr/local/bin/kubectl create -f druid-operator/deploy/service_account.yaml
sudo /usr/local/bin/kubectl create -f druid-operator/deploy/role.yaml
sudo /usr/local/bin/kubectl create -f druid-operator/deploy/role_binding.yaml
sudo /usr/local/bin/kubectl create -f druid-operator/deploy/crds/druid.apache.org_druids_crd.yaml
sudo /usr/local/bin/kubectl create -f druid-operator/deploy/operator.yaml
sudo /usr/local/bin/kubectl apply -f druid-operator/examples/tiny-cluster-zk.yaml
sudo /usr/local/bin/kubectl apply -f druid-operator/examples/tiny-cluster.yaml
# Wait 4 * 15 seconds to launch pods.
#count=0
#JSONPATH='{range .items[*]}{@.metadata.name}:{range @.status.conditions[*]}{@.type}={@.status};{end}{end}'; until sudo /usr/local/bin/kubectl -n default get pods -lapp=travis-example -o jsonpath="$JSONPATH" 2>&1 | grep -q "Ready=True"; do sleep 4;if [ $count -eq 15 ];then break 2 ;else let "count++";fi;echo $i;echo "waiting for travis-example deployment to be available"; sudo /usr/local/bin/kubectl get pods -n default; done
sleep 120
## Debug And FastFail
sudo /usr/local/bin/kubectl get pod
sudo /usr/local/bin/kubectl get svc
docker images
sudo /usr/local/bin/kubectl describe pod druid-tiny-cluster-middlemanagers-0

View File

@ -0,0 +1,52 @@
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -e
export KUBECTL="sudo /usr/local/bin/kubectl"
# setup client keystore
cd integration-tests
./docker/tls/generate-client-certs-and-keystores.sh
rm -rf docker/client_tls
cp -r client_tls docker/client_tls
cd ..
# Build Docker images for pods
mvn -B -ff -q dependency:go-offline \
install \
-Pdist,bundle-contrib-exts \
-Pskip-static-checks,skip-tests \
-Dmaven.javadoc.skip=true
docker build -t druid/cluster:v1 -f distribution/docker/DockerfileBuildTarAdvanced .
# This tmp dir is used for MiddleManager pod and Historical Pod to cache segments.
mkdir tmp
chmod 777 tmp
$KUBECTL apply -f integration-tests/k8s/role-and-binding.yaml
sed -i "s|REPLACE_VOLUMES|`pwd`|g" integration-tests/k8s/tiny-cluster.yaml
$KUBECTL apply -f integration-tests/k8s/tiny-cluster.yaml
# Wait a bit
sleep 120
## Debug And FastFail
$KUBECTL get pod
$KUBECTL get svc

View File

@ -0,0 +1,37 @@
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -e
export DRUID_OPERATOR_VERSION=0.0.3
export KUBECTL="sudo /usr/local/bin/kubectl"
# Prepare For Druid-Operator
git clone https://github.com/druid-io/druid-operator.git
cd druid-operator
git checkout -b druid-operator-$DRUID_OPERATOR_VERSION druid-operator-$DRUID_OPERATOR_VERSION
cd ..
sed -i "s|REPLACE_IMAGE|druidio/druid-operator:$DRUID_OPERATOR_VERSION|g" druid-operator/deploy/operator.yaml
# Deploy Druid Operator and Druid CR spec
$KUBECTL create -f druid-operator/deploy/service_account.yaml
$KUBECTL create -f druid-operator/deploy/role.yaml
$KUBECTL create -f druid-operator/deploy/role_binding.yaml
$KUBECTL create -f druid-operator/deploy/crds/druid.apache.org_druids_crd.yaml
$KUBECTL create -f druid-operator/deploy/operator.yaml
echo "Setup Druid Operator on K8S Done!"

View File

@ -0,0 +1,34 @@
#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -e
# Set Necessary ENV
export CHANGE_MINIKUBE_NONE_USER=true
export MINIKUBE_WANTUPDATENOTIFICATION=false
export MINIKUBE_WANTREPORTERRORPROMPT=false
export MINIKUBE_HOME=$HOME
export KUBECONFIG=$HOME/.kube/config
sudo apt install -y conntrack
# Lacunch K8S cluster
curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/v1.18.1/bin/linux/amd64/kubectl && chmod +x kubectl && sudo mv kubectl /usr/local/bin/
curl -Lo minikube https://storage.googleapis.com/minikube/releases/v1.8.1/minikube-linux-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/
sudo /usr/local/bin/minikube start --profile=minikube --vm-driver=none --kubernetes-version=v1.18.1
sudo /usr/local/bin/minikube update-context
echo "Setup K8S Cluster Done!"

View File

@ -1,97 +0,0 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
---
apiVersion: v1
kind: Service
metadata:
name: tiny-cluster-zk
spec:
clusterIP: None
ports:
- name: zk-client-port
port: 2181
- name: zk-fwr-port
port: 2888
- name: zk-elec-port
port: 3888
selector:
zk_cluster: tiny-cluster-zk
---
apiVersion: v1
kind: Service
metadata:
name: tiny-cluster-zk-nodeport
spec:
type: NodePort
ports:
- name: zk-service-port
nodePort: 30600
port: 2181
protocol: TCP
targetPort: 2181
selector:
zk_cluster: tiny-cluster-zk
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
zk_cluster: tiny-cluster-zk
name: tiny-cluster-zk
spec:
replicas: 1
selector:
matchLabels:
zk_cluster: tiny-cluster-zk
serviceName: tiny-cluster-zk
template:
metadata:
labels:
zk_cluster: tiny-cluster-zk
spec:
containers:
- env:
- name: ZOO_SERVERS
value: server.0=tiny-cluster-zk-0.tiny-cluster-zk:2888:3888
- name: SERVER_JVMFLAGS
value: -Xms256m -Xmx256m
image: zookeeper:3.4.13
name: tiny-cluster-zk
command: ["/bin/sh"]
args: ["-c", "ZOO_MY_ID=$(echo `hostname` | cut -d '-' -f2) /docker-entrypoint.sh zkServer.sh start-foreground"]
ports:
- containerPort: 2181
name: zk-client-port
- containerPort: 2888
name: zk-fwr-port
- containerPort: 3888
name: zk-elec-port
resources:
limits:
cpu: 1
memory: 512Mi
requests:
cpu: 1
memory: 512Mi
volumeMounts:
- mountPath: /data
name: druid-test-zk-data
- mountPath: /datalog
name: druid-test-zk-data-log
volumes:
- name: druid-test-zk-data
emptyDir: {}
- name: druid-test-zk-data-log
emptyDir: {}