Uploaded image for project: 'AMQ Streams'
  1. AMQ Streams
  2. ENTMQST-3034

Failed to connect to Zookeeper when using STRIMZI_OPERATOR_NAMESPACE_LABELS

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Explained
    • 1.7.0.GA
    • None
    • cluster-operator
    • None

    Description

      The STRIMZI_OPERATOR_NAMESPACE_LABELS only works when you have the cluster operator running in a separate namespace, while there is no namespaceSelector when the operator is local.

      If I use the following deploy procedure:

      OPERATOR_NS="kafka-operator"
      CLUSTER_NS="kafka-cluster"
      CLUSTER_NAME="my-cluster"
      CUSTOM_LABEL="axa-cloud.com/namespace-name=kafka-amqstreams-dev-axa-ch"
      
      kubectl create ns $OPERATOR_NS
      kubectl create ns $CLUSTER_NS
      
      kubectl -n $OPERATOR_NS create clusterrolebinding strimzi-cluster-operator-namespaced \
          --clusterrole strimzi-cluster-operator-namespaced --serviceaccount $OPERATOR_NS:strimzi-cluster-operator
      kubectl -n $OPERATOR_NS create clusterrolebinding strimzi-cluster-operator-entity-operator-delegation \
          --clusterrole strimzi-entity-operator --serviceaccount $OPERATOR_NS:strimzi-cluster-operator
      kubectl -n $OPERATOR_NS create clusterrolebinding strimzi-cluster-operator-topic-operator-delegation \
          --clusterrole strimzi-topic-operator --serviceaccount $OPERATOR_NS:strimzi-cluster-operator
      
      sed -i.bk "s/namespace: .*/namespace: $OPERATOR_NS/g" install/cluster-operator/*RoleBinding*.yaml
      kubectl -n $OPERATOR_NS create -f install/cluster-operator
      kubectl -n $OPERATOR_NS set env deploy strimzi-cluster-operator STRIMZI_NAMESPACE="$CLUSTER_NS"
      kubectl -n $OPERATOR_NS set env deploy strimzi-cluster-operator STRIMZI_OPERATOR_NAMESPACE_LABELS="$CUSTOM_LABEL"
      
      kubectl label namespace $CLUSTER_NS $CUSTOM_LABEL
      kubectl -n $CLUSTER_NS create -f examples/kafka/kafka-persistent.yaml
      

      Only Zookeeper pods are deployed and I get the following error (full log attached):

      2021-07-07 15:20:45 INFO  CrdOperator:112 - Status of Kafka my-cluster in namespace kafka-cluster has been updated
      2021-07-07 15:20:45 INFO  OperatorWatcher:40 - Reconciliation #5(watch) Kafka(kafka-cluster/my-cluster): Kafka my-cluster in namespace kafka-cluster was MODIFIED
      2021-07-07 15:20:45 WARN  AbstractOperator:508 - Reconciliation #0(watch) Kafka(kafka-cluster/my-cluster): Failed to reconcile
      io.strimzi.operator.cluster.operator.resource.ZookeeperScalingException: Failed to connect to Zookeeper my-cluster-zookeeper-0.my-cluster-zookeeper-nodes.kafka-cluster.svc:2181,my-cluster-zookeeper-1.my-cluster-zookeeper-nodes.kafka-cluster.svc:2181,my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.kafka-cluster.svc:2181. Connection was not ready in 300000 ms.
      	at io.strimzi.operator.cluster.operator.resource.ZookeeperScaler.lambda$connect$5(ZookeeperScaler.java:157) ~[io.strimzi.cluster-operator-0.22.1.redhat-00004.jar:0.22.1.redhat-00004]
      	at io.vertx.core.impl.FutureImpl.tryFail(FutureImpl.java:195) ~[io.vertx.vertx-core-3.9.4.redhat-00001.jar:3.9.4.redhat-00001]
      	at io.vertx.core.impl.FutureImpl.fail(FutureImpl.java:125) ~[io.vertx.vertx-core-3.9.4.redhat-00001.jar:3.9.4.redhat-00001]
      	at io.strimzi.operator.common.Util$1.lambda$handle$1(Util.java:137) ~[io.strimzi.operator-common-0.22.1.redhat-00004.jar:0.22.1.redhat-00004]
      	at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:327) ~[io.vertx.vertx-core-3.9.4.redhat-00001.jar:3.9.4.redhat-00001]
      	at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:366) ~[io.vertx.vertx-core-3.9.4.redhat-00001.jar:3.9.4.redhat-00001]
      	at io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38) ~[io.vertx.vertx-core-3.9.4.redhat-00001.jar:3.9.4.redhat-00001]
      	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [io.netty.netty-common-4.1.60.Final-redhat-00001.jar:4.1.60.Final-redhat-00001]
      	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [io.netty.netty-common-4.1.60.Final-redhat-00001.jar:4.1.60.Final-redhat-00001]
      	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) [io.netty.netty-transport-4.1.60.Final-redhat-00001.jar:4.1.60.Final-redhat-00001]
      	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [io.netty.netty-common-4.1.60.Final-redhat-00001.jar:4.1.60.Final-redhat-00001]
      	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty.netty-common-4.1.60.Final-redhat-00001.jar:4.1.60.Final-redhat-00001]
      	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty.netty-common-4.1.60.Final-redhat-00001.jar:4.1.60.Final-redhat-00001]
      	at java.lang.Thread.run(Thread.java:829) [?:?]
      Caused by: io.strimzi.operator.common.operator.resource.TimeoutException: Exceeded timeout of 300000ms while waiting for ZooKeeperAdmin connection to my-cluster-zookeeper-0.my-cluster-zookeeper-nodes.kafka-cluster.svc:2181,my-cluster-zookeeper-1.my-cluster-zookeeper-nodes.kafka-cluster.svc:2181,my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.kafka-cluster.svc:2181 to be connected
      	... 11 more
      

      If I remove the STRIMZI_OPERATOR_NAMESPACE_LABELS env variable, then cluster deployment complete successfully.

      Attachments

        Activity

          People

            tbentley-1 Tom Bentley
            rhn-support-fvaleri Federico Valeri
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: