Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-11222

kube-controller-manager cluster operator is degraded due connection refused while querying rules

XMLWordPrintable

    • No
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required

      This is a clone of issue OCPBUGS-7440. The following is the description of the original issue:

      Description of problem:

      while trying to figure out why it takes so long to install Single node OpenShift I noticed that the kube-controller-manager cluster operator is degraded for ~5 minutes due to:
      GarbageCollectorDegraded: error fetching rules: Get "https://thanos-querier.openshift-monitoring.svc:9091/api/v1/rules": dial tcp 172.30.119.108:9091: connect: connection refused
      I don't understand how the prometheusClient is successfully initialized, but we get a connection refused once we try to query the rules.
      Note that if the client initialization fails the kube-controller-manger won't set the  GarbageCollectorDegraded to true.

      Version-Release number of selected component (if applicable):

      4.12

      How reproducible:

      100%

      Steps to Reproduce:

      1. install SNO with bootstrap in place (https://github.com/eranco74/bootstrap-in-place-poc)
      
      2. monitor the cluster operators staus 
      

      Actual results:

      GarbageCollectorDegraded: error fetching rules: Get "https://thanos-querier.openshift-monitoring.svc:9091/api/v1/rules": dial tcp 172.30.119.108:9091: connect: connection refused 

      Expected results:

      Expected the GarbageCollectorDegraded status to be false

      Additional info:

      It seems that for PrometheusClient to be successfully initialised it needs to successfully create a connection but we get connection refused once we make the query.
      Note that installing SNO with this patch (https://github.com/eranco74/cluster-kube-controller-manager-operator/commit/26e644503a8f04aa6d116ace6b9eb7b9b9f2f23f) reduces the installation time by 3 minutes
      
      
      

              fkrepins@redhat.com Filip Krepinsky
              openshift-crt-jira-prow OpenShift Prow Bot
              ying zhou ying zhou
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: