Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-3457

[cpass-stage] Alertmanager and Prometheus pods failed to start after creating monitoringstack

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker Blocker
    • None
    • openshift-4.14.z
    • observability-operator
    • None
    • MON Sprint 244

      pod failed to start for trying to pull image from wrong registry “registry-proxy.engineering.redhat.com”, tag in rhobs-obo-prometheus-config-reloader:0.66.0-2 should also be changed to digest

      1. Create catalogsource
        apiVersion: operators.coreos.com/v1alpha1
        kind: CatalogSource
        metadata:
          name: coo-stage-registry
          namespace: openshift-marketplace
        spec:
          image: brew.registry.redhat.io/rh-osbs/iib-pub-pending:v4.14
          publisher: Openshift QE
          sourceType: grpc
          updateStrategy:
            registryPoll:
              interval: 10m0s 

        2. install COO cluster from UI and create monitoringstack with default configuration
        3.  check related pods

        % oc -n openshift-operators get pod                                                              
        NAME                                                         READY   STATUS                  RESTARTS   AGE
        alertmanager-sample-monitoring-stack-0                       0/2     Init:ImagePullBackOff   0          16m
        alertmanager-sample-monitoring-stack-1                       0/2     Init:ImagePullBackOff   0          16m
        obo-prometheus-operator-7548c8f586-qx9kh                     1/1     Running                 0          20m
        obo-prometheus-operator-admission-webhook-7d7f5d9479-7d7f4   1/1     Running                 0          20m
        obo-prometheus-operator-admission-webhook-7d7f5d9479-qlgff   1/1     Running                 0          20m
        observability-operator-69455f9d6d-26mv9                      1/1     Running                 0          20m
        prometheus-sample-monitoring-stack-0                         0/3     Init:ImagePullBackOff   0          16m
        prometheus-sample-monitoring-stack-1                         0/3     Init:ImagePullBackOff   0          16m
        % oc describe pod alertmanager-sample-monitoring-stack-0
        ----
        Events:
          Type     Reason          Age                    From               Message
          ----     ------          ----                   ----               -------
          Normal   Scheduled       27m                    default-scheduler  Successfully assigned openshift-operators/alertmanager-sample-monitoring-stack-0 to anligcp3-wdmml-worker-c-lbdch.c.openshift-qe.internal
          Normal   AddedInterface  27m                    multus             Add eth0 [10.128.2.35/23] from openshift-sdn
          Normal   Pulling         26m (x4 over 27m)      kubelet            Pulling image "registry-proxy.engineering.redhat.com/rh-osbs/rhobs-obo-prometheus-config-reloader:0.66.0-2"
          Warning  Failed          26m (x4 over 27m)      kubelet            Failed to pull image "registry-proxy.engineering.redhat.com/rh-osbs/rhobs-obo-prometheus-config-reloader:0.66.0-2": rpc error: code = Unknown desc = pinging container registry registry-proxy.engineering.redhat.com: Get "https://registry-proxy.engineering.redhat.com/v2/": dial tcp: lookup registry-proxy.engineering.redhat.com on 169.254.169.254:53: no such host
          Warning  Failed          26m (x4 over 27m)      kubelet            Error: ErrImagePull
          Warning  Failed          26m (x6 over 27m)      kubelet            Error: ImagePullBackOff
          Normal   BackOff         2m46s (x110 over 27m)  kubelet            Back-off pulling image "registry-proxy.engineering.redhat.com/rh-osbs/rhobs-obo-prometheus-config-reloader:0.66.0-2"
        
        % oc describe pod prometheus-sample-monitoring-stack-0| tail -n 10
        Events:
          Type     Reason          Age                    From               Message
          ----     ------          ----                   ----               -------
          Normal   Scheduled       29m                    default-scheduler  Successfully assigned openshift-operators/prometheus-sample-monitoring-stack-0 to anligcp3-wdmml-worker-b-5cbh6.c.openshift-qe.internal
          Normal   AddedInterface  29m                    multus             Add eth0 [10.129.2.33/23] from openshift-sdn
          Warning  Failed          28m (x6 over 29m)      kubelet            Error: ImagePullBackOff
          Normal   Pulling         28m (x4 over 29m)      kubelet            Pulling image "registry-proxy.engineering.redhat.com/rh-osbs/rhobs-obo-prometheus-config-reloader:0.66.0-2"
          Warning  Failed          28m (x4 over 29m)      kubelet            Failed to pull image "registry-proxy.engineering.redhat.com/rh-osbs/rhobs-obo-prometheus-config-reloader:0.66.0-2": rpc error: code = Unknown desc = pinging container registry registry-proxy.engineering.redhat.com: Get "https://registry-proxy.engineering.redhat.com/v2/": dial tcp: lookup registry-proxy.engineering.redhat.com on 169.254.169.254:53: no such host
          Warning  Failed          28m (x4 over 29m)      kubelet            Error: ErrImagePull
          Normal   BackOff         4m29s (x109 over 29m)  kubelet            Back-off pulling image "registry-proxy.engineering.redhat.com/rh-osbs/rhobs-obo-prometheus-config-reloader:0.66.0-2"

         

       

       

       

              jfajersk@redhat.com Jan Fajerski
              hongyli@redhat.com Hongyan Li
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: