Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-5241

OCP-42855 failure with hostedcluster conditions Degraded is True

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      During debugging ocp-42855 failure, hostedcluster conditions Degraded is True

      Version-Release number of selected component (if applicable):

      quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64

      How reproducible:

      follow ocp-42855 test steps

      Steps to Reproduce:

      1.Create a basic hosted cluster using hypershift tool
      2.check hostedcluster conditions
      

      Actual results:

      [hmx@ovpn-12-45 hypershift]$ oc get pods -n clusters-mihuanghy
      NAME                                                  READY   STATUS             RESTARTS         AGE
      aws-ebs-csi-driver-controller-9c46694f-mqrlc          7/7     Running            0                55m
      aws-ebs-csi-driver-operator-5d7867bc9f-hqzd5          1/1     Running            0                55m
      capi-provider-6df855dbb5-tcmvq                        2/2     Running            0                58m
      catalog-operator-7544b8d6d8-dk4hh                     2/2     Running            0                57m
      certified-operators-catalog-7f8f6598b5-2blv4          0/1     CrashLoopBackOff   15 (4m20s ago)   57m
      cloud-network-config-controller-545fcfc797-mgszj      3/3     Running            0                55m
      cluster-api-54c7f7c477-kgvzn                          1/1     Running            0                58m
      cluster-autoscaler-658756f99-vr2hk                    1/1     Running            0                58m
      cluster-image-registry-operator-84d84dbc9f-zpcsq      3/3     Running            0                57m
      cluster-network-operator-9b6985cc8-sd7d7              1/1     Running            0                57m
      cluster-node-tuning-operator-65c8f6fbb9-xzpws         1/1     Running            0                57m
      cluster-policy-controller-b5c76cf58-b4rth             1/1     Running            0                57m
      cluster-storage-operator-7474f76c99-9chl7             1/1     Running            0                57m
      cluster-version-operator-646d97ccc9-l72m5             1/1     Running            0                57m
      community-operators-catalog-774fdb48fc-z6s4d          1/1     Running            0                57m
      control-plane-operator-5bc8c4c996-4nz8c               2/2     Running            0                58m
      csi-snapshot-controller-5b7d6bb685-vf8rf              1/1     Running            0                55m
      csi-snapshot-controller-operator-6f74db85c6-89bts     1/1     Running            0                57m
      csi-snapshot-webhook-57c5bd7f85-lqnwf                 1/1     Running            0                55m
      dns-operator-767c5bbdd8-rb7fl                         1/1     Running            0                57m
      etcd-0                                                2/2     Running            0                58m
      hosted-cluster-config-operator-88b9d49b7-2gvbt        1/1     Running            0                57m
      ignition-server-949d9fd8c-cgtxb                       1/1     Running            0                58m
      ingress-operator-5c6f5d4f48-gh7fl                     3/3     Running            0                57m
      konnectivity-agent-79c5ff9585-pqctc                   1/1     Running            0                58m
      konnectivity-server-65956d468c-lpwfv                  1/1     Running            0                58m
      kube-apiserver-d9f887c4b-xwdcx                        5/5     Running            0                58m
      kube-controller-manager-64b6f757f9-6qszq              2/2     Running            0                52m
      kube-scheduler-58ffcdf789-fch2n                       1/1     Running            0                57m
      machine-approver-559d66d4d6-2v64w                     1/1     Running            0                58m
      multus-admission-controller-8695985fbc-hjtqb          2/2     Running            0                55m
      oauth-openshift-6b9695fc7f-pf4j6                      2/2     Running            0                55m
      olm-operator-bf694b84-gvz6x                           2/2     Running            0                57m
      openshift-apiserver-55c69bc497-x8bft                  2/2     Running            0                52m
      openshift-controller-manager-8597c66d58-jb7w2         1/1     Running            0                57m
      openshift-oauth-apiserver-674cd6df6d-ckg55            1/1     Running            0                57m
      openshift-route-controller-manager-76d78f897c-9mfmj   1/1     Running            0                57m
      ovnkube-master-0                                      7/7     Running            0                55m
      packageserver-7988d8ddfc-wnh6l                        2/2     Running            0                57m
      redhat-marketplace-catalog-77547cc685-hnh65           0/1     CrashLoopBackOff   15 (4m15s ago)   57m
      redhat-operators-catalog-7784d45f54-58lgg             1/1     Running            0                57m
      
      
       
      {
                      "lastTransitionTime": "2022-12-31T18:45:28Z",
                      "message": "[certified-operators-catalog deployment has 1 unavailable replicas, redhat-marketplace-catalog deployment has 1 unavailable replicas]",
                      "observedGeneration": 3,
                      "reason": "UnavailableReplicas",
                      "status": "True",
                      "type": "Degraded"
                  },
      

      Expected results:

      Degraded is False 

      Additional info:

      $ oc describe pod certified-operators-catalog-7f8f6598b5-2blv4 -n clusters-mihuanghy
      Name:                 certified-operators-catalog-7f8f6598b5-2blv4
      Namespace:            clusters-mihuanghy
      Priority:             100000000
      Priority Class Name:  hypershift-control-plane
      Node:                 ip-10-0-202-149.us-east-2.compute.internal/10.0.202.149
      Start Time:           Sun, 01 Jan 2023 02:47:03 +0800
      Labels:               app=certified-operators-catalog
                            hypershift.openshift.io/control-plane-component=certified-operators-catalog
                            hypershift.openshift.io/hosted-control-plane=clusters-mihuanghy
                            olm.catalogSource=certified-operators
                            pod-template-hash=7f8f6598b5
      Annotations:          hypershift.openshift.io/release-image: quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64
                            k8s.v1.cni.cncf.io/network-status:
                              [{
                                  "name": "openshift-sdn",
                                  "interface": "eth0",
                                  "ips": [
                                      "10.131.0.38"
                                  ],
                                  "default": true,
                                  "dns": {}
                              }]
                            k8s.v1.cni.cncf.io/networks-status:
                              [{
                                  "name": "openshift-sdn",
                                  "interface": "eth0",
                                  "ips": [
                                      "10.131.0.38"
                                  ],
                                  "default": true,
                                  "dns": {}
                              }]
                            openshift.io/scc: restricted-v2
                            seccomp.security.alpha.kubernetes.io/pod: runtime/default
      Status:               Running
      IP:                   10.131.0.38
      IPs:
        IP:           10.131.0.38
      Controlled By:  ReplicaSet/certified-operators-catalog-7f8f6598b5
      Containers:
        registry:
          Container ID:   cri-o://f32b8d4c31b729c1b7deef0da622ddd661d840428aa4847968b1b2b3bf76b6cf
          Image:          registry.redhat.io/redhat/certified-operator-index:v4.11
          Image ID:       registry.redhat.io/redhat/certified-operator-index@sha256:93f667597eee33b9bdbc9a61af60978b414b6f6df8e7c5f496c4298c1dfe9b62
          Port:           50051/TCP
          Host Port:      0/TCP
          State:          Waiting
            Reason:       CrashLoopBackOff
          Last State:     Terminated
            Reason:       Error
            Exit Code:    1
            Started:      Sun, 01 Jan 2023 03:39:44 +0800
            Finished:     Sun, 01 Jan 2023 03:39:44 +0800
          Ready:          False
          Restart Count:  15
          Requests:
            cpu:        10m
            memory:     160Mi
          Liveness:     exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3
          Readiness:    exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3
          Startup:      exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15
          Environment:  <none>
          Mounts:       <none>
      Conditions:
        Type              Status
        Initialized       True 
        Ready             False 
        ContainersReady   False 
        PodScheduled      True 
      Volumes:            <none>
      QoS Class:          Burstable
      Node-Selectors:     <none>
      Tolerations:        hypershift.openshift.io/cluster=clusters-mihuanghy:NoSchedule
                          hypershift.openshift.io/control-plane=true:NoSchedule
                          node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                          node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                          node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
      Events:
        Type     Reason          Age                    From               Message
        ----     ------          ----                   ----               -------
        Normal   Scheduled       54m                    default-scheduler  Successfully assigned clusters-mihuanghy/certified-operators-catalog-7f8f6598b5-2blv4 to ip-10-0-202-149.us-east-2.compute.internal
        Normal   AddedInterface  53m                    multus             Add eth0 [10.131.0.38/23] from openshift-sdn
        Normal   Pulling         53m                    kubelet            Pulling image "registry.redhat.io/redhat/certified-operator-index:v4.11"
        Normal   Pulled          53m                    kubelet            Successfully pulled image "registry.redhat.io/redhat/certified-operator-index:v4.11" in 40.628843349s
        Normal   Pulled          52m (x3 over 53m)      kubelet            Container image "registry.redhat.io/redhat/certified-operator-index:v4.11" already present on machine
        Normal   Created         52m (x4 over 53m)      kubelet            Created container registry
        Normal   Started         52m (x4 over 53m)      kubelet            Started container registry
        Warning  BackOff         3m59s (x256 over 53m)  kubelet            Back-off restarting failed container
      
      $ oc describe pod redhat-marketplace-catalog-77547cc685-hnh65 -n clusters-mihuanghy
      Name:                 redhat-marketplace-catalog-77547cc685-hnh65
      Namespace:            clusters-mihuanghy
      Priority:             100000000
      Priority Class Name:  hypershift-control-plane
      Node:                 ip-10-0-202-149.us-east-2.compute.internal/10.0.202.149
      Start Time:           Sun, 01 Jan 2023 02:47:03 +0800
      Labels:               app=redhat-marketplace-catalog
                            hypershift.openshift.io/control-plane-component=redhat-marketplace-catalog
                            hypershift.openshift.io/hosted-control-plane=clusters-mihuanghy
                            olm.catalogSource=redhat-marketplace
                            pod-template-hash=77547cc685
      Annotations:          hypershift.openshift.io/release-image: quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64
                            k8s.v1.cni.cncf.io/network-status:
                              [{
                                  "name": "openshift-sdn",
                                  "interface": "eth0",
                                  "ips": [
                                      "10.131.0.40"
                                  ],
                                  "default": true,
                                  "dns": {}
                              }]
                            k8s.v1.cni.cncf.io/networks-status:
                              [{
                                  "name": "openshift-sdn",
                                  "interface": "eth0",
                                  "ips": [
                                      "10.131.0.40"
                                  ],
                                  "default": true,
                                  "dns": {}
                              }]
                            openshift.io/scc: restricted-v2
                            seccomp.security.alpha.kubernetes.io/pod: runtime/default
      Status:               Running
      IP:                   10.131.0.40
      IPs:
        IP:           10.131.0.40
      Controlled By:  ReplicaSet/redhat-marketplace-catalog-77547cc685
      Containers:
        registry:
          Container ID:   cri-o://7afba8993dac8f1c07a2946d8b791def3b0c80ce62d5d6160770a5a9990bf922
          Image:          registry.redhat.io/redhat/redhat-marketplace-index:v4.11
          Image ID:       registry.redhat.io/redhat/redhat-marketplace-index@sha256:074498ac11b5691ba8975e8f63fa04407ce11bb035dde0ced2f439d7a4640510
          Port:           50051/TCP
          Host Port:      0/TCP
          State:          Waiting
            Reason:       CrashLoopBackOff
          Last State:     Terminated
            Reason:       Error
            Exit Code:    1
            Started:      Sun, 01 Jan 2023 03:39:49 +0800
            Finished:     Sun, 01 Jan 2023 03:39:49 +0800
          Ready:          False
          Restart Count:  15
          Requests:
            cpu:        10m
            memory:     340Mi
          Liveness:     exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3
          Readiness:    exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3
          Startup:      exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15
          Environment:  <none>
          Mounts:       <none>
      Conditions:
        Type              Status
        Initialized       True 
        Ready             False 
        ContainersReady   False 
        PodScheduled      True 
      Volumes:            <none>
      QoS Class:          Burstable
      Node-Selectors:     <none>
      Tolerations:        hypershift.openshift.io/cluster=clusters-mihuanghy:NoSchedule
                          hypershift.openshift.io/control-plane=true:NoSchedule
                          node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                          node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                          node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
      Events:
        Type     Reason          Age                  From               Message
        ----     ------          ----                 ----               -------
        Normal   Scheduled       55m                  default-scheduler  Successfully assigned clusters-mihuanghy/redhat-marketplace-catalog-77547cc685-hnh65 to ip-10-0-202-149.us-east-2.compute.internal
        Normal   AddedInterface  55m                  multus             Add eth0 [10.131.0.40/23] from openshift-sdn
        Normal   Pulling         55m                  kubelet            Pulling image "registry.redhat.io/redhat/redhat-marketplace-index:v4.11"
        Normal   Pulled          54m                  kubelet            Successfully pulled image "registry.redhat.io/redhat/redhat-marketplace-index:v4.11" in 40.862526792s
        Normal   Pulled          53m (x3 over 54m)    kubelet            Container image "registry.redhat.io/redhat/redhat-marketplace-index:v4.11" already present on machine
        Normal   Created         53m (x4 over 54m)    kubelet            Created container registry
        Normal   Started         53m (x4 over 54m)    kubelet            Started container registry
        Warning  BackOff         21s (x276 over 54m)  kubelet            Back-off restarting failed container
      
         $ oc describe deployment redhat-marketplace-catalog -n clusters-mihuanghy
      Name:                   redhat-marketplace-catalog
      Namespace:              clusters-mihuanghy
      CreationTimestamp:      Sun, 01 Jan 2023 02:47:03 +0800
      Labels:                 hypershift.openshift.io/managed-by=control-plane-operator
      Annotations:            deployment.kubernetes.io/revision: 1
      Selector:               olm.catalogSource=redhat-marketplace
      Replicas:               1 desired | 1 updated | 1 total | 0 available | 1 unavailable
      StrategyType:           RollingUpdate
      MinReadySeconds:        0
      RollingUpdateStrategy:  25% max unavailable, 25% max surge
      Pod Template:
        Labels:       app=redhat-marketplace-catalog
                      hypershift.openshift.io/control-plane-component=redhat-marketplace-catalog
                      hypershift.openshift.io/hosted-control-plane=clusters-mihuanghy
                      olm.catalogSource=redhat-marketplace
        Annotations:  hypershift.openshift.io/release-image: quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64
        Containers:
         registry:
          Image:      registry.redhat.io/redhat/redhat-marketplace-index:v4.11
          Port:       50051/TCP
          Host Port:  0/TCP
          Requests:
            cpu:              10m
            memory:           340Mi
          Liveness:           exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3
          Readiness:          exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3
          Startup:            exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15
          Environment:        <none>
          Mounts:             <none>
        Volumes:              <none>
        Priority Class Name:  hypershift-control-plane
      Conditions:
        Type           Status  Reason
        ----           ------  ------
        Available      False   MinimumReplicasUnavailable
        Progressing    False   ProgressDeadlineExceeded
      OldReplicaSets:  <none>
      NewReplicaSet:   redhat-marketplace-catalog-77547cc685 (1/1 replicas created)
      Events:
        Type    Reason             Age   From                   Message
        ----    ------             ----  ----                   -------
        Normal  ScalingReplicaSet  22m   deployment-controller  Scaled up replica set redhat-marketplace-catalog-77547cc685 to 1
      [hmx@ovpn-12-45 hypershift]$ oc get hostedcluster -A
      NAMESPACE   NAME        VERSION       KUBECONFIG                   PROGRESS    AVAILABLE   PROGRESSING   MESSAGE
      clusters    mihuanghy   4.12.0-rc.6   mihuanghy-admin-kubeconfig   Completed   True        False         The hosted control plane is available
      
      
      $ oc describe deployment certified-operators-catalog -n clusters-mihuanghy
      Name:                   certified-operators-catalog
      Namespace:              clusters-mihuanghy
      CreationTimestamp:      Sun, 01 Jan 2023 02:47:03 +0800
      Labels:                 hypershift.openshift.io/managed-by=control-plane-operator
      Annotations:            deployment.kubernetes.io/revision: 1
      Selector:               olm.catalogSource=certified-operators
      Replicas:               1 desired | 1 updated | 1 total | 0 available | 1 unavailable
      StrategyType:           RollingUpdate
      MinReadySeconds:        0
      RollingUpdateStrategy:  25% max unavailable, 25% max surge
      Pod Template:
        Labels:       app=certified-operators-catalog
                      hypershift.openshift.io/control-plane-component=certified-operators-catalog
                      hypershift.openshift.io/hosted-control-plane=clusters-mihuanghy
                      olm.catalogSource=certified-operators
        Annotations:  hypershift.openshift.io/release-image: quay.io/openshift-release-dev/ocp-release:4.12.0-rc.6-x86_64
        Containers:
         registry:
          Image:      registry.redhat.io/redhat/certified-operator-index:v4.11
          Port:       50051/TCP
          Host Port:  0/TCP
          Requests:
            cpu:              10m
            memory:           160Mi
          Liveness:           exec [grpc_health_probe -addr=:50051] delay=10s timeout=1s period=10s #success=1 #failure=3
          Readiness:          exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3
          Startup:            exec [grpc_health_probe -addr=:50051] delay=0s timeout=1s period=10s #success=1 #failure=15
          Environment:        <none>
          Mounts:             <none>
        Volumes:              <none>
        Priority Class Name:  hypershift-control-plane
      Conditions:
        Type           Status  Reason
        ----           ------  ------
        Available      False   MinimumReplicasUnavailable
        Progressing    False   ProgressDeadlineExceeded
      OldReplicaSets:  <none>
      NewReplicaSet:   certified-operators-catalog-7f8f6598b5 (1/1 replicas created)
      Events:
        Type    Reason             Age   From                   Message
        ----    ------             ----  ----                   -------
        Normal  ScalingReplicaSet  21m   deployment-controller  Scaled up replica set certified-operators-catalog-7f8f6598b5 to 1
      

              agarcial@redhat.com Alberto Garcia Lamela
              mihuang@redhat.com Mingxia Huang
              None
              None
              Mingxia Huang Mingxia Huang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: