Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-708

UpdatingKubeStateMetricsFailed before Upgrade

XMLWordPrintable

    • Moderate
    • None
    • Approved
    • False
    • Hide

      None

      Show
      None

      Description of problem:
      UpdatingKubeStateMetricsFailed,got 1 unavailable replicas. Before the upgrade, everything was normal for the cluster. However, there was a problem with monitoring components after running some data preparation cases.

      testing profile:
      07_aarch64_UPI on Baremetal-packet & OVN

      How reproducible:
      sometime

      Steps to Reproduce:
      trigger cluster upgrade Jinkins job from 4.11.0-0.nightly-arm64-2022-08-10-161519 to 4.11.0-0.nightly-arm64-2022-08-10-192742

      Actual results:

      [2022-08-11T18:39:03.107Z] oc get clusteroperators: 
      [2022-08-11T18:39:03.107Z]  NAME                                       VERSION                                    AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
      [2022-08-11T18:39:03.107Z] authentication                             4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      91m     
      [2022-08-11T18:39:03.107Z] baremetal                                  4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      115m    
      [2022-08-11T18:39:03.107Z] cloud-controller-manager                   4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      120m    
      [2022-08-11T18:39:03.107Z] cloud-credential                           4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      124m    
      [2022-08-11T18:39:03.107Z] cluster-autoscaler                         4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      114m    
      [2022-08-11T18:39:03.107Z] config-operator                            4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      116m    
      [2022-08-11T18:39:03.107Z] console                                    4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      96m     
      [2022-08-11T18:39:03.107Z] csi-snapshot-controller                    4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      115m    
      [2022-08-11T18:39:03.107Z] dns                                        4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      114m    
      [2022-08-11T18:39:03.107Z] etcd                                       4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      113m    
      [2022-08-11T18:39:03.107Z] image-registry                             4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      58m     
      [2022-08-11T18:39:03.107Z] ingress                                    4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      99m     
      [2022-08-11T18:39:03.107Z] insights                                   4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      109m    
      [2022-08-11T18:39:03.107Z] kube-apiserver                             4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      110m    
      [2022-08-11T18:39:03.107Z] kube-controller-manager                    4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      112m    
      [2022-08-11T18:39:03.107Z] kube-scheduler                             4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      112m    
      [2022-08-11T18:39:03.107Z] kube-storage-version-migrator              4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      56m     
      [2022-08-11T18:39:03.107Z] machine-api                                4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      115m    
      [2022-08-11T18:39:03.108Z] machine-approver                           4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      115m    
      [2022-08-11T18:39:03.108Z] machine-config                             4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      114m    
      [2022-08-11T18:39:03.108Z] marketplace                                4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      115m    
      [2022-08-11T18:39:03.108Z] monitoring                                 4.11.0-0.nightly-arm64-2022-08-10-161519   False       True          True       41m     Rollout of the monitoring stack failed and is degraded. Please investigate the degraded status error.
      [2022-08-11T18:39:03.108Z] network                                    4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      115m    
      [2022-08-11T18:39:03.108Z] node-tuning                                4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      115m    
      [2022-08-11T18:39:03.108Z] openshift-apiserver                        4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      105m    
      [2022-08-11T18:39:03.108Z] openshift-controller-manager               4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      111m    
      [2022-08-11T18:39:03.108Z] openshift-samples                          4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      107m    
      [2022-08-11T18:39:03.108Z] operator-lifecycle-manager                 4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      115m    
      [2022-08-11T18:39:03.108Z] operator-lifecycle-manager-catalog         4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      115m    
      [2022-08-11T18:39:03.108Z] operator-lifecycle-manager-packageserver   4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      59m     
      [2022-08-11T18:39:03.108Z] service-ca                                 4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      116m    
      [2022-08-11T18:39:03.108Z] storage                                    4.11.0-0.nightly-arm64-2022-08-10-161519   True        False         False      116m
      

        the logs in must-gather data:

      Error: container create failed: time="2022-08-11T17:45:45Z" level=error msg="runc create failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524"

      Expected results: monitoring stack upgrade succeeded

      Additional info:
      must-gather

              spasquie@redhat.com Simon Pasquier
              tagao@redhat.com Tai Gao
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: