Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-12875

SNO OCP upgrade from 4.12 to 4.13 failed due to node-tuning operator is not available - tuned pod stuck at Terminating

    XMLWordPrintable

Details

    • Important
    • No
    • Proposed
    • False
    • Hide

      None

      Show
      None
    • Hide
      3/22: Green as a tentative fix is posted
      3/14: telco reviewed
      Show
      3/22: Green as a tentative fix is posted 3/14: telco reviewed

    Description

      This is a clone of issue OCPBUGS-11920. The following is the description of the original issue:

      This is a clone of issue OCPBUGS-11031. The following is the description of the original issue:

      This is a clone of issue OCPBUGS-10217. The following is the description of the original issue:

      Description of problem:

      OCP upgrade from 4.12.6 to 4.13 nightly failed due to tuned operator is not available. 
      
      
      $ oc get clusterversion
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.12.7    True        True          155m    Unable to apply 4.13.0-0.nightly-2023-03-13-122248: the cluster operator node-tuning is not available
      
      
      $ oc get co 
      ...
      node-tuning                                4.13.0-0.nightly-2023-03-13-122248   False       True          False      126m    DaemonSet "tuned" has no available Pod(s)
      
      
      Following events in Terminating tuned pod: 
      
      $ oc get pods -n openshift-cluster-node-tuning-operator
      NAME                                            READY   STATUS        RESTARTS   AGE
      cluster-node-tuning-operator-5c458cc9d5-fp9zn   1/1     Running       0          136m
      tuned-94tcq                                     0/1     Terminating   2          4h43m
      
      $ oc describe pods -n openshift-cluster-node-tuning-operator tuned-94tcq
      ...
        Normal   Started            3h6m                 kubelet            Started container tuned
        Normal   Killing            126m                 kubelet            Stopping container tuned
        Warning  FailedPreStopHook  126m                 kubelet            Exec lifecycle hook ([/var/lib/tuned/bin/run stop]) for Container "tuned" in Pod "tuned-94tcq_openshift-cluster-node-tuning-operator(e12b341f-0683-4ccf-870c-bb3d24de073f)" failed - error: command '/var/lib/tuned/bin/run stop' exited with 1: openshift-tuned stop response: 
      , message: "openshift-tuned stop response: \n"
      

      Version-Release number of selected component (if applicable):

      4.13

      How reproducible:

      Intermittent

      Steps to Reproduce:

      1. Upgrade SNO from OCP 4.12.7 to 4.13 nightly
      2.
      3.
      

      Actual results:

      Upgrade stuck at waiting for node-tuning operator to be available
      
      $ oc get clusterversion NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS version   4.12.7    True        True          155m    Unable to apply 4.13.0-0.nightly-2023-03-13-122248: the cluster operator node-tuning is not available

      Expected results:

      Upgrade succeed

      Additional info:

      Workaround: manually reboot DUT

       

      Attachments

        Issue Links

          Activity

            People

              jmencak Jiri Mencak
              openshift-crt-jira-prow OpenShift Prow Bot
              Liquan Cui Liquan Cui
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: