Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-39558

"Cluster operator X is updating versions" is not a reason for Failing=True condition

XMLWordPrintable

    • No
    • 1
    • OTA 263
    • 1
    • False
    • Hide

      None

      Show
      None
    • Hide
      Previously, the Cluster Version Operator (CVO) did not filter internal errors that were propogated to the ClusterVersion Failing condition message. As a result, errors that did not negatively impact the update were shown for the ClusterVersion Failing condition message. With this release, the errors that are propogated to the ClusterVersion Failing condition message are filtered.
      ====
      *Cause*: The Cluster Version Operator did not filter out internal errors propagated to the ClusterVersion Failing condition message.
      *Consequence*: Errors that do not impact negatively the update were being shown in the message of the ClusterVersion Failing condition.
      *Fix*: The errors propagated to the Failing condition message are now being filtered.
      *Result*: Bug doesn’t present anymore.
      Show
      Previously, the Cluster Version Operator (CVO) did not filter internal errors that were propogated to the ClusterVersion Failing condition message. As a result, errors that did not negatively impact the update were shown for the ClusterVersion Failing condition message. With this release, the errors that are propogated to the ClusterVersion Failing condition message are filtered. ==== *Cause*: The Cluster Version Operator did not filter out internal errors propagated to the ClusterVersion Failing condition message. *Consequence*: Errors that do not impact negatively the update were being shown in the message of the ClusterVersion Failing condition. *Fix*: The errors propagated to the Failing condition message are now being filtered. *Result*: Bug doesn’t present anymore.
    • Bug Fix
    • Done

      This is a clone of issue OCPBUGS-15200. The following is the description of the original issue:

      Description of problem:

      During the build02 update from 4.14.0-ec.1 to ec.2 I have noticed the following:

      
      $ b02 get clusterversion version -o json | jq '.status.conditions[] | select (.type=="Failing")'
      {
        "lastTransitionTime": "2023-06-20T13:40:12Z",
        "message": "Multiple errors are preventing progress:\n* Cluster operator authentication is updating versions\n* Could not update customresourcedefinition \"alertingrules.monitoring.openshift.io\" (512 of 993): the object is invalid, possibly due to local cluster configuration",
        "reason": "MultipleErrors",
        "status": "True",
        "type": "Failing"
      }
      
      

      There is a valid error (the Could not update customresourcedefinition... one) but the whole thing is cluttered by the "Cluster operator authentication is updating versions" message, which is imo not a legit reason for Failing=True condition and should not be there. Before I captured this one I saw the message with three operators instead of just one.

      Version-Release number of selected component (if applicable):

      4.14.0-ec.2
      

      How reproducible:

      No idea
      

              dhurta@redhat.com David Hurta
              openshift-crt-jira-prow OpenShift Prow Bot
              Dinesh Kumar S Dinesh Kumar S
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: