Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-14183

Upgrade from 4.7 to 4.8 caused issue with installing operators

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • None
    • 4.8.z
    • OLM / Registry
    • Critical
    • No
    • OPECO 237
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Customer Escalated

      Description of problem:

      
      After cluster upgrade CU is not able to install any redhat-operator and community operator. Operator gets always stuck in unknown state. Only subscription is getting created, but no IP, CSV job.
      
      Catalog operator pod throwing below error:
      ~~~
      2023-05-28T13:04:22.954397188Z time="2023-05-28T13:04:22Z" level=warning msg="error encountered while listing bundles: rpc error: code = DeadlineExceeded desc = context deadline exceeded" catalog="{community-operators openshift-marketplace}"
      2023-05-28T13:04:22.954397188Z time="2023-05-28T13:04:22Z" level=warning msg="error encountered while listing bundles: rpc error: code = DeadlineExceeded desc = context deadline exceeded" catalog="{redhat-operators openshift-marketplace}"
      ~~~
      
      We see this mostly on the OCS operator, as it can't be upgraded.
      The error we see in the catalog logs is:
      ~~~
      2023-05-28T13:04:22.954876501Z time="2023-05-28T13:04:22Z" level=debug msg="resolution failed" error="constraints not satisfiable: no operators found from catalog redhat-operators in namespace openshift-marketplace referenced by subscription ocs-operator, subscription ocs-operator exists" id=wE2th namespace=openshift-storage
      ~~~
      
      Steps done:
      - I have deleted the existing CSVs under openshift-storage namespace
      - I have deleted existing sub under openshift-storage namespace
      - I have re-installed OCS v4.8 from OCP console and I don't see any behaviour changes.
      
      I observed only Sub for OCS v4.8 created and no CSV and no IP created.
      ~~~
      # # oc get csv,ip,sub
      NAME                                             PACKAGE        SOURCE             CHANNEL
      subscription.operators.coreos.com/ocs-operator   ocs-operator   redhat-operators   stable-4.8
      ~~~
      
      The observation is that the catalog operator is not able to list the packages from the channel.
      
      Steps taken to debug the issue:
      - Ran below command from catalog operator POD in openshift-operator-lifecycle-manager project and could see it is taking some time(~2 Minutes) to list out packages. But the grpcurl finishes with success.
      # grpcurl -plaintext <redhat-operators_SVC_IP>:50051 api.Registry/ListPackages
      
      - Catalog-operator POD and redhat-operators are hosted on two different nodes and to check whether there is any network latency between two nodes and causing the delay in listing packages, I ran the above command from redhat-operators POD itself after replacing service IP with localhost and could see same amount of delay. Moreover community-operators POD is running on same node as redhat-operators and above grpcurl command is returning instant result while using community-operators service IP in place of redhat-operators service IP. So, network latency can be ruled out.
      
      - Deleting the pods in the OLM project didn't solve the issue. The issue still persists after new pods were started.
      
      - Increased the debugging, but it doesn't show any additional data.
      
      

      Version-Release number of selected component (if applicable):

      - OCP Cluster version is 4.8.51
      

      How reproducible:

      #N/A
      

      Actual results:

      - OCS operator can't be installed
      - one of our application run time (Event-streams pods) has already been impacted mostly due to the ongoing/prevailing upgrade issues.
      

      Expected results:

      
      

      Additional info:

      - customer is trying to upgrade to the next EUS version
      

            rh-ee-cchantse Catherine Chan-Tse
            rhn-support-vwalek Vladislav Walek
            Jian Zhang Jian Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            18 Start watching this issue

              Created:
              Updated:
              Resolved: