Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-8708

Node Feature Discovery Operator is failing to update from OpenShift Container Platform 4.11 to 4.12

XMLWordPrintable

    • Critical
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      When updating OpenShift Container Platform 4.11 to 4.12 with NFD Operator, the NFD Operator will stuck with the required update and fail with the same with the below error reported in the Subscription.
      
      $ oc get Subscription -n openshift-nfd nfd -o yaml
      apiVersion: operators.coreos.com/v1alpha1
      kind: Subscription
      metadata:
        creationTimestamp: "2023-03-08T15:05:16Z"
        generation: 1
        labels:
          operators.coreos.com/nfd.openshift-nfd: ""
        name: nfd
        namespace: openshift-nfd
        resourceVersion: "7149045"
        uid: 0d744118-8568-4c31-8984-cbdcb4cce971
      spec:
        channel: stable
        installPlanApproval: Automatic
        name: nfd
        source: redhat-operators
        sourceNamespace: openshift-marketplace
        startingCSV: nfd.4.12.0-202302280915
      status:
        catalogHealth:
        - catalogSourceRef:
            apiVersion: operators.coreos.com/v1alpha1
            kind: CatalogSource
            name: certified-operators
            namespace: openshift-marketplace
            resourceVersion: "7143592"
            uid: 52d3b288-4042-43a4-9d4d-0bd0dd59b203
          healthy: true
          lastUpdated: "2023-03-08T15:05:17Z"
        - catalogSourceRef:
            apiVersion: operators.coreos.com/v1alpha1
            kind: CatalogSource
            name: community-operators
            namespace: openshift-marketplace
            resourceVersion: "7145481"
            uid: c303d9f0-7856-4394-bae1-807c4a8972a3
          healthy: true
          lastUpdated: "2023-03-08T15:05:17Z"
        - catalogSourceRef:
            apiVersion: operators.coreos.com/v1alpha1
            kind: CatalogSource
            name: redhat-marketplace
            namespace: openshift-marketplace
            resourceVersion: "7143591"
            uid: 6a373335-8fea-4323-a872-83af6ea21322
          healthy: true
          lastUpdated: "2023-03-08T15:05:17Z"
        - catalogSourceRef:
            apiVersion: operators.coreos.com/v1alpha1
            kind: CatalogSource
            name: redhat-operators
            namespace: openshift-marketplace
            resourceVersion: "7141365"
            uid: c4f2102b-cd2f-4e32-8793-f409c7e3ad04
          healthy: true
          lastUpdated: "2023-03-08T15:05:17Z"
        conditions:
        - lastTransitionTime: "2023-03-08T15:05:17Z"
          message: all available catalogsources are healthy
          reason: AllCatalogSourcesHealthy
          status: "False"
          type: CatalogSourcesUnhealthy
        - lastTransitionTime: "2023-03-08T15:06:22Z"
          message: 'error validating existing CRs against new CRD''s schema for "nodefeaturediscoveries.nfd.openshift.io":
            error validating custom resource against new schema for NodeFeatureDiscovery
            openshift-nfd/nfd-instance: [[].status.conditions[0].message: Required value,
            [].status.conditions[0].reason: Required value, [].status.conditions[1].message:
            Required value, [].status.conditions[1].reason: Required value, [].status.conditions[2].message:
            Required value, [].status.conditions[2].reason: Required value, [].status.conditions[3].message:
            Required value, [].status.conditions[3].reason: Required value]'
          reason: InstallComponentFailed
          status: "True"
          type: InstallPlanFailed
        currentCSV: nfd.4.12.0-202302280915
        installPlanGeneration: 1
        installPlanRef:
          apiVersion: operators.coreos.com/v1alpha1
          kind: InstallPlan
          name: install-xlnk6
          namespace: openshift-nfd
          resourceVersion: "7148053"
          uid: 881d42d8-1846-45ef-9e43-a5a64771ca4f
        installedCSV: nfd.4.12.0-202302280915
        installplan:
          apiVersion: operators.coreos.com/v1alpha1
          kind: InstallPlan
          name: install-xlnk6
          uuid: 881d42d8-1846-45ef-9e43-a5a64771ca4f
        lastUpdated: "2023-03-08T15:06:22Z"
        state: AtLatestKnown
      
      Removing and re-installing the NFD Operator does not help as the CRD validation continues to fail.
      
      Hence removing the affected CRD is probably the only approach to recover this but that will impact potentially the workload and is not something we recommend doing.
      
      So it's key to understand why this is happening and to have an approach to fix that without requiring customers to even remove their custom resources and custom resource definitions.

      Version-Release number of selected component (if applicable):

      OpenShift Container Platform 4.12

      How reproducible:

      Always

      Steps to Reproduce:

      1. Install OpenShift Container Platform 4.11 and NFD Operator from Red Hat
      2. Update to OpenShift Container Platform 4.12 and see how the NFD Operator update is stuck
      

      Actual results:

      error validating existing CRs against new CRD's schema for "nodefeaturediscoveries.nfd.openshift.io": error validating custom resource against new schema for NodeFeatureDiscovery openshift-nfd/nfd-instance: [[].status.conditions[0].message: Required value, [].status.conditions[0].reason: Required value, [].status.conditions[1].message: Required value, [].status.conditions[1].reason: Required value, [].status.conditions[2].message: Required value, [].status.conditions[2].reason: Required value, [].status.conditions[3].message: Required value, [].status.conditions[3].reason: Required value]
       

      Expected results:

      Update to just work without manual intervention required by the platform engineer
       

      Additional info:

       

            rhn-gps-cprocter Chris Procter
            rhn-support-sreber Simon Reber
            Guy Gordani Guy Gordani
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: