Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-23736

SRIOV operator fails to deploy for OCP-4.13.23

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Undefined Undefined
    • None
    • 4.13.z
    • Networking / SR-IOV
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • 4
    • Important
    • No
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Deployment of sriov operator fails on spoke
      
      Environment is hub with OCP 4.14.3 and ACM 2.9
      Deployment is for OCP 4.13.23 SNO spoke with Telco DU profile applied.
      
      Result is that sriov operator does not start successfully
      
      [kni@registry.kni-qe-55 sriov]$ oc get pods -A | grep -vE 'NAME|Runn|Comp'
      openshift-cluster-node-tuning-operator             cluster-node-tuning-operator-67b7779c9f-96xhm                     0/1     RunContainerError   1 (14h ago)   15h
      openshift-machine-api                              cluster-autoscaler-operator-578d7fbb69-tc85f                      1/2     RunContainerError   2 (14h ago)   15h
      openshift-marketplace                              marketplace-operator-66d94456c7-g8mj7                             0/1     RunContainerError   4 (14h ago)   15h
      openshift-sriov-network-operator                   sriov-network-operator-6976f885b-qlzz2                            0/1     RunContainerError   1 (14h ago)   14h
      vran-acceleration-operators                        sriov-fec-controller-manager-7cf9cfb79-rmqcm                      1/2     RunContainerError   2 (14h ago)   14h
      
      [kni@registry.kni-qe-55 sriov]$ oc logs -n openshift-sriov-network-operator                   sriov-network-operator-6976f885b-qlzz2  
      I1122 05:45:15.444629       1 request.go:682] Waited for 1.011832258s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/packages.operators.coreos.com/v1?timeout=32s
      I1122 05:45:25.485854       1 request.go:682] Waited for 1.473778564s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/project.openshift.io/v1?timeout=32s
      2023-11-22T05:42:31.913318948Z	INFO	controller-runtime.metrics	Metrics server is starting to listen	{"addr": ":8080"}
      2023-11-22T05:42:31.933602061Z	ERROR	Failed to get API Group-Resources	{"error": "Unauthorized"}
      sigs.k8s.io/controller-runtime/pkg/cluster.New
      	/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/sigs.k8s.io/controller-runtime/pkg/cluster/cluster.go:160
      sigs.k8s.io/controller-runtime/pkg/manager.New
      	/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/sigs.k8s.io/controller-runtime/pkg/manager/manager.go:340
      main.main
      	/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/main.go:118
      runtime.main
      	/usr/lib/golang/src/runtime/proc.go:250
      2023-11-22T05:42:31.933871238Z	ERROR	setup	unable to start global manager	{"error": "Unauthorized"}
      main.main
      	/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/main.go:123
      runtime.main
      	/usr/lib/golang/src/runtime/proc.go:250
       
      
          

      Version-Release number of selected component (if applicable):

      
      hub:
      OCP 4.14.3
      ACM v2.9.0-211
      
      spoke:
      OCP 4.13.23
      
          

      How reproducible:

          

      Steps to Reproduce:

          1. deploy hub with OCP / ACM
          2. deploy spoke with above OCP and DU profile
          3. observe deployment and monitor pods and pod logs.
          

      Actual results:

      sriov-operator pod does not start successfully
          

      Expected results:

      sriov-operator installs and starts successfully
          

      Additional info:

      
      oc logs -n openshift-cluster-node-tuning-operator cluster-node-tuning-operator-67b7779c9f-96xhm 
      I1122 05:45:20.090211       1 main.go:73] Go Version: go1.19.13 X:strictfipsruntime
      I1122 05:45:20.095179       1 main.go:74] Go OS/Arch: linux/amd64
      I1122 05:45:20.095209       1 main.go:75] node-tuning Version: v4.13.0-202311131234.p0.g05a417a.assembly.stream-0-g75f01f2-dirty
      I1122 05:45:21.334884       1 request.go:601] Waited for 1.012858837s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/sriovfec.intel.com/v2?timeout=32s
      I1122 05:45:31.356032       1 request.go:601] Waited for 1.492331213s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/project.openshift.io/v1?timeout=32s
      F1122 05:42:37.355317       1 main.go:146] unable to migrate pinned single node infra status: Unauthorized
      
      oc logs -n openshift-machine-api cluster-autoscaler-operator-578d7fbb69-tc85f 
      I1122 05:45:20.991384       1 main.go:13] Go Version: go1.19.13 X:strictfipsruntime
      I1122 05:45:20.991500       1 main.go:14] Go OS/Arch: linux/amd64
      I1122 05:45:20.991505       1 main.go:15] Version: cluster-autoscaler-operator v4.13.0-202311021930.p0.g8531634.assembly.stream-dirty
      I1122 05:45:22.229585       1 request.go:690] Waited for 1.016338543s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/authorization.k8s.io/v1?timeout=32s
      I1122 05:45:32.229640       1 request.go:690] Waited for 3.295298677s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/sriovfec.intel.com/v2?timeout=32s
      W1122 05:42:33.611927       1 machineautoscaler_controller.go:150] Removing support for unregistered target type: cluster.k8s.io/v1beta1, Kind=MachineDeployment
      F1122 05:42:37.830221       1 main.go:33] Failed to create operator: failed to add controllers: the server has asked for the client to provide credentials
      
      oc logs -n openshift-marketplace marketplace-operator-66d94456c7-g8mj7 
      time="2023-11-22T05:45:12Z" level=info msg="Go Version: go1.19.13 X:strictfipsruntime"
      time="2023-11-22T05:45:12Z" level=info msg="Go OS/Arch: linux/amd64"
      time="2023-11-22T05:45:12Z" level=info msg="[metrics] Registering marketplace metrics"
      time="2023-11-22T05:45:12Z" level=info msg="[metrics] Serving marketplace metrics"
      time="2023-11-22T05:45:12Z" level=info msg="TLS keys set, using https for metrics"
      time="2023-11-22T05:45:12Z" level=info msg="Config API is available"
      time="2023-11-22T05:45:12Z" level=info msg="setting up scheme"
      I1122 05:45:13.792308       1 request.go:601] Waited for 1.038697123s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/logging.openshift.io/v1?timeout=32s
      time="2023-11-22T05:45:19Z" level=info msg="setting up health checks"
      I1122 05:45:19.563843       1 leaderelection.go:248] attempting to acquire leader lease openshift-marketplace/marketplace-operator-lock...
      I1122 05:45:19.728714       1 leaderelection.go:258] successfully acquired lease openshift-marketplace/marketplace-operator-lock
      time="2023-11-22T05:45:19Z" level=info msg="became leader: marketplace-operator-66d94456c7-g8mj7"
      time="2023-11-22T05:45:19Z" level=info msg="registering components"
      time="2023-11-22T05:45:19Z" level=info msg="setting up the marketplace clusteroperator status reporter"
      I1122 05:45:23.829953       1 request.go:601] Waited for 1.338565292s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/certificates.k8s.io/v1?timeout=32s
      time="2023-11-22T05:45:26Z" level=info msg="setting up controllers"
      time="2023-11-22T05:45:26Z" level=info msg="starting the marketplace clusteroperator status reporter"
      time="2023-11-22T05:45:26Z" level=info msg="starting manager"
      time="2023-11-22T05:45:27Z" level=info msg="Reconciling ConfigMap openshift-marketplace/marketplace-trusted-ca"
      time="2023-11-22T05:45:27Z" level=info msg="[ca] Certificate Authorization ConfigMap openshift-marketplace/marketplace-trusted-ca is in sync with disk." name=marketplace-trusted-ca type=ConfigMap
      time="2023-11-22T05:45:27Z" level=info msg="Reconciling OperatorHub cluster"
      time="2023-11-22T05:45:27Z" level=info msg="Reconciling OperatorHub cluster"
      time="2023-11-22T05:45:27Z" level=info msg="Reconciling OperatorHub cluster"
      time="2023-11-22T05:45:27Z" level=info msg="Reconciling OperatorHub cluster"
      time="2023-11-22T05:45:27Z" level=info msg="Reconciling OperatorHub cluster"
      time="2023-11-22T05:45:27Z" level=info msg="Reconciling OperatorHub cluster"
      time="2023-11-22T05:45:27Z" level=info msg="Reconciling OperatorHub cluster"
      time="2023-11-22T05:45:27Z" level=info msg="Reconciling OperatorHub cluster"
      E1122 05:42:46.397679       1 leaderelection.go:330] error retrieving resource lock openshift-marketplace/marketplace-operator-lock: Unauthorized
      E1122 05:43:16.474293       1 leaderelection.go:330] error retrieving resource lock openshift-marketplace/marketplace-operator-lock: Unauthorized
      I1122 05:43:46.388323       1 leaderelection.go:283] failed to renew lease openshift-marketplace/marketplace-operator-lock: timed out waiting for the condition
      E1122 05:43:46.388481       1 leaderelection.go:306] Failed to release lock: resource name may not be empty
      time="2023-11-22T05:43:46Z" level=warning msg="leader election lost for marketplace-operator-66d94456c7-g8mj7 identity"
      
      oc logs -n openshift-sriov-network-operator sriov-network-operator-6976f885b-qlzz2 
      I1122 05:45:15.444629       1 request.go:682] Waited for 1.011832258s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/packages.operators.coreos.com/v1?timeout=32s
      I1122 05:45:25.485854       1 request.go:682] Waited for 1.473778564s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/project.openshift.io/v1?timeout=32s
      2023-11-22T05:42:31.913318948Z	INFO	controller-runtime.metrics	Metrics server is starting to listen	{"addr": ":8080"}
      2023-11-22T05:42:31.933602061Z	ERROR	Failed to get API Group-Resources	{"error": "Unauthorized"}
      sigs.k8s.io/controller-runtime/pkg/cluster.New
      	/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/sigs.k8s.io/controller-runtime/pkg/cluster/cluster.go:160
      sigs.k8s.io/controller-runtime/pkg/manager.New
      	/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/sigs.k8s.io/controller-runtime/pkg/manager/manager.go:340
      main.main
      	/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/main.go:118
      runtime.main
      	/usr/lib/golang/src/runtime/proc.go:250
      2023-11-22T05:42:31.933871238Z	ERROR	setup	unable to start global manager	{"error": "Unauthorized"}
      main.main
      	/go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/main.go:123
      runtime.main
      	/usr/lib/golang/src/runtime/proc.go:250
      
      oc logs -n vran-acceleration-operators sriov-fec-controller-manager-7cf9cfb79-rmqcm 
      I1122 05:45:24.890926       1 request.go:682] Waited for 1.045888078s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/machine.openshift.io/v1?timeout=32s
      I1122 05:42:31.446274       1 request.go:682] Waited for 1.544535428s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/cloudcredential.openshift.io/v1?timeout=32s
      {"addr":"127.0.0.1:8080","file":"/workspace/pkg/common/utils/logger_wrapper.go:26","func":"github.com/intel-collab/applications.orchestration.operators.sriov-fec-operator/pkg/common/utils.(*logrusWrapper).Info","level":"info","msg":"Metrics server is starting to listen","name":"events","time":"2023-11-22T05:42:33Z"}
      {"GVK":{"Group":"sriovfec.intel.com","Version":"v2","Kind":"SriovFecClusterConfig"},"file":"/workspace/pkg/common/utils/logger_wrapper.go:26","func":"github.com/intel-collab/applications.orchestration.operators.sriov-fec-operator/pkg/common/utils.(*logrusWrapper).Info","level":"info","msg":"skip registering a mutating webhook, object does not implement admission.Defaulter or WithDefaulter wasn't called","time":"2023-11-22T05:42:33Z"}
      {"GVK":{"Group":"sriovfec.intel.com","Version":"v2","Kind":"SriovFecClusterConfig"},"file":"/workspace/pkg/common/utils/logger_wrapper.go:26","func":"github.com/intel-collab/applications.orchestration.operators.sriov-fec-operator/pkg/common/utils.(*logrusWrapper).Info","level":"info","msg":"Registering a validating webhook","name":"webhooks","path":"/validate-sriovfec-intel-com-v2-sriovfecclusterconfig","time":"2023-11-22T05:42:33Z"}
      {"file":"/workspace/pkg/common/utils/logger_wrapper.go:26","func":"github.com/intel-collab/applications.orchestration.operators.sriov-fec-operator/pkg/common/utils.(*logrusWrapper).Info","level":"info","msg":"Registering webhook","time":"2023-11-22T05:42:33Z"}
      {"error":"Unauthorized","file":"/workspace/main.go:162","func":"main.createClient","level":"error","msg":"failed to create client","time":"2023-11-22T05:42:33Z"}
      [kni@registry.kni-qe-55 sriov]$ 
      
      
      
      
          

              bnemeth@redhat.com Balazs Nemeth
              rhn-support-dgonyier Dwaine Gonyier
              None
              None
              Zhanqi Zhao Zhanqi Zhao
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: